]> git.itanic.dy.fi Git - linux-stable/log
linux-stable
4 years agoLinux 4.14.154 v4.14.154
Greg Kroah-Hartman [Tue, 12 Nov 2019 18:19:08 +0000 (19:19 +0100)]
Linux 4.14.154

4 years agokvm: x86: mmu: Recovery of shattered NX large pages
Junaid Shahid [Thu, 31 Oct 2019 23:14:14 +0000 (00:14 +0100)]
kvm: x86: mmu: Recovery of shattered NX large pages

commit 1aa9b9572b10529c2e64e2b8f44025d86e124308 upstream.

The page table pages corresponding to broken down large pages are zapped in
FIFO order, so that the large page can potentially be recovered, if it is
not longer being used for execution.  This removes the performance penalty
for walking deeper EPT page tables.

By default, one large page will last about one hour once the guest
reaches a steady state.

Signed-off-by: Junaid Shahid <junaids@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
4 years agokvm: Add helper function for creating VM worker threads
Junaid Shahid [Thu, 31 Oct 2019 23:14:08 +0000 (00:14 +0100)]
kvm: Add helper function for creating VM worker threads

commit c57c80467f90e5504c8df9ad3555d2c78800bf94 upstream.

Add a function to create a kernel thread associated with a given VM. In
particular, it ensures that the worker thread inherits the priority and
cgroups of the calling thread.

Signed-off-by: Junaid Shahid <junaids@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
4 years agokvm: mmu: ITLB_MULTIHIT mitigation
Paolo Bonzini [Mon, 4 Nov 2019 11:22:02 +0000 (12:22 +0100)]
kvm: mmu: ITLB_MULTIHIT mitigation

commit b8e8c8303ff28c61046a4d0f6ea99aea609a7dc0 upstream.

With some Intel processors, putting the same virtual address in the TLB
as both a 4 KiB and 2 MiB page can confuse the instruction fetch unit
and cause the processor to issue a machine check resulting in a CPU lockup.

Unfortunately when EPT page tables use huge pages, it is possible for a
malicious guest to cause this situation.

Add a knob to mark huge pages as non-executable. When the nx_huge_pages
parameter is enabled (and we are using EPT), all huge pages are marked as
NX. If the guest attempts to execute in one of those pages, the page is
broken down into 4K pages, which are then marked executable.

This is not an issue for shadow paging (except nested EPT), because then
the host is in control of TLB flushes and the problematic situation cannot
happen.  With nested EPT, again the nested guest can cause problems shadow
and direct EPT is treated in the same way.

[ tglx: Fixup default to auto and massage wording a bit ]

Originally-by: Junaid Shahid <junaids@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
4 years agoKVM: vmx, svm: always run with EFER.NXE=1 when shadow paging is active
Paolo Bonzini [Sun, 27 Oct 2019 08:36:37 +0000 (09:36 +0100)]
KVM: vmx, svm: always run with EFER.NXE=1 when shadow paging is active

commit 9167ab79936206118cc60e47dcb926c3489f3bd5 upstream.

VMX already does so if the host has SMEP, in order to support the combination of
CR0.WP=1 and CR4.SMEP=1.  However, it is perfectly safe to always do so, and in
fact VMX also ends up running with EFER.NXE=1 on old processors that lack the
"load EFER" controls, because it may help avoiding a slow MSR write.

SVM does not have similar code, but it should since recent AMD processors do
support SMEP.  So this patch makes the code for the two vendors simpler and
more similar, while fixing an issue with CR0.WP=1 and CR4.SMEP=1 on AMD.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Joerg Roedel <jroedel@suse.de>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
4 years agoKVM: x86: add tracepoints around __direct_map and FNAME(fetch)
Paolo Bonzini [Thu, 4 Jul 2019 09:14:13 +0000 (05:14 -0400)]
KVM: x86: add tracepoints around __direct_map and FNAME(fetch)

commit 335e192a3fa415e1202c8b9ecdaaecd643f823cc upstream.

These are useful in debugging shadow paging.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
4 years agoKVM: x86: change kvm_mmu_page_get_gfn BUG_ON to WARN_ON
Paolo Bonzini [Sun, 30 Jun 2019 12:36:21 +0000 (08:36 -0400)]
KVM: x86: change kvm_mmu_page_get_gfn BUG_ON to WARN_ON

commit e9f2a760b158551bfbef6db31d2cae45ab8072e5 upstream.

Note that in such a case it is quite likely that KVM will BUG_ON
in __pte_list_remove when the VM is closed.  However, there is no
immediate risk of memory corruption in the host so a WARN_ON is
enough and it lets you gather traces for debugging.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
4 years agoKVM: x86: remove now unneeded hugepage gfn adjustment
Paolo Bonzini [Sun, 23 Jun 2019 17:15:49 +0000 (19:15 +0200)]
KVM: x86: remove now unneeded hugepage gfn adjustment

commit d679b32611c0102ce33b9e1a4e4b94854ed1812a upstream.

After the previous patch, the low bits of the gfn are masked in
both FNAME(fetch) and __direct_map, so we do not need to clear them
in transparent_hugepage_adjust.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
4 years agoKVM: x86: make FNAME(fetch) and __direct_map more similar
Paolo Bonzini [Mon, 24 Jun 2019 11:06:21 +0000 (13:06 +0200)]
KVM: x86: make FNAME(fetch) and __direct_map more similar

commit 3fcf2d1bdeb6a513523cb2c77012a6b047aa859c upstream.

These two functions are basically doing the same thing through
kvm_mmu_get_page, link_shadow_page and mmu_set_spte; yet, for historical
reasons, their code looks very different.  This patch tries to take the
best of each and make them very similar, so that it is easy to understand
changes that apply to both of them.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
4 years agokvm: mmu: Do not release the page inside mmu_set_spte()
Junaid Shahid [Fri, 4 Jan 2019 00:22:21 +0000 (16:22 -0800)]
kvm: mmu: Do not release the page inside mmu_set_spte()

commit 43fdcda96e2550c6d1c46fb8a78801aa2f7276ed upstream.

Release the page at the call-site where it was originally acquired.
This makes the exit code cleaner for most call sites, since they
do not need to duplicate code between success and the failure
label.

Signed-off-by: Junaid Shahid <junaids@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
4 years agokvm: Convert kvm_lock to a mutex
Junaid Shahid [Fri, 4 Jan 2019 01:14:28 +0000 (17:14 -0800)]
kvm: Convert kvm_lock to a mutex

commit 0d9ce162cf46c99628cc5da9510b959c7976735b upstream.

It doesn't seem as if there is any particular need for kvm_lock to be a
spinlock, so convert the lock to a mutex so that sleepable functions (in
particular cond_resched()) can be called while holding it.

Signed-off-by: Junaid Shahid <junaids@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
4 years agokvm: x86, powerpc: do not allow clearing largepages debugfs entry
Paolo Bonzini [Fri, 11 Oct 2019 09:59:48 +0000 (11:59 +0200)]
kvm: x86, powerpc: do not allow clearing largepages debugfs entry

commit 833b45de69a6016c4b0cebe6765d526a31a81580 upstream.

The largepages debugfs entry is incremented/decremented as shadow
pages are created or destroyed.  Clearing it will result in an
underflow, which is harmless to KVM but ugly (and could be
misinterpreted by tools that use debugfs information), so make
this particular statistic read-only.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: kvm-ppc@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
4 years agoDocumentation: Add ITLB_MULTIHIT documentation
Gomez Iglesias, Antonio [Mon, 4 Nov 2019 11:22:03 +0000 (12:22 +0100)]
Documentation: Add ITLB_MULTIHIT documentation

commit 7f00cc8d4a51074eb0ad4c3f16c15757b1ddfb7d upstream.

Add the initial ITLB_MULTIHIT documentation.

[ tglx: Add it to the index so it gets actually built. ]

Signed-off-by: Antonio Gomez Iglesias <antonio.gomez.iglesias@intel.com>
Signed-off-by: Nelson D'Souza <nelson.dsouza@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
4 years agocpu/speculation: Uninline and export CPU mitigations helpers
Tyler Hicks [Mon, 4 Nov 2019 11:22:02 +0000 (12:22 +0100)]
cpu/speculation: Uninline and export CPU mitigations helpers

commit 731dc9df975a5da21237a18c3384f811a7a41cc6 upstream.

A kernel module may need to check the value of the "mitigations=" kernel
command line parameter as part of its setup when the module needs
to perform software mitigations for a CPU flaw.

Uninline and export the helper functions surrounding the cpu_mitigations
enum to allow for their usage from a module.

Lastly, privatize the enum and cpu_mitigations variable since the value of
cpu_mitigations can be checked with the exported helper functions.

Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
4 years agox86/cpu: Add Tremont to the cpu vulnerability whitelist
Pawan Gupta [Mon, 4 Nov 2019 11:22:01 +0000 (12:22 +0100)]
x86/cpu: Add Tremont to the cpu vulnerability whitelist

commit cad14885a8d32c1c0d8eaa7bf5c0152a22b6080e upstream.

Add the new cpu family ATOM_TREMONT_D to the cpu vunerability
whitelist. ATOM_TREMONT_D is not affected by X86_BUG_ITLB_MULTIHIT.

ATOM_TREMONT_D might have mitigations against other issues as well, but
only the ITLB multihit mitigation is confirmed at this point.

Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
4 years agox86/bugs: Add ITLB_MULTIHIT bug infrastructure
Vineela Tummalapalli [Mon, 4 Nov 2019 11:22:01 +0000 (12:22 +0100)]
x86/bugs: Add ITLB_MULTIHIT bug infrastructure

commit db4d30fbb71b47e4ecb11c4efa5d8aad4b03dfae upstream.

Some processors may incur a machine check error possibly resulting in an
unrecoverable CPU lockup when an instruction fetch encounters a TLB
multi-hit in the instruction TLB. This can occur when the page size is
changed along with either the physical address or cache type. The relevant
erratum can be found here:

   https://bugzilla.kernel.org/show_bug.cgi?id=205195

There are other processors affected for which the erratum does not fully
disclose the impact.

This issue affects both bare-metal x86 page tables and EPT.

It can be mitigated by either eliminating the use of large pages or by
using careful TLB invalidations when changing the page size in the page
tables.

Just like Spectre, Meltdown, L1TF and MDS, a new bit has been allocated in
MSR_IA32_ARCH_CAPABILITIES (PSCHANGE_MC_NO) and will be set on CPUs which
are mitigated against this issue.

Signed-off-by: Vineela Tummalapalli <vineela.tummalapalli@intel.com>
Co-developed-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
4 years agox86/speculation/taa: Fix printing of TAA_MSG_SMT on IBRS_ALL CPUs
Josh Poimboeuf [Thu, 7 Nov 2019 02:26:46 +0000 (20:26 -0600)]
x86/speculation/taa: Fix printing of TAA_MSG_SMT on IBRS_ALL CPUs

commit 012206a822a8b6ac09125bfaa210a95b9eb8f1c1 upstream.

For new IBRS_ALL CPUs, the Enhanced IBRS check at the beginning of
cpu_bugs_smt_update() causes the function to return early, unintentionally
skipping the MDS and TAA logic.

This is not a problem for MDS, because there appears to be no overlap
between IBRS_ALL and MDS-affected CPUs.  So the MDS mitigation would be
disabled and nothing would need to be done in this function anyway.

But for TAA, the TAA_MSG_SMT string will never get printed on Cascade
Lake and newer.

The check is superfluous anyway: when 'spectre_v2_enabled' is
SPECTRE_V2_IBRS_ENHANCED, 'spectre_v2_user' is always
SPECTRE_V2_USER_NONE, and so the 'spectre_v2_user' switch statement
handles it appropriately by doing nothing.  So just remove the check.

Fixes: 1b42f017415b ("x86/speculation/taa: Add mitigation for TSX Async Abort")
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Tyler Hicks <tyhicks@canonical.com>
Reviewed-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
4 years agox86/tsx: Add config options to set tsx=on|off|auto
Michal Hocko [Wed, 23 Oct 2019 10:35:50 +0000 (12:35 +0200)]
x86/tsx: Add config options to set tsx=on|off|auto

commit db616173d787395787ecc93eef075fa975227b10 upstream.

There is a general consensus that TSX usage is not largely spread while
the history shows there is a non trivial space for side channel attacks
possible. Therefore the tsx is disabled by default even on platforms
that might have a safe implementation of TSX according to the current
knowledge. This is a fair trade off to make.

There are, however, workloads that really do benefit from using TSX and
updating to a newer kernel with TSX disabled might introduce a
noticeable regressions. This would be especially a problem for Linux
distributions which will provide TAA mitigations.

Introduce config options X86_INTEL_TSX_MODE_OFF, X86_INTEL_TSX_MODE_ON
and X86_INTEL_TSX_MODE_AUTO to control the TSX feature. The config
setting can be overridden by the tsx cmdline options.

 [ bp: Text cleanups from Josh. ]

Suggested-by: Borislav Petkov <bpetkov@suse.de>
Signed-off-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
4 years agox86/speculation/taa: Add documentation for TSX Async Abort
Pawan Gupta [Wed, 23 Oct 2019 10:32:55 +0000 (12:32 +0200)]
x86/speculation/taa: Add documentation for TSX Async Abort

commit a7a248c593e4fd7a67c50b5f5318fe42a0db335e upstream.

Add the documenation for TSX Async Abort. Include the description of
the issue, how to check the mitigation state, control the mitigation,
guidance for system administrators.

 [ bp: Add proper SPDX tags, touch ups by Josh and me. ]

Co-developed-by: Antonio Gomez Iglesias <antonio.gomez.iglesias@intel.com>
Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Signed-off-by: Antonio Gomez Iglesias <antonio.gomez.iglesias@intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Mark Gross <mgross@linux.intel.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
4 years agox86/tsx: Add "auto" option to the tsx= cmdline parameter
Pawan Gupta [Wed, 23 Oct 2019 10:28:57 +0000 (12:28 +0200)]
x86/tsx: Add "auto" option to the tsx= cmdline parameter

commit 7531a3596e3272d1f6841e0d601a614555dc6b65 upstream.

Platforms which are not affected by X86_BUG_TAA may want the TSX feature
enabled. Add "auto" option to the TSX cmdline parameter. When tsx=auto
disable TSX when X86_BUG_TAA is present, otherwise enable TSX.

More details on X86_BUG_TAA can be found here:
https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/tsx_async_abort.html

 [ bp: Extend the arg buffer to accommodate "auto\0". ]

Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
4 years agokvm/x86: Export MDS_NO=0 to guests when TSX is enabled
Pawan Gupta [Wed, 23 Oct 2019 10:23:33 +0000 (12:23 +0200)]
kvm/x86: Export MDS_NO=0 to guests when TSX is enabled

commit e1d38b63acd843cfdd4222bf19a26700fd5c699e upstream.

Export the IA32_ARCH_CAPABILITIES MSR bit MDS_NO=0 to guests on TSX
Async Abort(TAA) affected hosts that have TSX enabled and updated
microcode. This is required so that the guests don't complain,

  "Vulnerable: Clear CPU buffers attempted, no microcode"

when the host has the updated microcode to clear CPU buffers.

Microcode update also adds support for MSR_IA32_TSX_CTRL which is
enumerated by the ARCH_CAP_TSX_CTRL bit in IA32_ARCH_CAPABILITIES MSR.
Guests can't do this check themselves when the ARCH_CAP_TSX_CTRL bit is
not exported to the guests.

In this case export MDS_NO=0 to the guests. When guests have
CPUID.MD_CLEAR=1, they deploy MDS mitigation which also mitigates TAA.

Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Neelima Krishnan <neelima.krishnan@intel.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
4 years agox86/speculation/taa: Add sysfs reporting for TSX Async Abort
Pawan Gupta [Wed, 23 Oct 2019 10:19:51 +0000 (12:19 +0200)]
x86/speculation/taa: Add sysfs reporting for TSX Async Abort

commit 6608b45ac5ecb56f9e171252229c39580cc85f0f upstream.

Add the sysfs reporting file for TSX Async Abort. It exposes the
vulnerability and the mitigation state similar to the existing files for
the other hardware vulnerabilities.

Sysfs file path is:
/sys/devices/system/cpu/vulnerabilities/tsx_async_abort

Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Neelima Krishnan <neelima.krishnan@intel.com>
Reviewed-by: Mark Gross <mgross@linux.intel.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
4 years agox86/speculation/taa: Add mitigation for TSX Async Abort
Pawan Gupta [Wed, 23 Oct 2019 09:30:45 +0000 (11:30 +0200)]
x86/speculation/taa: Add mitigation for TSX Async Abort

commit 1b42f017415b46c317e71d41c34ec088417a1883 upstream.

TSX Async Abort (TAA) is a side channel vulnerability to the internal
buffers in some Intel processors similar to Microachitectural Data
Sampling (MDS). In this case, certain loads may speculatively pass
invalid data to dependent operations when an asynchronous abort
condition is pending in a TSX transaction.

This includes loads with no fault or assist condition. Such loads may
speculatively expose stale data from the uarch data structures as in
MDS. Scope of exposure is within the same-thread and cross-thread. This
issue affects all current processors that support TSX, but do not have
ARCH_CAP_TAA_NO (bit 8) set in MSR_IA32_ARCH_CAPABILITIES.

On CPUs which have their IA32_ARCH_CAPABILITIES MSR bit MDS_NO=0,
CPUID.MD_CLEAR=1 and the MDS mitigation is clearing the CPU buffers
using VERW or L1D_FLUSH, there is no additional mitigation needed for
TAA. On affected CPUs with MDS_NO=1 this issue can be mitigated by
disabling the Transactional Synchronization Extensions (TSX) feature.

A new MSR IA32_TSX_CTRL in future and current processors after a
microcode update can be used to control the TSX feature. There are two
bits in that MSR:

* TSX_CTRL_RTM_DISABLE disables the TSX sub-feature Restricted
Transactional Memory (RTM).

* TSX_CTRL_CPUID_CLEAR clears the RTM enumeration in CPUID. The other
TSX sub-feature, Hardware Lock Elision (HLE), is unconditionally
disabled with updated microcode but still enumerated as present by
CPUID(EAX=7).EBX{bit4}.

The second mitigation approach is similar to MDS which is clearing the
affected CPU buffers on return to user space and when entering a guest.
Relevant microcode update is required for the mitigation to work.  More
details on this approach can be found here:

  https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/mds.html

The TSX feature can be controlled by the "tsx" command line parameter.
If it is force-enabled then "Clear CPU buffers" (MDS mitigation) is
deployed. The effective mitigation state can be read from sysfs.

 [ bp:
   - massage + comments cleanup
   - s/TAA_MITIGATION_TSX_DISABLE/TAA_MITIGATION_TSX_DISABLED/g - Josh.
   - remove partial TAA mitigation in update_mds_branch_idle() - Josh.
   - s/tsx_async_abort_cmdline/tsx_async_abort_parse_cmdline/g
 ]

Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
4 years agox86/cpu: Add a "tsx=" cmdline option with TSX disabled by default
Pawan Gupta [Wed, 23 Oct 2019 09:01:53 +0000 (11:01 +0200)]
x86/cpu: Add a "tsx=" cmdline option with TSX disabled by default

commit 95c5824f75f3ba4c9e8e5a4b1a623c95390ac266 upstream.

Add a kernel cmdline parameter "tsx" to control the Transactional
Synchronization Extensions (TSX) feature. On CPUs that support TSX
control, use "tsx=on|off" to enable or disable TSX. Not specifying this
option is equivalent to "tsx=off". This is because on certain processors
TSX may be used as a part of a speculative side channel attack.

Carve out the TSX controlling functionality into a separate compilation
unit because TSX is a CPU feature while the TSX async abort control
machinery will go to cpu/bugs.c.

 [ bp: - Massage, shorten and clear the arg buffer.
       - Clarifications of the tsx= possible options - Josh.
       - Expand on TSX_CTRL availability - Pawan. ]

Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
4 years agox86/cpu: Add a helper function x86_read_arch_cap_msr()
Pawan Gupta [Wed, 23 Oct 2019 08:52:35 +0000 (10:52 +0200)]
x86/cpu: Add a helper function x86_read_arch_cap_msr()

commit 286836a70433fb64131d2590f4bf512097c255e1 upstream.

Add a helper function to read the IA32_ARCH_CAPABILITIES MSR.

Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Neelima Krishnan <neelima.krishnan@intel.com>
Reviewed-by: Mark Gross <mgross@linux.intel.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
4 years agox86/msr: Add the IA32_TSX_CTRL MSR
Pawan Gupta [Wed, 23 Oct 2019 08:45:50 +0000 (10:45 +0200)]
x86/msr: Add the IA32_TSX_CTRL MSR

commit c2955f270a84762343000f103e0640d29c7a96f3 upstream.

Transactional Synchronization Extensions (TSX) may be used on certain
processors as part of a speculative side channel attack.  A microcode
update for existing processors that are vulnerable to this attack will
add a new MSR - IA32_TSX_CTRL to allow the system administrator the
option to disable TSX as one of the possible mitigations.

The CPUs which get this new MSR after a microcode upgrade are the ones
which do not set MSR_IA32_ARCH_CAPABILITIES.MDS_NO (bit 5) because those
CPUs have CPUID.MD_CLEAR, i.e., the VERW implementation which clears all
CPU buffers takes care of the TAA case as well.

  [ Note that future processors that are not vulnerable will also
    support the IA32_TSX_CTRL MSR. ]

Add defines for the new IA32_TSX_CTRL MSR and its bits.

TSX has two sub-features:

1. Restricted Transactional Memory (RTM) is an explicitly-used feature
   where new instructions begin and end TSX transactions.
2. Hardware Lock Elision (HLE) is implicitly used when certain kinds of
   "old" style locks are used by software.

Bit 7 of the IA32_ARCH_CAPABILITIES indicates the presence of the
IA32_TSX_CTRL MSR.

There are two control bits in IA32_TSX_CTRL MSR:

  Bit 0: When set, it disables the Restricted Transactional Memory (RTM)
         sub-feature of TSX (will force all transactions to abort on the
 XBEGIN instruction).

  Bit 1: When set, it disables the enumeration of the RTM and HLE feature
         (i.e. it will make CPUID(EAX=7).EBX{bit4} and
  CPUID(EAX=7).EBX{bit11} read as 0).

The other TSX sub-feature, Hardware Lock Elision (HLE), is
unconditionally disabled by the new microcode but still enumerated
as present by CPUID(EAX=7).EBX{bit4}, unless disabled by
IA32_TSX_CTRL_MSR[1] - TSX_CTRL_CPUID_CLEAR.

Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Neelima Krishnan <neelima.krishnan@intel.com>
Reviewed-by: Mark Gross <mgross@linux.intel.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
4 years agoKVM: x86: use Intel speculation bugs and features as derived in generic x86 code
Paolo Bonzini [Mon, 19 Aug 2019 15:24:07 +0000 (17:24 +0200)]
KVM: x86: use Intel speculation bugs and features as derived in generic x86 code

commit 0c54914d0c52a15db9954a76ce80fee32cf318f4 upstream.

Similar to AMD bits, set the Intel bits from the vendor-independent
feature and bug flags, because KVM_GET_SUPPORTED_CPUID does not care
about the vendor and they should be set on AMD processors as well.

Suggested-by: Jim Mattson <jmattson@google.com>
Reviewed-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
4 years agodrm/i915/cmdparser: Fix jump whitelist clearing
Ben Hutchings [Mon, 11 Nov 2019 16:13:24 +0000 (08:13 -0800)]
drm/i915/cmdparser: Fix jump whitelist clearing

commit ea0b163b13ffc52818c079adb00d55e227a6da6f upstream.

When a jump_whitelist bitmap is reused, it needs to be cleared.
Currently this is done with memset() and the size calculation assumes
bitmaps are made of 32-bit words, not longs.  So on 64-bit
architectures, only the first half of the bitmap is cleared.

If some whitelist bits are carried over between successive batches
submitted on the same context, this will presumably allow embedding
the rogue instructions that we're trying to reject.

Use bitmap_zero() instead, which gets the calculation right.

Fixes: f8c08d8faee5 ("drm/i915/cmdparser: Add support for backward jumps")
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Jon Bloomfield <jon.bloomfield@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
4 years agodrm/i915/gen8+: Add RC6 CTX corruption WA
Imre Deak [Mon, 9 Jul 2018 15:24:27 +0000 (18:24 +0300)]
drm/i915/gen8+: Add RC6 CTX corruption WA

commit 7e34f4e4aad3fd34c02b294a3cf2321adf5b4438 upstream.

In some circumstances the RC6 context can get corrupted. We can detect
this and take the required action, that is disable RC6 and runtime PM.
The HW recovers from the corrupted state after a system suspend/resume
cycle, so detect the recovery and re-enable RC6 and runtime PM.

v2: rebase (Mika)
v3:
- Move intel_suspend_gt_powersave() to the end of the GEM suspend
  sequence.
- Add commit message.
v4:
- Rebased on intel_uncore_forcewake_put(i915->uncore, ...) API
  change.
v5:
- Rebased on latest upstream gt_pm refactoring.

Signed-off-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
4 years agodrm/i915: Lower RM timeout to avoid DSI hard hangs
Uma Shankar [Tue, 7 Aug 2018 15:45:35 +0000 (21:15 +0530)]
drm/i915: Lower RM timeout to avoid DSI hard hangs

commit 1d85a299c4db57c55e0229615132c964d17aa765 upstream.

In BXT/APL, device 2 MMIO reads from MIPI controller requires its PLL
to be turned ON. When MIPI PLL is turned off (MIPI Display is not
active or connected), and someone (host or GT engine) tries to read
MIPI registers, it causes hard hang. This is a hardware restriction
or limitation.

Driver by itself doesn't read MIPI registers when MIPI display is off.
But any userspace application can submit unprivileged batch buffer for
execution. In that batch buffer there can be mmio reads. And these
reads are allowed even for unprivileged applications. If these
register reads are for MIPI DSI controller and MIPI display is not
active during that time, then the MMIO read operation causes system
hard hang and only way to recover is hard reboot. A genuine
process/application won't submit batch buffer like this and doesn't
cause any issue. But on a compromised system, a malign userspace
process/app can generate such batch buffer and can trigger system
hard hang (denial of service attack).

The fix is to lower the internal MMIO timeout value to an optimum
value of 950us as recommended by hardware team. If the timeout is
beyond 1ms (which will hit for any value we choose if MMIO READ on a
DSI specific register is performed without PLL ON), it causes the
system hang. But if the timeout value is lower than it will be below
the threshold (even if timeout happens) and system will not get into
a hung state. This will avoid a system hang without losing any
programming or GT interrupts, taking the worst case of lowest CDCLK
frequency and early DC5 abort into account.

Signed-off-by: Uma Shankar <uma.shankar@intel.com>
Reviewed-by: Jon Bloomfield <jon.bloomfield@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
4 years agodrm/i915/cmdparser: Ignore Length operands during command matching
Jon Bloomfield [Thu, 20 Sep 2018 16:45:10 +0000 (09:45 -0700)]
drm/i915/cmdparser: Ignore Length operands during command matching

commit 926abff21a8f29ef159a3ac893b05c6e50e043c3 upstream.

Some of the gen instruction macros (e.g. MI_DISPLAY_FLIP) have the
length directly encoded in them. Since these are used directly in
the tables, the Length becomes part of the comparison used for
matching during parsing. Thus, if the cmd being parsed has a
different length to that in the table, it is not matched and the
cmd is accepted via the default variable length path.

Fix by masking out everything except the Opcode in the cmd tables

Cc: Tony Luck <tony.luck@intel.com>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Takashi Iwai <tiwai@suse.de>
Cc: Tyler Hicks <tyhicks@canonical.com>
Signed-off-by: Jon Bloomfield <jon.bloomfield@intel.com>
Reviewed-by: Chris Wilson <chris.p.wilson@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
4 years agodrm/i915/cmdparser: Add support for backward jumps
Jon Bloomfield [Thu, 20 Sep 2018 16:58:36 +0000 (09:58 -0700)]
drm/i915/cmdparser: Add support for backward jumps

commit f8c08d8faee5567803c8c533865296ca30286bbf upstream.

To keep things manageable, the pre-gen9 cmdparser does not
attempt to track any form of nested BB_START's. This did not
prevent usermode from using nested starts, or even chained
batches because the cmdparser is not strictly enforced pre gen9.

Instead, the existence of a nested BB_START would cause the batch
to be emitted in insecure mode, and any privileged capabilities
would not be available.

For Gen9, the cmdparser becomes mandatory (for BCS at least), and
so not providing any form of nested BB_START support becomes
overly restrictive. Any such batch will simply not run.

We make heavy use of backward jumps in igt, and it is much easier
to add support for this restricted subset of nested jumps, than to
rewrite the whole of our test suite to avoid them.

Add the required logic to support limited backward jumps, to
instructions that have already been validated by the parser.

Note that it's not sufficient to simply approve any BB_START
that jumps backwards in the buffer because this would allow an
attacker to embed a rogue instruction sequence within the
operand words of a harmless instruction (say LRI) and jump to
that.

We introduce a bit array to track every instr offset successfully
validated, and test the target of BB_START against this. If the
target offset hits, it is re-written to the same offset in the
shadow buffer and the BB_START cmd is allowed.

Note: This patch deliberately ignores checkpatch issues in the
cmdtables, in order to match the style of the surrounding code.
We'll correct the entire file in one go in a later patch.

v2: set dispatch secure late (Mika)
v3: rebase (Mika)
v4: Clear whitelist on each parse
Minor review updates (Chris)
v5: Correct backward jump batching
v6: fix compilation error due to struct eb shuffle (Mika)

Signed-off-by: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Takashi Iwai <tiwai@suse.de>
Cc: Tyler Hicks <tyhicks@canonical.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Chris Wilson <chris.p.wilson@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
4 years agodrm/i915/cmdparser: Use explicit goto for error paths
Jon Bloomfield [Thu, 27 Sep 2018 17:23:17 +0000 (10:23 -0700)]
drm/i915/cmdparser: Use explicit goto for error paths

commit 0546a29cd884fb8184731c79ab008927ca8859d0 upstream.

In the next patch we will be adding a second valid
termination condition which will require a small
amount of refactoring to share logic with the BB_END
case.

Refactor all error conditions to jump to a dedicated
exit path, with 'break' reserved only for a successful
parse.

Signed-off-by: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Takashi Iwai <tiwai@suse.de>
Cc: Tyler Hicks <tyhicks@canonical.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Chris Wilson <chris.p.wilson@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
4 years agodrm/i915: Add gen9 BCS cmdparsing
Jon Bloomfield [Mon, 23 Apr 2018 18:12:15 +0000 (11:12 -0700)]
drm/i915: Add gen9 BCS cmdparsing

commit 0f2f39758341df70202ae1c42d5a1e4ee392b6d3 upstream.

For gen9 we enable cmdparsing on the BCS ring, specifically
to catch inadvertent accesses to sensitive registers

Unlike gen7/hsw, we use the parser only to block certain
registers. We can rely on h/w to block restricted commands,
so the command tables only provide enough info to allow the
parser to delineate each command, and identify commands that
access registers.

Note: This patch deliberately ignores checkpatch issues in
favour of matching the style of the surrounding code. We'll
correct the entire file in one go in a later patch.

v3: rebase (Mika)
v4: Add RING_TIMESTAMP registers to whitelist (Jon)

Signed-off-by: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Takashi Iwai <tiwai@suse.de>
Cc: Tyler Hicks <tyhicks@canonical.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Chris Wilson <chris.p.wilson@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
4 years agodrm/i915: Allow parsing of unsized batches
Jon Bloomfield [Wed, 1 Aug 2018 16:45:50 +0000 (09:45 -0700)]
drm/i915: Allow parsing of unsized batches

commit 435e8fc059dbe0eec823a75c22da2972390ba9e0 upstream.

In "drm/i915: Add support for mandatory cmdparsing" we introduced the
concept of mandatory parsing. This allows the cmdparser to be invoked
even when user passes batch_len=0 to the execbuf ioctl's.

However, the cmdparser needs to know the extents of the buffer being
scanned. Refactor the code to ensure the cmdparser uses the actual
object size, instead of the incoming length, if user passes 0.

Signed-off-by: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Takashi Iwai <tiwai@suse.de>
Cc: Tyler Hicks <tyhicks@canonical.com>
Reviewed-by: Chris Wilson <chris.p.wilson@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
4 years agodrm/i915: Support ro ppgtt mapped cmdparser shadow buffers
Jon Bloomfield [Tue, 22 May 2018 20:59:06 +0000 (13:59 -0700)]
drm/i915: Support ro ppgtt mapped cmdparser shadow buffers

commit 4f7af1948abcb18b4772fe1bcd84d7d27d96258c upstream.

For Gen7, the original cmdparser motive was to permit limited
use of register read/write instructions in unprivileged BB's.
This worked by copying the user supplied bb to a kmd owned
bb, and running it in secure mode, from the ggtt, only if
the scanner finds no unsafe commands or registers.

For Gen8+ we can't use this same technique because running bb's
from the ggtt also disables access to ppgtt space. But we also
do not actually require 'secure' execution since we are only
trying to reduce the available command/register set. Instead we
will copy the user buffer to a kmd owned read-only bb in ppgtt,
and run in the usual non-secure mode.

Note that ro pages are only supported by ppgtt (not ggtt), but
luckily that's exactly what we need.

Add the required paths to map the shadow buffer to ppgtt ro for Gen8+

v2: IS_GEN7/IS_GEN (Mika)
v3: rebase
v4: rebase
v5: rebase

Signed-off-by: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Takashi Iwai <tiwai@suse.de>
Cc: Tyler Hicks <tyhicks@canonical.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Chris Wilson <chris.p.wilson@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
4 years agodrm/i915: Add support for mandatory cmdparsing
Jon Bloomfield [Wed, 1 Aug 2018 16:33:59 +0000 (09:33 -0700)]
drm/i915: Add support for mandatory cmdparsing

commit 311a50e76a33d1e029563c24b2ff6db0c02b5afe upstream.

The existing cmdparser for gen7 can be bypassed by specifying
batch_len=0 in the execbuf call. This is safe because bypassing
simply reduces the cmd-set available.

In a later patch we will introduce cmdparsing for gen9, as a
security measure, which must be strictly enforced since without
it we are vulnerable to DoS attacks.

Introduce the concept of 'required' cmd parsing that cannot be
bypassed by submitting zero-length bb's.

v2: rebase (Mika)
v2: rebase (Mika)
v3: fix conflict on engine flags (Mika)

Signed-off-by: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
4 years agodrm/i915: Remove Master tables from cmdparser
Jon Bloomfield [Fri, 8 Jun 2018 17:05:26 +0000 (10:05 -0700)]
drm/i915: Remove Master tables from cmdparser

commit 66d8aba1cd6db34af10de465c0d52af679288cb6 upstream.

The previous patch has killed support for secure batches
on gen6+, and hence the cmdparsers master tables are
now dead code. Remove them.

Signed-off-by: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Takashi Iwai <tiwai@suse.de>
Cc: Tyler Hicks <tyhicks@canonical.com>
Reviewed-by: Chris Wilson <chris.p.wilson@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
4 years agodrm/i915: Disable Secure Batches for gen6+
Jon Bloomfield [Fri, 8 Jun 2018 15:53:46 +0000 (08:53 -0700)]
drm/i915: Disable Secure Batches for gen6+

commit 44157641d448cbc0c4b73c5231d2b911f0cb0427 upstream.

Retroactively stop reporting support for secure batches
through the api for gen6+ so that older binaries trigger
the fallback path instead.

Older binaries use secure batches pre gen6 to access resources
that are not available to normal usermode processes. However,
all known userspace explicitly checks for HAS_SECURE_BATCHES
before relying on the secure batch feature.

Since there are no known binaries relying on this for newer gens
we can kill secure batches from gen6, via I915_PARAM_HAS_SECURE_BATCHES.

v2: rebase (Mika)
v3: rebase (Mika)

Signed-off-by: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Takashi Iwai <tiwai@suse.de>
Cc: Tyler Hicks <tyhicks@canonical.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Chris Wilson <chris.p.wilson@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
4 years agodrm/i915: Rename gen7 cmdparser tables
Jon Bloomfield [Fri, 20 Apr 2018 21:26:01 +0000 (14:26 -0700)]
drm/i915: Rename gen7 cmdparser tables

commit 0a2f661b6c21815a7fa60e30babe975fee8e73c6 upstream.

We're about to introduce some new tables for later gens, and the
current naming for the gen7 tables will no longer make sense.

Signed-off-by: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Takashi Iwai <tiwai@suse.de>
Cc: Tyler Hicks <tyhicks@canonical.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Chris Wilson <chris.p.wilson@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
4 years agodrm/i915: Move engine->needs_cmd_parser to engine->flags
Tvrtko Ursulin [Wed, 29 Nov 2017 08:24:09 +0000 (08:24 +0000)]
drm/i915: Move engine->needs_cmd_parser to engine->flags

commit 439e2ee4ca520e72870e4fa44aa0076060ad6857 upstream.

Will be adding a new per-engine flags shortly so it makes sense
to consolidate.

v2: Keep the original code flow in intel_engine_cleanup_cmd_parser.
    (Joonas Lahtinen)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Sagar Arun Kamble <sagar.a.kamble@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171129082409.18189-1-tvrtko.ursulin@linux.intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
4 years agodrm/i915: Don't use GPU relocations prior to cmdparser stalls
Chris Wilson [Sat, 26 Aug 2017 13:56:20 +0000 (14:56 +0100)]
drm/i915: Don't use GPU relocations prior to cmdparser stalls

commit 3dbf26ed7b9b40a8cb008ab9ad25703363af815d upstream.

If we are using the cmdparser, we will have to copy the batch and so
stall for the relocations. Rather than prolong that stall by adding more
relocation requests, just use CPU relocations and do the stall upfront.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170826135620.25949-1-chris@chris-wilson.co.uk
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
4 years agodrm/i915: Silence smatch for cmdparser
Chris Wilson [Tue, 7 Nov 2017 15:40:55 +0000 (15:40 +0000)]
drm/i915: Silence smatch for cmdparser

commit 0ffba1fc98e8ec35caae8d50b657296ebb9a9a51 upstream.

drivers/gpu/drm/i915/i915_cmd_parser.c:808:23: error: not an lvalue
drivers/gpu/drm/i915/i915_cmd_parser.c:811:23: error: not an lvalue
drivers/gpu/drm/i915/i915_cmd_parser.c:814:23: error: not an lvalue
drivers/gpu/drm/i915/i915_cmd_parser.c:808:23: error: not an lvalue
drivers/gpu/drm/i915/i915_cmd_parser.c:811:23: error: not an lvalue
drivers/gpu/drm/i915/i915_cmd_parser.c:814:23: error: not an lvalue
drivers/gpu/drm/i915/i915_cmd_parser.c:808:23: error: not an lvalue
drivers/gpu/drm/i915/i915_cmd_parser.c:811:23: error: not an lvalue
drivers/gpu/drm/i915/i915_cmd_parser.c:814:23: error: not an lvalue
drivers/gpu/drm/i915/i915_cmd_parser.c:808:23: error: not an lvalue
drivers/gpu/drm/i915/i915_cmd_parser.c:811:23: error: not an lvalue
drivers/gpu/drm/i915/i915_cmd_parser.c:814:23: error: not an lvalue

If we move the shift into each case not only do we kill the warning from
smatch, but we shrink the code slightly:

   text    data     bss     dec     hex filename
1267906   20587    3168 1291661  13b58d before
1267890   20587    3168 1291645  13b57d after

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171107154055.19460-1-chris@chris-wilson.co.uk
Reviewed-by: Matthew Auld <matthew.william.auld@gmail.com>
Reviewed-by: Gabriel Krisman Bertazi <krisman@collabora.co.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
4 years agodrm/i915/cmdparser: Do not check past the cmd length.
Michal Srb [Mon, 5 Feb 2018 16:04:38 +0000 (16:04 +0000)]
drm/i915/cmdparser: Do not check past the cmd length.

commit b3ad99ed45917f42884fee731fa3cf9b8229a26c upstream.

The command MEDIA_VFE_STATE checks bits at offset +2 dwords. However, it is
possible to have MEDIA_VFE_STATE command with length = 0 + LENGTH_BIAS = 2.
In that case check_cmd will read bits from the following command, or even past
the end of the buffer.

If the offset ends up outside of the command length, reject the command.

Fixes: 351e3db2b363 ("drm/i915: Implement command buffer parsing logic")
Signed-off-by: Michal Srb <msrb@suse.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180205151745.29292-1-msrb@suse.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20180205160438.3267-2-chris@chris-wilson.co.uk
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
4 years agodrm/i915/cmdparser: Check reg_table_count before derefencing.
Michal Srb [Mon, 5 Feb 2018 16:04:37 +0000 (16:04 +0000)]
drm/i915/cmdparser: Check reg_table_count before derefencing.

commit b18224e95cb13ef7517aa26e6b47c85117327f11 upstream.

The find_reg function was assuming that there is always at least one table in
reg_tables. It is not always true.

In case of VCS or VECS, the reg_tables is NULL and reg_table_count is 0,
implying that no register-accessing commands are allowed. However, the command
tables include commands such as MI_STORE_REGISTER_MEM. When trying to check
such command, the find_reg would dereference NULL pointer.

Now it will just return NULL meaning that the register was not found and the
command will be rejected.

Fixes: 76ff480ec963 ("drm/i915/cmdparser: Use binary search for faster register lookup")
Signed-off-by: Michal Srb <msrb@suse.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180205142916.27092-2-msrb@suse.com
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20180205160438.3267-1-chris@chris-wilson.co.uk
register lookup")
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
4 years agodrm/i915: Prevent writing into a read-only object via a GGTT mmap
Chris Wilson [Thu, 12 Jul 2018 18:53:13 +0000 (19:53 +0100)]
drm/i915: Prevent writing into a read-only object via a GGTT mmap

commit 3e977ac6179b39faa3c0eda5fce4f00663ae298d upstream.

If the user has created a read-only object, they should not be allowed
to circumvent the write protection by using a GGTT mmapping. Deny it.

Also most machines do not support read-only GGTT PTEs, so again we have
to reject attempted writes. Fortunately, this is known a priori, so we
can at least reject in the call to create the mmap (with a sanity check
in the fault handler).

v2: Check the vma->vm_flags during mmap() to allow readonly access.
v3: Remove VM_MAYWRITE to curtail mprotect()

Testcase: igt/gem_userptr_blits/readonly_mmap*
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Cc: David Herrmann <dh.herrmann@gmail.com>
Reviewed-by: Matthew Auld <matthew.william.auld@gmail.com> #v1
Reviewed-by: Jon Bloomfield <jon.bloomfield@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180712185315.3288-4-chris@chris-wilson.co.uk
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
4 years agodrm/i915/gtt: Disable read-only support under GVT
Chris Wilson [Thu, 12 Jul 2018 18:53:12 +0000 (19:53 +0100)]
drm/i915/gtt: Disable read-only support under GVT

commit c9e666880de5a1fed04dc412b046916d542b72dd upstream.

GVT is not propagating the PTE bits, and is always setting the
read-write bit, thus breaking read-only support.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Zhenyu Wang <zhenyuw@linux.intel.com>
Cc: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Reviewed-by: Jon Bloomfield <jon.bloomfield@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180712185315.3288-3-chris@chris-wilson.co.uk
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
4 years agodrm/i915/gtt: Read-only pages for insert_entries on bdw+
Vivi, Rodrigo [Mon, 6 Aug 2018 21:10:48 +0000 (14:10 -0700)]
drm/i915/gtt: Read-only pages for insert_entries on bdw+

commit 250f8c8140ac0a5e5acb91891d6813f12778b224 upstream.

Hook up the flags to allow read-only ppGTT mappings for gen8+

v2: Include a selftest to check that writes to a readonly PTE are
dropped
v3: Don't duplicate cpu_check() as we can just reuse it, and even worse
don't wholesale copy the theory-of-operation comment from igt_ctx_exec
without changing it to explain the intention behind the new test!
v4: Joonas really likes magic mystery values

Signed-off-by: Jon Bloomfield <jon.bloomfield@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.william.auld@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180712185315.3288-2-chris@chris-wilson.co.uk
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
4 years agodrm/i915/gtt: Add read only pages to gen8_pte_encode
Jon Bloomfield [Thu, 12 Jul 2018 18:53:10 +0000 (19:53 +0100)]
drm/i915/gtt: Add read only pages to gen8_pte_encode

commit 25dda4dabeeb12af5209b0183c788ef2a88dabbe upstream.

We can set a bit inside the ppGTT PTE to indicate a page is read-only;
writes from the GPU will be discarded. We can use this to protect pages
and in particular support read-only userptr mappings (necessary for
importing PROT_READ vma).

Signed-off-by: Jon Bloomfield <jon.bloomfield@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.william.auld@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180712185315.3288-1-chris@chris-wilson.co.uk
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
4 years agonet: prevent load/store tearing on sk->sk_stamp
Eric Dumazet [Tue, 5 Nov 2019 05:38:43 +0000 (21:38 -0800)]
net: prevent load/store tearing on sk->sk_stamp

[ Upstream commit f75359f3ac855940c5718af10ba089b8977bf339 ]

Add a couple of READ_ONCE() and WRITE_ONCE() to prevent
load-tearing and store-tearing in sock_read_timestamp()
and sock_write_timestamp()

This might prevent another KCSAN report.

Fixes: 3a0ed3e96197 ("sock: Make sock->sk_stamp thread-safe")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Deepa Dinamani <deepa.kernel@gmail.com>
Acked-by: Deepa Dinamani <deepa.kernel@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
4 years agousbip: Fix free of unallocated memory in vhci tx
Suwan Kim [Tue, 22 Oct 2019 09:30:17 +0000 (18:30 +0900)]
usbip: Fix free of unallocated memory in vhci tx

[ Upstream commit d4d8257754c3300ea2a465dadf8d2b02c713c920 ]

iso_buffer should be set to NULL after use and free in the while loop.
In the case of isochronous URB in the while loop, iso_buffer is
allocated and after sending it to server, buffer is deallocated. And
then, if the next URB in the while loop is not a isochronous pipe,
iso_buffer still holds the previously deallocated buffer address and
kfree tries to free wrong buffer address.

Fixes: ea44d190764b ("usbip: Implement SG support to vhci-hcd and stub driver")
Reported-by: kbuild test robot <lkp@intel.com>
Reported-by: Julia Lawall <julia.lawall@lip6.fr>
Signed-off-by: Suwan Kim <suwan.kim027@gmail.com>
Reviewed-by: Julia Lawall <julia.lawall@lip6.fr>
Acked-by: Shuah Khan <skhan@linuxfoundation.org>
Link: https://lore.kernel.org/r/20191022093017.8027-1-suwan.kim027@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
4 years agocgroup,writeback: don't switch wbs immediately on dead wbs if the memcg is dead
Tejun Heo [Fri, 8 Nov 2019 20:18:29 +0000 (12:18 -0800)]
cgroup,writeback: don't switch wbs immediately on dead wbs if the memcg is dead

commit 65de03e251382306a4575b1779c57c87889eee49 upstream.

cgroup writeback tries to refresh the associated wb immediately if the
current wb is dead.  This is to avoid keeping issuing IOs on the stale
wb after memcg - blkcg association has changed (ie. when blkcg got
disabled / enabled higher up in the hierarchy).

Unfortunately, the logic gets triggered spuriously on inodes which are
associated with dead cgroups.  When the logic is triggered on dead
cgroups, the attempt fails only after doing quite a bit of work
allocating and initializing a new wb.

While c3aab9a0bd91 ("mm/filemap.c: don't initiate writeback if mapping
has no dirty pages") alleviated the issue significantly as it now only
triggers when the inode has dirty pages.  However, the condition can
still be triggered before the inode is switched to a different cgroup
and the logic simply doesn't make sense.

Skip the immediate switching if the associated memcg is dying.

This is a simplified version of the following two patches:

 * https://lore.kernel.org/linux-mm/20190513183053.GA73423@dennisz-mbp/
 * http://lkml.kernel.org/r/156355839560.2063.5265687291430814589.stgit@buzz

Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Fixes: e8a7abf5a5bd ("writeback: disassociate inodes from dying bdi_writebacks")
Acked-by: Dennis Zhou <dennis@kernel.org>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
4 years agomm/filemap.c: don't initiate writeback if mapping has no dirty pages
Konstantin Khlebnikov [Mon, 23 Sep 2019 22:34:45 +0000 (15:34 -0700)]
mm/filemap.c: don't initiate writeback if mapping has no dirty pages

commit c3aab9a0bd91b696a852169479b7db1ece6cbf8c upstream.

Functions like filemap_write_and_wait_range() should do nothing if inode
has no dirty pages or pages currently under writeback.  But they anyway
construct struct writeback_control and this does some atomic operations if
CONFIG_CGROUP_WRITEBACK=y - on fast path it locks inode->i_lock and
updates state of writeback ownership, on slow path might be more work.
Current this path is safely avoided only when inode mapping has no pages.

For example generic_file_read_iter() calls filemap_write_and_wait_range()
at each O_DIRECT read - pretty hot path.

This patch skips starting new writeback if mapping has no dirty tags set.
If writeback is already in progress filemap_write_and_wait_range() will
wait for it.

Link: http://lkml.kernel.org/r/156378816804.1087.8607636317907921438.stgit@buzz
Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Reviewed-by: Jan Kara <jack@suse.cz>
Cc: Tejun Heo <tj@kernel.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
4 years agocan: flexcan: disable completely the ECC mechanism
Joakim Zhang [Thu, 15 Aug 2019 08:00:26 +0000 (08:00 +0000)]
can: flexcan: disable completely the ECC mechanism

[ Upstream commit 5e269324db5adb2f5f6ec9a93a9c7b0672932b47 ]

The ECC (memory error detection and correction) mechanism can be
activated or not, controlled by the ECCDIS bit in CAN_MECR. When
disabled, updates on indications and reporting registers are stopped.
So if want to disable ECC completely, had better assert ECCDIS bit, not
just mask the related interrupts.

Fixes: cdce844865be ("can: flexcan: add vf610 support for FlexCAN")
Signed-off-by: Joakim Zhang <qiangqing.zhang@nxp.com>
Cc: linux-stable <stable@vger.kernel.org>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
4 years agox86/apic/32: Avoid bogus LDR warnings
Jan Beulich [Tue, 29 Oct 2019 09:34:19 +0000 (10:34 +0100)]
x86/apic/32: Avoid bogus LDR warnings

[ Upstream commit fe6f85ca121e9c74e7490fe66b0c5aae38e332c3 ]

The removal of the LDR initialization in the bigsmp_32 APIC code unearthed
a problem in setup_local_APIC().

The code checks unconditionally for a mismatch of the logical APIC id by
comparing the early APIC id which was initialized in get_smp_config() with
the actual LDR value in the APIC.

Due to the removal of the bogus LDR initialization the check now can
trigger on bigsmp_32 APIC systems emitting a warning for every booting
CPU. This is of course a false positive because the APIC is not using
logical destination mode.

Restrict the check and the possibly resulting fixup to systems which are
actually using the APIC in logical destination mode.

[ tglx: Massaged changelog and added Cc stable ]

Fixes: bae3a8d3308 ("x86/apic: Do not initialize LDR and DFR for bigsmp")
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/666d8f91-b5a8-1afd-7add-821e72a35f03@suse.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
4 years agox86/apic: Drop logical_smp_processor_id() inline
Dou Liyang [Thu, 1 Mar 2018 05:59:30 +0000 (13:59 +0800)]
x86/apic: Drop logical_smp_processor_id() inline

[ Upstream commit 8f1561680f42a5491b371b513f1ab8197f31fd62 ]

The logical_smp_processor_id() inline which is only called in
setup_local_APIC() on x86_32 systems has no real value.

Drop it and directly use GET_APIC_LOGICAL_ID() at the call site and use a
more suitable variable name for readability

Signed-off-by: Dou Liyang <douly.fnst@cn.fujitsu.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: andy.shevchenko@gmail.com
Cc: bhe@redhat.com
Cc: ebiederm@xmission.com
Link: https://lkml.kernel.org/r/20180301055930.2396-4-douly.fnst@cn.fujitsu.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
4 years agox86/apic: Move pending interrupt check code into it's own function
Dou Liyang [Thu, 1 Mar 2018 05:59:28 +0000 (13:59 +0800)]
x86/apic: Move pending interrupt check code into it's own function

[ Upstream commit 9b217f33017715903d0956dfc58f82d2a2d00e63 ]

The pending interrupt check code is mixed with the local APIC setup code,
that looks messy.

Extract the related code, move it into a new function named
apic_pending_intr_clear().

Signed-off-by: Dou Liyang <douly.fnst@cn.fujitsu.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Cc: bhe@redhat.com
Cc: ebiederm@xmission.com
Link: https://lkml.kernel.org/r/20180301055930.2396-2-douly.fnst@cn.fujitsu.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
4 years agoe1000: fix memory leaks
Wenwen Wang [Mon, 12 Aug 2019 05:59:21 +0000 (00:59 -0500)]
e1000: fix memory leaks

[ Upstream commit 8472ba62154058b64ebb83d5f57259a352d28697 ]

In e1000_set_ringparam(), 'tx_old' and 'rx_old' are not deallocated if
e1000_up() fails, leading to memory leaks. Refactor the code to fix this
issue.

Signed-off-by: Wenwen Wang <wenwen@cs.uga.edu>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
4 years agoigb: Fix constant media auto sense switching when no cable is connected
Manfred Rudigier [Thu, 15 Aug 2019 20:55:20 +0000 (13:55 -0700)]
igb: Fix constant media auto sense switching when no cable is connected

[ Upstream commit 8d5cfd7f76a2414e23c74bb8858af7540365d985 ]

At least on the i350 there is an annoying behavior that is maybe also
present on 82580 devices, but was probably not noticed yet as MAS is not
widely used.

If no cable is connected on both fiber/copper ports the media auto sense
code will constantly swap between them as part of the watchdog task and
produce many unnecessary kernel log messages.

The swap code responsible for this behavior (switching to fiber) should
not be executed if the current media type is copper and there is no signal
detected on the fiber port. In this case we can safely wait until the
AUTOSENSE_EN bit is cleared.

Signed-off-by: Manfred Rudigier <manfred.rudigier@omicronenergy.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
4 years agonet: ethernet: arc: add the missed clk_disable_unprepare
Chuhong Yuan [Fri, 1 Nov 2019 12:17:25 +0000 (20:17 +0800)]
net: ethernet: arc: add the missed clk_disable_unprepare

[ Upstream commit 4202e219edd6cc164c042e16fa327525410705ae ]

The remove misses to disable and unprepare priv->macclk like what is done
when probe fails.
Add the missed call in remove.

Signed-off-by: Chuhong Yuan <hslester96@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
4 years agoNFSv4: Don't allow a cached open with a revoked delegation
Trond Myklebust [Thu, 31 Oct 2019 22:40:32 +0000 (18:40 -0400)]
NFSv4: Don't allow a cached open with a revoked delegation

[ Upstream commit be3df3dd4c70ee020587a943a31b98a0fb4b6424 ]

If the delegation is marked as being revoked, we must not use it
for cached opens.

Fixes: 869f9dfa4d6d ("NFSv4: Fix races between nfs_remove_bad_delegation() and delegation return")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
4 years agohv_netvsc: Fix error handling in netvsc_attach()
Haiyang Zhang [Wed, 30 Oct 2019 15:32:13 +0000 (15:32 +0000)]
hv_netvsc: Fix error handling in netvsc_attach()

[ Upstream commit 719b85c336ed35565d0f3982269d6f684087bb00 ]

If rndis_filter_open() fails, we need to remove the rndis device created
in earlier steps, before returning an error code. Otherwise, the retry of
netvsc_attach() from its callers will fail and hang.

Fixes: 7b2ee50c0cd5 ("hv_netvsc: common detach logic")
Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
4 years agonet: hisilicon: Fix "Trying to free already-free IRQ"
Jiangfeng Xiao [Fri, 25 Oct 2019 13:48:22 +0000 (21:48 +0800)]
net: hisilicon: Fix "Trying to free already-free IRQ"

[ Upstream commit 63a41746827cb16dc6ad0d4d761ab4e7dda7a0c3 ]

When rmmod hip04_eth.ko, we can get the following warning:

Task track: rmmod(1623)>bash(1591)>login(1581)>init(1)
------------[ cut here ]------------
WARNING: CPU: 0 PID: 1623 at kernel/irq/manage.c:1557 __free_irq+0xa4/0x2ac()
Trying to free already-free IRQ 200
Modules linked in: ping(O) pramdisk(O) cpuinfo(O) rtos_snapshot(O) interrupt_ctrl(O) mtdblock mtd_blkdevrtfs nfs_acl nfs lockd grace sunrpc xt_tcpudp ipt_REJECT iptable_filter ip_tables x_tables nf_reject_ipv
CPU: 0 PID: 1623 Comm: rmmod Tainted: G           O    4.4.193 #1
Hardware name: Hisilicon A15
[<c020b408>] (rtos_unwind_backtrace) from [<c0206624>] (show_stack+0x10/0x14)
[<c0206624>] (show_stack) from [<c03f2be4>] (dump_stack+0xa0/0xd8)
[<c03f2be4>] (dump_stack) from [<c021a780>] (warn_slowpath_common+0x84/0xb0)
[<c021a780>] (warn_slowpath_common) from [<c021a7e8>] (warn_slowpath_fmt+0x3c/0x68)
[<c021a7e8>] (warn_slowpath_fmt) from [<c026876c>] (__free_irq+0xa4/0x2ac)
[<c026876c>] (__free_irq) from [<c0268a14>] (free_irq+0x60/0x7c)
[<c0268a14>] (free_irq) from [<c0469e80>] (release_nodes+0x1c4/0x1ec)
[<c0469e80>] (release_nodes) from [<c0466924>] (__device_release_driver+0xa8/0x104)
[<c0466924>] (__device_release_driver) from [<c0466a80>] (driver_detach+0xd0/0xf8)
[<c0466a80>] (driver_detach) from [<c0465e18>] (bus_remove_driver+0x64/0x8c)
[<c0465e18>] (bus_remove_driver) from [<c02935b0>] (SyS_delete_module+0x198/0x1e0)
[<c02935b0>] (SyS_delete_module) from [<c0202ed0>] (__sys_trace_return+0x0/0x10)
---[ end trace bb25d6123d849b44 ]---

Currently "rmmod hip04_eth.ko" call free_irq more than once
as devres_release_all and hip04_remove both call free_irq.
This results in a 'Trying to free already-free IRQ' warning.
To solve the problem free_irq has been moved out of hip04_remove.

Signed-off-by: Jiangfeng Xiao <xiaojiangfeng@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
4 years agofjes: Handle workqueue allocation failure
Will Deacon [Fri, 25 Oct 2019 11:06:02 +0000 (12:06 +0100)]
fjes: Handle workqueue allocation failure

[ Upstream commit 85ac30fa2e24f628e9f4f9344460f4015d33fd7d ]

In the highly unlikely event that we fail to allocate either of the
"/txrx" or "/control" workqueues, we should bail cleanly rather than
blindly march on with NULL queue pointer(s) installed in the
'fjes_adapter' instance.

Cc: "David S. Miller" <davem@davemloft.net>
Reported-by: Nicolas Waisman <nico@semmle.com>
Link: https://lore.kernel.org/lkml/CADJ_3a8WFrs5NouXNqS5WYe7rebFP+_A5CheeqAyD_p7DFJJcg@mail.gmail.com/
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
4 years agoscsi: qla2xxx: stop timer in shutdown path
Nicholas Piggin [Thu, 24 Oct 2019 06:38:04 +0000 (16:38 +1000)]
scsi: qla2xxx: stop timer in shutdown path

[ Upstream commit d3566abb1a1e7772116e4d50fb6a58d19c9802e5 ]

In shutdown/reboot paths, the timer is not stopped:

  qla2x00_shutdown
  pci_device_shutdown
  device_shutdown
  kernel_restart_prepare
  kernel_restart
  sys_reboot

This causes lockups (on powerpc) when firmware config space access calls
are interrupted by smp_send_stop later in reboot.

Fixes: e30d1756480dc ("[SCSI] qla2xxx: Addition of shutdown callback handler.")
Link: https://lore.kernel.org/r/20191024063804.14538-1-npiggin@gmail.com
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Acked-by: Himanshu Madhani <hmadhani@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
4 years agoRDMA/iw_cxgb4: Avoid freeing skb twice in arp failure case
Potnuri Bharat Teja [Fri, 25 Oct 2019 12:34:40 +0000 (18:04 +0530)]
RDMA/iw_cxgb4: Avoid freeing skb twice in arp failure case

[ Upstream commit d4934f45693651ea15357dd6c7c36be28b6da884 ]

_put_ep_safe() and _put_pass_ep_safe() free the skb before it is freed by
process_work(). fix double free by freeing the skb only in process_work().

Fixes: 1dad0ebeea1c ("iw_cxgb4: Avoid touch after free error in ARP failure handlers")
Link: https://lore.kernel.org/r/1572006880-5800-1-git-send-email-bharat@chelsio.com
Signed-off-by: Dakshaja Uppalapati <dakshaja@chelsio.com>
Signed-off-by: Potnuri Bharat Teja <bharat@chelsio.com>
Reviewed-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
4 years agoUSB: ldusb: use unsigned size format specifiers
Johan Hovold [Tue, 22 Oct 2019 14:32:03 +0000 (16:32 +0200)]
USB: ldusb: use unsigned size format specifiers

[ Upstream commit 88f6bf3846ee90bf33aa1ce848cd3bfb3229f4a4 ]

A recent info-leak bug manifested itself along with warning about a
negative buffer overflow:

ldusb 1-1:0.28: Read buffer overflow, -131383859965943 bytes dropped

when it was really a rather large positive one.

A sanity check that prevents this has now been put in place, but let's
fix up the size format specifiers, which should all be unsigned.

Signed-off-by: Johan Hovold <johan@kernel.org>
Link: https://lore.kernel.org/r/20191022143203.5260-3-johan@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
4 years agoUSB: Skip endpoints with 0 maxpacket length
Alan Stern [Mon, 28 Oct 2019 14:52:35 +0000 (10:52 -0400)]
USB: Skip endpoints with 0 maxpacket length

[ Upstream commit d482c7bb0541d19dea8bff437a9f3c5563b5b2d2 ]

Endpoints with a maxpacket length of 0 are probably useless.  They
can't transfer any data, and it's not at all unlikely that an HCD will
crash or hang when trying to handle an URB for such an endpoint.

Currently the USB core does not check for endpoints having a maxpacket
value of 0.  This patch adds a check, printing a warning and skipping
over any endpoints it catches.

Now, the USB spec does not rule out endpoints having maxpacket = 0.
But since they wouldn't have any practical use, there doesn't seem to
be any good reason for us to accept them.

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Link: https://lore.kernel.org/r/Pine.LNX.4.44L0.1910281050420.1485-100000@iolanthe.rowland.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
4 years agoperf/x86/amd/ibs: Handle erratum #420 only on the affected CPU family (10h)
Kim Phillips [Wed, 23 Oct 2019 15:09:55 +0000 (10:09 -0500)]
perf/x86/amd/ibs: Handle erratum #420 only on the affected CPU family (10h)

[ Upstream commit e431e79b60603079d269e0c2a5177943b95fa4b6 ]

This saves us writing the IBS control MSR twice when disabling the
event.

I searched revision guides for all families since 10h, and did not
find occurrence of erratum #420, nor anything remotely similar:
so we isolate the secondary MSR write to family 10h only.

Also unconditionally update the count mask for IBS Op implementations
that have read & writeable current count (CurCnt) fields in addition
to the MaxCnt field.  These bits were reserved on prior
implementations, and therefore shouldn't have negative impact.

Signed-off-by: Kim Phillips <kim.phillips@amd.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Fixes: c9574fe0bdb9 ("perf/x86-ibs: Implement workaround for IBS erratum #420")
Link: https://lkml.kernel.org/r/20191023150955.30292-2-kim.phillips@amd.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
4 years agoperf/x86/amd/ibs: Fix reading of the IBS OpData register and thus precise RIP validity
Kim Phillips [Wed, 23 Oct 2019 15:09:54 +0000 (10:09 -0500)]
perf/x86/amd/ibs: Fix reading of the IBS OpData register and thus precise RIP validity

[ Upstream commit 317b96bb14303c7998dbcd5bc606bd8038fdd4b4 ]

The loop that reads all the IBS MSRs into *buf stopped one MSR short of
reading the IbsOpData register, which contains the RipInvalid status bit.

Fix the offset_max assignment so the MSR gets read, so the RIP invalid
evaluation is based on what the IBS h/w output, instead of what was
left in memory.

Signed-off-by: Kim Phillips <kim.phillips@amd.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Fixes: d47e8238cd76 ("perf/x86-ibs: Take instruction pointer from ibs sample")
Link: https://lkml.kernel.org/r/20191023150955.30292-1-kim.phillips@amd.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
4 years agousb: dwc3: remove the call trace of USBx_GFLADJ
Yinbo Zhu [Mon, 29 Jul 2019 06:46:07 +0000 (14:46 +0800)]
usb: dwc3: remove the call trace of USBx_GFLADJ

[ Upstream commit a7d9874c6f3fbc8d25cd9ceba35b6822612c4ebf ]

layerscape board sometimes reported some usb call trace, that is due to
kernel sent LPM tokerns automatically when it has no pending transfers
and think that the link is idle enough to enter L1, which procedure will
ask usb register has a recovery,then kernel will compare USBx_GFLADJ and
set GFLADJ_30MHZ, GFLADJ_30MHZ_REG until GFLADJ_30MHZ is equal 0x20, if
the conditions were met then issue occur, but whatever the conditions
whether were met that usb is all need keep GFLADJ_30MHZ of value is 0x20
(xhci spec ask use GFLADJ_30MHZ to adjust any offset from clock source
that generates the clock that drives the SOF counter, 0x20 is default
value of it)That is normal logic, so need remove the call trace.

Signed-off-by: Yinbo Zhu <yinbo.zhu@nxp.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
4 years agousb: gadget: configfs: fix concurrent issue between composite APIs
Peter Chen [Mon, 26 Aug 2019 19:10:55 +0000 (15:10 -0400)]
usb: gadget: configfs: fix concurrent issue between composite APIs

[ Upstream commit 1a1c851bbd706ea9f3a9756c2d3db28523506d3b ]

We meet several NULL pointer issues if configfs_composite_unbind
and composite_setup (or composite_disconnect) are running together.
These issues occur when do the function switch stress test, the
configfs_compsoite_unbind is called from user mode by
echo "" to /sys/../UDC entry, and meanwhile, the setup interrupt
or disconnect interrupt occurs by hardware. The composite_setup
will get the cdev from get_gadget_data, but configfs_composite_unbind
will set gadget data as NULL, so the NULL pointer issue occurs.
This concurrent is hard to reproduce by native kernel, but can be
reproduced by android kernel.

In this commit, we introduce one spinlock belongs to structure
gadget_info since we can't use the same spinlock in usb_composite_dev
due to exclusive running together between composite_setup and
configfs_composite_unbind. And one bit flag 'unbind' to indicate the
code is at unbind routine, this bit is needed due to we release the
lock at during configfs_composite_unbind sometimes, and composite_setup
may be run at that time.

Several oops:

oops 1:
android_work: sent uevent USB_STATE=CONNECTED
configfs-gadget gadget: super-speed config #1: b
android_work: sent uevent USB_STATE=CONFIGURED
init: Received control message 'start' for 'adbd' from pid: 3515 (system_server)
Unable to handle kernel NULL pointer dereference at virtual address 0000002a
init: Received control message 'stop' for 'adbd' from pid: 3375 (/vendor/bin/hw/android.hardware.usb@1.1-servic)
Mem abort info:
  Exception class = DABT (current EL), IL = 32 bits
  SET = 0, FnV = 0
  EA = 0, S1PTW = 0
Data abort info:
  ISV = 0, ISS = 0x00000004
  CM = 0, WnR = 0
user pgtable: 4k pages, 48-bit VAs, pgd = ffff8008f1b7f000
[000000000000002a] *pgd=0000000000000000
Internal error: Oops: 96000004 [#1] PREEMPT SMP
Modules linked in:
CPU: 4 PID: 2457 Comm: irq/125-5b11000 Not tainted 4.14.98-07846-g0b40a9b-dirty #16
Hardware name: Freescale i.MX8QM MEK (DT)
task: ffff8008f2a98000 task.stack: ffff00000b7b8000
PC is at composite_setup+0x44/0x1508
LR is at android_setup+0xb8/0x13c
pc : [<ffff0000089ffb3c>] lr : [<ffff000008a032fc>] pstate: 800001c5
sp : ffff00000b7bbb80
x29: ffff00000b7bbb80 x28: ffff8008f2a3c010
x27: 0000000000000001 x26: 0000000000000000                                                          [1232/1897]
audit: audit_lost=25791 audit_rate_limit=5 audit_backlog_limit=64
x25: 00000000ffffffa1 x24: ffff8008f2a3c010
audit: rate limit exceeded
x23: 0000000000000409 x22: ffff000009c8e000
x21: ffff8008f7a8b428 x20: ffff00000afae000
x19: ffff0000089ff000 x18: 0000000000000000
x17: 0000000000000000 x16: ffff0000082b7c9c
x15: 0000000000000000 x14: f1866f5b952aca46
x13: e35502e30d44349c x12: 0000000000000008
x11: 0000000000000008 x10: 0000000000000a30
x9 : ffff00000b7bbd00 x8 : ffff8008f2a98a90
x7 : ffff8008f27a9c90 x6 : 0000000000000001
x5 : 0000000000000000 x4 : 0000000000000001
x3 : 0000000000000000 x2 : 0000000000000006
x1 : ffff0000089ff8d0 x0 : 732a010310b9ed00

X7: 0xffff8008f27a9c10:
9c10  00000002 00000000 00000001 00000000 13110000 ffff0000 00000002 00208040
9c30  00000000 00000000 00000000 00000000 00000000 00000005 00000029 00000000
9c50  00051778 00000001 f27a8e00 ffff8008 00000005 00000000 00000078 00000078
9c70  00000078 00000000 09031d48 ffff0000 00100000 00000000 00400000 00000000
9c90  00000001 00000000 00000000 00000000 00000000 00000000 ffefb1a0 ffff8008
9cb0  f27a9ca8 ffff8008 00000000 00000000 b9d88037 00000173 1618a3eb 00000001
9cd0  870a792a 0000002e 16188fe6 00000001 0000242b 00000000 00000000 00000000
using random self ethernet address
9cf0  019a4646 00000000 000547f3 00000000 ecfd6c33 00000002 00000000
using random host ethernet address
 00000000

X8: 0xffff8008f2a98a10:
8a10  00000000 00000000 f7788d00 ffff8008 00000001 00000000 00000000 00000000
8a30  eb218000 ffff8008 f2a98000 ffff8008 f2a98000 ffff8008 09885000 ffff0000
8a50  f34df480 ffff8008 00000000 00000000 f2a98648 ffff8008 09c8e000 ffff0000
8a70  fff2c800 ffff8008 09031d48 ffff0000 0b7bbd00 ffff0000 0b7bbd00 ffff0000
8a90  080861bc ffff0000 00000000 00000000 00000000 00000000 00000000 00000000
8ab0  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
8ad0  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
8af0  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000

X21: 0xffff8008f7a8b3a8:
b3a8  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
b3c8  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
b3e8  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
b408  00000000 00000000 00000000 00000000 00000000 00000000 00000001 00000000
b428  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
b448  0053004d 00540046 00300031 00010030 eb07b520 ffff8008 20011201 00000003
b468  e418d109 0104404e 00010302 00000000 eb07b558 ffff8008 eb07b558 ffff8008
b488  f7a8b488 ffff8008 f7a8b488 ffff8008 f7a8b300 ffff8008 00000000 00000000

X24: 0xffff8008f2a3bf90:
bf90  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
bfb0  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
bfd0  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
bff0  00000000 00000000 00000000 00000000 f76c8010 ffff8008 f76c8010 ffff8008
c010  00000000 00000000 f2a3c018 ffff8008 f2a3c018 ffff8008 08a067dc ffff0000
c030  f2a5a000 ffff8008 091c3650 ffff0000 f716fd18 ffff8008 f716fe30 ffff8008
c050  f2ce4a30 ffff8008 00000000 00000005 00000000 00000000 095d1568 ffff0000
c070  f76c8010 ffff8008 f2ce4b00 ffff8008 095cac68 ffff0000 f2a5a028 ffff8008

X28: 0xffff8008f2a3bf90:
bf90  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
bfb0  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
bfd0  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
bff0  00000000 00000000 00000000 00000000 f76c8010 ffff8008 f76c8010 ffff8008
c010  00000000 00000000 f2a3c018 ffff8008 f2a3c018 ffff8008 08a067dc ffff0000
c030  f2a5a000 ffff8008 091c3650 ffff0000 f716fd18 ffff8008 f716fe30 ffff8008
c050  f2ce4a30 ffff8008 00000000 00000005 00000000 00000000 095d1568 ffff0000
c070  f76c8010 ffff8008 f2ce4b00 ffff8008 095cac68 ffff0000 f2a5a028 ffff8008

Process irq/125-5b11000 (pid: 2457, stack limit = 0xffff00000b7b8000)
Call trace:
Exception stack(0xffff00000b7bba40 to 0xffff00000b7bbb80)
ba40: 732a010310b9ed00 ffff0000089ff8d0 0000000000000006 0000000000000000
ba60: 0000000000000001 0000000000000000 0000000000000001 ffff8008f27a9c90
ba80: ffff8008f2a98a90 ffff00000b7bbd00 0000000000000a30 0000000000000008
baa0: 0000000000000008 e35502e30d44349c f1866f5b952aca46 0000000000000000
bac0: ffff0000082b7c9c 0000000000000000 0000000000000000 ffff0000089ff000
bae0: ffff00000afae000 ffff8008f7a8b428 ffff000009c8e000 0000000000000409
bb00: ffff8008f2a3c010 00000000ffffffa1 0000000000000000 0000000000000001
bb20: ffff8008f2a3c010 ffff00000b7bbb80 ffff000008a032fc ffff00000b7bbb80
bb40: ffff0000089ffb3c 00000000800001c5 ffff00000b7bbb80 732a010310b9ed00
bb60: ffffffffffffffff ffff0000080f777c ffff00000b7bbb80 ffff0000089ffb3c
[<ffff0000089ffb3c>] composite_setup+0x44/0x1508
[<ffff000008a032fc>] android_setup+0xb8/0x13c
[<ffff0000089bd9a8>] cdns3_ep0_delegate_req+0x44/0x70
[<ffff0000089bdff4>] cdns3_check_ep0_interrupt_proceed+0x33c/0x654
[<ffff0000089bca44>] cdns3_device_thread_irq_handler+0x4b0/0x4bc
[<ffff0000089b77b4>] cdns3_thread_irq+0x48/0x68
[<ffff000008145bf0>] irq_thread_fn+0x28/0x88
[<ffff000008145e38>] irq_thread+0x13c/0x228
[<ffff0000080fed70>] kthread+0x104/0x130
[<ffff000008085064>] ret_from_fork+0x10/0x18

oops2:
composite_disconnect: Calling disconnect on a Gadget that is                      not connected
android_work: did not send uevent (0 0           (null))
init: Received control message 'stop' for 'adbd' from pid: 3359 (/vendor/bin/hw/android.hardware.usb@1.1-service.imx)
init: Sending signal 9 to service 'adbd' (pid 22343) process group...
------------[ cut here ]------------
audit: audit_lost=180038 audit_rate_limit=5 audit_backlog_limit=64
audit: rate limit exceeded
WARNING: CPU: 0 PID: 3468 at kernel_imx/drivers/usb/gadget/composite.c:2009 composite_disconnect+0x80/0x88
Modules linked in:
CPU: 0 PID: 3468 Comm: HWC-UEvent-Thre Not tainted 4.14.98-07846-g0b40a9b-dirty #16
Hardware name: Freescale i.MX8QM MEK (DT)
task: ffff8008f2349c00 task.stack: ffff00000b0a8000
PC is at composite_disconnect+0x80/0x88
LR is at composite_disconnect+0x80/0x88
pc : [<ffff0000089ff9b0>] lr : [<ffff0000089ff9b0>] pstate: 600001c5
sp : ffff000008003dd0
x29: ffff000008003dd0 x28: ffff8008f2349c00
x27: ffff000009885018 x26: ffff000008004000
Timeout for IPC response!
x25: ffff000009885018 x24: ffff000009c8e280
x23: ffff8008f2d98010 x22: 00000000000001c0
x21: ffff8008f2d98394 x20: ffff8008f2d98010
x19: 0000000000000000 x18: 0000e3956f4f075a
fxos8700 4-001e: i2c block read acc failed
x17: 0000e395735727e8 x16: ffff00000829f4d4
x15: ffffffffffffffff x14: 7463656e6e6f6320
x13: 746f6e2009090920 x12: 7369207461687420
x11: 7465676461472061 x10: 206e6f207463656e
x9 : 6e6f637369642067 x8 : ffff000009c8e280
x7 : ffff0000086ca6cc x6 : ffff000009f15e78
x5 : 0000000000000000 x4 : 0000000000000000
x3 : ffffffffffffffff x2 : c3f28b86000c3900
x1 : c3f28b86000c3900 x0 : 000000000000004e

X20: 0xffff8008f2d97f90:
7f90  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
7fb0  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
libprocessgroup: Failed to kill process cgroup uid 0 pid 22343 in 215ms, 1 processes remain
7fd0
Timeout for IPC response!
 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
using random self ethernet address
7ff0  00000000 00000000 00000000 00000000 f76c8010 ffff8008 f76c8010 ffff8008
8010  00000100 00000000 f2d98018 ffff8008 f2d98018 ffff8008 08a067dc
using random host ethernet address
 ffff0000
8030  f206d800 ffff8008 091c3650 ffff0000 f7957b18 ffff8008 f7957730 ffff8008
8050  f716a630 ffff8008 00000000 00000005 00000000 00000000 095d1568 ffff0000
8070  f76c8010 ffff8008 f716a800 ffff8008 095cac68 ffff0000 f206d828 ffff8008

X21: 0xffff8008f2d98314:
8314  ffff8008 00000000 00000000 00000000 00000000 00000000 00000000 00000000
8334  00000000 00000000 00000000 00000000 00000000 08a04cf4 ffff0000 00000000
8354  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
8374  00000000 00000000 00000000 00001001 00000000 00000000 00000000 00000000
8394  e4bbe4bb 0f230000 ffff0000 0afae000 ffff0000 ae001000 00000000 f206d400
Timeout for IPC response!
83b4  ffff8008 00000000 00000000 f7957b18 ffff8008 f7957718 ffff8008 f7957018
83d4  ffff8008 f7957118 ffff8008 f7957618 ffff8008 f7957818 ffff8008 f7957918
83f4  ffff8008 f7957d18 ffff8008 00000000 00000000 00000000 00000000 00000000

X23: 0xffff8008f2d97f90:
7f90  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
7fb0  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
7fd0  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
7ff0  00000000 00000000 00000000 00000000 f76c8010 ffff8008 f76c8010 ffff8008
8010  00000100 00000000 f2d98018 ffff8008 f2d98018 ffff8008 08a067dc ffff0000
8030  f206d800 ffff8008 091c3650 ffff0000 f7957b18 ffff8008 f7957730 ffff8008
8050  f716a630 ffff8008 00000000 00000005 00000000 00000000 095d1568 ffff0000
8070  f76c8010 ffff8008 f716a800 ffff8008 095cac68 ffff0000 f206d828 ffff8008

X28: 0xffff8008f2349b80:
9b80  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
9ba0  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
9bc0  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
9be0  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
9c00  00000022 00000000 ffffffff ffffffff 00010001 00000000 00000000 00000000
9c20  0b0a8000 ffff0000 00000002 00404040 00000000 00000000 00000000 00000000
9c40  00000001 00000000 00000001 00000000 001ebd44 00000001 f390b800 ffff8008
9c60  00000000 00000001 00000070 00000070 00000070 00000000 09031d48 ffff0000

Call trace:
Exception stack(0xffff000008003c90 to 0xffff000008003dd0)
3c80:                                   000000000000004e c3f28b86000c3900
3ca0: c3f28b86000c3900 ffffffffffffffff 0000000000000000 0000000000000000
3cc0: ffff000009f15e78 ffff0000086ca6cc ffff000009c8e280 6e6f637369642067
3ce0: 206e6f207463656e 7465676461472061 7369207461687420 746f6e2009090920
3d00: 7463656e6e6f6320 ffffffffffffffff ffff00000829f4d4 0000e395735727e8
3d20: 0000e3956f4f075a 0000000000000000 ffff8008f2d98010 ffff8008f2d98394
3d40: 00000000000001c0 ffff8008f2d98010 ffff000009c8e280 ffff000009885018
3d60: ffff000008004000 ffff000009885018 ffff8008f2349c00 ffff000008003dd0
3d80: ffff0000089ff9b0 ffff000008003dd0 ffff0000089ff9b0 00000000600001c5
3da0: ffff8008f33f2cd8 0000000000000000 0000ffffffffffff 0000000000000000
init: Received control message 'start' for 'adbd' from pid: 3359 (/vendor/bin/hw/android.hardware.usb@1.1-service.imx)
3dc0: ffff000008003dd0 ffff0000089ff9b0
[<ffff0000089ff9b0>] composite_disconnect+0x80/0x88
[<ffff000008a044d4>] android_disconnect+0x3c/0x68
[<ffff0000089ba9f8>] cdns3_device_irq_handler+0xfc/0x2c8
[<ffff0000089b84c0>] cdns3_irq+0x44/0x94
[<ffff00000814494c>] __handle_irq_event_percpu+0x60/0x24c
[<ffff000008144c0c>] handle_irq_event+0x58/0xc0
[<ffff00000814873c>] handle_fasteoi_irq+0x98/0x180
[<ffff000008143a10>] generic_handle_irq+0x24/0x38
[<ffff000008144170>] __handle_domain_irq+0x60/0xac
[<ffff0000080819c4>] gic_handle_irq+0xd4/0x17c

Signed-off-by: Peter Chen <peter.chen@nxp.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
4 years agousb: gadget: composite: Fix possible double free memory bug
Chandana Kishori Chiluveru [Tue, 1 Oct 2019 07:46:48 +0000 (13:16 +0530)]
usb: gadget: composite: Fix possible double free memory bug

[ Upstream commit 1c20c89b0421b52b2417bb0f62a611bc669eda1d ]

composite_dev_cleanup call from the failure of configfs_composite_bind
frees up the cdev->os_desc_req and cdev->req. If the previous calls of
bind and unbind is successful these will carry stale values.

Consider the below sequence of function calls:
configfs_composite_bind()
        composite_dev_prepare()
                - Allocate cdev->req, cdev->req->buf
        composite_os_desc_req_prepare()
                - Allocate cdev->os_desc_req, cdev->os_desc_req->buf
configfs_composite_unbind()
        composite_dev_cleanup()
                - free the cdev->os_desc_req->buf and cdev->req->buf
Next composition switch
configfs_composite_bind()
        - If it fails goto err_comp_cleanup will call the
  composite_dev_cleanup() function
        composite_dev_cleanup()
        - calls kfree up with the stale values of cdev->req->buf and
  cdev->os_desc_req from the previous configfs_composite_bind
  call. The free call on these stale values leads to double free.

Hence, Fix this issue by setting request and buffer pointer to NULL after
kfree.

Signed-off-by: Chandana Kishori Chiluveru <cchiluve@codeaurora.org>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
4 years agousb: gadget: udc: atmel: Fix interrupt storm in FIFO mode.
Cristian Birsan [Fri, 4 Oct 2019 17:10:54 +0000 (20:10 +0300)]
usb: gadget: udc: atmel: Fix interrupt storm in FIFO mode.

[ Upstream commit ba3a1a915c49cc3023e4ddfc88f21e7514e82aa4 ]

Fix interrupt storm generated by endpoints when working in FIFO mode.
The TX_COMPLETE interrupt is used only by control endpoints processing.
Do not enable it for other types of endpoints.

Fixes: 914a3f3b3754 ("USB: add atmel_usba_udc driver")
Signed-off-by: Cristian Birsan <cristian.birsan@microchip.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
4 years agousb: fsl: Check memory resource before releasing it
Nikhil Badola [Mon, 21 Oct 2019 10:21:51 +0000 (18:21 +0800)]
usb: fsl: Check memory resource before releasing it

[ Upstream commit bc1e3a2dd0c9954fd956ac43ca2876bbea018c01 ]

Check memory resource existence before releasing it to avoid NULL
pointer dereference

Signed-off-by: Nikhil Badola <nikhil.badola@freescale.com>
Reviewed-by: Ran Wang <ran.wang_1@nxp.com>
Reviewed-by: Peter Chen <peter.chen@nxp.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
4 years agomacsec: fix refcnt leak in module exit routine
Taehee Yoo [Mon, 21 Oct 2019 18:47:55 +0000 (18:47 +0000)]
macsec: fix refcnt leak in module exit routine

[ Upstream commit 2bce1ebed17da54c65042ec2b962e3234bad5b47 ]

When a macsec interface is created, it increases a refcnt to a lower
device(real device). when macsec interface is deleted, the refcnt is
decreased in macsec_free_netdev(), which is ->priv_destructor() of
macsec interface.

The problem scenario is this.
When nested macsec interfaces are exiting, the exit routine of the
macsec module makes refcnt leaks.

Test commands:
    ip link add dummy0 type dummy
    ip link add macsec0 link dummy0 type macsec
    ip link add macsec1 link macsec0 type macsec
    modprobe -rv macsec

[  208.629433] unregister_netdevice: waiting for macsec0 to become free. Usage count = 1

Steps of exit routine of macsec module are below.
1. Calls ->dellink() in __rtnl_link_unregister().
2. Checks refcnt and wait refcnt to be 0 if refcnt is not 0 in
netdev_run_todo().
3. Calls ->priv_destruvtor() in netdev_run_todo().

Step2 checks refcnt, but step3 decreases refcnt.
So, step2 waits forever.

This patch makes the macsec module do not hold a refcnt of the lower
device because it already holds a refcnt of the lower device with
netdev_upper_dev_link().

Fixes: c09440f7dcb3 ("macsec: introduce IEEE 802.1AE driver")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
4 years agobonding: fix unexpected IFF_BONDING bit unset
Taehee Yoo [Mon, 21 Oct 2019 18:47:52 +0000 (18:47 +0000)]
bonding: fix unexpected IFF_BONDING bit unset

[ Upstream commit 65de65d9033750d2cf1b336c9d6e9da3a8b5cc6e ]

The IFF_BONDING means bonding master or bonding slave device.
->ndo_add_slave() sets IFF_BONDING flag and ->ndo_del_slave() unsets
IFF_BONDING flag.

bond0<--bond1

Both bond0 and bond1 are bonding device and these should keep having
IFF_BONDING flag until they are removed.
But bond1 would lose IFF_BONDING at ->ndo_del_slave() because that routine
do not check whether the slave device is the bonding type or not.
This patch adds the interface type check routine before removing
IFF_BONDING flag.

Test commands:
    ip link add bond0 type bond
    ip link add bond1 type bond
    ip link set bond1 master bond0
    ip link set bond1 nomaster
    ip link del bond1 type bond
    ip link add bond1 type bond

Splat looks like:
[  226.665555] proc_dir_entry 'bonding/bond1' already registered
[  226.666440] WARNING: CPU: 0 PID: 737 at fs/proc/generic.c:361 proc_register+0x2a9/0x3e0
[  226.667571] Modules linked in: bonding af_packet sch_fq_codel ip_tables x_tables unix
[  226.668662] CPU: 0 PID: 737 Comm: ip Not tainted 5.4.0-rc3+ #96
[  226.669508] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
[  226.670652] RIP: 0010:proc_register+0x2a9/0x3e0
[  226.671612] Code: 89 fa 48 c1 ea 03 80 3c 02 00 0f 85 39 01 00 00 48 8b 04 24 48 89 ea 48 c7 c7 a0 0b 14 9f 48 8b b0 e
0 00 00 00 e8 07 e7 88 ff <0f> 0b 48 c7 c7 40 2d a5 9f e8 59 d6 23 01 48 8b 4c 24 10 48 b8 00
[  226.675007] RSP: 0018:ffff888050e17078 EFLAGS: 00010282
[  226.675761] RAX: dffffc0000000008 RBX: ffff88805fdd0f10 RCX: ffffffff9dd344e2
[  226.676757] RDX: 0000000000000001 RSI: 0000000000000008 RDI: ffff88806c9f6b8c
[  226.677751] RBP: ffff8880507160f3 R08: ffffed100d940019 R09: ffffed100d940019
[  226.678761] R10: 0000000000000001 R11: ffffed100d940018 R12: ffff888050716008
[  226.679757] R13: ffff8880507160f2 R14: dffffc0000000000 R15: ffffed100a0e2c1e
[  226.680758] FS:  00007fdc217cc0c0(0000) GS:ffff88806c800000(0000) knlGS:0000000000000000
[  226.681886] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  226.682719] CR2: 00007f49313424d0 CR3: 0000000050e46001 CR4: 00000000000606f0
[  226.683727] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  226.684725] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  226.685681] Call Trace:
[  226.687089]  proc_create_seq_private+0xb3/0xf0
[  226.687778]  bond_create_proc_entry+0x1b3/0x3f0 [bonding]
[  226.691458]  bond_netdev_event+0x433/0x970 [bonding]
[  226.692139]  ? __module_text_address+0x13/0x140
[  226.692779]  notifier_call_chain+0x90/0x160
[  226.693401]  register_netdevice+0x9b3/0xd80
[  226.694010]  ? alloc_netdev_mqs+0x854/0xc10
[  226.694629]  ? netdev_change_features+0xa0/0xa0
[  226.695278]  ? rtnl_create_link+0x2ed/0xad0
[  226.695849]  bond_newlink+0x2a/0x60 [bonding]
[  226.696422]  __rtnl_newlink+0xb9f/0x11b0
[  226.696968]  ? rtnl_link_unregister+0x220/0x220
[ ... ]

Fixes: 0b680e753724 ("[PATCH] bonding: Add priv_flag to avoid event mishandling")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
4 years agoipvs: move old_secure_tcp into struct netns_ipvs
Eric Dumazet [Wed, 23 Oct 2019 16:53:03 +0000 (09:53 -0700)]
ipvs: move old_secure_tcp into struct netns_ipvs

[ Upstream commit c24b75e0f9239e78105f81c5f03a751641eb07ef ]

syzbot reported the following issue :

BUG: KCSAN: data-race in update_defense_level / update_defense_level

read to 0xffffffff861a6260 of 4 bytes by task 3006 on cpu 1:
 update_defense_level+0x621/0xb30 net/netfilter/ipvs/ip_vs_ctl.c:177
 defense_work_handler+0x3d/0xd0 net/netfilter/ipvs/ip_vs_ctl.c:225
 process_one_work+0x3d4/0x890 kernel/workqueue.c:2269
 worker_thread+0xa0/0x800 kernel/workqueue.c:2415
 kthread+0x1d4/0x200 drivers/block/aoe/aoecmd.c:1253
 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:352

write to 0xffffffff861a6260 of 4 bytes by task 7333 on cpu 0:
 update_defense_level+0xa62/0xb30 net/netfilter/ipvs/ip_vs_ctl.c:205
 defense_work_handler+0x3d/0xd0 net/netfilter/ipvs/ip_vs_ctl.c:225
 process_one_work+0x3d4/0x890 kernel/workqueue.c:2269
 worker_thread+0xa0/0x800 kernel/workqueue.c:2415
 kthread+0x1d4/0x200 drivers/block/aoe/aoecmd.c:1253
 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:352

Reported by Kernel Concurrency Sanitizer on:
CPU: 0 PID: 7333 Comm: kworker/0:5 Not tainted 5.4.0-rc3+ #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Workqueue: events defense_work_handler

Indeed, old_secure_tcp is currently a static variable, while it
needs to be a per netns variable.

Fixes: a0840e2e165a ("IPVS: netns, ip_vs_ctl local vars moved to ipvs struct.")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Sasha Levin <sashal@kernel.org>
4 years agoipvs: don't ignore errors in case refcounting ip_vs module fails
Davide Caratti [Sat, 19 Oct 2019 15:34:35 +0000 (17:34 +0200)]
ipvs: don't ignore errors in case refcounting ip_vs module fails

[ Upstream commit 62931f59ce9cbabb934a431f48f2f1f441c605ac ]

if the IPVS module is removed while the sync daemon is starting, there is
a small gap where try_module_get() might fail getting the refcount inside
ip_vs_use_count_inc(). Then, the refcounts of IPVS module are unbalanced,
and the subsequent call to stop_sync_thread() causes the following splat:

 WARNING: CPU: 0 PID: 4013 at kernel/module.c:1146 module_put.part.44+0x15b/0x290
  Modules linked in: ip_vs(-) nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 veth ip6table_filter ip6_tables iptable_filter binfmt_misc intel_rapl_msr intel_rapl_common crct10dif_pclmul crc32_pclmul ext4 mbcache jbd2 ghash_clmulni_intel snd_hda_codec_generic ledtrig_audio snd_hda_intel snd_intel_nhlt snd_hda_codec snd_hda_core snd_hwdep snd_seq snd_seq_device snd_pcm aesni_intel crypto_simd cryptd glue_helper joydev pcspkr snd_timer virtio_balloon snd soundcore i2c_piix4 nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c ata_generic pata_acpi virtio_net net_failover virtio_blk failover virtio_console qxl drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ata_piix ttm crc32c_intel serio_raw drm virtio_pci libata virtio_ring virtio floppy dm_mirror dm_region_hash dm_log dm_mod [last unloaded: nf_defrag_ipv6]
  CPU: 0 PID: 4013 Comm: modprobe Tainted: G        W         5.4.0-rc1.upstream+ #741
  Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
  RIP: 0010:module_put.part.44+0x15b/0x290
  Code: 04 25 28 00 00 00 0f 85 18 01 00 00 48 83 c4 68 5b 5d 41 5c 41 5d 41 5e 41 5f c3 89 44 24 28 83 e8 01 89 c5 0f 89 57 ff ff ff <0f> 0b e9 78 ff ff ff 65 8b 1d 67 83 26 4a 89 db be 08 00 00 00 48
  RSP: 0018:ffff888050607c78 EFLAGS: 00010297
  RAX: 0000000000000003 RBX: ffffffffc1420590 RCX: ffffffffb5db0ef9
  RDX: 0000000000000000 RSI: 0000000000000004 RDI: ffffffffc1420590
  RBP: 00000000ffffffff R08: fffffbfff82840b3 R09: fffffbfff82840b3
  R10: 0000000000000001 R11: fffffbfff82840b2 R12: 1ffff1100a0c0f90
  R13: ffffffffc1420200 R14: ffff88804f533300 R15: ffff88804f533ca0
  FS:  00007f8ea9720740(0000) GS:ffff888053800000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 00007f3245abe000 CR3: 000000004c28a006 CR4: 00000000001606f0
  Call Trace:
   stop_sync_thread+0x3a3/0x7c0 [ip_vs]
   ip_vs_sync_net_cleanup+0x13/0x50 [ip_vs]
   ops_exit_list.isra.5+0x94/0x140
   unregister_pernet_operations+0x29d/0x460
   unregister_pernet_device+0x26/0x60
   ip_vs_cleanup+0x11/0x38 [ip_vs]
   __x64_sys_delete_module+0x2d5/0x400
   do_syscall_64+0xa5/0x4e0
   entry_SYSCALL_64_after_hwframe+0x49/0xbe
  RIP: 0033:0x7f8ea8bf0db7
  Code: 73 01 c3 48 8b 0d b9 80 2c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 b8 b0 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 89 80 2c 00 f7 d8 64 89 01 48
  RSP: 002b:00007ffcd38d2fe8 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0
  RAX: ffffffffffffffda RBX: 0000000002436240 RCX: 00007f8ea8bf0db7
  RDX: 0000000000000000 RSI: 0000000000000800 RDI: 00000000024362a8
  RBP: 0000000000000000 R08: 00007f8ea8eba060 R09: 00007f8ea8c658a0
  R10: 00007ffcd38d2a60 R11: 0000000000000206 R12: 0000000000000000
  R13: 0000000000000001 R14: 00000000024362a8 R15: 0000000000000000
  irq event stamp: 4538
  hardirqs last  enabled at (4537): [<ffffffffb6193dde>] quarantine_put+0x9e/0x170
  hardirqs last disabled at (4538): [<ffffffffb5a0556a>] trace_hardirqs_off_thunk+0x1a/0x20
  softirqs last  enabled at (4522): [<ffffffffb6f8ebe9>] sk_common_release+0x169/0x2d0
  softirqs last disabled at (4520): [<ffffffffb6f8eb3e>] sk_common_release+0xbe/0x2d0

Check the return value of ip_vs_use_count_inc() and let its caller return
proper error. Inside do_ip_vs_set_ctl() the module is already refcounted,
we don't need refcount/derefcount there. Finally, in register_ip_vs_app()
and start_sync_thread(), take the module refcount earlier and ensure it's
released in the error path.

Change since v1:
 - better return values in case of failure of ip_vs_use_count_inc(),
   thanks to Julian Anastasov
 - no need to increase/decrease the module refcount in ip_vs_set_ctl(),
   thanks to Julian Anastasov

Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Sasha Levin <sashal@kernel.org>
4 years agoscsi: qla2xxx: Initialized mailbox to prevent driver load failure
Himanshu Madhani [Tue, 22 Oct 2019 19:36:42 +0000 (12:36 -0700)]
scsi: qla2xxx: Initialized mailbox to prevent driver load failure

[ Upstream commit c2ff2a36eff60efb5e123c940115216d6bf65684 ]

This patch fixes issue with Gen7 adapter in a blade environment where one
of the ports will not be detected by driver. Firmware expects mailbox 11 to
be set or cleared by driver for newer ISP.

Following message is seen in the log file:

[   18.810892] qla2xxx [0000:d8:00.0]-1820:1: **** Failed=102 mb[0]=4005 mb[1]=37 mb[2]=20 mb[3]=8
[   18.819596]  cmd=2 ****

[mkp: typos]

Link: https://lore.kernel.org/r/20191022193643.7076-2-hmadhani@marvell.com
Signed-off-by: Himanshu Madhani <hmadhani@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
4 years agoscsi: lpfc: Honor module parameter lpfc_use_adisc
Daniel Wagner [Tue, 22 Oct 2019 07:21:12 +0000 (09:21 +0200)]
scsi: lpfc: Honor module parameter lpfc_use_adisc

[ Upstream commit 0fd103ccfe6a06e40e2d9d8c91d96332cc9e1239 ]

The initial lpfc_desc_set_adisc implementation in commit
dea3101e0a5c ("lpfc: add Emulex FC driver version 8.0.28") enabled ADISC if

cfg_use_adisc && RSCN_MODE && FCP_2_DEVICE

In commit 92d7f7b0cde3 ("[SCSI] lpfc: NPIV: add NPIV support on top of
SLI-3") this changed to

(cfg_use_adisc && RSC_MODE) || FCP_2_DEVICE

and later in commit ffc954936b13 ("[SCSI] lpfc 8.3.13: FC Discovery Fixes
and enhancements.") to

(cfg_use_adisc && RSC_MODE) || (FCP_2_DEVICE && FCP_TARGET)

A customer reports that after a devloss, an ADISC failure is logged. It
turns out the ADISC flag is set even the user explicitly set lpfc_use_adisc
= 0.

[Sat Dec 22 22:55:58 2018] lpfc 0000:82:00.0: 2:(0):0203 Devloss timeout on WWPN 50:01:43:80:12:8e:40:20 NPort x05df00 Data: x82000000 x8 xa
[Sat Dec 22 23:08:20 2018] lpfc 0000:82:00.0: 2:(0):2755 ADISC failure DID:05DF00 Status:x9/x70000

[mkp: fixed Hannes' email]

Fixes: 92d7f7b0cde3 ("[SCSI] lpfc: NPIV: add NPIV support on top of SLI-3")
Cc: Dick Kennedy <dick.kennedy@broadcom.com>
Cc: James Smart <james.smart@broadcom.com>
Link: https://lore.kernel.org/r/20191022072112.132268-1-dwagner@suse.de
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Daniel Wagner <dwagner@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
4 years agonet: openvswitch: free vport unless register_netdevice() succeeds
Hillf Danton [Mon, 21 Oct 2019 10:01:57 +0000 (12:01 +0200)]
net: openvswitch: free vport unless register_netdevice() succeeds

[ Upstream commit 9464cc37f3671ee69cb1c00662b5e1f113a96b23 ]

syzbot found the following crash on:

HEAD commit:    1e78030e Merge tag 'mmc-v5.3-rc1' of git://git.kernel.org/..
git tree:       upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=148d3d1a600000
kernel config:  https://syzkaller.appspot.com/x/.config?x=30cef20daf3e9977
link: https://syzkaller.appspot.com/bug?extid=13210896153522fe1ee5
compiler:       gcc (GCC) 9.0.0 20181231 (experimental)
syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=136aa8c4600000
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=109ba792600000

=====================================================================
BUG: memory leak
unreferenced object 0xffff8881207e4100 (size 128):
   comm "syz-executor032", pid 7014, jiffies 4294944027 (age 13.830s)
   hex dump (first 32 bytes):
     00 70 16 18 81 88 ff ff 80 af 8c 22 81 88 ff ff  .p........."....
     00 b6 23 17 81 88 ff ff 00 00 00 00 00 00 00 00  ..#.............
   backtrace:
     [<000000000eb78212>] kmemleak_alloc_recursive  include/linux/kmemleak.h:43 [inline]
     [<000000000eb78212>] slab_post_alloc_hook mm/slab.h:522 [inline]
     [<000000000eb78212>] slab_alloc mm/slab.c:3319 [inline]
     [<000000000eb78212>] kmem_cache_alloc_trace+0x145/0x2c0 mm/slab.c:3548
     [<00000000006ea6c6>] kmalloc include/linux/slab.h:552 [inline]
     [<00000000006ea6c6>] kzalloc include/linux/slab.h:748 [inline]
     [<00000000006ea6c6>] ovs_vport_alloc+0x37/0xf0  net/openvswitch/vport.c:130
     [<00000000f9a04a7d>] internal_dev_create+0x24/0x1d0  net/openvswitch/vport-internal_dev.c:164
     [<0000000056ee7c13>] ovs_vport_add+0x81/0x190  net/openvswitch/vport.c:199
     [<000000005434efc7>] new_vport+0x19/0x80 net/openvswitch/datapath.c:194
     [<00000000b7b253f1>] ovs_dp_cmd_new+0x22f/0x410  net/openvswitch/datapath.c:1614
     [<00000000e0988518>] genl_family_rcv_msg+0x2ab/0x5b0  net/netlink/genetlink.c:629
     [<00000000d0cc9347>] genl_rcv_msg+0x54/0x9c net/netlink/genetlink.c:654
     [<000000006694b647>] netlink_rcv_skb+0x61/0x170  net/netlink/af_netlink.c:2477
     [<0000000088381f37>] genl_rcv+0x29/0x40 net/netlink/genetlink.c:665
     [<00000000dad42a47>] netlink_unicast_kernel  net/netlink/af_netlink.c:1302 [inline]
     [<00000000dad42a47>] netlink_unicast+0x1ec/0x2d0  net/netlink/af_netlink.c:1328
     [<0000000067e6b079>] netlink_sendmsg+0x270/0x480  net/netlink/af_netlink.c:1917
     [<00000000aab08a47>] sock_sendmsg_nosec net/socket.c:637 [inline]
     [<00000000aab08a47>] sock_sendmsg+0x54/0x70 net/socket.c:657
     [<000000004cb7c11d>] ___sys_sendmsg+0x393/0x3c0 net/socket.c:2311
     [<00000000c4901c63>] __sys_sendmsg+0x80/0xf0 net/socket.c:2356
     [<00000000c10abb2d>] __do_sys_sendmsg net/socket.c:2365 [inline]
     [<00000000c10abb2d>] __se_sys_sendmsg net/socket.c:2363 [inline]
     [<00000000c10abb2d>] __x64_sys_sendmsg+0x23/0x30 net/socket.c:2363

BUG: memory leak
unreferenced object 0xffff88811723b600 (size 64):
   comm "syz-executor032", pid 7014, jiffies 4294944027 (age 13.830s)
   hex dump (first 32 bytes):
     01 00 00 00 01 00 00 00 00 00 00 00 00 00 00 00  ................
     00 00 00 00 00 00 00 00 02 00 00 00 05 35 82 c1  .............5..
   backtrace:
     [<00000000352f46d8>] kmemleak_alloc_recursive  include/linux/kmemleak.h:43 [inline]
     [<00000000352f46d8>] slab_post_alloc_hook mm/slab.h:522 [inline]
     [<00000000352f46d8>] slab_alloc mm/slab.c:3319 [inline]
     [<00000000352f46d8>] __do_kmalloc mm/slab.c:3653 [inline]
     [<00000000352f46d8>] __kmalloc+0x169/0x300 mm/slab.c:3664
     [<000000008e48f3d1>] kmalloc include/linux/slab.h:557 [inline]
     [<000000008e48f3d1>] ovs_vport_set_upcall_portids+0x54/0xd0  net/openvswitch/vport.c:343
     [<00000000541e4f4a>] ovs_vport_alloc+0x7f/0xf0  net/openvswitch/vport.c:139
     [<00000000f9a04a7d>] internal_dev_create+0x24/0x1d0  net/openvswitch/vport-internal_dev.c:164
     [<0000000056ee7c13>] ovs_vport_add+0x81/0x190  net/openvswitch/vport.c:199
     [<000000005434efc7>] new_vport+0x19/0x80 net/openvswitch/datapath.c:194
     [<00000000b7b253f1>] ovs_dp_cmd_new+0x22f/0x410  net/openvswitch/datapath.c:1614
     [<00000000e0988518>] genl_family_rcv_msg+0x2ab/0x5b0  net/netlink/genetlink.c:629
     [<00000000d0cc9347>] genl_rcv_msg+0x54/0x9c net/netlink/genetlink.c:654
     [<000000006694b647>] netlink_rcv_skb+0x61/0x170  net/netlink/af_netlink.c:2477
     [<0000000088381f37>] genl_rcv+0x29/0x40 net/netlink/genetlink.c:665
     [<00000000dad42a47>] netlink_unicast_kernel  net/netlink/af_netlink.c:1302 [inline]
     [<00000000dad42a47>] netlink_unicast+0x1ec/0x2d0  net/netlink/af_netlink.c:1328
     [<0000000067e6b079>] netlink_sendmsg+0x270/0x480  net/netlink/af_netlink.c:1917
     [<00000000aab08a47>] sock_sendmsg_nosec net/socket.c:637 [inline]
     [<00000000aab08a47>] sock_sendmsg+0x54/0x70 net/socket.c:657
     [<000000004cb7c11d>] ___sys_sendmsg+0x393/0x3c0 net/socket.c:2311
     [<00000000c4901c63>] __sys_sendmsg+0x80/0xf0 net/socket.c:2356

BUG: memory leak
unreferenced object 0xffff8881228ca500 (size 128):
   comm "syz-executor032", pid 7015, jiffies 4294944622 (age 7.880s)
   hex dump (first 32 bytes):
     00 f0 27 18 81 88 ff ff 80 ac 8c 22 81 88 ff ff  ..'........"....
     40 b7 23 17 81 88 ff ff 00 00 00 00 00 00 00 00  @.#.............
   backtrace:
     [<000000000eb78212>] kmemleak_alloc_recursive  include/linux/kmemleak.h:43 [inline]
     [<000000000eb78212>] slab_post_alloc_hook mm/slab.h:522 [inline]
     [<000000000eb78212>] slab_alloc mm/slab.c:3319 [inline]
     [<000000000eb78212>] kmem_cache_alloc_trace+0x145/0x2c0 mm/slab.c:3548
     [<00000000006ea6c6>] kmalloc include/linux/slab.h:552 [inline]
     [<00000000006ea6c6>] kzalloc include/linux/slab.h:748 [inline]
     [<00000000006ea6c6>] ovs_vport_alloc+0x37/0xf0  net/openvswitch/vport.c:130
     [<00000000f9a04a7d>] internal_dev_create+0x24/0x1d0  net/openvswitch/vport-internal_dev.c:164
     [<0000000056ee7c13>] ovs_vport_add+0x81/0x190  net/openvswitch/vport.c:199
     [<000000005434efc7>] new_vport+0x19/0x80 net/openvswitch/datapath.c:194
     [<00000000b7b253f1>] ovs_dp_cmd_new+0x22f/0x410  net/openvswitch/datapath.c:1614
     [<00000000e0988518>] genl_family_rcv_msg+0x2ab/0x5b0  net/netlink/genetlink.c:629
     [<00000000d0cc9347>] genl_rcv_msg+0x54/0x9c net/netlink/genetlink.c:654
     [<000000006694b647>] netlink_rcv_skb+0x61/0x170  net/netlink/af_netlink.c:2477
     [<0000000088381f37>] genl_rcv+0x29/0x40 net/netlink/genetlink.c:665
     [<00000000dad42a47>] netlink_unicast_kernel  net/netlink/af_netlink.c:1302 [inline]
     [<00000000dad42a47>] netlink_unicast+0x1ec/0x2d0  net/netlink/af_netlink.c:1328
     [<0000000067e6b079>] netlink_sendmsg+0x270/0x480  net/netlink/af_netlink.c:1917
     [<00000000aab08a47>] sock_sendmsg_nosec net/socket.c:637 [inline]
     [<00000000aab08a47>] sock_sendmsg+0x54/0x70 net/socket.c:657
     [<000000004cb7c11d>] ___sys_sendmsg+0x393/0x3c0 net/socket.c:2311
     [<00000000c4901c63>] __sys_sendmsg+0x80/0xf0 net/socket.c:2356
     [<00000000c10abb2d>] __do_sys_sendmsg net/socket.c:2365 [inline]
     [<00000000c10abb2d>] __se_sys_sendmsg net/socket.c:2363 [inline]
     [<00000000c10abb2d>] __x64_sys_sendmsg+0x23/0x30 net/socket.c:2363
=====================================================================

The function in net core, register_netdevice(), may fail with vport's
destruction callback either invoked or not. After commit 309b66970ee2
("net: openvswitch: do not free vport if register_netdevice() is failed."),
the duty to destroy vport is offloaded from the driver OTOH, which ends
up in the memory leak reported.

It is fixed by releasing vport unless device is registered successfully.
To do that, the callback assignment is defered until device is registered.

Reported-by: syzbot+13210896153522fe1ee5@syzkaller.appspotmail.com
Fixes: 309b66970ee2 ("net: openvswitch: do not free vport if register_netdevice() is failed.")
Cc: Taehee Yoo <ap420073@gmail.com>
Cc: Greg Rose <gvrose8192@gmail.com>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Cc: Ying Xue <ying.xue@windriver.com>
Cc: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: Hillf Danton <hdanton@sina.com>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
[sbrivio: this was sent to dev@openvswitch.org and never made its way
 to netdev -- resending original patch]
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
4 years agoRDMA/uverbs: Prevent potential underflow
Dan Carpenter [Fri, 11 Oct 2019 13:34:19 +0000 (16:34 +0300)]
RDMA/uverbs: Prevent potential underflow

[ Upstream commit a9018adfde809d44e71189b984fa61cc89682b5e ]

The issue is in drivers/infiniband/core/uverbs_std_types_cq.c in the
UVERBS_HANDLER(UVERBS_METHOD_CQ_CREATE) function.  We check that:

        if (attr.comp_vector >= attrs->ufile->device->num_comp_vectors) {

But we don't check if "attr.comp_vector" is negative.  It could
potentially lead to an array underflow.  My concern would be where
cq->vector is used in the create_cq() function from the cxgb4 driver.

And really "attr.comp_vector" is appears as a u32 to user space so that's
the right type to use.

Fixes: 9ee79fce3642 ("IB/core: Add completion queue (cq) object actions")
Link: https://lore.kernel.org/r/20191011133419.GA22905@mwanda
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
4 years agoscsi: qla2xxx: fixup incorrect usage of host_byte
Hannes Reinecke [Fri, 18 Oct 2019 14:04:58 +0000 (16:04 +0200)]
scsi: qla2xxx: fixup incorrect usage of host_byte

[ Upstream commit 66cf50e65b183c863825f5c28a818e3f47a72e40 ]

DRIVER_ERROR is a a driver byte setting, not a host byte.  The qla2xxx
driver should rather return DID_ERROR here to be in line with the other
drivers.

Link: https://lore.kernel.org/r/20191018140458.108278-1-hare@suse.de
Signed-off-by: Hannes Reinecke <hare@suse.com>
Acked-by: Himanshu Madhani <hmadhani@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
4 years agonet/mlx5: prevent memory leak in mlx5_fpga_conn_create_cq
Navid Emamdoost [Wed, 25 Sep 2019 03:20:34 +0000 (22:20 -0500)]
net/mlx5: prevent memory leak in mlx5_fpga_conn_create_cq

[ Upstream commit c8c2a057fdc7de1cd16f4baa51425b932a42eb39 ]

In mlx5_fpga_conn_create_cq if mlx5_vector2eqn fails the allocated
memory should be released.

Fixes: 537a50574175 ("net/mlx5: FPGA, Add high-speed connection routines")
Signed-off-by: Navid Emamdoost <navid.emamdoost@gmail.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
4 years agoRDMA/qedr: Fix reported firmware version
Kamal Heib [Mon, 7 Oct 2019 21:07:30 +0000 (00:07 +0300)]
RDMA/qedr: Fix reported firmware version

[ Upstream commit b806c94ee44e53233b8ce6c92d9078d9781786a5 ]

Remove spaces from the reported firmware version string.
Actual value:
$ cat /sys/class/infiniband/qedr0/fw_ver
8. 37. 7. 0

Expected value:
$ cat /sys/class/infiniband/qedr0/fw_ver
8.37.7.0

Fixes: ec72fce401c6 ("qedr: Add support for RoCE HW init")
Signed-off-by: Kamal Heib <kamalheib1@gmail.com>
Acked-by: Michal KalderonĀ <michal.kalderon@marvell.com>
Link: https://lore.kernel.org/r/20191007210730.7173-1-kamalheib1@gmail.com
Signed-off-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
4 years agoHID: intel-ish-hid: fix wrong error handling in ishtp_cl_alloc_tx_ring()
Zhang Lixu [Wed, 16 Oct 2019 00:15:59 +0000 (08:15 +0800)]
HID: intel-ish-hid: fix wrong error handling in ishtp_cl_alloc_tx_ring()

[ Upstream commit 16ff7bf6dbcc6f77d2eec1ac9120edf44213c2f1 ]

When allocating tx ring buffers failed, should free tx buffers, not rx buffers.

Signed-off-by: Zhang Lixu <lixu.zhang@intel.com>
Acked-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Sasha Levin <sashal@kernel.org>
4 years agodmaengine: xilinx_dma: Fix control reg update in vdma_channel_set_config
Radhey Shyam Pandey [Thu, 26 Sep 2019 10:50:58 +0000 (16:20 +0530)]
dmaengine: xilinx_dma: Fix control reg update in vdma_channel_set_config

[ Upstream commit 6c6de1ddb1be3840f2ed5cc9d009a622720940c9 ]

In vdma_channel_set_config clear the delay, frame count and master mask
before updating their new values. It avoids programming incorrect state
when input parameters are different from default.

Signed-off-by: Radhey Shyam Pandey <radhey.shyam.pandey@xilinx.com>
Acked-by: Appana Durga Kedareswara rao <appana.durga.rao@xilinx.com>
Signed-off-by: Michal Simek <michal.simek@xilinx.com>
Link: https://lore.kernel.org/r/1569495060-18117-3-git-send-email-radhey.shyam.pandey@xilinx.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
4 years agoPCI: tegra: Enable Relaxed Ordering only for Tegra20 & Tegra30
Vidya Sagar [Thu, 4 Jul 2019 15:04:28 +0000 (20:34 +0530)]
PCI: tegra: Enable Relaxed Ordering only for Tegra20 & Tegra30

commit 7be142caabc4780b13a522c485abc806de5c4114 upstream.

The PCI Tegra controller conversion to a device tree configurable
driver in commit d1523b52bff3 ("PCI: tegra: Move PCIe driver
to drivers/pci/host") implied that code for the driver can be
compiled in for a kernel supporting multiple platforms.

Unfortunately, a blind move of the code did not check that some of the
quirks that were applied in arch/arm (eg enabling Relaxed Ordering on
all PCI devices - since the quirk hook erroneously matches PCI_ANY_ID
for both Vendor-ID and Device-ID) are now applied in all kernels that
compile the PCI Tegra controlled driver, DT and ACPI alike.

This is completely wrong, in that enablement of Relaxed Ordering is only
required by default in Tegra20 platforms as described in the Tegra20
Technical Reference Manual (available at
https://developer.nvidia.com/embedded/downloads#?search=tegra%202 in
Section 34.1, where it is mentioned that Relaxed Ordering bit needs to
be enabled in its root ports to avoid deadlock in hardware) and in the
Tegra30 platforms for the same reasons (unfortunately not documented
in the TRM).

There is no other strict requirement on PCI devices Relaxed Ordering
enablement on any other Tegra platforms or PCI host bridge driver.

Fix this quite upsetting situation by limiting the vendor and device IDs
to which the Relaxed Ordering quirk applies to the root ports in
question, reported above.

Signed-off-by: Vidya Sagar <vidyas@nvidia.com>
[lorenzo.pieralisi@arm.com: completely rewrote the commit log/fixes tag]
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Acked-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
4 years agousbip: Implement SG support to vhci-hcd and stub driver
Suwan Kim [Wed, 28 Aug 2019 03:27:41 +0000 (12:27 +0900)]
usbip: Implement SG support to vhci-hcd and stub driver

commit ea44d190764b4422af4d1c29eaeb9e69e353b406 upstream.

There are bugs on vhci with usb 3.0 storage device. In USB, each SG
list entry buffer should be divisible by the bulk max packet size.
But with native SG support, this problem doesn't matter because the
SG buffer is treated as contiguous buffer. But without native SG
support, USB storage driver breaks SG list into several URBs and the
error occurs because of a buffer size of URB that cannot be divided
by the bulk max packet size. The error situation is as follows.

When USB Storage driver requests 31.5 KB data and has SG list which
has 3584 bytes buffer followed by 7 4096 bytes buffer for some
reason. USB Storage driver splits this SG list into several URBs
because VHCI doesn't support SG and sends them separately. So the
first URB buffer size is 3584 bytes. When receiving data from device,
USB 3.0 device sends data packet of 1024 bytes size because the max
packet size of BULK pipe is 1024 bytes. So device sends 4096 bytes.
But the first URB buffer has only 3584 bytes buffer size. So host
controller terminates the transfer even though there is more data to
receive. So, vhci needs to support SG transfer to prevent this error.

In this patch, vhci supports SG regardless of whether the server's
host controller supports SG or not, because stub driver splits SG
list into several URBs if the server's host controller doesn't
support SG.

To support SG, vhci sets URB_DMA_MAP_SG flag in urb->transfer_flags
if URB has SG list and this flag will tell stub driver to use SG
list. After receiving urb from stub driver, vhci clear URB_DMA_MAP_SG
flag to avoid unnecessary DMA unmapping in HCD.

vhci sends each SG list entry to stub driver. Then, stub driver sees
the total length of the buffer and allocates SG table and pages
according to the total buffer length calling sgl_alloc(). After stub
driver receives completed URB, it again sends each SG list entry to
vhci.

If the server's host controller doesn't support SG, stub driver
breaks a single SG request into several URBs and submits them to
the server's host controller. When all the split URBs are completed,
stub driver reassembles the URBs into a single return command and
sends it to vhci.

Moreover, in the situation where vhci supports SG, but stub driver
does not, or vice versa, usbip works normally. Because there is no
protocol modification, there is no problem in communication between
server and client even if the one has a kernel without SG support.

In the case of vhci supports SG and stub driver doesn't, because
vhci sends only the total length of the buffer to stub driver as
it did before the patch applied, stub driver only needs to allocate
the required length of buffers using only kmalloc() regardless of
whether vhci supports SG or not. But stub driver has to allocate
buffer with kmalloc() as much as the total length of SG buffer which
is quite huge when vhci sends SG request, so it has overhead in
buffer allocation in this situation.

If stub driver needs to send data buffer to vhci because of IN pipe,
stub driver also sends only total length of buffer as metadata and
then sends real data as vhci does. Then vhci receive data from stub
driver and store it to the corresponding buffer of SG list entry.

And for the case of stub driver supports SG and vhci doesn't, since
the USB storage driver checks that vhci doesn't support SG and sends
the request to stub driver by splitting the SG list into multiple
URBs, stub driver allocates a buffer for each URB with kmalloc() as
it did before this patch.

* Test environment

Test uses two difference machines and two different kernel version
to make mismatch situation between the client and the server where
vhci supports SG, but stub driver does not, or vice versa. All tests
are conducted in both full SG support that both vhci and stub support
SG and half SG support that is the mismatch situation. Test kernel
version is 5.3-rc6 with commit "usb: add a HCD_DMA flag instead of
guestimating DMA capabilities" to avoid unnecessary DMA mapping and
unmapping.

 - Test kernel version
    - 5.3-rc6 with SG support
    - 5.1.20-200.fc29.x86_64 without SG support

* SG support test

 - Test devices
    - Super-speed storage device - SanDisk Ultra USB 3.0
    - High-speed storage device - SMI corporation USB 2.0 flash drive

 - Test description

Test read and write operation of mass storage device that uses the
BULK transfer. In test, the client reads and writes files whose size
is over 1G and it works normally.

* Regression test

 - Test devices
    - Super-speed device - Logitech Brio webcam
    - High-speed device  - Logitech C920 HD Pro webcam
    - Full-speed device  - Logitech bluetooth mouse
                         - Britz BR-Orion speaker
    - Low-speed device   - Logitech wired mouse

 - Test description

Moving and click test for mouse. To test the webcam, use gnome-cheese.
To test the speaker, play music and video on the client. All works
normally.

* VUDC compatibility test

VUDC also works well with this patch. Tests are done with two USB
gadget created by CONFIGFS USB gadget. Both use the BULK pipe.

        1. Serial gadget
        2. Mass storage gadget

 - Serial gadget test

Serial gadget on the host sends and receives data using cat command
on the /dev/ttyGS<N>. The client uses minicom to communicate with
the serial gadget.

 - Mass storage gadget test

After connecting the gadget with vhci, use "dd" to test read and
write operation on the client side.

Read  - dd if=/dev/sd<N> iflag=direct of=/dev/null bs=1G count=1
Write - dd if=<my file path> iflag=direct of=/dev/sd<N> bs=1G count=1

Signed-off-by: Suwan Kim <suwan.kim027@gmail.com>
Acked-by: Shuah khan <skhan@linuxfoundation.org>
Link: https://lore.kernel.org/r/20190828032741.12234-1-suwan.kim027@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
4 years agousbip: stub_rx: fix static checker warning on unnecessary checks
Shuah Khan [Fri, 15 Dec 2017 17:05:15 +0000 (10:05 -0700)]
usbip: stub_rx: fix static checker warning on unnecessary checks

commit 10c90120930628e8b959bf58d4a0aaef3ae5d945 upstream.

Fix the following static checker warnings:

The patch c6688ef9f297: "usbip: fix stub_rx: harden CMD_SUBMIT path
to handle malicious input" from Dec 7, 2017, leads to the following
static checker warning:

    drivers/usb/usbip/stub_rx.c:346 get_pipe()
    warn: impossible condition
'(pdu->u.cmd_submit.transfer_buffer_length > ((~0 >> 1))) =>
(s32min-s32max > s32max)'
    drivers/usb/usbip/stub_rx.c:486 stub_recv_cmd_submit()
    warn: always true condition
'(pdu->u.cmd_submit.transfer_buffer_length <= ((~0 >> 1))) =>
(s32min-s32max <= s32max)'

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
4 years agousbip: Fix vhci_urb_enqueue() URB null transfer buffer error path
Shuah Khan [Thu, 24 Jan 2019 21:46:42 +0000 (14:46 -0700)]
usbip: Fix vhci_urb_enqueue() URB null transfer buffer error path

commit 2c904963b1dd2acd4bc785b6c72e10a6283c2081 upstream.

Fix vhci_urb_enqueue() to print debug msg and return error instead of
failing with BUG_ON.

Signed-off-by: Shuah Khan <shuah@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
4 years agolib/scatterlist: Introduce sgl_alloc() and sgl_free()
Bart Van Assche [Fri, 5 Jan 2018 16:26:46 +0000 (08:26 -0800)]
lib/scatterlist: Introduce sgl_alloc() and sgl_free()

commit e80a0af4759a164214f02da157a3800753ce135f upstream.

Many kernel drivers contain code that allocates and frees both a
scatterlist and the pages that populate that scatterlist.
Introduce functions in lib/scatterlist.c that perform these tasks
instead of duplicating this functionality in multiple drivers.
Only include these functions in the build if CONFIG_SGL_ALLOC=y
to avoid that the kernel size increases if this functionality is
not used.

Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
4 years agosched/fair: Fix -Wunused-but-set-variable warnings
Qian Cai [Tue, 20 Aug 2019 18:40:55 +0000 (14:40 -0400)]
sched/fair: Fix -Wunused-but-set-variable warnings

commit 763a9ec06c409dcde2a761aac4bb83ff3938e0b3 upstream.

Commit:

   de53fd7aedb1 ("sched/fair: Fix low cpu usage with high throttling by removing expiration of cpu-local slices")

introduced a few compilation warnings:

  kernel/sched/fair.c: In function '__refill_cfs_bandwidth_runtime':
  kernel/sched/fair.c:4365:6: warning: variable 'now' set but not used [-Wunused-but-set-variable]
  kernel/sched/fair.c: In function 'start_cfs_bandwidth':
  kernel/sched/fair.c:4992:6: warning: variable 'overrun' set but not used [-Wunused-but-set-variable]

Also, __refill_cfs_bandwidth_runtime() does no longer update the
expiration time, so fix the comments accordingly.

Signed-off-by: Qian Cai <cai@lca.pw>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Ben Segall <bsegall@google.com>
Reviewed-by: Dave Chiluk <chiluk+linux@indeed.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: pauld@redhat.com
Fixes: de53fd7aedb1 ("sched/fair: Fix low cpu usage with high throttling by removing expiration of cpu-local slices")
Link: https://lkml.kernel.org/r/1566326455-8038-1-git-send-email-cai@lca.pw
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
4 years agosched/fair: Fix low cpu usage with high throttling by removing expiration of cpu...
Dave Chiluk [Tue, 23 Jul 2019 16:44:26 +0000 (11:44 -0500)]
sched/fair: Fix low cpu usage with high throttling by removing expiration of cpu-local slices

commit de53fd7aedb100f03e5d2231cfce0e4993282425 upstream.

It has been observed, that highly-threaded, non-cpu-bound applications
running under cpu.cfs_quota_us constraints can hit a high percentage of
periods throttled while simultaneously not consuming the allocated
amount of quota. This use case is typical of user-interactive non-cpu
bound applications, such as those running in kubernetes or mesos when
run on multiple cpu cores.

This has been root caused to cpu-local run queue being allocated per cpu
bandwidth slices, and then not fully using that slice within the period.
At which point the slice and quota expires. This expiration of unused
slice results in applications not being able to utilize the quota for
which they are allocated.

The non-expiration of per-cpu slices was recently fixed by
'commit 512ac999d275 ("sched/fair: Fix bandwidth timer clock drift
condition")'. Prior to that it appears that this had been broken since
at least 'commit 51f2176d74ac ("sched/fair: Fix unlocked reads of some
cfs_b->quota/period")' which was introduced in v3.16-rc1 in 2014. That
added the following conditional which resulted in slices never being
expired.

if (cfs_rq->runtime_expires != cfs_b->runtime_expires) {
/* extend local deadline, drift is bounded above by 2 ticks */
cfs_rq->runtime_expires += TICK_NSEC;

Because this was broken for nearly 5 years, and has recently been fixed
and is now being noticed by many users running kubernetes
(https://github.com/kubernetes/kubernetes/issues/67577) it is my opinion
that the mechanisms around expiring runtime should be removed
altogether.

This allows quota already allocated to per-cpu run-queues to live longer
than the period boundary. This allows threads on runqueues that do not
use much CPU to continue to use their remaining slice over a longer
period of time than cpu.cfs_period_us. However, this helps prevent the
above condition of hitting throttling while also not fully utilizing
your cpu quota.

This theoretically allows a machine to use slightly more than its
allotted quota in some periods. This overflow would be bounded by the
remaining quota left on each per-cpu runqueueu. This is typically no
more than min_cfs_rq_runtime=1ms per cpu. For CPU bound tasks this will
change nothing, as they should theoretically fully utilize all of their
quota in each period. For user-interactive tasks as described above this
provides a much better user/application experience as their cpu
utilization will more closely match the amount they requested when they
hit throttling. This means that cpu limits no longer strictly apply per
period for non-cpu bound applications, but that they are still accurate
over longer timeframes.

This greatly improves performance of high-thread-count, non-cpu bound
applications with low cfs_quota_us allocation on high-core-count
machines. In the case of an artificial testcase (10ms/100ms of quota on
80 CPU machine), this commit resulted in almost 30x performance
improvement, while still maintaining correct cpu quota restrictions.
That testcase is available at https://github.com/indeedeng/fibtest.

Fixes: 512ac999d275 ("sched/fair: Fix bandwidth timer clock drift condition")
Signed-off-by: Dave Chiluk <chiluk+linux@indeed.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Phil Auld <pauld@redhat.com>
Reviewed-by: Ben Segall <bsegall@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: John Hammond <jhammond@indeed.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Kyle Anderson <kwa@yelp.com>
Cc: Gabriel Munos <gmunoz@netflix.com>
Cc: Peter Oskolkov <posk@posk.io>
Cc: Cong Wang <xiyou.wangcong@gmail.com>
Cc: Brendan Gregg <bgregg@netflix.com>
Link: https://lkml.kernel.org/r/1563900266-19734-2-git-send-email-chiluk+linux@indeed.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
4 years agoARM: dts: dra7: Disable USB metastability workaround for USB2
Roger Quadros [Tue, 31 Oct 2017 13:26:00 +0000 (15:26 +0200)]
ARM: dts: dra7: Disable USB metastability workaround for USB2

commit b8c9c6fa2002b8fd4a9710f76f80f99c6046d48c upstream.

The metastability workaround causes Erratic errors [1]
on the HighSpeed USB PHY which can cause upto 2 seconds
delay in enumerating to a USB host while in Gadget mode.

Disable the Run/Stop metastability workaround to avoid this
ill effect. We are aware that this opens up the opportunity
for Run/Stop metastability, however this issue has never been
observed in TI releases so we think that Run/Stop metastability
is a lesser evil than the PHY Erratic errors. So disable it.

[1] USB controller trace during gadget enumeration
    irq/90-dwc3-969   [000] d...    52.323145: dwc3_event: event
    (00000901): Erratic Error [U0]
    irq/90-dwc3-969   [000] d...    52.560646: dwc3_event: event
    (00000901): Erratic Error [U0]
    irq/90-dwc3-969   [000] d...    52.798144: dwc3_event: event
    (00000901): Erratic Error [U0]

Signed-off-by: Roger Quadros <rogerq@ti.com>
Acked-by: Felipe Balbi <felipe.balbi@linux/intel.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
4 years agocpufreq: ti-cpufreq: add missing of_node_put()
Zumeng Chen [Thu, 5 Sep 2019 16:17:59 +0000 (10:17 -0600)]
cpufreq: ti-cpufreq: add missing of_node_put()

commit 248aefdcc3a7e0cfbd014946b4dead63e750e71b upstream

call of_node_put to release the refcount of np.

Signed-off-by: Zumeng Chen <zumeng.chen@gmail.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
4 years agoi2c: omap: Trigger bus recovery in lockup case
Claudio Foellmi [Thu, 5 Sep 2019 16:17:58 +0000 (10:17 -0600)]
i2c: omap: Trigger bus recovery in lockup case

commit 93367bfca98f36cece57c01dbce6ea1b4ac58245 upstream

A very conservative check for bus activity (to prevent interference
in multimaster setups) prevented the bus recovery methods from being
triggered in the case that SDA or SCL was stuck low.
This defeats the purpose of the recovery mechanism, which was introduced
for exactly this situation (a slave device keeping SDA pulled down).

Also added a check to make sure SDA is low before attempting recovery.
If SDA is not stuck low, recovery will not help, so we can skip it.

Note that bus lockups can persist across reboots. The only other options
are to reset or power cycle the offending slave device, and many i2c
slaves do not even have a reset pin.

If we see that one of the lines is low for the entire timeout duration,
we can actually be sure that there is no other master driving the bus.
It is therefore save for us to attempt a bus recovery.

Signed-off-by: Claudio Foellmi <claudio.foellmi@ergon.ch>
Tested-by: Vignesh R <vigneshr@ti.com>
Reviewed-by: Grygorii Strashko <grygorii.strashko@ti.com>
[wsa: fixed one return code to -EBUSY]
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
4 years agoASoC: davinci-mcasp: Fix an error handling path in 'davinci_mcasp_probe()'
Christophe Jaillet [Thu, 5 Sep 2019 16:17:57 +0000 (10:17 -0600)]
ASoC: davinci-mcasp: Fix an error handling path in 'davinci_mcasp_probe()'

commit 1b8b68b05d1868404316d32e20782b00442aba90 upstream

All error handling paths in this function 'goto err' except this one.

If one of the 2 previous memory allocations fails, we should go through
the existing error handling path. Otherwise there is an unbalanced
pm_runtime_enable()/pm_runtime_disable().

Fixes: dd55ff8346a9 ("ASoC: davinci-mcasp: Add set_tdm_slots() support")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Acked-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
4 years agoASoC: davinci: Kill BUG_ON() usage
Takashi Iwai [Thu, 5 Sep 2019 16:17:56 +0000 (10:17 -0600)]
ASoC: davinci: Kill BUG_ON() usage

commit befff4fbc27e19b14b343eb4a65d8f75d38b6230 upstream

Don't use BUG_ON() for a non-critical sanity check on production
systems.  This patch replaces with a softer WARN_ON() and an error
path.

Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>