]> git.itanic.dy.fi Git - linux-stable/log
linux-stable
5 years agoLinux 4.20.8 v4.20.8
Greg Kroah-Hartman [Tue, 12 Feb 2019 19:02:39 +0000 (20:02 +0100)]
Linux 4.20.8

5 years agoath9k: dynack: check da->enabled first in sampling routines
Lorenzo Bianconi [Fri, 2 Nov 2018 20:49:57 +0000 (21:49 +0100)]
ath9k: dynack: check da->enabled first in sampling routines

commit 9d3d65a91f027b8a9af5e63752d9b78cb10eb92d upstream.

Check da->enabled flag first in ath_dynack_sample_tx_ts and
ath_dynack_sample_ack_ts routines in order to avoid useless
processing

Tested-by: Koen Vandeputte <koen.vandeputte@ncentric.com>
Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoath9k: dynack: make ewma estimation faster
Lorenzo Bianconi [Fri, 2 Nov 2018 20:49:58 +0000 (21:49 +0100)]
ath9k: dynack: make ewma estimation faster

commit 0c60c490830a1a756c80f8de8d33d9c6359d4a36 upstream.

In order to make propagation time estimation faster,
use current sample as ewma output value during 'late ack'
tracking

Tested-by: Koen Vandeputte <koen.vandeputte@ncentric.com>
Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agonfsd: Fix error return values for nfsd4_clone_file_range()
Trond Myklebust [Mon, 21 Jan 2019 20:58:38 +0000 (15:58 -0500)]
nfsd: Fix error return values for nfsd4_clone_file_range()

commit e3fdc89ca47ef34dfb6fd5101fec084c3dba5486 upstream.

If the parameter 'count' is non-zero, nfsd4_clone_file_range() will
currently clobber all errors returned by vfs_clone_file_range() and
replace them with EINVAL.

Fixes: 42ec3d4c0218 ("vfs: make remap_file_range functions take and...")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Cc: stable@vger.kernel.org # v4.20+
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agocacheinfo: Keep the old value if of_property_read_u32 fails
Huacai Chen [Wed, 19 Dec 2018 08:16:03 +0000 (16:16 +0800)]
cacheinfo: Keep the old value if of_property_read_u32 fails

commit 3a34c986324c07dde32903f7bb262e6138e77c2a upstream.

Commit 448a5a552f336bd7b847b1951 ("drivers: base: cacheinfo: use OF
property_read_u32 instead of get_property,read_number") makes cache
size and number_of_sets be 0 if DT doesn't provide there values. I
think this is unreasonable so make them keep the old values, which is
the same as old kernels.

Fixes: 448a5a552f33 ("drivers: base: cacheinfo: use OF property_read_u32 instead of get_property,read_number")
Cc: stable@vger.kernel.org
Signed-off-by: Huacai Chen <chenhc@lemote.com>
Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoserial: sh-sci: Do not free irqs that have already been freed
Chris Brandt [Mon, 28 Jan 2019 18:25:56 +0000 (13:25 -0500)]
serial: sh-sci: Do not free irqs that have already been freed

commit 4d95987a32db53f3beca76f8c4c8309ef6a5f192 upstream.

Since IRQs might be muxed on some parts, we need to pay attention when we
are freeing them.
Otherwise we get the ugly WARNING "Trying to free already-free IRQ 20".

Fixes: 628c534ae735 ("serial: sh-sci: Improve support for separate TEI and DRI interrupts")
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Chris Brandt <chris.brandt@renesas.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoserial: 8250_pci: Make PCI class test non fatal
Andy Shevchenko [Thu, 24 Jan 2019 21:51:21 +0000 (23:51 +0200)]
serial: 8250_pci: Make PCI class test non fatal

commit 824d17c57b0abbcb9128fb3f7327fae14761914b upstream.

As has been reported the National Instruments serial cards have broken
PCI class.

The commit 7d8905d06405

  ("serial: 8250_pci: Enable device after we check black list")

made the PCI class check mandatory for the case when device is listed in
a quirk list.

Make PCI class test non fatal to allow broken card be enumerated.

Fixes: 7d8905d06405 ("serial: 8250_pci: Enable device after we check black list")
Cc: stable <stable@vger.kernel.org>
Reported-by: Guan Yung Tseng <guan.yung.tseng@ni.com>
Tested-by: Guan Yung Tseng <guan.yung.tseng@ni.com>
Tested-by: KHUENY.Gerhard <Gerhard.KHUENY@bachmann.info>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoserial: fix race between flush_to_ldisc and tty_open
Greg Kroah-Hartman [Thu, 31 Jan 2019 09:43:16 +0000 (17:43 +0800)]
serial: fix race between flush_to_ldisc and tty_open

commit fedb5760648a291e949f2380d383b5b2d2749b5e upstream.

There still is a race window after the commit b027e2298bd588
("tty: fix data race between tty_init_dev and flush of buf"),
and we encountered this crash issue if receive_buf call comes
before tty initialization completes in tty_open and
tty->driver_data may be NULL.

CPU0                                    CPU1
----                                    ----
                                  tty_open
                                   tty_init_dev
                                     tty_ldisc_unlock
                                       schedule
flush_to_ldisc
 receive_buf
  tty_port_default_receive_buf
   tty_ldisc_receive_buf
    n_tty_receive_buf_common
      __receive_buf
       uart_flush_chars
        uart_start
        /*tty->driver_data is NULL*/
                                   tty->ops->open
                                   /*init tty->driver_data*/

it can be fixed by extending ldisc semaphore lock in tty_init_dev
to driver_data initialized completely after tty->ops->open(), but
this will lead to get lock on one function and unlock in some other
function, and hard to maintain, so fix this race only by checking
tty->driver_data when receiving, and return if tty->driver_data
is NULL, and n_tty_receive_buf_common maybe calls uart_unthrottle,
so add the same check.

Because the tty layer knows nothing about the driver associated with the
device, the tty layer can not do anything here, it is up to the tty
driver itself to check for this type of race.  Fix up the serial driver
to correctly check to see if it is finished binding with the device when
being called, and if not, abort the tty calls.

[Description and problem report and testing from Li RongQing, I rewrote
the patch to be in the serial layer, not in the tty core - gregkh]

Reported-by: Li RongQing <lirongqing@baidu.com>
Tested-by: Li RongQing <lirongqing@baidu.com>
Signed-off-by: Wang Li <wangli39@baidu.com>
Signed-off-by: Zhang Yu <zhangyu31@baidu.com>
Signed-off-by: Li RongQing <lirongqing@baidu.com>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoperf tests evsel-tp-sched: Fix bitwise operator
Gustavo A. R. Silva [Tue, 22 Jan 2019 23:34:39 +0000 (17:34 -0600)]
perf tests evsel-tp-sched: Fix bitwise operator

commit 489338a717a0dfbbd5a3fabccf172b78f0ac9015 upstream.

Notice that the use of the bitwise OR operator '|' always leads to true
in this particular case, which seems a bit suspicious due to the context
in which this expression is being used.

Fix this by using bitwise AND operator '&' instead.

This bug was detected with the help of Coccinelle.

Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: stable@vger.kernel.org
Fixes: 6a6cd11d4e57 ("perf test: Add test for the sched tracepoint format fields")
Link: http://lkml.kernel.org/r/20190122233439.GA5868@embeddedor
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoperf/core: Don't WARN() for impossible ring-buffer sizes
Mark Rutland [Thu, 10 Jan 2019 14:27:45 +0000 (14:27 +0000)]
perf/core: Don't WARN() for impossible ring-buffer sizes

commit 9dff0aa95a324e262ffb03f425d00e4751f3294e upstream.

The perf tool uses /proc/sys/kernel/perf_event_mlock_kb to determine how
large its ringbuffer mmap should be. This can be configured to arbitrary
values, which can be larger than the maximum possible allocation from
kmalloc.

When this is configured to a suitably large value (e.g. thanks to the
perf fuzzer), attempting to use perf record triggers a WARN_ON_ONCE() in
__alloc_pages_nodemask():

   WARNING: CPU: 2 PID: 5666 at mm/page_alloc.c:4511 __alloc_pages_nodemask+0x3f8/0xbc8

Let's avoid this by checking that the requested allocation is possible
before calling kzalloc.

Reported-by: Julien Thierry <julien.thierry@arm.com>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Julien Thierry <julien.thierry@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: <stable@vger.kernel.org>
Link: https://lkml.kernel.org/r/20190110142745.25495-1-mark.rutland@arm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agox86/MCE: Initialize mce.bank in the case of a fatal error in mce_no_way_out()
Tony Luck [Fri, 1 Feb 2019 00:33:41 +0000 (16:33 -0800)]
x86/MCE: Initialize mce.bank in the case of a fatal error in mce_no_way_out()

commit d28af26faa0b1daf3c692603d46bc4687c16f19e upstream.

Internal injection testing crashed with a console log that said:

  mce: [Hardware Error]: CPU 7: Machine Check Exception: f Bank 0: bd80000000100134

This caused a lot of head scratching because the MCACOD (bits 15:0) of
that status is a signature from an L1 data cache error. But Linux says
that it found it in "Bank 0", which on this model CPU only reports L1
instruction cache errors.

The answer was that Linux doesn't initialize "m->bank" in the case that
it finds a fatal error in the mce_no_way_out() pre-scan of banks. If
this was a local machine check, then this partially initialized struct
mce is being passed to mce_panic().

Fix is simple: just initialize m->bank in the case of a fatal error.

Fixes: 40c36e2741d7 ("x86/mce: Fix incorrect "Machine check from unknown source" message")
Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: x86-ml <x86@kernel.org>
Cc: stable@vger.kernel.org # v4.18 Note pre-v5.0 arch/x86/kernel/cpu/mce/core.c was called arch/x86/kernel/cpu/mcheck/mce.c
Link: https://lkml.kernel.org/r/20190201003341.10638-1-tony.luck@intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoperf/x86/intel: Delay memory deallocation until x86_pmu_dead_cpu()
Peter Zijlstra [Wed, 19 Dec 2018 16:53:50 +0000 (17:53 +0100)]
perf/x86/intel: Delay memory deallocation until x86_pmu_dead_cpu()

commit 602cae04c4864bb3487dfe4c2126c8d9e7e1614a upstream.

intel_pmu_cpu_prepare() allocated memory for ->shared_regs among other
members of struct cpu_hw_events. This memory is released in
intel_pmu_cpu_dying() which is wrong. The counterpart of the
intel_pmu_cpu_prepare() callback is x86_pmu_dead_cpu().

Otherwise if the CPU fails on the UP path between CPUHP_PERF_X86_PREPARE
and CPUHP_AP_PERF_X86_STARTING then it won't release the memory but
allocate new memory on the next attempt to online the CPU (leaking the
old memory).
Also, if the CPU down path fails between CPUHP_AP_PERF_X86_STARTING and
CPUHP_PERF_X86_PREPARE then the CPU will go back online but never
allocate the memory that was released in x86_pmu_dying_cpu().

Make the memory allocation/free symmetrical in regard to the CPU hotplug
notifier by moving the deallocation to intel_pmu_cpu_dead().

This started in commit:

   a7e3ed1e47011 ("perf: Add support for supplementary event registers").

In principle the bug was introduced in v2.6.39 (!), but it will almost
certainly not backport cleanly across the big CPU hotplug rewrite between v4.7-v4.15...

[ bigeasy: Added patch description. ]
[ mingo: Added backporting guidance. ]

Reported-by: He Zhe <zhe.he@windriver.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> # With developer hat on
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> # With maintainer hat on
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: acme@kernel.org
Cc: bp@alien8.de
Cc: hpa@zytor.com
Cc: jolsa@kernel.org
Cc: kan.liang@linux.intel.com
Cc: namhyung@kernel.org
Cc: <stable@vger.kernel.org>
Fixes: a7e3ed1e47011 ("perf: Add support for supplementary event registers").
Link: https://lkml.kernel.org/r/20181219165350.6s3jvyxbibpvlhtq@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoperf/x86/intel/uncore: Add Node ID mask
Kan Liang [Sun, 27 Jan 2019 14:53:14 +0000 (06:53 -0800)]
perf/x86/intel/uncore: Add Node ID mask

commit 9e63a7894fd302082cf3627fe90844421a6cbe7f upstream.

Some PCI uncore PMUs cannot be registered on an 8-socket system (HPE
Superdome Flex).

To understand which Socket the PCI uncore PMUs belongs to, perf retrieves
the local Node ID of the uncore device from CPUNODEID(0xC0) of the PCI
configuration space, and the mapping between Socket ID and Node ID from
GIDNIDMAP(0xD4). The Socket ID can be calculated accordingly.

The local Node ID is only available at bit 2:0, but current code doesn't
mask it. If a BIOS doesn't clear the rest of the bits, an incorrect Node ID
will be fetched.

Filter the Node ID by adding a mask.

Reported-by: Song Liu <songliubraving@fb.com>
Tested-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: <stable@vger.kernel.org> # v3.7+
Fixes: 7c94ee2e0917 ("perf/x86: Add Intel Nehalem and Sandy Bridge-EP uncore support")
Link: https://lkml.kernel.org/r/1548600794-33162-1-git-send-email-kan.liang@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agocpu/hotplug: Fix "SMT disabled by BIOS" detection for KVM
Josh Poimboeuf [Wed, 30 Jan 2019 13:13:58 +0000 (07:13 -0600)]
cpu/hotplug: Fix "SMT disabled by BIOS" detection for KVM

commit b284909abad48b07d3071a9fc9b5692b3e64914b upstream.

With the following commit:

  73d5e2b47264 ("cpu/hotplug: detect SMT disabled by BIOS")

... the hotplug code attempted to detect when SMT was disabled by BIOS,
in which case it reported SMT as permanently disabled.  However, that
code broke a virt hotplug scenario, where the guest is booted with only
primary CPU threads, and a sibling is brought online later.

The problem is that there doesn't seem to be a way to reliably
distinguish between the HW "SMT disabled by BIOS" case and the virt
"sibling not yet brought online" case.  So the above-mentioned commit
was a bit misguided, as it permanently disabled SMT for both cases,
preventing future virt sibling hotplugs.

Going back and reviewing the original problems which were attempted to
be solved by that commit, when SMT was disabled in BIOS:

  1) /sys/devices/system/cpu/smt/control showed "on" instead of
     "notsupported"; and

  2) vmx_vm_init() was incorrectly showing the L1TF_MSG_SMT warning.

I'd propose that we instead consider #1 above to not actually be a
problem.  Because, at least in the virt case, it's possible that SMT
wasn't disabled by BIOS and a sibling thread could be brought online
later.  So it makes sense to just always default the smt control to "on"
to allow for that possibility (assuming cpuid indicates that the CPU
supports SMT).

The real problem is #2, which has a simple fix: change vmx_vm_init() to
query the actual current SMT state -- i.e., whether any siblings are
currently online -- instead of looking at the SMT "control" sysfs value.

So fix it by:

  a) reverting the original "fix" and its followup fix:

     73d5e2b47264 ("cpu/hotplug: detect SMT disabled by BIOS")
     bc2d8d262cba ("cpu/hotplug: Fix SMT supported evaluation")

     and

  b) changing vmx_vm_init() to query the actual current SMT state --
     instead of the sysfs control value -- to determine whether the L1TF
     warning is needed.  This also requires the 'sched_smt_present'
     variable to exported, instead of 'cpu_smt_control'.

Fixes: 73d5e2b47264 ("cpu/hotplug: detect SMT disabled by BIOS")
Reported-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Joe Mario <jmario@redhat.com>
Cc: Jiri Kosina <jikos@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: kvm@vger.kernel.org
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/e3a85d585da28cc333ecbc1e78ee9216e6da9396.1548794349.git.jpoimboe@redhat.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoKVM: nVMX: unconditionally cancel preemption timer in free_nested (CVE-2019-7221)
Peter Shier [Thu, 11 Oct 2018 18:46:46 +0000 (11:46 -0700)]
KVM: nVMX: unconditionally cancel preemption timer in free_nested (CVE-2019-7221)

commit ecec76885bcfe3294685dc363fd1273df0d5d65f upstream.

Bugzilla: 1671904

There are multiple code paths where an hrtimer may have been started to
emulate an L1 VMX preemption timer that can result in a call to free_nested
without an intervening L2 exit where the hrtimer is normally
cancelled. Unconditionally cancel in free_nested to cover all cases.

Embargoed until Feb 7th 2019.

Signed-off-by: Peter Shier <pshier@google.com>
Reported-by: Jim Mattson <jmattson@google.com>
Reviewed-by: Jim Mattson <jmattson@google.com>
Reported-by: Felix Wilhelm <fwilhelm@google.com>
Cc: stable@kernel.org
Message-Id: <20181011184646.154065-1-pshier@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agokvm: fix kvm_ioctl_create_device() reference counting (CVE-2019-6974)
Jann Horn [Sat, 26 Jan 2019 00:54:33 +0000 (01:54 +0100)]
kvm: fix kvm_ioctl_create_device() reference counting (CVE-2019-6974)

commit cfa39381173d5f969daf43582c95ad679189cbc9 upstream.

kvm_ioctl_create_device() does the following:

1. creates a device that holds a reference to the VM object (with a borrowed
   reference, the VM's refcount has not been bumped yet)
2. initializes the device
3. transfers the reference to the device to the caller's file descriptor table
4. calls kvm_get_kvm() to turn the borrowed reference to the VM into a real
   reference

The ownership transfer in step 3 must not happen before the reference to the VM
becomes a proper, non-borrowed reference, which only happens in step 4.
After step 3, an attacker can close the file descriptor and drop the borrowed
reference, which can cause the refcount of the kvm object to drop to zero.

This means that we need to grab a reference for the device before
anon_inode_getfd(), otherwise the VM can disappear from under us.

Fixes: 852b6d57dc7f ("kvm: add device control API")
Cc: stable@kernel.org
Signed-off-by: Jann Horn <jannh@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoKVM: x86: work around leak of uninitialized stack contents (CVE-2019-7222)
Paolo Bonzini [Tue, 29 Jan 2019 17:41:16 +0000 (18:41 +0100)]
KVM: x86: work around leak of uninitialized stack contents (CVE-2019-7222)

commit 353c0956a618a07ba4bbe7ad00ff29fe70e8412a upstream.

Bugzilla: 1671930

Emulation of certain instructions (VMXON, VMCLEAR, VMPTRLD, VMWRITE with
memory operand, INVEPT, INVVPID) can incorrectly inject a page fault
when passed an operand that points to an MMIO address.  The page fault
will use uninitialized kernel stack memory as the CR2 and error code.

The right behavior would be to abort the VM with a KVM_EXIT_INTERNAL_ERROR
exit to userspace; however, it is not an easy fix, so for now just
ensure that the error code and CR2 are zero.

Embargoed until Feb 7th 2019.

Reported-by: Felix Wilhelm <fwilhelm@google.com>
Cc: stable@kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoscsi: aic94xx: fix module loading
James Bottomley [Thu, 31 Jan 2019 00:42:12 +0000 (16:42 -0800)]
scsi: aic94xx: fix module loading

commit 42caa0edabd6a0a392ec36a5f0943924e4954311 upstream.

The aic94xx driver is currently failing to load with errors like

sysfs: cannot create duplicate filename '/devices/pci0000:00/0000:00:03.0/0000:02:00.3/0000:07:02.0/revision'

Because the PCI code had recently added a file named 'revision' to every
PCI device.  Fix this by renaming the aic94xx revision file to
aic_revision.  This is safe to do for us because as far as I can tell,
there's nothing in userspace relying on the current aic94xx revision file
so it can be renamed without breaking anything.

Fixes: 702ed3be1b1b (PCI: Create revision file in sysfs)
Cc: stable@vger.kernel.org
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoscsi: cxlflash: Prevent deadlock when adapter probe fails
Vaibhav Jain [Wed, 30 Jan 2019 12:26:51 +0000 (17:56 +0530)]
scsi: cxlflash: Prevent deadlock when adapter probe fails

commit bb61b843ffd46978d7ca5095453e572714934eeb upstream.

Presently when an error is encountered during probe of the cxlflash
adapter, a deadlock is seen with cpu thread stuck inside
cxlflash_remove(). Below is the trace of the deadlock as logged by
khungtaskd:

cxlflash 0006:00:00.0: cxlflash_probe: init_afu failed rc=-16
INFO: task kworker/80:1:890 blocked for more than 120 seconds.
       Not tainted 5.0.0-rc4-capi2-kexec+ #2
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
kworker/80:1    D    0   890      2 0x00000808
Workqueue: events work_for_cpu_fn

Call Trace:
 0x4d72136320 (unreliable)
 __switch_to+0x2cc/0x460
 __schedule+0x2bc/0xac0
 schedule+0x40/0xb0
 cxlflash_remove+0xec/0x640 [cxlflash]
 cxlflash_probe+0x370/0x8f0 [cxlflash]
 local_pci_probe+0x6c/0x140
 work_for_cpu_fn+0x38/0x60
 process_one_work+0x260/0x530
 worker_thread+0x280/0x5d0
 kthread+0x1a8/0x1b0
 ret_from_kernel_thread+0x5c/0x80
INFO: task systemd-udevd:5160 blocked for more than 120 seconds.

The deadlock occurs as cxlflash_remove() is called from cxlflash_probe()
without setting 'cxlflash_cfg->state' to STATE_PROBED and the probe thread
starts to wait on 'cxlflash_cfg->reset_waitq'. Since the device was never
successfully probed the 'cxlflash_cfg->state' never changes from
STATE_PROBING hence the deadlock occurs.

We fix this deadlock by setting the variable 'cxlflash_cfg->state' to
STATE_PROBED in case an error occurs during cxlflash_probe() and just
before calling cxlflash_remove().

Cc: stable@vger.kernel.org
Fixes: c21e0bbfc485("cxlflash: Base support for IBM CXL Flash Adapter")
Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoscsi: sd_zbc: Fix zone information messages
Damien Le Moal [Wed, 30 Jan 2019 06:54:58 +0000 (15:54 +0900)]
scsi: sd_zbc: Fix zone information messages

commit 88fc41c407158a7d2eaa4b2f7cfa289749d456c6 upstream.

Commit bf5054569653 ("block: Introduce blk_revalidate_disk_zones()")
inadvertently broke the message output of sd_zbc_print_zones() because the
zone information initialization of the scsi disk structure was moved to the
second scan run while sd_zbc_print_zones() is called on the first
scan. This leads to the following incorrect message to be printed for any
ZBC or ZAC zoned disks.

"...[sdX] 4294967295 zones of 0 logical blocks + 1 runt zone"

Fix this by initializing sdkp zone size and number of zones early on the
first scan. This does not impact the execution of
blk_revalidate_zones(). This functions is still called only once the block
device capacity is set on the second revalidate run on boot, or if the disk
zone configuration changed (i.e. the disk changed).

Fixes: bf5054569653 ("block: Introduce blk_revalidate_disk_zones()")
Cc: stable@vger.kernel.org
Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agostaging: speakup: fix tty-operation NULL derefs
Johan Hovold [Wed, 30 Jan 2019 09:49:34 +0000 (10:49 +0100)]
staging: speakup: fix tty-operation NULL derefs

commit a1960e0f1639cb1f7a3d94521760fc73091f6640 upstream.

The send_xchar() and tiocmset() tty operations are optional. Add the
missing sanity checks to prevent user-space triggerable NULL-pointer
dereferences.

Fixes: 6b9ad1c742bf ("staging: speakup: add send_xchar, tiocmset and input functionality for tty")
Cc: stable <stable@vger.kernel.org> # 4.13
Cc: Okash Khawaja <okash.khawaja@gmail.com>
Cc: Samuel Thibault <samuel.thibault@ens-lyon.org>
Signed-off-by: Johan Hovold <johan@kernel.org>
Reviewed-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agousb: gadget: musb: fix short isoc packets with inventra dma
Paul Elder [Wed, 30 Jan 2019 14:13:21 +0000 (08:13 -0600)]
usb: gadget: musb: fix short isoc packets with inventra dma

commit c418fd6c01fbc5516a2cd1eaf1df1ec86869028a upstream.

Handling short packets (length < max packet size) in the Inventra DMA
engine in the MUSB driver causes the MUSB DMA controller to hang. An
example of a problem that is caused by this problem is when streaming
video out of a UVC gadget, only the first video frame is transferred.

For short packets (mode-0 or mode-1 DMA), MUSB_TXCSR_TXPKTRDY must be
set manually by the driver. This was previously done in musb_g_tx
(musb_gadget.c), but incorrectly (all csr flags were cleared, and only
MUSB_TXCSR_MODE and MUSB_TXCSR_TXPKTRDY were set). Fixing that problem
allows some requests to be transferred correctly, but multiple requests
were often put together in one USB packet, and caused problems if the
packet size was not a multiple of 4. Instead, set MUSB_TXCSR_TXPKTRDY
in dma_controller_irq (musbhsdma.c), just like host mode transfers.

This topic was originally tackled by Nicolas Boichat [0] [1] and is
discussed further at [2] as part of his GSoC project [3].

[0] https://groups.google.com/forum/?hl=en#!topic/beagleboard-gsoc/k8Azwfp75CU
[1] https://gitorious.org/beagleboard-usbsniffer/beagleboard-usbsniffer-kernel/commit/b0be3b6cc195ba732189b04f1d43ec843c3e54c9?p=beagleboard-usbsniffer:beagleboard-usbsniffer-kernel.git;a=patch;h=b0be3b6cc195ba732189b04f1d43ec843c3e54c9
[2] http://beagleboard-usbsniffer.blogspot.com/2010/07/musb-isochronous-transfers-fixed.html
[3] http://elinux.org/BeagleBoard/GSoC/USBSniffer

Fixes: 550a7375fe72 ("USB: Add MUSB and TUSB support")
Signed-off-by: Paul Elder <paul.elder@ideasonboard.com>
Signed-off-by: Bin Liu <b-liu@ti.com>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agousb: gadget: udc: net2272: Fix bitwise and boolean operations
Gustavo A. R. Silva [Tue, 22 Jan 2019 21:28:08 +0000 (15:28 -0600)]
usb: gadget: udc: net2272: Fix bitwise and boolean operations

commit 07c69f1148da7de3978686d3af9263325d9d60bd upstream.

(!x & y) strikes again.

Fix bitwise and boolean operations by enclosing the expression:

intcsr & (1 << NET2272_PCI_IRQ)

in parentheses, before applying the boolean operator '!'.

Notice that this code has been there since 2011. So, it would
be helpful if someone can double-check this.

This issue was detected with the help of Coccinelle.

Fixes: ceb80363b2ec ("USB: net2272: driver for PLX NET2272 USB device controller")
Cc: stable@vger.kernel.org
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agousb: dwc3: gadget: Handle 0 xfer length for OUT EP
Tejas Joglekar [Tue, 22 Jan 2019 07:56:51 +0000 (13:26 +0530)]
usb: dwc3: gadget: Handle 0 xfer length for OUT EP

commit 1e19cdc8060227b0802bda6bc0bd22b23679ba32 upstream.

For OUT endpoints, zero-length transfers require MaxPacketSize buffer as
per the DWC_usb3 programming guide 3.30a section 4.2.3.3.

This patch fixes this by explicitly checking zero length
transfer to correctly pad up to MaxPacketSize.

Fixes: c6267a51639b ("usb: dwc3: gadget: align transfers to wMaxPacketSize")
Cc: stable@vger.kernel.org
Signed-off-by: Tejas Joglekar <joglekar@synopsys.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agousb: phy: am335x: fix race condition in _probe
Bin Liu [Wed, 16 Jan 2019 17:54:07 +0000 (11:54 -0600)]
usb: phy: am335x: fix race condition in _probe

commit a53469a68eb886e84dd8b69a1458a623d3591793 upstream.

power off the phy should be done before populate the phy. Otherwise,
am335x_init() could be called by the phy owner to power on the phy first,
then am335x_phy_probe() turns off the phy again without the caller knowing
it.

Fixes: 2fc711d76352 ("usb: phy: am335x: Enable USB remote wakeup using PHY wakeup")
Cc: stable@vger.kernel.org # v3.18+
Signed-off-by: Bin Liu <b-liu@ti.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoarm64: dts: allwinner: a64: Fix USB OTG regulator
Jernej Skrabec [Wed, 9 Jan 2019 18:16:04 +0000 (19:16 +0100)]
arm64: dts: allwinner: a64: Fix USB OTG regulator

commit b1360dcfdaa1d55952e5ec8dd9d99f88965d7ac9 upstream.

Currently, AXP803 driver assumes that reg_drivevbus is input which is
wrong. Unfortunate consequence of that is that none of the USB ports
work on the board, even USB HOST port, because USB PHY driver probing
fails due to missing regulator.

Fix that by adding "x-powers,drive-vbus-en" property to AXP803 node.

Fixes: 14ff5d8f9151 ("arm64: dts: allwinner: a64: Orange Pi Win: Enable USB OTG socket")
Cc: stable@vger.kernel.org
Signed-off-by: Jernej Skrabec <jernej.skrabec@siol.net>
Signed-off-by: Maxime Ripard <maxime.ripard@bootlin.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoirqchip/gic-v3-its: Plug allocation race for devices sharing a DevID
Marc Zyngier [Tue, 29 Jan 2019 10:02:33 +0000 (10:02 +0000)]
irqchip/gic-v3-its: Plug allocation race for devices sharing a DevID

commit 9791ec7df0e7b4d80706ccea8f24b6542f6059e9 upstream.

On systems or VMs where multiple devices share a single DevID
(because they sit behind a PCI bridge, or because the HW is
broken in funky ways), we reuse the save its_device structure
in order to reflect this.

It turns out that there is a distinct lack of locking when looking
up the its_device, and two device being probed concurrently can result
in double allocations. That's obviously not nice.

A solution for this is to have a per-ITS mutex that serializes device
allocation.

A similar issue exists on the freeing side, which can run concurrently
with the allocation. On top of now taking the appropriate lock, we
also make sure that a shared device is never freed, as we have no way
to currently track the life cycle of such object.

Reported-by: Zheng Xiang <zhengxiang9@huawei.com>
Tested-by: Zheng Xiang <zhengxiang9@huawei.com>
Cc: stable@vger.kernel.org
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agofutex: Handle early deadlock return correctly
Thomas Gleixner [Tue, 29 Jan 2019 22:15:12 +0000 (23:15 +0100)]
futex: Handle early deadlock return correctly

commit 1a1fb985f2e2b85ec0d3dc2e519ee48389ec2434 upstream.

commit 56222b212e8e ("futex: Drop hb->lock before enqueueing on the
rtmutex") changed the locking rules in the futex code so that the hash
bucket lock is not longer held while the waiter is enqueued into the
rtmutex wait list. This made the lock and the unlock path symmetric, but
unfortunately the possible early exit from __rt_mutex_proxy_start() due to
a detected deadlock was not updated accordingly. That allows a concurrent
unlocker to observe inconsitent state which triggers the warning in the
unlock path.

futex_lock_pi()                         futex_unlock_pi()
  lock(hb->lock)
  queue(hb_waiter) lock(hb->lock)
  lock(rtmutex->wait_lock)
  unlock(hb->lock)
                                        // acquired hb->lock
                                        hb_waiter = futex_top_waiter()
                                        lock(rtmutex->wait_lock)
  __rt_mutex_proxy_start()
     ---> fail
          remove(rtmutex_waiter);
     ---> returns -EDEADLOCK
  unlock(rtmutex->wait_lock)
                                        // acquired wait_lock
                                        wake_futex_pi()
                                        rt_mutex_next_owner()
  --> returns NULL
                                          --> WARN

  lock(hb->lock)
  unqueue(hb_waiter)

The problem is caused by the remove(rtmutex_waiter) in the failure case of
__rt_mutex_proxy_start() as this lets the unlocker observe a waiter in the
hash bucket but no waiter on the rtmutex, i.e. inconsistent state.

The original commit handles this correctly for the other early return cases
(timeout, signal) by delaying the removal of the rtmutex waiter until the
returning task reacquired the hash bucket lock.

Treat the failure case of __rt_mutex_proxy_start() in the same way and let
the existing cleanup code handle the eventual handover of the rtmutex
gracefully. The regular rt_mutex_proxy_start() gains the rtmutex waiter
removal for the failure case, so that the other callsites are still
operating correctly.

Add proper comments to the code so all these details are fully documented.

Thanks to Peter for helping with the analysis and writing the really
valuable code comments.

Fixes: 56222b212e8e ("futex: Drop hb->lock before enqueueing on the rtmutex")
Reported-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Co-developed-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: linux-s390@vger.kernel.org
Cc: Stefan Liebler <stli@linux.ibm.com>
Cc: Sebastian Sewior <bigeasy@linutronix.de>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/alpine.DEB.2.21.1901292311410.1950@nanos.tec.linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agodmaengine: imx-dma: fix wrong callback invoke
Leonid Iziumtsev [Tue, 15 Jan 2019 17:15:23 +0000 (17:15 +0000)]
dmaengine: imx-dma: fix wrong callback invoke

commit 341198eda723c8c1cddbb006a89ad9e362502ea2 upstream.

Once the "ld_queue" list is not empty, next descriptor will migrate
into "ld_active" list. The "desc" variable will be overwritten
during that transition. And later the dmaengine_desc_get_callback_invoke()
will use it as an argument. As result we invoke wrong callback.

That behaviour was in place since:
commit fcaaba6c7136 ("dmaengine: imx-dma: fix callback path in tasklet").
But after commit 4cd13c21b207 ("softirq: Let ksoftirqd do its job")
things got worse, since possible delay between tasklet_schedule()
from DMA irq handler and actual tasklet function execution got bigger.
And that gave more time for new DMA request to be submitted and
to be put into "ld_queue" list.

It has been noticed that DMA issue is causing problems for "mxc-mmc"
driver. While stressing the system with heavy network traffic and
writing/reading to/from sd card simultaneously the timeout may happen:

10013000.sdhci: mxcmci_watchdog: read time out (status = 0x30004900)

That often lead to file system corruption.

Signed-off-by: Leonid Iziumtsev <leonid.iziumtsev@gmail.com>
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agodmaengine: bcm2835: Fix abort of transactions
Lukas Wunner [Wed, 23 Jan 2019 08:26:00 +0000 (09:26 +0100)]
dmaengine: bcm2835: Fix abort of transactions

commit 9e528c799d17a4ac37d788c81440b50377dd592d upstream.

There are multiple issues with bcm2835_dma_abort() (which is called on
termination of a transaction):

* The algorithm to abort the transaction first pauses the channel by
  clearing the ACTIVE flag in the CS register, then waits for the PAUSED
  flag to clear.  Page 49 of the spec documents the latter as follows:

  "Indicates if the DMA is currently paused and not transferring data.
   This will occur if the active bit has been cleared [...]"
   https://www.raspberrypi.org/app/uploads/2012/02/BCM2835-ARM-Peripherals.pdf

  So the function is entering an infinite loop because it is waiting for
  PAUSED to clear which is always set due to the function having cleared
  the ACTIVE flag.  The only thing that's saving it from itself is the
  upper bound of 10000 loop iterations.

  The code comment says that the intention is to "wait for any current
  AXI transfer to complete", so the author probably wanted to check the
  WAITING_FOR_OUTSTANDING_WRITES flag instead.  Amend the function
  accordingly.

* The CS register is only read at the beginning of the function.  It
  needs to be read again after pausing the channel and before checking
  for outstanding writes, otherwise writes which were issued between
  the register read at the beginning of the function and pausing the
  channel may not be waited for.

* The function seeks to abort the transfer by writing 0 to the NEXTCONBK
  register and setting the ABORT and ACTIVE flags.  Thereby, the 0 in
  NEXTCONBK is sought to be loaded into the CONBLK_AD register.  However
  experimentation has shown this approach to not work:  The CONBLK_AD
  register remains the same as before and the CS register contains
  0x00000030 (PAUSED | DREQ_STOPS_DMA).  In other words, the control
  block is not aborted but merely paused and it will be resumed once the
  next DMA transaction is started.  That is absolutely not the desired
  behavior.

  A simpler approach is to set the channel's RESET flag instead.  This
  reliably zeroes the NEXTCONBK as well as the CS register.  It requires
  less code and only a single MMIO write.  This is also what popular
  user space DMA drivers do, e.g.:
  https://github.com/metachris/RPIO/blob/master/source/c_pwm/pwm.c

  Note that the spec is contradictory whether the NEXTCONBK register
  is writeable at all.  On the one hand, page 41 claims:

  "The value loaded into the NEXTCONBK register can be overwritten so
  that the linked list of Control Block data structures can be
  dynamically altered. However it is only safe to do this when the DMA
  is paused."

  On the other hand, page 40 specifies:

  "Only three registers in each channel's register set are directly
  writeable (CS, CONBLK_AD and DEBUG). The other registers (TI,
  SOURCE_AD, DEST_AD, TXFR_LEN, STRIDE & NEXTCONBK), are automatically
  loaded from a Control Block data structure held in external memory."

Fixes: 96286b576690 ("dmaengine: Add support for BCM2835")
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Cc: stable@vger.kernel.org # v3.14+
Cc: Frank Pavlic <f.pavlic@kunbus.de>
Cc: Martin Sperl <kernel@martin.sperl.org>
Cc: Florian Meier <florian.meier@koalo.de>
Cc: Clive Messer <clive.m.messer@gmail.com>
Cc: Matthias Reichl <hias@horus.com>
Tested-by: Stefan Wahren <stefan.wahren@i2se.com>
Acked-by: Florian Kauer <florian.kauer@koalo.de>
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agodmaengine: bcm2835: Fix interrupt race on RT
Lukas Wunner [Wed, 23 Jan 2019 08:26:00 +0000 (09:26 +0100)]
dmaengine: bcm2835: Fix interrupt race on RT

commit f7da7782aba92593f7b82f03d2409a1c5f4db91b upstream.

If IRQ handlers are threaded (either because CONFIG_PREEMPT_RT_BASE is
enabled or "threadirqs" was passed on the command line) and if system
load is sufficiently high that wakeup latency of IRQ threads degrades,
SPI DMA transactions on the BCM2835 occasionally break like this:

ks8851 spi0.0: SPI transfer timed out
bcm2835-dma 3f007000.dma: DMA transfer could not be terminated
ks8851 spi0.0 eth2: ks8851_rdfifo: spi_sync() failed

The root cause is an assumption made by the DMA driver which is
documented in a code comment in bcm2835_dma_terminate_all():

/*
 * Stop DMA activity: we assume the callback will not be called
 * after bcm_dma_abort() returns (even if it does, it will see
 * c->desc is NULL and exit.)
 */

That assumption falls apart if the IRQ handler bcm2835_dma_callback() is
threaded: A client may terminate a descriptor and issue a new one
before the IRQ handler had a chance to run. In fact the IRQ handler may
miss an *arbitrary* number of descriptors. The result is the following
race condition:

1. A descriptor finishes, its interrupt is deferred to the IRQ thread.
2. A client calls dma_terminate_async() which sets channel->desc = NULL.
3. The client issues a new descriptor. Because channel->desc is NULL,
   bcm2835_dma_issue_pending() immediately starts the descriptor.
4. Finally the IRQ thread runs and writes BCM2835_DMA_INT to the CS
   register to acknowledge the interrupt. This clears the ACTIVE flag,
   so the newly issued descriptor is paused in the middle of the
   transaction. Because channel->desc is not NULL, the IRQ thread
   finalizes the descriptor and tries to start the next one.

I see two possible solutions: The first is to call synchronize_irq()
in bcm2835_dma_issue_pending() to wait until the IRQ thread has
finished before issuing a new descriptor. The downside of this approach
is unnecessary latency if clients desire rapidly terminating and
re-issuing descriptors and don't have any use for an IRQ callback.
(The SPI TX DMA channel is a case in point.)

A better alternative is to make the IRQ thread recognize that it has
missed descriptors and avoid finalizing the newly issued descriptor.
So first of all, set the ACTIVE flag when acknowledging the interrupt.
This keeps a newly issued descriptor running.

If the descriptor was finished, the channel remains idle despite the
ACTIVE flag being set. However the ACTIVE flag can then no longer be
used to check whether the channel is idle, so instead check whether
the register containing the current control block address is zero
and finalize the current descriptor only if so.

That way, there is no impact on latency and throughput if the client
doesn't care for the interrupt: Only minimal additional overhead is
introduced for non-cyclic descriptors as one further MMIO read is
necessary per interrupt to check for idleness of the channel. Cyclic
descriptors are sped up slightly by removing one MMIO write per
interrupt.

Fixes: 96286b576690 ("dmaengine: Add support for BCM2835")
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Cc: stable@vger.kernel.org # v3.14+
Cc: Frank Pavlic <f.pavlic@kunbus.de>
Cc: Martin Sperl <kernel@martin.sperl.org>
Cc: Florian Meier <florian.meier@koalo.de>
Cc: Clive Messer <clive.m.messer@gmail.com>
Cc: Matthias Reichl <hias@horus.com>
Tested-by: Stefan Wahren <stefan.wahren@i2se.com>
Acked-by: Florian Kauer <florian.kauer@koalo.de>
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoHID: debug: fix the ring buffer implementation
Vladis Dronov [Tue, 29 Jan 2019 10:58:35 +0000 (11:58 +0100)]
HID: debug: fix the ring buffer implementation

commit 13054abbaa4f1fd4e6f3b4b63439ec033b4c8035 upstream.

Ring buffer implementation in hid_debug_event() and hid_debug_events_read()
is strange allowing lost or corrupted data. After commit 717adfdaf147
("HID: debug: check length before copy_to_user()") it is possible to enter
an infinite loop in hid_debug_events_read() by providing 0 as count, this
locks up a system. Fix this by rewriting the ring buffer implementation
with kfifo and simplify the code.

This fixes CVE-2019-3819.

v2: fix an execution logic and add a comment
v3: use __set_current_state() instead of set_current_state()

Link: https://bugzilla.redhat.com/show_bug.cgi?id=1669187
Cc: stable@vger.kernel.org # v4.18+
Fixes: cd667ce24796 ("HID: use debugfs for events/reports dumping")
Fixes: 717adfdaf147 ("HID: debug: check length before copy_to_user()")
Signed-off-by: Vladis Dronov <vdronov@redhat.com>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agocuse: fix ioctl
Miklos Szeredi [Wed, 16 Jan 2019 09:27:59 +0000 (10:27 +0100)]
cuse: fix ioctl

commit 8a3177db59cd644fde05ba9efee29392dfdec8aa upstream.

cuse_process_init_reply() doesn't initialize fc->max_pages and thus all
cuse bases ioctls fail with ENOMEM.

Reported-by: Andreas Steinmetz <ast@domdv.de>
Fixes: 5da784cce430 ("fuse: add max_pages to init_out")
Cc: <stable@vger.kernel.org> # v4.20
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agofuse: handle zero sized retrieve correctly
Miklos Szeredi [Wed, 16 Jan 2019 09:27:59 +0000 (10:27 +0100)]
fuse: handle zero sized retrieve correctly

commit 97e1532ef81acb31c30f9e75bf00306c33a77812 upstream.

Dereferencing req->page_descs[0] will Oops if req->max_pages is zero.

Reported-by: syzbot+c1e36d30ee3416289cc0@syzkaller.appspotmail.com
Tested-by: syzbot+c1e36d30ee3416289cc0@syzkaller.appspotmail.com
Fixes: b2430d7567a3 ("fuse: add per-page descriptor <offset, length> to fuse_req")
Cc: <stable@vger.kernel.org> # v3.9
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agofuse: decrement NR_WRITEBACK_TEMP on the right page
Miklos Szeredi [Wed, 16 Jan 2019 09:27:59 +0000 (10:27 +0100)]
fuse: decrement NR_WRITEBACK_TEMP on the right page

commit a2ebba824106dabe79937a9f29a875f837e1b6d4 upstream.

NR_WRITEBACK_TEMP is accounted on the temporary page in the request, not
the page cache page.

Fixes: 8b284dc47291 ("fuse: writepages: handle same page rewrites")
Cc: <stable@vger.kernel.org> # v3.13
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agofuse: call pipe_buf_release() under pipe lock
Jann Horn [Sat, 12 Jan 2019 01:39:05 +0000 (02:39 +0100)]
fuse: call pipe_buf_release() under pipe lock

commit 9509941e9c534920ccc4771ae70bd6cbbe79df1c upstream.

Some of the pipe_buf_release() handlers seem to assume that the pipe is
locked - in particular, anon_pipe_buf_release() accesses pipe->tmp_page
without taking any extra locks. From a glance through the callers of
pipe_buf_release(), it looks like FUSE is the only one that calls
pipe_buf_release() without having the pipe locked.

This bug should only lead to a memory leak, nothing terrible.

Fixes: dd3bb14f44a6 ("fuse: support splice() writing to fuse device")
Cc: stable@vger.kernel.org
Signed-off-by: Jann Horn <jannh@google.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoALSA: hda/realtek - Headset microphone support for System76 darp5
Jeremy Soller [Wed, 30 Jan 2019 23:12:31 +0000 (16:12 -0700)]
ALSA: hda/realtek - Headset microphone support for System76 darp5

commit 89e3a5682edaa4e5bb334719afb180256ac7bf78 upstream.

On the System76 Darter Pro (darp5), there is a headset microphone
input attached to 0x1a that does not have a jack detect.  In order to
get it working, the pin configuration needs to be set correctly, and
the ALC269_FIXUP_HEADSET_MODE_NO_HP_MIC fixup needs to be applied.
This is similar to the MIC_NO_PRESENCE fixups for some Dell laptops,
except we have a separate microphone jack that is already configured
correctly.

Signed-off-by: Jeremy Soller <jeremy@system76.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoALSA: hda/realtek - Use a common helper for hp pin reference
Takashi Iwai [Fri, 1 Feb 2019 10:19:50 +0000 (11:19 +0100)]
ALSA: hda/realtek - Use a common helper for hp pin reference

commit 35a39f98567d8d3f1cea48f0f30de1a7e736b644 upstream.

Replace the open-codes in many places with a new common helper for
performing the same thing: referring to the primary headphone pin.

This eventually fixes the potentially missing headphone pin on some
weird devices, too.

Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoALSA: hda/realtek - Fix lose hp_pins for disable auto mute
Kailang Yang [Fri, 1 Feb 2019 08:51:10 +0000 (16:51 +0800)]
ALSA: hda/realtek - Fix lose hp_pins for disable auto mute

commit d561aa0a70bb2e1dd85fde98b6a5561e4175ac3e upstream.

When auto_mute = no or spec->suppress_auto_mute = 1, cfg->hp_pins will
lose value.

Add this patch to find hp_pins value.
I add fixed for ALC282 ALC225 ALC256 ALC294 and alc_default_init()
alc_default_shutup().

Signed-off-by: Kailang Yang <kailang@realtek.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoALSA: hda - Serialize codec registrations
Takashi Iwai [Wed, 30 Jan 2019 16:46:03 +0000 (17:46 +0100)]
ALSA: hda - Serialize codec registrations

commit 305a0ade180981686eec1f92aa6252a7c6ebb1cf upstream.

In the current code, the codec registration may happen both at the
codec bind time and the end of the controller probe time.  In a rare
occasion, they race with each other, leading to Oops due to the still
uninitialized card device.

This patch introduces a simple flag to prevent the codec registration
at the codec bind time as long as the controller probe is going on.
The controller probe invokes snd_card_register() that does the whole
registration task, and we don't need to register each piece
beforehand.

Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoALSA: usb-audio: Add support for new T+A USB DAC
Udo Eberhardt [Tue, 5 Feb 2019 16:20:47 +0000 (17:20 +0100)]
ALSA: usb-audio: Add support for new T+A USB DAC

commit 3bff2407fbd28fd55ad5b5cccd98fc0c9598f23b upstream.

This patch adds the T+A VID to the generic check in order to enable
native DSD support for T+A devices. This works with the new T+A USB
DAC model SD3100HV and will also work with future devices which
support the XMOS/Thesycon style DSD format.

Signed-off-by: Udo Eberhardt <udo.eberhardt@thesycon.de>
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoALSA: compress: Fix stop handling on compressed capture streams
Charles Keepax [Tue, 5 Feb 2019 16:29:40 +0000 (16:29 +0000)]
ALSA: compress: Fix stop handling on compressed capture streams

commit 4f2ab5e1d13d6aa77c55f4914659784efd776eb4 upstream.

It is normal user behaviour to start, stop, then start a stream
again without closing it. Currently this works for compressed
playback streams but not capture ones.

The states on a compressed capture stream go directly from OPEN to
PREPARED, unlike a playback stream which moves to SETUP and waits
for a write of data before moving to PREPARED. Currently however,
when a stop is sent the state is set to SETUP for both types of
streams. This leaves a capture stream in the situation where a new
start can't be sent as that requires the state to be PREPARED and
a new set_params can't be sent as that requires the state to be
OPEN. The only option being to close the stream, and then reopen.

Correct this issues by allowing snd_compr_drain_notify to set the
state depending on the stream direction, as we already do in
set_params.

Fixes: 49bb6402f1aa ("ALSA: compress_core: Add support for capture streams")
Signed-off-by: Charles Keepax <ckeepax@opensource.cirrus.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoxfs: eof trim writeback mapping as soon as it is cached
Brian Foster [Fri, 1 Feb 2019 17:36:36 +0000 (09:36 -0800)]
xfs: eof trim writeback mapping as soon as it is cached

commit aa6ee4ab69293969867ab09b57546d226ace3d7a upstream.

The cached writeback mapping is EOF trimmed to try and avoid races
between post-eof block management and writeback that result in
sending cached data to a stale location. The cached mapping is
currently trimmed on the validation check, which leaves a race
window between the time the mapping is cached and when it is trimmed
against the current inode size.

For example, if a new mapping is cached by delalloc conversion on a
blocksize == page size fs, we could cycle various locks, perform
memory allocations, etc.  in the writeback codepath before the
associated mapping is eventually trimmed to i_size. This leaves
enough time for a post-eof truncate and file append before the
cached mapping is trimmed. The former event essentially invalidates
a range of the cached mapping and the latter bumps the inode size
such the trim on the next writepage event won't trim all of the
invalid blocks. fstest generic/464 reproduces this scenario
occasionally and causes a lost writeback and stale delalloc blocks
warning on inode inactivation.

To work around this problem, trim the cached writeback mapping as
soon as it is cached in addition to on subsequent validation checks.
This is a minor tweak to tighten the race window as much as possible
until a proper invalidation mechanism is available.

Fixes: 40214d128e07 ("xfs: trim writepage mapping to within eof")
Cc: <stable@vger.kernel.org> # v4.14+
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Allison Henderson <allison.henderson@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agonet/mlx5e: Force CHECKSUM_UNNECESSARY for short ethernet frames
Cong Wang [Tue, 4 Dec 2018 06:14:04 +0000 (22:14 -0800)]
net/mlx5e: Force CHECKSUM_UNNECESSARY for short ethernet frames

[ Upstream commit e8c8b53ccaff568fef4c13a6ccaf08bf241aa01a ]

When an ethernet frame is padded to meet the minimum ethernet frame
size, the padding octets are not covered by the hardware checksum.
Fortunately the padding octets are usually zero's, which don't affect
checksum. However, we have a switch which pads non-zero octets, this
causes kernel hardware checksum fault repeatedly.

Prior to:
commit '88078d98d1bb ("net: pskb_trim_rcsum() and CHECKSUM_COMPLETE ...")'
skb checksum was forced to be CHECKSUM_NONE when padding is detected.
After it, we need to keep skb->csum updated, like what we do for RXFCS.
However, fixing up CHECKSUM_COMPLETE requires to verify and parse IP
headers, it is not worthy the effort as the packets are so small that
CHECKSUM_COMPLETE can't save anything.

Fixes: 88078d98d1bb ("net: pskb_trim_rcsum() and CHECKSUM_COMPLETE are friends"),
Cc: Eric Dumazet <edumazet@google.com>
Cc: Tariq Toukan <tariqt@mellanox.com>
Cc: Nikola Ciprich <nikola.ciprich@linuxbox.cz>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agonet/mlx5e: Use the inner headers to determine tc/pedit offload limitation on decap...
Guy Shattah [Mon, 28 Jan 2019 13:58:07 +0000 (13:58 +0000)]
net/mlx5e: Use the inner headers to determine tc/pedit offload limitation on decap flows

[ Upstream commit 1651925d403e077e3fc86f961905e27c6810e132 ]

In packets that need to be decaped the internal headers
have to be checked, not the external ones.

Fixes: bdd66ac0aeed ("net/mlx5e: Disallow TC offloading of unsupported match/action combinations")
Signed-off-by: Guy Shattah <sguy@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agonet: dsa: b53: Fix for failure when irq is not defined in dt
Arun Parameswaran [Fri, 8 Feb 2019 00:01:18 +0000 (16:01 -0800)]
net: dsa: b53: Fix for failure when irq is not defined in dt

[ Upstream commit 39841cc1cbb69344539c98a1fa9d858ed124c7ba ]

Fixes the issues with non BCM58XX chips in the b53 driver
failing, when the irq is not specified in the device tree.

Removed the check for BCM58XX in b53_srab_prepare_irq(),
so the 'port->irq' will be set to '-EXIO' if the irq is not
specified in the device tree.

Fixes: 16994374a6fc ("net: dsa: b53: Make SRAB driver manage port interrupts")
Fixes: b2ddc48a81b5 ("net: dsa: b53: Do not fail when IRQ are not initialized")
Signed-off-by: Arun Parameswaran <arun.parameswaran@broadcom.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agonet: cls_flower: Remove filter from mask before freeing it
Petr Machata [Mon, 4 Feb 2019 14:50:38 +0000 (14:50 +0000)]
net: cls_flower: Remove filter from mask before freeing it

[ Upstream commit c1f7e02979edd7a3a3e69fe04be60b1d650dc8a7 ]

In fl_change(), when adding a new rule (i.e. fold == NULL), a driver may
reject the new rule, for example due to resource exhaustion. By that
point, the new rule was already assigned a mask, and it was added to
that mask's hash table. The clean-up path that's invoked as a result of
the rejection however neglects to undo the hash table addition, and
proceeds to free the new rule, thus leaving a dangling pointer in the
hash table.

Fix by removing fnew from the mask's hash table before it is freed.

Fixes: 35cc3cefc4de ("net/sched: cls_flower: Reject duplicated rules also under skip_sw")
Signed-off-by: Petr Machata <petrm@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agobnxt_en: Disable interrupts when allocating CP rings or NQs.
Michael Chan [Thu, 31 Jan 2019 19:31:48 +0000 (14:31 -0500)]
bnxt_en: Disable interrupts when allocating CP rings or NQs.

[ Upstream commit 5e66e35aab335b83d9ffb220d8a3a13986a7a60e ]

When calling firmware to allocate a CP ring or NQ, an interrupt associated
with that ring may be generated immediately before the doorbell is even
setup after the firmware call returns.  When servicing the interrupt, the
driver may crash when trying to access the doorbell.

Fix it by disabling interrupt on that vector until the doorbell is
set up.

Fixes: 697197e5a173 ("bnxt_en: Re-structure doorbells.")
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agovirtio_net: Account for tx bytes and packets on sending xdp_frames
Toshiaki Makita [Thu, 31 Jan 2019 11:40:30 +0000 (20:40 +0900)]
virtio_net: Account for tx bytes and packets on sending xdp_frames

[ Upstream commit 546f28974d771b124fb0bf7b551b343888cf0419 ]

Previously virtnet_xdp_xmit() did not account for device tx counters,
which caused confusions.
To be consistent with SKBs, account them on freeing xdp_frames.

Reported-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoskge: potential memory corruption in skge_get_regs()
Dan Carpenter [Fri, 1 Feb 2019 08:28:16 +0000 (11:28 +0300)]
skge: potential memory corruption in skge_get_regs()

[ Upstream commit 294c149a209c6196c2de85f512b52ef50f519949 ]

The "p" buffer is 0x4000 bytes long.  B3_RI_WTO_R1 is 0x190.  The value
of "regs->len" is in the 1-0x4000 range.  The bug here is that
"regs->len - B3_RI_WTO_R1" can be a negative value which would lead to
memory corruption and an abrupt crash.

Fixes: c3f8be961808 ("[PATCH] skge: expand ethtool debug register dump")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agosctp: walk the list of asoc safely
Greg Kroah-Hartman [Fri, 1 Feb 2019 14:15:22 +0000 (15:15 +0100)]
sctp: walk the list of asoc safely

[ Upstream commit ba59fb0273076637f0add4311faa990a5eec27c0 ]

In sctp_sendmesg(), when walking the list of endpoint associations, the
association can be dropped from the list, making the list corrupt.
Properly handle this by using list_for_each_entry_safe()

Fixes: 4910280503f3 ("sctp: add support for snd flag SCTP_SENDALL process in sendmsg")
Reported-by: Secunia Research <vuln@secunia.com>
Tested-by: Secunia Research <vuln@secunia.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agosctp: check and update stream->out_curr when allocating stream_out
Xin Long [Sun, 3 Feb 2019 19:27:58 +0000 (03:27 +0800)]
sctp: check and update stream->out_curr when allocating stream_out

[ Upstream commit cfe4bd7a257f6d6f81d3458d8c9d9ec4957539e6 ]

Now when using stream reconfig to add out streams, stream->out
will get re-allocated, and all old streams' information will
be copied to the new ones and the old ones will be freed.

So without stream->out_curr updated, next time when trying to
send from stream->out_curr stream, a panic would be caused.

This patch is to check and update stream->out_curr when
allocating stream_out.

v1->v2:
  - define fa_index() to get elem index from stream->out_curr.
v2->v3:
  - repost with no change.

Fixes: 5bbbbe32a431 ("sctp: introduce stream scheduler foundations")
Reported-by: Ying Xu <yinxu@redhat.com>
Reported-by: syzbot+e33a3a138267ca119c7d@syzkaller.appspotmail.com
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agorxrpc: bad unlock balance in rxrpc_recvmsg
Eric Dumazet [Mon, 4 Feb 2019 16:36:06 +0000 (08:36 -0800)]
rxrpc: bad unlock balance in rxrpc_recvmsg

[ Upstream commit 6dce3c20ac429e7a651d728e375853370c796e8d ]

When either "goto wait_interrupted;" or "goto wait_error;"
paths are taken, socket lock has already been released.

This patch fixes following syzbot splat :

WARNING: bad unlock balance detected!
5.0.0-rc4+ #59 Not tainted
-------------------------------------
syz-executor223/8256 is trying to release lock (sk_lock-AF_RXRPC) at:
[<ffffffff86651353>] rxrpc_recvmsg+0x6d3/0x3099 net/rxrpc/recvmsg.c:598
but there are no more locks to release!

other info that might help us debug this:
1 lock held by syz-executor223/8256:
 #0: 00000000fa9ed0f4 (slock-AF_RXRPC){+...}, at: spin_lock_bh include/linux/spinlock.h:334 [inline]
 #0: 00000000fa9ed0f4 (slock-AF_RXRPC){+...}, at: release_sock+0x20/0x1c0 net/core/sock.c:2798

stack backtrace:
CPU: 1 PID: 8256 Comm: syz-executor223 Not tainted 5.0.0-rc4+ #59
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x172/0x1f0 lib/dump_stack.c:113
 print_unlock_imbalance_bug kernel/locking/lockdep.c:3391 [inline]
 print_unlock_imbalance_bug.cold+0x114/0x123 kernel/locking/lockdep.c:3368
 __lock_release kernel/locking/lockdep.c:3601 [inline]
 lock_release+0x67e/0xa00 kernel/locking/lockdep.c:3860
 sock_release_ownership include/net/sock.h:1471 [inline]
 release_sock+0x183/0x1c0 net/core/sock.c:2808
 rxrpc_recvmsg+0x6d3/0x3099 net/rxrpc/recvmsg.c:598
 sock_recvmsg_nosec net/socket.c:794 [inline]
 sock_recvmsg net/socket.c:801 [inline]
 sock_recvmsg+0xd0/0x110 net/socket.c:797
 __sys_recvfrom+0x1ff/0x350 net/socket.c:1845
 __do_sys_recvfrom net/socket.c:1863 [inline]
 __se_sys_recvfrom net/socket.c:1859 [inline]
 __x64_sys_recvfrom+0xe1/0x1a0 net/socket.c:1859
 do_syscall_64+0x103/0x610 arch/x86/entry/common.c:290
 entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x446379
Code: e8 2c b3 02 00 48 83 c4 18 c3 0f 1f 80 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 2b 09 fc ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:00007fe5da89fd98 EFLAGS: 00000246 ORIG_RAX: 000000000000002d
RAX: ffffffffffffffda RBX: 00000000006dbc28 RCX: 0000000000446379
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000003
RBP: 00000000006dbc20 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00000000006dbc2c
R13: 0000000000000000 R14: 0000000000000000 R15: 20c49ba5e353f7cf

Fixes: 248f219cb8bc ("rxrpc: Rewrite the data and ack handling code")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: David Howells <dhowells@redhat.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoRevert "net: phy: marvell: avoid pause mode on SGMII-to-Copper for 88e151x"
Russell King [Thu, 31 Jan 2019 16:59:46 +0000 (16:59 +0000)]
Revert "net: phy: marvell: avoid pause mode on SGMII-to-Copper for 88e151x"

[ Upstream commit c14f07c6211cc01d52ed92cce1fade5071b8d197 ]

This reverts commit 6623c0fba10ef45b64ca213ad5dec926f37fa9a0.

The original diagnosis was incorrect: it appears that the NIC had
PHY polling mode enabled, which meant that it overwrote the PHYs
advertisement register during negotiation.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Tested-by: Yonglong Liu <liuyonglong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agords: fix refcount bug in rds_sock_addref
Eric Dumazet [Thu, 31 Jan 2019 16:47:10 +0000 (08:47 -0800)]
rds: fix refcount bug in rds_sock_addref

[ Upstream commit 6fa19f5637a6c22bc0999596bcc83bdcac8a4fa6 ]

syzbot was able to catch a bug in rds [1]

The issue here is that the socket might be found in a hash table
but that its refcount has already be set to 0 by another cpu.

We need to use refcount_inc_not_zero() to be safe here.

[1]

refcount_t: increment on 0; use-after-free.
WARNING: CPU: 1 PID: 23129 at lib/refcount.c:153 refcount_inc_checked lib/refcount.c:153 [inline]
WARNING: CPU: 1 PID: 23129 at lib/refcount.c:153 refcount_inc_checked+0x61/0x70 lib/refcount.c:151
Kernel panic - not syncing: panic_on_warn set ...
CPU: 1 PID: 23129 Comm: syz-executor3 Not tainted 5.0.0-rc4+ #53
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x1db/0x2d0 lib/dump_stack.c:113
 panic+0x2cb/0x65c kernel/panic.c:214
 __warn.cold+0x20/0x48 kernel/panic.c:571
 report_bug+0x263/0x2b0 lib/bug.c:186
 fixup_bug arch/x86/kernel/traps.c:178 [inline]
 fixup_bug arch/x86/kernel/traps.c:173 [inline]
 do_error_trap+0x11b/0x200 arch/x86/kernel/traps.c:271
 do_invalid_op+0x37/0x50 arch/x86/kernel/traps.c:290
 invalid_op+0x14/0x20 arch/x86/entry/entry_64.S:973
RIP: 0010:refcount_inc_checked lib/refcount.c:153 [inline]
RIP: 0010:refcount_inc_checked+0x61/0x70 lib/refcount.c:151
Code: 1d 51 63 c8 06 31 ff 89 de e8 eb 1b f2 fd 84 db 75 dd e8 a2 1a f2 fd 48 c7 c7 60 9f 81 88 c6 05 31 63 c8 06 01 e8 af 65 bb fd <0f> 0b eb c1 90 66 2e 0f 1f 84 00 00 00 00 00 55 48 89 e5 41 54 49
RSP: 0018:ffff8880a0cbf1e8 EFLAGS: 00010282
RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffc90006113000
RDX: 000000000001047d RSI: ffffffff81685776 RDI: 0000000000000005
RBP: ffff8880a0cbf1f8 R08: ffff888097c9e100 R09: ffffed1015ce5021
R10: ffffed1015ce5020 R11: ffff8880ae728107 R12: ffff8880723c20c0
R13: ffff8880723c24b0 R14: dffffc0000000000 R15: ffffed1014197e64
 sock_hold include/net/sock.h:647 [inline]
 rds_sock_addref+0x19/0x20 net/rds/af_rds.c:675
 rds_find_bound+0x97c/0x1080 net/rds/bind.c:82
 rds_recv_incoming+0x3be/0x1430 net/rds/recv.c:362
 rds_loop_xmit+0xf3/0x2a0 net/rds/loop.c:96
 rds_send_xmit+0x1355/0x2a10 net/rds/send.c:355
 rds_sendmsg+0x323c/0x44e0 net/rds/send.c:1368
 sock_sendmsg_nosec net/socket.c:621 [inline]
 sock_sendmsg+0xdd/0x130 net/socket.c:631
 __sys_sendto+0x387/0x5f0 net/socket.c:1788
 __do_sys_sendto net/socket.c:1800 [inline]
 __se_sys_sendto net/socket.c:1796 [inline]
 __x64_sys_sendto+0xe1/0x1a0 net/socket.c:1796
 do_syscall_64+0x1a3/0x800 arch/x86/entry/common.c:290
 entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x458089
Code: 6d b7 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 3b b7 fb ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:00007fc266df8c78 EFLAGS: 00000246 ORIG_RAX: 000000000000002c
RAX: ffffffffffffffda RBX: 0000000000000006 RCX: 0000000000458089
RDX: 0000000000000000 RSI: 00000000204b3fff RDI: 0000000000000005
RBP: 000000000073bf00 R08: 00000000202b4000 R09: 0000000000000010
R10: 0000000000000000 R11: 0000000000000246 R12: 00007fc266df96d4
R13: 00000000004c56e4 R14: 00000000004d94a8 R15: 00000000ffffffff

Fixes: cc4dfb7f70a3 ("rds: fix two RCU related problems")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Cc: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Cc: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Cc: rds-devel@oss.oracle.com
Cc: Cong Wang <xiyou.wangcong@gmail.com>
Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agonet: systemport: Fix WoL with password after deep sleep
Florian Fainelli [Fri, 1 Feb 2019 21:23:38 +0000 (13:23 -0800)]
net: systemport: Fix WoL with password after deep sleep

[ Upstream commit 8dfb8d2cceb76b74ad5b58cc65c75994329b4d5e ]

Broadcom STB chips support a deep sleep mode where all register
contents are lost. Because we were stashing the MagicPacket password
into some of these registers a suspend into that deep sleep then a
resumption would not lead to being able to wake-up from MagicPacket with
password again.

Fix this by keeping a software copy of the password and program it
during suspend.

Fixes: 83e82f4c706b ("net: systemport: add Wake-on-LAN support")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agonet: dsa: slave: Don't propagate flag changes on down slave interfaces
Rundong Ge [Sat, 2 Feb 2019 14:29:35 +0000 (14:29 +0000)]
net: dsa: slave: Don't propagate flag changes on down slave interfaces

[ Upstream commit 17ab4f61b8cd6f9c38e9d0b935d86d73b5d0d2b5 ]

The unbalance of master's promiscuity or allmulti will happen after ifdown
and ifup a slave interface which is in a bridge.

When we ifdown a slave interface , both the 'dsa_slave_close' and
'dsa_slave_change_rx_flags' will clear the master's flags. The flags
of master will be decrease twice.
In the other hand, if we ifup the slave interface again, since the
slave's flags were cleared the 'dsa_slave_open' won't set the master's
flag, only 'dsa_slave_change_rx_flags' that triggered by 'br_add_if'
will set the master's flags. The flags of master is increase once.

Only propagating flag changes when a slave interface is up makes
sure this does not happen. The 'vlan_dev_change_rx_flags' had the
same problem and was fixed, and changes here follows that fix.

Fixes: 91da11f870f0 ("net: Distributed Switch Architecture protocol support")
Signed-off-by: Rundong Ge <rdong.ge@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agonet: dsa: mv88e6xxx: Fix counting of ATU violations
Andrew Lunn [Tue, 5 Feb 2019 23:02:58 +0000 (00:02 +0100)]
net: dsa: mv88e6xxx: Fix counting of ATU violations

[ Upstream commit 75c05a74e745ae7d663b04d75777af80ada2233c ]

The ATU port vector contains a bit per port of the switch. The code
wrongly used it as a port number, and incremented a port counter. This
resulted in the wrong interfaces counter being incremented, and
potentially going off the end of the array of ports.

Fix this by using the source port ID for the violation, which really
is a port number.

Reported-by: Chris Healy <Chris.Healy@zii.aero>
Tested-by: Chris Healy <Chris.Healy@zii.aero>
Fixes: 65f60e4582bd ("net: dsa: mv88e6xxx: Keep ATU/VTU violation statistics")
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agonet: dsa: Fix NULL checking in dsa_slave_set_eee()
Dan Carpenter [Wed, 6 Feb 2019 15:35:15 +0000 (18:35 +0300)]
net: dsa: Fix NULL checking in dsa_slave_set_eee()

[ Upstream commit 00670cb8a73b10b10d3c40f045c15411715e4465 ]

This function can't succeed if dp->pl is NULL.  It will Oops inside the
call to return phylink_ethtool_get_eee(dp->pl, e);

Fixes: 1be52e97ed3e ("dsa: slave: eee: Allow ports to use phylink")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Vivien Didelot <vivien.didelot@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agonet: dsa: Fix lockdep false positive splat
Marc Zyngier [Sat, 2 Feb 2019 17:53:29 +0000 (17:53 +0000)]
net: dsa: Fix lockdep false positive splat

[ Upstream commit c8101f7729daee251f4f6505f9d135ec08e1342f ]

Creating a macvtap on a DSA-backed interface results in the following
splat when lockdep is enabled:

[   19.638080] IPv6: ADDRCONF(NETDEV_CHANGE): lan0: link becomes ready
[   23.041198] device lan0 entered promiscuous mode
[   23.043445] device eth0 entered promiscuous mode
[   23.049255]
[   23.049557] ============================================
[   23.055021] WARNING: possible recursive locking detected
[   23.060490] 5.0.0-rc3-00013-g56c857a1b8d3 #118 Not tainted
[   23.066132] --------------------------------------------
[   23.071598] ip/2861 is trying to acquire lock:
[   23.076171] 00000000f61990cb (_xmit_ETHER){+...}, at: dev_set_rx_mode+0x1c/0x38
[   23.083693]
[   23.083693] but task is already holding lock:
[   23.089696] 00000000ecf0c3b4 (_xmit_ETHER){+...}, at: dev_uc_add+0x24/0x70
[   23.096774]
[   23.096774] other info that might help us debug this:
[   23.103494]  Possible unsafe locking scenario:
[   23.103494]
[   23.109584]        CPU0
[   23.112093]        ----
[   23.114601]   lock(_xmit_ETHER);
[   23.117917]   lock(_xmit_ETHER);
[   23.121233]
[   23.121233]  *** DEADLOCK ***
[   23.121233]
[   23.127325]  May be due to missing lock nesting notation
[   23.127325]
[   23.134315] 2 locks held by ip/2861:
[   23.137987]  #0: 000000003b766c72 (rtnl_mutex){+.+.}, at: rtnetlink_rcv_msg+0x338/0x4e0
[   23.146231]  #1: 00000000ecf0c3b4 (_xmit_ETHER){+...}, at: dev_uc_add+0x24/0x70
[   23.153757]
[   23.153757] stack backtrace:
[   23.158243] CPU: 0 PID: 2861 Comm: ip Not tainted 5.0.0-rc3-00013-g56c857a1b8d3 #118
[   23.166212] Hardware name: Globalscale Marvell ESPRESSOBin Board (DT)
[   23.172843] Call trace:
[   23.175358]  dump_backtrace+0x0/0x188
[   23.179116]  show_stack+0x14/0x20
[   23.182524]  dump_stack+0xb4/0xec
[   23.185928]  __lock_acquire+0x123c/0x1860
[   23.190048]  lock_acquire+0xc8/0x248
[   23.193724]  _raw_spin_lock_bh+0x40/0x58
[   23.197755]  dev_set_rx_mode+0x1c/0x38
[   23.201607]  dev_set_promiscuity+0x3c/0x50
[   23.205820]  dsa_slave_change_rx_flags+0x5c/0x70
[   23.210567]  __dev_set_promiscuity+0x148/0x1e0
[   23.215136]  __dev_set_rx_mode+0x74/0x98
[   23.219167]  dev_uc_add+0x54/0x70
[   23.222575]  macvlan_open+0x170/0x1d0
[   23.226336]  __dev_open+0xe0/0x160
[   23.229830]  __dev_change_flags+0x16c/0x1b8
[   23.234132]  dev_change_flags+0x20/0x60
[   23.238074]  do_setlink+0x2d0/0xc50
[   23.241658]  __rtnl_newlink+0x5f8/0x6e8
[   23.245601]  rtnl_newlink+0x50/0x78
[   23.249184]  rtnetlink_rcv_msg+0x360/0x4e0
[   23.253397]  netlink_rcv_skb+0xe8/0x130
[   23.257338]  rtnetlink_rcv+0x14/0x20
[   23.261012]  netlink_unicast+0x190/0x210
[   23.265043]  netlink_sendmsg+0x288/0x350
[   23.269075]  sock_sendmsg+0x18/0x30
[   23.272659]  ___sys_sendmsg+0x29c/0x2c8
[   23.276602]  __sys_sendmsg+0x60/0xb8
[   23.280276]  __arm64_sys_sendmsg+0x1c/0x28
[   23.284488]  el0_svc_common+0xd8/0x138
[   23.288340]  el0_svc_handler+0x24/0x80
[   23.292192]  el0_svc+0x8/0xc

This looks fairly harmless (no actual deadlock occurs), and is
fixed in a similar way to c6894dec8ea9 ("bridge: fix lockdep
addr_list_lock false positive splat") by putting the addr_list_lock
in its own lockdep class.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agonet: dp83640: expire old TX-skb
Sebastian Andrzej Siewior [Mon, 4 Feb 2019 10:20:29 +0000 (11:20 +0100)]
net: dp83640: expire old TX-skb

[ Upstream commit 53bc8d2af08654659abfadfd3e98eb9922ff787c ]

During sendmsg() a cloned skb is saved via dp83640_txtstamp() in
->tx_queue. After the NIC sends this packet, the PHY will reply with a
timestamp for that TX packet. If the cable is pulled at the right time I
don't see that packet. It might gets flushed as part of queue shutdown
on NIC's side.
Once the link is up again then after the next sendmsg() we enqueue
another skb in dp83640_txtstamp() and have two on the list. Then the PHY
will send a reply and decode_txts() attaches it to the first skb on the
list.
No crash occurs since refcounting works but we are one packet behind.
linuxptp/ptp4l usually closes the socket and opens a new one (in such a
timeout case) so those "stale" replies never get there. However it does
not resume normal operation anymore.

Purge old skbs in decode_txts().

Fixes: cb646e2b02b2 ("ptp: Added a clock driver for the National Semiconductor PHYTER.")
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Reviewed-by: Kurt Kanzenbach <kurt@linutronix.de>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agolib/test_rhashtable: Make test_insert_dup() allocate its hash table dynamically
Bart Van Assche [Wed, 30 Jan 2019 18:42:30 +0000 (10:42 -0800)]
lib/test_rhashtable: Make test_insert_dup() allocate its hash table dynamically

[ Upstream commit fc42a689c4c097859e5bd37b5ea11b60dc426df6 ]

The test_insert_dup() function from lib/test_rhashtable.c passes a
pointer to a stack object to rhltable_init(). Allocate the hash table
dynamically to avoid that the following is reported with object
debugging enabled:

ODEBUG: object (ptrval) is on stack (ptrval), but NOT annotated.
WARNING: CPU: 0 PID: 1 at lib/debugobjects.c:368 __debug_object_init+0x312/0x480
Modules linked in:
EIP: __debug_object_init+0x312/0x480
Call Trace:
 ? debug_object_init+0x1a/0x20
 ? __init_work+0x16/0x30
 ? rhashtable_init+0x1e1/0x460
 ? sched_clock_cpu+0x57/0xe0
 ? rhltable_init+0xb/0x20
 ? test_insert_dup+0x32/0x20f
 ? trace_hardirqs_on+0x38/0xf0
 ? ida_dump+0x10/0x10
 ? jhash+0x130/0x130
 ? my_hashfn+0x30/0x30
 ? test_rht_init+0x6aa/0xab4
 ? ida_dump+0x10/0x10
 ? test_rhltable+0xc5c/0xc5c
 ? do_one_initcall+0x67/0x28e
 ? trace_hardirqs_off+0x22/0xe0
 ? restore_all_kernel+0xf/0x70
 ? trace_hardirqs_on_thunk+0xc/0x10
 ? restore_all_kernel+0xf/0x70
 ? kernel_init_freeable+0x142/0x213
 ? rest_init+0x230/0x230
 ? kernel_init+0x10/0x110
 ? schedule_tail_wrapper+0x9/0xc
 ? ret_from_fork+0x19/0x24

Cc: Thomas Graf <tgraf@suug.ch>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: netdev@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoenic: fix checksum validation for IPv6
Govindarajulu Varadarajan [Wed, 30 Jan 2019 14:59:00 +0000 (06:59 -0800)]
enic: fix checksum validation for IPv6

[ Upstream commit 7596175e99b3d4bce28022193efd954c201a782a ]

In case of IPv6 pkts, ipv4_csum_ok is 0. Because of this, driver does
not set skb->ip_summed. So IPv6 rx checksum is not offloaded.

Signed-off-by: Govindarajulu Varadarajan <gvaradar@cisco.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agodccp: fool proof ccid_hc_[rt]x_parse_options()
Eric Dumazet [Wed, 30 Jan 2019 19:39:41 +0000 (11:39 -0800)]
dccp: fool proof ccid_hc_[rt]x_parse_options()

[ Upstream commit 9b1f19d810e92d6cdc68455fbc22d9f961a58ce1 ]

Similarly to commit 276bdb82dedb ("dccp: check ccid before dereferencing")
it is wise to test for a NULL ccid.

kasan: CONFIG_KASAN_INLINE enabled
kasan: GPF could be caused by NULL-ptr deref or user memory access
general protection fault: 0000 [#1] PREEMPT SMP KASAN
CPU: 1 PID: 16 Comm: ksoftirqd/1 Not tainted 5.0.0-rc3+ #37
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
RIP: 0010:ccid_hc_tx_parse_options net/dccp/ccid.h:205 [inline]
RIP: 0010:dccp_parse_options+0x8d9/0x12b0 net/dccp/options.c:233
Code: c5 0f b6 75 b3 80 38 00 0f 85 d6 08 00 00 48 b9 00 00 00 00 00 fc ff df 48 8b 45 b8 4c 8b b8 f8 07 00 00 4c 89 f8 48 c1 e8 03 <80> 3c 08 00 0f 85 95 08 00 00 48 b8 00 00 00 00 00 fc ff df 4d 8b
kobject: 'loop5' (0000000080f78fc1): kobject_uevent_env
RSP: 0018:ffff8880a94df0b8 EFLAGS: 00010246
RAX: 0000000000000000 RBX: ffff8880858ac723 RCX: dffffc0000000000
RDX: 0000000000000100 RSI: 0000000000000007 RDI: 0000000000000001
RBP: ffff8880a94df140 R08: 0000000000000001 R09: ffff888061b83a80
R10: ffffed100c370752 R11: ffff888061b83a97 R12: 0000000000000026
R13: 0000000000000001 R14: 0000000000000000 R15: 0000000000000000
FS:  0000000000000000(0000) GS:ffff8880ae700000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f0defa33518 CR3: 000000008db5e000 CR4: 00000000001406e0
kobject: 'loop5' (0000000080f78fc1): fill_kobj_path: path = '/devices/virtual/block/loop5'
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 dccp_rcv_state_process+0x2b6/0x1af6 net/dccp/input.c:654
 dccp_v4_do_rcv+0x100/0x190 net/dccp/ipv4.c:688
 sk_backlog_rcv include/net/sock.h:936 [inline]
 __sk_receive_skb+0x3a9/0xea0 net/core/sock.c:473
 dccp_v4_rcv+0x10cb/0x1f80 net/dccp/ipv4.c:880
 ip_protocol_deliver_rcu+0xb6/0xa20 net/ipv4/ip_input.c:208
 ip_local_deliver_finish+0x23b/0x390 net/ipv4/ip_input.c:234
 NF_HOOK include/linux/netfilter.h:289 [inline]
 NF_HOOK include/linux/netfilter.h:283 [inline]
 ip_local_deliver+0x1f0/0x740 net/ipv4/ip_input.c:255
 dst_input include/net/dst.h:450 [inline]
 ip_rcv_finish+0x1f4/0x2f0 net/ipv4/ip_input.c:414
 NF_HOOK include/linux/netfilter.h:289 [inline]
 NF_HOOK include/linux/netfilter.h:283 [inline]
 ip_rcv+0xed/0x620 net/ipv4/ip_input.c:524
 __netif_receive_skb_one_core+0x160/0x210 net/core/dev.c:4973
 __netif_receive_skb+0x2c/0x1c0 net/core/dev.c:5083
 process_backlog+0x206/0x750 net/core/dev.c:5923
 napi_poll net/core/dev.c:6346 [inline]
 net_rx_action+0x76d/0x1930 net/core/dev.c:6412
 __do_softirq+0x30b/0xb11 kernel/softirq.c:292
 run_ksoftirqd kernel/softirq.c:654 [inline]
 run_ksoftirqd+0x8e/0x110 kernel/softirq.c:646
 smpboot_thread_fn+0x6ab/0xa10 kernel/smpboot.c:164
 kthread+0x357/0x430 kernel/kthread.c:246
 ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:352
Modules linked in:
---[ end trace 58a0ba03bea2c376 ]---
RIP: 0010:ccid_hc_tx_parse_options net/dccp/ccid.h:205 [inline]
RIP: 0010:dccp_parse_options+0x8d9/0x12b0 net/dccp/options.c:233
Code: c5 0f b6 75 b3 80 38 00 0f 85 d6 08 00 00 48 b9 00 00 00 00 00 fc ff df 48 8b 45 b8 4c 8b b8 f8 07 00 00 4c 89 f8 48 c1 e8 03 <80> 3c 08 00 0f 85 95 08 00 00 48 b8 00 00 00 00 00 fc ff df 4d 8b
RSP: 0018:ffff8880a94df0b8 EFLAGS: 00010246
RAX: 0000000000000000 RBX: ffff8880858ac723 RCX: dffffc0000000000
RDX: 0000000000000100 RSI: 0000000000000007 RDI: 0000000000000001
RBP: ffff8880a94df140 R08: 0000000000000001 R09: ffff888061b83a80
R10: ffffed100c370752 R11: ffff888061b83a97 R12: 0000000000000026
R13: 0000000000000001 R14: 0000000000000000 R15: 0000000000000000
FS:  0000000000000000(0000) GS:ffff8880ae700000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f0defa33518 CR3: 0000000009871000 CR4: 00000000001406e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Cc: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agothermal: hwmon: inline helpers when CONFIG_THERMAL_HWMON is not set
Eduardo Valentin [Wed, 2 Jan 2019 00:34:03 +0000 (00:34 +0000)]
thermal: hwmon: inline helpers when CONFIG_THERMAL_HWMON is not set

commit 03334ba8b425b2ad275c8f390cf83c7b081c3095 upstream.

Avoid warnings like this:
thermal_hwmon.h:29:1: warning: ‘thermal_remove_hwmon_sysfs’ defined but not used [-Wunused-function]
 thermal_remove_hwmon_sysfs(struct thermal_zone_device *tz)

Fixes: 0dd88793aacd ("thermal: hwmon: move hwmon support to single file")
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoxprtrdma: Don't wake pending tasks until disconnect is done
Chuck Lever [Wed, 19 Dec 2018 15:58:40 +0000 (10:58 -0500)]
xprtrdma: Don't wake pending tasks until disconnect is done

[ Upstream commit 0c0829bcf51aef713806e49b8ea2bac7962f54e2 ]

Transport disconnect processing does a "wake pending tasks" at
various points.

Suppose an RPC Reply is being processed. The RPC task that Reply
goes with is waiting on the pending queue. If a disconnect wake-up
happens before reply processing is done, that reply, even if it is
good, is thrown away, and the RPC has to be sent again.

This window apparently does not exist for socket transports because
there is a lock held while a reply is being received which prevents
the wake-up call until after reply processing is done.

To resolve this, all RPC replies being processed on an RPC-over-RDMA
transport have to complete before pending tasks are awoken due to a
transport disconnect.

Callers that already hold the transport write lock may invoke
->ops->close directly. Others use a generic helper that schedules
a close when the write lock can be taken safely.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
5 years agoscripts/gdb: fix lx-version string output
Du Changbin [Thu, 3 Jan 2019 23:28:27 +0000 (15:28 -0800)]
scripts/gdb: fix lx-version string output

[ Upstream commit b058809bfc8faeb7b7cae047666e23375a060059 ]

A bug is present in GDB which causes early string termination when
parsing variables.  This has been reported [0], but we should ensure
that we can support at least basic printing of the core kernel strings.

For current gdb version (has been tested with 7.3 and 8.1), 'lx-version'
only prints one character.

  (gdb) lx-version
  L(gdb)

This can be fixed by casting 'linux_banner' as (char *).

  (gdb) lx-version
  Linux version 4.19.0-rc1+ (changbin@acer) (gcc version 7.3.0 (Ubuntu 7.3.0-16ubuntu3)) #21 SMP Sat Sep 1 21:43:30 CST 2018

[0] https://sourceware.org/bugzilla/show_bug.cgi?id=20077

[kbingham@kernel.org: add detail to commit message]
Link: http://lkml.kernel.org/r/20181111162035.8356-1-kieran.bingham@ideasonboard.com
Fixes: 2d061d999424 ("scripts/gdb: add version command")
Signed-off-by: Du Changbin <changbin.du@gmail.com>
Signed-off-by: Kieran Bingham <kbingham@kernel.org>
Acked-by: Jan Kiszka <jan.kiszka@siemens.com>
Cc: Jan Kiszka <jan.kiszka@siemens.com>
Cc: Jason Wessel <jason.wessel@windriver.com>
Cc: Daniel Thompson <daniel.thompson@linaro.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
5 years agokernel/kcov.c: mark write_comp_data() as notrace
Anders Roxell [Thu, 3 Jan 2019 23:28:24 +0000 (15:28 -0800)]
kernel/kcov.c: mark write_comp_data() as notrace

[ Upstream commit 634724431607f6f46c495dfef801a1c8b44a96d9 ]

Since __sanitizer_cov_trace_const_cmp4 is marked as notrace, the
function called from __sanitizer_cov_trace_const_cmp4 shouldn't be
traceable either.  ftrace_graph_caller() gets called every time func
write_comp_data() gets called if it isn't marked 'notrace'.  This is the
backtrace from gdb:

 #0  ftrace_graph_caller () at ../arch/arm64/kernel/entry-ftrace.S:179
 #1  0xffffff8010201920 in ftrace_caller () at ../arch/arm64/kernel/entry-ftrace.S:151
 #2  0xffffff8010439714 in write_comp_data (type=5, arg1=0, arg2=0, ip=18446743524224276596) at ../kernel/kcov.c:116
 #3  0xffffff8010439894 in __sanitizer_cov_trace_const_cmp4 (arg1=<optimized out>, arg2=<optimized out>) at ../kernel/kcov.c:188
 #4  0xffffff8010201874 in prepare_ftrace_return (self_addr=18446743524226602768, parent=0xffffff801014b918, frame_pointer=18446743524223531344) at ./include/generated/atomic-instrumented.h:27
 #5  0xffffff801020194c in ftrace_graph_caller () at ../arch/arm64/kernel/entry-ftrace.S:182

Rework so that write_comp_data() that are called from
__sanitizer_cov_trace_*_cmp*() are marked as 'notrace'.

Commit 903e8ff86753 ("kernel/kcov.c: mark funcs in __sanitizer_cov_trace_pc() as notrace")
missed to mark write_comp_data() as 'notrace'. When that patch was
created gcc-7 was used. In lib/Kconfig.debug
config KCOV_ENABLE_COMPARISONS
depends on $(cc-option,-fsanitize-coverage=trace-cmp)

That code path isn't hit with gcc-7. However, it were that with gcc-8.

Link: http://lkml.kernel.org/r/20181206143011.23719-1-anders.roxell@linaro.org
Signed-off-by: Anders Roxell <anders.roxell@linaro.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Co-developed-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
5 years agoexec: load_script: don't blindly truncate shebang string
Oleg Nesterov [Thu, 3 Jan 2019 23:28:07 +0000 (15:28 -0800)]
exec: load_script: don't blindly truncate shebang string

[ Upstream commit 8099b047ecc431518b9bb6bdbba3549bbecdc343 ]

load_script() simply truncates bprm->buf and this is very wrong if the
length of shebang string exceeds BINPRM_BUF_SIZE-2.  This can silently
truncate i_arg or (worse) we can execute the wrong binary if buf[2:126]
happens to be the valid executable path.

Change load_script() to return ENOEXEC if it can't find '\n' or zero in
bprm->buf.  Note that '\0' can come from either
prepare_binprm()->memset() or from kernel_read(), we do not care.

Link: http://lkml.kernel.org/r/20181112160931.GA28463@redhat.com
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Kees Cook <keescook@chromium.org>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Ben Woodard <woodard@redhat.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
5 years agofs/epoll: drop ovflist branch prediction
Davidlohr Bueso [Thu, 3 Jan 2019 23:27:09 +0000 (15:27 -0800)]
fs/epoll: drop ovflist branch prediction

[ Upstream commit 76699a67f3041ff4c7af6d6ee9be2bfbf1ffb671 ]

The ep->ovflist is a secondary ready-list to temporarily store events
that might occur when doing sproc without holding the ep->wq.lock.  This
accounts for every time we check for ready events and also send events
back to userspace; both callbacks, particularly the latter because of
copy_to_user, can account for a non-trivial time.

As such, the unlikely() check to see if the pointer is being used, seems
both misleading and sub-optimal.  In fact, we go to an awful lot of
trouble to sync both lists, and populating the ovflist is far from an
uncommon scenario.

For example, profiling a concurrent epoll_wait(2) benchmark, with
CONFIG_PROFILE_ANNOTATED_BRANCHES shows that for a two threads a 33%
incorrect rate was seen; and when incrementally increasing the number of
epoll instances (which is used, for example for multiple queuing load
balancing models), up to a 90% incorrect rate was seen.

Similarly, by deleting the prediction, 3% throughput boost was seen
across incremental threads.

Link: http://lkml.kernel.org/r/20181108051006.18751-4-dave@stgolabs.net
Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Jason Baron <jbaron@akamai.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
5 years agokernel/hung_task.c: force console verbose before panic
Liu, Chuansheng [Thu, 3 Jan 2019 23:26:27 +0000 (15:26 -0800)]
kernel/hung_task.c: force console verbose before panic

[ Upstream commit 168e06f7937d96c7222037d8a05565e8a6eb00fe ]

Based on commit 401c636a0eeb ("kernel/hung_task.c: show all hung tasks
before panic"), we could get the call stack of hung task.

However, if the console loglevel is not high, we still can not see the
useful panic information in practice, and in most cases users don't set
console loglevel to high level.

This patch is to force console verbose before system panic, so that the
real useful information can be seen in the console, instead of being
like the following, which doesn't have hung task information.

  INFO: task init:1 blocked for more than 120 seconds.
        Tainted: G     U  W         4.19.0-quilt-2e5dc0ac-g51b6c21d76cc #1
  "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
  Kernel panic - not syncing: hung_task: blocked tasks
  CPU: 2 PID: 479 Comm: khungtaskd Tainted: G     U  W         4.19.0-quilt-2e5dc0ac-g51b6c21d76cc #1
  Call Trace:
   dump_stack+0x4f/0x65
   panic+0xde/0x231
   watchdog+0x290/0x410
   kthread+0x12c/0x150
   ret_from_fork+0x35/0x40
  reboot: panic mode set: p,w
  Kernel Offset: 0x34000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)

Link: http://lkml.kernel.org/r/27240C0AC20F114CBF8149A2696CBE4A6015B675@SHSMSX101.ccr.corp.intel.com
Signed-off-by: Chuansheng Liu <chuansheng.liu@intel.com>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Reviewed-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
5 years agoproc/sysctl: fix return error for proc_doulongvec_minmax()
Cheng Lin [Thu, 3 Jan 2019 23:26:13 +0000 (15:26 -0800)]
proc/sysctl: fix return error for proc_doulongvec_minmax()

[ Upstream commit 09be178400829dddc1189b50a7888495dd26aa84 ]

If the number of input parameters is less than the total parameters, an
EINVAL error will be returned.

For example, we use proc_doulongvec_minmax to pass up to two parameters
with kern_table:

{
.procname       = "monitor_signals",
.data           = &monitor_sigs,
.maxlen         = 2*sizeof(unsigned long),
.mode           = 0644,
.proc_handler   = proc_doulongvec_minmax,
},

Reproduce:

When passing two parameters, it's work normal.  But passing only one
parameter, an error "Invalid argument"(EINVAL) is returned.

  [root@cl150 ~]# echo 1 2 > /proc/sys/kernel/monitor_signals
  [root@cl150 ~]# cat /proc/sys/kernel/monitor_signals
  1       2
  [root@cl150 ~]# echo 3 > /proc/sys/kernel/monitor_signals
  -bash: echo: write error: Invalid argument
  [root@cl150 ~]# echo $?
  1
  [root@cl150 ~]# cat /proc/sys/kernel/monitor_signals
  3       2
  [root@cl150 ~]#

The following is the result after apply this patch.  No error is
returned when the number of input parameters is less than the total
parameters.

  [root@cl150 ~]# echo 1 2 > /proc/sys/kernel/monitor_signals
  [root@cl150 ~]# cat /proc/sys/kernel/monitor_signals
  1       2
  [root@cl150 ~]# echo 3 > /proc/sys/kernel/monitor_signals
  [root@cl150 ~]# echo $?
  0
  [root@cl150 ~]# cat /proc/sys/kernel/monitor_signals
  3       2
  [root@cl150 ~]#

There are three processing functions dealing with digital parameters,
__do_proc_dointvec/__do_proc_douintvec/__do_proc_doulongvec_minmax.

This patch deals with __do_proc_doulongvec_minmax, just as
__do_proc_dointvec does, adding a check for parameters 'left'.  In
__do_proc_douintvec, its code implementation explicitly does not support
multiple inputs.

static int __do_proc_douintvec(...){
         ...
         /*
          * Arrays are not supported, keep this simple. *Do not* add
          * support for them.
          */
         if (vleft != 1) {
                 *lenp = 0;
                 return -EINVAL;
         }
         ...
}

So, just __do_proc_doulongvec_minmax has the problem.  And most use of
proc_doulongvec_minmax/proc_doulongvec_ms_jiffies_minmax just have one
parameter.

Link: http://lkml.kernel.org/r/1544081775-15720-1-git-send-email-cheng.lin130@zte.com.cn
Signed-off-by: Cheng Lin <cheng.lin130@zte.com.cn>
Acked-by: Luis Chamberlain <mcgrof@kernel.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
5 years agokernel/hung_task.c: break RCU locks based on jiffies
Tetsuo Handa [Thu, 3 Jan 2019 23:26:31 +0000 (15:26 -0800)]
kernel/hung_task.c: break RCU locks based on jiffies

[ Upstream commit 304ae42739b108305f8d7b3eb3c1aec7c2b643a9 ]

check_hung_uninterruptible_tasks() is currently calling rcu_lock_break()
for every 1024 threads.  But check_hung_task() is very slow if printk()
was called, and is very fast otherwise.

If many threads within some 1024 threads called printk(), the RCU grace
period might be extended enough to trigger RCU stall warnings.
Therefore, calling rcu_lock_break() for every some fixed jiffies will be
safer.

Link: http://lkml.kernel.org/r/1544800658-11423-1-git-send-email-penguin-kernel@I-love.SAKURA.ne.jp
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Acked-by: Paul E. McKenney <paulmck@linux.ibm.com>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
5 years agoarm64/sve: ptrace: Fix SVE_PT_REGS_OFFSET definition
Dave Martin [Fri, 4 Jan 2019 13:09:50 +0000 (13:09 +0000)]
arm64/sve: ptrace: Fix SVE_PT_REGS_OFFSET definition

[ Upstream commit ee1b465b303591d3a04d403122bbc0d7026520fb ]

SVE_PT_REGS_OFFSET is supposed to indicate the offset for skipping
over the ptrace NT_ARM_SVE header (struct user_sve_header) to the
start of the SVE register data proper.

However, currently SVE_PT_REGS_OFFSET is defined in terms of struct
sve_context, which is wrong: that structure describes the SVE
header in the signal frame, not in the ptrace regset.

This patch fixes the definition to use the ptrace header structure
struct user_sve_header instead.

By good fortune, the two structures are the same size anyway, so
there is no functional or ABI change.

Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
5 years agoHID: lenovo: Add checks to fix of_led_classdev_register
Aditya Pakki [Mon, 24 Dec 2018 21:39:14 +0000 (15:39 -0600)]
HID: lenovo: Add checks to fix of_led_classdev_register

[ Upstream commit 6ae16dfb61bce538d48b7fe98160fada446056c5 ]

In lenovo_probe_tpkbd(), the function of_led_classdev_register() could
return an error value that is unchecked. The fix adds these checks.

Signed-off-by: Aditya Pakki <pakki001@umn.edu>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Sasha Levin <sashal@kernel.org>
5 years agothermal: generic-adc: Fix adc to temp interpolation
Bjorn Andersson [Mon, 24 Dec 2018 07:26:44 +0000 (23:26 -0800)]
thermal: generic-adc: Fix adc to temp interpolation

[ Upstream commit 9d216211fded20fff301d0317af3238d8383634c ]

First correct the edge case to return the last element if we're
outside the range, rather than at the last element, so that
interpolation is not omitted for points between the two last entries in
the table.

Then correct the formula to perform linear interpolation based the two
points surrounding the read ADC value. The indices for temp are kept as
"hi" and "lo" to pair with the adc indices, but there's no requirement
that the temperature is provided in descendent order. mult_frac() is
used to prevent issues with overflowing the int.

Cc: Laxman Dewangan <ldewangan@nvidia.com>
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
5 years agoblock/swim3: Fix regression on PowerBook G3
Finn Thain [Mon, 31 Dec 2018 05:44:09 +0000 (16:44 +1100)]
block/swim3: Fix regression on PowerBook G3

[ Upstream commit 427c5ce4417cba0801fbf79c8525d1330704759c ]

As of v4.20, the swim3 driver crashes when loaded on a PowerBook G3
(Wallstreet).

MacIO PCI driver attached to Gatwick chipset
MacIO PCI driver attached to Heathrow chipset
swim3 0.00015000:floppy: [fd0] SWIM3 floppy controller in media bay
0.00013020:ch-a: ttyS0 at MMIO 0xf3013020 (irq = 16, base_baud = 230400) is a Z85c30 ESCC - Serial port
0.00013000:ch-b: ttyS1 at MMIO 0xf3013000 (irq = 17, base_baud = 230400) is a Z85c30 ESCC - Infrared port
macio: fixed media-bay irq on gatwick
macio: fixed left floppy irqs
swim3 1.00015000:floppy: [fd1] Couldn't request interrupt
Unable to handle kernel paging request for data at address 0x00000024
Faulting instruction address: 0xc02652f8
Oops: Kernel access of bad area, sig: 11 [#1]
BE SMP NR_CPUS=2 PowerMac
Modules linked in:
CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.20.0 #2
NIP:  c02652f8 LR: c026915c CTR: c0276d1c
REGS: df43ba10 TRAP: 0300   Not tainted  (4.20.0)
MSR:  00009032 <EE,ME,IR,DR,RI>  CR: 28228288  XER: 00000100
DAR: 00000024 DSISR: 40000000
GPR00: c026915c df43bac0 df439060 c0731524 df494700 00000000 c06e1c08 00000001
GPR08: 00000001 00000000 df5ff220 00001032 28228282 00000000 c0004ca4 00000000
GPR16: 00000000 00000000 00000000 c073144c dfffe064 c0731524 00000120 c0586108
GPR24: c073132c c073143c c073143c 00000000 c0731524 df67cd70 df494700 00000001
NIP [c02652f8] blk_mq_free_rqs+0x28/0xf8
LR [c026915c] blk_mq_sched_tags_teardown+0x58/0x84
Call Trace:
[df43bac0] [c0045f50] flush_workqueue_prep_pwqs+0x178/0x1c4 (unreliable)
[df43bae0] [c026915c] blk_mq_sched_tags_teardown+0x58/0x84
[df43bb00] [c02697f0] blk_mq_exit_sched+0x9c/0xb8
[df43bb20] [c0252794] elevator_exit+0x84/0xa4
[df43bb40] [c0256538] blk_exit_queue+0x30/0x50
[df43bb50] [c0256640] blk_cleanup_queue+0xe8/0x184
[df43bb70] [c034732c] swim3_attach+0x330/0x5f0
[df43bbb0] [c034fb24] macio_device_probe+0x58/0xec
[df43bbd0] [c032ba88] really_probe+0x1e4/0x2f4
[df43bc00] [c032bd28] driver_probe_device+0x64/0x204
[df43bc20] [c0329ac4] bus_for_each_drv+0x60/0xac
[df43bc50] [c032b824] __device_attach+0xe8/0x160
[df43bc80] [c032ab38] bus_probe_device+0xa0/0xbc
[df43bca0] [c0327338] device_add+0x3d8/0x630
[df43bcf0] [c0350848] macio_add_one_device+0x444/0x48c
[df43bd50] [c03509f8] macio_pci_add_devices+0x168/0x1bc
[df43bd90] [c03500ec] macio_pci_probe+0xc0/0x10c
[df43bda0] [c02ad884] pci_device_probe+0xd4/0x184
[df43bdd0] [c032ba88] really_probe+0x1e4/0x2f4
[df43be00] [c032bd28] driver_probe_device+0x64/0x204
[df43be20] [c032bfcc] __driver_attach+0x104/0x108
[df43be40] [c0329a00] bus_for_each_dev+0x64/0xb4
[df43be70] [c032add8] bus_add_driver+0x154/0x238
[df43be90] [c032ca24] driver_register+0x84/0x148
[df43bea0] [c0004aa0] do_one_initcall+0x40/0x188
[df43bf00] [c0690100] kernel_init_freeable+0x138/0x1d4
[df43bf30] [c0004cbc] kernel_init+0x18/0x10c
[df43bf40] [c00121e4] ret_from_kernel_thread+0x14/0x1c
Instruction dump:
5484d97e 4bfff4f4 9421ffe0 7c0802a6 bf410008 7c9e2378 90010024 8124005c
2f890000 419e0078 81230004 7c7c1b78 <812900242f890000 419e0064 81440000
---[ end trace 12025ab921a9784c ]---

Reverting commit 8ccb8cb1892b ("swim3: convert to blk-mq") resolves the
problem.

That commit added a struct blk_mq_tag_set to struct floppy_state and
initialized it with a blk_mq_init_sq_queue() call. Unfortunately, there
is a memset() in swim3_add_device() that subsequently clears the
floppy_state struct. That means fs->tag_set->ops is a NULL pointer, and
it gets dereferenced by blk_mq_free_rqs() which gets called in the
request_irq() error path. Move the memset() to fix this bug.

BTW, the request_irq() failure for the left mediabay floppy (fd1) is not
a regression. I don't know why it happens. The right media bay floppy
(fd0) works fine however.

Reported-and-tested-by: Stan Johnson <userm57@yahoo.com>
Fixes: 8ccb8cb1892b ("swim3: convert to blk-mq")
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
5 years agoPCI: imx: Enable MSI from downstream components
Richard Zhu [Fri, 21 Dec 2018 04:33:38 +0000 (04:33 +0000)]
PCI: imx: Enable MSI from downstream components

[ Upstream commit 75cb8d20c112aba70f23d98e3f8d0a38ace16006 ]

The MSI Enable bit in the MSI Capability (PCIe r4.0, sec 7.7.1.2) controls
whether a Function can request service using MSI.

i.MX6 Root Ports implement the MSI Capability and may use MSI to request
service for events like PME, hotplug, AER, etc.  In addition, on i.MX6, the
MSI Enable bit controls delivery of MSI interrupts from components below
the Root Port.

Prior to f3fdfc4ac3a2 ("PCI: Remove host driver Kconfig selection of
CONFIG_PCIEPORTBUS"), enabling CONFIG_PCI_IMX6 automatically also enabled
CONFIG_PCIEPORTBUS, and when portdrv claimed the Root Ports, it set the MSI
Enable bit so it could use PME, hotplug, AER, etc.  As a side effect, that
also enabled delivery of MSI interrupts from downstream components.

The imx6q-pcie driver itself does not depend on portdrv, so set MSI Enable
in imx6q-pcie so MSI from downstream components works even if nobody uses
MSI for the Root Port events.

Fixes: f3fdfc4ac3a2 ("PCI: Remove host driver Kconfig selection of CONFIG_PCIEPORTBUS")
Signed-off-by: Richard Zhu <hongxing.zhu@nxp.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Sven Van Asbroeck <TheSven73@googlemail.com>
Tested-by: Trent Piepho <tpiepho@impinj.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
Acked-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
5 years agothermal: tsens: qcom: do not create duplicate regmap debugfs entries
Srinivas Kandagatla [Thu, 15 Nov 2018 17:43:30 +0000 (17:43 +0000)]
thermal: tsens: qcom: do not create duplicate regmap debugfs entries

[ Upstream commit 4ab248b3b10a58e379e2d32333fff99ea5ca256c ]

Regmap would use device name to create debugfs entries. If the device
has multiple regmaps it is recommended to use name field in regmap_config.
Fix this by providing name to the regmap configs correctly.

Without this patch we would see below error on DB820c.

qcom-tsens 4a9000.thermal-sensor: Failed to create 4a9000.thermal-sensor
debugfs directory

Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Acked-by: Amit Kucheria <amit.kucheria@linaro.org>
Tested-by: Amit Kucheria <amit.kucheria@linaro.org>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
5 years agokdb: Don't back trace on a cpu that didn't round up
Douglas Anderson [Wed, 5 Dec 2018 03:38:28 +0000 (19:38 -0800)]
kdb: Don't back trace on a cpu that didn't round up

[ Upstream commit 162bc7f5afd75b72acbe3c5f3488ef7e64a3fe36 ]

If you have a CPU that fails to round up and then run 'btc' you'll end
up crashing in kdb becaue we dereferenced NULL.  Let's add a check.
It's wise to also set the task to NULL when leaving the debugger so
that if we fail to round up on a later entry into the debugger we
won't backtrace a stale task.

Signed-off-by: Douglas Anderson <dianders@chromium.org>
Acked-by: Daniel Thompson <daniel.thompson@linaro.org>
Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
5 years agothermal: bcm2835: enable hwmon explicitly
Matthias Brugger [Sun, 21 Oct 2018 21:58:48 +0000 (23:58 +0200)]
thermal: bcm2835: enable hwmon explicitly

[ Upstream commit d56c19d07e0bc3ceff366a49b7d7a2440c967b1b ]

By defaul of-based thermal driver do not enable hwmon.
This patch does this explicitly, so that the temperature can be read
through the common hwmon sysfs.

Signed-off-by: Matthias Brugger <mbrugger@suse.com>
Acked-by: Stefan Wahren <stefan.wahren@i2se.com>
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
5 years agoblock/swim3: Fix -EBUSY error when re-opening device after unmount
Finn Thain [Mon, 31 Dec 2018 05:44:09 +0000 (16:44 +1100)]
block/swim3: Fix -EBUSY error when re-opening device after unmount

[ Upstream commit 296dcc40f2f2e402facf7cd26cf3f2c8f4b17d47 ]

When the block device is opened with FMODE_EXCL, ref_count is set to -1.
This value doesn't get reset when the device is closed which means the
device cannot be opened again. Fix this by checking for refcount <= 0
in the release method.

Reported-and-tested-by: Stan Johnson <userm57@yahoo.com>
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
5 years agofsl/fman: Use GFP_ATOMIC in {memac,tgec}_add_hash_mac_address()
Scott Wood [Fri, 28 Dec 2018 00:29:09 +0000 (18:29 -0600)]
fsl/fman: Use GFP_ATOMIC in {memac,tgec}_add_hash_mac_address()

[ Upstream commit 0d9c9a238faf925823bde866182c663b6d734f2e ]

These functions are called from atomic context:

[    9.150239] BUG: sleeping function called from invalid context at /home/scott/git/linux/mm/slab.h:421
[    9.158159] in_atomic(): 1, irqs_disabled(): 0, pid: 4432, name: ip
[    9.163128] CPU: 8 PID: 4432 Comm: ip Not tainted 4.20.0-rc2-00169-g63d86876f324 #29
[    9.163130] Call Trace:
[    9.170701] [c0000002e899a980] [c0000000009c1068] .dump_stack+0xa8/0xec (unreliable)
[    9.177140] [c0000002e899aa10] [c00000000007a7b4] .___might_sleep+0x138/0x164
[    9.184440] [c0000002e899aa80] [c0000000001d5bac] .kmem_cache_alloc_trace+0x238/0x30c
[    9.191216] [c0000002e899ab40] [c00000000065ea1c] .memac_add_hash_mac_address+0x104/0x198
[    9.199464] [c0000002e899abd0] [c00000000065a788] .set_multi+0x1c8/0x218
[    9.206242] [c0000002e899ac80] [c0000000006615ec] .dpaa_set_rx_mode+0xdc/0x17c
[    9.213544] [c0000002e899ad00] [c00000000083d2b0] .__dev_set_rx_mode+0x80/0xd4
[    9.219535] [c0000002e899ad90] [c00000000083d334] .dev_set_rx_mode+0x30/0x54
[    9.225271] [c0000002e899ae10] [c00000000083d4a0] .__dev_open+0x148/0x1c8
[    9.230751] [c0000002e899aeb0] [c00000000083d934] .__dev_change_flags+0x19c/0x1e0
[    9.230755] [c0000002e899af60] [c00000000083d9a4] .dev_change_flags+0x2c/0x80
[    9.242752] [c0000002e899aff0] [c0000000008554ec] .do_setlink+0x350/0xf08
[    9.248228] [c0000002e899b170] [c000000000857ad0] .rtnl_newlink+0x588/0x7e0
[    9.253965] [c0000002e899b740] [c000000000852424] .rtnetlink_rcv_msg+0x3e0/0x498
[    9.261440] [c0000002e899b820] [c000000000884790] .netlink_rcv_skb+0x134/0x14c
[    9.267607] [c0000002e899b8e0] [c000000000851840] .rtnetlink_rcv+0x18/0x2c
[    9.274558] [c0000002e899b950] [c000000000883c8c] .netlink_unicast+0x214/0x318
[    9.281163] [c0000002e899ba00] [c000000000884220] .netlink_sendmsg+0x348/0x444
[    9.287076] [c0000002e899bae0] [c00000000080d13c] .sock_sendmsg+0x2c/0x54
[    9.287080] [c0000002e899bb50] [c0000000008106c0] .___sys_sendmsg+0x2d0/0x2d8
[    9.298375] [c0000002e899bd30] [c000000000811a80] .__sys_sendmsg+0x5c/0xb0
[    9.303939] [c0000002e899be20] [c0000000000006b0] system_call+0x60/0x6c

Signed-off-by: Scott Wood <oss@buserror.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
5 years agogdrom: fix a memory leak bug
Wenwen Wang [Thu, 27 Dec 2018 02:15:13 +0000 (20:15 -0600)]
gdrom: fix a memory leak bug

[ Upstream commit 093c48213ee37c3c3ff1cf5ac1aa2a9d8bc66017 ]

In probe_gdrom(), the buffer pointed by 'gd.cd_info' is allocated through
kzalloc() and is used to hold the information of the gdrom device. To
register and unregister the device, the pointer 'gd.cd_info' is passed to
the functions register_cdrom() and unregister_cdrom(), respectively.
However, this buffer is not freed after it is used, which can cause a
memory leak bug.

This patch simply frees the buffer 'gd.cd_info' in exit_gdrom() to fix the
above issue.

Signed-off-by: Wenwen Wang <wang6495@umn.edu>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
5 years agoisdn: hisax: hfc_pci: Fix a possible concurrency use-after-free bug in HFCPCI_l1hw()
Jia-Ju Bai [Wed, 26 Dec 2018 14:09:34 +0000 (22:09 +0800)]
isdn: hisax: hfc_pci: Fix a possible concurrency use-after-free bug in HFCPCI_l1hw()

[ Upstream commit 7418e6520f22a2e35815122fa5a53d5bbfa2c10f ]

In drivers/isdn/hisax/hfc_pci.c, the functions hfcpci_interrupt() and
HFCPCI_l1hw() may be concurrently executed.

HFCPCI_l1hw()
  line 1173: if (!cs->tx_skb)

hfcpci_interrupt()
  line 942: spin_lock_irqsave();
  line 1066: dev_kfree_skb_irq(cs->tx_skb);

Thus, a possible concurrency use-after-free bug may occur
in HFCPCI_l1hw().

To fix these bugs, the calls to spin_lock_irqsave() and
spin_unlock_irqrestore() are added in HFCPCI_l1hw(), to protect the
access to cs->tx_skb.

Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
5 years agozram: fix lockdep warning of free block handling
Minchan Kim [Fri, 28 Dec 2018 08:36:33 +0000 (00:36 -0800)]
zram: fix lockdep warning of free block handling

[ Upstream commit 3c9959e025472122a61faebb208525cf26b305d1 ]

Patch series "zram idle page writeback", v3.

Inherently, swap device has many idle pages which are rare touched since
it was allocated.  It is never problem if we use storage device as swap.
However, it's just waste for zram-swap.

This patchset supports zram idle page writeback feature.

* Admin can define what is idle page "no access since X time ago"
* Admin can define when zram should writeback them
* Admin can define when zram should stop writeback to prevent wearout

Details are in each patch's description.

This patch (of 7):

  ================================
  WARNING: inconsistent lock state
  4.19.0+ #390 Not tainted
  --------------------------------
  inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage.
  zram_verify/2095 [HC0[0]:SC1[1]:HE1:SE0] takes:
  00000000b1828693 (&(&zram->bitmap_lock)->rlock){+.?.}, at: put_entry_bdev+0x1e/0x50
  {SOFTIRQ-ON-W} state was registered at:
    _raw_spin_lock+0x2c/0x40
    zram_make_request+0x755/0xdc9
    generic_make_request+0x373/0x6a0
    submit_bio+0x6c/0x140
    __swap_writepage+0x3a8/0x480
    shrink_page_list+0x1102/0x1a60
    shrink_inactive_list+0x21b/0x3f0
    shrink_node_memcg.constprop.99+0x4f8/0x7e0
    shrink_node+0x7d/0x2f0
    do_try_to_free_pages+0xe0/0x300
    try_to_free_pages+0x116/0x2b0
    __alloc_pages_slowpath+0x3f4/0xf80
    __alloc_pages_nodemask+0x2a2/0x2f0
    __handle_mm_fault+0x42e/0xb50
    handle_mm_fault+0x55/0xb0
    __do_page_fault+0x235/0x4b0
    page_fault+0x1e/0x30
  irq event stamp: 228412
  hardirqs last  enabled at (228412): [<ffffffff98245846>] __slab_free+0x3e6/0x600
  hardirqs last disabled at (228411): [<ffffffff98245625>] __slab_free+0x1c5/0x600
  softirqs last  enabled at (228396): [<ffffffff98e0031e>] __do_softirq+0x31e/0x427
  softirqs last disabled at (228403): [<ffffffff98072051>] irq_exit+0xd1/0xe0

  other info that might help us debug this:
   Possible unsafe locking scenario:

         CPU0
         ----
    lock(&(&zram->bitmap_lock)->rlock);
    <Interrupt>
      lock(&(&zram->bitmap_lock)->rlock);

   *** DEADLOCK ***

  no locks held by zram_verify/2095.

  stack backtrace:
  CPU: 5 PID: 2095 Comm: zram_verify Not tainted 4.19.0+ #390
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014
  Call Trace:
   <IRQ>
   dump_stack+0x67/0x9b
   print_usage_bug+0x1bd/0x1d3
   mark_lock+0x4aa/0x540
   __lock_acquire+0x51d/0x1300
   lock_acquire+0x90/0x180
   _raw_spin_lock+0x2c/0x40
   put_entry_bdev+0x1e/0x50
   zram_free_page+0xf6/0x110
   zram_slot_free_notify+0x42/0xa0
   end_swap_bio_read+0x5b/0x170
   blk_update_request+0x8f/0x340
   scsi_end_request+0x2c/0x1e0
   scsi_io_completion+0x98/0x650
   blk_done_softirq+0x9e/0xd0
   __do_softirq+0xcc/0x427
   irq_exit+0xd1/0xe0
   do_IRQ+0x93/0x120
   common_interrupt+0xf/0xf
   </IRQ>

With writeback feature, zram_slot_free_notify could be called in softirq
context by end_swap_bio_read.  However, bitmap_lock is not aware of that
so lockdep yell out:

  get_entry_bdev
  spin_lock(bitmap->lock);
  irq
  softirq
  end_swap_bio_read
  zram_slot_free_notify
  zram_slot_lock <-- deadlock prone
  zram_free_page
  put_entry_bdev
  spin_lock(bitmap->lock); <-- deadlock prone

With akpm's suggestion (i.e.  bitmap operation is already atomic), we
could remove bitmap lock.  It might fail to find a empty slot if serious
contention happens.  However, it's not severe problem because huge page
writeback has already possiblity to fail if there is severe memory
pressure.  Worst case is just keeping the incompressible in memory, not
storage.

The other problem is zram_slot_lock in zram_slot_slot_free_notify.  To
make it safe is this patch introduces zram_slot_trylock where
zram_slot_free_notify uses it.  Although it's rare to be contented, this
patch adds new debug stat "miss_free" to keep monitoring how often it
happens.

Link: http://lkml.kernel.org/r/20181127055429.251614-2-minchan@kernel.org
Signed-off-by: Minchan Kim <minchan@kernel.org>
Reviewed-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Reviewed-by: Joey Pabalinas <joeypabalinas@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
5 years agomm/page_alloc.c: don't call kasan_free_pages() at deferred mem init
Waiman Long [Fri, 28 Dec 2018 08:38:51 +0000 (00:38 -0800)]
mm/page_alloc.c: don't call kasan_free_pages() at deferred mem init

[ Upstream commit 3c0c12cc8f00ca5f81acb010023b8eb13e9a7004 ]

When CONFIG_KASAN is enabled on large memory SMP systems, the deferrred
pages initialization can take a long time.  Below were the reported init
times on a 8-socket 96-core 4TB IvyBridge system.

  1) Non-debug kernel without CONFIG_KASAN
     [    8.764222] node 1 initialised, 132086516 pages in 7027ms

  2) Debug kernel with CONFIG_KASAN
     [  146.288115] node 1 initialised, 132075466 pages in 143052ms

So the page init time in a debug kernel was 20X of the non-debug kernel.
The long init time can be problematic as the page initialization is done
with interrupt disabled.  In this particular case, it caused the
appearance of following warning messages as well as NMI backtraces of all
the cores that were doing the initialization.

[   68.240049] rcu: INFO: rcu_sched detected stalls on CPUs/tasks:
[   68.241000] rcu:  25-...0: (100 ticks this GP) idle=b72/1/0x4000000000000000 softirq=915/915 fqs=16252
[   68.241000] rcu:  44-...0: (95 ticks this GP) idle=49a/1/0x4000000000000000 softirq=788/788 fqs=16253
[   68.241000] rcu:  54-...0: (104 ticks this GP) idle=03a/1/0x4000000000000000 softirq=721/825 fqs=16253
[   68.241000] rcu:  60-...0: (103 ticks this GP) idle=cbe/1/0x4000000000000000 softirq=637/740 fqs=16253
[   68.241000] rcu:  72-...0: (105 ticks this GP) idle=786/1/0x4000000000000000 softirq=536/641 fqs=16253
[   68.241000] rcu:  84-...0: (99 ticks this GP) idle=292/1/0x4000000000000000 softirq=537/537 fqs=16253
[   68.241000] rcu:  111-...0: (104 ticks this GP) idle=bde/1/0x4000000000000000 softirq=474/476 fqs=16253
[   68.241000] rcu:  (detected by 13, t=65018 jiffies, g=249, q=2)

The long init time was mainly caused by the call to kasan_free_pages() to
poison the newly initialized pages.  On a 4TB system, we are talking about
almost 500GB of memory probably on the same node.

In reality, we may not need to poison the newly initialized pages before
they are ever allocated.  So KASAN poisoning of freed pages before the
completion of deferred memory initialization is now disabled.  Those pages
will be properly poisoned when they are allocated or freed after deferred
pages initialization is done.

With this change, the new page initialization time became:

[   21.948010] node 1 initialised, 132075466 pages in 18702ms

This was still about double the non-debug kernel time, but was much
better than before.

Link: http://lkml.kernel.org/r/1544459388-8736-1-git-send-email-longman@redhat.com
Signed-off-by: Waiman Long <longman@redhat.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Pasha Tatashin <Pavel.Tatashin@microsoft.com>
Cc: Oscar Salvador <osalvador@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
5 years agoocfs2: improve ocfs2 Makefile
Larry Chen [Fri, 28 Dec 2018 08:32:46 +0000 (00:32 -0800)]
ocfs2: improve ocfs2 Makefile

[ Upstream commit 9e6aea22802b5684c7e1d69822aeb0844dd01953 ]

Included file path was hard-wired in the ocfs2 makefile, which might
causes some confusion when compiling ocfs2 as an external module.

Say if we compile ocfs2 module as following.
cp -r /kernel/tree/fs/ocfs2 /other/dir/ocfs2
cd /other/dir/ocfs2
make -C /path/to/kernel_source M=`pwd` modules

Acutally, the compiler wil try to find included file in
/kernel/tree/fs/ocfs2, rather than the directory /other/dir/ocfs2.

To fix this little bug, we introduce the var $(src) provided by kbuild.
$(src) means the absolute path of the running kbuild file.

Link: http://lkml.kernel.org/r/20181108085546.15149-1-lchen@suse.com
Signed-off-by: Larry Chen <lchen@suse.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Joseph Qi <jiangqi903@gmail.com>
Cc: Changwei Ge <ge.changwei@h3c.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
5 years agoocfs2: don't clear bh uptodate for block read
Junxiao Bi [Fri, 28 Dec 2018 08:32:57 +0000 (00:32 -0800)]
ocfs2: don't clear bh uptodate for block read

[ Upstream commit 70306d9dce75abde855cefaf32b3f71eed8602a3 ]

For sync io read in ocfs2_read_blocks_sync(), first clear bh uptodate flag
and submit the io, second wait io done, last check whether bh uptodate, if
not return io error.

If two sync io for the same bh were issued, it could be the first io done
and set uptodate flag, but just before check that flag, the second io came
in and cleared uptodate, then ocfs2_read_blocks_sync() for the first io
will return IO error.

Indeed it's not necessary to clear uptodate flag, as the io end handler
end_buffer_read_sync() will set or clear it based on io succeed or failed.

The following message was found from a nfs server but the underlying
storage returned no error.

[4106438.567376] (nfsd,7146,3):ocfs2_get_suballoc_slot_bit:2780 ERROR: read block 1238823695 failed -5
[4106438.567569] (nfsd,7146,3):ocfs2_get_suballoc_slot_bit:2812 ERROR: status = -5
[4106438.567611] (nfsd,7146,3):ocfs2_test_inode_bit:2894 ERROR: get alloc slot and bit failed -5
[4106438.567643] (nfsd,7146,3):ocfs2_test_inode_bit:2932 ERROR: status = -5
[4106438.567675] (nfsd,7146,3):ocfs2_get_dentry:94 ERROR: test inode bit failed -5

Same issue in non sync read ocfs2_read_blocks(), fixed it as well.

Link: http://lkml.kernel.org/r/20181121020023.3034-4-junxiao.bi@oracle.com
Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com>
Reviewed-by: Changwei Ge <ge.changwei@h3c.com>
Reviewed-by: Yiwen Jiang <jiangyiwen@huawei.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Joseph Qi <jiangqi903@gmail.com>
Cc: Jun Piao <piaojun@huawei.com>
Cc: Mark Fasheh <mfasheh@versity.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
5 years agoarch/sh/boards/mach-kfr2r09/setup.c: fix struct mtd_oob_ops build warning
Randy Dunlap [Fri, 28 Dec 2018 08:31:39 +0000 (00:31 -0800)]
arch/sh/boards/mach-kfr2r09/setup.c: fix struct mtd_oob_ops build warning

[ Upstream commit 440e7b379f91acd245d5c8de94d533f40f5dffb3 ]

arch/sh/boards/mach-kfr2r09/setup.c does not need to #include
<mtd/onenand.h>, and doing so causes a build warning, so drop that header
file.

In file included from ../arch/sh/boards/mach-kfr2r09/setup.c:28:
../include/linux/mtd/onenand.h:225:12: warning: 'struct mtd_oob_ops' declared inside parameter list will not be visible outside of this definition or declaration
     struct mtd_oob_ops *ops);

Link: http://lkml.kernel.org/r/702f0a25-c63e-6912-4640-6ab0f00afbc7@infradead.org
Fixes: f3590dc32974 ("media: arch: sh: kfr2r09: Use new renesas-ceu camera driver")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Suggested-by: Miquel Raynal <miquel.raynal@bootlin.com>
Reviewed-by: Miquel Raynal <miquel.raynal@bootlin.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Rich Felker <dalias@libc.org>
Cc: Jacopo Mondi <jacopo+renesas@jmondi.org>
Cc: Magnus Damm <magnus.damm@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
5 years agoscripts/decode_stacktrace: only strip base path when a prefix of the path
Marc Zyngier [Fri, 28 Dec 2018 08:31:25 +0000 (00:31 -0800)]
scripts/decode_stacktrace: only strip base path when a prefix of the path

[ Upstream commit 67a28de47faa83585dd644bd4c31e5a1d9346c50 ]

Running something like:

decodecode vmlinux .

leads to interested results where not only the leading "." gets stripped
from the displayed paths, but also anywhere in the string, displaying
something like:

kvm_vcpu_check_block (arch/arm64/kvm/virt/kvm/kvm_mainc:2141)

which doesn't help further processing.

Fix it by only stripping the base path if it is a prefix of the path.

Link: http://lkml.kernel.org/r/20181210174659.31054-3-marc.zyngier@arm.com
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
5 years agoperf python: Do not force closing original perf descriptor in evlist.get_pollfd()
Jiri Olsa [Wed, 26 Dec 2018 11:21:21 +0000 (12:21 +0100)]
perf python: Do not force closing original perf descriptor in evlist.get_pollfd()

[ Upstream commit a389aece97938966616ce0336466b98b0351ef10 ]

Ondřej reported that when compiled with python3, the python extension
regresses in evlist.get_pollfd function behaviour.

The evlist.get_pollfd function creates file objects from evlist's fds
and returns them in a list. The python3 version also sets them to 'close
the original descriptor' when the object dies (is closed), by passing
True via the 'closefd' arg in the PyFile_FromFd call.

The python's closefd doc says:

  If closefd is False, the underlying file descriptor will be kept open
  when the file is closed.

That's why the following line in python3 closes all evlist fds:

  evlist.get_pollfd()

the returned list is immediately destroyed and that takes down the
original events fds.

Passing closefd as False to PyFile_FromFd to fix this.

Reported-by: Ondřej Lysoněk <olysonek@redhat.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jaroslav Škarvada <jskarvad@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Fixes: 66dfdff03d19 ("perf tools: Add Python 3 support")
Link: http://lkml.kernel.org/r/20181226112121.5285-1-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
5 years agocgroup: fix parsing empty mount option string
Ondrej Mosnacek [Thu, 13 Dec 2018 14:17:37 +0000 (15:17 +0100)]
cgroup: fix parsing empty mount option string

[ Upstream commit e250d91d65750a0c0c62483ac4f9f357e7317617 ]

This fixes the case where all mount options specified are consumed by an
LSM and all that's left is an empty string. In this case cgroupfs should
accept the string and not fail.

How to reproduce (with SELinux enabled):

    # umount /sys/fs/cgroup/unified
    # mount -o context=system_u:object_r:cgroup_t:s0 -t cgroup2 cgroup2 /sys/fs/cgroup/unified
    mount: /sys/fs/cgroup/unified: wrong fs type, bad option, bad superblock on cgroup2, missing codepage or helper program, or other error.
    # dmesg | tail -n 1
    [   31.575952] cgroup: cgroup2: unknown option ""

Fixes: 67e9c74b8a87 ("cgroup: replace __DEVEL__sane_behavior with cgroup2 fs type")
[NOTE: should apply on top of commit 5136f6365ce3 ("cgroup: implement "nsdelegate" mount option"), older versions need manual rebase]
Suggested-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
5 years agof2fs: fix sbi->extent_list corruption issue
Sahitya Tummala [Tue, 18 Dec 2018 11:09:24 +0000 (16:39 +0530)]
f2fs: fix sbi->extent_list corruption issue

[ Upstream commit e4589fa545e0020dbbc3c9bde35f35f949901392 ]

When there is a failure in f2fs_fill_super() after/during
the recovery of fsync'd nodes, it frees the current sbi and
retries again. This time the mount is successful, but the files
that got recovered before retry, still holds the extent tree,
whose extent nodes list is corrupted since sbi and sbi->extent_list
is freed up. The list_del corruption issue is observed when the
file system is getting unmounted and when those recoverd files extent
node is being freed up in the below context.

list_del corruption. prev->next should be fffffff1e1ef5480, but was (null)
<...>
kernel BUG at kernel/msm-4.14/lib/list_debug.c:53!
lr : __list_del_entry_valid+0x94/0xb4
pc : __list_del_entry_valid+0x94/0xb4
<...>
Call trace:
__list_del_entry_valid+0x94/0xb4
__release_extent_node+0xb0/0x114
__free_extent_tree+0x58/0x7c
f2fs_shrink_extent_tree+0xdc/0x3b0
f2fs_leave_shrinker+0x28/0x7c
f2fs_put_super+0xfc/0x1e0
generic_shutdown_super+0x70/0xf4
kill_block_super+0x2c/0x5c
kill_f2fs_super+0x44/0x50
deactivate_locked_super+0x60/0x8c
deactivate_super+0x68/0x74
cleanup_mnt+0x40/0x78
__cleanup_mnt+0x1c/0x28
task_work_run+0x48/0xd0
do_notify_resume+0x678/0xe98
work_pending+0x8/0x14

Fix this by not creating extents for those recovered files if shrinker is
not registered yet. Once mount is successful and shrinker is registered,
those files can have extents again.

Signed-off-by: Sahitya Tummala <stummala@codeaurora.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
5 years agoniu: fix missing checks of niu_pci_eeprom_read
Kangjie Lu [Tue, 25 Dec 2018 07:56:14 +0000 (01:56 -0600)]
niu: fix missing checks of niu_pci_eeprom_read

[ Upstream commit 26fd962bde0b15e54234fe762d86bc0349df1de4 ]

niu_pci_eeprom_read() may fail, so we should check its return value
before using the read data.

Signed-off-by: Kangjie Lu <kjlu@umn.edu>
Acked-by: Shannon Nelson <shannon.lee.nelson@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
5 years agoum: Avoid marking pages with "changed protection"
Anton Ivanov [Wed, 5 Dec 2018 12:37:41 +0000 (12:37 +0000)]
um: Avoid marking pages with "changed protection"

[ Upstream commit 8892d8545f2d0342b9c550defbfb165db237044b ]

Changing protection is a very high cost operation in UML
because in addition to an extra syscall it also interrupts
mmap merge sequences generated by the tlb.

While the condition is not particularly common it is worth
avoiding.

Signed-off-by: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Sasha Levin <sashal@kernel.org>
5 years agof2fs: fix use-after-free issue when accessing sbi->stat_info
Sahitya Tummala [Wed, 26 Dec 2018 05:50:29 +0000 (11:20 +0530)]
f2fs: fix use-after-free issue when accessing sbi->stat_info

[ Upstream commit 60aa4d5536ab7fe32433ca1173bd9d6633851f27 ]

iput() on sbi->node_inode can update sbi->stat_info
in the below context, if the f2fs_write_checkpoint()
has failed with error.

f2fs_balance_fs_bg+0x1ac/0x1ec
f2fs_write_node_pages+0x4c/0x260
do_writepages+0x80/0xbc
__writeback_single_inode+0xdc/0x4ac
writeback_single_inode+0x9c/0x144
write_inode_now+0xc4/0xec
iput+0x194/0x22c
f2fs_put_super+0x11c/0x1e8
generic_shutdown_super+0x70/0xf4
kill_block_super+0x2c/0x5c
kill_f2fs_super+0x44/0x50
deactivate_locked_super+0x60/0x8c
deactivate_super+0x68/0x74
cleanup_mnt+0x40/0x78

Fix this by moving f2fs_destroy_stats() further below iput() in
both f2fs_put_super() and f2fs_fill_super() paths.

Signed-off-by: Sahitya Tummala <stummala@codeaurora.org>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
5 years agocifs: check ntwrk_buf_start for NULL before dereferencing it
Ronnie Sahlberg [Wed, 12 Dec 2018 22:06:16 +0000 (08:06 +1000)]
cifs: check ntwrk_buf_start for NULL before dereferencing it

[ Upstream commit 59a63e479ce36a3f24444c3a36efe82b78e4a8e0 ]

RHBZ: 1021460

There is an issue where when multiple threads open/close the same directory
ntwrk_buf_start might end up being NULL, causing the call to smbCalcSize
later to oops with a NULL deref.

The real bug is why this happens and why this can become NULL for an
open cfile, which should not be allowed.
This patch tries to avoid a oops until the time when we fix the underlying
issue.

Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
5 years agoMIPS: ralink: Select CONFIG_CPU_MIPSR2_IRQ_VI on MT7620/8
Stefan Roese [Mon, 17 Dec 2018 09:47:48 +0000 (10:47 +0100)]
MIPS: ralink: Select CONFIG_CPU_MIPSR2_IRQ_VI on MT7620/8

[ Upstream commit 0b15394475e3bcaf35ca4bf22fc55d56df67224e ]

Testing has shown, that when using mainline U-Boot on MT7688 based
boards, the system may hang or crash while mounting the root-fs. The
main issue here is that mainline U-Boot configures EBase to a value
near the end of system memory. And with CONFIG_CPU_MIPSR2_IRQ_VI
disabled, trap_init() will not allocate a new area to place the
exception handler. The original value will be used and the handler
will be copied to this location, which might already be used by some
userspace application.

The MT7688 supports VI - its config3 register is 0x00002420, so VInt
(Bit 5) is set. But without setting CONFIG_CPU_MIPSR2_IRQ_VI this
bit will not be evaluated to result in "cpu_has_vi" being set. This
patch now selects CONFIG_CPU_MIPSR2_IRQ_VI on MT7620/8 which results
trap_init() to allocate some memory for the exception handler.

Please note that this issue was not seen with the Mediatek U-Boot
version, as it does not touch EBase (stays at default of 0x8000.0000).
This is strictly also not correct as the kernel (_text) resides
here.

Signed-off-by: Stefan Roese <sr@denx.de>
[paul.burton@mips.com: s/beeing/being/]
Signed-off-by: Paul Burton <paul.burton@mips.com>
Cc: John Crispin <blogic@openwrt.org>
Cc: Daniel Schwierzeck <daniel.schwierzeck@gmail.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
5 years agocrypto: ux500 - Use proper enum in hash_set_dma_transfer
Nathan Chancellor [Mon, 10 Dec 2018 23:49:54 +0000 (16:49 -0700)]
crypto: ux500 - Use proper enum in hash_set_dma_transfer

[ Upstream commit 5ac93f808338f4dd465402e91869702eb87db241 ]

Clang warns when one enumerated type is implicitly converted to another:

drivers/crypto/ux500/hash/hash_core.c:169:4: warning: implicit
conversion from enumeration type 'enum dma_data_direction' to different
enumeration type 'enum dma_transfer_direction' [-Wenum-conversion]
                        direction, DMA_CTRL_ACK | DMA_PREP_INTERRUPT);
                        ^~~~~~~~~
1 warning generated.

dmaengine_prep_slave_sg expects an enum from dma_transfer_direction.
We know that the only direction supported by this function is
DMA_TO_DEVICE because of the check at the top of this function so we can
just use the equivalent value from dma_transfer_direction.

DMA_TO_DEVICE = DMA_MEM_TO_DEV = 1

Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Sasha Levin <sashal@kernel.org>