]> git.itanic.dy.fi Git - linux-stable/log
linux-stable
7 years agoLinux 4.5.6 v4.5.6
Greg Kroah-Hartman [Wed, 1 Jun 2016 19:17:36 +0000 (12:17 -0700)]
Linux 4.5.6

7 years agokbuild: move -Wunused-const-variable to W=1 warning level
Arnd Bergmann [Tue, 10 May 2016 21:30:01 +0000 (23:30 +0200)]
kbuild: move -Wunused-const-variable to W=1 warning level

commit c9c6837d39311b0cc14cdbe7c18e815ab44aefb1 upstream.

gcc-6 started warning by default about variables that are not
used anywhere and that are marked 'const', generating many
false positives in an allmodconfig build, e.g.:

arch/arm/mach-davinci/board-da830-evm.c:282:20: warning: 'da830_evm_emif25_pins' defined but not used [-Wunused-const-variable=]
arch/arm/plat-omap/dmtimer.c:958:34: warning: 'omap_timer_match' defined but not used [-Wunused-const-variable=]
drivers/bluetooth/hci_bcm.c:625:39: warning: 'acpi_bcm_default_gpios' defined but not used [-Wunused-const-variable=]
drivers/char/hw_random/omap-rng.c:92:18: warning: 'reg_map_omap4' defined but not used [-Wunused-const-variable=]
drivers/devfreq/exynos/exynos5_bus.c:381:32: warning: 'exynos5_busfreq_int_pm' defined but not used [-Wunused-const-variable=]
drivers/dma/mv_xor.c:1139:34: warning: 'mv_xor_dt_ids' defined but not used [-Wunused-const-variable=]

This is similar to the existing -Wunused-but-set-variable warning
that was added in an earlier release and that we disable by default
now and only enable when W=1 is set, so it makes sense to do
the same here. Once we have eliminated the majority of the
warnings for both, we can put them back into the default list.

We probably want this in backport kernels as well, to allow building
them with gcc-6 without introducing extra warnings.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Olof Johansson <olof@lixom.net>
Acked-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Michal Marek <mmarek@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agoRevert "scsi: fix soft lockup in scsi_remove_target() on module removal"
Johannes Thumshirn [Tue, 5 Apr 2016 09:50:45 +0000 (11:50 +0200)]
Revert "scsi: fix soft lockup in scsi_remove_target() on module removal"

commit 305c2e71b3d733ec065cb716c76af7d554bd5571 upstream.

Now that we've done a more comprehensive fix with the intermediate
target state we can remove the previous hack introduced with commit
90a88d6ef88e ("scsi: fix soft lockup in scsi_remove_target() on module
removal").

Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: Ewan D. Milne <emilne@redhat.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agoscsi: Add intermediate STARGET_REMOVE state to scsi_target_state
Johannes Thumshirn [Tue, 5 Apr 2016 09:50:44 +0000 (11:50 +0200)]
scsi: Add intermediate STARGET_REMOVE state to scsi_target_state

commit f05795d3d771f30a7bdc3a138bf714b06d42aa95 upstream.

Add intermediate STARGET_REMOVE state to scsi_target_state to avoid
running into the BUG_ON() in scsi_target_reap(). The STARGET_REMOVE
state is only valid in the path from scsi_remove_target() to
scsi_target_destroy() indicating this target is going to be removed.

This re-fixes the problem introduced in commits bc3f02a795d3 ("[SCSI]
scsi_remove_target: fix softlockup regression on hot remove") and
40998193560d ("scsi: restart list search after unlock in
scsi_remove_target") in a more comprehensive way.

[mkp: Included James' fix for scsi_target_destroy()]

Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
Fixes: 40998193560dab6c3ce8d25f4fa58a23e252ef38
Reported-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Tested-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Reviewed-by: Ewan D. Milne <emilne@redhat.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: James Bottomley <jejb@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agohpfs: implement the show_options method
Mikulas Patocka [Tue, 24 May 2016 20:49:18 +0000 (22:49 +0200)]
hpfs: implement the show_options method

commit 037369b872940cd923835a0a589763180c4a36bc upstream.

The HPFS filesystem used generic_show_options to produce string that is
displayed in /proc/mounts.  However, there is a problem that the options
may disappear after remount.  If we mount the filesystem with option1
and then remount it with option2, /proc/mounts should show both option1
and option2, however it only shows option2 because the whole option
string is replaced with replace_mount_options in hpfs_remount_fs.

To fix this bug, implement the hpfs_show_options function that prints
options that are currently selected.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agohpfs: fix remount failure when there are no options changed
Mikulas Patocka [Tue, 24 May 2016 20:47:00 +0000 (22:47 +0200)]
hpfs: fix remount failure when there are no options changed

commit 44d51706b4685f965cd32acde3fe0fcc1e6198e8 upstream.

Commit ce657611baf9 ("hpfs: kstrdup() out of memory handling") checks if
the kstrdup function returns NULL due to out-of-memory condition.

However, if we are remounting a filesystem with no change to
filesystem-specific options, the parameter data is NULL.  In this case,
kstrdup returns NULL (because it was passed NULL parameter), although no
out of memory condition exists.  The mount syscall then fails with
ENOMEM.

This patch fixes the bug.  We fail with ENOMEM only if data is non-NULL.

The patch also changes the call to replace_mount_options - if we didn't
pass any filesystem-specific options, we don't call
replace_mount_options (thus we don't erase existing reported options).

Fixes: ce657611baf9 ("hpfs: kstrdup() out of memory handling")
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agoUBI: Fix static volume checks when Fastmap is used
Richard Weinberger [Tue, 26 Apr 2016 14:39:48 +0000 (16:39 +0200)]
UBI: Fix static volume checks when Fastmap is used

commit 1900149c835ab5b48bea31a823ea5e5a401fb560 upstream.

Ezequiel reported that he's facing UBI going into read-only
mode after power cut. It turned out that this behavior happens
only when updating a static volume is interrupted and Fastmap is
used.

A possible trace can look like:
ubi0 warning: ubi_io_read_vid_hdr [ubi]: no VID header found at PEB 2323, only 0xFF bytes
ubi0 warning: ubi_eba_read_leb [ubi]: switch to read-only mode
CPU: 0 PID: 833 Comm: ubiupdatevol Not tainted 4.6.0-rc2-ARCH #4
Hardware name: SAMSUNG ELECTRONICS CO., LTD. 300E4C/300E5C/300E7C/NP300E5C-AD8AR, BIOS P04RAP 10/15/2012
0000000000000286 00000000eba949bd ffff8800c45a7b38 ffffffff8140d841
ffff8801964be000 ffff88018eaa4800 ffff8800c45a7bb8 ffffffffa003abf6
ffffffff850e2ac0 8000000000000163 ffff8801850e2ac0 ffff8801850e2ac0
Call Trace:
[<ffffffff8140d841>] dump_stack+0x63/0x82
[<ffffffffa003abf6>] ubi_eba_read_leb+0x486/0x4a0 [ubi]
[<ffffffffa00453b3>] ubi_check_volume+0x83/0xf0 [ubi]
[<ffffffffa0039d97>] ubi_open_volume+0x177/0x350 [ubi]
[<ffffffffa00375d8>] vol_cdev_open+0x58/0xb0 [ubi]
[<ffffffff8124b08e>] chrdev_open+0xae/0x1d0
[<ffffffff81243bcf>] do_dentry_open+0x1ff/0x300
[<ffffffff8124afe0>] ? cdev_put+0x30/0x30
[<ffffffff81244d36>] vfs_open+0x56/0x60
[<ffffffff812545f4>] path_openat+0x4f4/0x1190
[<ffffffff81256621>] do_filp_open+0x91/0x100
[<ffffffff81263547>] ? __alloc_fd+0xc7/0x190
[<ffffffff812450df>] do_sys_open+0x13f/0x210
[<ffffffff812451ce>] SyS_open+0x1e/0x20
[<ffffffff81a99e32>] entry_SYSCALL_64_fastpath+0x1a/0xa4

UBI checks static volumes for data consistency and reads the
whole volume upon first open. If the volume is found erroneous
users of UBI cannot read from it, but another volume update is
possible to fix it. The check is performed by running
ubi_eba_read_leb() on every allocated LEB of the volume.
For static volumes ubi_eba_read_leb() computes the checksum of all
data stored in a LEB. To verify the computed checksum it has to read
the LEB's volume header which stores the original checksum.
If the volume header is not found UBI treats this as fatal internal
error and switches to RO mode. If the UBI device was attached via a
full scan the assumption is correct, the volume header has to be
present as it had to be there while scanning to get known as mapped.
If the attach operation happened via Fastmap the assumption is no
longer correct. When attaching via Fastmap UBI learns the mapping
table from Fastmap's snapshot of the system state and not via a full
scan. It can happen that a LEB got unmapped after a Fastmap was
written to the flash. Then UBI can learn the LEB still as mapped and
accessing it returns only 0xFF bytes. As UBI is not a FTL it is
allowed to have mappings to empty PEBs, it assumes that the layer
above takes care of LEB accounting and referencing.
UBIFS does so using the LEB property tree (LPT).
For static volumes UBI blindly assumes that all LEBs are present and
therefore special actions have to be taken.

The described situation can happen when updating a static volume is
interrupted, either by a user or a power cut.
The volume update code first unmaps all LEBs of a volume and then
writes LEB by LEB. If the sequence of operations is interrupted UBI
detects this either by the absence of LEBs, no volume header present
at scan time, or corrupted payload, detected via checksum.
In the Fastmap case the former method won't trigger as no scan
happened and UBI automatically thinks all LEBs are present.
Only by reading data from a LEB it detects that the volume header is
missing and incorrectly treats this as fatal error.
To deal with the situation ubi_eba_read_leb() from now on checks
whether we attached via Fastmap and handles the absence of a
volume header like a data corruption error.
This way interrupted static volume updates will correctly get detected
also when Fastmap is used.

Reported-by: Ezequiel Garcia <ezequiel@vanguardiasur.com.ar>
Tested-by: Ezequiel Garcia <ezequiel@vanguardiasur.com.ar>
Signed-off-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agoSIGNAL: Move generic copy_siginfo() to signal.h
James Hogan [Mon, 8 Feb 2016 18:43:50 +0000 (18:43 +0000)]
SIGNAL: Move generic copy_siginfo() to signal.h

commit ca9eb49aa9562eaadf3cea071ec7018ad6800425 upstream.

The generic copy_siginfo() is currently defined in
asm-generic/siginfo.h, after including uapi/asm-generic/siginfo.h which
defines the generic struct siginfo. However this makes it awkward for an
architecture to use it if it has to define its own struct siginfo (e.g.
MIPS and potentially IA64), since it means that asm-generic/siginfo.h
can only be included after defining the arch-specific siginfo, which may
be problematic if the arch-specific definition needs definitions from
uapi/asm-generic/siginfo.h.

It is possible to work around this by first including
uapi/asm-generic/siginfo.h to get the constants before defining the
arch-specific siginfo, and include asm-generic/siginfo.h after. However
uapi headers can't be included by other uapi headers, so that first
include has to be in an ifdef __kernel__, with the non __kernel__ case
including the non-UAPI header instead.

Instead of that mess, move the generic copy_siginfo() definition into
linux/signal.h, which allows an arch-specific uapi/asm/siginfo.h to
include asm-generic/siginfo.h and define the arch-specific siginfo, and
for the generic copy_siginfo() to see that arch-specific definition.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Petr Malat <oss@malat.biz>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Christopher Ferris <cferris@google.com>
Cc: linux-arch@vger.kernel.org
Cc: linux-mips@linux-mips.org
Cc: linux-ia64@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/12478/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agothunderbolt: Fix double free of drom buffer
Andreas Noever [Sun, 10 Apr 2016 10:48:27 +0000 (12:48 +0200)]
thunderbolt: Fix double free of drom buffer

commit 2ffa9a5d76a75abbc1f95c17959fced666095bdd upstream.

If tb_drom_read() fails, sw->drom is freed but not set to NULL.  sw->drom
is then freed again in the error path of tb_switch_alloc().

The bug can be triggered by unplugging a thunderbolt device shortly after
it is detected by the thunderbolt driver.

Clear sw->drom if tb_drom_read() fails.

[bhelgaas: add Fixes:, stable versions of interest]
Fixes: 343fcb8c70d7 ("thunderbolt: Fix nontrivial endpoint devices.")
Signed-off-by: Andreas Noever <andreas.noever@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
CC: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agoIB/srp: Fix a debug kernel crash
Bart Van Assche [Tue, 12 Apr 2016 21:39:18 +0000 (14:39 -0700)]
IB/srp: Fix a debug kernel crash

commit 54f5c9c52d69afa55abf2b034df8d45f588466c3 upstream.

Avoid that the following BUG() is triggered against a debug
kernel:

kernel BUG at include/linux/scatterlist.h:92!
RIP: 0010:[<ffffffffa0467199>]  [<ffffffffa0467199>] srp_map_idb+0x199/0x1a0 [ib_srp]
Call Trace:
 [<ffffffffa04685fa>] srp_map_data+0x84a/0x890 [ib_srp]
 [<ffffffffa0469674>] srp_queuecommand+0x1e4/0x610 [ib_srp]
 [<ffffffff813f5a5e>] scsi_dispatch_cmd+0x9e/0x180
 [<ffffffff813f8b07>] scsi_request_fn+0x477/0x610
 [<ffffffff81298ffe>] __blk_run_queue+0x2e/0x40
 [<ffffffff81299070>] blk_delay_work+0x20/0x30
 [<ffffffff81071f07>] process_one_work+0x197/0x480
 [<ffffffff81072239>] worker_thread+0x49/0x490
 [<ffffffff810787ea>] kthread+0xea/0x100
 [<ffffffff8159b632>] ret_from_fork+0x22/0x40

Fixes: f7f7aab1a5c0 ("IB/srp: Convert to new registration API")
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Cc: Sagi Grimberg <sagi@grimberg.me>
Cc: Christoph Hellwig <hch@lst.de>
Reviewed-by: Max Gurtovoy <maxg@mellanox.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agoALSA: hda - Fix headset mic detection problem for one Dell machine
Hui Wang [Wed, 25 May 2016 04:12:32 +0000 (12:12 +0800)]
ALSA: hda - Fix headset mic detection problem for one Dell machine

commit 86c72d1ce91d804e4fa8d90b316a89597dd220f1 upstream.

Add the pin configuration value of this machine into the pin_quirk
table to make DELL1_MIC_NO_PRESENCE apply to this machine.

Signed-off-by: Hui Wang <hui.wang@canonical.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agoALSA: hda/realtek - Add support for ALC295/ALC3254
Kailang Yang [Tue, 24 May 2016 08:46:07 +0000 (16:46 +0800)]
ALSA: hda/realtek - Add support for ALC295/ALC3254

commit 7d727869c7b86da0874436ac5675dcdadaf3a0a1 upstream.

Add support for ALC295/ALC3254.
They are simply compatible with ALC225 chip.

Signed-off-by: Kailang Yang <kailang@realtek.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agoALSA: hda - Fix headphone noise on Dell XPS 13 9360
Kai-Heng Feng [Fri, 20 May 2016 07:47:23 +0000 (15:47 +0800)]
ALSA: hda - Fix headphone noise on Dell XPS 13 9360

commit 423cd785619ac6778252fbdb916505aa1c153959 upstream.

The headphone has noise when playing sound or switching microphone sources.
It uses the same codec on XPS 13 9350, but with different subsystem ID.
Applying the fixup can solve the issue.
Also, changing the model name to better differentiate models.

v2: Reorder by device ID.

Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agoALSA: hda/realtek - New codecs support for ALC234/ALC274/ALC294
Kailang Yang [Wed, 4 May 2016 07:50:18 +0000 (15:50 +0800)]
ALSA: hda/realtek - New codecs support for ALC234/ALC274/ALC294

commit dcd4f0db6141d6bf2cb897309d5d6f53d1b1696f upstream.

Support new codecs for ALC234/ALC274/ALC294.
This three codecs was the same IC.
But bonding is not the same.

Signed-off-by: Kailang Yang <kailang@realtek.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agomcb: Fixed bar number assignment for the gdd
Andreas Werner [Tue, 3 May 2016 10:42:00 +0000 (12:42 +0200)]
mcb: Fixed bar number assignment for the gdd

commit f75564d343010b025301d9548f2304f48eb25f01 upstream.

The bar number is found in reg2 within the gdd. Therefore
we need to change the assigment from reg1 to reg2 which
is the correct location.

Signed-off-by: Andreas Werner <andreas.werner@men.de>
Fixes: '3764e82e5' drivers: Introduce MEN Chameleon Bus
Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agowatchdog: sp5100_tco: properly check for new register layouts
Lucas Stach [Tue, 3 May 2016 17:15:58 +0000 (19:15 +0200)]
watchdog: sp5100_tco: properly check for new register layouts

commit 46856fabe40cc80f92134683cdec7dc0fc8f4000 upstream.

Commits 190aa4304de6 (Add AMD Mullins platform support) and
cca118fa2a0a94 (Add AMD Carrizo platform support) enabled the
driver on a lot more devices, but the following commit missed
a single location in the code when checking if the SB800 register
offsets should be used. This leads to the wrong register being
written which in turn causes ACPI to go haywire.

Fix this by introducing a helper function to check for the new
register layout and use this consistently.

https://bugzilla.kernel.org/show_bug.cgi?id=114201
https://bugzilla.redhat.com/show_bug.cgi?id=1329910
Fixes: bdecfcdb5461 (sp5100_tco: fix the device check for SB800
and later chipsets)
Signed-off-by: Lucas Stach <dev@lynxeye.de>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agoclk: bcm2835: add locking to pll*_on/off methods
Martin Sperl [Mon, 29 Feb 2016 11:39:18 +0000 (11:39 +0000)]
clk: bcm2835: add locking to pll*_on/off methods

commit ec36a5c6682fdd5328abf15c3c67281bed0241d7 upstream.

Add missing locking to:
* bcm2835_pll_divider_on
* bcm2835_pll_divider_off
to protect the read modify write cycle for the
register access protecting both cm_reg and a2w_reg
registers.

Fixes: 41691b8862e2 ("clk: bcm2835: Add support for programming the
audio domain clocks")

Signed-off-by: Martin Sperl <kernel@martin.sperl.org>
Signed-off-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agolocking,qspinlock: Fix spin_is_locked() and spin_unlock_wait()
Peter Zijlstra [Fri, 20 May 2016 16:04:36 +0000 (18:04 +0200)]
locking,qspinlock: Fix spin_is_locked() and spin_unlock_wait()

commit 54cf809b9512be95f53ed4a5e3b631d1ac42f0fa upstream.

Similar to commits:

  51d7d5205d33 ("powerpc: Add smp_mb() to arch_spin_is_locked()")
  d86b8da04dfa ("arm64: spinlock: serialise spin_unlock_wait against concurrent lockers")

qspinlock suffers from the fact that the _Q_LOCKED_VAL store is
unordered inside the ACQUIRE of the lock.

And while this is not a problem for the regular mutual exclusive
critical section usage of spinlocks, it breaks creative locking like:

spin_lock(A) spin_lock(B)
spin_unlock_wait(B) if (!spin_is_locked(A))
do_something()   do_something()

In that both CPUs can end up running do_something at the same time,
because our _Q_LOCKED_VAL store can drop past the spin_unlock_wait()
spin_is_locked() loads (even on x86!!).

To avoid making the normal case slower, add smp_mb()s to the less used
spin_unlock_wait() / spin_is_locked() side of things to avoid this
problem.

Reported-and-tested-by: Davidlohr Bueso <dave@stgolabs.net>
Reported-by: Giovanni Gherdovich <ggherdovich@suse.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agoserial: samsung: Reorder the sequence of clock control when call s3c24xx_serial_set_t...
Chanwoo Choi [Thu, 21 Apr 2016 09:58:31 +0000 (18:58 +0900)]
serial: samsung: Reorder the sequence of clock control when call s3c24xx_serial_set_termios()

commit b8995f527aac143e83d3900ff39357651ea4e0f6 upstream.

This patch fixes the broken serial log when changing the clock source
of uart device. Before disabling the original clock source, this patch
enables the new clock source to protect the clock off state for a split second.

Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
Reviewed-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
7 years agoserial: 8250_mid: recognize interrupt source in handler
Andy Shevchenko [Mon, 4 Apr 2016 14:35:10 +0000 (17:35 +0300)]
serial: 8250_mid: recognize interrupt source in handler

commit c42850f1ae7e70056f852e67bb9dddf927853b47 upstream.

There is a special register that shows interrupt status by source. In
particular case the source can be a combination of DMA Tx, DMA Rx, and UART.

Read the register and call the handlers only for sources that request an
interrupt.

Fixes: 6ede6dcd87aa ("serial: 8250_mid: add support for DMA engine handling from UART MMIO")
Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agoserial: 8250_mid: use proper bar for DNV platform
Andy Shevchenko [Mon, 4 Apr 2016 14:35:09 +0000 (17:35 +0300)]
serial: 8250_mid: use proper bar for DNV platform

commit 107e15fc1f8d6ef69eac5f175971252f76e82f0d upstream.

Unlike Intel Medfield and Tangier platforms DNV uses PCI BAR0 for IO compatible
resources and BAR1 for MMIO. We need latter in a way to support DMA. Introduce
an additional field in the internal structure and pass PCI BAR based on device
ID.

Reported-by: "Lai, Poey Seng" <poey.seng.lai@intel.com>
Fixes: 6ede6dcd87aa ("serial: 8250_mid: add support for DMA engine handling from UART MMIO")
Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agoserial: 8250_pci: fix divide error bug if baud rate is 0
David Müller [Wed, 27 Apr 2016 09:58:32 +0000 (11:58 +0200)]
serial: 8250_pci: fix divide error bug if baud rate is 0

commit 6f210c18c1c0f016772c8cd51ae12a02bfb9e7ef upstream.

Since commit 21947ba654a6 ("serial: 8250_pci: replace switch-case by
formula"), the 8250 driver crashes in the byt_set_termios() function
with a divide error. This is caused by the fact that a baud rate of 0 (B0)
is not handled properly. Fix it by falling back to B9600 in this case.

Signed-off-by: David Müller <d.mueller@elsoft.ch>
Fixes: 21947ba654a6 ("serial: 8250_pci: replace switch-case by formula")
Suggested-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agoFix OpenSSH pty regression on close
Brian Bloniarz [Sun, 6 Mar 2016 21:16:30 +0000 (13:16 -0800)]
Fix OpenSSH pty regression on close

commit 0f40fbbcc34e093255a2b2d70b6b0fb48c3f39aa upstream.

OpenSSH expects the (non-blocking) read() of pty master to return
EAGAIN only if it has received all of the slave-side output after
it has received SIGCHLD. This used to work on pre-3.12 kernels.

This fix effectively forces non-blocking read() and poll() to
block for parallel i/o to complete for all ttys. It also unwinds
these changes:

1) f8747d4a466ab2cafe56112c51b3379f9fdb7a12
   tty: Fix pty master read() after slave closes

2) 52bce7f8d4fc633c9a9d0646eef58ba6ae9a3b73
   pty, n_tty: Simplify input processing on final close

3) 1a48632ffed61352a7810ce089dc5a8bcd505a60
   pty: Fix input race when closing

Inspired by analysis and patch from Marc Aurele La France <tsi@tuyoix.net>

Reported-by: Volth <openssh@volth.com>
Reported-by: Marc Aurele La France <tsi@tuyoix.net>
BugLink: https://bugzilla.mindrot.org/show_bug.cgi?id=52
BugLink: https://bugzilla.mindrot.org/show_bug.cgi?id=2492
Signed-off-by: Brian Bloniarz <brian.bloniarz@gmail.com>
Reviewed-by: Peter Hurley <peter@hurleysoftware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agotty/serial: atmel: fix hardware handshake selection
Alexandre Belloni [Tue, 12 Apr 2016 12:51:40 +0000 (14:51 +0200)]
tty/serial: atmel: fix hardware handshake selection

commit 5be605ac9af979265d7b64c160ad9928088a78be upstream.

Commit 1cf6e8fc8341 ("tty/serial: at91: fix RTS line management when
hardware handshake is enabled") actually allowed to enable hardware
handshaking.
Before, the CRTSCTS flags was silently ignored.

As the DMA controller can't drive RTS (as explain in the commit message).
Ensure that hardware flow control stays disabled when DMA is used and FIFOs
are not available.

Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
Acked-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Fixes: 1cf6e8fc8341 ("tty/serial: at91: fix RTS line management when hardware handshake is enabled")
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agoTTY: n_gsm, fix false positive WARN_ON
Jiri Slaby [Tue, 22 Mar 2016 17:09:51 +0000 (18:09 +0100)]
TTY: n_gsm, fix false positive WARN_ON

commit d175feca89a1c162f60f4e3560ca7bc9437c65eb upstream.

Dmitry reported, that the current cleanup code in n_gsm can trigger a
warning:
WARNING: CPU: 2 PID: 24238 at drivers/tty/n_gsm.c:2048 gsm_cleanup_mux+0x166/0x6b0()
...
Call Trace:
...
 [<ffffffff81247ab9>] warn_slowpath_null+0x29/0x30 kernel/panic.c:490
 [<ffffffff828d0456>] gsm_cleanup_mux+0x166/0x6b0 drivers/tty/n_gsm.c:2048
 [<ffffffff828d4d87>] gsmld_open+0x5b7/0x7a0 drivers/tty/n_gsm.c:2386
 [<ffffffff828b9078>] tty_ldisc_open.isra.2+0x78/0xd0 drivers/tty/tty_ldisc.c:447
 [<ffffffff828b973a>] tty_set_ldisc+0x1ca/0xa70 drivers/tty/tty_ldisc.c:567
 [<     inline     >] tiocsetd drivers/tty/tty_io.c:2650
 [<ffffffff828a14ea>] tty_ioctl+0xb2a/0x2140 drivers/tty/tty_io.c:2883
...

But this is a legal path when open fails to find a space in the
gsm_mux array and tries to clean up. So make it a standard test
instead of a warning.

Reported-by: "Dmitry Vyukov" <dvyukov@google.com>
Cc: Alan Cox <alan@linux.intel.com>
Link: http://lkml.kernel.org/r/CACT4Y+bHQbAB68VFi7Romcs-Z9ZW3kQRvcq+BvHH1oa5NcAdLA@mail.gmail.com
Fixes: 5a640967 ("tty/n_gsm.c: fix a memory leak in gsmld_open()")
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agotty: vt, return error when con_startup fails
Jiri Slaby [Tue, 3 May 2016 15:05:54 +0000 (17:05 +0200)]
tty: vt, return error when con_startup fails

commit 6798df4c5fe0a7e6d2065cf79649a794e5ba7114 upstream.

When csw->con_startup() fails in do_register_con_driver, we return no
error (i.e. 0). This was changed back in 2006 by commit 3e795de763.
Before that we used to return -ENODEV.

So fix the return value to be -ENODEV in that case again.

Fixes: 3e795de763 ("VT binding: Add binding/unbinding support for the VT console")
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Reported-by: "Dan Carpenter" <dan.carpenter@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agoxen/x86: actually allocate legacy interrupts on PV guests
Stefano Stabellini [Wed, 20 Apr 2016 13:15:01 +0000 (14:15 +0100)]
xen/x86: actually allocate legacy interrupts on PV guests

commit 702f926067d2a4b28c10a3c41a1172dd62d9e735 upstream.

b4ff8389ed14 is incomplete: relies on nr_legacy_irqs() to get the number
of legacy interrupts when actually nr_legacy_irqs() returns 0 after
probe_8259A(). Use NR_IRQS_LEGACY instead.

Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agoKVM: x86: mask CPUID(0xD,0x1).EAX against host value
Paolo Bonzini [Mon, 21 Mar 2016 11:33:00 +0000 (12:33 +0100)]
KVM: x86: mask CPUID(0xD,0x1).EAX against host value

commit 316314cae15fb0e3869b76b468f59a0c83ac3d4e upstream.

This ensures that the guest doesn't see XSAVE extensions
(e.g. xgetbv1 or xsavec) that the host lacks.

Cc: stable@vger.kernel.org
Reviewed-by: Radim Krčmář <rkrcmar@redhat.com>
[4.5 does have CPUID_D_1_EAX, but earlier kernels don't, so use
 the numeric value.  This is consistent with other occurrences
 of cpuid_mask in arch/x86/kvm/cpuid.c - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agoMIPS: KVM: Fix timer IRQ race when writing CP0_Compare
James Hogan [Fri, 22 Apr 2016 09:38:46 +0000 (10:38 +0100)]
MIPS: KVM: Fix timer IRQ race when writing CP0_Compare

commit b45bacd2d048f405c7760e5cc9b60dd67708734f upstream.

Writing CP0_Compare clears the timer interrupt pending bit
(CP0_Cause.TI), but this wasn't being done atomically. If a timer
interrupt raced with the write of the guest CP0_Compare, the timer
interrupt could end up being pending even though the new CP0_Compare is
nowhere near CP0_Count.

We were already updating the hrtimer expiry with
kvm_mips_update_hrtimer(), which used both kvm_mips_freeze_hrtimer() and
kvm_mips_resume_hrtimer(). Close the race window by expanding out
kvm_mips_update_hrtimer(), and clearing CP0_Cause.TI and setting
CP0_Compare between the freeze and resume. Since the pending timer
interrupt should not be cleared when CP0_Compare is written via the KVM
user API, an ack argument is added to distinguish the source of the
write.

Fixes: e30492bbe95a ("MIPS: KVM: Rewrite count/compare timer emulation")
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim KrÄ\8dmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agoMIPS: KVM: Fix timer IRQ race when freezing timer
James Hogan [Fri, 22 Apr 2016 09:38:45 +0000 (10:38 +0100)]
MIPS: KVM: Fix timer IRQ race when freezing timer

commit 4355c44f063d3de4f072d796604c7f4ba4085cc3 upstream.

There's a particularly narrow and subtle race condition when the
software emulated guest timer is frozen which can allow a guest timer
interrupt to be missed.

This happens due to the hrtimer expiry being inexact, so very
occasionally the freeze time will be after the moment when the emulated
CP0_Count transitions to the same value as CP0_Compare (so an IRQ should
be generated), but before the moment when the hrtimer is due to expire
(so no IRQ is generated). The IRQ won't be generated when the timer is
resumed either, since the resume CP0_Count will already match CP0_Compare.

With VZ guests in particular this is far more likely to happen, since
the soft timer may be frozen frequently in order to restore the timer
state to the hardware guest timer. This happens after 5-10 hours of
guest soak testing, resulting in an overflow in guest kernel timekeeping
calculations, hanging the guest. A more focussed test case to
intentionally hit the race (with the help of a new hypcall to cause the
timer state to migrated between hardware & software) hits the condition
fairly reliably within around 30 seconds.

Instead of relying purely on the inexact hrtimer expiry to determine
whether an IRQ should be generated, read the guest CP0_Compare and
directly check whether the freeze time is before or after it. Only if
CP0_Count is on or after CP0_Compare do we check the hrtimer expiry to
determine whether the last IRQ has already been generated (which will
have pushed back the expiry by one timer period).

Fixes: e30492bbe95a ("MIPS: KVM: Rewrite count/compare timer emulation")
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim KrÄ\8dmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agoKVM: x86: fix ordering of cr0 initialization code in vmx_cpu_reset
Bruce Rogers [Thu, 28 Apr 2016 20:49:21 +0000 (14:49 -0600)]
KVM: x86: fix ordering of cr0 initialization code in vmx_cpu_reset

commit f24632475d4ffed5626abbfab7ef30a128dd1474 upstream.

Commit d28bc9dd25ce reversed the order of two lines which initialize cr0,
allowing the current (old) cr0 value to mess up vcpu initialization.
This was observed in the checks for cr0 X86_CR0_WP bit in the context of
kvm_mmu_reset_context(). Besides, setting vcpu->arch.cr0 after vmx_set_cr0()
is completely redundant. Change the order back to ensure proper vcpu
initialization.

The combination of booting with ovmf firmware when guest vcpus > 1 and kvm's
ept=N option being set results in a VM-entry failure. This patch fixes that.

Fixes: d28bc9dd25ce ("KVM: x86: INIT and reset sequences are different")
Signed-off-by: Bruce Rogers <brogers@suse.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agoKVM: MTRR: remove MSR 0x2f8
Andy Honig [Tue, 17 May 2016 15:41:47 +0000 (17:41 +0200)]
KVM: MTRR: remove MSR 0x2f8

commit 9842df62004f366b9fed2423e24df10542ee0dc5 upstream.

MSR 0x2f8 accessed the 124th Variable Range MTRR ever since MTRR support
was introduced by 9ba075a664df ("KVM: MTRR support").

0x2f8 became harmful when 910a6aae4e2e ("KVM: MTRR: exactly define the
size of variable MTRRs") shrinked the array of VR MTRRs from 256 to 8,
which made access to index 124 out of bounds.  The surrounding code only
WARNs in this situation, thus the guest gained a limited read/write
access to struct kvm_arch_vcpu.

0x2f8 is not a valid VR MTRR MSR, because KVM has/advertises only 16 VR
MTRR MSRs, 0x200-0x20f.  Every VR MTRR is set up using two MSRs, 0x2f8
was treated as a PHYSBASE and 0x2f9 would be its PHYSMASK, but 0x2f9 was
not implemented in KVM, therefore 0x2f8 could never do anything useful
and getting rid of it is safe.

This fixes CVE-2016-3713.

Fixes: 910a6aae4e2e ("KVM: MTRR: exactly define the size of variable MTRRs")
Reported-by: David Matlack <dmatlack@google.com>
Signed-off-by: Andy Honig <ahonig@google.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agostaging: comedi: das1800: fix possible NULL dereference
H Hartley Sweeten [Fri, 8 Apr 2016 17:14:58 +0000 (10:14 -0700)]
staging: comedi: das1800: fix possible NULL dereference

commit d375278d666760e195693b57415ba0a125cadd55 upstream.

DMA is optional with this driver. If it was not enabled the devpriv->dma
pointer will be NULL.

Fix the possible NULL pointer dereference when trying to disable the DMA
channels in das1800_ai_cancel() and tidy up the comments to fix the
checkpatch.pl issues:
WARNING: line over 80 characters

It's probably harmless in das1800_ai_setup_dma() because the 'desc' pointer
will not be used if DMA is disabled but fix it there also.

Fixes: 99dfc3357e98 ("staging: comedi: das1800: remove depends on ISA_DMA_API limitation")
Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com>
Reviewed-by: Ian Abbott <abbotti@mev.co.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agousb: gadget: udc: core: Fix argument of dev_err() in usb_gadget_map_request()
Yoshihiro Shimoda [Mon, 18 Apr 2016 07:53:38 +0000 (16:53 +0900)]
usb: gadget: udc: core: Fix argument of dev_err() in usb_gadget_map_request()

commit 5096c4d3bfa75bdd23c78f799aabd08598afb48f upstream.

The argument of dev_err() in usb_gadget_map_request() should be dev
instead of &gadget->dev.

Fixes: 7ace8fc ("usb: gadget: udc: core: Fix argument of dma_map_single for IOMMU")
Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agoUSB: leave LPM alone if possible when binding/unbinding interface drivers
Alan Stern [Fri, 29 Apr 2016 19:25:17 +0000 (15:25 -0400)]
USB: leave LPM alone if possible when binding/unbinding interface drivers

commit 6fb650d43da3e7054984dc548eaa88765a94d49f upstream.

When a USB driver is bound to an interface (either through probing or
by claiming it) or is unbound from an interface, the USB core always
disables Link Power Management during the transition and then
re-enables it afterward.  The reason is because the driver might want
to prevent hub-initiated link power transitions, in which case the HCD
would have to recalculate the various LPM parameters.  This
recalculation takes place when LPM is re-enabled and the new
parameters are sent to the device and its parent hub.

However, if the driver does not want to prevent hub-initiated link
power transitions then none of this work is necessary.  The parameters
don't need to be recalculated, and LPM doesn't need to be disabled and
re-enabled.

It turns out that disabling and enabling LPM can be time-consuming,
enough so that it interferes with user programs that want to claim and
release interfaces rapidly via usbfs.  Since the usbfs kernel driver
doesn't set the disable_hub_initiated_lpm flag, we can speed things up
and get the user programs to work by leaving LPM alone whenever the
flag isn't set.

And while we're improving the way disable_hub_initiated_lpm gets used,
let's also fix its kerneldoc.

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Tested-by: Matthew Giassa <matthew@giassa.net>
CC: Mathias Nyman <mathias.nyman@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agousb: misc: usbtest: fix pattern tests for scatterlists.
Mathias Nyman [Mon, 2 May 2016 08:39:03 +0000 (11:39 +0300)]
usb: misc: usbtest: fix pattern tests for scatterlists.

commit cdc77c82a8286b1181b81b6e5ef60c8e83ded7bc upstream.

The current implemenentation restart the sent pattern for each entry in
the sg list. The receiving end expects a continuous pattern, and test
will fail unless scatterilst entries happen to be aligned with the
pattern

Fix this by calculating the pattern byte based on total sent size
instead of just the current sg entry.

Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Fixes: 8b5249019352 ("[PATCH] USB: usbtest: scatterlist OUT data pattern testing")
Acked-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agousb: f_mass_storage: test whether thread is running before starting another
Michal Nazarewicz [Fri, 8 Apr 2016 08:24:11 +0000 (10:24 +0200)]
usb: f_mass_storage: test whether thread is running before starting another

commit f78bbcae86e676fad9e6c6bb6cd9d9868ba23696 upstream.

When binding the function to usb_configuration, check whether the thread
is running before starting another one.  Without that, when function
instance is added to multiple configurations, fsg_bing starts multiple
threads with all but the latest one being forgotten by the driver.  This
leads to obvious thread leaks, possible lockups when trying to halt the
machine and possible more issues.

This fixes issues with legacy/multi¹ gadget as well as configfs gadgets
when mass_storage function is added to multiple configurations.

This change also simplifies API since the legacy gadgets no longer need
to worry about starting the thread by themselves (which was where bug
in legacy/multi was in the first place).

N.B., this patch doesn’t address adding single mass_storage function
instance to a single configuration twice.  Thankfully, there’s no
legitimate reason for such setup plus, if I’m not mistaken, configfs
gadget doesn’t even allow it to be expressed.

¹ I have no example failure though.  Conclusion that legacy/multi has
  a bug is based purely on me reading the code.

Acked-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Michal Nazarewicz <mina86@mina86.com>
Tested-by: Ivaylo Dimitrov <ivo.g.dimitrov.75@gmail.com>
Cc: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agousb: gadget: f_fs: Fix EFAULT generation for async read operations
Lars-Peter Clausen [Wed, 30 Mar 2016 11:49:14 +0000 (13:49 +0200)]
usb: gadget: f_fs: Fix EFAULT generation for async read operations

commit 332a5b446b7916d272c2a659a3b20909ce34d2c1 upstream.

In the current implementation functionfs generates a EFAULT for async read
operations if the read buffer size is larger than the URB data size. Since
a application does not necessarily know how much data the host side is
going to send it typically supplies a buffer larger than the actual data,
which will then result in a EFAULT error.

This behaviour was introduced while refactoring the code to use iov_iter
interface in commit c993c39b8639 ("gadget/function/f_fs.c: use put iov_iter
into io_data"). The original code took the minimum over the URB size and
the user buffer size and then attempted to copy that many bytes using
copy_to_user(). If copy_to_user() could not copy all data a EFAULT error
was generated. Restore the original behaviour by only generating a EFAULT
error when the number of bytes copied is not the size of the URB and the
target buffer has not been fully filled.

Commit 342f39a6c8d3 ("usb: gadget: f_fs: fix check in read operation")
already fixed the same problem for the synchronous read path.

Fixes: c993c39b8639 ("gadget/function/f_fs.c: use put iov_iter into io_data")
Acked-by: Michal Nazarewicz <mina86@mina86.com>
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agoUSB: serial: option: add even more ZTE device ids
Lei Liu [Wed, 4 May 2016 08:34:22 +0000 (16:34 +0800)]
USB: serial: option: add even more ZTE device ids

commit 74d2a91aec97ab832790c9398d320413ad185321 upstream.

Add even more ZTE device ids.

Signed-off-by: lei liu <liu.lei78@zte.com.cn>
[johan: rebase and replace commit message ]
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agoUSB: serial: option: add more ZTE device ids
lei liu [Tue, 3 May 2016 21:44:19 +0000 (14:44 -0700)]
USB: serial: option: add more ZTE device ids

commit f0d09463c59c2d764a6c6d492cbe6d2c77f27153 upstream.

More ZTE device ids.

Signed-off-by: lei liu <liu.lei78@zte.com.cn>
[properly sort them - gregkh]
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agoUSB: serial: option: add support for Cinterion PH8 and AHxx
Schemmel Hans-Christoph [Fri, 29 Apr 2016 08:51:06 +0000 (08:51 +0000)]
USB: serial: option: add support for Cinterion PH8 and AHxx

commit 444f94e9e625f6ec6bbe2cb232a6451c637f35a3 upstream.

Added support for Gemalto's Cinterion PH8 and AHxx products
with 2 RmNet Interfaces and products with 1 RmNet + 1 USB Audio interface.

In addition some minor renaming and formatting.

Signed-off-by: Hans-Christoph Schemmel <hans-christoph.schemmel@gemalto.com>
[johan: sort current entries and trim trailing whitespace ]
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agoUSB: serial: io_edgeport: fix memory leaks in probe error path
Johan Hovold [Sun, 8 May 2016 18:07:57 +0000 (20:07 +0200)]
USB: serial: io_edgeport: fix memory leaks in probe error path

commit c8d62957d450cc1a22ce3242908709fe367ddc8e upstream.

URBs and buffers allocated in attach for Epic devices would never be
deallocated in case of a later probe error (e.g. failure to allocate
minor numbers) as disconnect is then never called.

Fix by moving deallocation to release and making sure that the
URBs are first unlinked.

Fixes: f9c99bb8b3a1 ("USB: usb-serial: replace shutdown with disconnect,
release")
Signed-off-by: Johan Hovold <johan@kernel.org>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agoUSB: serial: io_edgeport: fix memory leaks in attach error path
Johan Hovold [Sun, 8 May 2016 18:07:56 +0000 (20:07 +0200)]
USB: serial: io_edgeport: fix memory leaks in attach error path

commit c5c0c55598cefc826d6cfb0a417eeaee3631715c upstream.

Private data, URBs and buffers allocated for Epic devices during
attach were never released on errors (e.g. missing endpoints).

Fixes: 6e8cf7751f9f ("USB: add EPIC support to the io_edgeport driver")
Signed-off-by: Johan Hovold <johan@kernel.org>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agoUSB: serial: quatech2: fix use-after-free in probe error path
Johan Hovold [Sun, 8 May 2016 18:08:02 +0000 (20:08 +0200)]
USB: serial: quatech2: fix use-after-free in probe error path

commit 028c49f5e02a257c94129cd815f7c8485f51d4ef upstream.

The interface read URB is submitted in attach, but was only unlinked by
the driver at disconnect.

In case of a late probe error (e.g. due to failed minor allocation),
disconnect is never called and we would end up with active URBs for an
unbound interface. This in turn could lead to deallocated memory being
dereferenced in the completion callback.

Fixes: f7a33e608d9a ("USB: serial: add quatech2 usb to serial driver")
Signed-off-by: Johan Hovold <johan@kernel.org>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agoUSB: serial: keyspan: fix use-after-free in probe error path
Johan Hovold [Sun, 8 May 2016 18:07:58 +0000 (20:07 +0200)]
USB: serial: keyspan: fix use-after-free in probe error path

commit 35be1a71d70775e7bd7e45fa6d2897342ff4c9d2 upstream.

The interface instat and indat URBs were submitted in attach, but never
unlinked in release before deallocating the corresponding transfer
buffers.

In the case of a late probe error (e.g. due to failed minor allocation),
disconnect would not have been called before release, causing the
buffers to be freed while the URBs are still in use. We'd also end up
with active URBs for an unbound interface.

Fixes: f9c99bb8b3a1 ("USB: usb-serial: replace shutdown with disconnect,
release")
Signed-off-by: Johan Hovold <johan@kernel.org>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agoUSB: serial: mxuport: fix use-after-free in probe error path
Johan Hovold [Sun, 8 May 2016 18:08:01 +0000 (20:08 +0200)]
USB: serial: mxuport: fix use-after-free in probe error path

commit 9e45284984096314994777f27e1446dfbfd2f0d7 upstream.

The interface read and event URBs are submitted in attach, but were
never explicitly unlinked by the driver. Instead the URBs would have
been killed by usb-serial core on disconnect.

In case of a late probe error (e.g. due to failed minor allocation),
disconnect is never called and we could end up with active URBs for an
unbound interface. This in turn could lead to deallocated memory being
dereferenced in the completion callbacks.

Fixes: ee467a1f2066 ("USB: serial: add Moxa UPORT 12XX/14XX/16XX
driver")
Signed-off-by: Johan Hovold <johan@kernel.org>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agomei: bus: call mei_cl_read_start under device lock
Alexander Usyskin [Tue, 3 May 2016 22:54:21 +0000 (18:54 -0400)]
mei: bus: call mei_cl_read_start under device lock

commit bc46b45a421a64a0895dd41a34d3d2086e1ac7f6 upstream.

Ensure that mei_cl_read_start is called under the device lock
also in the bus layer. The function updates global ctrl_wr_list
which should be locked.

Signed-off-by: Alexander Usyskin <alexander.usyskin@intel.com>
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agomei: amthif: discard not read messages
Alexander Usyskin [Sun, 17 Apr 2016 16:16:04 +0000 (12:16 -0400)]
mei: amthif: discard not read messages

commit 9d04ee11db7bf0d848266cbfd7db336097a0e239 upstream.

When a message is received and amthif client is not in reading state
the message is ignored and left dangling in the queue. This may happen
after one of the amthif host connections is closed w/o completing the
reading. Another client will pick up a wrong message on next read
attempt which will lead to link reset.
To prevent this the driver has to properly discard the message when
amthif client is not in reading state.

Signed-off-by: Alexander Usyskin <alexander.usyskin@intel.com>
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agomei: fix NULL dereferencing during FW initiated disconnection
Alexander Usyskin [Sun, 17 Apr 2016 16:16:03 +0000 (12:16 -0400)]
mei: fix NULL dereferencing during FW initiated disconnection

commit 6a8d648c8d1824117a9e9edb948ed1611fb013c0 upstream.

In the case when disconnection is initiated from the FW
the driver is flushing items from the write control list while
iterating over it:

mei_irq_write_handler()
    list_for_each_entry_safe(ctrl_wr_list)         <-- outer loop
         mei_cl_irq_disconnect_rsp()
             mei_cl_set_disconnected()
                 mei_io_list_flush(ctrl_wr_list)   <-- destorying list

We move the list flushing to the completion routine.

Signed-off-by: Alexander Usyskin <alexander.usyskin@intel.com>
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agoBluetooth: vhci: Fix race at creating hci device
Takashi Iwai [Thu, 14 Apr 2016 15:32:19 +0000 (17:32 +0200)]
Bluetooth: vhci: Fix race at creating hci device

commit c7c999cb18da88a881e10e07f0724ad0bfaff770 upstream.

hci_vhci driver creates a hci device object dynamically upon each
HCI_VENDOR_PKT write.  Although it checks the already created object
and returns an error, it's still racy and may build multiple hci_dev
objects concurrently when parallel writes are performed, as the device
tracks only a single hci_dev object.

This patch introduces a mutex to protect against the concurrent device
creations.

Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agoBluetooth: vhci: purge unhandled skbs
Jiri Slaby [Sat, 19 Mar 2016 10:49:43 +0000 (11:49 +0100)]
Bluetooth: vhci: purge unhandled skbs

commit 13407376b255325fa817798800117a839f3aa055 upstream.

The write handler allocates skbs and queues them into data->readq.
Read side should read them, if there is any. If there is none, skbs
should be dropped by hdev->flush. But this happens only if the device
is HCI_UP, i.e. hdev->power_on work was triggered already. When it was
not, skbs stay allocated in the queue when /dev/vhci is closed. So
purge the queue in ->release.

Program to reproduce:
#include <err.h>
#include <fcntl.h>
#include <stdio.h>
#include <unistd.h>

#include <sys/stat.h>
#include <sys/types.h>
#include <sys/uio.h>

int main()
{
char buf[] = { 0xff, 0 };
struct iovec iov = {
.iov_base = buf,
.iov_len = sizeof(buf),
};
int fd;

while (1) {
fd = open("/dev/vhci", O_RDWR);
if (fd < 0)
err(1, "open");

usleep(50);

if (writev(fd, &iov, 1) < 0)
err(1, "writev");

usleep(50);

close(fd);
}

return 0;
}

Result:
kmemleak: 4609 new suspected memory leaks
unreferenced object 0xffff88059f4d5440 (size 232):
  comm "vhci", pid 1084, jiffies 4294912542 (age 37569.296s)
  hex dump (first 32 bytes):
    20 f0 23 87 05 88 ff ff 20 f0 23 87 05 88 ff ff   .#..... .#.....
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  backtrace:
...
    [<ffffffff81ece010>] __alloc_skb+0x0/0x5a0
    [<ffffffffa021886c>] vhci_create_device+0x5c/0x580 [hci_vhci]
    [<ffffffffa0219436>] vhci_write+0x306/0x4c8 [hci_vhci]

Fixes: 23424c0d31 (Bluetooth: Add support creating virtual AMP controllers)
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agoBluetooth: vhci: fix open_timeout vs. hdev race
Jiri Slaby [Sat, 19 Mar 2016 10:05:18 +0000 (11:05 +0100)]
Bluetooth: vhci: fix open_timeout vs. hdev race

commit 373a32c848ae3a1c03618517cce85f9211a6facf upstream.

Both vhci_get_user and vhci_release race with open_timeout work. They
both contain cancel_delayed_work_sync, but do not test whether the
work actually created hdev or not. Since the work can be in progress
and _sync will wait for finishing it, we can have data->hdev allocated
when cancel_delayed_work_sync returns. But the call sites do 'if
(data->hdev)' *before* cancel_delayed_work_sync.

As a result:
* vhci_get_user allocates a second hdev and puts it into
  data->hdev. The former is leaked.
* vhci_release does not release data->hdev properly as it thinks there
  is none.

Fix both cases by moving the actual test *after* the call to
cancel_delayed_work_sync.

This can be hit by this program:
#include <err.h>
#include <fcntl.h>
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#include <unistd.h>

#include <sys/stat.h>
#include <sys/types.h>

int main(int argc, char **argv)
{
int fd;

srand(time(NULL));

while (1) {
const int delta = (rand() % 200 - 100) * 100;

fd = open("/dev/vhci", O_RDWR);
if (fd < 0)
err(1, "open");

usleep(1000000 + delta);

close(fd);
}

return 0;
}

And the result is:
BUG: KASAN: use-after-free in skb_queue_tail+0x13e/0x150 at addr ffff88006b0c1228
Read of size 8 by task kworker/u13:1/32068
=============================================================================
BUG kmalloc-192 (Tainted: G            E     ): kasan: bad access detected
-----------------------------------------------------------------------------

Disabling lock debugging due to kernel taint
INFO: Allocated in vhci_open+0x50/0x330 [hci_vhci] age=260 cpu=3 pid=32040
...
kmem_cache_alloc_trace+0x150/0x190
vhci_open+0x50/0x330 [hci_vhci]
misc_open+0x35b/0x4e0
chrdev_open+0x23b/0x510
...
INFO: Freed in vhci_release+0xa4/0xd0 [hci_vhci] age=9 cpu=2 pid=32040
...
__slab_free+0x204/0x310
vhci_release+0xa4/0xd0 [hci_vhci]
...
INFO: Slab 0xffffea0001ac3000 objects=16 used=13 fp=0xffff88006b0c1e00 flags=0x5fffff80004080
INFO: Object 0xffff88006b0c1200 @offset=4608 fp=0xffff88006b0c0600
Bytes b4 ffff88006b0c11f0: 09 df 00 00 01 00 00 00 00 00 00 00 00 00 00 00  ................
Object ffff88006b0c1200: 00 06 0c 6b 00 88 ff ff 00 00 00 00 00 00 00 00  ...k............
Object ffff88006b0c1210: 10 12 0c 6b 00 88 ff ff 10 12 0c 6b 00 88 ff ff  ...k.......k....
Object ffff88006b0c1220: c0 46 c2 6b 00 88 ff ff c0 46 c2 6b 00 88 ff ff  .F.k.....F.k....
Object ffff88006b0c1230: 01 00 00 00 01 00 00 00 e0 ff ff ff 0f 00 00 00  ................
Object ffff88006b0c1240: 40 12 0c 6b 00 88 ff ff 40 12 0c 6b 00 88 ff ff  @..k....@..k....
Object ffff88006b0c1250: 50 0d 6e a0 ff ff ff ff 00 02 00 00 00 00 ad de  P.n.............
Object ffff88006b0c1260: 00 00 00 00 00 00 00 00 ab 62 02 00 01 00 00 00  .........b......
Object ffff88006b0c1270: 90 b9 19 81 ff ff ff ff 38 12 0c 6b 00 88 ff ff  ........8..k....
Object ffff88006b0c1280: 03 00 20 00 ff ff ff ff ff ff ff ff 00 00 00 00  .. .............
Object ffff88006b0c1290: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
Object ffff88006b0c12a0: 00 00 00 00 00 00 00 00 00 80 cd 3d 00 88 ff ff  ...........=....
Object ffff88006b0c12b0: 00 20 00 00 00 00 00 00 00 00 00 00 00 00 00 00  . ..............
Redzone ffff88006b0c12c0: bb bb bb bb bb bb bb bb                          ........
Padding ffff88006b0c13f8: 00 00 00 00 00 00 00 00                          ........
CPU: 3 PID: 32068 Comm: kworker/u13:1 Tainted: G    B       E      4.4.6-0-default #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.8.1-0-g4adadbd-20151112_172657-sheep25 04/01/2014
Workqueue: hci0 hci_cmd_work [bluetooth]
 00000000ffffffff ffffffff81926cfa ffff88006be37c68 ffff88006bc27180
 ffff88006b0c1200 ffff88006b0c1234 ffffffff81577993 ffffffff82489320
 ffff88006bc24240 0000000000000046 ffff88006a100000 000000026e51eb80
Call Trace:
...
 [<ffffffff81ec8ebe>] ? skb_queue_tail+0x13e/0x150
 [<ffffffffa06e027c>] ? vhci_send_frame+0xac/0x100 [hci_vhci]
 [<ffffffffa0c61268>] ? hci_send_frame+0x188/0x320 [bluetooth]
 [<ffffffffa0c61515>] ? hci_cmd_work+0x115/0x310 [bluetooth]
 [<ffffffff811a1375>] ? process_one_work+0x815/0x1340
 [<ffffffff811a1f85>] ? worker_thread+0xe5/0x11f0
 [<ffffffff811a1ea0>] ? process_one_work+0x1340/0x1340
 [<ffffffff811b3c68>] ? kthread+0x1c8/0x230
...
Memory state around the buggy address:
 ffff88006b0c1100: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
 ffff88006b0c1180: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>ffff88006b0c1200: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                                  ^
 ffff88006b0c1280: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
 ffff88006b0c1300: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc

Fixes: 23424c0d31 (Bluetooth: Add support creating virtual AMP controllers)
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Cc: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agommc: sdhci-pci: Remove MMC_CAP_BUS_WIDTH_TEST for Intel controllers
Adrian Hunter [Fri, 20 May 2016 07:33:47 +0000 (10:33 +0300)]
mmc: sdhci-pci: Remove MMC_CAP_BUS_WIDTH_TEST for Intel controllers

commit 822969369482166050c5b2f7013501505e025c39 upstream.

The CMD19/CMD14 bus width test has been found to be unreliable in
some cases.  It is not essential, so simply remove it.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agommc: longer timeout for long read time quirk
Matt Gumbel [Fri, 20 May 2016 07:33:46 +0000 (10:33 +0300)]
mmc: longer timeout for long read time quirk

commit 32ecd320db39bcb007679ed42f283740641b81ea upstream.

008GE0 Toshiba mmc in some Intel Baytrail tablets responds to
MMC_SEND_EXT_CSD in 450-600ms.

This patch will...

() Increase the long read time quirk timeout from 300ms to 600ms. Original
   author of that quirk says 300ms was only a guess and that the number
   may need to be raised in the future.

() Add this specific MMC to the quirk

Signed-off-by: Matt Gumbel <matthew.k.gumbel@intel.com>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agodell-rbtn: Ignore ACPI notifications if device is suspended
Gabriele Mazzotta [Tue, 24 May 2016 20:53:08 +0000 (22:53 +0200)]
dell-rbtn: Ignore ACPI notifications if device is suspended

commit ff8651237f39cea60dc89b2d9f25d9ede3fc82c0 upstream.

Some BIOSes unconditionally send an ACPI notification to RBTN when the
system is resuming from suspend. This makes dell-rbtn send an input
event to userspace as if a function key was pressed. Prevent this by
ignoring all the notifications received while the device is suspended.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=106031
Signed-off-by: Gabriele Mazzotta <gabriele.mzt@gmail.com>
Tested-by: Alex Hung <alex.hung@canonical.com>
Reviewed-by: Pali Rohár <pali.rohar@gmail.com>
Signed-off-by: Darren Hart <dvhart@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agoACPI / osi: Fix an issue that acpi_osi=!* cannot disable ACPICA internal strings
Lv Zheng [Tue, 3 May 2016 08:48:20 +0000 (16:48 +0800)]
ACPI / osi: Fix an issue that acpi_osi=!* cannot disable ACPICA internal strings

commit 30c9bb0d7603e7b3f4d6a0ea231e1cddae020c32 upstream.

The order of the _OSI related functionalities is as follows:

  acpi_blacklisted()
    acpi_dmi_osi_linux()
      acpi_osi_setup()
    acpi_osi_setup()
      acpi_update_interfaces() if "!*"
      <<<<<<<<<<<<<<<<<<<<<<<<
  parse_args()
    __setup("acpi_osi=")
      acpi_osi_setup_linux()
        acpi_update_interfaces() if "!*"
        <<<<<<<<<<<<<<<<<<<<<<<<
  acpi_early_init()
    acpi_initialize_subsystem()
      acpi_ut_initialize_interfaces()
      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  acpi_bus_init()
    acpi_os_initialize1()
      acpi_install_interface_handler(acpi_osi_handler)
      acpi_osi_setup_late()
        acpi_update_interfaces() for "!"
        >>>>>>>>>>>>>>>>>>>>>>>>
  acpi_osi_handler()

Since acpi_osi_setup_linux() can override acpi_dmi_osi_linux(), the command
line setting can override the DMI detection. That's why acpi_blacklisted()
is put before __setup("acpi_osi=").

Then we can notice the following wrong invocation order. There are
acpi_update_interfaces() (marked by <<<<) calls invoked before
acpi_ut_initialize_interfaces() (marked by ^^^^). This makes it impossible
to use acpi_osi=!* correctly from OSI DMI table or from the command line.
The use of acpi_osi=!* is meant to disable both ACPICA
(acpi_gbl_supported_interfaces) and Linux specific strings
(osi_setup_entries) while the ACPICA part should have stopped working
because of the order issue.

This patch fixes this issue by moving acpi_update_interfaces() to where
it is invoked for acpi_osi=! (marked by >>>>) as this is ensured to be
invoked after acpi_ut_initialize_interfaces() (marked by ^^^^). Linux
specific strings are still handled in the original place in order to make
the following command line working: acpi_osi=!* acpi_osi="Module Device".

Note that since acpi_osi=!* is meant to further disable linux specific
string comparing to the acpi_osi=!, there is no such use case in our bug
fixing work and hence there is no one using acpi_osi=!* either from the
command line or from the DMI quirks, this issue is just a theoretical
issue.

Fixes: 741d81280ad2 (ACPI: Add facility to remove all _OSI strings)
Tested-by: Lukas Wunner <lukas@wunner.de>
Tested-by: Chen Yu <yu.c.chen@intel.com>
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agommc: sdhci-acpi: Remove MMC_CAP_BUS_WIDTH_TEST for Intel controllers
Adrian Hunter [Fri, 20 May 2016 07:33:48 +0000 (10:33 +0300)]
mmc: sdhci-acpi: Remove MMC_CAP_BUS_WIDTH_TEST for Intel controllers

commit 265984b36ce82fec67957d452dd2b22e010611e4 upstream.

The CMD19/CMD14 bus width test has been found to be unreliable in
some cases.  It is not essential, so simply remove it.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agommc: sdhci-acpi: Ensure connected devices are powered when probing
Adrian Hunter [Thu, 19 May 2016 13:25:42 +0000 (15:25 +0200)]
mmc: sdhci-acpi: Ensure connected devices are powered when probing

commit e5bbf30733f930a1d17b4ccf19eac88e30a39cc7 upstream.

Some devices connected to the SDHCI controller may have separate enabling
lines that are controlled through GPIO. These devices need to be powered
on and enabled before probing. This is to ensure all devices connected can
be seen by the controller.

Note, for "stable" this patch depends on the following change:
commit 78a898d0e395 ("ACPI / PM: Export acpi_device_fix_up_power()")

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Reported-and-tested-by: Laszlo Fiat <laszlo.fiat@gmail.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Reported-by: Laszlo Fiat <laszlo.fiat@gmail.com>
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=112571
Link: http://lkml.kernel.org/r/CA+7w51inLtQSr656bJvOjGG9oQWKYPXH+xxDPJKbeJ=CcrkS9Q@mail.gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agoACPI / PM: Export acpi_device_fix_up_power()
Ulf Hansson [Thu, 19 May 2016 13:25:41 +0000 (15:25 +0200)]
ACPI / PM: Export acpi_device_fix_up_power()

commit 78a898d0e39513469858de990de83210fee28ee9 upstream.

Drivers that needs acpi_device_fix_up_power(), allow them to be built as
modules by exporting this function.

Tested-by: Laszlo Fiat <laszlo.fiat@gmail.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agommc: mmc: Fix partition switch timeout for some eMMCs
Adrian Hunter [Thu, 5 May 2016 05:12:28 +0000 (08:12 +0300)]
mmc: mmc: Fix partition switch timeout for some eMMCs

commit 1c447116d017a98c90f8f71c8c5a611e0aa42178 upstream.

Some eMMCs set the partition switch timeout too low.

Now typically eMMCs are considered a critical component (e.g. because
they store the root file system) and consequently are expected to be
reliable.  Thus we can neglect the use case where eMMCs can't switch
reliably and we might want a lower timeout to facilitate speedy
recovery.

Although we could employ a quirk for the cards that are affected (if
we could identify them all), as described above, there is little
benefit to having a low timeout, so instead simply set a minimum
timeout.

The minimum is set to 300ms somewhat arbitrarily - the examples that
have been seen had a timeout of 10ms but were sometimes taking 60-70ms.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agocan: fix handling of unmodifiable configuration options
Oliver Hartkopp [Mon, 21 Mar 2016 19:18:21 +0000 (20:18 +0100)]
can: fix handling of unmodifiable configuration options

commit bb208f144cf3f59d8f89a09a80efd04389718907 upstream.

As described in 'can: m_can: tag current CAN FD controllers as non-ISO'
(6cfda7fbebe) it is possible to define fixed configuration options by
setting the according bit in 'ctrlmode' and clear it in 'ctrlmode_supported'.
This leads to the incovenience that the fixed configuration bits can not be
passed by netlink even when they have the correct values (e.g. non-ISO, FD).

This patch fixes that issue and not only allows fixed set bit values to be set
again but now requires(!) to provide these fixed values at configuration time.
A valid CAN FD configuration consists of a nominal/arbitration bittiming, a
data bittiming and a control mode with CAN_CTRLMODE_FD set - which is now
enforced by a new can_validate() function. This fix additionally removed the
inconsistency that was prohibiting the support of 'CANFD-only' controller
drivers, like the RCar CAN FD.

For this reason a new helper can_set_static_ctrlmode() has been introduced to
provide a proper interface to handle static enabled CAN controller options.

Reported-by: Ramesh Shanmugasundaram <ramesh.shanmugasundaram@bp.renesas.com>
Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net>
Reviewed-by: Ramesh Shanmugasundaram <ramesh.shanmugasundaram@bp.renesas.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agonfc: st21nfca: Fix static checker warning
Christophe Ricard [Sat, 30 Apr 2016 07:12:34 +0000 (09:12 +0200)]
nfc: st21nfca: Fix static checker warning

commit b58afe6d6d3a53af165d5946f12c4b08c95acd58 upstream.

Fix static checker warning:
drivers/nfc/st21nfca/i2c.c:530 st21nfca_hci_i2c_acpi_request_resources()
error: 'gpiod_ena' dereferencing possible ERR_PTR()

Fix so that if no enable gpio can be retrieved an -ENODEV is returned.

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Fixes: dfa8070d7f64 ("nfc: st21nfca: Add support for acpi probing for i2c device.")
Signed-off-by: Christophe Ricard <christophe-h.ricard@st.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agoirqchip/gic-v3: Configure all interrupts as non-secure Group-1
Marc Zyngier [Fri, 6 May 2016 18:41:56 +0000 (19:41 +0100)]
irqchip/gic-v3: Configure all interrupts as non-secure Group-1

commit 7c9b973061b03af62734f613f6abec46c0dd4a88 upstream.

The GICv3 driver wrongly assumes that it runs on the non-secure
side of a secure-enabled system, while it could be on a system
with a single security state, or a GICv3 with GICD_CTLR.DS set.

Either way, it is important to configure this properly, or
interrupts will simply not be delivered on this HW.

Reported-by: Peter Maydell <peter.maydell@linaro.org>
Tested-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agoirqchip/gic: Ensure ordering between read of INTACK and shared data
Will Deacon [Tue, 26 Apr 2016 11:00:00 +0000 (12:00 +0100)]
irqchip/gic: Ensure ordering between read of INTACK and shared data

commit f86c4fbd930ff6fecf3d8a1c313182bd0f49f496 upstream.

When an IPI is generated by a CPU, the pattern looks roughly like:

  <write shared data>
  smp_wmb();
  <write to GIC to signal SGI>

On the receiving CPU we rely on the fact that, once we've taken the
interrupt, then the freshly written shared data must be visible to us.
Put another way, the CPU isn't going to speculate taking an interrupt.

Unfortunately, this assumption turns out to be broken.

Consider that CPUx wants to send an IPI to CPUy, which will cause CPUy
to read some shared_data. Before CPUx has done anything, a random
peripheral raises an IRQ to the GIC and the IRQ line on CPUy is raised.
CPUy then takes the IRQ and starts executing the entry code, heading
towards gic_handle_irq. Furthermore, let's assume that a bunch of the
previous interrupts handled by CPUy were SGIs, so the branch predictor
kicks in and speculates that irqnr will be <16 and we're likely to
head into handle_IPI. The prefetcher then grabs a speculative copy of
shared_data which contains a stale value.

Meanwhile, CPUx gets round to updating shared_data and asking the GIC
to send an SGI to CPUy. Internally, the GIC decides that the SGI is
more important than the peripheral interrupt (which hasn't yet been
ACKed) but doesn't need to do anything to CPUy, because the IRQ line
is already raised.

CPUy then reads the ACK register on the GIC, sees the SGI value which
confirms the branch prediction and we end up with a stale shared_data
value.

This patch fixes the problem by adding an smp_rmb() to the IPI entry
code in gic_handle_irq. As it turns out, the combination of a control
dependency and an ISB instruction from the EOI in the GICv3 driver is
enough to provide the ordering we need, so we add a comment there
justifying the absence of an explicit smp_rmb().

Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agoInput: pwm-beeper - fix - scheduling while atomic
Manfred Schlaegl [Fri, 27 May 2016 23:36:36 +0000 (16:36 -0700)]
Input: pwm-beeper - fix - scheduling while atomic

commit f49cf3b8b4c841457244c461c66186a719e13bcc upstream.

Pwm config may sleep so defer it using a worker.

On a Freescale i.MX53 based board we ran into "BUG: scheduling while
atomic" because input_inject_event locks interrupts, but
imx_pwm_config_v2 sleeps.

Tested on Freescale i.MX53 SoC with 4.6.0.

Signed-off-by: Manfred Schlaegl <manfred.schlaegl@gmx.at>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agomfd: omap-usb-tll: Fix scheduling while atomic BUG
Roger Quadros [Mon, 9 May 2016 08:28:37 +0000 (11:28 +0300)]
mfd: omap-usb-tll: Fix scheduling while atomic BUG

commit b49b927f16acee626c56a1af4ab4cb062f75b5df upstream.

We shouldn't be calling clk_prepare_enable()/clk_prepare_disable()
in an atomic context.

Fixes the following issue:

[    5.830970] ehci-omap: OMAP-EHCI Host Controller driver
[    5.830974] driver_register 'ehci-omap'
[    5.895849] driver_register 'wl1271_sdio'
[    5.896870] BUG: scheduling while atomic: udevd/994/0x00000002
[    5.896876] 4 locks held by udevd/994:
[    5.896904]  #0:  (&dev->mutex){......}, at: [<c049597c>] __driver_attach+0x60/0xac
[    5.896923]  #1:  (&dev->mutex){......}, at: [<c049598c>] __driver_attach+0x70/0xac
[    5.896946]  #2:  (tll_lock){+.+...}, at: [<c04c2630>] omap_tll_enable+0x2c/0xd0
[    5.896966]  #3:  (prepare_lock){+.+...}, at: [<c05ce9c8>] clk_prepare_lock+0x48/0xe0
[    5.897042] Modules linked in: wlcore_sdio(+) ehci_omap(+) dwc3_omap snd_soc_ts3a225e leds_is31fl319x bq27xxx_battery_i2c tsc2007 bq27xxx_battery bq2429x_charger ina2xx tca8418_keypad as5013 leds_tca6507 twl6040_vibra gpio_twl6040 bmp085_i2c(+) palmas_gpadc usb3503 palmas_pwrbutton bmg160_i2c(+) bmp085 bma150(+) bmg160_core bmp280 input_polldev snd_soc_omap_mcbsp snd_soc_omap_mcpdm snd_soc_omap snd_pcm_dmaengine
[    5.897048] Preemption disabled at:[<  (null)>]   (null)
[    5.897051]
[    5.897059] CPU: 0 PID: 994 Comm: udevd Not tainted 4.6.0-rc5-letux+ #233
[    5.897062] Hardware name: Generic OMAP5 (Flattened Device Tree)
[    5.897076] [<c010e714>] (unwind_backtrace) from [<c010af34>] (show_stack+0x10/0x14)
[    5.897087] [<c010af34>] (show_stack) from [<c040aa7c>] (dump_stack+0x88/0xc0)
[    5.897099] [<c040aa7c>] (dump_stack) from [<c020c558>] (__schedule_bug+0xac/0xd0)
[    5.897111] [<c020c558>] (__schedule_bug) from [<c06f3d44>] (__schedule+0x88/0x7e4)
[    5.897120] [<c06f3d44>] (__schedule) from [<c06f46d8>] (schedule+0x9c/0xc0)
[    5.897129] [<c06f46d8>] (schedule) from [<c06f4904>] (schedule_preempt_disabled+0x14/0x20)
[    5.897140] [<c06f4904>] (schedule_preempt_disabled) from [<c06f64e4>] (mutex_lock_nested+0x258/0x43c)
[    5.897150] [<c06f64e4>] (mutex_lock_nested) from [<c05ce9c8>] (clk_prepare_lock+0x48/0xe0)
[    5.897160] [<c05ce9c8>] (clk_prepare_lock) from [<c05d0e7c>] (clk_prepare+0x10/0x28)
[    5.897169] [<c05d0e7c>] (clk_prepare) from [<c04c2668>] (omap_tll_enable+0x64/0xd0)
[    5.897180] [<c04c2668>] (omap_tll_enable) from [<c04c1728>] (usbhs_runtime_resume+0x18/0x17c)
[    5.897192] [<c04c1728>] (usbhs_runtime_resume) from [<c049d404>] (pm_generic_runtime_resume+0x2c/0x40)
[    5.897202] [<c049d404>] (pm_generic_runtime_resume) from [<c049f180>] (__rpm_callback+0x38/0x68)
[    5.897210] [<c049f180>] (__rpm_callback) from [<c049f220>] (rpm_callback+0x70/0x88)
[    5.897218] [<c049f220>] (rpm_callback) from [<c04a0a00>] (rpm_resume+0x4ec/0x7ec)
[    5.897227] [<c04a0a00>] (rpm_resume) from [<c04a0f48>] (__pm_runtime_resume+0x4c/0x64)
[    5.897236] [<c04a0f48>] (__pm_runtime_resume) from [<c04958dc>] (driver_probe_device+0x30/0x70)
[    5.897246] [<c04958dc>] (driver_probe_device) from [<c04959a4>] (__driver_attach+0x88/0xac)
[    5.897256] [<c04959a4>] (__driver_attach) from [<c04940f8>] (bus_for_each_dev+0x50/0x84)
[    5.897267] [<c04940f8>] (bus_for_each_dev) from [<c0494e40>] (bus_add_driver+0xcc/0x1e4)
[    5.897276] [<c0494e40>] (bus_add_driver) from [<c0496914>] (driver_register+0xac/0xf4)
[    5.897286] [<c0496914>] (driver_register) from [<c01018e0>] (do_one_initcall+0x100/0x1b8)
[    5.897296] [<c01018e0>] (do_one_initcall) from [<c01c7a54>] (do_init_module+0x58/0x1c0)
[    5.897304] [<c01c7a54>] (do_init_module) from [<c01c8a3c>] (SyS_finit_module+0x88/0x90)
[    5.897313] [<c01c8a3c>] (SyS_finit_module) from [<c0107120>] (ret_fast_syscall+0x0/0x1c)
[    5.912697] ------------[ cut here ]------------
[    5.912711] WARNING: CPU: 0 PID: 994 at kernel/sched/core.c:2996 _raw_spin_unlock+0x28/0x58
[    5.912717] DEBUG_LOCKS_WARN_ON(val > preempt_count())

Reported-by: H. Nikolaus Schaller <hns@goldelico.com>
Tested-by: H. Nikolaus Schaller <hns@goldelico.com>
Signed-off-by: Roger Quadros <rogerq@ti.com>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agosched/loadavg: Fix loadavg artifacts on fully idle and on fully loaded systems
Vik Heyndrickx [Thu, 28 Apr 2016 18:46:28 +0000 (20:46 +0200)]
sched/loadavg: Fix loadavg artifacts on fully idle and on fully loaded systems

commit 20878232c52329f92423d27a60e48b6a6389e0dd upstream.

Systems show a minimal load average of 0.00, 0.01, 0.05 even when they
have no load at all.

Uptime and /proc/loadavg on all systems with kernels released during the
last five years up until kernel version 4.6-rc5, show a 5- and 15-minute
minimum loadavg of 0.01 and 0.05 respectively. This should be 0.00 on
idle systems, but the way the kernel calculates this value prevents it
from getting lower than the mentioned values.

Likewise but not as obviously noticeable, a fully loaded system with no
processes waiting, shows a maximum 1/5/15 loadavg of 1.00, 0.99, 0.95
(multiplied by number of cores).

Once the (old) load becomes 93 or higher, it mathematically can never
get lower than 93, even when the active (load) remains 0 forever.
This results in the strange 0.00, 0.01, 0.05 uptime values on idle
systems.  Note: 93/2048 = 0.0454..., which rounds up to 0.05.

It is not correct to add a 0.5 rounding (=1024/2048) here, since the
result from this function is fed back into the next iteration again,
so the result of that +0.5 rounding value then gets multiplied by
(2048-2037), and then rounded again, so there is a virtual "ghost"
load created, next to the old and active load terms.

By changing the way the internally kept value is rounded, that internal
value equivalent now can reach 0.00 on idle, and 1.00 on full load. Upon
increasing load, the internally kept load value is rounded up, when the
load is decreasing, the load value is rounded down.

The modified code was tested on nohz=off and nohz kernels. It was tested
on vanilla kernel 4.6-rc5 and on centos 7.1 kernel 3.10.0-327. It was
tested on single, dual, and octal cores system. It was tested on virtual
hosts and bare hardware. No unwanted effects have been observed, and the
problems that the patch intended to fix were indeed gone.

Tested-by: Damien Wyart <damien.wyart@free.fr>
Signed-off-by: Vik Heyndrickx <vik.heyndrickx@veribox.net>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Doug Smythies <dsmythies@telus.net>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: 0f004f5a696a ("sched: Cure more NO_HZ load average woes")
Link: http://lkml.kernel.org/r/e8d32bff-d544-7748-72b5-3c86cc71f09f@veribox.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agoclk: qcom: msm8916: Fix crypto clock flags
Andy Gross [Tue, 3 May 2016 20:24:11 +0000 (15:24 -0500)]
clk: qcom: msm8916: Fix crypto clock flags

commit 2a0974aa1a0b40a92387ea03dbfeacfbc9ba182c upstream.

This patch adds the CLK_SET_RATE_PARENT flag for the crypto core and
ahb blocks.  Without this flag, clk_set_rate can fail for certain
frequency requests.

Signed-off-by: Andy Gross <andy.gross@linaro.org>
Fixes: 3966fab8b6ab ("clk: qcom: Add MSM8916 Global Clock Controller support")
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agocrypto: sun4i-ss - Replace spinlock_bh by spin_lock_irq{save|restore}
Corentin LABBE [Wed, 23 Mar 2016 15:11:24 +0000 (16:11 +0100)]
crypto: sun4i-ss - Replace spinlock_bh by spin_lock_irq{save|restore}

commit bdb6cf9f6fe6d9af905ea34b7c4bb78ea601329e upstream.

The current sun4i-ss driver could generate data corruption when ciphering/deciphering.
It occurs randomly on end of handled data.
No root cause have been found and the only way to remove it is to replace
all spin_lock_bh by their irq counterparts.

Fixes: 6298e948215f ("crypto: sunxi-ss - Add Allwinner Security System crypto accelerator")
Signed-off-by: LABBE Corentin <clabbe.montjoie@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agocrypto: talitos - fix ahash algorithms registration
Horia Geant? [Thu, 21 Apr 2016 16:24:55 +0000 (19:24 +0300)]
crypto: talitos - fix ahash algorithms registration

commit 3639ca840df953f9af6f15fc8a6bf77f19075ab1 upstream.

Provide hardware state import/export functionality, as mandated by
commit 8996eafdcbad ("crypto: ahash - ensure statesize is non-zero")

Reported-by: Jonas Eymann <J.Eymann@gmx.net>
Signed-off-by: Horia Geant? <horia.geanta@nxp.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agocrypto: caam - fix caam_jr_alloc() ret code
Catalin Vasile [Fri, 6 May 2016 13:18:53 +0000 (16:18 +0300)]
crypto: caam - fix caam_jr_alloc() ret code

commit e930c765ca5c6b039cd22ebfb4504ea7b5dab43d upstream.

caam_jr_alloc() used to return NULL if a JR device could not be
allocated for a session. In turn, every user of this function used
IS_ERR() function to verify if anything went wrong, which does NOT look
for NULL values. This made the kernel crash if the sanity check failed,
because the driver continued to think it had allocated a valid JR dev
instance to the session and at some point it tries to do a caam_jr_free()
on a NULL JR dev pointer.
This patch is a fix for this issue.

Signed-off-by: Catalin Vasile <cata.vasile@nxp.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agoring-buffer: Prevent overflow of size in ring_buffer_resize()
Steven Rostedt (Red Hat) [Fri, 13 May 2016 13:34:12 +0000 (09:34 -0400)]
ring-buffer: Prevent overflow of size in ring_buffer_resize()

commit 59643d1535eb220668692a5359de22545af579f6 upstream.

If the size passed to ring_buffer_resize() is greater than MAX_LONG - BUF_PAGE_SIZE
then the DIV_ROUND_UP() will return zero.

Here's the details:

  # echo 18014398509481980 > /sys/kernel/debug/tracing/buffer_size_kb

tracing_entries_write() processes this and converts kb to bytes.

 18014398509481980 << 10 = 18446744073709547520

and this is passed to ring_buffer_resize() as unsigned long size.

 size = DIV_ROUND_UP(size, BUF_PAGE_SIZE);

Where DIV_ROUND_UP(a, b) is (a + b - 1)/b

BUF_PAGE_SIZE is 4080 and here

 18446744073709547520 + 4080 - 1 = 18446744073709551599

where 18446744073709551599 is still smaller than 2^64

 2^64 - 18446744073709551599 = 17

But now 18446744073709551599 / 4080 = 4521260802379792

and size = size * 4080 = 18446744073709551360

This is checked to make sure its still greater than 2 * 4080,
which it is.

Then we convert to the number of buffer pages needed.

 nr_page = DIV_ROUND_UP(size, BUF_PAGE_SIZE)

but this time size is 18446744073709551360 and

 2^64 - (18446744073709551360 + 4080 - 1) = -3823

Thus it overflows and the resulting number is less than 4080, which makes

  3823 / 4080 = 0

an nr_pages is set to this. As we already checked against the minimum that
nr_pages may be, this causes the logic to fail as well, and we crash the
kernel.

There's no reason to have the two DIV_ROUND_UP() (that's just result of
historical code changes), clean up the code and fix this bug.

Fixes: 83f40318dab00 ("ring-buffer: Make removal of ring buffer pages atomic")
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agoring-buffer: Use long for nr_pages to avoid overflow failures
Steven Rostedt (Red Hat) [Thu, 12 May 2016 15:01:24 +0000 (11:01 -0400)]
ring-buffer: Use long for nr_pages to avoid overflow failures

commit 9b94a8fba501f38368aef6ac1b30e7335252a220 upstream.

The size variable to change the ring buffer in ftrace is a long. The
nr_pages used to update the ring buffer based on the size is int. On 64 bit
machines this can cause an overflow problem.

For example, the following will cause the ring buffer to crash:

 # cd /sys/kernel/debug/tracing
 # echo 10 > buffer_size_kb
 # echo 8556384240 > buffer_size_kb

Then you get the warning of:

 WARNING: CPU: 1 PID: 318 at kernel/trace/ring_buffer.c:1527 rb_update_pages+0x22f/0x260

Which is:

  RB_WARN_ON(cpu_buffer, nr_removed);

Note each ring buffer page holds 4080 bytes.

This is because:

 1) 10 causes the ring buffer to have 3 pages.
    (10kb requires 3 * 4080 pages to hold)

 2) (2^31 / 2^10  + 1) * 4080 = 8556384240
    The value written into buffer_size_kb is shifted by 10 and then passed
    to ring_buffer_resize(). 8556384240 * 2^10 = 8761737461760

 3) The size passed to ring_buffer_resize() is then divided by BUF_PAGE_SIZE
    which is 4080. 8761737461760 / 4080 = 2147484672

 4) nr_pages is subtracted from the current nr_pages (3) and we get:
    2147484669. This value is saved in a signed integer nr_pages_to_update

 5) 2147484669 is greater than 2^31 but smaller than 2^32, a signed int
    turns into the value of -2147482627

 6) As the value is a negative number, in update_pages_handler() it is
    negated and passed to rb_remove_pages() and 2147482627 pages will
    be removed, which is much larger than 3 and it causes the warning
    because not all the pages asked to be removed were removed.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=118001
Fixes: 7a8e76a3829f1 ("tracing: unified trace buffer")
Reported-by: Hao Qin <QEver.cn@gmail.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agoasix: Fix offset calculation in asix_rx_fixup() causing slow transmissions
John Stultz [Tue, 17 May 2016 03:36:15 +0000 (20:36 -0700)]
asix: Fix offset calculation in asix_rx_fixup() causing slow transmissions

commit cd9e2e5d3ff148be9ea210f622ce3e8e8292fcd6 upstream.

In testing with HiKey, we found that since
commit 3f30b158eba5 ("asix: On RX avoid creating bad Ethernet
frames"),
we're seeing lots of noise during network transfers:

[  239.027993] asix 1-1.1:1.0 eth0: asix_rx_fixup() Data Header synchronisation was lost, remaining 988
[  239.037310] asix 1-1.1:1.0 eth0: asix_rx_fixup() Bad Header Length 0x54ebb5ec, offset 4
[  239.045519] asix 1-1.1:1.0 eth0: asix_rx_fixup() Bad Header Length 0xcdffe7a2, offset 4
[  239.275044] asix 1-1.1:1.0 eth0: asix_rx_fixup() Data Header synchronisation was lost, remaining 988
[  239.284355] asix 1-1.1:1.0 eth0: asix_rx_fixup() Bad Header Length 0x1d36f59d, offset 4
[  239.292541] asix 1-1.1:1.0 eth0: asix_rx_fixup() Bad Header Length 0xaef3c1e9, offset 4
[  239.518996] asix 1-1.1:1.0 eth0: asix_rx_fixup() Data Header synchronisation was lost, remaining 988
[  239.528300] asix 1-1.1:1.0 eth0: asix_rx_fixup() Bad Header Length 0x2881912, offset 4
[  239.536413] asix 1-1.1:1.0 eth0: asix_rx_fixup() Bad Header Length 0x5638f7e2, offset 4

And network throughput ends up being pretty bursty and slow with
a overall throughput of at best ~30kB/s (where as previously we
got 1.1MB/s with the slower USB1.1 "full speed" host).

We found the issue also was reproducible on a x86_64 system,
using a "high-speed" USB2.0 port but the throughput did not
measurably drop (possibly due to the scp transfer being cpu
bound on my slow test hardware).

After lots of debugging, I found the check added in the
problematic commit seems to be calculating the offset
incorrectly.

In the normal case, in the main loop of the function, we do:
(where offset is zero, or set to "offset += (copy_length + 1) &
0xfffe" in the previous loop)
    rx->header = get_unaligned_le32(skb->data +
                                    offset);
    offset += sizeof(u32);

But the problematic patch calculates:
    offset = ((rx->remaining + 1) & 0xfffe) + sizeof(u32);
    rx->header = get_unaligned_le32(skb->data + offset);

Adding some debug logic to check those offset calculation used
to find rx->header, the one in problematic code is always too
large by sizeof(u32).

Thus, this patch removes the incorrect " + sizeof(u32)" addition
in the problematic calculation, and resolves the issue.

Cc: Dean Jenkins <Dean_Jenkins@mentor.com>
Cc: "David B. Robins" <linux@davidrobins.net>
Cc: Mark Craske <Mark_Craske@mentor.com>
Cc: Emil Goode <emilgoode@gmail.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: YongQin Liu <yongqin.liu@linaro.org>
Cc: Guodong Xu <guodong.xu@linaro.org>
Cc: Ivan Vecera <ivecera@redhat.com>
Cc: linux-usb@vger.kernel.org
Cc: netdev@vger.kernel.org
Reported-by: Yongqin Liu <yongqin.liu@linaro.org>
Signed-off-by: John Stultz <john.stultz@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agofs/cifs: correctly to anonymous authentication for the NTLM(v2) authentication
Stefan Metzmacher [Tue, 3 May 2016 08:52:30 +0000 (10:52 +0200)]
fs/cifs: correctly to anonymous authentication for the NTLM(v2) authentication

commit 1a967d6c9b39c226be1b45f13acd4d8a5ab3dc44 upstream.

Only server which map unknown users to guest will allow
access using a non-null NTLMv2_Response.

For Samba it's the "map to guest = bad user" option.

BUG: https://bugzilla.samba.org/show_bug.cgi?id=11913

Signed-off-by: Stefan Metzmacher <metze@samba.org>
Signed-off-by: Steve French <smfrench@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agofs/cifs: correctly to anonymous authentication for the NTLM(v1) authentication
Stefan Metzmacher [Tue, 3 May 2016 08:52:30 +0000 (10:52 +0200)]
fs/cifs: correctly to anonymous authentication for the NTLM(v1) authentication

commit 777f69b8d26bf35ade4a76b08f203c11e048365d upstream.

Only server which map unknown users to guest will allow
access using a non-null NTChallengeResponse.

For Samba it's the "map to guest = bad user" option.

BUG: https://bugzilla.samba.org/show_bug.cgi?id=11913

Signed-off-by: Stefan Metzmacher <metze@samba.org>
Signed-off-by: Steve French <smfrench@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agofs/cifs: correctly to anonymous authentication for the LANMAN authentication
Stefan Metzmacher [Tue, 3 May 2016 08:52:30 +0000 (10:52 +0200)]
fs/cifs: correctly to anonymous authentication for the LANMAN authentication

commit fa8f3a354bb775ec586e4475bcb07f7dece97e0c upstream.

Only server which map unknown users to guest will allow
access using a non-null LMChallengeResponse.

For Samba it's the "map to guest = bad user" option.

BUG: https://bugzilla.samba.org/show_bug.cgi?id=11913

Signed-off-by: Stefan Metzmacher <metze@samba.org>
Signed-off-by: Steve French <smfrench@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agofs/cifs: correctly to anonymous authentication via NTLMSSP
Stefan Metzmacher [Tue, 3 May 2016 08:52:30 +0000 (10:52 +0200)]
fs/cifs: correctly to anonymous authentication via NTLMSSP

commit cfda35d98298131bf38fbad3ce4cd5ecb3cf18db upstream.

See [MS-NLMP] 3.2.5.1.2 Server Receives an AUTHENTICATE_MESSAGE from the Client:

   ...
   Set NullSession to FALSE
   If (AUTHENTICATE_MESSAGE.UserNameLen == 0 AND
      AUTHENTICATE_MESSAGE.NtChallengeResponse.Length == 0 AND
      (AUTHENTICATE_MESSAGE.LmChallengeResponse == Z(1)
       OR
       AUTHENTICATE_MESSAGE.LmChallengeResponse.Length == 0))
       -- Special case: client requested anonymous authentication
       Set NullSession to TRUE
   ...

Only server which map unknown users to guest will allow
access using a non-null NTChallengeResponse.

For Samba it's the "map to guest = bad user" option.

BUG: https://bugzilla.samba.org/show_bug.cgi?id=11913

Signed-off-by: Stefan Metzmacher <metze@samba.org>
Signed-off-by: Steve French <smfrench@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agoremove directory incorrectly tries to set delete on close on non-empty directories
Steve French [Fri, 13 May 2016 02:20:36 +0000 (21:20 -0500)]
remove directory incorrectly tries to set delete on close on non-empty directories

commit 897fba1172d637d344f009d700f7eb8a1fa262f1 upstream.

Wrong return code was being returned on SMB3 rmdir of
non-empty directory.

For SMB3 (unlike for cifs), we attempt to delete a directory by
set of delete on close flag on the open. Windows clients set
this flag via a set info (SET_FILE_DISPOSITION to set this flag)
which properly checks if the directory is empty.

With this patch on smb3 mounts we correctly return
 "DIRECTORY NOT EMPTY"
on attempts to remove a non-empty directory.

Signed-off-by: Steve French <steve.french@primarydata.com>
Acked-by: Sachin Prabhu <sprabhu@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agokvm: arm64: Fix EC field in inject_abt64
Matt Evans [Mon, 16 May 2016 12:54:56 +0000 (13:54 +0100)]
kvm: arm64: Fix EC field in inject_abt64

commit e4fe9e7dc3828bf6a5714eb3c55aef6260d823a2 upstream.

The EC field of the constructed ESR is conditionally modified by ORing in
ESR_ELx_EC_DABT_LOW for a data abort.  However, ESR_ELx_EC_SHIFT is missing
from this condition.

Signed-off-by: Matt Evans <matt.evans@arm.com>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agoarm/arm64: KVM: Enforce Break-Before-Make on Stage-2 page tables
Marc Zyngier [Thu, 28 Apr 2016 15:16:31 +0000 (16:16 +0100)]
arm/arm64: KVM: Enforce Break-Before-Make on Stage-2 page tables

commit d4b9e0790aa764c0b01e18d4e8d33e93ba36d51f upstream.

The ARM architecture mandates that when changing a page table entry
from a valid entry to another valid entry, an invalid entry is first
written, TLB invalidated, and only then the new entry being written.

The current code doesn't respect this, directly writing the new
entry and only then invalidating TLBs. Let's fix it up.

Reported-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agoarm64: cpuinfo: Missing NULL terminator in compat_hwcap_str
Julien Grall [Tue, 10 May 2016 14:40:31 +0000 (15:40 +0100)]
arm64: cpuinfo: Missing NULL terminator in compat_hwcap_str

commit f228b494e56d949be8d8ea09d4f973d1979201bf upstream.

The loop that browses the array compat_hwcap_str will stop when a NULL
is encountered, however NULL is missing at the end of array. This will
lead to overrun until a NULL is found somewhere in the following memory.
In reality, this works out because the compat_hwcap2_str array tends to
follow immediately in memory, and that *is* terminated correctly.
Furthermore, the unsigned int compat_elf_hwcap is checked before
printing each capability, so we end up doing the right thing because
the size of the two arrays is less than 32. Still, this is an obvious
mistake and should be fixed.

Note for backporting: commit 12d11817eaafa414 ("arm64: Move
/proc/cpuinfo handling code") moved this code in v4.4. Prior to that
commit, the same change should be made in arch/arm64/kernel/setup.c.

Fixes: 44b82b7700d0 "arm64: Fix up /proc/cpuinfo"
Signed-off-by: Julien Grall <julien.grall@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agoarm64: Implement pmdp_set_access_flags() for hardware AF/DBM
Catalin Marinas [Thu, 5 May 2016 09:44:00 +0000 (10:44 +0100)]
arm64: Implement pmdp_set_access_flags() for hardware AF/DBM

commit 282aa7051b0169991b34716f0f22d9c2f59c46c4 upstream.

The update to the accessed or dirty states for block mappings must be
done atomically on hardware with support for automatic AF/DBM. The
ptep_set_access_flags() function has been fixed as part of commit
66dbd6e61a52 ("arm64: Implement ptep_set_access_flags() for hardware
AF/DBM"). This patch brings pmdp_set_access_flags() in line with the pte
counterpart.

Fixes: 2f4b829c625e ("arm64: Add support for hardware updates of the access and dirty pte bits")
Reviewed-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agoarm64: Implement ptep_set_access_flags() for hardware AF/DBM
Catalin Marinas [Wed, 13 Apr 2016 15:01:22 +0000 (16:01 +0100)]
arm64: Implement ptep_set_access_flags() for hardware AF/DBM

commit 66dbd6e61a526ae7d11a208238ae2c17e5cacb6b upstream.

When hardware updates of the access and dirty states are enabled, the
default ptep_set_access_flags() implementation based on calling
set_pte_at() directly is potentially racy. This triggers the "racy dirty
state clearing" warning in set_pte_at() because an existing writable PTE
is overridden with a clean entry.

There are two main scenarios for this situation:

1. The CPU getting an access fault does not support hardware updates of
   the access/dirty flags. However, a different agent in the system
   (e.g. SMMU) can do this, therefore overriding a writable entry with a
   clean one could potentially lose the automatically updated dirty
   status

2. A more complex situation is possible when all CPUs support hardware
   AF/DBM:

   a) Initial state: shareable + writable vma and pte_none(pte)
   b) Read fault taken by two threads of the same process on different
      CPUs
   c) CPU0 takes the mmap_sem and proceeds to handling the fault. It
      eventually reaches do_set_pte() which sets a writable + clean pte.
      CPU0 releases the mmap_sem
   d) CPU1 acquires the mmap_sem and proceeds to handle_pte_fault(). The
      pte entry it reads is present, writable and clean and it continues
      to pte_mkyoung()
   e) CPU1 calls ptep_set_access_flags()

   If between (d) and (e) the hardware (another CPU) updates the dirty
   state (clears PTE_RDONLY), CPU1 will override the PTR_RDONLY bit
   marking the entry clean again.

This patch implements an arm64-specific ptep_set_access_flags() function
to perform an atomic update of the PTE flags.

Fixes: 2f4b829c625e ("arm64: Add support for hardware updates of the access and dirty pte bits")
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Reported-by: Ming Lei <tom.leiming@gmail.com>
Tested-by: Julien Grall <julien.grall@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
[will: reworded comment]
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agoarm64: Ensure pmd_present() returns false after pmd_mknotpresent()
Catalin Marinas [Thu, 5 May 2016 09:44:02 +0000 (10:44 +0100)]
arm64: Ensure pmd_present() returns false after pmd_mknotpresent()

commit 5bb1cc0ff9a6b68871970737e6c4c16919928d8b upstream.

Currently, pmd_present() only checks for a non-zero value, returning
true even after pmd_mknotpresent() (which only clears the type bits).
This patch converts pmd_present() to using pte_present(), similar to the
other pmd_*() checks. As a side effect, it will return true for
PROT_NONE mappings, though they are not yet used by the kernel with
transparent huge pages.

For consistency, also change pmd_mknotpresent() to only clear the
PMD_SECT_VALID bit, even though the PMD_TABLE_BIT is already 0 for block
mappings (no functional change). The unused PMD_SECT_PROT_NONE
definition is removed as transparent huge pages use the pte page prot
values.

Fixes: 9c7e535fcc17 ("arm64: mm: Route pmd thp functions through pte equivalents")
Reviewed-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agoarm64: Fix typo in the pmdp_huge_get_and_clear() definition
Catalin Marinas [Thu, 5 May 2016 09:43:59 +0000 (10:43 +0100)]
arm64: Fix typo in the pmdp_huge_get_and_clear() definition

commit 911f56eeb87ee378f5e215469268a7a2f68a5a8a upstream.

With hardware AF/DBM support, pmd modifications (transparent huge pages)
should be performed atomically using load/store exclusive. The initial
patches defined the get-and-clear function and __HAVE_ARCH_* macro
without the "huge" word, leaving the pmdp_huge_get_and_clear() to the
default, non-atomic implementation.

Fixes: 2f4b829c625e ("arm64: Add support for hardware updates of the access and dirty pte bits")
Reviewed-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agoperf/core: Fix perf_event_open() vs. execve() race
Peter Zijlstra [Tue, 26 Apr 2016 09:36:53 +0000 (11:36 +0200)]
perf/core: Fix perf_event_open() vs. execve() race

commit 79c9ce57eb2d5f1497546a3946b4ae21b6fdc438 upstream.

Jann reported that the ptrace_may_access() check in
find_lively_task_by_vpid() is racy against exec().

Specifically:

  perf_event_open() execve()

  ptrace_may_access()
commit_creds()
  ... if (get_dumpable() != SUID_DUMP_USER)
  perf_event_exit_task();
  perf_install_in_context()

would result in installing a counter across the creds boundary.

Fix this by wrapping lots of perf_event_open() in cred_guard_mutex.
This should be fine as perf_event_exit_task() is already called with
cred_guard_mutex held, so all perf locks already nest inside it.

Reported-by: Jann Horn <jannh@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: He Kuang <hekuang@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agoperf/x86/intel/pt: Generate PMI in the STOP region as well
Alexander Shishkin [Tue, 10 May 2016 13:18:32 +0000 (16:18 +0300)]
perf/x86/intel/pt: Generate PMI in the STOP region as well

commit ab92b232ae05c382c3df0e3d6a5c6d16b639ac8c upstream.

Currently, the PT driver always sets the PMI bit one region (page) before
the STOP region so that we can wake up the consumer before we run out of
room in the buffer and have to disable the event. However, we also need
an interrupt in the last output region, so that we actually get to disable
the event (if no more room from new data is available at that point),
otherwise hardware just quietly refuses to start, but the event is
scheduled in and we end up losing trace data till the event gets removed.

For a cpu-wide event it is even worse since there may not be any
re-scheduling at all and no chance for the ring buffer code to notice
that its buffer is filled up and the event needs to be disabled (so that
the consumer can re-enable it when it finishes reading the data out). In
other words, all the trace data will be lost after the buffer gets filled
up.

This patch makes PT also generate a PMI when the last output region is
full.

Reported-by: Markus Metzger <markus.t.metzger@intel.com>
Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: vince@deater.net
Link: http://lkml.kernel.org/r/1462886313-13660-2-git-send-email-alexander.shishkin@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agoLinux 4.5.5 v4.5.5
Greg Kroah-Hartman [Thu, 19 May 2016 01:35:34 +0000 (18:35 -0700)]
Linux 4.5.5

7 years agonf_conntrack: avoid kernel pointer value leak in slab name
Linus Torvalds [Sat, 14 May 2016 18:11:44 +0000 (11:11 -0700)]
nf_conntrack: avoid kernel pointer value leak in slab name

commit 31b0b385f69d8d5491a4bca288e25e63f1d945d0 upstream.

The slab name ends up being visible in the directory structure under
/sys, and even if you don't have access rights to the file you can see
the filenames.

Just use a 64-bit counter instead of the pointer to the 'net' structure
to generate a unique name.

This code will go away in 4.7 when the conntrack code moves to a single
kmemcache, but this is the backportable simple solution to avoiding
leaking kernel pointers to user space.

Fixes: 5b3501faa874 ("netfilter: nf_conntrack: per netns nf_conntrack_cachep")
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agobtrfs: Reset IO error counters before start of device replacing
Yauhen Kharuzhy [Tue, 29 Mar 2016 21:17:48 +0000 (14:17 -0700)]
btrfs: Reset IO error counters before start of device replacing

commit 7ccefb98ce3e5c4493cd213cd03714b7149cf0cb upstream.

If device replace entry was found on disk at mounting and its num_write_errors
stats counter has non-NULL value, then replace operation will never be
finished and -EIO error will be reported by btrfs_scrub_dev() because
this counter is never reset.

 # mount -o degraded /media/a4fb5c0a-21c5-4fe7-8d0e-fdd87d5f71ee/
 # btrfs replace status /media/a4fb5c0a-21c5-4fe7-8d0e-fdd87d5f71ee/
 Started on 25.Mar 07:28:00, canceled on 25.Mar 07:28:01 at 0.0%, 40 write errs, 0 uncorr. read errs
 # btrfs replace start -B 4 /dev/sdg /media/a4fb5c0a-21c5-4fe7-8d0e-fdd87d5f71ee/
 ERROR: ioctl(DEV_REPLACE_START) failed on "/media/a4fb5c0a-21c5-4fe7-8d0e-fdd87d5f71ee/": Input/output error, no error

Reset num_write_errors and num_uncorrectable_read_errors counters in the
dev_replace structure before start of replacing.

Signed-off-by: Yauhen Kharuzhy <yauhen.kharuzhy@zavadatar.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agoBtrfs: don't use src fd for printk
Josef Bacik [Fri, 25 Mar 2016 14:02:41 +0000 (10:02 -0400)]
Btrfs: don't use src fd for printk

commit c79b4713304f812d3d6c95826fc3e5fc2c0b0c14 upstream.

The fd we pass in may not be on a btrfs file system, so don't try to do
BTRFS_I() on it.  Thanks,

Signed-off-by: Josef Bacik <jbacik@fb.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agobtrfs: fallback to vmalloc in btrfs_compare_tree
David Sterba [Wed, 30 Mar 2016 14:01:12 +0000 (16:01 +0200)]
btrfs: fallback to vmalloc in btrfs_compare_tree

commit 8f282f71eaee7ac979cdbe525f76daa0722798a8 upstream.

The allocation of node could fail if the memory is too fragmented for a
given node size, practically observed with 64k.

http://article.gmane.org/gmane.comp.file-systems.btrfs/54689

Reported-and-tested-by: Jean-Denis Girard <jd.girard@sysnux.pf>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agobtrfs: handle non-fatal errors in btrfs_qgroup_inherit()
Mark Fasheh [Thu, 31 Mar 2016 00:57:48 +0000 (17:57 -0700)]
btrfs: handle non-fatal errors in btrfs_qgroup_inherit()

commit 918c2ee103cf9956f1c61d3f848dbb49fd2d104a upstream.

create_pending_snapshot() will go readonly on _any_ error return from
btrfs_qgroup_inherit(). If qgroups are enabled, a user can crash their fs by
just making a snapshot and asking it to inherit from an invalid qgroup. For
example:

$ btrfs sub snap -i 1/10 /btrfs/ /btrfs/foo

Will cause a transaction abort.

Fix this by only throwing errors in btrfs_qgroup_inherit() when we know
going readonly is acceptable.

The following xfstests test case reproduces this bug:

  seq=`basename $0`
  seqres=$RESULT_DIR/$seq
  echo "QA output created by $seq"

  here=`pwd`
  tmp=/tmp/$$
  status=1 # failure is the default!
  trap "_cleanup; exit \$status" 0 1 2 3 15

  _cleanup()
  {
   cd /
   rm -f $tmp.*
  }

  # get standard environment, filters and checks
  . ./common/rc
  . ./common/filter

  # remove previous $seqres.full before test
  rm -f $seqres.full

  # real QA test starts here
  _supported_fs btrfs
  _supported_os Linux
  _require_scratch

  rm -f $seqres.full

  _scratch_mkfs
  _scratch_mount
  _run_btrfs_util_prog quota enable $SCRATCH_MNT
  # The qgroup '1/10' does not exist and should be silently ignored
  _run_btrfs_util_prog subvolume snapshot -i 1/10 $SCRATCH_MNT $SCRATCH_MNT/snap1

  _scratch_unmount

  echo "Silence is golden"

  status=0
  exit

Signed-off-by: Mark Fasheh <mfasheh@suse.de>
Reviewed-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agoBtrfs: fix invalid reference in replace_path
Liu Bo [Mon, 21 Mar 2016 21:59:53 +0000 (14:59 -0700)]
Btrfs: fix invalid reference in replace_path

commit 264813acb1c756aebc337b16b832604a0c9aadaf upstream.

Dan Carpenter's static checker has found this error, it's introduced by
commit 64c043de466d
("Btrfs: fix up read_tree_block to return proper error")

It's really supposed to 'break' the loop on error like others.

Cc: Dan Carpenter <dan.carpenter@oracle.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agobtrfs: do not write corrupted metadata blocks to disk
Alex Lyakas [Thu, 10 Mar 2016 11:10:15 +0000 (13:10 +0200)]
btrfs: do not write corrupted metadata blocks to disk

commit 0f805531daa2ebfb5706422dc2ead1cff9e53e65 upstream.

csum_dirty_buffer was issuing a warning in case the extent buffer
did not look alright, but was still returning success.
Let's return error in this case, and also add an additional sanity
check on the extent buffer header.
The caller up the chain may BUG_ON on this, for example flush_epd_write_bio will,
but it is better than to have a silent metadata corruption on disk.

Signed-off-by: Alex Lyakas <alex@zadarastorage.com>
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agobtrfs: csum_tree_block: return proper errno value
Alex Lyakas [Thu, 10 Mar 2016 11:09:46 +0000 (13:09 +0200)]
btrfs: csum_tree_block: return proper errno value

commit 8bd98f0e6bf792e8fa7c3fed709321ad42ba8d2e upstream.

Signed-off-by: Alex Lyakas <alex@zadarastorage.com>
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agoBtrfs: do not collect ordered extents when logging that inode exists
Filipe Manana [Thu, 25 Feb 2016 23:19:38 +0000 (23:19 +0000)]
Btrfs: do not collect ordered extents when logging that inode exists

commit 5e33a2bd7ca7fa687fb0965869196eea6815d1f3 upstream.

When logging that an inode exists, for example as part of a directory
fsync operation, we were collecting any ordered extents for the inode but
we ended up doing nothing with them except tagging them as processed, by
setting the flag BTRFS_ORDERED_LOGGED on them, which prevented a
subsequent fsync of that inode (using the LOG_INODE_ALL mode) from
collecting and processing them. This created a time window where a second
fsync against the inode, using the fast path, ended up not logging the
checksums for the new extents but it logged the extents since they were
part of the list of modified extents. This happened because the ordered
extents were not collected and checksums were not yet added to the csum
tree - the ordered extents have not gone through btrfs_finish_ordered_io()
yet (which is where we add them to the csum tree by calling
inode.c:add_pending_csums()).

So fix this by not collecting an inode's ordered extents if we are logging
it with the LOG_INODE_EXISTS mode.

Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Chris Mason <clm@fb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agoBtrfs: fix race when checking if we can skip fsync'ing an inode
Filipe Manana [Wed, 24 Feb 2016 07:35:05 +0000 (07:35 +0000)]
Btrfs: fix race when checking if we can skip fsync'ing an inode

commit affc0ff902d539ebe9bba405d330410314f46e9f upstream.

If we're about to do a fast fsync for an inode and btrfs_inode_in_log()
returns false, it's possible that we had an ordered extent in progress
(btrfs_finish_ordered_io() not run yet) when we noticed that the inode's
last_trans field was not greater than the id of the last committed
transaction, but shortly after, before we checked if there were any
ongoing ordered extents, the ordered extent had just completed and
removed itself from the inode's ordered tree, in which case we end up not
logging the inode, losing some data if a power failure or crash happens
after the fsync handler returns and before the transaction is committed.

Fix this by checking first if there are any ongoing ordered extents
before comparing the inode's last_trans with the id of the last committed
transaction - when it completes, an ordered extent always updates the
inode's last_trans before it removes itself from the inode's ordered
tree (at btrfs_finish_ordered_io()).

Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Chris Mason <clm@fb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agoBtrfs: fix deadlock between direct IO reads and buffered writes
Filipe Manana [Thu, 18 Feb 2016 14:28:55 +0000 (14:28 +0000)]
Btrfs: fix deadlock between direct IO reads and buffered writes

commit ade770294df29e08f913e5d733a756893128f45e upstream.

While running a test with a mix of buffered IO and direct IO against
the same files I hit a deadlock reported by the following trace:

[11642.140352] INFO: task kworker/u32:3:15282 blocked for more than 120 seconds.
[11642.142452]       Not tainted 4.4.0-rc6-btrfs-next-21+ #1
[11642.143982] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[11642.146332] kworker/u32:3   D ffff880230ef7988 [11642.147737] systemd-journald[571]: Sent WATCHDOG=1 notification.
[11642.149771]     0 15282      2 0x00000000
[11642.151205] Workqueue: btrfs-flush_delalloc btrfs_flush_delalloc_helper [btrfs]
[11642.154074]  ffff880230ef7988 0000000000000246 0000000000014ec0 ffff88023ec94ec0
[11642.156722]  ffff880233fe8f80 ffff880230ef8000 ffff88023ec94ec0 7fffffffffffffff
[11642.159205]  0000000000000002 ffffffff8147b7f9 ffff880230ef79a0 ffffffff8147b541
[11642.161403] Call Trace:
[11642.162129]  [<ffffffff8147b7f9>] ? bit_wait+0x2f/0x2f
[11642.163396]  [<ffffffff8147b541>] schedule+0x82/0x9a
[11642.164871]  [<ffffffff8147e7fe>] schedule_timeout+0x43/0x109
[11642.167020]  [<ffffffff8147b7f9>] ? bit_wait+0x2f/0x2f
[11642.167931]  [<ffffffff8108afd1>] ? trace_hardirqs_on_caller+0x17b/0x197
[11642.182320]  [<ffffffff8108affa>] ? trace_hardirqs_on+0xd/0xf
[11642.183762]  [<ffffffff810b079b>] ? timekeeping_get_ns+0xe/0x33
[11642.185308]  [<ffffffff810b0f61>] ? ktime_get+0x41/0x52
[11642.186782]  [<ffffffff8147ac08>] io_schedule_timeout+0xa0/0x102
[11642.188217]  [<ffffffff8147ac08>] ? io_schedule_timeout+0xa0/0x102
[11642.189626]  [<ffffffff8147b814>] bit_wait_io+0x1b/0x39
[11642.190803]  [<ffffffff8147bb21>] __wait_on_bit_lock+0x4c/0x90
[11642.192158]  [<ffffffff8111829f>] __lock_page+0x66/0x68
[11642.193379]  [<ffffffff81082f29>] ? autoremove_wake_function+0x3a/0x3a
[11642.194831]  [<ffffffffa0450ddd>] lock_page+0x31/0x34 [btrfs]
[11642.197068]  [<ffffffffa0454e3b>] extent_write_cache_pages.isra.19.constprop.35+0x1af/0x2f4 [btrfs]
[11642.199188]  [<ffffffffa0455373>] extent_writepages+0x4b/0x5c [btrfs]
[11642.200723]  [<ffffffffa043c913>] ? btrfs_writepage_start_hook+0xce/0xce [btrfs]
[11642.202465]  [<ffffffffa043aa82>] btrfs_writepages+0x28/0x2a [btrfs]
[11642.203836]  [<ffffffff811236bc>] do_writepages+0x23/0x2c
[11642.205624]  [<ffffffff811198c9>] __filemap_fdatawrite_range+0x5a/0x61
[11642.207057]  [<ffffffff81119946>] filemap_fdatawrite_range+0x13/0x15
[11642.208529]  [<ffffffffa044f87e>] btrfs_start_ordered_extent+0xd0/0x1a1 [btrfs]
[11642.210375]  [<ffffffffa0462613>] ? btrfs_scrubparity_helper+0x140/0x33a [btrfs]
[11642.212132]  [<ffffffffa044f974>] btrfs_run_ordered_extent_work+0x25/0x34 [btrfs]
[11642.213837]  [<ffffffffa046262f>] btrfs_scrubparity_helper+0x15c/0x33a [btrfs]
[11642.215457]  [<ffffffffa046293b>] btrfs_flush_delalloc_helper+0xe/0x10 [btrfs]
[11642.217095]  [<ffffffff8106483e>] process_one_work+0x256/0x48b
[11642.218324]  [<ffffffff81064f20>] worker_thread+0x1f5/0x2a7
[11642.219466]  [<ffffffff81064d2b>] ? rescuer_thread+0x289/0x289
[11642.220801]  [<ffffffff8106a500>] kthread+0xd4/0xdc
[11642.222032]  [<ffffffff8106a42c>] ? kthread_parkme+0x24/0x24
[11642.223190]  [<ffffffff8147fdef>] ret_from_fork+0x3f/0x70
[11642.224394]  [<ffffffff8106a42c>] ? kthread_parkme+0x24/0x24
[11642.226295] 2 locks held by kworker/u32:3/15282:
[11642.227273]  #0:  ("%s-%s""btrfs", name){++++.+}, at: [<ffffffff8106474d>] process_one_work+0x165/0x48b
[11642.229412]  #1:  ((&work->normal_work)){+.+.+.}, at: [<ffffffff8106474d>] process_one_work+0x165/0x48b
[11642.231414] INFO: task kworker/u32:8:15289 blocked for more than 120 seconds.
[11642.232872]       Not tainted 4.4.0-rc6-btrfs-next-21+ #1
[11642.234109] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[11642.235776] kworker/u32:8   D ffff88020de5f848     0 15289      2 0x00000000
[11642.237412] Workqueue: writeback wb_workfn (flush-btrfs-481)
[11642.238670]  ffff88020de5f848 0000000000000246 0000000000014ec0 ffff88023ed54ec0
[11642.240475]  ffff88021b1ece40 ffff88020de60000 ffff88023ed54ec0 7fffffffffffffff
[11642.242154]  0000000000000002 ffffffff8147b7f9 ffff88020de5f860 ffffffff8147b541
[11642.243715] Call Trace:
[11642.244390]  [<ffffffff8147b7f9>] ? bit_wait+0x2f/0x2f
[11642.245432]  [<ffffffff8147b541>] schedule+0x82/0x9a
[11642.246392]  [<ffffffff8147e7fe>] schedule_timeout+0x43/0x109
[11642.247479]  [<ffffffff8147b7f9>] ? bit_wait+0x2f/0x2f
[11642.248551]  [<ffffffff8108afd1>] ? trace_hardirqs_on_caller+0x17b/0x197
[11642.249968]  [<ffffffff8108affa>] ? trace_hardirqs_on+0xd/0xf
[11642.251043]  [<ffffffff810b079b>] ? timekeeping_get_ns+0xe/0x33
[11642.252202]  [<ffffffff810b0f61>] ? ktime_get+0x41/0x52
[11642.253210]  [<ffffffff8147ac08>] io_schedule_timeout+0xa0/0x102
[11642.254307]  [<ffffffff8147ac08>] ? io_schedule_timeout+0xa0/0x102
[11642.256118]  [<ffffffff8147b814>] bit_wait_io+0x1b/0x39
[11642.257131]  [<ffffffff8147bb21>] __wait_on_bit_lock+0x4c/0x90
[11642.258200]  [<ffffffff8111829f>] __lock_page+0x66/0x68
[11642.259168]  [<ffffffff81082f29>] ? autoremove_wake_function+0x3a/0x3a
[11642.260516]  [<ffffffffa0450ddd>] lock_page+0x31/0x34 [btrfs]
[11642.261841]  [<ffffffffa0454e3b>] extent_write_cache_pages.isra.19.constprop.35+0x1af/0x2f4 [btrfs]
[11642.263531]  [<ffffffffa0455373>] extent_writepages+0x4b/0x5c [btrfs]
[11642.264747]  [<ffffffffa043c913>] ? btrfs_writepage_start_hook+0xce/0xce [btrfs]
[11642.266148]  [<ffffffffa043aa82>] btrfs_writepages+0x28/0x2a [btrfs]
[11642.267264]  [<ffffffff811236bc>] do_writepages+0x23/0x2c
[11642.268280]  [<ffffffff81192a2b>] __writeback_single_inode+0xda/0x5ba
[11642.269407]  [<ffffffff811939f0>] writeback_sb_inodes+0x27b/0x43d
[11642.270476]  [<ffffffff81193c28>] __writeback_inodes_wb+0x76/0xae
[11642.271547]  [<ffffffff81193ea6>] wb_writeback+0x19e/0x41c
[11642.272588]  [<ffffffff81194821>] wb_workfn+0x201/0x341
[11642.273523]  [<ffffffff81194821>] ? wb_workfn+0x201/0x341
[11642.274479]  [<ffffffff8106483e>] process_one_work+0x256/0x48b
[11642.275497]  [<ffffffff81064f20>] worker_thread+0x1f5/0x2a7
[11642.276518]  [<ffffffff81064d2b>] ? rescuer_thread+0x289/0x289
[11642.277520]  [<ffffffff81064d2b>] ? rescuer_thread+0x289/0x289
[11642.278517]  [<ffffffff8106a500>] kthread+0xd4/0xdc
[11642.279371]  [<ffffffff8106a42c>] ? kthread_parkme+0x24/0x24
[11642.280468]  [<ffffffff8147fdef>] ret_from_fork+0x3f/0x70
[11642.281607]  [<ffffffff8106a42c>] ? kthread_parkme+0x24/0x24
[11642.282604] 3 locks held by kworker/u32:8/15289:
[11642.283423]  #0:  ("writeback"){++++.+}, at: [<ffffffff8106474d>] process_one_work+0x165/0x48b
[11642.285629]  #1:  ((&(&wb->dwork)->work)){+.+.+.}, at: [<ffffffff8106474d>] process_one_work+0x165/0x48b
[11642.287538]  #2:  (&type->s_umount_key#37){+++++.}, at: [<ffffffff81171217>] trylock_super+0x1b/0x4b
[11642.289423] INFO: task fdm-stress:26848 blocked for more than 120 seconds.
[11642.290547]       Not tainted 4.4.0-rc6-btrfs-next-21+ #1
[11642.291453] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[11642.292864] fdm-stress      D ffff88022c107c20     0 26848  26591 0x00000000
[11642.294118]  ffff88022c107c20 000000038108affa 0000000000014ec0 ffff88023ed54ec0
[11642.295602]  ffff88013ab1ca40 ffff88022c108000 ffff8800b2fc19d0 00000000000e0fff
[11642.297098]  ffff8800b2fc19b0 ffff88022c107c88 ffff88022c107c38 ffffffff8147b541
[11642.298433] Call Trace:
[11642.298896]  [<ffffffff8147b541>] schedule+0x82/0x9a
[11642.299738]  [<ffffffffa045225d>] lock_extent_bits+0xfe/0x1a3 [btrfs]
[11642.300833]  [<ffffffff81082eef>] ? add_wait_queue_exclusive+0x44/0x44
[11642.301943]  [<ffffffffa0447516>] lock_and_cleanup_extent_if_need+0x68/0x18e [btrfs]
[11642.303270]  [<ffffffffa04485ba>] __btrfs_buffered_write+0x238/0x4c1 [btrfs]
[11642.304552]  [<ffffffffa044b50a>] ? btrfs_file_write_iter+0x17c/0x408 [btrfs]
[11642.305782]  [<ffffffffa044b682>] btrfs_file_write_iter+0x2f4/0x408 [btrfs]
[11642.306878]  [<ffffffff8116e298>] __vfs_write+0x7c/0xa5
[11642.307729]  [<ffffffff8116e7d1>] vfs_write+0x9d/0xe8
[11642.308602]  [<ffffffff8116efbb>] SyS_write+0x50/0x7e
[11642.309410]  [<ffffffff8147fa97>] entry_SYSCALL_64_fastpath+0x12/0x6b
[11642.310403] 3 locks held by fdm-stress/26848:
[11642.311108]  #0:  (&f->f_pos_lock){+.+.+.}, at: [<ffffffff811877e8>] __fdget_pos+0x3a/0x40
[11642.312578]  #1:  (sb_writers#11){.+.+.+}, at: [<ffffffff811706ee>] __sb_start_write+0x5f/0xb0
[11642.314170]  #2:  (&sb->s_type->i_mutex_key#15){+.+.+.}, at: [<ffffffffa044b401>] btrfs_file_write_iter+0x73/0x408 [btrfs]
[11642.316796] INFO: task fdm-stress:26849 blocked for more than 120 seconds.
[11642.317842]       Not tainted 4.4.0-rc6-btrfs-next-21+ #1
[11642.318691] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[11642.319959] fdm-stress      D ffff8801964ffa68     0 26849  26591 0x00000000
[11642.321312]  ffff8801964ffa68 00ff8801e9975f80 0000000000014ec0 ffff88023ed94ec0
[11642.322555]  ffff8800b00b4840 ffff880196500000 ffff8801e9975f20 0000000000000002
[11642.323715]  ffff8801e9975f18 ffff8800b00b4840 ffff8801964ffa80 ffffffff8147b541
[11642.325096] Call Trace:
[11642.325532]  [<ffffffff8147b541>] schedule+0x82/0x9a
[11642.326303]  [<ffffffff8147e7fe>] schedule_timeout+0x43/0x109
[11642.327180]  [<ffffffff8108ae40>] ? mark_held_locks+0x5e/0x74
[11642.328114]  [<ffffffff8147f30e>] ? _raw_spin_unlock_irq+0x2c/0x4a
[11642.329051]  [<ffffffff8108afd1>] ? trace_hardirqs_on_caller+0x17b/0x197
[11642.330053]  [<ffffffff8147bceb>] __wait_for_common+0x109/0x147
[11642.330952]  [<ffffffff8147bceb>] ? __wait_for_common+0x109/0x147
[11642.331869]  [<ffffffff8147e7bb>] ? usleep_range+0x4a/0x4a
[11642.332925]  [<ffffffff81074075>] ? wake_up_q+0x47/0x47
[11642.333736]  [<ffffffff8147bd4d>] wait_for_completion+0x24/0x26
[11642.334672]  [<ffffffffa044f5ce>] btrfs_wait_ordered_extents+0x1c8/0x217 [btrfs]
[11642.335858]  [<ffffffffa0465b5a>] btrfs_mksubvol+0x224/0x45d [btrfs]
[11642.336854]  [<ffffffff81082eef>] ? add_wait_queue_exclusive+0x44/0x44
[11642.337820]  [<ffffffffa0465edb>] btrfs_ioctl_snap_create_transid+0x148/0x17a [btrfs]
[11642.339026]  [<ffffffffa046603b>] btrfs_ioctl_snap_create_v2+0xc7/0x110 [btrfs]
[11642.340214]  [<ffffffffa0468582>] btrfs_ioctl+0x590/0x27bd [btrfs]
[11642.341123]  [<ffffffff8147dc00>] ? mutex_unlock+0xe/0x10
[11642.341934]  [<ffffffffa00fa6e9>] ? ext4_file_write_iter+0x2a3/0x36f [ext4]
[11642.342936]  [<ffffffff8108895d>] ? __lock_is_held+0x3c/0x57
[11642.343772]  [<ffffffff81186a1d>] ? rcu_read_unlock+0x3e/0x5d
[11642.344673]  [<ffffffff8117dc95>] do_vfs_ioctl+0x458/0x4dc
[11642.346024]  [<ffffffff81186bbe>] ? __fget_light+0x62/0x71
[11642.346873]  [<ffffffff8117dd70>] SyS_ioctl+0x57/0x79
[11642.347720]  [<ffffffff8147fa97>] entry_SYSCALL_64_fastpath+0x12/0x6b
[11642.350222] 4 locks held by fdm-stress/26849:
[11642.350898]  #0:  (sb_writers#11){.+.+.+}, at: [<ffffffff811706ee>] __sb_start_write+0x5f/0xb0
[11642.352375]  #1:  (&type->i_mutex_dir_key#4/1){+.+.+.}, at: [<ffffffffa0465981>] btrfs_mksubvol+0x4b/0x45d [btrfs]
[11642.354072]  #2:  (&fs_info->subvol_sem){++++..}, at: [<ffffffffa0465a2a>] btrfs_mksubvol+0xf4/0x45d [btrfs]
[11642.355647]  #3:  (&root->ordered_extent_mutex){+.+...}, at: [<ffffffffa044f456>] btrfs_wait_ordered_extents+0x50/0x217 [btrfs]
[11642.357516] INFO: task fdm-stress:26850 blocked for more than 120 seconds.
[11642.358508]       Not tainted 4.4.0-rc6-btrfs-next-21+ #1
[11642.359376] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[11642.368625] fdm-stress      D ffff88021f167688     0 26850  26591 0x00000000
[11642.369716]  ffff88021f167688 0000000000000001 0000000000014ec0 ffff88023edd4ec0
[11642.370950]  ffff880128a98680 ffff88021f168000 ffff88023edd4ec0 7fffffffffffffff
[11642.372210]  0000000000000002 ffffffff8147b7f9 ffff88021f1676a0 ffffffff8147b541
[11642.373430] Call Trace:
[11642.373853]  [<ffffffff8147b7f9>] ? bit_wait+0x2f/0x2f
[11642.374623]  [<ffffffff8147b541>] schedule+0x82/0x9a
[11642.375948]  [<ffffffff8147e7fe>] schedule_timeout+0x43/0x109
[11642.376862]  [<ffffffff8147b7f9>] ? bit_wait+0x2f/0x2f
[11642.377637]  [<ffffffff8108afd1>] ? trace_hardirqs_on_caller+0x17b/0x197
[11642.378610]  [<ffffffff8108affa>] ? trace_hardirqs_on+0xd/0xf
[11642.379457]  [<ffffffff810b079b>] ? timekeeping_get_ns+0xe/0x33
[11642.380366]  [<ffffffff810b0f61>] ? ktime_get+0x41/0x52
[11642.381353]  [<ffffffff8147ac08>] io_schedule_timeout+0xa0/0x102
[11642.382255]  [<ffffffff8147ac08>] ? io_schedule_timeout+0xa0/0x102
[11642.383162]  [<ffffffff8147b814>] bit_wait_io+0x1b/0x39
[11642.383945]  [<ffffffff8147bb21>] __wait_on_bit_lock+0x4c/0x90
[11642.384875]  [<ffffffff8111829f>] __lock_page+0x66/0x68
[11642.385749]  [<ffffffff81082f29>] ? autoremove_wake_function+0x3a/0x3a
[11642.386721]  [<ffffffffa0450ddd>] lock_page+0x31/0x34 [btrfs]
[11642.387596]  [<ffffffffa0454e3b>] extent_write_cache_pages.isra.19.constprop.35+0x1af/0x2f4 [btrfs]
[11642.389030]  [<ffffffffa0455373>] extent_writepages+0x4b/0x5c [btrfs]
[11642.389973]  [<ffffffff810a25ad>] ? rcu_read_lock_sched_held+0x61/0x69
[11642.390939]  [<ffffffffa043c913>] ? btrfs_writepage_start_hook+0xce/0xce [btrfs]
[11642.392271]  [<ffffffffa0451c32>] ? __clear_extent_bit+0x26e/0x2c0 [btrfs]
[11642.393305]  [<ffffffffa043aa82>] btrfs_writepages+0x28/0x2a [btrfs]
[11642.394239]  [<ffffffff811236bc>] do_writepages+0x23/0x2c
[11642.395045]  [<ffffffff811198c9>] __filemap_fdatawrite_range+0x5a/0x61
[11642.395991]  [<ffffffff81119946>] filemap_fdatawrite_range+0x13/0x15
[11642.397144]  [<ffffffffa044f87e>] btrfs_start_ordered_extent+0xd0/0x1a1 [btrfs]
[11642.398392]  [<ffffffffa0452094>] ? clear_extent_bit+0x17/0x19 [btrfs]
[11642.399363]  [<ffffffffa0445945>] btrfs_get_blocks_direct+0x12b/0x61c [btrfs]
[11642.400445]  [<ffffffff8119f7a1>] ? dio_bio_add_page+0x3d/0x54
[11642.401309]  [<ffffffff8119fa93>] ? submit_page_section+0x7b/0x111
[11642.402213]  [<ffffffff811a0258>] do_blockdev_direct_IO+0x685/0xc24
[11642.403139]  [<ffffffffa044581a>] ? btrfs_page_exists_in_range+0x1a1/0x1a1 [btrfs]
[11642.404360]  [<ffffffffa043d267>] ? btrfs_get_extent_fiemap+0x1c0/0x1c0 [btrfs]
[11642.406187]  [<ffffffff811a0828>] __blockdev_direct_IO+0x31/0x33
[11642.407070]  [<ffffffff811a0828>] ? __blockdev_direct_IO+0x31/0x33
[11642.407990]  [<ffffffffa043d267>] ? btrfs_get_extent_fiemap+0x1c0/0x1c0 [btrfs]
[11642.409192]  [<ffffffffa043b4ca>] btrfs_direct_IO+0x1c7/0x27e [btrfs]
[11642.410146]  [<ffffffffa043d267>] ? btrfs_get_extent_fiemap+0x1c0/0x1c0 [btrfs]
[11642.411291]  [<ffffffff81119a2c>] generic_file_read_iter+0x89/0x4e1
[11642.412263]  [<ffffffff8108ac05>] ? mark_lock+0x24/0x201
[11642.413057]  [<ffffffff8116e1f8>] __vfs_read+0x79/0x9d
[11642.413897]  [<ffffffff8116e6f1>] vfs_read+0x8f/0xd2
[11642.414708]  [<ffffffff8116ef3d>] SyS_read+0x50/0x7e
[11642.415573]  [<ffffffff8147fa97>] entry_SYSCALL_64_fastpath+0x12/0x6b
[11642.416572] 1 lock held by fdm-stress/26850:
[11642.417345]  #0:  (&f->f_pos_lock){+.+.+.}, at: [<ffffffff811877e8>] __fdget_pos+0x3a/0x40
[11642.418703] INFO: task fdm-stress:26851 blocked for more than 120 seconds.
[11642.419698]       Not tainted 4.4.0-rc6-btrfs-next-21+ #1
[11642.420612] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[11642.421807] fdm-stress      D ffff880196483d28     0 26851  26591 0x00000000
[11642.422878]  ffff880196483d28 00ff8801c8f60740 0000000000014ec0 ffff88023ed94ec0
[11642.424149]  ffff8801c8f60740 ffff880196484000 0000000000000246 ffff8801c8f60740
[11642.425374]  ffff8801bb711840 ffff8801bb711878 ffff880196483d40 ffffffff8147b541
[11642.426591] Call Trace:
[11642.427013]  [<ffffffff8147b541>] schedule+0x82/0x9a
[11642.427856]  [<ffffffff8147b6d5>] schedule_preempt_disabled+0x18/0x24
[11642.428852]  [<ffffffff8147c23a>] mutex_lock_nested+0x1d7/0x3b4
[11642.429743]  [<ffffffffa044f456>] ? btrfs_wait_ordered_extents+0x50/0x217 [btrfs]
[11642.430911]  [<ffffffffa044f456>] btrfs_wait_ordered_extents+0x50/0x217 [btrfs]
[11642.432102]  [<ffffffffa044f674>] ? btrfs_wait_ordered_roots+0x57/0x191 [btrfs]
[11642.433259]  [<ffffffffa044f456>] ? btrfs_wait_ordered_extents+0x50/0x217 [btrfs]
[11642.434431]  [<ffffffffa044f6ea>] btrfs_wait_ordered_roots+0xcd/0x191 [btrfs]
[11642.436079]  [<ffffffffa0410cab>] btrfs_sync_fs+0xe0/0x1ad [btrfs]
[11642.437009]  [<ffffffff81197900>] ? SyS_tee+0x23c/0x23c
[11642.437860]  [<ffffffff81197920>] sync_fs_one_sb+0x20/0x22
[11642.438723]  [<ffffffff81171435>] iterate_supers+0x75/0xc2
[11642.439597]  [<ffffffff81197d00>] sys_sync+0x52/0x80
[11642.440454]  [<ffffffff8147fa97>] entry_SYSCALL_64_fastpath+0x12/0x6b
[11642.441533] 3 locks held by fdm-stress/26851:
[11642.442370]  #0:  (&type->s_umount_key#37){+++++.}, at: [<ffffffff8117141f>] iterate_supers+0x5f/0xc2
[11642.444043]  #1:  (&fs_info->ordered_operations_mutex){+.+...}, at: [<ffffffffa044f661>] btrfs_wait_ordered_roots+0x44/0x191 [btrfs]
[11642.446010]  #2:  (&root->ordered_extent_mutex){+.+...}, at: [<ffffffffa044f456>] btrfs_wait_ordered_extents+0x50/0x217 [btrfs]

This happened because under specific timings the path for direct IO reads
can deadlock with concurrent buffered writes. The diagram below shows how
this happens for an example file that has the following layout:

     [  extent A  ]  [  extent B  ]  [ ....
     0K              4K              8K

     CPU 1                                               CPU 2                             CPU 3

DIO read against range
 [0K, 8K[ starts

btrfs_direct_IO()
  --> calls btrfs_get_blocks_direct()
      which finds the extent map for the
      extent A and leaves the range
      [0K, 4K[ locked in the inode's
      io tree

                                                   buffered write against
                                                   range [4K, 8K[ starts

                                                   __btrfs_buffered_write()
                                                     --> dirties page at 4K

                                                                                     a user space
                                                                                     task calls sync
                                                                                     for e.g or
                                                                                     writepages() is
                                                                                     invoked by mm

                                                                                     writepages()
                                                                                       run_delalloc_range()
                                                                                         cow_file_range()
                                                                                           --> ordered extent X
                                                                                               for the buffered
                                                                                               write is created
                                                                                               and
                                                                                               writeback starts

  --> calls btrfs_get_blocks_direct()
      again, without submitting first
      a bio for reading extent A, and
      finds the extent map for extent B

  --> calls lock_extent_direct()

      --> locks range [4K, 8K[
      --> finds ordered extent X
          covering range [4K, 8K[
      --> unlocks range [4K, 8K[

                                                  buffered write against
                                                  range [0K, 8K[ starts

                                                  __btrfs_buffered_write()
                                                    prepare_pages()
                                                      --> locks pages with
                                                          offsets 0 and 4K
                                                    lock_and_cleanup_extent_if_need()
                                                      --> blocks attempting to
                                                          lock range [0K, 8K[ in
                                                          the inode's io tree,
                                                          because the range [0, 4K[
                                                          is already locked by the
                                                          direct IO task at CPU 1

      --> calls
          btrfs_start_ordered_extent(oe X)

          btrfs_start_ordered_extent(oe X)

            --> At this point writeback for ordered
                extent X has not finished yet

            filemap_fdatawrite_range()
              btrfs_writepages()
                extent_writepages()
                  extent_write_cache_pages()
                    --> finds page with offset 0
                        with the writeback tag
                        (and not dirty)
                    --> tries to lock it
                         --> deadlock, task at CPU 2
                             has the page locked and
                             is blocked on the io range
                             [0, 4K[ that was locked
                             earlier by this task

So fix this by falling back to a buffered read in the direct IO read path
when an ordered extent for a buffered write is found.

Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: Liu Bo <bo.li.liu@oracle.com>
Signed-off-by: Chris Mason <clm@fb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>