Trinity gets kernel BUG at mm/mmap.c:1963! in about 3 minutes of
mmap testing. That's the VM_BUG_ON(gap_end < gap_start) at the
end of unmapped_area_topdown(). Linus points out how MAP_FIXED
(which does not have to respect our stack guard gap intentions)
could result in gap_end below gap_start there. Fix that, and
the similar case in its alternative, unmapped_area().
Fixes: 1be7107fbe18 ("mm: larger stack guard gap, between vmas") Reported-by: Dave Jones <davej@codemonkey.org.uk> Debugged-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Hugh Dickins <hughd@google.com> Acked-by: Michal Hocko <mhocko@suse.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Fix expand_upwards() on architectures with an upward-growing stack (parisc,
metag and partly IA-64) to allow the stack to reliably grow exactly up to
the address space limit given by TASK_SIZE.
Stack guard page is a useful feature to reduce a risk of stack smashing
into a different mapping. We have been using a single page gap which
is sufficient to prevent having stack adjacent to a different mapping.
But this seems to be insufficient in the light of the stack usage in
userspace. E.g. glibc uses as large as 64kB alloca() in many commonly
used functions. Others use constructs liks gid_t buffer[NGROUPS_MAX]
which is 256kB or stack strings with MAX_ARG_STRLEN.
This will become especially dangerous for suid binaries and the default
no limit for the stack size limit because those applications can be
tricked to consume a large portion of the stack and a single glibc call
could jump over the guard page. These attacks are not theoretical,
unfortunatelly.
Make those attacks less probable by increasing the stack guard gap
to 1MB (on systems with 4k pages; but make it depend on the page size
because systems with larger base pages might cap stack allocations in
the PAGE_SIZE units) which should cover larger alloca() and VLA stack
allocations. It is obviously not a full fix because the problem is
somehow inherent, but it should reduce attack space a lot.
One could argue that the gap size should be configurable from userspace,
but that can be done later when somebody finds that the new 1MB is wrong
for some special case applications. For now, add a kernel command line
option (stack_guard_gap) to specify the stack gap size (in page units).
Implementation wise, first delete all the old code for stack guard page:
because although we could get away with accounting one extra page in a
stack vma, accounting a larger gap can break userspace - case in point,
a program run with "ulimit -S -v 20000" failed when the 1MB gap was
counted for RLIMIT_AS; similar problems could come with RLIMIT_MLOCK
and strict non-overcommit mode.
Instead of keeping gap inside the stack vma, maintain the stack guard
gap as a gap between vmas: using vm_start_gap() in place of vm_start
(or vm_end_gap() in place of vm_end if VM_GROWSUP) in just those few
places which need to respect the gap - mainly arch_get_unmapped_area(),
and and the vma tree's subtree_gap support for that.
Original-patch-by: Oleg Nesterov <oleg@redhat.com> Original-patch-by: Michal Hocko <mhocko@suse.com> Signed-off-by: Hugh Dickins <hughd@google.com> Acked-by: Michal Hocko <mhocko@suse.com> Tested-by: Helge Deller <deller@gmx.de> # parisc Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
[wt: backport to 4.11: adjust context]
[wt: backport to 4.9: adjust context ; kernel doc was not in admin-guide] Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The alarmtimer code has another source of potentially rearming itself too
fast. Interval timers with a very samll interval have a similar CPU hog
effect as the previously fixed overflow issue.
The reason is that alarmtimers do not implement the normal protection
against this kind of problem which the other posix timer use:
timer expires -> queue signal -> deliver signal -> rearm timer
This scheme brings the rearming under scheduler control and prevents
permanently firing timers which hog the CPU.
Bringing this scheme to the alarm timer code is a major overhaul because it
lacks all the necessary mechanisms completely.
So for a quick fix limit the interval to one jiffie. This is not
problematic in practice as alarmtimers are usually backed by an RTC for
suspend which have 1 second resolution. It could be therefor argued that
the resolution of this clock should be set to 1 second in general, but
that's outside the scope of this fix.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Kostya Serebryany <kcc@google.com> Cc: syzkaller <syzkaller@googlegroups.com> Cc: John Stultz <john.stultz@linaro.org> Cc: Dmitry Vyukov <dvyukov@google.com> Link: http://lkml.kernel.org/r/20170530211655.896767100@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
On sparc, if we have an alloca() like situation, as is the case with
SHASH_DESC_ON_STACK(), we can end up referencing deallocated stack
memory. The result can be that the value is clobbered if a trap
or interrupt arrives at just the right instruction.
It only occurs if the function ends returning a value from that
alloca() area and that value can be placed into the return value
register using a single instruction.
For example, in lib/libcrc32c.c:crc32c() we end up with a return
sequence like:
%o5 holds the base of the on-stack area allocated for the shash
descriptor. But the return released the stack frame and the
register window.
So if an intererupt arrives between 'return' and 'lduw', then
the value read at %o5+16 can be corrupted.
Add a data compiler barrier to work around this problem. This is
exactly what the gcc fix will end up doing as well, and it absolutely
should not change the code generated for other cpus (unless gcc
on them has the same bug :-)
With crucial insight from Eric Sandeen.
Reported-by: Anatoly Pugachev <matorola@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The current code passes the address of tpm_chip as the argument to
dev_get_drvdata() without prior NULL check in
tpm_ibmvtpm_get_desired_dma. This resulted an oops during kernel
boot when vTPM is enabled in Power partition configured in active
memory sharing mode.
The vio_driver's get_desired_dma() is called before the probe(), which
for vtpm is tpm_ibmvtpm_probe, and it's this latter function that
initializes the driver and set data. Attempting to get data before
the probe() caused the problem.
This patch adds a NULL check to the tpm_ibmvtpm_get_desired_dma.
fixes: 9e0d39d8a6a0 ("tpm: Remove useless priv field in struct tpm_vendor_specific") Signed-off-by: Hon Ching(Vicky) Lo <honclo@linux.vnet.ibm.com> Reviewed-by: Jarkko Sakkine <jarkko.sakkinen@linux.intel.com> Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The .its targets require information about the kernel binary, such as
its entry point, which is extracted from the vmlinux ELF. We therefore
require that the ELF is built before the .its files are generated.
Declare this requirement in the Makefile such that make will ensure this
is always the case, otherwise in corner cases we can hit issues as the
.its is generated with an incorrect (either invalid or stale) entry
point.
Signed-off-by: Paul Burton <paul.burton@imgtec.com> Fixes: cf2a5e0bb4c6 ("MIPS: Support generating Flattened Image Trees (.itb)") Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/16179/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The code handling the pop76 opcode (ie. bnezc & jialc instructions) in
__compute_return_epc_for_insn() needs to set the value of $31 in the
jialc case, which is encoded with rs = 0. However its check to
differentiate bnezc (rs != 0) from jialc (rs = 0) was unfortunately
backwards, meaning that if we emulate a bnezc instruction we clobber $31
& if we emulate a jialc instruction it actually behaves like a jic
instruction.
Fix this by inverting the check of rs to match the way the instructions
are actually encoded.
Signed-off-by: Paul Burton <paul.burton@imgtec.com> Fixes: 28d6f93d201d ("MIPS: Emulate the new MIPS R6 BNEZC and JIALC instructions") Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/16178/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Andrey reported a alartimer related RCU stall while fuzzing the kernel with
syzkaller.
The reason for this is an overflow in ktime_add() which brings the
resulting time into negative space and causes immediate expiry of the
timer. The following rearm with a small interval does not bring the timer
back into positive space due to the same issue.
This results in a permanent firing alarmtimer which hogs the CPU.
Use ktime_add_safe() instead which detects the overflow and clamps the
result to KTIME_SEC_MAX.
idle_task_exit() can be called with IRQs on x86 on and therefore
should use switch_mm(), not switch_mm_irqs_off().
This doesn't seem to cause any problems right now, but it will
confuse my upcoming TLB flush changes. Nonetheless, I think it
should be backported because it's trivial. There won't be any
meaningful performance impact because idle_task_exit() is only
used when offlining a CPU.
Signed-off-by: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@suse.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: f98db6013c55 ("sched/core: Add switch_mm_irqs_off() and use it in the scheduler") Link: http://lkml.kernel.org/r/ca3d1a9fa93a0b49f5a8ff729eda3640fb6abdf9.1497034141.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Starting from MPU6500, accelerometer dlpf is set in a separate
register named ACCEL_CONFIG_2.
Add this new register in the map and set it for the corresponding
chips.
memory_failure() chooses a recovery action function based on the page
flags. For huge pages it uses the tail page flags which don't have
anything interesting set, resulting in:
> Memory failure: 0x9be3b4: Unknown page state
> Memory failure: 0x9be3b4: recovery action for unknown page: Failed
Instead, save a copy of the head page's flags if this is a huge page,
this means if there are no relevant flags for this tail page, we use the
head pages flags instead. This results in the me_huge_page() recovery
action being called:
> Memory failure: 0x9b7969: recovery action for huge page: Delayed
For hugepages that have not yet been allocated, this allows the hugepage
to be dequeued.
Fixes: 524fca1e7356 ("HWPOISON: fix misjudgement of page_action() for errors on mlocked pages") Link: http://lkml.kernel.org/r/20170524130204.21845-1-james.morse@arm.com Signed-off-by: James Morse <james.morse@arm.com> Tested-by: Punit Agrawal <punit.agrawal@arm.com> Acked-by: Punit Agrawal <punit.agrawal@arm.com> Acked-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Using the syzkaller kernel fuzzer, Andrey Konovalov generated the
following error in gadgetfs:
> BUG: KASAN: use-after-free in __lock_acquire+0x3069/0x3690
> kernel/locking/lockdep.c:3246
> Read of size 8 at addr ffff88003a2bdaf8 by task kworker/3:1/903
>
> CPU: 3 PID: 903 Comm: kworker/3:1 Not tainted 4.12.0-rc4+ #35
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
> Workqueue: usb_hub_wq hub_event
> Call Trace:
> __dump_stack lib/dump_stack.c:16 [inline]
> dump_stack+0x292/0x395 lib/dump_stack.c:52
> print_address_description+0x78/0x280 mm/kasan/report.c:252
> kasan_report_error mm/kasan/report.c:351 [inline]
> kasan_report+0x230/0x340 mm/kasan/report.c:408
> __asan_report_load8_noabort+0x19/0x20 mm/kasan/report.c:429
> __lock_acquire+0x3069/0x3690 kernel/locking/lockdep.c:3246
> lock_acquire+0x22d/0x560 kernel/locking/lockdep.c:3855
> __raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline]
> _raw_spin_lock+0x2f/0x40 kernel/locking/spinlock.c:151
> spin_lock include/linux/spinlock.h:299 [inline]
> gadgetfs_suspend+0x89/0x130 drivers/usb/gadget/legacy/inode.c:1682
> set_link_state+0x88e/0xae0 drivers/usb/gadget/udc/dummy_hcd.c:455
> dummy_hub_control+0xd7e/0x1fb0 drivers/usb/gadget/udc/dummy_hcd.c:2074
> rh_call_control drivers/usb/core/hcd.c:689 [inline]
> rh_urb_enqueue drivers/usb/core/hcd.c:846 [inline]
> usb_hcd_submit_urb+0x92f/0x20b0 drivers/usb/core/hcd.c:1650
> usb_submit_urb+0x8b2/0x12c0 drivers/usb/core/urb.c:542
> usb_start_wait_urb+0x148/0x5b0 drivers/usb/core/message.c:56
> usb_internal_control_msg drivers/usb/core/message.c:100 [inline]
> usb_control_msg+0x341/0x4d0 drivers/usb/core/message.c:151
> usb_clear_port_feature+0x74/0xa0 drivers/usb/core/hub.c:412
> hub_port_disable+0x123/0x510 drivers/usb/core/hub.c:4177
> hub_port_init+0x1ed/0x2940 drivers/usb/core/hub.c:4648
> hub_port_connect drivers/usb/core/hub.c:4826 [inline]
> hub_port_connect_change drivers/usb/core/hub.c:4999 [inline]
> port_event drivers/usb/core/hub.c:5105 [inline]
> hub_event+0x1ae1/0x3d40 drivers/usb/core/hub.c:5185
> process_one_work+0xc08/0x1bd0 kernel/workqueue.c:2097
> process_scheduled_works kernel/workqueue.c:2157 [inline]
> worker_thread+0xb2b/0x1860 kernel/workqueue.c:2233
> kthread+0x363/0x440 kernel/kthread.c:231
> ret_from_fork+0x2a/0x40 arch/x86/entry/entry_64.S:424
>
> Allocated by task 9958:
> save_stack_trace+0x1b/0x20 arch/x86/kernel/stacktrace.c:59
> save_stack+0x43/0xd0 mm/kasan/kasan.c:513
> set_track mm/kasan/kasan.c:525 [inline]
> kasan_kmalloc+0xad/0xe0 mm/kasan/kasan.c:617
> kmem_cache_alloc_trace+0x87/0x280 mm/slub.c:2745
> kmalloc include/linux/slab.h:492 [inline]
> kzalloc include/linux/slab.h:665 [inline]
> dev_new drivers/usb/gadget/legacy/inode.c:170 [inline]
> gadgetfs_fill_super+0x24f/0x540 drivers/usb/gadget/legacy/inode.c:1993
> mount_single+0xf6/0x160 fs/super.c:1192
> gadgetfs_mount+0x31/0x40 drivers/usb/gadget/legacy/inode.c:2019
> mount_fs+0x9c/0x2d0 fs/super.c:1223
> vfs_kern_mount.part.25+0xcb/0x490 fs/namespace.c:976
> vfs_kern_mount fs/namespace.c:2509 [inline]
> do_new_mount fs/namespace.c:2512 [inline]
> do_mount+0x41b/0x2d90 fs/namespace.c:2834
> SYSC_mount fs/namespace.c:3050 [inline]
> SyS_mount+0xb0/0x120 fs/namespace.c:3027
> entry_SYSCALL_64_fastpath+0x1f/0xbe
>
> Freed by task 9960:
> save_stack_trace+0x1b/0x20 arch/x86/kernel/stacktrace.c:59
> save_stack+0x43/0xd0 mm/kasan/kasan.c:513
> set_track mm/kasan/kasan.c:525 [inline]
> kasan_slab_free+0x72/0xc0 mm/kasan/kasan.c:590
> slab_free_hook mm/slub.c:1357 [inline]
> slab_free_freelist_hook mm/slub.c:1379 [inline]
> slab_free mm/slub.c:2961 [inline]
> kfree+0xed/0x2b0 mm/slub.c:3882
> put_dev+0x124/0x160 drivers/usb/gadget/legacy/inode.c:163
> gadgetfs_kill_sb+0x33/0x60 drivers/usb/gadget/legacy/inode.c:2027
> deactivate_locked_super+0x8d/0xd0 fs/super.c:309
> deactivate_super+0x21e/0x310 fs/super.c:340
> cleanup_mnt+0xb7/0x150 fs/namespace.c:1112
> __cleanup_mnt+0x1b/0x20 fs/namespace.c:1119
> task_work_run+0x1a0/0x280 kernel/task_work.c:116
> exit_task_work include/linux/task_work.h:21 [inline]
> do_exit+0x18a8/0x2820 kernel/exit.c:878
> do_group_exit+0x14e/0x420 kernel/exit.c:982
> get_signal+0x784/0x1780 kernel/signal.c:2318
> do_signal+0xd7/0x2130 arch/x86/kernel/signal.c:808
> exit_to_usermode_loop+0x1ac/0x240 arch/x86/entry/common.c:157
> prepare_exit_to_usermode arch/x86/entry/common.c:194 [inline]
> syscall_return_slowpath+0x3ba/0x410 arch/x86/entry/common.c:263
> entry_SYSCALL_64_fastpath+0xbc/0xbe
>
> The buggy address belongs to the object at ffff88003a2bdae0
> which belongs to the cache kmalloc-1024 of size 1024
> The buggy address is located 24 bytes inside of
> 1024-byte region [ffff88003a2bdae0, ffff88003a2bdee0)
> The buggy address belongs to the page:
> page:ffffea0000e8ae00 count:1 mapcount:0 mapping: (null)
> index:0x0 compound_mapcount: 0
> flags: 0x100000000008100(slab|head)
> raw: 0100000000008100000000000000000000000000000000000000000100170017
> raw: ffffea0000ed3020ffffea0000f5f820ffff88003e80efc00000000000000000
> page dumped because: kasan: bad access detected
>
> Memory state around the buggy address:
> ffff88003a2bd980: fb fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
> ffff88003a2bda00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
> >ffff88003a2bda80: fc fc fc fc fc fc fc fc fc fc fc fc fb fb fb fb
> ^
> ffff88003a2bdb00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> ffff88003a2bdb80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> ==================================================================
What this means is that the gadgetfs_suspend() routine was trying to
access dev->lock after it had been deallocated. The root cause is a
race in the dummy_hcd driver; the dummy_udc_stop() routine can race
with the rest of the driver because it contains no locking. And even
when proper locking is added, it can still race with the
set_link_state() function because that function incorrectly drops the
private spinlock before invoking any gadget driver callbacks.
The result of this race, as seen above, is that set_link_state() can
invoke a callback in gadgetfs even after gadgetfs has been unbound
from dummy_hcd's UDC and its private data structures have been
deallocated.
include/linux/usb/gadget.h documents that the ->reset, ->disconnect,
->suspend, and ->resume callbacks may be invoked in interrupt context.
In general this is necessary, to prevent races with gadget driver
removal. This patch fixes dummy_hcd to retain the spinlock across
these calls, and it adds a spinlock acquisition to dummy_udc_stop() to
prevent the race.
The net2280 driver makes the same mistake of dropping the private
spinlock for its ->disconnect and ->reset callback invocations. The
patch fixes it too.
Lastly, since gadgetfs_suspend() may be invoked in interrupt context,
it cannot assume that interrupts are enabled when it runs. It must
use spin_lock_irqsave() instead of spin_lock_irq(). The patch fixes
that bug as well.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Reported-and-tested-by: Andrey Konovalov <andreyknvl@google.com> Acked-by: Felipe Balbi <felipe.balbi@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The bug was caused by dev_release() failing to turn off its
gadget_registered flag after unregistering the gadget driver. As a
result, when a later user closed the device file before writing a
valid set of descriptors, dev_release() thought the gadget had been
registered and tried to unregister it, even though it had not been.
This led to the NULL pointer dereference.
The fix is simple: turn off the flag when the gadget is unregistered.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Reported-and-tested-by: Andrey Konovalov <andreyknvl@google.com> Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When plugging an USB webcam I see the following message:
[106385.615559] xhci_hcd 0000:04:00.0: WARN Successful completion on short TX: needs XHCI_TRUST_TX_LENGTH quirk?
[106390.583860] handle_tx_event: 913 callbacks suppressed
With this patch applied, I get no more printing of this message.
xHCI host controllers can have both USB 3.1 and 3.0 extended speed
protocol lists. If the USB3.1 speed is parsed first and 3.0 second then
the minor revision supported will be overwritten by the 3.0 speeds and
the USB3 roothub will only show support for USB 3.0 speeds.
This was the case with a xhci controller with the supported protocol
capability listed below.
In xhci-mem.c, the USB 3.1 speed is parsed first, the min_rev of usb3_rhub
is set as 0x10. And then USB 3.0 is parsed. However, the min_rev of
usb3_rhub will be changed to 0x00. If USB 3.1 device is connected behind
this host controller, the speed of USB 3.1 device just reports 5G speed
using lsusb.
This function only has one caller. Freeing "vdev" here leads to a use
after free bug. There are several other error paths in this function
but this is the only one which frees "vdev". It looks like the kfree()
can be safely removed.
Fixes: 61e9c905df78 ("misc: mic: Enable VOP host side functionality") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This fixes the below crash when ath10k probe firmware fails, NAPI polling tries
to access a rx ring resource which was never allocated. An easy way to
reproduce this is easy to remove all the firmware files, load ath10k modules
and ath10k will crash when calling 'rmmod ath10k_pci'. The fix is to call
napi_enable() from ath10k_pci_hif_start() so that it matches with
napi_disable() being called from ath10k_pci_hif_stop().
Big thanks to Mohammed Shafi Shajakhan who debugged this and provided first
version of the fix. In this patch I just fix the actual problem in pci.c
instead of having a workaround in core.c.
BUG: unable to handle kernel NULL pointer dereference at (null)
IP: __ath10k_htt_rx_ring_fill_n+0x19/0x230 [ath10k_core]
__ath10k_htt_rx_ring_fill_n+0x19/0x230 [ath10k_core]
Reported-by: Ben Greear <greearb@candelatech.com> Fixes: 3c97f5de1f28 ("ath10k: implement NAPI support") Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The timeout for BULK packets was 300ms which is a long time if other
endpoints or devices are waiting for their turn. Changing it to 50ms
greatly increased the overall performance for multi-endpoint devices.
Fixes: 5d3043586db4 ("usb: r8a66597-hcd: host controller driver for R8A6659") Signed-off-by: Chris Brandt <chris.brandt@renesas.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
If multiple endpoints on a single device have pending IN URBs and one
endpoint times out due to NAKs (perfectly legal), select a different
endpoint URB to try.
The existing code only checked to see another device address has pending
URBs and ignores other IN endpoints on the current device address. This
leads to endpoints never getting serviced if one endpoint is using NAK as
a flow control method.
Fixes: 5d3043586db4 ("usb: r8a66597-hcd: host controller driver for R8A6659") Signed-off-by: Chris Brandt <chris.brandt@renesas.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Flag the first and only port as removable while also leaving the
remaining bits (including the reserved bit zero) unset in accordance
with the specifications:
"Within a byte, if no port exists for a given location, the bit
field representing the port characteristics shall be 0."
Also add a comment marking the legacy PortPwrCtrlMask field.
The driver uses a relatively large data structure on the stack, which
showed up on my radar as we get a warning with the "latent entropy"
GCC plugin:
drivers/media/usb/pvrusb2/pvrusb2-eeprom.c:153:1: error: the frame size of 1376 bytes is larger than 1152 bytes [-Werror=frame-larger-than=]
The warning is usually hidden as we raise the warning limit to 2048
when the plugin is enabled, but I'd like to lower that again in the
future, and making this function smaller helps to do that without
build regressions.
Further analysis shows that putting an 'i2c_client' structure on
the stack is not really supported, as the embedded 'struct device'
is not initialized here, and we are only saved by the fact that
the function that is called here does not use the pointer at all.
Fix up the root-hub descriptor to accommodate the variable-length
DeviceRemovable and PortPwrCtrlMask fields, while marking all ports as
removable (and leaving the reserved bit zero unset).
Also add a build-time constraint on VHCI_HC_PORTS which must never be
greater than USB_MAXCHILDREN (but this was only enforced through a
KConfig constant).
This specifically fixes the descriptor layout whenever VHCI_HC_PORTS is
greater than seven (default is 8).
This controller disallows to change the PIPE until reading/writing
a packet finishes. However. the previous code is not enough to hold
the lock in some functions. So, this patch fixes it.
Fixes: 746bfe63bba3 ("usb: gadget: renesas_usb3: add support for Renesas USB3.0 peripheral controller") Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com> Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This patch fixes an issue that this driver is possible to cause
deadlock by double-spinclocked in renesas_usb3_stop_controller().
So, this patch removes spinlock API calling in renesas_usb3_stop().
(In other words, the previous code had a redundant lock.)
Fixes: 746bfe63bba3 ("usb: gadget: renesas_usb3: add support for Renesas USB3.0 peripheral controller") Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com> Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This patch fixes an issue that this driver is possible to access
the registers before pm_runtime_get_sync() if a gadget driver is
installed first. After that, oops happens on R-Car Gen3 environment.
To avoid it, this patch changes the pm_runtime call timing from
probe/remove to udc_start/udc_stop.
Fixes: 746bfe63bba3 ("usb: gadget: renesas_usb3: add support for Renesas USB3.0 peripheral controller") Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com> Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The logic was broken as it failed to update the response length for
architectures with PAGE_SIZE larger than 4kB. As a result further
extension of the ucontext response struct would fail.
Fixes: d69e3bcf7976 ('IB/mlx5: Mmap the HCA's core clock register to user-space') Signed-off-by: Eli Cohen <eli@mellanox.com> Reviewed-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
We should be allocating enough information for a tiadc_device struct
which is about 400 bytes but instead we allocate enough for a second
iio_dev struct which is over 2000 bytes.
Fixes: fea89e2dfcea ("iio: adc: ti_am335x_adc: use variable names for sizeof() operator") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Jonathan Cameron <jic23@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
According to the datasheet the RCO must be recalibrated
on every power-on-reset. Also remove mutex locking in the
calibration function since callers other than the probe
function (which doesn't need it) will have a lock.
Fixes: 24ddb0e4bba4 ("iio: Add AS3935 lightning sensor support") Cc: George McCollister <george.mccollister@gmail.com> Signed-off-by: Matt Ranostay <matt.ranostay@konsulko.com> Signed-off-by: Jonathan Cameron <jic23@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Datasheet of each device (lps331ap, lps25h, lps001wp, lps22hb) says that
the pressure and temperature data is a 2's complement.
I'm sending this the slow way, as negative pressures on these are pretty
unusual and the nature of the fixing of multiple device introduction patches
will make it hard to apply to older kernels - Jonathan.
Fixes: 217494e5b780 ("iio:pressure: Add STMicroelectronics pressures driver") Fixes: 2f5effcbd097 ("iio: pressure-core: st: Expand and rename LPS331AP's channel descriptor") Fixes: 7885a8ce6800 ("iio: pressure: st: Add support for new LPS001WP pressure sensor") Fixes: e039e2f5b4da ("iio:st_pressure:initial lps22hb sensor support") Signed-off-by: Marcin Niestroj <m.niestroj@grinn-global.com> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Jonathan Cameron <jic23@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Standard deviation is calculated as the square root of the variance
where variance is the mean of sample_sum and length. Correct the
computation of statP->stddev in accordance to the proper calculation.
Commit 16fa3dc75c22 ("mfd: omap-usb-tll: HOST TLL platform driver")
added support for USB TLL, but uses OMAP_TLL_CHANNEL_CONF_ULPINOBITSTUFF
bit the wrong way. The comments in the code are correct, but the inverted
use of OMAP_TLL_CHANNEL_CONF_ULPINOBITSTUFF causes the register to be
enabled instead of disabled unlike what the comments say.
Without this change the Wrigley 3G LTE modem on droid 4 EHCI bus can
be only pinged few times before it stops responding.
Fixes: 16fa3dc75c22 ("mfd: omap-usb-tll: HOST TLL platform driver") Signed-off-by: Tony Lindgren <tony@atomide.com> Acked-by: Roger Quadros <rogerq@ti.com> Signed-off-by: Lee Jones <lee.jones@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
'__vmalloc_start_set' currently only gets set in initmem_init() when
!CONFIG_NEED_MULTIPLE_NODES. This breaks detection of vmalloc address
with virt_addr_valid() with CONFIG_NEED_MULTIPLE_NODES=y, causing
a kernel crash:
[mm/usercopy] 517e1fbeb6: kernel BUG at arch/x86/mm/physaddr.c:78!
Set '__vmalloc_start_set' appropriately for that case as well.
Reported-by: kbuild test robot <fengguang.wu@intel.com> Signed-off-by: Laura Abbott <labbott@redhat.com> Reviewed-by: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: dc16ecf7fd1f ("x86-32: use specific __vmalloc_start_set flag in __virt_addr_valid") Link: http://lkml.kernel.org/r/1494278596-30373-1-git-send-email-labbott@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When changing hardware control flow for a UART with dedicated RTS/CTS
pins, the new AUTORTS state is not immediately reflected in the
hardware, but only when RTS is raised. However, the serial core does
not call .set_mctrl() after .set_termios(), hence AUTORTS may only
become effective when the port is closed, and reopened later.
Note that this problem does not happen when manually using stty to
change CRTSCTS, as AUTORTS will work fine on next open.
To fix this, call .set_mctrl() from .set_termios() when dedicated
RTS/CTS pins are present, to refresh the AUTORTS or RTS state.
This is similar to what other drivers supporting AUTORTS do (e.g.
omap-serial).
Reported-by: Baumann, Christoph (C.) <cbaumann@visteon.com> Fixes: 33f50ffc253854cf ("serial: sh-sci: Fix support for hardware-assisted RTS/CTS") Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
UARTn_FRAME_PARITY_ODD is 0x0300
UARTn_FRAME_PARITY_EVEN is 0x0200
So if the UART is configured for EVEN parity, it would be reported as ODD.
Fix it by correctly testing if the 2 bits are set.
If a CMA allocation failed, the partially constructed BO would be
unreferenced through the normal path, and we might choose to put it in
the BO cache. If we then reused it before it expired from the cache,
the kernel would OOPS.
mtk_hdmi_setup_vendor_specific_infoframe will return before handle
mtk_hdmi_hw_send_info_frame.Because hdmi_vendor_infoframe_pack
returns the number of bytes packed into the binary buffer or
a negative error code on failure.
So correct it.
Fixes: 8f83f26891e1 ("drm/mediatek: Add HDMI support") Signed-off-by: Nickey Yang <nickey.yang@rock-chips.com> Signed-off-by: CK Hu <ck.hu@mediatek.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
mac80211 allows to modify the SMPS state of an AP both,
when it is started, and after it has been started. Such a
change will trigger an action frame to all the peers that
are currently connected, and will be remembered so that
new peers will get notified as soon as they connect (since
the SMPS setting in the beacon may not be the right one).
This means that we need to remember the SMPS state
currently requested as well as the SMPS state that was
configured initially (and advertised in the beacon).
The former is bss->req_smps and the latter is
sdata->smps_mode.
Initially, the AP interface could only be started with
SMPS_OFF, which means that sdata->smps_mode was SMPS_OFF
always. Later, a nl80211 API was added to be able to start
an AP with a different AP mode. That code forgot to update
bss->req_smps and because of that, if the AP interface was
started with SMPS_DYNAMIC, we had:
sdata->smps_mode = SMPS_DYNAMIC
bss->req_smps = SMPS_OFF
That configuration made mac80211 think it needs to fire off
an action frame to any new station connecting to the AP in
order to let it know that the actual SMPS configuration is
SMPS_OFF.
Fix that by properly setting bss->req_smps in
ieee80211_start_ap.
Fixes: f69931748730 ("mac80211: set smps_mode according to ap params") Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Luca Coelho <luciano.coelho@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In the commit enabling per-CPU station statistics, I inadvertedly
copy-pasted some code to update rx_packets and forgot to change it
to update rx_dropped_misc. Fix that.
This addresses https://bugzilla.kernel.org/show_bug.cgi?id=195953.
Fixes: c9c5962b56c1 ("mac80211: enable collecting station statistics per-CPU") Reported-by: Petru-Florin Mihancea <petrum@gmail.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Mesh forwarding path checks for address extension mode to fetch
appropriate proxied address and MPP address. Existing condition
that looks for 6 address format is not strict enough so that
frames with improper values are processed and invalid entries
are added into MPP table. Fix that by adding a stricter check before
processing the packet.
Per IEEE Std 802.11s-2011 spec. Table 7-6g1 lists address extension
mode 0x3 as reserved one. And also Table Table 9-13 does not specify
0x3 as valid address field.
Fixes: 9b395bc3be1c ("mac80211: verify that skb data is present") Signed-off-by: Rajkumar Manoharan <rmanohar@qti.qualcomm.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When VHT IBSS support was added, the size of the extra elements
wasn't considered in ieee80211_ibss_build_presp(), which makes
it possible that it would overrun the allocated buffer. Fix it
by allocating the necessary space.
Fixes: abcff6ef01f9 ("mac80211: add VHT support for IBSS") Reported-by: Shaul Triebitz <shaul.triebitz@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When adding per-CPU statistics, which added statistics back
to mac80211 for the fast-RX path, I evidently forgot to add
the "stats->packets++" line. The reason for that is likely
that I didn't see it since it's done in defragmentation for
the regular RX path.
Add the missing line to properly count received packets in
the fast-RX case.
Fixes: c9c5962b56c1 ("mac80211: enable collecting station statistics per-CPU") Reported-by: Oren Givon <oren.givon@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Currently VBUS is turned off while a usb device is detached, and turned
on again by the polling routine. This short period VBUS loss prevents
usb modem to switch mode.
VBUS should be constantly on for host-only mode, so this changes the
driver to not turn off VBUS for host-only mode.
Fixes: 2f3fd2c5bde1 ("usb: musb: Prepare dsps glue layer for PM runtime support") Reported-by: Moreno Bartalucci <moreno.bartalucci@tecnorama.it> Acked-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Bin Liu <b-liu@ti.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Current it's strictly checked if PVINFO version matches 1.0
for GVT-g i915 guest which doesn't help for compatibility at
all and forces GVT-g host can't extend PVINFO easily with version
bump for real compatibility check.
This fixes that to check minimal required PVINFO version instead.
v2:
- drop unneeded version macro
- use only major version for sanity check
v3:
- fix up PVInfo value with kernel type
- one indent fix
Commit d63c277dc672e0
("drm/amdgpu: Make display watermark calculations more accurate")
made watermark calculations more accurate, but not for > 4k
resolutions on 32-Bit architectures, as it introduced an integer
overflow for those setups and resolutions.
Fix this by proper u64 casting and division.
Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com> Reported-by: Ben Hutchings <ben.hutchings@codethink.co.uk> Fixes: d63c277dc672 ("drm/amdgpu: Make display watermark calculations more accurate") Cc: Ben Hutchings <ben.hutchings@codethink.co.uk> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Otherwise, we enable all sorts of forgeries via timing attack.
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Cc: Johannes Berg <johannes@sipsolutions.net> Cc: linux-wireless@vger.kernel.org Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Commit 27ed3cd2ebf4 (cpufreq: conservative: Fix the logic in frequency
decrease checking) removed the 10 point substraction when comparing the
load against down_threshold but did not remove the related limit for the
down_threshold value. As a result, down_threshold lower than 11 is not
allowed even though values from 1 to 10 do work correctly too. The
comment ("cannot be lower than 11 otherwise freq will not fall") also
does not apply after removing the substraction.
For this reason, allow down_threshold to take any value from 1 to 99
and fix the related comment.
Fixes: 27ed3cd2ebf4 (cpufreq: conservative: Fix the logic in frequency decrease checking) Signed-off-by: Tomasz Wilczyński <twilczynski@naver.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
While discussing the possible merits of clang warning about unused initialized
functions, I found one function that was clearly meant to be called but
never actually is.
__ila_hash_secret_init() initializes the hash value for the ila locator,
apparently this is intended to prevent hash collision attacks, but this ends
up being a read-only zero constant since there is no caller. I could find
no indication of why it was never called, the earliest patch submission
for the module already was like this. If my interpretation is right, we
certainly want to backport the patch to stable kernels as well.
I considered adding it to the ila_xlat_init callback, but for best effect
the random data is read as late as possible, just before it is first used.
The underlying net_get_random_once() is already highly optimized to avoid
overhead when called frequently.
Fixes: 7f00feaf1076 ("ila: Add generic ILA translation facility") Link: https://www.spinics.net/lists/kernel/msg2527243.html Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This patch closes a long standing race in configfs between
the creation of a new symlink in create_link(), while the
symlink target's config_item is being concurrently removed
via configfs_rmdir().
This can happen because the symlink target's reference
is obtained by config_item_get() in create_link() before
the CONFIGFS_USET_DROPPING bit set by configfs_detach_prep()
during configfs_rmdir() shutdown is actually checked..
This originally manifested itself on ppc64 on v4.8.y under
heavy load using ibmvscsi target ports with Novalink API:
My static checker complains that if "lvl" is ULONG_MAX (this is 64 bit)
then some of the strings will overflow. I don't know if that's possible
but it seems simple enough to make the buffers slightly larger.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net> Cc: Waldemar Brodkorb <wbx@openadk.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
On some systems there can be a race condition in which no crtc state is
added to the first atomic commit. This results in all crtc's having a
null DDB allocation, causing a FIFO underrun on any update until the
first modeset.
Changes since v1:
- Do not take the connection_mutex, this is already done below.
Reported-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Inspired-by: Mahesh Kumar <mahesh1.kumar@intel.com> Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Fixes: 98d39494d375 ("drm/i915/gen9: Compute DDB allocation at atomic
check time (v4)") Cc: Mahesh Kumar <mahesh1.kumar@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170531154236.27180-1-maarten.lankhorst@linux.intel.com Reviewed-by: Mahesh Kumar <mahesh1.kumar@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 367d73d2806085bb507ab44c1f532640917fd5ca) Signed-off-by: Jani Nikula <jani.nikula@intel.com>
The scanline counter is bonkers on VLV/CHV DSI. The scanline counter
increment is not lined up with the start of vblank like it is on
every other platform and output type. This causes problems for
both the vblank timestamping and atomic update vblank evasion.
On my FFRD8 machine at least, the scanline counter increment
happens about 1/3 of a scanline ahead of the start of vblank (which
is where all register latching happens still). That means we can't
trust the scanline counter to tell us whether we're in vblank or not
while we're on that particular line. In order to keep vblank
timestamping in working condition when called from the vblank irq,
we'll leave scanline_offset at one, which means that the entire
line containing the start of vblank is considered to be inside
the vblank.
For the vblank evasion we'll need to consider that entire line
to be bad, since we can't tell whether the registers already
got latched or not. And we can't actually use the start of vblank
interrupt to get us past that line as the interrupt would fire
too soon, and then we'd up waiting for the next start of vblank
instead. One way around that would using the frame start
interrupt instead since that wouldn't fire until the next
scanline, but that would require some bigger changes in the
interrupt code. So for simplicity we'll just poll until we get
past the bad line.
v2: Adjust the comments a bit
Cc: Jonas Aaberg <cja@gmx.net> Tested-by: Jonas Aaberg <cja@gmx.net>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99086 Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20161215174734.28779-1-ville.syrjala@linux.intel.com Tested-by: Mika Kahola <mika.kahola@intel.com> Reviewed-by: Mika Kahola <mika.kahola@intel.com>
(cherry picked from commit ec1b4ee2834e66884e5b0d3d465f347ff212e372) Signed-off-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
For most cases a protection exception in the host (e.g. copy
on write or dirty tracking) on the sie instruction will indicate
an instruction length of 4. Turns out that there are some corner
cases (e.g. runtime instrumentation) where this is not necessarily
true and the ILC is unpredictable.
Let's replace our 4 byte rewind_pad with 3 byte nops to prepare for
all possible ILCs.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Linux IRQ #0 is reserved for error reporting and may not be used.
Increase NR_IRQS for one additional slot and increase
irq_domain_add_legacy parameter first_irq value to 1, so that linux
IRQ #0 is not associated with hardware IRQ #0 in legacy IRQ domains.
Introduce macro XTENSA_PIC_LINUX_IRQ for static translation of xtensa
PIC hardware IRQ # to linux IRQ #. Use this macro in XTFPGA platform
data definitions.
This fixes inability to use hardware IRQ #0 in configurations that don't
use device tree and allows for non-identity mapping between linux IRQ #
and hardware IRQ #.
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Relying on qede to trigger qedr on startup is problematic. When probing
both if qedr loads slowly then qede can assume qedr is missing and not
trigger it. This patch adds a triggering from qedr and protects against
a race via an atomic bit.
First, log prefix will be truncated to NF_LOG_PREFIXLEN-1, i.e. 127,
at nf_log_packet(), so the extra part is useless.
Second, after adding a log rule with a very very long prefix, we will
fail to dump the nft rules after this _special_ one, but acctually,
they do exist. For example:
# name_65000=$(printf "%0.sQ" {1..65000})
# nft add rule filter output log prefix "$name_65000"
# nft add rule filter output counter
# nft add rule filter output counter
# nft list chain filter output
table ip filter {
chain output {
type filter hook output priority 0; policy accept;
}
}
So now, restrict the log prefix length to NF_LOG_PREFIXLEN-1.
If the element exists and no NLM_F_EXCL is specified, do not bump
set->nelems, otherwise we leak one set element slot. This problem
amplifies if the set is full since the abort path always decrements the
counter for the -ENFILE case too, giving one spare extra slot.
Fix this by moving set->nelems update to nft_add_set_elem() after
successful element insertion. Moreover, remove the element if the set is
full so there is no need to rely on the abort path to undo things
anymore.
Fixes: c016c7e45ddf ("netfilter: nf_tables: honor NLM_F_EXCL flag in set element insertion") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
We trigger a soft lockup as we grab nametbl_lock twice if the node
has a pending node up/down or link up/down event while:
- we process an incoming named message in tipc_named_rcv() and
perform an tipc_update_nametbl().
- we have pending backlog items in the name distributor queue
during a nametable update using tipc_nametbl_publish() or
tipc_nametbl_withdraw().
The following are the call chain associated:
tipc_named_rcv() Grabs nametbl_lock
tipc_update_nametbl() (publish/withdraw)
tipc_node_subscribe()/unsubscribe()
tipc_node_write_unlock()
<< lockup occurs if an outstanding node/link event
exits, as we grabs nametbl_lock again >>
tipc_nametbl_withdraw() Grab nametbl_lock
tipc_named_process_backlog()
tipc_update_nametbl()
<< rest as above >>
The function tipc_node_write_unlock(), in addition to releasing the
lock processes the outstanding node/link up/down events. To do this,
we need to grab the nametbl_lock again leading to the lockup.
In this commit we fix the soft lockup by introducing a fast variant of
node_unlock(), where we just release the lock. We adapt the
node_subscribe()/node_unsubscribe() to use the fast variants.
Reported-and-Tested-by: John Thompson <thompa.atl@gmail.com> Acked-by: Ying Xue <ying.xue@windriver.com> Acked-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: Parthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Until now, the subscribers keep track of the subscriptions using
reference count at subscriber level. At subscription cancel or
subscriber delete, we delete the subscription only if the timer
was pending for the subscription. This approach is incorrect as:
1. del_timer() is not SMP safe, if on CPU0 the check for pending
timer returns true but CPU1 might schedule the timer callback
thereby deleting the subscription. Thus when CPU0 is scheduled,
it deletes an invalid subscription.
2. We export tipc_subscrp_report_overlap(), which accesses the
subscription pointer multiple times. Meanwhile the subscription
timer can expire thereby freeing the subscription and we might
continue to access the subscription pointer leading to memory
violations.
In this commit, we introduce subscription refcount to avoid deleting
an invalid subscription.
Reported-and-Tested-by: John Thompson <thompa.atl@gmail.com> Acked-by: Ying Xue <ying.xue@windriver.com> Acked-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: Parthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Until now, the generic server framework maintains the connection
id's per subscriber in server's conn_idr. At tipc_close_conn, we
remove the connection id from the server list, but the connection is
valid until we call the refcount cleanup. Hence we have a window
where the server allocates the same connection to an new subscriber
leading to inconsistent reference count. We have another refcount
warning we grab the refcount in tipc_conn_lookup() for connections
with flag with CF_CONNECTED not set. This usually occurs at shutdown
when the we stop the topology server and withdraw TIPC_CFG_SRV
publication thereby triggering a withdraw message to subscribers.
In this commit, we:
1. remove the connection from the server list at recount cleanup.
2. grab the refcount for a connection only if CF_CONNECTED is set.
Tested-by: John Thompson <thompa.atl@gmail.com> Acked-by: Ying Xue <ying.xue@windriver.com> Acked-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: Parthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In tipc_conn_sendmsg(), we first queue the request to the outqueue
followed by the connection state check. If the connection is not
connected, we should not queue this message.
In this commit, we reject the messages if the connection state is
not CF_CONNECTED.
Acked-by: Ying Xue <ying.xue@windriver.com> Acked-by: Jon Maloy <jon.maloy@ericsson.com> Tested-by: John Thompson <thompa.atl@gmail.com> Signed-off-by: Parthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This is needed on HS38 cores, for setting up IO-Coherency aperture properly
The polling could perturb the caches and coherecy fabric which could be
wrong in the small window when Master is setting up IOC aperture etc
in arc_cache_init()
We do it only for ARCv2 based builds to not affect EZChip ARCompact
based platform.
For run-on-reset SMP configs, non master cores call a routine which
waits until Master gives it a "go" signal (currently using a shared
mem flag). The same routine then jumps off the well known entry point of
all non Master cores i.e. @first_lines_of_secondary
This patch moves out the last part into one single place in early boot
code.
This is better in terms of absraction (the wait API only waits) and
returns, leaving out the "jump off to" part.
In actual implementation this requires some restructuring of the early
boot code as well as Master now jumps to BSS setup explicitly,
vs. falling thru into it before.
Technically this patch doesn't cause any functional change, it just
moves the ugly #ifdef'ry from assembly code to "C"
On an overloaded system, it is possible that a change in the watchdog
threshold can be delayed long enough to trigger a false positive.
This can easily be achieved by having a cpu spinning indefinitely on a
task, while another cpu updates watchdog threshold.
What happens is while trying to park the watchdog threads, the hrtimers
on the other cpus trigger and reprogram themselves with the new slower
watchdog threshold. Meanwhile, the nmi watchdog is still programmed
with the old faster threshold.
Because the one cpu is blocked, it prevents the thread parking on the
other cpus from completing, which is needed to shutdown the nmi watchdog
and reprogram it correctly. As a result, a false positive from the nmi
watchdog is reported.
Fix this by setting a park_in_progress flag to block all lockups until
the parking is complete.
This is an attempt to cleanup watchdog handlers. Right now,
kernel/watchdog.c implements both softlockup and hardlockup detectors.
Softlockup code is generic. Hardlockup code is arch specific. Some
architectures don't use hardlockup detectors. They use their own
watchdog detectors. To make both these combination work, we have
numerous #ifdefs in kernel/watchdog.c.
We are trying here to make these handlers independent of each other.
Also provide an interface for architectures to implement their own
handlers. watchdog_nmi_enable and watchdog_nmi_disable will be defined
as weak such that architectures can override its definitions.
Thanks to Don Zickus for his suggestions.
Here are our previous discussions
http://www.spinics.net/lists/sparclinux/msg16543.html
http://www.spinics.net/lists/sparclinux/msg16441.html
This patch (of 3):
Move shared macros and definitions to nmi.h so that watchdog.c, new file
watchdog_hld.c or any other architecture specific handler can use those
definitions.
Separate hardlockup code from watchdog.c and move it to watchdog_hld.c.
It is mostly straight forward. Remove everything inside
CONFIG_HARDLOCKUP_DETECTORS. This code will go to file watchdog_hld.c.
Also update the makefile accordigly.
With >=32 CPUs the userfaultfd selftest triggered a graceful but
unexpected SIGBUS because VM_FAULT_RETRY was returned by
handle_userfault() despite the UFFDIO_COPY wasn't completed.
This seems caused by rwsem waking the thread blocked in
handle_userfault() and we can't run up_read() before the wait_event
sequence is complete.
Keeping the wait_even sequence identical to the first one, would require
running userfaultfd_must_wait() again to know if the loop should be
repeated, and it would also require retaking the rwsem and revalidating
the whole vma status.
It seems simpler to wait the targeted wakeup so that if false wakeups
materialize we still wait for our specific wakeup event, unless of
course there are signals or the uffd was released.
Debug code collecting the stack trace of the wakeup showed this:
This always happens when the main userfault selftest thread is running
clone() while glibc runs either mprotect or mmap (both taking mmap_sem
down_write()) to allocate the thread stack of the background threads,
while locking/userfault threads already run at full throttle and are
susceptible to false wakeups that may cause handle_userfault() to return
before than expected (which results in graceful SIGBUS at the next
attempt).
This was reproduced only with >=32 CPUs because the loop to start the
thread where clone() is too quick with fewer CPUs, while with 32 CPUs
there's already significant activity on ~32 locking and userfault
threads when the last background threads are started with clone().
This >=32 CPUs SMP race condition is likely reproducible only with the
selftest because of the much heavier userfault load it generates if
compared to real apps.
We'll have to allow "one more" VM_FAULT_RETRY for the WP support and a
patch floating around that provides it also hidden this problem but in
reality only is successfully at hiding the problem.
False wakeups could still happen again the second time
handle_userfault() is invoked, even if it's a so rare race condition
that getting false wakeups twice in a row is impossible to reproduce.
This full fix is needed for correctness, the only alternative would be
to allow VM_FAULT_RETRY to be returned infinitely. With this fix the WP
support can stick to a strict "one more" VM_FAULT_RETRY logic (no need
of returning it infinite times to avoid the SIGBUS).
Link: http://lkml.kernel.org/r/20170111005535.13832-2-aarcange@redhat.com Signed-off-by: Andrea Arcangeli <aarcange@redhat.com> Reported-by: Shubham Kumar Sharma <shubham.kumar.sharma@oracle.com> Tested-by: Mike Kravetz <mike.kravetz@oracle.com> Acked-by: Hillf Danton <hillf.zj@alibaba-inc.com> Cc: Michael Rapoport <RAPOPORT@il.ibm.com> Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com> Cc: Pavel Emelyanov <xemul@parallels.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Commit 8a59f5d25265 ("fs/romfs: return f_fsid for statfs(2)") generates
a 64bit id from sb->s_bdev->bd_dev. This is only correct when romfs is
defined with CONFIG_ROMFS_ON_BLOCK. If romfs is only defined with
CONFIG_ROMFS_ON_MTD, sb->s_bdev is NULL, referencing sb->s_bdev->bd_dev
will triger an oops.
Richard Weinberger points out that when CONFIG_ROMFS_BACKED_BY_BOTH=y,
both CONFIG_ROMFS_ON_BLOCK and CONFIG_ROMFS_ON_MTD are defined.
Therefore when calling huge_encode_dev() to generate a 64bit id, I use
the follow order to choose parameter,
- CONFIG_ROMFS_ON_BLOCK defined
use sb->s_bdev->bd_dev
- CONFIG_ROMFS_ON_BLOCK undefined and CONFIG_ROMFS_ON_MTD defined
use sb->s_dev when,
- both CONFIG_ROMFS_ON_BLOCK and CONFIG_ROMFS_ON_MTD undefined
leave id as 0
When CONFIG_ROMFS_ON_MTD is defined and sb->s_mtd is not NULL, sb->s_dev
is set to a device ID generated by MTD_BLOCK_MAJOR and mtd index,
otherwise sb->s_dev is 0.
This is a try-best effort to generate a uniq file system ID, if all the
above conditions are not meet, f_fsid of this romfs instance will be 0.
Generally only one romfs can be built on single MTD block device, this
method is enough to identify multiple romfs instances in a computer.
Link: http://lkml.kernel.org/r/1482928596-115155-1-git-send-email-colyli@suse.de Signed-off-by: Coly Li <colyli@suse.de> Reported-by: Nong Li <nongli1031@gmail.com> Tested-by: Nong Li <nongli1031@gmail.com> Cc: Richard Weinberger <richard.weinberger@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
sctp_addr_id2transport is a function for sockopt to look up assoc by
address. As the address is from userspace, it can be a v4-mapped v6
address. But in sctp protocol stack, it always handles a v4-mapped
v6 address as a v4 address. So it's necessary to convert it to a v4
address before looking up assoc by address.
This patch is to fix it by calling sctp_verify_addr in which it can do
this conversion before calling sctp_endpoint_lookup_assoc, just like
what sctp_sendmsg and __sctp_connect do for the address from users.
Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Now sctp gso puts segments into skb's frag_list, then processes these
segments in skb_segment. But skb_segment handles them only when gs is
enabled, as it's in the same branch with skb's frags.
Although almost all the NICs support sg other than some old ones, but
since commit 1e16aa3ddf86 ("net: gso: use feature flag argument in all
protocol gso handlers"), features &= skb->dev->hw_enc_features, and
xfrm_output_gso call skb_segment with features = 0, which means sctp
gso would call skb_segment with sg = 0, and skb_segment would not work
as expected.
This patch is to fix it by setting features param with NETIF_F_SG when
calling skb_segment so that it can go the right branch to process the
skb's frag_list.
Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
bnxt_get_port_module_status() calls bnxt_update_link() which expects
RTNL to be held. In bnxt_sp_task() that does not hold RTNL, we need to
call it with a prior call to bnxt_rtnl_lock_sp() and the call needs to
be moved to the end of bnxt_sp_task().
Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
bnxt_update_link() is called from multiple code paths. Most callers,
such as open, ethtool, already hold RTNL. Only the caller bnxt_sp_task()
does not. So it is a bug to take RTNL inside bnxt_update_link().
Fix it by removing the RTNL inside bnxt_update_link(). The function
now expects the caller to always hold RTNL.
In bnxt_sp_task(), call bnxt_rtnl_lock_sp() before calling
bnxt_update_link(). We also need to move the call to the end of
bnxt_sp_task() since it will be clearing the BNXT_STATE_IN_SP_TASK bit.
Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
On some dual port NICs, the speed setting on one port can affect the
available speed on the other port. Add logic to detect these changes
and adjust the advertised speed settings when necessary.
Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In bnxt_sp_task(), we set a bit BNXT_STATE_IN_SP_TASK so that bnxt_close()
will synchronize and wait for bnxt_sp_task() to finish. Some functions
in bnxt_sp_task() require us to clear BNXT_STATE_IN_SP_TASK and then
acquire rtnl_lock() to prevent race conditions.
There are some bugs related to this logic. This patch refactors the code
to have common bnxt_rtnl_lock_sp() and bnxt_rtnl_unlock_sp() to handle
the RTNL and the clearing/setting of the bit. Multiple functions will
need the same logic. We also need to move bnxt_reset() to the end of
bnxt_sp_task(). Functions that clear BNXT_STATE_IN_SP_TASK must be the
last functions to be called in bnxt_sp_task(). The common scheme will
handle the condition properly.
Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When the binding was defined, I was not aware that mt2701 was an earlier
version of the SoC. For sake of consistency, the ethernet driver should
use mt2701 inside the compat string as this is the earliest SoC with the
ethernet core.
The ethernet driver is currently of no real use until we finish and
upstream the DSA driver. There are no users of this binding yet. It should
be safe to fix this now before it is too late and we need to provide
backward compatibility for the mt7623-eth compat string.
Reported-by: Sean Wang <sean.wang@mediatek.com> Signed-off-by: John Crispin <john@phrozen.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>