For physical addresses, since the address may exceed 32-bit address range
after calculation, we should use 0x%8.8X%8.8X instead of ACPI_PRINTF_UINT
and ACPI_FORMAT_UINT64() instead of
ACPI_FORMAT_NATIVE_UINT()/ACPI_FORMAT_TO_UINT().
This patch also removes above replaced macros as there are no users.
This is a preparation to switch acpi_physical_address to 64-bit on 32-bit
kernel builds.
Link: https://github.com/acpica/acpica/commit/b6061237 Signed-off-by: Lv Zheng <lv.zheng@intel.com> Signed-off-by: Bob Moore <robert.moore@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Dirk Behme <dirk.behme@gmail.com>
[gdavis: Move tbprint.c changes to tbutils.c due to lack of commit
"42f4786 ACPICA: Split table print utilities to a new a
separate file" in linux-3.10.y] Signed-off-by: George G. Davis <george_davis@mentor.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[bwh: Backported to 3.2: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
For physical addresses, since the address may exceed 32-bit address range
after calculation, we should use %8.8X%8.8X (see ACPI_FORMAT_UINT64()) to
convert the %p formats.
This is a preparation to switch acpi_physical_address to 64-bit on 32-bit
kernel builds.
Link: https://github.com/acpica/acpica/commit/7f06739d Signed-off-by: Lv Zheng <lv.zheng@intel.com> Signed-off-by: Bob Moore <robert.moore@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Dirk Behme <dirk.behme@gmail.com>
[gdavis: Move tbinstall.c changes to tbutils.c due to lack of commit
"42f4786 ACPICA: Split table print utilities to a new a
separate file" in linux-3.10.y] Signed-off-by: George G. Davis <george_davis@mentor.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[bwh: Backported to 3.2:
- Drop inapplicable changes to drivers/acpi/acpica/utaddress.c and
acpi_tb_install_table()
- Fix similar format issues in acpi_tb_add_table() and
acpi_tb_install_table() that aren't present upstream] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Cleanup output for Processor(). Length is a byte, not a word.
Signed-off-by: Bob Moore <robert.moore@intel.com> Signed-off-by: Feng Tang <feng.tang@intel.com> Signed-off-by: Len Brown <len.brown@intel.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
OSPMs like Linux expect an acpi_physical_address returning value from
acpi_find_root_pointer(). This triggers warnings if sizeof (acpi_size) doesn't
equal to sizeof (acpi_physical_address):
drivers/acpi/osl.c:275:3: warning: passing argument 1 of 'acpi_find_root_pointer' from incompatible pointer type [enabled by default]
In file included from include/acpi/acpi.h:64:0,
from include/linux/acpi.h:36,
from drivers/acpi/osl.c:41:
include/acpi/acpixf.h:433:1: note: expected 'acpi_size *' but argument is of type 'acpi_physical_address *'
This patch corrects acpi_find_root_pointer().
Link: https://github.com/acpica/acpica/commit/7d9fd643 Signed-off-by: Lv Zheng <lv.zheng@intel.com> Signed-off-by: Bob Moore <robert.moore@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Dirk Behme <dirk.behme@gmail.com> Signed-off-by: George G. Davis <george_davis@mentor.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Sam Ravnborg <sam@ravnborg.org> Cc: Konrad Eisele <konrad@gaisler.com> Cc: Daniel Hellstrom <daniel@gaisler.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
__sync_fetch_and_and and __sync_fetch_and_or are functions that are provided
by gcc and depending on the target architecture may be implemented in libgcc,
which is not always available in the kernel. This leads to a build failure
on ARMv5:
drivers/built-in.o: In function `line6_pcm_release':
:(.text+0x3bfe80): undefined reference to `__sync_fetch_and_and_4'
drivers/built-in.o: In function `line6_pcm_acquire':
:(.text+0x3bff30): undefined reference to `__sync_fetch_and_or_4'
To work around this, we can use the kernel-provided cmpxchg macro.
Build-tested only.
Signed-off-by: Arnd Bergmann <arnd@arndb.de> Cc: Markus Grabner <grabner@icg.tugraz.at> Acked-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[bwh: Backported to 3.2:
- Adjust context
- Fix up two more instances of __sync_fetch_and_and() that were removed
separately upstream] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
The Debian experimental linux source package (3.8.5-1) build fails
with the following errors:
...
MODPOST 2016 modules
ERROR: "__ucmpdi2" [fs/btrfs/btrfs.ko] undefined!
ERROR: "__ucmpdi2" [drivers/md/dm-verity.ko] undefined!
The attached patch resolves this problem. It is based on the s390
implementation of ucmpdi2.c.
Signed-off-by: John David Anglin <dave.anglin@bell.net> Cc: "James E.J. Bottomley" <jejb@parisc-linux.org> Signed-off-by: Helge Deller <deller@gmx.de> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Running mtd-utils/tests/ubi-tests/io_basic.c could cause
soft lockup or watchdog reset. It is because *updatevol*
will perform ubi_check_volume() after updating finish
and this function will full scan the updated lebs if the
volume is initialized as STATIC_VOLUME.
This patch adds *cond_resched()* in the loop of lebs scan
to avoid soft lockup.
Helped by Richard Weinberger <richard@nod.at>
[ 2158.067096] INFO: rcu_sched self-detected stall on CPU { 1} (t=2101 jiffies g=1606 c=1605 q=56)
[ 2158.172867] CPU: 1 PID: 2073 Comm: io_basic Tainted: G O 3.10.53 #21
[ 2158.172898] [<c000f624>] (unwind_backtrace+0x0/0x120) from [<c000c294>] (show_stack+0x10/0x14)
[ 2158.172918] [<c000c294>] (show_stack+0x10/0x14) from [<c008ac3c>] (rcu_check_callbacks+0x1c0/0x660)
[ 2158.172936] [<c008ac3c>] (rcu_check_callbacks+0x1c0/0x660) from [<c002b480>] (update_process_times+0x38/0x64)
[ 2158.172953] [<c002b480>] (update_process_times+0x38/0x64) from [<c005ff38>] (tick_sched_handle+0x54/0x60)
[ 2158.172966] [<c005ff38>] (tick_sched_handle+0x54/0x60) from [<c00601ac>] (tick_sched_timer+0x44/0x74)
[ 2158.172978] [<c00601ac>] (tick_sched_timer+0x44/0x74) from [<c003f348>] (__run_hrtimer+0xc8/0x1b8)
[ 2158.172992] [<c003f348>] (__run_hrtimer+0xc8/0x1b8) from [<c003fd9c>] (hrtimer_interrupt+0x128/0x2a4)
[ 2158.173007] [<c003fd9c>] (hrtimer_interrupt+0x128/0x2a4) from [<c0246f1c>] (arch_timer_handler_virt+0x28/0x30)
[ 2158.173022] [<c0246f1c>] (arch_timer_handler_virt+0x28/0x30) from [<c0086214>] (handle_percpu_devid_irq+0x9c/0x124)
[ 2158.173036] [<c0086214>] (handle_percpu_devid_irq+0x9c/0x124) from [<c0082bd8>] (generic_handle_irq+0x20/0x30)
[ 2158.173049] [<c0082bd8>] (generic_handle_irq+0x20/0x30) from [<c000969c>] (handle_IRQ+0x64/0x8c)
[ 2158.173060] [<c000969c>] (handle_IRQ+0x64/0x8c) from [<c0008544>] (gic_handle_irq+0x3c/0x60)
[ 2158.173074] [<c0008544>] (gic_handle_irq+0x3c/0x60) from [<c02f0f80>] (__irq_svc+0x40/0x50)
[ 2158.173083] Exception stack(0xc4043c98 to 0xc4043ce0)
[ 2158.173092] 3c80: c4043ce400000019
[ 2158.173102] 3ca0: 1f8a865fc050ad101f8a864c00000031c04b59700003ebce00000000f3550000
[ 2158.173113] 3cc0: bf00bc68000008000003ebcec4043ce0c0186d14c0186cb880000013ffffffff
[ 2158.173130] [<c02f0f80>] (__irq_svc+0x40/0x50) from [<c0186cb8>] (read_current_timer+0x4/0x38)
[ 2158.173145] [<c0186cb8>] (read_current_timer+0x4/0x38) from [<1f8a865f>] (0x1f8a865f)
[ 2183.927097] BUG: soft lockup - CPU#1 stuck for 22s! [io_basic:2073]
[ 2184.002229] Modules linked in: nandflash(O) [last unloaded: nandflash]
Signed-off-by: Wang Kai <morgan.wang@huawei.com> Signed-off-by: hujianyang <hujianyang@huawei.com> Signed-off-by: Richard Weinberger <richard@nod.at> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Returns a non-zero value if the current processor implementation requires
an IHB instruction to deal with an instruction hazard as per MIPS R2
architecture specification, zero otherwise.
For a discussion, see http://patchwork.linux-mips.org/patch/9539/.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
[bwh: Backported to 3.2: trim the CPU type list] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
udelay() in PCI/PCIe read/write callbacks cause 30ms IRQ latency on Octeon
platforms because these operations are called from PCI_OP_READ() and
PCI_OP_WRITE() under raw_spin_lock_irqsave().
Signed-off-by: Alexander Sverdlin <alexander.sverdlin@nokia.com> Cc: linux-mips@linux-mips.org Cc: David Daney <ddaney@cavium.com> Cc: Rob Herring <robh@kernel.org> Cc: Jiri Kosina <jkosina@suse.cz> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Masanari Iida <standby24x7@gmail.com> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Mathias <mathias.rulf@nokia.com>
Patchwork: https://patchwork.linux-mips.org/patch/9576/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
[bwh: Backported to 3.2: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
The lazy cache flushing implemented in the MIPS kernel suffers from a
race condition that is exposed by do_set_pte() in mm/memory.c.
A pre-condition is a file-system that writes to the page from the CPU
in its readpage method and then calls flush_dcache_page(). One example
is ubifs. Another pre-condition is that the dcache flush is postponed
in __flush_dcache_page().
Upon a page fault for an executable mapping not existing in the
page-cache, the following will happen:
1. Write to the page
2. flush_dcache_page
3. flush_icache_page
4. set_pte_at
5. update_mmu_cache (commits the flush of a dcache-dirty page)
Between steps 4 and 5 another thread can hit the same page and it will
encounter a valid pte. Because the data still is in the L1 dcache the CPU
will fetch stale data from L2 into the icache and execute garbage.
This fix moves the commit of the cache flush to step 3 to close the
race window. It also reduces the amount of flushes on non-executable
mappings because we never enter __flush_dcache_page() for non-aliasing
CPUs.
Regressions can occur in drivers that mistakenly relies on the
flush_dcache_page() in get_user_pages() for DMA operations.
[ralf@linux-mips.org: Folded in patch 9346 to fix highmem issue.]
The stop machine logic can lock up if all but one of the migration
threads make it through the disable-irq step and the one remaining
thread gets stuck in __do_softirq. The reason __do_softirq can hang is
that it has a bail-out based on jiffies timeout, but in the lockup case,
jiffies itself is not incremented.
To work around this, re-add the max_restart counter in __do_irq and stop
processing irqs after 10 restarts.
Thanks to Tejun Heo and Rusty Russell and others for helping me track
this down.
This was introduced in 3.9 by commit c10d73671ad3 ("softirq: reduce
latencies").
It may be worth looking into ath9k to see if it has issues with its irq
handler at a later date.
The hang stack traces look something like this:
------------[ cut here ]------------
WARNING: at kernel/watchdog.c:245 watchdog_overflow_callback+0x9c/0xa7()
Watchdog detected hard LOCKUP on cpu 2
Modules linked in: ath9k ath9k_common ath9k_hw ath mac80211 cfg80211 nfsv4 auth_rpcgss nfs fscache nf_nat_ipv4 nf_nat veth 8021q garp stp mrp llc pktgen lockd sunrpc]
Pid: 23, comm: migration/2 Tainted: G C 3.9.4+ #11
Call Trace:
<NMI> warn_slowpath_common+0x85/0x9f
warn_slowpath_fmt+0x46/0x48
watchdog_overflow_callback+0x9c/0xa7
__perf_event_overflow+0x137/0x1cb
perf_event_overflow+0x14/0x16
intel_pmu_handle_irq+0x2dc/0x359
perf_event_nmi_handler+0x19/0x1b
nmi_handle+0x7f/0xc2
do_nmi+0xbc/0x304
end_repeat_nmi+0x1e/0x2e
<<EOE>>
cpu_stopper_thread+0xae/0x162
smpboot_thread_fn+0x258/0x260
kthread+0xc7/0xcf
ret_from_fork+0x7c/0xb0
---[ end trace 4947dfa9b0a4cec3 ]---
BUG: soft lockup - CPU#1 stuck for 22s! [migration/1:17]
Modules linked in: ath9k ath9k_common ath9k_hw ath mac80211 cfg80211 nfsv4 auth_rpcgss nfs fscache nf_nat_ipv4 nf_nat veth 8021q garp stp mrp llc pktgen lockd sunrpc]
irq event stamp: 835637905
hardirqs last enabled at (835637904): __do_softirq+0x9f/0x257
hardirqs last disabled at (835637905): apic_timer_interrupt+0x6d/0x80
softirqs last enabled at (5654720): __do_softirq+0x1ff/0x257
softirqs last disabled at (5654725): irq_exit+0x5f/0xbb
CPU 1
Pid: 17, comm: migration/1 Tainted: G WC 3.9.4+ #11 To be filled by O.E.M. To be filled by O.E.M./To be filled by O.E.M.
RIP: tasklet_hi_action+0xf0/0xf0
Process migration/1
Call Trace:
<IRQ>
__do_softirq+0x117/0x257
irq_exit+0x5f/0xbb
smp_apic_timer_interrupt+0x8a/0x98
apic_timer_interrupt+0x72/0x80
<EOI>
printk+0x4d/0x4f
stop_machine_cpu_stop+0x22c/0x274
cpu_stopper_thread+0xae/0x162
smpboot_thread_fn+0x258/0x260
kthread+0xc7/0xcf
ret_from_fork+0x7c/0xb0
Signed-off-by: Ben Greear <greearb@candelatech.com> Acked-by: Tejun Heo <tj@kernel.org> Acked-by: Pekka Riikonen <priikone@iki.fi> Cc: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
[bwh: Backported to 3.2: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Cc: Rui Xiang <rui.xiang@huawei.com>
In various network workloads, __do_softirq() latencies can be up
to 20 ms if HZ=1000, and 200 ms if HZ=100.
This is because we iterate 10 times in the softirq dispatcher,
and some actions can consume a lot of cycles.
This patch changes the fallback to ksoftirqd condition to :
- A time limit of 2 ms.
- need_resched() being set on current task
When one of this condition is met, we wakeup ksoftirqd for further
softirq processing if we still have pending softirqs.
Using need_resched() as the only condition can trigger RCU stalls,
as we can keep BH disabled for too long.
I ran several benchmarks and got no significant difference in
throughput, but a very significant reduction of latencies (one order
of magnitude) :
In following bench, 200 antagonist "netperf -t TCP_RR" are started in
background, using all available cpus.
Then we start one "netperf -t TCP_RR", bound to the cpu handling the NIC
IRQ (hard+soft)
Before patch :
# netperf -H 7.7.7.84 -t TCP_RR -T2,2 -- -k
RT_LATENCY,MIN_LATENCY,MAX_LATENCY,P50_LATENCY,P90_LATENCY,P99_LATENCY,MEAN_LATENCY,STDDEV_LATENCY
MIGRATED TCP REQUEST/RESPONSE TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET
to 7.7.7.84 () port 0 AF_INET : first burst 0 : cpu bind
RT_LATENCY=550110.424
MIN_LATENCY=146858
MAX_LATENCY=997109
P50_LATENCY=305000
P90_LATENCY=550000
P99_LATENCY=710000
MEAN_LATENCY=376989.12
STDDEV_LATENCY=184046.92
After patch :
# netperf -H 7.7.7.84 -t TCP_RR -T2,2 -- -k
RT_LATENCY,MIN_LATENCY,MAX_LATENCY,P50_LATENCY,P90_LATENCY,P99_LATENCY,MEAN_LATENCY,STDDEV_LATENCY
MIGRATED TCP REQUEST/RESPONSE TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET
to 7.7.7.84 () port 0 AF_INET : first burst 0 : cpu bind
RT_LATENCY=40545.492
MIN_LATENCY=9834
MAX_LATENCY=78366
P50_LATENCY=33583
P90_LATENCY=59000
P99_LATENCY=69000
MEAN_LATENCY=38364.67
STDDEV_LATENCY=12865.26
Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: David Miller <davem@davemloft.net> Cc: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Cc: Rui Xiang <rui.xiang@huawei.com>
Since commit 8a0a9bd4db63bc45e301, this comment in mmap_rnd() does not
hold true as the value returned by get_random_int() will in fact be
different every single call. Remove the comment and simplify the code
back to its original desired form.
This reverts commit a5adc91a4b44b5d1 which is no longer necessary and
also fixes the sparc code that copied this same adjustment.
Signed-off-by: Dan McGee <dpmcgee@gmail.com> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Cc: Moritz Mühlenhoff <jmm@inutil.org>
__ptrace_may_access() checks get_dumpable/ptrace_has_cap/etc if task !=
current, this can can lead to surprising results.
For example, a sub-thread can't readlink("/proc/self/exe") if the
executable is not readable. setup_new_exec()->would_dump() notices that
inode_permission(MAY_READ) fails and then it does
set_dumpable(suid_dumpable). After that get_dumpable() fails.
(It is not clear why proc_pid_readlink() checks get_dumpable(), perhaps we
could add PTRACE_MODE_NODUMPABLE)
Change __ptrace_may_access() to use same_thread_group() instead of "task
== current". Any security check is pointless when the tasks share the
same ->mm.
Signed-off-by: Mark Grondona <mgrondona@llnl.gov> Signed-off-by: Ben Woodard <woodard@redhat.com> Signed-off-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Cc: Sheng Yong <shengyong1@huawei.com>
The root cause seems to be the
default_send_IPI_mask_allbutself_phys() takes quite some time (I
measured it could be several ms) to complete sending NMIs to all
the other 23 CPUs, and for HZ=250/1000 system, the time is long
enough for a timer interrupt to happen, which will in turn
trigger to kick load balance to a stopped CPU and cause this
warning in native_smp_send_reschedule().
So disabling the local irq before stop_other_cpu() can fix this
problem (tested 25 times reboot ok), and it is fine as there
should be nobody caring the timer interrupt in such reboot
stage.
The latest 3.4 kernel slightly changes this behavior by sending
REBOOT_VECTOR first and only send NMI_VECTOR if the REBOOT_VCTOR
fails, and this patch is still needed to prevent the problem.
Signed-off-by: Feng Tang <feng.tang@intel.com> Acked-by: Don Zickus <dzickus@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20120530231541.4c13433a@feng-i7 Signed-off-by: Ingo Molnar <mingo@kernel.org>
[bwh: Backported to 3.2: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Cc: Vinson Lee <vlee@twopensource.com>
A huge amount of NIC drivers use the DMA API, however if
compiled under 32-bit an very important part of the DMA API can
be ommitted leading to the drivers not working at all
(especially if used with 'swiotlb=force iommu=soft').
As Prashant Sreedharan explains it: "the driver [tg3] uses
DEFINE_DMA_UNMAP_ADDR(), dma_unmap_addr_set() to keep a copy of
the dma "mapping" and dma_unmap_addr() to get the "mapping"
value. On most of the platforms this is a no-op, but ... with
"iommu=soft and swiotlb=force" this house keeping is required,
... otherwise we pass 0 while calling pci_unmap_/pci_dma_sync_
instead of the DMA address."
As such enable this even when using 32-bit kernels.
Reported-by: Ian Jackson <Ian.Jackson@eu.citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Prashant Sreedharan <prashant@broadcom.com> Cc: Borislav Petkov <bp@alien8.de> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Michael Chan <mchan@broadcom.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: boris.ostrovsky@oracle.com Cc: cascardo@linux.vnet.ibm.com Cc: david.vrabel@citrix.com Cc: sanjeevb@broadcom.com Cc: siva.kallam@broadcom.com Cc: vyasevich@gmail.com Cc: xen-devel@lists.xensource.com Link: http://lkml.kernel.org/r/20150417190448.GA9462@l.oracle.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
[bwh: Backported to 3.2: also change the def_bool (cond) to
def_bool y + depends] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
We found that after v3.10.73, recvmsg might return -EFAULT while -EINVAL
was expected.
We tested it through the recvmsg01 testcase come from LTP testsuit. It set
msg->msg_namelen to -1 and the recvmsg syscall returned errno 14, which is
unexpected (errno 22 is expected):
Linux mainline has no this bug for commit 08adb7dab fixes it accidentally.
However, it is too large and complex to be backported to LTS 3.10.
Commit 281c9c36 (net: compat: Update get_compat_msghdr() to match
copy_msghdr_from_user() behaviour) made get_compat_msghdr() return
error if msg_sys->msg_namelen was negative, which changed the behaviors
of recvmsg and sendmsg syscall in a lib32 system:
Before commit 281c9c36, get_compat_msghdr() wouldn't fail and it would
return -EINVAL in move_addr_to_user() or somewhere if msg_sys->msg_namelen
was invalid and then syscall returned -EINVAL, which is correct.
And now, when msg_sys->msg_namelen is negative, get_compat_msghdr() will
fail and wants to return -EINVAL, however, the outer syscall will return
-EFAULT directly, which is unexpected.
This patch gets the return value of get_compat_msghdr() as well as
copy_msghdr_from_user(), then returns this expected value if
get_compat_msghdr() fails.
Fixes: 281c9c36 (net: compat: Update get_compat_msghdr() to match copy_msghdr_from_user() behaviour) Signed-off-by: Junling Zheng <zhengjunling@huawei.com> Signed-off-by: Hanbing Xu <xuhanbing@huawei.com> Cc: Li Zefan <lizefan@huawei.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: David Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Current implementation of unfreeze_partials() is so complicated,
but benefit from it is insignificant. In addition many code in
do {} while loop have a bad influence to a fail rate of cmpxchg_double_slab.
Under current implementation which test status of cpu partial slab
and acquire list_lock in do {} while loop,
we don't need to acquire a list_lock and gain a little benefit
when front of the cpu partial slab is to be discarded, but this is a rare case.
In case that add_partial is performed and cmpxchg_double_slab is failed,
remove_partial should be called case by case.
I think that these are disadvantages of current implementation,
so I do refactoring unfreeze_partials().
Minimizing code in do {} while loop introduce a reduced fail rate
of cmpxchg_double_slab. Below is output of 'slabinfo -r kmalloc-256'
when './perf stat -r 33 hackbench 50 process 4000 > /dev/null' is done.
No regression is found, but rather we can see slightly better result.
Acked-by: Christoph Lameter <cl@linux.com> Signed-off-by: Joonsoo Kim <js1304@gmail.com> Signed-off-by: Pekka Enberg <penberg@kernel.org>
[bwh: Backported to 3.2: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Cc: Zefan Li <lizefan@huawei.com>
Ben Hutchings [Sat, 1 Aug 2015 18:55:11 +0000 (19:55 +0100)]
debugfs: Fix statfs() regression in 3.2.69
Commit 915f4f86ddc4 ("debugfs: leave freeing a symlink body until inode
eviction", commit 0db59e59299f upstream) changed debugfs to define its
own super_operations and implement the evict_inode operation.
Luis Henriques pointed out that it needs to define the statfs
operation, as in simple_super_operations which it was using before.
Reported-by: Luis Henriques <luis.henriques@canonical.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
There is NULL pointer dereference possible during statistics update if the route
used for OOTB responce is removed at unfortunate time. If the route exists when
we receive OOTB packet and we finally jump into sctp_packet_transmit() to send
ABORT, but in the meantime route is removed under our feet, we take "no_route"
path and try to update stats with IP_INC_STATS(sock_net(asoc->base.sk), ...).
But sctp_ootb_pkt_new() used to prepare responce packet doesn't call
sctp_transport_set_owner() and therefore there is no asoc associated with this
packet. Probably temporary asoc just for OOTB responces is overkill, so just
introduce a check like in all other places in sctp_packet_transmit(), where
"asoc" is dereferenced.
To reproduce this, one needs to
0. ensure that sctp module is loaded (otherwise ABORT is not generated)
1. remove default route on the machine
2. while true; do
ip route del [interface-specific route]
ip route add [interface-specific route]
done
3. send enough OOTB packets (i.e. HB REQs) from another host to trigger ABORT
responce
Signed-off-by: Alexander Sverdlin <alexander.sverdlin@nokia.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Acked-by: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
[bwh: Backported to 3.2: sctp alway uses init_net] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
The lockless lookups can return entry that is unlinked.
Sometimes they get reference before last neigh_cleanup_and_release,
sometimes they do not need reference. Later, any
modification attempts may result in the following problems:
1. entry is not destroyed immediately because neigh_update
can start the timer for dead entry, eg. on change to NUD_REACHABLE
state. As result, entry lives for some time but is invisible
and out of control.
2. __neigh_event_send can run in parallel with neigh_destroy
while refcnt=0 but if timer is started and expired refcnt can
reach 0 for second time leading to second neigh_destroy and
possible crash.
Thanks to Eric Dumazet and Ying Xue for their work and analyze
on the __neigh_event_send change.
Fixes: 767e97e1e0db ("neigh: RCU conversion of struct neighbour") Fixes: a263b3093641 ("ipv4: Make neigh lookups directly in output packet path.") Fixes: 6fd6ce2056de ("ipv6: Do not depend on rt->n in ip6_finish_output2().") Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: Ying Xue <ying.xue@windriver.com> Signed-off-by: Julian Anastasov <ja@ssi.bg> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
[bwh: Backported to 3.2: drop change to __neigh_set_probe_once() which
we don't have] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
PACKET_FANOUT_LB computes f->rr_cur such that it is modulo
f->num_members. It returns the old value unconditionally, but
f->num_members may have changed since the last store. Ensure
that the return value is always < num.
When modifying the logic, simplify it further by replacing the loop
with an unconditional atomic increment.
Fixes: dc99f600698d ("packet: Add fanout support.") Suggested-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Willem de Bruijn <willemb@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
[bwh: Backported to 3.2:
- Adjust context
- Demux functions return a sock pointer, not an index] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
We need to tell compiler it must not read f->num_members multiple
times. Otherwise testing if num is not zero is flaky, and we could
attempt an invalid divide by 0 in fanout_demux_cpu()
Note bug was present in packet_rcv_fanout_hash() and
packet_rcv_fanout_lb() but final 3.1 had a simple location
after commit 95ec3eb417115fb ("packet: Add 'cpu' fanout policy.")
Fixes: dc99f600698dc ("packet: Add fanout support.") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
After the ->set() spinlocks were removed br_stp_set_bridge_priority
was left running without any protection when used via sysfs. It can
race with port add/del and could result in use-after-free cases and
corrupted lists. Tested by running port add/del in a loop with stp
enabled while setting priority in a loop, crashes are easily
reproducible.
The spinlocks around sysfs ->set() were removed in commit: 14f98f258f19 ("bridge: range check STP parameters")
There's also a race condition in the netlink priority support that is
fixed by this change, but it was introduced recently and the fixes tag
covers it, just in case it's needed the commit is: af615762e972 ("bridge: add ageing_time, stp_state, priority over netlink")
Signed-off-by: Nikolay Aleksandrov <razor@blackwall.org> Fixes: 14f98f258f19 ("bridge: range check STP parameters") Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
When we come to tear things down in netback_remove() and generate the
uevent it is possible that the xenstore directory has already been
removed (details below).
In such cases netback_uevent() won't be able to read the hotplug
script and will write a xenstore error node.
A recent change to the hypervisor exposed this race such that we now
sometimes lose it (where apparently we didn't ever before).
Instead read the hotplug script configuration during setup and use it
for the lifetime of the backend device.
The apparently more obvious fix of moving the transition to
state=Closed in netback_remove() to after the uevent does not work
because it is possible that we are already in state=Closed (in
reaction to the guest having disconnected as it shutdown). Being
already in Closed means the toolstack is at liberty to start tearing
down the xenstore directories. In principal it might be possible to
arrange to unregister the device sooner (e.g on transition to Closing)
such that xenstore would still be there but this state machine is
fragile and prone to anger...
A modern Xen system only relies on the hotplug uevent for driver
domains, when the backend is in the same domain as the toolstack it
will run the necessary setup/teardown directly in the correct sequence
wrt xenstore changes.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Wei Liu <wei.liu2@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Mark Salyzyn <salyzyn@android.com> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
----
v2: switch to sock_flag(sk, SOCK_DEAD) and added net/caif/caif_socket.c
v3: return -ECONNRESET in upstream caller of wait function for SOCK_DEAD Signed-off-by: David S. Miller <davem@davemloft.net>
[bwh: Backported to 3.2: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Currently, the calibration function that corrects the initial offsets
among multiple devices only works the first time. If the function is
called more than once, the calibration fails and bogus offsets will be
programmed into the devices.
In a well hidden spot, the device documentation tells that trigger indexes
0 and 1 are special in allowing the TRIG_IF_LATE flag to actually work.
This patch fixes the issue by using one of the special triggers during the
recalibration method.
Signed-off-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Commit 59a53afe70fd530040bdc69581f03d880157f15a "powerpc: Don't setup
CPUs with bad status" broke ePAPR SMP booting. ePAPR says that CPUs
that aren't presently running shall have status of disabled, with
enable-method being used to determine whether the CPU can be enabled.
Fix by checking for spin-table, which is currently the only supported
enable-method.
Signed-off-by: Scott Wood <scottwood@freescale.com> Cc: Michael Neuling <mikey@neuling.org> Cc: Emil Medve <Emilian.Medve@Freescale.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Cc: Jonathan Toppins <jtoppins@cumulusnetworks.com>
Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
[jt: fixed up context due to commit 59a53afe70fd530040bdc69581f03d880157f15a
already being backported.] Signed-off-by: Jonathan Toppins <jtoppins@cumulusnetworks.com> Reviewed-by: Andy Gospodarek <gospo@cumulusnetworks.com> Acked-by: Curt Brune <curt@cumulusnetworks.com> Acked-by: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Add a helper function for finding the index of a string in a string
list property. This helper is useful for bindings that use a separate
*-name property for attaching names to tuples in another property such
as 'reg' or 'gpios'.
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
[jt: dropped changes to files not existing in the 3.2 tree] Signed-off-by: Jonathan Toppins <jtoppins@cumulusnetworks.com> Reviewed-by: Andy Gospodarek <gospo@cumulusnetworks.com> Acked-by: Curt Brune <curt@cumulusnetworks.com> Acked-by: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
->auto_asconf_splist is per namespace and mangled by functions like
sctp_setsockopt_auto_asconf() which doesn't guarantee any serialization.
Also, the call to inet_sk_copy_descendant() was backuping
->auto_asconf_list through the copy but was not honoring
->do_auto_asconf, which could lead to list corruption if it was
different between both sockets.
This commit thus fixes the list handling by using ->addr_wq_lock
spinlock to protect the list. A special handling is done upon socket
creation and destruction for that. Error handlig on sctp_init_sock()
will never return an error after having initialized asconf, so
sctp_destroy_sock() can be called without addrq_wq_lock. The lock now
will be take on sctp_close_sock(), before locking the socket, so we
don't do it in inverse order compared to sctp_addr_wq_timeout_handler().
Instead of taking the lock on sctp_sock_migrate() for copying and
restoring the list values, it's preferred to avoid rewritting it by
implementing sctp_copy_descendant().
Issue was found with a test application that kept flipping sysctl
default_auto_asconf on and off, but one could trigger it by issuing
simultaneous setsockopt() calls on multiple sockets or by
creating/destroying sockets fast enough. This is only triggerable
locally.
Fixes: 9f7d653b67ae ("sctp: Add Auto-ASCONF support (core).") Reported-by: Ji Jianwen <jiji@redhat.com> Suggested-by: Neil Horman <nhorman@tuxdriver.com> Suggested-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
[bwh: Backported to 3.2:
- Adjust filename, context
- Most per-netns state is global] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
We have two problems in UDP stack related to bogus checksums :
1) We return -EAGAIN to application even if receive queue is not empty.
This breaks applications using edge trigger epoll()
2) Under UDP flood, we can loop forever without yielding to other
processes, potentially hanging the host, especially on non SMP.
This patch is an attempt to make things better.
We might in the future add extra support for rt applications
wanting to better control time spent doing a recv() in a hostile
environment. For example we could validate checksums before queuing
packets in socket receive queue.
Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Ben Hutchings [Mon, 15 Jun 2015 02:51:55 +0000 (03:51 +0100)]
pipe: iovec: Fix memory corruption when retrying atomic copy as non-atomic
pipe_iov_copy_{from,to}_user() may be tried twice with the same iovec,
the first time atomically and the second time not. The second attempt
needs to continue from the iovec position, pipe buffer offset and
remaining length where the first attempt failed, but currently the
pipe buffer offset and remaining length are reset. This will corrupt
the piped data (possibly also leading to an information leak between
processes) and may also corrupt kernel memory.
This was fixed upstream by commits f0d1bec9d58d ("new helper:
copy_page_from_iter()") and 637b58c2887e ("switch pipe_read() to
copy_page_to_iter()"), but those aren't suitable for stable. This fix
for older kernel versions was made by Seth Jennings for RHEL and I
have extracted it from their update.
CVE-2015-1805
References: https://bugzilla.redhat.com/show_bug.cgi?id=1202855 Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Worse yet, reading the error message (the filter again) it says that
there was no error, when there clearly was. The issue is that the
code that checks the input does not check for balanced ops. That is,
having an op between a closed parenthesis and the next token.
This would only cause a warning, and fail out before doing any real
harm, but it should still not caues a warning, and the error reported
should work:
Link: http://lkml.kernel.org/r/20150615175025.7e809215@gandalf.local.home Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Ingo Molnar <mingo@redhat.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Reported-by: Vince Weaver <vincent.weaver@maine.edu> Tested-by: Vince Weaver <vincent.weaver@maine.edu> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
[bwh: Backported to 3.2: drop the check for OP_NOT, which we don't have] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Since the addition of sysfs multicast router support if one set
multicast_router to "2" more than once, then the port would be added to
the hlist every time and could end up linking to itself and thus causing an
endless loop for rlist walkers.
So to reproduce just do:
echo 2 > multicast_router; echo 2 > multicast_router;
in a bridge port and let some igmp traffic flow, for me it hangs up
in br_multicast_flood().
Fix this by adding a check in br_multicast_add_router() if the port is
already linked.
The reason this didn't happen before the addition of multicast_router
sysfs entries is because there's a !hlist_unhashed check that prevents
it.
Signed-off-by: Nikolay Aleksandrov <razor@blackwall.org> Fixes: 0909e11758bd ("bridge: Add multicast_router sysfs entries") Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
[bwh: Backported to 3.2: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Commit 334c86c494b9 ("MIPS: IRQ: Add stackoverflow detection") added
kernel stack overflow detection, however it only enabled it conditional
upon the preprocessor definition DEBUG_STACKOVERFLOW, which is never
actually defined. The Kconfig option is called DEBUG_STACKOVERFLOW,
which manifests to the preprocessor as CONFIG_DEBUG_STACKOVERFLOW, so
switch it to using that definition instead.
This change allows the driver to recognize newer Elantech touchpads.
Signed-off-by: Yi ju Hong <sam.hung@emc.com.tw> Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Newer elantech touchpads are not recognized by the current driver, since it
fails to detect their firmware version number. This prevents more advanced
touchpad features from being usable such as two-finger scrolling. This
patch allows newer touchpads to be detected and be fully functional. Tested
on Sony Vaio SVF13N17PXB.
Signed-off-by: Jordan Rife <jrife0@gmail.com> Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Added detection for newer Elantech touchpads, so that kernel doesn't
fall-back to default PS/2 driver. Supports touchpads released after
~August 2013. Fixes bug:
https://lists.launchpad.net/kernel-packages/msg18481.html
Tested on an Acer Aspire S7-392-6302.
Signed-off by: Matt Walker <matt.g.d.walker@gmail.com> Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
* Fix version recognition in elantech_set_properties
The new hardware reports itself as v7 but the packets'
structure is unaltered.
* Fix packet type recognition in elantech_packet_check_v4
The bitmask used for v6 is too wide, only the last three bits of
the third byte in a packet (packet[3] & 0x03) are actually used to
distinguish between packet types.
Starting from v7, additional information (to be interpreted) is
stored in the remaining bits (packets[3] & 0x1c).
In addition, the value stored in (packet[0] & 0x0c) is no longer
a constant but contains additional information yet to be deciphered.
This change should be backwards compatible with v6 hardware.
Additional-author: Giovanni Frigione <gio.frigione@gmail.com> Signed-off-by: Matteo Delfino <kendatsuba@gmail.com> Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Added the USB serial device ID for the HubZ dual ZigBee
and Z-Wave radio dongle.
Signed-off-by: John D. Blair <johnb@candicontrols.com> Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
The driver worked around an error in the MAYA44 USB(+)'s mixer unit
descriptor by aborting before parsing the missing field. However,
aborting parsing too early prevented parsing of the other units
connected to this unit, so the capture mixer controls would be missing.
Fix this by moving the check for this descriptor error after the parsing
of the unit's input pins.
Reported-by: nightmixes <nightmixes@gmail.com> Tested-by: nightmixes <nightmixes@gmail.com> Signed-off-by: Clemens Ladisch <clemens@ladisch.de> Signed-off-by: Takashi Iwai <tiwai@suse.de>
[bwh: Backported to 3.2:
- Adjust context
- Logging statement was different] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Make the check to skip the rate check more lax, so that it applies
to all hw_version 4 models.
This fixes the touchpad not being detected properly on Asus PU551LA
laptops.
Reported-and-tested-by: David Zafra Gómez <dezeta@klo.es> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
We used to read file_handle twice. Once to get the amount of extra
bytes, and once to fetch the entire structure.
This may be problematic since we do size verifications only after the
first read, so if the number of extra bytes changes in userspace between
the first and second calls, we'll have an incoherent view of
file_handle.
Instead, read the constant size once, and copy that over to the final
structure without having to re-read it again.
Signed-off-by: Sasha Levin <sasha.levin@oracle.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Ben Hutchings [Tue, 21 Jul 2015 14:42:59 +0000 (15:42 +0100)]
x86_64: Fix strnlen_user() to not touch memory after specified maximum
Inspired by commit f18c34e483ff ("lib: Fix strnlen_user() to not touch
memory after specified maximum") upstream. This version of
strnlen_user(), no longer present upstream, has a similar off-by-one
error.
Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Cc: Jan Kara <jack@suse.cz>
See https://bugzilla.redhat.com/show_bug.cgi?id=1025672
We need to put() the reference to the scsi host that we got in
pscsi_configure_device(). In VIRTUAL_HOST mode it is associated with
the dev_virt, not the hba_virt.
Signed-off-by: Andy Grover <agrover@redhat.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
[bwh: Backported to 3.2: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
usb 7-1: New USB device found, idVendor=046d, idProduct=08ca
...
usb 7-1: Warning! Unlikely big volume range (=3072), cval->res is probably wrong.
usb 7-1: [5] FU [Mic Capture Volume] ch = 1, val = 4608/7680/1
Signed-off-by: Wolfram Sang <wsa@the-dreams.de> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Add the volume control quirk for avoiding the kernel warning
for the Logitech HD Webcam C525
as in the similar commit 36691e1be6ec551eef4a5225f126a281f8c051c2
for the Logitech HD Webcam C310.
Reported-by: Maksim Boyko <maksim.a.boyko@gmail.com> Tested-by: Maksim Boyko <maksim.a.boyko@gmail.com> Signed-off-by: Maksim Boyko <maksim.a.boyko@gmail.com> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
when we find that a child has died while we'd been trying to ascend,
we should go into the first live sibling itself, rather than its sibling.
Off-by-one in question had been introduced in "deal with deadlock in
d_walk()" and the fix needs to be backported to all branches this one
has been backported to.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
[bwh: Backported to 3.2: apply to the 3 copies of this logic we
ended up with] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
match_token() expects a NULL terminator at the end of the token list so
that it would know where to stop. Not having one causes it to overrun
to invalid memory.
In practice, passing a mount option that omfs didn't recognize would
sometimes panic the system.
Signed-off-by: Sasha Levin <sasha.levin@oracle.com> Signed-off-by: Bob Copeland <me@bobcopeland.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Fixes: a87938b2e246b81b4fb ("fs/binfmt_elf.c: fix bug in loading of PIE binaries") Reported-by: James Hogan <james.hogan@imgtec.com> Cc: Michael Davidson <md@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
This bug has been there since day 1; addresses in the top guest physical
page weren't considered valid. You could map that page (the check in
check_gpte() is correct), but if a guest tried to put a pagetable there
we'd check that address manually when walking it, and kill the guest.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
x86 has variable length encoding. x86 JIT compiler is trying
to pick the shortest encoding for given bpf instruction.
While doing so the jump targets are changing, so JIT is doing
multiple passes over the program. Typical program needs 3 passes.
Some very short programs converge with 2 passes. Large programs
may need 4 or 5. But specially crafted bpf programs may hit the
pass limit and if the program converges on the last iteration
the JIT compiler will be producing an image full of 'int 3' insns.
Fix this corner case by doing final iteration over bpf program.
Fixes: 0a14842f5a3c ("net: filter: Just In Time compiler for x86-64") Reported-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Alexei Starovoitov <ast@plumgrid.com> Tested-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>
[bwh: Backported to 3.2: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
When more than a multicast address is present in a MLDv2 report, all but
the first address is ignored, because the code breaks out of the loop if
there has not been an error adding that address.
This has caused failures when two guests connected through the bridge
tried to communicate using IPv6. Neighbor discoveries would not be
transmitted to the other guest when both used a link-local address and a
static address.
This only happens when there is a MLDv2 querier in the network.
The fix will only break out of the loop when there is a failure adding a
multicast address.
The mdb before the patch:
dev ovirtmgmt port vnet0 grp ff02::1:ff7d:6603 temp
dev ovirtmgmt port vnet1 grp ff02::1:ff7d:6604 temp
dev ovirtmgmt port bond0.86 grp ff02::2 temp
After the patch:
dev ovirtmgmt port vnet0 grp ff02::1:ff7d:6603 temp
dev ovirtmgmt port vnet1 grp ff02::1:ff7d:6604 temp
dev ovirtmgmt port bond0.86 grp ff02::fb temp
dev ovirtmgmt port bond0.86 grp ff02::2 temp
dev ovirtmgmt port bond0.86 grp ff02::d temp
dev ovirtmgmt port vnet0 grp ff02::1:ff00:76 temp
dev ovirtmgmt port bond0.86 grp ff02::16 temp
dev ovirtmgmt port vnet1 grp ff02::1:ff00:77 temp
dev ovirtmgmt port bond0.86 grp ff02::1:ff00:def temp
dev ovirtmgmt port bond0.86 grp ff02::1:ffa1:40bf temp
Fixes: 08b202b67264 ("bridge br_multicast: IPv6 MLD support.") Reported-by: Rik Theys <Rik.Theys@esat.kuleuven.be> Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@redhat.com> Tested-by: Rik Theys <Rik.Theys@esat.kuleuven.be> Signed-off-by: David S. Miller <davem@davemloft.net>
[bwh: Backported to 3.2: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Cc: Jonathan Toppins <jtoppins@cumulusnetworks.com>
Multitheaded tests showed that the icv buffer in the current ghash
implementation is not handled correctly. A move of this working ghash
buffer value to the descriptor context fixed this. Code is tested and
verified with an multithreaded application via af_alg interface.
Signed-off-by: Harald Freudenberger <freude@linux.vnet.ibm.com> Signed-off-by: Gerald Schaefer <geraldsc@linux.vnet.ibm.com> Reported-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
This adds support for new Xsens device, Motion Tracker Development Board,
using Xsens' own Vendor ID
Signed-off-by: Patrick Riphagen <patrick.riphagen@xsens.com> Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
A non-percpu VIRQ (e.g., VIRQ_CONSOLE) may be freed on a different
VCPU than it is bound to. This can result in a race between
handle_percpu_irq() and removing the action in __free_irq() because
handle_percpu_irq() does not take desc->lock. The interrupt handler
sees a NULL action and oopses.
Only use the percpu chip/handler for per-CPU VIRQs (like VIRQ_TIMER).
256 bytes per sector support has been broken since 2.6.X,
and no-one stepped up to fix this.
So disable support for it.
Signed-off-by: Mark Hounschell <dmarkh@cfl.rr.com> Signed-off-by: Hannes Reinecke <hare@suse.de> Signed-off-by: James Bottomley <JBottomley@Odin.com>
[bwh: Backported to 3.2: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
The journal revoke block recovery code does not check r_count for
sanity, which means that an evil value of r_count could result in
the kernel reading off the end of the revoke table and into whatever
garbage lies beyond. This could crash the kernel, so fix that.
However, in testing this fix, I discovered that the code to write
out the revoke tables also was not correctly checking to see if the
block was full -- the current offset check is fine so long as the
revoke table space size is a multiple of the record size, but this
is not true when either journal_csum_v[23] are set.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Reviewed-by: Jan Kara <jack@suse.cz>
[bwh: Backported to 3.2: journal checksumming is not supported, so only
the first fix is needed] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
In function dmi_present(), dmi_walk_early() calls dmi_table(), which
calls dmi_decode(), which ultimately calls dmi_save_uuid(). This last
function makes a decision based on the value of global variable
dmi_ver. The problem is that this variable is set right _after_
dmi_walk_early() returns. So dmi_save_uuid() always sees dmi_ver == 0
regardless of the actual version implemented.
This causes /sys/class/dmi/id/product_uuid to always use the old
ordering even on systems implementing DMI/SMBIOS 2.6 or later, which
should use the new ordering.
This is broken since kernel v3.8 for legacy DMI implementations and
since kernel v3.10 for SMBIOS 2 implementations. SMBIOS 3
implementations with the 64-bit entry point are not affected.
The first breakage does not matter much as in practice legacy DMI
implementations are always for versions older than 2.6, which is when
the UUID ordering changed. The second breakage is more problematic as
it affects the vast majority of x86 systems manufactured since 2009.
Signed-off-by: Jean Delvare <jdelvare@suse.de> Fixes: 9f9c9cbb6057 ("drivers/firmware/dmi_scan.c: fetch dmi version from SMBIOS if it exists") Fixes: 79bae42d51a5 ("dmi_scan: refactor dmi_scan_machine(), {smbios,dmi}_present()") Acked-by: Zhenzhong Duan <zhenzhong.duan@oracle.com> Cc: Ben Hutchings <ben@decadent.org.uk> Cc: Artem Savkov <artem.savkov@gmail.com> Cc: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org> Cc: Matt Fleming <matt.fleming@intel.com>
[bwh: Backported to 3.2: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Move the calls to memcpy_fromio() up into the loop in
dmi_scan_machine(), and move the signature checks back down into
dmi_decode(). We need to check at 16-byte intervals but keep a 32-byte
buffer for an SMBIOS entry, so shift the buffer after each iteration.
Merge smbios_present() into dmi_present(), so we look for an SMBIOS
signature at the beginning of the given buffer and then for a DMI
signature at an offset of 16 bytes.
[artem.savkov@gmail.com: use proper buf type in dmi_present()] Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Reported-by: Tim McGrath <tmhikaru@gmail.com> Tested-by: Tim Mcgrath <tmhikaru@gmail.com> Cc: Zhenzhong Duan <zhenzhong.duan@oracle.com> Signed-off-by: Artem Savkov <artem.savkov@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
[bwh: Prerequisite for "firmware: dmi_scan: Fix ordering of product_uuid"]
Recent toolchains force the TOC to be 256 byte aligned. We need
to enforce this alignment in our linker script, otherwise pointers
to our TOC variables (__toc_start, __prom_init_toc_start) could
be incorrect.
If they are bad, we die a few hundred instructions into boot.
Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
When the v3 hardware sees more than one finger, it uses the semi-mt
protocol to report the touches. However, it currently works when
num_fingers is 0, 1 or 2, but when it is 3 and above, it sends only 1
finger as if num_fingers was 1.
This confuses userspace which knows how to deal with extra fingers
when all the slots are used, but not when some are missing.
Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=90101 Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com> Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
According to the RM of wm8958, BCLK DIV 348 doesn't exist, correct it
to 384.
Signed-off-by: Zidan Wang <zidan.wang@freescale.com> Acked-by: Charles Keepax <ckeepax@opensource.wolfsonmicro.com> Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
It should be "RINPUT3" instead of "LINPUT3" route to "Right Input
Mixer".
Signed-off-by: Zidan Wang <zidan.wang@freescale.com> Acked-by: Charles Keepax <ckeepax@opensource.wolfsonmicro.com> Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
When there is prefix specified, currently we will add this prefix in
widget->name, but not in widget->sname.
it causes failure at snd_soc_dapm_link_dai_widgets:
if (!w->sname || !strstr(w->sname, dai_w->name))
because dai_w->name has prefix added, but w->sname does not.
We should also add prefix for stream name
Signed-off-by: Koro Chen <koro.chen@mediatek.com> Signed-off-by: Mark Brown <broonie@kernel.org>
[bwh: Backported to 3.2:
- Adjust context
- s/prefix/dapm->codec->name_prefix] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
smep_andnot_wp is initialized in kvm_init_shadow_mmu and shadow pages
should not be reused for different values of it. Thus, it has to be
added to the mask in kvm_mmu_pte_write.
Reviewed-by: Xiao Guangrong <guangrong.xiao@linux.intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
[bwh: Backported to 3.2: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Janusz Dziedzic <janusz.dziedzic@tieto.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
[bwh: Backported to 3.2: s/IEEE80211_WEP/WEP/] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Avoton AHCI occasionally sees drive probe timeouts at driver load time.
When this happens SCR_STATUS indicates device detected, but no D2H FIS
reception. Reset the internal link state machines by bouncing
port-enable in the PCS register when this occurs.
Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Tejun Heo <tj@kernel.org>
[bwh: Backported to 3.2:
- Adjust context
- Call ahci_start_engine() directly] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Make ahci_dev_classify available to the ahci platform driver for custom
hard reset function.
Signed-off-by: Rob Herring <rob.herring@calxeda.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Without this flag some versions of these enclosures do not work.
Reported-and-tested-by: Christian Schaller <cschalle@redhat.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
If the xHCI host controller has died (ie, device removed) or suffered
other serious fatal error (STS_FATAL), then xhci_irq should handle this
condition with IRQ_HANDLED instead of -ESHUTDOWN.
Signed-off-by: Joe Lawrence <joe.lawrence@stratus.com> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Our event ring consists of only one segment, and we risk filling
the event ring in case we get isoc transfers with short intervals
such as webcams that fill a TD every microframe (125us)
With 64 TRB segment size one usb camera could fill the event ring in 8ms.
A setup with several cameras and other devices can fill up the
event ring as it is shared between all devices.
This has occurred when uvcvideo queues 5 * 32TD URBs which then
get cancelled when the video mode changes. The cancelled URBs are returned
in the xhci interrupt context and blocks the interrupt handler from
handling the new events.
A full event ring will block xhci from scheduling traffic and affect all
devices conneted to the xhci, will see errors such as Missed Service
Intervals for isoc devices, and and Split transaction errors for LS/FS
interrupt devices.
Increasing the TRB_PER_SEGMENT will also increase the default endpoint ring
size, which is welcome as for most isoc transfer we had to dynamically
expand the endpoint ring anyway to be able to queue the 5 * 32TDs uvcvideo
queues.
Isoc TDs usually consist of one TRB, sometimes two. When all goes well we
receive only one success event for a TD, and move the dequeue pointer to
the next TD.
This fails if the TD consists of two TRBs and we get a transfer error
on the first TRB, we will then see two events for that TD.
Fix this by making sure the event we get is for the last TRB in that TD
before moving the dequeue pointer to the next TD. This will resolve some
of the uvc and dvb issues with the
"ERROR Transfer event TRB DMA ptr not part of current TD" error message
Fixes: a0840e2e165a ("IPVS: netns, ip_vs_ctl local vars moved to ipvs struct.") Signed-off-by: Tommi Rantala <tt.rantala@gmail.com> Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
If any memory allocation in resize_stripes fails we will return
-ENOMEM, but in some cases we update conf->pool_size anyway.
This means that if we try again, the allocations will be assumed
to be larger than they are, and badness results.
So only update pool_size if there is no error.
This bug was introduced in 2.6.17 and the patch is suitable for
-stable.
Fixes: ad01c9e3752f ("[PATCH] md: Allow stripes to be expanded in preparation for expanding an array") Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Since acpi_reserve_resources() is defined as a device_initcall(),
there's no guarantee that it will be executed in the right order
with respect to the rest of the ACPI initialization code. On some
systems this leads to breakage if, for example, the address range
that should be reserved for the ACPI fixed registers is given to
the PCI host bridge instead if the race is won by the wrong code
path.
Fix this by turning acpi_reserve_resources() into a void function
and calling it directly from within the ACPI initialization sequence.
Reported-and-tested-by: George McCollister <george.mccollister@gmail.com> Link: http://marc.info/?t=143092384600002&r=1&w=2 Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
There is a race window in dlm_get_lock_resource(), which may return a
lock resource which has been purged. This will cause the process to
hang forever in dlmlock() as the ast msg can't be handled due to its
lock resource not existing.
dlm_get_lock_resource {
...
spin_lock(&dlm->spinlock);
tmpres = __dlm_lookup_lockres_full(dlm, lockid, namelen, hash);
if (tmpres) {
spin_unlock(&dlm->spinlock);
>>>>>>>> race window, dlm_run_purge_list() may run and purge
the lock resource
spin_lock(&tmpres->spinlock);
...
spin_unlock(&tmpres->spinlock);
}
}
Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com> Cc: Joseph Qi <joseph.qi@huawei.com> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
The range check for b-tree level parameter in nilfs_btree_root_broken()
is wrong; it accepts the case of "level == NILFS_BTREE_LEVEL_MAX" even
though the level is limited to values in the range of 0 to
(NILFS_BTREE_LEVEL_MAX - 1).
Since the level parameter is read from storage device and used to index
nilfs_btree_path array whose element count is NILFS_BTREE_LEVEL_MAX, it
can cause memory overrun during btree operations if the boundary value
is set to the level parameter on device.
This fixes the broken sanity check and adds a comment to clarify that
the upper bound NILFS_BTREE_LEVEL_MAX is exclusive.
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
If we find a non-confirmed openowner we jump to exit the function, but do
not set an error value. Fix this by factoring out a helper to do the
check and properly set the error from nfsd4_validate_stateid.
Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
[bwh: Backported to 3.2: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
The PM_RESTORE_PREPARE is not handled now in mmc_pm_notify(),
as result mmc_rescan() could be scheduled and executed at
late hibernation restore stages when MMC device is suspended
already - which, in turn, will lead to system crash on TI dra7-evm board:
WARNING: CPU: 0 PID: 3188 at drivers/bus/omap_l3_noc.c:148 l3_interrupt_handler+0x258/0x374() 44000000.ocp:L3 Custom Error: MASTER MPU TARGET L4_PER1_P3 (Idle): Data Access in User mode during Functional access
Hence, add missed PM_RESTORE_PREPARE PM event in mmc_pm_notify().
Fixes: 4c2ef25fe0b8 (mmc: fix all hangs related to mmc/sd card...) Signed-off-by: Grygorii Strashko <Grygorii.Strashko@linaro.org> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
[bwh: Backported to 3.2: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
fallocate() checks that the file is extent-based and returns
EOPNOTSUPP in case is not. Other tasks can convert from and to
indirect and extent so it's safe to check only after grabbing
the inode mutex.
Signed-off-by: Davide Italiano <dccitaliano@gmail.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
[bwh: Backported to 3.2:
- Adjust context
- Add the 'out' label] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
The incorrect ordering of operations during cpu dlpar add results in invalid
affinity for the cpu being added. The ibm,associativity property in the
device tree is populated with all zeroes for the added cpu which results in
invalid affinity mappings and all cpus appear to belong to node 0.
This occurs because rtas configure-connector is called prior to making the
rtas set-indicator calls. Phyp does not assign affinity information
for a cpu until the rtas set-indicator calls are made to set the isolation
and allocation state.
Correct the order of operations to make the rtas set-indicator
calls (done in dlpar_acquire_drc) before calling rtas configure-connector.
Fixes: 1a8061c46c46 ("powerpc/pseries: Add kernel based CPU DLPAR handling") Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
[bwh: Backported to 3.2:
- Adjust context
- Keep using goto instead of directly returning] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Unregister GPIOs requested through sysfs at chip remove to avoid leaking
the associated memory and sysfs entries.
The stale sysfs entries prevented the gpio numbers from being exported
when the gpio range was later reused (e.g. at device reconnect).
This also fixes the related module-reference leak.
Note that kernfs makes sure that any on-going sysfs operations finish
before the class devices are unregistered and that further accesses
fail.
The chip exported flag is used to prevent gpiod exports during removal.
This also makes it harder to trigger, but does not fix, the related race
between gpiochip_remove and export_store, which is really a race with
gpiod_request that needs to be addressed separately.
Also note that this would prevent the crashes (e.g. NULL-dereferences)
at reconnect that affects pre-3.18 kernels, as well as use-after-free on
operations on open attribute files on pre-3.14 kernels (prior to
kernfs).
Fixes: d8f388d8dc8d ("gpio: sysfs interface") Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
[bwh: Backported to 3.2:
- Adjust filename, context
- Move up initialisation of 'desc' in gpio_export()
- Use global 'gpio_desc' array and gpio_free() function in
gpiochip_unexport()] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
The variable for the 'permissive' module parameter used to be static
but was recently changed to be extern. This puts it in the kernel
global namespace if the driver is built-in, so its name should begin
with a prefix identifying the driver.
Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Fixes: af6fc858a35b ("xen-pciback: limit guest control of command register") Signed-off-by: David Vrabel <david.vrabel@citrix.com>
This phone is already supported by the visor driver.
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Johan Hovold <johan@kernel.org>
[bwh: Backported to 3.2: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>