This is an ancient bug that was actually attempted to be fixed once
(badly) by me eleven years ago in commit 4ceb5db9757a ("Fix
get_user_pages() race for write access") but that was then undone due to
problems on s390 by commit f33ea7f404e5 ("fix get_user_pages bug").
In the meantime, the s390 situation has long been fixed, and we can now
fix it by checking the pte_dirty() bit properly (and do it better). The
s390 dirty bit was implemented in abf09bed3cce ("s390/mm: implement
software dirty bits") which made it into v3.9. Earlier kernels will
have to look at the page state itself.
Also, the VM has become more scalable, and what used a purely
theoretical race back then has become easier to trigger.
To fix it, we introduce a new internal FOLL_COW flag to mark the "yes,
we already did a COW" rather than play racy games with FOLL_WRITE that
is very fundamental, and then use the pte dirty flag to validate that
the FOLL_COW flag is still valid.
Commit 2a6fba6 "xfs: only return -errno or success from attr ->put_listent"
changes the returnvalue of __xfs_xattr_put_listen to 0 in case when there is
insufficient space in the buffer assuming that setting context->count to -1
would be enough, but all of the ->put_listent callers only check seen_enough.
This results in a failed assertion:
XFS: Assertion failed: context->count >= 0, file: fs/xfs/xfs_xattr.c, line: 175
in insufficient buffer size case.
This is only reproducible with at least 2 xattrs and only when the buffer
gets depleted before the last one.
Furthermore if buffersize is such that it is enough to hold the last xattr's
name, but not enough to hold the sum of preceeding xattr names listxattr won't
fail with ERANGE, but will suceed returning last xattr's name without the
first character. The first character end's up overwriting data stored at
(context->alist - 1).
Signed-off-by: Artem Savkov <asavkov@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com> Cc: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Runtime PM should be configured already once we call device_add. See
also the description in this mail thread
https://lists.linuxfoundation.org/pipermail/linux-pm/2009-November/023198.html
or the order of calls e.g. in usb_new_device.
The changed order also helps to avoid scenarios where runtime pm for
&shost->shost_gendev is activated whilst the parent is suspended,
resulting in error message "runtime PM trying to activate child device
hostx but parent yyy is not active".
In addition properly reverse the runtime pm calls in the error path.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Acked-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The rcar_fcp_enable() function immediately returns successfully when the
FCP device pointer is NULL to avoid forcing the users to check the FCP
device manually before every call. However, the stub version of the
function used when the FCP driver is disabled returns -ENOSYS
unconditionally, resulting in a different API contract for the two
versions of the function.
As a user that requires FCP support will fail at probe time when calling
rcar_fcp_get() if the FCP driver is disabled, the stub version of the
rcar_fcp_enable() function will only be called with a NULL FCP device.
We can thus return 0 unconditionally to align the behaviour with the
normal version of the function.
The req_canceled() callback is used by tpm_transmit() periodically to
check whether the request has been canceled while it is receiving a
response from the TPM.
The TPM_CRB_CTRL_CANCEL register was cleared already in the crb_cancel
callback, which has two consequences:
* Cancel might not happen.
* req_canceled() always returns zero.
A better place to clear the register is when starting to send a new
command. The behavior of TPM_CRB_CTRL_CANCEL is described in the
section 5.5.3.6 of the PTP specification.
Unseal and load operations should be done as an atomic operation. This
commit introduces unlocked tpm_transmit() so that tpm2_unseal_trusted()
can do the locking by itself.
Fixes: 0fe5480303a1 ("keys, trusted: seal/unseal with TPM 2.0 chips") Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> Reviewed-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Ima tries to call ->setxattr() on overlayfs dentry after having locked
underlying inode, which results in a deadlock.
Reported-by: Krisztian Litkey <kli@iki.fi> Fixes: 4bacc9c9234c ("overlayfs: Make f_path always point to the overlay and f_inode to the underlay") Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Cc: Mimi Zohar <zohar@linux.vnet.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
We know that 'ret = 0' because it has been tested a few lines above.
So, if 'kzalloc' fails, 0 will be returned instead of an error code.
Return -ENOMEM instead.
Fixes: a0d46a3dfdc3 ("ARM: cpuidle: Register per cpuidle device") Signed-off-by: Christophe Jaillet <christophe.jaillet@wanadoo.fr> Acked-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The Qualcomm SPMI GPIO and MPP lines are problematic: the
are fetched from the main MFD driver with platform_get_irq()
which means that at this point they will all be assigned the
flags set up for the interrupts in the device tree.
That is problematic since these are flagged as rising edge
and an this point the interrupt descriptor is assigned a
rising edge, while the only thing the GPIO/MPP drivers really
do is issue irq_get_irqchip_state() on the line to read it
out and to provide a .to_irq() helper for *other* IRQ
consumers.
If another device tree node tries to flag the same IRQ
for use as something else than rising edge, the kernel
irqdomain core will protest like this:
type mismatch, failed to map hwirq-NN for <FOO>!
Which is what happens when the device tree defines two
contradictory flags for the same interrupt line.
To work around this and alleviate the problem, assign 0
as flag for the interrupts taken by the PM GPIO and MPP
drivers. This will lead to the flag being unset, and a
second consumer requesting rising, falling, both or level
interrupts will be respected. This is what the qcom-pm*.dtsi
files already do.
Switched to using the symbolic name IRQ_TYPE_NONE so that
we get this more readable.
This misconfiguration was caused by a copy/pasting the
APQ8064 set-up, the latter has been fixed in a separate
patch.
Tested with one of the SPMI GPIOs: after this I can
successfully request one of these GPIOs as falling edge
from the device tree.
Fixes: 0840ea9e4457 ("ARM: dts: add GPIO and MPP to MSM8660 PMIC") Cc: Srinivas Kandagatla <srinivas.kandagatla@linaro.org> Cc: Stephen Boyd <sboyd@codeaurora.org> Cc: Björn Andersson <bjorn.andersson@linaro.org> Cc: Ivan T. Ivanov <ivan.ivanov@linaro.org> Cc: John Stultz <john.stultz@linaro.org> Cc: Andy Gross <andy.gross@linaro.org> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Andy Gross <andy.gross@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The Qualcomm PMIC GPIO and MPP lines are problematic: the
are fetched from the main MFD driver with platform_get_irq()
which means that at this point they will all be assigned the
flags set up for the interrupts in the device tree.
That is problematic since these are flagged as rising edge
and an this point the interrupt descriptor is assigned a
rising edge, while the only thing the GPIO/MPP drivers really
do is issue irq_get_irqchip_state() on the line to read it
out and to provide a .to_irq() helper for *other* IRQ
consumers.
If another device tree node tries to flag the same IRQ
for use as something else than rising edge, the kernel
irqdomain core will protest like this:
type mismatch, failed to map hwirq-NN for <FOO>!
Which is what happens when the device tree defines two
contradictory flags for the same interrupt line.
To work around this and alleviate the problem, assign 0
as flag for the interrupts taken by the PM GPIO and MPP
drivers. This will lead to the flag being unset, and a
second consumer requesting rising, falling, both or level
interrupts will be respected. This is what the qcom-pm*.dtsi
files already do.
Switched to using the symbolic name IRQ_TYPE_NONE so that
we get this more readable.
Fixes: bce360469676 ("ARM: dts: apq8064: add pm8921 mpp support") Fixes: 874443fe9e33 ("ARM: dts: apq8064: Add pm8921 mfd and its gpio node") Cc: Srinivas Kandagatla <srinivas.kandagatla@linaro.org> Cc: Stephen Boyd <sboyd@codeaurora.org> Cc: Björn Andersson <bjorn.andersson@linaro.org> Cc: Ivan T. Ivanov <ivan.ivanov@linaro.org> Cc: John Stultz <john.stultz@linaro.org> Cc: Andy Gross <andy.gross@linaro.org> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Andy Gross <andy.gross@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Commit 215e362dafed ("ARM: 8306/1: loop_udelay: remove bogomips value
limitation") tried to increase the bogomips limitation, but in doing
so messed up udelay such that it always gives about a 5% error in the
delay, even if we use a timer.
Originally, UDELAY_MULT was ((UL(2199023) * HZ) >> 11) and UDELAY_SHIFT
30. Assuming HZ=100, us_delay of 1000 and ticks_per_jiffy of 1660000
(eg, 166MHz timer, 1ms delay) this would calculate:
On x86_32, when an interrupt happens from kernel space, SS and SP aren't
pushed and the existing stack is used. So pt_regs is effectively two
words shorter, and the previous stack pointer is normally the memory
after the shortened pt_regs, aka '®s->sp'.
But in the rare case where the interrupt hits right after the stack
pointer has been changed to point to an empty stack, like for example
when call_on_stack() is used, the address immediately after the
shortened pt_regs is no longer on the stack. In that case, instead of
'®s->sp', the previous stack pointer should be retrieved from the
beginning of the current stack page.
kernel_stack_pointer() wants to do that, but it forgets to dereference
the pointer. So instead of returning a pointer to the previous stack,
it returns a pointer to the beginning of the current stack.
Note that it's probably outside of kernel_stack_pointer()'s scope to be
switching stacks at all. The x86_64 version of this function doesn't do
it, and it would be better for the caller to do it if necessary. But
that's a patch for another day. This just fixes the original intent.
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Byungchul Park <byungchul.park@lge.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Nilay Vaish <nilayvaish@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: 0788aa6a23cb ("x86: Prepare removal of previous_esp from i386 thread_info structure") Link: http://lkml.kernel.org/r/472453d6e9f6a2d4ab16aaed4935f43117111566.1471535549.git.jpoimboe@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
__show_regs() fails to dump the PKRU state when the debug registers are in
their default state because there is a return statement on the debug
register state.
Change the logic to report PKRU value even when debug registers are in
their default state.
Fixes:c0b17b5bd4b7 ("x86/mm/pkeys: Dump PKRU with other kernel registers") Signed-off-by: Nicolas Iooss <nicolas.iooss_linux@m4x.org> Acked-by: Dave Hansen <dave.hansen@linux.intel.com> Link: http://lkml.kernel.org/r/20160910183045.4618-1-nicolas.iooss_linux@m4x.org Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When a CPU is physically added to a system then the MADT table is not
updated.
If subsequently a kdump kernel is started on that physically added CPU then
the ACPI enumeration fails to provide the information for this CPU which is
now the boot CPU of the kdump kernel.
As a consequence, generic_processor_info() is not invoked for that CPU so
the number of enumerated processors is 0 and none of the initializations,
including the logical package id management, are performed.
We have code which relies on the correctness of the logical package map and
other information which is initialized via generic_processor_info().
Executing such code will result in undefined behaviour or kernel crashes.
This problem applies only to the kdump kernel because a normal kexec will
switch to the original boot CPU, which is enumerated in MADT, before
jumping into the kexec kernel.
The boot code already has a check for num_processors equal 0 in
prefill_possible_map(). We can use that check as an indicator that the
enumeration of the boot CPU did not happen and invoke generic_processor_info()
for it. That initializes the relevant data for the boot CPU and therefore
prevents subsequent failure.
[ tglx: Refined the code and rewrote the changelog ]
Signed-off-by: Prarit Bhargava <prarit@redhat.com> Fixes: 1f12e32f4cd5 ("x86/topology: Create logical package id") Cc: Peter Zijlstra <peterz@infradead.org> Cc: Len Brown <len.brown@intel.com> Cc: Borislav Petkov <bp@suse.de> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: dyoung@redhat.com Cc: Eric Biederman <ebiederm@xmission.com> Cc: kexec@lists.infradead.org Link: http://lkml.kernel.org/r/1475514432-27682-1-git-send-email-prarit@redhat.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The array has a size of MAX_LOCAL_APIC, which can be as large as 32k, so it
can consume up to 128k.
The array has been there forever and was never used for anything useful
other than a version mismatch check which was introduced in 2009.
There is no reason to store the version in an array. The kernel is not
prepared to handle different APIC versions anyway, so the real important
part is to detect a version mismatch and warn about it, which can be done
with a single variable as well.
[ tglx: Massaged changelog ]
Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com> CC: Andy Lutomirski <luto@amacapital.net> CC: Borislav Petkov <bp@alien8.de> CC: Brian Gerst <brgerst@gmail.com> CC: Mike Travis <travis@sgi.com> Link: http://lkml.kernel.org/r/20160913181232.30815-1-dvlasenk@redhat.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Our XSAVE features are divided into two categories: those that
generate FPU exceptions, and those that do not. MPX and pkeys do
not generate FPU exceptions and thus can not be used lazily. We
disable them when lazy mode is forced on.
We have a pair of masks to collect these two sets of features, but
XFEATURE_MASK_PKRU was added to the wrong mask: XFEATURE_MASK_LAZY.
Fix it by moving the feature to XFEATURE_MASK_EAGER.
Note: this only causes problem if you boot with lazy FPU mode
(eagerfpu=off) which is *not* the default. It also only affects
hardware which is not currently publicly available. It looks like
eager mode is going away, but we still need this patch applied
to any kernel that has protection keys and lazy mode, which is 4.6
through 4.8 at this point, and 4.9 if the lazy removal isn't sent
to Linus for 4.9.
Fixes: c8df40098451 ("x86/fpu, x86/mm/pkeys: Add PKRU xsave fields and data structures") Signed-off-by: Dave Hansen <dave.hansen@intel.com> Cc: Dave Hansen <dave@sr71.net> Link: http://lkml.kernel.org/r/20161007162342.28A49813@viggo.jf.intel.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When a CPU is about to be offlined we call fixup_irqs() that resets IRQ
affinities related to the CPU in question. The same thing is also done when
the system is suspended to S-states like S3 (mem).
For each IRQ we try to complete any on-going move regardless whether the
IRQ is actually part of x86_vector_domain. For each IRQ descriptor we fetch
its chip_data, assume it is of type struct apic_chip_data and manipulate it
by clearing old_domain mask etc. For irq_chips that are not part of the
x86_vector_domain, like those created by various GPIO drivers, will find
their chip_data being changed unexpectly.
Below is an example where GPIO chip owned by pinctrl-sunrisepoint.c gets
corrupted after resume:
# cat /sys/kernel/debug/gpio
gpiochip0: GPIOs 360-511, parent: platform/INT344B:00, INT344B:00:
gpio-511 ( |sysfs ) in hi
ec776ef6bbe1 ("x86/mm: Add support for the non-standard protected e820 type")
Christoph references the original patch I wrote implementing pmem support.
The intent of the 'max_pfn' changes in that commit were to enable persistent
memory ranges to be covered by the struct page memmap by default.
However, that approach was abandoned when Christoph ported the patches [1], and
that functionality has since been replaced by devm_memremap_pages().
In the meantime, this max_pfn manipulation is confusing kdump [2] that
assumes that everything covered by the max_pfn is "System RAM". This
results in kdump hanging or crashing.
In some places, dump_backtrace() is called with a NULL tsk parameter,
e.g. in bug_handler() in arch/arm64, or indirectly via show_stack() in
core code. The expectation is that this is treated as if current were
passed instead of NULL. Similar is true of unwind_frame().
Commit a80a0eb70c358f8c ("arm64: make irq_stack_ptr more robust") didn't
take this into account. In dump_backtrace() it compares tsk against
current *before* we check if tsk is NULL, and in unwind_frame() we never
set tsk if it is NULL.
Due to this, we won't initialise irq_stack_ptr in either function. In
dump_backtrace() this results in calling dump_mem() for memory
immediately above the IRQ stack range, rather than for the relevant
range on the task stack. In unwind_frame we'll reject unwinding frames
on the IRQ stack.
In either case this results in incomplete or misleading backtrace
information, but is not otherwise problematic. The initial percpu areas
(including the IRQ stacks) are allocated in the linear map, and dump_mem
uses __get_user(), so we shouldn't access anything with side-effects,
and will handle holes safely.
This patch fixes the issue by having both functions handle the NULL tsk
case before doing anything else with tsk.
Signed-off-by: Mark Rutland <mark.rutland@arm.com> Fixes: a80a0eb70c358f8c ("arm64: make irq_stack_ptr more robust") Acked-by: James Morse <james.morse@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Yang Shi <yang.shi@linaro.org> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
If userspace creates a PMU for the VCPU, but doesn't create an in-kernel
irqchip, then we end up in a nasty path where we try to take an
uninitialized spinlock, which can lead to all sorts of breakages.
Luckily, QEMU always creates the VGIC before the PMU, so we can
establish this as ABI and check for the VGIC in the PMU init stage.
This can be relaxed at a later time if we want to support PMU with a
userspace irqchip.
Cc: Shannon Zhao <shannon.zhao@linaro.org> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When a guest TLB entry is replaced by TLBWI or TLBWR, we only invalidate
TLB entries on the local CPU. This doesn't work correctly on an SMP host
when the guest is migrated to a different physical CPU, as it could pick
up stale TLB mappings from the last time the vCPU ran on that physical
CPU.
Therefore invalidate both user and kernel host ASIDs on other CPUs,
which will cause new ASIDs to be generated when it next runs on those
CPUs.
We're careful only to do this if the TLB entry was already valid, and
only for the kernel ASID where the virtual address it mapped is outside
of the guest user address range.
The MMCR2 register is available twice, one time with number 785
(privileged access), and one time with number 769 (unprivileged,
but it can be disabled completely). In former times, the Linux
kernel was using the unprivileged register 769 only, but since
commit 8dd75ccb571f3c92c ("powerpc: Use privileged SPR number
for MMCR2"), it uses the privileged register 785 instead.
The KVM-PR code then of course also switched to use the SPR 785,
but this is causing older guest kernels to crash, since these
kernels still access 769 instead. So to support older kernels
with KVM-PR again, we have to support register 769 in KVM-PR, too.
Fixes: 8dd75ccb571f3c92c48014b3dabd3d51a115ab41 Signed-off-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Early during boot topology_update_package_map() computes
logical_pkg_ids for all present processors.
Later, when processors are brought up, identify_cpu() updates
these values based on phys_pkg_id which is a function of
initial_apicid. On PV guests the latter may point to a
non-existing node, causing logical_pkg_ids to be set to -1.
Intel's RAPL uses logical_pkg_id (as topology_logical_package_id())
to index its arrays and therefore in this case will point to index
65535 (since logical_pkg_id is a u16). This could lead to either a
crash or may actually access random memory location.
As a workaround, we recompute topology during CPU bringup to reset
logical_pkg_id to a valid value.
(The reason for initial_apicid being bogus is because it is
initial_apicid of the processor from which the guest is launched.
This value is CPUID(1).EBX[31:24])
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
drivers/built-in.o: In function `wm8350_i2c_probe':
core.c:(.text+0x828b0): undefined reference to `__devm_regmap_init_i2c'
Makefile:953: recipe for target 'vmlinux' failed
Fixes: 52b461b86a9f ("mfd: Add regmap cache support for wm8350") Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Acked-by: Charles Keepax <ckeepax@opensource.wolfsonmicro.com> Signed-off-by: Lee Jones <lee.jones@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
set_bit() and clear_bit() take the bit number so this code is really
doing "1 << (1 << irq)" which is a double shift bug. It's done
consistently so it won't cause a problem unless "irq" is more than 4.
Fixes: 70c6cce04066 ('mfd: Support 88pm80x in 80x driver') Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Lee Jones <lee.jones@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Member "status" of struct usb_sg_request is managed by usb core. A
spin lock is used to serialize the change of it. The driver could
check the value of req->status, but should avoid changing it without
the hold of the spinlock. Otherwise, it could cause race or error
in usb core.
This patch could be backported to stable kernels with version later
than v3.14.
Cc: Alan Stern <stern@rowland.harvard.edu> Cc: Roger Tseng <rogerable@realtek.com> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Lee Jones <lee.jones@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Currently, usb-line6 module exports an array of MIDI manufacturer ID and
usb-pod module uses it. However, the declaration is not the definition in
common header. The difference is explicit length of array. Although
compiler calculates it and everything goes well, it's better to use the
same representation between definition and declaration.
This commit fills the length of array for usb-line6 module. As a small
good sub-effect, this commit suppress below warnings from static analysis
by sparse v0.5.0.
The DragonFly quirk added in 42e3121d90f4 ("ALSA: usb-audio: Add a more
accurate volume quirk for AudioQuest DragonFly") applies a custom dB map
on the volume control when its range is reported as 0..50 (0 .. 0.2dB).
However, there exists at least one other variant (hw v1.0c, as opposed
to the tested v1.2) which reports a different non-sensical volume range
(0..53) and the custom map is therefore not applied for that device.
This results in all of the volume change appearing close to 100% on
mixer UIs that utilize the dB TLV information.
Add a fallback case where no dB TLV is reported at all if the control
range is not 0..50 but still 0..N where N <= 1000 (3.9 dB). Also
restrict the quirk to only apply to the volume control as there is also
a mute control which would match the check otherwise.
Fixes: 42e3121d90f4 ("ALSA: usb-audio: Add a more accurate volume quirk for AudioQuest DragonFly") Signed-off-by: Anssi Hannula <anssi.hannula@iki.fi> Reported-by: David W <regulars@d-dub.org.uk> Tested-by: David W <regulars@d-dub.org.uk> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The pointer callbacks of ali5451 driver may return the value at the
boundary occasionally, and it results in the kernel warning like
snd_ali5451 0000:00:06.0: BUG: , pos = 16384, buffer size = 16384, period size = 1024
It seems that folding the position offset is enough for fixing the
warning and no ill-effect has been seen by that.
The musb driver calls into this phy driver to disable/enable squelch
detection. This function was introduced in 24fe86a617c5 ("phy: sun4i-usb:
Add a sunxi specific function for setting squelch-detect"). This
function in turn calls sun4i_usb_phy_write, which uses a mutex to
guard the common access register. Unfortunately musb does this
in atomic context, which results in the following warning with lock
debugging enabled:
BUG: sleeping function called from invalid context at kernel/locking/mutex.c:97
in_atomic(): 1, irqs_disabled(): 128, pid: 96, name: kworker/0:2
CPU: 0 PID: 96 Comm: kworker/0:2 Not tainted 4.8.0-rc4-00181-gd502f8ad1c3e #13
Hardware name: Allwinner sun8i Family
Workqueue: events musb_deassert_reset
[<c010bc01>] (unwind_backtrace) from [<c0109237>] (show_stack+0xb/0xc)
[<c0109237>] (show_stack) from [<c02a669b>] (dump_stack+0x67/0x74)
[<c02a669b>] (dump_stack) from [<c05d68c9>] (mutex_lock+0x15/0x2c)
[<c05d68c9>] (mutex_lock) from [<c02c3589>] (sun4i_usb_phy_write+0x39/0xec)
[<c02c3589>] (sun4i_usb_phy_write) from [<c03e6327>] (musb_port_reset+0xfb/0x184)
[<c03e6327>] (musb_port_reset) from [<c03e4917>] (musb_deassert_reset+0x1f/0x2c)
[<c03e4917>] (musb_deassert_reset) from [<c012ecb5>] (process_one_work+0x129/0x2b8)
[<c012ecb5>] (process_one_work) from [<c012f5e3>] (worker_thread+0xf3/0x424)
[<c012f5e3>] (worker_thread) from [<c0132dbd>] (kthread+0xa1/0xb8)
[<c0132dbd>] (kthread) from [<c0105f31>] (ret_from_fork+0x11/0x20)
Since the register access is mmio, we can use a spinlock to guard this
specific access, rather than the mutex that guards the entire phy.
Fixes: ba4bdc9e1dc0 ("PHY: sunxi: Add driver for sunxi usb phy") Cc: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Chen-Yu Tsai <wens@csie.org> Reviewed-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Commit 50c763f8c1bac ("usb: dwc3: Set the ClearPendIN bit on Clear
Stall EP command") sets ClearPendIN bit for all IN endpoints of
v2.60a+ cores. This causes ClearStall command fails on 2.60+ cores
operating in HighSpeed mode.
In page 539 of 2.60a specification:
"When issuing Clear Stall command for IN endpoints in SuperSpeed
mode, the software must set the "ClearPendIN" bit to '1' to
clear any pending IN transcations, so that the device does not
expect any ACK TP from the host for the data sent earlier."
It's obvious that we only need to apply this rule to those IN
endpoints that currently operating in SuperSpeed mode.
Fixes: 50c763f8c1bac ("usb: dwc3: Set the ClearPendIN bit on Clear Stall EP command") Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In commit 27727df240c7 ("Avoid taking lock in NMI path with
CONFIG_DEBUG_TIMEKEEPING"), I changed the logic to open-code
the timekeeping_get_ns() function, but I forgot to include
the unit conversion from cycles to nanoseconds, breaking the
function's output, which impacts users like perf.
Add the proper use of timekeeping_delta_to_ns() to convert
the cycle delta to nanoseconds as needed.
Thanks to Brendan and Alexei for finding this quickly after
the v4.8 release. Unfortunately the problematic commit has
landed in some -stable trees so they'll need this fix as
well.
Many apologies for this mistake. I'll be looking to add a
perf-clock sanity test to the kselftest timers tests soon.
Fixes: 27727df240c7 "timekeeping: Avoid taking lock in NMI path with CONFIG_DEBUG_TIMEKEEPING" Reported-by: Brendan Gregg <bgregg@netflix.com> Reported-by: Alexei Starovoitov <alexei.starovoitov@gmail.com> Tested-and-reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Signed-off-by: John Stultz <john.stultz@linaro.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/r/1475636148-26539-1-git-send-email-john.stultz@linaro.org Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Since commit 71723f95463d "PM / runtime: print error when activating a
child to unactive parent" I see the following error message:
scsi host2: usb-storage 1-3:1.0
scsi host2: runtime PM trying to activate child device host2 but parent
(1-3:1.0) is not active
Digging into it it seems to be related to the problem described in the
commit message for cd998ded5c12 "i2c: designware: Prevent runtime
suspend during adapter registration" as scsi_add_host also calls
device_add and after the call to device_add the parent device is
suspended.
Fix this by using the approach from the mentioned commit and getting
the runtime pm reference before calling scsi_add_host.
One of the laptops has the codec ALC256 on it, applying the
ALC255_FIXUP_DELL1_MIC_NO_PRESENCE can fix the problem, the rest
of laptops have the codec ALC295 on them, they are similar to machines
with ALC225, applying the ALC269_FIXUP_DELL1_MIC_NO_PRESENCE can fix
the problem.
We have two new Dell laptop models, they have the same ALC255 pin
definition, but not in the pin quirk table yet, as a result, the
headset microphone can't work. After adding the definition in the
table, the headset microphone works well.
Turns out it was totally wrong. The memory is supposed to be bound to
the kref, as the original code was doing correctly, not the
device/driver binding as the devm_kzalloc() would cause.
This fixes an oops when read would be called after the device was
unbound from the driver.
Reported-by: Ladislav Michl <ladis@linux-mips.org> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In v_recv_cmd_submit(), urb_p->urb->pipe has the type unsigned int
(which is 32-bit long on x86_64) but 11<<30 results in a 34-bit integer.
Therefore the 2 leading bits are truncated and
urb_p->urb->pipe &= ~(11 << 30);
has the same meaning as
urb_p->urb->pipe &= ~(3 << 30);
This second statement seems to be how the code was intended to be
written, as PIPE_ constants have values between 0 and 3.
The overflow has been detected with a clang warning:
drivers/usb/usbip/vudc_rx.c:145:27: warning: signed shift result
(0x2C0000000) requires 35 bits to represent, but 'int' only has 32
bits [-Wshift-overflow]
urb_p->urb->pipe &= ~(11 << 30);
~~ ^ ~~
Commit 367e8560e8d7a62d96e9b1d644028a3816e04206 introduced a bug
in fbtft-core where fps is always 0, this is because variable
update_time is not assigned correctly.
This patch fixes a NULL pointer dereference caused by a race codition in
the probe function of the legousbtower driver. It re-structures the
probe function to only register the interface after successfully reading
the board's firmware ID.
The probe function does not deregister the usb interface after an error
receiving the devices firmware ID. The device file registered
(/dev/usb/legousbtower%d) may be read/written globally before the probe
function returns. When tower_delete is called in the probe function
(after an r/w has been initiated), core dev structures are deleted while
the file operation functions are still running. If the 0 address is
mappable on the machine, this vulnerability can be used to create a
Local Priviege Escalation exploit via a write-what-where condition by
remapping dev->interrupt_out_buffer in tower_write. A forged USB device
and local program execution would be required for LPE. The USB device
would have to delay the control message in tower_probe and accept
the control urb in tower_open whilst guest code initiated a write to the
device file as tower_delete is called from the error in tower_probe.
This bug has existed since 2003. Patch tested by emulated device.
Reported-by: James Patrick-Evans <james@jmp-e.com> Tested-by: James Patrick-Evans <james@jmp-e.com> Signed-off-by: James Patrick-Evans <james@jmp-e.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
That just generally kills the machine, and makes debugging only much
harder, since the traces may long be gone.
Debugging by assert() is a disease. Don't do it. If you can continue,
you're much better off doing so with a live machine where you have a
much higher chance that the report actually makes it to the system logs,
rather than result in a machine that is just completely dead.
The only valid situation for BUG_ON() is when continuing is not an
option, because there is massive corruption. But if you are just
verifying that something is true, you warn about your broken assumptions
(preferably just once), and limp on.
Fixes: 22f2ac51b6d6 ("mm: workingset: fix crash in shadow node shrinker caused by replace_page_cache_page()") Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Miklos Szeredi <miklos@szeredi.hu> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When TIF_SINGLESTEP is set for a task, the single-step state machine is
enabled and we must take care not to reset it to the active-not-pending
state if it is already in the active-pending state.
Unfortunately, that's exactly what user_enable_single_step does, by
unconditionally setting the SS bit in the SPSR for the current task.
This causes failures in the GDB testsuite, where GDB ends up missing
expected step traps if the instruction being stepped generates another
trap, e.g. PTRACE_EVENT_FORK from an SVC instruction.
This patch fixes the problem by preserving the current state of the
stepping state machine when TIF_SINGLESTEP is set on the current thread.
Cc: <stable@vger.kernel.org> Reported-by: Yao Qi <yao.qi@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Linus Torvalds [Sun, 2 Oct 2016 22:23:00 +0000 (15:23 -0700)]
Merge branch 'fixes' of git://git.armlinux.org.uk/~rmk/linux-arm
Pull ARM fixes from Russell King:
"Three relatively small fixes for ARM:
- Roger noticed that dma_max_pfn() was calculating the upper limit
wrongly, by adding the PFN offset of memory twice.
- A fix from Robin to correct parsing of MPIDR values when the
address size is larger than one BE32 unit.
- A fix from Srinivas to ensure that we do not rely on the boot
loader (or previous Linux kernel) setting the translation table
base register a certain way in the decompressor, which can lead to
crashes"
* 'fixes' of git://git.armlinux.org.uk/~rmk/linux-arm:
ARM: 8618/1: decompressor: reset ttbcr fields to use TTBR0 on ARMv7
ARM: 8617/1: dma: fix dma_max_pfn()
ARM: 8616/1: dt: Respect property size when parsing CPUs
ARM: 8618/1: decompressor: reset ttbcr fields to use TTBR0 on ARMv7
If the bootloader uses the long descriptor format and jumps to
kernel decompressor code, TTBCR may not be in a right state.
Before enabling the MMU, it is required to clear the TTBCR.PD0
field to use TTBR0 for translation table walks.
The commit dbece45894d3a ("ARM: 7501/1: decompressor:
reset ttbcr for VMSA ARMv7 cores") does the reset of TTBCR.N, but
doesn't consider all the bits for the size of TTBCR.N.
Clear TTBCR.PD0 field and reset all the three bits of TTBCR.N to
indicate the use of TTBR0 and the correct base address width.
Fixes: dbece45894d3 ("ARM: 7501/1: decompressor: reset ttbcr for VMSA ARMv7 cores") Acked-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Srinivas Ramana <sramana@codeaurora.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Linus Torvalds [Sun, 2 Oct 2016 18:04:29 +0000 (11:04 -0700)]
Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Thomas Gleixner:
"The last regression fixes for 4.8 final:
- Two patches addressing the fallout of the CR4 optimizations which
caused CR4-less machines to fail.
- Fix the VDSO build on big endian machines
- Take care of FPU initialization if no CPUID is available otherwise
task struct size ends up being zero
- Fix up context tracking in case load_gs_index fails"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/entry/64: Fix context tracking state warning when load_gs_index fails
x86/boot: Initialize FPU and X86_FEATURE_ALWAYS even if we don't have CPUID
x86/vdso: Fix building on big endian host
x86/boot: Fix another __read_cr4() case on 486
x86/init: Fix cr4_init_shadow() on CR4-less machines
1) Fix wrong TCP checksums on MTU probing when checksum offloading is
disabled, from Douglas Caetano dos Santos.
2) Fix qdisc backlog updates in qfq and sfb schedulers, from Cong Wang.
3) Route lookup flow key protocol value is wrong in ip6gre_xmit_other(),
fix from Lance Richardson.
4) Scheduling while atomic in multicast routing code of ipv4 and ipv6,
fix from Nikolay Aleksandrov.
5) Fix packet alignment in fec driver, from Eric Nelson.
6) Fix perf regression in sctp due to struct layout and cache misses,
from Xin Long.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
sctp: fix the issue sctp_diag uses lock_sock in rcu_read_lock
sctp: change to check peer prsctp_capable when using prsctp polices
sctp: remove prsctp_param from sctp_chunk
sctp: move sent_count to the memory hole in sctp_chunk
tg3: Avoid NULL pointer dereference in tg3_io_error_detected()
act_ife: Fix false encoding
act_ife: Fix external mac header on encode
VSOCK: Don't dec ack backlog twice for rejected connections
Revert "net: ethernet: bcmgenet: use phydev from struct net_device"
net: fec: align IP header in hardware
net: fec: remove QUIRK_HAS_RACC from i.mx27
net: fec: remove QUIRK_HAS_RACC from i.mx25
ipmr, ip6mr: fix scheduling while atomic and a deadlock with ipmr_get_route
ip6_gre: fix flowi6_proto value in ip6gre_xmit_other()
tcp: fix a compile error in DBGUNDO()
tcp: fix wrong checksum calculation on MTU probing
sch_sfb: keep backlog updated with qlen
sch_qfq: keep backlog updated with qlen
can: dev: fix deadlock reported after bus-off
Paul Burton [Fri, 30 Sep 2016 16:25:01 +0000 (17:25 +0100)]
MIPS: CM: Fix mips_cm_max_vp_width for non-MT kernels on MT systems
When discovering the number of VPEs per core, smp_num_siblings will be
incorrect for kernels built without support for the MIPS MultiThreading
(MT) ASE running on systems which implement said ASE. This leads to
accesses to VPEs in secondary cores being performed incorrectly since
mips_cm_vp_id calculates the wrong ID to write to the local "other"
registers. Fix this by examining the number of VPEs in the core as
reported by the CM.
This patch presumes that the number of VPEs will be the same in each
core of the system. As this path only applies to systems with CM version
2.5 or lower, and this property is true of all such known systems, this
is likely to be fine but is described in a comment for good measure.
Linus Torvalds [Sat, 1 Oct 2016 14:37:15 +0000 (07:37 -0700)]
Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI fix from James Bottomley:
"One final fix before 4.8.
There was a memory leak triggered by turning scsi mq off due to the
fact that we assume on host release that the already running hosts
weren't mq based because that's the state of the global flag (even
though they were).
Fix it by tracking this on a per host host basis"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: Avoid that toggling use_blk_mq triggers a memory leak
Linus Torvalds [Sat, 1 Oct 2016 04:25:09 +0000 (21:25 -0700)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
Pull input fix from Dmitry Torokhov:
"One small change to make joydev (which is used by older games) to bind
to devices that export Z axis but not X or Y (such as TRC rudder)"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
Input: joydev - recognize devices with Z axis as joysticks
Merge more fixes from Andrew Morton:
"Three fixes"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
include/linux/property.h: fix typo/compile error
ocfs2: fix deadlock on mmapped page in ocfs2_write_begin_nolock()
mm: workingset: fix crash in shadow node shrinker caused by replace_page_cache_page()
John Youn [Fri, 30 Sep 2016 22:11:35 +0000 (15:11 -0700)]
include/linux/property.h: fix typo/compile error
This fixes commit d76eebfa175e ("include/linux/property.h: fix build
issues with gcc-4.4.4").
With that commit we get the following compile error when using the
PROPERTY_ENTRY_INTEGER_ARRAY macro.
include/linux/property.h:201:39: error: `u32_data' undeclared (first
use in this function)
PROPERTY_ENTRY_INTEGER_ARRAY(_name_, u32, _val_)
^
include/linux/property.h:193:17: note: in definition of macro
`PROPERTY_ENTRY_INTEGER_ARRAY'
{ .pointer = { _type_##_data = _val_ } }, \
^
This needs a '.' to reference the union member. It seems this was just
overlooked here since it is done correctly in similar constructs in
other parts of the original commit.
This fix is in preparation of upcoming commits that will use this macro.
Fixes: commit d76eebfa175e ("include/linux/property.h: fix build issues with gcc-4.4.4") Link: http://lkml.kernel.org/r/2de3b929290d88a723ed829a3e3cbd02044714df.1475114627.git.johnyoun@synopsys.com Signed-off-by: John Youn <johnyoun@synopsys.com> Cc: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Eric Ren [Fri, 30 Sep 2016 22:11:32 +0000 (15:11 -0700)]
ocfs2: fix deadlock on mmapped page in ocfs2_write_begin_nolock()
The testcase "mmaptruncate" of ocfs2-test deadlocks occasionally.
In this testcase, we create a 2*CLUSTER_SIZE file and mmap() on it;
there are 2 process repeatedly performing the following operations
respectively: one is doing memset(mmaped_addr + 2*CLUSTER_SIZE - 1, 'a',
1), while the another is playing ftruncate(fd, 2*CLUSTER_SIZE) and then
ftruncate(fd, CLUSTER_SIZE) again and again.
In ocfs2_write_begin_nolock(), we first grab the pages and then allocate
disk space for this write; ocfs2_try_to_free_truncate_log() will be
called if -ENOSPC is returned; if we're lucky to get enough clusters,
which is usually the case, we start over again.
But in ocfs2_free_write_ctxt() the target page isn't unlocked, so we
will deadlock when trying to grab the target page again.
Also, -ENOMEM might be returned in ocfs2_grab_pages_for_write().
Another deadlock will happen in __do_page_mkwrite() if
ocfs2_page_mkwrite() returns non-VM_FAULT_LOCKED, and along with a
locked target page.
These two errors fail on the same path, so fix them by unlocking the
target page manually before ocfs2_free_write_ctxt().
Jan Kara helps me clear out the JBD2 part, and suggest the hint for root
cause.
Changes since v1:
1. Also put ENOMEM error case into consideration.
Link: http://lkml.kernel.org/r/1474173902-32075-1-git-send-email-zren@suse.com Signed-off-by: Eric Ren <zren@suse.com> Reviewed-by: He Gang <ghe@suse.com> Acked-by: Joseph Qi <joseph.qi@huawei.com> Cc: Mark Fasheh <mfasheh@suse.de> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Johannes Weiner [Fri, 30 Sep 2016 22:11:29 +0000 (15:11 -0700)]
mm: workingset: fix crash in shadow node shrinker caused by replace_page_cache_page()
Antonio reports the following crash when using fuse under memory pressure:
kernel BUG at /build/linux-a2WvEb/linux-4.4.0/mm/workingset.c:346!
invalid opcode: 0000 [#1] SMP
Modules linked in: all of them
CPU: 2 PID: 63 Comm: kswapd0 Not tainted 4.4.0-36-generic #55-Ubuntu
Hardware name: System manufacturer System Product Name/P8H67-M PRO, BIOS 3904 04/27/2013
task: ffff88040cae6040 ti: ffff880407488000 task.ti: ffff880407488000
RIP: shadow_lru_isolate+0x181/0x190
Call Trace:
__list_lru_walk_one.isra.3+0x8f/0x130
list_lru_walk_one+0x23/0x30
scan_shadow_nodes+0x34/0x50
shrink_slab.part.40+0x1ed/0x3d0
shrink_zone+0x2ca/0x2e0
kswapd+0x51e/0x990
kthread+0xd8/0xf0
ret_from_fork+0x3f/0x70
which corresponds to the following sanity check in the shadow node
tracking:
BUG_ON(node->count & RADIX_TREE_COUNT_MASK);
The workingset code tracks radix tree nodes that exclusively contain
shadow entries of evicted pages in them, and this (somewhat obscure)
line checks whether there are real pages left that would interfere with
reclaim of the radix tree node under memory pressure.
While discussing ways how fuse might sneak pages into the radix tree
past the workingset code, Miklos pointed to replace_page_cache_page(),
and indeed there is a problem there: it properly accounts for the old
page being removed - __delete_from_page_cache() does that - but then
does a raw raw radix_tree_insert(), not accounting for the replacement
page. Eventually the page count bits in node->count underflow while
leaving the node incorrectly linked to the shadow node LRU.
To address this, make sure replace_page_cache_page() uses the tracked
page insertion code, page_cache_tree_insert(). This fixes the page
accounting and makes sure page-containing nodes are properly unlinked
from the shadow node LRU again.
Also, make the sanity checks a bit less obscure by using the helpers for
checking the number of pages and shadows in a radix tree node.
Fixes: 449dd6984d0e ("mm: keep page cache radix tree nodes in check") Link: http://lkml.kernel.org/r/20160919155822.29498-1-hannes@cmpxchg.org Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Reported-by: Antonio SJ Musumeci <trapexit@spawn.link> Debugged-by: Miklos Szeredi <miklos@szeredi.hu> Cc: <stable@vger.kernel.org> [3.15+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
... can be reproduced by running the GS testcase of the ldt_gdt test unit in
the x86 selftests.
do_int80_syscall_32() will call enter_form_user_mode() to convert context
tracking state from user state to kernel state. The load_gs_index() call
can fail with user gsbase, gsbase will be fixed up and proceed if this
happen.
However, enter_from_user_mode() will be called again in the fixed up path
though it is context tracking kernel state currently.
This patch fixes it by just fixing up gsbase and telling lockdep that IRQs
are off once load_gs_index() failed with user gsbase.
Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com> Acked-by: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1475197266-3440-1-git-send-email-wanpeng.li@hotmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
Fixes: 57f90c3dfc75 ("x86/vdso: Error out if the vDSO isn't a valid DSO") Reported-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: Segher Boessenkool <segher@kernel.crashing.org> Acked-by: Andy Lutomirski <luto@kernel.org> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: linux-next@vger.kernel.org Link: http://lkml.kernel.org/r/20160929193442.GA16617@gate.crashing.org Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Andy Lutomirski [Thu, 29 Sep 2016 19:48:11 +0000 (12:48 -0700)]
x86/boot: Fix another __read_cr4() case on 486
The condition for reading CR4 was wrong: there are some CPUs with
CPUID but not CR4. Rather than trying to make the condition exact,
use __read_cr4_safe().
Fixes: 18bc7bd523e0 ("x86/boot: Synchronize trampoline_cr4_features and mmu_cr4_features directly") Reported-by: david@saggiorato.net Signed-off-by: Andy Lutomirski <luto@kernel.org> Reviewed-by: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Link: http://lkml.kernel.org/r/8c453a61c4f44ab6ff43c29780ba04835234d2e5.1475178369.git.luto@kernel.org Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Xin Long [Wed, 28 Sep 2016 18:55:44 +0000 (02:55 +0800)]
sctp: fix the issue sctp_diag uses lock_sock in rcu_read_lock
When sctp dumps all the ep->assocs, it needs to lock_sock first,
but now it locks sock in rcu_read_lock, and lock_sock may sleep,
which would break rcu_read_lock.
This patch is to get and hold one sock when traversing the list.
After that and get out of rcu_read_lock, lock and dump it. Then
it will traverse the list again to get the next one until all
sctp socks are dumped.
For sctp_diag_dump_one, it fixes this issue by holding asoc and
moving cb() out of rcu_read_lock in sctp_transport_lookup_process.
Fixes: 8f840e47f190 ("sctp: add the sctp_diag.c file") Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Xin Long [Wed, 28 Sep 2016 18:37:28 +0000 (02:37 +0800)]
sctp: change to check peer prsctp_capable when using prsctp polices
Now before using prsctp polices, sctp uses asoc->prsctp_enable to
check if prsctp is enabled. However asoc->prsctp_enable is set only
means local host support prsctp, sctp should not abandon packet if
peer host doesn't enable prsctp.
So this patch is to use asoc->peer.prsctp_capable to check if prsctp
is enabled on both side, instead of asoc->prsctp_enable, as asoc's
peer.prsctp_capable is set only when local and peer both enable prsctp.
Fixes: a6c2f792873a ("sctp: implement prsctp TTL policy") Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Xin Long [Wed, 28 Sep 2016 18:37:27 +0000 (02:37 +0800)]
sctp: remove prsctp_param from sctp_chunk
Now sctp uses chunk->prsctp_param to save the prsctp param for all the
prsctp polices, we didn't need to introduce prsctp_param to sctp_chunk.
We can just use chunk->sinfo.sinfo_timetolive for RTX and BUF polices,
and reuse msg->expires_at for TTL policy, as the prsctp polices and old
expires policy are mutual exclusive.
This patch is to remove prsctp_param from sctp_chunk, and reuse msg's
expires_at for TTL and chunk's sinfo.sinfo_timetolive for RTX and BUF
polices.
Note that sctp can't use chunk's sinfo.sinfo_timetolive for TTL policy,
as it needs a u64 variables to save the expires_at time.
This one also fixes the "netperf-Throughput_Mbps -37.2% regression"
issue.
Fixes: a6c2f792873a ("sctp: implement prsctp TTL policy") Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Xin Long [Wed, 28 Sep 2016 18:37:26 +0000 (02:37 +0800)]
sctp: move sent_count to the memory hole in sctp_chunk
Now pahole sctp_chunk, it has 2 memory holes:
struct sctp_chunk {
struct list_head list;
atomic_t refcnt;
/* XXX 4 bytes hole, try to pack */
...
long unsigned int prsctp_param;
int sent_count;
/* XXX 4 bytes hole, try to pack */
This patch is to move up sent_count to fill the 1st one and eliminate
the 2nd one.
It's not just another struct compaction, it also fixes the "netperf-
Throughput_Mbps -37.2% regression" issue when overloading the CPU.
Fixes: a6c2f792873a ("sctp: implement prsctp TTL policy") Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Milton Miller [Thu, 29 Sep 2016 16:24:08 +0000 (13:24 -0300)]
tg3: Avoid NULL pointer dereference in tg3_io_error_detected()
While the driver is probing the adapter, an error may occur before the
netdev structure is allocated and attached to pci_dev. In this case,
not only netdev isn't available, but the tg3 private structure is also
not available as it is just math from the NULL pointer, so dereferences
must be skipped.
The following trace is seen when the error is triggered:
This patch avoids the NULL pointer dereference by moving the access after
the netdev NULL pointer check on tg3_io_error_detected(). Also, we add a
check for netdev being NULL on tg3_io_resume() [suggested by Michael Chan].
Fixes: 0486a063b1ff ("tg3: prevent ifup/ifdown during PCI error recovery") Fixes: dfc8f370316b ("net/tg3: Release IRQs on permanent error") Tested-by: Guilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com> Signed-off-by: Milton Miller <miltonm@us.ibm.com> Signed-off-by: Guilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com> Acked-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Merge tag 'drm-fixes-for-v4.8-final' of git://people.freedesktop.org/~airlied/linux
Pull drm fixes from Dave Airlie:
"drm fixes for final 4.8.
One big regression fix for udl, along with two amdgpu fixes and two
nouveau fixes.
All seems pretty safe and useful"
* tag 'drm-fixes-for-v4.8-final' of git://people.freedesktop.org/~airlied/linux:
drm/udl: fix line iterator in damage handling
drm/radeon/si/dpm: add workaround for for Jet parts
drm/amdgpu: disable CRTCs before teardown
drm/nouveau: Revert "bus: remove cpu_coherent flag"
drm/nouveau/fifo/nv04: avoid ramht race against cookie insertion
Merge branch 'libnvdimm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm
Pull libnvdimm fixes from Dan Williams:
- Four fixes for "flush hint" support.
Flush hints are addresses advertised by the ACPI 6+ NFIT (NVDIMM
Firmware Interface Table) that when written and fenced guarantee that
writes pending in platform write buffers (outside the cpu) have been
flushed to media. They might also be used by hypervisors as a
trigger condition to flush guest-persistent memory ranges to storage.
Fix a potential data corruption issue, a broken definition of the
hint array, a wrong allocation size for the unit test implementation
of the flush hint table, and missing NULL check in an error path.
The unit test, while it did not prevent these bugs from being
merged, at least triggered occasional crashes in advance of
production usages.
- Fix handling of ACPI DSM error status results. The DSM mechanism
allows communication with platform and memory device firmware. We
correctly parse known errors, but were silently ignoring others.
Fix it to consistently fail any command with a non-zero status return
that we otherwise do not interpret / handle.
* 'libnvdimm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
libnvdimm, region: fix flush hint table thinko
nfit: fail DSMs that return non-zero status by default
libnvdimm: fix devm_nvdimm_memremap() error path
tools/testing/nvdimm: fix allocation range for mock flush hint tables
nvdimm: fix PHYS_PFN/PFN_PHYS mixup
Paul Burton [Fri, 2 Sep 2016 14:17:31 +0000 (15:17 +0100)]
MIPS: Fix detection of unsupported highmem with cache aliases
The paging_init() function contains code which detects that highmem is
in use but unsupported due to dcache aliasing. However this code was
ineffective because it was being run before the caches are probed,
meaning that cpu_has_dc_aliases would always evaluate to false (unless a
platform overrides it to a compile-time constant) and the detection of
the unsupported case is never triggered. The kernel would then go on to
attempt to use highmem & either hit coherency issues or trigger the
BUG_ON in flush_kernel_dcache_page().
Fix this by running paging_init() later than cpu_cache_init(), such that
the cpu_has_dc_aliases macro will evaluate correctly & the unsupported
highmem case will be detected successfully.
This then leads to a formerly hidden issue in that
mem_init_free_highmem() will attempt to free all highmem pages, even
though we're avoiding use of them & don't have valid page structs for
them. This leads to an invalid pointer dereference & a TLB exception.
Avoid this by skipping the loop in mem_init_free_highmem() if
cpu_has_dc_aliases evaluates true.
Signed-off-by: Paul Burton <paul.burton@imgtec.com> Cc: Rabin Vincent <rabinv@axis.com> Cc: Matt Redfearn <matt.redfearn@imgtec.com> Cc: Jerome Marchand <jmarchan@redhat.com> Cc: Alexander Sverdlin <alexander.sverdlin@gmail.com> Cc: Aurelien Jarno <aurelien@aurel32.net> Cc: Jaedon Shin <jaedon.shin@gmail.com> Cc: Toshi Kani <toshi.kani@hpe.com> Cc: James Hogan <james.hogan@imgtec.com> Cc: Sergey Ryazanov <ryazanov.s.a@gmail.com> Cc: Jonas Gorski <jogo@openwrt.org> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: linux-mips@linux-mips.org Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/14184/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Paul Burton [Fri, 2 Sep 2016 15:07:10 +0000 (16:07 +0100)]
MIPS: Malta: Fix IOCU disable switch read for MIPS64
Malta boards used with CPU emulators feature a switch to disable use of
an IOCU. Software has to check this switch & ignore any present IOCU if
the switch is closed. The read used to do this was unsafe for 64 bit
kernels, as it simply casted the address 0xbf403000 to a pointer &
dereferenced it. Whilst in a 32 bit kernel this would access kseg1, in a
64 bit kernel this attempts to access xuseg & results in an address
error exception.
Fix by accessing a correctly formed ckseg1 address generated using the
CKSEG1ADDR macro.
Whilst modifying this code, define the name of the register and the bit
we care about within it, which indicates whether PCI DMA is routed to
the IOCU or straight to DRAM. The code previously checked that bit 0 was
also set, but the least significant 7 bits of the CONFIG_GEN0 register
contain the value of the MReqInfo signal provided to the IOCU OCP bus,
so singling out bit 0 makes little sense & that part of the check is
dropped.
Paul Burton [Fri, 19 Aug 2016 17:15:40 +0000 (18:15 +0100)]
MIPS: Fix BUILD_ROLLBACK_PROLOGUE for microMIPS
When the kernel is built for microMIPS, branches targets need to be
known to be microMIPS code in order to result in bit 0 of the PC being
set. The branch target in the BUILD_ROLLBACK_PROLOGUE macro was simply
the end of the macro, which may be pointing at padding rather than at
code. This results in recent enough GNU linkers complaining like so:
mips-img-linux-gnu-ld: arch/mips/built-in.o: .text+0x3e3c: Unsupported branch between ISA modes.
mips-img-linux-gnu-ld: final link failed: Bad value
Makefile:936: recipe for target 'vmlinux' failed
make: *** [vmlinux] Error 1
Fix this by changing the branch target to be the start of the
appropriate handler, skipping over any padding.
Paul Burton [Fri, 19 Aug 2016 17:18:28 +0000 (18:18 +0100)]
MIPS: clear execution hazard after changing FTLB enable
On current P-series cores from Imagination the FTLB can be enabled or
disabled via a bit in the Config6 register, and an execution hazard is
created by changing the value of bit. The ftlb_disable function already
cleared that hazard but that does no good for other callers. Clear the
hazard in the set_ftlb_enable function that creates it, and only for the
cores where it applies.
This has the effect of reverting c982c6d6c48b ("MIPS: cpu-probe: Remove
cp0 hazard barrier when enabling the FTLB") which was incorrect.
Signed-off-by: Paul Burton <paul.burton@imgtec.com> Fixes: c982c6d6c48b ("MIPS: cpu-probe: Remove cp0 hazard barrier when enabling the FTLB") Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/14023/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Paul Burton [Fri, 19 Aug 2016 17:18:27 +0000 (18:18 +0100)]
MIPS: Configure FTLB after probing TLB sizes from config4
On some cores (proAptiv, P5600) we make use of the sizes of the TLBs
to determine the desired FTLB:VTLB write ratio. However set_ftlb_enable
& thus calculate_ftlb_probability is called before decode_config4. This
results in us calculating a probability based on zero sizes, and we end
up setting FTLBP=3 for a 3:1 FTLB:VTLB write ratio in all cases. This
will make abysmal use of the available FTLB resources in the affected
cores.
Fix this by configuring the FTLB probability after having decoded
config4. However we do need to have enabled the FTLB before that point
such that fields in config4 actually reflect that an FTLB is present. So
set_ftlb_enable is now called twice, with flags indicating that it
should configure the write probability only the second time.
Signed-off-by: Paul Burton <paul.burton@imgtec.com> Fixes: cf0a8aa0226d ("MIPS: cpu-probe: Set the FTLB probability bit on supported cores") Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/14022/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Paul Burton [Fri, 19 Aug 2016 17:18:26 +0000 (18:18 +0100)]
MIPS: Stop setting I6400 FTLBP
The FTLBP field in Config7 for the I6400 is intended as chicken bits for
debugging rather than as a field that software actually makes use of.
For best performance, FTLBP should be left at its default value of 0
with all TLB writes hitting the FTLB by default.
Additionally, since set_ftlb_enable is called from decode_configs before
decode_config4 which determines the size of the TLBs, this was
previously always setting FTLBP=3 for a 3:1 FTLB:VTLB write ratio which
makes abysmal use of the available FTLB resources.
This effectively reverts b0c4e1b79d8a ("MIPS: Set up FTLB probability
for I6400").
Signed-off-by: Paul Burton <paul.burton@imgtec.com> Fixes: b0c4e1b79d8a ("MIPS: Set up FTLB probability for I6400") Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/14021/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
MIPS: DEC: Avoid la pseudo-instruction in delay slots
When expanding the la or dla pseudo-instruction in a delay slot the GNU
assembler will complain should the pseudo-instruction expand to multiple
actual instructions, since only the first of them will be in the delay
slot leading to the pseudo-instruction being only partially executed if
the branch is taken. Use of PTR_LA in the dec int-handler.S leads to
such warnings:
arch/mips/dec/int-handler.S: Assembler messages:
arch/mips/dec/int-handler.S:149: Warning: macro instruction expanded into multiple instructions in a branch delay slot
arch/mips/dec/int-handler.S:198: Warning: macro instruction expanded into multiple instructions in a branch delay slot
Steven J. Hill [Fri, 26 Aug 2016 19:02:04 +0000 (14:02 -0500)]
MIPS: Octeon: mark GPIO controller node not populated after IRQ init.
We clear the OF_POPULATED flag for the GPIO controller node on Octeon
processors. Otherwise, none of the devices hanging on the GPIO lines
are probed. The 'gpio-leds' driver on OCTEON failed to probe in addition
to other devices on Cavium 71xx and 78xx development boards.
Fixes: 15cc2ed6dcf9 ("of/irq: Mark initialised interrupt controllers as populated") Signed-off-by: Steven J. Hill <steven.hill@cavium.com> Tested-by: Aaro Koskinen <aaro.koskinen@iki.fi> Cc: David Daney <david.daney@cavium.com> Cc: Rob Herring <robh@kernel.org> Cc: linux-mips@linux-mips.org Cc: devicetree@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/14091/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
arch_uprobe_pre_xol needs to emulate a branch if a branch instruction
has been replaced with a breakpoint, but in fact an uninitialised local
variable was passed to the emulator routine instead of the original
instruction
Generic kernel code implements a weak version of set_orig_insn that
moves cached 'insn' from arch_uprobe to the original code location when
the trap is removed.
MIPS variant used arch_uprobe->orig_inst which was never initialised
properly, so this code only inserted a nop instead of the original
instruction. With that change orig_inst can also be safely removed.
Matt Redfearn [Thu, 22 Sep 2016 10:59:47 +0000 (11:59 +0100)]
MIPS: smp-cps: Avoid BUG() when offlining pre-r6 CPUs
Commit 0d2808f338c7 ("MIPS: smp-cps: Add support for CPU hotplug of
MIPSr6 processors") added a call to mips_cm_lock_other in order to lock
the CPC in CPUs containing a version 3 or higher Coherence Manager,
which use the general CM core other register, where previous CMs had a
dedicated core other register for the CPC.
A kernel BUG() is triggered, however, if mips_cm_lock_other is called
with a VP other than 0 on a CPU with CM < 3, a condition introduced by 0d2808f338c7.
Avoid the BUG() by always locking VP0 when locking the CPC, since the
required register, cpc_stat_conf, is shared by all vps in a core.
Fixes: 0d2808f338c7 ("MIPS: smp-cps: Add support for CPU hotplug...) Signed-off-by: Matt Redfearn <matt.redfearn@imgtec.com> Cc: Qais Yousef <qsyousef@gmail.com> Cc: Masahiro Yamada <yamada.masahiro@socionext.com> Cc: James Hogan <james.hogan@imgtec.com> Cc: Paul Burton <paul.burton@imgtec.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: linux-mips@linux-mips.org Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/14297/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Roger Quadros [Thu, 29 Sep 2016 07:32:55 +0000 (08:32 +0100)]
ARM: 8617/1: dma: fix dma_max_pfn()
Since commit 6ce0d2001692 ("ARM: dma: Use dma_pfn_offset for dma address translation"),
dma_to_pfn() already returns the PFN with the physical memory start offset
so we don't need to add it again.
This fixes USB mass storage lock-up problem on systems that can't do DMA
over the entire physical memory range (e.g.) Keystone 2 systems with 4GB RAM
can only do DMA over the first 2GB. [K2E-EVM].
What happens there is that without this patch SCSI layer sets a wrong
bounce buffer limit in scsi_calculate_bounce_limit() for the USB mass
storage device. dma_max_pfn() evaluates to 0x8fffff and bounce_limit
is set to 0x8fffff000 whereas maximum DMA'ble physical memory on Keystone 2
is 0x87fffffff. This results in non DMA'ble pages being given to the
USB controller and hence the lock-up.
NOTE: in the above case, USB-SCSI-device's dma_pfn_offset was showing as 0.
This should have really been 0x780000 as on K2e, LOWMEM_START is 0x80000000
and HIGHMEM_START is 0x800000000. DMA zone is 2GB so dma_max_pfn should be
0x87ffff. The incorrect dma_pfn_offset for the USB storage device is because
USB devices are not correctly inheriting the dma_pfn_offset from the
USB host controller. This will be fixed by a separate patch.
Fixes: 6ce0d2001692 ("ARM: dma: Use dma_pfn_offset for dma address translation") Cc: stable@vger.kernel.org Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Santosh Shilimkar <santosh.shilimkar@oracle.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Olof Johansson <olof@lixom.net> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Linus Walleij <linus.walleij@linaro.org> Reported-by: Grygorii Strashko <grygorii.strashko@ti.com> Signed-off-by: Roger Quadros <rogerq@ti.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Robin Murphy [Mon, 26 Sep 2016 15:50:55 +0000 (16:50 +0100)]
ARM: 8616/1: dt: Respect property size when parsing CPUs
Whilst MPIDR values themselves are less than 32 bits, it is still
perfectly valid for a DT to have #address-cells > 1 in the CPUs node,
resulting in the "reg" property having leading zero cell(s). In that
situation, the big-endian nature of the data conspires with the current
behaviour of only reading the first cell to cause the kernel to think
all CPUs have ID 0, and become resoundingly unhappy as a consequence.
Take the full property length into account when parsing CPUs so as to
be correct under any circumstances.
Cc: Russell King <linux@armlinux.org.uk> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
David S. Miller [Thu, 29 Sep 2016 00:40:52 +0000 (20:40 -0400)]
sparc64: Fix non-SMP build.
Need to provide a dummy smp_fill_in_cpu_possible_map.
Fixes: 9b2f753ec237 ("sparc64: Fix cpu_possible_mask if nr_cpus is set") Reported-by: kbuild test robot <fengguang.wu@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
mem-hotplug: use nodes that contain memory as mask in new_node_page()
scripts/recordmcount.c: account for .softirqentry.text
dma-mapping.h: preserve unmap info for CONFIG_DMA_API_DEBUG
mm,ksm: fix endless looping in allocating memory when ksm enable
Li Zhong [Wed, 28 Sep 2016 22:22:38 +0000 (15:22 -0700)]
mem-hotplug: use nodes that contain memory as mask in new_node_page()
9bb627be47a5 ("mem-hotplug: don't clear the only node in new_node_page()")
prevents allocating from an empty nodemask, but as David points out, it is
still wrong. As node_online_map may include memoryless nodes, only
allocating from these nodes is meaningless.
This patch uses node_states[N_MEMORY] mask to prevent the above case.
Fixes: 9bb627be47a5 ("mem-hotplug: don't clear the only node in new_node_page()") Fixes: 394e31d2ceb4 ("mem-hotplug: alloc new page from a nearest neighbor node when mem-offline") Link: http://lkml.kernel.org/r/1474447117.28370.6.camel@TP420 Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com> Suggested-by: David Rientjes <rientjes@google.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: Michal Hocko <mhocko@suse.cz> Cc: John Allen <jallen@linux.vnet.ibm.com> Cc: Xishi Qiu <qiuxishi@huawei.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
scripts/recordmcount.c: account for .softirqentry.text
be7635e7287e ("arch, ftrace: for KASAN put hard/soft IRQ entries into
separate sections") added .softirqentry.text section, but it was not added
to recordmcount. So functions in the section are untracable. Add the
section to scripts/recordmcount.c and scripts/recordmcount.pl.
Fixes: be7635e7287e ("arch, ftrace: for KASAN put hard/soft IRQ entries into separate sections") Link: http://lkml.kernel.org/r/1474902626-73468-1-git-send-email-dvyukov@google.com Signed-off-by: Dmitry Vyukov <dvyukov@google.com> Acked-by: Steve Rostedt <rostedt@goodmis.org> Cc: <stable@vger.kernel.org> [4.6+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>