When event on child inodes are sent to the parent inode mark and
parent inode mark was not marked with FAN_EVENT_ON_CHILD, the event
will not be delivered to the listener process. However, if the same
process also has a mount mark, the event to the parent inode will be
delivered regadless of the mount mark mask.
This behavior is incorrect in the case where the mount mark mask does
not contain the specific event type. For example, the process adds
a mark on a directory with mask FAN_MODIFY (without FAN_EVENT_ON_CHILD)
and a mount mark with mask FAN_CLOSE_NOWRITE (without FAN_ONDIR).
A modify event on a file inside that directory (and inside that mount)
should not create a FAN_MODIFY event, because neither of the marks
requested to get that event on the file.
Fixes: 1968f5eed54c ("fanotify: use both marks when possible") Cc: stable <stable@vger.kernel.org> Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Jan Kara <jack@suse.cz>
[natechancellor: Fix small conflict due to lack of 3cd5eca8d7a2f] Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The autofs file system mkdir inode operation blindly sets the created
directory mode to S_IFDIR | 0555, ingoring the passed in mode, which can
cause selinux dac_override denials.
But the function also checks if the caller is the daemon (as no-one else
should be able to do anything here) so there's no point in not honouring
the passed in mode, allowing the daemon to set appropriate mode when
required.
Link: http://lkml.kernel.org/r/152361593601.8051.14014139124905996173.stgit@pluto.themaw.net Signed-off-by: Ian Kent <raven@themaw.net> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
We want it only for the stuff created by SB_KERNMOUNT mounts, *not* for
their copies. As it is, creating a deep stack of bindings of /proc/*/ns/*
somewhere in a new namespace and exiting yields a stack overflow.
Cc: stable@kernel.org Reported-by: Alexander Aring <aring@mojatatu.com> Bisected-by: Kirill Tkhai <ktkhai@virtuozzo.com> Tested-by: Kirill Tkhai <ktkhai@virtuozzo.com> Tested-by: Alexander Aring <aring@mojatatu.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
if we ever hit rpc_gssd_dummy_depopulate() dentry passed to
it has refcount equal to 1. __rpc_rmpipe() drops it and
dput() done after that hits an already freed dentry.
Cc: stable@kernel.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When we patch an alternate feature section, we have to adjust any
relative branches that branch out of the alternate section.
But currently we have a bug if we have a branch that points to past
the last instruction of the alternate section, eg:
FTR_SECTION_ELSE
1: b 2f
or 6,6,6
2:
ALT_FTR_SECTION_END(...)
nop
This will result in a relative branch at 1 with a target that equals
the end of the alternate section.
That branch does not need adjusting when it's moved to the non-else
location. Currently we do adjust it, resulting in a branch that goes
off into the link-time location of the else section, which is junk.
The fix is to not patch branches that have a target == end of the
alternate section.
The label .Llast_fixup\@ is jumped to on page fault within the final
byte set loop of memset (on < MIPSR6 architectures). For some reason, in
this fault handler, the v1 register is randomly set to a2 & STORMASK.
This clobbers v1 for the calling function. This can be observed with the
following test code:
static int __init __attribute__((optimize("O0"))) test_clear_user(void)
{
register int t asm("v1");
char *test;
int j, k;
pr_info("\n\n\nTesting clear_user\n");
test = vmalloc(PAGE_SIZE);
for (j = 256; j < 512; j++) {
t = 0xa5a5a5a5;
if ((k = clear_user(test + PAGE_SIZE - 256, j)) != j - 256) {
pr_err("clear_user (%px %d) returned %d\n", test + PAGE_SIZE - 256, j, k);
}
if (t != 0xa5a5a5a5) {
pr_err("v1 was clobbered to 0x%x!\n", t);
}
}
return 0;
}
late_initcall(test_clear_user);
Which demonstrates that v1 is indeed clobbered (MIPS64):
Testing clear_user
v1 was clobbered to 0x1!
v1 was clobbered to 0x2!
v1 was clobbered to 0x3!
v1 was clobbered to 0x4!
v1 was clobbered to 0x5!
v1 was clobbered to 0x6!
v1 was clobbered to 0x7!
Since the number of bytes that could not be set is already contained in
a2, the andi placing a value in v1 is not necessary and actively
harmful in clobbering v1.
Reported-by: James Hogan <jhogan@kernel.org> Signed-off-by: Matt Redfearn <matt.redfearn@mips.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Cc: stable@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/19109/ Signed-off-by: James Hogan <jhogan@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The __clear_user function is defined to return the number of bytes that
could not be cleared. From the underlying memset / bzero implementation
this means setting register a2 to that number on return. Currently if a
page fault is triggered within the memset_partial block, the value
loaded into a2 on return is meaningless.
The label .Lpartial_fixup\@ is jumped to on page fault. In order to work
out how many bytes failed to copy, the exception handler should find how
many bytes left in the partial block (andi a2, STORMASK), add that to
the partial block end address (a2), and subtract the faulting address to
get the remainder. Currently it incorrectly subtracts the partial block
start address (t1), which has additionally been clobbered to generate a
jump target in memset_partial. Fix this by adding the block end address
instead.
This issue was found with the following test code:
int j, k;
for (j = 0; j < 512; j++) {
if ((k = clear_user(NULL, j)) != j) {
pr_err("clear_user (NULL %d) returned %d\n", j, k);
}
}
Which now passes on Creator Ci40 (MIPS32) and Cavium Octeon II (MIPS64).
Suggested-by: James Hogan <jhogan@kernel.org> Signed-off-by: Matt Redfearn <matt.redfearn@mips.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Cc: stable@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/19108/ Signed-off-by: James Hogan <jhogan@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The MIPS kernel memset / bzero implementation includes a small_memset
branch which is used when the region to be set is smaller than a long (4
bytes on 32bit, 8 bytes on 64bit). The current small_memset
implementation uses a simple store byte loop to write the destination.
There are 2 issues with this implementation:
1. When EVA mode is active, user and kernel address spaces may overlap.
Currently the use of the sb instruction means kernel mode addressing is
always used and an intended write to userspace may actually overwrite
some critical kernel data.
2. If the write triggers a page fault, for example by calling
__clear_user(NULL, 2), instead of gracefully handling the fault, an OOPS
is triggered.
Fix these issues by replacing the sb instruction with the EX() macro,
which will emit EVA compatible instuctions as required. Additionally
implement a fault fixup for small_memset which sets a2 to the number of
bytes that could not be cleared (as defined by __clear_user).
Reported-by: Chuanhua Lei <chuanhua.lei@intel.com> Signed-off-by: Matt Redfearn <matt.redfearn@mips.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Cc: stable@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/18975/ Signed-off-by: James Hogan <jhogan@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Doing `ioctl(HIDIOCGFEATURE)` in a tight loop on a hidraw device
and then disconnecting the device, or unloading the driver, can
cause a NULL pointer dereference.
When a hidraw device is destroyed it sets 0 to `dev->exist`.
Most functions check 'dev->exist' before doing its work, but
`hidraw_get_report()` was missing that check.
Some rawmidi compat ioctls lack of the input substream checks
(although they do check only for rfile->output). This many eventually
lead to an Oops as NULL substream is passed to the rawmidi core
functions.
Fix it by adding the proper checks before each function call.
Two years ago I tried an AMD Radeon E8860 embedded GPU with the drm driver.
The dmesg output included driver warnings about an invalid PCIe lane width.
Tracking the problem back led to si_set_pcie_lane_width_in_smc().
The calculation of the lane widths via ATOM_PPLIB_PCIE_LINK_WIDTH_MASK and
ATOM_PPLIB_PCIE_LINK_WIDTH_SHIFT macros did not increment the resulting
value, per the comment in pptable.h ("lanes - 1"), and per usage elsewhere.
Applying the increment silenced the warnings.
The code has not changed since, so either my analysis was incorrect or the
bug has gone unnoticed. Hence submitting this as an RFC.
Acked-by: Christian König <christian.koenig@amd.com> Acked-by: Chunming Zhou <david1.zhou@amd.com> Signed-off-by: Paul Parsons <lost.distance@yahoo.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
If some metadata block, such as an allocation bitmap, overlaps the
superblock, it's very likely that if the file system is mounted
read/write, the results will not be pretty. So disallow r/w mounts
for file systems corrupted in this particular way.
Backport notes:
3.18.y is missing bc98a42c1f7d ("VFS: Convert sb->s_flags & MS_RDONLY to sb_rdonly(sb)")
and e462ec50cb5f ("VFS: Differentiate mount flags (MS_*) from internal superblock flags")
so we simply use the sb MS_RDONLY check from pre bc98a42c1f7d in place of the sb_rdonly
function used in the upstream variant of the patch.
If the root directory has an i_links_count of zero, then when the file
system is mounted, then when ext4_fill_super() notices the problem and
tries to call iput() the root directory in the error return path,
ext4_evict_inode() will try to free the inode on disk, before all of
the file system structures are set up, and this will result in an OOPS
caused by a NULL pointer dereference.
The commit 02a5d6925cd3 ("ALSA: pcm: Avoid potential races between OSS
ioctls and read/write") split the PCM preparation code to a locked
version, and it added a sanity check of runtime->oss.prepare flag
along with the change. This leaded to an endless loop when the stream
gets XRUN: namely, snd_pcm_oss_write3() and co call
snd_pcm_oss_prepare() without setting runtime->oss.prepare flag and
the loop continues until the PCM state reaches to another one.
As the function is supposed to execute the preparation
unconditionally, drop the invalid state check there.
The previous fix 40cab6e88cb0 ("ALSA: pcm: Return -EBUSY for OSS
ioctls changing busy streams") introduced some mutex unbalance; the
check of runtime->oss.rw_ref was inserted in a wrong place after the
mutex lock.
This patch fixes the inconsistency by rewriting with the helper
functions to lock/unlock parameters with the stream check.
OSS PCM stream management isn't modal but it allows ioctls issued at
any time for changing the parameters. In the previous hardening
patch ("ALSA: pcm: Avoid potential races between OSS ioctls and
read/write"), we covered these races and prevent the corruption by
protecting the concurrent accesses via params_lock mutex. However,
this means that some ioctls that try to change the stream parameter
(e.g. channels or format) would be blocked until the read/write
finishes, and it may take really long.
Basically changing the parameter while reading/writing is an invalid
operation, hence it's even more user-friendly from the API POV if it
returns -EBUSY in such a situation.
This patch adds such checks in the relevant ioctls with the addition
of read/write access refcount.
Although we apply the params_lock mutex to the whole read and write
operations as well as snd_pcm_oss_change_params(), we may still face
some races.
First off, the params_lock is taken inside the read and write loop.
This is intentional for avoiding the too long locking, but it allows
the in-between parameter change, which might lead to invalid
pointers. We check the readiness of the stream and set up via
snd_pcm_oss_make_ready() at the beginning of read and write, but it's
called only once, by assuming that it remains ready in the rest.
Second, many ioctls that may change the actual parameters
(i.e. setting runtime->oss.params=1) aren't protected, hence they can
be processed in a half-baked state.
This patch is an attempt to plug these holes. The stream readiness
check is moved inside the read/write inner loop, so that the stream is
always set up in a proper state before further processing. Also, each
ioctl that may change the parameter is wrapped with the params_lock
for avoiding the races.
The issues were triggered by syzkaller in a few different scenarios,
particularly the one below appearing as GPF in loopback_pos_update.
When device boots with T > T_trip_1 and requests interrupt,
the race condition takes place. The interrupt comes before
THERMAL_DEVICE_ENABLED is set. This leads to an attempt to
reading sensor value from irq and disabling the sensor, based on
the data->mode field, which expected to be THERMAL_DEVICE_ENABLED,
but still stays as THERMAL_DEVICE_DISABLED. Afher this issue
sensor is never re-enabled, as the driver state is wrong.
Fix this problem by setting the 'data' members prior to
requesting the interrupts.
Clearfog boards can come with a CPU clocked at 1600MHz (commercial)
or 1333MHz (industrial).
They have also some dip-switches to select a different clock (666, 800,
1066, 1200).
The funny thing is that the recovery button is on the MPP34 fq selector.
So, when booting an industrial board with this button down, the frequency
666MHz is selected (and the kernel didn't boot).
This patch add all the missing clocks.
The only mode I didn't test is 2GHz (uboot found 4294MHz instead :/ ).
Fixes: 0e85aeced4d6 ("clk: mvebu: add clock support for Armada 380/385") Cc: <stable@vger.kernel.org> # 3.16.x: 9593f4f56cf5: clk: mvebu: armada-38x: add support for 1866MHz variants Cc: <stable@vger.kernel.org> # 3.16.x Signed-off-by: Richard Genoud <richard.genoud@gmail.com> Acked-by: Gregory CLEMENT <gregory.clement@bootlin.com> Signed-off-by: Stephen Boyd <sboyd@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The Linksys WRT3200ACM CPU is clocked at 1866MHz. Add 1866MHz to the
list of supported CPU frequencies. Also update multiplier and divisor
for the l2clk and ddrclk.
Noticed by the following warning:
[ 0.000000] Selected CPU frequency (16) unsupported
Signed-off-by: Ralph Sennhauser <ralph.sennhauser@gmail.com> Reviewed-by: Gregory CLEMENT <gregory.clement@free-electrons.com> Signed-off-by: Stephen Boyd <sboyd@codeaurora.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
A spinlock is held while updating the internal copy of the IRQ mask,
but not while writing it to the actual IMASK register. After the lock
is released, an IRQ can occur before the IMASK register is written.
If handling this IRQ causes the mask to be changed, when the handler
returns back to the middle of the first mask update, a stale value
will be written to the mask register.
If this causes an IRQ to become unmasked that cannot have its status
cleared by writing a 1 to it in the IREG register, e.g. the SDIO IRQ,
then we can end up stuck with the same IRQ repeatedly being fired but
not handled. Normally the MMC IRQ handler attempts to clear any
unexpected IRQs by writing IREG, but for those that cannot be cleared
in this way then the IRQ will just repeatedly fire.
This was resulting in lockups after a while of using Wi-Fi on the
CI20 (GitHub issue #19).
Resolve by holding the spinlock until after the IMASK register has
been updated.
Cc: stable@vger.kernel.org Link: https://github.com/MIPS/CI20_linux/issues/19 Fixes: 61bfbdb85687 ("MMC: Add support for the controller on JZ4740 SoCs.") Tested-by: Mathieu Malaterre <malat@debian.org> Signed-off-by: Alex Smith <alex.smith@imgtec.com> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This fixes a harmless UBSAN where root could potentially end up
causing an overflow while bumping the entropy_total field (which is
ignored once the entropy pool has been initialized, and this generally
is completed during the boot sequence).
This is marginal for the stable kernel series, but it's a really
trivial patch, and it fixes UBSAN warning that might cause security
folks to get overly excited for no reason.
The driver misses implementation of PM hook that undoes what
->freeze_noirq() does after the hibernation image is created. This means
the control channel is not resumed properly and the Thunderbolt bus
becomes useless in later stages of hibernation (when the image is stored
or if the operation fails).
Fix this by pointing ->thaw_noirq to driver nhi_resume_noirq(). This
makes sure the control channel is resumed properly.
Fixes: 23dd5bb49d98 ("thunderbolt: Add suspend/hibernate support") Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com> Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
SSM2602 driver is broken on recent kernels (at least
since 4.9). User space applications such as amixer or
alsamixer get EIO when attempting to access codec
controls via the relevant IOCTLs.
Root cause of these failures is the regcache_hw_init
function in drivers/base/regmap/regcache.c, which
prevents regmap cache initalization from the
reg_defaults_raw element of the regmap_config structure
when registers are write only. It also disables the
regmap cache entirely when all registers are write only
or volatile as is the case for the SSM2602 driver.
Using the reg_defaults element of the regmap_config
structure rather than the reg_defaults_raw element to
initalize the regmap cache avoids the logic in the
regcache_hw_init function entirely. It also makes this
driver consistent with other ASoC codec drivers, as
this driver was the ONLY codec driver that used the
reg_defaults_raw element to initalize the cache.
Tested on Digilent Zybo Z7 development board which has
a SSM2603 codec chip connected to a Xilinx Zynq SoC.
Signed-off-by: James Kelly <jamespeterkelly@gmail.com> Signed-off-by: Mark Brown <broonie@kernel.org> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The OPAL NVRAM driver does not sleep in case it gets OPAL_BUSY or
OPAL_BUSY_EVENT from firmware, which causes large scheduling
latencies, and various lockup errors to trigger (again, BMC reboot
can cause it).
Fix this by converting it to the standard form OPAL_BUSY loop that
sleeps.
Fixes: 628daa8d5abf ("powerpc/powernv: Add RTC and NVRAM support plus RTAS fallbacks")
Depends-on: 34dd25de9fe3 ("powerpc/powernv: define a standard delay for OPAL_BUSY type retry loops") Cc: stable@vger.kernel.org # v3.2+ Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This is the start of an effort to tidy up and standardise all the
delays. Existing loops have a range of delay/sleep periods from 1ms
to 20ms, and some have no delay. They all loop forever except rtc,
which times out after 10 retries, and that uses 10ms delays. So use
10ms as our standard delay. The OPAL maintainer agrees 10ms is a
reasonable starting point.
The idea is to use the same recipe everywhere, once this is proven to
work then it will be documented as an OPAL API standard. Then both
firmware and OS can agree, and if a particular call needs something
else, then that can be documented with reasoning.
This is not the end-all of this effort, it's just a relatively easy
change that fixes some existing high latency delays. There should be
provision for standardising timeouts and/or interruptible loops where
possible, so non-fatal firmware errors don't cause hangs.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Cc: Nathan Chancellor <natechancellor@gmail.com> Cc: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
asm/barrier.h is not always included after asm/synch.h, which meant
it was missing __SUBARCH_HAS_LWSYNC, so in some files smp_wmb() would
be eieio when it should be lwsync. kernel/time/hrtimer.c is one case.
__SUBARCH_HAS_LWSYNC is only used in one place, so just fold it in
to where it's used. Previously with my small simulator config, 377
instances of eieio in the tree. After this patch there are 55.
opal_nvram_write currently just assumes success if it encounters an
error other than OPAL_BUSY or OPAL_BUSY_EVENT. Have it return -EIO
on other errors instead.
Fixes: 628daa8d5abf ("powerpc/powernv: Add RTC and NVRAM support plus RTAS fallbacks") Cc: stable@vger.kernel.org # v3.2+ Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Acked-by: Stewart Smith <stewart@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
USB3 specification 10.10, Enhanced SuperSpeed hubs only support selective
suspend and resume, they do not support global suspend/resume where the
hub downstream facing ports states are not affected.
When system enters hibernation it first enters freeze process where only
the root hub enters suspend, usb_port_suspend() is not called for other
devices, and suspend status flags are not set for them. Other devices are
expected to suspend globally. Some external USB3 hubs will suspend the
downstream facing port at global suspend. These devices won't be resumed
at thaw as the suspend status flag is not set.
A USB3 removable hard disk connected through a USB3 hub that won't resume
at thaw will fail to synchronize SCSI cache, return “cmd cmplt err -71”
error, and needs a 60 seconds timeout which causing system hang for 60s
before the USB host reset the port for the USB3 removable hard disk to
recover.
Fix this by always calling usb_port_suspend() during freeze for USB3
devices.
Mike Lothian reported that plugging in a USB-C device does not work
properly in his Dell Alienware system. This system has an Intel Alpine
Ridge Thunderbolt controller providing USB-C functionality. In these
systems the USB controller (xHCI) is hotplugged whenever a device is
connected to the port using ACPI-based hotplug.
The ACPI description of the root port in question is as follows:
Device (RP01)
{
Name (_ADR, 0x001C0000)
Device (PXSX)
{
Name (_ADR, 0x02)
Method (_RMV, 0, NotSerialized)
{
// ...
}
}
Here _ADR 0x02 means device 0, function 2 on the bus under root port (RP01)
but that seems to be incorrect because device 0 is the upstream port of the
Alpine Ridge PCIe switch and it has no functions other than 0 (the bridge
itself). When we get ACPI Notify() to the root port resulting from
connecting a USB-C device, Linux tries to read PCI_VENDOR_ID from device 0,
function 2 which of course always returns 0xffffffff because there is no
such function and we never find the device.
In Windows this works fine.
Now, since we get ACPI Notify() to the root port and not to the PXSX device
we should actually start our scan from there as well and not from the
non-existent PXSX device. Fix this by checking presence of the slot itself
(function 0) if we fail to do that otherwise.
While there use pci_bus_read_dev_vendor_id() in get_slot_status(), which is
the recommended way to read Device and Vendor IDs of devices on PCI buses.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=198557 Reported-by: Mike Lothian <mike@fireburn.co.uk> Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
A toolstack may delete the vif frontend and backend xenstore entries
while xen-netfront is in the removal code path. In that case, the
checks for xenbus_read_driver_state would return XenbusStateUnknown, and
xennet_remove would hang indefinitely. This hang prevents system
shutdown.
xennet_remove must be able to handle XenbusStateUnknown, and
netback_changed must also wake up the wake_queue for that state as well.
Fixes: 5b5971df3bc2 ("xen-netfront: remove warning when unloading module") Signed-off-by: Jason Andryuk <jandryuk@gmail.com> Cc: Eduardo Otubo <otubo@redhat.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
There are only 19 PIOB pins having primary names PB0-PB18. Not all of them
have a 'C' function. So the pinctrl property mask ends up being the same as the
other SoC of the at91sam9x5 series.
Reported-by: Marek Sieranski <marek.sieranski@microchip.com> Signed-off-by: Nicolas Ferre <nicolas.ferre@microchip.com> Cc: <stable@vger.kernel.org> # v3.8+ Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
musb->endpoints[] has array size MUSB_C_NUM_EPS.
We must check array bounds before accessing the array and not afterwards.
Signed-off-by: Heinrich Schuchardt <xypron.glpk@gmx.de> Signed-off-by: Bin Liu <b-liu@ti.com> Cc: stable <stable@vger.kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
We've got a bug report indicating a kernel panic at booting on an x86-32
system, and it turned out to be the invalid PCI resource assigned after
reallocation. __find_resource() first aligns the resource start address
and resets the end address with start+size-1 accordingly, then checks
whether it's contained. Here the end address may overflow the integer,
although resource_contains() still returns true because the function
validates only start and end address. So this ends up with returning an
invalid resource (start > end).
There was already an attempt to cover such a problem in the commit 47ea91b4052d ("Resource: fix wrong resource window calculation"), but
this case is an overseen one.
This patch adds the validity check of the newly calculated resource for
avoiding the integer overflow problem.
Bugzilla: http://bugzilla.opensuse.org/show_bug.cgi?id=1086739 Link: http://lkml.kernel.org/r/s5hpo37d5l8.wl-tiwai@suse.de Fixes: 23c570a67448 ("resource: ability to resize an allocated resource") Signed-off-by: Takashi Iwai <tiwai@suse.de> Reported-by: Michael Henders <hendersm@shaw.ca> Tested-by: Michael Henders <hendersm@shaw.ca> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Ram Pai <linuxram@us.ibm.com> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
One use of the reiserfs_warning() macro in journal_init_dev() is missing
a parameter, causing the following warning:
REISERFS warning (device loop0): journal_init_dev: Cannot open '%s': %i journal_init_dev:
This also causes a WARN_ONCE() warning in the vsprintf code, and then a
panic if panic_on_warn is set.
Please remove unsupported %/ in format string
WARNING: CPU: 1 PID: 4480 at lib/vsprintf.c:2138 format_decode+0x77f/0x830 lib/vsprintf.c:2138
Kernel panic - not syncing: panic_on_warn set ...
Just add another string argument to the macro invocation.
While UBI and UBIFS seem to work at first sight with MLC NAND, you will
most likely lose all your data upon a power-cut or due to read/write
disturb.
In order to protect users from bad surprises, refuse to attach to MLC
NAND.
Cc: stable@vger.kernel.org Signed-off-by: Richard Weinberger <richard@nod.at> Acked-by: Boris Brezillon <boris.brezillon@bootlin.com> Acked-by: Artem Bityutskiy <dedekind1@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When opening a device with write access, ubiblock_open returns an error
code. Currently, this error code is -EPERM, but this is not the right
value.
The open function for other block devices returns -EROFS when opening
read-only devices with FMODE_WRITE set. When used with dm-verity, the
veritysetup userspace tool is expecting EROFS, and refuses to use the
ubiblock device.
Use -EROFS for ubiblock as well. As a result, veritysetup accepts the
ubiblock device as valid.
Cc: stable@vger.kernel.org Fixes: 9d54c8a33eec (UBI: R/O block driver on top of UBI volumes) Signed-off-by: Romain Izard <romain.izard.pro@gmail.com> Signed-off-by: Richard Weinberger <richard@nod.at> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
If ubifs_wbuf_sync() fails we must not write a master node with the
dirty marker cleared.
Otherwise it is possible that in case of an IO error while syncing we
mark the filesystem as clean and UBIFS refuses to recover upon next
mount.
Cc: <stable@vger.kernel.org> Fixes: 1e51764a3c2a ("UBIFS: add new flash file system") Signed-off-by: Richard Weinberger <richard@nod.at> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
On receiving a packet the state index points to the rstate which must be
used to fill up IP and TCP headers. But if the state index points to a
rstate which is unitialized, i.e. filled with zeros, it gets stuck in an
infinite loop inside ip_fast_csum trying to compute the ip checsum of a
header with zero length.
./scripts/faddr2line vmlinux slhc_uncompress+0x464/0x468 output:
ip_fast_csum at arch/arm64/include/asm/checksum.h:40
(inlined by) slhc_uncompress at drivers/net/slip/slhc.c:615
Adding a variable to indicate if the current rstate is initialized. If
such a packet arrives, move to toss state.
Signed-off-by: Tejaswi Tanikella <tejaswit@codeaurora.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
On an Output queue, both EMPTY and PENDING buffer states imply that the
buffer is ready for completion-processing by the upper-layer drivers.
So for a non-QEBSM Output queue, get_buf_states() merges mixed
batches of PENDING and EMPTY buffers into one large batch of EMPTY
buffers. The upper-layer driver (ie. qeth) later distuingishes PENDING
from EMPTY by inspecting the slsb_state for
QDIO_OUTBUF_STATE_FLAG_PENDING.
But the merge logic in get_buf_states() contains a bug that causes us to
erronously also merge ERROR buffers into such a batch of EMPTY buffers
(ERROR is 0xaf, EMPTY is 0xa1; so ERROR & EMPTY == EMPTY).
Effectively, most outbound ERROR buffers are currently discarded
silently and processed as if they had succeeded.
Note that this affects _all_ non-QEBSM device types, not just IQD with CQ.
Fix it by explicitly spelling out the exact conditions for merging.
For extracting the "get initial state" part out of the loop, this relies
on the fact that get_buf_states() is never called with a count of 0. The
QEBSM path already strictly requires this, and the two callers with
variable 'count' make sure of it.
Fixes: 104ea556ee7f ("qdio: support asynchronous delivery of storage blocks") Cc: <stable@vger.kernel.org> #v3.2+ Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com> Reviewed-by: Ursula Braun <ubraun@linux.vnet.ibm.com> Reviewed-by: Benjamin Block <bblock@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Immediate retry of EQBS after CCQ 96 means that we potentially misreport
the state of buffers inspected during the first EQBS call.
This occurs when
1. the first EQBS finds all inspected buffers still in the initial state
set by the driver (ie INPUT EMPTY or OUTPUT PRIMED),
2. the EQBS terminates early with CCQ 96, and
3. by the time that the second EQBS comes around, the state of those
previously inspected buffers has changed.
If the state reported by the second EQBS is 'driver-owned', all we know
is that the previous buffers are driver-owned now as well. But we can't
tell if they all have the same state. So for instance
- the second EQBS reports OUTPUT EMPTY, but any number of the previous
buffers could be OUTPUT ERROR by now,
- the second EQBS reports OUTPUT ERROR, but any number of the previous
buffers could be OUTPUT EMPTY by now.
Effectively, this can result in both over- and underreporting of errors.
If the state reported by the second EQBS is 'HW-owned', that doesn't
guarantee that the previous buffers have not been switched to
driver-owned in the mean time. So for instance
- the second EQBS reports INPUT EMPTY, but any number of the previous
buffers could be INPUT PRIMED (or INPUT ERROR) by now.
This would result in failure to process pending work on the queue. If
it's the final check before yielding initiative, this can cause
a (temporary) queue stall due to IRQ avoidance.
Fixes: 25f269f17316 ("[S390] qdio: EQBS retry after CCQ 96") Cc: <stable@vger.kernel.org> #v3.2+ Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com> Reviewed-by: Benjamin Block <bblock@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
As found by the ubsan checker, the value of the 'index' variable can be
out of range for the bc[] array:
UBSAN: Undefined behaviour in arch/parisc/kernel/drivers.c:655:21
index 6 is out of range for type 'char [6]'
Backtrace:
[<104fa850>] __ubsan_handle_out_of_bounds+0x68/0x80
[<1019d83c>] check_parent+0xc0/0x170
[<1019d91c>] descend_children+0x30/0x6c
[<1059e164>] device_for_each_child+0x60/0x98
[<1019cd54>] parse_tree_node+0x40/0x54
[<1019d86c>] check_parent+0xf0/0x170
[<1019d91c>] descend_children+0x30/0x6c
[<1059e164>] device_for_each_child+0x60/0x98
[<1019d938>] descend_children+0x4c/0x6c
[<1059e164>] device_for_each_child+0x60/0x98
[<1019cd54>] parse_tree_node+0x40/0x54
[<1019cffc>] hwpath_to_device+0xa4/0xc4
At put_v4l2_window32(), it tries to access kp->clips. However,
kp points to an userspace pointer. So, it should be obtained
via get_user(), otherwise it can OOPS:
Revert "xhci: plat: Register shutdown for xhci_plat"
Pixel 2 field testers reported that when they tried to reboot their
phones with some USB devices plugged in, the reboot would get wedged and
eventually trigger watchdog reset. Once the Pixel kernel team found a
reliable repro case, they narrowed it down to this commit's 4.4.y
backport. Reverting the change made the issue go away.
While building ipv6 datagram we currently allow arbitrary large
extheaders, even beyond pmtu size. The syzbot has found a way
to exploit the above to trigger the following splat:
When a host fragments an IPv6 datagram, it MUST include the entire
IPv6 Header Chain in the First Fragment.
So this patch addresses the issue dropping datagrams with excessive
extheader length. It also updates the error path to report to the
calling socket nonnegative pmtu values.
The issue apparently predates git history.
v1 -> v2: cleanup error path, as per Eric's suggestion
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Reported-by: syzbot+91e6f9932ff122fa4410@syzkaller.appspotmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Fixes a bug in the tcf_dump_walker function that can cause some actions
to not be reported when dumping a large number of actions. This issue
became more aggrevated when cookies feature was added. In particular
this issue is manifest when large cookie values are assigned to the
actions and when enough actions are created that the resulting table
must be dumped in multiple batches.
The number of actions returned in each batch is limited by the total
number of actions and the memory buffer size. With small cookies
the numeric limit is reached before the buffer size limit, which avoids
the code path triggering this bug. When large cookies are used buffer
fills before the numeric limit, and the erroneous code path is hit.
For example after creating 32 csum actions with the cookie aaaabbbbccccdddd
$ tc actions ls action csum
total acts 26
action order 0: csum (tcp) action continue
index 1 ref 1 bind 0
cookie aaaabbbbccccdddd
.....
action order 25: csum (tcp) action continue
index 26 ref 1 bind 0
cookie aaaabbbbccccdddd
total acts 6
action order 0: csum (tcp) action continue
index 28 ref 1 bind 0
cookie aaaabbbbccccdddd
......
action order 5: csum (tcp) action continue
index 32 ref 1 bind 0
cookie aaaabbbbccccdddd
Note that the action with index 27 is omitted from the report.
Fixes: 4b3550ef530c ("[NET_SCHED]: Use nla_nest_start/nla_nest_end")" Signed-off-by: Craig Dillabaugh <cdillaba@mojatatu.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
pci_set_drvdata() is called only after registering the net_device,
therefore we could run into a NPE if one of the functions using
driver_data is called before it's set.
Fix this by calling pci_set_drvdata() before registering the
net_device.
This fix is a candidate for stable. As far as I can see the
bug has been there in kernel version 3.2 already, therefore
I can't provide a reference which commit is fixed by it.
The fix may need small adjustments per kernel version because
due to other changes the label which is jumped to if
register_netdev() fails has changed over time.
Reported-by: David Miller <davem@davemloft.net> Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Use valid_name() to make sure user does not provide illegal
device name.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Use valid_name() to make sure user does not provide illegal
device name.
Fixes: ed1efb2aefbb ("ipv6: Add support for IPsec virtual tunnel interfaces") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Use dev_valid_name() to make sure user does not provide illegal
device name.
syzbot caught the following bug :
BUG: KASAN: stack-out-of-bounds in strlcpy include/linux/string.h:300 [inline]
BUG: KASAN: stack-out-of-bounds in ip6gre_tunnel_locate+0x334/0x860 net/ipv6/ip6_gre.c:339
Write of size 20 at addr ffff8801afb9f7b8 by task syzkaller851048/4466
Fixes: c12b395a4664 ("gre: Support GRE over IPv6") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Use dev_valid_name() to make sure user does not provide illegal
device name.
syzbot caught the following bug :
BUG: KASAN: stack-out-of-bounds in strlcpy include/linux/string.h:300 [inline]
BUG: KASAN: stack-out-of-bounds in ipip6_tunnel_locate+0x63b/0xaa0 net/ipv6/sit.c:254
Write of size 33 at addr ffff8801b64076d8 by task syzkaller932654/4453
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Use dev_valid_name() to make sure user does not provide illegal
device name.
syzbot caught the following bug :
BUG: KASAN: stack-out-of-bounds in strlcpy include/linux/string.h:300 [inline]
BUG: KASAN: stack-out-of-bounds in __ip_tunnel_create+0xca/0x6b0 net/ipv4/ip_tunnel.c:257
Write of size 20 at addr ffff8801ac79f810 by task syzkaller268107/4482
Fixes: c54419321455 ("GRE: Refactor GRE tunneling code.") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
We want to use dev_valid_name() to validate tunnel names,
so better use strnlen(name, IFNAMSIZ) than strlen(name) to make
sure to not upset KASAN.
Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When dev_set_promiscuity(1) succeeds but dev_set_allmulti(1) fails,
dev_set_promiscuity(-1) should be done before going to the err path.
Otherwise, dev->promiscuity will leak.
Fixes: 7e1a1ac1fbaa ("bonding: Check return of dev_set_promiscuity/allmulti") Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This is actually a dead lock caused by sync slave hwaddr from master when
the master is the slave's 'slave'. This dead loop check is actually done
by netdev_master_upper_dev_link. However, Commit 1f718f0f4f97 ("bonding:
populate neighbour's private on enslave") moved it after dev_mc_sync.
This patch is to fix it by moving dev_mc_sync after master_upper_dev_link,
so that this loop check would be earlier than dev_mc_sync. It also moves
if (mode == BOND_MODE_8023AD) into if (!bond_uses_primary) clause as an
improvement.
Note team driver also has this issue, I will fix it in another patch.
Fixes: 1f718f0f4f97 ("bonding: populate neighbour's private on enslave") Reported-by: Beniamino Galvani <bgalvani@redhat.com> Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
vlan_vids_add_by_dev is called right after dev hwaddr sync, so on
the err path it should unsync dev hwaddr. Otherwise, the slave
dev's hwaddr will never be unsync when this err happens.
Fixes: 1ff412ad7714 ("bonding: change the bond's vlan syncing functions with the standard ones") Signed-off-by: Xin Long <lucien.xin@gmail.com> Reviewed-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Acked-by: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
We tried to remove vq poll from wait queue, but do not check whether
or not it was in a list before. This will lead double free. Fixing
this by switching to use vhost_poll_stop() which zeros poll->wqh after
removing poll from waitqueue to make sure it won't be freed twice.
Cc: Darren Kenny <darren.kenny@oracle.com> Reported-by: syzbot+c0272972b01b872e604a@syzkaller.appspotmail.com Fixes: 2b8b328b61c79 ("vhost_net: handle polling errors when setting backend") Signed-off-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Darren Kenny <darren.kenny@oracle.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Local variable description: ----address@SYSC_bind
Variable was created at:
SYSC_bind+0x6f/0x4b0 net/socket.c:1461
SyS_bind+0x54/0x80 net/socket.c:1460
Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Vlad Yasevich <vyasevich@gmail.com> Cc: Neil Horman <nhorman@tuxdriver.com> Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Local variable description: ----addr@___sys_recvmsg
Variable was created at:
___sys_recvmsg+0xd5/0x810 net/socket.c:2172
__sys_recvmmsg+0x54e/0xdb0 net/socket.c:2313
Bytes 8-15 of 16 are uninitialized
==================================================================
Kernel panic - not syncing: panic_on_warn set ...
Once dst has been cached in socket via sk_setup_caps(),
it is illegal to call ip_rt_put() (or dst_release()),
since sk_setup_caps() did not change dst refcount.
We can still dereference it since we hold socket lock.
Caugth by syzbot :
BUG: KASAN: use-after-free in atomic_dec_return include/asm-generic/atomic-instrumented.h:198 [inline]
BUG: KASAN: use-after-free in dst_release+0x27/0xa0 net/core/dst.c:185
Write of size 4 at addr ffff8801c54dc040 by task syz-executor4/20088
Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
KMSAN reports use of uninitialized memory in the case when |alen| is
smaller than sizeof(struct sockaddr_nl), and therefore |nladdr| isn't
fully copied from the userspace.
Signed-off-by: Alexander Potapenko <glider@google.com> Fixes: 1da177e4c3f41524 ("Linux-2.6.12-rc2") Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
skb mac header is not necessarily set at the time skb_network_protocol()
is called. Use skb->data instead.
BUG: KASAN: slab-out-of-bounds in skb_network_protocol+0x46b/0x4b0 net/core/dev.c:2739
Read of size 2 at addr ffff8801b3097a0b by task syz-executor5/14242
Fixes: 19acc327258a ("gso: Handle Trans-Ether-Bridging protocol in skb_network_protocol()") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Pravin B Shelar <pshelar@ovn.org> Reported-by: Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When dealing with key handling for shared futexes, we can drastically reduce
the usage/need of the page lock. 1) For anonymous pages, the associated futex
object is the mm_struct which does not require the page lock. 2) For inode
based, keys, we can check under RCU read lock if the page mapping is still
valid and take reference to the inode. This just leaves one rare race that
requires the page lock in the slow path when examining the swapcache.
Additionally realtime users currently have a problem with the page lock being
contended for unbounded periods of time during futex operations.
Task A
get_futex_key()
lock_page()
---> preempted
Now any other task trying to lock that page will have to wait until
task A gets scheduled back in, which is an unbound time.
With this patch, we pretty much have a lockless futex_get_key().
Experiments show that this patch can boost/speedup the hashing of shared
futexes with the perf futex benchmarks (which is good for measuring such
change) by up to 45% when there are high (> 100) thread counts on a 60 core
Westmere. Lower counts are pretty much in the noise range or less than 10%,
but mid range can be seen at over 30% overall throughput (hash ops/sec).
This makes anon-mem shared futexes much closer to its private counterpart.
Signed-off-by: Mel Gorman <mgorman@suse.de>
[ Ported on top of thp refcount rework, changelog, comments, fixes. ] Signed-off-by: Davidlohr Bueso <dbueso@suse.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: Chris Mason <clm@fb.com> Cc: Darren Hart <dvhart@linux.intel.com> Cc: Hugh Dickins <hughd@google.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: dave@stgolabs.net Link: http://lkml.kernel.org/r/1455045314-8305-3-git-send-email-dave@stgolabs.net Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Chenbo Feng <fengc@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Linus pointed out that there is a much more efficient way of avoiding
the problem that we were trying to address in commit 9dfa7bba35ac0:
"fix race in drivers/char/random.c:get_reg()".
virtio_net: check return value of skb_to_sgvec in one more location
Kernels that do not have f6b10209b90d ("virtio-net: switch to use
build_skb() for small buffer") will have an extra call to skb_to_sgvec
that is not handled by e2fcad58fd23 ("virtio_net: check return value of
skb_to_sgvec always"). Since the former does not appear to be stable
material, just fix the call up directly.
Cc: Jason A. Donenfeld <Jason@zx2c4.com> Cc: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Reviewed-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
[natechancellor: backport to 3.18] Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Acked-by: David Howells <dhowells@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
[natechancellor: backport to 3.18] Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Cc: Steffen Klassert <steffen.klassert@secunet.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: David S. Miller <davem@davemloft.net>
[nc: Adjust context due to lack of 000ae7b2690e2 and fca11ebde3f0] Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Some devices have the control dlci stay in ADM mode instead of the UA
mode. This can seen at least on droid 4 when trying to open the ts
27.010 mux port. Enabling n_gsm debug mode shows the control dlci
always respond with DM to SABM instead of UA:
Note that this is different issue from other n_gsm -EL2HLT issues such
as timeouts when the control dlci does not respond at all.
The ADM mode seems to be a quite common according to "RF Wireless World"
article "GSM Issue-UE sends SABM and gets a DM response instead of
UA response":
This issue is most commonly observed in GSM networks where in UE sends
SABM and expects network to send UA response but it ends up receiving
DM response from the network. SABM stands for Set asynchronous balanced
mode, UA stands for Unnumbered Acknowledge and DA stands for
Disconnected Mode.
An RLP entity can be in one of two modes:
- Asynchronous Balanced Mode (ABM)
- Asynchronous Disconnected Mode (ADM)
Currently Linux kernel closes the control dlci after several retries
in gsm_dlci_t1() on DM. This causes n_gsm /dev/gsmtty ports to produce
error code -EL2HLT when trying to open them as the closing of control
dlci has already set gsm->dead.
Let's fix the issue by allowing control dlci stay in ADM mode after the
retries so the /dev/gsmtty ports can be opened and used. It seems that
it might take several attempts to get any response from the control
dlci, so it's best to allow ADM mode only after the SABM retries are
done.
Note that for droid 4 additional patches are needed to mux the ttyS0
pins and to toggle RTS gpio_149 to wake up the mdm6600 modem are also
needed to use n_gsm. And the mdm6600 modem needs to be powered on.
Cc: linux-serial@vger.kernel.org Cc: Alan Cox <alan@llwyncelyn.cymru> Cc: Jiri Prchal <jiri.prchal@aksignal.cz> Cc: Jiri Slaby <jslaby@suse.cz> Cc: Marcel Partap <mpartap@gmx.net> Cc: Michael Scott <michael.scott@linaro.org> Cc: Peter Hurley <peter@hurleysoftware.com> Cc: Russ Gorby <russ.gorby@intel.com> Cc: Sascha Hauer <s.hauer@pengutronix.de> Cc: Sebastian Reichel <sre@kernel.org> Signed-off-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The status of SAS PHY is in sas_phy->enabled. There is an issue that the
status of a remote SAS PHY may be initialized incorrectly: if disable
remote SAS PHY through sysfs interface (such as echo 0 >
/sys/class/sas_phy/phy-1:0:0/enable), then reboot the system, and we
will find the status of remote SAS PHY which is disabled before is
1 (cat /sys/class/sas_phy/phy-1:0:0/enable). But actually the status of
remote SAS PHY is disabled and the device attached is not found.
In SAS protocol, NEGOTIATED LOGICAL LINK RATE field of DISCOVER response
is 0x1 when remote SAS PHY is disabled. So initialize sas_phy->enabled
according to the value of NEGOTIATED LOGICAL LINK RATE field.
Signed-off-by: chenxiang <chenxiang66@hisilicon.com> Reviewed-by: John Garry <john.garry@huawei.com> Signed-off-by: Jason Yan <yanaijie@huawei.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The intend purpose here was to goto out if smp_execute_task() returned
error. Obviously something got screwed up. We will never get these link
error statistics below:
In such scenario that there are some flash only volumes
, and some cached devices, when many tasks request these devices in
writeback mode, the write IOs may fall to the same bucket as bellow:
| cached data | flash data | cached data | cached data| flash data|
then after writeback of these cached devices, the bucket would
be like bellow bucket:
| free | flash data | free | free | flash data |
So, there are many free space in this bucket, but since data of flash
only volumes still exists, so this bucket cannot be reclaimable,
which would cause waste of bucket space.
In this patch, we segregate flash only volume write streams from
cached devices, so data from flash only volumes and cached devices
can store in different buckets.
Compare to v1 patch, this patch do not add a additionally open bucket
list, and it is try best to segregate flash only volume write streams
from cached devices, sectors of flash only volumes may still be mixed
with dirty sectors of cached device, but the number is very small.
[mlyle: fixed commit log formatting, permissions, line endings]
Currently, when a cached device detaching from cache, writeback thread is
not stopped, and writeback_rate_update work is not canceled. For example,
after the following command:
echo 1 >/sys/block/sdb/bcache/detach
you can still see the writeback thread. Then you attach the device to the
cache again, bcache will create another writeback thread, for example,
after below command:
echo ba0fb5cd-658a-4533-9806-6ce166d883b9 > /sys/block/sdb/bcache/attach
then you will see 2 writeback threads.
This patch stops writeback thread and cancels writeback_rate_update work
when cached device detaching from cache.
Compare with patch v1, this v2 patch moves code down into the register
lock for safety in case of any future changes as Coly and Mike suggested.
An invalid opcode indicates something seriously wrong with the
input AML file. The AML parser is immediately confused and lost,
causing the resulting parse tree to be ill-formed. The actual
disassembly can then cause numerous unrelated errors and faults.
This change aborts the disassembly upon discovery of such an
opcode during the AML parse phase.
Link: https://github.com/acpica/acpica/commit/ed0389cb Signed-off-by: Bob Moore <robert.moore@intel.com> Signed-off-by: Lv Zheng <lv.zheng@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
It is reported that on Linux, RTC driver complains wrong errors on
hardware reduced platform:
[ 4.085420] ACPI Warning: Could not enable fixed event - real_time_clock (4) (20160422/evxface-654)
This patch fixes this by correctly adding runtime reduced hardware check.
Reported by Chandan Tagore, fixed by Lv Zheng.
Link: https://github.com/acpica/acpica/commit/99bc3bec Tested-by: Chandan Tagore <tagore.chandan@gmail.com> Signed-off-by: Lv Zheng <lv.zheng@intel.com> Signed-off-by: Bob Moore <robert.moore@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The Broadcom BCM20702 Bluetooth controller in ThinkPad-T530 devices
report support for the Set Event Mask Page 2 command, but actually do
return an error when trying to use it.
Since these controllers do not support any feature that would require
the event mask page 2 to be modified, it is safe to not send this
command at all. The default value is all bits set to zero.
When an ldc control-only packet is received during data exchange in
read_nonraw(), a new rx head is calculated but the rx queue head is not
actually advanced (rx_set_head() is not called) and a branch is taken to
'no_data' at which point two things can happen depending on the value
of the newly calculated rx head and the current rx tail:
- If the rx queue is determined to be not empty, then the wrong packet
is picked up.
- If the rx queue is determined to be empty, then a read error (EAGAIN)
is eventually returned since it is falsely assumed that more data was
expected.
The fix is to update the rx head and return in case of a control only
packet during data exchange.
Signed-off-by: Jagannathan Raman <jag.raman@oracle.com> Reviewed-by: Aaron Young <aaron.young@oracle.com> Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com> Reviewed-by: Bijan Mottahedeh <bijan.mottahedeh@oracle.com> Reviewed-by: Liam Merwick <liam.merwick@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This warning is caused by the lock held by sctp_getsockopt() is on one
socket, while the other lock that sctp_close() is getting later is on
the newly created (which failed) socket during peeloff operation.
This patch is to avoid this warning by use lock_sock with subclass
SINGLE_DEPTH_NESTING as Wang Cong and Marcelo's suggestion.
Reported-by: Dmitry Vyukov <dvyukov@google.com> Suggested-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Suggested-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
VF clients are configured as enforced, meaning firmware is validating
the correctness of their ethertype/vid during transmission.
Once txvlan is disabled, VF would start getting SKBs for transmission
here vlan is on the payload - but it'll pass the packet's ethertype
instead of the vid, leading to firmware declaring it as malicious.
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The improved type-checking version of container_of() triggers a warning for
xchg_xen_ulong, pointing out that 'xen_ulong_t' is unsigned, but atomic64_t
contains a signed value:
drivers/xen/events/events_2l.c: In function 'evtchn_2l_handle_events':
drivers/xen/events/events_2l.c:187:1020: error: call to '__compiletime_assert_187' declared with attribute error: pointer type mismatch in container_of()
This adds a cast to work around the warning.
Cc: Ian Abbott <abbotti@mev.co.uk> Fixes: 85323a991d40 ("xen: arm: mandate EABI and use generic atomic operations.") Fixes: daa2ac80834d ("kernel.h: handle pointers to arrays better in container_of()") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Stefano Stabellini <sstabellini@kernel.org> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org> Acked-by: Ian Abbott <abbotti@mev.co.uk> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
If a kernel modules is compressed, it should be decompressed before
running objdump to parse binary data correctly. This fixes a failure of
object code reading test for me.
Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Wang Nan <wangnan0@huawei.com> Cc: kernel-team@lge.com Link: http://lkml.kernel.org/r/20170608073109.30699-8-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This patch fixes a problem where the AR8035 PHY can't be
detected on an Cisco Meraki MR24, if the ethernet cable is
not connected on boot.
Russell Senior provided steps to reproduce the issue:
|Disconnect ethernet cable, apply power, wait until device has booted,
|plug in ethernet, check for interfaces, no eth0 is listed.
|
|This appears to be a problem during probing of the AR8035 Phy chip.
|When ethernet has no link, the phy detection fails, and eth0 is not
|created. Plugging ethernet later has no effect, because there is no
|interface as far as the kernel is concerned. The relevant part of
|the boot log looks like this:
|this is the failing case:
|
|[ 0.876611] /plb/opb/emac-rgmii@ef601500: input 0 in RGMII mode
|[ 0.882532] /plb/opb/ethernet@ef600c00: reset timeout
|[ 0.888546] /plb/opb/ethernet@ef600c00: can't find PHY!
|and the succeeding case:
|
|[ 0.876672] /plb/opb/emac-rgmii@ef601500: input 0 in RGMII mode
|[ 0.883952] eth0: EMAC-0 /plb/opb/ethernet@ef600c00, MAC 00:01:..
|[ 0.890822] eth0: found Atheros 8035 Gigabit Ethernet PHY (0x01)
Based on the comment and the commit message of
commit 23fbb5a87c56 ("emac: Fix EMAC soft reset on 460EX/GT").
This is because the AR8035 PHY doesn't provide the TX Clock,
if the ethernet cable is not attached. This causes the reset
to timeout and the PHY detection code in emac_init_phy() is
unable to detect the AR8035 PHY. As a result, the emac driver
bails out early and the user left with no ethernet.
In order to stay compatible with existing configurations, the driver
tries the current reset approach at first. Only if the first attempt
timed out, it does perform one more retry with the clock temporarily
switched to the internal source for just the duration of the reset.
Cc: Chris Blake <chrisrblake93@gmail.com> Reported-by: Russell Senior <russell@personaltelco.net> Fixes: 23fbb5a87c56e98 ("emac: Fix EMAC soft reset on 460EX/GT") Signed-off-by: Christian Lamparter <chunkeey@googlemail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When ftrace is used with kprobes, it is possible for a kprobe to contain
an invalid location (ie. only initialised to 0 and not to a specific
location in the code). Trying to perform a cache flush on such location
leads to a crash r4k_flush_icache_range().
fixrange_init operates at PMD-granularity and expects the addresses to
be PMD-size aligned, but currently that might not be the case for
PKMAP_BASE unless it is defined properly, so ensure a correct alignment
is used before passing the address to fixrange_init.
fixed mappings: only align the start address that is passed to
fixrange_init rather than the value before adding the size, as we may
end up with uninitialised upper part of the range.