This patch fixes a NULL pointer dereference OOPs with pSCSI backends
within target_core_stat.c code. The bug is caused by a configfs attr
read if no pscsi_dev_virt->pdv_sd has been configured.
Reported-by: Olaf Hering <olaf@aepfle.de> Cc: <stable@vger.kernel.org> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
This patch fixes a iser specific logout bug where early complete()
of conn->conn_logout_comp in iscsit_close_connection() was causing
isert_wait4logout() to complete too soon, triggering a use after
free NULL pointer dereference of iscsi_conn memory.
The complete() was originally added for traditional iscsi-target
when a ISCSI_LOGOUT_OP failed in iscsi_target_rx_opcode(), but given
iser-target does not wait in logout failure, this special case needs
to be avoided.
This patch fixes a NULL pointer dereference triggered by a late
target_configure_device() -> alloc_workqueue() failure that results
in target_free_device() being called with DF_CONFIGURED already set,
which subsequently OOPses in destroy_workqueue() code.
Currently this only happens at modprobe target_core_mod time when
core_dev_setup_virtual_lun0() -> target_configure_device() fails,
and the explicit target_free_device() gets called.
To address this bug originally introduced by commit 0fd97ccf45, go
ahead and move DF_CONFIGURED to end of target_configure_device()
code to handle this special failure case.
ehrpwm tbclk is wrongly modelled as deriving from dpll_per_m2_ck.
The TRM says tbclk is derived from SYSCLKOUT. SYSCLKOUT nothing but the
functional clock of pwmss (l4ls_gclk).
Fix this by changing source of ehrpwmx_tbclk to l4ls_gclk.
Fixes: 4da1c67719f61 ("add tbclk data for ehrpwm") Signed-off-by: Vignesh R <vigneshr@ti.com> Acked-by: Tero Kristo <t-kristo@ti.com> Signed-off-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
ehrpwm tbclk is wrongly modelled as deriving from dpll_per_m2_ck.
The TRM says tbclk is derived from SYSCLKOUT. SYSCLKOUT nothing but the
functional clock of pwmss (l4ls_gclk).
Fix this by changing source of ehrpwmx_tbclk to l4ls_gclk.
Fixes: 9e100ebafb91: ("Fix ehrpwm tbclk data") Signed-off-by: Vignesh R <vigneshr@ti.com> Acked-by: Tero Kristo <t-kristo@ti.com> Signed-off-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Fixes: ee6c750761 (ARM: dts: dra7 clock data)
On DRA7x, For DPLL_IVA, the ref clock(CLKINP) is connected to sys_clk1 and
the bypass input(CLKINPULOW) is connected to iva_dpll_hs_clk_div clock.
But the bypass input is not directly routed to bypass clkout instead
both CLKINP and CLKINPULOW are connected to bypass clkout via a mux.
This mux is controlled by the bit - CM_CLKSEL_DPLL_IVA[23]:DPLL_BYP_CLKSEL
and it's POR value is zero which selects the CLKINP as bypass clkout.
which means iva_dpll_hs_clk_div is not the bypass clock for dpll_iva_ck
Fix this by adding another mux clock as parent in bypass mode.
This design is common to most of the PLLs and the rest have only one bypass
clock. Below is a list of the DPLLs that need this fix:
DPLL_IVA, DPLL_DDR,
DPLL_DSP, DPLL_EVE,
DPLL_GMAC, DPLL_PER,
DPLL_USB and DPLL_CORE
Signed-off-by: Ravikumar Kattekola <rk@ti.com> Acked-by: Tero Kristo <t-kristo@ti.com> Signed-off-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
at91rm9200 standby and suspend to ram has been broken since 00482a4078f4. It is wrongly using AT91_BASE_SYS which is a physical address
and actually doesn't correspond to any register on at91rm9200.
Use the correct at91_ramc_base[0] instead.
Fixes: 00482a4078f4 (ARM: at91: implement the standby function for pm/cpuidle) Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com> Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
USB vbus 5V is from PMIC SWBST, so set swbst_reg as vbus's
parent reg, it fixed a bug that the voltage of vbus is incorrect
due to swbst_reg is disabled after boots up.
Prevent 16-bit APIC IDs being truncated by using correct mask. This fixes
booting large systems, where the wrong core would receive the startup and
init IPIs, causing hanging.
Because prior GCC versions always emitted NOPs on ALIGN directives, but
gcc5 started omitting them.
.LSTARTFDEDLSI1 says:
/* HACK: The dwarf2 unwind routines will subtract 1 from the
return address to get an address in the middle of the
presumed call instruction. Since we didn't get here via
a call, we need to include the nop before the real start
to make up for it. */
.long .LSTART_sigreturn-1-. /* PC-relative start address */
But commit 69d0627a7f6e ("x86 vDSO: reorder vdso32 code") from 2.6.25
replaced .org __kernel_vsyscall+32,0x90 by ALIGN right before
__kernel_sigreturn.
Of course, ALIGN need not generate any NOP in there. Esp. gcc5 collapses
vclock_gettime.o and int80.o together with no generated NOPs as "ALIGN".
So fix this by adding to that point at least a single NOP and make the
function ALIGN possibly with more NOPs then.
Kudos for reporting and diagnosing should go to Richard.
Reported-by: Richard Biener <rguenther@suse.de> Signed-off-by: Jiri Slaby <jslaby@suse.cz> Acked-by: Andy Lutomirski <luto@amacapital.net> Cc: <stable@vger.kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1425543211-12542-1-git-send-email-jslaby@suse.cz Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
POWER supports irqfds but forgot to advertise them. Some userspace does
not check for the capability, but others check it---thus they work on
x86 and s390 but not POWER.
To avoid that other architectures in the future make the same mistake, let
common code handle KVM_CAP_IRQFD the same way as KVM_CAP_IRQFD_RESAMPLE.
Reported-and-tested-by: Greg Kurz <gkurz@linux.vnet.ibm.com> Cc: stable@vger.kernel.org Fixes: 297e21053a52f060944e9f0de4c64fad9bcd72fc Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
drop_fpu() does clear_used_math() and usually this is correct
because tsk == current.
However switch_fpu_finish()->restore_fpu_checking() is called before
__switch_to() updates the "current_task" variable. If it fails,
we will wrongly clear the PF_USED_MATH flag of the previous task.
So use clear_stopped_child_used_math() instead.
Signed-off-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Rik van Riel <riel@redhat.com> Cc: <stable@vger.kernel.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Pekka Riikonen <priikone@iki.fi> Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com> Cc: Suresh Siddha <sbsiddha@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20150309171041.GB11388@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
math_state_restore() assumes it is called with irqs disabled,
but this is not true if the caller is __restore_xstate_sig().
This means that if ia32_fxstate == T and __copy_from_user()
fails, __restore_xstate_sig() returns with irqs disabled too.
This triggers:
BUG: sleeping function called from invalid context at kernel/locking/rwsem.c:41
dump_stack
___might_sleep
? _raw_spin_unlock_irqrestore
__might_sleep
down_read
? _raw_spin_unlock_irqrestore
print_vma_addr
signal_fault
sys32_rt_sigreturn
Change __restore_xstate_sig() to call set_used_math()
unconditionally. This avoids enabling and disabling interrupts
in math_state_restore(). If copy_from_user() fails, we can
simply do fpu_finit() by hand.
[ Note: this is only the first step. math_state_restore() should
not check used_math(), it should set this flag. While
init_fpu() should simply die. ]
Signed-off-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Borislav Petkov <bp@suse.de> Cc: <stable@vger.kernel.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Pekka Riikonen <priikone@iki.fi> Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com> Cc: Rik van Riel <riel@redhat.com> Cc: Suresh Siddha <sbsiddha@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20150307153844.GB25954@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
The kernel crypto API logic requires the caller to provide the
length of (ciphertext || authentication tag) as cryptlen for the
AEAD decryption operation. Thus, the cipher implementation must
calculate the size of the plaintext output itself and cannot simply use
cryptlen.
The RFC4106 GCM decryption operation tries to overwrite cryptlen memory
in req->dst. As the destination buffer for decryption only needs to hold
the plaintext memory but cryptlen references the input buffer holding
(ciphertext || authentication tag), the assumption of the destination
buffer length in RFC4106 GCM operation leads to a too large size. This
patch simply uses the already calculated plaintext size.
In addition, this patch fixes the offset calculation of the AAD buffer
pointer: as mentioned before, cryptlen already includes the size of the
tag. Thus, the tag does not need to be added. With the addition, the AAD
will be written beyond the already allocated buffer.
Note, this fixes a kernel crash that can be triggered from user space
via AF_ALG(aead) -- simply use the libkcapi test application
from [1] and update it to use rfc4106-gcm-aes.
Using [1], the changes were tested using CAVS vectors to demonstrate
that the crypto operation still delivers the right results.
[1] http://www.chronox.de/libkcapi.html
CC: Tadeusz Struk <tadeusz.struk@intel.com> Cc: stable@vger.kernel.org Signed-off-by: Stephan Mueller <smueller@chronox.de> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
This updates the bit sliced AES module to the latest version in the
upstream OpenSSL repository (e620e5ae37bc). This is needed to fix a
bug in the XTS decryption path, where data chunked in a certain way
could trigger the ciphertext stealing code, which is not supposed to
be active in the kernel build (The kernel implementation of XTS only
supports round multiples of the AES block size of 16 bytes, whereas
the conformant OpenSSL implementation of XTS supports inputs of
arbitrary size by applying ciphertext stealing). This is fixed in
the upstream version by adding the missing #ifndef XTS_CHAIN_TWEAK
around the offending instructions.
The upstream code also contains the change applied by Russell to
build the code unconditionally, i.e., even if __LINUX_ARM_ARCH__ < 7,
but implemented slightly differently.
Cc: stable@vger.kernel.org Fixes: e4e7f10bfc40 ("ARM: add support for bit sliced AES using NEON instructions") Reported-by: Adrian Kotelba <adrian.kotelba@gmail.com> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Tested-by: Milan Broz <gmazyland@gmail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
As pointed by recent post[1] on exploiting DRAM physical imperfection,
/proc/PID/pagemap exposes sensitive information which can be used to do
attacks.
This disallows anybody without CAP_SYS_ADMIN to read the pagemap.
On the Cortex-A9-based Armada SoCs, the MPIC is not the primary interrupt
controller. Yet, it still has to handle some per-cpu interrupt.
To do so, it is chained with the GIC using a per-cpu interrupt. However, the
current code only call irq_set_chained_handler, which is called and enable that
interrupt only on the boot CPU, which means that the parent per-CPU interrupt
is never unmasked on the secondary CPUs, preventing the per-CPU interrupt to
actually work as expected.
This was not seen until now since the only MPIC PPI users were the Marvell
timers that were not working, but not used either since the system use the ARM
TWD by default, and the ethernet controllers, that are faking there interrupts
as SPI, and don't really expect to have interrupts on the secondary cores
anyway.
Add a CPU notifier that will enable the PPI on the secondary cores when they
are brought up.
When printing the driver_override parameter when it is 4095 and 4094 bytes
long, the printing code would access invalid memory because we need count+1
bytes for printing.
Fixes: 782a985d7af2 ("PCI: Introduce new device binding path using pci_dev.driver_override") Signed-off-by: Sasha Levin <sasha.levin@oracle.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Alex Williamson <alex.williamson@redhat.com> CC: stable@vger.kernel.org # v3.16+ CC: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> CC: Alexander Graf <agraf@suse.de> CC: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
It's directly caused by 89d3cf6 [SCSI] libsas: add mutex for SMP task
execution, but shows a deeper cause: expander functions expect to be able to
cast to and treat domain devices as expanders. The correct fix is to only do
expander discover when we know we've got an expander device to avoid wrongly
casting a non-expander device.
AIO_PREAD requests call ->aio_read() with iovec on caller's stack, so if
we are going to access it asynchronously, we'd better get ourselves
a copy - the one on kernel stack of aio_run_iocb() won't be there
anymore. function/f_fs.c take care of doing that, legacy/inode.c
doesn't...
Cc: stable@vger.kernel.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Otherwise the guest can abuse that control to cause e.g. PCIe
Unsupported Request responses by disabling memory and/or I/O decoding
and subsequently causing (CPU side) accesses to the respective address
ranges, which (depending on system configuration) may be fatal to the
host.
Note that to alter any of the bits collected together as
PCI_COMMAND_GUEST permissive mode is now required to be enabled
globally or on the specific device.
This is CVE-2015-2150 / XSA-120.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: <stable@vger.kernel.org> Signed-off-by: David Vrabel <david.vrabel@citrix.com> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Using the pvops kernel a NULL pointer dereference was detected on a
large machine (144 processors) when booting as dom0 in
evtchn_fifo_unmask() during assignment of a pirq.
The event channel in question was the first to need a new entry in
event_array[] in events_fifo.c. Unfortunately xen_irq_info_pirq_setup()
is called with evtchn being 0 for a new pirq and the real event channel
number is assigned to the pirq only during __startup_pirq().
It is mandatory to call xen_evtchn_port_setup() after assigning the
event channel number to the pirq to make sure all memory needed for the
event channel is allocated.
Commit df9e26d093d3 ("rtc: s3c: add support for RTC of Exynos3250 SoC")
added an "rtc_src" DT property to specify the clock used as a source to
the S3C real-time clock.
Not all SoCs needs this so commit eaf3a659086e ("drivers/rtc/rtc-s3c.c:
fix initialization failure without rtc source clock") changed to check
the struct s3c_rtc_data .needs_src_clk to conditionally grab the clock.
But that commit didn't update the data for each IP version so the RTC
broke on the boards that needs a source clock. This is the case of at
least Exynos5250 and Exynos5440 which uses the s3c6410 RTC IP block.
This commit fixes the S3C rtc on the Exynos5250 Snow and Exynos5420
Peach Pit and Pi Chromebooks.
Signed-off-by: Javier Martinez Canillas <javier.martinez@collabora.co.uk> Cc: Marek Szyprowski <m.szyprowski@samsung.com> Cc: Chanwoo Choi <cw00.choi@samsung.com> Cc: Doug Anderson <dianders@chromium.org> Cc: Olof Johansson <olof@lixom.net> Cc: Kevin Hilman <khilman@linaro.org> Cc: Tyler Baker <tyler.baker@linaro.org> Cc: Alessandro Zummo <a.zummo@towertech.it> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
The internal framebuffers we create to remap legacy cursor ioctls to
plane operations for the universal plane support shouldn't be linke to
the file like normal userspace framebuffers. This bug goes back to the
original universal cursor plane support introduced in
drm: Support legacy cursor ioctls via universal planes when possible (v4)
The isn't too disastrous since fbs are small, we only create one when the
cursor bo gets changed and ultimately they'll be reaped when the window
server restarts.
Conceptually we'd want to just pass NULL for file_priv when creating it,
but the driver needs the file to lookup the underlying buffer object for
cursor id. Instead let's move the file_priv linking out of
add_framebuffer_internal() into the addfb ioctl implementation, which is
the only place it is needed. And also rename the function for a more
accurate since it only creates the fb, but doesn't add it anywhere.
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> (fix & commit msg) Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> (provider of lipstick) Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Cc: Rob Clark <robdclark@gmail.com> Cc: stable@vger.kernel.org Signed-off-by: Dave Airlie <airlied@redhat.com> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Experimental lockdep annotation added to the TTM lock has unveiled a
couple of lock dependency violations in the vmwgfx driver. In both
cases it turns out that the device_private::reservation_sem is not
needed so the offending code is moved out of that lock.
To take down the MOB and GMR memory types, the driver may have to issue
fence objects and thus make sure that the fence manager is taken down
after those memory types.
Reorder device init accordingly.
This reverts commit e4df3a0b6228
("i2c: core: Dispose OF IRQ mapping at client removal time")
Calling irq_dispose_mapping() will destroy the mapping and disassociate
the IRQ from the IRQ chip to which it belongs. Keeping it is OK, because
existent mappings are reused properly.
Also, this commit breaks drivers using devm* for IRQ management on
OF-based systems because devm* cleanup happens in device code, after
bus's remove() method returns.
According to a report from Yuxuan Shui, nilfs2 in kernel 3.19 got stuck
during recovery at mount time. The code path that caused the deadlock was
as follows:
nilfs_fill_super()
load_nilfs()
nilfs_salvage_orphan_logs()
* Do roll-forwarding, attach segment constructor for recovery,
and kick it.
nilfs_segctor_thread()
nilfs_segctor_thread_construct()
* A lock is held with nilfs_transaction_lock()
nilfs_segctor_do_construct()
nilfs_segctor_drop_written_files()
iput()
iput_final()
write_inode_now()
writeback_single_inode()
__writeback_single_inode()
do_writepages()
nilfs_writepage()
nilfs_construct_dsync_segment()
nilfs_transaction_lock() --> deadlock
This can happen if commit 7ef3ff2fea8b ("nilfs2: fix deadlock of segment
constructor over I_SYNC flag") is applied and roll-forward recovery was
performed at mount time. The roll-forward recovery can happen if datasync
write is done and the file system crashes immediately after that. For
instance, we can reproduce the issue with the following steps:
< nilfs2 is mounted on /nilfs (device: /dev/sdb1) >
# dd if=/dev/zero of=/nilfs/test bs=4k count=1 && sync
# dd if=/dev/zero of=/nilfs/test conv=notrunc oflag=dsync bs=4k
count=1 && reboot -nfh
< the system will immediately reboot >
# mount -t nilfs2 /dev/sdb1 /nilfs
The deadlock occurs because iput() can run segment constructor through
writeback_single_inode() if MS_ACTIVE flag is not set on sb->s_flags. The
above commit changed segment constructor so that it calls iput()
asynchronously for inodes with i_nlink == 0, but that change was
imperfect.
This fixes the another deadlock by deferring iput() in segment constructor
even for the case that mount is not finished, that is, for the case that
MS_ACTIVE flag is not set.
Normally _regulator_do_enable() isn't called on an already-enabled
rdev. That's because the main caller, _regulator_enable() always
calls _regulator_is_enabled() and only calls _regulator_do_enable() if
the rdev was not already enabled.
However, there is one caller of _regulator_do_enable() that doesn't
check: regulator_suspend_finish(). While we might want to make
regulator_suspend_finish() behave more like _regulator_enable(), it's
probably also a good idea to make _regulator_do_enable() robust if it
is called on an already enabled rdev.
At the moment, _regulator_do_enable() is _not_ robust for already
enabled rdevs if we're using an ena_pin. Each time
_regulator_do_enable() is called for an rdev using an ena_pin the
reference count of the ena_pin is incremented even if the rdev was
already enabled. This is not as intended because the ena_pin is for
something else: for keeping track of how many active rdevs there are
sharing the same ena_pin.
Here's how the reference counting works here:
* Each time _regulator_enable() is called we increment
rdev->use_count, so _regulator_enable() calls need to be balanced
with _regulator_disable() calls.
* There is no explicit reference counting in _regulator_do_enable()
which is normally just a warapper around rdev->desc->ops->enable()
with code for supporting delays. It's not expected that the
"ops->enable()" call do reference counting.
* Since regulator_ena_gpio_ctrl() does have reference counting
(handling the sharing of the pin amongst multiple rdevs), we
shouldn't call it if the current rdev is already enabled.
Note that as part of this we cleanup (remove) the initting of
ena_gpio_state in regulator_register(). In _regulator_do_enable(),
_regulator_do_disable() and _regulator_is_enabled() is is clear that
ena_gpio_state should be the state of whether this particular rdev has
requested the GPIO be enabled. regulator_register() was initting it
as the actual state of the pin.
Fixes: 967cfb18c0e3 ("regulator: core: manage enable GPIO list") Signed-off-by: Doug Anderson <dianders@chromium.org> Signed-off-by: Mark Brown <broonie@kernel.org> Cc: stable@vger.kernel.org Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
The _regulator_do_enable() call ought to be a no-op when called on an
already-enabled regulator. However, as an optimization
_regulator_enable() doesn't call _regulator_do_enable() on an already
enabled regulator. That means we never test the case of calling
_regulator_do_enable() during normal usage and there may be hidden
bugs or warnings. We have seen warnings issued by the tps65090 driver
and bugs when using the GPIO enable pin.
Let's match the same optimization that _regulator_enable() in
regulator_suspend_finish(). That may speed up suspend/resume and also
avoids exposing hidden bugs.
[Use much clearer commit message from Doug Anderson]
Signed-off-by: Javier Martinez Canillas <javier.martinez@collabora.co.uk> Signed-off-by: Mark Brown <broonie@kernel.org> Cc: stable@vger.kernel.org Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
The LDOs are documented in the rk808 datasheet to have a soft start
time of 400us. Add that to the driver. If this time takes longer on
a certain board the device tree should be able to override with
"regulator-enable-ramp-delay".
This fixes some dw_mmc probing problems (together with other patches
posted to the mmc maiing lists) on rk3288.
Signed-off-by: Doug Anderson <dianders@chromium.org> Signed-off-by: Mark Brown <broonie@kernel.org> Cc: stable@vger.kernel.org Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
EEH recovery for bnx2x based adapters is not reliable on all Power
systems using the default hot reset, which can result in an
unrecoverable EEH error. Forcing the use of fundamental reset
during EEH recovery fixes this.
Cc: stable<stable@vger.kernel.org> Signed-off-by: Brian King <brking@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
The NDDB register holds the data that are needed by the read and write
commands.
However, during a read PIO access, the datasheet specifies that after each 32
bytes read in that register, when BCH is enabled, we have to make sure that the
RDDREQ bit is set in the NDSR register.
This fixes an issue that was seen on the Armada 385, and presumably other mvebu
SoCs, when a read on a newly erased page would end up in the driver reporting a
timeout from the NAND.
Cc: <stable@vger.kernel.org> # v3.14 Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com> Reviewed-by: Boris Brezillon <boris.brezillon@free-electrons.com> Acked-by: Ezequiel Garcia <ezequiel.garcia@free-electrons.com> Signed-off-by: Brian Norris <computersforpeace@gmail.com> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
The commit [ef403edb7558: ALSA: hda - Don't access stereo amps for
mono channel widgets] fixed the handling of mono widgets in general,
but it still misses an exceptional case: namely, a mono mixer widget
taking a single stereo input. In this case, it has stereo volumes
although it's a mono widget, and thus we have to take care of both
left and right input channels, as stated in HD-audio spec ("7.1.3
Widget Interconnection Rules").
This patch covers this missing piece by adding proper checks of stereo
amps in both the generic parser and the proc output codes.
The commit [63e51fd708f5: ALSA: hda - Don't take unresponsive D3
transition too serious] introduced a conditional fallback behavior to
the HD-audio controller depending on the flag set. However, it
introduced a silly bug, too, that the flag was evaluated in a reverse
way. This resulted in a regression of HD-audio controller driver
where it can't go to the fallback mode at communication errors.
Unfortunately (or fortunately?) this didn't come up until recently
because the affected code path is an error handling that happens only
on an unstable hardware chip. Most of recent chips work stably, thus
they didn't hit this problem. Now, we've got a regression report with
a VIA chip, and this seems indeed requiring the fallback to the
polling mode, and finally the bug was revealed.
The fix is a oneliner to remove the wrong logical NOT in the check.
(Lesson learned - be careful about double negation.)
The bug should be backported to stable, but the patch won't be
applicable to 3.13 or earlier because of the code splits. The stable
fix patches for earlier kernels will be posted later manually.
MacBook Air 5,2 has the same problem as MacBook Pro 8,1 where the
built-in mic records only the right channel. Apply the same
workaround as MBP8,1 to spread the mono channel via a Cirrus codec
vendor-specific COEF setup.
CS420x codecs seem to deal only the single amps of ADC nodes even
though the nodes receive multiple inputs. This leads to the
inconsistent amp value after S3/S4 resume, for example.
The fix is just to set codec->single_adc_amp flag. Then the driver
handles these ADC amps as if single connections.
The current HDA generic parser initializes / modifies the amp values
always in stereo, but this seems causing the problem on ALC3229 codec
that has a few mono channel widgets: namely, these mono widgets react
to actions for both channels equally.
In the driver code, we do care the mono channel and create a control
only for the left channel (as defined in HD-audio spec) for such a
node. When the control is updated, only the left channel value is
changed. However, in the resume, the right channel value is also
restored from the initial value we took as stereo, and this overwrites
the left channel value. This ends up being the silent output as the
right channel has been never touched and remains muted.
This patch covers the places where unconditional stereo amp accesses
are done and converts to the conditional accesses.
Compaq Presario CQ60 laptop with CX20561 gives a wrong pin for the
built-in mic NID 0x17 instead of NID 0x1d, and it results in the
non-working mic. This patch just remaps the pin correctly via fixup.
There was no check about the id string of user control elements, so we
accepted even a control element with an empty string, which is
obviously bogus. This patch adds more sanity checks of id strings.
The device complies to the UAC1 standard but hides that fact with
proprietary descriptors. The autodetect quirk for Roland devices
catches the audio interface but misses the MIDI part, so a specific
quirk is needed.
Commit fd316941c ("spi/pl022: disable port when unused") introduced a race,
which leads to possible driver lock up (easily reproducible on SMP).
The problem happens in giveback() function where the completion of the transfer
is signalled to SPI subsystem and then the HW SPI controller is disabled. Another
transfer might be setup in between, which brings driver in locked-up state.
Lockup! SPI controller is disabled and the data will never be received. Whole
SPI subsystem is waiting for transfer ACK and blocked.
So, only signal transfer completion after disabling the controller.
Fixes: fd316941c (spi/pl022: disable port when unused) Signed-off-by: Alexander Sverdlin <alexander.sverdlin@nokia.com> Signed-off-by: Mark Brown <broonie@kernel.org> Cc: stable@vger.kernel.org Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Additionally to the current DMA transfer the PDC allows to set up a next DMA
transfer. This is useful for larger SPI transfers.
The driver currently waits for ENDRX as end of the transfer. But ENDRX is set
when the current DMA transfer is done (RCR = 0), i.e. it doesn't include the
next DMA transfer.
Thus a subsequent SPI transfer could be started although there is currently a
transfer in progress. This can cause invalid accesses to the SPI slave devices
and to SPI transfer errors.
This issue has been observed on a hardware with a M25P128 SPI NOR flash.
So instead of ENDRX we should wait for RXBUFF. This flag is set if there is
no more DMA transfer in progress (RCR = RNCR = 0).
Signed-off-by: Torsten Fleischer <torfl6749@gmail.com> Signed-off-by: Mark Brown <broonie@kernel.org> Cc: stable@vger.kernel.org Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Problem: When IMA and VTPM are both enabled in kernel config,
kernel hangs during bootup on LE OS.
Why?: IMA calls tpm_pcr_read() which results in tpm_ibmvtpm_send
and tpm_ibmtpm_recv getting called. A trace showed that
tpm_ibmtpm_recv was hanging.
Resolution: tpm_ibmtpm_recv was hanging because tpm_ibmvtpm_send
was sending CRQ message that probably did not make much sense
to phype because of Endianness. The fix below sends correctly
converted CRQ for LE. This was not caught before because it
seems IMA is not enabled by default in kernel config and
IMA exercises this particular code path in vtpm.
Tested with IMA and VTPM enabled in kernel config and VTPM
enabled on both a BE OS and a LE OS ppc64 lpar. This exercised
CRQ and TPM command code paths in vtpm.
Patch is against Peter's tpmdd tree on github which included
Vicky's previous vtpm le patches.
Signed-off-by: Joy Latten <jmlatten@linux.vnet.ibm.com> Cc: <stable@vger.kernel.org> # eb71f8a5e33f: "Added Little Endian support to vtpm module" Cc: <stable@vger.kernel.org> Reviewed-by: Ashley Lai <ashley@ahsleylai.com> Signed-off-by: Peter Huewe <peterhuewe@gmx.de> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
The cpuset.sched_relax_domain_level can control how far we do
immediate load balancing on a system. However, it was found on recent
kernels that echo'ing a value into cpuset.sched_relax_domain_level
did not reduce any immediate load balancing.
The reason this occurred was because the update_domain_attr_tree() traversal
did not update for the "top_cpuset". This resulted in nothing being changed
when modifying the sched_relax_domain_level parameter.
This patch is able to address that problem by having update_domain_attr_tree()
allow updates for the root in the cpuset traversal.
cancel[_delayed]_work_sync() are implemented using
__cancel_work_timer() which grabs the PENDING bit using
try_to_grab_pending() and then flushes the work item with PENDING set
to prevent the on-going execution of the work item from requeueing
itself.
try_to_grab_pending() can always grab PENDING bit without blocking
except when someone else is doing the above flushing during
cancelation. In that case, try_to_grab_pending() returns -ENOENT. In
this case, __cancel_work_timer() currently invokes flush_work(). The
assumption is that the completion of the work item is what the other
canceling task would be waiting for too and thus waiting for the same
condition and retrying should allow forward progress without excessive
busy looping
Unfortunately, this doesn't work if preemption is disabled or the
latter task has real time priority. Let's say task A just got woken
up from flush_work() by the completion of the target work item. If,
before task A starts executing, task B gets scheduled and invokes
__cancel_work_timer() on the same work item, its try_to_grab_pending()
will return -ENOENT as the work item is still being canceled by task A
and flush_work() will also immediately return false as the work item
is no longer executing. This puts task B in a busy loop possibly
preventing task A from executing and clearing the canceling state on
the work item leading to a hang.
task A task B worker
executing work
__cancel_work_timer()
try_to_grab_pending()
set work CANCELING
flush_work()
block for work completion
completion, wakes up A
__cancel_work_timer()
while (forever) {
try_to_grab_pending()
-ENOENT as work is being canceled
flush_work()
false as work is no longer executing
}
This patch removes the possible hang by updating __cancel_work_timer()
to explicitly wait for clearing of CANCELING rather than invoking
flush_work() after try_to_grab_pending() fails with -ENOENT.
Link: http://lkml.kernel.org/g/20150206171156.GA8942@axis.com
v3: bit_waitqueue() can't be used for work items defined in vmalloc
area. Switched to custom wake function which matches the target
work item and exclusive wait and wakeup.
v2: v1 used wake_up() on bit_waitqueue() which leads to NULL deref if
the target bit waitqueue has wait_bit_queue's on it. Use
DEFINE_WAIT_BIT() and __wake_up_bit() instead. Reported by Tomeu
Vizoso.
The Kvaser firmware can only read and write messages that are
not crossing the USB endpoint's wMaxPacketSize boundary. While
receiving commands from the CAN device, if the next command in
the same URB buffer crossed that max packet size boundary, the
firmware puts a zero-length placeholder command in its place
then moves the real command to the next boundary mark.
The driver did not recognize such behavior, leading to missing
a good number of rx events during a heavy rx load session.
Moreover, a tx URB context only gets freed upon receiving its
respective tx ACK event. Over time, the free tx URB contexts
pool gets depleted due to the missing ACK events. Consequently,
the netif transmission queue gets __permanently__ stopped; no
frames could be sent again except after restarting the CAN
newtwork interface.
Signed-off-by: Ahmed S. Darwish <ahmed.darwish@valeo.com> Cc: linux-stable <stable@vger.kernel.org> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Some archs (specifically PowerPC), are sensitive with the ordering of
the enabling of the calls to function tracing and setting of the
function to use to be traced.
That is, update_ftrace_function() sets what function the ftrace_caller
trampoline should call. Some archs require this to be set before
calling ftrace_run_update_code().
Another bug was discovered, that ftrace_startup_sysctl() called
ftrace_run_update_code() directly. If the function the ftrace_caller
trampoline changes, then it will not be updated. Instead a call
to ftrace_startup_enable() should be called because it tests to see
if the callback changed since the code was disabled, and will
tell the arch to update appropriately. Most archs do not need this
notification, but PowerPC does.
The problem could be seen by the following commands:
When ftrace is enabled globally through the proc interface, we must check if
ftrace_graph_active is set. If it is set, then we should also pass the
FTRACE_START_FUNC_RET command to ftrace_run_update_code(). Similarly, when
ftrace is disabled globally through the proc interface, we must check if
ftrace_graph_active is set. If it is set, then we should also pass the
FTRACE_STOP_FUNC_RET command to ftrace_run_update_code().
Since ftrace_enabled = 0, ftrace_enable_ftrace_graph_caller() is never
called.
# echo 1 > /proc/sys/kernel/ftrace_enabled
Now ftrace_enabled will be set to true, but still
ftrace_enable_ftrace_graph_caller() will not be called, which is not
desired.
Further if we execute the following after this:
# echo nop > /sys/kernel/debug/tracing/current_tracer
Now since ftrace_enabled is set it will call
ftrace_disable_ftrace_graph_caller(), which causes a kernel warning on
the ARM platform.
On the ARM platform, when ftrace_enable_ftrace_graph_caller() is called,
it checks whether the old instruction is a nop or not. If it's not a nop,
then it returns an error. If it is a nop then it replaces instruction at
that address with a branch to ftrace_graph_caller.
ftrace_disable_ftrace_graph_caller() behaves just the opposite. Therefore,
if generic ftrace code ever calls either ftrace_enable_ftrace_graph_caller()
or ftrace_disable_ftrace_graph_caller() consecutively two times in a row,
then it will return an error, which will cause the generic ftrace code to
raise a warning.
Note, x86 does not have an issue with this because the architecture
specific code for ftrace_enable_ftrace_graph_caller() and
ftrace_disable_ftrace_graph_caller() does not check the previous state,
and calling either of these functions twice in a row has no ill effect.
Link: http://lkml.kernel.org/r/e4fbe64cdac0dd0e86a3bf914b0f83c0b419f146.1425666454.git.panand@redhat.com Cc: stable@vger.kernel.org # 2.6.31+ Signed-off-by: Pratyush Anand <panand@redhat.com>
[
removed extra if (ftrace_start_up) and defined ftrace_graph_active as 0
if CONFIG_FUNCTION_GRAPH_TRACER is not set.
] Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
When /proc/sys/kernel/ftrace_enabled is set to zero, all function
tracing is disabled. But the records that represent the functions
still hold information about the ftrace_ops that are hooked to them.
ftrace_ops may request "REGS" (have a full set of pt_regs passed to
the callback), or "TRAMP" (the ops has its own trampoline to use).
When the record is updated to represent the state of the ops hooked
to it, it sets "REGS_EN" and/or "TRAMP_EN" to state that the callback
points to the correct trampoline (REGS has its own trampoline).
When ftrace_enabled is set to zero, all ftrace locations are a nop,
so they do not point to any trampoline. But the _EN flags are still
set. This can cause the accounting to go wrong when ftrace_enabled
is cleared and an ops that has a trampoline is registered or unregistered.
For example, the following will cause ftrace to crash:
As function_graph uses a trampoline, when ftrace_enabled is set to zero
the updates to the record are not done. When enabling function_graph
again, the record will still have the TRAMP_EN flag set, and it will
look for an op that has a trampoline other than the function_graph
ops, and fail to find one.
when multiport is off, virtio console invokes config access from irq
context, config access is blocking on s390.
Fix this up by scheduling work from config irq - similar to what we do
for multiport configs.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Amit Shah <amit.shah@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Cc: stable@kernel.org Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
when multiport is off, we don't initialize config work,
but we then cancel uninitialized control_work on freeze.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Amit Shah <amit.shah@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Cc: stable@kernel.org Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
commit 6ae9200f2cab7 ("enlarge console.name") increased the storage
for the console name to 16 bytes, but not the corresponding
struct console_cmdline::name storage. Console names longer than
8 bytes cause read beyond end-of-string and failure to match
console; I'm not sure if there are other unexpected consequences.
Signed-off-by: Peter Hurley <peter@hurleysoftware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit 2cf6258c282608633c9477e5c8e319e948565122)
The 8250_dw i/o accessors try to write a console error message if the
LCR workaround was unsuccessful. When the port->lock is already held
(eg., when called from serial8250_set_termios()), this deadlocks.
Make the error message a FIXME until a general solution is devised.
Cc: Tim Kryger <tim.kryger@gmail.com> Reported-by: Zhang Zhen <zhenzhang.zhang@huawei.com> Signed-off-by: Peter Hurley <peter@hurleysoftware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
If the part of the compression data are corrupted, or the compression
data is totally fake, the memory access over the limit is possible.
This is the log from my system usning lz4 decompression.
[6502]data abort, halting
[6503]r0 0x00000000 r1 0x00000000 r2 0xdcea0ffc r3 0xdcea0ffc
[6509]r4 0xb9ab0bfd r5 0xdcea0ffc r6 0xdcea0ff8 r7 0xdce80000
[6515]r8 0x00000000 r9 0x00000000 r10 0x00000000 r11 0xb9a98000
[6522]r12 0xdcea1000 usp 0x00000000 ulr 0x00000000 pc 0x820149bc
[6528]spsr 0x400001f3
and the memory addresses of some variables at the moment are
ref:0xdcea0ffc, op:0xdcea0ffc, oend:0xdcea1000
As you can see, COPYLENGH is 8bytes, so @ref and @op can access the momory
over @oend.
radeon_bo_create() calls radeon_ttm_placement_from_domain()
before ttm_bo_init() is called. radeon_ttm_placement_from_domain()
uses the ttm bo size to determine when to select top down
allocation but since the ttm bo is not initialized yet the
check is always false. It only took effect when buffers
were validated later. It also seemed to regress suspend
and resume on some systems possibly due to it not
taking effect in radeon_bo_create().
radeon_bo_create() and radeon_ttm_placement_from_domain()
need to be reworked substantially for this to be optimally
effective. Re-enable it at that point.
Noticed-by: Oded Gabbay <oded.gabbay@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
A normal wait adds to the front of the tail. By doing something
similar to fence_default_wait the fence code can run without racing.
This is a complete fix for "panic on suspend from KDE with radeon",
and a partial fix for "Radeon: System pauses on TAHITI". On tahiti
si_irq_set needs to be fixed too, to completely flush the writes
before radeon_fence_activity is called in radeon_fence_enable_signaling.
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=90741
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=90861 Signed-off-by: Maarten Lankhorst <maarten.lankhorst@ubuntu.com> Reported-by: Jon Arne Jørgensen <jonjon.arnearne@gmail.com> Reported-and-tested-by: Gustaw Smolarczyk <wielkiegie@gmail.com> Cc: stable@vger.kernel.org (v3.18+) Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
The ARM architecture allows the caching of intermediate page table
levels and page table freeing requires a sequence like:
pmd_clear()
TLB invalidation
pte page freeing
With commit 5e5f6dc10546 (arm64: mm: enable HAVE_RCU_TABLE_FREE logic),
the page table freeing batching was moved from tlb_remove_page() to
tlb_remove_table(). The former takes care of TLB invalidation as this is
also shared with pte clearing and page cache page freeing. The latter,
however, does not invalidate the TLBs for intermediate page table levels
as it probably relies on the architecture code to do it if required.
When the mm->mm_users < 2, tlb_remove_table() does not do any batching
and page table pages are freed before tlb_finish_mmu() which performs
the actual TLB invalidation.
This patch introduces __tlb_flush_pgtable() for arm64 and calls it from
the {pte,pmd,pud}_free_tlb() directly without relying on deferred page
table freeing.
Fixes: 5e5f6dc10546 arm64: mm: enable HAVE_RCU_TABLE_FREE logic Reported-by: Jon Masters <jcm@redhat.com> Tested-by: Jon Masters <jcm@redhat.com> Tested-by: Steve Capper <steve.capper@linaro.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
On architectures with hardware broadcasting of TLB invalidation messages
, it makes sense to reduce the range of the mmu_gather structure when
unmapping page ranges based on the dirty address information passed to
tlb_remove_tlb_entry.
arm64 already does this by directly manipulating the start/end fields
of the gather structure, but this confuses the generic code which
does not expect these fields to change and can end up calculating
invalid, negative ranges when forcing a flush in zap_pte_range.
This patch moves the minimal range calculation out of the arm64 code
and into the generic implementation, simplifying zap_pte_range in the
process (which no longer needs to care about start/end, since they will
point to the appropriate ranges already). With the range being tracked
by core code, the need_flush flag is dropped in favour of checking that
the end of the range has actually been set.
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Russell King - ARM Linux <linux@arm.linux.org.uk> Cc: Michal Simek <monstr@monstr.eu> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
While working on sk_forward_alloc problems reported by Denys
Fedoryshchenko, we found that tcp connect() (and fastopen) do not call
sk_wmem_schedule() for SYN packet (and/or SYN/DATA packet), so
sk_forward_alloc is negative while connect is in progress.
We can fix this by calling regular sk_stream_alloc_skb() both for the
SYN packet (in tcp_connect()) and the syn_data packet in
tcp_send_syn_data()
Then, tcp_send_syn_data() can avoid copying syn_data as we simply
can manipulate syn_data->cb[] to remove SYN flag (and increment seq)
Instead of open coding memcpy_fromiovecend(), simply use this helper.
This leaves in socket write queue clean fast clone skbs.
This was tested against our fastopen packetdrill tests.
Reported-by: Denys Fedoryshchenko <nuclearcat@nuclearcat.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Commit db31c55a6fb2 (net: clamp ->msg_namelen instead of returning an
error) introduced the clamping of msg_namelen when the unsigned value
was larger than sizeof(struct sockaddr_storage). This caused a
msg_namelen of -1 to be valid. The native code was subsequently fixed by
commit dbb490b96584 (net: socket: error on a negative msg_namelen).
In addition, the native code sets msg_namelen to 0 when msg_name is
NULL. This was done in commit (6a2a2b3ae075 net:socket: set msg_namelen
to 0 if msg_name is passed as NULL in msghdr struct from userland) and
subsequently updated by 08adb7dabd48 (fold verify_iovec() into
copy_msghdr_from_user()).
This patch brings the get_compat_msghdr() in line with
copy_msghdr_from_user().
Fixes: db31c55a6fb2 (net: clamp ->msg_namelen instead of returning an error) Cc: David S. Miller <davem@davemloft.net> Cc: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
tcp_send_fin() does not account for the memory it allocates properly, so
sk_forward_alloc can be negative in cases where we've sent a FIN:
ss example output (ss -amn | grep -B1 f4294):
tcp FIN-WAIT-1 0 1 192.168.0.1:45520 192.0.2.1:8080
skmem:(r0,rb87380,t0,tb87380,f4294966016,w1280,o0,bl0) Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
for throw routes to trigger evaluation of other policy rules
EAGAIN needs to be propagated up to fib_rules_lookup
similar to how its done for IPv4
A simple testcase for verification is:
ip -6 rule add lookup 33333 priority 33333
ip -6 route add throw 2001:db8::1
ip -6 route add 2001:db8::1 via fe80::1 dev wlan0 table 33333
ip route get 2001:db8::1
Signed-off-by: Steven Barth <cyrus@openwrt.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
The custom USB_DEVICE_CLASS macro matches
bDeviceClass, bDeviceSubClass and bDeviceProtocol
but the common USB_DEVICE_AND_INTERFACE_INFO matches
bInterfaceClass, bInterfaceSubClass and bInterfaceProtocol instead, which are
not specified.
Signed-off-by: Ondrej Zary <linux@rainbow-software.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
[I would really like an ACK on that one from dhowells; it appears to be
quite straightforward, but...]
MSG_PEEK isn't passed to ->recvmsg() via msg->msg_flags; as the matter of
fact, neither the kernel users of rxrpc, nor the syscalls ever set that bit
in there. It gets passed via flags; in fact, another such check in the same
function is done correctly - as flags & MSG_PEEK.
It had been that way (effectively disabled) for 8 years, though, so the patch
needs beating up - that case had never been tested. If it is correct, it's
-stable fodder.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
It should be checking flags, not msg->msg_flags. It's ->sendmsg()
instances that need to look for that in ->msg_flags, ->recvmsg() ones
(including the other ->recvmsg() instance in that file, as well as
unix_dgram_recvmsg() this one claims to be imitating) check in flags.
Braino had been introduced in commit dcda13 ("caif: Bugfix - use MSG_TRUNC
in receive") back in 2010, so it goes quite a while back.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
inet_diag_dump_one_icsk() allocates too small skb.
Add inet_sk_attr_size() helper right before inet_sk_diag_fill()
so that it can be updated if/when new attributes are added.
iproute2/ss currently does not use this dump_one() interface,
this might explain nobody noticed this problem yet.
Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
This patch fixes this by doing this in virtnet_free_queues(). And also
don't delete napi in virtnet_freeze() since it will call
virtnet_free_queues() which has already did this.
Fixes 91815639d880 ("virtio-net: rx busy polling support") Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
The rds_iw_update_cm_id function stores a large 'struct rds_sock' object
on the stack in order to pass a pair of addresses. This happens to just
fit withint the 1024 byte stack size warning limit on x86, but just
exceed that limit on ARM, which gives us this warning:
net/rds/iw_rdma.c:200:1: warning: the frame size of 1056 bytes is larger than 1024 bytes [-Wframe-larger-than=]
As the use of this large variable is basically bogus, we can rearrange
the code to not do that. Instead of passing an rds socket into
rds_iw_get_device, we now just pass the two addresses that we have
available in rds_iw_update_cm_id, and we change rds_iw_get_mr accordingly,
to create two address structures on the stack there.
Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Sowmini Varadhan <sowmini.varadhan@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
sysctl has sysctl.net.core.rmem_*/wmem_* parameters which can be
set to incorrect values. Given that 'struct sk_buff' allocates from
rcvbuf, incorrectly set buffer length could result to memory
allocation failures. For example, set them as follows:
Moreover, the possible minimum is 1, so we can get another kernel panic:
...
BUG: unable to handle kernel paging request at ffff88013caee5c0
IP: [<ffffffff815604cf>] __alloc_skb+0x12f/0x1f0
...
Signed-off-by: Alexey Kodanev <alexey.kodanev@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
The current driver support receive VLAN CTAG HW acceleration feature
(NETIF_F_HW_VLAN_CTAG_RX) through software simulation. There calls the
api .skb_copy_to_linear_data_offset() to skip the VLAN tag, but there
have overlap between the two memory data point range. The patch just fix
the issue.
V2:
Michael Grzeschik suggest to use memmove() instead of skb_copy_to_linear_data_offset().
Reported-by: Michael Grzeschik <m.grzeschik@pengutronix.de> Fixes: 1b7bde6d659d ("net: fec: implement rx_copybreak to improve rx performance") Signed-off-by: Fugang Duan <B38611@freescale.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
So ->ht is supposed to be the last field of this struct, however
this is broken, since an rcu head is appended after it.
Fixes: 1ce87720d456 ("net: sched: make cls_u32 lockless") Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Firstly, handle zero length calls properly. Believe it or not there
are a few of these happening during early boot.
Next, we can't just drop to a memcpy() call in the forward copy case
where dst <= src. The reason is that the cache initializing stores
used in the Niagara memcpy() implementations can end up clearing out
cache lines before we've sourced their original contents completely.
For example, considering NG4memcpy, the main unrolled loop begins like
this:
Assume dst is 64 byte aligned and let's say that dst is src - 8 for
this memcpy() call. That store at the end there is the one to the
first line in the cache line, thus clearing the whole line, which thus
clobbers "src + 0x28" before it even gets loaded.
To avoid this, just fall through to a simple copy only mildly
optimized for the case where src and dst are 8 byte aligned and the
length is a multiple of 8 as well. We could get fancy and call
GENmemcpy() but this is good enough for how this thing is actually
used.
Reported-by: David Ahern <david.ahern@oracle.com> Reported-by: Bob Picco <bpicco@meloft.net> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
With the increase in number of CPUs calls to functions that dump
output to console (e.g., arch_trigger_all_cpu_backtrace) can take
a long time to complete. If IRQs are disabled eventually the NMI
watchdog kicks in and creates more havoc. Avoid by telling the NMI
watchdog everything is ok.
Signed-off-by: David Ahern <david.ahern@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
The reason is that state is never reset (stays with PERF_HES_UPTODATE set).
Add a call to sparc_pmu_enable_event during the added_event handling.
Clean up the encoding since pmu_start calls sparc_pmu_enable_event which
does the same. Passing PERF_EF_RELOAD to sparc_pmu_start means the call
to sparc_perf_event_set_period can be removed as well.
With this patch:
$ perf stat ls
...
Performance counter stats for 'ls':
Signed-off-by: David Ahern <david.ahern@oracle.com> Acked-by: Bob Picco <bob.picco@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
perf_pmu_disable is called by core perf code before pmu->del and the
enable function is called by core perf code afterwards. No need to
call again within sparc_pmu_del.
Ditto for pmu->add and sparc_pmu_add.
Signed-off-by: David Ahern <david.ahern@oracle.com> Acked-by: Bob Picco <bob.picco@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
A bug was reported that the semtimedop() system call was always
failing eith ENOSYS.
Since SEMCTL is defined as 3, and SEMTIMEDOP is defined as 4,
the comparison "call <= SEMCTL" will always prevent SEMTIMEDOP
from getting through to the semaphore ops switch statement.
This is corrected by changing the comparison to "call <= SEMTIMEDOP".
Signed-off-by: Rob Gardner <rob.gardner@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Load balancing can be triggered in the critical sections protected by
srmmu_context_spinlock in destroy_context() and switch_mm() and can hang
the cpu waiting for the rq lock of another cpu that in turn has called
switch_mm hangning on srmmu_context_spinlock leading to deadlock.
So, disable interrupt while taking srmmu_context_spinlock in
destroy_context() and switch_mm() so we don't deadlock.
See also commit 77b838fa1ef0 ("[SPARC64]: destroy_context() needs to disable
interrupts.")
Signed-off-by: Andreas Larsson <andreas@gaisler.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
We were missing a return statement in the PSL interrupt handler in the
case of an AFU error, which would trigger an "Unhandled CXL PSL IRQ"
warning. We do actually handle these type of errors (by notifying
userspace), so add the missing return IRQ_HANDLED so we don't throw
unecessary warnings.
Signed-off-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>