When using the PtrAuth feature in a guest, we need to save the host's
keys before allowing the guest to program them. For that, we dump
them in a per-CPU data structure (the so called host context).
But both call sites that do this are in preemptible context,
which may end up in disaster should the vcpu thread get preempted
before reentering the guest.
Instead, save the keys eagerly on each vcpu_load(). This has an
increased overhead, but is at least safe.
Cc: stable@vger.kernel.org Reviewed-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
On a VHE system, the EL1 state is left in the CPU most of the time,
and only syncronized back to memory when vcpu_put() is called (most
of the time on preemption).
Which means that when injecting an exception, we'd better have a way
to either:
(1) write directly to the EL1 sysregs
(2) synchronize the state back to memory, and do the changes there
For an AArch64, we already do (1), so we are safe. Unfortunately,
doing the same thing for AArch32 would be pretty invasive. Instead,
we can easily implement (2) by calling the put/load architectural
backends, and keep preemption disabled. We can then reload the
state back into EL1.
Cc: stable@vger.kernel.org Reported-by: James Morse <james.morse@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
After commit 6d232b29cfce ("ACPICA: Dispatcher: always generate buffer
objects for ASL create_field() operator") ACPICA creates buffers even
when new fields are small enough to fit into an integer.
Many SNC calls counted on the old behaviour.
Since sony-laptop already handles the INTEGER/BUFFER case in
sony_nc_buffer_call, switch sony_nc_int_call to use its more generic
function instead.
Fixes: 6d232b29cfce ("ACPICA: Dispatcher: always generate buffer objects for ASL create_field() operator") Reported-by: Dominik Mierzejewski <dominik@greysector.net>
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=207491 Reported-by: William Bader <williambader@hotmail.com>
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1830150 Signed-off-by: Mattia Dongili <malattia@linux.it> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Since the switch of floppy driver to blk-mq, the contended (fdc_busy) case
in floppy_queue_rq() is not handled correctly.
In case we reach floppy_queue_rq() with fdc_busy set (i.e. with the floppy
locked due to another request still being in-flight), we put the request
on the list of requests and return BLK_STS_OK to the block core, without
actually scheduling delayed work / doing further processing of the
request. This means that processing of this request is postponed until
another request comes and passess uncontended.
Which in some cases might actually never happen and we keep waiting
indefinitely. The simple testcase is
for i in `seq 1 2000`; do echo -en $i '\r'; blkid --info /dev/fd0 2> /dev/null; done
run in quemu. That reliably causes blkid eventually indefinitely hanging
in __floppy_read_block_0() waiting for completion, as the BIO callback
never happens, and no further IO is ever submitted on the (non-existent)
floppy device. This was observed reliably on qemu-emulated device.
Fix that by not queuing the request in the contended case, and return
BLK_STS_RESOURCE instead, so that blk core handles the request
rescheduling and let it pass properly non-contended later.
Over the years, the code in mmc_sdio_init_card() has grown to become quite
messy. Unfortunate this has also lead to that several paths are leaking
memory in form of an allocated struct mmc_card, which includes additional
data, such as initialized struct device for example.
Unfortunate, it's a too complex task find each offending commit. Therefore,
this change fixes all memory leaks at once.
During some scenarios mmc_sdio_init_card() runs a retry path for the UHS-I
specific initialization, which leads to removal of the previously allocated
card. A new card is then re-allocated while retrying.
However, in one of the corresponding error paths we may end up to remove an
already removed card, which likely leads to a NULL pointer exception. So,
let's fix this.
Fixes: 5fc3d80ef496 ("mmc: sdio: don't use rocr to check if the card could support UHS mode") Cc: <stable@vger.kernel.org> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Link: https://lore.kernel.org/r/20200430091640.455-2-ulf.hansson@linaro.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Currently, tmio_mmc_irq() handler is registered before the host is
fully initialized by tmio_mmc_host_probe(). I did not previously notice
this problem.
The boot ROM of a new Socionext SoC unmasks interrupts (CTL_IRQ_MASK)
somehow. The handler is invoked before tmio_mmc_host_probe(), then
emits noisy call trace.
Before calling tmio_mmc_host_probe(), the caller is required to enable
clocks for its device, as to make it accessible when reading/writing
registers during probe.
Therefore, the responsibility to disable these clocks, in the error path of
->probe() and during ->remove(), is better managed outside
tmio_mmc_host_remove(). As a matter of fact, callers of
tmio_mmc_host_remove() already expects this to be the behaviour.
However, there's a problem with tmio_mmc_host_remove() when the Kconfig
option, CONFIG_PM, is set. More precisely, tmio_mmc_host_remove() may then
disable the clock via runtime PM, which leads to clock enable/disable
imbalance problems, when the caller of tmio_mmc_host_remove() also tries to
disable the same clocks.
To solve the problem, let's make sure tmio_mmc_host_remove() leaves the
device with clocks enabled, but also make sure to disable the IRQs, as we
normally do at ->runtime_suspend().
Turning on CONFIG_DMA_API_DEBUG_SG results in the following warning:
WARNING: CPU: 1 PID: 20 at kernel/dma/debug.c:500 add_dma_entry+0x16c/0x17c
DMA-API: exceeded 7 overlapping mappings of cacheline 0x031d2645
Modules linked in:
CPU: 1 PID: 20 Comm: kworker/1:1 Not tainted 5.5.0-rc2-00021-gdeda30999c2b-dirty #49
Hardware name: STM32 (Device Tree Support)
Workqueue: events_freezable mmc_rescan
[<c03138c0>] (unwind_backtrace) from [<c030d760>] (show_stack+0x10/0x14)
[<c030d760>] (show_stack) from [<c0f2eb28>] (dump_stack+0xc0/0xd4)
[<c0f2eb28>] (dump_stack) from [<c034a14c>] (__warn+0xd0/0xf8)
[<c034a14c>] (__warn) from [<c034a530>] (warn_slowpath_fmt+0x94/0xb8)
[<c034a530>] (warn_slowpath_fmt) from [<c03bca0c>] (add_dma_entry+0x16c/0x17c)
[<c03bca0c>] (add_dma_entry) from [<c03bdf54>] (debug_dma_map_sg+0xe4/0x3d4)
[<c03bdf54>] (debug_dma_map_sg) from [<c0d09244>] (sdmmc_idma_prep_data+0x94/0xf8)
[<c0d09244>] (sdmmc_idma_prep_data) from [<c0d05a2c>] (mmci_prep_data+0x2c/0xb0)
[<c0d05a2c>] (mmci_prep_data) from [<c0d073ec>] (mmci_start_data+0x134/0x2f0)
[<c0d073ec>] (mmci_start_data) from [<c0d078d0>] (mmci_request+0xe8/0x154)
[<c0d078d0>] (mmci_request) from [<c0cecb44>] (mmc_start_request+0x94/0xbc)
DMA api debug brings to light leaking dma-mappings, dma_map_sg and
dma_unmap_sg are not correctly balanced.
If a request is prepared, the dma_map/unmap are done in asynchronous call
pre_req (prep_data) and post_req (unprep_data). In this case the
dma-mapping is right balanced.
But if the request was not prepared, the data->host_cookie is define to
zero and the dma_map/unmap must be done in the request. The dma_map is
called by mmci_dma_start (prep_data), but there is no dma_unmap in this
case.
This patch adds dma_unmap_sg when the dma is finalized and the data cookie
is zero (request not prepared).
When enabling calibration at reset, the CALCR register was completely
rewritten. This may cause certain bits being deleted unintentedly.
Fix by issuing a read-modify-write operation.
Fixes: 727d836a375a ("mmc: sdhci-of-at91: add DT property to enable calibration on full reset") Signed-off-by: Eugen Hristev <eugen.hristev@microchip.com> Link: https://lore.kernel.org/r/20200527105659.142560-1-eugen.hristev@microchip.com Cc: stable@vger.kernel.org Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Clear tuning_done flag while executing tuning to ensure vendor
specific HS400 settings are applied properly when the controller
is re-initialized in HS400 mode.
Without this, re-initialization of the qcom SDHC in HS400 mode fails
while resuming the driver from runtime-suspend or system-suspend.
After changing the timing between GTT updates and execution on the GPU,
we started seeing sporadic failures on Ironlake. These were narrowed
down to being an insufficiently strong enough barrier/delay after
updating the GTT and scheduling execution on the GPU. By forcing the
uncached read, and adding the missing barrier for the singular
insert_page (relocation paths), the sporadic failures go away.
Fixes: 983d308cb8f6 ("agp/intel: Serialise after GTT updates") Fixes: 3497971a71d8 ("agp/intel: Flush chipset writes after updating a single PTE") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Andi Shyti <andi.shyti@intel.com> Cc: stable@vger.kernel.org # v4.0+ Link: https://patchwork.freedesktop.org/patch/msgid/20200410083535.25464-1-chris@chris-wilson.co.uk Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Under rare circumstances, task_function_call() can repeatedly fail and
cause a soft lockup.
There is a slight race where the process is no longer running on the cpu
we targeted by the time remote_function() runs. The code will simply
try again. If we are very unlucky, this will continue to fail, until a
watchdog fires. This can happen in a heavily loaded, multi-core virtual
machine.
Fixes: 80da026a8e5d ("mm/slub: fix slab double-free in case of duplicate sysfs filename") Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Wang Hai <wanghai38@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Link: http://lkml.kernel.org/r/20200602115033.1054-1-wanghai38@huawei.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In ath9k_hif_usb_rx_cb interface number is assumed to be 0.
usb_ifnum_to_if(urb->dev, 0)
But it isn't always true.
The case reported by syzbot:
https://lore.kernel.org/linux-usb/000000000000666c9c05a1c05d12@google.com
usb 2-1: new high-speed USB device number 2 using dummy_hcd
usb 2-1: config 1 has an invalid interface number: 2 but max is 0
usb 2-1: config 1 has no interface number 0
usb 2-1: New USB device found, idVendor=0cf3, idProduct=9271, bcdDevice=
1.08
usb 2-1: New USB device strings: Mfr=1, Product=2, SerialNumber=3
general protection fault, probably for non-canonical address
0xdffffc0000000015: 0000 [#1] SMP KASAN
KASAN: null-ptr-deref in range [0x00000000000000a8-0x00000000000000af]
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.6.0-rc5-syzkaller #0
Add barrier to accessing the stack array skb_pool.
The case reported by syzbot:
https://lore.kernel.org/linux-usb/0000000000003d7c1505a2168418@google.com
BUG: KASAN: stack-out-of-bounds in ath9k_hif_usb_rx_stream
drivers/net/wireless/ath/ath9k/hif_usb.c:626 [inline]
BUG: KASAN: stack-out-of-bounds in ath9k_hif_usb_rx_cb+0xdf6/0xf70
drivers/net/wireless/ath/ath9k/hif_usb.c:666
Write of size 8 at addr ffff8881db309a28 by task swapper/1/0
The case reported by syzbot:
https://lore.kernel.org/linux-usb/0000000000006ac55b05a1c05d72@google.com
BUG: KASAN: use-after-free in htc_process_conn_rsp
drivers/net/wireless/ath/ath9k/htc_hst.c:131 [inline]
BUG: KASAN: use-after-free in ath9k_htc_rx_msg+0xa25/0xaf0
drivers/net/wireless/ath/ath9k/htc_hst.c:443
Write of size 2 at addr ffff8881cea291f0 by task swapper/1/0
Free wmi later after cmd urb has been killed, as urb cb will access wmi.
the case reported by syzbot:
https://lore.kernel.org/linux-usb/0000000000000002fc05a1d61a68@google.com
BUG: KASAN: use-after-free in ath9k_wmi_ctrl_rx+0x416/0x500
drivers/net/wireless/ath/ath9k/wmi.c:215
Read of size 1 at addr ffff8881cef1417c by task swapper/1/0
MFI_BIG_ENDIAN macro used in drivers structure bitfield to check the CPU
big endianness is undefined which would break the code on big endian
machine. __BIG_ENDIAN_BITFIELD kernel macro should be used in places of
MFI_BIG_ENDIAN macro.
Link: https://lore.kernel.org/r/20200508085130.23339-1-chandrakanth.patil@broadcom.com Fixes: a7faf81d7858 ("scsi: megaraid_sas: Set no_write_same only for Virtual Disk") Cc: <stable@vger.kernel.org> # v5.6+ Signed-off-by: Shivasharan S <shivasharan.srikanteshwara@broadcom.com> Signed-off-by: Chandrakanth Patil <chandrakanth.patil@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Implementation of a previous patch added a condition to an if check that
always end up with the if test being true. Execution of the else clause was
inadvertently negated. The additional condition check was incorrect and
unnecessary after the other modifications had been done in that patch.
Remove the check from the if series.
Link: https://lore.kernel.org/r/20200501214310.91713-5-jsmart2021@gmail.com Fixes: b95b21193c85 ("scsi: lpfc: Fix loss of remote port after devloss due to lack of RPIs") Cc: <stable@vger.kernel.org> # v5.4+ Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
AArch32 CP1x registers are overlayed on their AArch64 counterparts
in the vcpu struct. This leads to an interesting problem as they
are stored in their CPU-local format, and thus a CP1x register
doesn't "hit" the lower 32bit portion of the AArch64 register on
a BE host.
To workaround this unfortunate situation, introduce a bias trick
in the vcpu_cp1x() accessors which picks the correct half of the
64bit register.
Cc: stable@vger.kernel.org Reported-by: James Morse <james.morse@arm.com> Tested-by: James Morse <james.morse@arm.com> Acked-by: James Morse <james.morse@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
aarch32 has pairs of registers to access the high and low parts of 64bit
registers. KVM has a union of 64bit sys_regs[] and 32bit copro[]. The
32bit accessors read the high or low part of the 64bit sys_reg[] value
through the union.
Both sys_reg_descs[] and cp15_regs[] list access_csselr() as the accessor
for CSSELR{,_EL1}. access_csselr() is only aware of the 64bit sys_regs[],
and expects r->reg to be 'CSSELR_EL1' in the enum, index 2 of the 64bit
array.
cp15_regs[] uses the 32bit copro[] alias of sys_regs[]. Here CSSELR is
c0_CSSELR which is the same location in sys_reg[]. r->reg is 'c0_CSSELR',
index 4 in the 32bit array.
access_csselr() uses the 32bit r->reg value to access the 64bit array,
so reads and write the wrong value. sys_regs[4], is ACTLR_EL1, which
is subsequently save/restored when we enter the guest.
ACTLR_EL1 is supposed to be read-only for the guest. This register
only affects execution at EL1, and the host's value is restored before
we return to host EL1.
Convert the 32bit register index back to the 64bit version.
Suggested-by: Marc Zyngier <maz@kernel.org> Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20200529150656.7339-2-james.morse@arm.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
If a CPU support more than 32bit vmbits (which is true for 64bit CPUs),
VPN2_MASK set to fixed 0xffffe000 will lead to a wrong EntryHi in some
functions such as _kvm_mips_host_tlb_inv().
The cpu_vmbits definition of 32bit CPU in cpu-features.h is 31, so we
still use the old definition.
Cc: Stable <stable@vger.kernel.org> Reviewed-by: Aleksandar Markovic <aleksandar.qemu.devel@gmail.com> Signed-off-by: Xing Li <lixing@loongson.cn>
[Huacai: Improve commit messages] Signed-off-by: Huacai Chen <chenhc@lemote.com>
Message-Id: <1590220602-3547-3-git-send-email-chenhc@lemote.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Consult only the basic exit reason, i.e. bits 15:0 of vmcs.EXIT_REASON,
when determining whether a nested VM-Exit should be reflected into L1 or
handled by KVM in L0.
For better or worse, the switch statement in nested_vmx_exit_reflected()
currently defaults to "true", i.e. reflects any nested VM-Exit without
dedicated logic. Because the case statements only contain the basic
exit reason, any VM-Exit with modifier bits set will be reflected to L1,
even if KVM intended to handle it in L0.
Practically speaking, this only affects EXIT_REASON_MCE_DURING_VMENTRY,
i.e. a #MC that occurs on nested VM-Enter would be incorrectly routed to
L1, as "failed VM-Entry" is the only modifier that KVM can currently
encounter. The SMM modifiers will never be generated as KVM doesn't
support/employ a SMI Transfer Monitor. Ditto for "exit from enclave",
as KVM doesn't yet support virtualizing SGX, i.e. it's impossible to
enter an enclave in a KVM guest (L1 or L2).
Fixes: 644d711aa0e1 ("KVM: nVMX: Deciding if L0 or L1 should handle an L2 exit") Cc: Jim Mattson <jmattson@google.com> Cc: Xiaoyao Li <xiaoyao.li@intel.com> Cc: stable@vger.kernel.org Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Message-Id: <20200227174430.26371-1-sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Restoring the ASID from the hsave area on VMEXIT is wrong, because its
value depends on the handling of TLB flushes. Just skipping the field in
copy_vmcb_control_area will do.
Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Async page faults have to be trapped in the host (L1 in this case),
since the APF reason was passed from L0 to L1 and stored in the L1 APF
data page. This was completely reversed: the page faults were passed
to the guest, a L2 hypervisor.
Cc: stable@vger.kernel.org Reviewed-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Skip the Indirect Branch Prediction Barrier that is triggered on a VMCS
switch when running with spectre_v2_user=on/auto if the switch is
between two VMCSes in the same guest, i.e. between vmcs01 and vmcs02.
The IBPB is intended to prevent one guest from attacking another, which
is unnecessary in the nested case as it's the same guest from KVM's
perspective.
This all but eliminates the overhead observed for nested VMX transitions
when running with CONFIG_RETPOLINE=y and spectre_v2_user=on/auto, which
can be significant, e.g. roughly 3x on current systems.
Reported-by: Alexander Graf <graf@amazon.com> Cc: KarimAllah Raslan <karahmed@amazon.de> Cc: stable@vger.kernel.org Fixes: 15d45071523d ("KVM/x86: Add IBPB support") Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Message-Id: <20200501163117.4655-1-sean.j.christopherson@intel.com>
[Invert direction of bool argument. - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Commit 9495b7e92f716ab2bd6814fab5e97ab4a39adfdd ("driver core: platform:
Initialize dma_parms for platform devices") in v5.7-rc5 causes
vb2_dma_contig_clear_max_seg_size() to kfree memory that was not
allocated by vb2_dma_contig_set_max_seg_size().
The assumption in vb2_dma_contig_set_max_seg_size() seems to be that
dev->dma_parms is always NULL when the driver is probed, and the case
where dev->dma_parms has bee initialized by someone else than the driver
(by calling vb2_dma_contig_set_max_seg_size) will cause a failure.
All the current users of these functions are platform devices, which now
always have dma_parms set by the driver core. To fix the issue for v5.7,
make vb2_dma_contig_set_max_seg_size() return an error if dma_parms is
NULL to be on the safe side, and remove the kfree code from
vb2_dma_contig_clear_max_seg_size().
For v5.8 we should remove the two functions and move the
dma_set_max_seg_size() calls into the drivers.
Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ti.com> Fixes: 9495b7e92f71 ("driver core: platform: Initialize dma_parms for platform devices") Cc: stable@vger.kernel.org Acked-by: Marek Szyprowski <m.szyprowski@samsung.com> Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl> Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Some memory is vmalloc'ed in the 'w100fb_save_vidmem' function and freed in
the 'w100fb_restore_vidmem' function. (these functions are called
respectively from the 'suspend' and the 'resume' functions)
However, it is also freed in the 'remove' function.
In order to avoid a potential double free, set the corresponding pointer
to NULL once freed in the 'w100fb_restore_vidmem' function.
Fix following warning:
vt8500lcdfb.c: In function 'vt8500lcd_blank':
vt8500lcdfb.c:229:6: warning: this statement may fall through [-Wimplicit-fallthrough=]
if (info->fix.visual == FB_VISUAL_PSEUDOCOLOR ||
^
vt8500lcdfb.c:233:2: note: here
case FB_BLANK_UNBLANK:
^~~~
Adding a simple "fallthrough;" fixed the warning.
The fix was build tested.
Signed-off-by: Sam Ravnborg <sam@ravnborg.org> Reported-by: kbuild test robot <lkp@intel.com> Fixes: e41f1a989408 ("fbdev: Implement simple blanking in pseudocolor modes for vt8500lcdfb") Cc: Alexey Charkov <alchark@gmail.com> Cc: Paul Mundt <lethal@linux-sh.org> Cc: <stable@vger.kernel.org> # v2.6.38+ Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200412202143.GA26948@ravnborg.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The skx_edac driver wrongly uses the mtr register to retrieve two fields
close_pg and bank_xor_enable. Fix it by using the correct mcmtr register
to get the two fields.
After commit 18c49926c4bf ("cpufreq: Add QoS requests for userspace
constraints") the return value of freq_qos_update_request(), that can
be 1, passed by cpufreq_boost_set_sw() to its caller sometimes
confuses the latter, which only expects to see 0 or negative error
codes, so notice that cpufreq_boost_set_sw() can return an error code
(which should not be -EINVAL for that matter) as soon as the first
policy without a frequency table is found (because either all policies
have a frequency table or none of them have it) and rework it to meet
its caller's expectations.
Fixes: 18c49926c4bf ("cpufreq: Add QoS requests for userspace constraints") Reported-by: Serge Semin <Sergey.Semin@baikalelectronics.ru> Reported-by: Xiongfeng Wang <wangxiongfeng2@huawei.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Cc: 5.3+ <stable@vger.kernel.org> # 5.3+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The commit 086d08725d34 ("remoteproc: create vdev subdevice with specific
dma memory pool") has introduced a new vdev subdevice for each vdev
declared in the firmware resource table and made it as the parent for the
created virtio rpmsg devices instead of the previous remoteproc device.
This changed the overall parenting hierarchy for the rpmsg devices, which
were children of virtio devices, and does not allow the corresponding
rpmsg drivers to retrieve the parent rproc device through the
rproc_get_by_child() API.
Fix this by restoring the remoteproc device as the parent. The new vdev
subdevice can continue to inherit the DMA attributes from the remoteproc's
parent device (actual platform device).
In some cases, like with OMAP remoteproc, we are not creating dedicated
memory pool for the virtio device. Instead, we use the same memory pool
for all shared memories. The current virtio memory pool handling forces
a split between these two, as a separate device is created for it,
causing memory to be allocated from bad location if the dedicated pool
is not available. Fix this by falling back to using the parent device
memory pool if dedicated is not available.
Cc: stable@vger.kernel.org Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org> Acked-by: Arnaud Pouliquen <arnaud.pouliquen@st.com> Fixes: 086d08725d34 ("remoteproc: create vdev subdevice with specific dma memory pool") Signed-off-by: Tero Kristo <t-kristo@ti.com> Signed-off-by: Suman Anna <s-anna@ti.com> Link: https://lore.kernel.org/r/20200420160600.10467-2-s-anna@ti.com Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Recently syzbot reported that unmounting proc when there is an ongoing
inotify watch on the root directory of proc could result in a use
after free when the watch is removed after the unmount of proc
when the watcher exits.
Commit 69879c01a0c3 ("proc: Remove the now unnecessary internal mount
of proc") made it easier to unmount proc and allowed syzbot to see the
problem, but looking at the code it has been around for a long time.
Looking at the code the fsnotify watch should have been removed by
fsnotify_sb_delete in generic_shutdown_super. Unfortunately the inode
was allocated with new_inode_pseudo instead of new_inode so the inode
was not on the sb->s_inodes list. Which prevented
fsnotify_unmount_inodes from finding the inode and removing the watch
as well as made it so the "VFS: Busy inodes after unmount" warning
could not find the inodes to warn about them.
Make all of the inodes in proc visible to generic_shutdown_super,
and fsnotify_sb_delete by using new_inode instead of new_inode_pseudo.
The only functional difference is that new_inode places the inodes
on the sb->s_inodes list.
I wrote a small test program and I can verify that without changes it
can trigger this issue, and by replacing new_inode_pseudo with
new_inode the issues goes away.
Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/000000000000d788c905a7dfa3f4@google.com Reported-by: syzbot+7d2debdcdb3cb93c1e5e@syzkaller.appspotmail.com Fixes: 0097875bd415 ("proc: Implement /proc/thread-self to point at the directory of the current thread") Fixes: 021ada7dff22 ("procfs: switch /proc/self away from proc_dir_entry") Fixes: 51f0885e5415 ("vfs,proc: guarantee unique inodes in /proc") Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In ovl_copy_xattr, if all the xattrs to be copied are overlayfs private
xattrs, the copy loop will terminate without assigning anything to the
error variable, thus returning an uninitialized value.
If ovl_copy_xattr is called from ovl_clear_empty, this uninitialized error
value is put into a pointer by ERR_PTR(), causing potential invalid memory
accesses down the line.
This commit initialize error with 0. This is the correct value because when
there's no xattr to copy, because all xattrs are private, ovl_copy_xattr
should succeed.
This bug is discovered with the help of INIT_STACK_ALL and clang.
While unregistration is in progress, user might be reloading the
interface.
This can race with unregistration in below flow which uses the
resources which are getting disabled by reload flow.
Hence, disable the devlink reloading first when removing the device.
A recent change added a disable to NAPI into macb_open, this was
intended to only happen on the error path but accidentally applies
to all paths. This causes NAPI to be disabled on the success path, which
leads to the network to no longer functioning.
Fixes: 014406babc1f ("net: cadence: macb: disable NAPI on error") Signed-off-by: Charles Keepax <ckeepax@opensource.cirrus.com> Tested-by: Corentin Labbe <clabbe@baylibre.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This is due to NAPI left enabled if macb_phylink_connect() fail.
Fixes: 7897b071ac3b ("net: macb: convert to phylink") Signed-off-by: Corentin Labbe <clabbe@baylibre.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
After an XSK is closed, the relevant structures in the channel are not
zeroed. If an XSK is opened the second time on the same channel without
recreating channels, the stray values in the structures will lead to
incorrect operation of queues, which causes CQE errors, and the new
socket doesn't work at all.
This patch fixes the issue by explicitly zeroing XSK-related structs in
the channel on XSK close. Note that those structs are zeroed on channel
creation, and usually a configuration change (XDP program is set)
happens on XSK open, which leads to recreating channels, so typical XSK
usecases don't suffer from this issue. However, if XSKs are opened and
closed on the same channel without removing the XDP program, this bug
reproduces.
Currently, in case of fatal error during mlx5_load_one(), we cannot
enter error state until mlx5_load_one() is finished, what can take
several minutes until commands will get timeouts, because these commands
can't be processed due to the fatal error.
Fix it by setting dev->state as MLX5_DEVICE_STATE_INTERNAL_ERROR before
requesting the lock.
Fixes: c1d4d2e92ad6 ("net/mlx5: Avoid calling sleeping function by the health poll thread") Signed-off-by: Shay Drory <shayd@mellanox.com> Reviewed-by: Moshe Shemesh <moshe@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In case there is a work in the health WQ when we teardown the driver,
in driver load error flow, the health work will try to read dev->iseg,
which was already unmap in mlx5_pci_close().
Fix it by draining the health workqueue first thing in mlx5_pci_close().
getopt_long requires the last element to be filled with zeros.
Otherwise, passing an unrecognized option can cause a segfault.
Fixes: 16e781224198 ("selftests/net: Add a test to validate behavior of rx timestamps") Signed-off-by: Tanner Love <tannerlove@google.com> Acked-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Disable frames injection in mvneta_xdp_xmit routine during hw
re-configuration in order to avoid hardware hangs
Fixes: b0a43db9087a ("net: mvneta: add XDP_TX support") Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
There are some memory leaks in dccp_init() and dccp_fini().
In dccp_fini() and the error handling path in dccp_init(), free lhash2
is missing. Add inet_hashinfo2_free_mod() to do it.
If inet_hashinfo2_init_mod() failed in dccp_init(),
percpu_counter_destroy() should be called to destroy dccp_orphan_count.
It need to goto out_free_percpu when inet_hashinfo2_init_mod() failed.
Fixes: c92c81df93df ("net: dccp: fix kernel crash on module load") Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Wang Hai <wanghai38@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The header of the message to send can be changed if the
response is longer than the request:
- 1st word, the header is sent
- the remaining words of the message are sent
- the response is received asynchronously during the
execution of the loop, changing the size field in
the header
- the for loop test the termination condition using
the corrupted header
It is the case for the API build_info which has just a
header as request but 3 words in response.
This issue is fixed storing the header locally instead of
using a pointer on it.
Current imx-scu requires four TX and four RX to communicate with
SCU. This is low efficient and causes lots of mailbox interrupts.
With imx-mailbox driver could support one TX to use all four transmit
registers and one RX to use all four receive registers, imx-scu
could use one TX and one RX.
An interesting thing happened when a guest Linux instance took a machine
check. The VMM unmapped the bad page from guest physical space and
passed the machine check to the guest.
Linux took all the normal actions to offline the page from the process
that was using it. But then guest Linux crashed because it said there
was a second machine check inside the kernel with this stack trace:
This was odd, because a CLFLUSH instruction shouldn't raise a machine
check (it isn't consuming the data). Further investigation showed that
the VMM had passed in another machine check because is appeared that the
guest was accessing the bad page.
Fix is to check the scope of the poison by checking the MCi_MISC register.
If the entire page is affected, then unmap the page. If only part of the
page is affected, then mark the page as uncacheable.
This assumes that VMMs will do the logical thing and pass in the "whole
page scope" via the MCi_MISC register (since they unmapped the entire
page).
[ bp: Adjust to x86/entry changes. ]
Fixes: 284ce4011ba6 ("x86/memory_failure: Introduce {set, clear}_mce_nospec()") Reported-by: Jue Wang <juew@google.com> Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Jue Wang <juew@google.com> Cc: <stable@vger.kernel.org> Link: https://lkml.kernel.org/r/20200520163546.GA7977@agluck-desk2.amr.corp.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In io_uring_cancel_files(), after refcount_sub_and_test() leaves 0
req->refs, it calls io_put_req(), which would also put a ref. Call
io_free_req() instead.
Cc: stable@vger.kernel.org Fixes: 2ca10259b418 ("io_uring: prune request from overflow list on flush") Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The prepare_to_wait() and finish_wait() calls in io_uring_cancel_files()
are mismatched. Currently I don't see any issues related this bug, just
find it by learning codes.
The system will crash when the users insmod crypto/tcrypt.ko with mode=38
( testing "cts(cbc(aes))" ).
Usually the next entry of one sg will be @sg@ + 1, but if this sg element
is part of a chained scatterlist, it could jump to the start of a new
scatterlist array. Fix it by sg_next() on calculation of src/dst
scatterlist.
Fixes: dbaf0624ffa5 ("crypto: add virtio-crypto driver") Reported-by: LABBE Corentin <clabbe@baylibre.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: virtualization@lists.linux-foundation.org Cc: linux-kernel@vger.kernel.org Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20200123101000.GB24255@Red Signed-off-by: Gonglei <arei.gonglei@huawei.com> Signed-off-by: Longpeng(Mike) <longpeng2@huawei.com> Link: https://lore.kernel.org/r/20200602070501.2023-2-longpeng2@huawei.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The system'll crash when the users insmod crypto/tcrypto.ko with mode=155
( testing "authenc(hmac(sha1),cbc(aes))" ). It's caused by reuse the memory
of request structure.
In crypto_authenc_init_tfm(), the reqsize is set to:
[PART 1] sizeof(authenc_request_ctx) +
[PART 2] ictx->reqoff +
[PART 3] MAX(ahash part, skcipher part)
and the 'PART 3' is used by both ahash and skcipher in turn.
When the virtio_crypto driver finish skcipher req, it'll call ->complete
callback(in crypto_finalize_skcipher_request) and then free its
resources whose pointers are recorded in 'skcipher parts'.
However, the ->complete is 'crypto_authenc_encrypt_done' in this case,
it will use the 'ahash part' of the request and change its content,
so virtio_crypto driver will get the wrong pointer after ->complete
finish and mistakenly free some other's memory. So the system will crash
when these memory will be used again.
The resources which need to be cleaned up are not used any more. But the
pointers of these resources may be changed in the function
"crypto_finalize_skcipher_request". Thus release specific resources before
calling this function.
Fixes: dbaf0624ffa5 ("crypto: add virtio-crypto driver") Reported-by: LABBE Corentin <clabbe@baylibre.com> Cc: Gonglei <arei.gonglei@huawei.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: virtualization@lists.linux-foundation.org Cc: linux-kernel@vger.kernel.org Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20200123101000.GB24255@Red Acked-by: Gonglei <arei.gonglei@huawei.com> Signed-off-by: Longpeng(Mike) <longpeng2@huawei.com> Link: https://lore.kernel.org/r/20200602070501.2023-3-longpeng2@huawei.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The src/dst length is not aligned with AES_BLOCK_SIZE(which is 16) in some
testcases in tcrypto.ko.
For example, the src/dst length of one of cts(cbc(aes))'s testcase is 17, the
crypto_virtio driver will set @src_data_len=16 but @dst_data_len=17 in this
case and get a wrong at then end.
SRC: pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp pp (17 bytes)
EXP: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc pp (17 bytes)
DST: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc 00 (pollute the last bytes)
(pp: plaintext cc:ciphertext)
Fix this issue by limit the length of dest buffer.
Fixes: dbaf0624ffa5 ("crypto: add virtio-crypto driver") Cc: Gonglei <arei.gonglei@huawei.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: virtualization@lists.linux-foundation.org Cc: linux-kernel@vger.kernel.org Cc: stable@vger.kernel.org Signed-off-by: Longpeng(Mike) <longpeng2@huawei.com> Link: https://lore.kernel.org/r/20200602070501.2023-4-longpeng2@huawei.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Currently after any algorithm is registered and tested, there's an
unnecessary request_module("cryptomgr") even if it's already loaded.
Also, CRYPTO_MSG_ALG_LOADED is sent twice, and thus if the algorithm is
"crct10dif", lib/crc-t10dif.c replaces the tfm twice rather than once.
This occurs because CRYPTO_MSG_ALG_LOADED is sent using
crypto_probing_notify(), which tries to load "cryptomgr" if the
notification is not handled (NOTIFY_DONE). This doesn't make sense
because "cryptomgr" doesn't handle this notification.
Fix this by using crypto_notify() instead of crypto_probing_notify().
Fixes: dd8b083f9a5e ("crypto: api - Introduce notifier for new crypto algorithms") Cc: <stable@vger.kernel.org> # v4.20+ Cc: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Eric Biggers <ebiggers@google.com> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Doing a "get_user_pages()" on a copy-on-write page for reading can be
ambiguous: the page can be COW'ed at any time afterwards, and the
direction of a COW event isn't defined.
Yes, whoever writes to it will generally do the COW, but if the thread
that did the get_user_pages() unmapped the page before the write (and
that could happen due to memory pressure in addition to any outright
action), the writer could also just take over the old page instead.
End result: the get_user_pages() call might result in a page pointer
that is no longer associated with the original VM, and is associated
with - and controlled by - another VM having taken it over instead.
So when doing a get_user_pages() on a COW mapping, the only really safe
thing to do would be to break the COW when getting the page, even when
only getting it for reading.
At the same time, some users simply don't even care.
For example, the perf code wants to look up the page not because it
cares about the page, but because the code simply wants to look up the
physical address of the access for informational purposes, and doesn't
really care about races when a page might be unmapped and remapped
elsewhere.
This adds logic to force a COW event by setting FOLL_WRITE on any
copy-on-write mapping when FOLL_GET (or FOLL_PIN) is used to get a page
pointer as a result.
The current semantics end up being:
- __get_user_pages_fast(): no change. If you don't ask for a write,
you won't break COW. You'd better know what you're doing.
- get_user_pages_fast(): the fast-case "look it up in the page tables
without anything getting mmap_sem" now refuses to follow a read-only
page, since it might need COW breaking. Which happens in the slow
path - the fast path doesn't know if the memory might be COW or not.
- get_user_pages() (including the slow-path fallback for gup_fast()):
for a COW mapping, turn on FOLL_WRITE for FOLL_GET/FOLL_PIN, with
very similar semantics to FOLL_FORCE.
If it turns out that we want finer granularity (ie "only break COW when
it might actually matter" - things like the zero page are special and
don't need to be broken) we might need to push these semantics deeper
into the lookup fault path. So if people care enough, it's possible
that we might end up adding a new internal FOLL_BREAK_COW flag to go
with the internal FOLL_COW flag we already have for tracking "I had a
COW".
Alternatively, if it turns out that different callers might want to
explicitly control the forced COW break behavior, we might even want to
make such a flag visible to the users of get_user_pages() instead of
using the above default semantics.
But for now, this is mostly commentary on the issue (this commit message
being a lot bigger than the patch, and that patch in turn is almost all
comments), with that minimal "enable COW breaking early" logic using the
existing FOLL_WRITE behavior.
[ It might be worth noting that we've always had this ambiguity, and it
could arguably be seen as a user-space issue.
You only get private COW mappings that could break either way in
situations where user space is doing cooperative things (ie fork()
before an execve() etc), but it _is_ surprising and very subtle, and
fork() is supposed to give you independent address spaces.
So let's treat this as a kernel issue and make the semantics of
get_user_pages() easier to understand. Note that obviously a true
shared mapping will still get a page that can change under us, so this
does _not_ mean that get_user_pages() somehow returns any "stable"
page ]
Reported-by: Jann Horn <jannh@google.com> Tested-by: Christoph Hellwig <hch@lst.de> Acked-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Kirill Shutemov <kirill@shutemov.name> Acked-by: Jan Kara <jack@suse.cz> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Matthew Wilcox <willy@infradead.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
clk_pm_runtime_get() assumes that the PM-runtime usage counter will
be dropped by pm_runtime_get_sync() on errors, which is not the case,
so PM-runtime references to devices acquired by the former are leaked
on errors returned by the latter.
Fix this by modifying clk_pm_runtime_get() to drop the reference if
pm_runtime_get_sync() returns an error.
Fixes: 9a34b45397e5 clk: Add support for runtime PM Cc: 4.15+ <stable@vger.kernel.org> # 4.15+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Currently we set the tx/rx buffer to 0xff when NULL. This causes
problems with some spi slaves where 0xff is a valid command. Looking
at other drivers, the tx/rx buffer is usually set to 0x00 when NULL.
Following this convention solves the issue.
Fixes: fa236a7ef240 ("spi: bcm-qspi: Add Broadcom MSPI driver") Signed-off-by: Justin Chen <justinpopo6@gmail.com> Signed-off-by: Kamal Dasu <kdasu.kdev@gmail.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20200420190853.45614-6-kdasu.kdev@gmail.com Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The BCM2835aux SPI driver uses devm_spi_register_master() on bind.
As a consequence, on unbind, __device_release_driver() first invokes
bcm2835aux_spi_remove() before unregistering the SPI controller via
devres_release_all().
This order is incorrect: bcm2835aux_spi_remove() turns off the SPI
controller, including its interrupts and clock. The SPI controller
is thus no longer usable.
When the SPI controller is subsequently unregistered, it unbinds all
its slave devices. If their drivers need to access the SPI bus,
e.g. to quiesce their interrupts, unbinding will fail.
As a rule, devm_spi_register_master() must not be used if the
->remove() hook performs teardown steps which shall be performed
after unbinding of slaves.
Fix by using the non-devm variant spi_register_master(). Note that the
struct spi_master as well as the driver-private data are not freed until
after bcm2835aux_spi_remove() has finished, so accessing them is safe.
The BCM2835 SPI driver uses devm_spi_register_controller() on bind.
As a consequence, on unbind, __device_release_driver() first invokes
bcm2835_spi_remove() before unregistering the SPI controller via
devres_release_all().
This order is incorrect: bcm2835_spi_remove() tears down the DMA
channels and turns off the SPI controller, including its interrupts
and clock. The SPI controller is thus no longer usable.
When the SPI controller is subsequently unregistered, it unbinds all
its slave devices. If their drivers need to access the SPI bus,
e.g. to quiesce their interrupts, unbinding will fail.
As a rule, devm_spi_register_controller() must not be used if the
->remove() hook performs teardown steps which shall be performed
after unbinding of slaves.
Fix by using the non-devm variant spi_register_controller(). Note that
the struct spi_controller as well as the driver-private data are not
freed until after bcm2835_spi_remove() has finished, so accessing them
is safe.
The PXA2xx SPI driver releases a runtime PM ref in the probe error path
even though it hasn't acquired a ref earlier.
Apparently commit e2b714afee32 ("spi: pxa2xx: Disable runtime PM if
controller registration fails") sought to copy-paste the invocation of
pm_runtime_disable() from pxa2xx_spi_remove(), but erroneously copied
the call to pm_runtime_put_noidle() as well. Drop it.
Fixes: e2b714afee32 ("spi: pxa2xx: Disable runtime PM if controller registration fails") Signed-off-by: Lukas Wunner <lukas@wunner.de> Reviewed-by: Jarkko Nikula <jarkko.nikula@linux.intel.com> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: stable@vger.kernel.org # v4.17+ Cc: Jarkko Nikula <jarkko.nikula@linux.intel.com> Link: https://lore.kernel.org/r/58b2ac6942ca1f91aaeeafe512144bc5343e1d84.1590408496.git.lukas@wunner.de Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The PXA2xx SPI driver uses devm_spi_register_controller() on bind.
As a consequence, on unbind, __device_release_driver() first invokes
pxa2xx_spi_remove() before unregistering the SPI controller via
devres_release_all().
This order is incorrect: pxa2xx_spi_remove() disables the chip,
rendering the SPI bus inaccessible even though the SPI controller is
still registered. When the SPI controller is subsequently unregistered,
it unbinds all its slave devices. Because their drivers cannot access
the SPI bus, e.g. to quiesce interrupts, the slave devices may be left
in an improper state.
As a rule, devm_spi_register_controller() must not be used if the
->remove() hook performs teardown steps which shall be performed after
unregistering the controller and specifically after unbinding of slaves.
Fix by reverting to the non-devm variant of spi_register_controller().
An alternative approach would be to use device-managed functions for all
steps in pxa2xx_spi_remove(), e.g. by calling devm_add_action_or_reset()
on probe. However that approach would add more LoC to the driver and
it wouldn't lend itself as well to backporting to stable.
The improper use of devm_spi_register_controller() was introduced in 2013
by commit a807fcd090d6 ("spi: pxa2xx: use devm_spi_register_master()"),
but all earlier versions of the driver going back to 2006 were likewise
broken because they invoked spi_unregister_master() at the end of
pxa2xx_spi_remove(), rather than at the beginning.
When an SPI controller unregisters, it unbinds all its slave devices.
For this, their drivers may need to access the SPI bus, e.g. to quiesce
interrupts.
However since commit ffbbdd21329f ("spi: create a message queueing
infrastructure"), spi_destroy_queue() is executed before unbinding the
slaves. It sets ctlr->running = false, thereby preventing SPI bus
access and causing unbinding of slave devices to fail.
Fix by unbinding slaves before calling spi_destroy_queue().
Fixes: ffbbdd21329f ("spi: create a message queueing infrastructure") Signed-off-by: Lukas Wunner <lukas@wunner.de> Cc: stable@vger.kernel.org # v3.4+ Cc: Linus Walleij <linus.walleij@linaro.org> Link: https://lore.kernel.org/r/8aaf9d44c153fe233b17bc2dec4eb679898d7e7b.1589557526.git.lukas@wunner.de Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The Designware SPI driver uses devm_spi_register_controller() on bind.
As a consequence, on unbind, __device_release_driver() first invokes
dw_spi_remove_host() before unregistering the SPI controller via
devres_release_all().
This order is incorrect: dw_spi_remove_host() shuts down the chip,
rendering the SPI bus inaccessible even though the SPI controller is
still registered. When the SPI controller is subsequently unregistered,
it unbinds all its slave devices. Because their drivers cannot access
the SPI bus, e.g. to quiesce interrupts, the slave devices may be left
in an improper state.
As a rule, devm_spi_register_controller() must not be used if the
->remove() hook performs teardown steps which shall be performed after
unregistering the controller and specifically after unbinding of slaves.
Fix by reverting to the non-devm variant of spi_register_controller().
An alternative approach would be to use device-managed functions for all
steps in dw_spi_remove_host(), e.g. by calling devm_add_action_or_reset()
on probe. However that approach would add more LoC to the driver and
it wouldn't lend itself as well to backporting to stable.
Fixes: 04f421e7b0b1 ("spi: dw: use managed resources") Signed-off-by: Lukas Wunner <lukas@wunner.de> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: stable@vger.kernel.org # v3.14+ Cc: Baruch Siach <baruch@tkos.co.il> Link: https://lore.kernel.org/r/3fff8cb8ae44a9893840d0688be15bb88c090a14.1590408496.git.lukas@wunner.de Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Commit 2d6261583be0 ("lib: rework bitmap_parse()") does not take into
account order of halfwords on 64-bit big endian architectures. As
result (at least) Receive Packet Steering, IRQ affinity masks and
runtime kernel test "test_bitmap" get broken on s390.
[andriy.shevchenko@linux.intel.com: convert infinite while loop to a for loop] Link: http://lkml.kernel.org/r/20200609140535.87160-1-andriy.shevchenko@linux.intel.com Fixes: 2d6261583be0 ("lib: rework bitmap_parse()") Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com> Cc: Yury Norov <yury.norov@gmail.com> Cc: Amritha Nambiar <amritha.nambiar@intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Kees Cook <keescook@chromium.org> Cc: Matthew Wilcox <willy@infradead.org> Cc: Miklos Szeredi <mszeredi@redhat.com> Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk> Cc: Steffen Klassert <steffen.klassert@secunet.com> Cc: "Tobin C . Harding" <tobin@kernel.org> Cc: Vineet Gupta <vineet.gupta1@synopsys.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Willem de Bruijn <willemb@google.com> Cc: <stable@vger.kernel.org> Link: http://lkml.kernel.org/r/1591634471-17647-1-git-send-email-agordeev@linux.ibm.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
After commit c3aab9a0bd91 ("mm/filemap.c: don't initiate writeback if
mapping has no dirty pages"), the following null pointer dereference has
been reported on nilfs2:
This crash turned out to be caused by set_page_writeback() call for
segment summary buffers at nilfs_segctor_prepare_write().
set_page_writeback() can call inc_wb_stat(inode_to_wb(inode),
WB_WRITEBACK) where inode_to_wb(inode) is NULL if the inode of
underlying block device does not have an associated wb.
This fixes the issue by calling inode_attach_wb() in advance to ensure
to associate the bdev inode with its wb.
In some rare cases, for input data over 32 KB, lzo-rle could encode two
different inputs to the same compressed representation, so that
decompression is then ambiguous (i.e. data may be corrupted - although
zram is not affected because it operates over 4 KB pages).
This modifies the compressor without changing the decompressor or the
bitstream format, such that:
- there is no change to how data produced by the old compressor is
decompressed
- an old decompressor will correctly decode data from the updated
compressor
- performance and compression ratio are not affected
- we avoid introducing a new bitstream format
In testing over 12.8M real-world files totalling 903 GB, three files
were affected by this bug. I also constructed 37M semi-random 64 KB
files totalling 2.27 TB, and saw no affected files. Finally I tested
over files constructed to contain each of the ~1024 possible bad input
sequences; for all of these cases, updated lzo-rle worked correctly.
There is no significant impact to performance or compression ratio.
Signed-off-by: Dave Rodgman <dave.rodgman@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Dave Rodgman <dave.rodgman@arm.com> Cc: Willy Tarreau <w@1wt.eu> Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com> Cc: Markus F.X.J. Oberhumer <markus@oberhumer.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Nitin Gupta <ngupta@vflare.org> Cc: Chao Yu <yuchao0@huawei.com> Cc: <stable@vger.kernel.org> Link: http://lkml.kernel.org/r/20200507100203.29785-1-dave.rodgman@arm.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
UBSAN: null-ptr-deref in arch/arm64/kernel/smp.c:596:6
member access within null pointer of type 'struct acpi_madt_generic_interrupt'
CPU: 0 PID: 0 Comm: swapper Not tainted 5.7.0-rc6-00124-g96bc42ff0a82 #1
Call trace:
dump_backtrace+0x0/0x384
show_stack+0x28/0x38
dump_stack+0xec/0x174
handle_null_ptr_deref+0x134/0x174
__ubsan_handle_type_mismatch_v1+0x84/0xa4
acpi_parse_gic_cpu_interface+0x60/0xe8
acpi_parse_entries_array+0x288/0x498
acpi_table_parse_entries_array+0x178/0x1b4
acpi_table_parse_madt+0xa4/0x110
acpi_parse_and_init_cpus+0x38/0x100
smp_init_cpus+0x74/0x258
setup_arch+0x350/0x3ec
start_kernel+0x98/0x6f4
This is from the use of the ACPI_OFFSET in
arch/arm64/include/asm/acpi.h. Replace its use with offsetof from
include/linux/stddef.h which should implement the same logic using
__builtin_offsetof, so that UBSAN wont warn.
As recently reported, some platforms provide a list of power
resources for device power state D3hot, through the _PR3 object,
but they do not provide a list of power resources for device power
state D0.
Among other things, this causes acpi_device_get_power() to return
D3hot as the current state of the device in question if all of the
D3hot power resources are "on", because it sees the power_resources
flag set and calls acpi_power_get_inferred_state() which finds that
D3hot is the shallowest power state with all of the associated power
resources turned "on", so that's what it returns. Moreover, that
value takes precedence over the acpi_dev_pm_explicit_get() return
value, because it means a deeper power state. The device may very
well be in D0 physically at that point, however.
Moreover, the presence of _PR3 without _PR0 for a given device
means that only one D3-level power state can be supported by it.
Namely, because there are no power resources to turn "off" when
transitioning the device from D0 into D3cold (which should be
supported since _PR3 is present), the evaluation of _PS3 should
be sufficient to put it straight into D3cold, but this means that
the effect of turning "on" the _PR3 power resources is unclear,
so it is better to avoid doing that altogether. Consequently,
there is no practical way do distinguish D3cold from D3hot for
the device in question and the power states of it can be labeled
so that D3hot is the deepest supported one (and Linux assumes
that putting a device into D3hot via ACPI may cause power to be
removed from it anyway, for legacy reasons).
To work around the problem described above modify the ACPI
enumeration of devices so that power resources are only used
for device power management if the list of D0 power resources
is not empty and make it mart D3cold as supported only if that
is the case and the D3hot list of power resources is not empty
too.
Fixes: ef85bdbec444 ("ACPI / scan: Consolidate extraction of power resources lists") Link: https://bugzilla.kernel.org/show_bug.cgi?id=205057 Link: https://lore.kernel.org/linux-acpi/20200603194659.185757-1-hdegoede@redhat.com/ Reported-by: Hans de Goede <hdegoede@redhat.com> Tested-by: Hans de Goede <hdegoede@redhat.com> Tested-by: youling257@gmail.com Cc: 3.10+ <stable@vger.kernel.org> # 3.10+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Per the ACPI spec, interrupts in the range [0, 255] may be handled
in AML using individual methods whose naming is based on the format
_Exx or _Lxx, where xx is the hex representation of the interrupt
index.
Add support for this missing feature to our ACPI GED driver.
Cc: v4.9+ <stable@vger.kernel.org> # v4.9+ Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
kobject_init_and_add() takes reference even when it fails.
If this function returns an error, kobject_put() must be called to
properly clean up the memory associated with the object. Previous
commit "b8eb718348b8" fixed a similar problem.
Fixes: 158c998ea44b ("ACPI / CPPC: add sysfs support to compute delivered performance") Signed-off-by: Qiushi Wu <wu000273@umn.edu> Cc: 4.10+ <stable@vger.kernel.org> # 4.10+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
kobject_init_and_add() takes reference even when it fails.
Thus, when kobject_init_and_add() returns an error,
kobject_put() must be called to properly clean up the kobject.
Fixes: 3f8055c35836 ("ACPI / hotplug: Introduce user space interface for hotplug profiles") Signed-off-by: Qiushi Wu <wu000273@umn.edu> Cc: 3.10+ <stable@vger.kernel.org> # 3.10+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The HP Thunderbolt Dock has two separate USB devices, one is for speaker
and one is for headset. Add names for them so userspace can apply UCM
settings.
When a USB-audio interface gets runtime-suspended via auto-pm feature,
the driver suspends all functionality and increment
chip->num_suspended_intf. Later on, when the system gets suspended to
S3, the driver increments chip->num_suspended_intf again, skips the
device changes, and sets the card power state to
SNDRV_CTL_POWER_D3hot. In return, when the system gets resumed from
S3, the resume callback decrements chip->num_suspended_intf. Since
this refcount is still not zero (it's been runtime-suspended), the
whole resume is skipped. But there is a small pitfall here.
The problem is that the driver doesn't restore the card power state
after this resume call, leaving it as SNDRV_CTL_POWER_D3hot. So,
even after the system resume finishes, the card instance still appears
as if it were system-suspended, and this confuses many ioctl accesses
that are blocked unexpectedly.
In details, we have two issues behind the scene: one is that the card
power state is changed only when the refcount becomes zero, and
another is that the prior auto-suspend check is kept in a boolean
flag. Although the latter problem is almost negligible since the
auto-pm feature is imposed only on the primary interface, but this can
be a potential problem on the devices with multiple interfaces.
This patch addresses those issues by the following:
- Replace chip->autosuspended boolean flag with chip->system_suspend
counter
- At the first system-suspend, chip->num_suspended_intf is recorded to
chip->system_suspend
- At system-resume, the card power state is restored when the
chip->num_suspended_intf refcount reaches to chip->system_suspend,
i.e. the state returns to the auto-suspended
Also, the patch fixes yet another hidden problem by the code
refactoring along with the fixes above: namely, when some resume
procedure failed, the driver left chip->num_suspended_intf that was
already decreased, and it might lead to the refcount unbalance.
In the new code, the refcount decrement is done after the whole resume
procedure, and the problem is avoided as well.
Fixes: 0662292aec05 ("ALSA: usb-audio: Handle normal and auto-suspend equally") Reported-and-tested-by: Macpaul Lin <macpaul.lin@mediatek.com> Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/20200603153709.6293-1-tiwai@suse.de Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Add and use snd_pcm_stream_lock_nested() in snd_pcm_link/unlink
implementation. The code is fine, but generates a lockdep complaint:
============================================
WARNING: possible recursive locking detected
5.7.1mq+ #381 Tainted: G O
--------------------------------------------
pulseaudio/4180 is trying to acquire lock: ffff888402d6f508 (&group->lock){-...}-{2:2}, at: snd_pcm_common_ioctl+0xda8/0xee0 [snd_pcm]
but task is already holding lock: ffff8883f7a8cf18 (&group->lock){-...}-{2:2}, at: snd_pcm_common_ioctl+0xe4e/0xee0 [snd_pcm]
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0
----
lock(&group->lock);
lock(&group->lock);
*** DEADLOCK ***
May be due to missing lock nesting notation
2 locks held by pulseaudio/4180:
#0: ffffffffa1a05190 (snd_pcm_link_rwsem){++++}-{3:3}, at: snd_pcm_common_ioctl+0xca0/0xee0 [snd_pcm]
#1: ffff8883f7a8cf18 (&group->lock){-...}-{2:2}, at: snd_pcm_common_ioctl+0xe4e/0xee0 [snd_pcm]
[...]
Prevent SNDRV_PCM_IOCTL_LINK linking stream to itself - the code
can't handle it. Fixed commit is not where bug was introduced, but
changes the context significantly.
A couple of Lenovo ThinkCentre machines all have 2 front mics and they
use the same codec alc623 and have the same pin config, so add a
pintbl entry for those machines to apply the fixup
ALC283_FIXUP_HEADSET_MIC.
In the latter models of RME Fireface series, device start to transfer
packets several dozens of milliseconds. On the other hand, ALSA fireface
driver starts IR context 2 milliseconds after the start. This results
in loss to handle incoming packets on the context.
This commit changes to start IR context immediately instead of
postponement. For Fireface 800, this affects nothing because the device
transfer packets 100 milliseconds or so after the start and this is
within wait timeout.
Some of tests in xfstests failed with cifsd kernel server since commit e80ddeb2f70e. cifsd kernel server validates credit charge from client
by calculating it base on max((InputCount + OutputCount) and
(MaxInputResponse + MaxOutputResponse)) according to specification.
MS-SMB2 specification describe credit charge calculation of smb2 ioctl :
If Connection.SupportsMultiCredit is TRUE, the server MUST validate
CreditCharge based on the maximum of (InputCount + OutputCount) and
(MaxInputResponse + MaxOutputResponse), as specified in section 3.3.5.2.5.
If the validation fails, it MUST fail the IOCTL request with
STATUS_INVALID_PARAMETER.
This patch add indatalen that can be a non-zero value to calculation of
credit charge in SMB2_ioctl_init().
Fixes: e80ddeb2f70e ("smb3: fix incorrect number of credits when ioctl
MaxOutputResponse > 64K") Cc: Stable <stable@vger.kernel.org> Reviewed-by: Aurelien Aptel <aaptel@suse.com> Cc: Steve French <smfrench@gmail.com> Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
We were not checking to see if ioctl requests asked for more than
64K (ie when CIFSMaxBufSize was > 64K) so when setting larger
CIFSMaxBufSize then ioctls would fail with invalid parameter errors.
When requests ask for more than 64K in MaxOutputResponse then we
need to ask for more than 1 credit.
Signed-off-by: Steve French <stfrench@microsoft.com> CC: Stable <stable@vger.kernel.org> Reviewed-by: Aurelien Aptel <aaptel@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The documentation provided by kobject_init_and_add() clearly spells out
the need to call kobject_put() on the kobject if an error is returned.
Add this missing call to the error path.