The buffer size is 2 Bytes and we expect to receive the same amount of
data. But sometimes we receive less data and run into uninit-was-stored
issue upon read. Hence modify the error check on the return value to match
with the buffer size as a prevention.
Reported-and-tested by: syzbot+a7e220df5a81d1ab400e@syzkaller.appspotmail.com Signed-off-by: Himadri Pandya <himadrispandya@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
There is a race between the assignment of `table->data` and write value
to the pointer of `table->data` in the __do_proc_doulongvec_minmax() on
the other thread.
Fix this by duplicating the `table`, and only update the duplicate of
it. And introduce a helper of proc_hugetlb_doulongvec_minmax() to
simplify the code.
The usage of "capture group (...)" in the immediate condition after `&&`
results in `$1` being uninitialized. This issues a warning "Use of
uninitialized value $1 in regexp compilation at ./scripts/checkpatch.pl
line 2638".
I noticed this bug while running checkpatch on the set of commits from
v5.7 to v5.8-rc1 of the kernel on the commits with a diff content in
their commit message.
This bug was introduced in the script by commit e518e9a59ec3
("checkpatch: emit an error when there's a diff in a changelog"). It
has been in the script since then.
The author intended to store the match made by capture group in variable
`$1`. This should have contained the name of the file as `[\w/]+`
matched. However, this couldn't be accomplished due to usage of capture
group and `$1` in the same regular expression.
Fix this by placing the capture group in the condition before `&&`.
Thus, `$1` can be initialized to the text that capture group matches
thereby setting it to the desired and required value.
Fixes: e518e9a59ec3 ("checkpatch: emit an error when there's a diff in a changelog") Signed-off-by: Mrinal Pandey <mrinalmni@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Tested-by: Lukas Bulwahn <lukas.bulwahn@gmail.com> Reviewed-by: Lukas Bulwahn <lukas.bulwahn@gmail.com> Cc: Joe Perches <joe@perches.com> Link: https://lkml.kernel.org/r/20200714032352.f476hanaj2dlmiot@mrinalpandey Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
AT instructions do a translation table walk and return the result, or
the fault in PAR_EL1. KVM uses these to find the IPA when the value is
not provided by the CPU in HPFAR_EL1.
If a translation table walk causes an external abort it is taken as an
exception, even if it was due to an AT instruction. (DDI0487F.a's D5.2.11
"Synchronous faults generated by address translation instructions")
While we previously made KVM resilient to exceptions taken due to AT
instructions, the device access causes mismatched attributes, and may
occur speculatively. Prevent this, by forbidding a walk through memory
described as device at stage2. Now such AT instructions will report a
stage2 fault.
Such a fault will cause KVM to restart the guest. If the AT instructions
always walk the page tables, but guest execution uses the translation cached
in the TLB, the guest can't make forward progress until the TLB entry is
evicted. This isn't a problem, as since commit 5dcd0fdbb492 ("KVM: arm64:
Defer guest entry when an asynchronous exception is pending"), KVM will
return to the host to process IRQs allowing the rest of the system to keep
running.
Cc: stable@vger.kernel.org # v4.14 Signed-off-by: James Morse <james.morse@arm.com> Reviewed-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Andre Przywara <andre.przywara@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
KVM doesn't expect any synchronous exceptions when executing, any such
exception leads to a panic(). AT instructions access the guest page
tables, and can cause a synchronous external abort to be taken.
The arm-arm is unclear on what should happen if the guest has configured
the hardware update of the access-flag, and a memory type in TCR_EL1 that
does not support atomic operations. B2.2.6 "Possible implementation
restrictions on using atomic instructions" from DDI0487F.a lists
synchronous external abort as a possible behaviour of atomic instructions
that target memory that isn't writeback cacheable, but the page table
walker may behave differently.
Make KVM robust to synchronous exceptions caused by AT instructions.
Add a get_user() style helper for AT instructions that returns -EFAULT
if an exception was generated.
While KVM's version of the exception table mixes synchronous and
asynchronous exceptions, only one of these can occur at each location.
Re-enter the guest when the AT instructions take an exception on the
assumption the guest will take the same exception. This isn't guaranteed
to make forward progress, as the AT instructions may always walk the page
tables, but guest execution may use the translation cached in the TLB.
This isn't a problem, as since commit 5dcd0fdbb492 ("KVM: arm64: Defer guest
entry when an asynchronous exception is pending"), KVM will return to the
host to process IRQs allowing the rest of the system to keep running.
Cc: stable@vger.kernel.org # v4.14 Signed-off-by: James Morse <james.morse@arm.com> Reviewed-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Andre Przywara <andre.przywara@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
SError that occur during world-switch's entry to the guest will be
accounted to the guest, as the exception is masked until we enter the
guest... but we want to attribute the SError as precisely as possible.
Reading DISR_EL1 before guest entry requires free registers, and using
ESB+DISR_EL1 to consume and read back the ESR would leave KVM holding
a host SError... We would rather leave the SError pending and let the
host take it once we exit world-switch. To do this, we need to defer
guest-entry if an SError is pending.
Read the ISR to see if SError (or an IRQ) is pending. If so fake an
exit. Place this check between __guest_enter()'s save of the host
registers, and restore of the guest's. SError that occur between
here and the eret into the guest must have affected the guest's
registers, which we can naturally attribute to the guest.
The dsb is needed to ensure any previous writes have been done before
we read ISR_EL1. On systems without the v8.2 RAS extensions this
doesn't give us anything as we can't contain errors, and the ESR bits
to describe the severity are all implementation-defined. Replace
this with a nop for these systems.
v4.14-backport: as this kernel version doesn't have the RAS support at
all, remove the RAS alternative.
Cc: stable@vger.kernel.org # v4.14 Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
[ James: Removed v8.2 RAS related barriers ] Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Andre Przywara <andre.przywara@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
KVM has a one instruction window where it will allow an SError exception
to be consumed by the hypervisor without treating it as a hypervisor bug.
This is used to consume asynchronous external abort that were caused by
the guest.
As we are about to add another location that survives unexpected exceptions,
generalise this code to make it behave like the host's extable.
KVM's version has to be mapped to EL2 to be accessible on nVHE systems.
The SError vaxorcism code is a one instruction window, so has two entries
in the extable. Because the KVM code is copied for VHE and nVHE, we end up
with four entries, half of which correspond with code that isn't mapped.
Cc: stable@vger.kernel.org # v4.14 Signed-off-by: James Morse <james.morse@arm.com> Reviewed-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Andre Przywara <andre.przywara@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Commit 52f23478081ae0 ("mm/slub.c: fix corrupted freechain in
deactivate_slab()") suffered an update when picked up from LKML [1].
Specifically, relocating 'freelist = NULL' into 'freelist_corrupted()'
created a no-op statement. Fix it by sticking to the behavior intended
in the original patch [1]. In addition, make freelist_corrupted()
immune to passing NULL instead of &freelist.
The issue has been spotted via static analysis and code review.
Maybe __create_persistent_data_objects() caller will use PTR_ERR as a
pointer, it will lead to some strange things.
Signed-off-by: Ye Bin <yebin10@huawei.com> Cc: stable@vger.kernel.org Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Maybe __create_persistent_data_objects() caller will use PTR_ERR as a
pointer, it will lead to some strange things.
Signed-off-by: Ye Bin <yebin10@huawei.com> Cc: stable@vger.kernel.org Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
All three generations of Sandisk SSDs lock up hard intermittently.
Experiments showed that disabling NCQ lowered the failure rate significantly
and the kernel has been disabling NCQ for some models of SD7's and 8's,
which is obviously undesirable.
Karthik worked with Sandisk to root cause the hard lockups to trim commands
larger than 128M. This patch implements ATA_HORKAGE_MAX_TRIM_128M which
limits max trim size to 128M and applies it to all three generations of
Sandisk SSDs.
It happens often while I'm preparing a patch for a block driver that
I'm wondering: is a definition of SECTOR_SIZE and/or SECTOR_SHIFT
available for this driver? Do I have to introduce definitions of these
constants before I can use these constants? To avoid this confusion,
move the existing definitions of SECTOR_SIZE and SECTOR_SHIFT into the
<linux/blkdev.h> header file such that these become available for all
block drivers. Make the SECTOR_SIZE definition in the uapi msdos_fs.h
header file conditional to avoid that including that header file after
<linux/blkdev.h> causes the compiler to complain about a SECTOR_SIZE
redefinition.
Note: the SECTOR_SIZE / SECTOR_SHIFT / SECTOR_BITS definitions have
not been removed from uapi header files nor from NAND drivers in
which these constants are used for another purpose than converting
block layer offsets and sizes into a number of sectors.
Cc: David S. Miller <davem@davemloft.net> Cc: Mike Snitzer <snitzer@redhat.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Nitin Gupta <ngupta@vflare.org> Reviewed-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Block layer usually doesn't support or allow zero-length bvec. Since
commit 1bdc76aea115 ("iov_iter: use bvec iterator to implement
iterate_bvec()"), iterate_bvec() switches to bvec iterator. However,
Al mentioned that 'Zero-length segments are not disallowed' in iov_iter.
Fixes for_each_bvec() so that it can move on after seeing one zero
length bvec.
Fixes: 1bdc76aea115 ("iov_iter: use bvec iterator to implement iterate_bvec()") Reported-by: syzbot <syzbot+61acc40a49a3e46e25ea@syzkaller.appspotmail.com> Signed-off-by: Ming Lei <ming.lei@redhat.com> Tested-by: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Matthew Wilcox <willy@infradead.org> Cc: <stable@vger.kernel.org> Link: https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg2262077.html Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The basic permission bits (protection bits in AmigaOS) have been broken
in Linux' AFFS - it would only set bits, but never delete them.
Also, contrary to the documentation, the Archived bit was not handled.
Let's fix this for good, and set the bits such that Linux and classic
AmigaOS can coexist in the most peaceful manner.
Also, update the documentation to represent the current state of things.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Cc: stable@vger.kernel.org Signed-off-by: Max Staudt <max@enpas.org> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Avid Adrenaline is reported that ALSA firewire-digi00x driver is bound to.
However, as long as he investigated, the design of this model is hardly
similar to the one of Digi 00x family. It's better to exclude the model
from modalias of ALSA firewire-digi00x driver.
This commit changes device entries so that the model is excluded.
When system is suspended with active audio playback to HDMI/DP, two
alternative sequences can happen at resume:
a) monitor is detected first and ALSA prepare follows normal
stream setup sequence, or
b) ALSA prepare is called first, but monitor is not yet detected,
so PCM is restarted without a pin,
In case of (b), on i915 systems, haswell_verify_D0() is not called at
resume and the pin power state may be incorrect. Result is lack of audio
after resume with no error reported back to user-space.
Fix the problem by always verifying converter and pin state in the
i915_pin_cvt_fixup().
The PCM OSS mulaw plugin has a check of the format of the counter part
whether it's a linear format. The check is with snd_BUG_ON() that
emits WARN_ON() when the debug config is set, and it confuses
syzkaller as if it were a serious issue. Let's drop snd_BUG_ON() for
avoiding that.
While we're at it, correct the error code to a more suitable, EINVAL.
snd_ca0106_spi_write() returns 1 on error, snd_ca0106_pcm_power_dac()
is returning the error code directly, and the caller is expecting an
negative error code
The problem is we're doing a copy_to_user() while holding tree locks,
which can deadlock if we have to do a page fault for the copy_to_user().
This exists even without my locking changes, so it needs to be fixed.
Rework the search ioctl to do the pre-fault and then
copy_to_user_nofault for the copying.
CC: stable@vger.kernel.org # 4.4+ Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Commit 3d7081822f7f ("uaccess: Add non-pagefault user-space read functions")
missed to add probe write function, therefore factor out a probe_write_common()
helper with most logic of probe_kernel_write() except setting KERNEL_DS, and
add a new probe_user_write() helper so it can be used from BPF side.
Again, on some archs, the user address space and kernel address space can
co-exist and be overlapping, so in such case, setting KERNEL_DS would mean
that the given address is treated as being in kernel address space.
Add probe_user_read(), strncpy_from_unsafe_user() and
strnlen_unsafe_user() which allows caller to access user-space
in IRQ context.
Current probe_kernel_read() and strncpy_from_unsafe() are
not available for user-space memory, because it sets
KERNEL_DS while accessing data. On some arch, user address
space and kernel address space can be co-exist, but others
can not. In that case, setting KERNEL_DS means given
address is treated as a kernel address space.
Also strnlen_user() is only available from user context since
it can sleep if pagefault is enabled.
To access user-space memory without pagefault, we need
these new functions which sets USER_DS while accessing
the data.
These are special extent buffers that get rewound in order to lookup
the state of the tree at a specific point in time. As such they do not
go through the normal initialization paths that set their lockdep class,
so handle them appropriately when they are created and before they are
locked.
CC: stable@vger.kernel.org # 4.4+ Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
When a rewound buffer is created it already has a ref count of 1 and the
dummy flag set. Then another ref is taken bumping the count to 2.
Finally when this buffer is released from btrfs_release_path the extra
reference is decremented by the special handling code in
free_extent_buffer.
However, this special code is in fact redundant sinca ref count of 1 is
still correct since the buffer is only accessed via btrfs_path struct.
This paves the way forward of removing the special handling in
free_extent_buffer.
Signed-off-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
get_old_root used used only by btrfs_search_old_slot to initialise the
path structure. The old root is always a cloned buffer (either via alloc
dummy or via btrfs_clone_extent_buffer) and its reference count is 2: 1
from allocation, 1 from extent_buffer_get call in get_old_root.
This latter explicit ref count acquire operation is in fact unnecessary
since the semantic is such that the newly allocated buffer is handed
over to the btrfs_path for lifetime management. Considering this just
remove the extra extent_buffer_get in get_old_root.
Signed-off-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
This problem exists because we have two different rescan threads,
btrfs_uuid_scan_kthread which creates the uuid tree, and
btrfs_uuid_tree_iterate that goes through and updates or deletes any out
of date roots. The problem is they both do things in different order.
btrfs_uuid_scan_kthread() reads the tree_root, and then inserts entries
into the uuid_root. btrfs_uuid_tree_iterate() scans the uuid_root, but
then does a btrfs_get_fs_root() which can read from the tree_root.
It's actually easy enough to not be holding the path in
btrfs_uuid_scan_kthread() when we add a uuid entry, as we already drop
it further down and re-start the search when we loop. So simply move
the path release before we add our entry to the uuid tree.
This also fixes a problem where we're holding a path open after we do
btrfs_end_transaction(), which has it's own problems.
CC: stable@vger.kernel.org # 4.4+ Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Otherwise gcc generates warnings if the expression is complicated.
Fixes: 312a0c170945 ("[PATCH] LOG2: Alter roundup_pow_of_two() so that it can use a ilog2() on a constant") Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Link: https://lkml.kernel.org/r/0-v1-8a2697e3c003+41165-log_brackets_jgg@nvidia.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
We can sometimes get bogus thermal shutdowns on omap4430 at least with
droid4 running idle with a battery charger connected:
thermal thermal_zone0: critical temperature reached (143 C), shutting down
Dumping out the register values shows we can occasionally get a 0x7f value
that is outside the TRM listed values in the ADC conversion table. And then
we get a normal value when reading again after that. Reading the register
multiple times does not seem help avoiding the bogus values as they stay
until the next sample is ready.
Looking at the TRM chapter "18.4.10.2.3 ADC Codes Versus Temperature", we
should have values from 13 to 107 listed with a total of 95 values. But
looking at the omap4430_adc_to_temp array, the values are off, and the
end values are missing. And it seems that the 4430 ADC table is similar
to omap3630 rather than omap4460.
Let's fix the issue by using values based on the omap3630 table and just
ignoring invalid values. Compared to the 4430 TRM, the omap3630 table has
the missing values added while the TRM table only shows every second
value.
Note that sometimes the ADC register values within the valid table can
also be way off for about 1 out of 10 values. But it seems that those
just show about 25 C too low values rather than too high values. So those
do not cause a bogus thermal shutdown.
Fixes: 1a31270e54d7 ("staging: omap-thermal: add OMAP4 data structures") Cc: Merlijn Wajer <merlijn@wizzup.org> Cc: Pavel Machek <pavel@ucw.cz> Cc: Sebastian Reichel <sebastian.reichel@collabora.com> Signed-off-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20200706183338.25622-1-tony@atomide.com Signed-off-by: Sasha Levin <sashal@kernel.org>
The VT-d spec requires (10.4.4 Global Command Register, GCMD_REG General
Description) that:
If multiple control fields in this register need to be modified, software
must serialize the modifications through multiple writes to this register.
However, in irq_remapping.c, modifications of IRE and CFI are done in one
write. We need to do two separate writes with STS checking after each. It
also checks the status register before writing command register to avoid
unnecessary register write.
Fixes: af8d102f999a4 ("x86/intel/irq_remapping: Clean up x2apic opt-out security warning mess") Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Jacob Pan <jacob.jun.pan@linux.intel.com> Cc: Kevin Tian <kevin.tian@intel.com> Cc: Ashok Raj <ashok.raj@intel.com> Link: https://lore.kernel.org/r/20200828000615.8281-1-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
If tg3_reset_task() fails, the device state is left in an inconsistent
state with IFF_RUNNING still set but NAPI state not enabled. A
subsequent operation, such as ifdown or AER error can cause it to
soft lock up when it tries to disable NAPI state.
Fix it by bringing down the device to !IFF_RUNNING state when
tg3_reset_task() fails. tg3_reset_task() running from workqueue
will now call tg3_close() when the reset fails. We need to
modify tg3_reset_task_cancel() slightly to avoid tg3_close()
calling cancel_work_sync() to cancel tg3_reset_task(). Otherwise
cancel_work_sync() will wait forever for tg3_reset_task() to
finish.
Reported-by: David Christensen <drc@linux.vnet.ibm.com> Reported-by: Baptiste Covolato <baptiste@arista.com> Fixes: db2199737990 ("tg3: Schedule at most one tg3_reset_task run") Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
epoll_loop_check_proc() can run into a file already committed to destruction;
we can't grab a reference on those and don't need to add them to the set for
reverse path check anyway.
Tested-by: Marc Zyngier <maz@kernel.org> Fixes: a9ed4a6560b8 ("epoll: Keep a reference on files added to the check list") Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Sasha Levin <sashal@kernel.org>
On machines with much memory (> 2 TByte) and log_mtts_per_seg == 0, a
max_order of 31 will be passed to mlx_buddy_init(), which results in
s = BITS_TO_LONGS(1 << 31) becoming a negative value, leading to
kvmalloc_array() failure when it is converted to size_t.
mlx4_core 0000:b1:00.0: Failed to initialize memory region table, aborting
mlx4_core: probe of 0000:b1:00.0 failed with error -12
Fix this issue by changing the left shifting operand from a signed literal to
an unsigned one.
Fixes: 225c7b1feef1 ("IB/mlx4: Add a driver Mellanox ConnectX InfiniBand adapters") Signed-off-by: Shung-Hsi Yu <shung-hsi.yu@suse.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
perf_event.h has macros that define the field offsets in the data_src
bitmask in perf records. The SNOOPX and REMOTE offsets were both 37.
These are distinct fields, and the bitfield layout in perf_mem_data_src
confirms that SNOOPX should be at offset 38.
Committer notes:
This was extracted from a larger patch that also contained kernel
changes.
Fixes: 52839e653b5629bd ("perf tools: Add support for printing new mem_info encodings") Signed-off-by: Al Grant <al.grant@arm.com> Reviewed-by: Andi Kleen <ak@linux.intel.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/9974f2d0-bf7f-518e-d9f7-4520e5ff1bb0@foss.arm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
If firmware goes into unstable state, HWRM_NVM_GET_DIR_INFO firmware
command may return zero dir entries. Return error in such case to
avoid zero length dma buffer request.
Fixes: c0c050c58d84 ("bnxt_en: New Broadcom ethernet driver.") Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
During a dump, this attribute is essential, it enables the userspace to
know on which interface the context is linked to.
Fixes: 459aa660eb1d ("gtp: add initial driver for datapath of GPRS Tunneling Protocol (GTP-U)") Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Tested-by: Gabriel Ganne <gabriel.ganne@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
Move the burst len fixup after setting the generic value for it. This
finally enables the fixup introduced by commit 137bd11090d8 ("dmaengine:
pl330: Align DMA memcpy operations to MFIFO width"), which otherwise was
overwritten by the generic value.
Reported-by: kernel test robot <lkp@intel.com> Fixes: 137bd11090d8 ("dmaengine: pl330: Align DMA memcpy operations to MFIFO width") Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Link: https://lore.kernel.org/r/20200825064617.16193-1-m.szyprowski@samsung.com Signed-off-by: Vinod Koul <vkoul@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
When devm_gpiod_get_optional() fails, bus should be
freed just like when of_mdiobus_register() fails.
Fixes: 1bddd96cba03d ("net: arc_emac: support the phy reset for emac driver") Signed-off-by: Dinghao Liu <dinghao.liu@zju.edu.cn> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
When this driver is built as a module, I cannot rmmod it after insmoding
it.
This is because that this driver calls ravb_mdio_init() at the time of
probe, and module->refcnt is incremented by alloc_mdio_bitbang() called
after that.
Therefore, even if ifup is not performed, the driver is in use and rmmod
cannot be performed.
$ lsmod
Module Size Used by
ravb 40960 1
$ rmmod ravb
rmmod: ERROR: Module ravb is in use
Call ravb_mdio_init() at open and free_mdio_bitbang() at close, thereby
rmmod is possible in the ifdown state.
Fixes: c156633f1353 ("Renesas Ethernet AVB driver proper") Signed-off-by: Yuusuke Ashizuka <ashiduka@fujitsu.com> Reviewed-by: Sergei Shtylyov <sergei.shtylyov@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
When devm_kcalloc() fails, dev should be freed just
like what we've done in the subsequent error paths.
Fixes: 7b78be48a8eb6 ("net: systemport: Dynamically allocate number of TX rings") Signed-off-by: Dinghao Liu <dinghao.liu@zju.edu.cn> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
hns_nic_dev_probe allocates ndev, but not free it on
two error handling paths, which may lead to memleak.
Fixes: 63434888aaf1b ("net: hns: net: hns: enet adds support of acpi") Signed-off-by: Dinghao Liu <dinghao.liu@zju.edu.cn> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
Note the non-zero upper bits of the first element. It turns out that
nft_immediate doesn't zero the destination register, but this is needed
when the length isn't a multiple of 4.
Furthermore, the zeroing in nft_payload is broken. We can't use
[len / 4] = 0 -- if len is a multiple of 4, index is off by one.
Skip zeroing in this case and use a conditional instead of (len -1) / 4.
Fixes: 49499c3e6e18 ("netfilter: nf_tables: switch registers to 32 bit addressing") Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Kernel sends an empty NFTA_SET_USERDATA attribute with no value if
userspace adds a set with no NFTA_SET_USERDATA attribute.
Fixes: e6d8ecac9e68 ("netfilter: nf_tables: Add new attributes into nft_set to store user data.") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
The initialization done by bmips_cpu_setup() typically affects both
threads of a given core, on 7435 which supports 2 cores and 2 threads,
logical CPU number 2 and 3 would not run this initialization.
Fixes: 738a3f79027b ("MIPS: BMIPS: Add early CPU initialization code") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
When the BMIPS generic cpu-feature-overrides.h file was introduced,
cpu_has_inclusive_caches/MIPS_CPU_INCLUSIVE_CACHES was not set for
BMIPS5000 CPUs. Correct this when we have initialized the MIPS secondary
cache successfully.
batadv_bla_send_claim() gets called from worker thread context through
batadv_bla_periodic_work(), thus netif_rx_ni needs to be used in that
case. This fixes "NOHZ: local_softirq_pending 08" log messages seen
when batman-adv is enabled.
Fixes: 23721387c409 ("batman-adv: add basic bridge loop avoidance code") Signed-off-by: Jussi Kivilinna <jussi.kivilinna@haltian.com> Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
The own OGM check is currently misplaced and can lead to the following
issues:
For one thing we might receive an aggregated OGM from a neighbor node
which has our own OGM in the first place. We would then not only skip
our own OGM but erroneously also any other, following OGM in the
aggregate.
For another, we might receive an OGM aggregate which has our own OGM in
a place other then the first one. Then we would wrongly not skip this
OGM, leading to populating the orginator and gateway table with ourself.
The gateway client code can try to optimize the delivery of DHCP packets to
avoid broadcasting them through the whole mesh. But also transmissions to
the client can be optimized by looking up the destination via the chaddr of
the DHCP packet.
But the chaddr is currently only done when chaddr is fully inside the
non-paged area of the skbuff. Otherwise it will not be initialized and the
unoptimized path should have been taken.
But the implementation didn't handle this correctly. It didn't retrieve the
correct chaddr but still tried to perform the TT lookup with this
uninitialized memory.
Reported-by: syzbot+ab16e463b903f5a37036@syzkaller.appspotmail.com Fixes: 6c413b1c22a2 ("batman-adv: send every DHCP packet as bat-unicast") Signed-off-by: Sven Eckelmann <sven@narfation.org> Acked-by: Antonio Quartulli <a@unstable.cc> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
On some architectures (like ARM), virt_to_gfn cannot be used for
vmalloc'd memory because of its reliance on virt_to_phys. This patch
introduces a check for vmalloc'd addresses and obtains the PFN using
vmalloc_to_pfn in that case.
Since commit a21ee6055c30 ("lockdep: Change hardirq{s_enabled,_context}
to per-cpu variables") the lockdep code itself uses percpu variables. This
leads to recursions because the percpu macros are calling preempt_enable()
which might call trace_preempt_on().
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Tested-by: Marco Elver <elver@google.com> Link: https://lkml.kernel.org/r/20200821085348.251340558@infradead.org Signed-off-by: Sasha Levin <sashal@kernel.org>
Leases don't currently work correctly on kcephfs, as they are not broken
when caps are revoked. They could eventually be implemented similarly to
how we did them in libcephfs, but for now don't allow them.
[ idryomov: no need for simple_nosetlease() in ceph_dir_fops and
ceph_snapdir_fops ]
clang static analysis reports this representative problem
applesmc.c:758:10: warning: 1st function call argument is an
uninitialized value
left = be16_to_cpu(*(__be16 *)(buffer + 6)) >> 2;
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
buffer is filled by the earlier call
ret = applesmc_read_key(LIGHT_SENSOR_LEFT_KEY, ...
This problem is reported because a goto skips the status check.
Other similar problems use data from applesmc_read_key before checking
the status. So move the checks to before the use.
Signed-off-by: Tom Rix <trix@redhat.com> Reviewed-by: Henrik Rydberg <rydberg@bitmath.org> Link: https://lore.kernel.org/r/20200820131932.10590-1-trix@redhat.com Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
Define shutdown callback for display drm driver,
so as to disable all the CRTCS when shutdown
notification is received by the driver.
This change will turn off the timing engine so
that no display transactions are requested
while mmu translations are getting disabled
during reboot sequence.
Signed-off-by: Krishna Manikandan <mkrishn@codeaurora.org>
Changes in v2:
- Remove NULL check from msm_pdev_shutdown (Stephen Boyd)
- Change commit text to reflect when this issue
was uncovered (Sai Prakash Ranjan)
Signed-off-by: Rob Clark <robdclark@chromium.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Event modifiers are not mentioned in the perf record or perf stat
manpages. Add them to orient new users more effectively by pointing
them to the perf list manpage for details.
Fixes: 2055fdaf8703 ("perf list: Document precise event sampling for AMD IBS") Signed-off-by: Kim Phillips <kim.phillips@amd.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Budankov <alexey.budankov@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Clarke <pc@us.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Tony Jones <tonyj@suse.de> Cc: stable@vger.kernel.org Link: http://lore.kernel.org/lkml/20200901215853.276234-1-kim.phillips@amd.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When calling into hid_map_usage(), the passed event code is
blindly stored as is, even if it doesn't fit in the associated bitmap.
This event code can come from a variety of sources, including devices
masquerading as input devices, only a bit more "programmable".
Instead of taking the event code at face value, check that it actually
fits the corresponding bitmap, and if it doesn't:
- spit out a warning so that we know which device is acting up
- NULLify the bitmap pointer so that we catch unexpected uses
Code paths that can make use of untrusted inputs can now check
that the mapping was indeed correct and bail out if not.
Cc: stable@vger.kernel.org Signed-off-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Benjamin Tissoires <benjamin.tissoires@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
It appears that a ReportSize value of zero is legal, even if a bit
non-sensical. Most of the HID code seems to handle that gracefully,
except when computing the total size in bytes. When fed as input to
memset, this leads to some funky outcomes.
Detect the corner case and correctly compute the size.
Cc: stable@vger.kernel.org Signed-off-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Benjamin Tissoires <benjamin.tissoires@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The size of the buffers for storing context's and sessions can vary from
arch to arch as PAGE_SIZE can be anything between 4 kB and 256 kB (the
maximum for PPC64). Define a fixed buffer size set to 16 kB. This should be
enough for most use with three handles (that is how many we allow at the
moment). Parametrize the buffer size while doing this, so that it is easier
to revisit this later on if required.
Cc: stable@vger.kernel.org Reported-by: Stefan Berger <stefanb@linux.ibm.com> Fixes: 745b361e989a ("tpm: infrastructure for TPM spaces") Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com> Tested-by: Stefan Berger <stefanb@linux.ibm.com> Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
With my new locking code dbench is so much faster that I tripped over a
transaction abort from ENOSPC. This turned out to be because
btrfs_del_dir_entries_in_log was checking for ret == -ENOSPC, but this
function sets err on error, and returns err. So instead of properly
marking the inode as needing a full commit, we were returning -ENOSPC
and aborting in __btrfs_unlink_inode. Fix this by checking the proper
variable so that we return the correct thing in the case of ENOSPC.
The ENOENT needs to be checked, because btrfs_lookup_dir_item_index()
can return -ENOENT if the dir item isn't in the tree log (which would
happen if we hadn't fsync'ed this guy). We actually handle that case in
__btrfs_unlink_inode, so it's an expected error to get back.
Fixes: 4a500fd178c8 ("Btrfs: Metadata ENOSPC handling for tree log") CC: stable@vger.kernel.org # 4.4+ Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com>
[ add note and comment about ENOENT ] Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
The PSZ-HA* family of USB disk drives from Sony can't handle the
REPORT OPCODES command when using the UAS protocol. This patch adds
an appropriate quirks entry.
Reported-and-tested-by: Till Dörges <doerges@pre-sense.de> Signed-off-by: Alan Stern <stern@rowland.harvard.edu> CC: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/20200826143229.GB400430@rowland.harvard.edu Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
cdc-acm.c:409:3: warning: Use of memory after it is freed
acm_process_notification(acm, (unsigned char *)dr);
There are three problems, the first one is that dr is not reset
The variable dr is set with
if (acm->nb_index)
dr = (struct usb_cdc_notification *)acm->notification_buffer;
But if the notification_buffer is too small it is resized with
if (acm->nb_size) {
kfree(acm->notification_buffer);
acm->nb_size = 0;
}
alloc_size = roundup_pow_of_two(expected_size);
/*
* kmalloc ensures a valid notification_buffer after a
* use of kfree in case the previous allocation was too
* small. Final freeing is done on disconnect.
*/
acm->notification_buffer =
kmalloc(alloc_size, GFP_ATOMIC);
dr should point to the new acm->notification_buffer.
The second problem is any data in the notification_buffer is lost
when the pointer is freed. In the normal case, the current data
is accumulated in the notification_buffer here.
Inadvertently the commit b1cd1b65afba ("USB: gadget: u_f: add overflow checks
to VLA macros") makes VLA macros to always return 0 due to different scope of
two variables of the same name. Obviously we need to have only one.
Fixes: b1cd1b65afba ("USB: gadget: u_f: add overflow checks to VLA macros") Reported-by: Marek Szyprowski <m.szyprowski@samsung.com> Tested-by: Marek Szyprowski <m.szyprowski@samsung.com> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Brooke Basile <brookebasile@gmail.com> Cc: stable <stable@kernel.org> Link: https://lore.kernel.org/r/20200826192119.56450-1-andriy.shevchenko@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Some values extracted by ncm_unwrap_ntb() could possibly lead to several
different out of bounds reads of memory. Specifically the values passed
to netdev_alloc_skb_ip_align() need to be checked so that memory is not
overflowed.
Resolve this by applying bounds checking to a number of different
indexes and lengths of the structure parsing logic.
Reported-by: Ilja Van Sprundel <ivansprundel@ioactive.com> Signed-off-by: Brooke Basile <brookebasile@gmail.com> Acked-by: Felipe Balbi <balbi@kernel.org> Cc: stable <stable@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
size can potentially hold an overflowed value if its assigned expression
is left unchecked, leading to a smaller than needed allocation when
vla_group_size() is used by callers to allocate memory.
To fix this, add a test for saturation before declaring variables and an
overflow check to (n) * sizeof(type).
If the expression results in overflow, vla_group_size() will return SIZE_MAX.
If the function platform_get_irq() failed, the negative value
returned will not be detected here. So fix error handling in
exynos_ohci_probe(). And when get irq failed, the function
platform_get_irq() logs an error message, so remove redundant
message here.
This device does not support UAS properly and a similar entry already
exists in drivers/usb/storage/unusual_uas.h. Without this patch,
storage_probe() defers the handling of this device to UAS, which cannot
handle it either.
Tested-by: Brice Goglin <brice.goglin@gmail.com> Fixes: bc3bdb12bbb3 ("usb-storage: Disable UAS on JMicron SATA enclosure") Acked-by: Alan Stern <stern@rowland.harvard.edu> CC: <stable@vger.kernel.org> Signed-off-by: Cyril Roelandt <tipecaml@gmail.com> Link: https://lore.kernel.org/r/20200825212231.46309-1-tipecaml@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
There's another Raydium touchscreen needs the no-lpm quirk:
[ 1.339149] usb 1-9: New USB device found, idVendor=2386, idProduct=350e, bcdDevice= 0.00
[ 1.339150] usb 1-9: New USB device strings: Mfr=1, Product=2, SerialNumber=0
[ 1.339151] usb 1-9: Product: Raydium Touch System
[ 1.339152] usb 1-9: Manufacturer: Raydium Corporation
...
[ 6.450497] usb 1-9: can't set config #1, error -110
PNY Pro Elite USB 3.1 Gen 2 device (SSD) doesn't respond to ATA_12
pass-through command (i.e. it just hangs). If it doesn't support this
command, it should respond properly to the host. Let's just add a quirk
to be able to move forward with other operations.
The syzbot fuzzer identified a bug in the yurex driver: It passes
GFP_KERNEL as a memory-allocation flag to usb_submit_urb() at a time
when its state is TASK_INTERRUPTIBLE, not TASK_RUNNING:
The values for "se_num" and "sh_num" come from the user in the ioctl.
They can be in the 0-255 range but if they're more than
AMDGPU_GFX_MAX_SE (4) or AMDGPU_GFX_MAX_SH_PER_SE (2) then it results in
an out of bounds read.
Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When the primary firmware node pointer is removed from a
device (set to NULL) the secondary firmware node pointer,
when it exists, is made the primary node for the device.
However, the secondary firmware node pointer of the original
primary firmware node is never cleared (set to NULL).
To avoid situation where the secondary firmware node pointer
is pointing to a non-existing object, clearing it properly
when the primary node is removed from a device in
set_primary_fwnode().
Fixes: 97badf873ab6 ("device property: Make it possible to use secondary firmware nodes") Cc: All applicable <stable@vger.kernel.org> Signed-off-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
It has been reported that system-wide suspend may be aborted in the
absence of any wakeup events due to unforseen interactions of it with
the runtume PM framework.
One failing scenario is when there are multiple devices sharing an
ACPI power resource and runtime-resume needs to be carried out for
one of them during system-wide suspend (for example, because it needs
to be reconfigured before the whole system goes to sleep). In that
case, the runtime-resume of that device involves turning the ACPI
power resource "on" which in turn causes runtime-resume requests
to be queued up for all of the other devices sharing it. Those
requests go to the runtime PM workqueue which is frozen during
system-wide suspend, so they are not actually taken care of until
the resume of the whole system, but the pm_runtime_barrier()
call in __device_suspend() sees them and triggers system wakeup
events for them which then cause the system-wide suspend to be
aborted if wakeup source objects are in active use.
Of course, the logic that leads to triggering those wakeup events is
questionable in the first place, because clearly there are cases in
which a pending runtime resume request for a device is not connected
to any real wakeup events in any way (like the one above). Moreover,
it is racy, because the device may be resuming already by the time
the pm_runtime_barrier() runs and so if the driver doesn't take care
of signaling the wakeup event as appropriate, it will be lost.
However, if the driver does take care of that, the extra
pm_wakeup_event() call in the core is redundant.
Accordingly, drop the conditional pm_wakeup_event() call fron
__device_suspend() and make the latter call pm_runtime_barrier()
alone. Also modify the comment next to that call to reflect the new
code and extend it to mention the need to avoid unwanted interactions
between runtime PM and system-wide device suspend callbacks.
Fixes: 1e2ef05bb8cf8 ("PM: Limit race conditions between runtime PM and system sleep (v2)") Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Alan Stern <stern@rowland.harvard.edu> Reported-by: Utkarsh H Patel <utkarsh.h.patel@intel.com> Tested-by: Utkarsh H Patel <utkarsh.h.patel@intel.com> Tested-by: Pengfei Xu <pengfei.xu@intel.com> Cc: All applicable <stable@vger.kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Sometimes re-plugging a USB device during system sleep renders the device
useless:
[ 173.418345] xhci_hcd 0000:00:14.0: Get port status 2-4 read: 0x14203e2, return 0x10262
...
[ 176.496485] usb 2-4: Waited 2000ms for CONNECT
[ 176.496781] usb usb2-port4: status 0000.0262 after resume, -19
[ 176.497103] usb 2-4: can't resume, status -19
[ 176.497438] usb usb2-port4: logical disconnect
Because PLS equals to XDEV_RESUME, xHCI driver reports U3 to usbcore,
despite of CAS bit is flagged.
So proritize CAS over XDEV_RESUME to let usbcore handle warm-reset for
the port.
handler data is meant for interrupt handlers and not for storing irq chip
specific information as some devices require handler data to store internal
per interrupt information, e.g. pinctrl/GPIO chained interrupt handlers.
This obviously creates a conflict of interests and crashes the machine
because the XEN pointer is overwritten by the driver pointer.
As the XEN data is not handler specific it should be stored in
irqdesc::irq_data::chip_data instead.
A simple sed s/irq_[sg]et_handler_data/irq_[sg]et_chip_data/ cures that.
Cc: stable@vger.kernel.org Reported-by: Roman Shaposhnik <roman@zededa.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Roman Shaposhnik <roman@zededa.com> Reviewed-by: Juergen Gross <jgross@suse.com> Link: https://lore.kernel.org/r/87lfi2yckt.fsf@nanos.tec.linutronix.de Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When we are processing writeback for sync(2), move_expired_inodes()
didn't set any inode expiry value (older_than_this). This can result in
writeback never completing if there's steady stream of inodes added to
b_dirty_time list as writeback rechecks dirty lists after each writeback
round whether there's more work to be done. Fix the problem by using
sync(2) start time is inode expiry value when processing b_dirty_time
list similarly as for ordinarily dirtied inodes. This requires some
refactoring of older_than_this handling which simplifies the code
noticeably as a bonus.
Fixes: 0ae45f63d4ef ("vfs: add support for a lazytime mount option") CC: stable@vger.kernel.org Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Inode's i_io_list list head is used to attach inode to several different
lists - wb->{b_dirty, b_dirty_time, b_io, b_more_io}. When flush worker
prepares a list of inodes to writeback e.g. for sync(2), it moves inodes
to b_io list. Thus it is critical for sync(2) data integrity guarantees
that inode is not requeued to any other writeback list when inode is
queued for processing by flush worker. That's the reason why
writeback_single_inode() does not touch i_io_list (unless the inode is
completely clean) and why __mark_inode_dirty() does not touch i_io_list
if I_SYNC flag is set.
However there are two flaws in the current logic:
1) When inode has only I_DIRTY_TIME set but it is already queued in b_io
list due to sync(2), concurrent __mark_inode_dirty(inode, I_DIRTY_SYNC)
can still move inode back to b_dirty list resulting in skipping
writeback of inode time stamps during sync(2).
2) When inode is on b_dirty_time list and writeback_single_inode() races
with __mark_inode_dirty() like:
writeback_single_inode() __mark_inode_dirty(inode, I_DIRTY_PAGES)
inode->i_state |= I_SYNC
__writeback_single_inode()
inode->i_state |= I_DIRTY_PAGES;
if (inode->i_state & I_SYNC)
bail
if (!(inode->i_state & I_DIRTY_ALL))
- not true so nothing done
We end up with I_DIRTY_PAGES inode on b_dirty_time list and thus
standard background writeback will not writeback this inode leading to
possible dirty throttling stalls etc. (thanks to Martijn Coenen for this
analysis).
Fix these problems by tracking whether inode is queued in b_io or
b_more_io lists in a new I_SYNC_QUEUED flag. When this flag is set, we
know flush worker has queued inode and we should not touch i_io_list.
On the other hand we also know that once flush worker is done with the
inode it will requeue the inode to appropriate dirty list. When
I_SYNC_QUEUED is not set, __mark_inode_dirty() can (and must) move inode
to appropriate dirty list.
Reported-by: Martijn Coenen <maco@android.com> Reviewed-by: Martijn Coenen <maco@android.com> Tested-by: Martijn Coenen <maco@android.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Fixes: 0ae45f63d4ef ("vfs: add support for a lazytime mount option") CC: stable@vger.kernel.org Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Currently, operations on inode->i_io_list are protected by
wb->list_lock. In the following patches we'll need to maintain
consistency between inode->i_state and inode->i_io_list so change the
code so that inode->i_lock protects also all inode's i_io_list handling.
Reviewed-by: Martijn Coenen <maco@android.com> Reviewed-by: Christoph Hellwig <hch@lst.de> CC: stable@vger.kernel.org # Prerequisite for "writeback: Avoid skipping inode writeback" Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
We have a number of "uart.port->desc.lock vs desc.lock->uart.port"
lockdep reports coming from 8250 driver; this causes a bit of trouble
to people, so let's fix it.
The problem is reverse lock order in two different call paths:
Fix this by changing the order of locks in serial8250_do_startup():
do disable_irq_nosync() first, which grabs desc->lock, and grab
uart->port after that, so that chain #1 and chain #2 have same lock
order.
Full lockdep splat:
======================================================
WARNING: possible circular locking dependency detected
5.4.39 #55 Not tainted
======================================================
swapper/0/0 is trying to acquire lock: ffffffffab65b6c0 (console_owner){-...}, at: console_lock_spinning_enable+0x31/0x57
but task is already holding lock: ffff88810a8e34c0 (&irq_desc_lock_class){-.-.}, at: __report_bad_irq+0x5b/0xba
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
If the number of ports a card has is not explicitly specified, it defaults
to the rightmost 4 bits of the PCI device ID. This is prone to error since
not all PCI device IDs contain a number which corresponds to the number of
ports that card provides.
This particular case involves COMMTECH_4222PCIE, COMMTECH_4224PCIE and
COMMTECH_4228PCIE cards with device IDs 0x0022, 0x0020 and 0x0021.
Currently the multiport cards receive 2, 0 and 1 port instead of 2, 4 and
8 ports respectively.
To fix this, each Commtech Fastcom PCIe card is given a struct where the
number of ports is explicitly specified. This ensures 'board->num_ports'
is used instead of the default 'pcidev->device & 0x0f'.
Fixes: d0aeaa83f0b0 ("serial: exar: split out the exar code from 8250_pci") Signed-off-by: Valmer Huhn <valmer.huhn@concurrent-rt.com> Tested-by: Valmer Huhn <valmer.huhn@concurrent-rt.com> Cc: stable <stable@vger.kernel.org> Link: https://lore.kernel.org/r/20200813165255.GC345440@icarus.concurrent-rt.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
pl011_probe() calls pl011_setup_port() to reserve an amba_ports[] entry,
then calls pl011_register_port() to register the uart driver with the
tty layer.
If registration of the uart driver fails, the amba_ports[] entry is not
released. If this happens 14 times (value of UART_NR macro), then all
amba_ports[] entries will have been leaked and driver probing is no
longer possible. (To be fair, that can only happen if the DeviceTree
doesn't contain alias IDs since they cause the same entry to be used for
a given port.) Fix it.
If probing of a pl011 gets deferred until after free_initmem(), an oops
ensues because pl011_console_match() is called which has been freed.
Fix by removing the __init attribute from the function and those it
calls.
Commit 10879ae5f12e ("serial: pl011: add console matching function")
introduced pl011_console_match() not just for early consoles but
regular preferred consoles, such as those added by acpi_parse_spcr().
Regular consoles may be registered after free_initmem() for various
reasons, one being deferred probing, another being dynamic enablement
of serial ports using a DeviceTree overlay.
Thus, pl011_console_match() must not be declared __init and the
functions it calls mustn't either.
Stack trace for posterity:
Unable to handle kernel paging request at virtual address 80c38b58
Internal error: Oops: 8000000d [#1] PREEMPT SMP ARM
PC is at pl011_console_match+0x0/0xfc
LR is at register_console+0x150/0x468
[<80187004>] (register_console)
[<805a8184>] (uart_add_one_port)
[<805b2b68>] (pl011_register_port)
[<805b3ce4>] (pl011_probe)
[<80569214>] (amba_probe)
[<805ca088>] (really_probe)
[<805ca2ec>] (driver_probe_device)
[<805ca5b0>] (__device_attach_driver)
[<805c8060>] (bus_for_each_drv)
[<805c9dfc>] (__device_attach)
[<805ca630>] (device_initial_probe)
[<805c90a8>] (bus_probe_device)
[<805c95a8>] (deferred_probe_work_func)
Fixes: 10879ae5f12e ("serial: pl011: add console matching function") Signed-off-by: Lukas Wunner <lukas@wunner.de> Cc: stable@vger.kernel.org # v4.10+ Cc: Aleksey Makarov <amakarov@marvell.com> Cc: Peter Hurley <peter@hurleysoftware.com> Cc: Russell King <linux@armlinux.org.uk> Cc: Christopher Covington <cov@codeaurora.org> Link: https://lore.kernel.org/r/f827ff09da55b8c57d316a1b008a137677b58921.1597315557.git.lukas@wunner.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In few older Samsung SoCs like s3c2410, s3c2412
and s3c2440, UART IP is having 2 interrupt lines.
However, in other SoCs like s3c6400, s5pv210,
exynos5433, and exynos4210 UART is having only 1
interrupt line. Due to this, "platform_get_irq(platdev, 1)"
call in the driver gives the following false-positive error:
"IRQ index 1 not found" on newer SoC's.
This patch adds the condition to check for Tx interrupt
only for the those SoC's which have 2 interrupt lines.
Tested-by: Alim Akhtar <alim.akhtar@samsung.com> Tested-by: Marek Szyprowski <m.szyprowski@samsung.com> Reviewed-by: Krzysztof Kozlowski <krzk@kernel.org> Reviewed-by: Alim Akhtar <alim.akhtar@samsung.com> Signed-off-by: Tamseel Shams <m.shams@samsung.com> Cc: stable <stable@vger.kernel.org> Link: https://lore.kernel.org/r/20200810030021.45348-1-m.shams@samsung.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
vc_resize() can return with an error after failure. Change VT_RESIZEX ioctl
to save struct vc_data values that are modified and restore the original
values in case of error.
syzbot is reporting UAF bug in set_origin() from vc_do_resize() [1], for
vc_do_resize() calls kfree(vc->vc_screenbuf) before calling set_origin().
Unfortunately, in set_origin(), vc->vc_sw->con_set_origin() might access
vc->vc_pos when scroll is involved in order to manipulate cursor, but
vc->vc_pos refers already released vc->vc_screenbuf until vc->vc_pos gets
updated based on the result of vc->vc_sw->con_set_origin().
Preserving old buffer and tolerating outdated vc members until set_origin()
completes would be easier than preventing vc->vc_sw->con_set_origin() from
accessing outdated vc members.
lvs_rh_probe() can return some nonnegative value from usb_control_msg()
when it is less than "USB_DT_HUB_NONVAR_SIZE + 2" that is considered as
a failure. Make lvs_rh_probe() return -EINVAL in this case.
Found by Linux Driver Verification project (linuxtesting.org).
Add a check to fbcon_resize() to ensure that a possible change to user font
height or user font width will not allow a font data out-of-bounds access.
NOTE: must use original charcount in calculation as font charcount can
change and cannot be used to determine the font data allocated size.
If a transaction aborts it can cause a memory leak of the pages array of
a block group's io_ctl structure. The following steps explain how that can
happen:
1) Transaction N is committing, currently in state TRANS_STATE_UNBLOCKED
and it's about to start writing out dirty extent buffers;
2) Transaction N + 1 already started and another task, task A, just called
btrfs_commit_transaction() on it;
3) Block group B was dirtied (extents allocated from it) by transaction
N + 1, so when task A calls btrfs_start_dirty_block_groups(), at the
very beginning of the transaction commit, it starts writeback for the
block group's space cache by calling btrfs_write_out_cache(), which
allocates the pages array for the block group's io_ctl with a call to
io_ctl_init(). Block group A is added to the io_list of transaction
N + 1 by btrfs_start_dirty_block_groups();
4) While transaction N's commit is writing out the extent buffers, it gets
an IO error and aborts transaction N, also setting the file system to
RO mode;
5) Task A has already returned from btrfs_start_dirty_block_groups(), is at
btrfs_commit_transaction() and has set transaction N + 1 state to
TRANS_STATE_COMMIT_START. Immediately after that it checks that the
filesystem was turned to RO mode, due to transaction N's abort, and
jumps to the "cleanup_transaction" label. After that we end up at
btrfs_cleanup_one_transaction() which calls btrfs_cleanup_dirty_bgs().
That helper finds block group B in the transaction's io_list but it
never releases the pages array of the block group's io_ctl, resulting in
a memory leak.
In fact at the point when we are at btrfs_cleanup_dirty_bgs(), the pages
array points to pages that were already released by us at
__btrfs_write_out_cache() through the call to io_ctl_drop_pages(). We end
up freeing the pages array only after waiting for the ordered extent to
complete through btrfs_wait_cache_io(), which calls io_ctl_free() to do
that. But in the transaction abort case we don't wait for the space cache's
ordered extent to complete through a call to btrfs_wait_cache_io(), so
that's why we end up with a memory leak - we wait for the ordered extent
to complete indirectly by shutting down the work queues and waiting for
any jobs in them to complete before returning from close_ctree().
We can solve the leak simply by freeing the pages array right after
releasing the pages (with the call to io_ctl_drop_pages()) at
__btrfs_write_out_cache(), since we will never use it anymore after that
and the pages array points to already released pages at that point, which
is currently not a problem since no one will use it after that, but not a
good practice anyway since it can easily lead to use-after-free issues.
So fix this by freeing the pages array right after releasing the pages at
__btrfs_write_out_cache().
This issue can often be reproduced with test case generic/475 from fstests
and kmemleak can detect it and reports it with the following trace:
Before this commit i2c_hid_parse() consists of the following steps:
1. Send power on cmd
2. usleep_range(1000, 5000)
3. Send reset cmd
4. Wait for reset to complete (device interrupt, or msleep(100))
5. Send power on cmd
6. Try to read HID descriptor
Notice how there is an usleep_range(1000, 5000) after the first power-on
command, but not after the second power-on command.
Testing has shown that at least on the BMAX Y13 laptop's i2c-hid touchpad,
not having a delay after the second power-on command causes the HID
descriptor to read as all zeros.
In case we hit this on other devices too, the descriptor being all zeros
can be recognized by the following message being logged many, many times:
hid-generic 0018:0911:5288.0002: unknown main item tag 0x0
At the same time as the BMAX Y13's touchpad issue was debugged,
Kai-Heng was working on debugging some issues with Goodix i2c-hid
touchpads. It turns out that these need a delay after a PWR_ON command
too, otherwise they stop working after a suspend/resume cycle.
According to Goodix a delay of minimal 60ms is needed.
Having multiple cases where we need a delay after sending the power-on
command, seems to indicate that we should always sleep after the power-on
command.
This commit fixes the mentioned issues by moving the existing 1ms sleep to
the i2c_hid_set_power() function and changing it to a 60ms sleep.
Cc: stable@vger.kernel.org BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=208247 Reported-by: Kai-Heng Feng <kai.heng.feng@canonical.com> Reported-and-tested-by: Andrea Borgia <andrea@borgia.bo.it> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>