The rsvp classifier has served us well for about a quarter of a century but has
has not been getting much maintenance attention due to lack of known users.
The offset is just 32bits here so this can potentially overflow if
somebody specifies a large value. Instead reduce the size to calculate
the last possible offset.
The error handling path incorrectly drops the reference to the user
fence BO resulting in potential reference count underflow.
Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
v7.2 controller has different ECC level field size and shift in the acc
control register than its predecessor and successor controller. It needs
to be set specifically.
With the configuration PAGE_SIZE 64k and filesystem blocksize 64k,
a problem occurred when more than 13 million files were directly created
under a directory:
EXT4-fs error (device xx): ext4_dx_csum_set:492: inode #xxxx: comm xxxxx: dir seems corrupt? Run e2fsck -D.
EXT4-fs error (device xx): ext4_dx_csum_verify:463: inode #xxxx: comm xxxxx: dir seems corrupt? Run e2fsck -D.
EXT4-fs error (device xx): dx_probe:856: inode #xxxx: block 8188: comm xxxxx: Directory index failed checksum
When enough files are created, the fake_dirent->reclen will be 0xffff.
it doesn't equal to the blocksize 65536, i.e. 0x10000.
But it is not the same condition when blocksize equals to 4k.
when enough files are created, the fake_dirent->reclen will be 0x1000.
it equals to the blocksize 4k, i.e. 0x1000.
The problem seems to be related to the limitation of the 16-bit field
when the blocksize is set to 64k.
To address this, helpers like ext4_rec_len_{from,to}_disk has already
been introduced to complete the conversion between the encoded and the
plain form of rec_len.
So fix this one by using the helper, and all the other in this file too.
Cc: stable@kernel.org Fixes: dbe89444042a ("ext4: Calculate and verify checksums for htree nodes") Suggested-by: Andreas Dilger <adilger@dilger.ca> Suggested-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Shida Zhang <zhangshida@kylinos.cn> Reviewed-by: Andreas Dilger <adilger@dilger.ca> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Link: https://lore.kernel.org/r/20230803060938.1929759-1-zhangshida@kylinos.cn Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The following processes run into a deadlock. CPU 41 was waiting for CPU 29
to handle a CSD request while holding spinlock "crashdump_lock", but CPU 29
was hung by that spinlock with IRQs disabled.
The lock is used to synchronize different sysfs operations, it doesn't
protect any resource that will be touched by an interrupt. Consequently
it's not required to disable IRQs. Replace the spinlock with a mutex to fix
the deadlock.
Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com> Link: https://lore.kernel.org/r/20230828221018.19471-1-junxiao.bi@oracle.com Reviewed-by: Mike Christie <michael.christie@oracle.com> Cc: stable@vger.kernel.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reset the i2c controller when an i2c transfer timeout occurs.
The remaining interrupts and device should be reset to avoid
unpredictable controller behavior.
Fixes: 2e57b7cebb98 ("i2c: aspeed: Add multi-master use case support") Cc: <stable@vger.kernel.org> # v5.1+ Signed-off-by: Tommy Huang <tommy_huang@aspeedtech.com> Reviewed-by: Andi Shyti <andi.shyti@kernel.org> Signed-off-by: Wolfram Sang <wsa@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The function tracefs_create_dir() was missing a lockdown check and was
called by the RV code. This gave an inconsistent behavior of this function
returning success while other tracefs functions failed. This caused the
inode being freed by the wrong kmem_cache.
Link: https://lkml.kernel.org/r/20230905182711.692687042@goodmis.org Link: https://lore.kernel.org/all/202309050916.58201dc6-oliver.sang@intel.com/ Cc: stable@vger.kernel.org Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Ajay Kaher <akaher@vmware.com> Cc: Ching-lin Yu <chinglinyu@google.com> Fixes: bf8e602186ec4 ("tracing: Do not create tracefs files if tracefs lockdown is in effect") Reported-by: kernel test robot <oliver.sang@intel.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The option files update the options for a given trace array. For an
instance, if the file is opened and the instance is deleted, reading or
writing to the file will cause a use after free.
Up the ref count of the trace_array when an option file is opened.
The current_trace updates the trace array tracer. For an instance, if the
file is opened and the instance is deleted, reading or writing to the file
will cause a use after free.
Up the ref count of the trace array when current_trace is opened.
When running delayed items we are holding a delayed node's mutex and then
we will attempt to modify a subvolume btree to insert/update/delete the
delayed items. However if have an error during the insertions for example,
btrfs_insert_delayed_items() may return with a path that has locked extent
buffers (a leaf at the very least), and then we attempt to release the
delayed node at __btrfs_run_delayed_items(), which requires taking the
delayed node's mutex, causing an ABBA type of deadlock. This was reported
by syzbot and the lockdep splat is the following:
WARNING: possible circular locking dependency detected 6.5.0-rc7-syzkaller-00024-g93f5de5f648d #0 Not tainted
------------------------------------------------------
syz-executor.2/13257 is trying to acquire lock: ffff88801835c0c0 (&delayed_node->mutex){+.+.}-{3:3}, at: __btrfs_release_delayed_node+0x9a/0xaa0 fs/btrfs/delayed-inode.c:256
but task is already holding lock: ffff88802a5ab8e8 (btrfs-tree-00){++++}-{3:3}, at: __btrfs_tree_lock+0x3c/0x2a0 fs/btrfs/locking.c:198
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
Changing the mode of symlinks is meaningless as the vfs doesn't take the
mode of a symlink into account during path lookup permission checking.
However, the vfs doesn't block mode changes on symlinks. This however,
has lead to an untenable mess roughly classifiable into the following
two categories:
(1) Filesystems that don't implement a i_op->setattr() for symlinks.
Such filesystems may or may not know that without i_op->setattr()
defined, notify_change() falls back to simple_setattr() causing the
inode's mode in the inode cache to be changed.
That's a generic issue as this will affect all non-size changing
inode attributes including ownership changes.
Example: afs
(2) Filesystems that fail with EOPNOTSUPP but change the mode of the
symlink nonetheless.
Some filesystems will happily update the mode of a symlink but still
return EOPNOTSUPP. This is the biggest source of confusion for
userspace.
The EOPNOTSUPP in this case comes from POSIX ACLs. Specifically it
comes from filesystems that call posix_acl_chmod(), e.g., btrfs via
But for most major filesystems with POSIX ACL support such as btrfs,
ext4, ceph, tmpfs, xfs and others this will fail with EOPNOTSUPP with
the mode still updated due to the aforementioned posix_acl_chmod()
nonsense.
So, given that for all major filesystems this would fail with EOPNOTSUPP
and that both glibc (cf. [1]) and musl (cf. [2]) outright block mode
changes on symlinks we should just try and block mode changes on
symlinks directly in the vfs and have a clean break with this nonsense.
If this causes any regressions, we do the next best thing and fix up all
filesystems that do return EOPNOTSUPP with the mode updated to not call
posix_acl_chmod() on symlinks.
But as usual, let's try the clean cut solution first. It's a simple
patch that can be easily reverted. Not marking this for backport as I'll
do that manually if we're reasonably sure that this works and there are
no strong objections.
We could block this in chmod_common() but it's more appropriate to do it
notify_change() as it will also mean that we catch filesystems that
change symlink permissions explicitly or accidently.
Similar proposals were floated in the past as in [3] and [4] and again
recently in [5]. There's also a couple of bugs about this inconsistency
as in [6] and [7].
Fix to unmount the tracefs if the ftracetest mounted it for recovering
system environment. If the tracefs is already mounted, this does nothing.
Suggested-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/all/29fce076-746c-4650-8358-b4e0fa215cf7@sirena.org.uk/ Fixes: cbd965bde74c ("ftrace/selftests: Return the skip code when tracing directory not configured in kernel") Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org> Reviewed-by: Mark Brown <broonie@kernel.org> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
The function btrfs_validate_super() should verify the metadata_uuid in
the provided superblock argument. Because, all its callers expect it to
do that.
In some cases, we need to read the FSID from the superblock when the
metadata_uuid is not set, and otherwise, read the metadata_uuid. So,
add a helper.
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Tested-by: Guilherme G. Piccoli <gpiccoli@igalia.com> Signed-off-by: Anand Jain <anand.jain@oracle.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
Stable-dep-of: 6bfe3959b0e7 ("btrfs: compare the correct fsid/metadata_uuid in btrfs_validate_super") Signed-off-by: Sasha Levin <sashal@kernel.org>
According to the description in Documentation/core-api/kobject.rst:
- A ktype is the type of object that embeds a kobject. Every structure
that embeds a kobject needs a corresponding ktype.
So add sanity check to make sure kset->kobj.ktype is not NULL.
The function lio_target_nacl_info_show() uses sprintf() in a loop to print
details for every iSCSI connection in a session without checking for the
buffer length. With enough iSCSI connections it's possible to overflow the
buffer provided by configfs and corrupt the memory.
This patch replaces sprintf() with sysfs_emit_at() that checks for buffer
boundries.
In az6007_i2c_xfer, msg is controlled by user. When msg[i].buf
is null and msg[i].len is zero, former checks on msg[i].buf would be
passed. Malicious data finally reach az6007_i2c_xfer. If accessing
msg[i].buf[0] without sanity check, null ptr deref would happen.
We add check on msg[i].len to prevent crash.
Similar commit:
commit 0ed554fd769a
("media: dvb-usb: az6027: fix null-ptr-deref in az6027_i2c_xfer()")
In anysee_master_xfer, msg is controlled by user. When msg[i].buf
is null and msg[i].len is zero, former checks on msg[i].buf would be
passed. Malicious data finally reach anysee_master_xfer. If accessing
msg[i].buf[0] without sanity check, null ptr deref would happen.
We add check on msg[i].len to prevent crash.
Similar commit:
commit 0ed554fd769a
("media: dvb-usb: az6027: fix null-ptr-deref in az6027_i2c_xfer()")
Signed-off-by: Zhang Shurong <zhang_shurong@foxmail.com> Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
[hverkuil: add spaces around +] Signed-off-by: Sasha Levin <sashal@kernel.org>
In af9005_i2c_xfer, msg is controlled by user. When msg[i].buf
is null and msg[i].len is zero, former checks on msg[i].buf would be
passed. Malicious data finally reach af9005_i2c_xfer. If accessing
msg[i].buf[0] without sanity check, null ptr deref would happen.
We add check on msg[i].len to prevent crash.
Similar commit:
commit 0ed554fd769a
("media: dvb-usb: az6027: fix null-ptr-deref in az6027_i2c_xfer()")
In dw2102_i2c_transfer, msg is controlled by user. When msg[i].buf
is null and msg[i].len is zero, former checks on msg[i].buf would be
passed. Malicious data finally reach dw2102_i2c_transfer. If accessing
msg[i].buf[0] without sanity check, null ptr deref would happen.
We add check on msg[i].len to prevent crash.
Similar commit:
commit 950e252cb469
("[media] dw2102: limit messages to buffer size")
In af9035_i2c_master_xfer, msg is controlled by user. When msg[i].buf
is null and msg[i].len is zero, former checks on msg[i].buf would be
passed. Malicious data finally reach af9035_i2c_master_xfer. If accessing
msg[i].buf[0] without sanity check, null ptr deref would happen.
We add check on msg[i].len to prevent crash.
Similar commit:
commit 0ed554fd769a
("media: dvb-usb: az6027: fix null-ptr-deref in az6027_i2c_xfer()")
Signed-off-by: Zhang Shurong <zhang_shurong@foxmail.com> Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl> Signed-off-by: Sasha Levin <sashal@kernel.org>
[ moved variable declaration to fix build issues in older kernels - gregkh ] Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
If device_register() returns error in ibmebus_bus_init(), name of kobject
which is allocated in dev_set_name() called in device_add() is leaked.
As comment of device_add() says, it should call put_device() to drop
the reference count that was set in device_initialize() when it fails,
so the name can be freed in kobject_cleanup().
BUG: KASAN: double-free in slab_free mm/slub.c:3661 [inline]
BUG: KASAN: double-free in __kmem_cache_free+0x71/0x110 mm/slub.c:3674
Free of addr ffff88806f410000 by task syz-executor131/3632
JFS_IP(ipimap)->i_imap is not setting to NULL after free in diUnmount.
If jfs_remount() free JFS_IP(ipimap)->i_imap but then failed at diMount().
JFS_IP(ipimap)->i_imap will be freed once again.
Fix this problem by setting JFS_IP(ipimap)->i_imap to NULL after free.
Reported-by: syzbot+90a11e6b1e810785c6ff@syzkaller.appspotmail.com Signed-off-by: Liu Shixin <liushixin2@huawei.com> Signed-off-by: Dave Kleikamp <dave.kleikamp@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
JFS_SBI(ipbmap->i_sb)->bmap wasn't set to NULL after kfree() in
dbUnmount().
Syzkaller uses faultinject to reproduce this KASAN double-free
warning. The issue is triggered if either diMount() or dbMount() fail
in jfs_remount(), since diUnmount() or dbUnmount() already happened in
such a case - they will do double-free on next execution: jfs_umount
or jfs_remount.
Tested on both upstream and jfs-next by syzkaller.
Reported-and-tested-by: syzbot+6a93efb725385bc4b2e9@syzkaller.appspotmail.com Closes: https://lore.kernel.org/all/000000000000471f2d05f1ce8bad@google.com/T/ Link: https://syzkaller.appspot.com/bug?extid=6a93efb725385bc4b2e9 Signed-off-by: Andrew Kanner <andrew.kanner@gmail.com> Signed-off-by: Dave Kleikamp <dave.kleikamp@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
I run a small server that uses external hard drives for backups. The
backup software I use uses ext2 filesystems with 4KiB block size and
the server is running SELinux and therefore relies on xattr. I recently
upgraded the hard drives from 4TB to 12TB models. I noticed that after
transferring some TBs I got a filesystem error "Freeing blocks not in
datazone - block = 18446744071529317386, count = 1" and the backup
process stopped. Trying to fix the fs with e2fsck resulted in a
completely corrupted fs. The error probably came from ext2_free_blocks(),
and because of the large number 18e19 this problem immediately looked
like some kind of integer overflow. Whereas the 4TB fs was about 1e9
blocks, the new 12TB is about 3e9 blocks. So, searching the ext2 code,
I came across the line in fs/ext2/xattr.c:745 where ext2_new_block()
is called and the resulting block number is stored in the variable block
as an int datatype. If a block with a block number greater than
INT32_MAX is returned, this variable overflows and the call to
sb_getblk() at line fs/ext2/xattr.c:750 fails, then the call to
ext2_free_blocks() produces the error.
Signed-off-by: Georg Ottinger <g.ottinger@gmx.at> Signed-off-by: Jan Kara <jack@suse.cz>
Message-Id: <20230815100340.22121-1-g.ottinger@gmx.at> Signed-off-by: Sasha Levin <sashal@kernel.org>
If rddev->raid_disk is greater than mddev->raid_disks, there will be
an out-of-bounds in raid1_remove_disk(). We have already found
similar reports as follows:
Enable the uart quirks similar to the earlier SoCs. Let's assume we are
likely going to need a k3 specific quirk mask separate from the earlier
SoCs, so let's not start changing the revision register mask at this point.
Note that SYSC_QUIRK_LEGACY_IDLE will be needed until we can remove the
need for pm_runtime_irq_safe() from 8250_omap driver.
The variable crtc->state->event is often protected by the lock
crtc->dev->event_lock when is accessed. However, it is accessed as a
condition of an if statement in exynos_drm_crtc_atomic_disable() without
holding the lock:
if (crtc->state->event && !crtc->state->active)
However, if crtc->state->event is changed to NULL by another thread right
after the conditions of the if statement is checked to be true, a
null-pointer dereference can occur in drm_crtc_send_vblank_event():
e->pipe = pipe;
To fix this possible null-pointer dereference caused by data race, the
spin lock coverage is extended to protect the if statement as well as the
function call to drm_crtc_send_vblank_event().
Reported-by: BassCheck <bass@buaa.edu.cn> Link: https://sites.google.com/view/basscheck/home Signed-off-by: Tuo Li <islituo@gmail.com> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Added relevant link. Signed-off-by: Inki Dae <inki.dae@samsung.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
While technically some control frames like ACK are shorter and
end after Address 1, such frames shouldn't be forwarded through
wmediumd or similar userspace, so require the full 3-address
header to avoid accessing invalid memory if shorter frames are
passed in.
Reported-by: syzbot+b2645b5bf1512b81fa22@syzkaller.appspotmail.com Reviewed-by: Jeff Johnson <quic_jjohnson@quicinc.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
The following message shows up when compiling with W=1:
In function ‘fortify_memcpy_chk’,
inlined from ‘alx_get_ethtool_stats’ at drivers/net/ethernet/atheros/alx/ethtool.c:297:2:
./include/linux/fortify-string.h:592:4: error: call to ‘__read_overflow2_field’
declared with attribute warning: detected read beyond size of field (2nd parameter);
maybe use struct_group()? [-Werror=attribute-warning]
592 | __read_overflow2_field(q_size_field, size);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In order to get alx stats altogether, alx_get_ethtool_stats() reads
beyond hw->stats.rx_ok. Fix this warning by directly copying hw->stats,
and refactor the unnecessarily complicated BUILD_BUG_ON btw.
Errata ERR010450 only shows up if voltage is 1.8V, but if the device is
supplied by 3v3 the errata can be ignored. So let's check for if quirk
SDHCI_QUIRK2_NO_1_8_V is defined or not before limiting the frequency.
Cc: Jim Reinhart <jimr@tekvox.com> Cc: James Autry <jautry@tekvox.com> Cc: Matthew Maron <matthewm@tekvox.com> Signed-off-by: Giulio Benetti <giulio.benetti@benettiengineering.com> Acked-by: Haibo Chen <haibo.chen@nxp.com> Acked-by: Adrian Hunter <adrian.hunter@intel.com> Link: https://lore.kernel.org/r/20230811214853.8623-1-giulio.benetti@benettiengineering.com Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Similar to the transmission of TPM responses, also the transmission of TPM
commands may become corrupted. Instead of aborting when detecting such
issues, try resending the command again.
Signed-off-by: Alexander Steffen <Alexander.Steffen@infineon.com> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
During NVMeTCP Authentication a controller can trigger a kernel
oops by specifying the 8192 bit Diffie Hellman group and passing
a correctly sized, but zeroed Diffie Hellamn value.
mpi_cmp_ui() was detecting this if the second parameter was 0,
but 1 is passed from dh_is_pubkey_valid(). This causes the null
pointer u->d to be dereferenced towards the end of mpi_cmp_ui()
Signed-off-by: Mark O'Donovan <shiftee@posteo.net> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Sasha Levin <sashal@kernel.org>
When compiling with gcc 13.1 and CONFIG_FORTIFY_SOURCE=y,
I've noticed the following:
In function ‘fortify_memcpy_chk’,
inlined from ‘mwifiex_construct_tdls_action_frame’ at drivers/net/wireless/marvell/mwifiex/tdls.c:765:3,
inlined from ‘mwifiex_send_tdls_action_frame’ at drivers/net/wireless/marvell/mwifiex/tdls.c:856:6:
./include/linux/fortify-string.h:529:25: warning: call to ‘__read_overflow2_field’
declared with attribute warning: detected read beyond size of field (2nd parameter);
maybe use struct_group()? [-Wattribute-warning]
529 | __read_overflow2_field(q_size_field, size);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
and it happens because the fortification logic interprets this
as an attempt to overread 1-byte 'u.action.category' member of
'struct ieee80211_mgmt'. To silence this warning, it's enough
to pass an address of 'u.action' itself instead of an address
of its first member.
This also fixes an improper usage of 'sizeof()'. Since 'skb' is
extended with 'sizeof(mgmt->u.action.u.tdls_discover_resp) + 1'
bytes (where 1 is actually 'sizeof(mgmt->u.action.category)'),
I assume that the same number of bytes should be copied.
Suggested-by: Brian Norris <briannorris@chromium.org> Signed-off-by: Dmitry Antipov <dmantipov@yandex.ru> Reviewed-by: Brian Norris <briannorris@chromium.org> Signed-off-by: Kalle Valo <kvalo@kernel.org> Link: https://lore.kernel.org/r/20230629085115.180499-2-dmantipov@yandex.ru Signed-off-by: Sasha Levin <sashal@kernel.org>
ath_pci_probe() warn: argument 4 to %lx specifier is cast from pointer
ath_ahb_probe() warn: argument 4 to %lx specifier is cast from pointer
Fix it by modifying %lx to %p in the printk format string.
Note that with this change, the pointer address will be printed as a
hashed value by default. This is appropriate because the kernel
should not leak kernel pointers to user space in an informational
message. If someone wants to see the real address for debugging
purposes, this can be achieved with the no_hash_pointers kernel option.
The checks in question were introduced by:
commit 6b4db2e528f6 ("devlink: Fix use-after-free after a failed reload").
That fixed an issue of reload with mlxsw driver.
Back then, that was a valid fix, because there was a limitation
in place that prevented drivers from registering/unregistering params
when devlink instance was registered.
It was possible to do the fix differently by changing drivers to
register/unregister params in appropriate places making sure the ops
operate only on memory which is allocated and initialized. But that,
as a dependency, would require to remove the limitation mentioned above.
Eventually, this limitation was lifted by:
commit 1d18bb1a4ddd ("devlink: allow registering parameters after the instance")
Also, the alternative fix (which also fixed another issue) was done by:
commit 74cbc3c03c82 ("mlxsw: spectrum_acl_tcam: Move devlink param to TCAM code").
Therefore, the checks are no longer relevant. Each driver should make
sure to have the params registered only when the memory the ops
are working with is allocated and initialized.
So remove the checks.
Signed-off-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
Arm platforms use is_default_overflow_handler() to determine if the
hw_breakpoint code should single-step over the breakpoint trigger or
let the custom handler deal with it.
Since bpf_overflow_handler() currently isn't recognized as a default
handler, attaching a BPF program to a PERF_TYPE_BREAKPOINT event causes
it to keep firing (the instruction triggering the data abort exception
is never skipped). For example:
# bpftrace -e 'watchpoint:0x10000:4:w { print("hit") }' -c ./test
Attaching 1 probe...
hit
hit
[...]
^C
(./test performs a single 4-byte store to 0x10000)
This patch replaces the check with uses_default_overflow_handler(),
which accounts for the bpf_overflow_handler() case by also testing
if one of the perf_event_output functions gets invoked indirectly,
via orig_default_handler.
Some HiSilicon SMMU PMCG suffers the erratum 162001900 that the PMU
disable control sometimes fail to disable the counters. This will lead
to error or inaccurate data since before we enable the counters the
counter's still counting for the event used in last perf session.
This patch tries to fix this by hardening the global disable process.
Before disable the PMU, writing an invalid event type (0xffff) to
focibly stop the counters. Correspondingly restore each events on
pmu::pmu_enable().
Instead of calling __put_task_struct() directly, we defer it using
call_rcu(). A more natural approach would use a workqueue, but since
in PREEMPT_RT, we can't allocate dynamic memory from atomic context,
the code would become more complex because we would need to put the
work_struct instance in the task_struct and initialize it when we
allocate a new task_struct.
The issue is reproducible with stress-ng:
while true; do
stress-ng --sched deadline --sched-period 1000000000 \
--sched-runtime 800000000 --sched-deadline \ 1000000000 --mmapfork 23 -t 20
done
Reported-by: Hu Chunyu <chuhu@redhat.com> Suggested-by: Oleg Nesterov <oleg@redhat.com> Suggested-by: Valentin Schneider <vschneid@redhat.com> Suggested-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Wander Lairson Costa <wander@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20230614122323.37957-2-wander@redhat.com Signed-off-by: Sasha Levin <sashal@kernel.org>
According to the ACPI specification 19.6.134, no argument is required to be passed for ASL Timer instruction. For taking care of no argument, AML_NO_OPERAND_RESOLVE flag is added to ASL Timer instruction opcode.
When ASL timer instruction interpreted by ACPI interpreter, getting error. After adding AML_NO_OPERAND_RESOLVE flag to ASL Timer instruction opcode, issue is not observed.
=============================================================
UBSAN: array-index-out-of-bounds in acpica/dswexec.c:401:12 index -1 is out of range for type 'union acpi_operand_object *[9]'
CPU: 37 PID: 1678 Comm: cat Not tainted
6.0.0-dev-th500-6.0.y-1+bcf8c46459e407-generic-64k
HW name: NVIDIA BIOS v1.1.1-d7acbfc-dirty 12/19/2022 Call trace:
dump_backtrace+0xe0/0x130
show_stack+0x20/0x60
dump_stack_lvl+0x68/0x84
dump_stack+0x18/0x34
ubsan_epilogue+0x10/0x50
__ubsan_handle_out_of_bounds+0x80/0x90
acpi_ds_exec_end_op+0x1bc/0x6d8
acpi_ps_parse_loop+0x57c/0x618
acpi_ps_parse_aml+0x1e0/0x4b4
acpi_ps_execute_method+0x24c/0x2b8
acpi_ns_evaluate+0x3a8/0x4bc
acpi_evaluate_object+0x15c/0x37c
acpi_evaluate_integer+0x54/0x15c
show_power+0x8c/0x12c [acpi_power_meter]
Link: https://github.com/acpica/acpica/commit/90310989 Signed-off-by: Abhishek Mainkar <abmainkar@nvidia.com> Signed-off-by: Bob Moore <robert.moore@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
As following backtrace, the struct file_lock request , in posix_lock_inode
is free before ftrace function using.
Replace the ftrace function ahead free flow could fix the use-after-free
issue.
[BUG]
Syzbot reported several warning triggered inside
lookup_inline_extent_backref().
[CAUSE]
As usual, the reproducer doesn't reliably trigger locally here, but at
least we know the WARN_ON() is triggered when an inline backref can not
be found, and it can only be triggered when @insert is true. (I.e.
inserting a new inline backref, which means the backref should already
exist)
[ENHANCEMENT]
After the WARN_ON(), dump all the parameters and the extent tree
leaf to help debug.
Link: https://syzkaller.appspot.com/bug?extid=d6f9ff86c1d804ba2bc6 Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
autofs_wait_queue structs should be freed if their wait_ctr becomes zero.
Otherwise they will be lost.
In this case an AUTOFS_IOC_EXPIRE_MULTI ioctl is done, then a new
waitqueue struct is allocated in autofs_wait(), its initial wait_ctr
equals 2. After that wait_event_killable() is interrupted (it returns
-ERESTARTSYS), so that 'wq->name.name == NULL' condition may be not
satisfied. Actually, this condition can be satisfied when
autofs_wait_release() or autofs_catatonic_mode() is called and, what is
also important, wait_ctr is decremented in those places. Upon the exit of
autofs_wait(), wait_ctr is decremented to 1. Then the unmounting process
begins: kill_sb calls autofs_catatonic_mode(), which should have freed the
waitqueues, but it only decrements its usage counter to zero which is not
a correct behaviour.
edit:imk
This description is of course not correct. The umount performed as a result
of an expire is a umount of a mount that has been automounted, it's not the
autofs mount itself. They happen independently, usually after everything
mounted within the autofs file system has been expired away. If everything
hasn't been expired away the automount daemon can still exit leaving mounts
in place. But expires done in both cases will result in a notification that
calls autofs_wait_release() with a result status. The problem case is the
summary execution of of the automount daemon. In this case any waiting
processes won't be woken up until either they are terminated or the mount
is umounted.
end edit: imk
So in catatonic mode we should free waitqueues which counter becomes zero.
edit: imk
Initially I was concerned that the calling of autofs_wait_release() and
autofs_catatonic_mode() was not mutually exclusive but that can't be the
case (obviously) because the queue entry (or entries) is removed from the
list when either of these two functions are called. Consequently the wait
entry will be freed by only one of these functions or by the woken process
in autofs_wait() depending on the order of the calls.
end edit: imk
Reported-by: syzbot+5e53f70e69ff0c0a1c0c@syzkaller.appspotmail.com Suggested-by: Takeshi Misawa <jeliantsurux@gmail.com> Signed-off-by: Fedor Pchelkin <pchelkin@ispras.ru> Signed-off-by: Alexey Khoroshilov <khoroshilov@ispras.ru> Signed-off-by: Ian Kent <raven@themaw.net> Cc: Matthew Wilcox <willy@infradead.org> Cc: Andrei Vagin <avagin@gmail.com> Cc: autofs@vger.kernel.org Cc: linux-kernel@vger.kernel.org
Message-Id: <169112719161.7590.6700123246297365841.stgit@donald.themaw.net> Signed-off-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
[WHY]
Currently, when insert_plane is called with insert_above_mpcc
parameter that is equal to tree->opp_list, the function returns NULL.
[HOW]
Instead, the function should insert the plane at the top of the tree.
Cc: Mario Limonciello <mario.limonciello@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Reviewed-by: Jun Lei <jun.lei@amd.com> Acked-by: Tom Chung <chiahsuan.chung@amd.com> Signed-off-by: Wesley Chalmers <wesley.chalmers@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
syzkaller found a memory leak in kcm_sendmsg(), and commit c821a88bd720
("kcm: Fix memory leak in error path of kcm_sendmsg()") suppressed it by
updating kcm_tx_msg(head)->last_skb if partial data is copied so that the
following sendmsg() will resume from the skb.
However, we cannot know how many bytes were copied when we get the error.
Thus, we could mess up the MSG_MORE queue.
When kcm_sendmsg() fails for SOCK_DGRAM, we should purge the queue as we
do so for UDP by udp_flush_pending_frames().
Even without this change, when the error occurred, the following sendmsg()
resumed from a wrong skb and the queue was messed up. However, we have
yet to get such a report, and only syzkaller stumbled on it. So, this
can be changed safely.
Note this does not change SOCK_SEQPACKET behaviour.
The commit in fixes introduced flags to control the status of hardware
configuration while processing packets. At the same time another structure
is used to provide configuration of timestamper to user-space applications.
The way it was coded makes this structures go out of sync easily. The
repro is easy for 82599 chips:
The eth0 device is properly configured to timestamp any PTPv2 events.
[root@hostname ~]# hwstamp_ctl -i eth0 -r 1 -t 1
current settings:
tx_type 1
rx_filter 12
SIOCSHWTSTAMP failed: Numerical result out of range
The requested time stamping mode is not supported by the hardware.
The error is properly returned because HW doesn't support all packets
timestamping. But the adapter->flags is cleared of timestamp flags
even though no HW configuration was done. From that point no RX timestamps
are received by user-space application. But configuration shows good
values:
Fix the issue by applying new flags only when the HW was actually
configured.
Fixes: a9763f3cb54c ("ixgbe: Update PTP to support X550EM_x devices") Signed-off-by: Vadim Fedorenko <vadim.fedorenko@linux.dev> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
I got the below warning when do fuzzing test:
BUG: KASAN: null-ptr-deref in scatterwalk_copychunks+0x320/0x470
Read of size 4 at addr 0000000000000008 by task kworker/u8:1/9
This is because the value of rec_seq of tls_crypto_info configured by the
user program is too large, for example, 0xffffffffffffff. In addition, TLS
is asynchronously accelerated. When tls_do_encryption() returns
-EINPROGRESS and sk->sk_err is set to EBADMSG due to rec_seq overflow,
skmsg is released before the asynchronous encryption process ends. As a
result, the UAF problem occurs during the asynchronous processing of the
encryption module.
If the operation is asynchronous and the encryption module returns
EINPROGRESS, do not free the record information.
Fixes: 635d93981786 ("net/tls: free record only on encryption error") Signed-off-by: Liu Jian <liujian56@huawei.com> Reviewed-by: Sabrina Dubroca <sd@queasysnail.net> Link: https://lore.kernel.org/r/20230909081434.2324940-1-liujian56@huawei.com Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
mlxbf-tmfifo accesses config space using native types -
which works for it since the legacy virtio native types.
This will break if it ever needs to support modern virtio,
so with new tags previously introduced for virtio net config,
sparse now warns for this in drivers.
Since this is a legacy only device, fix it up using
virtio_legacy_is_little_endian for now.
No functional changes.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Cornelia Huck <cohuck@redhat.com> Acked-by: Andy Shevchenko <andy.shevchenko@gmail.com> Acked-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Stable-dep-of: fc4c65582154 ("platform/mellanox: mlxbf-tmfifo: Drop jumbo frames") Signed-off-by: Sasha Levin <sashal@kernel.org>
This commit fixes tmfifo console stuck issue when the virtual
networking interface is in down state. In such case, the network
Rx descriptors runs out and causes the Rx network packet staying
in the head of the tmfifo thus blocking the console packets. The
fix is to drop the Rx network packet when no more Rx descriptors.
Function name mlxbf_tmfifo_release_pending_pkt() is also renamed
to mlxbf_tmfifo_release_pkt() to be more approperiate.
Fixes: 1357dfd7261f ("platform/mellanox: Add TmFifo driver for Mellanox BlueField Soc") Signed-off-by: Liming Sun <limings@nvidia.com> Reviewed-by: Vadim Pasternak <vadimp@nvidia.com> Reviewed-by: David Thompson <davthompson@nvidia.com> Link: https://lore.kernel.org/r/8c0177dc938ae03f52ff7e0b62dbeee74b7bec09.1693322547.git.limings@nvidia.com Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
In kcm_sendmsg(), kcm_tx_msg(head)->last_skb is used as a cursor to append
newly allocated skbs to 'head'. If some bytes are copied, an error occurred,
and jumped to out_error label, 'last_skb' is left unmodified. A later
kcm_sendmsg() will use an obsoleted 'last_skb' reference, corrupting the
'head' frag_list and causing the leak.
This patch fixes this issue by properly updating the last allocated skb in
'last_skb'.
Fixes: ab7ac4eb9832 ("kcm: Kernel Connection Multiplexor module") Reported-and-tested-by: syzbot+6f98de741f7dbbfc4ccb@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=6f98de741f7dbbfc4ccb Signed-off-by: Shigeru Yoshida <syoshida@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
According to the document of napi, there is no rx process when the
budget is 0. Therefore, r8152_poll() has to return 0 directly when the
budget is equal to 0.
Fixes: d2187f8e4454 ("r8152: divide the tx and rx bottom functions") Signed-off-by: Hayes Wang <hayeswang@realtek.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
rule_locs is allocated in ethtool_get_rxnfc and the size is determined by
rule_cnt from user space. So rule_cnt needs to be check before using
rule_locs to avoid NULL pointer dereference.
Fixes: 7aab747e5563 ("net: ethernet: mediatek: add ethtool functions to configure RX flows of HW LRO") Signed-off-by: Hangyu Hua <hbh25y@gmail.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
rules is allocated in ethtool_get_rxnfc and the size is determined by
rule_cnt from user space. So rule_cnt needs to be check before using
rules to avoid OOB writing or NULL pointer dereference.
Fixes: 90b509b39ac9 ("net: mvpp2: cls: Add Classification offload support") Signed-off-by: Hangyu Hua <hbh25y@gmail.com> Reviewed-by: Marcin Wojtas <mw@semihalf.com> Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
I got the below warning when do fuzzing test:
unregister_netdevice: waiting for bond0 to become free. Usage count = 2
It can be repoduced via:
ip link add bond0 type bond
sysctl -w net.ipv4.conf.bond0.promote_secondaries=1
ip addr add 4.117.174.103/0 scope 0x40 dev bond0
ip addr add 192.168.100.111/255.255.255.254 scope 0 dev bond0
ip addr add 0.0.0.4/0 scope 0x40 secondary dev bond0
ip addr del 4.117.174.103/0 scope 0x40 dev bond0
ip link delete bond0 type bond
In this reproduction test case, an incorrect 'last_prim' is found in
__inet_del_ifa(), as a result, the secondary address(0.0.0.4/0 scope 0x40)
is lost. The memory of the secondary address is leaked and the reference of
in_device and net_device is leaked.
Fix this problem:
Look for 'last_prim' starting at location of the deleted IP and inserting
the promoted IP into the location of 'last_prim'.
Fixes: 0ff60a45678e ("[IPV4]: Fix secondary IP addresses after promotion") Signed-off-by: Liu Jian <liujian56@huawei.com> Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
Many i.MX8M SoCs use same 1443X/1416X PLL, such as i.MX8MM,
i.MX8MN and later i.MX8M SoCs, moving these PLL definitions
to pll14xx driver can save a lot of duplicated code on each
platform.
Meanwhile, no need to define PLL clock structure for every
module which uses same type of PLL, e.g., audio/video/dram use
1443X PLL, arm/gpu/vpu/sys use 1416X PLL, define 2 PLL clock
structure for each group is enough.
Some usb hubs will negotiate DisplayPort Alt mode with the device
but will then negotiate a data role swap after entering the alt
mode. The data role swap causes the device to unregister all alt
modes, however the usb hub will still send Attention messages
even after failing to reregister the Alt Mode. type_altmode_attention
currently does not verify whether or not a device's altmode partner
exists, which results in a NULL pointer error when dereferencing
the typec_altmode and typec_altmode_ops belonging to the altmode
partner.
Verify the presence of a device's altmode partner before sending
the Attention message to the Alt Mode driver.
Fixes: 8a37d87d72f0 ("usb: typec: Bus type for alternate modes") Cc: stable@vger.kernel.org Signed-off-by: RD Babiera <rdbabiera@google.com> Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Link: https://lore.kernel.org/r/20230814180559.923475-1-rdbabiera@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Refactor tcpm_handle_vdm_request and its tcpm_pd_svdm helper function so
that reporting the results of the vdm to the altmode-driver is separated
out into a clear separate step inside tcpm_handle_vdm_request, instead
of being scattered over various places inside the tcpm_pd_svdm helper.
This is a preparation patch for fixing an AB BA lock inversion between the
tcpm code and some altmode drivers.
Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Link: https://lore.kernel.org/r/20200724174702.61754-4-hdegoede@redhat.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Stable-dep-of: f23643306430 ("usb: typec: bus: verify partner exists in typec_altmode_attention") Signed-off-by: Sasha Levin <sashal@kernel.org>
Refactor the tcpm_handle_vdm_request payload handling by doing the
endianness conversion only once directly inside tcpm_handle_vdm_request
itself instead of doing it multiple times inside various helper functions
called by tcpm_handle_vdm_request.
This is a preparation patch for some further refactoring to fix an AB BA
lock inversion between the tcpm code and some altmode drivers.
Reviewed-by: Guenter Roeck <linux@roeck-us.net> Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Link: https://lore.kernel.org/r/20200724174702.61754-3-hdegoede@redhat.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Stable-dep-of: f23643306430 ("usb: typec: bus: verify partner exists in typec_altmode_attention") Signed-off-by: Sasha Levin <sashal@kernel.org>
The PERF_RECORD_ATTR is used for a pipe mode to describe an event with
attribute and IDs. The ID table comes after the attr and it calculate
size of the table using the total record size and the attr size.
This is fine for most use cases, but sometimes it saves the pipe output
in a file and then process it later. And it becomes a problem if there
is a change in attr size between the record and report.
$ perf record -o- > perf-pipe.data # old version
$ perf report -i- < perf-pipe.data # new version
For example, if the attr size is 128 and it has 4 IDs, then it would
save them in 168 byte like below:
The commit ef9ff6017e3c4593 ("perf ui browser: Move the extra title
lines from the hists browser") introduced ui_browser__gotorc_title() to
help moving non-title lines easily. But it missed to update the title
for the hierarchy mode so it won't print the header line on TUI at all.
$ perf report --hierarchy
Fixes: ef9ff6017e3c4593 ("perf ui browser: Move the extra title lines from the hists browser") Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20230731094934.1616495-1-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
If system is busy during the command status polling function, the driver
may not get the chance to poll the status register till the end of time
out and return the premature status. Do a final check after time out
happens to ensure reading the correct status.
When the oob buffer length is not in multiple of words, the oob write
function does out-of-bounds read on the oob source buffer at the last
iteration. Fix that by always checking length limit on the oob buffer
read and fill with 0xff when reaching the end of the buffer to the oob
registers.
When executing a NAND command within the panic write path, wait for any
pending command instead of calling BUG_ON to avoid crashing while
already crashing.
And
check_dev_super()
btrfs_validate_super(.., sb, ..)
However, it currently verifies the fs_info::super_copy::fsid instead,
which is not correct. Fix this using the correct fsid in the superblock
argument.
CC: stable@vger.kernel.org # 5.4+ Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Tested-by: Guilherme G. Piccoli <gpiccoli@igalia.com> Signed-off-by: Anand Jain <anand.jain@oracle.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When joining a transaction with TRANS_JOIN_NOSTART, if we don't find a
running transaction we end up creating one. This goes against the purpose
of TRANS_JOIN_NOSTART which is to join a running transaction if its state
is at or below the state TRANS_STATE_COMMIT_START, otherwise return an
-ENOENT error and don't start a new transaction. So fix this to not create
a new transaction if there's no running transaction at or below that
state.
CC: stable@vger.kernel.org # 4.14+ Fixes: a6d155d2e363 ("Btrfs: fix deadlock between fiemap and transaction commits") Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
During our debugging of glusterfs, we found an Assertion failed error:
inode_lookup >= nlookup, which was caused by the nlookup value in the
kernel being greater than that in the FUSE file system.
The issue was introduced by fuse_direntplus_link, where in the function,
fuse_iget increments nlookup, and if d_splice_alias returns failure,
fuse_direntplus_link returns failure without decrementing nlookup
https://github.com/gluster/glusterfs/pull/4081
In all these cases, the last argument to dma_declare_coherent_memory() is
the buffer end address, but the expected value should be the size of the
reserved region.
Fixes: 39fb993038e1 ("media: arch: sh: ap325rxa: Use new renesas-ceu camera driver") Fixes: c2f9b05fd5c1 ("media: arch: sh: ecovec: Use new renesas-ceu camera driver") Fixes: f3590dc32974 ("media: arch: sh: kfr2r09: Use new renesas-ceu camera driver") Fixes: 186c446f4b84 ("media: arch: sh: migor: Use new renesas-ceu camera driver") Fixes: 1a3c230b4151 ("media: arch: sh: ms7724se: Use new renesas-ceu camera driver") Signed-off-by: Petr Tesarik <petr.tesarik.ext@huawei.com> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: Jacopo Mondi <jacopo.mondi@ideasonboard.com> Reviewed-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Reviewed-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com> Link: https://lore.kernel.org/r/20230724120742.2187-1-petrtesarik@huaweicloud.com Signed-off-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
The opt_num field is controlled by user mode and is not currently
validated inside the kernel. An attacker can take advantage of this to
trigger an OOB read and potentially leak information.
BUG: KASAN: slab-out-of-bounds in nf_osf_match_one+0xbed/0xd10 net/netfilter/nfnetlink_osf.c:88
Read of size 2 at addr ffff88804bc64272 by task poc/6431
Fixes: 039f50629b7f ("ip_tunnel: Move stats update to iptunnel_xmit()") Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
Change the minimum value of RX/TX descriptors to 64 to enable setting the rx/tx
value between 64 and 80. All igb devices can use as low as 64 descriptors.
This change will unify igb with other drivers.
Based on commit 7b1be1987c1e ("e1000e: lower ring minimum size to 64")
Fixes: 9d5c824399de ("igb: PCI-Express 82575 Gigabit Ethernet driver") Signed-off-by: Olga Zaborska <olga.zaborska@intel.com> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Change the minimum value of RX/TX descriptors to 64 to enable setting the rx/tx
value between 64 and 80. All igbvf devices can use as low as 64 descriptors.
This change will unify igbvf with other drivers.
Based on commit 7b1be1987c1e ("e1000e: lower ring minimum size to 64")
Fixes: d4e0fe01a38a ("igbvf: add new driver to support 82576 virtual functions") Signed-off-by: Olga Zaborska <olga.zaborska@intel.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Change the minimum value of RX/TX descriptors to 64 to enable setting the rx/tx
value between 64 and 80. All igc devices can use as low as 64 descriptors.
This change will unify igc with other drivers.
Based on commit 7b1be1987c1e ("e1000e: lower ring minimum size to 64")
Fixes: 0507ef8a0372 ("igc: Add transmit and receive fastpath and interrupt handlers") Signed-off-by: Olga Zaborska <olga.zaborska@intel.com> Tested-by: Naama Meir <naamax.meir@linux.intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
When the plug qdisc is used as a class of the qfq qdisc it could trigger a
UAF. This issue can be reproduced with following commands:
tc qdisc add dev lo root handle 1: qfq
tc class add dev lo parent 1: classid 1:1 qfq weight 1 maxpkt 512
tc qdisc add dev lo parent 1:1 handle 2: plug
tc filter add dev lo parent 1: basic classid 1:1
ping -c1 127.0.0.1
and boom:
[ 285.353793] BUG: KASAN: slab-use-after-free in qfq_dequeue+0xa7/0x7f0
[ 285.354910] Read of size 4 at addr ffff8880bad312a8 by task ping/144
[ 285.355903]
[ 285.356165] CPU: 1 PID: 144 Comm: ping Not tainted 6.5.0-rc3+ #4
[ 285.357112] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2 04/01/2014
[ 285.358376] Call Trace:
[ 285.358773] <IRQ>
[ 285.359109] dump_stack_lvl+0x44/0x60
[ 285.359708] print_address_description.constprop.0+0x2c/0x3c0
[ 285.360611] kasan_report+0x10c/0x120
[ 285.361195] ? qfq_dequeue+0xa7/0x7f0
[ 285.361780] qfq_dequeue+0xa7/0x7f0
[ 285.362342] __qdisc_run+0xf1/0x970
[ 285.362903] net_tx_action+0x28e/0x460
[ 285.363502] __do_softirq+0x11b/0x3de
[ 285.364097] do_softirq.part.0+0x72/0x90
[ 285.364721] </IRQ>
[ 285.365072] <TASK>
[ 285.365422] __local_bh_enable_ip+0x77/0x90
[ 285.366079] __dev_queue_xmit+0x95f/0x1550
[ 285.366732] ? __pfx_csum_and_copy_from_iter+0x10/0x10
[ 285.367526] ? __pfx___dev_queue_xmit+0x10/0x10
[ 285.368259] ? __build_skb_around+0x129/0x190
[ 285.368960] ? ip_generic_getfrag+0x12c/0x170
[ 285.369653] ? __pfx_ip_generic_getfrag+0x10/0x10
[ 285.370390] ? csum_partial+0x8/0x20
[ 285.370961] ? raw_getfrag+0xe5/0x140
[ 285.371559] ip_finish_output2+0x539/0xa40
[ 285.372222] ? __pfx_ip_finish_output2+0x10/0x10
[ 285.372954] ip_output+0x113/0x1e0
[ 285.373512] ? __pfx_ip_output+0x10/0x10
[ 285.374130] ? icmp_out_count+0x49/0x60
[ 285.374739] ? __pfx_ip_finish_output+0x10/0x10
[ 285.375457] ip_push_pending_frames+0xf3/0x100
[ 285.376173] raw_sendmsg+0xef5/0x12d0
[ 285.376760] ? do_syscall_64+0x40/0x90
[ 285.377359] ? __static_call_text_end+0x136578/0x136578
[ 285.378173] ? do_syscall_64+0x40/0x90
[ 285.378772] ? kasan_enable_current+0x11/0x20
[ 285.379469] ? __pfx_raw_sendmsg+0x10/0x10
[ 285.380137] ? __sock_create+0x13e/0x270
[ 285.380673] ? __sys_socket+0xf3/0x180
[ 285.381174] ? __x64_sys_socket+0x3d/0x50
[ 285.381725] ? entry_SYSCALL_64_after_hwframe+0x6e/0xd8
[ 285.382425] ? __rcu_read_unlock+0x48/0x70
[ 285.382975] ? ip4_datagram_release_cb+0xd8/0x380
[ 285.383608] ? __pfx_ip4_datagram_release_cb+0x10/0x10
[ 285.384295] ? preempt_count_sub+0x14/0xc0
[ 285.384844] ? __list_del_entry_valid+0x76/0x140
[ 285.385467] ? _raw_spin_lock_bh+0x87/0xe0
[ 285.386014] ? __pfx__raw_spin_lock_bh+0x10/0x10
[ 285.386645] ? release_sock+0xa0/0xd0
[ 285.387148] ? preempt_count_sub+0x14/0xc0
[ 285.387712] ? freeze_secondary_cpus+0x348/0x3c0
[ 285.388341] ? aa_sk_perm+0x177/0x390
[ 285.388856] ? __pfx_aa_sk_perm+0x10/0x10
[ 285.389441] ? check_stack_object+0x22/0x70
[ 285.390032] ? inet_send_prepare+0x2f/0x120
[ 285.390603] ? __pfx_inet_sendmsg+0x10/0x10
[ 285.391172] sock_sendmsg+0xcc/0xe0
[ 285.391667] __sys_sendto+0x190/0x230
[ 285.392168] ? __pfx___sys_sendto+0x10/0x10
[ 285.392727] ? kvm_clock_get_cycles+0x14/0x30
[ 285.393328] ? set_normalized_timespec64+0x57/0x70
[ 285.393980] ? _raw_spin_unlock_irq+0x1b/0x40
[ 285.394578] ? __x64_sys_clock_gettime+0x11c/0x160
[ 285.395225] ? __pfx___x64_sys_clock_gettime+0x10/0x10
[ 285.395908] ? _copy_to_user+0x3e/0x60
[ 285.396432] ? exit_to_user_mode_prepare+0x1a/0x120
[ 285.397086] ? syscall_exit_to_user_mode+0x22/0x50
[ 285.397734] ? do_syscall_64+0x71/0x90
[ 285.398258] __x64_sys_sendto+0x74/0x90
[ 285.398786] do_syscall_64+0x64/0x90
[ 285.399273] ? exit_to_user_mode_prepare+0x1a/0x120
[ 285.399949] ? syscall_exit_to_user_mode+0x22/0x50
[ 285.400605] ? do_syscall_64+0x71/0x90
[ 285.401124] entry_SYSCALL_64_after_hwframe+0x6e/0xd8
[ 285.401807] RIP: 0033:0x495726
[ 285.402233] Code: ff ff ff f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b8 0f 1f 00 41 89 ca 64 8b 04 25 18 00 00 00 85 c0 75 11 b8 2c 00 00 00 0f 09
[ 285.404683] RSP: 002b:00007ffcc25fb618 EFLAGS: 00000246 ORIG_RAX: 000000000000002c
[ 285.405677] RAX: ffffffffffffffda RBX: 0000000000000040 RCX: 0000000000495726
[ 285.406628] RDX: 0000000000000040 RSI: 0000000002518750 RDI: 0000000000000000
[ 285.407565] RBP: 00000000005205ef R08: 00000000005f8838 R09: 000000000000001c
[ 285.408523] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000002517634
[ 285.409460] R13: 00007ffcc25fb6f0 R14: 0000000000000003 R15: 0000000000000000
[ 285.410403] </TASK>
[ 285.410704]
[ 285.410929] Allocated by task 144:
[ 285.411402] kasan_save_stack+0x1e/0x40
[ 285.411926] kasan_set_track+0x21/0x30
[ 285.412442] __kasan_slab_alloc+0x55/0x70
[ 285.412973] kmem_cache_alloc_node+0x187/0x3d0
[ 285.413567] __alloc_skb+0x1b4/0x230
[ 285.414060] __ip_append_data+0x17f7/0x1b60
[ 285.414633] ip_append_data+0x97/0xf0
[ 285.415144] raw_sendmsg+0x5a8/0x12d0
[ 285.415640] sock_sendmsg+0xcc/0xe0
[ 285.416117] __sys_sendto+0x190/0x230
[ 285.416626] __x64_sys_sendto+0x74/0x90
[ 285.417145] do_syscall_64+0x64/0x90
[ 285.417624] entry_SYSCALL_64_after_hwframe+0x6e/0xd8
[ 285.418306]
[ 285.418531] Freed by task 144:
[ 285.418960] kasan_save_stack+0x1e/0x40
[ 285.419469] kasan_set_track+0x21/0x30
[ 285.419988] kasan_save_free_info+0x27/0x40
[ 285.420556] ____kasan_slab_free+0x109/0x1a0
[ 285.421146] kmem_cache_free+0x1c2/0x450
[ 285.421680] __netif_receive_skb_core+0x2ce/0x1870
[ 285.422333] __netif_receive_skb_one_core+0x97/0x140
[ 285.423003] process_backlog+0x100/0x2f0
[ 285.423537] __napi_poll+0x5c/0x2d0
[ 285.424023] net_rx_action+0x2be/0x560
[ 285.424510] __do_softirq+0x11b/0x3de
[ 285.425034]
[ 285.425254] The buggy address belongs to the object at ffff8880bad31280
[ 285.425254] which belongs to the cache skbuff_head_cache of size 224
[ 285.426993] The buggy address is located 40 bytes inside of
[ 285.426993] freed 224-byte region [ffff8880bad31280, ffff8880bad31360)
[ 285.428572]
[ 285.428798] The buggy address belongs to the physical page:
[ 285.429540] page:00000000f4b77674 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0xbad31
[ 285.430758] flags: 0x100000000000200(slab|node=0|zone=1)
[ 285.431447] page_type: 0xffffffff()
[ 285.431934] raw: 0100000000000200ffff88810094a8c0dead0000000001220000000000000000
[ 285.432757] raw: 000000000000000000000000800c000c00000001ffffffff0000000000000000
[ 285.433562] page dumped because: kasan: bad access detected
[ 285.434144]
[ 285.434320] Memory state around the buggy address:
[ 285.434828] ffff8880bad31180: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[ 285.435580] ffff8880bad31200: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[ 285.436264] >ffff8880bad31280: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[ 285.436777] ^
[ 285.437106] ffff8880bad31300: fb fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc
[ 285.437616] ffff8880bad31380: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[ 285.438126] ==================================================================
[ 285.438662] Disabling lock debugging due to kernel taint
Fix this by:
1. Changing sch_plug's .peek handler to qdisc_peek_dequeued(), a
function compatible with non-work-conserving qdiscs
2. Checking the return value of qdisc_dequeue_peeked() in sch_qfq.
Fixes: 462dbc9101ac ("pkt_sched: QFQ Plus: fair-queueing service at DRR cost") Reported-by: valis <sec@valis.email> Signed-off-by: valis <sec@valis.email> Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com> Link: https://lore.kernel.org/r/20230901162237.11525-1-jhs@mojatatu.com Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
As with sk->sk_shutdown shown in the previous patch, sk->sk_err can be
read locklessly by unix_dgram_sendmsg().
Let's use READ_ONCE() for sk_err as well.
Note that the writer side is marked by commit cc04410af7de ("af_unix:
annotate lockless accesses to sk->sk_err").
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
The existing code incorrectly casted a negative value (the result of a
subtraction) to an unsigned value without checking. For example, if
/proc/sys/net/ipv6/conf/*/temp_prefered_lft was set to 1, the preferred
lifetime would jump to 4 billion seconds. On my machine and network the
shortest lifetime that avoided underflow was 3 seconds.
Fixes: 76506a986dc3 ("IPv6: fix DESYNC_FACTOR") Signed-off-by: Alex Henrie <alexhenrie24@gmail.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>