]> git.itanic.dy.fi Git - linux-stable/log
linux-stable
19 months agoLinux 5.10.146 v5.10.146
Greg Kroah-Hartman [Wed, 28 Sep 2022 09:10:41 +0000 (11:10 +0200)]
Linux 5.10.146

Link: https://lore.kernel.org/r/20220926100754.639112000@linuxfoundation.org
Tested-by: Pavel Machek (CIP) <pavel@denx.de>
Link: https://lore.kernel.org/r/20220926163550.904900693@linuxfoundation.org
Tested-by: Florian Fainelli <f.fainelli@gmail.com>
Tested-by: Linux Kernel Functional Testing <lkft@linaro.org>
Tested-by: Sudip Mukherjee <sudip.mukherjee@codethink.co.uk>
Tested-by: Pavel Machek (CIP) <pavel@denx.de>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Tested-by: Jon Hunter <jonathanh@nvidia.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
19 months agoext4: make directory inode spreading reflect flexbg size
Jan Kara [Thu, 8 Sep 2022 09:21:26 +0000 (11:21 +0200)]
ext4: make directory inode spreading reflect flexbg size

commit 613c5a85898d1cd44e68f28d65eccf64a8ace9cf upstream.

Currently the Orlov inode allocator searches for free inodes for a
directory only in flex block groups with at most inodes_per_group/16
more directory inodes than average per flex block group. However with
growing size of flex block group this becomes unnecessarily strict.
Scale allowed difference from average directory count per flex block
group with flex block group size as we do with other metrics.

Tested-by: Stefan Wahren <stefan.wahren@i2se.com>
Tested-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
Cc: stable@kernel.org
Link: https://lore.kernel.org/all/0d81a7c2-46b7-6010-62a4-3e6cfc1628d6@i2se.com/
Signed-off-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20220908092136.11770-3-jack@suse.cz
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
19 months agoext4: limit the number of retries after discarding preallocations blocks
Theodore Ts'o [Thu, 1 Sep 2022 22:03:14 +0000 (18:03 -0400)]
ext4: limit the number of retries after discarding preallocations blocks

commit 80fa46d6b9e7b1527bfd2197d75431fd9c382161 upstream.

This patch avoids threads live-locking for hours when a large number
threads are competing over the last few free extents as they blocks
getting added and removed from preallocation pools.  From our bug
reporter:

   A reliable way for triggering this has multiple writers
   continuously write() to files when the filesystem is full, while
   small amounts of space are freed (e.g. by truncating a large file
   -1MiB at a time). In the local filesystem, this can be done by
   simply not checking the return code of write (0) and/or the error
   (ENOSPACE) that is set. Over NFS with an async mount, even clients
   with proper error checking will behave this way since the linux NFS
   client implementation will not propagate the server errors [the
   write syscalls immediately return success] until the file handle is
   closed. This leads to a situation where NFS clients send a
   continuous stream of WRITE rpcs which result in ERRNOSPACE -- but
   since the client isn't seeing this, the stream of writes continues
   at maximum network speed.

   When some space does appear, multiple writers will all attempt to
   claim it for their current write. For NFS, we may see dozens to
   hundreds of threads that do this.

   The real-world scenario of this is database backup tooling (in
   particular, github.com/mdkent/percona-xtrabackup) which may write
   large files (>1TiB) to NFS for safe keeping. Some temporary files
   are written, rewound, and read back -- all before closing the file
   handle (the temp file is actually unlinked, to trigger automatic
   deletion on close/crash.) An application like this operating on an
   async NFS mount will not see an error code until TiB have been
   written/read.

   The lockup was observed when running this database backup on large
   filesystems (64 TiB in this case) with a high number of block
   groups and no free space. Fragmentation is generally not a factor
   in this filesystem (~thousands of large files, mostly contiguous
   except for the parts written while the filesystem is at capacity.)

Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
19 months agoext4: fix bug in extents parsing when eh_entries == 0 and eh_depth > 0
Luís Henriques [Mon, 22 Aug 2022 09:42:35 +0000 (10:42 +0100)]
ext4: fix bug in extents parsing when eh_entries == 0 and eh_depth > 0

commit 29a5b8a137ac8eb410cc823653a29ac0e7b7e1b0 upstream.

When walking through an inode extents, the ext4_ext_binsearch_idx() function
assumes that the extent header has been previously validated.  However, there
are no checks that verify that the number of entries (eh->eh_entries) is
non-zero when depth is > 0.  And this will lead to problems because the
EXT_FIRST_INDEX() and EXT_LAST_INDEX() will return garbage and result in this:

[  135.245946] ------------[ cut here ]------------
[  135.247579] kernel BUG at fs/ext4/extents.c:2258!
[  135.249045] invalid opcode: 0000 [#1] PREEMPT SMP
[  135.250320] CPU: 2 PID: 238 Comm: tmp118 Not tainted 5.19.0-rc8+ #4
[  135.252067] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.15.0-0-g2dd4b9b-rebuilt.opensuse.org 04/01/2014
[  135.255065] RIP: 0010:ext4_ext_map_blocks+0xc20/0xcb0
[  135.256475] Code:
[  135.261433] RSP: 0018:ffffc900005939f8 EFLAGS: 00010246
[  135.262847] RAX: 0000000000000024 RBX: ffffc90000593b70 RCX: 0000000000000023
[  135.264765] RDX: ffff8880038e5f10 RSI: 0000000000000003 RDI: ffff8880046e922c
[  135.266670] RBP: ffff8880046e9348 R08: 0000000000000001 R09: ffff888002ca580c
[  135.268576] R10: 0000000000002602 R11: 0000000000000000 R12: 0000000000000024
[  135.270477] R13: 0000000000000000 R14: 0000000000000024 R15: 0000000000000000
[  135.272394] FS:  00007fdabdc56740(0000) GS:ffff88807dd00000(0000) knlGS:0000000000000000
[  135.274510] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  135.276075] CR2: 00007ffc26bd4f00 CR3: 0000000006261004 CR4: 0000000000170ea0
[  135.277952] Call Trace:
[  135.278635]  <TASK>
[  135.279247]  ? preempt_count_add+0x6d/0xa0
[  135.280358]  ? percpu_counter_add_batch+0x55/0xb0
[  135.281612]  ? _raw_read_unlock+0x18/0x30
[  135.282704]  ext4_map_blocks+0x294/0x5a0
[  135.283745]  ? xa_load+0x6f/0xa0
[  135.284562]  ext4_mpage_readpages+0x3d6/0x770
[  135.285646]  read_pages+0x67/0x1d0
[  135.286492]  ? folio_add_lru+0x51/0x80
[  135.287441]  page_cache_ra_unbounded+0x124/0x170
[  135.288510]  filemap_get_pages+0x23d/0x5a0
[  135.289457]  ? path_openat+0xa72/0xdd0
[  135.290332]  filemap_read+0xbf/0x300
[  135.291158]  ? _raw_spin_lock_irqsave+0x17/0x40
[  135.292192]  new_sync_read+0x103/0x170
[  135.293014]  vfs_read+0x15d/0x180
[  135.293745]  ksys_read+0xa1/0xe0
[  135.294461]  do_syscall_64+0x3c/0x80
[  135.295284]  entry_SYSCALL_64_after_hwframe+0x46/0xb0

This patch simply adds an extra check in __ext4_ext_check(), verifying that
eh_entries is not 0 when eh_depth is > 0.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=215941
Link: https://bugzilla.kernel.org/show_bug.cgi?id=216283
Cc: Baokun Li <libaokun1@huawei.com>
Cc: stable@kernel.org
Signed-off-by: Luís Henriques <lhenriques@suse.de>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Baokun Li <libaokun1@huawei.com>
Link: https://lore.kernel.org/r/20220822094235.2690-1-lhenriques@suse.de
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
19 months agodevdax: Fix soft-reservation memory description
Dan Williams [Fri, 23 Sep 2022 22:05:56 +0000 (15:05 -0700)]
devdax: Fix soft-reservation memory description

commit 67feaba413ec68daf4124e9870878899b4ed9a0e upstream.

The "hmem" platform-devices that are created to represent the
platform-advertised "Soft Reserved" memory ranges end up inserting a
resource that causes the iomem_resource tree to look like this:

340000000-43fffffff : hmem.0
  340000000-43fffffff : Soft Reserved
    340000000-43fffffff : dax0.0

This is because insert_resource() reparents ranges when they completely
intersect an existing range.

This matters because code that uses region_intersects() to scan for a
given IORES_DESC will only check that top-level 'hmem.0' resource and
not the 'Soft Reserved' descendant.

So, to support EINJ (via einj_error_inject()) to inject errors into
memory hosted by a dax-device, be sure to describe the memory as
IORES_DESC_SOFT_RESERVED. This is a follow-on to:

commit b13a3e5fd40b ("ACPI: APEI: Fix _EINJ vs EFI_MEMORY_SP")

...that fixed EINJ support for "Soft Reserved" ranges in the first
instance.

Fixes: 262b45ae3ab4 ("x86/efi: EFI soft reservation to E820 enumeration")
Reported-by: Ricardo Sandoval Torres <ricardo.sandoval.torres@intel.com>
Tested-by: Ricardo Sandoval Torres <ricardo.sandoval.torres@intel.com>
Cc: <stable@vger.kernel.org>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Omar Avelar <omar.avelar@intel.com>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Mark Gross <markgross@kernel.org>
Link: https://lore.kernel.org/r/166397075670.389916.7435722208896316387.stgit@dwillia2-xfh.jf.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
19 months agoi2c: mlxbf: Fix frequency calculation
Asmaa Mnebhi [Tue, 20 Sep 2022 17:47:29 +0000 (13:47 -0400)]
i2c: mlxbf: Fix frequency calculation

[ Upstream commit 37f071ec327b04c83d47637c5e5c2199b39899ca ]

The i2c-mlxbf.c driver is currently broken because there is a bug
in the calculation of the frequency. core_f, core_r and core_od
are components read from hardware registers and are used to
compute the frequency used to compute different timing parameters.
The shifting mechanism used to get core_f, core_r and core_od is
wrong. Use FIELD_GET to mask and shift the bitfields properly.

Fixes: b5b5b32081cd206b (i2c: mlxbf: I2C SMBus driver for Mellanox BlueField SoC)
Reviewed-by: Khalil Blaiech <kblaiech@nvidia.com>
Signed-off-by: Asmaa Mnebhi <asmaa@nvidia.com>
Signed-off-by: Wolfram Sang <wsa@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
19 months agoi2c: mlxbf: prevent stack overflow in mlxbf_i2c_smbus_start_transaction()
Asmaa Mnebhi [Thu, 8 Sep 2022 17:35:39 +0000 (13:35 -0400)]
i2c: mlxbf: prevent stack overflow in mlxbf_i2c_smbus_start_transaction()

[ Upstream commit de24aceb07d426b6f1c59f33889d6a964770547b ]

memcpy() is called in a loop while 'operation->length' upper bound
is not checked and 'data_idx' also increments.

Fixes: b5b5b32081cd206b ("i2c: mlxbf: I2C SMBus driver for Mellanox BlueField SoC")
Reviewed-by: Khalil Blaiech <kblaiech@nvidia.com>
Signed-off-by: Asmaa Mnebhi <asmaa@nvidia.com>
Signed-off-by: Wolfram Sang <wsa@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
19 months agoi2c: mlxbf: incorrect base address passed during io write
Asmaa Mnebhi [Thu, 8 Sep 2022 17:35:38 +0000 (13:35 -0400)]
i2c: mlxbf: incorrect base address passed during io write

[ Upstream commit 2a5be6d1340c0fefcee8a6489cff7fd88a0d5b85 ]

Correct the base address used during io write.
This bug had no impact over the overall functionality of the read and write
transactions. MLXBF_I2C_CAUSE_OR_CLEAR=0x18 so writing to (smbus->io + 0x18)
instead of (mst_cause->ioi + 0x18) actually writes to the sc_low_timeout
register which just sets the timeout value before a read/write aborts.

Fixes: b5b5b32081cd206b (i2c: mlxbf: I2C SMBus driver for Mellanox BlueField SoC)
Reviewed-by: Khalil Blaiech <kblaiech@nvidia.com>
Signed-off-by: Asmaa Mnebhi <asmaa@nvidia.com>
Signed-off-by: Wolfram Sang <wsa@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
19 months agoi2c: imx: If pm_runtime_get_sync() returned 1 device access is possible
Uwe Kleine-König [Mon, 12 Sep 2022 13:20:40 +0000 (15:20 +0200)]
i2c: imx: If pm_runtime_get_sync() returned 1 device access is possible

[ Upstream commit 085aacaa73163f4b8a89dec24ecb32cfacd34017 ]

pm_runtime_get_sync() returning 1 also means the device is powered. So
resetting the chip registers in .remove() is possible and should be
done.

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Fixes: d98bdd3a5b50 ("i2c: imx: Make sure to unregister adapter on remove()")
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Acked-by: Oleksij Rempel <o.rempel@pengutronix.de>
Signed-off-by: Wolfram Sang <wsa@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
19 months agoworkqueue: don't skip lockdep work dependency in cancel_work_sync()
Tetsuo Handa [Fri, 29 Jul 2022 04:30:23 +0000 (13:30 +0900)]
workqueue: don't skip lockdep work dependency in cancel_work_sync()

[ Upstream commit c0feea594e058223973db94c1c32a830c9807c86 ]

Like Hillf Danton mentioned

  syzbot should have been able to catch cancel_work_sync() in work context
  by checking lockdep_map in __flush_work() for both flush and cancel.

in [1], being unable to report an obvious deadlock scenario shown below is
broken. From locking dependency perspective, sync version of cancel request
should behave as if flush request, for it waits for completion of work if
that work has already started execution.

  ----------
  #include <linux/module.h>
  #include <linux/sched.h>
  static DEFINE_MUTEX(mutex);
  static void work_fn(struct work_struct *work)
  {
    schedule_timeout_uninterruptible(HZ / 5);
    mutex_lock(&mutex);
    mutex_unlock(&mutex);
  }
  static DECLARE_WORK(work, work_fn);
  static int __init test_init(void)
  {
    schedule_work(&work);
    schedule_timeout_uninterruptible(HZ / 10);
    mutex_lock(&mutex);
    cancel_work_sync(&work);
    mutex_unlock(&mutex);
    return -EINVAL;
  }
  module_init(test_init);
  MODULE_LICENSE("GPL");
  ----------

The check this patch restores was added by commit 0976dfc1d0cd80a4
("workqueue: Catch more locking problems with flush_work()").

Then, lockdep's crossrelease feature was added by commit b09be676e0ff25bd
("locking/lockdep: Implement the 'crossrelease' feature"). As a result,
this check was once removed by commit fd1a5b04dfb899f8 ("workqueue: Remove
now redundant lock acquisitions wrt. workqueue flushes").

But lockdep's crossrelease feature was removed by commit e966eaeeb623f099
("locking/lockdep: Remove the cross-release locking checks"). At this
point, this check should have been restored.

Then, commit d6e89786bed977f3 ("workqueue: skip lockdep wq dependency in
cancel_work_sync()") introduced a boolean flag in order to distinguish
flush_work() and cancel_work_sync(), for checking "struct workqueue_struct"
dependency when called from cancel_work_sync() was causing false positives.

Then, commit 87915adc3f0acdf0 ("workqueue: re-add lockdep dependencies for
flushing") tried to restore "struct work_struct" dependency check, but by
error checked this boolean flag. Like an example shown above indicates,
"struct work_struct" dependency needs to be checked for both flush_work()
and cancel_work_sync().

Link: https://lkml.kernel.org/r/20220504044800.4966-1-hdanton@sina.com
Reported-by: Hillf Danton <hdanton@sina.com>
Suggested-by: Lai Jiangshan <jiangshanlai@gmail.com>
Fixes: 87915adc3f0acdf0 ("workqueue: re-add lockdep dependencies for flushing")
Cc: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
19 months agodrm/rockchip: Fix return type of cdn_dp_connector_mode_valid
Nathan Huckleberry [Tue, 13 Sep 2022 20:55:55 +0000 (13:55 -0700)]
drm/rockchip: Fix return type of cdn_dp_connector_mode_valid

[ Upstream commit b0b9408f132623dc88e78adb5282f74e4b64bb57 ]

The mode_valid field in drm_connector_helper_funcs is expected to be of
type:
enum drm_mode_status (* mode_valid) (struct drm_connector *connector,
     struct drm_display_mode *mode);

The mismatched return type breaks forward edge kCFI since the underlying
function definition does not match the function hook definition.

The return type of cdn_dp_connector_mode_valid should be changed from
int to enum drm_mode_status.

Reported-by: Dan Carpenter <error27@gmail.com>
Link: https://github.com/ClangBuiltLinux/linux/issues/1703
Cc: llvm@lists.linux.dev
Signed-off-by: Nathan Huckleberry <nhuck@google.com>
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20220913205555.155149-1-nhuck@google.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
19 months agodrm/amd/display: Mark dml30's UseMinimumDCFCLK() as noinline for stack usage
Nathan Chancellor [Tue, 30 Aug 2022 20:34:09 +0000 (13:34 -0700)]
drm/amd/display: Mark dml30's UseMinimumDCFCLK() as noinline for stack usage

[ Upstream commit 41012d715d5d7b9751ae84b8fb255e404ac9c5d0 ]

This function consumes a lot of stack space and it blows up the size of
dml30_ModeSupportAndSystemConfigurationFull() with clang:

  drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn30/display_mode_vba_30.c:3542:6: error: stack frame size (2200) exceeds limit (2048) in 'dml30_ModeSupportAndSystemConfigurationFull' [-Werror,-Wframe-larger-than]
  void dml30_ModeSupportAndSystemConfigurationFull(struct display_mode_lib *mode_lib)
       ^
  1 error generated.

Commit a0f7e7f759cf ("drm/amd/display: fix i386 frame size warning")
aimed to address this for i386 but it did not help x86_64.

To reduce the amount of stack space that
dml30_ModeSupportAndSystemConfigurationFull() uses, mark
UseMinimumDCFCLK() as noinline, using the _for_stack variant for
documentation. While this will increase the total amount of stack usage
between the two functions (1632 and 1304 bytes respectively), it will
make sure both stay below the limit of 2048 bytes for these files. The
aforementioned change does help reduce UseMinimumDCFCLK()'s stack usage
so it should not be reverted in favor of this change.

Link: https://github.com/ClangBuiltLinux/linux/issues/1681
Reported-by: "Sudip Mukherjee (Codethink)" <sudipm.mukherjee@gmail.com>
Tested-by: Maíra Canal <mairacanal@riseup.net>
Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
19 months agodrm/amd/display: Limit user regamma to a valid value
Yao Wang1 [Mon, 22 Aug 2022 10:30:31 +0000 (18:30 +0800)]
drm/amd/display: Limit user regamma to a valid value

[ Upstream commit 3601d620f22e37740cf73f8278eabf9f2aa19eb7 ]

[Why]
For HDR mode, we get total 512 tf_point and after switching to SDR mode
we actually get 400 tf_point and the rest of points(401~512) still use
dirty value from HDR mode. We should limit the rest of the points to max
value.

[How]
Limit the value when coordinates_x.x > 1, just like what we do in
translate_from_linear_space for other re-gamma build paths.

Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Reviewed-by: Krunoslav Kovac <Krunoslav.Kovac@amd.com>
Reviewed-by: Aric Cyr <Aric.Cyr@amd.com>
Acked-by: Pavle Kotarac <Pavle.Kotarac@amd.com>
Signed-off-by: Yao Wang1 <Yao.Wang1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
19 months agodrm/amdgpu: use dirty framebuffer helper
Hamza Mahfooz [Tue, 6 Sep 2022 19:01:49 +0000 (15:01 -0400)]
drm/amdgpu: use dirty framebuffer helper

[ Upstream commit 66f99628eb24409cb8feb5061f78283c8b65f820 ]

Currently, we aren't handling DRM_IOCTL_MODE_DIRTYFB. So, use
drm_atomic_helper_dirtyfb() as the dirty callback in the amdgpu_fb_funcs
struct.

Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
19 months agodrm/gma500: Fix BUG: sleeping function called from invalid context errors
Hans de Goede [Tue, 6 Sep 2022 20:38:50 +0000 (22:38 +0200)]
drm/gma500: Fix BUG: sleeping function called from invalid context errors

[ Upstream commit 63e37a79f7bd939314997e29c2f5a9f0ef184281 ]

gma_crtc_page_flip() was holding the event_lock spinlock while calling
crtc_funcs->mode_set_base() which takes ww_mutex.

The only reason to hold event_lock is to clear gma_crtc->page_flip_event
on mode_set_base() errors.

Instead unlock it after setting gma_crtc->page_flip_event and on
errors re-take the lock and clear gma_crtc->page_flip_event it
it is still set.

This fixes the following WARN/stacktrace:

[  512.122953] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:870
[  512.123004] in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 1253, name: gnome-shell
[  512.123031] preempt_count: 1, expected: 0
[  512.123048] RCU nest depth: 0, expected: 0
[  512.123066] INFO: lockdep is turned off.
[  512.123080] irq event stamp: 0
[  512.123094] hardirqs last  enabled at (0): [<0000000000000000>] 0x0
[  512.123134] hardirqs last disabled at (0): [<ffffffff8d0ec28c>] copy_process+0x9fc/0x1de0
[  512.123176] softirqs last  enabled at (0): [<ffffffff8d0ec28c>] copy_process+0x9fc/0x1de0
[  512.123207] softirqs last disabled at (0): [<0000000000000000>] 0x0
[  512.123233] Preemption disabled at:
[  512.123241] [<0000000000000000>] 0x0
[  512.123275] CPU: 3 PID: 1253 Comm: gnome-shell Tainted: G        W         5.19.0+ #1
[  512.123304] Hardware name: Packard Bell dot s/SJE01_CT, BIOS V1.10 07/23/2013
[  512.123323] Call Trace:
[  512.123346]  <TASK>
[  512.123370]  dump_stack_lvl+0x5b/0x77
[  512.123412]  __might_resched.cold+0xff/0x13a
[  512.123458]  ww_mutex_lock+0x1e/0xa0
[  512.123495]  psb_gem_pin+0x2c/0x150 [gma500_gfx]
[  512.123601]  gma_pipe_set_base+0x76/0x240 [gma500_gfx]
[  512.123708]  gma_crtc_page_flip+0x95/0x130 [gma500_gfx]
[  512.123808]  drm_mode_page_flip_ioctl+0x57d/0x5d0
[  512.123897]  ? drm_mode_cursor2_ioctl+0x10/0x10
[  512.123936]  drm_ioctl_kernel+0xa1/0x150
[  512.123984]  drm_ioctl+0x21f/0x420
[  512.124025]  ? drm_mode_cursor2_ioctl+0x10/0x10
[  512.124070]  ? rcu_read_lock_bh_held+0xb/0x60
[  512.124104]  ? lock_release+0x1ef/0x2d0
[  512.124161]  __x64_sys_ioctl+0x8d/0xd0
[  512.124203]  do_syscall_64+0x58/0x80
[  512.124239]  ? do_syscall_64+0x67/0x80
[  512.124267]  ? trace_hardirqs_on_prepare+0x55/0xe0
[  512.124300]  ? do_syscall_64+0x67/0x80
[  512.124340]  ? rcu_read_lock_sched_held+0x10/0x80
[  512.124377]  entry_SYSCALL_64_after_hwframe+0x63/0xcd
[  512.124411] RIP: 0033:0x7fcc4a70740f
[  512.124442] Code: 00 48 89 44 24 18 31 c0 48 8d 44 24 60 c7 04 24 10 00 00 00 48 89 44 24 08 48 8d 44 24 20 48 89 44 24 10 b8 10 00 00 00 0f 05 <89> c2 3d 00 f0 ff ff 77 18 48 8b 44 24 18 64 48 2b 04 25 28 00 00
[  512.124470] RSP: 002b:00007ffda73f5390 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[  512.124503] RAX: ffffffffffffffda RBX: 000055cc9e474500 RCX: 00007fcc4a70740f
[  512.124524] RDX: 00007ffda73f5420 RSI: 00000000c01864b0 RDI: 0000000000000009
[  512.124544] RBP: 00007ffda73f5420 R08: 000055cc9c0b0cb0 R09: 0000000000000034
[  512.124564] R10: 0000000000000000 R11: 0000000000000246 R12: 00000000c01864b0
[  512.124584] R13: 0000000000000009 R14: 000055cc9df484d0 R15: 000055cc9af5d0c0
[  512.124647]  </TASK>

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Patrik Jakobsson <patrik.r.jakobsson@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220906203852.527663-2-hdegoede@redhat.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
19 months agoDrivers: hv: Never allocate anything besides framebuffer from framebuffer memory...
Vitaly Kuznetsov [Sat, 27 Aug 2022 13:03:45 +0000 (15:03 +0200)]
Drivers: hv: Never allocate anything besides framebuffer from framebuffer memory region

[ Upstream commit f0880e2cb7e1f8039a048fdd01ce45ab77247221 ]

Passed through PCI device sometimes misbehave on Gen1 VMs when Hyper-V
DRM driver is also loaded. Looking at IOMEM assignment, we can see e.g.

$ cat /proc/iomem
...
f8000000-fffbffff : PCI Bus 0000:00
  f8000000-fbffffff : 0000:00:08.0
    f8000000-f8001fff : bb8c4f33-2ba2-4808-9f7f-02f3b4da22fe
...
fe0000000-fffffffff : PCI Bus 0000:00
  fe0000000-fe07fffff : bb8c4f33-2ba2-4808-9f7f-02f3b4da22fe
    fe0000000-fe07fffff : 2ba2:00:02.0
      fe0000000-fe07fffff : mlx4_core

the interesting part is the 'f8000000' region as it is actually the
VM's framebuffer:

$ lspci -v
...
0000:00:08.0 VGA compatible controller: Microsoft Corporation Hyper-V virtual VGA (prog-if 00 [VGA controller])
Flags: bus master, fast devsel, latency 0, IRQ 11
Memory at f8000000 (32-bit, non-prefetchable) [size=64M]
...

 hv_vmbus: registering driver hyperv_drm
 hyperv_drm 5620e0c7-8062-4dce-aeb7-520c7ef76171: [drm] Synthvid Version major 3, minor 5
 hyperv_drm 0000:00:08.0: vgaarb: deactivate vga console
 hyperv_drm 0000:00:08.0: BAR 0: can't reserve [mem 0xf8000000-0xfbffffff]
 hyperv_drm 5620e0c7-8062-4dce-aeb7-520c7ef76171: [drm] Cannot request framebuffer, boot fb still active?

Note: "Cannot request framebuffer" is not a fatal error in
hyperv_setup_gen1() as the code assumes there's some other framebuffer
device there but we actually have some other PCI device (mlx4 in this
case) config space there!

The problem appears to be that vmbus_allocate_mmio() can use dedicated
framebuffer region to serve any MMIO request from any device. The
semantics one might assume of a parameter named "fb_overlap_ok"
aren't implemented because !fb_overlap_ok essentially has no effect.
The existing semantics are really "prefer_fb_overlap". This patch
implements the expected and needed semantics, which is to not allocate
from the frame buffer space when !fb_overlap_ok.

Note, Gen2 VMs are usually unaffected by the issue because
framebuffer region is already taken by EFI fb (in case kernel supports
it) but Gen1 VMs may have this region unclaimed by the time Hyper-V PCI
pass-through driver tries allocating MMIO space if Hyper-V DRM/FB drivers
load after it. Devices can be brought up in any sequence so let's
resolve the issue by always ignoring 'fb_mmio' region for non-FB
requests, even if the region is unclaimed.

Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Link: https://lore.kernel.org/r/20220827130345.1320254-4-vkuznets@redhat.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
19 months agocifs: always initialize struct msghdr smb_msg completely
Stefan Metzmacher [Wed, 14 Sep 2022 03:25:47 +0000 (05:25 +0200)]
cifs: always initialize struct msghdr smb_msg completely

[ Upstream commit bedc8f76b3539ac4f952114b316bcc2251e808ce ]

So far we were just lucky because the uninitialized members
of struct msghdr are not used by default on a SOCK_STREAM tcp
socket.

But as new things like msg_ubuf and sg_from_iter where added
recently, we should play on the safe side and avoid potention
problems in future.

Signed-off-by: Stefan Metzmacher <metze@samba.org>
Cc: stable@vger.kernel.org
Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
19 months agocifs: use discard iterator to discard unneeded network data more efficiently
David Howells [Thu, 4 Feb 2021 06:15:21 +0000 (00:15 -0600)]
cifs: use discard iterator to discard unneeded network data more efficiently

[ Upstream commit cf0604a686b11175d8beae60281c4ccc95aaa5c2 ]

The iterator, ITER_DISCARD, that can only be used in READ mode and
just discards any data copied to it, was added to allow a network
filesystem to discard any unwanted data sent by a server.
Convert cifs_discard_from_socket() to use this.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Stable-dep-of: bedc8f76b353 ("cifs: always initialize struct msghdr smb_msg completely")
Signed-off-by: Sasha Levin <sashal@kernel.org>
19 months agodrm/amdgpu: Fix check for RAS support
Luben Tuikov [Fri, 12 Mar 2021 00:11:01 +0000 (19:11 -0500)]
drm/amdgpu: Fix check for RAS support

[ Upstream commit 084e2640e51626f413f85663e3ba7e32d4272477 ]

Use positive logic to check for RAS
support. Rename the function to actually indicate
what it is testing for. Essentially, make the
function a predicate with the correct name.

Cc: Stanley Yang <Stanley.Yang@amd.com>
Cc: Alexander Deucher <Alexander.Deucher@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Stable-dep-of: 6c2049066355 ("drm/amdgpu: Don't enable LTR if not supported")
Signed-off-by: Sasha Levin <sashal@kernel.org>
19 months agovfio/type1: fix vaddr_get_pfns() return in vfio_pin_page_external()
Daniel Jordan [Mon, 8 Mar 2021 17:24:52 +0000 (12:24 -0500)]
vfio/type1: fix vaddr_get_pfns() return in vfio_pin_page_external()

commit 4ab4fcfce5b540227d80eb32f1db45ab615f7c92 upstream.

vaddr_get_pfns() now returns the positive number of pfns successfully
gotten instead of zero.  vfio_pin_page_external() might return 1 to
vfio_iommu_type1_pin_pages(), which will treat it as an error, if
vaddr_get_pfns() is successful but vfio_pin_page_external() doesn't
reach vfio_lock_acct().

Fix it up in vfio_pin_page_external().  Found by inspection.

Fixes: be16c1fd99f4 ("vfio/type1: Change success value of vaddr_get_pfn()")
Signed-off-by: Daniel Jordan <daniel.m.jordan@oracle.com>
Message-Id: <20210308172452.38864-1-daniel.m.jordan@oracle.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
19 months agousb: xhci-mtk: fix issue of out-of-bounds array access
Chunfeng Yun [Tue, 17 Aug 2021 08:36:25 +0000 (16:36 +0800)]
usb: xhci-mtk: fix issue of out-of-bounds array access

commit de5107f473190538a65aac7edea85209cd5c1a8f upstream.

Bus bandwidth array access is based on esit, increase one
will cause out-of-bounds issue; for example, when esit is
XHCI_MTK_MAX_ESIT, will overstep boundary.

Fixes: 7c986fbc16ae ("usb: xhci-mtk: get the microframe boundary for ESIT")
Cc: <stable@vger.kernel.org>
Reported-by: Stan Lu <stan.lu@mediatek.com>
Signed-off-by: Chunfeng Yun <chunfeng.yun@mediatek.com>
Link: https://lore.kernel.org/r/1629189389-18779-5-git-send-email-chunfeng.yun@mediatek.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
19 months agos390/dasd: fix Oops in dasd_alias_get_start_dev due to missing pavgroup
Stefan Haberland [Mon, 19 Sep 2022 15:49:31 +0000 (17:49 +0200)]
s390/dasd: fix Oops in dasd_alias_get_start_dev due to missing pavgroup

commit db7ba07108a48c0f95b74fabbfd5d63e924f992d upstream.

Fix Oops in dasd_alias_get_start_dev() function caused by the pavgroup
pointer being NULL.

The pavgroup pointer is checked on the entrance of the function but
without the lcu->lock being held. Therefore there is a race window
between dasd_alias_get_start_dev() and _lcu_update() which sets
pavgroup to NULL with the lcu->lock held.

Fix by checking the pavgroup pointer with lcu->lock held.

Cc: <stable@vger.kernel.org> # 2.6.25+
Fixes: 8e09f21574ea ("[S390] dasd: add hyper PAV support to DASD device driver, part 1")
Signed-off-by: Stefan Haberland <sth@linux.ibm.com>
Reviewed-by: Jan Hoeppner <hoeppner@linux.ibm.com>
Link: https://lore.kernel.org/r/20220919154931.4123002-2-sth@linux.ibm.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
19 months agoserial: tegra-tcu: Use uart_xmit_advance(), fixes icount.tx accounting
Ilpo Järvinen [Thu, 1 Sep 2022 14:39:34 +0000 (17:39 +0300)]
serial: tegra-tcu: Use uart_xmit_advance(), fixes icount.tx accounting

commit 1d10cd4da593bc0196a239dcc54dac24b6b0a74e upstream.

Tx'ing does not correctly account Tx'ed characters into icount.tx.
Using uart_xmit_advance() fixes the problem.

Fixes: 2d908b38d409 ("serial: Add Tegra Combined UART driver")
Cc: <stable@vger.kernel.org> # serial: Create uart_xmit_advance()
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Link: https://lore.kernel.org/r/20220901143934.8850-4-ilpo.jarvinen@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
19 months agoserial: tegra: Use uart_xmit_advance(), fixes icount.tx accounting
Ilpo Järvinen [Thu, 1 Sep 2022 14:39:33 +0000 (17:39 +0300)]
serial: tegra: Use uart_xmit_advance(), fixes icount.tx accounting

commit 754f68044c7dd6c52534ba3e0f664830285c4b15 upstream.

DMA complete & stop paths did not correctly account Tx'ed characters
into icount.tx. Using uart_xmit_advance() fixes the problem.

Fixes: e9ea096dd225 ("serial: tegra: add serial driver")
Cc: <stable@vger.kernel.org> # serial: Create uart_xmit_advance()
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Link: https://lore.kernel.org/r/20220901143934.8850-3-ilpo.jarvinen@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
19 months agoserial: Create uart_xmit_advance()
Ilpo Järvinen [Thu, 1 Sep 2022 14:39:32 +0000 (17:39 +0300)]
serial: Create uart_xmit_advance()

commit e77cab77f2cb3a1ca2ba8df4af45bb35617ac16d upstream.

A very common pattern in the drivers is to advance xmit tail
index and do bookkeeping of Tx'ed characters. Create
uart_xmit_advance() to handle it.

Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Cc: stable <stable@kernel.org>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Link: https://lore.kernel.org/r/20220901143934.8850-2-ilpo.jarvinen@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
19 months agodrm/amd/amdgpu: fixing read wrong pf2vf data in SRIOV
Jingwen Chen [Thu, 13 Jan 2022 11:06:59 +0000 (19:06 +0800)]
drm/amd/amdgpu: fixing read wrong pf2vf data in SRIOV

commit 9a458402fb69bda886aa6cbe067311b6e3d9c52a upstream.

[Why]
This fixes 892deb48269c ("drm/amdgpu: Separate vf2pf work item init from virt data exchange").
we should read pf2vf data based at mman.fw_vram_usage_va after gmc
sw_init. commit 892deb48269c breaks this logic.

[How]
calling amdgpu_virt_exchange_data in amdgpu_virt_init_data_exchange to
set the right base in the right sequence.

v2:
call amdgpu_virt_init_data_exchange after gmc sw_init to make data
exchange workqueue run

v3:
clean up the code logic

v4:
add some comment and make the code more readable

Fixes: 892deb48269c ("drm/amdgpu: Separate vf2pf work item init from virt data exchange")
Signed-off-by: Jingwen Chen <Jingwen.Chen2@amd.com>
Reviewed-by: Horace Chen <horace.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
19 months agoselftests: forwarding: add shebang for sch_red.sh
Hangbin Liu [Thu, 22 Sep 2022 02:44:53 +0000 (10:44 +0800)]
selftests: forwarding: add shebang for sch_red.sh

[ Upstream commit 83e4b196838d90799a8879e5054a3beecf9ed256 ]

RHEL/Fedora RPM build checks are stricter, and complain when executable
files don't have a shebang line, e.g.

*** WARNING: ./kselftests/net/forwarding/sch_red.sh is executable but has no shebang, removing executable bit

Fix it by adding shebang line.

Fixes: 6cf0291f9517 ("selftests: forwarding: Add a RED test for SW datapath")
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Link: https://lore.kernel.org/r/20220922024453.437757-1-liuhangbin@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
19 months agonet: sched: fix possible refcount leak in tc_new_tfilter()
Hangyu Hua [Wed, 21 Sep 2022 09:27:34 +0000 (17:27 +0800)]
net: sched: fix possible refcount leak in tc_new_tfilter()

[ Upstream commit c2e1cfefcac35e0eea229e148c8284088ce437b5 ]

tfilter_put need to be called to put the refount got by tp->ops->get to
avoid possible refcount leak when chain->tmplt_ops != NULL and
chain->tmplt_ops != tp->ops.

Fixes: 7d5509fa0d3d ("net: sched: extend proto ops with 'put' callback")
Signed-off-by: Hangyu Hua <hbh25y@gmail.com>
Reviewed-by: Vlad Buslov <vladbu@nvidia.com>
Link: https://lore.kernel.org/r/20220921092734.31700-1-hbh25y@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
19 months agonet: sunhme: Fix packet reception for len < RX_COPY_THRESHOLD
Sean Anderson [Tue, 20 Sep 2022 23:50:18 +0000 (19:50 -0400)]
net: sunhme: Fix packet reception for len < RX_COPY_THRESHOLD

[ Upstream commit 878e2405710aacfeeb19364c300f38b7a9abfe8f ]

There is a separate receive path for small packets (under 256 bytes).
Instead of allocating a new dma-capable skb to be used for the next packet,
this path allocates a skb and copies the data into it (reusing the existing
sbk for the next packet). There are two bytes of junk data at the beginning
of every packet. I believe these are inserted in order to allow aligned DMA
and IP headers. We skip over them using skb_reserve. Before copying over
the data, we must use a barrier to ensure we see the whole packet. The
current code only synchronizes len bytes, starting from the beginning of
the packet, including the junk bytes. However, this leaves off the final
two bytes in the packet. Synchronize the whole packet.

To reproduce this problem, ping a HME with a payload size between 17 and
214

$ ping -s 17 <hme_address>

which will complain rather loudly about the data mismatch. Small packets
(below 60 bytes on the wire) do not have this issue. I suspect this is
related to the padding added to increase the minimum packet size.

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Sean Anderson <seanga2@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/20220920235018.1675956-1-seanga2@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
19 months agonet/smc: Stop the CLC flow if no link to map buffers on
Wen Gu [Tue, 20 Sep 2022 06:43:09 +0000 (14:43 +0800)]
net/smc: Stop the CLC flow if no link to map buffers on

[ Upstream commit e738455b2c6dcdab03e45d97de36476f93f557d2 ]

There might be a potential race between SMC-R buffer map and
link group termination.

smc_smcr_terminate_all()     | smc_connect_rdma()
--------------------------------------------------------------
                             | smc_conn_create()
for links in smcibdev        |
        schedule links down  |
                             | smc_buf_create()
                             |  \- smcr_buf_map_usable_links()
                             |      \- no usable links found,
                             |         (rmb->mr = NULL)
                             |
                             | smc_clc_send_confirm()
                             |  \- access conn->rmb_desc->mr[]->rkey
                             |     (panic)

During reboot and IB device module remove, all links will be set
down and no usable links remain in link groups. In such situation
smcr_buf_map_usable_links() should return an error and stop the
CLC flow accessing to uninitialized mr.

Fixes: b9247544c1bc ("net/smc: convert static link ID instances to support multiple links")
Signed-off-by: Wen Gu <guwen@linux.alibaba.com>
Link: https://lore.kernel.org/r/1663656189-32090-1-git-send-email-guwen@linux.alibaba.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
19 months agodrm/mediatek: dsi: Move mtk_dsi_stop() call back to mtk_dsi_poweroff()
Nícolas F. R. A. Prado [Thu, 4 Aug 2022 19:43:25 +0000 (15:43 -0400)]
drm/mediatek: dsi: Move mtk_dsi_stop() call back to mtk_dsi_poweroff()

[ Upstream commit 90144dd8b0d137d9e78ef34b3c418e51a49299ad ]

As the comment right before the mtk_dsi_stop() call advises,
mtk_dsi_stop() should only be called after
mtk_drm_crtc_atomic_disable(). That's because that function calls
drm_crtc_wait_one_vblank(), which requires the vblank irq to be enabled.

Previously mtk_dsi_stop(), being in mtk_dsi_poweroff() and guarded by a
refcount, would only be called at the end of
mtk_drm_crtc_atomic_disable(), through the call to mtk_crtc_ddp_hw_fini().
Commit cde7e2e35c28 ("drm/mediatek: Separate poweron/poweroff from
enable/disable and define new funcs") moved the mtk_dsi_stop() call to
mtk_output_dsi_disable(), causing it to be called before
mtk_drm_crtc_atomic_disable(), and consequently generating vblank
timeout warnings during suspend.

Move the mtk_dsi_stop() call back to mtk_dsi_poweroff() so that we have
a working vblank irq during mtk_drm_crtc_atomic_disable() and stop
getting vblank timeout warnings.

Fixes: cde7e2e35c28 ("drm/mediatek: Separate poweron/poweroff from enable/disable and define new funcs")
Signed-off-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>
Tested-by: Hsin-Yi Wang <hsinyi@chromium.org>
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Tested-by: Allen-KH Cheng <allen-kh.cheng@mediatek.com>
Link: http://lists.infradead.org/pipermail/linux-mediatek/2022-August/046713.html
Signed-off-by: Chun-Kuang Hu <chunkuang.hu@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
19 months agoperf kcore_copy: Do not check /proc/modules is unchanged
Adrian Hunter [Wed, 14 Sep 2022 12:24:29 +0000 (15:24 +0300)]
perf kcore_copy: Do not check /proc/modules is unchanged

[ Upstream commit 5b427df27b94aec1312cace48a746782a0925c53 ]

/proc/kallsyms and /proc/modules are compared before and after the copy
in order to ensure no changes during the copy.

However /proc/modules also might change due to reference counts changing
even though that does not make any difference.

Any modules loaded or unloaded should be visible in changes to kallsyms,
so it is not necessary to check /proc/modules also anyway.

Remove the comparison checking that /proc/modules is unchanged.

Fixes: fc1b691d7651d949 ("perf buildid-cache: Add ability to add kcore to the cache")
Reported-by: Daniel Dao <dqminh@cloudflare.com>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Tested-by: Daniel Dao <dqminh@cloudflare.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20220914122429.8770-1-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
19 months agoperf jit: Include program header in ELF files
Lieven Hey [Thu, 15 Sep 2022 09:29:10 +0000 (11:29 +0200)]
perf jit: Include program header in ELF files

[ Upstream commit babd04386b1df8c364cdaa39ac0e54349502e1e5 ]

The missing header makes it hard for programs like elfutils to open
these files.

Fixes: 2d86612aacb7805f ("perf symbol: Correct address for bss symbols")
Reviewed-by: Leo Yan <leo.yan@linaro.org>
Signed-off-by: Lieven Hey <lieven.hey@kdab.com>
Tested-by: Leo Yan <leo.yan@linaro.org>
Cc: Leo Yan <leo.yan@linaro.org>
Link: https://lore.kernel.org/r/20220915092910.711036-1-lieven.hey@kdab.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
19 months agocan: gs_usb: gs_can_open(): fix race dev->can.state condition
Marc Kleine-Budde [Tue, 20 Sep 2022 09:40:56 +0000 (11:40 +0200)]
can: gs_usb: gs_can_open(): fix race dev->can.state condition

[ Upstream commit 5440428b3da65408dba0241985acb7a05258b85e ]

The dev->can.state is set to CAN_STATE_ERROR_ACTIVE, after the device
has been started. On busy networks the CAN controller might receive
CAN frame between and go into an error state before the dev->can.state
is assigned.

Assign dev->can.state before starting the controller to close the race
window.

Fixes: d08e973a77d1 ("can: gs_usb: Added support for the GS_USB CAN devices")
Link: https://lore.kernel.org/all/20220920195216.232481-1-mkl@pengutronix.de
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
19 months agonetfilter: ebtables: fix memory leak when blob is malformed
Florian Westphal [Tue, 20 Sep 2022 12:20:17 +0000 (14:20 +0200)]
netfilter: ebtables: fix memory leak when blob is malformed

[ Upstream commit 62ce44c4fff947eebdf10bb582267e686e6835c9 ]

The bug fix was incomplete, it "replaced" crash with a memory leak.
The old code had an assignment to "ret" embedded into the conditional,
restore this.

Fixes: 7997eff82828 ("netfilter: ebtables: reject blobs that don't provide all entry points")
Reported-and-tested-by: syzbot+a24c5252f3e3ab733464@syzkaller.appspotmail.com
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
19 months agonetfilter: nf_tables: fix percpu memory leak at nf_tables_addchain()
Tetsuo Handa [Mon, 12 Sep 2022 13:58:51 +0000 (22:58 +0900)]
netfilter: nf_tables: fix percpu memory leak at nf_tables_addchain()

[ Upstream commit 9a4d6dd554b86e65581ef6b6638a39ae079b17ac ]

It seems to me that percpu memory for chain stats started leaking since
commit 3bc158f8d0330f0a ("netfilter: nf_tables: map basechain priority to
hardware priority") when nft_chain_offload_priority() returned an error.

Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Fixes: 3bc158f8d0330f0a ("netfilter: nf_tables: map basechain priority to hardware priority")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
19 months agonetfilter: nf_tables: fix nft_counters_enabled underflow at nf_tables_addchain()
Tetsuo Handa [Mon, 12 Sep 2022 12:41:00 +0000 (21:41 +0900)]
netfilter: nf_tables: fix nft_counters_enabled underflow at nf_tables_addchain()

[ Upstream commit 921ebde3c0d22c8cba74ce8eb3cc4626abff1ccd ]

syzbot is reporting underflow of nft_counters_enabled counter at
nf_tables_addchain() [1], for commit 43eb8949cfdffa76 ("netfilter:
nf_tables: do not leave chain stats enabled on error") missed that
nf_tables_chain_destroy() after nft_basechain_init() in the error path of
nf_tables_addchain() decrements the counter because nft_basechain_init()
makes nft_is_base_chain() return true by setting NFT_CHAIN_BASE flag.

Increment the counter immediately after returning from
nft_basechain_init().

Link:  https://syzkaller.appspot.com/bug?extid=b5d82a651b71cd8a75ab [1]
Reported-by: syzbot <syzbot+b5d82a651b71cd8a75ab@syzkaller.appspotmail.com>
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Tested-by: syzbot <syzbot+b5d82a651b71cd8a75ab@syzkaller.appspotmail.com>
Fixes: 43eb8949cfdffa76 ("netfilter: nf_tables: do not leave chain stats enabled on error")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
19 months agonet/sched: taprio: make qdisc_leaf() see the per-netdev-queue pfifo child qdiscs
Vladimir Oltean [Thu, 15 Sep 2022 10:08:02 +0000 (13:08 +0300)]
net/sched: taprio: make qdisc_leaf() see the per-netdev-queue pfifo child qdiscs

[ Upstream commit 1461d212ab277d8bba1a753d33e9afe03d81f9d4 ]

taprio can only operate as root qdisc, and to that end, there exists the
following check in taprio_init(), just as in mqprio:

if (sch->parent != TC_H_ROOT)
return -EOPNOTSUPP;

And indeed, when we try to attach taprio to an mqprio child, it fails as
expected:

$ tc qdisc add dev swp0 root handle 1: mqprio num_tc 8 \
map 0 1 2 3 4 5 6 7 \
queues 1@0 1@1 1@2 1@3 1@4 1@5 1@6 1@7 hw 0
$ tc qdisc replace dev swp0 parent 1:2 taprio num_tc 8 \
map 0 1 2 3 4 5 6 7 \
queues 1@0 1@1 1@2 1@3 1@4 1@5 1@6 1@7 \
base-time 0 sched-entry S 0x7f 990000 sched-entry S 0x80 100000 \
flags 0x0 clockid CLOCK_TAI
Error: sch_taprio: Can only be attached as root qdisc.

(extack message added by me)

But when we try to attach a taprio child to a taprio root qdisc,
surprisingly it doesn't fail:

$ tc qdisc replace dev swp0 root handle 1: taprio num_tc 8 \
map 0 1 2 3 4 5 6 7 queues 1@0 1@1 1@2 1@3 1@4 1@5 1@6 1@7 \
base-time 0 sched-entry S 0x7f 990000 sched-entry S 0x80 100000 \
flags 0x0 clockid CLOCK_TAI
$ tc qdisc replace dev swp0 parent 1:2 taprio num_tc 8 \
map 0 1 2 3 4 5 6 7 \
queues 1@0 1@1 1@2 1@3 1@4 1@5 1@6 1@7 \
base-time 0 sched-entry S 0x7f 990000 sched-entry S 0x80 100000 \
flags 0x0 clockid CLOCK_TAI

This is because tc_modify_qdisc() behaves differently when mqprio is
root, vs when taprio is root.

In the mqprio case, it finds the parent qdisc through
p = qdisc_lookup(dev, TC_H_MAJ(clid)), and then the child qdisc through
q = qdisc_leaf(p, clid). This leaf qdisc q has handle 0, so it is
ignored according to the comment right below ("It may be default qdisc,
ignore it"). As a result, tc_modify_qdisc() goes through the
qdisc_create() code path, and this gives taprio_init() a chance to check
for sch_parent != TC_H_ROOT and error out.

Whereas in the taprio case, the returned q = qdisc_leaf(p, clid) is
different. It is not the default qdisc created for each netdev queue
(both taprio and mqprio call qdisc_create_dflt() and keep them in
a private q->qdiscs[], or priv->qdiscs[], respectively). Instead, taprio
makes qdisc_leaf() return the _root_ qdisc, aka itself.

When taprio does that, tc_modify_qdisc() goes through the qdisc_change()
code path, because the qdisc layer never finds out about the child qdisc
of the root. And through the ->change() ops, taprio has no reason to
check whether its parent is root or not, just through ->init(), which is
not called.

The problem is the taprio_leaf() implementation. Even though code wise,
it does the exact same thing as mqprio_leaf() which it is copied from,
it works with different input data. This is because mqprio does not
attach itself (the root) to each device TX queue, but one of the default
qdiscs from its private array.

In fact, since commit 13511704f8d7 ("net: taprio offload: enforce qdisc
to netdev queue mapping"), taprio does this too, but just for the full
offload case. So if we tried to attach a taprio child to a fully
offloaded taprio root qdisc, it would properly fail too; just not to a
software root taprio.

To fix the problem, stop looking at the Qdisc that's attached to the TX
queue, and instead, always return the default qdiscs that we've
allocated (and to which we privately enqueue and dequeue, in software
scheduling mode).

Since Qdisc_class_ops :: leaf  is only called from tc_modify_qdisc(),
the risk of unforeseen side effects introduced by this change is
minimal.

Fixes: 5a781ccbd19e ("tc: Add support for configuring the taprio scheduler")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
19 months agonet/sched: taprio: avoid disabling offload when it was never enabled
Vladimir Oltean [Thu, 15 Sep 2022 10:08:01 +0000 (13:08 +0300)]
net/sched: taprio: avoid disabling offload when it was never enabled

[ Upstream commit db46e3a88a09c5cf7e505664d01da7238cd56c92 ]

In an incredibly strange API design decision, qdisc->destroy() gets
called even if qdisc->init() never succeeded, not exclusively since
commit 87b60cfacf9f ("net_sched: fix error recovery at qdisc creation"),
but apparently also earlier (in the case of qdisc_create_dflt()).

The taprio qdisc does not fully acknowledge this when it attempts full
offload, because it starts off with q->flags = TAPRIO_FLAGS_INVALID in
taprio_init(), then it replaces q->flags with TCA_TAPRIO_ATTR_FLAGS
parsed from netlink (in taprio_change(), tail called from taprio_init()).

But in taprio_destroy(), we call taprio_disable_offload(), and this
determines what to do based on FULL_OFFLOAD_IS_ENABLED(q->flags).

But looking at the implementation of FULL_OFFLOAD_IS_ENABLED()
(a bitwise check of bit 1 in q->flags), it is invalid to call this macro
on q->flags when it contains TAPRIO_FLAGS_INVALID, because that is set
to U32_MAX, and therefore FULL_OFFLOAD_IS_ENABLED() will return true on
an invalid set of flags.

As a result, it is possible to crash the kernel if user space forces an
error between setting q->flags = TAPRIO_FLAGS_INVALID, and the calling
of taprio_enable_offload(). This is because drivers do not expect the
offload to be disabled when it was never enabled.

The error that we force here is to attach taprio as a non-root qdisc,
but instead as child of an mqprio root qdisc:

$ tc qdisc add dev swp0 root handle 1: \
mqprio num_tc 8 map 0 1 2 3 4 5 6 7 \
queues 1@0 1@1 1@2 1@3 1@4 1@5 1@6 1@7 hw 0
$ tc qdisc replace dev swp0 parent 1:1 \
taprio num_tc 8 map 0 1 2 3 4 5 6 7 \
queues 1@0 1@1 1@2 1@3 1@4 1@5 1@6 1@7 base-time 0 \
sched-entry S 0x7f 990000 sched-entry S 0x80 100000 \
flags 0x0 clockid CLOCK_TAI
Unable to handle kernel paging request at virtual address fffffffffffffff8
[fffffffffffffff8] pgd=0000000000000000, p4d=0000000000000000
Internal error: Oops: 96000004 [#1] PREEMPT SMP
Call trace:
 taprio_dump+0x27c/0x310
 vsc9959_port_setup_tc+0x1f4/0x460
 felix_port_setup_tc+0x24/0x3c
 dsa_slave_setup_tc+0x54/0x27c
 taprio_disable_offload.isra.0+0x58/0xe0
 taprio_destroy+0x80/0x104
 qdisc_create+0x240/0x470
 tc_modify_qdisc+0x1fc/0x6b0
 rtnetlink_rcv_msg+0x12c/0x390
 netlink_rcv_skb+0x5c/0x130
 rtnetlink_rcv+0x1c/0x2c

Fix this by keeping track of the operations we made, and undo the
offload only if we actually did it.

I've added "bool offloaded" inside a 4 byte hole between "int clockid"
and "atomic64_t picos_per_byte". Now the first cache line looks like
below:

$ pahole -C taprio_sched net/sched/sch_taprio.o
struct taprio_sched {
        struct Qdisc * *           qdiscs;               /*     0     8 */
        struct Qdisc *             root;                 /*     8     8 */
        u32                        flags;                /*    16     4 */
        enum tk_offsets            tk_offset;            /*    20     4 */
        int                        clockid;              /*    24     4 */
        bool                       offloaded;            /*    28     1 */

        /* XXX 3 bytes hole, try to pack */

        atomic64_t                 picos_per_byte;       /*    32     0 */

        /* XXX 8 bytes hole, try to pack */

        spinlock_t                 current_entry_lock;   /*    40     0 */

        /* XXX 8 bytes hole, try to pack */

        struct sched_entry *       current_entry;        /*    48     8 */
        struct sched_gate_list *   oper_sched;           /*    56     8 */
        /* --- cacheline 1 boundary (64 bytes) --- */

Fixes: 9c66d1564676 ("taprio: Add support for hardware offloading")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
19 months agonet: socket: remove register_gifconf
Arnd Bergmann [Thu, 22 Jul 2021 14:29:01 +0000 (16:29 +0200)]
net: socket: remove register_gifconf

[ Upstream commit b0e99d03778b2418aec20db99d97d19d25d198b6 ]

Since dynamic registration of the gifconf() helper is only used for
IPv4, and this can not be in a loadable module, this can be simplified
noticeably by turning it into a direct function call as a preparation
for cleaning up the compat handling.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Stable-dep-of: 5641c751fe2f ("net: enetc: deny offload of tc-based TSN features on VF interfaces")
Signed-off-by: Sasha Levin <sashal@kernel.org>
19 months agonet: enetc: move enetc_set_psfp() out of the common enetc_set_features()
Vladimir Oltean [Fri, 16 Sep 2022 13:32:08 +0000 (16:32 +0300)]
net: enetc: move enetc_set_psfp() out of the common enetc_set_features()

[ Upstream commit fed38e64d9b99d65a36c0dbadc3d3f8ddd9ea030 ]

The VF netdev driver shouldn't respond to changes in the NETIF_F_HW_TC
flag; only PFs should. Moreover, TSN-specific code should go to
enetc_qos.c, which should not be included in the VF driver.

Fixes: 79e499829f3f ("net: enetc: add hw tc hw offload features for PSPF capability")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://lore.kernel.org/r/20220916133209.3351399-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
19 months agowireguard: netlink: avoid variable-sized memcpy on sockaddr
Jason A. Donenfeld [Fri, 16 Sep 2022 14:37:40 +0000 (15:37 +0100)]
wireguard: netlink: avoid variable-sized memcpy on sockaddr

[ Upstream commit 26c013108c12b94bc023bf19198a4300596c98b1 ]

Doing a variable-sized memcpy is slower, and the compiler isn't smart
enough to turn this into a constant-size assignment.

Further, Kees' latest fortified memcpy will actually bark, because the
destination pointer is type sockaddr, not explicitly sockaddr_in or
sockaddr_in6, so it thinks there's an overflow:

    memcpy: detected field-spanning write (size 28) of single field
    "&endpoint.addr" at drivers/net/wireguard/netlink.c:446 (size 16)

Fix this by just assigning by using explicit casts for each checked
case.

Fixes: e7096c131e51 ("net: WireGuard secure network tunnel")
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reported-by: syzbot+a448cda4dba2dac50de5@syzkaller.appspotmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
19 months agowireguard: ratelimiter: disable timings test by default
Jason A. Donenfeld [Fri, 16 Sep 2022 14:37:38 +0000 (15:37 +0100)]
wireguard: ratelimiter: disable timings test by default

[ Upstream commit 684dec3cf45da2b0848298efae4adf3b2aeafeda ]

A previous commit tried to make the ratelimiter timings test more
reliable but in the process made it less reliable on other
configurations. This is an impossible problem to solve without
increasingly ridiculous heuristics. And it's not even a problem that
actually needs to be solved in any comprehensive way, since this is only
ever used during development. So just cordon this off with a DEBUG_
ifdef, just like we do for the trie's randomized tests, so it can be
enabled while hacking on the code, and otherwise disabled in CI. In the
process we also revert 151c8e499f47.

Fixes: 151c8e499f47 ("wireguard: ratelimiter: use hrtimer in selftest")
Fixes: e7096c131e51 ("net: WireGuard secure network tunnel")
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
19 months agonet: ipa: properly limit modem routing table use
Alex Elder [Tue, 13 Sep 2022 20:46:02 +0000 (15:46 -0500)]
net: ipa: properly limit modem routing table use

[ Upstream commit cf412ec333250cb82bafe57169204e14a9f1c2ac ]

IPA can route packets between IPA-connected entities.  The AP and
modem are currently the only such entities supported, and no routing
is required to transfer packets between them.

The number of entries in each routing table is fixed, and defined at
initialization time.  Some of these entries are designated for use
by the modem, and the rest are available for the AP to use.  The AP
sends a QMI message to the modem which describes (among other
things) information about routing table memory available for the
modem to use.

Currently the QMI initialization packet gives wrong information in
its description of routing tables.  What *should* be supplied is the
maximum index that the modem can use for the routing table memory
located at a given location.  The current code instead supplies the
total *number* of routing table entries.  Furthermore, the modem is
granted the entire table, not just the subset it's supposed to use.

This patch fixes this.  First, the ipa_mem_bounds structure is
generalized so its "end" field can be interpreted either as a final
byte offset, or a final array index.  Second, the IPv4 and IPv6
(non-hashed and hashed) table information fields in the QMI
ipa_init_modem_driver_req structure are changed to be ipa_mem_bounds
rather than ipa_mem_array structures.  Third, we set the "end" value
for each routing table to be the last index, rather than setting the
"count" to be the number of indices.  Finally, instead of allowing
the modem to use all of a routing table's memory, it is limited to
just the portion meant to be used by the modem.  In all versions of
IPA currently supported, that is IPA_ROUTE_MODEM_COUNT (8) entries.

Update a few comments for clarity.

Fixes: 530f9216a9537 ("soc: qcom: ipa: AP/modem communications")
Signed-off-by: Alex Elder <elder@linaro.org>
Link: https://lore.kernel.org/r/20220913204602.1803004-1-elder@linaro.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
19 months agonet: ipa: kill IPA_TABLE_ENTRY_SIZE
Alex Elder [Sun, 28 Mar 2021 17:31:11 +0000 (12:31 -0500)]
net: ipa: kill IPA_TABLE_ENTRY_SIZE

[ Upstream commit 4ea29143ebe6c453f5fddc80ffe4ed046f44aa3a ]

Entries in an IPA route or filter table are 64-bit little-endian
addresses, each of which refers to a routing or filtering rule.

The format of these table slots are fixed, but IPA_TABLE_ENTRY_SIZE
is used to define their size.  This symbol doesn't really add value,
and I think it unnecessarily obscures what a table entry *is*.

So get rid of IPA_TABLE_ENTRY_SIZE, and just use sizeof(__le64) in
its place throughout the code.

Update the comments in "ipa_table.c" to provide a little better
explanation of these table slots.

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Stable-dep-of: cf412ec33325 ("net: ipa: properly limit modem routing table use")
Signed-off-by: Sasha Levin <sashal@kernel.org>
19 months agonet: ipa: DMA addresses are nicely aligned
Alex Elder [Sun, 28 Mar 2021 17:31:10 +0000 (12:31 -0500)]
net: ipa: DMA addresses are nicely aligned

[ Upstream commit 19aaf72c0c7a26ab7ffc655a6d84da6a379f899b ]

A recent patch avoided doing 64-bit modulo operations by checking
the alignment of some DMA allocations using only the lower 32 bits
of the address.

David Laight pointed out (after the fix was committed) that DMA
allocations might already satisfy the alignment requirements.  And
he was right.

Remove the alignment checks that occur after DMA allocation requests,
and update comments to explain why the constraint is satisfied.  The
only place IPA_TABLE_ALIGN was used was to check the alignment; it is
therefore no longer needed, so get rid of it.

Add comments where GSI_RING_ELEMENT_SIZE and the tre_count and
event_count channel data fields are defined to make explicit they
are required to be powers of 2.

Revise a comment in gsi_trans_pool_init_dma(), taking into account
that dma_alloc_coherent() guarantees its result is aligned to a page
size (or order thereof).

Don't bother printing an error if a DMA allocation fails.

Suggested-by: David Laight <David.Laight@ACULAB.COM>
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Stable-dep-of: cf412ec33325 ("net: ipa: properly limit modem routing table use")
Signed-off-by: Sasha Levin <sashal@kernel.org>
19 months agonet: ipa: avoid 64-bit modulus
Alex Elder [Tue, 23 Mar 2021 01:05:05 +0000 (20:05 -0500)]
net: ipa: avoid 64-bit modulus

[ Upstream commit 437c78f976f5b39fc4b2a1c65903a229f55912dd ]

It is possible for a 32 bit x86 build to use a 64 bit DMA address.

There are two remaining spots where the IPA driver does a modulo
operation to check alignment of a DMA address, and under certain
conditions this can lead to a build error on i386 (at least).

The alignment checks we're doing are for power-of-2 values, and this
means the lower 32 bits of the DMA address can be used.  This ensures
both operands to the modulo operator are 32 bits wide.

Reported-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Alex Elder <elder@linaro.org>
Acked-by: Randy Dunlap <rdunlap@infradead.org> # build-tested
Signed-off-by: David S. Miller <davem@davemloft.net>
Stable-dep-of: cf412ec33325 ("net: ipa: properly limit modem routing table use")
Signed-off-by: Sasha Levin <sashal@kernel.org>
19 months agonet: ipa: fix table alignment requirement
Alex Elder [Thu, 18 Mar 2021 18:59:29 +0000 (13:59 -0500)]
net: ipa: fix table alignment requirement

[ Upstream commit e5d4e96b44cf20330c970c3e30ea0a8c3a23feca ]

We currently have a build-time check to ensure that the minimum DMA
allocation alignment satisfies the constraint that IPA filter and
route tables must point to rules that are 128-byte aligned.

But what's really important is that the actual allocated DMA memory
has that alignment, even if the minimum is smaller than that.

Remove the BUILD_BUG_ON() call checking against minimim DMA alignment
and instead verify at rutime that the allocated memory is properly
aligned.

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Stable-dep-of: cf412ec33325 ("net: ipa: properly limit modem routing table use")
Signed-off-by: Sasha Levin <sashal@kernel.org>
19 months agonet: ipa: fix assumptions about DMA address size
Alex Elder [Thu, 18 Mar 2021 18:59:27 +0000 (13:59 -0500)]
net: ipa: fix assumptions about DMA address size

[ Upstream commit d2fd2311de909a7f4e99b4bd11a19e6b671d6a6b ]

Some build time checks in ipa_table_validate_build() assume that a
DMA address is 64 bits wide.  That is more restrictive than it has
to be.  A route or filter table is 64 bits wide no matter what the
size of a DMA address is on the AP.  The code actually uses a
pointer to __le64 to access table entries, and a fixed constant
IPA_TABLE_ENTRY_SIZE to describe the size of those entries.

Loosen up two checks so they still verify some requirements, but
such that they do not assume the size of a DMA address is 64 bits.

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Stable-dep-of: cf412ec33325 ("net: ipa: properly limit modem routing table use")
Signed-off-by: Sasha Levin <sashal@kernel.org>
19 months agoof: mdio: Add of_node_put() when breaking out of for_each_xx
Liang He [Tue, 13 Sep 2022 12:56:59 +0000 (20:56 +0800)]
of: mdio: Add of_node_put() when breaking out of for_each_xx

[ Upstream commit 1c48709e6d9d353acaaac1d8e33474756b121d78 ]

In of_mdiobus_register(), we should call of_node_put() for 'child'
escaped out of for_each_available_child_of_node().

Fixes: 66bdede495c7 ("of_mdio: Fix broken PHY IRQ in case of probe deferral")
Co-developed-by: Miaoqian Lin <linmq006@gmail.com>
Signed-off-by: Miaoqian Lin <linmq006@gmail.com>
Signed-off-by: Liang He <windhl@126.com>
Link: https://lore.kernel.org/r/20220913125659.3331969-1-windhl@126.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
19 months agodrm/hisilicon: Add depends on MMU
Randy Dunlap [Tue, 31 May 2022 02:55:57 +0000 (19:55 -0700)]
drm/hisilicon: Add depends on MMU

[ Upstream commit d8a79c03054911c375a2252627a429c9bc4615b6 ]

The Kconfig symbol depended on MMU but was dropped by the commit
acad3fe650a5 ("drm/hisilicon: Removed the dependency on the mmu")
because it already had as a dependency ARM64 that already selects MMU.

But later, commit a0f25a6bb319 ("drm/hisilicon/hibmc: Allow to be built
if COMPILE_TEST is enabled") allowed the driver to be built for non-ARM64
when COMPILE_TEST is set but that could lead to unmet direct dependencies
and linking errors.

Prevent a kconfig warning when MMU is not enabled by making
DRM_HISI_HIBMC depend on MMU.

WARNING: unmet direct dependencies detected for DRM_TTM
  Depends on [n]: HAS_IOMEM [=y] && DRM [=m] && MMU [=n]
  Selected by [m]:
  - DRM_TTM_HELPER [=m] && HAS_IOMEM [=y] && DRM [=m]
  - DRM_HISI_HIBMC [=m] && HAS_IOMEM [=y] && DRM [=m] && PCI [=y] && (ARM64 || COMPILE_TEST [=y])

Fixes: acad3fe650a5 ("drm/hisilicon: Removed the dependency on the mmu")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Gerd Hoffmann <kraxel@redhat.com>
Cc: Thomas Zimmermann <tzimmermann@suse.de>
Cc: Xinliang Liu <xinliang.liu@linaro.org>
Cc: Tian Tao <tiantao6@hisilicon.com>
Cc: John Stultz <jstultz@google.com>
Cc: Xinwei Kong <kong.kongxinwei@hisilicon.com>
Cc: Chen Feng <puck.chen@hisilicon.com>
Cc: Christian Koenig <christian.koenig@amd.com>
Cc: Huang Rui <ray.huang@amd.com>
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel@ffwll.ch>
Reviewed-by: Javier Martinez Canillas <javierm@redhat.com>
Signed-off-by: Javier Martinez Canillas <javierm@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220531025557.29593-1-rdunlap@infradead.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
19 months agodrm/hisilicon/hibmc: Allow to be built if COMPILE_TEST is enabled
Javier Martinez Canillas [Thu, 16 Dec 2021 21:09:36 +0000 (22:09 +0100)]
drm/hisilicon/hibmc: Allow to be built if COMPILE_TEST is enabled

[ Upstream commit a0f25a6bb319aa05e04dcf51707c97c2881b4f47 ]

The commit feeb07d0ca5a ("drm/hisilicon/hibmc: Make CONFIG_DRM_HISI_HIBMC
depend on ARM64") made the driver Kconfig symbol to depend on ARM64 since
it only supports that architecture and loading the module on others would
lead to incorrect video modes being used.

But it also prevented the driver to be built on other architectures which
is useful to have compile test coverage when doing subsystem wide changes.

Make the dependency instead to be (ARM64 || COMPILE_TEST), so the driver
is buildable when the CONFIG_COMPILE_TEST option is enabled.

Signed-off-by: Javier Martinez Canillas <javierm@redhat.com>
Acked-by: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20211216210936.3329977-1-javierm@redhat.com
Stable-dep-of: d8a79c030549 ("drm/hisilicon: Add depends on MMU")
Signed-off-by: Sasha Levin <sashal@kernel.org>
19 months agosfc: fix null pointer dereference in efx_hard_start_xmit
Íñigo Huguet [Wed, 14 Sep 2022 11:11:35 +0000 (13:11 +0200)]
sfc: fix null pointer dereference in efx_hard_start_xmit

[ Upstream commit 0a242eb2913a4aa3d6fbdb86559f27628e9466f3 ]

Trying to get the channel from the tx_queue variable here is wrong
because we can only be here if tx_queue is NULL, so we shouldn't
dereference it. As the above comment in the code says, this is very
unlikely to happen, but it's wrong anyway so let's fix it.

I hit this issue because of a different bug that caused tx_queue to be
NULL. If that happens, this is the error message that we get here:
  BUG: unable to handle kernel NULL pointer dereference at 0000000000000020
  [...]
  RIP: 0010:efx_hard_start_xmit+0x153/0x170 [sfc]

Fixes: 12804793b17c ("sfc: decouple TXQ type from label")
Reported-by: Tianhao Zhao <tizhao@redhat.com>
Signed-off-by: Íñigo Huguet <ihuguet@redhat.com>
Acked-by: Edward Cree <ecree.xilinx@gmail.com>
Link: https://lore.kernel.org/r/20220914111135.21038-1-ihuguet@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
19 months agosfc: fix TX channel offset when using legacy interrupts
Íñigo Huguet [Wed, 14 Sep 2022 10:36:48 +0000 (12:36 +0200)]
sfc: fix TX channel offset when using legacy interrupts

[ Upstream commit f232af4295653afa4ade3230462b3be15ad16419 ]

In legacy interrupt mode the tx_channel_offset was hardcoded to 1, but
that's not correct if efx_sepparate_tx_channels is false. In that case,
the offset is 0 because the tx queues are in the single existing channel
at index 0, together with the rx queue.

Without this fix, as soon as you try to send any traffic, it tries to
get the tx queues from an uninitialized channel getting these errors:
  WARNING: CPU: 1 PID: 0 at drivers/net/ethernet/sfc/tx.c:540 efx_hard_start_xmit+0x12e/0x170 [sfc]
  [...]
  RIP: 0010:efx_hard_start_xmit+0x12e/0x170 [sfc]
  [...]
  Call Trace:
   <IRQ>
   dev_hard_start_xmit+0xd7/0x230
   sch_direct_xmit+0x9f/0x360
   __dev_queue_xmit+0x890/0xa40
  [...]
  BUG: unable to handle kernel NULL pointer dereference at 0000000000000020
  [...]
  RIP: 0010:efx_hard_start_xmit+0x153/0x170 [sfc]
  [...]
  Call Trace:
   <IRQ>
   dev_hard_start_xmit+0xd7/0x230
   sch_direct_xmit+0x9f/0x360
   __dev_queue_xmit+0x890/0xa40
  [...]

Fixes: c308dfd1b43e ("sfc: fix wrong tx channel offset with efx_separate_tx_channels")
Reported-by: Tianhao Zhao <tizhao@redhat.com>
Signed-off-by: Íñigo Huguet <ihuguet@redhat.com>
Acked-by: Edward Cree <ecree.xilinx@gmail.com>
Link: https://lore.kernel.org/r/20220914103648.16902-1-ihuguet@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
19 months agoi40e: Fix set max_tx_rate when it is lower than 1 Mbps
Michal Jaron [Thu, 1 Sep 2022 07:49:33 +0000 (09:49 +0200)]
i40e: Fix set max_tx_rate when it is lower than 1 Mbps

[ Upstream commit 198eb7e1b81d8ba676d0f4f120c092032ae69a8e ]

While converting max_tx_rate from bytes to Mbps, this value was set to 0,
if the original value was lower than 125000 bytes (1 Mbps). This would
cause no transmission rate limiting to occur. This happened due to lack of
check of max_tx_rate against the 1 Mbps value for max_tx_rate and the
following division by 125000. Fix this issue by adding a helper
i40e_bw_bytes_to_mbits() which sets max_tx_rate to minimum usable value of
50 Mbps, if its value is less than 1 Mbps, otherwise do the required
conversion by dividing by 125000.

Fixes: 5ecae4120a6b ("i40e: Refactor VF BW rate limiting")
Signed-off-by: Michal Jaron <michalx.jaron@intel.com>
Signed-off-by: Andrii Staikov <andrii.staikov@intel.com>
Tested-by: Bharathi Sreenivas <bharathi.sreenivas@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
19 months agoi40e: Fix VF set max MTU size
Michal Jaron [Tue, 13 Sep 2022 13:38:36 +0000 (15:38 +0200)]
i40e: Fix VF set max MTU size

[ Upstream commit 372539def2824c43b6afe2403045b140f65c5acc ]

Max MTU sent to VF is set to 0 during memory allocation. It cause
that max MTU on VF is changed to IAVF_MAX_RXBUFFER and does not
depend on data from HW.

Set max_mtu field in virtchnl_vf_resource struct to inform
VF in GET_VF_RESOURCES msg what size should be max frame.

Fixes: dab86afdbbd1 ("i40e/i40evf: Change the way we limit the maximum frame size for Rx")
Signed-off-by: Michal Jaron <michalx.jaron@intel.com>
Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
19 months agoiavf: Fix set max MTU size with port VLAN and jumbo frames
Michal Jaron [Tue, 13 Sep 2022 13:38:35 +0000 (15:38 +0200)]
iavf: Fix set max MTU size with port VLAN and jumbo frames

[ Upstream commit 399c98c4dc50b7eb7e9f24da7ffdda6f025676ef ]

After setting port VLAN and MTU to 9000 on VF with ice driver there
was an iavf error
"PF returned error -5 (IAVF_ERR_PARAM) to our request 6".

During queue configuration, VF's max packet size was set to
IAVF_MAX_RXBUFFER but on ice max frame size was smaller by VLAN_HLEN
due to making some space for port VLAN as VF is not aware whether it's
in a port VLAN. This mismatch in sizes caused ice to reject queue
configuration with ERR_PARAM error. Proper max_mtu is sent from ice PF
to VF with GET_VF_RESOURCES msg but VF does not look at this.

In iavf change max_frame from IAVF_MAX_RXBUFFER to max_mtu
received from pf with GET_VF_RESOURCES msg to make vf's
max_frame_size dependent from pf. Add check if received max_mtu is
not in eligible range then set it to IAVF_MAX_RXBUFFER.

Fixes: dab86afdbbd1 ("i40e/i40evf: Change the way we limit the maximum frame size for Rx")
Signed-off-by: Michal Jaron <michalx.jaron@intel.com>
Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
19 months agoiavf: Fix bad page state
Norbert Zulinski [Wed, 14 Sep 2022 13:39:13 +0000 (15:39 +0200)]
iavf: Fix bad page state

[ Upstream commit 66039eb9015eee4f7ff0c99b83c65c7ecb3c8190 ]

Fix bad page state, free inappropriate page in handling dummy
descriptor. iavf_build_skb now has to check not only if rx_buffer is
NULL but also if size is zero, same thing in iavf_clean_rx_irq.
Without this patch driver would free page that will be used
by napi_build_skb.

Fixes: a9f49e006030 ("iavf: Fix handling of dummy receive descriptors")
Signed-off-by: Norbert Zulinski <norbertx.zulinski@intel.com>
Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
19 months agoMIPS: Loongson32: Fix PHY-mode being left unspecified
Serge Semin [Sun, 11 Sep 2022 16:10:09 +0000 (00:10 +0800)]
MIPS: Loongson32: Fix PHY-mode being left unspecified

[ Upstream commit e9f3f8f488005f6da3cfb66070706770ecaef747 ]

commit 0060c8783330 ("net: stmmac: implement support for passive mode
converters via dt") has changed the plat->interface field semantics from
containing the PHY-mode to specifying the MAC-PCS interface mode. Due to
that the loongson32 platform code will leave the phylink interface
uninitialized with the PHY-mode intended by the means of the actual
platform setup. The commit-author most likely has just missed the
arch-specific code to fix. Let's mend the Loongson32 platform code then by
assigning the PHY-mode to the phy_interface field of the STMMAC platform
data.

Fixes: 0060c8783330 ("net: stmmac: implement support for passive mode converters via dt")
Signed-off-by: Serge Semin <Sergey.Semin@baikalelectronics.ru>
Signed-off-by: Keguang Zhang <keguang.zhang@gmail.com>
Tested-by: Keguang Zhang <keguang.zhang@gmail.com>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
19 months agoMIPS: lantiq: export clk_get_io() for lantiq_wdt.ko
Randy Dunlap [Sat, 17 Sep 2022 23:25:40 +0000 (16:25 -0700)]
MIPS: lantiq: export clk_get_io() for lantiq_wdt.ko

[ Upstream commit 502550123bee6a2ffa438409b5b9aad4d6db3a8c ]

The lantiq WDT driver uses clk_get_io(), which is not exported,
so export it to fix a build error:

ERROR: modpost: "clk_get_io" [drivers/watchdog/lantiq_wdt.ko] undefined!

Fixes: 287e3f3f4e68 ("MIPS: lantiq: implement support for clkdev api")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Reported-by: kernel test robot <lkp@intel.com>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: John Crispin <john@phrozen.org>
Cc: linux-mips@vger.kernel.org
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
19 months agodrm/panel: simple: Fix innolux_g121i1_l01 bus_format
Heiko Schocher [Fri, 26 Aug 2022 16:50:21 +0000 (13:50 -0300)]
drm/panel: simple: Fix innolux_g121i1_l01 bus_format

[ Upstream commit a7c48a0ab87ae52c087d663e83e56b8225ac4cce ]

innolux_g121i1_l01 sets bpc to 6, so use the corresponding bus format:
MEDIA_BUS_FMT_RGB666_1X7X3_SPWG.

Fixes: 4ae13e486866 ("drm/panel: simple: Add more properties to Innolux G121I1-L01")
Signed-off-by: Heiko Schocher <hs@denx.de>
Signed-off-by: Fabio Estevam <festevam@denx.de>
Signed-off-by: Marek Vasut <marex@denx.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20220826165021.1592532-1-festevam@denx.de
Signed-off-by: Sasha Levin <sashal@kernel.org>
19 months agonet: team: Unsync device addresses on ndo_stop
Benjamin Poirier [Wed, 7 Sep 2022 07:56:41 +0000 (16:56 +0900)]
net: team: Unsync device addresses on ndo_stop

[ Upstream commit bd60234222b2fd5573526da7bcd422801f271f5f ]

Netdev drivers are expected to call dev_{uc,mc}_sync() in their
ndo_set_rx_mode method and dev_{uc,mc}_unsync() in their ndo_stop method.
This is mentioned in the kerneldoc for those dev_* functions.

The team driver calls dev_{uc,mc}_unsync() during ndo_uninit instead of
ndo_stop. This is ineffective because address lists (dev->{uc,mc}) have
already been emptied in unregister_netdevice_many() before ndo_uninit is
called. This mistake can result in addresses being leftover on former team
ports after a team device has been deleted; see test_LAG_cleanup() in the
last patch in this series.

Add unsync calls at their expected location, team_close().

v3:
* When adding or deleting a port, only sync/unsync addresses if the team
  device is up. In other cases, it is taken care of at the right time by
  ndo_open/ndo_set_rx_mode/ndo_stop.

Fixes: 3d249d4ca7d0 ("net: introduce ethernet teaming device")
Signed-off-by: Benjamin Poirier <bpoirier@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
19 months agonet: bonding: Unsync device addresses on ndo_stop
Benjamin Poirier [Wed, 7 Sep 2022 07:56:40 +0000 (16:56 +0900)]
net: bonding: Unsync device addresses on ndo_stop

[ Upstream commit 86247aba599e5b07d7e828e6edaaebb0ef2b1158 ]

Netdev drivers are expected to call dev_{uc,mc}_sync() in their
ndo_set_rx_mode method and dev_{uc,mc}_unsync() in their ndo_stop method.
This is mentioned in the kerneldoc for those dev_* functions.

The bonding driver calls dev_{uc,mc}_unsync() during ndo_uninit instead of
ndo_stop. This is ineffective because address lists (dev->{uc,mc}) have
already been emptied in unregister_netdevice_many() before ndo_uninit is
called. This mistake can result in addresses being leftover on former bond
slaves after a bond has been deleted; see test_LAG_cleanup() in the last
patch in this series.

Add unsync calls, via bond_hw_addr_flush(), at their expected location,
bond_close().
Add dev_mc_add() call to bond_open() to match the above change.

v3:
* When adding or deleting a slave, only sync/unsync, add/del addresses if
  the bond is up. In other cases, it is taken care of at the right time by
  ndo_open/ndo_set_rx_mode/ndo_stop.

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Benjamin Poirier <bpoirier@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
19 months agonet: bonding: Share lacpdu_mcast_addr definition
Benjamin Poirier [Wed, 7 Sep 2022 07:56:39 +0000 (16:56 +0900)]
net: bonding: Share lacpdu_mcast_addr definition

[ Upstream commit 1d9a143ee3408349700f44a9197b7ae0e4faae5d ]

There are already a few definitions of arrays containing
MULTICAST_LACPDU_ADDR and the next patch will add one more use. These all
contain the same constant data so define one common instance for all
bonding code.

Signed-off-by: Benjamin Poirier <bpoirier@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Stable-dep-of: 86247aba599e ("net: bonding: Unsync device addresses on ndo_stop")
Signed-off-by: Sasha Levin <sashal@kernel.org>
19 months agoscsi: mpt3sas: Fix return value check of dma_get_required_mask()
Sreekanth Reddy [Tue, 13 Sep 2022 12:05:38 +0000 (17:35 +0530)]
scsi: mpt3sas: Fix return value check of dma_get_required_mask()

[ Upstream commit e0e0747de0ea3dd87cdbb0393311e17471a9baf1 ]

Fix the incorrect return value check of dma_get_required_mask().  Due to
this incorrect check, the driver was always setting the DMA mask to 63 bit.

Link: https://lore.kernel.org/r/20220913120538.18759-2-sreekanth.reddy@broadcom.com
Fixes: ba27c5cf286d ("scsi: mpt3sas: Don't change the DMA coherent mask after allocations")
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
19 months agoscsi: mpt3sas: Force PCIe scatterlist allocations to be within same 4 GB region
Suganath Prabu S [Fri, 5 Mar 2021 10:28:58 +0000 (15:58 +0530)]
scsi: mpt3sas: Force PCIe scatterlist allocations to be within same 4 GB region

[ Upstream commit d6adc251dd2fede6aaaf6c39f7e4ad799eda3758 ]

According to the MPI specification, PCIe SGL buffers can not cross a 4 GB
boundary.

While allocating, if any buffer crosses the 4 GB boundary, then:

 - Release the already allocated memory pools; and

 - Reallocate them by changing the DMA coherent mask to 32-bit

Link: https://lore.kernel.org/r/20210305102904.7560-2-suganath-prabu.subramani@broadcom.com
Signed-off-by: Suganath Prabu S <suganath-prabu.subramani@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Stable-dep-of: e0e0747de0ea ("scsi: mpt3sas: Fix return value check of dma_get_required_mask()")
Signed-off-by: Sasha Levin <sashal@kernel.org>
19 months agonet: phy: aquantia: wait for the suspend/resume operations to finish
Ioana Ciornei [Tue, 6 Sep 2022 13:04:51 +0000 (16:04 +0300)]
net: phy: aquantia: wait for the suspend/resume operations to finish

[ Upstream commit ca2dccdeeb49a7e408112d681bf447984c845292 ]

The Aquantia datasheet notes that after issuing a Processor-Intensive
MDIO operation, like changing the low-power state of the device, the
driver should wait for the operation to finish before issuing a new MDIO
command.

The new aqr107_wait_processor_intensive_op() function is added which can
be used after these kind of MDIO operations. At the moment, we are only
adding it at the end of the suspend/resume calls.

The issue was identified on a board featuring the AQR113C PHY, on
which commands like 'ip link (..) up / down' issued without any delays
between them would render the link on the PHY to remain down.
The issue was easy to reproduce with a one-liner:
 $ ip link set dev ethX down; ip link set dev ethX up; \
 ip link set dev ethX down; ip link set dev ethX up;

Fixes: ac9e81c230eb ("net: phy: aquantia: add suspend / resume callbacks for AQR107 family")
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/20220906130451.1483448-1-ioana.ciornei@nxp.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
19 months agonet: core: fix flow symmetric hash
Ludovic Cintrat [Wed, 7 Sep 2022 10:08:13 +0000 (12:08 +0200)]
net: core: fix flow symmetric hash

[ Upstream commit 64ae13ed478428135cddc2f1113dff162d8112d4 ]

__flow_hash_consistentify() wrongly swaps ipv4 addresses in few cases.
This function is indirectly used by __skb_get_hash_symmetric(), which is
used to fanout packets in AF_PACKET.
Intrusion detection systems may be impacted by this issue.

__flow_hash_consistentify() computes the addresses difference then swaps
them if the difference is negative. In few cases src - dst and dst - src
are both negative.

The following snippet mimics __flow_hash_consistentify():

```
 #include <stdio.h>
 #include <stdint.h>

 int main(int argc, char** argv) {

     int diffs_d, diffd_s;
     uint32_t dst  = 0xb225a8c0; /* 178.37.168.192 --> 192.168.37.178 */
     uint32_t src  = 0x3225a8c0; /*  50.37.168.192 --> 192.168.37.50  */
     uint32_t dst2 = 0x3325a8c0; /*  51.37.168.192 --> 192.168.37.51  */

     diffs_d = src - dst;
     diffd_s = dst - src;

     printf("src:%08x dst:%08x, diff(s-d)=%d(0x%x) diff(d-s)=%d(0x%x)\n",
             src, dst, diffs_d, diffs_d, diffd_s, diffd_s);

     diffs_d = src - dst2;
     diffd_s = dst2 - src;

     printf("src:%08x dst:%08x, diff(s-d)=%d(0x%x) diff(d-s)=%d(0x%x)\n",
             src, dst2, diffs_d, diffs_d, diffd_s, diffd_s);

     return 0;
 }
```

Results:

src:3225a8c0 dst:b225a8c0, \
    diff(s-d)=-2147483648(0x80000000) \
    diff(d-s)=-2147483648(0x80000000)

src:3225a8c0 dst:3325a8c0, \
    diff(s-d)=-16777216(0xff000000) \
    diff(d-s)=16777216(0x1000000)

In the first case the addresses differences are always < 0, therefore
__flow_hash_consistentify() always swaps, thus dst->src and src->dst
packets have differents hashes.

Fixes: c3f8324188fa8 ("net: Add full IPv6 addresses to flow_keys")
Signed-off-by: Ludovic Cintrat <ludovic.cintrat@gatewatcher.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
19 months agonet: let flow have same hash in two directions
zhang kai [Wed, 28 Jul 2021 10:54:18 +0000 (18:54 +0800)]
net: let flow have same hash in two directions

[ Upstream commit 1e60cebf82948cfdc9497ea4553bab125587593c ]

using same source and destination ip/port for flow hash calculation
within the two directions.

Signed-off-by: zhang kai <zhangkaiheb@126.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Stable-dep-of: 64ae13ed4784 ("net: core: fix flow symmetric hash")
Signed-off-by: Sasha Levin <sashal@kernel.org>
19 months agoipvlan: Fix out-of-bound bugs caused by unset skb->mac_header
Lu Wei [Wed, 7 Sep 2022 10:12:04 +0000 (18:12 +0800)]
ipvlan: Fix out-of-bound bugs caused by unset skb->mac_header

[ Upstream commit 81225b2ea161af48e093f58e8dfee6d705b16af4 ]

If an AF_PACKET socket is used to send packets through ipvlan and the
default xmit function of the AF_PACKET socket is changed from
dev_queue_xmit() to packet_direct_xmit() via setsockopt() with the option
name of PACKET_QDISC_BYPASS, the skb->mac_header may not be reset and
remains as the initial value of 65535, this may trigger slab-out-of-bounds
bugs as following:

=================================================================
UG: KASAN: slab-out-of-bounds in ipvlan_xmit_mode_l2+0xdb/0x330 [ipvlan]
PU: 2 PID: 1768 Comm: raw_send Kdump: loaded Not tainted 6.0.0-rc4+ #6
ardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-1.fc33
all Trace:
print_address_description.constprop.0+0x1d/0x160
print_report.cold+0x4f/0x112
kasan_report+0xa3/0x130
ipvlan_xmit_mode_l2+0xdb/0x330 [ipvlan]
ipvlan_start_xmit+0x29/0xa0 [ipvlan]
__dev_direct_xmit+0x2e2/0x380
packet_direct_xmit+0x22/0x60
packet_snd+0x7c9/0xc40
sock_sendmsg+0x9a/0xa0
__sys_sendto+0x18a/0x230
__x64_sys_sendto+0x74/0x90
do_syscall_64+0x3b/0x90
entry_SYSCALL_64_after_hwframe+0x63/0xcd

The root cause is:
  1. packet_snd() only reset skb->mac_header when sock->type is SOCK_RAW
     and skb->protocol is not specified as in packet_parse_headers()

  2. packet_direct_xmit() doesn't reset skb->mac_header as dev_queue_xmit()

In this case, skb->mac_header is 65535 when ipvlan_xmit_mode_l2() is
called. So when ipvlan_xmit_mode_l2() gets mac header with eth_hdr() which
use "skb->head + skb->mac_header", out-of-bound access occurs.

This patch replaces eth_hdr() with skb_eth_hdr() in ipvlan_xmit_mode_l2()
and reset mac header in multicast to solve this out-of-bound bug.

Fixes: 2ad7bf363841 ("ipvlan: Initial check-in of the IPVLAN driver.")
Signed-off-by: Lu Wei <luwei32@huawei.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
19 months agoiavf: Fix cached head and tail value for iavf_get_tx_pending
Brett Creeley [Thu, 1 Sep 2022 14:34:40 +0000 (16:34 +0200)]
iavf: Fix cached head and tail value for iavf_get_tx_pending

[ Upstream commit 809f23c0423a43266e47a7dc67e95b5cb4d1cbfc ]

The underlying hardware may or may not allow reading of the head or tail
registers and it really makes no difference if we use the software
cached values. So, always used the software cached values.

Fixes: 9c6c12595b73 ("i40e: Detection and recovery of TX queue hung logic moved to service_task from tx_timeout")
Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Co-developed-by: Norbert Zulinski <norbertx.zulinski@intel.com>
Signed-off-by: Norbert Zulinski <norbertx.zulinski@intel.com>
Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
19 months agonetfilter: nfnetlink_osf: fix possible bogus match in nf_osf_find()
Pablo Neira Ayuso [Wed, 7 Sep 2022 08:26:18 +0000 (10:26 +0200)]
netfilter: nfnetlink_osf: fix possible bogus match in nf_osf_find()

[ Upstream commit 559c36c5a8d730c49ef805a72b213d3bba155cc8 ]

nf_osf_find() incorrectly returns true on mismatch, this leads to
copying uninitialized memory area in nft_osf which can be used to leak
stale kernel stack data to userspace.

Fixes: 22c7652cdaa8 ("netfilter: nft_osf: Add version option support")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
19 months agonetfilter: nf_conntrack_irc: Tighten matching on DCC message
David Leadbeater [Fri, 26 Aug 2022 04:56:57 +0000 (14:56 +1000)]
netfilter: nf_conntrack_irc: Tighten matching on DCC message

[ Upstream commit e8d5dfd1d8747b56077d02664a8838c71ced948e ]

CTCP messages should only be at the start of an IRC message, not
anywhere within it.

While the helper only decodes packes in the ORIGINAL direction, its
possible to make a client send a CTCP message back by empedding one into
a PING request.  As-is, thats enough to make the helper believe that it
saw a CTCP message.

Fixes: 869f37d8e48f ("[NETFILTER]: nf_conntrack/nf_nat: add IRC helper port")
Signed-off-by: David Leadbeater <dgl@dgl.cx>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
19 months agonetfilter: nf_conntrack_sip: fix ct_sip_walk_headers
Igor Ryzhov [Wed, 5 Jun 2019 09:32:40 +0000 (12:32 +0300)]
netfilter: nf_conntrack_sip: fix ct_sip_walk_headers

[ Upstream commit 39aebedeaaa95757f5c1f2ddb5f43fdddbf478ca ]

ct_sip_next_header and ct_sip_get_header return an absolute
value of matchoff, not a shift from current dataoff.
So dataoff should be assigned matchoff, not incremented by it.

This issue can be seen in the scenario when there are multiple
Contact headers and the first one is using a hostname and other headers
use IP addresses. In this case, ct_sip_walk_headers will work as follows:

The first ct_sip_get_header call to will find the first Contact header
but will return -1 as the header uses a hostname. But matchoff will
be changed to the offset of this header. After that, dataoff should be
set to matchoff, so that the next ct_sip_get_header call find the next
Contact header. But instead of assigning dataoff to matchoff, it is
incremented by it, which is not correct, as matchoff is an absolute
value of the offset. So on the next call to the ct_sip_get_header,
dataoff will be incorrect, and the next Contact header may not be
found at all.

Fixes: 05e3ced297fe ("[NETFILTER]: nf_conntrack_sip: introduce SIP-URI parsing helper")
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
19 months agoarm64: dts: rockchip: Remove 'enable-active-low' from rk3399-puma
Fabio Estevam [Sat, 27 Aug 2022 17:51:39 +0000 (14:51 -0300)]
arm64: dts: rockchip: Remove 'enable-active-low' from rk3399-puma

[ Upstream commit a994b34b9abb9c08ee09e835b4027ff2147f9d94 ]

The 'enable-active-low' property is not a valid one.

Only 'enable-active-high' is valid, and when this property is absent
the gpio regulator will act as active low by default.

Remove the invalid 'enable-active-low' property.

Fixes: 2c66fc34e945 ("arm64: dts: rockchip: add RK3399-Q7 (Puma) SoM")
Signed-off-by: Fabio Estevam <festevam@denx.de>
Link: https://lore.kernel.org/r/20220827175140.1696699-1-festevam@denx.de
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
19 months agodmaengine: ti: k3-udma-private: Fix refcount leak bug in of_xudma_dev_get()
Liang He [Wed, 20 Jul 2022 07:32:34 +0000 (15:32 +0800)]
dmaengine: ti: k3-udma-private: Fix refcount leak bug in of_xudma_dev_get()

[ Upstream commit f9fdb0b86f087c2b7f6c6168dd0985a3c1eda87e ]

We should call of_node_put() for the reference returned by
of_parse_phandle() in fail path or when it is not used anymore.
Here we only need to move the of_node_put() before the check.

Fixes: d70241913413 ("dmaengine: ti: k3-udma: Add glue layer for non DMAengine users")
Signed-off-by: Liang He <windhl@126.com>
Acked-by: Peter Ujfalusi <peter.ujfalusi@gmail.com>
Link: https://lore.kernel.org/r/20220720073234.1255474-1-windhl@126.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
19 months agoarm64: dts: rockchip: Set RK3399-Gru PCLK_EDP to 24 MHz
zain wang [Tue, 30 Aug 2022 20:16:17 +0000 (13:16 -0700)]
arm64: dts: rockchip: Set RK3399-Gru PCLK_EDP to 24 MHz

[ Upstream commit 8123437cf46ea5a0f6ca5cb3c528d8b6db97b9c2 ]

We've found the AUX channel to be less reliable with PCLK_EDP at a
higher rate (typically 25 MHz). This is especially important on systems
with PSR-enabled panels (like Gru-Kevin), since we make heavy, constant
use of AUX.

According to Rockchip, using any rate other than 24 MHz can cause
"problems between syncing the PHY an PCLK", which leads to all sorts of
unreliabilities around register operations.

Fixes: d67a38c5a623 ("arm64: dts: rockchip: move core edp from rk3399-kevin to shared chromebook")
Reviewed-by: Douglas Anderson <dianders@chromium.org>
Signed-off-by: zain wang <wzz@rock-chips.com>
Signed-off-by: Brian Norris <briannorris@chromium.org>
Link: https://lore.kernel.org/r/20220830131212.v2.1.I98d30623f13b785ca77094d0c0fd4339550553b6@changeid
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
19 months agodrm/mediatek: dsi: Add atomic {destroy,duplicate}_state, reset callbacks
AngeloGioacchino Del Regno [Thu, 21 Jul 2022 17:27:27 +0000 (19:27 +0200)]
drm/mediatek: dsi: Add atomic {destroy,duplicate}_state, reset callbacks

[ Upstream commit eeda05b5e92f51d9a09646ecb493f0a1e872a6ef ]

Add callbacks for atomic_destroy_state, atomic_duplicate_state and
atomic_reset to restore functionality of the DSI driver: this solves
vblank timeouts when another bridge is present in the chain.

Tested bridge chain: DSI <=> ANX7625 => aux-bus panel

Fixes: 7f6335c6a258 ("drm/mediatek: Modify dsi funcs to atomic operations")
Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Tested-by: Chen-Yu Tsai <wenst@chromium.org>
Reviewed-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>
Tested-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>
Link: https://patchwork.kernel.org/project/linux-mediatek/patch/20220721172727.14624-1-angelogioacchino.delregno@collabora.com/
Signed-off-by: Chun-Kuang Hu <chunkuang.hu@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
19 months agoarm64: dts: rockchip: Pull up wlan wake# on Gru-Bob
Brian Norris [Mon, 22 Aug 2022 23:45:04 +0000 (16:45 -0700)]
arm64: dts: rockchip: Pull up wlan wake# on Gru-Bob

[ Upstream commit e5467359a725de90b6b8d0dd865500f6373828ca ]

The Gru-Bob board does not have a pull-up resistor on its
WLAN_HOST_WAKE# pin, but Kevin does. The production/vendor kernel
specified the pin configuration correctly as a pull-up, but this didn't
get ported correctly to upstream.

This means Bob's WLAN_HOST_WAKE# pin is floating, causing inconsistent
wakeup behavior.

Note that bt_host_wake_l has a similar dynamic, but apparently the
upstream choice was to redundantly configure both internal and external
pull-up on Kevin (see the "Kevin has an external pull up" comment in
rk3399-gru.dtsi). This doesn't cause any functional problem, although
it's perhaps wasteful.

Fixes: 8559bbeeb849 ("arm64: dts: rockchip: add Google Bob")
Signed-off-by: Brian Norris <briannorris@chromium.org>
Reviewed-by: Douglas Anderson <dianders@chromium.org>
Link: https://lore.kernel.org/r/20220822164453.1.I75c57b48b0873766ec993bdfb7bc1e63da5a1637@changeid
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
19 months agoxfs: validate inode fork size against fork format
Dave Chinner [Thu, 22 Sep 2022 15:47:28 +0000 (18:47 +0300)]
xfs: validate inode fork size against fork format

commit 1eb70f54c445fcbb25817841e774adb3d912f3e8 upstream.

[backport for 5.10.y]

xfs_repair catches fork size/format mismatches, but the in-kernel
verifier doesn't, leading to null pointer failures when attempting
to perform operations on the fork. This can occur in the
xfs_dir_is_empty() where the in-memory fork format does not match
the size and so the fork data pointer is accessed incorrectly.

Note: this causes new failures in xfs/348 which is testing mode vs
ftype mismatches. We now detect a regular file that has been changed
to a directory or symlink mode as being corrupt because the data
fork is for a symlink or directory should be in local form when
there are only 3 bytes of data in the data fork. Hence the inode
verify for the regular file now fires w/ -EFSCORRUPTED because
the inode fork format does not match the format the corrupted mode
says it should be in.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Acked-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
19 months agoxfs: reorder iunlink remove operation in xfs_ifree
Dave Chinner [Thu, 22 Sep 2022 15:47:27 +0000 (18:47 +0300)]
xfs: reorder iunlink remove operation in xfs_ifree

commit 9a5280b312e2e7898b6397b2ca3cfd03f67d7be1 upstream.

[backport for 5.10.y]

The O_TMPFILE creation implementation creates a specific order of
operations for inode allocation/freeing and unlinked list
modification. Currently both are serialised by the AGI, so the order
doesn't strictly matter as long as the are both in the same
transaction.

However, if we want to move the unlinked list insertions largely out
from under the AGI lock, then we have to be concerned about the
order in which we do unlinked list modification operations.
O_TMPFILE creation tells us this order is inode allocation/free,
then unlinked list modification.

Change xfs_ifree() to use this same ordering on unlinked list
removal. This way we always guarantee that when we enter the
iunlinked list removal code from this path, we already have the AGI
locked and we don't have to worry about lock nesting AGI reads
inside unlink list locks because it's already locked and attached to
the transaction.

We can do this safely as the inode freeing and unlinked list removal
are done in the same transaction and hence are atomic operations
with respect to log recovery.

Reported-by: Frank Hofmann <fhofmann@cloudflare.com>
Fixes: 298f7bec503f ("xfs: pin inode backing buffer to the inode log item")
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Acked-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
19 months agoxfs: fix up non-directory creation in SGID directories
Christoph Hellwig [Sat, 23 Jan 2021 00:48:18 +0000 (16:48 -0800)]
xfs: fix up non-directory creation in SGID directories

commit 01ea173e103edd5ec41acec65b9261b87e123fc2 upstream.

XFS always inherits the SGID bit if it is set on the parent inode, while
the generic inode_init_owner does not do this in a few cases where it can
create a possible security problem, see commit 0fa3ecd87848
("Fix up non-directory creation in SGID directories") for details.

Switch XFS to use the generic helper for the normal path to fix this,
just keeping the simple field inheritance open coded for the case of the
non-sgid case with the bsdgrpid mount option.

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Reported-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Acked-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
19 months agointerconnect: qcom: icc-rpmh: Add BCMs to commit list in pre_aggregate
Mike Tipton [Thu, 25 Nov 2021 17:47:51 +0000 (19:47 +0200)]
interconnect: qcom: icc-rpmh: Add BCMs to commit list in pre_aggregate

commit b95b668eaaa2574e8ee72f143c52075e9955177e upstream.

We're only adding BCMs to the commit list in aggregate(), but there are
cases where pre_aggregate() is called without subsequently calling
aggregate(). In particular, in icc_sync_state() when a node with initial
BW has zero requests. Since BCMs aren't added to the commit list in
these cases, we don't actually send the zero BW request to HW. So the
resources remain on unnecessarily.

Add BCMs to the commit list in pre_aggregate() instead, which is always
called even when there are no requests.

Signed-off-by: Mike Tipton <mdtipton@codeaurora.org>
[georgi: remove icc_sync_state for platforms with incomplete support]
Link: https://lore.kernel.org/r/20211125174751.25317-1-djakov@kernel.org
Signed-off-by: Georgi Djakov <djakov@kernel.org>
[dianders: dropped sm8350.c which isn't present in 5.10]
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Acked-by: Alex Elder <elder@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
19 months agoKVM: SEV: add cache flush to solve SEV cache incoherency issues
Mingwei Zhang [Thu, 21 Apr 2022 03:14:07 +0000 (03:14 +0000)]
KVM: SEV: add cache flush to solve SEV cache incoherency issues

commit 683412ccf61294d727ead4a73d97397396e69a6b upstream.

Flush the CPU caches when memory is reclaimed from an SEV guest (where
reclaim also includes it being unmapped from KVM's memslots).  Due to lack
of coherency for SEV encrypted memory, failure to flush results in silent
data corruption if userspace is malicious/broken and doesn't ensure SEV
guest memory is properly pinned and unpinned.

Cache coherency is not enforced across the VM boundary in SEV (AMD APM
vol.2 Section 15.34.7). Confidential cachelines, generated by confidential
VM guests have to be explicitly flushed on the host side. If a memory page
containing dirty confidential cachelines was released by VM and reallocated
to another user, the cachelines may corrupt the new user at a later time.

KVM takes a shortcut by assuming all confidential memory remain pinned
until the end of VM lifetime. Therefore, KVM does not flush cache at
mmu_notifier invalidation events. Because of this incorrect assumption and
the lack of cache flushing, malicous userspace can crash the host kernel:
creating a malicious VM and continuously allocates/releases unpinned
confidential memory pages when the VM is running.

Add cache flush operations to mmu_notifier operations to ensure that any
physical memory leaving the guest VM get flushed. In particular, hook
mmu_notifier_invalidate_range_start and mmu_notifier_release events and
flush cache accordingly. The hook after releasing the mmu lock to avoid
contention with other vCPUs.

Cc: stable@vger.kernel.org
Suggested-by: Sean Christpherson <seanjc@google.com>
Reported-by: Mingwei Zhang <mizhang@google.com>
Signed-off-by: Mingwei Zhang <mizhang@google.com>
Message-Id: <20220421031407.2516575-4-mizhang@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
[OP: applied kvm_arch_guest_memory_reclaimed() calls in kvm_set_memslot() and
kvm_mmu_notifier_invalidate_range_start();
OP: adjusted kvm_arch_guest_memory_reclaimed() to not use static_call_cond()]
Signed-off-by: Ovidiu Panait <ovidiu.panait@windriver.com>
Reviewed-by: Liam Merwick <liam.merwick@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
19 months agomm/slub: fix to return errno if kmalloc() fails
Chao Yu [Wed, 31 Aug 2022 14:54:54 +0000 (22:54 +0800)]
mm/slub: fix to return errno if kmalloc() fails

commit 7e9c323c52b379d261a72dc7bd38120a761a93cd upstream.

In create_unique_id(), kmalloc(, GFP_KERNEL) can fail due to
out-of-memory, if it fails, return errno correctly rather than
triggering panic via BUG_ON();

kernel BUG at mm/slub.c:5893!
Internal error: Oops - BUG: 0 [#1] PREEMPT SMP

Call trace:
 sysfs_slab_add+0x258/0x260 mm/slub.c:5973
 __kmem_cache_create+0x60/0x118 mm/slub.c:4899
 create_cache mm/slab_common.c:229 [inline]
 kmem_cache_create_usercopy+0x19c/0x31c mm/slab_common.c:335
 kmem_cache_create+0x1c/0x28 mm/slab_common.c:390
 f2fs_kmem_cache_create fs/f2fs/f2fs.h:2766 [inline]
 f2fs_init_xattr_caches+0x78/0xb4 fs/f2fs/xattr.c:808
 f2fs_fill_super+0x1050/0x1e0c fs/f2fs/super.c:4149
 mount_bdev+0x1b8/0x210 fs/super.c:1400
 f2fs_mount+0x44/0x58 fs/f2fs/super.c:4512
 legacy_get_tree+0x30/0x74 fs/fs_context.c:610
 vfs_get_tree+0x40/0x140 fs/super.c:1530
 do_new_mount+0x1dc/0x4e4 fs/namespace.c:3040
 path_mount+0x358/0x914 fs/namespace.c:3370
 do_mount fs/namespace.c:3383 [inline]
 __do_sys_mount fs/namespace.c:3591 [inline]
 __se_sys_mount fs/namespace.c:3568 [inline]
 __arm64_sys_mount+0x2f8/0x408 fs/namespace.c:3568

Cc: <stable@kernel.org>
Fixes: 81819f0fc8285 ("SLUB core")
Reported-by: syzbot+81684812ea68216e08c5@syzkaller.appspotmail.com
Reviewed-by: Muchun Song <songmuchun@bytedance.com>
Reviewed-by: Hyeonggon Yoo <42.hyeyoo@gmail.com>
Signed-off-by: Chao Yu <chao.yu@oppo.com>
Acked-by: David Rientjes <rientjes@google.com>
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
19 months agocan: flexcan: flexcan_mailbox_read() fix return value for drop = true
Marc Kleine-Budde [Thu, 11 Aug 2022 08:25:44 +0000 (10:25 +0200)]
can: flexcan: flexcan_mailbox_read() fix return value for drop = true

commit a09721dd47c8468b3f2fdd73f40422699ffe26dd upstream.

The following happened on an i.MX25 using flexcan with many packets on
the bus:

The rx-offload queue reached a length more than skb_queue_len_max. In
can_rx_offload_offload_one() the drop variable was set to true which
made the call to .mailbox_read() (here: flexcan_mailbox_read()) to
_always_ return ERR_PTR(-ENOBUFS) and drop the rx'ed CAN frame. So
can_rx_offload_offload_one() returned ERR_PTR(-ENOBUFS), too.

can_rx_offload_irq_offload_fifo() looks as follows:

|  while (1) {
|  skb = can_rx_offload_offload_one(offload, 0);
|  if (IS_ERR(skb))
|  continue;
|  if (!skb)
|  break;
|  ...
|  }

The flexcan driver wrongly always returns ERR_PTR(-ENOBUFS) if drop is
requested, even if there is no CAN frame pending. As the i.MX25 is a
single core CPU, while the rx-offload processing is active, there is
no thread to process packets from the offload queue. So the queue
doesn't get any shorter and this results is a tight loop.

Instead of always returning ERR_PTR(-ENOBUFS) if drop is requested,
return NULL if no CAN frame is pending.

Changes since v1: https://lore.kernel.org/all/20220810144536.389237-1-u.kleine-koenig@pengutronix.de
- don't break in can_rx_offload_irq_offload_fifo() in case of an error,
  return NULL in flexcan_mailbox_read() in case of no pending CAN frame
  instead

Fixes: 4e9c9484b085 ("can: rx-offload: Prepare for CAN FD support")
Link: https://lore.kernel.org/all/20220811094254.1864367-1-mkl@pengutronix.de
Cc: stable@vger.kernel.org # v5.5
Suggested-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Reviewed-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Tested-by: Thorsten Scherer <t.scherer@eckelmann.de>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
19 months agoriscv: fix a nasty sigreturn bug...
Al Viro [Fri, 24 Sep 2021 01:55:27 +0000 (01:55 +0000)]
riscv: fix a nasty sigreturn bug...

commit 762df359aa5849e010ef04c3ed79d57588ce17d9 upstream.

riscv has an equivalent of arm bug fixed by 653d48b22166 ("arm: fix
really nasty sigreturn bug"); if signal gets caught by an interrupt that
hits when we have the right value in a0 (-513), *and* another signal
gets delivered upon sigreturn() (e.g. included into the blocked mask for
the first signal and posted while the handler had been running), the
syscall restart logics will see regs->cause equal to EXC_SYSCALL (we are
in a syscall, after all) and a0 already restored to its original value
(-513, which happens to be -ERESTARTNOINTR) and assume that we need to
apply the usual syscall restart logics.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Fixes: e2c0cdfba7f6 ("RISC-V: User-facing API")
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/YxJEiSq%2FCGaL6Gm9@ZenIV/
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
19 months agogpiolib: cdev: Set lineevent_state::irq after IRQ register successfully
Meng Li [Wed, 21 Sep 2022 03:20:20 +0000 (11:20 +0800)]
gpiolib: cdev: Set lineevent_state::irq after IRQ register successfully

commit 69bef19d6b9700e96285f4b4e28691cda3dcd0d1 upstream.

When running gpio test on nxp-ls1028 platform with below command
gpiomon --num-events=3 --rising-edge gpiochip1 25
There will be a warning trace as below:
Call trace:
free_irq+0x204/0x360
lineevent_free+0x64/0x70
gpio_ioctl+0x598/0x6a0
__arm64_sys_ioctl+0xb4/0x100
invoke_syscall+0x5c/0x130
......
el0t_64_sync+0x1a0/0x1a4
The reason of this issue is that calling request_threaded_irq()
function failed, and then lineevent_free() is invoked to release
the resource. Since the lineevent_state::irq was already set, so
the subsequent invocation of free_irq() would trigger the above
warning call trace. To fix this issue, set the lineevent_state::irq
after the IRQ register successfully.

Fixes: 468242724143 ("gpiolib: cdev: refactor lineevent cleanup into lineevent_free")
Cc: stable@vger.kernel.org
Signed-off-by: Meng Li <Meng.Li@windriver.com>
Reviewed-by: Kent Gibson <warthog618@gmail.com>
Signed-off-by: Bartosz Golaszewski <brgl@bgdev.pl>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
19 months agogpio: mockup: fix NULL pointer dereference when removing debugfs
Bartosz Golaszewski [Tue, 20 Sep 2022 07:18:41 +0000 (09:18 +0200)]
gpio: mockup: fix NULL pointer dereference when removing debugfs

commit b7df41a6f79dfb18ba2203f8c5f0e9c0b9b57f68 upstream.

We now remove the device's debugfs entries when unbinding the driver.
This now causes a NULL-pointer dereference on module exit because the
platform devices are unregistered *after* the global debugfs directory
has been recursively removed. Fix it by unregistering the devices first.

Fixes: 303e6da99429 ("gpio: mockup: remove gpio debugfs when remove device")
Cc: Wei Yongjun <weiyongjun1@huawei.com>
Cc: stable@vger.kernel.org
Signed-off-by: Bartosz Golaszewski <brgl@bgdev.pl>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
19 months agowifi: mt76: fix reading current per-tid starting sequence number for aggregation
Felix Fietkau [Fri, 26 Aug 2022 18:23:29 +0000 (20:23 +0200)]
wifi: mt76: fix reading current per-tid starting sequence number for aggregation

commit c3a510e2b53785df31d882a773c4c0780b4c825f upstream.

The code was accidentally shifting register values down by tid % 32 instead of
(tid * field_size) % 32.

Cc: stable@vger.kernel.org
Fixes: a28bef561a5c ("mt76: mt7615: re-enable offloading of sequence number assignment")
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20220826182329.18155-1-nbd@nbd.name
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
19 months agoefi: libstub: check Shim mode using MokSBStateRT
Ard Biesheuvel [Tue, 20 Sep 2022 15:08:23 +0000 (17:08 +0200)]
efi: libstub: check Shim mode using MokSBStateRT

commit 5f56a74cc0a6d9b9f8ba89cea29cd7c4774cb2b1 upstream.

We currently check the MokSBState variable to decide whether we should
treat UEFI secure boot as being disabled, even if the firmware thinks
otherwise. This is used by shim to indicate that it is not checking
signatures on boot images. In the kernel, we use this to relax lockdown
policies.

However, in cases where shim is not even being used, we don't want this
variable to interfere with lockdown, given that the variable may be
non-volatile and therefore persist across a reboot. This means setting
it once will persistently disable lockdown checks on a given system.

So switch to the mirrored version of this variable, called MokSBStateRT,
which is supposed to be volatile, and this is something we can check.

Cc: <stable@vger.kernel.org> # v4.19+
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Ilias Apalodimas <ilias.apalodimas@linaro.org>
Reviewed-by: Peter Jones <pjones@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
19 months agoefi: x86: Wipe setup_data on pure EFI boot
Ard Biesheuvel [Thu, 4 Aug 2022 13:39:48 +0000 (15:39 +0200)]
efi: x86: Wipe setup_data on pure EFI boot

commit 63bf28ceb3ebbe76048c3fb2987996ca1ae64f83 upstream.

When booting the x86 kernel via EFI using the LoadImage/StartImage boot
services [as opposed to the deprecated EFI handover protocol], the setup
header is taken from the image directly, and given that EFI's LoadImage
has no Linux/x86 specific knowledge regarding struct bootparams or
struct setup_header, any absolute addresses in the setup header must
originate from the file and not from a prior loading stage.

Since we cannot generally predict where LoadImage() decides to load an
image (*), such absolute addresses must be treated as suspect: even if a
prior boot stage intended to make them point somewhere inside the
[signed] image, there is no way to validate that, and if they point at
an arbitrary location in memory, the setup_data nodes will not be
covered by any signatures or TPM measurements either, and could be made
to contain an arbitrary sequence of SETUP_xxx nodes, which could
interfere quite badly with the early x86 boot sequence.

(*) Note that, while LoadImage() does take a buffer/size tuple in
addition to a device path, which can be used to provide the image
contents directly, it will re-allocate such images, as the memory
footprint of an image is generally larger than the PE/COFF file
representation.

Cc: <stable@vger.kernel.org> # v5.10+
Link: https://lore.kernel.org/all/20220904165321.1140894-1-Jason@zx2c4.com/
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Acked-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
19 months agomedia: flexcop-usb: fix endpoint type check
Johan Hovold [Mon, 22 Aug 2022 15:10:27 +0000 (17:10 +0200)]
media: flexcop-usb: fix endpoint type check

commit 763679f0eeff0185fc431498849bbc1c24460875 upstream.

Commit d725d20e81c2 ("media: flexcop-usb: sanity checking of endpoint
type") tried to add an endpoint type sanity check for the single
isochronous endpoint but instead broke the driver by checking the wrong
descriptor or random data beyond the last endpoint descriptor.

Make sure to check the right endpoint descriptor.

Fixes: d725d20e81c2 ("media: flexcop-usb: sanity checking of endpoint type")
Cc: Oliver Neukum <oneukum@suse.com>
Cc: stable@vger.kernel.org # 5.9
Reported-by: Dongliang Mu <mudongliangabcd@gmail.com>
Signed-off-by: Johan Hovold <johan@kernel.org>
Link: https://lore.kernel.org/r/20220822151027.27026-1-johan@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
19 months agoiommu/vt-d: Check correct capability for sagaw determination
Yi Liu [Wed, 21 Sep 2022 02:40:54 +0000 (10:40 +0800)]
iommu/vt-d: Check correct capability for sagaw determination

commit 154897807050c1161cb2660e502fc0470d46b986 upstream.

Check 5-level paging capability for 57 bits address width instead of
checking 1GB large page capability.

Fixes: 53fc7ad6edf2 ("iommu/vt-d: Correctly calculate sagaw value of IOMMU")
Cc: stable@vger.kernel.org
Reported-by: Raghunathan Srinivasan <raghunathan.srinivasan@intel.com>
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Raghunathan Srinivasan <raghunathan.srinivasan@intel.com>
Link: https://lore.kernel.org/r/20220916071212.2223869-2-yi.l.liu@intel.com
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
19 months agoALSA: hda/realtek: Enable 4-speaker output Dell Precision 5530 laptop
Callum Osmotherly [Thu, 15 Sep 2022 13:06:08 +0000 (22:36 +0930)]
ALSA: hda/realtek: Enable 4-speaker output Dell Precision 5530 laptop

commit 1885ff13d4c42910b37a0e3f7c2f182520f4eed1 upstream.

Just as with the 5570 (and the other Dell laptops), this enables the two
subwoofer speakers on the Dell Precision 5530 together with the main
ones, significantly increasing the audio quality. I've tested this
myself on a 5530 and can confirm it's working as expected.

Signed-off-by: Callum Osmotherly <callum.osmotherly@gmail.com>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/YyMjQO3mhyXlMbCf@piranha
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
19 months agoALSA: hda/realtek: Add quirk for ASUS GA503R laptop
Luke D. Jones [Thu, 15 Sep 2022 08:09:21 +0000 (20:09 +1200)]
ALSA: hda/realtek: Add quirk for ASUS GA503R laptop

commit ba1f818053b0668a1ce2fe86b840e81b592cc560 upstream.

The ASUS G15 2022 (GA503R) series laptop has the same node-to-DAC pairs
as early models and the G14, this includes bass speakers which are by
default mapped incorrectly to the 0x06 node.

Add a quirk to use the same DAC pairs as the G14.

Signed-off-by: Luke D. Jones <luke@ljones.dev>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20220915080921.35563-4-luke@ljones.dev
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
19 months agoALSA: hda/realtek: Add pincfg for ASUS G533Z HP jack
Luke D. Jones [Thu, 15 Sep 2022 08:09:20 +0000 (20:09 +1200)]
ALSA: hda/realtek: Add pincfg for ASUS G533Z HP jack

commit bc2c23549ccd7105eb6ff0d4f0ac519285628673 upstream.

Fixes up the pincfg for ASUS ROG Strix G15 (G533Z) headphone combo jack

[ Fixed the position in the quirk table by tiwai ]

Signed-off-by: Luke D. Jones <luke@ljones.dev>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20220915080921.35563-3-luke@ljones.dev
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
19 months agoALSA: hda/realtek: Add pincfg for ASUS G513 HP jack
Luke D. Jones [Thu, 15 Sep 2022 08:09:19 +0000 (20:09 +1200)]
ALSA: hda/realtek: Add pincfg for ASUS G513 HP jack

commit c611e659044168e7abcbae8ba1ea833521498fbb upstream.

Fixes up the pincfg for ASUS ROG Strix G513 headphone and mic combo jack

[ Fixed the position in the quirk table by tiwai ]

Signed-off-by: Luke D. Jones <luke@ljones.dev>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20220915080921.35563-2-luke@ljones.dev
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
19 months agoALSA: hda/realtek: Re-arrange quirk table entries
Takashi Iwai [Thu, 15 Sep 2022 15:47:24 +0000 (17:47 +0200)]
ALSA: hda/realtek: Re-arrange quirk table entries

commit b16c8f229a58eaddfc58aab447253464abd3c85e upstream.

A few entries have been mistakenly inserted in wrong positions without
considering the SSID ordering.  Place them at right positions.

Fixes: b7557267c233 ("ALSA: hda/realtek: Add quirk for ASUS GA402")
Fixes: 94db9cc8f8fa ("ALSA: hda/realtek: Add quirk for ASUS GU603")
Fixes: 739d0959fbed ("ALSA: hda: Add quirk for ASUS Flow x13")
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20220915154724.31634-1-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
19 months agoALSA: hda/realtek: Enable 4-speaker output Dell Precision 5570 laptop
Callum Osmotherly [Wed, 14 Sep 2022 09:14:00 +0000 (18:44 +0930)]
ALSA: hda/realtek: Enable 4-speaker output Dell Precision 5570 laptop

commit bdc9b7396f7d4d6533e70fd8d5472f505b5ef58f upstream.

The Dell Precision 5570 uses the same 4-speakers-on-ALC289 just like the
previous Precision 5560. I replicated that patch onto this one, and can
confirm that the audio is much better (the woofers are now working);
I've tested it on my Dell Precision 5570.

Signed-off-by: Callum Osmotherly <callum.osmotherly@gmail.com>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/YyGbWM5wEoFMbW2v@piranha
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>