git.itanic.dy.fi Git - linux-stable/log

Linux 5.19.14

Link: https://lore.kernel.org/r/20221003070724.490989164@linuxfoundation.org
Tested-by: Guenter Roeck <linux@roeck-us.net>
Tested-by: Florian Fainelli <f.fainelli@gmail.com>
Tested-by: Justin M. Forbes <jforbes@fedoraproject.org>
Tested-by: Slade Watkins <srw@sladewatkins.net>
Tested-by: Shuah Khan <skhan@linuxfoundation.org>
Tested-by: Zan Aziz <zanaziz313@gmail.com>
Tested-by: Ron Economos <re@w6rz.net>
Tested-by: Linux Kernel Functional Testing <lkft@linaro.org>
Tested-by: Bagas Sanjaya <bagasdotme@gmail.com>
Tested-by: Sudip Mukherjee <sudip.mukherjee@codethink.co.uk>
Tested-by: Fenil Jain <fkjainco@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

damon/sysfs: fix possible memleak on damon_sysfs_add_target

commit 1c8e2349f2d033f634d046063b704b2ca6c46972 upstream.

When damon_sysfs_add_target couldn't find proper task, New allocated
damon_target structure isn't registered yet, So, it's impossible to free
new allocated one by damon_sysfs_destroy_targets.

By calling damon_add_target as soon as allocating new target, Fix this
possible memory leak.

Link: https://lkml.kernel.org/r/20220926160611.48536-1-sj@kernel.org
Fixes: a61ea561c871 ("mm/damon/sysfs: link DAMON for virtual address spaces monitoring")
Signed-off-by: Levi Yun <ppbuk5246@gmail.com>
Signed-off-by: SeongJae Park <sj@kernel.org>
Reviewed-by: SeongJae Park <sj@kernel.org>
Cc: <stable@vger.kernel.org> [5.17.x]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

x86/alternative: Fix race in try_get_desc()

commit efd608fa7403ba106412b437f873929e2c862e28 upstream.

I encountered some occasional crashes of poke_int3_handler() when
kprobes are set, while accessing desc->vec.

The text poke mechanism claims to have an RCU-like behavior, but it
does not appear that there is any quiescent state to ensure that
nobody holds reference to desc. As a result, the following race
appears to be possible, which can lead to memory corruption.

  CPU0 CPU1
  ---- ----
  text_poke_bp_batch()
  -> smp_store_release(&bp_desc, &desc)

  [ notice that desc is on
    the stack ]

poke_int3_handler()

[ int3 might be kprobe's
  so sync events are do not
  help ]

-> try_get_desc(descp=&bp_desc)
   desc = __READ_ONCE(bp_desc)

   if (!desc) [false, success]
  WRITE_ONCE(bp_desc, NULL);
  atomic_dec_and_test(&desc.refs)

  [ success, desc space on the stack
    is being reused and might have
    non-zero value. ]
arch_atomic_inc_not_zero(&desc->refs)

[ might succeed since desc points to
  stack memory that was freed and might
  be reused. ]

Fix this issue with small backportable patch. Instead of trying to
make RCU-like behavior for bp_desc, just eliminate the unnecessary
level of indirection of bp_desc, and hold the whole descriptor as a
global.  Anyhow, there is only a single descriptor at any given
moment.

Fixes: 1f676247f36a4 ("x86/alternatives: Implement a better poke_int3_handler() completion scheme")
Signed-off-by: Nadav Amit <namit@vmware.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: stable@kernel.org
Link: https://lkml.kernel.org/r/20220920224743.3089-1-namit@vmware.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

x86/cacheinfo: Add a cpu_llc_shared_mask() UP variant

commit df5b035b5683d6a25f077af889fb88e09827f8bc upstream.

On a CONFIG_SMP=n kernel, the LLC shared mask is 0, which prevents
__cache_amd_cpumap_setup() from doing the L3 masks setup, and more
specifically from setting up the shared_cpu_map and shared_cpu_list
files in sysfs, leading to lscpu from util-linux getting confused and
segfaulting.

Add a cpu_llc_shared_mask() UP variant which returns a mask with a
single bit set, i.e., for CPU0.

Fixes: 2b83809a5e6d ("x86/cpu/amd: Derive L3 shared_cpu_map from cpu_llc_shared_mask")
Reported-by: Saurabh Sengar <ssengar@linux.microsoft.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/1660148115-302-1-git-send-email-ssengar@linux.microsoft.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

KVM: x86: Hide IA32_PLATFORM_DCA_CAP[31:0] from the guest

[ Upstream commit aae2e72229cdb21f90df2dbe4244c977e5d3265b ]

The only thing reported by CPUID.9 is the value of
IA32_PLATFORM_DCA_CAP[31:0] in EAX. This MSR doesn't even exist in the
guest, since CPUID.1:ECX.DCA[bit 18] is clear in the guest.

Clear CPUID.9 in KVM_GET_SUPPORTED_CPUID.

Fixes: 24c82e576b78 ("KVM: Sanitize cpuid")
Signed-off-by: Jim Mattson <jmattson@google.com>
Message-Id: <20220922231854.249383-1-jmattson@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>

perf tests record: Fail the test if the 'errs' counter is not zero

[ Upstream commit 25c5e67cdf744cbb93fd06647611d3036218debe ]

We were just checking for the 'err' variable, when we should really see
if there was some of the many checked errors that don't stop the test
right away.

Detected with clang 15.0.0:

  44    75.23 fedora:37       : FAIL clang version 15.0.0 (Fedora 15.0.0-2.fc37)

    tests/perf-record.c:68:16: error: variable 'errs' set but not used [-Werror,-Wunused-but-set-variable]
            int err = -1, errs = 0, i, wakeups = 0;
                          ^
    1 error generated.

The patch introducing this 'perf test' entry had that check:

  +       return (err < 0 || errs > 0) ? -1 : 0;

But at some point we lost that:

  -   return (err < 0 || errs > 0) ? -1 : 0;
  +   if (err == -EACCES)
  +               return TEST_SKIP;
  +   if (err < 0)
  +               return TEST_FAIL;
  +   return TEST_OK

Put it back.

Fixes: 2cf88f4614c996e5 ("perf test: Use skip in PERF_RECORD_*")
Acked-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/lkml/YzR0n5QhsH9VyYB0@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>

perf test: Fix test case 87 ("perf record tests") for hybrid systems

[ Upstream commit 457c8b60267054869513ab1fb5513abb0a566dd0 ]

The test case 87 ("perf record tests") failed on hybrid systems,the event
"cpu/br_inst_retired.near_call/p" is only for non-hybrid system. Correct
the test event to support both non-hybrid and hybrid systems.

Before:

  # ./perf test 87
  87: perf record tests                                   : FAILED!

After:

  # ./perf test 87
  87: perf record tests                                   : Ok

Fixes: 24f378e66021f559 ("perf test: Add basic perf record tests")
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220927051513.3768717-1-zhengjun.xing@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>

net: ethernet: mtk_eth_soc: fix mask of RX_DMA_GET_SPORT{,_V2}

[ Upstream commit c9da02bfb1112461e048d3b736afb1873f6f4ccf ]

The bitmasks applied in RX_DMA_GET_SPORT and RX_DMA_GET_SPORT_V2 macros
were swapped. Fix that.

Reported-by: Chen Minqiang <ptpt52@gmail.com>
Fixes: 160d3a9b192985 ("net: ethernet: mtk_eth_soc: introduce MTK_NETSYS_V2 support")
Acked-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: Daniel Golle <daniel@makrotopia.org>
Link: https://lore.kernel.org/r/YzMW+mg9UsaCdKRQ@makrotopia.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>

net: mscc: ocelot: fix tagged VLAN refusal while under a VLAN-unaware bridge

[ Upstream commit 276d37eb449133bc22872b8f0a6f878e120deeff ]

Currently the following set of commands fails:

$ ip link add br0 type bridge # vlan_filtering 0
$ ip link set swp0 master br0
$ bridge vlan
port              vlan-id
swp0              1 PVID Egress Untagged
$ bridge vlan add dev swp0 vid 10
Error: mscc_ocelot_switch_lib: Port with more than one egress-untagged VLAN cannot have egress-tagged VLANs.

Dumping ocelot->vlans, one can see that the 2 egress-untagged VLANs on swp0 are
vid 1 (the bridge PVID) and vid 4094, a PVID used privately by the driver for
VLAN-unaware bridging. So this is why bridge vid 10 is refused, despite
'bridge vlan' showing a single egress untagged VLAN.

As mentioned in the comment added, having this private VLAN does not impose
restrictions to the hardware configuration, yet it is a bookkeeping problem.

There are 2 possible solutions.

One is to make the functions that operate on VLAN-unaware pvids:
- ocelot_add_vlan_unaware_pvid()
- ocelot_del_vlan_unaware_pvid()
- ocelot_port_setup_dsa_8021q_cpu()
- ocelot_port_teardown_dsa_8021q_cpu()
call something different than ocelot_vlan_member_(add|del)(), the latter being
the real problem, because it allocates a struct ocelot_bridge_vlan *vlan which
it adds to ocelot->vlans. We don't really *need* the private VLANs in
ocelot->vlans, it's just that we have the extra convenience of having the
vlan->portmask cached in software (whereas without these structures, we'd have
to create a raw ocelot_vlant_rmw_mask() procedure which reads back the current
port mask from hardware).

The other solution is to filter out the private VLANs from
ocelot_port_num_untagged_vlans(), since they aren't what callers care about.
We only need to do this to the mentioned function and not to
ocelot_port_num_tagged_vlans(), because private VLANs are never egress-tagged.

Nothing else seems to be broken in either solution, but the first one requires
more rework which will conflict with the net-next change  36a0bf443585 ("net:
mscc: ocelot: set up tag_8021q CPU ports independent of user port affinity"),
and I'd like to avoid that. So go with the other one.

Fixes: 54c319846086 ("net: mscc: ocelot: enforce FDB isolation when VLAN-unaware")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://lore.kernel.org/r/20220927122042.1100231-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>

clk: imx93: drop of_match_ptr

[ Upstream commit daaa2fbe678efdaced53d1c635f4d326751addf8 ]

There is build warning when CONFIG_OF is not selected.
>> drivers/clk/imx/clk-imx93.c:324:34: warning: 'imx93_clk_of_match'
>> defined but not used [-Wunused-const-variable=]
324 | static const struct of_device_id imx93_clk_of_match[] = {
| ^~~~~~~~~~~~~~~~~~

The driver only support DT table, no sense to use of_match_ptr.

Fixes: 24defbe194b6 ("clk: imx: add i.MX93 clk")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Peng Fan <peng.fan@nxp.com>
Link: https://lore.kernel.org/r/20220830033137.4149542-3-peng.fan@oss.nxp.com
Reviewed-by: Abel Vesa <abel.vesa@linaro.org>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>

clk: iproc: Do not rely on node name for correct PLL setup

[ Upstream commit 1b24a132eba7a1c19475ba2510ec1c00af3ff914 ]

After commit 31fd9b79dc58 ("ARM: dts: BCM5301X: update CRU block
description") a warning from clk-iproc-pll.c was generated due to a
duplicate PLL name as well as the console stopped working. Upon closer
inspection it became clear that iproc_pll_clk_setup() used the Device
Tree node unit name as an unique identifier as well as a parent name to
parent all clocks under the PLL.

BCM5301X was the first platform on which that got noticed because of the
DT node unit name renaming but the same assumptions hold true for any
user of the iproc_pll_clk_setup() function.

The first 'clock-output-names' property is always guaranteed to be
unique as well as providing the actual desired PLL clock name, so we
utilize that to register the PLL and as a parent name of all children
clock.

Fixes: 5fe225c105fd ("clk: iproc: add initial common clock support")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Acked-by: Rafał Miłecki <rafal@milecki.pl>
Link: https://lore.kernel.org/r/20220905161504.1526-1-f.fainelli@gmail.com
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>

drm/i915/gt: Perf_limit_reasons are only available for Gen11+

[ Upstream commit 7738be973fc4e2ba22154fafd3a5d7b9666f9abf ]

Register GT0_PERF_LIMIT_REASONS (0x1381a8) is available only for
Gen11+. Therefore ensure perf_limit_reasons sysfs files are created only
for Gen11+. Otherwise on Gen < 5 accessing these files results in the
following oops:

<1> [88.829420] BUG: unable to handle page fault for address: ffffc90000bb81a8
<1> [88.829438] #PF: supervisor read access in kernel mode
<1> [88.829447] #PF: error_code(0x0000) - not-present page

This patch is a backport of the drm-tip commit 0d2d201095e9
("drm/i915: Perf_limit_reasons are only available for Gen11+") to
drm-intel-fixes. The backport is not identical to the original, it only
includes the sysfs portions of if. The debugfs portion is not available
in drm-intel-fixes so has not been backported.

Bspec: 20008
Bug: https://gitlab.freedesktop.org/drm/intel/-/issues/6863
Fixes: fa68bff7cf27 ("drm/i915/gt: Add sysfs throttle frequency interfaces")
Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220919162401.2077713-1-ashutosh.dixit@intel.com
(backported from commit 0d2d201095e9f141d6a9fb44320afce761f8b5c2)
Signed-off-by: Sasha Levin <sashal@kernel.org>

clk: imx: imx6sx: remove the SET_RATE_PARENT flag for QSPI clocks

[ Upstream commit b1ff1bfe81e763420afd5f3f25f0b3cbfd97055c ]

There is no dedicate parent clock for QSPI so SET_RATE_PARENT flag
should not be used. For instance, the default parent clock for QSPI is
pll2_bus, which is also the parent clock for quite a few modules, such
as MMDC, once GPMI NAND set clock rate for EDO5 mode can cause system
hang due to pll2_bus rate changed.

Fixes: f1541e15e38e ("clk: imx6sx: Switch to clk_hw based API")
Signed-off-by: Han Xu <han.xu@nxp.com>
Link: https://lore.kernel.org/r/20220915150959.3646702-1-han.xu@nxp.com
Tested-by: Fabio Estevam <festevam@denx.de>
Reviewed-by: Abel Vesa <abel.vesa@linaro.org>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>

vdpa/mlx5: Fix MQ to support non power of two num queues

[ Upstream commit a43ae8057cc154fd26a3a23c0e8643bef104d995 ]

RQT objects require that a power of two value be configured for both
rqt_max_size and rqt_actual size.

For create_rqt, make sure to round up to the power of two the value of
given by the user who created the vdpa device and given by
ndev->rqt_size. The actual size is also rounded up to the power of two
using the current number of VQs given by ndev->cur_num_vqs.

Same goes with modify_rqt where we need to make sure act size is power
of two based on the new number of QPs.

Without this patch, attempt to create a device with non power of two QPs
would result in error from firmware.

Fixes: 52893733f2c5 ("vdpa/mlx5: Add multiqueue support")
Signed-off-by: Eli Cohen <elic@nvidia.com>
Message-Id: <20220912125019.833708-1-elic@nvidia.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>

virtio-blk: Fix WARN_ON_ONCE in virtio_queue_rq()

[ Upstream commit 37fafe6b61e4f15d977982635bb785f4e605f7cd ]

If a request fails at virtio_queue_rqs(), it is inserted to requeue_list
and passed to virtio_queue_rq(). Then blk_mq_start_request() can be called
again at virtio_queue_rq() and trigger WARN_ON_ONCE like below trace because
request state was already set to MQ_RQ_IN_FLIGHT in virtio_queue_rqs()
despite the failure.

[    1.890468] ------------[ cut here ]------------
[    1.890776] WARNING: CPU: 2 PID: 122 at block/blk-mq.c:1143
blk_mq_start_request+0x8a/0xe0
[    1.891045] Modules linked in:
[    1.891250] CPU: 2 PID: 122 Comm: journal-offline Not tainted 5.19.0+ #44
[    1.891504] Hardware name: ChromiumOS crosvm, BIOS 0
[    1.891739] RIP: 0010:blk_mq_start_request+0x8a/0xe0
[    1.891961] Code: 12 80 74 22 48 8b 4b 10 8b 89 64 01 00 00 8b 53
20 83 fa ff 75 08 ba 00 00 00 80 0b 53 24 c1 e1 10 09 d1 89 48 34 5b
41 5e c3 <0f> 0b eb b8 65 8b 05 2b 39 b6 7e 89 c0 48 0f a3 05 39 77 5b
01 0f
[    1.892443] RSP: 0018:ffffc900002777b0 EFLAGS: 00010202
[    1.892673] RAX: 0000000000000000 RBX: ffff888004bc0000 RCX: 0000000000000000
[    1.892952] RDX: 0000000000000000 RSI: ffff888003d7c200 RDI: ffff888004bc0000
[    1.893228] RBP: 0000000000000000 R08: 0000000000000001 R09: ffff888004bc0100
[    1.893506] R10: ffffffffffffffff R11: ffffffff8185ca10 R12: ffff888004bc0000
[    1.893797] R13: ffffc90000277900 R14: ffff888004ab2340 R15: ffff888003d86e00
[    1.894060] FS:  00007ffa143a4640(0000) GS:ffff88807dd00000(0000)
knlGS:0000000000000000
[    1.894412] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    1.894682] CR2: 00005648577d9088 CR3: 00000000053da004 CR4: 0000000000170ee0
[    1.894953] Call Trace:
[    1.895139]  <TASK>
[    1.895303]  virtblk_prep_rq+0x1e5/0x280
[    1.895509]  virtio_queue_rq+0x5c/0x310
[    1.895710]  ? virtqueue_add_sgs+0x95/0xb0
[    1.895905]  ? _raw_spin_unlock_irqrestore+0x16/0x30
[    1.896133]  ? virtio_queue_rqs+0x340/0x390
[    1.896453]  ? sbitmap_get+0xfa/0x220
[    1.896678]  __blk_mq_issue_directly+0x41/0x180
[    1.896906]  blk_mq_plug_issue_direct+0xd8/0x2c0
[    1.897115]  blk_mq_flush_plug_list+0x115/0x180
[    1.897342]  blk_add_rq_to_plug+0x51/0x130
[    1.897543]  blk_mq_submit_bio+0x3a1/0x570
[    1.897750]  submit_bio_noacct_nocheck+0x418/0x520
[    1.897985]  ? submit_bio_noacct+0x1e/0x260
[    1.897989]  ext4_bio_write_page+0x222/0x420
[    1.898000]  mpage_process_page_bufs+0x178/0x1c0
[    1.899451]  mpage_prepare_extent_to_map+0x2d2/0x440
[    1.899603]  ext4_writepages+0x495/0x1020
[    1.899733]  do_writepages+0xcb/0x220
[    1.899871]  ? __seccomp_filter+0x171/0x7e0
[    1.900006]  file_write_and_wait_range+0xcd/0xf0
[    1.900167]  ext4_sync_file+0x72/0x320
[    1.900308]  __x64_sys_fsync+0x66/0xa0
[    1.900449]  do_syscall_64+0x31/0x50
[    1.900595]  entry_SYSCALL_64_after_hwframe+0x63/0xcd
[    1.900747] RIP: 0033:0x7ffa16ec96ea
[    1.900883] Code: b8 4a 00 00 00 0f 05 48 3d 00 f0 ff ff 77 41 c3
48 83 ec 18 89 7c 24 0c e8 e3 02 f8 ff 8b 7c 24 0c 89 c2 b8 4a 00 00
00 0f 05 <48> 3d 00 f0 ff ff 77 36 89 d7 89 44 24 0c e8 43 03 f8 ff 8b
44 24
[    1.901302] RSP: 002b:00007ffa143a3ac0 EFLAGS: 00000293 ORIG_RAX:
000000000000004a
[    1.901499] RAX: ffffffffffffffda RBX: 0000560277ec6fe0 RCX: 00007ffa16ec96ea
[    1.901696] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000016
[    1.901884] RBP: 0000560277ec5910 R08: 0000000000000000 R09: 00007ffa143a4640
[    1.902082] R10: 00007ffa16e4d39e R11: 0000000000000293 R12: 00005602773f59e0
[    1.902459] R13: 0000000000000000 R14: 00007fffbfc007ff R15: 00007ffa13ba4000
[    1.902763]  </TASK>
[    1.902877] ---[ end trace 0000000000000000 ]---

To avoid calling blk_mq_start_request() twice, This patch moves the
execution of blk_mq_start_request() to the end of virtblk_prep_rq().
And instead of requeuing failed request to plug list in the error path of
virtblk_add_req_batch(), it uses blk_mq_requeue_request() to change failed
request state to MQ_RQ_IDLE. Then virtblk can safely handle the request
on the next trial.

Fixes: 0e9911fa768f ("virtio-blk: support mq_ops->queue_rqs()")
Reported-by: Alexandre Courbot <acourbot@chromium.org>
Tested-by: Alexandre Courbot <acourbot@chromium.org>
Signed-off-by: Suwan Kim <suwan.kim027@gmail.com>
Message-Id: <20220830150153.12627-1-suwan.kim027@gmail.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Pankaj Raghav <p.raghav@samsung.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>

vdpa/ifcvf: fix the calculation of queuepair

[ Upstream commit db5db1a00d0816207be3a0166fcb4f523eaf3b52 ]

The q_pair_id to address a queue pair in the lm bar should be
calculated by queue_id / 2 rather than queue_id / nr_vring.

Fixes: 2ddae773c93b ("vDPA/ifcvf: detect and use the onboard number of queues directly")
Signed-off-by: Angus Chen <angus.chen@jaguarmicro.com>
Reviewed-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Zhu Lingshan <lingshan.zhu@intel.com>
Message-Id: <20220923091013.191-1-angus.chen@jaguarmicro.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>

ice: xsk: drop power of 2 ring size restriction for AF_XDP

[ Upstream commit b3056ae2b57858b02b376b3fed6077040caf14b4 ]

We had multiple customers in the past months that reported commit
296f13ff3854 ("ice: xsk: Force rings to be sized to power of 2")
makes them unable to use ring size of 8160 in conjunction with AF_XDP.
Remove this restriction.

Fixes: 296f13ff3854 ("ice: xsk: Force rings to be sized to power of 2")
CC: Alasdair McWilliam <alasdair.mcwilliam@outlook.com>
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Tested-by: George Kuruvinakunnel <george.kuruvinakunnel@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>

ice: xsk: change batched Tx descriptor cleaning

[ Upstream commit 29322791bc8b4f42fc65734840826e3ddc30921e ]

AF_XDP Tx descriptor cleaning in ice driver currently works in a "lazy"
way - descriptors are not cleaned immediately after send. We rather hold
on with cleaning until we see that free space in ring drops below
particular threshold. This was supposed to reduce the amount of
unnecessary work related to cleaning and instead of keeping the ring
empty, ring was rather saturated.

In AF_XDP realm cleaning Tx descriptors implies producing them to CQ.
This is a way of letting know user space that particular descriptor has
been sent, as John points out in [0].

We tried to implement serial descriptor cleaning which would be used in
conjunction with batched cleaning but it made code base more convoluted
and probably harder to maintain in future. Therefore we step away from
batched cleaning in a current form in favor of an approach where we set
RS bit on every last descriptor from a batch and clean always at the
beginning of ice_xmit_zc().

This means that we give up a bit of Tx performance, but this doesn't
hurt l2fwd scenario which is way more meaningful than txonly as this can
be treaten as AF_XDP based packet generator. l2fwd is not hurt due to
the fact that Tx side is much faster than Rx and Rx is the one that has
to catch Tx up.

FWIW Tx descriptors are still produced in a batched way.

[0]: https://lore.kernel.org/bpf/62b0a20232920_3573208ab@john.notmuch/

Fixes: 126cdfe1007a ("ice: xsk: Improve AF_XDP ZC Tx and use batching API")
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Tested-by: George Kuruvinakunnel <george.kuruvinakunnel@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>

selftests: Fix the if conditions of in test_extra_filter()

[ Upstream commit bc7a319844891746135dc1f34ab9df78d636a3ac ]

The socket 2 bind the addr in use, bind should fail with EADDRINUSE. So
if bind success or errno != EADDRINUSE, testcase should be failed.

Fixes: 3ca8e4029969 ("soreuseport: BPF selection functional test")
Signed-off-by: Wang Yufen <wangyufen@huawei.com>
Link: https://lore.kernel.org/r/1663916557-10730-1-git-send-email-wangyufen@huawei.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>

net: phy: Don't WARN for PHY_UP state in mdio_bus_phy_resume()

[ Upstream commit ea64cdfad124922c931633e39287c5a31a9b14a1 ]

Commit 744d23c71af3 ("net: phy: Warn about incorrect mdio_bus_phy_resume()
state") introduced a WARN() on resume from system sleep if a PHY is not
in PHY_HALTED state.

Commit 6dbe852c379f ("net: phy: Don't WARN for PHY_READY state in
mdio_bus_phy_resume()") added an exemption for PHY_READY state from
the WARN().

It turns out PHY_UP state needs to be exempted as well because the
following may happen on suspend:

  mdio_bus_phy_suspend()
    phy_stop_machine()
      phydev->state = PHY_UP  #  if (phydev->state >= PHY_UP)

Fixes: 744d23c71af3 ("net: phy: Warn about incorrect mdio_bus_phy_resume() state")
Reported-by: Marek Szyprowski <m.szyprowski@samsung.com>
Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>
Link: https://lore.kernel.org/netdev/2b1a1588-505e-dff3-301d-bfc1fb14d685@samsung.com/
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Cc: Xiaolei Wang <xiaolei.wang@windriver.com>
Link: https://lore.kernel.org/r/8128fdb51eeebc9efbf3776a4097363a1317aaf1.1663905575.git.lukas@wunner.de
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>

net: stmmac: power up/down serdes in stmmac_open/release

[ Upstream commit 49725ffc15fc4e9fae68c55b691fd25168cbe5c1 ]

This commit fixes DMA engine reset timeout issue in suspend/resume
with ADLink I-Pi SMARC Plus board which dmesg shows:
...
[   54.678271] PM: suspend exit
[   54.754066] intel-eth-pci 0000:00:1d.2 enp0s29f2: PHY [stmmac-3:01] driver [Maxlinear Ethernet GPY215B] (irq=POLL)
[   54.755808] intel-eth-pci 0000:00:1d.2 enp0s29f2: Register MEM_TYPE_PAGE_POOL RxQ-0
...
[   54.780482] intel-eth-pci 0000:00:1d.2 enp0s29f2: Register MEM_TYPE_PAGE_POOL RxQ-7
[   55.784098] intel-eth-pci 0000:00:1d.2: Failed to reset the dma
[   55.784111] intel-eth-pci 0000:00:1d.2 enp0s29f2: stmmac_hw_setup: DMA engine initialization failed
[   55.784115] intel-eth-pci 0000:00:1d.2 enp0s29f2: stmmac_open: Hw setup failed
...

The issue is related with serdes which impacts clock.  There is
serdes in ADLink I-Pi SMARC board ethernet controller. Please refer to
commit b9663b7ca6ff78 ("net: stmmac: Enable SERDES power up/down sequence")
for detial. When issue is reproduced, DMA engine clock is not ready
because serdes is not powered up.

To reproduce DMA engine reset timeout issue with hardware which has
serdes in GBE controller, install Ubuntu. In Ubuntu GUI, click
"Power Off/Log Out" -> "Suspend" menu, it disables network interface,
then goes to sleep mode. When it wakes up, it enables network
interface again. Stmmac driver is called in this way:

1. stmmac_release: Stop network interface. In this function, it
   disables DMA engine and network interface;
2. stmmac_suspend: It is called in kernel suspend flow. But because
   network interface has been disabled(netif_running(ndev) is
   false), it does nothing and returns directly;
3. System goes into S3 or S0ix state. Some time later, system is
   waken up by keyboard or mouse;
4. stmmac_resume: It does nothing because network interface has
   been disabled;
5. stmmac_open: It is called to enable network interace again. DMA
   engine is initialized in this API, but serdes is not power on so
   there will be DMA engine reset timeout issue.

Similarly, serdes powerdown should be added in stmmac_release.
Network interface might be disabled by cmd "ifconfig eth0 down",
DMA engine, phy and mac have been disabled in ndo_stop callback,
serdes should be powered down as well. It doesn't make sense that
serdes is on while other components have been turned off.

If ethernet interface is in enabled state(netif_running(ndev) is true)
before suspend/resume, the issue couldn't be reproduced  because serdes
could be powered up in stmmac_resume.

Because serdes_powerup is added in stmmac_open, it doesn't need to be
called in probe function.

Fixes: b9663b7ca6ff78 ("net: stmmac: Enable SERDES power up/down sequence")
Signed-off-by: Junxiao Chang <junxiao.chang@intel.com>
Reviewed-by: Voon Weifeng <weifeng.voon@intel.com>
Tested-by: Jimmy JS Chen <jimmyjs.chen@adlinktech.com>
Tested-by: Looi, Hong Aun <hong.aun.looi@intel.com>
Link: https://lore.kernel.org/r/20220923050448.1220250-1-junxiao.chang@intel.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>

wifi: mac80211: fix memory corruption in minstrel_ht_update_rates()

[ Upstream commit be92292b90bfdc31f332c962882b6d3ea0285fdf ]

During our testing of WFM200 module over SDIO on i.MX6Q-based platform,
we discovered a memory corruption on the system, tracing back to the wfx
driver. Using kfence, it was possible to trace it back to the root
cause, which is hw->max_rates set to 8 in wfx_init_common,
while the maximum defined by IEEE80211_TX_TABLE_SIZE is 4.

This causes array out-of-bounds writes during updates of the rate table,
as seen below:

BUG: KFENCE: memory corruption in kfree_rcu_work+0x320/0x36c

Corrupted memory at 0xe0a4ffe0 [ 0x03 0x03 0x03 0x03 0x01 0x00 0x00
0x02 0x02 0x02 0x09 0x00 0x21 0xbb 0xbb 0xbb ] (in kfence-#81):
kfree_rcu_work+0x320/0x36c
process_one_work+0x3ec/0x920
worker_thread+0x60/0x7a4
kthread+0x174/0x1b4
ret_from_fork+0x14/0x2c
0x0

kfence-#81: 0xe0a4ffc0-0xe0a4ffdf, size=32, cache=kmalloc-64

allocated by task 297 on cpu 0 at 631.039555s:
minstrel_ht_update_rates+0x38/0x2b0 [mac80211]
rate_control_tx_status+0xb4/0x148 [mac80211]
ieee80211_tx_status_ext+0x364/0x1030 [mac80211]
ieee80211_tx_status+0xe0/0x118 [mac80211]
ieee80211_tasklet_handler+0xb0/0xe0 [mac80211]
tasklet_action_common.constprop.0+0x11c/0x148
__do_softirq+0x1a4/0x61c
irq_exit+0xcc/0x104
call_with_stack+0x18/0x20
__irq_svc+0x80/0xb0
wq_worker_sleeping+0x10/0x100
wq_worker_sleeping+0x10/0x100
schedule+0x50/0xe0
schedule_timeout+0x2e0/0x474
wait_for_completion+0xdc/0x1ec
mmc_wait_for_req_done+0xc4/0xf8
mmc_io_rw_extended+0x3b4/0x4ec
sdio_io_rw_ext_helper+0x290/0x384
sdio_memcpy_toio+0x30/0x38
wfx_sdio_copy_to_io+0x88/0x108 [wfx]
wfx_data_write+0x88/0x1f0 [wfx]
bh_work+0x1c8/0xcc0 [wfx]
process_one_work+0x3ec/0x920
worker_thread+0x60/0x7a4
kthread+0x174/0x1b4
ret_from_fork+0x14/0x2c 0x0

After discussion on the wireless mailing list it was clarified
that the issue has been introduced by:
commit ee0e16ab756a ("mac80211: minstrel_ht: fill all requested rates")
and fix shall be in minstrel_ht_update_rates in rc80211_minstrel_ht.c.

Fixes: ee0e16ab756a ("mac80211: minstrel_ht: fill all requested rates")
Link: https://lore.kernel.org/all/12e5adcd-8aed-f0f7-70cc-4fb7b656b829@camlingroup.com/
Link: https://lore.kernel.org/linux-wireless/20220915131445.30600-1-lech.perczak@camlingroup.com/
Cc: Jérôme Pouiller <jerome.pouiller@silabs.com>
Cc: Johannes Berg <johannes@sipsolutions.net>
Cc: Peter Seiderer <ps.report@gmx.net>
Cc: Kalle Valo <kvalo@kernel.org>
Cc: Krzysztof Drobiński <krzysztof.drobinski@camlingroup.com>,
Signed-off-by: Paweł Lenkow <pawel.lenkow@camlingroup.com>
Signed-off-by: Lech Perczak <lech.perczak@camlingroup.com>
Reviewed-by: Peter Seiderer <ps.report@gmx.net>
Reviewed-by: Jérôme Pouiller <jerome.pouiller@silabs.com>
Acked-by: Felix Fietkau <nbd@nbd.name>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>

wifi: mac80211: fix regression with non-QoS drivers

[ Upstream commit d873697ef2b7e1b6fdd8e9d449d9354bd5d29a4a ]

Commit 10cb8e617560 ("mac80211: enable QoS support for nl80211 ctrl port")
changed ieee80211_tx_control_port() to aways call
__ieee80211_select_queue() without checking local->hw.queues.

__ieee80211_select_queue() returns a queue-id between 0 and 3, which means
that now ieee80211_tx_control_port() may end up setting the queue-mapping
for a skb to a value higher then local->hw.queues if local->hw.queues
is less then 4.

Specifically this is a problem for ralink rt2500-pci cards where
local->hw.queues is 2. There this causes rt2x00queue_get_tx_queue() to
return NULL and the following error to be logged: "ieee80211 phy0:
rt2x00mac_tx: Error - Attempt to send packet over invalid queue 2",
after which association with the AP fails.

Other callers of __ieee80211_select_queue() skip calling it when
local->hw.queues < IEEE80211_NUM_ACS, add the same check to
ieee80211_tx_control_port(). This fixes ralink rt2500-pci and
similar cards when less then 4 tx-queues no longer working.

Fixes: 10cb8e617560 ("mac80211: enable QoS support for nl80211 ctrl port")
Cc: Markus Theil <markus.theil@tu-ilmenau.de>
Suggested-by: Stanislaw Gruszka <stf_xl@wp.pl>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Link: https://lore.kernel.org/r/20220918192052.443529-1-hdegoede@redhat.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>

wifi: cfg80211: fix MCS divisor value

[ Upstream commit 64e966d1e84b29c9fa916cfeaabbf4013703942e ]

The Bitrate for HE/EHT MCS6 is calculated wrongly due to the
incorrect MCS divisor value for mcs6. Fix it with the proper
value.

previous mcs_divisor value = (11769/6144) = 1.915527

fixed mcs_divisor value = (11377/6144) = 1.851725

Fixes: 9c97c88d2f4b ("cfg80211: Add support to calculate and report 4096-QAM HE rates")
Signed-off-by: Tamizh Chelvam Raja <quic_tamizhr@quicinc.com>
Link: https://lore.kernel.org/r/20220908181034.9936-1-quic_tamizhr@quicinc.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>

nvme: Fix IOC_PR_CLEAR and IOC_PR_RELEASE ioctls for nvme devices

[ Upstream commit c292a337d0e45a292c301e3cd51c35aa0ae91e95 ]

The IOC_PR_CLEAR and IOC_PR_RELEASE ioctls are
non-functional on NVMe devices because the nvme_pr_clear()
and nvme_pr_release() functions set the IEKEY field incorrectly.
The IEKEY field should be set only when the key is zero (i.e,
not specified). The current code does it backwards.

Furthermore, the NVMe spec describes the persistent
reservation "clear" function as an option on the reservation
release command. The current implementation of nvme_pr_clear()
erroneously uses the reservation register command.

Fix these errors. Note that NVMe version 1.3 and later specify
that setting the IEKEY field will return an error of Invalid
Field in Command. The fix will set IEKEY when the key is zero,
which is appropriate as these ioctls consider a zero key to
be "unspecified", and the intention of the spec change is
to require a valid key.

Tested on a version 1.4 PCI NVMe device in an Azure VM.

Fixes: 1673f1f08c88 ("nvme: move block_device_operations and ns/ctrl freeing to common code")
Fixes: 1d277a637a71 ("NVMe: Add persistent reservation ops")
Signed-off-by: Michael Kelley <mikelley@microsoft.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>

net/mlxbf_gige: Fix an IS_ERR() vs NULL bug in mlxbf_gige_mdio_probe

[ Upstream commit 4774db8dfc6a2e6649920ebb2fc8e2f062c2080d ]

The devm_ioremap() function returns NULL on error, it doesn't return
error pointers.

Fixes: 3a1a274e933f ("mlxbf_gige: compute MDIO period based on i1clk")
Signed-off-by: Peng Wu <wupeng58@huawei.com>
Link: https://lore.kernel.org/r/20220923023640.116057-1-wupeng58@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>

cxgb4: fix missing unlock on ETHOFLD desc collect fail path

[ Upstream commit c635ebe8d911a93bd849a9419b01a58783de76f1 ]

The label passed to the QDESC_GET for the ETHOFLD TXQ, RXQ, and FLQ, is the
'out' one, which skips the 'out_unlock' label, and thus doesn't unlock the
'uld_mutex' before returning. Additionally, since commit 5148e5950c67
("cxgb4: add EOTID tracking and software context dump"), the access to
these ETHOFLD hardware queues should be protected by the 'mqprio_mutex'
instead.

Fixes: 2d0cb84dd973 ("cxgb4: add ETHOFLD hardware queue support")
Fixes: 5148e5950c67 ("cxgb4: add EOTID tracking and software context dump")
Signed-off-by: Rafael Mendonca <rafaelmendsr@gmail.com>
Reviewed-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Link: https://lore.kernel.org/r/20220922175109.764898-1-rafaelmendsr@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>

net: sched: act_ct: fix possible refcount leak in tcf_ct_init()

[ Upstream commit 6e23ec0ba92d426c77a73a9ccab16346e5e0ef49 ]

nf_ct_put need to be called to put the refcount got by tcf_ct_fill_params
to avoid possible refcount leak when tcf_ct_flow_table_get fails.

Fixes: c34b961a2492 ("net/sched: act_ct: Create nf flow table per zone")
Signed-off-by: Hangyu Hua <hbh25y@gmail.com>
Link: https://lore.kernel.org/r/20220923020046.8021-1-hbh25y@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>

usbnet: Fix memory leak in usbnet_disconnect()

[ Upstream commit a43206156263fbaf1f2b7f96257441f331e91bb7 ]

Currently usbnet_disconnect() unanchors and frees all deferred URBs
using usb_scuttle_anchored_urbs(), which does not free urb->context,
causing a memory leak as reported by syzbot.

Use a usb_get_from_anchor() while loop instead, similar to what we did
in commit 19cfe912c37b ("Bluetooth: btusb: Fix memory leak in
play_deferred"). Also free urb->sg.

Reported-and-tested-by: syzbot+dcd3e13cf4472f2e0ba1@syzkaller.appspotmail.com
Fixes: 69ee472f2706 ("usbnet & cdc-ether: Autosuspend for online devices")
Fixes: 638c5115a794 ("USBNET: support DMA SG")
Signed-off-by: Peilin Ye <peilin.ye@bytedance.com>
Link: https://lore.kernel.org/r/20220923042551.2745-1-yepeilin.cs@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>

perf parse-events: Remove "not supported" hybrid cache events

[ Upstream commit 71c86cda750b001100e0d6dc04a88449b7381a59 ]

By default, we create two hybrid cache events, one is for cpu_core, and
another is for cpu_atom. But Some hybrid hardware cache events are only
available on one CPU PMU. For example, the 'L1-dcache-load-misses' is only
available on cpu_core, while the 'L1-icache-loads' is only available on
cpu_atom. We need to remove "not supported" hybrid cache events. By
extending is_event_supported() to global API and using it to check if the
hybrid cache events are supported before being created, we can remove the
"not supported" hybrid cache events.

Before:

# ./perf stat -e L1-dcache-load-misses,L1-icache-loads -a sleep 1

Performance counter stats for 'system wide':

            52,570      cpu_core/L1-dcache-load-misses/
   <not supported>      cpu_atom/L1-dcache-load-misses/
   <not supported>      cpu_core/L1-icache-loads/
         1,471,817      cpu_atom/L1-icache-loads/

       1.004915229 seconds time elapsed

After:

# ./perf stat -e L1-dcache-load-misses,L1-icache-loads -a sleep 1

Performance counter stats for 'system wide':

            54,510      cpu_core/L1-dcache-load-misses/
         1,441,286      cpu_atom/L1-icache-loads/

       1.005114281 seconds time elapsed

Fixes: 30def61f64bac5f5 ("perf parse-events: Create two hybrid cache events")
Reported-by: Yi Ammy <ammy.yi@intel.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220923030013.3726410-2-zhengjun.xing@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>

perf print-events: Fix "perf list" can not display the PMU prefix for some hybrid cache events

[ Upstream commit e28c07871c3f2107e316c2590d4703496bd114f4 ]

Some hybrid hardware cache events are only available on one CPU PMU. For
example, 'L1-dcache-load-misses' is only available on cpu_core.

We have supported in the perf list clearly reporting this info, the
function works fine before but recently the argument "config" in API
is_event_supported() is changed from "u64" to "unsigned int" which
caused a regression, the "perf list" then can not display the PMU prefix
for some hybrid cache events.

For the hybrid systems, the PMU type ID is stored at config[63:32],
define config to "unsigned int" will miss the PMU type ID information,
then the regression happened, the config should be defined as "u64".

Before:
# ./perf list |grep "Hardware cache event"
  L1-dcache-load-misses                              [Hardware cache event]
  L1-dcache-loads                                    [Hardware cache event]
  L1-dcache-stores                                   [Hardware cache event]
  L1-icache-load-misses                              [Hardware cache event]
  L1-icache-loads                                    [Hardware cache event]
  LLC-load-misses                                    [Hardware cache event]
  LLC-loads                                          [Hardware cache event]
  LLC-store-misses                                   [Hardware cache event]
  LLC-stores                                         [Hardware cache event]
  branch-load-misses                                 [Hardware cache event]
  branch-loads                                       [Hardware cache event]
  dTLB-load-misses                                   [Hardware cache event]
  dTLB-loads                                         [Hardware cache event]
  dTLB-store-misses                                  [Hardware cache event]
  dTLB-stores                                        [Hardware cache event]
  iTLB-load-misses                                   [Hardware cache event]
  node-load-misses                                   [Hardware cache event]
  node-loads                                         [Hardware cache event]

After:
# ./perf list |grep "Hardware cache event"
  L1-dcache-loads                                    [Hardware cache event]
  L1-dcache-stores                                   [Hardware cache event]
  L1-icache-load-misses                              [Hardware cache event]
  LLC-load-misses                                    [Hardware cache event]
  LLC-loads                                          [Hardware cache event]
  LLC-store-misses                                   [Hardware cache event]
  LLC-stores                                         [Hardware cache event]
  branch-load-misses                                 [Hardware cache event]
  branch-loads                                       [Hardware cache event]
  cpu_atom/L1-icache-loads/                          [Hardware cache event]
  cpu_core/L1-dcache-load-misses/                    [Hardware cache event]
  cpu_core/node-load-misses/                         [Hardware cache event]
  cpu_core/node-loads/                               [Hardware cache event]
  dTLB-load-misses                                   [Hardware cache event]
  dTLB-loads                                         [Hardware cache event]
  dTLB-store-misses                                  [Hardware cache event]
  dTLB-stores                                        [Hardware cache event]
  iTLB-load-misses                                   [Hardware cache event]

Fixes: 9b7c7728f4e4ba8d ("perf parse-events: Break out tracepoint and printing")
Reported-by: Yi Ammy <ammy.yi@intel.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220923030013.3726410-1-zhengjun.xing@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Stable-dep-of: 71c86cda750b ("perf parse-events: Remove "not supported" hybrid cache events")
Signed-off-by: Sasha Levin <sashal@kernel.org>

perf parse-events: Break out tracepoint and printing

[ Upstream commit 9b7c7728f4e4ba8dd75269fb111fa187faa018c6 ]

Move print_*_events functions out of parse-events.c into a new
print-events.c. Move tracepoint code into tracepoint.c or
trace-event-info.c (sole user). This reduces the dependencies of
parse-events.c and makes it more amenable to being a library in the
future.

Remove some unnecessary definitions from parse-events.h. Fix a
checkpatch.pl warning on using unsigned rather than unsigned int. Fix
some line length warnings too.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20220729204217.250166-3-irogers@google.com
[ Add include linux/stddef.h before perf_events.h for systems where __always_inline isn't pulled in before used, such as older Alpine Linux ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Stable-dep-of: 71c86cda750b ("perf parse-events: Remove "not supported" hybrid cache events")
Signed-off-by: Sasha Levin <sashal@kernel.org>

gpio: mvebu: Fix check for pwm support on non-A8K platforms

[ Upstream commit 4335417da2b8d6d9b2d4411b5f9e248e5bb2d380 ]

pwm support incompatible with Armada 80x0/70x0 API is not only in
Armada 370, but also in Armada XP, 38x and 39x. So basically every non-A8K
platform. Fix check for pwm support appropriately.

Fixes: 85b7d8abfec7 ("gpio: mvebu: add pwm support for Armada 8K/7K")
Signed-off-by: Pali Rohár <pali@kernel.org>
Signed-off-by: Bartosz Golaszewski <brgl@bgdev.pl>
Signed-off-by: Sasha Levin <sashal@kernel.org>

Input: melfas_mip4 - fix return value check in mip4_probe()

[ Upstream commit a54dc27bd25f20ee3ea2009584b3166d25178243 ]

devm_gpiod_get_optional() may return ERR_PTR(-EPROBE_DEFER),
add a minus sign to fix it.

Fixes: 6ccb1d8f78bd ("Input: add MELFAS MIP4 Touchscreen driver")
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Link: https://lore.kernel.org/r/20220924030715.1653538-1-yangyingliang@huawei.com
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>

Revert "drm: bridge: analogix/dp: add panel prepare/unprepare in suspend/resume time"

[ Upstream commit cc62d98bd56d45de4531844ca23913a15136c05b ]

This reverts commit 211f276ed3d96e964d2d1106a198c7f4a4b3f4c0.

For quite some time, core DRM helpers already ensure that any relevant
connectors/CRTCs/etc. are disabled, as well as their associated
components (e.g., bridges) when suspending the system. Thus,
analogix_dp_bridge_{enable,disable}() already get called, which in turn
call drm_panel_{prepare,unprepare}(). This makes these drm_panel_*()
calls redundant.

Besides redundancy, there are a few problems with this handling:

(1) drm_panel_{prepare,unprepare}() are *not* reference-counted APIs and
are not in general designed to be handled by multiple callers --
although some panel drivers have a coarse 'prepared' flag that mitigates
some damage, at least. So at a minimum this is redundant and confusing,
but in some cases, this could be actively harmful.

(2) The error-handling is a bit non-standard. We ignored errors in
suspend(), but handled errors in resume(). And recently, people noticed
that the clk handling is unbalanced in error paths, and getting *that*
right is not actually trivial, given the current way errors are mostly
ignored.

(3) In the particular way analogix_dp_{suspend,resume}() get used (e.g.,
in rockchip_dp_*(), as a late/early callback), we don't necessarily have
a proper PM relationship between the DP/bridge device and the panel
device. So while the DP bridge gets resumed, the panel's parent device
(e.g., platform_device) may still be suspended, and so any prepare()
calls may fail.

So remove the superfluous, possibly-harmful suspend()/resume() handling
of panel state.

Fixes: 211f276ed3d9 ("drm: bridge: analogix/dp: add panel prepare/unprepare in suspend/resume time")
Link: https://lore.kernel.org/all/Yv2CPBD3Picg%2FgVe@google.com/
Signed-off-by: Brian Norris <briannorris@chromium.org>
Reviewed-by: Douglas Anderson <dianders@chromium.org>
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20220822180729.1.I8ac5abe3a4c1c6fd5c061686c6e883c22f69022c@changeid
Signed-off-by: Sasha Levin <sashal@kernel.org>

net: macb: Fix ZynqMP SGMII non-wakeup source resume failure

[ Upstream commit f22bd29ba19a43e758b192429613e04aa7abb70d ]

When GEM is in SGMII mode and disabled as a wakeup source, the power
management controller can power down the entire full power domain(FPD)
if none of the FPD devices are in use.

Incase of FPD off, there are below ethernet link up issues on non-wakeup
suspend/resume. To fix it add phy_exit() in suspend and phy_init() in the
resume path which reinitializes PS GTR SGMII lanes.

$ echo +20 > /sys/class/rtc/rtc0/wakealarm
$ echo mem > /sys/power/state

After resume:

$ ifconfig eth0 up
xilinx-psgtr fd400000.phy: lane 0 (type 10, protocol 5): PLL lock timeout
phy phy-fd400000.phy.0: phy poweron failed --> -110
xilinx-psgtr fd400000.phy: lane 0 (type 10, protocol 5): PLL lock timeout
SIOCSIFFLAGS: Connection timed out
phy phy-fd400000.phy.0: phy poweron failed --> -110

Fixes: 8b73fa3ae02b ("net: macb: Added ZynqMP-specific initialization")
Signed-off-by: Radhey Shyam Pandey <radhey.shyam.pandey@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>

drm/bridge: lt8912b: fix corrupted image output

[ Upstream commit 051ad2788d35ca07aec8402542e5d38429f2426a ]

Correct I2C address for the register list in lt8912_write_lvds_config(),
these registers are on the first I2C address (0x48), the current
function is just writing garbage to the wrong registers and this creates
multiple issues (artifacts and output completely corrupted) on some HDMI
displays.

Correct I2C address comes from Lontium documentation and it is the one
used on other out-of-tree LT8912B drivers [1].

[1] https://github.com/boundarydevices/linux/blob/boundary-imx_5.10.x_2.0.0/drivers/video/lt8912.c#L296

Fixes: 30e2ae943c26 ("drm/bridge: Introduce LT8912B DSI to HDMI bridge")
Signed-off-by: Francesco Dolcini <francesco.dolcini@toradex.com>
Signed-off-by: Philippe Schenker <philippe.schenker@toradex.com>
Acked-by: Adrien Grassein <adrien.grassein@gmail.com>
Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20220922124306.34729-4-dev@pschenker.ch
Signed-off-by: Sasha Levin <sashal@kernel.org>

drm/bridge: lt8912b: set hdmi or dvi mode

[ Upstream commit 6dd1de12e1243f2013e4fabf31e99e63b1a860d0 ]

The Lontium LT8912 does have a setting for DVI or HDMI. This patch reads
from EDID what the display needs and sets it accordingly.

Fixes: 30e2ae943c26 ("drm/bridge: Introduce LT8912B DSI to HDMI bridge")
Signed-off-by: Philippe Schenker <philippe.schenker@toradex.com>
Acked-by: Adrien Grassein <adrien.grassein@gmail.com>
Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20220922124306.34729-3-dev@pschenker.ch
Signed-off-by: Sasha Levin <sashal@kernel.org>

drm/bridge: lt8912b: add vsync hsync

[ Upstream commit da73a94fa282f78d485bd0aab36c8ac15b6f792c ]

Currently the bridge driver does not take care whether or not the display
needs positive/negative vertical/horizontal syncs. Pass these two flags
to the bridge from the EDID that was read out from the display.

Fixes: 30e2ae943c26 ("drm/bridge: Introduce LT8912B DSI to HDMI bridge")
Signed-off-by: Philippe Schenker <philippe.schenker@toradex.com>
Acked-by: Adrien Grassein <adrien.grassein@gmail.com>
Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20220922124306.34729-2-dev@pschenker.ch
Signed-off-by: Sasha Levin <sashal@kernel.org>

ASoC: tas2770: Reinit regcache on reset

[ Upstream commit 0a0342ede303fc420f3a388e1ae82da3ae8ff6bd ]

On probe of the ASoC component, the device is reset but the regcache is
retained. This means the regcache gets out of sync if the codec is
rebound to a sound card for a second time. Fix it by reinitializing the
regcache to defaults after the device is reset.

Fixes: b0bcbe615756 ("ASoC: tas2770: Fix calling reset in probe")
Signed-off-by: Martin Povišer <povik+lin@cutebit.org>
Link: https://lore.kernel.org/r/20220919173453.84292-1-povik+lin@cutebit.org
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>

arm64: dts: qcom: sm8350: fix UFS PHY serdes size

[ Upstream commit 40e9541959100e017533e18e44d07eed44f91dc5 ]

The size of the UFS PHY serdes register region is 0x1c4 and the
corresponding 'reg' property should specifically not include the
adjacent regions that are defined in the child node (e.g. tx and rx).

Fixes: 59c7cf814783 ("arm64: dts: qcom: sm8350: Add UFS nodes")
Signed-off-by: Johan Hovold <johan+linaro@kernel.org>
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
Link: https://lore.kernel.org/r/20220916093603.24263-1-johan+linaro@kernel.org
Signed-off-by: Sasha Levin <sashal@kernel.org>

clk: microchip: mpfs: make the rtc's ahb clock critical

[ Upstream commit 05d27090b6dc88bce71a608d1271536e582b73d1 ]

The onboard RTC's AHB bus clock must be kept running as the RTC will
stop & lose track of time if the AHB interface clock is disabled.

Fixes: 635e5e73370e ("clk: microchip: Add driver for Microchip PolarFire SoC")
Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
Signed-off-by: Claudiu Beznea <claudiu.beznea@microchip.com>
Link: https://lore.kernel.org/r/20220909123123.2699583-3-conor.dooley@microchip.com
Signed-off-by: Sasha Levin <sashal@kernel.org>

clk: microchip: mpfs: fix clk_cfg array bounds violation

[ Upstream commit 5da39ac5d648cdbfdfa8bea0e0cde279ded5c7c2 ]

There is an array bounds violation present during clock registration,
triggered by current code by only specific toolchains. This seems to
fail gracefully in v6.0-rc1, using a toolchain build from the riscv-
gnu-toolchain repo and with clang-15, and life carries on. While
converting the driver to use standard clock structs/ops, kernel panics
were seen during boot when built with clang-15:

[    0.581754] Unable to handle kernel NULL pointer dereference at virtual address 00000000000000b1
[    0.591520] Oops [#1]
[    0.594045] Modules linked in:
[    0.597435] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 6.0.0-rc1-00011-g8e1459cf4eca #1
[    0.606188] Hardware name: Microchip PolarFire-SoC Icicle Kit (DT)
[    0.613012] epc : __clk_register+0x4a6/0x85c
[    0.617759]  ra : __clk_register+0x49e/0x85c
[    0.622489] epc : ffffffff803faf7c ra : ffffffff803faf74 sp : ffffffc80400b720
[    0.630466]  gp : ffffffff810e93f8 tp : ffffffe77fe60000 t0 : ffffffe77ffb3800
[    0.638443]  t1 : 000000000000000a t2 : ffffffffffffffff s0 : ffffffc80400b7c0
[    0.646420]  s1 : 0000000000000001 a0 : 0000000000000001 a1 : 0000000000000000
[    0.654396]  a2 : 0000000000000001 a3 : 0000000000000000 a4 : 0000000000000000
[    0.662373]  a5 : ffffffff803a5810 a6 : 0000000200000022 a7 : 0000000000000006
[    0.670350]  s2 : ffffffff81099d48 s3 : ffffffff80d6e28e s4 : 0000000000000028
[    0.678327]  s5 : ffffffff810ed3c8 s6 : ffffffff810ed3d0 s7 : ffffffe77ffbc100
[    0.686304]  s8 : ffffffe77ffb1540 s9 : ffffffe77ffb1540 s10: 0000000000000008
[    0.694281]  s11: 0000000000000000 t3 : 00000000000000c6 t4 : 0000000000000007
[    0.702258]  t5 : ffffffff810c78c0 t6 : ffffffe77ff88cd0
[    0.708125] status: 0000000200000120 badaddr: 00000000000000b1 cause: 000000000000000d
[    0.716869] [<ffffffff803fb892>] devm_clk_hw_register+0x62/0xaa
[    0.723420] [<ffffffff80403412>] mpfs_clk_probe+0x1e0/0x244

In v6.0-rc1 and later, this issue is visible without the follow on
patches doing the conversion using toolchains provided by our Yocto
meta layer too.

It fails on "clk_periph_timer" - which uses a different parent, that it
tries to find using the macro:
\#define PARENT_CLK(PARENT) (&mpfs_cfg_clks[CLK_##PARENT].cfg.hw)

If parent is RTCREF, so the macro becomes: &mpfs_cfg_clks[33].cfg.hw
which is well beyond the end of the array. Amazingly, builds with GCC
11.1 see no problem here, booting correctly and hooking the parent up
etc. Builds with clang-15 do not, with the above panic.

Change the macro to use specific offsets depending on the parent rather
than the dt-binding's clock IDs.

Fixes: 1c6a7ea32b8c ("clk: microchip: mpfs: add RTCREF clock control")
CC: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Claudiu Beznea <claudiu.beznea@microchip.com>
Signed-off-by: Claudiu Beznea <claudiu.beznea@microchip.com>
Link: https://lore.kernel.org/r/20220909123123.2699583-2-conor.dooley@microchip.com
Signed-off-by: Sasha Levin <sashal@kernel.org>

ASoC: imx-card: Fix refcount issue with of_node_put

[ Upstream commit d56ba9a04d7548d4149c46ec86a0e3cc41a70f4a ]

imx_card_parse_of will search all the node with loop,
if there is defer probe happen in the middle of loop,
the previous released codec node will be released
twice, then cause refcount issue.

Here assign NULL to pointer of released nodes to fix
the issue.

Fixes: aa736700f42f ("ASoC: imx-card: Add imx-card machine driver")
Signed-off-by: Shengjiu Wang <shengjiu.wang@nxp.com>
Link: https://lore.kernel.org/r/1663059601-29259-1-git-send-email-shengjiu.wang@nxp.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>

soc: sunxi: sram: Fix debugfs info for A64 SRAM C

[ Upstream commit e3c95edb1bd8b9c2cb0caa6ae382fc8080f6a0ed ]

The labels were backward with respect to the register values. The SRAM
is mapped to the CPU when the register value is 1.

Fixes: 5e4fb6429761 ("drivers: soc: sunxi: add support for A64 and its SRAM C")
Acked-by: Jernej Skrabec <jernej.skrabec@gmail.com>
Signed-off-by: Samuel Holland <samuel@sholland.org>
Signed-off-by: Jernej Skrabec <jernej.skrabec@gmail.com>
Link: https://lore.kernel.org/r/20220815041248.53268-7-samuel@sholland.org
Signed-off-by: Sasha Levin <sashal@kernel.org>

soc: sunxi: sram: Fix probe function ordering issues

[ Upstream commit 49fad91a7b8941979c3e9a35f9894ac45bc5d3d6 ]

Errors from debugfs are intended to be non-fatal, and should not prevent
the driver from probing.

Since debugfs file creation is treated as infallible, move it below the
parts of the probe function that can fail. This prevents an error
elsewhere in the probe function from causing the file to leak. Do the
same for the call to of_platform_populate().

Finally, checkpatch suggests an octal literal for the file permissions.

Fixes: 4af34b572a85 ("drivers: soc: sunxi: Introduce SoC driver to map SRAMs")
Fixes: 5828729bebbb ("soc: sunxi: export a regmap for EMAC clock reg on A64")
Reviewed-by: Jernej Skrabec <jernej.skrabec@gmail.com>
Signed-off-by: Samuel Holland <samuel@sholland.org>
Tested-by: Heiko Stuebner <heiko@sntech.de>
Signed-off-by: Jernej Skrabec <jernej.skrabec@gmail.com>
Link: https://lore.kernel.org/r/20220815041248.53268-6-samuel@sholland.org
Signed-off-by: Sasha Levin <sashal@kernel.org>

soc: sunxi: sram: Prevent the driver from being unbound

[ Upstream commit 90e10a1fcd9b24b4ba8c0d35136127473dcd829e ]

This driver exports a regmap tied to the platform device (as opposed to
a syscon, which exports a regmap tied to the OF node). Because of this,
the driver can never be unbound, as that would destroy the regmap. Use
builtin_platform_driver_probe() to enforce this limitation.

Fixes: 5828729bebbb ("soc: sunxi: export a regmap for EMAC clock reg on A64")
Reviewed-by: Jernej Skrabec <jernej.skrabec@gmail.com>
Signed-off-by: Samuel Holland <samuel@sholland.org>
Reviewed-by: Heiko Stuebner <heiko@sntech.de>
Tested-by: Heiko Stuebner <heiko@sntech.de>
Signed-off-by: Jernej Skrabec <jernej.skrabec@gmail.com>
Link: https://lore.kernel.org/r/20220815041248.53268-5-samuel@sholland.org
Signed-off-by: Sasha Levin <sashal@kernel.org>

soc: sunxi: sram: Actually claim SRAM regions

[ Upstream commit fd362baad2e659ef0fb5652f023a606b248f1781 ]

sunxi_sram_claim() checks the sram_desc->claimed flag before updating
the register, with the intent that only one device can claim a region.
However, this was ineffective because the flag was never set.

Fixes: 4af34b572a85 ("drivers: soc: sunxi: Introduce SoC driver to map SRAMs")
Reviewed-by: Jernej Skrabec <jernej.skrabec@gmail.com>
Signed-off-by: Samuel Holland <samuel@sholland.org>
Reviewed-by: Heiko Stuebner <heiko@sntech.de>
Tested-by: Heiko Stuebner <heiko@sntech.de>
Signed-off-by: Jernej Skrabec <jernej.skrabec@gmail.com>
Link: https://lore.kernel.org/r/20220815041248.53268-4-samuel@sholland.org
Signed-off-by: Sasha Levin <sashal@kernel.org>

ARM: dts: am5748: keep usb4_tm disabled

[ Upstream commit 6a6d9ecff14a2a46c1deeffa3eb3825349639bdd ]

Commit bcbb63b80284 ("ARM: dts: dra7: Separate AM57 dtsi files")
disabled usb4_tm for am5748 devices since USB4 IP is not present
in this SoC.

The commit log explained the difference between AM5 and DRA7 families:

AM5 and DRA7 SoC families have different set of modules in them so the
SoC sepecific dtsi files need to be separated.

e.g. Some of the major differences between AM576 and DRA76

DRA76x AM576x

USB3 x
USB4 x
ATL x
VCP x
MLB x
ISS x
PRU-ICSS1 x
PRU-ICSS2 x

Then commit 176f26bcd41a ("ARM: dts: Add support for dra762 abz
package") removed usb4_tm part from am5748.dtsi and introcuded new
ti-sysc errors in dmesg:

ti-sysc 48940000.target-module: clock get error for fck: -2
ti-sysc: probe of 48940000.target-module failed with error -2

Fixes: 176f26bcd41a ("ARM: dts: Add support for dra762 abz package")
Signed-off-by: Romain Naour <romain.naour@skf.com>
Signed-off-by: Romain Naour <romain.naour@smile.fr>
Message-Id: <20220823072742.351368-1-romain.naour@smile.fr>
Reviewed-by: Roger Quadros <rogerq@kernel.org>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>

reset: imx7: Fix the iMX8MP PCIe PHY PERST support

[ Upstream commit 051d9eb403887bb11852b7a4f744728a6a4b1b58 ]

On i.MX7/iMX8MM/iMX8MQ, the initialized default value of PERST bit(BIT3)
of SRC_PCIEPHY_RCR is 1b'1.
But i.MX8MP has one inversed default value 1b'0 of PERST bit.

And the PERST bit should be kept 1b'1 after power and clocks are stable.
So fix the i.MX8MP PCIe PHY PERST support here.

Fixes: e08672c03981 ("reset: imx7: Add support for i.MX8MP SoC")
Signed-off-by: Richard Zhu <hongxing.zhu@nxp.com>
Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>
Tested-by: Marek Vasut <marex@denx.de>
Tested-by: Richard Leitner <richard.leitner@skidata.com>
Tested-by: Alexander Stein <alexander.stein@ew.tq-group.com>
Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
Link: https://lore.kernel.org/r/1661845564-11373-5-git-send-email-hongxing.zhu@nxp.com
Signed-off-by: Sasha Levin <sashal@kernel.org>

ARM: dts: am33xx: Fix MMCHS0 dma properties

[ Upstream commit 2eb502f496f7764027b7958d4e74356fed918059 ]

According to technical manual(table 11-24), the DMA of MMCHS0 should be
direct mapped.

Fixes: b5e509066074 ("ARM: DTS: am33xx: Use the new DT bindings for the eDMA3")
Signed-off-by: YuTong Chang <mtwget@gmail.com>
Message-Id: <20220620124146.5330-1-mtwget@gmail.com>
Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>

media: v4l2-compat-ioctl32.c: zero buffer passed to v4l2_compat_get_array_args()

commit 4e768c8e34e639cff66a0f175bc4aebf472e4305 upstream.

The v4l2_compat_get_array_args() function can leave uninitialized memory in the
buffer it is passed. So zero it before copying array elements from userspace
into the buffer.

Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Reported-by: syzbot+ff18193ff05f3f87f226@syzkaller.appspotmail.com
Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

media: mediatek: vcodec: Drop platform_get_resource(IORESOURCE_IRQ)

commit a2d2e593d39bc2f29a1cd5e3779af457fd26490c upstream.

Commit a1a2b7125e10 ("of/platform: Drop static setup of IRQ resource
from DT core") removed support for calling platform_get_resource(...,
IORESOURCE_IRQ, ...) on DT-based drivers, but the probe() function of
mtk-vcodec's encoder was still making use of it. This caused the encoder
driver to fail probe.

Since the platform_get_resource() call was only being used to check for
the presence of the interrupt (its returned resource wasn't even used)
and platform_get_irq() was already being used to get the IRQ, simply
drop the use of platform_get_resource(IORESOURCE_IRQ) and handle the
failure of platform_get_irq(), to get the driver probing again.

[hverkuil: drop unused struct resource *res]

Fixes: a1a2b7125e10 ("of/platform: Drop static setup of IRQ resource from DT core")
Signed-off-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

media: rkvdec: Disable H.264 error detection

commit 3a99c4474112f49a5459933d8758614002ca0ddc upstream.

Quite often, the HW get stuck in error condition if a stream error
was detected. As documented, the HW should stop immediately and self
reset. There is likely a problem or a miss-understanding of the self
reset mechanism, as unless we make a long pause, the next command
will then report an error even if there is no error in it.

Disabling error detection fixes the issue, and let the decoder continue
after an error. This patch is safe for backport into older kernels.

Fixes: cd33c830448b ("media: rkvdec: Add the rkvdec driver")
Signed-off-by: Nicolas Dufresne <nicolas.dufresne@collabora.com>
Reviewed-by: Brian Norris <briannorris@chromium.org>
Tested-by: Brian Norris <briannorris@chromium.org>
Reviewed-by: Ezequiel Garcia <ezequiel@vanguardiasur.com.ar>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

media: dvb_vb2: fix possible out of bound access

commit 37238699073e7e93f05517e529661151173cd458 upstream.

vb2_core_qbuf and vb2_core_querybuf don't check the range of b->index
controlled by the user.

Fix this by adding range checking code before using them.

Fixes: 57868acc369a ("media: videobuf2: Add new uAPI for DVB streaming I/O")
Signed-off-by: Hangyu Hua <hbh25y@gmail.com>
Reviewed-by: Sergey Senozhatsky <senozhatsky@chromium.org>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mm,hwpoison: check mm when killing accessing process

commit 77677cdbc2aa4b5d5d839562793d3d126201d18d upstream.

The GHES code calls memory_failure_queue() from IRQ context to queue work
into workqueue and schedule it on the current CPU.  Then the work is
processed in memory_failure_work_func() by kworker and calls
memory_failure().

When a page is already poisoned, commit a3f5d80ea401 ("mm,hwpoison: send
SIGBUS with error virutal address") make memory_failure() call
kill_accessing_process() that:

    - holds mmap locking of current->mm
    - does pagetable walk to find the error virtual address
    - and sends SIGBUS to the current process with error info.

However, the mm of kworker is not valid, resulting in a null-pointer
dereference.  So check mm when killing the accessing process.

[akpm@linux-foundation.org: remove unrelated whitespace alteration]
Link: https://lkml.kernel.org/r/20220914064935.7851-1-xueshuai@linux.alibaba.com
Fixes: a3f5d80ea401 ("mm,hwpoison: send SIGBUS with error virutal address")
Signed-off-by: Shuai Xue <xueshuai@linux.alibaba.com>
Reviewed-by: Miaohe Lin <linmiaohe@huawei.com>
Acked-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
Cc: Huang Ying <ying.huang@intel.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Bixuan Cui <cuibixuan@linux.alibaba.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mm/hugetlb: correct demote page offset logic

commit 317314527d173e1f139ceaf8cb87cb1746abf240 upstream.

With gigantic pages it may not be true that struct page structures are
contiguous across the entire gigantic page.  The nth_page macro is used
here in place of direct pointer arithmetic to correct for this.

Mike said:

: This error could cause addressing exceptions.  However, this is only
: possible in configurations where CONFIG_SPARSEMEM &&
: !CONFIG_SPARSEMEM_VMEMMAP.  Such a configuration option is rare and
: unknown to be the default anywhere.

Link: https://lkml.kernel.org/r/20220914190917.3517663-1-opendmb@gmail.com
Fixes: 8531fc6f52f5 ("hugetlb: add hugetlb demote page support")
Signed-off-by: Doug Berger <opendmb@gmail.com>
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Muchun Song <songmuchun@bytedance.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mm: bring back update_mmu_cache() to finish_fault()

commit 70427f6e9ecfc8c5f977b21dd9f846b3bda02500 upstream.

Running this test program on ARMv4 a few times (sometimes just once)
reproduces the bug.

int main()
{
        unsigned i;
        char paragon[SIZE];
        void* ptr;

        memset(paragon, 0xAA, SIZE);
        ptr = mmap(NULL, SIZE, PROT_READ | PROT_WRITE,
                   MAP_ANON | MAP_SHARED, -1, 0);
        if (ptr == MAP_FAILED) return 1;
        printf("ptr = %p\n", ptr);
        for (i=0;i<10000;i++){
                memset(ptr, 0xAA, SIZE);
                if (memcmp(ptr, paragon, SIZE)) {
                        printf("Unexpected bytes on iteration %u!!!\n", i);
                        break;
                }
        }
        munmap(ptr, SIZE);
}

In the "ptr" buffer there appear runs of zero bytes which are aligned
by 16 and their lengths are multiple of 16.

Linux v5.11 does not have the bug, "git bisect" finds the first bad commit:
f9ce0be71d1f ("mm: Cleanup faultaround and finish_fault() codepaths")

Before the commit update_mmu_cache() was called during a call to
filemap_map_pages() as well as finish_fault(). After the commit
finish_fault() lacks it.

Bring back update_mmu_cache() to finish_fault() to fix the bug.
Also call update_mmu_tlb() only when returning VM_FAULT_NOPAGE to more
closely reproduce the code of alloc_set_pte() function that existed before
the commit.

On many platforms update_mmu_cache() is nop:
x86, see arch/x86/include/asm/pgtable
ARMv6+, see arch/arm/include/asm/tlbflush.h
So, it seems, few users ran into this bug.

Link: https://lkml.kernel.org/r/20220908204809.2012451-1-saproj@gmail.com
Fixes: f9ce0be71d1f ("mm: Cleanup faultaround and finish_fault() codepaths")
Signed-off-by: Sergei Antonov <saproj@gmail.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Will Deacon <will@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mm: fix madivse_pageout mishandling on non-LRU page

commit 58d426a7ba92870d489686dfdb9d06b66815a2ab upstream.

MADV_PAGEOUT tries to isolate non-LRU pages and gets a warning from
isolate_lru_page below.

Fix it by checking PageLRU in advance.

------------[ cut here ]------------
trying to isolate tail page
WARNING: CPU: 0 PID: 6175 at mm/folio-compat.c:158 isolate_lru_page+0x130/0x140
Modules linked in:
CPU: 0 PID: 6175 Comm: syz-executor.0 Not tainted 5.18.12 #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014
RIP: 0010:isolate_lru_page+0x130/0x140

Link: https://lore.kernel.org/linux-mm/485f8c33.2471b.182d5726afb.Coremail.hantianshuo@iie.ac.cn/
Link: https://lkml.kernel.org/r/20220908151204.762596-1-minchan@kernel.org
Fixes: 1a4e58cce84e ("mm: introduce MADV_PAGEOUT")
Signed-off-by: Minchan Kim <minchan@kernel.org>
Reported-by: 韩天ç`\95 <hantianshuo@iie.ac.cn>
Suggested-by: Yang Shi <shy828301@gmail.com>
Acked-by: Yang Shi <shy828301@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mm/migrate_device.c: copy pte dirty bit to page

commit fd35ca3d12cc9922d7d9a35f934e72132dbc4853 upstream.

migrate_vma_setup() has a fast path in migrate_vma_collect_pmd() that
installs migration entries directly if it can lock the migrating page.
When removing a dirty pte the dirty bit is supposed to be carried over to
the underlying page to prevent it being lost.

Currently migrate_vma_*() can only be used for private anonymous mappings.
That means loss of the dirty bit usually doesn't result in data loss
because these pages are typically not file-backed. However pages may be
backed by swap storage which can result in data loss if an attempt is made
to migrate a dirty page that doesn't yet have the PageDirty flag set.

In this case migration will fail due to unexpected references but the
dirty pte bit will be lost. If the page is subsequently reclaimed data
won't be written back to swap storage as it is considered uptodate,
resulting in data loss if the page is subsequently accessed.

Prevent this by copying the dirty bit to the page when removing the pte to
match what try_to_migrate_one() does.

Link: https://lkml.kernel.org/r/dd48e4882ce859c295c1a77612f66d198b0403f9.1662078528.git-series.apopple@nvidia.com
Fixes: 8c3328f1f36a ("mm/migrate: migrate_vma() unmap page from vma while collecting pages")
Signed-off-by: Alistair Popple <apopple@nvidia.com>
Acked-by: Peter Xu <peterx@redhat.com>
Reviewed-by: "Huang, Ying" <ying.huang@intel.com>
Reported-by: "Huang, Ying" <ying.huang@intel.com>
Acked-by: David Hildenbrand <david@redhat.com>
Cc: Alex Sierra <alex.sierra@amd.com>
Cc: Ben Skeggs <bskeggs@redhat.com>
Cc: Felix Kuehling <Felix.Kuehling@amd.com>
Cc: huang ying <huang.ying.caritas@gmail.com>
Cc: Jason Gunthorpe <jgg@nvidia.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Karol Herbst <kherbst@redhat.com>
Cc: Logan Gunthorpe <logang@deltatee.com>
Cc: Lyude Paul <lyude@redhat.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Nadav Amit <nadav.amit@gmail.com>
Cc: Paul Mackerras <paulus@ozlabs.org>
Cc: Ralph Campbell <rcampbell@nvidia.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mm/migrate_device.c: add missing flush_cache_page()

commit a3589e1d5fe39c3d9fdd291b111524b93d08bc32 upstream.

Currently we only call flush_cache_page() for the anon_exclusive case,
however in both cases we clear the pte so should flush the cache.

Link: https://lkml.kernel.org/r/5676f30436ab71d1a587ac73f835ed8bd2113ff5.1662078528.git-series.apopple@nvidia.com
Fixes: 8c3328f1f36a ("mm/migrate: migrate_vma() unmap page from vma while collecting pages")
Signed-off-by: Alistair Popple <apopple@nvidia.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Acked-by: Peter Xu <peterx@redhat.com>
Cc: Alex Sierra <alex.sierra@amd.com>
Cc: Ben Skeggs <bskeggs@redhat.com>
Cc: Felix Kuehling <Felix.Kuehling@amd.com>
Cc: huang ying <huang.ying.caritas@gmail.com>
Cc: "Huang, Ying" <ying.huang@intel.com>
Cc: Jason Gunthorpe <jgg@nvidia.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Karol Herbst <kherbst@redhat.com>
Cc: Logan Gunthorpe <logang@deltatee.com>
Cc: Lyude Paul <lyude@redhat.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Nadav Amit <nadav.amit@gmail.com>
Cc: Paul Mackerras <paulus@ozlabs.org>
Cc: Ralph Campbell <rcampbell@nvidia.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mm/migrate_device.c: flush TLB while holding PTL

commit 60bae73708963de4a17231077285bd9ff2f41c44 upstream.

When clearing a PTE the TLB should be flushed whilst still holding the PTL
to avoid a potential race with madvise/munmap/etc.  For example consider
the following sequence:

  CPU0                          CPU1
  ----                          ----

  migrate_vma_collect_pmd()
  pte_unmap_unlock()
                                madvise(MADV_DONTNEED)
                                -> zap_pte_range()
                                pte_offset_map_lock()
                                [ PTE not present, TLB not flushed ]
                                pte_unmap_unlock()
                                [ page is still accessible via stale TLB ]
  flush_tlb_range()

In this case the page may still be accessed via the stale TLB entry after
madvise returns.  Fix this by flushing the TLB while holding the PTL.

Fixes: 8c3328f1f36a ("mm/migrate: migrate_vma() unmap page from vma while collecting pages")
Link: https://lkml.kernel.org/r/9f801e9d8d830408f2ca27821f606e09aa856899.1662078528.git-series.apopple@nvidia.com
Signed-off-by: Alistair Popple <apopple@nvidia.com>
Reported-by: Nadav Amit <nadav.amit@gmail.com>
Reviewed-by: "Huang, Ying" <ying.huang@intel.com>
Acked-by: David Hildenbrand <david@redhat.com>
Acked-by: Peter Xu <peterx@redhat.com>
Cc: Alex Sierra <alex.sierra@amd.com>
Cc: Ben Skeggs <bskeggs@redhat.com>
Cc: Felix Kuehling <Felix.Kuehling@amd.com>
Cc: huang ying <huang.ying.caritas@gmail.com>
Cc: Jason Gunthorpe <jgg@nvidia.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Karol Herbst <kherbst@redhat.com>
Cc: Logan Gunthorpe <logang@deltatee.com>
Cc: Lyude Paul <lyude@redhat.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Paul Mackerras <paulus@ozlabs.org>
Cc: Ralph Campbell <rcampbell@nvidia.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mm: fix dereferencing possible ERR_PTR

commit 4eb5bbde3ccb710d3b85bfb13466612e56393369 upstream.

Smatch checker complains that 'secretmem_mnt' dereferencing possible
ERR_PTR(). Let the function return if 'secretmem_mnt' is ERR_PTR, to
avoid deferencing it.

Link: https://lkml.kernel.org/r/20220904074647.GA64291@cloud-MacBookPro
Fixes: 1507f51255c9f ("mm: introduce memfd_secret system call to create "secret" memory areas")
Signed-off-by: Binyi Han <dantengknight@gmail.com>
Reviewed-by: Andrew Morton <akpm@linux-foudation.org>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Ammar Faizi <ammarfaizi2@gnuweeb.org>
Cc: Hagen Paul Pfeifer <hagen@jauu.net>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mm/page_isolation: fix isolate_single_pageblock() isolation behavior

commit 80e2b584f3abfc31c3fe5573007f0d1d10810fde upstream.

set_migratetype_isolate() does not allow isolating MIGRATE_CMA pageblocks
unless it is used for CMA allocation.  isolate_single_pageblock() did not
have the same behavior when it is used together with
set_migratetype_isolate() in start_isolate_page_range().  This allows
alloc_contig_range() with migratetype other than MIGRATE_CMA, like
MIGRATE_MOVABLE (used by alloc_contig_pages()), to isolate first and last
pageblock but fail the rest.  The failure leads to changing migratetype of
the first and last pageblock to MIGRATE_MOVABLE from MIGRATE_CMA,
corrupting the CMA region.  This can happen during gigantic page
allocations.

Like Doug said here:
https://lore.kernel.org/linux-mm/a3363a52-883b-dcd1-b77f-f2bb378d6f2d@gmail.com/T/#u,
for gigantic page allocations, the user would notice no difference,
since the allocation on CMA region will fail as well as it did before.
But it might hurt the performance of device drivers that use CMA, since
CMA region size decreases.

Fix it by passing migratetype into isolate_single_pageblock(), so that
set_migratetype_isolate() used by isolate_single_pageblock() will prevent
the isolation happening.

Link: https://lkml.kernel.org/r/20220914023913.1855924-1-zi.yan@sent.com
Fixes: b2c9e2fbba32 ("mm: make alloc_contig_range work at pageblock granularity")
Signed-off-by: Zi Yan <ziy@nvidia.com>
Reported-by: Doug Berger <opendmb@gmail.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Doug Berger <opendmb@gmail.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mm: prevent page_frag_alloc() from corrupting the memory

commit dac22531bbd4af2426c4e29e05594415ccfa365d upstream.

A number of drivers call page_frag_alloc() with a fragment's size >
PAGE_SIZE.

In low memory conditions, __page_frag_cache_refill() may fail the order
3 cache allocation and fall back to order 0; In this case, the cache
will be smaller than the fragment, causing memory corruptions.

Prevent this from happening by checking if the newly allocated cache is
large enough for the fragment; if not, the allocation will fail and
page_frag_alloc() will return NULL.

Link: https://lkml.kernel.org/r/20220715125013.247085-1-mlombard@redhat.com
Fixes: b63ae8ca096d ("mm/net: Rename and move page fragment handling from net/ to mm/")
Signed-off-by: Maurizio Lombardi <mlombard@redhat.com>
Reviewed-by: Alexander Duyck <alexanderduyck@fb.com>
Cc: Chen Lin <chen45464546@163.com>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mm/page_alloc: fix race condition between build_all_zonelists and page allocation

commit 3d36424b3b5850bd92f3e89b953a430d7cfc88ef upstream.

Patrick Daly reported the following problem;

NODE_DATA(nid)->node_zonelists[ZONELIST_FALLBACK] - before offline operation
[0] - ZONE_MOVABLE
[1] - ZONE_NORMAL
[2] - NULL

For a GFP_KERNEL allocation, alloc_pages_slowpath() will save the
offset of ZONE_NORMAL in ac->preferred_zoneref. If a concurrent
memory_offline operation removes the last page from ZONE_MOVABLE,
build_all_zonelists() & build_zonerefs_node() will update
node_zonelists as shown below. Only populated zones are added.

NODE_DATA(nid)->node_zonelists[ZONELIST_FALLBACK] - after offline operation
[0] - ZONE_NORMAL
[1] - NULL
[2] - NULL

The race is simple -- page allocation could be in progress when a memory
hot-remove operation triggers a zonelist rebuild that removes zones.  The
allocation request will still have a valid ac->preferred_zoneref that is
now pointing to NULL and triggers an OOM kill.

This problem probably always existed but may be slightly easier to trigger
due to 6aa303defb74 ("mm, vmscan: only allocate and reclaim from zones
with pages managed by the buddy allocator") which distinguishes between
zones that are completely unpopulated versus zones that have valid pages
not managed by the buddy allocator (e.g.  reserved, memblock, ballooning
etc).  Memory hotplug had multiple stages with timing considerations
around managed/present page updates, the zonelist rebuild and the zone
span updates.  As David Hildenbrand puts it

memory offlining adjusts managed+present pages of the zone
essentially in one go. If after the adjustments, the zone is no
longer populated (present==0), we rebuild the zone lists.

Once that's done, we try shrinking the zone (start+spanned
pages) -- which results in zone_start_pfn == 0 if there are no
more pages. That happens *after* rebuilding the zonelists via
remove_pfn_range_from_zone().

The only requirement to fix the race is that a page allocation request
identifies when a zonelist rebuild has happened since the allocation
request started and no page has yet been allocated.  Use a seqlock_t to
track zonelist updates with a lockless read-side of the zonelist and
protecting the rebuild and update of the counter with a spinlock.

[akpm@linux-foundation.org: make zonelist_update_seq static]
Link: https://lkml.kernel.org/r/20220824110900.vh674ltxmzb3proq@techsingularity.net
Fixes: 6aa303defb74 ("mm, vmscan: only allocate and reclaim from zones with pages managed by the buddy allocator")
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Reported-by: Patrick Daly <quic_pdaly@quicinc.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Cc: <stable@vger.kernel.org> [4.9+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mm: gup: fix the fast GUP race against THP collapse

commit 70cbc3cc78a997d8247b50389d37c4e1736019da upstream.

Since general RCU GUP fast was introduced in commit 2667f50e8b81 ("mm:
introduce a general RCU get_user_pages_fast()"), a TLB flush is no longer
sufficient to handle concurrent GUP-fast in all cases, it only handles
traditional IPI-based GUP-fast correctly.  On architectures that send an
IPI broadcast on TLB flush, it works as expected.  But on the
architectures that do not use IPI to broadcast TLB flush, it may have the
below race:

   CPU A                                          CPU B
THP collapse                                     fast GUP
                                              gup_pmd_range() <-- see valid pmd
                                                  gup_pte_range() <-- work on pte
pmdp_collapse_flush() <-- clear pmd and flush
__collapse_huge_page_isolate()
    check page pinned <-- before GUP bump refcount
                                                      pin the page
                                                      check PTE <-- no change
__collapse_huge_page_copy()
    copy data to huge page
    ptep_clear()
install huge pmd for the huge page
                                                      return the stale page
discard the stale page

The race can be fixed by checking whether PMD is changed or not after
taking the page pin in fast GUP, just like what it does for PTE.  If the
PMD is changed it means there may be parallel THP collapse, so GUP should
back off.

Also update the stale comment about serializing against fast GUP in
khugepaged.

Link: https://lkml.kernel.org/r/20220907180144.555485-1-shy828301@gmail.com
Fixes: 2667f50e8b81 ("mm: introduce a general RCU get_user_pages_fast()")
Acked-by: David Hildenbrand <david@redhat.com>
Acked-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Yang Shi <shy828301@gmail.com>
Reviewed-by: John Hubbard <jhubbard@nvidia.com>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Jason Gunthorpe <jgg@nvidia.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mmc: hsq: Fix data stomping during mmc recovery

commit e7afa79a3b35a27a046a2139f8b20bd6b98155c2 upstream.

The block device uses multiple queues to access emmc. There will be up to 3
requests in the hsq of the host. The current code will check whether there
is a request doing recovery before entering the queue, but it will not check
whether there is a request when the lock is issued. The request is in recovery
mode. If there is a request in recovery, then a read and write request is
initiated at this time, and the conflict between the request and the recovery
request will cause the data to be trampled.

Signed-off-by: Wenchao Chen <wenchao.chen@unisoc.com>
Fixes: 511ce378e16f ("mmc: Add MMC host software queue support")
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20220916090506.10662-1-wenchao.chen666@gmail.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mmc: moxart: fix 4-bit bus width and remove 8-bit bus width

commit 35ca91d1338ae158f6dcc0de5d1e86197924ffda upstream.

According to the datasheet [1] at page 377, 4-bit bus width is turned on by
bit 2 of the Bus Width Register. Thus the current bitmask is wrong: define
BUS_WIDTH_4 BIT(1)

BIT(1) does not work but BIT(2) works. This has been verified on real MOXA
hardware with FTSDC010 controller revision 1_6_0.

The corrected value of BUS_WIDTH_4 mask collides with: define BUS_WIDTH_8
BIT(2). Additionally, 8-bit bus width mode isn't supported according to the
datasheet, so let's remove the corresponding code.

[1]
https://bitbucket.org/Kasreyn/mkrom-uc7112lx/src/master/documents/FIC8120_DS_v1.2.pdf

Fixes: 1b66e94e6b99 ("mmc: moxart: Add MOXA ART SD/MMC driver")
Signed-off-by: Sergei Antonov <saproj@gmail.com>
Cc: Jonas Jensen <jonas.jensen@gmail.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20220907205753.1577434-1-saproj@gmail.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mptcp: fix unreleased socket in accept queue

commit 30e51b923e436b631e8d5b77fa5e318c6b066dc7 upstream.

The mptcp socket and its subflow sockets in accept queue can't be
released after the process exit.

While the release of a mptcp socket in listening state, the
corresponding tcp socket will be released too. Meanwhile, the tcp
socket in the unaccept queue will be released too. However, only init
subflow is in the unaccept queue, and the joined subflow is not in the
unaccept queue, which makes the joined subflow won't be released, and
therefore the corresponding unaccepted mptcp socket will not be released
to.

This can be reproduced easily with following steps:

1. create 2 namespace and veth:
   $ ip netns add mptcp-client
   $ ip netns add mptcp-server
   $ sysctl -w net.ipv4.conf.all.rp_filter=0
   $ ip netns exec mptcp-client sysctl -w net.mptcp.enabled=1
   $ ip netns exec mptcp-server sysctl -w net.mptcp.enabled=1
   $ ip link add red-client netns mptcp-client type veth peer red-server \
     netns mptcp-server
   $ ip -n mptcp-server address add 10.0.0.1/24 dev red-server
   $ ip -n mptcp-server address add 192.168.0.1/24 dev red-server
   $ ip -n mptcp-client address add 10.0.0.2/24 dev red-client
   $ ip -n mptcp-client address add 192.168.0.2/24 dev red-client
   $ ip -n mptcp-server link set red-server up
   $ ip -n mptcp-client link set red-client up

2. configure the endpoint and limit for client and server:
   $ ip -n mptcp-server mptcp endpoint flush
   $ ip -n mptcp-server mptcp limits set subflow 2 add_addr_accepted 2
   $ ip -n mptcp-client mptcp endpoint flush
   $ ip -n mptcp-client mptcp limits set subflow 2 add_addr_accepted 2
   $ ip -n mptcp-client mptcp endpoint add 192.168.0.2 dev red-client id \
     1 subflow

3. listen and accept on a port, such as 9999. The nc command we used
   here is modified, which makes it use mptcp protocol by default.
   $ ip netns exec mptcp-server nc -l -k -p 9999

4. open another *two* terminal and use each of them to connect to the
   server with the following command:
   $ ip netns exec mptcp-client nc 10.0.0.1 9999
   Input something after connect to trigger the connection of the second
   subflow. So that there are two established mptcp connections, with the
   second one still unaccepted.

5. exit all the nc command, and check the tcp socket in server namespace.
   And you will find that there is one tcp socket in CLOSE_WAIT state
   and can't release forever.

Fix this by closing all of the unaccepted mptcp socket in
mptcp_subflow_queue_clean() with __mptcp_close().

Now, we can ensure that all unaccepted mptcp sockets will be cleaned by
__mptcp_close() before they are released, so mptcp_sock_destruct(), which
is used to clean the unaccepted mptcp socket, is not needed anymore.

The selftests for mptcp is ran for this commit, and no new failures.

Fixes: f296234c98a8 ("mptcp: Add handling of incoming MP_JOIN requests")
Fixes: 6aeed9045071 ("mptcp: fix race on unaccepted mptcp sockets")
Cc: stable@vger.kernel.org
Reviewed-by: Jiang Biao <benbjiang@tencent.com>
Reviewed-by: Mengen Sun <mengensun@tencent.com>
Acked-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Menglong Dong <imagedong@tencent.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mptcp: factor out __mptcp_close() without socket lock

commit 26d3e21ce1aab6cb19069c510fac8e7474445b18 upstream.

Factor out __mptcp_close() from mptcp_close(). The caller of
__mptcp_close() should hold the socket lock, and cancel mptcp work when
__mptcp_close() returns true.

This function will be used in the next commit.

Fixes: f296234c98a8 ("mptcp: Add handling of incoming MP_JOIN requests")
Fixes: 6aeed9045071 ("mptcp: fix race on unaccepted mptcp sockets")
Cc: stable@vger.kernel.org
Reviewed-by: Jiang Biao <benbjiang@tencent.com>
Reviewed-by: Mengen Sun <mengensun@tencent.com>
Acked-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Menglong Dong <imagedong@tencent.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mm: fix BUG splat with kvmalloc + GFP_ATOMIC

commit 30c19366636f72515679aa10dad61a4d988d4c9a upstream.

Martin Zaharinov reports BUG with 5.19.10 kernel:
kernel BUG at mm/vmalloc.c:2437!
invalid opcode: 0000 [#1] SMP
CPU: 28 PID: 0 Comm: swapper/28 Tainted: G        W  O      5.19.9 #1
[..]
RIP: 0010:__get_vm_area_node+0x120/0x130
  __vmalloc_node_range+0x96/0x1e0
  kvmalloc_node+0x92/0xb0
  bucket_table_alloc.isra.0+0x47/0x140
  rhashtable_try_insert+0x3a4/0x440
  rhashtable_insert_slow+0x1b/0x30
[..]

bucket_table_alloc uses kvzalloc(GPF_ATOMIC).  If kmalloc fails, this now
falls through to vmalloc and hits code paths that assume GFP_KERNEL.

Link: https://lkml.kernel.org/r/20220926151650.15293-1-fw@strlen.de
Fixes: a421ef303008 ("mm: allow !GFP_KERNEL allocations for kvmalloc")
Signed-off-by: Florian Westphal <fw@strlen.de>
Suggested-by: Michal Hocko <mhocko@suse.com>
Link: https://lore.kernel.org/linux-mm/Yy3MS2uhSgjF47dy@pc636/T/#t
Acked-by: Michal Hocko <mhocko@suse.com>
Reported-by: Martin Zaharinov <micron10@gmail.com>
Cc: Uladzislau Rezki (Sony) <urezki@gmail.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

libata: add ATA_HORKAGE_NOLPM for Pioneer BDR-207M and BDR-205

commit ea08aec7e77bfd6599489ec430f9f859ab84575a upstream.

Commit 1527f69204fe ("ata: ahci: Add Green Sardine vendor ID as
board_ahci_mobile") added an explicit entry for AMD Green Sardine
AHCI controller using the board_ahci_mobile configuration (this
configuration has later been renamed to board_ahci_low_power).

The board_ahci_low_power configuration enables support for low power
modes.

This explicit entry takes precedence over the generic AHCI controller
entry, which does not enable support for low power modes.

Therefore, when commit 1527f69204fe ("ata: ahci: Add Green Sardine
vendor ID as board_ahci_mobile") was backported to stable kernels,
it make some Pioneer optical drives, which was working perfectly fine
before the commit was backported, stop working.

The real problem is that the Pioneer optical drives do not handle low
power modes correctly. If these optical drives would have been tested
on another AHCI controller using the board_ahci_low_power configuration,
this issue would have been detected earlier.

Unfortunately, the board_ahci_low_power configuration is only used in
less than 15% of the total AHCI controller entries, so many devices
have never been tested with an AHCI controller with low power modes.

Fixes: 1527f69204fe ("ata: ahci: Add Green Sardine vendor ID as board_ahci_mobile")
Cc: stable@vger.kernel.org
Reported-by: Jaap Berkhout <j.j.berkhout@staalenberk.nl>
Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

vduse: prevent uninitialized memory accesses

commit 46f8a29272e51b6df7393d58fc5cb8967397ef2b upstream.

If the VDUSE application provides a smaller config space
than the driver expects, the driver may use uninitialized
memory from the stack.

This patch prevents it by initializing the buffer passed by
the driver to store the config value.

This fix addresses CVE-2022-2308.

Cc: stable@vger.kernel.org # v5.15+
Fixes: c8a6153b6c59 ("vduse: Introduce VDUSE - vDPA Device in Userspace")
Reviewed-by: Xie Yongji <xieyongji@bytedance.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Message-Id: <20220831154923.97809-1-maxime.coquelin@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

drm/amdgpu: Add amdgpu suspend-resume code path under SRIOV

commit 3b7329cf5a767c1be38352d43066012e220ad43c upstream.

- Under SRIOV, we need to send REQ_GPU_FINI to the hypervisor
  during the suspend time. Furthermore, we cannot request a
  mode 1 reset under SRIOV as VF. Therefore, we will skip it
  as it is called in suspend_noirq() function.

- In the resume code path, we need to send REQ_GPU_INIT to the
  hypervisor and also resume PSP IP block under SRIOV.

Signed-off-by: Bokun Zhang <Bokun.Zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

drm/i915/gt: Restrict forced preemption to the active context

commit 6ef7d362123ecb5bf6d163bb9c7fd6ba2d8c968c upstream.

When we submit a new pair of contexts to ELSP for execution, we start a
timer by which point we expect the HW to have switched execution to the
pending contexts. If the promotion to the new pair of contexts has not
occurred, we declare the executing context to have hung and force the
preemption to take place by resetting the engine and resubmitting the
new contexts.

This can lead to an unfair situation where almost all of the preemption
timeout is consumed by the first context which just switches into the
second context immediately prior to the timer firing and triggering the
preemption reset (assuming that the timer interrupts before we process
the CS events for the context switch). The second context hasn't yet had
a chance to yield to the incoming ELSP (and send the ACk for the
promotion) and so ends up being blamed for the reset.

If we see that a context switch has occurred since setting the
preemption timeout, but have not yet received the ACK for the ELSP
promotion, rearm the preemption timer and check again. This is
especially significant if the first context was not schedulable and so
we used the shortest timer possible, greatly increasing the chance of
accidentally blaming the second innocent context.

Fixes: 3a7a92aba8fb ("drm/i915/execlists: Force preemption")
Fixes: d12acee84ffb ("drm/i915/execlists: Cancel banned contexts on schedule-out")
Reported-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Andi Shyti <andi.shyti@linux.intel.com>
Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com>
Tested-by: Andrzej Hajda <andrzej.hajda@intel.com>
Cc: <stable@vger.kernel.org> # v5.5+
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220921135258.1714873-1-andrzej.hajda@intel.com
(cherry picked from commit 107ba1a2c705f4358f2602ec2f2fd821bb651f42)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

powerpc/64s/radix: don't need to broadcast IPI for radix pmd collapse flush

commit bedf03416913d88c796288f9dca109a53608c745 upstream.

The IPI broadcast is used to serialize against fast-GUP, but fast-GUP will
move to use RCU instead of disabling local interrupts in fast-GUP.  Using
an IPI is the old-styled way of serializing against fast-GUP although it
still works as expected now.

And fast-GUP now fixed the potential race with THP collapse by checking
whether PMD is changed or not.  So IPI broadcast in radix pmd collapse
flush is not necessary anymore.  But it is still needed for hash TLB.

Link: https://lkml.kernel.org/r/20220907180144.555485-2-shy828301@gmail.com
Suggested-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Yang Shi <shy828301@gmail.com>
Acked-by: David Hildenbrand <david@redhat.com>
Acked-by: Peter Xu <peterx@redhat.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Hugh Dickins <hughd@google.com>
Cc: Jason Gunthorpe <jgg@nvidia.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Revert "firmware: arm_scmi: Add clock management to the SCMI power domain"

commit 3c6656337852e9f1a4079d172f3fddfbf00868f9 upstream.

This reverts commit a3b884cef873 ("firmware: arm_scmi: Add clock management
to the SCMI power domain").

Using the GENPD_FLAG_PM_CLK tells genpd to gate/ungate the consumer
device's clock(s) during runtime suspend/resume through the PM clock API.
More precisely, in genpd_runtime_resume() the clock(s) for the consumer
device would become ungated prior to the driver-level ->runtime_resume()
callbacks gets invoked.

This behaviour isn't a good fit for all platforms/drivers. For example, a
driver may need to make some preparations of its device in its
->runtime_resume() callback, like calling clk_set_rate() before the
clock(s) should be ungated. In these cases, it's easier to let the clock(s)
to be managed solely by the driver, rather than at the PM domain level.

For these reasons, let's drop the use GENPD_FLAG_PM_CLK for the SCMI PM
domain, as to enable it to be more easily adopted across ARM platforms.

Fixes: a3b884cef873 ("firmware: arm_scmi: Add clock management to the SCMI power domain")
Cc: Nicolas Pitre <npitre@baylibre.com>
Cc: stable@vger.kernel.org
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Tested-by: Peng Fan <peng.fan@nxp.com>
Acked-by: Sudeep Holla <sudeep.holla@arm.com>
Link: https://lore.kernel.org/r/20220919122033.86126-1-ulf.hansson@linaro.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

net: mt7531: only do PLL once after the reset

commit 42bc4fafe359ed6b73602b7a2dba0dd99588f8ce upstream.

Move the PLL init of the switch out of the pad configuration of the port
6 (usally cpu port).

Fix a unidirectional 100 mbit limitation on 1 gbit or 2.5 gbit links for
outbound traffic on port 5 or port 6.

Fixes: c288575f7810 ("net: dsa: mt7530: Add the support of MT7531 switch")
Cc: stable@vger.kernel.org
Signed-off-by: Alexander Couzens <lynxis@fe80.eu>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mm/damon/dbgfs: fix memory leak when using debugfs_lookup()

commit 1552fd3ef7dbe07208b8ae84a0a6566adf7dfc9d upstream.

When calling debugfs_lookup() the result must have dput() called on it,
otherwise the memory will leak over time. Fix this up by properly calling
dput().

Link: https://lkml.kernel.org/r/20220902191149.112434-1-sj@kernel.org
Fixes: 75c1c2b53c78b ("mm/damon/dbgfs: support multiple contexts")
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

x86/uaccess: avoid check_object_size() in copy_from_user_nmi()

commit 59298997df89e19aad426d4ae0a7e5037074da5a upstream.

The check_object_size() helper under CONFIG_HARDENED_USERCOPY is designed
to skip any checks where the length is known at compile time as a
reasonable heuristic to avoid "likely known-good" cases.  However, it can
only do this when the copy_*_user() helpers are, themselves, inline too.

Using find_vmap_area() requires taking a spinlock.  The
check_object_size() helper can call find_vmap_area() when the destination
is in vmap memory.  If show_regs() is called in interrupt context, it will
attempt a call to copy_from_user_nmi(), which may call check_object_size()
and then find_vmap_area().  If something in normal context happens to be
in the middle of calling find_vmap_area() (with the spinlock held), the
interrupt handler will hang forever.

The copy_from_user_nmi() call is actually being called with a fixed-size
length, so check_object_size() should never have been called in the first
place.  Given the narrow constraints, just replace the
__copy_from_user_inatomic() call with an open-coded version that calls
only into the sanitizers and not check_object_size(), followed by a call
to raw_copy_from_user().

[akpm@linux-foundation.org: no instrument_copy_from_user() in my tree...]
Link: https://lkml.kernel.org/r/20220919201648.2250764-1-keescook@chromium.org
Link: https://lore.kernel.org/all/CAOUHufaPshtKrTWOz7T7QFYUNVGFm0JBjvM700Nhf9qEL9b3EQ@mail.gmail.com
Fixes: 0aef499f3172 ("mm/usercopy: Detect vmalloc overruns")
Signed-off-by: Kees Cook <keescook@chromium.org>
Reported-by: Yu Zhao <yuzhao@google.com>
Reported-by: Florian Lehner <dev@der-flo.net>
Suggested-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Florian Lehner <dev@der-flo.net>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Josh Poimboeuf <jpoimboe@kernel.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ntfs: fix BUG_ON in ntfs_lookup_inode_by_name()

commit 1b513f613731e2afc05550e8070d79fac80c661e upstream.

Syzkaller reported BUG_ON as follows:

------------[ cut here ]------------
kernel BUG at fs/ntfs/dir.c:86!
invalid opcode: 0000 [#1] PREEMPT SMP KASAN PTI
CPU: 3 PID: 758 Comm: a.out Not tainted 5.19.0-next-20220808 #5
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
RIP: 0010:ntfs_lookup_inode_by_name+0xd11/0x2d10
Code: ff e9 b9 01 00 00 e8 1e fe d6 fe 48 8b 7d 98 49 8d 5d 07 e8 91 85 29 ff 48 c7 45 98 00 00 00 00 e9 5a fb ff ff e8 ff fd d6 fe <0f> 0b e8 f8 fd d6 fe 0f 0b e8 f1 fd d6 fe 48 8b b5 50 ff ff ff 4c
RSP: 0018:ffff888079607978 EFLAGS: 00010293
RAX: 0000000000000000 RBX: 0000000000008000 RCX: 0000000000000000
RDX: ffff88807cf10000 RSI: ffffffff82a4a081 RDI: 0000000000000003
RBP: ffff888079607a70 R08: 0000000000000001 R09: ffff88807a6d01d7
R10: ffffed100f4da03a R11: 0000000000000000 R12: ffff88800f0fb110
R13: ffff88800f0ee000 R14: ffff88800f0fb000 R15: 0000000000000001
FS: 00007f33b63c7540(0000) GS:ffff888108580000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f33b635c090 CR3: 000000000f39e005 CR4: 0000000000770ee0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
PKRU: 55555554
Call Trace:
<TASK>
load_system_files+0x1f7f/0x3620
ntfs_fill_super+0xa01/0x1be0
mount_bdev+0x36a/0x440
ntfs_mount+0x3a/0x50
legacy_get_tree+0xfb/0x210
vfs_get_tree+0x8f/0x2f0
do_new_mount+0x30a/0x760
path_mount+0x4de/0x1880
__x64_sys_mount+0x2b3/0x340
do_syscall_64+0x38/0x90
entry_SYSCALL_64_after_hwframe+0x63/0xcd
RIP: 0033:0x7f33b62ff9ea
Code: 48 8b 0d a9 f4 0b 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 49 89 ca b8 a5 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 76 f4 0b 00 f7 d8 64 89 01 48
RSP: 002b:00007ffd0c471aa8 EFLAGS: 00000202 ORIG_RAX: 00000000000000a5
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f33b62ff9ea
RDX: 0000000020000000 RSI: 0000000020000100 RDI: 00007ffd0c471be0
RBP: 00007ffd0c471c60 R08: 00007ffd0c471ae0 R09: 00007ffd0c471c24
R10: 0000000000000000 R11: 0000000000000202 R12: 000055bac5afc160
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
</TASK>
Modules linked in:
---[ end trace 0000000000000000 ]---

Fix this by adding sanity check on extended system files' directory inode
to ensure that it is directory, just like ntfs_extend_init() when mounting
ntfs3.

Link: https://lkml.kernel.org/r/20220809064730.2316892-1-chenxiaosong2@huawei.com
Signed-off-by: ChenXiaoSong <chenxiaosong2@huawei.com>
Cc: Anton Altaparmakov <anton@tuxera.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ARM: dts: integrator: Tag PCI host with device_type

commit 4952aa696a9f221c5e34e5961e02fca41ef67ad6 upstream.

The DT parser is dependent on the PCI device being tagged as
device_type = "pci" in order to parse memory ranges properly.
Fix this up.

Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20220919092608.813511-1-linus.walleij@linaro.org'
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

frontswap: don't call ->init if no ops are registered

commit 37dcc673d065d9823576cd9f2484a72531e1cba6 upstream.

If no frontswap module (i.e.  zswap) was registered, frontswap_ops will be
NULL.  In such situation, swapon crashes with the following stack trace:

  Unable to handle kernel access to user memory outside uaccess routines at virtual address 0000000000000000
  Mem abort info:
    ESR = 0x0000000096000004
    EC = 0x25: DABT (current EL), IL = 32 bits
    SET = 0, FnV = 0
    EA = 0, S1PTW = 0
    FSC = 0x04: level 0 translation fault
  Data abort info:
    ISV = 0, ISS = 0x00000004
    CM = 0, WnR = 0
  user pgtable: 4k pages, 48-bit VAs, pgdp=00000020a4fab000
  [0000000000000000] pgd=0000000000000000, p4d=0000000000000000
  Internal error: Oops: 96000004 [#1] SMP
  Modules linked in: zram fsl_dpaa2_eth pcs_lynx phylink ahci_qoriq crct10dif_ce ghash_ce sbsa_gwdt fsl_mc_dpio nvme lm90 nvme_core at803x xhci_plat_hcd rtc_fsl_ftm_alarm xgmac_mdio ahci_platform i2c_imx ip6_tables ip_tables fuse
  Unloaded tainted modules: cppc_cpufreq():1
  CPU: 10 PID: 761 Comm: swapon Not tainted 6.0.0-rc2-00454-g22100432cf14 #1
  Hardware name: SolidRun Ltd. SolidRun CEX7 Platform, BIOS EDK II Jun 21 2022
  pstate: 00400005 (nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
  pc : frontswap_init+0x38/0x60
  lr : __do_sys_swapon+0x8a8/0x9f4
  sp : ffff80000969bcf0
  x29: ffff80000969bcf0 x28: ffff37bee0d8fc00 x27: ffff80000a7f5000
  x26: fffffcdefb971e80 x25: ffffaba797453b90 x24: 0000000000000064
  x23: ffff37c1f209d1a8 x22: ffff37bee880e000 x21: ffffaba797748560
  x20: ffff37bee0d8fce4 x19: ffffaba797748488 x18: 0000000000000014
  x17: 0000000030ec029a x16: ffffaba795a479b0 x15: 0000000000000000
  x14: 0000000000000000 x13: 0000000000000030 x12: 0000000000000001
  x11: ffff37c63c0aba18 x10: 0000000000000000 x9 : ffffaba7956b8c88
  x8 : ffff80000969bcd0 x7 : 0000000000000000 x6 : 0000000000000000
  x5 : 0000000000000001 x4 : 0000000000000000 x3 : ffffaba79730f000
  x2 : ffff37bee0d8fc00 x1 : 0000000000000000 x0 : 0000000000000000
  Call trace:
  frontswap_init+0x38/0x60
  __do_sys_swapon+0x8a8/0x9f4
  __arm64_sys_swapon+0x28/0x3c
  invoke_syscall+0x78/0x100
  el0_svc_common.constprop.0+0xd4/0xf4
  do_el0_svc+0x38/0x4c
  el0_svc+0x34/0x10c
  el0t_64_sync_handler+0x11c/0x150
  el0t_64_sync+0x190/0x194
  Code: d000e283 910003fd f9006c41 f946d461 (f9400021)
  ---[ end trace 0000000000000000 ]---

Link: https://lkml.kernel.org/r/20220909130829.3262926-1-hch@lst.de
Fixes: 1da0d94a3ec8 ("frontswap: remove support for multiple ops")
Reported-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Liu Shixin <liushixin2@huawei.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

x86/sgx: Do not fail on incomplete sanitization on premature stop of ksgxd

commit 133e049a3f8c91b175029fb6a59b6039d5e79cba upstream.

Unsanitized pages trigger WARN_ON() unconditionally, which can panic the
whole computer, if /proc/sys/kernel/panic_on_warn is set.

In sgx_init(), if misc_register() fails or misc_register() succeeds but
neither sgx_drv_init() nor sgx_vepc_init() succeeds, then ksgxd will be
prematurely stopped. This may leave unsanitized pages, which will result a
false warning.

Refine __sgx_sanitize_pages() to return:

1. Zero when the sanitization process is complete or ksgxd has been
requested to stop.
2. The number of unsanitized pages otherwise.

Fixes: 51ab30eb2ad4 ("x86/sgx: Replace section->init_laundry_list with sgx_dirty_page_list")
Reported-by: Paul Menzel <pmenzel@molgen.mpg.de>
Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/linux-sgx/20220825051827.246698-1-jarkko@kernel.org/T/#u
Link: https://lkml.kernel.org/r/20220906000221.34286-2-jarkko@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

wifi: mac80211: ensure vif queues are operational after start

commit 527008e5e87600a389feb8a57042c928ecca195d upstream.

Make sure local->queue_stop_reasons and vif.txqs_stopped stay in sync.

When a new vif is created the queues may end up in an inconsistent state
and be inoperable:
Communication not using iTXQ will work, allowing to e.g. complete the
association. But the 4-way handshake will time out. The sta will not
send out any skbs queued in iTXQs.

All normal attempts to start the queues will fail when reaching this
state.
local->queue_stop_reasons will have marked all queues as operational but
vif.txqs_stopped will still be set, creating an inconsistent internal
state.

In reality this seems to be race between the mac80211 function
ieee80211_do_open() setting SDATA_STATE_RUNNING and the wake_txqs_tasklet:
Depending on the driver and the timing the queues may end up to be
operational or not.

Cc: stable@vger.kernel.org
Fixes: f856373e2f31 ("wifi: mac80211: do not wake queues on a vif that is being stopped")
Signed-off-by: Alexander Wetzel <alexander@wetzel-home.de>
Acked-by: Felix Fietkau <nbd@nbd.name>
Link: https://lore.kernel.org/r/20220915130946.302803-1-alexander@wetzel-home.de
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

clk: ingenic-tcu: Properly enable registers before accessing timers

commit 6726d552a6912e88cf63fe2bda87b2efa0efc7d0 upstream.

Access to registers is guarded by ingenic_tcu_{enable,disable}_regs()
so the stop bit can be cleared before accessing a timer channel, but
those functions did not clear the stop bit on SoCs with a global TCU
clock gate.

Testing on the X1000 has revealed that the stop bits must be cleared
_and_ the global TCU clock must be ungated to access timer registers.
This appears to be the norm on Ingenic SoCs, and is specified in the
documentation for the X1000 and numerous JZ47xx SoCs.

If the stop bit isn't cleared, register writes don't take effect and
the system can be left in a broken state, eg. the watchdog timer may
not run.

The bug probably went unnoticed because stop bits are zeroed when
the SoC is reset, and the kernel does not set them unless a timer
gets disabled at runtime. However, it is possible that a bootloader
or a previous kernel (if using kexec) leaves the stop bits set and
we should not rely on them being cleared.

Fixing this is easy: have ingenic_tcu_{enable,disable}_regs() always
clear the stop bit, regardless of the presence of a global TCU gate.

Reviewed-by: Paul Cercueil <paul@crapouillou.net>
Tested-by: Paul Cercueil <paul@crapouillou.net>
Fixes: 4f89e4b8f121 ("clk: ingenic: Add driver for the TCU clocks")
Cc: stable@vger.kernel.org
Signed-off-by: Aidan MacDonald <aidanmacdonald.0x0@gmail.com>
Link: https://lore.kernel.org/r/20220617122254.738900-1-aidanmacdonald.0x0@gmail.com
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

can: c_can: don't cache TX messages for C_CAN cores

commit 81d192c2ce74157e717e1fc4b68791f82f7499d4 upstream.

As Jacob noticed, the optimization introduced in 387da6bc7a82 ("can:
c_can: cache frames to operate as a true FIFO") doesn't properly work
on C_CAN, but on D_CAN IP cores. The exact reasons are still unknown.

For now disable caching if CAN frames in the TX path for C_CAN cores.

Fixes: 387da6bc7a82 ("can: c_can: cache frames to operate as a true FIFO")
Link: https://lore.kernel.org/all/20220928083354.1062321-1-mkl@pengutronix.de
Link: https://lore.kernel.org/all/15a8084b-9617-2da1-6704-d7e39d60643b@gmail.com
Reported-by: Jacob Kroon <jacob.kroon@gmail.com>
Tested-by: Jacob Kroon <jacob.kroon@gmail.com>
Cc: stable@vger.kernel.org # v5.15
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Input: snvs_pwrkey - fix SNVS_HPVIDR1 register address

commit e62563db857f81d75c5726a35bc0180bed6d1540 upstream.

Both i.MX6 and i.MX8 reference manuals list 0xBF8 as SNVS_HPVIDR1
(chapters 57.9 and 6.4.5 respectively).

Without this, trying to read the revision number results in 0 on
all revisions, causing the i.MX6 quirk to apply on all platforms,
which in turn causes the driver to synthesise power button release
events instead of passing the real one as they happen even on
platforms like i.MX8 where that's not wanted.

Fixes: 1a26c920717a ("Input: snvs_pwrkey - send key events for i.MX6 S, DL and Q")
Tested-by: Martin Kepplinger <martin.kepplinger@puri.sm>
Signed-off-by: Sebastian Krzyszkowiak <sebastian.krzyszkowiak@puri.sm>
Reviewed-by: Mattijs Korpershoek <mkorpershoek@baylibre.com>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/4599101.ElGaqSPkdT@pliszka
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

net: usb: qmi_wwan: Add new usb-id for Dell branded EM7455

commit 797666cd5af041ffb66642fff62f7389f08566a2 upstream.

Add support for Dell 5811e (EM7455) with USB-id 0x413c:0x81c2.

Signed-off-by: Frank Wunderlich <frank-w@public-files.de>
Cc: stable@vger.kernel.org
Acked-by: Bjørn Mork <bjorn@mork.no>
Link: https://lore.kernel.org/r/20220926150740.6684-3-linux@fw-web.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

thunderbolt: Explicitly reset plug events delay back to USB4 spec value

commit 31f87f705b3c1635345d8e8a493697099b43e508 upstream.

If any software has interacted with the USB4 registers before the Linux
USB4 CM runs, it may have modified the plug events delay. It has been
observed that if this value too large, it's possible that hotplugged
devices will negotiate a fallback mode instead in Linux.

To prevent this, explicitly align the plug events delay with the USB4
spec value of 10ms.

Cc: stable@vger.kernel.org
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

usb: typec: ucsi: Remove incorrect warning

commit 415ba26cb73f7d22a892043301b91b57ae54db02 upstream.

Sink only devices do not have any source capabilities, so
the driver should not warn about that. Also DRP (Dual Role
Power) capable devices, such as USB Type-C docking stations,
do not return any source capabilities unless they are
plugged to a power supply themselves.

Fixes: 1f4642b72be7 ("usb: typec: ucsi: Retrieve all the PDOs instead of just the first 4")
Reported-by: Paul Menzel <pmenzel@molgen.mpg.de>
Cc: <stable@vger.kernel.org>
Signed-off-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Link: https://lore.kernel.org/r/20220922145924.80667-1-heikki.krogerus@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

uas: ignore UAS for Thinkplus chips

commit 0fb9703a3eade0bb84c635705d9c795345e55053 upstream.

The UAS mode of Thinkplus(0x17ef, 0x3899) is reported to influence
performance and trigger kernel panic on several platforms with the
following error message:

[   39.702439] xhci_hcd 0000:0c:00.3: ERROR Transfer event for disabled
               endpoint or incorrect stream ring
[   39.702442] xhci_hcd 0000:0c:00.3: @000000026c61f810 00000000 00000000
               1b000000 05038000

[  720.545894][13] Workqueue: usb_hub_wq hub_event
[  720.550971][13]  ffff88026c143c38 0000000000016300 ffff8802755bb900 ffff880
                    26cb80000
[  720.559673][13]  ffff88026c144000 ffff88026ca88100 0000000000000000 ffff880
                    26cb80000
[  720.568374][13]  ffff88026cb80000 ffff88026c143c50 ffffffff8186ae25 ffff880
                    26ca880f8
[  720.577076][13] Call Trace:
[  720.580201][13]  [<ffffffff8186ae25>] schedule+0x35/0x80
[  720.586137][13]  [<ffffffff8186b0ce>] schedule_preempt_disabled+0xe/0x10
[  720.593623][13]  [<ffffffff8186cb94>] __mutex_lock_slowpath+0x164/0x1e0
[  720.601012][13]  [<ffffffff8186cc3f>] mutex_lock+0x2f/0x40
[  720.607141][13]  [<ffffffff8162b8e9>] usb_disconnect+0x59/0x290

Falling back to USB mass storage can solve this problem, so ignore UAS
function of this chip.

Acked-by: Alan Stern <stern@rowland.harvard.edu>
Cc: stable <stable@kernel.org>
Signed-off-by: Hongling Zeng <zenghongling@kylinos.cn>
Link: https://lore.kernel.org/r/1663902249837086.19.seg@mailgw
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

usb-storage: Add Hiksemi USB3-FW to IGNORE_UAS

commit e00b488e813f0f1ad9f778e771b7cd2fe2877023 upstream.

The UAS mode of Hiksemi USB_HDD is reported to fail to work on several
platforms with the following error message, then after re-connecting the
device will be offlined and not working at all.

[  592.518442][ 2] sd 8:0:0:0: [sda] tag#17 uas_eh_abort_handler 0 uas-tag 18
                   inflight: CMD
[  592.527575][ 2] sd 8:0:0:0: [sda] tag#17 CDB: Write(10) 2a 00 03 6f 88 00 00
                   04 00 00
[  592.536330][ 2] sd 8:0:0:0: [sda] tag#0 uas_eh_abort_handler 0 uas-tag 1
                   inflight: CMD
[  592.545266][ 2] sd 8:0:0:0: [sda] tag#0 CDB: Write(10) 2a 00 07 44 1a 88 00
                   00 08 00

These disks have a broken uas implementation, the tag field of the status
iu-s is not set properly,so we need to fall-back to usb-storage.

Acked-by: Alan Stern <stern@rowland.harvard.edu>
Cc: stable <stable@kernel.org>
Signed-off-by: Hongling Zeng <zenghongling@kylinos.cn>
Link: https://lore.kernel.org/r/1663901185-21067-1-git-send-email-zenghongling@kylinos.cn
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

uas: add no-uas quirk for Hiksemi usb_disk

commit a625a4b8806cc1e928b7dd2cca1fee709c9de56e upstream.

The UAS mode of Hiksemi is reported to fail to work on several platforms
with the following error message, then after re-connecting the device will
be offlined and not working at all.

[  592.518442][ 2] sd 8:0:0:0: [sda] tag#17 uas_eh_abort_handler 0 uas-tag 18
                   inflight: CMD
[  592.527575][ 2] sd 8:0:0:0: [sda] tag#17 CDB: Write(10) 2a 00 03 6f 88 00 00
                   04 00 00
[  592.536330][ 2] sd 8:0:0:0: [sda] tag#0 uas_eh_abort_handler 0 uas-tag 1
                   inflight: CMD
[  592.545266][ 2] sd 8:0:0:0: [sda] tag#0 CDB: Write(10) 2a 00 07 44 1a 88 00
                   00 08 00

These disks have a broken uas implementation, the tag field of the status
iu-s is not set properly,so we need to fall-back to usb-storage.

Acked-by: Alan Stern <stern@rowland.harvard.edu>
Cc: stable <stable@kernel.org>
Signed-off-by: Hongling Zeng <zenghongling@kylinos.cn>
Link: https://lore.kernel.org/r/1663901173-21020-1-git-send-email-zenghongling@kylinos.cn
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

counter: 104-quad-8: Fix skipped IRQ lines during events configuration

[ Upstream commit 2bc54aaa65d2126ae629919175708a28ce7ef06e ]

IRQ trigger configuration is skipped if it has already been set before;
however, the IRQ line still needs to be OR'd to irq_enabled because
irq_enabled is reset for every events_configure call. This patch moves
the irq_enabled OR operation update to before the irq_trigger check so
that IRQ line enablement is not skipped.

Fixes: c95cc0d95702 ("counter: 104-quad-8: Fix persistent enabled events bug")
Cc: stable <stable@kernel.org>
Link: https://lore.kernel.org/r/20220815122301.2750-1-william.gray@linaro.org/
Signed-off-by: William Breathitt Gray <william.gray@linaro.org>
Link: https://lore.kernel.org/r/179eed11eaf225dbd908993b510df0c8f67b1230.1663844776.git.william.gray@linaro.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>

counter: 104-quad-8: Implement and utilize register structures

[ Upstream commit daae1ee572d1f99daddef26afe6c6fc7aeea741d ]

Reduce magic numbers and improve code readability by implementing and
utilizing named register data structures.

Link: https://lore.kernel.org/r/20220707171709.36010-1-william.gray@linaro.org/
Cc: Syed Nayyar Waris <syednwaris@gmail.com>
Tested-by: Fred Eckert <Frede@cmslaser.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: William Breathitt Gray <william.gray@linaro.org>
Link: https://lore.kernel.org/r/285fdc7c03892251f50bdbf2c28c19998243a6a3.1657813472.git.william.gray@linaro.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Stable-dep-of: 2bc54aaa65d2 ("counter: 104-quad-8: Fix skipped IRQ lines during events configuration")
Signed-off-by: Sasha Levin <sashal@kernel.org>

counter: 104-quad-8: Utilize iomap interface

[ Upstream commit b6e9cded90d46b1066a1ca260d8b5ecf67787aba ]

This driver doesn't need to access I/O ports directly via inb()/outb()
and friends. This patch abstracts such access by calling ioport_map()
to enable the use of more typical ioread8()/iowrite8() I/O memory
accessor calls.

Link: https://lore.kernel.org/r/861c003318dce3d2bef4061711643bb04f5ec14f.1652201921.git.william.gray@linaro.org
Cc: Syed Nayyar Waris <syednwaris@gmail.com>
Suggested-by: David Laight <David.Laight@ACULAB.COM>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: William Breathitt Gray <william.gray@linaro.org>
Link: https://lore.kernel.org/r/e971b897cacfac4cb2eca478f5533d2875f5cadd.1657813472.git.william.gray@linaro.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Stable-dep-of: 2bc54aaa65d2 ("counter: 104-quad-8: Fix skipped IRQ lines during events configuration")
Signed-off-by: Sasha Levin <sashal@kernel.org>

perf record: Fix cpu mask bit setting for mixed mmaps

[ Upstream commit ca76d7d2812b46124291f99c9b50aaf63a936f23 ]

With mixed per-thread and (system-wide) per-cpu maps, the "any cpu" value
-1 must be skipped when setting CPU mask bits.

Prior to commit cbd7bfc7fd99acdd ("tools/perf: Fix out of bound access
to cpu mask array") the invalid setting went unnoticed, but since then
it causes perf record to fail with an error.

Example:

Before:

   $ perf record -e intel_pt// --per-thread uname
   Failed to initialize parallel data streaming masks

After:

   $ perf record -e intel_pt// --per-thread uname
   Linux
   [ perf record: Woken up 1 times to write data ]
   [ perf record: Captured and wrote 0.068 MB perf.data ]

Fixes: ae4f8ae16a078964 ("libperf evlist: Allow mixing per-thread and per-cpu mmaps")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20220915122612.81738-2-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>

tools/perf: Fix out of bound access to cpu mask array

[ Upstream commit cbd7bfc7fd99acdde58ec2b0bce990158fba1654 ]

The cpu mask init code in "record__mmap_cpu_mask_init" function access
"bits" array part of "struct mmap_cpu_mask".  The size of this array is
the value from cpu__max_cpu().cpu.  This array is used to contain the
cpumask value for each cpu. While setting bit for each cpu, it calls
"set_bit" function which access index in "bits" array.

If we provide a command line option to -C which is greater than the
number of CPU's present in the system, the set_bit could access an array
member which is out-of the array size. This is because currently, there
is no boundary check for the CPU. This will result in seg fault:

<<>>
  ./perf record -C 12341234 ls
  Perf can support 2048 CPUs. Consider raising MAX_NR_CPUS
  Segmentation fault (core dumped)
<<>>

Debugging with gdb, points to function flow as below:

<<>>
  set_bit
  record__mmap_cpu_mask_init
  record__init_thread_default_masks
  record__init_thread_masks
  cmd_record
<<>>

Fix this by adding boundary check for the array.

After the patch:

<<>>
./perf record -C 12341234 ls
  Perf can support 2048 CPUs. Consider raising MAX_NR_CPUS
  Failed to initialize parallel data streaming masks
<<>>

With this fix, if -C is given a non-exsiting CPU, perf
record will fail with:

<<>>
  ./perf record -C 50 ls
  Failed to initialize parallel data streaming masks
<<>>

Reported-by: Nageswara R Sastry <rnsastry@linux.ibm.com>
Signed-off-by: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Nageswara R Sastry <rnsastry@linux.ibm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: linuxppc-dev@lists.ozlabs.org
Link: https://lore.kernel.org/r/20220905141929.7171-2-atrajeev@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Stable-dep-of: ca76d7d2812b ("perf record: Fix cpu mask bit setting for mixed mmaps")
Signed-off-by: Sasha Levin <sashal@kernel.org>