EMIT6_PCREL() macro assumes that the previous pass generated 6 bytes
of code, which is not the case if branch shortening took place. Fix by
using jit->prg, like all the other EMIT6_PCREL_*() macros.
Reported-by: Johan Almbladh <johan.almbladh@anyfinetworks.com> Fixes: 4e9b4a6883dd ("s390/bpf: Use relative long branches") Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The JIT uses agfi for subtracting constants, but -(-0x80000000) cannot
be represented as a 32-bit signed binary integer. Fix by using algfi in
this particular case.
Currently the JIT completely removes things like `reg32 += 0`,
however, the BPF_ALU semantics requires the target register to be
zero-extended in such cases.
Fix by optimizing out only the arithmetic operation, but not the
subsequent zero-extension.
After d12e1c464988 ("net: dsa: b53: Set correct number of ports in the
DSA struct") we stopped setting dsa_switch::num_ports to DSA_MAX_PORTS,
which created an off by one error between the statically allocated
bcm_sf2_priv::port_sts array (of size DSA_MAX_PORTS). When
dsa_is_cpu_port() is used, we end-up accessing an out of bounds member
and causing a NPD.
Fix this by iterating with the appropriate port count using
ds->num_ports.
Fixes: d12e1c464988 ("net: dsa: b53: Set correct number of ports in the DSA struct") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The recent patch has introduced a regression by not reading the reset
count in the ERROR_RECOVERY async event handler. We may have just
gone through a reset and the reset count has just incremented. If
we don't update the reset count in the ERROR_RECOVERY event handler,
the health check timer will see that the reset count has changed and
will initiate an unintended reset.
Restore the unconditional update of the reset count in
bnxt_async_event_process() if error recovery watchdog is enabled.
Also, update the reset count at the end of the reset sequence to
make it even more robust.
Fixes: 1b2b91831983 ("bnxt_en: Fix possible unintended driver initiated error recovery") Reviewed-by: Edwin Peer <edwin.peer@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The cur_tx counter must be incremented after TACT bit of
txdesc->status was set. However, a CPU is possible to reorder
instructions and/or memory accesses between cur_tx and
txdesc->status. And then, if TX interrupt happened at such a
timing, the sh_eth_tx_free() may free the descriptor wrongly.
So, add wmb() before cur_tx++.
Otherwise NETDEV WATCHDOG timeout is possible to happen.
Fixes: 86a74ff21a7a ("net: sh_eth: add support for Renesas SuperH Ethernet") Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
When removing the driver module w/o bringing an interface up before
the error below occurs. Reason seems to be that cancel_work_sync() is
called in t3_sge_stop() for a queue that hasn't been initialized yet.
Fixes: 5e0b8928927f ("net:cxgb3: replace tasklets with works") Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
One MIPS platform (mach-rc32434) defines GPIOBASE. This macro
conflicts with one of the same name in lpc_sch.c. Rename the latter one
to prevent the build error.
../drivers/mfd/lpc_sch.c:25: error: "GPIOBASE" redefined [-Werror]
25 | #define GPIOBASE 0x44
../arch/mips/include/asm/mach-rc32434/rb.h:32: note: this is the location of the previous definition
32 | #define GPIOBASE 0x050000
Cc: Denis Turischev <denis@compulab.co.il> Fixes: e82c60ae7d3a ("mfd: Introduce lpc_sch for Intel SCH LPC bridge") Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Lee Jones <lee.jones@linaro.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
This commit was added for equivalence with a similar fix to ip_gre.
That fix proved to have a bug. Upon closer inspection, ip6_gre is not
susceptible to the original bug.
So revert the unnecessary extra check.
In short, ipgre_xmit calls skb_pull to remove ipv4 headers previously
inserted by dev_hard_header. ip6gre_tunnel_xmit does not.
If error recovery is already enabled, bnxt_timer() will periodically
check the heartbeat register and the reset counter. If we get an
error recovery async. notification from the firmware (e.g. change in
primary/secondary role), we will immediately read and update the
heartbeat register and the reset counter. If the timer for the next
health check expires soon after this, we may read the heartbeat register
again in quick succession and find that it hasn't changed. This will
trigger error recovery unintentionally.
The likelihood is small because we also reset fw_health->tmr_counter
which will reset the interval for the next health check. But the
update is not protected and bnxt_timer() can miss the update and
perform the health check without waiting for the full interval.
Fix it by only reading the heartbeat register and reset counter in
bnxt_async_event_process() if error recovery is trasitioning to the
enabled state. Also add proper memory barriers so that when enabling
for the first time, bnxt_timer() will see the tmr_counter interval and
perform the health check after the full interval has elapsed.
Fixes: 7e914027f757 ("bnxt_en: Enable health monitoring.") Reviewed-by: Edwin Peer <edwin.peer@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
The current asic.rev is incomplete and does not include the metal
revision. Add the metal revision and decode the complete asic
revision into the more common and readable form (A0, B0, etc).
Fixes: 7154917a12b2 ("bnxt_en: Refactor bnxt_dl_info_get().") Reviewed-by: Edwin Peer <edwin.peer@broadcom.com> Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
Broadcom's b53 switches have one IMP (Inband Management Port) that needs
to be programmed using its own designed register. IMP port may be
different than CPU port - especially on devices with multiple CPU ports.
For that reason it's required to explicitly note IMP port index and
check for it when choosing a register to use.
This commit fixes BCM5301x support. Those switches use CPU port 5 while
their IMP port is 8. Before this patch b53 was trying to program port 5
with B53_PORT_OVERRIDE_CTRL instead of B53_GMII_PORT_OVERRIDE_CTRL(5).
It may be possible to also replace "cpu_port" usages with
dsa_is_cpu_port() but that is out of the scope of thix BCM5301x fix.
Fixes: 967dd82ffc52 ("net: dsa: b53: Add support for Broadcom RoboSwitch") Signed-off-by: Rafał Miłecki <rafal@milecki.pl> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
The GRE tunnel device can pull existing outer headers in ipge_xmit.
This is a rare path, apparently unique to this device. The below
commit ensured that pulling does not move skb->data beyond csum_start.
But it has a false positive if ip_summed is not CHECKSUM_PARTIAL and
thus csum_start is irrelevant.
Refine to exclude this. At the same time simplify and strengthen the
test.
Simplify, by moving the check next to the offending pull, making it
more self documenting and removing an unnecessary branch from other
code paths.
Strengthen, by also ensuring that the transport header is correct and
therefore the inner headers will be after skb_reset_inner_headers.
The transport header is set to csum_start in skb_partial_csum_set.
Link: https://lore.kernel.org/netdev/YS+h%2FtqCJJiQei+W@shredder/ Fixes: 1d011c4803c7 ("ip_gre: add validation for csum_start") Reported-by: Ido Schimmel <idosch@idosch.org> Suggested-by: Alexander Duyck <alexander.duyck@gmail.com> Signed-off-by: Willem de Bruijn <willemb@google.com> Reviewed-by: Alexander Duyck <alexanderduyck@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
A firmware is requested but never released in this function. This leads to
a memory leak in the normal execution path.
Add the missing 'release_firmware()' call.
Also introduce a temp variable (new_len) in order to keep the value of
'pnvm->size' after the firmware has been released.
Fixes: cdda18fbbefa ("iwlwifi: pnvm: move file loading code to a separate function") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Reviewed-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Luca Coelho <luca@coelho.fi> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Link: https://lore.kernel.org/r/1b5d80f54c1dbf85710fd285243932943b498fe7.1630614969.git.christophe.jaillet@wanadoo.fr Signed-off-by: Sasha Levin <sashal@kernel.org>
Previous commit 68233c583ab4 removes the qlcnic_rom_lock()
in qlcnic_pinit_from_rom(), but remains its corresponding
unlock function, which is odd. I'm not very sure whether the
lock is missing, or the unlock is redundant. This bug is
suggested by a static analysis tool, please advise.
Fixes: 68233c583ab4 ("qlcnic: updated reset sequence") Signed-off-by: Dinghao Liu <dinghao.liu@zju.edu.cn> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
It seems that this bug has already been fixed by Eric Dumazet in the
past in:
commit 78296c97ca1f ("netfilter: xt_socket: fix a stack corruption bug")
But a variant of the same issue has been introduced in
commit d64d80a2cde9 ("netfilter: x_tables: don't extract flow keys on early demuxed sks in socket match")
`daddr` and `saddr` potentially hold a reference to ipv6_var that is no
longer in scope when the call to `nf_socket_get_sock_v6` is made.
Fixes: d64d80a2cde9 ("netfilter: x_tables: don't extract flow keys on early demuxed sks in socket match") Acked-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Benjamin Hesmans <benjamin.hesmans@tessares.net> Reviewed-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Recent changes exposed a bug where specifically-timed requests to the
path manager netlink API could trigger a divide-by-zero in
__tcp_select_window(), as syzkaller does:
mptcp_pm_nl_addr_send_ack() was attempting to send a TCP ACK on the
first subflow in the MPTCP socket's connection list without validating
that the subflow was in a suitable connection state. To address this,
always validate subflow state when sending extra ACKs on subflows
for address advertisement or subflow priority change.
Fixes: 84dfe3677a6f ("mptcp: send out dedicated ADD_ADDR packet") Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/229 Co-developed-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Acked-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
Setting DSA_MAX_PORTS caused DSA to call b53 callbacks (e.g.
b53_disable_port() during dsa_register_switch()) for invalid
(non-existent) ports. That made b53 modify unrelated registers and is
one of reasons for a broken BCM5301x support.
This problem exists for years but DSA_MAX_PORTS usage has changed few
times. It seems the most accurate to reference commit dropping
dsa_switch_alloc() in the Fixes tag.
Fixes: 7e99e3470172 ("net: dsa: remove dsa_switch_alloc helper") Signed-off-by: Rafał Miłecki <rafal@milecki.pl> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
It isn't true that CPU port is always the last one. Switches BCM5301x
have 9 ports (port 6 being inactive) and they use port 5 as CPU by
default (depending on design some other may be CPU ports too).
A more reliable way of determining number of ports is to check for the
last set bit in the "enabled_ports" bitfield.
This fixes b53 internal state, it will allow providing accurate info to
the DSA and is required to fix BCM5301x support.
Fixes: 967dd82ffc52 ("net: dsa: b53: Add support for Broadcom RoboSwitch") Signed-off-by: Rafał Miłecki <rafal@milecki.pl> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
If the network devices connected to the system beyond
HSO_MAX_NET_DEVICES. add_net_device() in hso_create_net_device()
will be failed for the network_table is full. It will lead to
business failure which rely on network_table, for example,
hso_suspend() and hso_resume(). It will also lead to memory leak
because resource release process can not search the hso_device
object from network_table in hso_free_interface().
Add failure handler for add_net_device() in hso_create_net_device()
to solve the above problems.
Fixes: 72dc1c096c70 ("HSO: add option hso driver") Signed-off-by: Ziyang Xuan <william.xuanziyang@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
Florian noted that if mptcp_alloc_tx_skb() allocation fails
in __mptcp_push_pending(), we can end-up invoking
mptcp_push_release()/tcp_push() with a zero mss, causing
a divide by 0 error.
This change addresses the issue refactoring the skb allocation
code checking if skb collapsing will happen for sure and doing
the skb allocation only after such check. Skb allocation will
now happen only after the call to tcp_send_mss() which
correctly initializes mss_now.
As side bonuses we now fill the skb tx cache only when needed,
and this also clean-up a bit the output path.
v1 -> v2:
- use lockdep_assert_held_once() - Jakub
- fix indentation - Jakub
Reported-by: Florian Westphal <fw@strlen.de> Fixes: 724cfd2ee8aa ("mptcp: allocate TX skbs in msk context") Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Currently the clean target when using O= isn't cleaning the feature
detect output. This is because O= and OUTPUT= are set to canonical
paths. For example in tools/perf/Makefile:
This means that OUTPUT ends in a / and most usages prepend it to a file
without adding an extra /. This line that was changed adds an extra /
before the 'feature' folder but not to the end, resulting in a clean
command like this:
rm -f /tmp/build//featuretest-all.bin ...
After the change the clean command looks like this:
rm -f /tmp/build/feature/test-all.bin ...
Fixes: 762323eb39a257c3 ("perf build: Move feature cleanup under tools/build") Signed-off-by: James Clark <james.clark@arm.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: http://lore.kernel.org/lkml/20210816130705.1331868-1-james.clark@arm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
I noticed that only port 0 worked on the RTL8366RB since we
started to use custom tags.
It turns out that the format of egress custom tags is actually
different from ingress custom tags. While the lower bits just
contain the port number in ingress tags, egress tags need to
indicate destination port by setting the bit for the
corresponding port.
It was working on port 0 because port 0 added 0x00 as port
number in the lower bits, and if you do this the packet appears
at all ports, including the intended port. Ooops.
Fix this and all ports work again. Use the define for shifting
the "type A" into place while we're at it.
Tested on the D-Link DIR-685 by sending traffic to each of
the ports in turn. It works.
Fixes: 86dd9868b878 ("net: dsa: tag_rtl4_a: Support also egress tags") Cc: DENG Qingfang <dqfext@gmail.com> Cc: Mauri Sandberg <sandberg@mailfence.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
In case of buffered reading from block device, when short read happens,
we should retry to read more, otherwise the IO will be completed
partially, for example, the following fio expects to read 2MB, but it
can only read 1M or less bytes:
Commit 76c47d1449fc ("gpio: mpc8xxx: Add ACPI support") has switched to a
managed version when dealing with 'mpc8xxx_gc->regs'. So the corresponding
'iounmap()' call in the error handling path and in the remove should be
removed to avoid a double unmap.
This also allows some simplification in the probe. All the error handling
paths related to managed resources can be direct returns and a NULL check
in what remains in the error handling path can be removed.
Commit 698b8eeaed72 ("gpio/mpc8xxx: change irq handler from chained to normal")
has introduced a new 'goto err;' at the very end of the function, but has
not updated the error handling path accordingly.
Add the now missing 'irq_domain_remove()' call which balances a previous
'irq_domain_create_linear() call.
The build on fedora:35 and fedora:rawhide with clang is failing with:
49 41.00 fedora:35 : FAIL clang version 13.0.0 (Fedora 13.0.0~rc1-1.fc35)
bench/inject-buildid.c:351:6: error: variable 'len' set but not used [-Werror,-Wunused-but-set-variable]
u64 len = 0;
^
1 error generated.
make[3]: *** [/git/perf-5.14.0-rc7/tools/build/Makefile.build:139: bench] Error 2
50 41.11 fedora:rawhide : FAIL clang version 13.0.0 (Fedora 13.0.0~rc1-1.fc35)
bench/inject-buildid.c:351:6: error: variable 'len' set but not used [-Werror,-Wunused-but-set-variable]
u64 len = 0;
^
1 error generated.
make[3]: *** [/git/perf-5.14.0-rc7/tools/build/Makefile.build:139: bench] Error 2
That 'len' variable is not used at all, so just make sure all the
synthesize_RECORD() routines return ssize_t to propagate the writen()
return, as it may fail, ditch the 'ret' var and bail out if those
routines fail.
ifdef LIBUNWIND_DIR
LIBUNWIND_CFLAGS = -I$(LIBUNWIND_DIR)/include
LIBUNWIND_LDFLAGS = -L$(LIBUNWIND_DIR)/lib
LIBUNWIND_ARCHS = x86 x86_64 arm aarch64 debug-frame-arm debug-frame-aarch64
$(foreach libunwind_arch,$(LIBUNWIND_ARCHS),$(call libunwind_arch_set_flags,$(libunwind_arch)))
endif
Look at that 'foreach' on all the LIBUNWIND_ARCHS.
</>
After commit 5c4d7c82c0dc ("perf unwind: Do not put libunwind-{x86,aarch64}
in FEATURE_TESTS_BASIC"), FEATURE_CHECK_LDFLAGS-libunwind-{x86,aarch64} is
overwritten. As a result, the remote libunwind libraries cannot be searched
from $(LIBUNWIND_DIR)/lib directory during feature check tests. Fix it with
variable appending.
Before this patch:
perf$ make VF=1 LIBUNWIND_DIR=/opt/libunwind_aarch64
BUILD: Doing 'make -j16' parallel build
<SNIP>
...
... libopencsd: [ OFF ]
... libunwind-x86: [ OFF ]
... libunwind-x86_64: [ OFF ]
... libunwind-arm: [ OFF ]
... libunwind-aarch64: [ OFF ]
... libunwind-debug-frame: [ OFF ]
... libunwind-debug-frame-arm: [ OFF ]
... libunwind-debug-frame-aarch64: [ OFF ]
... cxx: [ OFF ]
<SNIP>
perf$ cat ../build/feature/test-libunwind-aarch64.make.output
/usr/bin/ld: cannot find -lunwind-aarch64
/usr/bin/ld: cannot find -lunwind-aarch64
collect2: error: ld returned 1 exit status
After this patch:
perf$ make VF=1 LIBUNWIND_DIR=/opt/libunwind_aarch64
BUILD: Doing 'make -j16' parallel build
<SNIP>
... libopencsd: [ OFF ]
... libunwind-x86: [ OFF ]
... libunwind-x86_64: [ OFF ]
... libunwind-arm: [ OFF ]
... libunwind-aarch64: [ on ]
... libunwind-debug-frame: [ OFF ]
... libunwind-debug-frame-arm: [ OFF ]
... libunwind-debug-frame-aarch64: [ OFF ]
... cxx: [ OFF ]
<SNIP>
Fixes: 5c4d7c82c0dceccf ("perf unwind: Do not put libunwind-{x86,aarch64} in FEATURE_TESTS_BASIC") Signed-off-by: Li Huafei <lihuafei1@huawei.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: He Kuang <hekuang@huawei.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Zhang Jinhao <zhangjinhao2@huawei.com> Link: http://lore.kernel.org/lkml/20210823134340.60955-1-lihuafei1@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Acaict, perf_home_perfconfig() is supposed to cache the result of
home_perfconfig, which returns the default location of perfconfig for
the user, given the HOME environment variable.
However, the current implementation calls home_perfconfig every time
perf_home_perfconfig() is called (so no caching is actually performed),
replacing the previous pointer, thus also causing a memory leak.
This patch adds a check of whether either config or failed is set and,
in that case, directly returns config without calling home_perfconfig at
each invocation.
Fixes: f5f03e19ce14fc31 ("perf config: Add perf_home_perfconfig function") Signed-off-by: Riccardo Mancini <rickyman7@gmail.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <song@kernel.org> Link: http://lore.kernel.org/lkml/20210820130817.740536-1-rickyman7@gmail.com
[ Removed needless double check for the 'failed' variable ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
0day bot reports a build error:
ERROR: modpost: "clear_user_page" [drivers/media/v4l2-core/videobuf-dma-sg.ko] undefined!
so export it in arch/arc/ to fix the build error.
In most ARCHes, clear_user_page() is a macro. OTOH, in a few
ARCHes it is a function and needs to be exported.
PowerPC exported it in 2004. It looks like nds32 and nios2
still need to have it exported.
A successful 'init_rs_non_canonical()' call should be balanced by a
corresponding 'free_rs()' call in the error handling path of the probe, as
already done in the remove function.
watchdog_hrtimer_pretimeout_stop needs the watchdog device to have a
valid pointer to the watchdog core data to stop the pretimeout hrtimer.
Therefore it needs to be called before the pointers are cleared in
watchdog_cdev_unregister.
Fixes: 7b7d2fdc8c3e ("watchdog: Add hrtimer-based pretimeout feature") Reported-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Curtis Klein <curtis.klein@hpe.com> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Link: https://lore.kernel.org/r/1624429583-5720-1-git-send-email-curtis.klein@hpe.com Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Since 39850ed51062 ("PCI/PTM: Save/restore Precision Time Measurement
Capability for suspend/resume"), devices that have PTM capability but
don't enable it see this message on calls to pci_save_state():
no suspend buffer for PTM
Drop the message, it's perfectly fine not to use a capability.
Fixes: 39850ed51062 ("PCI/PTM: Save/restore Precision Time Measurement Capability for suspend/resume") Link: https://lore.kernel.org/r/20210811185955.3112534-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: David E. Box <david.e.box@linux.intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
The CPU_ON PSCI call takes a payload that KVM uses to configure a
destination vCPU to run. This payload is non-architectural state and not
exposed through any existing UAPI. Effectively, we have a race between
CPU_ON and userspace saving/restoring a guest: if the target vCPU isn't
ran again before the VMM saves its state, the requested PC and context
ID are lost. When restored, the target vCPU will be runnable and start
executing at its old PC.
We can avoid this race by making sure the reset payload is serviced
before userspace can access a vCPU's state.
Fixes: 358b28f09f0a ("arm/arm64: KVM: Allow a VCPU to fully reset itself") Signed-off-by: Oliver Upton <oupton@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210818202133.1106786-3-oupton@google.com Signed-off-by: Sasha Levin <sashal@kernel.org>
KVM correctly serializes writes to a vCPU's reset state, however since
we do not take the KVM lock on the read side it is entirely possible to
read state from two different reset requests.
Cure the race for now by taking the KVM lock when reading the
reset_state structure.
Fixes: 358b28f09f0a ("arm/arm64: KVM: Allow a VCPU to fully reset itself") Signed-off-by: Oliver Upton <oupton@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210818202133.1106786-2-oupton@google.com Signed-off-by: Sasha Levin <sashal@kernel.org>
Since 2431c4f5b46c3 ("mtd: Implement mtd_{read,write}() as wrappers
around mtd_{read,write}_oob()") don't allow _write|_read and
_write_oob|_read_oob existing at the same time, we should check the
existence of callbacks "_read and _write" from subdev's master device
(We can trust master device since it has been registered) before
assigning, otherwise following warning occurs while making
concatenated device:
WARNING: CPU: 2 PID: 6728 at drivers/mtd/mtdcore.c:595
add_mtd_device+0x7f/0x7b0
Since commit 46b5889cc2c5("mtd: implement proper partition handling")
applied, mtd partition device won't hold some callback functions, such
as _block_isbad, _block_markbad, etc. Besides, function mtd_block_isbad()
will get mtd device's master mtd device, then invokes master mtd device's
callback function. So, following process may result mtd_block_isbad()
always return 0, even though mtd device has bad blocks:
1. Split a mtd device into 3 partitions: PA, PB, PC
[ Each mtd partition device won't has callback function _block_isbad(). ]
2. Concatenate PA and PB as a new mtd device PN
[ mtd_concat_create() finds out each subdev has no callback function
_block_isbad(), so PN won't be assigned callback function
concat_block_isbad(). ]
Then, mtd_block_isbad() checks "!master->_block_isbad" is true, will
always return 0.
Fixes a build error when CONFIG_HIST_TRIGGERS=n with boot-time
tracing. Since the trigger_process_regex() is defined only
when CONFIG_HIST_TRIGGERS=y, if it is disabled, the 'actions'
event option also must be disabled.
Fixes: 45db33709ccc ("PCI: Allow specifying devices using a base bus and path of devfns") Link: https://lore.kernel.org/r/20210812070004.GC31863@kili Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Logan Gunthorpe <logang@deltatee.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Even though ID_AA64MMFR0.PARANGE reports 52 bit PA size support, it cannot
be enabled as guest IPA size on 4K or 16K page size configurations. Hence
kvm_ipa_limit must be restricted to 48 bits. This change achieves required
IPA capping.
Before the commit c9b69a0cf0b4 ("KVM: arm64: Don't constrain maximum IPA
size based on host configuration"), the problem here would have been just
latent via PHYS_MASK_SHIFT (which earlier in turn capped kvm_ipa_limit),
which remains capped at 48 bits on 4K and 16K configs.
Cc: Marc Zyngier <maz@kernel.org> Cc: James Morse <james.morse@arm.com> Cc: Alexandru Elisei <alexandru.elisei@arm.com> Cc: Suzuki K Poulose <suzuki.poulose@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Cc: linux-arm-kernel@lists.infradead.org Cc: kvmarm@lists.cs.columbia.edu Cc: linux-kernel@vger.kernel.org Fixes: c9b69a0cf0b4 ("KVM: arm64: Don't constrain maximum IPA size based on host configuration") Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/1628680275-16578-1-git-send-email-anshuman.khandual@arm.com Signed-off-by: Sasha Levin <sashal@kernel.org>
Syzbot hit use-after-free in nf_tables_dump_sets. The problem was in
missing lock protection for nft_ct_pcpu_template_refcnt.
Before commit f102d66b335a ("netfilter: nf_tables: use dedicated
mutex to guard transactions") all transactions were serialized by global
mutex, but then global mutex was changed to local per netnamespace
commit_mutex.
This change causes use-after-free bug, when 2 netnamespaces concurently
changing nft_ct_pcpu_template_refcnt without proper locking. Fix it by
adding nft_ct_pcpu_mutex and protect all nft_ct_pcpu_template_refcnt
changes with it.
Fixes: f102d66b335a ("netfilter: nf_tables: use dedicated mutex to guard transactions") Reported-and-tested-by: syzbot+649e339fa6658ee623d3@syzkaller.appspotmail.com Signed-off-by: Pavel Skripkin <paskripkin@gmail.com> Acked-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
In commit 7ef1c871da16 ("PCI: iproc: Use
pci_parse_request_of_pci_ranges()"), calling
devm_request_pci_bus_resources() was dropped from the common iProc
probe code, but is still needed for BCMA bus probing. Without it, there
will be lots of warnings like this:
pci 0000:00:00.0: BAR 8: no space for [mem size 0x00c00000]
pci 0000:00:00.0: BAR 8: failed to assign [mem size 0x00c00000]
Add back calling devm_request_pci_bus_resources() and adding the
resources to pci_host_bridge.windows for BCMA bus probe.
Link: https://lore.kernel.org/r/20210803215656.3803204-2-robh@kernel.org Fixes: 7ef1c871da16 ("PCI: iproc: Use pci_parse_request_of_pci_ranges()") Reported-by: Rafał Miłecki <zajec5@gmail.com> Tested-by: Rafał Miłecki <rafal@milecki.pl> Signed-off-by: Rob Herring <robh@kernel.org> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Cc: Srinath Mannam <srinath.mannam@broadcom.com> Cc: Roman Bacik <roman.bacik@broadcom.com> Cc: Bharat Gooty <bharat.gooty@broadcom.com> Cc: Abhishek Shah <abhishek.shah@broadcom.com> Cc: Jitendra Bhivare <jitendra.bhivare@broadcom.com> Cc: Ray Jui <ray.jui@broadcom.com> Cc: Florian Fainelli <f.fainelli@gmail.com> Cc: BCM Kernel Feedback <bcm-kernel-feedback-list@broadcom.com> Cc: Scott Branden <sbranden@broadcom.com> Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Cc: "Krzysztof Wilczyński" <kw@linux.com> Cc: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Commit 669cbc708122 ("PCI: Move DT resource setup into
devm_pci_alloc_host_bridge()") made devm_pci_alloc_host_bridge() fail on
any DT resource parsing errors, but Broadcom iProc uses
devm_pci_alloc_host_bridge() on BCMA bus devices that don't have DT
resources. In particular, there is no 'ranges' property. Fix iProc by
making 'ranges' optional.
If 'ranges' is required by a platform, there's going to be more errors
latter on if it is missing.
Link: https://lore.kernel.org/r/20210803215656.3803204-1-robh@kernel.org Fixes: 669cbc708122 ("PCI: Move DT resource setup into devm_pci_alloc_host_bridge()") Reported-by: Rafał Miłecki <zajec5@gmail.com> Tested-by: Rafał Miłecki <rafal@milecki.pl> Signed-off-by: Rob Herring <robh@kernel.org> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Cc: Srinath Mannam <srinath.mannam@broadcom.com> Cc: Roman Bacik <roman.bacik@broadcom.com> Cc: Bharat Gooty <bharat.gooty@broadcom.com> Cc: Abhishek Shah <abhishek.shah@broadcom.com> Cc: Jitendra Bhivare <jitendra.bhivare@broadcom.com> Cc: Ray Jui <ray.jui@broadcom.com> Cc: Florian Fainelli <f.fainelli@gmail.com> Cc: BCM Kernel Feedback <bcm-kernel-feedback-list@broadcom.com> Cc: Scott Branden <sbranden@broadcom.com> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
The Intel IXP4xx PCI controller is only present on Intel IXP4xx
XScale-based network processor SoCs.
Add a dependency on ARCH_IXP4XX, to prevent asking the user about this
driver when configuring a kernel without support for the XScale
processor family.
Link: https://lore.kernel.org/r/6a88e55fe58fc280f4ff1ca83c154e4895b6dcbf.1624972789.git.geert+renesas@glider.be Fixes: f7821b4934584824 ("PCI: ixp4xx: Add a new driver for IXP4xx") Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
[lorenzo.pieralisi@arm.com: commit log] Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Remove interrupt disablement during backlight setting. It is
way to dangerous and makes platforms instable by having it
miss vblank IRQs leading to the graphics derailing.
The code is using ndelay() which is not available on
platforms such as ARM and will result in 32 * udelay(1)
which is substantial.
Add some code to detect if an interrupt occurs during the
tight loop and in that case just redo it from the top.
Fixes: 5317f37e48b9 ("backlight: Add Kinetic KTD253 backlight driver") Cc: Stephan Gerhold <stephan@gerhold.net> Reported-by: newbyte@disroot.org Reviewed-by: Daniel Thompson <daniel.thompson@linaro.org> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Lee Jones <lee.jones@linaro.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
On Cherry Trail devices with an AXP288 PMIC the external SD-card slot
used the AXP's DLDO2 as card-voltage and either DLDO3 or GPIO1LDO
(GPIO1 pin in low noise LDO mode) as signal-voltage.
These regulators are turned on/off and in case of the signal-voltage
also have their output-voltage changed by the _PS0 and _PS3 power-
management ACPI methods on the MMC-controllers ACPI fwnode as well as
by the _DSM ACPI method for changing the signal voltage.
The AML code implementing these methods is directly accessing the
PMIC through ACPI I2C OpRegion accesses, instead of using the special
PMIC OpRegion handled by drivers/acpi/pmic/intel_pmic_xpower.c .
This means that the contents of the involved PMIC registers can change
without the change being made through the regmap interface, so regmap
should not cache the contents of these registers.
Mark the regulator power on/off, the regulator voltage control and the
GPIO1 control registers as volatile, to avoid regmap caching them.
Specifically this fixes an issue on some models where the i915 driver
toggles another LDO using the same on/off register on/off through
MIPI sequences (through intel_soc_pmic_exec_mipi_pmic_seq_element())
which then writes back a cached on/off register-value where the
card-voltage is off causing the external sdcard slot to stop working
when the screen goes blank, or comes back on again.
The regulator register-range now marked volatile also includes the
buck regulator control registers. This is done on purpose these are
normally not touched by the AML code, but they are updated directly
by the SoC's PUNIT which means that they may also change without going
through regmap.
Note the AXP288 PMIC is only used on Bay- and Cherry-Trail platforms,
so even though this is an ACPI specific problem there is no need to
make the new volatile ranges conditional since these platforms always
use ACPI.
Fixes: dc91c3b6fe66 ("mfd: axp20x: Mark AXP20X_VBUS_IPSOUT_MGMT as volatile") Fixes: cd53216625a0 ("mfd: axp20x: Fix axp288 volatile ranges") Reported-and-tested-by: Clamshell <clamfly@163.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Reviewed-by: Chen-Yu Tsai <wens@csie.org> Signed-off-by: Lee Jones <lee.jones@linaro.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Joakim Zhang reports that Wake-on-Lan with the stmmac ethernet driver broke
when moving the incorrect handling of mac link state out of mac_config().
This reason this breaks is because the stmmac's WoL is handled by the MAC
rather than the PHY, and phylink doesn't cater for that scenario.
This patch adds the necessary phylink code to handle suspend/resume events
according to whether the MAC still needs a valid link or not. This is the
barest minimum for this support.
Reported-by: Joakim Zhang <qiangqing.zhang@nxp.com> Tested-by: Joakim Zhang <qiangqing.zhang@nxp.com> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: Joakim Zhang <qiangqing.zhang@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
Reported-by: Abaci Robot <abaci@linux.alibaba.com> Signed-off-by: Yang Li <yang.lee@linux.alibaba.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
syzbot is reporting circular locking problem at __loop_clr_fd() [1], for
commit a160c6159d4a0cf8 ("block: add an optional probe callback to
major_names") is calling the module's probe function with major_names_lock
held.
Fortunately, since commit 990e78116d38059c ("block: loop: fix deadlock
between open and remove") stopped holding loop_ctl_mutex in lo_open(),
current role of loop_ctl_mutex is to serialize access to loop_index_idr
and loop_add()/loop_remove(); in other words, management of id for IDR.
To avoid holding loop_ctl_mutex during whole add/remove operation, use
a bool flag to indicate whether the loop device is ready for use.
loop_unregister_transfer() which is called from cleanup_cryptoloop()
currently has possibility of use-after-free problem due to lack of
serialization between kfree() from loop_remove() from loop_control_remove()
and mutex_lock() from unregister_transfer_cb(). But since lo->lo_encryption
should be already NULL when this function is called due to module unload,
and commit 222013f9ac30b9ce ("cryptoloop: add a deprecation warning")
indicates that we will remove this function shortly, this patch updates
this function to emit warning instead of checking lo->lo_encryption.
Holding loop_ctl_mutex in loop_exit() is pointless, for all users must
close /dev/loop-control and /dev/loop$num (in order to drop module's
refcount to 0) before loop_exit() starts, and nobody can open
/dev/loop-control or /dev/loop$num afterwards.
The function bfq_setup_merge prepares the merging between two
bfq_queues, say bfqq and new_bfqq. To this goal, it assigns
bfqq->new_bfqq = new_bfqq. Then, each time some I/O for bfqq arrives,
the process that generated that I/O is disassociated from bfqq and
associated with new_bfqq (merging is actually a redirection). In this
respect, bfq_setup_merge increases new_bfqq->ref in advance, adding
the number of processes that are expected to be associated with
new_bfqq.
Unfortunately, the stable-merging mechanism interferes with this
setup. After bfqq->new_bfqq has been set by bfq_setup_merge, and
before all the expected processes have been associated with
bfqq->new_bfqq, bfqq may happen to be stably merged with a different
queue than the current bfqq->new_bfqq. In this case, bfqq->new_bfqq
gets changed. So, some of the processes that have been already
accounted for in the ref counter of the previous new_bfqq will not be
associated with that queue. This creates an unbalance, because those
references will never be decremented.
This commit fixes this issue by reestablishing the previous, natural
behaviour: once bfqq->new_bfqq has been set, it will not be changed
until all expected redirections have occurred.
Commit 3df98d79215ace13 ("lsm,selinux: pass flowi_common instead of flowi
to the LSM hooks") introduced flowi{4,6}_to_flowi_common() functions which
cause UBSAN warning when building with LLVM 11.0.1 on Ubuntu 21.04.
which should point to the same address without access to "struct flowi".
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
This patch fixes kernel NULL pointer dereference when creating nexthop
which is bound with SRv6 decapsulation. In the creation of nexthop,
__seg6_end_dt_vrf_build is called. __seg6_end_dt_vrf_build expects
fc_lninfo in fib6_config is set correctly, but it isn't set in
nh_create_ipv6, which causes kernel crash.
Here is steps to reproduce kernel crash:
1. modprobe vrf
2. ip -6 nexthop add encap seg6local action End.DT4 vrftable 1 dev eth0
Check one more time before exiting the API with an error.
Fix API to poll at least twice, in case there are other high priority
tasks and this API doesn't get CPU cycles for multiple jiffies update.
In addition, increase timeout from usecs_to_jiffies(10000) to
usecs_to_jiffies(20000), to prevent the case that for CONFIG_100HZ
timeout will be a single jiffies.
A single jiffies results actual timeout that can be any time between
1usec and 10msec. To solve this, a value of usecs_to_jiffies(20000)
ensures that timeout is 2 jiffies.
Signed-off-by: Smadar Fuks <smadarf@marvell.com> Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
We must not pet a running watchdog when handle_boot_enabled is off
because this will kick off automatic triggering before userland is
running, defeating the purpose of the handle_boot_enabled control.
Furthermore, don't ping in case watchdog_set_last_hw_keepalive was
called incorrectly when the hardware watchdog is actually not running.
Fixed: cef9572e9af3 ("watchdog: add support for adjusting last known HW keepalive time") Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Link: https://lore.kernel.org/r/93d56386-6e37-060b-55ce-84de8cde535f@web.de Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
AM64 has the same PCIe IP as in J7200 with certain erratas not
applicable (quirk_detect_quiet_flag). Add support for "ti,am64-pcie-host"
compatible and "ti,am64-pcie-ep" compatible that is specific to AM64.
J7200 has the same PCIe IP as in J721E with minor changes in the
wrapper. J7200 allows byte access of bridge configuration space
registers and the register field for LINK_DOWN interrupt is different.
J7200 also requires "quirk_detect_quiet_flag" to be set. Configure these
changes as part of driver data applicable only to J7200.
PCIe fails to link up if SERDES lanes not used by PCIe are assigned to
another protocol. For example, link training fails if lanes 2 and 3 are
assigned to another protocol while lanes 0 and 1 are used for PCIe to
form a two lane link. This failure is due to an incorrect tie-off on an
internal status signal indicating electrical idle.
Status signals going from SERDES to PCIe Controller are tied-off when a
lane is not assigned to PCIe. Signal indicating electrical idle is
incorrectly tied-off to a state that indicates non-idle. As a result,
PCIe sees unused lanes to be out of electrical idle and this causes
LTSSM to exit Detect.Quiet state without waiting for 12ms timeout to
occur. If a receiver is not detected on the first receiver detection
attempt in Detect.Active state, LTSSM goes back to Detect.Quiet and
again moves forward to Detect.Active state without waiting for 12ms as
required by PCIe base specification. Since wait time in Detect.Quiet is
skipped, multiple receiver detect operations are performed back-to-back
without allowing time for capacitance on the transmit lines to
discharge. This causes subsequent receiver detection to always fail even
if a receiver gets connected eventually.
Add a quirk flag "quirk_detect_quiet_flag" to program the minimum
time the LTSSM should wait on entering Detect.Quiet state here.
This has to be set for J7200 as it has an incorrect tie-off on unused
lanes.
Link: https://lore.kernel.org/r/20210811123336.31357-3-kishon@ti.com Signed-off-by: Nadeem Athani <nadeem@cadence.com> Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
No functional change. As we are intending to add additional 1-bit
members in struct j721e_pcie_data/struct cdns_pcie_rc, use bitfields
instead of bool since it takes less space. As discussed in [1],
the preference is to use bitfileds instead of bool inside structures.
Since kprobe_events and uprobe_events only check whether the
other same-type probe event has the same name or not, if the
user gives the same name of the existing tracepoint event (or
the other type of probe events), it silently fails to create
the tracefs entry (but registered.) as below.
/sys/kernel/tracing # ls events/task/task_rename
enable filter format hist id trigger
/sys/kernel/tracing # echo p:task/task_rename vfs_read >> kprobe_events
[ 113.048508] Could not create tracefs 'task_rename' directory
/sys/kernel/tracing # cat kprobe_events
p:task/task_rename vfs_read
To fix this issue, check whether the existing events have the
same name or not in trace_probe_register_event_call(). If exists,
it rejects to register the new event.
When protected mode is enabled, the host is unable to access most parts
of the EL2 hypervisor image, including 'hyp_physvirt_offset' and the
contents of the hypervisor's '.rodata.str' section. Unfortunately,
nvhe_hyp_panic_handler() tries to read from both of these locations when
handling a BUG() triggered at EL2; the former for converting the ELR to
a physical address and the latter for displaying the name of the source
file where the BUG() occurred.
Hack the EL2 panic asm to pass both physical and virtual ELR values to
the host and utilise the newly introduced CONFIG_NVHE_EL2_DEBUG so that
we disable stage-2 protection for the host before returning to the EL1
panic handler. If the debug option is not enabled, display the address
instead of the source file:line information.
Cc: Andrew Scull <ascull@google.com> Cc: Quentin Perret <qperret@google.com> Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210813130336.8139-1-will@kernel.org Signed-off-by: Sasha Levin <sashal@kernel.org>
RISCV uses a global variable pfn_base for page/pfn translation. But this
is a common name and will be used elsewhere. In those cases, the
page-pfn macros which refer to this name will be referred to the
local/input variable instead. (such as in vfio_pin_pages_remote). This
make everything wrong.
This patch changes the name from pfn_base to riscv_pfn_base to fix
this problem.
pm_runtime_get_sync() will increase the runtime PM counter
even it returns an error. Thus a pairing decrement is needed
to prevent refcount leak. Fix this by replacing this API with
pm_runtime_resume_and_get(), which will not change the runtime
PM counter on error.
Link: https://lore.kernel.org/r/20210408072402.15069-1-dinghao.liu@zju.edu.cn Signed-off-by: Dinghao Liu <dinghao.liu@zju.edu.cn> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Sasha Levin <sashal@kernel.org>
Although irq_create_mapping() is able to deal with duplicate
mappings, it really isn't supposed to be a substitute for
irq_find_mapping(), and can result in allocations that take place
in atomic context if the mapping didn't exist.
Fix the handful of MFD drivers that use irq_create_mapping() in
interrupt context by using irq_find_mapping() instead.
Cc: Linus Walleij <linus.walleij@linaro.org> Cc: Lee Jones <lee.jones@linaro.org> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Lee Jones <lee.jones@linaro.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Lower order MSI-X address is programmed in MSIX_ADDR_MATCH_HIGH_OFF
DBI register instead of higher order address. This patch fixes this
programming mistake.
In tegra_pcie_ep_hard_irq(), APPL_INTR_STATUS_L0 is stored in val and again
APPL_INTR_STATUS_L1_0_0 is also stored in val. So when execution reaches
"if (val & APPL_INTR_STATUS_L0_PCI_CMD_EN_INT)", val is not correct.
There is a potential race between fuse_read_interrupt() and
fuse_request_end().
TASK1
in fuse_read_interrupt(): delete req->intr_entry (while holding
fiq->lock)
TASK2
in fuse_request_end(): req->intr_entry is empty -> skip fiq->lock
wake up TASK3
TASK3
request is freed
TASK1
in fuse_read_interrupt(): dereference req->in.h.unique ***BAM***
Fix by always grabbing fiq->lock if the request was ever interrupted
(FR_INTERRUPTED set) thereby serializing with concurrent
fuse_read_interrupt() calls.
FR_INTERRUPTED is set before the request is queued on fiq->interrupts.
Dequeing the request is done with list_del_init() but FR_INTERRUPTED is not
cleared in this case.
Root Ports in NXP LX2xx0 and LX2xx2, where each Root Port is a Root Complex
with unique segment numbers, do provide isolation features to disable peer
transactions and validate bus numbers in requests, but do not provide an
actual PCIe ACS capability.
Add ACS quirks for NXP LX2xx0 A/C/E/N and LX2xx2 A/C/E/N platforms.
These are the actual frequencies reported by the PLL, so let's
report these. The roundoffs are inappropriate, we should round
to the frequency that the clock will later report.
Drop some whitespace at the same time.
Cc: phone-devel@vger.kernel.org Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Lee Jones <lee.jones@linaro.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
The remoteproc driver is split between the responsibilities of getting
the SoC-internal ARM core up and running and the external RF (aka
"Iris") part configured.
In order to satisfy the regulator framework's need of a struct device *
to look up supplies this was implemented as two different drivers, using
of_platform_populate() in the remoteproc part to probe the iris part.
Unfortunately it's possible that the iris part probe defers on yet not
available regulators and an attempt to start the remoteproc will have to
be rejected, until this has been resolved. But there's no useful
mechanism of knowing when this would be.
Instead replace the of_platform_populate() and the iris probe with a
function that rolls its own struct device, with the relevant of_node
associated that is enough to acquire regulators and clocks specified in
the DT node and that may propagate the EPROBE_DEFER back to the wcnss
device's probe.
"PAGESIZE / 512" is the number of ECC chunks.
"ECC_BYTES" is the number of bytes needed to store a single ECC code.
"2" is the space reserved by the bad block marker.
"2 + (PAGESIZE / 512) * ECC_BYTES" should of course be lower or equal
than the total number of OOB bytes, otherwise it won't fit.
The network interface managed by the mlxbf_gige driver can
get into a problem state where traffic does not flow.
In this state, the interface will be up and enabled, but
will stop processing received packets. This problem state
will happen if three specific conditions occur:
1) driver has received more than (N * RxRingSize) packets but
less than (N+1 * RxRingSize) packets, where N is an odd number
Note: the command "ethtool -g <interface>" will display the
current receive ring size, which currently defaults to 128
2) the driver's interface was disabled via "ifconfig oob_net0 down"
during the window described in #1.
3) the driver's interface is re-enabled via "ifconfig oob_net0 up"
This patch ensures that the driver's "valid_polarity" field is
cleared during the open() method so that it always matches the
receive polarity used by hardware. Without this fix, the driver
needs to be unloaded and reloaded to correct this problem state.
Fixes: f92e1869d74e ("Add Mellanox BlueField Gigabit Ethernet driver") Reviewed-by: Asmaa Mnebhi <asmaa@nvidia.com> Signed-off-by: David Thompson <davthompson@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
Sometimes when unbinding the mv88e6xxx driver on Turris MOX, these error
messages appear:
mv88e6085 d0032004.mdio-mii:12: port 1 failed to delete be:79:b4:9e:9e:96 vid 1 from fdb: -2
mv88e6085 d0032004.mdio-mii:12: port 1 failed to delete be:79:b4:9e:9e:96 vid 0 from fdb: -2
mv88e6085 d0032004.mdio-mii:12: port 1 failed to delete d8:58:d7:00:ca:6d vid 100 from fdb: -2
mv88e6085 d0032004.mdio-mii:12: port 1 failed to delete d8:58:d7:00:ca:6d vid 1 from fdb: -2
mv88e6085 d0032004.mdio-mii:12: port 1 failed to delete d8:58:d7:00:ca:6d vid 0 from fdb: -2
(and similarly for other ports)
What happens is that DSA has a policy "even if there are bugs, let's at
least not leak memory" and dsa_port_teardown() clears the dp->fdbs and
dp->mdbs lists, which are supposed to be empty.
But deleting that cleanup code, the warnings go away.
=> the FDB and MDB lists (used for refcounting on shared ports, aka CPU
and DSA ports) will eventually be empty, but are not empty by the time
we tear down those ports. Aka we are deleting them too soon.
The addresses that DSA complains about are host-trapped addresses: the
local addresses of the ports, and the MAC address of the bridge device.
The problem is that offloading those entries happens from a deferred
work item scheduled by the SWITCHDEV_FDB_DEL_TO_DEVICE handler, and this
races with the teardown of the CPU and DSA ports where the refcounting
is kept.
In fact, not only it races, but fundamentally speaking, if we iterate
through the port list linearly, we might end up tearing down the shared
ports even before we delete a DSA user port which has a bridge upper.
So as it turns out, we need to first tear down the user ports (and the
unused ones, for no better place of doing that), then the shared ports
(the CPU and DSA ports). In between, we need to ensure that all work
items scheduled by our switchdev handlers (which only run for user
ports, hence the reason why we tear them down first) have finished.
Fixes: 161ca59d39e9 ("net: dsa: reference count the MDB entries at the cross-chip notifier level") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Link: https://lore.kernel.org/r/20210914134726.2305133-1-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Once blk_throtl_init() queue init failed, blkcg_iolatency_exit() will
not be invoked for cleanup. That leads a memory leak. Swap the
blk_throtl_init() and blk_iolatency_init() calls can solve this.
When we remove the siblings entry, we update ns->head->list, hence we
can't separate the removal and test for being empty. They have to be
in the same critical section to avoid a race.
To avoid breaking the refcounting imbalance again, add a list empty
check to nvme_find_ns_head.
Fixes: 5396fdac56d8 ("nvme: fix refcounting imbalance when all paths are down") Signed-off-by: Daniel Wagner <dwagner@suse.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Tested-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
When the command for querying imp info is issued to the firmware,
if the firmware does not support the command, the returned value
of bd num is 0.
Add protection mechanism before alloc memory to prevent apply for
0-length memory.
Fixes: 0b198b0d80ea ("net: hns3: refactor dump m7 info of debugfs") Signed-off-by: Jiaran Zhang <zhangjiaran@huawei.com> Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
The delay is especially needed by the xRX300 and xRX330 SoCs. Without
this patch, some phys are sometimes not properly detected.
The patch was tested on BT Home Hub 5A and D-Link DWR-966.
Fixes: a09d042b0862 ("net: dsa: lantiq: allow to use all GPHYs on xRX300 and xRX330") Signed-off-by: Aleksander Jan Bajkowski <olek2@wp.pl> Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com> Acked-by: Hauke Mehrtens <hauke@hauke-m.de> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
When the mdio legacy mapping is used the mii_bus priv registered by DSA
refer to the dsa switch struct instead of the qca8k_priv struct and
causes a kernel panic. Create dedicated function when the internal
dedicated mdio driver is used to properly handle the 2 different
implementation.
Fixes: 759bafb8a322 ("net: dsa: qca8k: add support for internal phy and internal mdio") Signed-off-by: Ansuel Smith <ansuelsmth@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
There are two cases where the current PF does not support RDMA
functionality. The first is if the NVM loaded on the device is set
to not support RDMA (common_caps.rdma is false). The second is if
the kernel bonding driver has included the current PF in an active
link aggregate.
When the driver has determined that this PF does not support RDMA, then
auxiliary devices should not be created on the auxiliary bus. Without
a device on the auxiliary bus, even if the irdma driver is present, there
will be no RDMA activity attempted on this PF.
Currently, in the reset flow, an attempt to create auxiliary devices is
performed without regard to the ability of the PF. There needs to be a
check in ice_aux_plug_dev (as the central point that creates auxiliary
devices) to see if the PF is in a state to support the functionality.
When disabling and re-enabling RDMA due to the inclusion/removal of the PF
in a link aggregate, we also need to set/clear the bit which controls
auxiliary device creation so that a reset recovery in a link aggregate
situation doesn't try to create auxiliary devices when it shouldn't.
Fixes: f9f5301e7e2d ("ice: Register auxiliary device to provide RDMA") Reported-by: Yongxin Liu <yongxin.liu@windriver.com> Signed-off-by: Dave Ertman <david.m.ertman@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
Some profiles of the driver don't support a dedicated PTP-RQ, hence can't
support HW TS and CQE compression simultaneously. When HW TS is enabled
the COE compression is disabled, and should be restored when the HW TS
is turned off. Add rx_filter as an input to modifying CQE compression to
enforce this restriction.
Fixes: 256f79d13c1d ("net/mlx5e: Fix HW TS with CQE compression according to profile") Signed-off-by: Aya Levin <ayal@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
The problem appears to be that we free 'ring_info->pkt_buffer' twice:
first, when the device is unbound from in-kernel driver (netvsc in this
case) and second from hv_uio_remove(). Normally, ring buffer is supposed
to be re-initialized from hv_uio_open() but this happens when UIO device
is being opened and this is not guaranteed to happen.
Generally, it is OK to call hv_ringbuffer_cleanup() twice for the same
channel (which is being handed over between in-kernel drivers and UIO) even
if we didn't call hv_ringbuffer_init() in between. We, however, need to
avoid kfree() call for an already freed pointer.
Fixes: adae1e931acd ("Drivers: hv: vmbus: Copy packets sent by Hyper-V out of the ring buffer") Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Reviewed-by: Andrea Parri <parri.andrea@gmail.com> Reviewed-by: Michael Kelley <mikelley@microsoft.com> Link: https://lore.kernel.org/r/20210831143916.144983-1-vkuznets@redhat.com Signed-off-by: Wei Liu <wei.liu@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Use __maybe_unused for noirq_suspend()/noirq_resume() hooks to avoid
build warning with !CONFIG_PM_SLEEP:
>> drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c:796:12: error: 'stmmac_pltfr_noirq_resume' defined but not used [-Werror=unused-function]
796 | static int stmmac_pltfr_noirq_resume(struct device *dev)
| ^~~~~~~~~~~~~~~~~~~~~~~~~
>> drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c:775:12: error: 'stmmac_pltfr_noirq_suspend' defined but not used [-Werror=unused-function]
775 | static int stmmac_pltfr_noirq_suspend(struct device *dev)
| ^~~~~~~~~~~~~~~~~~~~~~~~~~
cc1: all warnings being treated as errors
Fixes: 276aae377206 ("net: stmmac: fix system hang caused by eee_ctrl_timer during suspend/resume") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Joakim Zhang <qiangqing.zhang@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Currently, the VF does not clear the interrupt source immediately after
receiving the interrupt. As a result, if the second interrupt task is
triggered when processing the first interrupt task, clearing the
interrupt source before exiting will clear the interrupt sources of the
two tasks at the same time. As a result, no interrupt is triggered for
the second task. The VF detects the missed message only when the next
interrupt is generated.
Clearing it immediately after executing check_evt_cause ensures that:
1. Even if two interrupt tasks are triggered at the same time, they can
be processed.
2. If the second task is triggered during the processing of the first
task and the interrupt source is not cleared, the interrupt is reported
after vector0 is enabled.
Fixes: b90fcc5bd904 ("net: hns3: add reset handling for VF when doing Core/Global/IMP reset") Signed-off-by: Jiaran Zhang <zhangjiaran@huawei.com> Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The firmware will not disable mac in flr process. Therefore, the driver
needs to proactively disable mac during flr, which is the same as the
function reset.
Fixes: 35d93a30040c ("net: hns3: adjust the process of PF reset") Signed-off-by: Yufeng Mo <moyufeng@huawei.com> Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Currently, affinity_mask is set to a single cpu. As a result,
irqbalance becomes invalid in SUBSET or EXACT mode. To solve
this problem, change affinity_mask to numa node range. In this
way, irqbalance can be performed on the cpu of the numa node.
Fixes: 0812545487ec ("net: hns3: add interrupt affinity support for misc interrupt") Signed-off-by: Yufeng Mo <moyufeng@huawei.com> Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The hardware cannot handle short tunnel frames below 65 bytes,
and will cause vlan tag missing problem. So pads packet size to
65 bytes for tunnel frames to fix this bug.
Fixes: 3db084d28dc0("net: hns3: Fix for vxlan tx checksum bug") Signed-off-by: Yufeng Mo <moyufeng@huawei.com> Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>