When ld detects unaligned relocations, it emits R_PPC64_UADDR64
relocations instead of R_PPC64_RELATIVE. Currently R_PPC64_UADDR64 are
detected by arch/powerpc/tools/relocs_check.sh and expected not to work.
Below is a simple chunk to trigger this behaviour (this disables
optimization for the demonstration purposes only, this also happens with
-O1/-O2 when CONFIG_PRINTK_INDEX=y, for example):
Elf{32,64}_Rela::r_addend is of type: Elf{32,64}_Sword, that means
that our reloc::addend needs to be long or face tuncation issues when
we do elf_rebuild_reloc_section():
Occasionally objtool driven code patching (think .static_call_sites
.retpoline_sites etc..) goes sideways and it tries to patch an
instruction that doesn't match.
Much head-scatching and cursing later the problem is as outlined below
and affects every section that objtool generates for us, very much
including the ORC data. The below uses .static_call_sites because it's
convenient for demonstration purposes, but as mentioned the ORC
sections, .retpoline_sites and __mount_loc are all similarly affected.
Noting that ld preserves the weak function text, but strips the symbol
off of it (hence objdump doing that funny negative offset thing). This
does lead to 'interesting' unused code issues with objtool when ran on
linked objects, but that seems to be working (fingers crossed).
So far so good.. Now lets consider the objtool static_call output
section (readelf output, old binutils):
And now we can see how that foos.o .static_call_sites goes side-ways, we
now have _two_ patch sites in foo. One for the weak symbol at foo+0
(which is no longer a static_call site!) and one at foo+d which is in
fact the right location.
This seems to happen when objtool cannot find a section symbol, in which
case it falls back to any other symbol to key off of, however in this
case that goes terribly wrong!
As such, teach objtool to create a section symbol when there isn't
one.
Fixes: 44f6a7c0755d ("objtool: Fix seg fault with Clang non-section symbols") Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Josh Poimboeuf <jpoimboe@redhat.com> Link: https://lkml.kernel.org/r/20220419203807.655552918@infradead.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
'size' may be used uninitialized in gsm_dlci_modem_output() if called with
an adaption that is neither 1 nor 2. The function is currently only called
by gsm_modem_upd_via_data() and only for adaption 2.
Properly handle every invalid case by returning -EINVAL to silence the
compiler warning and avoid future regressions.
Fixes: c19ffe00fed6 ("tty: n_gsm: fix invalid use of MSC in advanced option") Cc: stable@vger.kernel.org Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Daniel Starke <daniel.starke@siemens.com> Link: https://lore.kernel.org/r/20220425104726.7986-1-daniel.starke@siemens.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
n_gsm is based on the 3GPP 07.010 and its newer version is the 3GPP 27.010.
See https://portal.3gpp.org/desktopmodules/Specifications/SpecificationDetails.aspx?specificationId=1516
The changes from 07.010 to 27.010 are non-functional. Therefore, I refer to
the newer 27.010 here. Chapter 5.4.8.1 states that XON/XOFF characters
shall be used instead of Fcon/Fcoff command in advanced option mode to
handle flow control. Chapter 5.4.8.2 describes how XON/XOFF characters
shall be handled. Basic option mode only used Fcon/Fcoff commands and no
XON/XOFF characters. These are treated as data bytes here.
The current implementation uses the gsm_mux field 'constipated' to handle
flow control from the remote peer and the gsm_dlci field 'constipated' to
handle flow control from each DLCI. The later is unrelated to this patch.
The gsm_mux field is correctly set for Fcon/Fcoff commands in
gsm_control_message(). However, the same is not true for XON/XOFF
characters in gsm1_receive().
Disable software flow control handling in the tty to allow explicit
handling by n_gsm.
Add the missing handling in advanced option mode for gsm_mux in
gsm1_receive() to comply with the standard.
This patch depends on the following commit:
Commit 8838b2af23ca ("tty: n_gsm: fix SW flow control encoding/handling")
n_gsm is based on the 3GPP 07.010 and its newer version is the 3GPP 27.010.
See https://portal.3gpp.org/desktopmodules/Specifications/SpecificationDetails.aspx?specificationId=1516
The changes from 07.010 to 27.010 are non-functional. Therefore, I refer to
the newer 27.010 here. Chapter 5.4.6.3.7 states that the Modem Status
Command (MSC) shall only be used if the basic option was chosen.
The current implementation uses MSC frames even if advanced option was
chosen to inform the peer about modem line state updates. A standard
conform peer may choose to discard these frames in advanced option mode.
Furthermore, gsmtty_modem_update() is not part of the 'tty_operations'
functions despite its name.
Rename gsmtty_modem_update() to gsm_modem_update() to clarify this. Split
its function into gsm_modem_upd_via_data() and gsm_modem_upd_via_msc()
depending on the encoding and adaption. Introduce gsm_dlci_modem_output()
as adaption of gsm_dlci_data_output() to encode and queue empty frames in
advanced option mode. Use it in gsm_modem_upd_via_data().
gsm_modem_upd_via_msc() is based on the initial gsmtty_modem_update()
function which used only MSC frames to update modem states.
Dynamic virtual tty registration was introduced to allow the user to handle
these cases with uevent rules. The following commits relate to this:
Commit 5b87686e3203 ("tty: n_gsm: Modify gsmtty driver register method when config requester")
Commit 0b91b5332368 ("tty: n_gsm: Save dlci address open status when config requester")
Commit 46292622ad73 ("tty: n_gsm: clean up indenting in gsm_queue()")
However, the following behavior can be seen with this implementation:
- n_gsm ldisc is activated via ioctl
- all configuration parameters are set to their default value (initiator=0)
- the mux gets activated and attached and gsmtty0 is being registered in
in gsm_dlci_open() after DLCI 0 was established (DLCI 0 is the control
channel)
- the user configures n_gsm via ioctl GSMIOC_SETCONF as initiator
- this re-attaches the n_gsm mux
- no new gsmtty devices are registered in gsmld_attach_gsm() because the
mux is already active
- the initiator side registered only the control channel as gsmtty0
(which should never happen) and no user channel tty
The commits above make it impossible to operate the initiator side as no
user channel tty is or will be available.
On the other hand, this behavior will make it also impossible to allow DLCI
parameter negotiation on responder side in the future. The responder side
first needs to provide a device for the application before the application
can set its parameters of the associated DLCI via ioctl.
Note that the user application is still able to detect a link establishment
without relaying to uevent by waiting for DTR open on responder side. This
is the same behavior as on a physical serial interface. And on initiator
side a tty hangup can be detected if a link establishment request failed.
Revert the commits above completely to always register all user channels
and no control channel after mux attachment. No other changes are made.
Currently the peer is not informed about the initial state of the modem
control lines after a new DLCI has been opened.
Fix this by sending the initial modem control line states after DLCI open.
n_gsm is based on the 3GPP 07.010 and its newer version is the 3GPP 27.010.
See https://portal.3gpp.org/desktopmodules/Specifications/SpecificationDetails.aspx?specificationId=1516
The changes from 07.010 to 27.010 are non-functional. Therefore, I refer to
the newer 27.010 here. Chapter 5.4.4.2 states that any received unnumbered
acknowledgment (UA) with its poll/final (PF) bit set to 0 shall be
discarded. Currently, all UA frame are handled in the same way regardless
of the PF bit. This does not comply with the standard.
Remove the UA case in gsm_queue() to process only UA frames with PF bit set
to 1 to abide the standard.
gsmtty_write() and gsm_dlci_data_output() properly guard the fifo access.
However, gsm_dlci_close() and gsmtty_flush_buffer() modifies the fifo but
do not guard this.
Add a guard here to prevent race conditions on parallel writes to the fifo.
gsm_control_modem() informs the virtual tty that more data can be written
after receiving a control signal octet via modem status command (MSC).
However, gsm_dlci_data() fails to do the same after receiving a control
signal octet from the convergence layer type 2 header.
Add tty_wakeup() in gsm_dlci_data() for convergence layer type 2 to fix
this.
n_gsm is based on the 3GPP 07.010 and its newer version is the 3GPP 27.010.
See https://portal.3gpp.org/desktopmodules/Specifications/SpecificationDetails.aspx?specificationId=1516
The changes from 07.010 to 27.010 are non-functional. Therefore, I refer to
the newer 27.010 here. The value of the modem status command (MSC) frame
contains an address field, control signal and optional break signal octet.
The address field is encoded as described in chapter 5.2.1.2 with only one
octet (may be extended to more in future versions of the standard). Whereas
the control signal and break signal octet are always one byte each. This is
strange at first glance as it makes the EA bit redundant. However, the same
two octets are also encoded as header in convergence layer type 2 as
described in chapter 5.5.2. No header length field is given and the only
way to test if there is an optional break signal octet is via the EA flag
which extends the control signal octet with a break signal octet. Now it
becomes obvious how the EA bit for those two octets shall be encoded in the
MSC frame. The current implementation treats the signal octet different for
MSC frame and convergence layer type 2 header even though the standard
describes it for both in the same way.
Use the EA bit to encode the signal octets not only in the convergence
layer type 2 header but also in the MSC frame in the same way with either
1 or 2 bytes in case of an optional break signal. Adjust the receiving path
accordingly in gsm_control_modem().
Fixes: 3ac06b905655 ("tty: n_gsm: Fix for modems with brk in modem status control") Cc: stable@vger.kernel.org Signed-off-by: Daniel Starke <daniel.starke@siemens.com> Link: https://lore.kernel.org/r/20220414094225.4527-13-daniel.starke@siemens.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
n_gsm is based on the 3GPP 07.010 and its newer version is the 3GPP 27.010.
See https://portal.3gpp.org/desktopmodules/Specifications/SpecificationDetails.aspx?specificationId=1516
The changes from 07.010 to 27.010 are non-functional. Therefore, I refer to
the newer 27.010 here. Chapter 5.4.6.1 states that each command frame shall
be made up from type, length and value. Looking for example in chapter
5.4.6.3.5 at the description for the encoding of a flow control on command
it becomes obvious, that the type and length field is always present
whereas the value may be zero bytes long. The current implementation omits
the length field if the value is not present. This is wrong.
Correct this by always sending the length in gsm_control_transmit().
So far only the modem status command (MSC) has included a value and encoded
its length directly. Therefore, also change gsmtty_modem_update().
n_gsm is based on the 3GPP 07.010 and its newer version is the 3GPP 27.010.
See https://portal.3gpp.org/desktopmodules/Specifications/SpecificationDetails.aspx?specificationId=1516
The changes from 07.010 to 27.010 are non-functional. Therefore, I refer to
the newer 27.010 here. Chapter 5.7.3 states that the valid range for the
maximum number of retransmissions (N2) is from 0 to 255 (both including).
gsm_config() fails to limit this range correctly. Furthermore,
gsm_control_retransmit() handles this number incorrectly by performing
N2 - 1 retransmission attempts. Setting N2 to zero results in more than 255
retransmission attempts.
Fix the range check in gsm_config() and the value handling in
gsm_control_send() and gsm_control_retransmit() to comply with 3GPP 27.010.
In gsm_cleanup_mux() the muxer is closed down and all queues are removed.
However, removing the queues is done without explicit control of the
underlying buffers. Flush those before freeing up our queues to ensure
that all outgoing queues are cleared consistently. Otherwise, a new mux
connection establishment attempt may time out while the underlying tty is
still busy sending out the remaining data from the previous connection.
The current DLCI release order starts with the control channel followed by
the user channels. Reverse this order to keep the control channel open
until all user channels have been released.
n_gsm is based on the 3GPP 07.010 and its newer version is the 3GPP 27.010.
See https://portal.3gpp.org/desktopmodules/Specifications/SpecificationDetails.aspx?specificationId=1516
The changes from 07.010 to 27.010 are non-functional. Therefore, I refer to
the newer 27.010 here. Chapter 5.7.2 states that the maximum frame size
(N1) refers to the length of the information field (i.e. user payload).
However, 'txframe' stores the whole frame including frame header, checksum
and start/end flags. We also need to consider the byte stuffing overhead.
Define constant for the protocol overhead and adjust the 'txframe' size
calculation accordingly to reserve enough space for a complete mux frame
including byte stuffing for advanced option mode. Note that no byte
stuffing is applied to the start and end flag.
Also use MAX_MTU instead of MAX_MRU as this buffer is used for data
transmission.
Check if the incoming interface is available and NFT_BREAK
in case neither skb->sk nor input device are set.
Because nf_sk_lookup_slow*() assume packet headers are in the
'in' direction, use in postrouting is not going to yield a meaningful
result. Same is true for the forward chain, so restrict the use
to prerouting, input and output.
Use in output work if a socket is already attached to the skb.
Fixes: 554ced0a6e29 ("netfilter: nf_tables: add support for native socket matching") Reported-and-tested-by: Topi Miettinen <toiwoton@gmail.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The gsm_mux field 'malformed' represents the number of malformed frames
received. However, gsm1_receive() also increases this counter for any out
of frame byte.
Fix this by ignoring out of frame data for the malformed counter.
The frame checksum (FCS) is currently handled in gsm_queue() after
reception of a frame. However, this breaks layering. A workaround with
'received_fcs' was implemented so far.
Furthermore, frames are handled as such even if no end flag was received.
Move FCS calculation from gsm_queue() to gsm0_receive() and gsm1_receive().
Also delay gsm_queue() call there until a full frame was received to fix
both points.
n_gsm is based on the 3GPP 07.010 and its newer version is the 3GPP 27.010.
See https://portal.3gpp.org/desktopmodules/Specifications/SpecificationDetails.aspx?specificationId=1516
The changes from 07.010 to 27.010 are non-functional. Therefore, I refer to
the newer 27.010 here. Chapter 5.5.2 describes that the signal octet in
convergence layer type 2 can be either one or two bytes. The length is
encoded in the EA bit. This is set 1 for the last byte in the sequence.
gsmtty_modem_update() handles this correctly but gsm_dlci_data_output()
fails to set EA to 1. There is no case in which we encode two signal octets
as there is no case in which we send out a break signal.
Therefore, always set the EA bit to 1 for the signal octet to fix this.
Internally, we manage the alive state of the mux channels and mux itself
with the field member 'dead'. This makes it possible to notify the user
if the accessed underlying link is already gone. On the other hand,
however, removing the virtual ttys before terminating the channels may
result in peer messages being received without any internal target. Move
the mux cleanup procedure from gsmld_detach_gsm() to gsmld_close() to fix
this by keeping the virtual ttys open until the mux has been cleaned up.
The active mux instances are managed in the gsm_mux array and via mux_get()
and mux_put() functions separately. This gives a very loose coupling
between the actual instance and the gsm_mux array which manages it. It also
results in unnecessary lockings which makes it prone to failures. And it
creates a race condition if more than the maximum number of mux instances
are requested while the user changes the parameters of an active instance.
The user may loose ownership of the current mux instance in this case.
Fix this by moving the gsm_mux array handling to the mux allocation and
deallocation functions.
n_gsm is based on the 3GPP 07.010 and its newer version is the 3GPP 27.010.
See https://portal.3gpp.org/desktopmodules/Specifications/SpecificationDetails.aspx?specificationId=1516
The changes from 07.010 to 27.010 are non-functional. Therefore, I refer to
the newer 27.010 here. Chapter 5.8.2 states that both sides will revert to
the non-multiplexed mode via a close-down message (CLD). The usual program
flow is as following:
- start multiplex mode by sending AT+CMUX to the mobile
- establish the control channel (DLCI 0)
- establish user channels (DLCI >0)
- terminate user channels
- send close-down message (CLD)
- revert to AT protocol (i.e. leave multiplexed mode)
The AT protocol is out of scope of the n_gsm driver. However,
gsm_disconnect() sends CLD if gsm_config() detects that the requested
parameters require the mux protocol to restart. The next immediate action
is to start the mux protocol by opening DLCI 0 again. Any responder side
which handles CLD commands correctly forces us to fail at this point
because AT+CMUX needs to be sent to the mobile to start the mux again.
Therefore, remove the CLD command in this phase and keep both sides in
multiplexed mode.
Remove the gsm_disconnect() function as it become unnecessary and merge the
remaining parts into gsm_cleanup_mux() to handle the termination order and
locking correctly.
Fixes: 71e077915396 ("tty: n_gsm: do not send/receive in ldisc close path") Cc: stable@vger.kernel.org Signed-off-by: Daniel Starke <daniel.starke@siemens.com> Link: https://lore.kernel.org/r/20220414094225.4527-2-daniel.starke@siemens.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Currently, only the initiator resets the mux protocol if the user requests
new parameters that are incompatible to those of the current connection.
The responder also needs to reset the multiplexer if the new parameter set
requires this. Otherwise, we end up with an inconsistent parameter set
between initiator and responder.
Revert the old behavior to inform the peer upon an incompatible parameter
set change from the user on the responder side by re-establishing the mux
protocol in such case.
Now arch-specific functions all do the same thing. When it fixes the
symbol address it needs to check the boundary between the kernel image
and modules. For the last symbol in the previous region, it cannot
know the exact size as it's discarded already. Thus it just uses a
small page size (4096) and rounds it up like the last symbol.
Fixes: 3cf6a32f3f2a4594 ("perf symbols: Fix symbol size calculation condition") Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Ian Rogers <irogers@google.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <songliubraving@fb.com> Cc: Will Deacon <will@kernel.org> Cc: linux-s390@vger.kernel.org Cc: linuxppc-dev@lists.ozlabs.org Link: https://lore.kernel.org/r/20220416004048.1514900-3-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The symbol fixup is necessary for symbols in kallsyms since they don't
have size info. So we use the next symbol's address to calculate the
size. Now it's also used for user binaries because sometimes they miss
size for hand-written asm functions.
There's a arch-specific function to handle kallsyms differently but
currently it cannot distinguish kallsyms from others. Pass this
information explicitly to handle it properly. Note that those arch
functions will be moved to the generic function so I didn't added it to
the arch-functions.
Fixes: 3cf6a32f3f2a4594 ("perf symbols: Fix symbol size calculation condition") Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Ian Rogers <irogers@google.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <songliubraving@fb.com> Cc: Will Deacon <will@kernel.org> Cc: linux-s390@vger.kernel.org Cc: linuxppc-dev@lists.ozlabs.org Link: https://lore.kernel.org/r/20220416004048.1514900-2-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The GW71xx, GW72xx and GW73xx boards have USB1 routed to a USB OTG
connectors and USB2 routed to a USB hub.
The OTG connector has a over-currently protection with an active-low
pin and the USB1 to HUB connection has no over-current protection (as
the HUB itself implements this for its downstream ports).
Add proper dt nodes to specify the over-current pin polarity for USB1
and disable over-current protection for USB2.
Fixes: 6f30b27c5ef5 ("arm64: dts: imx8mm: Add Gateworks i.MX 8M Mini Development Kits") Cc: stable@vger.kernel.org Signed-off-by: Tim Harvey <tharvey@gateworks.com> Signed-off-by: Shawn Guo <shawnguo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Flexcom3 is used as board console serial. There are no pull-ups on these
lines on the board. This means that if a cable is not connected (that has
pull-ups included), stray characters could appear on the console as the
floating pins voltage levels are interpreted as incoming characters.
To avoid this problem, enable the internal pull-ups on these lines.
That assertion is relatively new, introduced with commit d04fbe19aefd2
("btrfs: scrub: cleanup the argument list of scrub_chunk()").
The block group we get at scrub_enumerate_chunks() can actually have a
start address that is smaller then the chunk offset we extracted from a
device extent item we got from the commit root of the device tree.
This is very rare, but it can happen due to a race with block group
removal and allocation. For example, the following steps show how this
can happen:
1) We are at transaction T, and we have the following blocks groups,
sorted by their logical start address:
--> logical address space hole of 256M,
there used to be a 256M metadata block group here
[ bg Y, start address Y, length 256M (metadata) ]
--> Y matches W's end offset + 256M
Block group Y is the block group with the highest logical address in
the whole filesystem;
2) Block group Y is deleted and its extent mapping is removed by the call
to remove_extent_mapping() made from btrfs_remove_block_group().
So after this point, the last element of the mapping red black tree,
its rightmost node, is the mapping for block group W;
3) While still at transaction T, a new data block group is allocated,
with a length of 1G. When creating the block group we do a call to
find_next_chunk(), which returns the logical start address for the
new block group. This calls returns X, which corresponds to the
end offset of the last block group, the rightmost node in the mapping
red black tree (fs_info->mapping_tree), plus one.
So we get a new block group that starts at logical address X and with
a length of 1G. It spans over the whole logical range of the old block
group Y, that was previously removed in the same transaction.
However the device extent allocated to block group X is not the same
device extent that was used by block group Y, and it also does not
overlap that extent, which must be always the case because we allocate
extents by searching through the commit root of the device tree
(otherwise it could corrupt a filesystem after a power failure or
an unclean shutdown in general), so the extent allocator is behaving
as expected;
4) We have a task running scrub, currently at scrub_enumerate_chunks().
There it searches for device extent items in the device tree, using
its commit root. It finds a device extent item that was used by
block group Y, and it extracts the value Y from that item into the
local variable 'chunk_offset', using btrfs_dev_extent_chunk_offset();
It then calls btrfs_lookup_block_group() to find block group for
the logical address Y - since there's currently no block group that
starts at that logical address, it returns block group X, because
its range contains Y.
This results in triggering the assertion:
ASSERT(cache->start == chunk_offset);
right before calling scrub_chunk(), as cache->start is X and
chunk_offset is Y.
This is more likely to happen of filesystems not larger than 50G, because
for these filesystems we use a 256M size for metadata block groups and
a 1G size for data block groups, while for filesystems larger than 50G,
we use a 1G size for both data and metadata block groups (except for
zoned filesystems). It could also happen on any filesystem size due to
the fact that system block groups are always smaller (32M) than both
data and metadata block groups, but these are not frequently deleted, so
much less likely to trigger the race.
So make scrub skip any block group with a start offset that is less than
the value we expect, as that means it's a new block group that was created
in the current transaction. It's pointless to continue and try to scrub
its extents, because scrub searches for extents using the commit root, so
it won't find any. For a device replace, skip it as well for the same
reasons, and we don't need to worry about the possibility of extents of
the new block group not being to the new device, because we have the write
duplication setup done through btrfs_map_block().
Fixes: d04fbe19aefd ("btrfs: scrub: cleanup the argument list of scrub_chunk()") CC: stable@vger.kernel.org # 5.17 Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Currently, we use btrfs_inode_{lock,unlock}() to grant an exclusive
writeback of the relocation data inode in
btrfs_zoned_data_reloc_{lock,unlock}(). However, that can cause a deadlock
in the following path.
Thread A takes btrfs_inode_lock() and waits for metadata reservation by
e.g, waiting for writeback:
The deadlock is caused by relying on the vfs_inode's lock. By using it, we
introduced unnecessary exclusion of writeback and
btrfs_prealloc_file_range(). Also, the lock at this point is useless as we
don't have any dirty pages in the inode yet.
Introduce fs_info->zoned_data_reloc_io_lock and use it for the exclusive
writeback.
Fixes: 35156d852762 ("btrfs: zoned: only allow one process to add pages to a relocation inode") CC: stable@vger.kernel.org # 5.16.x: 869f4cdc73f9: btrfs: zoned: encapsulate inode locking for zoned relocation CC: stable@vger.kernel.org # 5.16.x CC: stable@vger.kernel.org # 5.17 Cc: Johannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
On a zoned filesystem, if we fail to allocate the root node for the log
root tree while syncing the log, we end up returning without finishing
the IO plug we started before, resulting in leaking resources as we
have started writeback for extent buffers of a log tree before. That
allocation failure, which typically is either -ENOMEM or -ENOSPC, is not
fatal and the fsync can safely fallback to a full transaction commit.
So release the IO plug if we fail to allocate the extent buffer for the
root of the log root tree when syncing the log on a zoned filesystem.
Fixes: 3ddebf27fcd3a9 ("btrfs: zoned: reorder log node allocation on zoned filesystem") CC: stable@vger.kernel.org # 5.15+ Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When a bio is split in btrfs_submit_direct, dip->file_offset contains
the file offset for the first bio. But this means the start value used
in btrfs_end_dio_bio to record the write location for zone devices is
incorrect for subsequent bios.
When a bio is split in btrfs_submit_direct, dip->file_offset contains
the file offset for the first bio. But this means the start value used
in btrfs_check_read_dio_bio is incorrect for subsequent bios. Add
a file_offset field to struct btrfs_bio to pass along the correct offset.
Given that check_data_csum only uses start of an error message this
means problems with this miscalculation will only show up when I/O fails
or checksums mismatch.
The logic was removed in f4f39fc5dc30 ("btrfs: remove btrfs_bio::logical
member") but we need it due to the bio splitting.
Control Flow Integrity (CFI) instrumentation of the kernel noticed that
the caller, dev_attr_show(), and the callback, odvp_show(), did not have
matching function prototypes, which would cause a CFI exception to be
raised. Correct the prototype by using struct device_attribute instead
of struct kobj_attribute.
The "safe state" index is used by acpi_idle_enter_bm() to avoid
entering a C-state that may require bus mastering to be disabled
on entry in the cases when this is not going to happen. For this
reason, it should not be set to point to C3 type of C-states, because
they may require bus mastering to be disabled on entry in principle.
This was broken by commit d6b88ce2eb9d ("ACPI: processor idle: Allow
playing dead in C3 state") which inadvertently allowed the "safe
state" index to point to C3 type of C-states.
This results in a machine that won't boot past the point when it first
enters C3. Restore the correct behaviour (either demote to C1/C2, or
use C3 but also set ARB_DIS=1).
I hit this on a Fujitsu Siemens Lifebook S6010 (P3) machine.
Fixes: d6b88ce2eb9d ("ACPI: processor idle: Allow playing dead in C3 state") Cc: 5.16+ <stable@vger.kernel.org> # 5.16+ Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Tested-by: Woody Suwalski <wsuwalski@gmail.com>
[ rjw: Subject and changelog adjustments ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
I made a mistake with the commit a6aaa0032424 ("net: ethernet: stmmac:
fix altr_tse_pcs function when using a fixed-link"). I should have
tested against both scenario of having a SGMII interface and one
without.
Without the SGMII PCS TSE adpater, the sgmii_adapter_base address is
NULL, thus a write to this address will fail.
Cc: stable@vger.kernel.org Fixes: a6aaa0032424 ("net: ethernet: stmmac: fix altr_tse_pcs function when using a fixed-link") Signed-off-by: Dinh Nguyen <dinguyen@kernel.org> Link: https://lore.kernel.org/r/20220420152345.27415-1-dinguyen@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
We have now seen panel (XMG Core 15 e21 laptop) advertizing support
for Intel proprietary eDP backlight control via DPCD registers, but
actually working only with legacy pwm control.
This patch adds panel EDID check for possible HDR static metadata and
Intel proprietary eDP backlight control is used only if that exists.
Missing HDR static metadata is ignored if user specifically asks for
Intel proprietary eDP backlight control via enable_dpcd_backlight
parameter.
v2 :
- Ignore missing HDR static metadata if Intel proprietary eDP
backlight control is forced via i915.enable_dpcd_backlight
- Printout info message if panel is missing HDR static metadata and
support for Intel proprietary eDP backlight control is detected
Fixes: 4a8d79901d5b ("drm/i915/dp: Enable Intel's HDR backlight interface (only SDR for now)") Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/5284 Cc: Lyude Paul <lyude@redhat.com> Cc: Mika Kahola <mika.kahola@intel.com> Cc: Jani Nikula <jani.nikula@intel.com> Cc: Filippo Falezza <filippo.falezza@outlook.it> Cc: stable@vger.kernel.org Signed-off-by: Jouni Högander <jouni.hogander@intel.com> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220413082826.120634-1-jouni.hogander@intel.com Reviewed-by: Lyude Paul <lyude@redhat.com>
(cherry picked from commit b4b157577cb1de13bee8bebc3576f1de6799a921) Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
We normally runtime suspend when there are displays attached if they
are in the DPMS off state, however, if something wakes the GPU
we send a hotplug event on resume (in case any displays were connected
while the GPU was in suspend) which can cause userspace to light
up the displays again soon after they were turned off.
Prior to
commit 087451f372bf76 ("drm/amdgpu: use generic fb helpers instead of setting up AMD own's."),
the driver took a runtime pm reference when the fbdev emulation was
enabled because we didn't implement proper shadowing support for
vram access when the device was off so the device never runtime
suspended when there was a console bound. Once that commit landed,
we now utilize the core fb helper implementation which properly
handles the emulation, so runtime pm now suspends in cases where it did
not before. Ultimately, we need to sort out why runtime suspend in not
working in this case for some users, but this should restore similar
behavior to before.
v2: move check into runtime_suspend
v3: wake ups -> wakeups in comment, retain pm_runtime behavior in
runtime_idle callback
Fixes: 087451f372bf76 ("drm/amdgpu: use generic fb helpers instead of setting up AMD own's.") Link: https://lore.kernel.org/r/20220403132322.51c90903@darkstar.example.org/ Tested-by: Michele Ballabio <ballabio.m@gmail.com> Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The commit referenced below fixed packet re-routing if Netfilter mangles
a routing key property of a packet and the packet is routed in a VRF L3
domain. The fix, however, addressed IPv4 re-routing, only.
This commit applies the same behavior for IPv6. While at it, untangle
the nested ternary operator to make the code more readable.
Fixes: 6d8b49c3a3a3 ("netfilter: Update ip_route_me_harder to consider L3 domain") Cc: stable@vger.kernel.org Signed-off-by: Martin Willi <martin@strongswan.org> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This patch fixes a memory corruption that occurred in the
nand_scan() path for Hynix nand device.
On boot, for Hynix nand device will panic at a weird place:
| Unable to handle kernel NULL pointer dereference at virtual
address 00000070
| [00000070] *pgd=00000000
| Internal error: Oops: 5 [#1] PREEMPT SMP ARM
| Modules linked in:
| CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.17.0-01473-g13ae1769cfb0
#38
| Hardware name: Generic DT based system
| PC is at nandc_set_reg+0x8/0x1c
| LR is at qcom_nandc_command+0x20c/0x5d0
| pc : [<c088b74c>] lr : [<c088d9c8>] psr: 00000113
| sp : c14adc50 ip : c14ee208 fp : c0cc970c
| r10: 000000a3 r9 : 00000000 r8 : 00000040
| r7 : c16f6a00 r6 : 00000090 r5 : 00000004 r4 :c14ee040
| r3 : 00000000 r2 : 0000000b r1 : 00000000 r0 :c14ee040
| Flags: nzcv IRQs on FIQs on Mode SVC_32 ISA ARM Segment none
| Control: 10c5387d Table: 8020406a DAC: 00000051
| Register r0 information: slab kmalloc-2k start c14ee000 pointer offset
64 size 2048
| Process swapper/0 (pid: 1, stack limit = 0x(ptrval))
| nandc_set_reg from qcom_nandc_command+0x20c/0x5d0
| qcom_nandc_command from nand_readid_op+0x198/0x1e8
| nand_readid_op from hynix_nand_has_valid_jedecid+0x30/0x78
| hynix_nand_has_valid_jedecid from hynix_nand_init+0xb8/0x454
| hynix_nand_init from nand_scan_with_ids+0xa30/0x14a8
| nand_scan_with_ids from qcom_nandc_probe+0x648/0x7b0
| qcom_nandc_probe from platform_probe+0x58/0xac
The problem is that the nand_scan()'s qcom_nand_attach_chip callback
is updating the nandc->max_cwperpage from 1 to 4 or 8 based on page size.
This causes the sg_init_table of clear_bam_transaction() in the driver's
qcom_nandc_command() to memset much more than what was initially
allocated by alloc_bam_transaction().
This patch will update nandc->max_cwperpage 1 to 4 or 8 based on page
size in qcom_nand_attach_chip call back after freeing the previously
allocated memory for bam txn as per nandc->max_cwperpage = 1 and then
again allocating bam txn as per nandc->max_cwperpage = 4 or 8 based on
page size in qcom_nand_attach_chip call back itself.
Cc: stable@vger.kernel.org Fixes: 6a3cec64f18c ("mtd: rawnand: qcom: convert driver to nand_scan()") Reported-by: Konrad Dybcio <konrad.dybcio@somainline.org> Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Co-developed-by: Sricharan R <quic_srichara@quicinc.com> Signed-off-by: Sricharan R <quic_srichara@quicinc.com> Signed-off-by: Md Sadre Alam <quic_mdalam@quicinc.com> Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Link: https://lore.kernel.org/linux-mtd/1650268107-5363-1-git-send-email-quic_mdalam@quicinc.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
kasan_quarantine_remove_cache() is called in kmem_cache_shrink()/
destroy(). The kasan_quarantine_remove_cache() call is protected by
cpuslock in kmem_cache_destroy() to ensure serialization with
kasan_cpu_offline().
However the kasan_quarantine_remove_cache() call is not protected by
cpuslock in kmem_cache_shrink(). When a CPU is going offline and cache
shrink occurs at same time, the cpu_quarantine may be corrupted by
interrupt (per_cpu_remove_cache operation).
So add a cpu_quarantine offline flags check in per_cpu_remove_cache().
[akpm@linux-foundation.org: add comment, per Zqiang]
Ensure that the i_flags field of struct zonefs_inode_info is cleared to
0 when initializing a zone file inode, avoiding seeing the flag
ZONEFS_ZONE_OPEN being incorrectly set.
Fixes: b5c00e975779 ("zonefs: open/close zone on file open/close") Cc: <stable@vger.kernel.org> Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Reviewed-by: Hans Holmberg <hans.holmberg@wdc.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The mount option "explicit_open" manages the device open zone
resources to ensure that if an application opens a sequential file for
writing, the file zone can always be written by explicitly opening
the zone and accounting for that state with the s_open_zones counter.
However, if some zones are already open when mounting, the device open
zone resource usage status will be larger than the initial s_open_zones
value of 0. Ensure that this inconsistency does not happen by closing
any sequential zone that is open when mounting.
Furthermore, with ZNS drives, closing an explicitly open zone that has
not been written will change the zone state to "closed", that is, the
zone will remain in an active state. Since this can then cause failures
of explicit open operations on other zones if the drive active zone
resources are exceeded, we need to make sure that the zone is not
active anymore by resetting it instead of closing it. To address this,
zonefs_zone_mgmt() is modified to change a REQ_OP_ZONE_CLOSE request
into a REQ_OP_ZONE_RESET for sequential zones that have not been
written.
Fixes: b5c00e975779 ("zonefs: open/close zone on file open/close") Cc: <stable@vger.kernel.org> Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by: Hans Holmberg <hans.holmberg@wdc.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* bio_issue_time() is stored in bio->bi_issue truncated to 51 bits. This
overflows in slightly over 26 days. Setting rq->io_start_time_ns with it
means that io duration calculation would yield >26days after 26 days of
uptime. This, for example, confuses kyber making it cause high IO
latencies.
* rq->io_start_time_ns should record the time that the IO is issued to the
device so that on-device latency can be measured. However,
bio_issue_time() is set before the bio goes through the rq-qos controllers
(wbt, iolatency, iocost), so when the bio gets throttled in any of the
mechanisms, the measured latencies make no sense - on-device latencies end
up higher than request-alloc-to-completion latencies.
We'll need a smarter way to avoid calling ktime_get_ns() repeatedly
back-to-back. For now, let's revert the commit.
People are occasionally reporting a warning bfqq_request_over_limit()
triggering reporting that BFQ's idea of cgroup hierarchy (and its depth)
does not match what generic blkcg code thinks. This can actually happen
when bfqq gets moved between BFQ groups while bfqq_request_over_limit()
is running. Make sure the code is safe against BFQ queue being moved to
a different BFQ group.
Because mremap does not have a MAP_FIXED_NOREPLACE flag, it can destroy
existing mappings. This causes a segfault when regions such as text are
remapped and the permissions are changed.
Verify the requested mremap destination address does not overlap any
existing mappings by using mmap's MAP_FIXED_NOREPLACE flag. Keep
incrementing the destination address until a valid mapping is found or
fail the current test once the max address is reached.
Avoid calling mmap with requested addresses that are less than the
system's mmap_min_addr. When run as root, mmap returns EACCES when
trying to map addresses < mmap_min_addr. This is not one of the error
codes for the condition to retry the mmap in the test.
Rather than arbitrarily retrying on EACCES, don't attempt an mmap until
addr > vm.mmap_min_addr.
Add a munmap call after an alignment check as the mappings are retained
after the retry and can reach the vm.max_map_count sysctl.
The "read_bhrb" global symbol is only called under CONFIG_PPC64 of
arch/powerpc/perf/core-book3s.c but it is compiled for both 32 and 64 bit
anyway (and LLVM fails to link this on 32bit).
We hold rrpriv->lock in position (1) of thread 1 and
use del_timer_sync() to wait timer to stop, but timer handler
also need rrpriv->lock in position (2) of thread 2.
As a result, rr_close() will block forever.
This patch extracts del_timer_sync() from the protection of
spin_lock_irqsave(), which could let timer handler to obtain
the needed lock.
because the copychunk_write might cover a region of the file that has not yet
been sent to the server and thus fail.
A simple way to reproduce this is:
truncate -s 0 /mnt/testfile; strace -f -o x -ttT xfs_io -i -f -c 'pwrite 0k 128k' -c 'fcollapse 16k 24k' /mnt/testfile
the issue is that the 'pwrite 0k 128k' becomes rearranged on the wire with
the 'fcollapse 16k 24k' due to write-back caching.
fcollapse is implemented in cifs.ko as a SMB2 IOCTL(COPYCHUNK_WRITE) call
and it will fail serverside since the file is still 0b in size serverside
until the writes have been destaged.
To avoid this we must ensure that we destage any unwritten data to the
server before calling COPYCHUNK_WRITE.
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1997373 Reported-by: Xiaoli Feng <xifeng@redhat.com> Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
The first "if" condition in __memcpy_flushcache is supposed to align the
"dest" variable to 8 bytes and copy data up to this alignment. However,
this condition may misbehave if "size" is greater than 4GiB.
The statement min_t(unsigned, size, ALIGN(dest, 8) - dest); casts both
arguments to unsigned int and selects the smaller one. However, the
cast truncates high bits in "size" and it results in misbehavior.
For example:
suppose that size == 0x100000001, dest == 0x200000002
min_t(unsigned, size, ALIGN(dest, 8) - dest) == min_t(0x1, 0xe) == 0x1;
...
dest += 0x1;
so we copy just one byte "and" dest remains unaligned.
This patch fixes the bug by replacing unsigned with size_t.
The root cause is the race as follows:
Thread #1 Thread #2(irq ctx)
z_erofs_runqueue()
struct z_erofs_decompressqueue io_A[];
submit bio A
z_erofs_decompress_kickoff(,,1)
z_erofs_decompressqueue_endio(bio A)
z_erofs_decompress_kickoff(,,-1)
spin_lock_irqsave()
atomic_add_return()
io_wait_event() -> pending_bios is already 0
[end of function]
wake_up_locked(io_A[]) // crash
Referenced backtrace in kernel 5.4:
[ 10.129422] Unable to handle kernel paging request at virtual address eb0454a4
[ 10.364157] CPU: 0 PID: 709 Comm: getprop Tainted: G WC O 5.4.147-ab09225 #1
[ 11.556325] [<c01b33b8>] (__wake_up_common) from [<c01b3300>] (__wake_up_locked+0x40/0x48)
[ 11.565487] [<c01b3300>] (__wake_up_locked) from [<c044c8d0>] (z_erofs_vle_unzip_kickoff+0x6c/0xc0)
[ 11.575438] [<c044c8d0>] (z_erofs_vle_unzip_kickoff) from [<c044c854>] (z_erofs_vle_read_endio+0x16c/0x17c)
[ 11.586082] [<c044c854>] (z_erofs_vle_read_endio) from [<c06a80e8>] (clone_endio+0xb4/0x1d0)
[ 11.595428] [<c06a80e8>] (clone_endio) from [<c04a1280>] (blk_update_request+0x150/0x4dc)
[ 11.604516] [<c04a1280>] (blk_update_request) from [<c06dea28>] (mmc_blk_cqe_complete_rq+0x144/0x15c)
[ 11.614640] [<c06dea28>] (mmc_blk_cqe_complete_rq) from [<c04a5d90>] (blk_done_softirq+0xb0/0xcc)
[ 11.624419] [<c04a5d90>] (blk_done_softirq) from [<c010242c>] (__do_softirq+0x184/0x56c)
[ 11.633419] [<c010242c>] (__do_softirq) from [<c01051e8>] (irq_exit+0xd4/0x138)
[ 11.641640] [<c01051e8>] (irq_exit) from [<c010c314>] (__handle_domain_irq+0x94/0xd0)
[ 11.650381] [<c010c314>] (__handle_domain_irq) from [<c04fde70>] (gic_handle_irq+0x50/0xd4)
[ 11.659641] [<c04fde70>] (gic_handle_irq) from [<c0101b70>] (__irq_svc+0x70/0xb0)
Currently ksmbd is using ->f_bsize from vfs_statfs() as sector size.
If fat/exfat is a local share, ->f_bsize is a cluster size that is too
large to be used as a sector size. Sector sizes larger than 4K cause
problem occurs when mounting an iso file through windows client.
The error message can be obtained using Mount-DiskImage command,
the error is:
"Mount-DiskImage : The sector size of the physical disk on which the
virtual disk resides is not supported."
This patch reports fixed 4KB sector size if ->s_blocksize is bigger
than 4KB.
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
We hold timer_lock in position (1) of thread 1 and
use del_timer_sync() to wait timer to stop, but timer handler
also need timer_lock in position (2) of thread 2.
As a result, rs_close() will block forever.
This patch deletes the redundant timer_lock in order to
prevent the deadlock. Because there is no race condition
between rs_close, rs_open and rs_poll.
This patch takes a defensive programming and paranoid approach in case
the parent device (SoundWire) is pm_runtime resumed but the rt711
device is not. In that case, during the attachment and initialization,
a jack detection workqueue can be scheduled. Since the pm_runtime
suspend routines will not be invoked, the sequence to cancel all
deferred work is not executed, and the jack detection could happen
after the bus stops operating, leading to a timeout.
This patch applies the same solution to rt5682, based on the
similarities between codec drivers. The race condition with rt5682 was
not detected experimentally though.
We enabled UBSAN in the ubuntu kernel, and the cs35l41 driver triggers
a warning calltrace like below:
cs35l41-hda i2c-CSC3551:00-cs35l41-hda.0: bitoffset= 8, word_offset=23, bit_sum mod 32=0, otp_map[i].size = 24
cs35l41-hda i2c-CSC3551:00-cs35l41-hda.0: bitoffset= 0, word_offset=24, bit_sum mod 32=24, otp_map[i].size = 0
================================================================================
UBSAN: shift-out-of-bounds in linux-kernel-src/sound/soc/codecs/cs35l41-lib.c:836:8
shift exponent 64 is too large for 64-bit type 'long unsigned int'
CPU: 10 PID: 595 Comm: systemd-udevd Not tainted 5.15.0-23-generic #23
Hardware name: LENOVO \x02MFG_IN_GO/\x02MFG_IN_GO, BIOS N3GET19W (1.00 ) 03/11/2022
Call Trace:
<TASK>
show_stack+0x52/0x58
dump_stack_lvl+0x4a/0x5f
dump_stack+0x10/0x12
ubsan_epilogue+0x9/0x45
__ubsan_handle_shift_out_of_bounds.cold+0x61/0xef
? regmap_unlock_mutex+0xe/0x10
cs35l41_otp_unpack.cold+0x1c6/0x2b2 [snd_soc_cs35l41_lib]
cs35l41_hda_probe+0x24f/0x33a [snd_hda_scodec_cs35l41]
cs35l41_hda_i2c_probe+0x65/0x90 [snd_hda_scodec_cs35l41_i2c]
When both bitoffset and otp_map[i].size are 0, the line 836 will
result in GENMASK(-1, 0), this triggers the shift-out-of-bounds
calltrace.
Here add a checking, if both bitoffset and otp_map[i].size are 0,
do not run GENMASK() and directly set otp_val to 0, this will not
bring any function change on the driver but could avoid the calltrace.
At the kzalloc() call in dpcm_be_connect(), there is no spin lock involved.
It's merely protected by card->pcm_mutex, instead. The spinlock is applied
at the later call with snd_soc_pcm_stream_lock_irq() only for the list
manipulations. (See it's *_irq(), not *_irqsave(); that means the context
being sleepable at that point.) So, we can use GFP_KERNEL safely there.
This patch revert commit d8a9c6e1f676 ("ASoC: soc-pcm: use GFP_ATOMIC for
dpcm structure") which is no longer needed since commit b7898396f4bb
("ASoC: soc-pcm: Fix and cleanup DPCM locking").
// Send 2 data segments
+0 write(4, ..., 2000) = 2000
+0 > P. 1:2001(2000) ack 1
// TLP
+.022 > P. 1001:2001(1000) ack 1
// Continue to send 8 data segments
+0 write(4, ..., 10000) = 10000
+0 > P. 2001:10001(8000) ack 1
// RTO
+.188 > . 1:1001(1000) ack 1
// The original data is acked and new data is sent(F-RTO step 2.b)
+0 < . 1:1(0) ack 2001 win 257
+0 > P. 10001:12001(2000) ack 1
// D-SACK caused by TLP is regarded as a dupack, this results in
// the incorrect judgment of "loss was real"(F-RTO step 3.a)
+.022 < . 1:1(0) ack 2001 win 257 <sack 1001:2001,nop,nop>
// Never-retransmitted data(3001:4001) are acked and
// expect to switch to open state(F-RTO step 3.b)
+0 < . 1:1(0) ack 4001 win 257
+0 %{ assert tcpi_ca_state == 0, tcpi_ca_state }%
When client requests channel or ring size larger than what the server
can support the server will cap the request to the supported max. So,
the client would not be able to successfully request resources that
exceed the server limit.
Fixes: 723ad9161347 ("ibmvnic: Add ethtool private flag for driver-defined queue limits") Signed-off-by: Dany Madden <drt@linux.ibm.com> Link: https://lore.kernel.org/r/20220427235146.23189-1-drt@linux.ibm.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
The Time-Specified Departure feature is indeed mutually exclusive with
TX IP checksumming in ENETC, but TX checksumming in itself is broken and
was removed from this driver in commit 82728b91f124 ("enetc: Remove Tx
checksumming offload code").
The blamed commit declared NETIF_F_HW_CSUM in dev->features to comply
with software TSO's expectations, and still did the checksumming in
software by calling skb_checksum_help(). So there isn't any restriction
for the Time-Specified Departure feature.
However, enetc_setup_tc_txtime() doesn't understand that, and blindly
looks for NETIF_F_CSUM_MASK.
Instead of checking for things which can literally never happen in the
current code base, just remove the check and let the driver offload
tc-etf qdiscs.
Fixes: acede3c5dad5 ("net: enetc: declare NETIF_F_HW_CSUM and do it in software") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://lore.kernel.org/r/20220427203017.1291634-1-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
This patch corrects a bug whereby synthesized events from SPE
samples are missing virtual addresses.
Fixes: 54f7815efef7fad9 ("perf arm-spe: Fill address info for samples") Reviewed-by: Leo Yan <leo.yan@linaro.org> Signed-off-by: Timothy Hayes <timothy.hayes@arm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: bpf@vger.kernel.org Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Fastabend <john.fastabend@gmail.com> Cc: John Garry <john.garry@huawei.com> Cc: KP Singh <kpsingh@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: linux-arm-kernel@lists.infradead.org Cc: Mark Rutland <mark.rutland@arm.com> Cc: Martin KaFai Lau <kafai@fb.com> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: netdev@vger.kernel.org Cc: Song Liu <songliubraving@fb.com> Cc: Will Deacon <will@kernel.org> Cc: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/r/20220421165205.117662-2-timothy.hayes@arm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Commit 00bfe02f4796 ("gfs2: Fix mmap + page fault deadlocks for buffered
I/O") changed gfs2_file_read_iter() and gfs2_file_buffered_write() to
allow dropping the inode glock while faulting in user buffers. When the
lock was dropped, a short result was returned to indicate that the
operation was interrupted.
As pointed out by Linus (see the link below), this behavior is broken
and the operations should always re-acquire the inode glock and resume
the operation instead.
When direct writes fail with -ENOTBLK because we're writing into a
hole (gfs2_iomap_begin()) or because of a page invalidation failure
(iomap_dio_rw()), we're falling back to buffered writes. In that case,
when we lose the inode glock in gfs2_file_buffered_write(), we want to
re-acquire it instead of returning a short write.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
While handling PCI errors (AER flow) driver tries to
disable NAPI [napi_disable()] after NAPI is deleted
[__netif_napi_del()] which causes unexpected system
hang/crash.
EEH calls into bnx2x twice based on the system log above, first through
bnx2x_io_error_detected() and then bnx2x_io_slot_reset(), and executes
the following call chains:
Calling tls_append_frag when max_open_record_len == record->len might
add an empty fragment to the TLS record if the call happens to be on the
page boundary. Normally tls_append_frag coalesces the zero-sized
fragment to the previous one, but not if it's on page boundary.
If a resync happens then, the mlx5 driver posts dump WQEs in
tx_post_resync_dump, and the empty fragment may become a data segment
with byte_count == 0, which will confuse the NIC and lead to a CQE
error.
This commit fixes the described issue by skipping tls_append_frag on
zero size to avoid adding empty fragments. The fix is not in the driver,
because an empty fragment is hardly the desired behavior.
Before this commit fan_curve_check_present() was trying to not cause
the probe to fail on devices without fan curve control by testing for
known error codes returned by asus_wmi_evaluate_method_buf().
Checking for ENODATA or ENODEV, with the latter being returned by this
function when an ACPI integer with a value of ASUS_WMI_UNSUPPORTED_METHOD
is returned. But for other ACPI integer returns this function just returns
them as is, including the ASUS_WMI_DSTS_UNKNOWN_BIT value of 2.
On the Asus U36SD ASUS_WMI_DSTS_UNKNOWN_BIT gets returned, leading to:
asus-nb-wmi: probe of asus-nb-wmi failed with error 2
Instead of playing whack a mole with error codes here, simply treat all
errors as there not being any fan curves, fixing the driver no longer
loading on the Asus U36SD laptop.
This code tests for if the obj->buffer.length is larger than the buffer
but then it just does the memcpy() anyway.
Fixes: 0f0ac158d28f ("platform/x86: asus-wmi: Add support for custom fan curves") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Link: https://lore.kernel.org/r/20220413073744.GB8812@kili Reviewed-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
`nf_flowtable_udp_timeout` sysctl option is available only
if CONFIG_NFT_FLOW_OFFLOAD enabled. But infra for this flow
offload UDP timeout was added under CONFIG_NF_FLOW_TABLE
config option. So, if you have CONFIG_NFT_FLOW_OFFLOAD
disabled and CONFIG_NF_FLOW_TABLE enabled, the
`nf_flowtable_udp_timeout` is not present in sysfs.
Please note, that TCP flow offload timeout sysctl option
is present even CONFIG_NFT_FLOW_OFFLOAD is disabled.
I suppose it was a typo in commit that adds UDP flow offload
timeout and CONFIG_NF_FLOW_TABLE should be used instead.
Jaco Kroon reported tcp problems that Eric Dumazet and Neal Cardwell
pinpointed to nf_conntrack tcp_in_window() bug.
tcp trace shows following sequence:
I > R Flags [S], seq 3451342529, win 62580, options [.. tfo [|tcp]>
R > I Flags [S.], seq 2699962254, ack 3451342530, win 65535, options [..]
R > I Flags [P.], seq 1:89, ack 1, [..]
Note 3rd ACK is from responder to initiator so following branch is taken:
} else if (((state->state == TCP_CONNTRACK_SYN_SENT
&& dir == IP_CT_DIR_ORIGINAL)
|| (state->state == TCP_CONNTRACK_SYN_RECV
&& dir == IP_CT_DIR_REPLY))
&& after(end, sender->td_end)) {
... because state == TCP_CONNTRACK_SYN_RECV and dir is REPLY.
This causes the scaling factor to be reset to 0: window scale option
is only present in syn(ack) packets. This in turn makes nf_conntrack
mark valid packets as out-of-window.
This was always broken, it exists even in original commit where
window tracking was added to ip_conntrack (nf_conntrack predecessor)
in 2.6.9-rc1 kernel.
Restrict to 'tcph->syn', just like the 3rd condtional added in
commit 82b72cb94666 ("netfilter: conntrack: re-init state for retransmitted syn-ack").
Upon closer look, those conditionals/branches can be merged:
Because earlier checks prevent syn-ack from showing up in
original direction, the 'dir' checks in the conditional quoted above are
redundant, remove them. Return early for pure syn retransmitted in reply
direction (simultaneous open).
Commit 4b5923249b8fa4 ("net: dsa: lantiq_gswip: Configure all remaining
GSWIP_MII_CFG bits") added all known bits in the GSWIP_MII_CFGp
register. It helped bring this register into a well-defined state so the
driver has to rely less on the bootloader to do things right.
Unfortunately it also sets the GSWIP_MII_CFG_RMII_CLK bit without any
possibility to configure it. Upon further testing it turns out that all
boards which are supported by the GSWIP driver in OpenWrt which use an
RMII PHY have a dedicated oscillator on the board which provides the
50MHz RMII reference clock.
Don't set the GSWIP_MII_CFG_RMII_CLK bit (but keep the code which always
clears it) to fix support for the Fritz!Box 7362 SL in OpenWrt. This is
a board with two Atheros AR8030 RMII PHYs. With the "RMII clock" bit set
the MAC also generates the RMII reference clock whose signal then
conflicts with the signal from the oscillator on the board. This results
in a constant cycle of the PHY detecting link up/down (and as a result
of that: the two ports using the AR8030 PHYs are not working).
At the time of writing this patch there's no known board where the MAC
(GSWIP) has to generate the RMII reference clock. If needed this can be
implemented in future by providing a device-tree flag so the
GSWIP_MII_CFG_RMII_CLK bit can be toggled per port.
Fixes: 4b5923249b8fa4 ("net: dsa: lantiq_gswip: Configure all remaining GSWIP_MII_CFG bits") Tested-by: Jan Hoffmann <jan@3e8.eu> Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com> Acked-by: Hauke Mehrtens <hauke@hauke-m.de> Link: https://lore.kernel.org/r/20220425152027.2220750-1-martin.blumenstingl@googlemail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Commit d5ebaa7c5f6f6 introduces checks for handle range
(e.g HCI_CONN_HANDLE_MAX) but controllers like Intel AX200 don't seem
to respect the valid range int case of error status:
Because of it is impossible to cleanup the connections properly since
the stack would attempt to cancel the connection which is no longer in
progress causing the following trace:
We need to wait 5 s for EMP reset after firmware flash. Code was extracted
from OOT driver (ice v1.8.3 downloaded from sourceforge). Without this
wait, fw_activate let card in inconsistent state and recoverable only
by second flash/activate. Flash was tested on these fw's:
From -> To
3.00 -> 3.10/3.20
3.10 -> 3.00/3.20
3.20 -> 3.00/3.10
Reproducer:
[root@host ~]# devlink dev flash pci/0000:ca:00.0 file E810_XXVDA4_FH_O_SEC_FW_1p6p1p9_NVM_3p10_PLDMoMCTP_0.11_8000AD7B.bin
Preparing to flash
[fw.mgmt] Erasing
[fw.mgmt] Erasing done
[fw.mgmt] Flashing 100%
[fw.mgmt] Flashing done 100%
[fw.undi] Erasing
[fw.undi] Erasing done
[fw.undi] Flashing 100%
[fw.undi] Flashing done 100%
[fw.netlist] Erasing
[fw.netlist] Erasing done
[fw.netlist] Flashing 100%
[fw.netlist] Flashing done 100%
Activate new firmware by devlink reload
[root@host ~]# devlink dev reload pci/0000:ca:00.0 action fw_activate
reload_actions_performed:
fw_activate
[root@host ~]# ip link show ens7f0
71: ens7f0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN mode DEFAULT group default qlen 1000
link/ether b4:96:91:dc:72:e0 brd ff:ff:ff:ff:ff:ff
altname enp202s0f0
dmesg after flash:
[ 55.120788] ice: Copyright (c) 2018, Intel Corporation.
[ 55.274734] ice 0000:ca:00.0: Get PHY capabilities failed status = -5, continuing anyway
[ 55.569797] ice 0000:ca:00.0: The DDP package was successfully loaded: ICE OS Default Package version 1.3.28.0
[ 55.603629] ice 0000:ca:00.0: Get PHY capability failed.
[ 55.608951] ice 0000:ca:00.0: ice_init_nvm_phy_type failed: -5
[ 55.647348] ice 0000:ca:00.0: PTP init successful
[ 55.675536] ice 0000:ca:00.0: DCB is enabled in the hardware, max number of TCs supported on this port are 8
[ 55.685365] ice 0000:ca:00.0: FW LLDP is disabled, DCBx/LLDP in SW mode.
[ 55.692179] ice 0000:ca:00.0: Commit DCB Configuration to the hardware
[ 55.701382] ice 0000:ca:00.0: 126.024 Gb/s available PCIe bandwidth, limited by 16.0 GT/s PCIe x8 link at 0000:c9:02.0 (capable of 252.048 Gb/s with 16.0 GT/s PCIe x16 link)
Reboot doesn’t help, only second flash/activate with OOT or patched
driver put card back in consistent state.
After patch:
[root@host ~]# devlink dev flash pci/0000:ca:00.0 file E810_XXVDA4_FH_O_SEC_FW_1p6p1p9_NVM_3p10_PLDMoMCTP_0.11_8000AD7B.bin
Preparing to flash
[fw.mgmt] Erasing
[fw.mgmt] Erasing done
[fw.mgmt] Flashing 100%
[fw.mgmt] Flashing done 100%
[fw.undi] Erasing
[fw.undi] Erasing done
[fw.undi] Flashing 100%
[fw.undi] Flashing done 100%
[fw.netlist] Erasing
[fw.netlist] Erasing done
[fw.netlist] Flashing 100%
[fw.netlist] Flashing done 100%
Activate new firmware by devlink reload
[root@host ~]# devlink dev reload pci/0000:ca:00.0 action fw_activate
reload_actions_performed:
fw_activate
[root@host ~]# ip link show ens7f0
19: ens7f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000
link/ether b4:96:91:dc:72:e0 brd ff:ff:ff:ff:ff:ff
altname enp202s0f0
Fixes: 399e27dbbd9e94 ("ice: support immediate firmware activation via devlink reload") Signed-off-by: Petr Oros <poros@redhat.com> Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
commit b4bdc4fbf8d0 ("soc: sunxi: Deal with the MBUS DMA offsets in a
central place") added a platform device notifier that sets the DMA
offset for all of the display engine frontend and backend devices.
The code applying the offset to DMA buffer physical addresses was then
removed from the backend driver in commit 756668ba682e ("drm/sun4i:
backend: Remove the MBUS quirks"), but the code subtracting PHYS_OFFSET
was left in the frontend driver.
As a result, the offset was applied twice in the frontend driver. This
likely went unnoticed because it only affects specific configurations
(scaling or certain pixel formats) where the frontend is used, on boards
with both one of these older SoCs and more than 1 GB of DRAM.
In addition, the references to PHYS_OFFSET prevent compiling the driver
on architectures where PHYS_OFFSET is not defined.
Fixes: b4bdc4fbf8d0 ("soc: sunxi: Deal with the MBUS DMA offsets in a central place") Reviewed-by: Jernej Skrabec <jernej.skrabec@gmail.com> Signed-off-by: Samuel Holland <samuel@sholland.org> Signed-off-by: Maxime Ripard <maxime@cerno.tech> Link: https://patchwork.freedesktop.org/patch/msgid/20220424162633.12369-4-samuel@sholland.org Signed-off-by: Sasha Levin <sashal@kernel.org>
The other port_hidden functions rely on the port_read/port_write
functions to access the hidden control port. These functions apply the
offset for port_base_addr where applicable. Update port_hidden_wait to
use the port_wait_bit so that port_base_addr offsets are accounted for
when waiting for the busy bit to change.
Without the offset the port_hidden_wait function would timeout on
devices that have a non-zero port_base_addr (e.g. MV88E6141), however
devices that have a zero port_base_addr would operate correctly (e.g.
MV88E6390).
Fixes: 609070133aff ("net: dsa: mv88e6xxx: update code operating on hidden registers") Signed-off-by: Nathan Rossi <nathan@nathanrossi.com> Reviewed-by: Marek Behún <kabel@kernel.org> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://lore.kernel.org/r/20220425070454.348584-1-nathan@nathanrossi.com Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
The hardware checksum offloading requires use of a transmit
status block inserted before the outgoing frame data, this was
updated in '9a9ba2a4aaaa ("net: bcmgenet: always enable status blocks")'
However, skb_tx_timestamp() assumes that it is passed a raw frame
and PTP parsing chokes on this status block.
Fix this by calling __skb_pull(), which hides the TSB before calling
skb_tx_timestamp(), so an outgoing PTP packet is parsed correctly.
As the data in the skb has already been set up for DMA, and the
dma_unmap_* calls use a separately stored address, there is no
no effective change in the data transmission.
The function mctp_unregister() reclaims the device's relevant resource
when a netcard detaches. However, a running routine may be unaware of
this and cause the use-after-free of the mdev->addrs object.
An attacker can adopt the (recent provided) mtcpserial driver with pty
to fake the device detaching and use the userfaultfd to increase the
race success chance (in mctp_sendmsg). The KASan report for such a POC
is shown below:
[ 86.051955] ==================================================================
[ 86.051955] BUG: KASAN: use-after-free in mctp_local_output+0x4e9/0xb7d
[ 86.051955] Read of size 1 at addr ffff888005f298c0 by task poc/295
[ 86.051955]
[ 86.051955] Call Trace:
[ 86.051955] <TASK>
[ 86.051955] dump_stack_lvl+0x33/0x42
[ 86.051955] print_report.cold.13+0xb2/0x6b3
[ 86.051955] ? preempt_schedule_irq+0x57/0x80
[ 86.051955] ? mctp_local_output+0x4e9/0xb7d
[ 86.051955] kasan_report+0xa5/0x120
[ 86.051955] ? mctp_local_output+0x4e9/0xb7d
[ 86.051955] mctp_local_output+0x4e9/0xb7d
[ 86.051955] ? mctp_dev_set_key+0x79/0x79
[ 86.051955] ? copyin+0x38/0x50
[ 86.051955] ? _copy_from_iter+0x1b6/0xf20
[ 86.051955] ? sysvec_apic_timer_interrupt+0x97/0xb0
[ 86.051955] ? asm_sysvec_apic_timer_interrupt+0x12/0x20
[ 86.051955] ? mctp_local_output+0x1/0xb7d
[ 86.051955] mctp_sendmsg+0x64d/0xdb0
[ 86.051955] ? mctp_sk_close+0x20/0x20
[ 86.051955] ? __fget_light+0x2fd/0x4f0
[ 86.051955] ? mctp_sk_close+0x20/0x20
[ 86.051955] sock_sendmsg+0xdd/0x110
[ 86.051955] __sys_sendto+0x1cc/0x2a0
[ 86.051955] ? __ia32_sys_getpeername+0xa0/0xa0
[ 86.051955] ? new_sync_write+0x335/0x550
[ 86.051955] ? alloc_file+0x22f/0x500
[ 86.051955] ? __ip_do_redirect+0x820/0x1820
[ 86.051955] ? vfs_write+0x44d/0x7b0
[ 86.051955] ? vfs_write+0x44d/0x7b0
[ 86.051955] ? fput_many+0x15/0x120
[ 86.051955] ? ksys_write+0x155/0x1b0
[ 86.051955] ? __ia32_sys_read+0xa0/0xa0
[ 86.051955] __x64_sys_sendto+0xd8/0x1b0
[ 86.051955] ? exit_to_user_mode_prepare+0x2f/0x120
[ 86.051955] ? syscall_exit_to_user_mode+0x12/0x20
[ 86.051955] do_syscall_64+0x3a/0x80
[ 86.051955] entry_SYSCALL_64_after_hwframe+0x44/0xae
[ 86.051955] RIP: 0033:0x7f82118a56b3
[ 86.051955] RSP: 002b:00007ffdb154b110 EFLAGS: 00000293 ORIG_RAX: 000000000000002c
[ 86.051955] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f82118a56b3
[ 86.051955] RDX: 0000000000000010 RSI: 00007f8211cd4000 RDI: 0000000000000007
[ 86.051955] RBP: 00007ffdb154c1d0 R08: 00007ffdb154b164 R09: 000000000000000c
[ 86.051955] R10: 0000000000000000 R11: 0000000000000293 R12: 000055d779800db0
[ 86.051955] R13: 00007ffdb154c2b0 R14: 0000000000000000 R15: 0000000000000000
[ 86.051955] </TASK>
[ 86.051955]
[ 86.051955] Allocated by task 295:
[ 86.051955] kasan_save_stack+0x1c/0x40
[ 86.051955] __kasan_kmalloc+0x84/0xa0
[ 86.051955] mctp_rtm_newaddr+0x242/0x610
[ 86.051955] rtnetlink_rcv_msg+0x2fd/0x8b0
[ 86.051955] netlink_rcv_skb+0x11c/0x340
[ 86.051955] netlink_unicast+0x439/0x630
[ 86.051955] netlink_sendmsg+0x752/0xc00
[ 86.051955] sock_sendmsg+0xdd/0x110
[ 86.051955] __sys_sendto+0x1cc/0x2a0
[ 86.051955] __x64_sys_sendto+0xd8/0x1b0
[ 86.051955] do_syscall_64+0x3a/0x80
[ 86.051955] entry_SYSCALL_64_after_hwframe+0x44/0xae
[ 86.051955]
[ 86.051955] Freed by task 301:
[ 86.051955] kasan_save_stack+0x1c/0x40
[ 86.051955] kasan_set_track+0x21/0x30
[ 86.051955] kasan_set_free_info+0x20/0x30
[ 86.051955] __kasan_slab_free+0x104/0x170
[ 86.051955] kfree+0x8c/0x290
[ 86.051955] mctp_dev_notify+0x161/0x2c0
[ 86.051955] raw_notifier_call_chain+0x8b/0xc0
[ 86.051955] unregister_netdevice_many+0x299/0x1180
[ 86.051955] unregister_netdevice_queue+0x210/0x2f0
[ 86.051955] unregister_netdev+0x13/0x20
[ 86.051955] mctp_serial_close+0x6d/0xa0
[ 86.051955] tty_ldisc_kill+0x31/0xa0
[ 86.051955] tty_ldisc_hangup+0x24f/0x560
[ 86.051955] __tty_hangup.part.28+0x2ce/0x6b0
[ 86.051955] tty_release+0x327/0xc70
[ 86.051955] __fput+0x1df/0x8b0
[ 86.051955] task_work_run+0xca/0x150
[ 86.051955] exit_to_user_mode_prepare+0x114/0x120
[ 86.051955] syscall_exit_to_user_mode+0x12/0x20
[ 86.051955] do_syscall_64+0x46/0x80
[ 86.051955] entry_SYSCALL_64_after_hwframe+0x44/0xae
[ 86.051955]
[ 86.051955] The buggy address belongs to the object at ffff888005f298c0
[ 86.051955] which belongs to the cache kmalloc-8 of size 8
[ 86.051955] The buggy address is located 0 bytes inside of
[ 86.051955] 8-byte region [ffff888005f298c0, ffff888005f298c8)
[ 86.051955]
[ 86.051955] The buggy address belongs to the physical page:
[ 86.051955] flags: 0x100000000000200(slab|node=0|zone=1)
[ 86.051955] raw: 0100000000000200dead000000000100dead000000000122ffff888005c42280
[ 86.051955] raw: 0000000000000000000000008066006600000001ffffffff0000000000000000
[ 86.051955] page dumped because: kasan: bad access detected
[ 86.051955]
[ 86.051955] Memory state around the buggy address:
[ 86.051955] ffff888005f29780: 00 fc fc fc fc 00 fc fc fc fc 00 fc fc fc fc 00
[ 86.051955] ffff888005f29800: fc fc fc fc 00 fc fc fc fc 00 fc fc fc fc 00 fc
[ 86.051955] >ffff888005f29880: fc fc fc fb fc fc fc fc fa fc fc fc fc fa fc fc
[ 86.051955] ^
[ 86.051955] ffff888005f29900: fc fc 00 fc fc fc fc 00 fc fc fc fc 00 fc fc fc
[ 86.051955] ffff888005f29980: fc 00 fc fc fc fc 00 fc fc fc fc 00 fc fc fc fc
[ 86.051955] ==================================================================
To this end, just like the commit e04480920d1e ("Bluetooth: defer
cleanup of resources in hci_unregister_dev()") this patch defers the
destructive kfree(mdev->addrs) in mctp_unregister to the mctp_dev_put,
where the refcount of mdev is zero and the entire device is reclaimed.
This prevents the use-after-free because the sendmsg thread holds the
reference of mdev in the mctp_route object.
Fixes: 583be982d934 (mctp: Add device handling and netlink interface) Signed-off-by: Lin Ma <linma@zju.edu.cn> Acked-by: Jeremy Kerr <jk@codeconstruct.com.au> Link: https://lore.kernel.org/r/20220422114340.32346-1-linma@zju.edu.cn Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
It's noted that dcvs interrupts are not self-clearing, thus an interrupt
handler runs constantly, which leads to a severe regression in runtime.
To fix the problem an explicit write to clear interrupt register is
required, note that on OSM platforms the register may not be present.
This code is really spurious.
It always returns an ERR_PTR, even when err is known to be 0 and calls
put_device() after a successful device_register() call.
It is likely that the return statement in the normal path is missing.
Add 'return rdev;' to fix it.
Fixes: d787dcdb9c8f ("bus: sunxi-rsb: Add driver for Allwinner Reduced Serial Bus") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Reviewed-by: Samuel Holland <samuel@sholland.org> Tested-by: Samuel Holland <samuel@sholland.org> Signed-off-by: Jernej Skrabec <jernej.skrabec@gmail.com> Link: https://lore.kernel.org/r/ef2b9576350bba4c8e05e669e9535e9e2a415763.1650551719.git.christophe.jaillet@wanadoo.fr Signed-off-by: Sasha Levin <sashal@kernel.org>