If the call to uart_add_one_port() in serial8250_register_8250_port()
fails, a half-initialized entry in the serial_8250ports[] array is left
behind.
A subsequent reprobe of the same serial port causes that entry to be
reused. Because uart->port.dev is set, uart_remove_one_port() is called
for the half-initialized entry and bails out with an error message:
The same happens on failure of mctrl_gpio_init() since commit 4a96895f74c9 ("tty/serial/8250: use mctrl_gpio helpers").
Fix by zeroing the uart->port.dev pointer in the probe error path.
The bug was introduced in v2.6.10 by historical commit befff6f5bf5f
("[SERIAL] Add new port registration/unregistration functions."):
https://git.kernel.org/tglx/history/c/befff6f5bf5f
The commit added an unconditional call to uart_remove_one_port() in
serial8250_register_port(). In v3.7, commit 835d844d1a28 ("8250_pnp:
do pnp probe before legacy probe") made that call conditional on
uart->port.dev which allows me to fix the issue by zeroing that pointer
in the error path. Thus, the present commit will fix the problem as far
back as v3.7 whereas still older versions need to also cherry-pick 835d844d1a28.
Fixes: 835d844d1a28 ("8250_pnp: do pnp probe before legacy probe") Signed-off-by: Lukas Wunner <lukas@wunner.de> Cc: stable@vger.kernel.org # v2.6.10 Cc: stable@vger.kernel.org # v2.6.10: 835d844d1a28: 8250_pnp: do pnp probe before legacy Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Link: https://lore.kernel.org/r/b4a072013ee1a1d13ee06b4325afb19bda57ca1b.1589285873.git.lukas@wunner.de
[iwamatsu: Backported to 4.14, 4.19: adjust context] Signed-off-by: Nobuhiro Iwamatsu (CIP) <nobuhiro1.iwamatsu@toshiba.co.jp> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Aggregation effects are extremely common with wifi, cellular, and cable
modem link technologies, ACK decimation in middleboxes, and LRO and GRO
in receiving hosts. The aggregation can happen in either direction,
data or ACKs, but in either case the aggregation effect is visible
to the sender in the ACK stream.
Previously BBR's sending was often limited by cwnd under severe ACK
aggregation/decimation because BBR sized the cwnd at 2*BDP. If packets
were acked in bursts after long delays (e.g. one ACK acking 5*BDP after
5*RTT), BBR's sending was halted after sending 2*BDP over 2*RTT, leaving
the bottleneck idle for potentially long periods. Note that loss-based
congestion control does not have this issue because when facing
aggregation it continues increasing cwnd after bursts of ACKs, growing
cwnd until the buffer is full.
To achieve good throughput in the presence of aggregation effects, this
algorithm allows the BBR sender to put extra data in flight to keep the
bottleneck utilized during silences in the ACK stream that it has evidence
to suggest were caused by aggregation.
A summary of the algorithm: when a burst of packets are acked by a
stretched ACK or a burst of ACKs or both, BBR first estimates the expected
amount of data that should have been acked, based on its estimated
bandwidth. Then the surplus ("extra_acked") is recorded in a windowed-max
filter to estimate the recent level of observed ACK aggregation. Then cwnd
is increased by the ACK aggregation estimate. The larger cwnd avoids BBR
being cwnd-limited in the face of ACK silences that recent history suggests
were caused by aggregation. As a sanity check, the ACK aggregation degree
is upper-bounded by the cwnd (at the time of measurement) and a global max
of BW * 100ms. The algorithm is further described by the following
presentation:
https://datatracker.ietf.org/meeting/101/materials/slides-101-iccrg-an-update-on-bbr-work-at-google-00
In our internal testing, we observed a significant increase in BBR
throughput (measured using netperf), in a basic wifi setup.
- Host1 (sender on ethernet) -> AP -> Host2 (receiver on wifi)
- 2.4 GHz -> BBR before: ~73 Mbps; BBR after: ~102 Mbps; CUBIC: ~100 Mbps
- 5.0 GHz -> BBR before: ~362 Mbps; BBR after: ~593 Mbps; CUBIC: ~601 Mbps
Also, this code is running globally on YouTube TCP connections and produced
significant bandwidth increases for YouTube traffic.
This is based on Ian Swett's max_ack_height_ algorithm from the
QUIC BBR implementation.
Signed-off-by: Priyaranjan Jha <priyarjha@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Because bbr_target_cwnd() is really a general-purpose BBR helper for
computing some volume of inflight data as a function of the estimated
BDP, refactor it into following helper functions:
- bbr_bdp()
- bbr_quantization_budget()
- bbr_inflight()
Signed-off-by: Priyaranjan Jha <priyarjha@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
It only happens on our 1-vcpu instances, because there's no chance for
oom reaper to run to reclaim the to-be-killed process.
Add a cond_resched() at the upper shrink_node_memcgs() to solve this
issue, this will mean that we will get a scheduling point for each memcg
in the reclaimed hierarchy without any dependency on the reclaimable
memory in that memcg thus making it more predictable.
Suggested-by: Michal Hocko <mhocko@suse.com> Signed-off-by: Xunlei Pang <xlpang@linux.alibaba.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Chris Down <chris@chrisdown.name> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Link: http://lkml.kernel.org/r/1598495549-67324-1-git-send-email-xlpang@linux.alibaba.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Julius Hemanth Pitti <jpitti@cisco.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
As Documentation/kbuild/llvm.rst implies, building the kernel with a
full set of LLVM tools gets very verbose and unwieldy.
Provide a single switch LLVM=1 to use Clang and LLVM tools instead
of GCC and Binutils. You can pass it from the command line or as an
environment variable.
Please note LLVM=1 does not turn on the integrated assembler. You need
to pass LLVM_IAS=1 to use it. When the upstream kernel is ready for the
integrated assembler, I think we can make it default.
We discussed what we need, and we agreed to go with a simple boolean
flag that switches both target and host tools:
When multiple versions of LLVM are installed, I just thought supporting
LLVM_DIR=/path/to/my/llvm/bin/ might be useful.
CC = $(LLVM_DIR)clang
LD = $(LLVM_DIR)ld.lld
...
However, we can handle this by modifying PATH. So, we decided to not do
this.
- LLVM_SUFFIX
Some distributions (e.g. Debian) package specific versions of LLVM with
naming conventions that use the version as a suffix.
CC = clang$(LLVM_SUFFIX)
LD = ld.lld(LLVM_SUFFIX)
...
will allow a user to pass LLVM_SUFFIX=-11 to use clang-11 etc.,
but the suffixed versions in /usr/bin/ are symlinks to binaries in
/usr/lib/llvm-#/bin/, so this can also be handled by PATH.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Reviewed-by: Nathan Chancellor <natechancellor@gmail.com> Tested-by: Nathan Chancellor <natechancellor@gmail.com> # build Tested-by: Nick Desaulniers <ndesaulniers@google.com> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
[nd: conflict in exported vars list from not backporting commit e83b9f55448a ("kbuild: add ability to generate BTF type info for vmlinux")]
[nd: hunk against Documentation/kbuild/kbuild.rst dropped due to not backporting
commit cd238effefa2 ("docs: kbuild: convert docs to ReST and rename to *.rst")] Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The 'AS' variable is unused for building the kernel. Only the remaining
usage is to turn on the integrated assembler. A boolean flag is a better
fit for this purpose.
AS=clang was added for experts. So, I replaced it with LLVM_IAS=1,
breaking the backward compatibility.
Suggested-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Reviewed-by: Nathan Chancellor <natechancellor@gmail.com> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
As commit 5ef872636ca7 ("kbuild: get rid of misleading $(AS) from
documents") noted, we rarely use $(AS) directly in the kernel build.
Now that the only/last user of $(AS) in drivers/net/wan/Makefile was
converted to $(CC), $(AS) is no longer used in the build process.
You can still pass in AS=clang, which is just a switch to turn on
the LLVM integrated assembler.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Tested-by: Nick Desaulniers <ndesaulniers@google.com> Reviewed-by: Nathan Chancellor <natechancellor@gmail.com> Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
[nd: conflict in exported vars list from not backporting commit e83b9f55448a ("kbuild: add ability to generate BTF type info for vmlinux")] Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Introduce a new READELF variable to top-level Makefile, so the name of
readelf binary can be specified.
Before this change the name of the binary was hardcoded to
"$(CROSS_COMPILE)readelf" which might not be present for every
toolchain.
This allows to build with LLVM Object Reader by using make parameter
READELF=llvm-readelf.
Link: https://github.com/ClangBuiltLinux/linux/issues/771 Signed-off-by: Dmitry Golovin <dima@golovin.in> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
[nd: conflict in exported vars list from not backporting commit e83b9f55448a ("kbuild: add ability to generate BTF type info for vmlinux")] Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The firmware source, wanxlfw.S, is currently compiled by the combo of
$(CPP) and $(M68KAS). This is not what we usually do for compiling *.S
files. In fact, this Makefile is the only user of $(AS) in the kernel
build.
Instead of combining $(CPP) and (AS) from different tool sets, using
$(M68KCC) as an assembler driver is simpler, and saner.
As far as I understood from the Kconfig help text, this build rule is
used to rebuild the driver firmware, which runs on an old m68k-based
chip. So, you need m68k tools for the firmware rebuild.
wanxl.c is a PCI driver, but CONFIG_M68K does not select CONFIG_HAVE_PCI.
So, you cannot enable CONFIG_WANXL_BUILD_FIRMWARE for ARCH=m68k. In other
words, ifeq ($(ARCH),m68k) is false here.
I am keeping the dead code for now, but rebuilding the firmware requires
'as68k' and 'ld68k', which I do not have in hand.
Instead, the kernel.org m68k GCC [1] successfully built it.
Allowing a user to pass in CROSS_COMPILE_M68K= is handier.
Define and export OBJSIZE variable for "size" tool from binutils to be
used in architecture specific Makefiles (naming the variable just "SIZE"
would be too risky). In particular this tool is useful to perform checks
that early boot code is not using bss section (which might have not been
zeroed yet or intersects with initrd or other files boot loader might
have put right after the linux kernel).
Add keyword support so that our mailing list gets cc'ed for clang/llvm
patches. We're pretty active on our mailing list so far as code review.
There are numerous Googlers like myself that are paid to support
building the Linux kernel with Clang and LLVM.
Link: http://lkml.kernel.org/r/20190620001907.255803-1-ndesaulniers@google.com Signed-off-by: Nick Desaulniers <ndesaulniers@google.com> Reviewed-by: Nathan Chancellor <natechancellor@gmail.com> Cc: Joe Perches <joe@perches.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Kfir reported that pmtu exceptions are not created properly for
deployments where multipath routes use the same device.
After some digging I see 2 compounding problems:
1. ip_route_output_key_hash_rcu is updating the flowi4_oif *after*
the route lookup. This is the second use case where this has
been a problem (the first is related to use of vti devices with
VRF). I can not find any reason for the oif to be changed after the
lookup; the code goes back to the start of git. It does not seem
logical so remove it.
2. fib_lookups for exceptions do not call fib_select_path to handle
multipath route selection based on the hash.
The end result is that the fib_lookup used to add the exception
always creates it based using the first leg of the route.
for h in host1 host2 host3; do
ip netns add ${h}
ip -netns ${h} link set lo up
ip netns exec ${h} sysctl -wq net.ipv4.ip_forward=1
done
ip netns add switch
ip -netns switch li set lo up
ip -netns switch link add br0 type bridge stp 0
ip -netns switch link set br0 up
for n in 1 2 3; do
ip -netns switch link add eth-sw type veth peer name eth-h${n}
ip -netns switch li set eth-h${n} master br0 up
ip -netns switch li set eth-sw netns host${n} name eth0
done
ip -netns host1 addr add 192.168.252.209/24 dev eth0
ip -netns host1 link set dev eth0 up
ip -netns host1 route add 192.168.247.0/24 \
nexthop via 192.168.252.250 dev eth0 nexthop via 192.168.252.252 dev eth0
ip -netns host2 addr add 192.168.252.250/24 dev eth0
ip -netns host2 link set dev eth0 up
ip -netns host2 addr add 192.168.252.252/24 dev eth0
ip -netns host3 link set dev eth0 up
ip netns add tunnel
ip -netns tunnel li set lo up
ip -netns tunnel li add br0 type bridge
ip -netns tunnel li set br0 up
for n in $(seq 11 20); do
ip -netns tunnel addr add dev br0 192.168.247.${n}/24
done
for n in 2 3
do
ip -netns tunnel link add vti${n} type veth peer name eth${n}
ip -netns tunnel link set eth${n} mtu 1360 master br0 up
ip -netns tunnel link set vti${n} netns host${n} mtu 1360 up
ip -netns host${n} addr add dev vti${n} 192.168.247.${n}/24
done
ip -netns tunnel ro add default nexthop via 192.168.247.2 nexthop via 192.168.247.3
ip netns exec host1 ping -M do -s 1400 -c3 -I 192.168.252.209 192.168.247.11
ip netns exec host1 ping -M do -s 1400 -c3 -I 192.168.252.209 192.168.247.15
ip -netns host1 ro ls cache
Before this patch the cache always shows exceptions against the first
leg in the multipath route; 192.168.252.250 per this example. Since the
hash has an initial random seed, you may need to vary the final octet
more than what is listed. In my tests, using addresses between 11 and 19
usually found 1 that used both legs.
With this patch, the cache will have exceptions for both legs.
Fixes: 4895c771c7f0 ("ipv4: Add FIB nexthop exceptions") Reported-by: Kfir Itzhak <mastertheknife@gmail.com> Signed-off-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
skb_put_padto() and __skb_put_padto() callers
must check return values or risk use-after-free.
Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
If skb_put_padto() returns an error, skb has been freed.
Better not touch it anymore, as reported by syzbot [1]
Note to qrtr maintainers : this suggests qrtr_sendmsg()
should adjust sock_alloc_send_skb() second parameter
to account for the potential added alignment to avoid
reallocation.
[1]
BUG: KASAN: use-after-free in __skb_insert include/linux/skbuff.h:1907 [inline]
BUG: KASAN: use-after-free in __skb_queue_before include/linux/skbuff.h:2016 [inline]
BUG: KASAN: use-after-free in __skb_queue_tail include/linux/skbuff.h:2049 [inline]
BUG: KASAN: use-after-free in skb_queue_tail+0x6b/0x120 net/core/skbuff.c:3146
Write of size 8 at addr ffff88804d8ab3c0 by task syz-executor.4/4316
If we have unbound the PHY driver prior to calling phy_detach() (often
via phy_disconnect()) then we can cause a NULL pointer de-reference
accessing the driver owner member. The steps to reproduce are:
echo unimac-mdio-0:01 > /sys/class/net/eth0/phydev/driver/unbind
ip link set eth0 down
Fixes: cafe8df8b9bc ("net: phy: Fix lack of reference count on PHY driver") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Returning "unknown" as a temperature value violates the hwmon interface
rules. Appropriate error codes should be returned via device_attribute
show instead. These will ultimately be propagated to the user via the
file system interface.
In addition to the corrected error handling, it is an even better idea to
not present the sensor in sysfs at all if it is known that the read will
definitely fail. Given that temp1_input is currently the only sensor
reported, ensure no hwmon registration if TEMP_MONITOR_QUERY is not
supported or if it will fail due to access permissions. Something smarter
may be needed if and when other sensors are added.
Fixes: 12cce90b934b ("bnxt_en: fix HWRM error when querying VF temperature") Signed-off-by: Edwin Peer <edwin.peer@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In tipc_buf_append() it may change skb's frag_list, and it causes
problems when this skb is cloned. skb_unclone() doesn't really
make this skb's flag_list available to change.
Shuang Li has reported an use-after-free issue because of this
when creating quite a few macvlan dev over the same dev, where
the broadcast packets will be cloned and go up to the stack:
So fix it by using skb_unshare() instead, which would create a new
skb for the cloned frag and it'll be safe to change its frag_list.
The similar things were also done in sctp_make_reassembled_event(),
which is using skb_copy().
Reported-by: Shuang Li <shuali@redhat.com> Fixes: 37e22164a8a3 ("tipc: rename and move message reassembly function") Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
int main(int argc, char *argv[])
{
int fds[2] = { -1, -1 };
socketpair(PF_TIPC, SOCK_STREAM /* or SOCK_DGRAM */, 0, fds);
if (fork() == 0)
_exit(read(fds[0], NULL, 1));
shutdown(fds[0], SHUT_RDWR); /* This must make read() return. */
wait(NULL); /* To be woken up by _exit(). */
return 0;
}
----------
Since shutdown(SHUT_RDWR) should affect all processes sharing that socket,
unconditionally setting sk->sk_shutdown to SHUTDOWN_MASK will be the right
behavior.
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Acked-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
tipc_group_add_to_tree() returns silently if `key` matches `nkey` of an
existing node, causing tipc_group_create_member() to leak memory. Let
tipc_group_add_to_tree() return an error in such a case, so that
tipc_group_create_member() can handle it properly.
Fixes: 75da2163dbb6 ("tipc: introduce communication groups") Reported-and-tested-by: syzbot+f95d90c454864b3b5bc9@syzkaller.appspotmail.com Cc: Hillf Danton <hdanton@sina.com> Link: https://syzkaller.appspot.com/bug?id=048390604fe1b60df34150265479202f10e13aff Signed-off-by: Peilin Ye <yepeilin.cs@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Currently there is concurrent reset and enqueue operation for the
same lockless qdisc when there is no lock to synchronize the
q->enqueue() in __dev_xmit_skb() with the qdisc reset operation in
qdisc_deactivate() called by dev_deactivate_queue(), which may cause
out-of-bounds access for priv->ring[] in hns3 driver if user has
requested a smaller queue num when __dev_xmit_skb() still enqueue a
skb with a larger queue_mapping after the corresponding qdisc is
reset, and call hns3_nic_net_xmit() with that skb later.
Reused the existing synchronize_net() in dev_deactivate_many() to
make sure skb with larger queue_mapping enqueued to old qdisc(which
is saved in dev_queue->qdisc_sleeping) will always be reset when
dev_reset_queue() is called.
Fixes: 6b3ba9146fe6 ("net: sched: allow qdiscs to handle locking") Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When IPV6_SEG6_HMAC is enabled and CRYPTO is disabled, it results in the
following Kbuild warning:
WARNING: unmet direct dependencies detected for CRYPTO_HMAC
Depends on [n]: CRYPTO [=n]
Selected by [y]:
- IPV6_SEG6_HMAC [=y] && NET [=y] && INET [=y] && IPV6 [=y]
WARNING: unmet direct dependencies detected for CRYPTO_SHA1
Depends on [n]: CRYPTO [=n]
Selected by [y]:
- IPV6_SEG6_HMAC [=y] && NET [=y] && INET [=y] && IPV6 [=y]
WARNING: unmet direct dependencies detected for CRYPTO_SHA256
Depends on [n]: CRYPTO [=n]
Selected by [y]:
- IPV6_SEG6_HMAC [=y] && NET [=y] && INET [=y] && IPV6 [=y]
The reason is that IPV6_SEG6_HMAC selects CRYPTO_HMAC, CRYPTO_SHA1, and
CRYPTO_SHA256 without depending on or selecting CRYPTO while those configs
are subordinate to CRYPTO.
Honor the kconfig menu hierarchy to remove kconfig dependency warnings.
Fixes: bf355b8d2c30 ("ipv6: sr: add core files for SR HMAC support") Signed-off-by: Necip Fazil Yildiran <fazilyildiran@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The parameter passed via DCB_ATTR_DCB_BUFFER is a struct dcbnl_buffer. The
field prio2buffer is an array of IEEE_8021Q_MAX_PRIORITIES bytes, where
each value is a number of a buffer to direct that priority's traffic to.
That value is however never validated to lie within the bounds set by
DCBX_MAX_BUFFERS. The only driver that currently implements the callback is
mlx5 (maintainers CCd), and that does not do any validation either, in
particual allowing incorrect configuration if the prio2buffer value does
not fit into 4 bits.
Instead of offloading the need to validate the buffer index to drivers, do
it right there in core, and bounce the request if the value is too large.
CC: Parav Pandit <parav@nvidia.com> CC: Saeed Mahameed <saeedm@nvidia.com> Fixes: e549f6f9c098 ("net/dcb: Add dcbnl buffer attribute") Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Fixes: 421842edeaf6 ("net/ipv6: Add fib6_null_entry") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: David Ahern <dsahern@gmail.com> Reviewed-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Currently, in tcp_v4_reqsk_send_ack() and tcp_v4_send_reset(), we
echo the TOS value of the received packets in the response.
However, we do not want to echo the lower 2 ECN bits in accordance
with RFC 3168 6.1.5 robustness principles.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Wei Wang <weiwan@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
There are a couple bugs here:
1) If opt[1] is zero then this results in a forever loop. If the value
is less than 2 then it is invalid.
2) It assumes that "len" is more than sizeof(valid_accm) or 6 which can
result in memory corruption.
In the case of LCP_OPTION_ACCM, then we should check "opt[1]" instead
of "len" because, if "opt[1]" is less than sizeof(valid_accm) then
"nak_len" gets out of sync and it can lead to memory corruption in the
next iterations through the loop. In case of LCP_OPTION_MAGIC, the
only valid value for opt[1] is 6, but the code is trying to log invalid
data so we should only discard the data when "len" is less than 6
because that leads to a read overflow.
Reported-by: ChenNan Of Chaitin Security Research Lab <whutchennan@gmail.com> Fixes: e022c2f07ae5 ("WAN: new synchronous PPP implementation for generic HDLC.") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This patch adds transport ports information for route lookup so that
IPsec can select Geneve tunnel traffic to do encryption. This is
needed for OVS/OVN IPsec with encrypted Geneve tunnels.
This can be tested by configuring a host-host VPN using an IKE
daemon and specifying port numbers. For example, for an
Openswan-type configuration, the following parameters should be
configured on both hosts and IPsec set up as-per normal:
$ cat /etc/ipsec.conf
conn in
...
left=$IP1
right=$IP2
...
leftprotoport=udp/6081
rightprotoport=udp
...
conn out
...
left=$IP1
right=$IP2
...
leftprotoport=udp
rightprotoport=udp/6081
...
The tunnel can then be setup using "ip" on both hosts (but
changing the relevant IP addresses):
$ ip link add tun type geneve id 1000 remote $IP2
$ ip addr add 192.168.0.1/24 dev tun
$ ip link set tun up
This can then be tested by pinging from $IP1:
$ ping 192.168.0.2
Without this patch the traffic is unencrypted on the wire.
Fixes: 2d07dc79fe04 ("geneve: add initial netdev driver for GENEVE tunnels") Signed-off-by: Qiuyu Xiao <qiuyu.xiao.qyx@gmail.com> Signed-off-by: Mark Gray <mark.d.gray@redhat.com> Reviewed-by: Greg Rose <gvrose8192@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Pass the correct offset to clear the stale filter hit
bytes counter. Otherwise, the counter starts incrementing
from the stale information, instead of 0.
Fixes: 12b276fbf6e0 ("cxgb4: add support to create hash filters") Signed-off-by: Ganji Aravind <ganji.aravind@chelsio.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
A migrating transparent huge page has to already be unmapped. Otherwise,
the page could be modified while it is being copied to a new page and data
could be lost. The function __split_huge_pmd() checks for a PMD migration
entry before calling __split_huge_pmd_locked() leading one to think that
__split_huge_pmd_locked() can handle splitting a migrating PMD.
However, the code always increments the page->_mapcount and adjusts the
memory control group accounting assuming the page is mapped.
Also, if the PMD entry is a migration PMD entry, the call to
is_huge_zero_pmd(*pmd) is incorrect because it calls pmd_pfn(pmd) instead
of migration_entry_to_pfn(pmd_to_swp_entry(pmd)). Fix these problems by
checking for a PMD migration entry.
Fixes: 84c3fc4e9c56 ("mm: thp: check pmd migration entry in common path") Signed-off-by: Ralph Campbell <rcampbell@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Yang Shi <shy828301@gmail.com> Reviewed-by: Zi Yan <ziy@nvidia.com> Cc: Jerome Glisse <jglisse@redhat.com> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Alistair Popple <apopple@nvidia.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Jason Gunthorpe <jgg@nvidia.com> Cc: Bharata B Rao <bharata@linux.ibm.com> Cc: Ben Skeggs <bskeggs@redhat.com> Cc: Shuah Khan <shuah@kernel.org> Cc: <stable@vger.kernel.org> [4.14+] Link: https://lkml.kernel.org/r/20200903183140.19055-1-rcampbell@nvidia.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
If a kprobe is marked as gone, we should not kill it again. Otherwise, we
can disarm the kprobe more than once. In that case, the statistics of
kprobe_ftrace_enabled can unbalance which can lead to that kprobe do not
work.
Fixes: e8386a0cb22f ("kprobes: support probing module __exit function") Co-developed-by: Chengming Zhou <zhouchengming@bytedance.com> Signed-off-by: Muchun Song <songmuchun@bytedance.com> Signed-off-by: Chengming Zhou <zhouchengming@bytedance.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Cc: "Naveen N . Rao" <naveen.n.rao@linux.ibm.com> Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com> Cc: David S. Miller <davem@davemloft.net> Cc: Song Liu <songliubraving@fb.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: <stable@vger.kernel.org> Link: https://lkml.kernel.org/r/20200822030055.32383-1-songmuchun@bytedance.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
when kmalloc() fails in kvm_io_bus_unregister_dev(), before removing
the bus, we should iterate over all other devices linked to it and call
kvm_iodevice_destructor() for them
In pfkey_dump() dplen and splen can both be specified to access the
xfrm_address_t structure out of bounds in__xfrm_state_filter_match()
when it calls addr_match() with the indexes. Return EINVAL if either
are out of range.
Signed-off-by: Mark Salyzyn <salyzyn@android.com> Cc: netdev@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: kernel-team@android.com Cc: Steffen Klassert <steffen.klassert@secunet.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: "David S. Miller" <davem@davemloft.net> Cc: Jakub Kicinski <kuba@kernel.org> Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
A spanking new machine I just got has all but one USB ports wired as 3.0.
Booting defconfig resulted in no keyboard or mouse, which was pretty
uncool. Let's enable that -- USB3 is ubiquitous rather than an oddity.
As 'y' not 'm' -- recovering from initrd problems needs a keyboard.
Also add it to the 32-bit defconfig.
Signed-off-by: Adam Borowski <kilobyte@angband.pl> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-usb@vger.kernel.org Link: http://lkml.kernel.org/r/20181009062803.4332-1-kilobyte@angband.pl Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Andy Shevchenko <andy.shevchenko@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
There are 2 problems with it:
1. "<" vs expected "<<"
2. the shift number is an IOMMU page number mask, not an address
mask as the IOMMU page shift is missing.
This did not hit us before f1565c24b596 ("powerpc: use the generic
dma_ops_bypass mode") because we had additional code to handle bypass
mask so this chunk (almost?) never executed.However there were
reports that aacraid does not work with "iommu=nobypass".
After f1565c24b596, aacraid (and probably others which call
dma_get_required_mask() before setting the mask) was unable to enable
64bit DMA and fall back to using IOMMU which was known not to work,
one of the problems is double free of an IOMMU page.
This fixes DMA for aacraid, both with and without "iommu=nobypass" in
the kernel command line. Verified with "stress-ng -d 4".
Fixes: 6a5c7be5e484 ("powerpc: Override dma_get_required_mask by platform hook and ops") Cc: stable@vger.kernel.org # v3.2+ Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200908015106.79661-1-aik@ozlabs.ru Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The CRC calculation done by genksyms is triggered when the parser hits
EXPORT_SYMBOL*() macros. At this point, genksyms recursively expands the
types of the function parameters, and uses that as the input for the CRC
calculation. In the case of forward-declared structs, the type expands
to 'UNKNOWN'. Following this, it appears that the result of the
expansion of each type is cached somewhere, and seems to be re-used
when/if the same type is seen again for another exported symbol in the
same C file.
Unfortunately, this can cause CRC 'stability' issues when a struct
definition becomes visible in the middle of a C file. For example, let's
assume code with the following pattern:
struct foo;
int bar(struct foo *arg)
{
/* Do work ... */
}
EXPORT_SYMBOL_GPL(bar);
/* This contains struct foo's definition */
#include "foo.h"
int baz(struct foo *arg)
{
/* Do more work ... */
}
EXPORT_SYMBOL_GPL(baz);
Here, baz's CRC will be computed using the expansion of struct foo that
was cached after bar's CRC calculation ('UNKOWN' here). But if
EXPORT_SYMBOL_GPL(bar) is removed from the file (because of e.g. symbol
trimming using CONFIG_TRIM_UNUSED_KSYMS), struct foo will be expanded
late, during baz's CRC calculation, which now has visibility over the
full struct definition, hence resulting in a different CRC for baz.
The proper fix for this certainly is in genksyms, but that will take me
some time to get right. In the meantime, we have seen one occurrence of
this in the ehci-hcd code which hits this problem because of the way it
includes C files halfway through the code together with an unlucky mix
of symbol trimming.
In order to workaround this, move the include done in ehci-hub.c early
in ehci-hcd.c, hence making sure the struct definitions are visible to
the entire file. This improves CRC stability of the ehci-hcd exports
even when symbol trimming is enabled.
The x86-64 psABI [0] specifies special relocation types
(R_X86_64_[REX_]GOTPCRELX) for indirection through the Global Offset
Table, semantically equivalent to R_X86_64_GOTPCREL, which the linker
can take advantage of for optimization (relaxation) at link time. This
is supported by LLD and binutils versions 2.26 onwards.
The compressed kernel is position-independent code, however, when using
LLD or binutils versions before 2.27, it must be linked without the -pie
option. In this case, the linker may optimize certain instructions into
a non-position-independent form, by converting foo@GOTPCREL(%rip) to $foo.
This potential issue has been present with LLD and binutils-2.26 for a
long time, but it has never manifested itself before now:
- LLD and binutils-2.26 only relax
movq foo@GOTPCREL(%rip), %reg
to
leaq foo(%rip), %reg
which is still position-independent, rather than
mov $foo, %reg
which is permitted by the psABI when -pie is not enabled.
- GCC happens to only generate GOTPCREL relocations on mov instructions.
- CLang does generate GOTPCREL relocations on non-mov instructions, but
when building the compressed kernel, it uses its integrated assembler
(due to the redefinition of KBUILD_CFLAGS dropping -no-integrated-as),
which has so far defaulted to not generating the GOTPCRELX
relocations.
Nick Desaulniers reports [1,2]:
"A recent change [3] to a default value of configuration variable
(ENABLE_X86_RELAX_RELOCATIONS OFF -> ON) in LLVM now causes Clang's
integrated assembler to emit R_X86_64_GOTPCRELX/R_X86_64_REX_GOTPCRELX
relocations. LLD will relax instructions with these relocations based
on whether the image is being linked as position independent or not.
When not, then LLD will relax these instructions to use absolute
addressing mode (R_RELAX_GOT_PC_NOPIC). This causes kernels built with
Clang and linked with LLD to fail to boot."
Patch series [4] is a solution to allow the compressed kernel to be
linked with -pie unconditionally, but even if merged is unlikely to be
backported. As a simple solution that can be applied to stable as well,
prevent the assembler from generating the relaxed relocation types using
the -mrelax-relocations=no option. For ease of backporting, do this
unconditionally.
These serial ports are exposed by the OOB-management-engine on
RealManage-enabled network cards (e.g. AMD DASH enabled systems using
Realtek cards).
Because these have 3 BARs, they fail the "num_iomem <= 1" check in
serial_pci_guess_board.
I've manually checked the two IOMEM regions and BAR 2 doesn't seem to
respond to reads, but BAR 4 seems to be an MMIO version of the IO ports
(untested).
With this change, the ports are detected:
0000:02:00.1: ttyS0 at I/O 0x2200 (irq = 82, base_baud = 115200) is a 16550A
0000:02:00.2: ttyS1 at I/O 0x2100 (irq = 55, base_baud = 115200) is a 16550A
Variable populated, which is a member of struct pcpu_chunk, is used as a
unit of size of unsigned long.
However, size of populated is miscounted. So, I fix this minor part.
Fixes: 8ab16c43ea79 ("percpu: change the number of pages marked in the first_chunk pop bitmap") Cc: <stable@vger.kernel.org> # 4.14+ Signed-off-by: Sunghyun Jin <mcsmonk@gmail.com> Signed-off-by: Dennis Zhou <dennis@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
On suspend the original host configuration gets restored. The
resume routine has to undo this, otherwise the SMBus master
may be left in disabled state or in i2c mode.
[JD: Rebased on v5.8, moved the write into i801_setup_hstcfg.]
Signed-off-by: Volker Rümelin <vr_qemu@t-online.de> Signed-off-by: Jean Delvare <jdelvare@suse.de> Signed-off-by: Wolfram Sang <wsa@kernel.org> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The SCSI layer can go into an ugly loop if you ignore that a device is
gone. You need to report an error in the command rather than in the
return value of the queue method.
We need to specifically check for ENODEV. The issue goes back to the
introduction of the driver.
Add a USB_QUIRK_IGNORE_REMOTE_WAKEUP quirk for the BYD zhaoxin notebook.
This notebook come with usb touchpad. And we would like to disable
touchpad wakeup on this notebook by default.
if of_find_device_by_node() succeed, mtk_drm_kms_init() doesn't have
a corresponding put_device(). Thus add jump target to fix the exception
handling for this function implementation.
On A20R machines the interrupt pending bits in cause register need to be
updated by requesting the chipset to do it. This needs to be done to
find the interrupt cause and after interrupt service. In
commit 0b888c7f3a03 ("MIPS: SNI: Convert to new irq_chip functions") the
function to do after service update got lost, which caused spurious
interrupts.
Fixes: 0b888c7f3a03 ("MIPS: SNI: Convert to new irq_chip functions") Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
syzbot is reporting OOB read at fbcon_resize() [1], for
commit 39b3cffb8cf31117 ("fbcon: prevent user font height or width change
from causing potential out-of-bounds access") is by error using
registered_fb[con2fb_map[vc->vc_num]]->fbcon_par->p->userfont (which was
set to non-zero) instead of fb_display[vc->vc_num].userfont (which remains
zero for that display).
We could remove tricky userfont flag [2], for we can determine it by
comparing address of the font data and addresses of built-in font data.
But since that commit is failing to fix the original OOB read [3], this
patch keeps the change minimal in case we decide to revert altogether.
Indirect leak of 360 byte(s) in 9 object(s) allocated from:
#0 0x7fecc305180e in calloc (/lib/x86_64-linux-gnu/libasan.so.5+0x10780e)
#1 0x560578f6dce5 in perf_pmu__new_format util/pmu.c:1333
#2 0x560578f752fc in perf_pmu_parse util/pmu.y:59
#3 0x560578f6a8b7 in perf_pmu__format_parse util/pmu.c:73
#4 0x560578e07045 in test__pmu tests/pmu.c:155
#5 0x560578de109b in run_test tests/builtin-test.c:410
#6 0x560578de109b in test_and_print tests/builtin-test.c:440
#7 0x560578de401a in __cmd_test tests/builtin-test.c:661
#8 0x560578de401a in cmd_test tests/builtin-test.c:807
#9 0x560578e49354 in run_builtin /home/namhyung/project/linux/tools/perf/perf.c:312
#10 0x560578ce71a8 in handle_internal_command /home/namhyung/project/linux/tools/perf/perf.c:364
#11 0x560578ce71a8 in run_argv /home/namhyung/project/linux/tools/perf/perf.c:408
#12 0x560578ce71a8 in main /home/namhyung/project/linux/tools/perf/perf.c:538
#13 0x7fecc2b7acc9 in __libc_start_main ../csu/libc-start.c:308
Fixes: cff7f956ec4a1 ("perf tests: Move pmu tests into separate object") Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lore.kernel.org/lkml/20200915031819.386559-12-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Commit 930beb5ac09a ("MIPS: introduce MIPS_L1_CACHE_SHIFT_<N>") forgot
to select the correct MIPS_L1_CACHE_SHIFT for SNI RM. This breaks non
coherent DMA because of a wrong allocation alignment.
When compiling with DEBUG=1 on Fedora 32 I'm getting crash for 'perf
test signal':
Program received signal SIGSEGV, Segmentation fault.
0x0000000000c68548 in __test_function ()
(gdb) bt
#0 0x0000000000c68548 in __test_function ()
#1 0x00000000004d62e9 in test_function () at tests/bp_signal.c:61
#2 0x00000000004d689a in test__bp_signal (test=0xa8e280 <generic_ ...
#3 0x00000000004b7d49 in run_test (test=0xa8e280 <generic_tests+1 ...
#4 0x00000000004b7e7f in test_and_print (t=0xa8e280 <generic_test ...
#5 0x00000000004b8927 in __cmd_test (argc=1, argv=0x7fffffffdce0, ...
...
It's caused by the symbol __test_function being in the ".bss" section:
$ nm perf | grep __test_function 0000000000c68548 B __test_function
I guess most of the time we're just lucky the inline asm ended up in the
".text" section, so making it specific explicit with push and pop
section clauses.
vmbus_wait_for_unload() looks for a CHANNELMSG_UNLOAD_RESPONSE message
coming from Hyper-V. But if the message isn't found for some reason,
the panic path gets hung forever. Add a timeout of 10 seconds to prevent
this.
Fixes: 415719160de3 ("Drivers: hv: vmbus: avoid scheduling in interrupt context in vmbus_initiate_unload()") Signed-off-by: Michael Kelley <mikelley@microsoft.com> Reviewed-by: Dexuan Cui <decui@microsoft.com> Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com> Link: https://lore.kernel.org/r/1600026449-23651-1-git-send-email-mikelley@microsoft.com Signed-off-by: Wei Liu <wei.liu@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
This warning was introduced in
commit 81033c6b584b ("ALSA: core: Warn on empty module").
It looks like we are supposed to set card->owner to THIS_MODULE.
A new warning in Clang points out that the initialization of
mux_pll_src_4plls_p appears incorrect:
../drivers/clk/rockchip/clk-rk3228.c:140:58: warning: suspicious
concatenation of string literals in an array initialization; did you
mean to separate the elements with a comma? [-Wstring-concatenation]
PNAME(mux_pll_src_4plls_p) = { "cpll", "gpll", "hdmiphy" "usb480m" };
^
,
../drivers/clk/rockchip/clk-rk3228.c:140:48: note: place parentheses
around the string literal to silence warning
PNAME(mux_pll_src_4plls_p) = { "cpll", "gpll", "hdmiphy" "usb480m" };
^
1 warning generated.
Given the name of the variable and the same variable name in rv1108, it
seems that this should have been four distinct elements. Fix it up by
adding the comma as suggested.
In Documentation/virt/kvm/api.rst it is said that "You probably want to
use 0 as machine type", which implies that type 0 be the "automatic" or
"default" type. And, in user-space libvirt use the null-machine (with
type 0) to detect the kvm capability, which returns "KVM not supported"
on a VZ platform.
I try to fix it in QEMU but it is ugly:
https://lists.nongnu.org/archive/html/qemu-devel/2020-08/msg05629.html
And Thomas Huth suggests me to change the definition of kvm type:
https://lists.nongnu.org/archive/html/qemu-devel/2020-09/msg03281.html
Since VZ and TE cannot co-exists, using type 0 on a TE platform will
still return success (so old user-space tools have no problems on new
kernels); the advantage is that using type 0 on a VZ platform will not
return failure. So, the only problem is "new user-space tools use type
2 on old kernels", but if we treat this as a kernel bug, we can backport
this patch to old stable kernels.
In the prepare_message callback the bus driver has the
opportunity to split a transfer into smaller chunks.
spi_map_msg is done after prepare_message.
Function spi_res_release releases the splited transfers
in the message. Therefore spi_res_release should be called
after spi_map_msg.
The previous try at this was commit c9ba7a16d0f1
which released the splited transfers after
spi_finalize_current_message had been called.
This introduced a race since the message struct could be
out of scope because the spi_sync call got completed.
Fixes this leak on spi bus driver spi-bcm2835.c when transfer
size is greater than 65532:
If something goes wrong (such as the SCL being stuck low) then we need
to reset the PCA chip. The issue with this is that on reset we lose all
config settings and the chip ends up in a disabled state which results
in a lock up/high CPU usage. We need to re-apply any configuration that
had previously been set and re-enable the chip.
Signed-off-by: Evan Nimmo <evan.nimmo@alliedtelesis.co.nz> Reviewed-by: Chris Packham <chris.packham@alliedtelesis.co.nz> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Wolfram Sang <wsa@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Reading past end of file returns EOF for aligned reads but -EINVAL for
unaligned reads on f2fs. While documentation is not strict about this
corner case, most filesystem returns EOF on this case, like iomap
filesystems. This patch consolidates the behavior for f2fs, by making
it return EOF(0).
it can be verified by a read loop on a file that does a partial read
before EOF (A file that doesn't end at an aligned address). The
following code fails on an unaligned file on f2fs, but not on
btrfs, ext4, and xfs.
while (done < total) {
ssize_t delta = pread(fd, buf + done, total - done, off + done);
if (!delta)
break;
...
}
It is arguable whether filesystems should actually return EOF or
-EINVAL, but since iomap filesystems support it, and so does the
original DIO code, it seems reasonable to consolidate on that.
Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com> Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
If the sbi->ckpt->next_free_nid is not NAT block aligned and if there
are free nids in that NAT block between the start of the block and
next_free_nid, then those free nids will not be scanned in scan_nat_page().
This results into mismatch between nm_i->available_nids and the sum of
nm_i->free_nid_count of all NAT blocks scanned. And nm_i->available_nids
will always be greater than the sum of free nids in all the blocks.
Under this condition, if we use all the currently scanned free nids,
then it will loop forever in f2fs_alloc_nid() as nm_i->available_nids
is still not zero but nm_i->free_nid_count of that partially scanned
NAT block is zero.
Fix this to align the nm_i->next_scan_nid to the first nid of the
corresponding NAT block.
I found this when compiling a kbuild random config with GCC 11. The
config enables CONFIG_DEBUG_SECTION_MISMATCH, which sets CFLAGS
-fno-inline-functions-called-once. This causes the call to cache_loop in
cache.c to not be inlined causing the below compile error.
In file included from arch/openrisc/mm/cache.c:13:
arch/openrisc/mm/cache.c: In function 'cache_loop':
./arch/openrisc/include/asm/spr.h:16:27: warning: 'asm' operand 0 probably does not match constraints
16 | #define mtspr(_spr, _val) __asm__ __volatile__ ( \
| ^~~~~~~
arch/openrisc/mm/cache.c:25:3: note: in expansion of macro 'mtspr'
25 | mtspr(reg, line);
| ^~~~~
./arch/openrisc/include/asm/spr.h:16:27: error: impossible constraint in 'asm'
16 | #define mtspr(_spr, _val) __asm__ __volatile__ ( \
| ^~~~~~~
arch/openrisc/mm/cache.c:25:3: note: in expansion of macro 'mtspr'
25 | mtspr(reg, line);
| ^~~~~
make[1]: *** [scripts/Makefile.build:283: arch/openrisc/mm/cache.o] Error 1
The asm constraint "K" requires a immediate constant argument to mtspr,
however because of no inlining a register argument is passed causing a
failure. Fix this by using __always_inline.
Enabling a whole subsystem from a single driver 'select' is frowned
upon and won't be accepted in new drivers, that need to use 'depends on'
instead. Existing selection of DMAENGINES will then cause circular
dependencies. Replace them with a dependency.
Since p points at raw xdr data, there's no guarantee that it's NULL
terminated, so we should give a length. And probably escape any special
characters too.
Reported-by: Zhi Li <yieli@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
If a write delegation isn't available, the Linux NFS client uses
a zero-stateid when performing a SETATTR.
NFSv4.0 provides no mechanism for an NFS server to match such a
request to a particular client. It recalls all delegations for that
file, even delegations held by the client issuing the request. If
that client happens to hold a read delegation, the server will
recall it immediately, resulting in an NFS4ERR_DELAY/CB_RECALL/
DELEGRETURN sequence.
Optimize out this pipeline bubble by having the client return any
delegations it may hold on a file before it issues a
SETATTR(zero-stateid) on that file.
The "tx/rx-transfer - crossing PAGE_SIZE" test always fails when
len=131071 and rx_offset >= 5:
spi-loopback-test spi0.0: Running test tx/rx-transfer - crossing PAGE_SIZE
...
with iteration values: len = 131071, tx_off = 0, rx_off = 3
with iteration values: len = 131071, tx_off = 0, rx_off = 4
with iteration values: len = 131071, tx_off = 0, rx_off = 5
loopback strangeness - rx changed outside of allowed range at: ...a4321000
spi_msg@ffffffd5a4157690
frame_length: 131071
actual_length: 131071
spi_transfer@ffffffd5a41576f8
len: 131071
tx_buf: ffffffd5a4340ffc
Note that rx_offset > 3 can only occur if the SPI controller driver sets
->dma_alignment to a higher value than 4, so most SPI controller drivers
are not affect.
The allocated Rx buffer is of size SPI_TEST_MAX_SIZE_PLUS, which is 132
KiB (assuming 4 KiB pages). This test uses an initial offset into the
rx_buf of PAGE_SIZE - 4, and a len of 131071, so the range expected to
be written in this transfer ends at (4096 - 4) + 5 + 131071 == 132 KiB,
which is also the end of the allocated buffer. But the code which
verifies the content of the buffer reads a byte beyond the allocated
buffer and spuriously fails because this out-of-bounds read doesn't
return the expected value.
Fix this by using ITERATE_LEN instead of ITERATE_MAX_LEN to avoid
testing sizes which cause out-of-bounds reads.
If the zero duty cycle doesn't correspond to any voltage in the voltage
table, the PWM regulator returns an -EINVAL from get_voltage_sel() which
results in the core erroring out with a "failed to get the current
voltage" and ending up not applying the machine constraints.
Instead, return -ENOTRECOVERABLE which makes the core set the voltage
since it's at an unknown value.
The driver is unable to successfully login with remote device. During pt2pt
login, the driver completes its FLOGI request with the remote device having
WWN precedence. The remote device issues its own (delayed) FLOGI after
accepting the driver's and, upon transmitting the FLOGI, immediately
recognizes it has already processed the driver's FLOGI thus it transitions
to sending a PLOGI before waiting for an ACC to its FLOGI.
In the driver, the FLOGI is received and an ACC sent, followed by the PLOGI
being received and an ACC sent. The issue is that the PLOGI reception
occurs before the response from the adapter from the FLOGI ACC is
received. Processing of the PLOGI sets state flags to perform the REG_RPI
mailbox command and proceed with the rest of discovery on the port. The
same completion routine used by both FLOGI and PLOGI is generic in
nature. One of the things it does is clear flags, and those flags happen to
drive the rest of discovery. So what happened was the PLOGI processing set
the flags, the FLOGI ACC completion cleared them, thus when the PLOGI ACC
completes it doesn't see the flags and stops.
Fix by modifying the generic completion routine to not clear the rest of
discovery flag (NLP_ACC_REGLOGIN) unless the completion is also associated
with performing a mailbox command as part of its handling. For things such
as FLOGI ACC, there isn't a subsequent action to perform with the adapter,
thus there is no mailbox cmd ptr. PLOGI ACC though will perform REG_RPI
upon completion, thus there is a mailbox cmd ptr.
Link: https://lore.kernel.org/r/20200828175332.130300-3-james.smart@broadcom.com Co-developed-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Link:
https://lore.kernel.org/r/20200825093940.19612-1-jhasan@marvell.com Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Javed Hasan <jhasan@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
When pm8001_tag_alloc() fails, task should be freed just like it is done in
the subsequent error paths.
Link: https://lore.kernel.org/r/20200823091453.4782-1-dinghao.liu@zju.edu.cn Acked-by: Jack Wang <jinpu.wang@cloud.ionos.com> Signed-off-by: Dinghao Liu <dinghao.liu@zju.edu.cn> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
A client should be able to handle getting an ERR_DELAY error
while doing a LOCK call to reclaim state due to delegation being
recalled. This is a transient error that can happen due to server
moving its volumes and invalidating its file location cache and
upon reference to it during the LOCK call needing to do an
expensive lookup (leading to an ERR_DELAY error on a PUTFH).
When using vf_ops->ndo_select_queue, the number of queues of VF is
usually bigger than the synthetic NIC. This condition may happen
often.
Remove "unlikely" from the comparison of ndev->real_num_tx_queues.
Fixes: b3bf5666a510 ("hv_netvsc: defer queue selection to VF") Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
pskb_carve_frag_list() may return -ENOMEM in pskb_carve_inside_nonlinear().
we should handle this correctly or we would get wrong sk_buff.
Fixes: 6fa01ccd8830 ("skbuff: Add pskb_extract() helper function") Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Since transactions may be freed shortly after they're created, before
a log_flush occurs, we need to initialize their ail1 and ail2 lists
earlier. Before this patch, the ail1 list was initialized in gfs2_log_flush().
This moves the initialization to the point when the transaction is first
created.
Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Cc: Salvatore Bonaccorso <carnil@debian.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
- Reduce sess_lock holding to prevent CPU Lock up. sess_lock was held across
fc_port registration and deletion. These calls can be blocked by upper
layer. Sess_lock is also being accessed by interrupt thread.
- Reduce number of loops in processing work_list to prevent kernel complaint
of CPU lockup or holding sess_lock.
Reported-by: Zhengyuan Liu <liuzhengyuan@kylinos.cn> Tested-by: Zhengyuan Liu <liuzhengyuan@kylinos.cn> Fixes: 9ba1cb25c151 ("scsi: qla2xxx: Remove all rports if fabric scan retry fails") Link: https://lore.kernel.org/linux-scsi/D01377DD-2E86-427B-BA0C-8D7649E37870@oracle.com/T/#t Signed-off-by: Quinn Tran <quinn.tran@cavium.com> Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Currently, the rport registration is being called from a single work element
that is used to process QLA internal "work_list". This work_list is meant for
quick and simple task (ie no sleep). The Rport registration process sometime
can be delayed by upper layer. This causes back pressure with the internal
queue where other jobs are unable to move forward.
This patch will schedule the registration process with a new work element
(fc_port.reg_work). While the RPort is being registered, the current state of
the fcport will not move forward until the registration is done. If the state
of the fabric has changed, a new field/next_disc_state will record the next
action on whether to 'DELETE' or 'Reverify the session/ADISC'.
Signed-off-by: Quinn Tran <quinn.tran@cavium.com> Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Zhengyuan Liu <liuzhengyuan@kylinos.cn> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The driver for Marvell switches puts all ports in IGMP snooping mode
which results in all IGMP/MLD frames that ingress on the ports to be
forwarded to the CPU only.
The bridge code in the kernel can then interpret these frames and act
upon them, for instance by updating the mdb in the switch to reflect
multicast memberships of stations connected to the ports. However,
the IGMP/MLD frames must then also be forwarded to other ports of the
bridge so external IGMP queriers can track membership reports, and
external multicast clients can receive query reports from foreign IGMP
queriers.
Currently, this is impossible as the EDSA tagger sets offload_fwd_mark
on the skb when it unwraps the tagged frames, and that will make the
switchdev layer prevent the skb from egressing on any other port of
the same switch.
To fix that, look at the To_CPU code in the DSA header and make
forwarding of the frame possible for trapped IGMP packets.
Introduce some #defines for the frame types to make the code a bit more
comprehensive.
Signed-off-by: Daniel Mack <daniel@zonque.org> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Tested-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net> Cc: DENG Qingfang <dqfext@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Using gcov to collect coverage data for kernels compiled with GCC 10.1
causes random malfunctions and kernel crashes. This is the result of a
changed GCOV_COUNTERS value in GCC 10.1 that causes a mismatch between
the layout of the gcov_info structure created by GCC profiling code and
the related structure used by the kernel.
Fix this by updating the in-kernel GCOV_COUNTERS value. Also re-enable
config GCOV_KERNEL for use with GCC 10.
Reported-by: Colin Ian King <colin.king@canonical.com> Reported-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Peter Oberparleiter <oberpar@linux.ibm.com> Tested-by: Leon Romanovsky <leonro@nvidia.com> Tested-and-Acked-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Failing probe with -EPROBE_DEFER until all dependencies
listed in the _DEP (Operation Region Dependencies) object
have been met.
This will fix an issue where on some platforms UCSI ACPI
driver fails to probe because the address space handler for
the operation region that the UCSI ACPI interface uses has
not been loaded yet.
Userspace drivers that use a SetConfiguration() request to "lightweight"
reset an already configured usb device might cause data toggles to get out
of sync between the device and host, and the device becomes unusable.
The xHCI host requires endpoints to be dropped and added back to reset the
toggle. If USB core notices the new configuration is the same as the
current active configuration it will avoid these extra steps by calling
usb_reset_configuration() instead of usb_set_configuration().
A SetConfiguration() request will reset the device side data toggles.
Make sure usb_reset_configuration() function also drops and adds back the
endpoints to ensure data toggles are in sync.
To avoid code duplication split the current usb_disable_device() function
and reuse the endpoint specific part.
The purpose of each TTY is as follows:
* ttyUSB0: DIAG/QCDM port.
* ttyUSB1: GNSS data.
* ttyUSB2: AT-capable port (control).
* ttyUSB3: AT-capable port (data).
In the secondary layout with PID=0x9206 (AT+CUSBSELNV=86) the module
exposes 6 TTY ports:
The purpose of each TTY is as follows:
* ttyUSB0: DIAG/QCDM port.
* ttyUSB1: GNSS data.
* ttyUSB2: AT-capable port (control).
* ttyUSB3: QFLOG interface.
* ttyUSB4: DAM interface.
* ttyUSB5: AT-capable port (data).
Signed-off-by: Aleksander Morgado <aleksander@aleksander.es> Cc: stable@vger.kernel.org Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The USB composition, defining the set of exported functions, is dynamic
in newer Quectel modems. Default functions can be disabled and
alternative functions can be enabled instead. The alternatives
includes class functions using interface pairs, which should be
handled by the respective class drivers.
Active interfaces are numbered consecutively, so static
blacklisting based on interface numbers will fail when the
composition changes. An example of such an error, where the
option driver has bound to the CDC ECM data interface,
preventing cdc_ether from handling this function:
Change rules for EC21, EC25, BG96 and EG95 to match vendor specific
serial functions only, to prevent binding to class functions. Require
2 endpoints on ff/ff/ff functions, avoiding the 3 endpoint QMI/RMNET
network functions.
Cc: AceLan Kao <acelan.kao@canonical.com> Cc: Sebastian Sjoholm <ssjoholm@mac.com> Cc: Dan Williams <dcbw@redhat.com> Cc: stable@vger.kernel.org Signed-off-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The USB device descriptor may get changed between two consecutive
enumerations on the same device for some reason, such as DFU or
malicius device.
In that case, we may access the changing descriptor if we don't take
the device lock here.
There were some problem in ipq8074 Gen2 PCIe phy init sequence.
1. Few register values were wrongly updated in the phy init sequence.
2. The register QSERDES_RX_SIGDET_CNTRL is a RX tuning parameter
register which is added in serdes table causing the wrong register
was getting updated.
3. Clocks and resets were not added in the phy init.
Fix these to make Gen2 PCIe port on ipq8074 devices to work.
The current implementation for gbcodec_mixer_dapm_ctl_put() uses
uninitialized gbvalue for comparison with updated value. This was found
using static analysis with coverity.
Uninitialized scalar variable (UNINIT)
11. uninit_use: Using uninitialized value
gbvalue.value.integer_value[0].
460 if (gbvalue.value.integer_value[0] != val) {
This patch fixes the issue with fetching the gbvalue before using it for
comparision.