During testing on Armada 388 platforms, it was found with a certain
module configuration that it was possible to trigger a kernel oops
during the module load process, caused by the phylink resolver being
triggered for a currently disabled interface.
This problem was introduced by changing the way the SFP registration
works, which now can result in the sfp link down notification being
called during phylink_create().
Fixes: b5bfc21af5cb ("net: sfp: do not probe SFP module before we're attached") Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net> Cc: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Clang warns when one enumerated type is implicitly converted to another:
drivers/pinctrl/pinctrl-max77620.c:56:12: warning: implicit conversion
from enumeration type 'enum max77620_pinconf_param' to different
enumeration type 'enum pin_config_param' [-Wenum-conversion]
.param = MAX77620_ACTIVE_FPS_SOURCE,
^~~~~~~~~~~~~~~~~~~~~~~~~~
It is expected that pinctrl drivers can extend pin_config_param because
of the gap between PIN_CONFIG_END and PIN_CONFIG_MAX so this conversion
isn't an issue. Most drivers that take advantage of this define the
PIN_CONFIG variables as constants, rather than enumerated values. Do the
same thing here so that Clang no longer warns.
The udlfb driver maintained an open count and cleaned up itself when the
count reached zero. But the console is also counted in the reference count
- so, if the user unplugged the device, the open count would not drop to
zero and the driver stayed loaded with console attached. If the user
re-plugged the adapter, it would create a device /dev/fb1, show green
screen and the access to the console would be lost.
The framebuffer subsystem has reference counting on its own - in order to
fix the unplug bug, we rely the framebuffer reference counting. When the
user unplugs the adapter, we call unregister_framebuffer unconditionally.
unregister_framebuffer will unbind the console, wait until all users stop
using the framebuffer and then call the fb_destroy method. The fb_destroy
cleans up the USB driver.
This patch makes the following changes:
* Drop dlfb->kref and rely on implicit framebuffer reference counting
instead.
* dlfb_usb_disconnect calls unregister_framebuffer, the rest of driver
cleanup is done in the function dlfb_ops_destroy. dlfb_ops_destroy will
be called by the framebuffer subsystem when no processes have the
framebuffer open or mapped.
* We don't use workqueue during initialization, but initialize directly
from dlfb_usb_probe. The workqueue could race with dlfb_usb_disconnect
and this racing would produce various kinds of memory corruption.
* We use usb_get_dev and usb_put_dev to make sure that the USB subsystem
doesn't free the device under us.
A proc_remove() can sleep. so that it can't be inside of spin_lock.
Hence proc_remove() is moved to outside of spin_lock. and it also
adds mutex to sync create and remove of proc entry(config->pde).
When we check the tcp options of a packet and it doesn't match the current
fingerprint, the tcp packet option pointer must be restored to its initial
value in order to do the proper tcp options check for the next fingerprint.
Here we can see an example.
Assumming the following fingerprint base with two lines:
Where TCP options are the last field in the OS signature, all of them overlap
except by the last one, ie. 'W6' versus 'W7'.
In case a packet for Linux 4.19 kicks in, the osf finds no matching because the
TCP options pointer is updated after checking for the TCP options in the first
line.
Therefore, reset pointer back to where it should be.
Fixes: 11eeef41d5f6 ("netfilter: passive OS fingerprint xtables match") Signed-off-by: Fernando Fernandez Mancera <ffmancera@riseup.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Commit 508b09046c0f ("netfilter: ipv6: Preserve link scope traffic
original oif") made ip6_route_me_harder() keep the original oif for
link-local and multicast packets. However, it also affected packets
for the loopback address because it used rt6_need_strict().
REDIRECT rules in the OUTPUT chain rewrite the destination to loopback
address; thus its oif should not be preserved. This commit fixes the bug
that redirected local packets are being dropped. Actually the packet was
not exactly dropped; Instead it was sent out to the original oif rather
than lo. When a packet with daddr ::1 is sent to the router, it is
effectively dropped.
Fixes: 508b09046c0f ("netfilter: ipv6: Preserve link scope traffic original oif") Signed-off-by: Eli Cooper <elicooper@gmx.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Flush after rule deletion bogusly hits -ENOENT. Skip rules that have
been already from nft_delrule_by_chain() which is always called from the
flush path.
Fixes: cf9dc09d0949 ("netfilter: nf_tables: fix missing rules flushing per table") Reported-by: Phil Sutter <phil@nwl.cc> Acked-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This reverts commit 5a2de63fd1a5 ("bridge: do not add port to router list
when receives query with source 0.0.0.0") and commit 0fe5119e267f ("net:
bridge: remove ipv6 zero address check in mcast queries")
The reason is RFC 4541 is not a standard but suggestive. Currently we
will elect 0.0.0.0 as Querier if there is no ip address configured on
bridge. If we do not add the port which recives query with source
0.0.0.0 to router list, the IGMP reports will not be about to forward
to Querier, IGMP data will also not be able to forward to dest.
As Nikolay suggested, revert this change first and add a boolopt api
to disable none-zero election in future if needed.
Reported-by: Linus Lüssing <linus.luessing@c0d3.blue> Reported-by: Sebastian Gottschall <s.gottschall@newmedia-net.de> Fixes: 5a2de63fd1a5 ("bridge: do not add port to router list when receives query with source 0.0.0.0") Fixes: 0fe5119e267f ("net: bridge: remove ipv6 zero address check in mcast queries") Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Acked-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
There are two minor issues in the current freeze interface:
1) Freeze interfaces have not related with CONFIG_DEBUG_SPINLOCK,
therefore fix the incorrect conditions;
2) For SMP platforms, it should also disable preemption before
doing atomic_cmpxchg in case that some high priority tasks
preempt between atomic_cmpxchg and disable_preempt, then spin
on the locked refcount later.
It's better to use atomic_cond_read_relaxed, which is implemented
in hardware instructions to monitor a variable changes currently
for ARM64, instead of open-coded busy waiting.
Multiref support means that a compressed page could have
more than one reference, which is designed for on-disk data
deduplication. However, mkfs doesn't support this mode
at this moment, and the kernel implementation is also broken.
Let's drop multiref support. If it is fully implemented
in the future, it can be reverted later.
This patch completes error handing code of z_erofs_do_read_page.
PG_error will be set when some read error happens, therefore
z_erofs_onlinepage_endio will unlock this page without setting
PG_uptodate.
As described in Kconfig, the last compressed pack should be cached
for further reading for either `EROFS_FS_ZIP_CACHE_UNIPOLAR' or
`EROFS_FS_ZIP_CACHE_BIPOLAR' by design.
However, there is a bug in z_erofs_do_read_page, it will
switch `initial' to `false' at the very beginning before it decides
to cache the last compressed pack.
caching strategy should work properly after appling this patch.
GSO packets with vnet_hdr must conform to a small set of gso_types.
The below commit uses flow dissection to drop packets that do not.
But it has false positives when the skb is not fully initialized.
Dissection needs skb->protocol and skb->network_header.
Infer skb->protocol from gso_type as the two must agree.
SKB_GSO_UDP can use both ipv4 and ipv6, so try both.
Exclude callers for which network header offset is not known.
Fixes: d5be7f632bad ("net: validate untrusted gso packets without csum offload") Signed-off-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Syzkaller again found a path to a kernel crash through bad gso input.
By building an excessively large packet to cause an skb field to wrap.
If VIRTIO_NET_HDR_F_NEEDS_CSUM was set this would have been dropped in
skb_partial_csum_set.
GSO packets that do not set checksum offload are suspicious and rare.
Most callers of virtio_net_hdr_to_skb already pass them to
skb_probe_transport_header.
Move that test forward, change it to detect parse failure and drop
packets on failure as those cleary are not one of the legitimate
VIRTIO_NET_HDR_GSO types.
Fixes: bfd5f4a3d605 ("packet: Add GSO/csum offload support.") Fixes: f43798c27684 ("tun: Allow GSO using virtio_net_hdr") Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Willem de Bruijn <willemb@google.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Previously, 'commit 372fddf70904 ("x86/mm: Introduce the 'no5lvl' kernel
parameter")' cleared X86_FEATURE_LA57 in boot_cpu_data, if Linux chooses
to not run in 5-level paging mode. Yet boot_cpu_data is queried by
do_cpuid_ent() as the host capability later when creating vcpus, and Qemu
will not be able to detect this feature and create VMs with LA57 feature.
As discussed earlier, VMs can still benefit from extended linear address
width, e.g. to enhance features like ASLR. So we would like to fix this,
by return the true hardware capability when Qemu queries.
Forwarded packets enter the tx path through ieee80211_add_pending_skb,
which skips the ieee80211_skb_resize call.
Fixes WARN_ON in ccmp_encrypt_skb and resulting packet loss.
Cc: stable@vger.kernel.org Signed-off-by: Felix Fietkau <nbd@nbd.name> Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
drm_dp_mst_topology_mgr_suspend() is added into the new reboot
sequence, which disables the UP request at the beginning.
Therefore sideband messages are blocked.
[How]
Finish MST sideband message transaction before UP request is
suppressed.
Signed-off-by: Leo (Hanghong) Ma <hanghong.ma@amd.com> Reviewed-by: Roman Li <Roman.Li@amd.com> Acked-by: Leo Li <sunpeng.li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
If we skipped all the connectors that were not part of a tile, we would
leave conn_seq=0 and conn_configured=0, convincing ourselves that we
had stagnated in our configuration attempts. Avoid this situation by
starting conn_seq=ALL_CONNECTORS, and repeating until we find no more
connectors to configure.
Fixes: 754a76591b12 ("drm/i915/fbdev: Stop repeating tile configuration on stagnation") Reported-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190215123019.32283-1-chris@chris-wilson.co.uk Cc: <stable@vger.kernel.org> # v3.19+
(cherry picked from commit d9b308b1f8a1acc0c3279f443d4fe0f9f663252e) Signed-off-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
On HP ProBook 4540s, if PM-runtime is enabled in the radeon driver
and the direct-complete optimization is used for the radeon device
during system-wide suspend, the system doesn't resume.
Preventing direct-complete from being used with the radeon device by
setting the DPM_FLAG_NEVER_SKIP driver flag for it makes the problem
go away, which indicates that direct-complete is not safe for the
radeon driver in general and should not be used with it (at least
for now).
This fixes a regression introduced by commit c62ec4610c40
("PM / core: Fix direct_complete handling for devices with no
callbacks") which allowed direct-complete to be applied to
devices without PM callbacks (again) which in turn unlocked
direct-complete for radeon on HP ProBook 4540s.
Fixes: c62ec4610c40 ("PM / core: Fix direct_complete handling for devices with no callbacks") Link: https://bugzilla.kernel.org/show_bug.cgi?id=201519 Reported-by: Ярослав Семченко <ukrkyi@gmail.com> Tested-by: Ярослав Семченко <ukrkyi@gmail.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When using ATPX to control dGPU power, the state is not retained
across suspend and resume cycles by default. This can probably
be loosened for Hybrid Graphics (_PR3) laptops where I think the
state is properly retained.
Fixes: c62ec4610c40 ("PM / core: Fix direct_complete handling for devices with no callbacks") Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The default value of ARCH_SLAB_MINALIGN in "include/linux/slab.h" is
"__alignof__(unsigned long long)" which for ARC unexpectedly turns out
to be 4. This is not a compiler bug, but as defined by ARC ABI [1]
Thus slab allocator would allocate a struct which is 32-bit aligned,
which is generally OK even if struct has long long members.
There was however potetial problem when it had any atomic64_t which
use LLOCKD/SCONDD instructions which are required by ISA to take
64-bit addresses. This is the problem we ran into
The fix is to make sure slab allocations are 64-bit aligned.
Do note that atomic64_t is __attribute__((aligned(8)) which means gcc
does generate 64-bit aligned references, relative to beginning of
container struct. However the issue is if the container itself is not
64-bit aligned, atomic64_t ends up unaligned which is what this patch
ensures.
Handle U-boot arguments paranoidly:
* don't allow to pass unknown tag.
* try to use external device tree blob only if corresponding tag
(TAG_DTB) is set.
* don't check uboot_tag if kernel build with no ARC_UBOOT_SUPPORT.
NOTE:
If U-boot args are invalid we skip them and try to use embedded device
tree blob. We can't panic on invalid U-boot args as we really pass
invalid args due to bug in U-boot code.
This happens if we don't provide external DTB to U-boot and
don't set 'bootargs' U-boot environment variable (which is default
case at least for HSDK board) In that case we will pass
{r0 = 1 (bootargs in r2); r1 = 0; r2 = 0;} to linux which is invalid.
While I'm at it refactor U-boot arguments handling code.
It is currently done in arc_init_IRQ() which might be too late
considering gcc 7.3.1 onwards (GNU 2018.03) generates unaligned
memory accesses by default
Commit 910cd32e552e ("parisc: Fix and enable seccomp filter support")
introduced a regression in ptrace-based syscall tampering: when tracer
changes syscall number to -1, the kernel fails to initialize %r28 with
-ENOSYS and subsequently fails to return the error code of the failed
syscall to userspace.
This erroneous behaviour could be observed with a simple strace syscall
fault injection command which is expected to print something like this:
$ strace -a0 -ewrite -einject=write:error=enospc echo hello
write(1, "hello\n", 6) = -1 ENOSPC (No space left on device) (INJECTED)
write(2, "echo: ", 6) = -1 ENOSPC (No space left on device) (INJECTED)
write(2, "write error", 11) = -1 ENOSPC (No space left on device) (INJECTED)
write(2, "\n", 1) = -1 ENOSPC (No space left on device) (INJECTED)
+++ exited with 1 +++
syzbot hit the 'BUG_ON(index_key->desc_len == 0);' in __key_link_begin()
called from construct_alloc_key() during sys_request_key(), because the
length of the key description was never calculated.
The problem is that we rely on ->desc_len being initialized by
search_process_keyrings(), specifically by search_nested_keyrings().
But, if the process isn't subscribed to any keyrings that never happens.
Fix it by always initializing keyring_index_key::desc_len as soon as the
description is set, like we already do in some places.
The following program reproduces the BUG_ON() when it's run as root and
no session keyring has been installed. If it doesn't work, try removing
pam_keyinit.so from /etc/pam.d/login and rebooting.
Reported-by: syzbot+ec24e95ea483de0a24da@syzkaller.appspotmail.com Fixes: b2a4df200d57 ("KEYS: Expand the capacity of a keyring") Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: David Howells <dhowells@redhat.com> Cc: stable@vger.kernel.org Signed-off-by: James Morris <james.morris@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Align the payload of "user" and "logon" keys so that users of the
keyrings service can access it as a struct that requires more than
2-byte alignment. fscrypt currently does this which results in the read
of fscrypt_key::size being misaligned as it needs 4-byte alignment.
Align to __alignof__(u64) rather than __alignof__(long) since in the
future it's conceivable that people would use structs beginning with
u64, which on some platforms would require more than 'long' alignment.
Reported-by: Aaro Koskinen <aaro.koskinen@iki.fi> Fixes: 2aa349f6e37c ("[PATCH] Keys: Export user-defined keyring operations") Fixes: 88bd6ccdcdd6 ("ext4 crypto: add encryption key management facilities") Cc: stable@vger.kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com> Tested-by: Aaro Koskinen <aaro.koskinen@iki.fi> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: James Morris <james.morris@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Since .scsi_done() must only be called after scsi_queue_rq() has
finished, make sure that the SRP initiator driver does not call
.scsi_done() while scsi_queue_rq() is in progress. Although
invoking sg_reset -d while I/O is in progress works fine with kernel
v4.20 and before, that is not the case with kernel v5.0-rc1. This
patch avoids that the following crash is triggered with kernel
v5.0-rc1:
BUG: unable to handle kernel NULL pointer dereference at 0000000000000138
CPU: 0 PID: 360 Comm: kworker/0:1H Tainted: G B 5.0.0-rc1-dbg+ #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014
Workqueue: kblockd blk_mq_run_work_fn
RIP: 0010:blk_mq_dispatch_rq_list+0x116/0xb10
Call Trace:
blk_mq_sched_dispatch_requests+0x2f7/0x300
__blk_mq_run_hw_queue+0xd6/0x180
blk_mq_run_work_fn+0x27/0x30
process_one_work+0x4f1/0xa20
worker_thread+0x67/0x5b0
kthread+0x1cf/0x1f0
ret_from_fork+0x24/0x30
Cc: <stable@vger.kernel.org> Fixes: 94a9174c630c ("IB/srp: reduce lock coverage of command completion") Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Currently mlx5 driver creates xdp redirect hw queues unconditionally on
netdevice open, This is great until someone starts redirecting XDP traffic
via ndo_xdp_xmit on mlx5 device and changes the device configuration at
the same time, this might cause crashes, since the other device's napi
is not aware of the mlx5 state change (resources un-availability).
To fix this we must synchronize with other devices napi's on the system.
Added a new flag under mlx5e_priv to determine XDP TX resources are
available, set/clear it up when necessary and use synchronize_rcu()
when the flag is turned off, so other napi's are in-sync with it, before
we actually cleanup the hw resources.
The flag is tested prior to committing to transmit on mlx5e_xdp_xmit, and
it is sufficient to determine if it safe to transmit or not. The other
two internal flags (MLX5E_STATE_OPENED and MLX5E_SQ_STATE_ENABLED) become
unnecessary. Thus, they are removed from data path.
Fixes: 58b99ee3e3eb ("net/mlx5e: Add support for XDP_REDIRECT in device-out side") Reported-by: Toke Høiland-Jørgensen <toke@redhat.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
struct tcindex_filter_result contains two parts:
struct tcf_exts and struct tcf_result.
For the local variable 'cr', its exts part is never used but
initialized without being released properly on success path. So
just completely remove the exts part to fix this leak.
For the local variable 'new_filter_result', it is never properly
released if not used by 'r' on success path.
Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: Jiri Pirko <jiri@resnulli.us> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When tcindex_destroy() destroys all the filter results in
the perfect hash table, it invokes the walker to delete
each of them. However, results with class==0 are skipped
in either tcindex_walk() or tcindex_delete(), which causes
a memory leak reported by kmemleak.
This patch fixes it by skipping the walker and directly
deleting these filter results so we don't miss any filter
result.
As a result of this change, we have to initialize exts->net
properly in tcindex_alloc_perfect_hash(). For net-next, we
need to consider whether we should initialize ->net in
tcf_exts_init() instead, before that just directly test
CONFIG_NET_CLS_ACT=y.
Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: Jiri Pirko <jiri@resnulli.us> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
tcindex_destroy() invokes tcindex_destroy_element() via
a walker to delete each filter result in its perfect hash
table, and tcindex_destroy_element() calls tcindex_delete()
which schedules tcf RCU works to do the final deletion work.
Unfortunately this races with the RCU callback
__tcindex_destroy(), which could lead to use-after-free as
reported by Adrian.
Fix this by migrating this RCU callback to tcf RCU work too,
as that workqueue is ordered, we will not have use-after-free.
Note, we don't need to hold netns refcnt because we don't call
tcf_exts_destroy() here.
Fixes: 27ce4f05e2ab ("net_sched: use tcf_queue_work() in tcindex filter") Reported-by: Adrian <bugs@abtelecom.ro> Cc: Ben Hutchings <ben@decadent.org.uk> Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: Jiri Pirko <jiri@resnulli.us> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
If we disabled IPv6 from the kernel command line (ipv6.disable=1), we should
not call ip6_err_gen_icmpv6_unreach(). This:
ip link add sit1 type sit local 192.0.2.1 remote 192.0.2.2 ttl 1
ip link set sit1 up
ip addr add 198.51.100.1/24 dev sit1
ping 198.51.100.2
if IPv6 is disabled at boot time, will crash the kernel.
v2: there's no need to use in6_dev_get(), use __in6_dev_get() instead,
as we only need to check that idev exists and we are under
rcu_read_lock() (from netif_receive_skb_internal()).
Reported-by: Jianlin Shi <jishi@redhat.com> Fixes: ca15a078bd90 ("sit: generate icmpv6 error when receiving icmpv4 error") Cc: Oussama Ghorbel <ghorbel@pivasoftware.com> Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Reviewed-by: Stefano Brivio <sbrivio@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When we add a new GENEVE device with IPv6 remote, checking only for
IS_ENABLED(CONFIG_IPV6) is not enough as we may disable IPv6 in the
kernel command line (ipv6.disable=1), and calling rt6_lookup() would
cause a NULL pointer dereference.
v2:
- don't mix declarations and code (reported by Stefano Brivio, Eric Dumazet)
- there's no need to use in6_dev_get() as we only need to check that
idev exists (reported by David Ahern). This is under RTNL, so we can
simply use __in6_dev_get() instead (Stefano, Eric).
Reported-by: Jianlin Shi <jishi@redhat.com> Fixes: c40e89fd358e9 ("geneve: configure MTU based on a lower device") Cc: Alexey Kodanev <alexey.kodanev@oracle.com> Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Reviewed-by: Stefano Brivio <sbrivio@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Same story as before, these use struct ifreq and thus need
to be read with the shorter version to not cause faults.
Cc: stable@vger.kernel.org Fixes: f92d4fc95341 ("kill bond_ioctl()") Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
As reported by Robert O'Callahan in
https://bugzilla.kernel.org/show_bug.cgi?id=202273
reverting the previous changes in this area broke
the SIOCGIFNAME ioctl in compat again (I'd previously
fixed it after his previous report of breakage in
https://bugzilla.kernel.org/show_bug.cgi?id=199469).
This is obviously because I fixed SIOCGIFNAME more or
less by accident.
Fix it explicitly now by making it pass through the
restored compat translation code.
Cc: stable@vger.kernel.org Fixes: 4cf808e7ac32 ("kill dev_ifname32()") Reported-by: Robert O'Callahan <robert@ocallahan.org> Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This reverts commit bf4405737f9f ("kill dev_ifsioc()").
This wasn't really unused as implied by the original commit,
it still handles the copy to/from user differently, and the
commit thus caused issues such as
https://bugzilla.kernel.org/show_bug.cgi?id=199469
and
https://bugzilla.kernel.org/show_bug.cgi?id=202273
However, deviating from a strict revert, rename dev_ifsioc()
to compat_ifreq_ioctl() to be clearer as to its purpose and
add a comment.
Cc: stable@vger.kernel.org Fixes: bf4405737f9f ("kill dev_ifsioc()") Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This reverts commit 1cebf8f143c2 ("socket: fix struct ifreq
size in compat ioctl"), it's a bugfix for another commit that
I'll revert next.
This is not a 'perfect' revert, I'm keeping some coding style
intact rather than revert to the state with indentation errors.
Cc: stable@vger.kernel.org Fixes: 1cebf8f143c2 ("socket: fix struct ifreq size in compat ioctl") Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
as while we retrieve 'opt_inst' from team->option_inst_list, it could
be added to the local 'opt_inst_list' for multiple times. The
__team_option_inst_tmp_find() doesn't work, as the setter
team_mode_option_set() still calls team->ops.exit() which uses
->tmp_list too in __team_options_change_check().
Simplify the list operations by moving the 'opt_inst_list' and
team_nl_send_event_options_get() into the nla_for_each_nested() loop so
that it can be guranteed that we won't insert a same list entry for
multiple times. Therefore, __team_option_inst_tmp_find() can be removed
too.
Fixes: 4fb0534fb7bb ("team: avoid adding twice the same option to the event list") Fixes: 2fcdb2c9e659 ("team: allow to send multiple set events in one message") Reported-by: syzbot+4d4af685432dc0e56c91@syzkaller.appspotmail.com Reported-by: syzbot+68ee510075cf64260cc4@syzkaller.appspotmail.com Cc: Jiri Pirko <jiri@resnulli.us> Cc: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In sctp_stream_init(), after sctp_stream_outq_migrate() freed the
surplus streams' ext, but sctp_stream_alloc_out() returns -ENOMEM,
stream->outcnt will not be set to 'outcnt'.
With the bigger value on stream->outcnt, when closing the assoc and
freeing its streams, the ext of those surplus streams will be freed
again since those stream exts were not set to NULL after freeing in
sctp_stream_outq_migrate(). Then the invalid-free issue reported by
syzbot would be triggered.
We fix it by simply setting them to NULL after freeing.
Fixes: 5bbbbe32a431 ("sctp: introduce stream scheduler foundations") Reported-by: syzbot+58e480e7b28f2d890bfd@syzkaller.appspotmail.com Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
It was caused by SKB_GSO_CB(skb)->csum_start not set in sctp_gso_segment.
sctp_gso_segment() calls skb_segment() with 'feature | NETIF_F_HW_CSUM',
which causes SKB_GSO_CB(skb)->csum_start not to be set in skb_segment().
For TCP/UDP, when feature supports HW_CSUM, CHECKSUM_PARTIAL will be set
and gso_reset_checksum will be called to set SKB_GSO_CB(skb)->csum_start.
So SCTP should do the same as TCP/UDP, to call gso_reset_checksum() when
computing checksum in sctp_gso_segment.
Reported-by: Jianlin Shi <jishi@redhat.com> Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When we probe a SFP module, we expect to be able to call the upstream
device's module_insert() function so that the upstream link can be
configured. However, when the upstream device is delayed, we currently
may end up probing the module before the upstream device is available,
and lose the module_insert() call.
Avoid this by holding off probing the module until the SFP bus is
properly connected to both the SFP socket driver and the upstream
driver.
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When calculating rb->frames_per_block * req->tp_block_nr the result
can overflow. Check it for overflow without limiting the total buffer
size to UINT_MAX.
This change fixes support for packet ring buffers >= UINT_MAX.
Fixes: 8f8d28e4d6d8 ("net/packet: fix overflow in check for tp_frame_nr") Signed-off-by: Kal Conley <kal.conley@dectris.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In some case, we may use multiple pedit actions to modify packets.
The command shown as below: the last pedit action is effective.
$ tc filter add dev netdev_rep parent ffff: protocol ip prio 1 \
flower skip_sw ip_proto icmp dst_ip 3.3.3.3 \
action pedit ex munge ip dst set 192.168.1.100 pipe \
action pedit ex munge eth src set 00:00:00:00:00:01 pipe \
action pedit ex munge eth dst set 00:00:00:00:00:02 pipe \
action csum ip pipe \
action tunnel_key set src_ip 1.1.1.100 dst_ip 1.1.1.200 dst_port 4789 id 100 \
action mirred egress redirect dev vxlan0
To fix it, we add max_mod_hdr_actions to mlx5e_tc_flow_parse_attr struction,
max_mod_hdr_actions will store the max pedit action number we support and
num_mod_hdr_actions indicates how many pedit action we used, and store all
pedit action to mod_hdr_actions.
Fixes: d79b6df6b10a ("net/mlx5e: Add parsing of TC pedit actions to HW format") Cc: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Acked-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When an ethernet frame is padded to meet the minimum ethernet frame
size, the padding octets are not covered by the hardware checksum.
Fortunately the padding octets are usually zero's, which don't affect
checksum. However, it is not guaranteed. For example, switches might
choose to make other use of these octets.
This repeatedly causes kernel hardware checksum fault.
Prior to the cited commit below, skb checksum was forced to be
CHECKSUM_NONE when padding is detected. After it, we need to keep
skb->csum updated. However, fixing up CHECKSUM_COMPLETE requires to
verify and parse IP headers, it does not worth the effort as the packets
are so small that CHECKSUM_COMPLETE has no significant advantage.
Future work: when reporting checksum complete is not an option for
IP non-TCP/UDP packets, we can actually fallback to report checksum
unnecessary, by looking at cqe IPOK bit.
Fixes: 88078d98d1bb ("net: pskb_trim_rcsum() and CHECKSUM_COMPLETE are friends") Cc: Eric Dumazet <edumazet@google.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Fix race condition between ena_update_on_link_change() and
ena_restore_device().
This race can occur if link notification arrives while the driver
is performing a reset sequence. In this case link can be set up,
enabling the device, before it is fully restored. If packets are
sent at this time, the driver might access uninitialized data
structures, causing kernel crash.
Move the clearing of ENA_FLAG_ONGOING_RESET and netif_carrier_on()
after ena_up() to ensure the device is ready when link is set up.
Fixes: d18e4f683445 ("net: ena: fix race condition between device reset and link up setup") Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
genlmsg_reply can fail, so propagate its return code
Fixes: 915d7e5e593 ("ipv6: sr: add code base for control plane support of SR-IPv6") Signed-off-by: Li RongQing <lirongqing@baidu.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Field idiag_ext in struct inet_diag_req_v2 used as bitmap of requested
extensions has only 8 bits. Thus extensions starting from DCTCPINFO
cannot be requested directly. Some of them included into response
unconditionally or hook into some of lower 8 bits.
Extension INET_DIAG_CLASS_ID has not way to request from the beginning.
This patch bundle it with INET_DIAG_TCLASS (ipv6 tos), fixes space
reservation, and documents behavior for other extensions.
Also this patch adds fallback to reporting socket priority. This filed
is more widely used for traffic classification because ipv4 sockets
automatically maps TOS to priority and default qdisc pfifo_fast knows
about that. But priority could be changed via setsockopt SO_PRIORITY so
INET_DIAG_TOS isn't enough for predicting class.
Also cgroup2 obsoletes net_cls classid (it always zero), but we cannot
reuse this field for reporting cgroup2 id because it is 64-bit (ino+gen).
So, after this patch INET_DIAG_CLASS_ID will report socket priority
for most common setup when net_cls isn't set and/or cgroup2 in use.
Fixes: 0888e372c37f ("net: inet: diag: expose sockets cgroup classid") Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Fixes: c6c8fea29769 ("net: Add batman-adv meshing protocol") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com>
Cc: Marek Lindner <mareklindner@neomailbox.ch>
Cc: Simon Wunderlich <sw@simonwunderlich.de>
Cc: Antonio Quartulli <a@unstable.cc> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
A recent commit in Clang expanded the -Wstring-plus-int warning, showing
some odd behavior in this file.
drivers/isdn/hardware/avm/b1.c:426:30: warning: adding 'int' to a string does not append to the string [-Wstring-plus-int]
cinfo->version[j] = "\0\0" + 1;
~~~~~~~^~~
drivers/isdn/hardware/avm/b1.c:426:30: note: use array indexing to silence this warning
cinfo->version[j] = "\0\0" + 1;
^
& [ ]
1 warning generated.
This is equivalent to just "\0". Nick pointed out that it is smarter to
use "" instead of "\0" because "" is used elsewhere in the kernel and
can be deduplicated at the linking stage.
Link: https://github.com/ClangBuiltLinux/linux/issues/309 Suggested-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
Test that externally learned FDB entries can roam, but not age out.
Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Petr Machata <petrm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
The driver currently treats static FDB entries as both static and
sticky. This is incorrect and prevents such entries from being roamed to
a different port via learning.
Fix this by configuring static entries with ageing disabled and roaming
enabled.
In net-next we can add proper support for the newly introduced 'sticky'
flag.
Fixes: 56ade8fe3fe1 ("mlxsw: spectrum: Add initial support for Spectrum ASIC") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reported-by: Alexander Petrovskiy <alexpe@mellanox.com> Reviewed-by: Petr Machata <petrm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
Externally learned entries can be added by a user or by a switch driver
that is notifying the bridge driver about entries that were learned in
hardware.
In the first case, the entries are not marked with the 'added_by_user'
flag, which causes switch drivers to ignore them and not offload them.
The 'added_by_user' flag can be set on externally learned FDB entries
based on the 'swdev_notify' parameter in br_fdb_external_learn_add(),
which effectively means if the created / updated FDB entry was added by
a user or not.
Fixes: 816a3bed9549 ("switchdev: Add fdb.added_by_user to switchdev notifications") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reported-by: Alexander Petrovskiy <alexpe@mellanox.com> Reviewed-by: Petr Machata <petrm@mellanox.com> Cc: Roopa Prabhu <roopa@cumulusnetworks.com> Cc: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Cc: bridge@lists.linux-foundation.org Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
As txq_trans_update() only updates trans_start when the lock is held,
trans_start does not get updated if NETIF_F_LLTX is declared.
Signed-off-by: Madalin Bucur <madalin.bucur@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
In sock_setsockopt() (net/core/sock.h), when SO_MARK option is used
to change sk_mark, sk_dst_reset(sk) is called. The same should be
done in bpf_setsockopt().
Fixes: 8c4b4c7e9ff0 ("bpf: Add setsockopt helper function to bpf") Reported-by: Maciej Żenczykowski <maze@google.com> Signed-off-by: Peter Oskolkov <posk@google.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Reviewed-by: Maciej Żenczykowski <maze@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
Shifting the 1 by exp by an int can lead to sign-extension overlow when
exp is 31 since 1 is an signed int and sign-extending this result to an
unsigned long long will set the upper 32 bits. Fix this by shifting an
unsigned long.
Detected by cppcheck:
(warning) Shifting signed 32-bit value by 31 bits is undefined behaviour
Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
While running test_progs in a loop I found out that I'm sometimes hitting
"Didn't find expected build ID from the map" error.
Looking at stack_map_get_build_id_offset() it seems that it is racy (by
design) and can sometimes return BPF_STACK_BUILD_ID_IP (i.e. can't trylock
current->mm->mmap_sem).
Let's retry this test a single time.
Fixes: 13790d1cc72c ("bpf: add selftest for stackmap with build_id in NMI context") Acked-by: Song Liu <songliubraving@fb.com> Signed-off-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
When returning BPF_STACK_BUILD_ID_IP from stack_map_get_build_id_offset,
make sure that build_id field is empty. Since we are using percpu
free list, there is a possibility that we might reuse some previous
bpf_stack_build_id with non-zero build_id.
Fixes: 615755a77b24 ("bpf: extend stackmap to save binary_build_id+offset instead of address") Acked-by: Song Liu <songliubraving@fb.com> Signed-off-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
Build-id length is not fixed to 20, it can be (`man ld` /--build-id):
* 128-bit (uuid)
* 160-bit (sha1)
* any length specified in ld --build-id=0xhexstring
To fix the issue of missing BPF_STACK_BUILD_ID_VALID for shorter build-ids,
assume that build-id is somewhere in the range of 1 .. 20.
Set the remaining bytes to zero.
v2:
* don't introduce new "len = min(BPF_BUILD_ID_SIZE, nhdr->n_descsz)",
we already know that nhdr->n_descsz <= BPF_BUILD_ID_SIZE if we enter
this 'if' condition
Fixes: 615755a77b24 ("bpf: extend stackmap to save binary_build_id+offset instead of address") Acked-by: Song Liu <songliubraving@fb.com> Signed-off-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
Fix the refcounting of the authentication keys in the file locking code.
The vnode->lock_key member points to a key on which it expects to be
holding a ref, but it isn't always given an extra ref, however.
A cb_interest record is not necessarily attached to the vnode on entry to
afs_validate(), which can cause an oops when we try to bring the vnode's
cb_s_break up to date in the default case (ie. no current callback promise
and the vnode has not been deleted).
Fix this by simply removing the line, as vnode->cb_s_break will be set when
needed by afs_register_server_cb_interest() when we next get a callback
promise from RPC call.
In iproute2 commit 90c5c969f0b9 ("fix print_0xhex on 32 bit"), the format
specifier for the ife type changed from 0x%X to %#llX, causing systematic
failures in the following TDC test cases:
7682 - Create valid ife encode action with mark and pass control
ef47 - Create valid ife encode action with mark and pipe control
df43 - Create valid ife encode action with mark and continue control
e4cf - Create valid ife encode action with mark and drop control
ccba - Create valid ife encode action with mark and reclassify control
a1cf - Create valid ife encode action with mark and jump control
cb3d - Create valid ife encode action with mark value at 32-bit maximum
95ed - Create valid ife encode action with prio and pass control
aa17 - Create valid ife encode action with prio and pipe control
74c7 - Create valid ife encode action with prio and continue control
7a97 - Create valid ife encode action with prio and drop control
f66b - Create valid ife encode action with prio and reclassify control
3056 - Create valid ife encode action with prio and jump control
7dd3 - Create valid ife encode action with prio value at 32-bit maximum
05bb - Create valid ife encode action with tcindex and pass control
ce65 - Create valid ife encode action with tcindex and pipe control
09cd - Create valid ife encode action with tcindex and continue control
8eb5 - Create valid ife encode action with tcindex and continue control
451a - Create valid ife encode action with tcindex and drop control
d76c - Create valid ife encode action with tcindex and reclassify control
e731 - Create valid ife encode action with tcindex and jump control
b7b8 - Create valid ife encode action with tcindex value at 16-bit maximum
2a9c - Create valid ife encode action with mac src parameter
cf5c - Create valid ife encode action with mac dst parameter
2353 - Create valid ife encode action with mac src and mac dst parameters
552c - Create valid ife encode action with mark and type parameters
0421 - Create valid ife encode action with prio and type parameters
4017 - Create valid ife encode action with tcindex and type parameters
fac3 - Create valid ife encode action with index at 32-bit maximnum
7c25 - Create valid ife decode action with pass control
dccb - Create valid ife decode action with pipe control
7bb9 - Create valid ife decode action with continue control
d9ad - Create valid ife decode action with drop control
219f - Create valid ife decode action with reclassify control
8f44 - Create valid ife decode action with jump control
b330 - Create ife encode action with cookie
Change 'matchPattern' values, allowing '0' and '0x0' if ife type is equal
to 0, and accepting both '0x' and '0X' otherwise, to let these tests pass
both with old and new tc binaries.
While at it, fix a small typo in test case fac3 ('maximnum'->'maximum').
Signed-off-by: Davide Caratti <dcaratti@redhat.com> Acked-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
After commit 1c25324caf82 ("net/sched: act_tunnel_key: Don't dump dst port
if it wasn't set"), act_tunnel_key doesn't dump anymore the destination
port, unless it was explicitly configured. This caused systematic failures
in the following TDC test case:
7a88 - Add tunnel_key action with cookie parameter
Avoid matching zero values of TCA_TUNNEL_KEY_ENC_DST_PORT to let the test
pass again.
Signed-off-by: Davide Caratti <dcaratti@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
After merge of commit 80ef0f22ceda ("net/sched: act_tunnel_key: Allow
key-less tunnels"), act_tunnel_key does not reject anymore requests to
install 'set' rules where the key id is missing. Therefore, drop the
following TDC testcase:
ba4e - Add tunnel_key set action with missing mandatory id parameter
because it's going to become a systematic fail as soon as userspace
iproute2 will start supporting key-less tunnels.
Signed-off-by: Davide Caratti <dcaratti@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
static checker warning:
drivers/xen/pvcalls-front.c:373 alloc_active_ring()
error: we previously assumed 'map->active.ring' could be null
(see line 357)
The device node iterators perform an of_node_get on each
iteration, so a jump out of the loop requires an of_node_put.
Remote and port also have augmented reference counts, so drop them
on each iteration and at the end of the function, respectively.
Remote is only used for the address it contains, not for the
contents of that address, so the reference count can be dropped
immediately.
The semantic patch that fixes the first part of this problem is
as follows (http://coccinelle.lip6.fr):
// <smpl>
@@
expression root,e;
local idexpression child;
iterator name for_each_child_of_node;
@@
for_each_available_child_of_node(root, child) {
... when != of_node_put(child)
when != e = child
+ of_node_put(child);
? break;
...
}
... when != child
// </smpl>
We've failed to copy and process vhost_iotlb_msg so let userspace at
least know about it. For instance before these patch the code below runs
without any error:
int main()
{
struct vhost_msg msg;
struct iovec iov;
int fd;
if (writev(fd, &iov,1) == -1) {
perror("writev");
return 1;
}
return 0;
}
Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Charlene Liu <charlene.liu@amd.com> Reviewed-by: Chris Park <Chris.Park@amd.com> Acked-by: Leo Li <sunpeng.li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Add an of_node_put when the result of of_graph_get_remote_port_parent is
not available.
An of_node_put is also needed when meson_probe_remote completes. This was
present at the recursive call, but not in the call from meson_drv_probe.
The semantic match that finds this problem is as follows
(http://coccinelle.lip6.fr):
// <smpl>
@r exists@
local idexpression e;
expression x;
@@
e = of_graph_get_remote_port_parent(...);
... when != x = e
when != true e == NULL
when != of_node_put(e)
when != of_fwnode_handle(e)
(
return e;
|
*return ...;
)
// </smpl>
Commit e657fcc clears cpu capability bit instead of using fake cpuid
value, the EXTD should always be off for PV guest without depending
on cpuid value. So remove the cpuid check in xen_read_msr_safe() to
always clear the X2APIC_ENABLE bit.
Signed-off-by: Talons Lee <xin.li@citrix.com> Reviewed-by: Juergen Gross <jgross@suse.com> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
This patch uses nfct_help() to detect whether an established connection
needs conntrack helper instead of using test_bit(IPS_HELPER_BIT,
&ct->status).
The reason is that IPS_HELPER_BIT is only set when using explicit CT
target.
However, in the case that a device enables conntrack helper via command
"echo 1 > /proc/sys/net/netfilter/nf_conntrack_helper", the status of
IPS_HELPER_BIT will not present any change, and consequently it loses
the checking ability in the context.
Signed-off-by: Henry Yen <henry.yen@mediatek.com> Reviewed-by: Ryder Lee <ryder.lee@mediatek.com> Tested-by: John Crispin <john@phrozen.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
In case of ->set_param() and ->bind_conn() cxgb4i driver does not wait for
cmd completion, this can create race conditions, to avoid this add
wait_for_completion().
Signed-off-by: Varun Prakash <varun@chelsio.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Albeit we no longer rely on those hard-coded descriptor sizes, we still use
them as our defaults, so better get it right. While adding its sysfs
entries, we forgot to update the geometry descriptor size. It is 0x48
according to UFS2.1, and wasn't changed in UFS3.0.
[mkp: typo]
Fixes: c720c091222e (scsi: ufs: sysfs: geometry descriptor) Signed-off-by: Avri Altman <avri.altman@wdc.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
When the driver finds invalid destination MAC for the first un-reachable
target, and before completes the PATH_REQ operation, set new ep_state to
OFFLDCONN_NONE so that as part of driver ep_poll mechanism, the upper
open-iscsi layer is notified to complete the login process on the first
un-reachable target and thus proceed login to other reachable targets.
Signed-off-by: Manish Rangankar <mrangankar@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
hba->is_sys_suspended is set after successful system suspend but
not clear after successful system resume.
According to current behavior, hba->is_sys_suspended will not be set if
host is runtime-suspended but not system-suspended. Thus we shall aligh the
same policy: clear this flag even if host remains runtime-suspended after
ufshcd_system_resume is successfully returned.
Simply fix this flag to correct host status logs.
Signed-off-by: Stanley Chu <stanley.chu@mediatek.com> Reviewed-by: Avri Altman <avri.altman@wdc.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Currently there is one cmd timeout timer and one qfull timer for each udev,
and whenever any new command is coming in we will update the cmd timer or
qfull timer. For some corner cases the timers are always working only for
the ringbuffer's and full queue's newest cmd. That's to say the timer won't
be fired even if one cmd has been stuck for a very long time and the
deadline is reached.
This fix will keep the cmd/qfull timers to be pended for the oldest cmd in
ringbuffer and full queue, and will update them with the next cmd's
deadline only when the old cmd's deadline is reached or removed from the
ringbuffer and full queue.
Signed-off-by: Xiubo Li <xiubli@redhat.com> Acked-by: Mike Christie <mchristi@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
The functions isdn_tty_tiocmset() and isdn_tty_set_termios() may be
concurrently executed.
isdn_tty_tiocmset
isdn_tty_modem_hup
line 719: kfree(info->dtmf_state);
line 721: kfree(info->silence_state);
line 723: kfree(info->adpcms);
line 725: kfree(info->adpcmr);
isdn_tty_set_termios
isdn_tty_modem_hup
line 719: kfree(info->dtmf_state);
line 721: kfree(info->silence_state);
line 723: kfree(info->adpcms);
line 725: kfree(info->adpcmr);
Thus, some concurrency double-free bugs may occur.
These possible bugs are found by a static tool written by myself and
my manual code review.
To fix these possible bugs, the mutex lock "modem_info_mutex" used in
isdn_tty_tiocmset() is added in isdn_tty_set_termios().
Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
Currently, TX is given a budget which is consumed by stmmac_tx_clean()
and stmmac_rx() is given the remaining non-consumed budget.
This is wrong and in case we are sending a large number of packets this
can starve RX because remaining budget will be low.
Let's give always the same budget for RX and TX clean.
While at it, check if we missed any interrupts while we were in NAPI
callback by looking at DMA interrupt status.
Cc: Joao Pinto <jpinto@synopsys.com> Cc: David S. Miller <davem@davemloft.net> Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com> Cc: Alexandre Torgue <alexandre.torgue@st.com> Signed-off-by: Jose Abreu <joabreu@synopsys.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
RX Watchdog can be disabled by platform definitions but currently we are
initializing the descriptors before checking if Watchdog must be
disabled or not.
Fix this by checking earlier if user wants Watchdog disabled or not.
Cc: Joao Pinto <jpinto@synopsys.com> Cc: David S. Miller <davem@davemloft.net> Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com> Cc: Alexandre Torgue <alexandre.torgue@st.com> Signed-off-by: Jose Abreu <joabreu@synopsys.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
Check if CBS is currently supported before trying to configure it in HW.
Cc: Joao Pinto <jpinto@synopsys.com> Cc: David S. Miller <davem@davemloft.net> Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com> Cc: Alexandre Torgue <alexandre.torgue@st.com> Signed-off-by: Jose Abreu <joabreu@synopsys.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
In DMA interrupt handler we were clearing all interrupts status, even
the ones that were not active. Fix this and only clear the active
interrupts.
Cc: Joao Pinto <jpinto@synopsys.com> Cc: David S. Miller <davem@davemloft.net> Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com> Cc: Alexandre Torgue <alexandre.torgue@st.com> Signed-off-by: Jose Abreu <joabreu@synopsys.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
Since commit b7d0f08e9129, the enable / disable of PCI device is not
managed which will result in IO regions not being automatically unmapped.
As regions continue mapped it is currently not possible to remove and
then probe again the PCI module of stmmac.
Fix this by manually unmapping regions on remove callback.
Changes from v1:
- Fix build error
Cc: Joao Pinto <jpinto@synopsys.com> Cc: David S. Miller <davem@davemloft.net> Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com> Cc: Alexandre Torgue <alexandre.torgue@st.com> Fixes: b7d0f08e9129 ("net: stmmac: Fix WoL for PCI-based setups") Signed-off-by: Jose Abreu <joabreu@synopsys.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
Reported-by: Jane Chu <jane.chu@oracle.com> Fixes: 23222f8f8dce ("acpi, nfit: Add function to look up nvdimm...") Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Commit 8c8c10b90d88 ("powerpc/8xx: fix handling of early NULL pointer
dereference") moved the loading of r6 earlier in the code. As some
functions are called inbetween, r6 needs to be loaded again with the
address of swapper_pg_dir in order to set PTE pointers for
the Abatron BDI.
Fixes: 8c8c10b90d88 ("powerpc/8xx: fix handling of early NULL pointer dereference") Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Sasha Levin <sashal@kernel.org>
As part of audit process to update drivers to use rdma_restrack_add()
ensure that QP objects is cleared before access. Such change fixes the
crash observed with uninitialized non zero sgid attr accessed by
ib_destroy_qp().
Reported-by: Alexander Murashkin <AlexanderMurashkin@msn.com> Tested-by: Alexander Murashkin <AlexanderMurashkin@msn.com> Fixes: 1a1f460ff151 ("RDMA: Hold the sgid_attr inside the struct ib_ah/qp") Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
In the forward chain, the iif is changed from slave device to master vrf
device. Thus, flow offload does not find a match on the lower slave
device.
This patch uses the cached route, ie. dst->dev, to update the iif and
oif fields in the flow entry.
After this patch, the following example works fine:
# ip addr add dev eth0 1.1.1.1/24
# ip addr add dev eth1 10.0.0.1/24
# ip link add user1 type vrf table 1
# ip l set user1 up
# ip l set dev eth0 master user1
# ip l set dev eth1 master user1
# nft add table firewall
# nft add flowtable f fb1 { hook ingress priority 0 \; devices = { eth0, eth1 } \; }
# nft add chain f ftb-all {type filter hook forward priority 0 \; policy accept \; }
# nft add rule f ftb-all ct zone 1 ip protocol tcp flow offload @fb1
# nft add rule f ftb-all ct zone 1 ip protocol udp flow offload @fb1
As Naresh reported, test_stacktrace_build_id() causes panic on i386 and
arm32 systems. This is caused by page_address() returns NULL in certain
cases.
This patch fixes this error by using kmap_atomic/kunmap_atomic instead
of page_address.
Fixes: 615755a77b24 (" bpf: extend stackmap to save binary_build_id+offset instead of address") Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org> Signed-off-by: Song Liu <songliubraving@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Sasha Levin <sashal@kernel.org>