A packet received on a trunk will have bit 2 set in Forward DSA tagged
frame. Bit 1 can be either 0 or 1 and is otherwise undefined and bit 0
indicates the frame CFI. Masking with 7 thus results in frames as
being identified as being from a trunk when in fact they are not. Fix
the mask to just look at bit 2.
Fixes: 5b60dadb71db ("net: dsa: tag_dsa: Support reception of packets from LAG devices") Signed-off-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
On recent Intel systems the HPET stops working when the system reaches PC10
idle state.
The approach of adding PCI ids to the early quirks to disable HPET on
these systems is a whack a mole game which makes no sense.
Check for PC10 instead and force disable HPET if supported. The check is
overbroad as it does not take ACPI, intel_idle enablement and command
line parameters into account. That's fine as long as there is at least
PMTIMER available to calibrate the TSC frequency. The decision can be
overruled by adding "hpet=force" on the kernel command line.
Remove the related early PCI quirks for affected Ice Cake and Coffin Lake
systems as they are not longer required. That should also cover all
other systems, i.e. Tiger Rag and newer generations, which are most
likely affected by this as well.
Fixes: Yet another hardware trainwreck Reported-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Rafael J. Wysocki <rafael@kernel.org> Cc: stable@vger.kernel.org Cc: Kai-Heng Feng <kai.heng.feng@canonical.com> Cc: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
changed the warning to only fire if the CPU supports SMAP.
However, the warning can still trigger on a machine that supports SMAP
but where it's disabled in the kernel config and when running the
syscall_nt selftest, for example:
------------[ cut here ]------------
WARNING: CPU: 0 PID: 49 at irqentry_enter_from_user_mode
CPU: 0 PID: 49 Comm: init Tainted: G T 5.15.0-rc4+ #98 e6202628ee053b4f310759978284bd8bb0ce6905
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014
RIP: 0010:irqentry_enter_from_user_mode
...
Call Trace:
? irqentry_enter
? exc_general_protection
? asm_exc_general_protection
? asm_exc_general_protectio
IS_ENABLED(CONFIG_X86_SMAP) could be added to the warning condition, but
even this would not be enough in case SMAP is disabled at boot time with
the "nosmap" parameter.
To be consistent with "nosmap" behaviour, clear X86_FEATURE_SMAP when
!CONFIG_X86_SMAP.
init[1] bad frame in sigreturn frame:(ptrval) ip:b7c9fbe6 sp:bf933310 orax:ffffffff \
in libc-2.33.so[b7bed000+156000]
Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
CPU: 0 PID: 1 Comm: init Tainted: G W 5.14.9 #1
Hardware name: Hewlett-Packard HP PC/HP Board, BIOS JD.00.06 12/06/2001
Call Trace:
dump_stack_lvl
dump_stack
panic
do_exit.cold
do_group_exit
get_signal
arch_do_signal_or_restart
? force_sig_info_to_task
? force_sig
exit_to_user_mode_prepare
syscall_exit_to_user_mode
do_int80_syscall_32
entry_INT80_32
on an old 32-bit Intel CPU:
vendor_id : GenuineIntel
cpu family : 6
model : 6
model name : Celeron (Mendocino)
stepping : 5
microcode : 0x3
Ser bisected the problem to the commit in Fixes.
tglx suggested reverting the rejection of invalid MXCSR values which
this commit introduced and replacing it with what the old code did -
simply masking them out to zero.
so restore the original behavior only for 32-bit kernels where you have
ancient machines with buggy hardware. For 32-bit programs on 64-bit
kernels, user space which supplies wrong MXCSR values is considered
malicious so fail the sigframe restoration there.
Fixes: 6f9866a166cd ("x86/fpu/signal: Let xrstor handle the features to init") Reported-by: Ser Olmy <ser.olmy@protonmail.com> Signed-off-by: Borislav Petkov <bp@suse.de> Tested-by: Ser Olmy <ser.olmy@protonmail.com> Cc: <stable@vger.kernel.org> Link: https://lkml.kernel.org/r/YVtA67jImg3KlBTw@zn.tnic Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
After returning from a VMGEXIT NAE event, SW_EXITINFO1[31:0] is checked
for a value of 1, which indicates an error and that SW_EXITINFO2
contains exception information. However, future versions of the GHCB
specification may define new values for SW_EXITINFO1[31:0], so really
any non-zero value should be treated as an error.
On pseries LPAR when an empty slot is assigned to partition OR in single
LPAR mode, kdump kernel crashes during issuing PHB reset.
In the kdump scenario, we traverse all PHBs and issue reset using the
pe_config_addr of the first child device present under each PHB. However
the code assumes that none of the PHB slots can be empty and uses
list_first_entry() to get the first child device under the PHB. Since
list_first_entry() expects the list to be non-empty, it returns an
invalid pci_dn entry and ends up accessing NULL phb pointer under
pci_dn->phb causing kdump kernel crash.
This patch fixes the below kdump kernel crash by skipping empty slots:
audit: initializing netlink subsys (disabled)
thermal_sys: Registered thermal governor 'fair_share'
thermal_sys: Registered thermal governor 'step_wise'
cpuidle: using governor menu
pstore: Registered nvram as persistent store backend
Issue PHB reset ...
audit: type=2000 audit(1631267818.000:1): state=initialized audit_enabled=0 res=1
BUG: Kernel NULL pointer dereference on read at 0x00000268
Faulting instruction address: 0xc000000008101fb0
Oops: Kernel access of bad area, sig: 7 [#1]
LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA pSeries
Modules linked in:
CPU: 7 PID: 1 Comm: swapper/7 Not tainted 5.14.0 #1
NIP: c000000008101fb0 LR: c000000009284ccc CTR: c000000008029d70
REGS: c00000001161b840 TRAP: 0300 Not tainted (5.14.0)
MSR: 8000000002009033 <SF,VEC,EE,ME,IR,DR,RI,LE> CR: 28000224 XER: 20040002
CFAR: c000000008101f0c DAR: 0000000000000268 DSISR: 00080000 IRQMASK: 0
...
NIP pseries_eeh_get_pe_config_addr+0x100/0x1b0
LR __machine_initcall_pseries_eeh_pseries_init+0x2cc/0x350
Call Trace:
0xc00000001161bb80 (unreliable)
__machine_initcall_pseries_eeh_pseries_init+0x2cc/0x350
do_one_initcall+0x60/0x2d0
kernel_init_freeable+0x350/0x3f8
kernel_init+0x3c/0x17c
ret_from_kernel_thread+0x5c/0x64
Fixes: 5a090f7c363fd ("powerpc/pseries: PCIE PHB reset") Signed-off-by: Mahesh Salgaonkar <mahesh@linux.ibm.com>
[mpe: Tweak wording and trim oops] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/163215558252.413351.8600189949820258982.stgit@jupiter Signed-off-by: Sasha Levin <sashal@kernel.org>
At interrupt exit, kuap_kernel_restore() calls kuap_unlock() with the
value contained in regs->kuap. However, when regs->kuap contains
0xffffffff it means that KUAP was not unlocked so calling kuap_unlock()
is unrelevant and results in jeopardising the contents of kernel space
segment registers.
So check that regs->kuap doesn't contain KUAP_NONE before calling
kuap_unlock(). In the meantime it also means that if KUAP has not
been correcly locked back at interrupt exit, it must be locked
before continuing. This is done by checking the content of
current->thread.kuap which was returned by kuap_get_and_assert_locked()
The machine check handler is not considered NMI on 64s. The early
handler is the true NMI handler, and then it schedules the
machine_check_exception handler to run when interrupts are enabled.
This works fine except the case of an unrecoverable MCE, where the true
NMI is taken when MSR[RI] is clear, it can not recover, so it calls
machine_check_exception directly so something might be done about it.
Calling an async handler from NMI context can result in irq state and
other things getting corrupted. This can also trigger the BUG at
arch/powerpc/include/asm/interrupt.h:168
BUG_ON(!arch_irq_disabled_regs(regs) && !(regs->msr & MSR_EE));
Fix this by making an _async version of the handler which is called
in the normal case, and a NMI version that is called for unrecoverable
interrupts.
Fixes: 2b43dd7653cc ("powerpc/64: enable MSR[EE] in irq replay pt_regs") Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Tested-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211004145642.1331214-6-npiggin@gmail.com Signed-off-by: Sasha Levin <sashal@kernel.org>
_exception can be called by machine check handlers when the MCE hits
user code (e.g., pseries and powernv). This will enable local irqs
because, which is a dicey thing to do in NMI or hard irq context.
This seemed to worked out okay because a userspace MCE can basically be
treated like a synchronous interrupt (after async / imprecise MCEs are
filtered out). Since NMI and hard irq handlers have started growing
nmi_enter / irq_enter, and more irq state sanity checks, this has
started to cause problems (or at least trigger warnings).
The Fixes tag to the commit which introduced this rather than try to
work out exactly which commit was the first that could possibly cause a
problem because that may be difficult to prove.
Fixes: 9f2f79e3a3c1 ("powerpc: Disable interrupts in 64-bit kernel FP and vector faults") Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211004145642.1331214-3-npiggin@gmail.com Signed-off-by: Sasha Levin <sashal@kernel.org>
Emergency stack path was jumping into a 3: label inside the
__GEN_COMMON_BODY macro for the normal path after it had finished,
rather than jumping over it. By a small miracle this is the correct
place to build up a new interrupt frame with the existing stack
pointer, so things basically worked okay with an added weird looking
700 trap frame on top (which had the wrong ->nip so it didn't decode
bug messages either).
Fix this by avoiding using numeric labels when jumping over non-trivial
macros.
Commit e31694e0a7a7 ("objtool: Don't make .altinstructions writable")
aligned objtool-created and kernel-created .altinstructions section
flags, but there remains a minor discrepency in their use of a section
entry size: objtool sets one while the kernel build does not.
While sh_entsize of sizeof(struct alt_instr) seems intuitive, this small
deviation can cause failures with external tooling (kpatch-build).
Fix this by creating new .altinstructions sections with sh_entsize of 0
and then later updating sec->sh_size as alternatives are added to the
section. An added benefit is avoiding the data descriptor and buffer
created by elf_create_section(), but previously unused by
elf_add_alternative().
Fixes: 9bc0bb50727c ("objtool/x86: Rewrite retpoline thunk calls") Signed-off-by: Joe Lawrence <joe.lawrence@redhat.com> Reviewed-by: Miroslav Benes <mbenes@suse.cz> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Link: https://lore.kernel.org/r/20210822225037.54620-2-joe.lawrence@redhat.com Cc: Andy Lavr <andy.lavr@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: x86@kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Sasha Levin <sashal@kernel.org>
Converting a special section's relocation reference to a symbol is
straightforward. No need for objtool to complain that it doesn't know
how to handle it. Just handle it.
This fixes the following warning:
arch/x86/kvm/emulate.o: warning: objtool: __ex_table+0x4: don't know how to handle reloc symbol type: kvm_fastop_exception
Fixes: 24ff65257375 ("objtool: Teach get_alt_entry() about more relocation types") Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Link: https://lore.kernel.org/r/feadbc3dfb3440d973580fad8d3db873cbfe1694.1633367242.git.jpoimboe@redhat.com Cc: Peter Zijlstra <peterz@infradead.org> Cc: x86@kernel.org Cc: Miroslav Benes <mbenes@suse.cz> Cc: linux-kernel@vger.kernel.org Signed-off-by: Sasha Levin <sashal@kernel.org>
Commit d39df158518c ("scsi: iscsi: Have abort handler get ref to conn")
added iscsi_get_conn()/iscsi_put_conn() calls during abort handling but
then also changed the handling of the case where we detect an already
completed task where we now end up doing a goto to the common put/cleanup
code. This results in a iscsi_task use after free, because the common
cleanup code will do a put on the iscsi_task.
This reverts the goto and moves the iscsi_get_conn() to after we've checked
if the iscsi_task is valid.
Link: https://lore.kernel.org/r/20211004210608.9962-1-michael.christie@oracle.com Fixes: d39df158518c ("scsi: iscsi: Have abort handler get ref to conn") Signed-off-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
As far as I can tell this should be enabled on rv32 as well, I'm not
sure why it's rv64-only. checksyscalls is complaining about our lack of
clone3() on rv32.
Change setting for 400KHz frequency support by more accurate value.
Fixes: 66b0c2846ba8 ("i2c: mlxcpld: Add support for I2C bus frequency setting") Signed-off-by: Vadim Pasternak <vadimp@nvidia.com> Signed-off-by: Wolfram Sang <wsa@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Value for getting frequency capability wrongly has been taken from
register offset instead of register value.
Fixes: 66b0c2846ba8 ("i2c: mlxcpld: Add support for I2C bus frequency setting") Signed-off-by: Vadim Pasternak <vadimp@nvidia.com> Signed-off-by: Wolfram Sang <wsa@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
riscv architectures relying on mmap_sem for write in their
arch_setup_additional_pages. If the waiting task gets killed by the oom
killer it would block oom_reaper from asynchronous address space reclaim
and reduce the chances of timely OOM resolving. Wait for the lock in
the killable mode and return with EINTR if the task got killed while
waiting.
Signed-off-by: Tong Tiangen <tongtiangen@huawei.com> Reviewed-by: Kefeng Wang <wangkefeng.wang@huawei.com> Fixes: 76d2a0493a17 ("RISC-V: Init and Halt Code") Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
As commit 601255ae3c98 ("arm64: vdso: move data page before code pages"), the
same issue exists on riscv, testcase is shown below, make sure that vdso.so is
bigger than page size,
without this patch, test result : tv_sec: 0, tv_nsec: 0
with this patch, test result : tv_sec: 1629271537, tv_nsec: 748000000
Move the vdso data page in front of the VDSO area to fix the issue.
Fixes: ad5d1122b82fb ("riscv: use vDSO common flow to reduce the latency of the time-related functions") Signed-off-by: Tong Tiangen <tongtiangen@huawei.com> Reviewed-by: Kefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
The asm/vdso.h will be included in vdso.lds.S in the next patch, the
following cleanup is needed to avoid syntax error:
1.the declaration of sys_riscv_flush_icache() is moved into asm/syscall.h.
2.the definition of struct vdso_data is moved into kernel/vdso.c.
2.the definition of VDSO_SYMBOL is placed under "#ifndef __ASSEMBLY__".
The current implementation of the `__rt_sigaction` reference computed an
absolute offset relative to the mapped base of the VDSO. While this can
be handled in the medlow model, the medany model cannot handle this as
it is meant to be position independent. The current implementation
relied on the BFD linker relaxing the PC-relative relocation into an
absolute relocation as it was a near-zero address allowing it to be
referenced relative to `zero`.
We now extract the offsets and create a generated header allowing the
build with LLVM and lld to succeed as we no longer depend on the linker
rewriting address references near zero. This change was largely
modelled after the ARM64 target which does something similar.
In the commit be5ce0e97cc7 ("i2c: mediatek: Add i2c ac-timing adjust
support"), we miss setting OFFSET_EXT_CONF register if
i2c->dev_comp->timing_adjust is false, now add it back.
acpi_i2c_find_adapter_by_handle() calls bus_find_device() which takes a
reference on the adapter which is never released which will result in a
reference count leak and render the adapter unremovable. Make sure to
put the adapter after creating the client in the same manner that we do
for OF.
Fixes: 525e6fabeae2 ("i2c / ACPI: add support for ACPI reconfigure notifications") Signed-off-by: Jamie Iles <quic_jiles@quicinc.com> Acked-by: Mika Westerberg <mika.westerberg@linux.intel.com>
[wsa: fixed title] Signed-off-by: Wolfram Sang <wsa@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
According to dma-api.rst, the dma_get_required_mask() helper should return
"the mask that the platform requires to operate efficiently". Which in
the case of PPC64 means the bypass mask and not a mask from an IOMMU table
which is shorter and slower to use due to map/unmap operations (especially
expensive on "pseries").
However the existing implementation ignores the possibility of bypassing
and returns the IOMMU table mask on the pseries platform which makes some
drivers (mpt3sas is one example) choose 32bit DMA even though bypass is
supported. The powernv platform sort of handles it by having a bigger
default window with a mask >=40 but it only works as drivers choose
63/64bit if the required mask is >32 which is rather pointless.
This reintroduces the bypass capability check to let drivers make
a better choice of the DMA mask.
Fixes: f1565c24b596 ("powerpc: use the generic dma_ops_bypass mode") Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210930034454.95794-1-aik@ozlabs.ru Signed-off-by: Sasha Levin <sashal@kernel.org>
The commit 6da5b0f027a8 ("net: ensure unbound datagram socket to be
chosen when not in a VRF") modified compute_score() so that a device
match is always made, not just in the case of an l3mdev skb, then
increments the score also for unbound sockets. This ensures that
sockets bound to an l3mdev are never selected when not in a VRF.
But as unbound and bound sockets are now scored equally, this results
in the last opened socket being selected if there are matches in the
default VRF for an unbound socket and a socket bound to a dev that is
not an l3mdev. However, handling prior to this commit was to always
select the bound socket in this case. Reinstate this handling by
incrementing the score only for bound sockets. The required isolation
due to choosing between an unbound socket and a socket bound to an
l3mdev remains in place due to the device match always being made.
The same approach is taken for compute_score() for stream sockets.
Fixes: 6da5b0f027a8 ("net: ensure unbound datagram socket to be chosen when not in a VRF") Fixes: e78190581aff ("net: ensure unbound stream socket to be chosen when not in a VRF") Signed-off-by: Mike Manning <mmanning@vyatta.att-mail.com> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://lore.kernel.org/r/cf0a8523-b362-1edf-ee78-eef63cbbb428@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
The crit_lock mutex could be unlocked twice as reported here
https://lists.osuosl.org/pipermail/intel-wired-lan/Week-of-Mon-20210823/025525.html
Remove the superfluous unlock. Technically the problem was already
present before 5ac49f3c2702 as that commit only replaced the locking
primitive, but no functional change.
Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Fixes: 5ac49f3c2702 ("iavf: use mutexes for locking of critical sections") Fixes: bac8486116b0 ("iavf: Refactor the watchdog state machine") Signed-off-by: Stefan Assmann <sassmann@kpanic.de> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
When VSI set up failed in i40e_probe() as part of PF switch set up
driver was trying to free misc IRQ vectors in
i40e_clear_interrupt_scheme and produced a kernel Oops:
The problem is that at that point misc IRQ vectors
were not allocated yet and we get a call trace
that driver is trying to free already free IRQ vectors.
Add a check in i40e_clear_interrupt_scheme for __I40E_MISC_IRQ_REQUESTED
PF state before calling i40e_free_misc_vector. This state is set only if
misc IRQ vectors were properly initialized.
Fixes: c17401a1dd21 ("i40e: use separate state bit for miscellaneous IRQ setup") Reported-by: PJ Waskiewicz <pwaskiewicz@jumptrading.com> Signed-off-by: Sylwester Dziedziuch <sylwesterx.dziedziuch@intel.com> Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com> Tested-by: Dave Switzer <david.switzer@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
The loop in i40e_get_capabilities can never end. The problem is that
although i40e_aq_discover_capabilities returns with an error if there's
a firmware problem, the returned error is not checked. There is a check for
pf->hw.aq.asq_last_status but that value is set to I40E_AQ_RC_OK on most
firmware problems.
When i40e_aq_discover_capabilities encounters a firmware problem, it will
encounter the same problem on its next invocation. As the result, the loop
becomes endless. We hit this with I40E_ERR_ADMIN_QUEUE_TIMEOUT but looking
at the code, it can happen with a range of other firmware errors.
I don't know what the correct behavior should be: whether the firmware
should be retried a few times, or whether pf->hw.aq.asq_last_status should
be always set to the encountered firmware error (but then it would be
pointless and can be just replaced by the i40e_aq_discover_capabilities
return value). However, the current behavior with an endless loop under the
rtnl mutex(!) is unacceptable and Intel has not submitted a fix, although we
explained the bug to them 7 months ago.
This may not be the best possible fix but it's better than hanging the whole
system on a firmware bug.
Fixes: 56a62fc86895 ("i40e: init code and hardware support") Tested-by: Stefan Assmann <sassmann@redhat.com> Signed-off-by: Jiri Benc <jbenc@redhat.com> Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Dave Switzer <david.switzer@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Each tx queue maintains a 64bit counter for bytes, there is
no reason to truncate this to 32bit (or this has not been
documented)
Fixes: 24aeb56f2d38 ("gve: Add Gvnic stats AQ command and ethtool show/set-priv-flags.") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Yangchun Fu <yangchun@google.com> Cc: Kuo Zhao <kuozhao@google.com> Cc: David Awogbemila <awogbemila@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
gve_get_stats() can report wrong numbers if/when u64_stats_fetch_retry()
returns true.
What is needed here is to sample values in temporary variables,
and only use them after each loop is ended.
Fixes: f5cedc84a30d ("gve: Add transmit and receive support") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Catherine Sullivan <csully@google.com> Cc: Sagi Shahar <sagis@google.com> Cc: Jon Olson <jonolson@google.com> Cc: Willem de Bruijn <willemb@google.com> Cc: Luigi Rizzo <lrizzo@google.com> Cc: Jeroen de Borst <jeroendb@google.com> Cc: Tao Liu <xliutaox@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
But if_nlmsg_stats_size() never considered the needed storage.
This bug did not show up because alloc_skb(X) allocates skb with
extra tailroom, because of added alignments. This could very well
be changed in the future to have deterministic behavior.
Fixes: 10c9ead9f3c6 ("rtnetlink: add new RTM_GETSTATS message to dump link stats") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Roopa Prabhu <roopa@nvidia.com> Acked-by: Roopa Prabhu <roopa@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
Fixes: ede3fcf5ec67f ("gve: Add support for raw addressing to the rx path") Signed-off-by: Catherine Sullivan <csully@google.com> Signed-off-by: Jeroen de Borst <jeroendb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
Prevent possible crashes when cleaning up after unsuccessful
initializations.
Fixes: 893ce44df5658 ("gve: Add basic driver framework for Compute Engine Virtual NIC") Signed-off-by: Tao Liu <xliutaox@google.com> Signed-off-by: Catherine Sully <csully@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
The current implementation enable PCS EEE feature in the event of link
up, but PCS EEE feature is not disabled on link down.
This patch makes sure PCE EEE feature is disabled on link down.
Fixes: 656ed8b015f1 ("net: stmmac: fix EEE init issue when paired with EEE capable PHYs") Signed-off-by: Wong Vee Khee <vee.khee.wong@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
When Energy-Efficient Ethernet(EEE) is disable from the MAC side,
we need to clear the DW_VR_MII_EEE_TRN_LPI bit of DW_VR_MII_EEE_MCTRL1
register.
Fixes: 7617af3d1a5e ("net: pcs: Introducing support for DWC xpcs Energy Efficient Ethernet") Cc: Michael Sit Wei Hong <michael.wei.hong.sit@intel.com> Signed-off-by: Wong Vee Khee <vee.khee.wong@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
nouveau_bo_init() is backed by ttm_bo_init() and ferries its return code
back to the caller. On failures, ttm_bo_init() invokes the provided
destructor which should de-initialize and free the memory.
Thus, when nouveau_bo_init() returns an error the gem object has already
been released and the memory freed by nouveau_bo_del_ttm().
Fixes: 019cbd4a4feb ("drm/nouveau: Initialize GEM object before TTM object") Cc: Thierry Reding <treding@nvidia.com> Signed-off-by: Jeremy Cline <jcline@redhat.com> Reviewed-by: Lyude Paul <lyude@redhat.com> Reviewed-by: Karol Herbst <kherbst@redhat.com> Signed-off-by: Karol Herbst <kherbst@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20201203000220.18238-1-jcline@redhat.com Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
The gbefb driver not only registers a driver but also the device for that
driver. This is all well and good when run on the IP32 machines that are
supported by the driver but since the driver supports building with
COMPILE_TEST we might also be building on other platforms which do not have
this hardware and will crash instantiating the driver. Add an IS_ENABLED()
check so we compile out the device registration if we don't have the Kconfig
option for the machine enabled.
Fixes: 552ccf6b259d290c0c ("video: fbdev: gbefb: add COMPILE_TEST support") Signed-off-by: Mark Brown <broonie@kernel.org> Cc: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20210921212102.30803-1-broonie@kernel.org Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Recent rework, which made HDMI PHY driver a platform device, inadvertely
reversed clock setup order. HW is very touchy about it. Proper way is to
handle controllers resets and clocks first and HDMI PHYs second.
Currently, without this fix, first mode set completely fails (nothing on
HDMI monitor) on H3 era PHYs. On H6, it still somehow work.
Move HDMI PHY reset & clocks handling to sun8i_hdmi_phy_init() which
will assure that code is executed after controllers reset & clocks are
handled. Additionally, add sun8i_hdmi_phy_deinit() which will deinit
them at controllers driver unload.
Tested on A64, H3, H6 and R40.
Fixes: 9bf3797796f5 ("drm/sun4i: dw-hdmi: Make HDMI PHY into a platform device") Signed-off-by: Jernej Skrabec <jernej.skrabec@gmail.com> Signed-off-by: Maxime Ripard <maxime@cerno.tech> Link: https://patchwork.freedesktop.org/patch/msgid/20210915175836.3158839-1-jernej.skrabec@gmail.com Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Commit 94f6345712b3 ("bus: ti-sysc: Implement quirk handling for
CLKDM_NOAUTO") should have also added the quirk for dra7 dcan1 in
addition to dcan2 for errata i893 handling.
Let's also pass the quirk flag for legacy mode booting for if "ti,hwmods"
dts property is used with related dcan hwmod data. This should be only
needed if anybody needs to git bisect earlier stable trees though.
Fixes: 94f6345712b3 ("bus: ti-sysc: Implement quirk handling for CLKDM_NOAUTO") Signed-off-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
The compiler reports that free_sys_event_tables() is dead code.
But according to the semantics, the "LIST_HEAD(sys_event_tables)" should
also be released, just like we do with 'arch_std_events' in main().
Fixes: e9d32c1bf0cd7a98 ("perf vendor events: Add support for arch standard events") Signed-off-by: Like Xu <likexu@tencent.com> Reviewed-by: John Garry <john.garry@huawei.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20210928102938.69681-1-likexu@tencent.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
In current code, when a PCI error state pci_channel_io_normal is detectd,
it will report PCI_ERS_RESULT_CAN_RECOVER status to PCI driver, and PCI
driver will continue the execution of PCI resume callback report_resume by
pci_walk_bridge, and the callback will go into amdgpu_pci_resume
finally, where write lock is releasd unconditionally without acquiring
such lock first. In this case, a deadlock will happen when other threads
start to acquire the read lock.
To fix this, add a member in amdgpu_device strucutre to cache
pci_channel_state, and only continue the execution in amdgpu_pci_resume
when it's pci_channel_io_frozen.
While existing code is correct, KCSAN is reporting
a data-race in netlink_insert / netlink_sendmsg [1]
It is correct to read nlk->bound without a lock, as netlink_autobind()
will acquire all needed locks.
[1]
BUG: KCSAN: data-race in netlink_insert / netlink_sendmsg
write to 0xffff8881031c8b30 of 1 bytes by task 18752 on cpu 0:
netlink_insert+0x5cc/0x7f0 net/netlink/af_netlink.c:597
netlink_autobind+0xa9/0x150 net/netlink/af_netlink.c:842
netlink_sendmsg+0x479/0x7c0 net/netlink/af_netlink.c:1892
sock_sendmsg_nosec net/socket.c:703 [inline]
sock_sendmsg net/socket.c:723 [inline]
____sys_sendmsg+0x360/0x4d0 net/socket.c:2392
___sys_sendmsg net/socket.c:2446 [inline]
__sys_sendmsg+0x1ed/0x270 net/socket.c:2475
__do_sys_sendmsg net/socket.c:2484 [inline]
__se_sys_sendmsg net/socket.c:2482 [inline]
__x64_sys_sendmsg+0x42/0x50 net/socket.c:2482
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x3d/0x90 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x44/0xae
read to 0xffff8881031c8b30 of 1 bytes by task 18751 on cpu 1:
netlink_sendmsg+0x270/0x7c0 net/netlink/af_netlink.c:1891
sock_sendmsg_nosec net/socket.c:703 [inline]
sock_sendmsg net/socket.c:723 [inline]
__sys_sendto+0x2a8/0x370 net/socket.c:2019
__do_sys_sendto net/socket.c:2031 [inline]
__se_sys_sendto net/socket.c:2027 [inline]
__x64_sys_sendto+0x74/0x90 net/socket.c:2027
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x3d/0x90 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x44/0xae
value changed: 0x00 -> 0x01
Reported by Kernel Concurrency Sanitizer on:
CPU: 1 PID: 18751 Comm: syz-executor.0 Not tainted 5.14.0-rc1-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Fixes: da314c9923fe ("netlink: Replace rhash_portid with bound") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
According to Synopsys DesignWare Cores Ethernet PCS databook, it is
required to disable Clause 37 auto-negotiation by programming bit-12
(AN_ENABLE) to 0 if it is already enabled, before programming various
fields of VR_MII_AN_CTRL registers.
After all these programming are done, it is then required to enable
Clause 37 auto-negotiation by programming bit-12 (AN_ENABLE) to 1.
Fixes: b97b5331b8ab ("net: pcs: add C37 SGMII AN support for intel mGbE controller") Cc: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Wong Vee Khee <vee.khee.wong@linux.intel.com> Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Tested-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
The string should be "tx_disable" to match the state enum.
Fixes: 4005a7cb4f55 ("net: phy: sftp: print debug message with text, not numbers") Signed-off-by: Sean Anderson <sean.anderson@seco.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
Commit de1799667b00 ("net: bridge: add STP xstats")
added an additional nla_reserve_64bit() in br_fill_linkxstats(),
but forgot to update br_get_linkxstats_size() accordingly.
This can trigger the following in rtnl_stats_get()
WARN_ON(err == -EMSGSIZE);
Fixes: de1799667b00 ("net: bridge: add STP xstats") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Vivien Didelot <vivien.didelot@gmail.com> Cc: Nikolay Aleksandrov <nikolay@nvidia.com> Acked-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
bridge_fill_linkxstats() is using nla_reserve_64bit().
We must use nla_total_size_64bit() instead of nla_total_size()
for corresponding data structure.
Fixes: 1080ab95e3c7 ("net: bridge: add support for IGMP/MLD stats and export them via netlink") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Nikolay Aleksandrov <nikolay@nvidia.com> Cc: Vivien Didelot <vivien.didelot@gmail.com> Acked-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
Fix afs_launder_page() to set the starting position of the StoreData RPC at
the offset into the page at which the modified data starts instead of at
the beginning of the page (the iov_iter is correctly offset).
The offset got lost during the conversion to passing an iov_iter into
afs_store_data().
Changes:
ver #2:
- Use page_offset() rather than manually calculating it[1].
Fix netfs_clear_unread() to pass READ to iov_iter_xarray() instead of WRITE
(the flag is about the operation accessing the buffer, not what sort of
access it is doing to the buffer).
With patch "drm/i915/vbt: Fix backlight parsing for VBT 234+"
the size of bdb_lfp_backlight_data structure has been increased,
causing if-statement in the parse_lfp_backlight function
that comapres this structure size to the one retrieved from BDB,
always to fail for older revisions.
This patch calculates expected size of the structure for a given
BDB version and compares it with the value gathered from BDB.
Tested on Chromebook Pixelbook (Nocturne) (reports bdb->version = 221)
Fixes: d381baad29b4 ("drm/i915/vbt: Fix backlight parsing for VBT 234+") Tested-by: Lukasz Majczak <lma@semihalf.com> Signed-off-by: Lukasz Majczak <lma@semihalf.com> Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210930134606.227234-1-lma@semihalf.com
(cherry picked from commit 4378daf5d04eed59724e6d0e74755e17dce2e105) Signed-off-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Atm during driver loading and system resume TypeC ports are accessed
before their HW/SW state is synced. Move the TypeC port sanitization to
the encoder's sync_state hook to fix this.
v2: Handle the encoder disabled case in gen11_dsi_sync_state() as well
(Jose, Jani)
Fixes: f9e76a6e68d3 ("drm/i915: Add an encoder hook to sanitize its state during init/resume") Cc: José Roberto de Souza <jose.souza@intel.com> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Imre Deak <imre.deak@intel.com> Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210929132833.2253961-1-imre.deak@intel.com
(cherry picked from commit 7194dc998dfffca096c30b3cd39625158608992d) Signed-off-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
When pipe A is disabled and MIPI DSI is enabled on pipe B,
the AMT KVMR feature will incorrectly see pipe A as enabled.
Set 0x42080 bit 23=1 before enabling DSI on pipe B and leave
it set while DSI is enabled on pipe B. No impact to setting
it all the time.
Changes since V5:
- Added reviewed-by
- Removed redundant braces and debug message format - Imre
Changes since V4:
- Modified function comment Wa_<number>:icl,jsl,ehl - Lucas
- Modified debug message in sync state - Imre
Changes since V3:
- More meaningful name to workaround - Imre
- Remove boolean check clear flag
- Add WA_verify hook in dsi sync_state
Changes since V2:
- Used REG_BIT, ignored pipe A and used sw state check - Jani
- Made function wrapper - Jani
Changes since V1:
- ./dim checkpatch errors addressed
On the LS1028A this instance of the eSDHC controller is intended for
either an eMMC or eSDIO card. It doesn't provide a card detect pin and
its IO voltage is fixed at 1.8V.
Remove the bogus broken-cd property, instead add the non-removable
property. Fix the voltage-ranges property and set it to 1.8V only.
The buck2 output of the PMIC is the VDD core voltage of the cpu.
Switching off this will poweroff the CPU. Add the 'regulator-always-on'
property to avoid this.
Fixes: 8668d8b2e67f ("arm64: dts: Add the Kontron i.MX8M Mini SoMs and baseboards") Signed-off-by: Heiko Thiery <heiko.thiery@gmail.com> Reviewed-by: Frieder Schrempf <frieder.schrempf@kontron.de> Signed-off-by: Shawn Guo <shawnguo@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Before commit 0e30f47232ab5 ("mtd: spi-nor: add support for DTR protocol"),
for all PP command, it only support 1-1-1 mode, no matter the tx setting
in dts. But after the upper commit, the logic change. It will choose
the best mode(fastest mode) which flash device and spi-nor host controller
both support.
qspi and fspi host controller do not support read 1-4-4 mode. so need to
set the tx to 1, let the common code finally select read 1-1-4 mode.
Before commit 0e30f47232ab5 ("mtd: spi-nor: add support for DTR protocol"),
for all PP command, it only support 1-1-1 mode, no matter the tx setting
in dts. But after the upper commit, the logic change. It will choose
the best mode(fastest mode) which flash device and spi-nor host controller
both support.
Though the spi-nor device on imx6sx-sdb/imx6ul(l/z)-14x14-evk board
do not support PP-1-4-4/PP-1-1-4, but if tx is 4 in dts file, it will also
impact the read mode selection. For the spi-nor device on the upper mentioned
boards, they support read 1-4-4 mode and read 1-1-4 mode according to the
device internal sfdp register. But qspi host controller do not support
read 1-4-4 mode. so need to set the tx to 1, let the common code finally
select read 1-1-4 mode, PP-1-1-1 mode.
The driver can't be loaded automatically because it misses
module alias to be provided. Add corresponding MODULE_DEVICE_TABLE()
call to the driver.
Fixes: 863d08ece9bf ("supports eg20t ptp clock") Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
Property phy-connection-type contains invalid value "sgmii-2500" per scheme
defined in file ethernet-controller.yaml.
Correct phy-connection-type value should be "2500base-x".
Signed-off-by: Pali Rohár <pali@kernel.org> Fixes: 84e0f1c13806 ("powerpc/mpc85xx: Add MDIO bus muxing support to the board device tree(s)") Acked-by: Scott Wood <oss@buserror.net> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
This reverts commit 6decd1aad15f56b169217789630a0098b496de0e. CPULAUNCH
register is not set properly by some bootloaders, causing a regression
until a bootloader change is made, which is hard if not impossible on
some embedded devices. Revert the change until a more robust core
detection mechanism that works on MT7621S routers such as Netgear R6220
as well as platforms like Digi EX15 can be made.
Commit 2d26f6e39afb ("net: stmmac: dwmac-rk: fix unbalanced pm_runtime_enable warnings")
while getting rid of a runtime PM warning ended up breaking ethernet
on rk3399 based devices. By dropping an extra reference to the device,
the commit ends up enabling suspend / resume of the ethernet device -
which appears to be broken.
While the issue with runtime pm is being investigated, partially
revert commit 2d26f6e39afb to restore the network on rk3399.
When ocelot_flower.c calls ocelot_vcap_filter_add(), the filter has a
given filter->id.cookie. This filter is added to the block->rules list.
However, when ocelot_flower.c calls ocelot_vcap_block_find_filter_by_id()
which passes the cookie as argument, the filter is never found by
filter->id.cookie when searching through the block->rules list.
This is unsurprising, since the filter->id.cookie is an unsigned long,
but the cookie argument provided to ocelot_vcap_block_find_filter_by_id()
is a signed int, and the comparison fails.
Fixes: 50c6cc5b9283 ("net: mscc: ocelot: store a namespaced VCAP filter ID") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://lore.kernel.org/r/20210930125330.2078625-1-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Syzbot reported memory leak in MDIO bus interface, the problem was in
wrong state logic.
MDIOBUS_ALLOCATED indicates 2 states:
1. Bus is only allocated
2. Bus allocated and __mdiobus_register() fails, but
device_register() was called
In case of device_register() has been called we should call put_device()
to correctly free the memory allocated for this device, but mdiobus_free()
calls just kfree(dev) in case of MDIOBUS_ALLOCATED state
To avoid this behaviour we need to set bus->state to MDIOBUS_UNREGISTERED
_before_ calling device_register(), because put_device() should be
called even in case of device_register() failure.
When fed an empty BPF object, bpftool gen skeleton -L crashes at
btf__set_fd() since it assumes presence of obj->btf, however for
the sequence below clang adds no .BTF section (hence no BTF).
Reproducer:
$ touch a.bpf.c
$ clang -O2 -g -target bpf -c a.bpf.c
$ bpftool gen skeleton -L a.bpf.o
/* SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause) */
/* THIS FILE IS AUTOGENERATED! */
PTP-RQ counters title format contains PTP-RQ identifier, which is
mistakenly not passed to sprinft().
This leads to unexpected garbage values instead.
This patch fixes it.
When setting number of completion EQs of the SF, consider number of
online CPUs.
Without this consideration, when number of online cpus are less than 8,
unnecessary 8 completion EQs are allocated.
When in Real-time mode, HW clock is synced with the PTP daemon. Hence
driver should not re-calibrate the next pulse (via MTPPSE repetitive
events mechanism).
This patch arms repetitive events only in free-running mode.
Fixes: 432119de33d9 ("net/mlx5: Add cyc2time HW translation mode support") Signed-off-by: Aya Levin <ayal@nvidia.com> Reviewed-by: Eran Ben Elisha <eranbe@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Allow configuration of 1PPS start time only with time-stamp representing
a round second. Prior to this patch driver allowed setting of a
non-round-second which is not supported by the device. Avoid unexpected
behavior by restricting start-time configuration to a round-second.
Fixes: 4272f9b88db9 ("net/mlx5e: Change 1PPS out scheme") Signed-off-by: Aya Levin <ayal@nvidia.com> Reviewed-by: Eran Ben Elisha <eranbe@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Flow counter is allocated in eswitch legacy acl setting functions
without checking if already allocated by previous setting. Add a check
to avoid such double allocation.
The value for maximum number of channels is first calculated based
on the netdev's profile and current function resources (specifically,
number of MSIX vectors, which depends among other things on the number
of online cores in the system).
This value is then used to calculate the netdev's number of rxqs/txqs.
Once created (by alloc_etherdev_mqs), the number of netdev's rxqs/txqs
is constant and we must not exceed it.
To achieve this, keep the maximum number of channels in sync upon any
netdevice re-attach.
Use mlx5e_get_max_num_channels() for calculating the number of netdev's
rxqs/txqs. After netdev is created, use mlx5e_calc_max_nch() (which
coinsiders core device resources, profile, and netdev) to init or
update priv->max_nch.
Before this patch, the value of priv->max_nch might get out of sync,
mistakenly allowing accesses to out-of-bounds objects, which would
crash the system.
Track the number of channels stats structures used in a separate
field, as they are persistent to suspend/resume operations. All the
collected stats of every channel index that ever existed should be
preserved. They are reset only when struct mlx5e_priv is,
in mlx5e_priv_cleanup(), which is part of the profile changing flow.
There is no point anymore in blocking a profile change due to max_nch
mismatch in mlx5e_netdev_change_profile(). Remove the limitation.
Fixes: a1f240f18017 ("net/mlx5e: Adjust to max number of channles when re-attaching") Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Aya Levin <ayal@nvidia.com> Reviewed-by: Maxim Mikityanskiy <maximmi@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Currently in Rx data path IPsec crypto offloaded packets uses
csum_none flag, so checksum is handled by the stack, this naturally
have some performance/cpu utilization impact on such flows. As Nvidia
NIC starting from ConnectX6DX provides checksum complete value out of
the box also for such flows there is no sense in taking csum_none path,
furthermore the stack (xfrm) have the method to handle checksum complete
corrections for such flows i.e. IPsec trailer removal and consequently
checksum value adjustment.
Because of the above and in addition the ConnectX6DX is the first HW
which supports IPsec crypto offload then it is safe to report csum
complete for IPsec offloaded traffic.
In prealloc_elems_and_freelist(), the multiplication to calculate the
size passed to bpf_map_area_alloc() could lead to an integer overflow.
As a result, out-of-bounds write could occur in pcpu_freelist_populate()
as reported by KASAN:
Starting with v5.15-rc1, we may now see some am335x beaglebone black
device produce the following error on pruss probe:
Unhandled fault: external abort on non-linefetch (0x1008) at 0xe0326000
This has started with the enabling of pruss for am335x in the dts files.
Turns out the is caused by the PRM reset handling not waiting for the
reset bit to clear. To fix the issue, let's always wait for the reset
bit to clear, even if there is a separate reset status register.
We attempted to fix a similar issue for dra7 iva with a udelay() in
commit effe89e40037 ("soc: ti: omap-prm: Fix occasional abort on reset
deassert for dra7 iva"). There is no longer a need for the udelay()
for dra7 iva reset either with the check added for reset bit clearing.
Cc: Drew Fustini <pdp7pdp7@gmail.com> Cc: Grygorii Strashko <grygorii.strashko@ti.com> Cc: "H. Nikolaus Schaller" <hns@goldelico.com> Cc: Robert Nelson <robertcnelson@gmail.com> Cc: Yongqin Liu <yongqin.liu@linaro.org> Fixes: effe89e40037 ("soc: ti: omap-prm: Fix occasional abort on reset deassert for dra7 iva") Reported-by: Matti Vaittinen <mazziesaccount@gmail.com> Tested-by: Matti Vaittinen <matti.vaittinen@fi.rohmeurope.com> Signed-off-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
On ARM CPUs that lack div/mod instructions, ALU32 BPF_DIV and BPF_MOD are
implemented using a call to a helper function. Before, the emitted code
for those function calls failed to preserve caller-saved ARM registers.
Since some of those registers happen to be mapped to BPF registers, it
resulted in eBPF register values being overwritten.
This patch emits code to push and pop the remaining caller-saved ARM
registers r2-r3 into the stack during the div/mod function call. ARM
registers r0-r1 are used as arguments and return value, and those were
already saved and restored correctly.
Fixes: 39c13c204bb1 ("arm: eBPF JIT compiler") Signed-off-by: Johan Almbladh <johan.almbladh@anyfinetworks.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
Deactivate old rule first, then append the new rule, so rule replacement
notification via netlink first reports the deletion of the old rule with
handle X in first place, then it adds the new rule (reusing the handle X
of the replaced old rule).
Note that the abort path releases the transaction that has been created
by nft_delrule() on error.
Fixes: ca08987885a1 ("netfilter: nf_tables: deactivate expressions in rule replecement routine") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Add position handle to allow to identify the rule location from netlink
events. Otherwise, userspace cannot incrementally update a userspace
cache through monitoring events.
Skip handle dump if the rule has been either inserted (at the beginning
of the ruleset) or appended (at the end of the ruleset), the
NLM_F_APPEND netlink flag is sufficient in these two cases.
Handle NLM_F_REPLACE as NLM_F_APPEND since the rule replacement
expansion appends it after the specified rule handle.