If 'iptables-legacy' is available, 'ip6tables-legacy' command will be
used instead of 'ip6tables'. So no need to look if 'ip6tables' is
available in this case.
Cc: stable@vger.kernel.org Fixes: 0c4cd3f86a40 ("selftests: mptcp: join: use 'iptables-legacy' if available") Acked-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Mat Martineau <martineau@kernel.org> Link: https://lore.kernel.org/r/20230725-send-net-20230725-v1-1-6f60fe7137a9@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In commit 2cb1e0259f50 ("ASoC: cs42l51: re-hook of_match_table
pointer"), 9 years ago, some random guy fixed the cs42l51 after it was
split into a core part and an I2C part to properly match based on a
Device Tree compatible string.
However, the fix in this commit is wrong: the MODULE_DEVICE_TABLE(of,
....) is in the core part of the driver, not the I2C part. Therefore,
automatic module loading based on module.alias, based on matching with
the DT compatible string, loads the core part of the driver, but not
the I2C part. And threfore, the i2c_driver is not registered, and the
codec is not known to the system, nor matched with a DT node with the
corresponding compatible string.
In order to fix that, we move the MODULE_DEVICE_TABLE(of, ...) into
the I2C part of the driver. The cs42l51_of_match[] array is also moved
as well, as it is not possible to have this definition in one file,
and the MODULE_DEVICE_TABLE(of, ...) invocation in another file, due
to how MODULE_DEVICE_TABLE works.
Thanks to this commit, the I2C part of the driver now properly
autoloads, and thanks to its dependency on the core part, the core
part gets autoloaded as well, resulting in a functional sound card
without having to manually load kernel modules.
Fixes: 2cb1e0259f50 ("ASoC: cs42l51: re-hook of_match_table pointer") Cc: stable@vger.kernel.org Signed-off-by: Thomas Petazzoni <thomas.petazzoni@bootlin.com> Link: https://lore.kernel.org/r/20230713112112.778576-1-thomas.petazzoni@bootlin.com Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
io-wq assumes that an issue is blocking, but it may not be if the
request type has asked for a non-blocking attempt. If we get
-EAGAIN for that case, then we need to treat it as a final result
and not retry or arm poll for it.
IPTables commands using 'iptables-nft' fail on old kernels, at least
on v5.15 because it doesn't see the default IPTables chains:
$ iptables -L
iptables/1.8.2 Failed to initialize nft: Protocol not supported
As a first step before switching to NFTables, we can use iptables-legacy
if available.
Link: https://github.com/multipath-tcp/mptcp_net-next/issues/368 Fixes: dc65fe82fb07 ("selftests: mptcp: add packet mark test case") Cc: stable@vger.kernel.org Acked-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
After making acpi_processor_get_platform_limit() use the "no limit"
value for its frequency QoS request when _PPC returns 0, it is not
necessary to replace the frequency corresponding to the first _PSS
return package entry with the maximum turbo frequency of the given
CPU in intel_pstate_init_acpi_perf_limits() any more, so drop the
code doing that along with the comment explaining it.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Tested-by: Hagar Hemdan <hagarhem@amazon.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Modify acpi_processor_get_platform_limit() to avoid updating its
frequency QoS request when the _PPC return value has not changed
by comparing that value to the previous _PPC return value stored in
the performance_platform_limit field of the struct acpi_processor
corresponding to the given CPU.
While at it, do the _PPC return value check against the state count
earlier, to avoid setting performance_platform_limit to an invalid
value, and make acpi_processor_ppc_init() use FREQ_QOS_MAX_DEFAULT_VALUE
as the "no limit" frequency QoS for consistency.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Tested-by: Hagar Hemdan <hagarhem@amazon.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When _PPC returns 0, it means that the CPU frequency is not limited by
the platform firmware, so make acpi_processor_get_platform_limit()
update the frequency QoS request used by it to "no limit" in that case.
This addresses a problem with limiting CPU frequency artificially on
some systems after CPU offline/online to the frequency that corresponds
to the first entry in the _PSS return package.
The test to check if the field is a stack is to be done if it is not a
string. But the code had:
} if (event->fields[i]->is_stack) {
and not
} else if (event->fields[i]->is_stack) {
which would cause it to always be tested. Worse yet, this also included an
"else" statement that was only to be called if the field was not a string
and a stack, but this code allows it to be called if it was a string (and
not a stack).
An attempt to acquire exclusive lock can race with the current lock
owner closing the image:
1. lock is held by client123, rbd_lock() returns -EBUSY
2. get_lock_owner_info() returns client123 instance details
3. client123 closes the image, lock is released
4. find_watcher() returns 0 as there is no matching watcher anymore
5. client123 instance gets erroneously blocklisted
Particularly impacted is mirror snapshot scheduler in snapshot-based
mirroring since it happens to open and close images a lot (images are
opened only for as long as it takes to take the next mirror snapshot,
the same client instance is used for all images).
To reduce the potential for erroneous blocklisting, retrieve the lock
owner again after find_watcher() returns 0. If it's still there, make
sure it matches the previously detected lock owner.
Cc: stable@vger.kernel.org # f38cb9d9c204: rbd: make get_lock_owner_info() return a single locker or NULL Cc: stable@vger.kernel.org # 8ff2c64c9765: rbd: harden get_lock_owner_info() a bit Cc: stable@vger.kernel.org Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Dongsheng Yang <dongsheng.yang@easystack.cn> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
- we want the exclusive lock type, so test for it directly
- use sscanf() to actually parse the lock cookie and avoid admitting
invalid handles
- bail if locker has a blank address
Make the "num_lockers can be only 0 or 1" assumption explicit and
simplify the API by getting rid of output parameters in preparation
for calling get_lock_owner_info() twice before blocklisting.
When using the cleaner policy to decommission the cache, there is
never any writeback started from the cache as it is constantly delayed
due to normal I/O keeping the device busy. Meaning @idle=false was
always being passed to clean_target_met()
Fix this by adding a specific 'cleaner' flag that is set when the
cleaner policy is configured. This flag serves to always allow the
cleaner's writeback work to be queued until the cache is
decommissioned (even if the cache isn't idle).
Reported-by: David Jeffery <djeffery@redhat.com> Fixes: b29d4986d0da ("dm cache: significant rework to leverage dm-bio-prison-v2") Cc: stable@vger.kernel.org Signed-off-by: Joe Thornber <ejt@redhat.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The WM8904_ADC_TEST_0 register is modified as part of updating the OSR
controls but does not have a cache default, leading to errors when we try
to modify these controls in cache only mode with no prior read:
wm8904 3-001a: ASoC: error at snd_soc_component_update_bits on wm8904.3-001a for register: [0x000000c6] -16
Add a read of the register to probe() to fill the cache and avoid both the
error messages and the misconfiguration of the chip which will result.
Quiesce and resume are functions that tell the DASD driver to stop/resume
issuing I/Os to a specific DASD.
On resume dasd_schedule_block_bh() is called to kick handling of IO
requests again. This does unfortunately not cover internal requests which
are used for path verification for example.
This could lead to a hanging device when a path event or anything else
that triggers internal requests occurs on a quiesced device.
Fix by also calling dasd_schedule_device_bh() which triggers handling of
internal requests on resume.
Fixes: 8e09f21574ea ("[S390] dasd: add hyper PAV support to DASD device driver, part 1") Cc: stable@vger.kernel.org Signed-off-by: Stefan Haberland <sth@linux.ibm.com> Reviewed-by: Jan Hoeppner <hoeppner@linux.ibm.com> Link: https://lore.kernel.org/r/20230721193647.3889634-2-sth@linux.ibm.com Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
A race were found where set_channels could be called after registering
but before virtnet_set_queues() in virtnet_probe(). Fixing this by
moving the virtnet_set_queues() before netdevice registering. While at
it, use _virtnet_set_queues() to avoid holding rtnl as the device is
not even registered at that time.
Cc: stable@vger.kernel.org Fixes: a220871be66f ("virtio-net: correctly enable multiqueue") Signed-off-by: Jason Wang <jasowang@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Link: https://lore.kernel.org/r/20230725072049.617289-1-jasowang@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reject KVM_SET_SREGS{2} with -EINVAL if the incoming CR0 is invalid,
e.g. due to setting bits 63:32, illegal combinations, or to a value that
isn't allowed in VMX (non-)root mode. The VMX checks in particular are
"fun" as failure to disallow Real Mode for an L2 that is configured with
unrestricted guest disabled, when KVM itself has unrestricted guest
enabled, will result in KVM forcing VM86 mode to virtual Real Mode for
L2, but then fail to unwind the related metadata when synthesizing a
nested VM-Exit back to L1 (which has unrestricted guest enabled).
Opportunistically fix a benign typo in the prototype for is_valid_cr4().
Cc: stable@vger.kernel.org Reported-by: syzbot+5feef0b9ee9c8e9e5689@syzkaller.appspotmail.com Closes: https://lore.kernel.org/all/000000000000f316b705fdf6e2b4@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20230613203037.1968489-2-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Henry reported that rt_mutex_adjust_prio_check() has an ordering
problem and puts the lie to the comment in [7]. Sharing the sort key
between lock->waiters and owner->pi_waiters *does* create problems,
since unlike what the comment claims, holding [L] is insufficient.
Notably, consider:
A
/ \
M1 M2
| |
B C
That is, task A owns both M1 and M2, B and C block on them. In this
case a concurrent chain walk (B & C) will modify their resp. sort keys
in [7] while holding M1->wait_lock and M2->wait_lock. So holding [L]
is meaningless, they're different Ls.
This then gives rise to a race condition between [7] and [11], where
the requeue of pi_waiters will observe an inconsistent tree order.
[11]
rt_mutex_enqueue_pi();
// observes inconsistent A->pi_waiters
// tree order
Fixing this means either extending the range of the owner lock from
[10-13] to [6-13], with the immediate problem that this means [6-8]
hold both blocked and owner locks, or duplicating the sort key.
Since the locking in chain walk is horrible enough without having to
consider pi_lock nesting rules, duplicate the sort key instead.
By giving each tree their own sort key, the above race becomes
harmless, if C sees B at the old location, then B will correct things
(if they need correcting) when it walks up the chain and reaches A.
Fixes: fb00aca47440 ("rtmutex: Turn the plist into an rb-tree") Reported-by: Henry Wu <triangletrap12@gmail.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Henry Wu <triangletrap12@gmail.com> Link: https://lkml.kernel.org/r/20230707161052.GF2883469%40hirez.programming.kicks-ass.net Signed-off-by: Sasha Levin <sashal@kernel.org>
The irq to block mapping is fixed, and interrupts from the first block
will always be routed to the first parent IRQ. But the parent interrupts
themselves can be routed to any available CPU.
This is used by the bootloader to map the first parent interrupt to the
boot CPU, regardless wether the boot CPU is the first one or the second
one.
When booting from the second CPU, the assumption that the first block's
IRQ is mapped to the first CPU breaks, and the system hangs because
interrupts do not get routed correctly.
Fix this by passing the appropriate bcm6434_l1_cpu to the interrupt
handler instead of the chip itself, so the handler always has the right
block.
Fixes: c7c42ec2baa1 ("irqchips/bmips: Add bcm6345-l1 interrupt controller") Signed-off-by: Jonas Gorski <jonas.gorski@gmail.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20230629072620.62527-1-jonas.gorski@gmail.com Signed-off-by: Sasha Levin <sashal@kernel.org>
recv_data either returns the number of received bytes, or a negative value
representing an error code. Adding the return value directly to the total
number of received bytes therefore looks a little weird, since it might add
a negative error code to a sum of bytes.
The following check for size < expected usually makes the function return
ETIME in that case, so it does not cause too many problems in practice. But
to make the code look cleaner and because the caller might still be
interested in the original error code, explicitly check for the presence of
an error code and pass that through.
Cc: stable@vger.kernel.org Fixes: cb5354253af2 ("[PATCH] tpm: spacing cleanups 2") Signed-off-by: Alexander Steffen <Alexander.Steffen@infineon.com> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
If the client is calling TEST_STATEID, then it is because some event
occurred that requires it to check all the stateids for validity and
call FREE_STATEID on the ones that have been revoked. In this case,
either the stateid exists in the list of stateids associated with that
nfs4_client, in which case it should be tested, or it does not. There
are no additional conditions to be considered.
Reported-by: "Frank Ch. Eigler" <fche@redhat.com> Fixes: 7df302f75ee2 ("NFSD: TEST_STATEID should not return NFS4ERR_STALE_STATEID") Cc: stable@vger.kernel.org # v5.7+ Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The pidfd_getfd() system call allows a caller with ptrace_may_access()
abilities on another process to steal a file descriptor from this
process. This system call is used by debuggers, container runtimes,
system call supervisors, networking proxies etc. So while it is a
special interest system call it is used in common tools.
That ability ends up breaking our long-time optimization in fdget_pos(),
which "knew" that if we had exclusive access to the file descriptor
nobody else could access it, and we didn't need the lock for the file
position.
That check for file_count(file) was always fairly subtle - it depended
on __fdget() not incrementing the file count for single-threaded
processes and thus included that as part of the rule - but it did mean
that we didn't need to take the lock in all those traditional unix
process contexts.
So it's sad to see this go, and I'd love to have some way to re-instate
the optimization. At the same time, the lock obviously isn't ever
contended in the case we optimized, so all we were optimizing away is
the atomics and the cacheline dirtying. Let's see if anybody even
notices that the optimization is gone.
btrfs_attach_transaction_barrier() is used to get a handle pointing to the
current running transaction if the transaction has not started its commit
yet (its state is < TRANS_STATE_COMMIT_START). If the transaction commit
has started, then we wait for the transaction to commit and finish before
returning - however we completely ignore if the transaction was aborted
due to some error during its commit, we simply return ERR_PT(-ENOENT),
which makes the caller assume everything is fine and no errors happened.
This could make an fsync return success (0) to user space when in fact we
had a transaction abort and the target inode changes were therefore not
persisted.
Fix this by checking for the return value from btrfs_wait_for_commit(),
and if it returned an error, return it back to the caller.
Fixes: d4edf39bd5db ("Btrfs: fix uncompleted transaction") CC: stable@vger.kernel.org # 4.19+ Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
At btrfs_wait_for_commit() we wait for a transaction to finish and then
always return 0 (success) without checking if it was aborted, in which
case the transaction didn't happen due to some critical error. Fix this
by checking if the transaction was aborted.
Because of hex value 0x46 used instead of decimal 46, the temp6
(PECI1) temperature is always declared visible and then displayed
even if disabled in the chip
Industrial processor i3255 supports temperatures -40 deg celcius
to 105 deg Celcius. The current implementation of k10temp_read_temp
rounds off any negative temperatures to '0'. To fix this,
the following changes have been made.
A flag 'disp_negative' is added to struct k10temp_data to support
AMD i3255 processors. Flag 'disp_negative' is set if 3255 processor
is found during k10temp_probe. Flag 'disp_negative' is used to
determine whether to round off negative temperatures to '0' in
k10temp_read_temp.
Signed-off-by: Baskaran Kannan <Baski.Kannan@amd.com> Link: https://lore.kernel.org/r/20230727162159.1056136-1-Baski.Kannan@amd.com Fixes: aef17ca12719 ("hwmon: (k10temp) Only apply temperature offset if result is positive") Cc: stable@vger.kernel.org
[groeck: Fixed multi-line comment] Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ 0.754373] pci 0000:0b:00.0: xHCI HW did not halt within 32000 usec status = 0x1000
[ 0.754419] pci 0000:0b:00.0: quirk_usb_early_handoff+0x0/0x7a0 took 31459 usecs
[ 2.228048] xhci_hcd 0000:0b:00.0: xHCI Host Controller
[ 2.228053] xhci_hcd 0000:0b:00.0: new USB bus registered, assigned bus number 7
[ 2.260073] xhci_hcd 0000:0b:00.0: Host halt failed, -110
[ 2.260079] xhci_hcd 0000:0b:00.0: can't setup: -110
[ 2.260551] xhci_hcd 0000:0b:00.0: USB bus 7 deregistered
[ 2.260624] xhci_hcd 0000:0b:00.0: init 0000:0b:00.0 fail, -110
[ 2.260639] xhci_hcd: probe of 0000:0b:00.0 failed with error -110
The hardware in question is an external PCIe card. It looks to me like the quirk
needs to be narrowed down. But this needs information about the hardware showing
the issue this quirk is to fix. So for now a clean revert.
Signed-off-by: Oliver Neukum <oneukum@suse.com> Fixes: 5255660b208a ("xhci: add quirk for host controllers that don't update endpoint DCS") Cc: stable <stable@kernel.org> Link: https://lore.kernel.org/r/20230713112830.21773-1-oneukum@suse.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In gsm_cleanup_mux() the 'gsm->dlci' pointer was not cleaned properly,
leaving it a dangling pointer after gsm_dlci_release.
This leads to use-after-free where 'gsm->dlci[0]' are freed and accessed
by the subsequent gsm_cleanup_mux().
In the above mentioned routine, memory is allocated in several places.
If the first succeeds and a later one fails, the routine will leak memory.
This patch fixes commit 2865d42c78a9 ("staging: r8712u: Add the new driver
to the mainline kernel"). A potential memory leak in
r8712_xmit_resource_alloc() is also addressed.
Fixes: 2865d42c78a9 ("staging: r8712u: Add the new driver to the mainline kernel") Reported-by: syzbot+cf71097ffb6755df8251@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/x/log.txt?x=11ac3fa0a80000 Cc: stable@vger.kernel.org Cc: Nam Cao <namcaov@gmail.com> Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net> Reviewed-by: Nam Cao <namcaov@gmail.com> Link: https://lore.kernel.org/r/20230714175417.18578-1-Larry.Finger@lwfinger.net Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The kernel security team does NOT assign CVEs, so document that properly
and provide the "if you want one, ask MITRE for it" response that we
give on a weekly basis in the document, so we don't have to constantly
say it to everyone who asks.
Because the linux-distros group forces reporters to release information
about reported bugs, and they impose arbitrary deadlines in having those
bugs fixed despite not actually being kernel developers, the kernel
security team recommends not interacting with them at all as this just
causes confusion and the early-release of reported security problems.
The reverted commit was based on static analysis and a misunderstanding
of how PTR_ERR() and NULLs are supposed to work. When a function
returns both pointer errors and NULL then normally the NULL means
"continue operating without a feature because it was deliberately
turned off". The NULL should not be treated as a failure. If a driver
cannot work when that feature is disabled then the KConfig should
enforce that the function cannot return NULL. We should not need to
test for it.
In this code, the patch means that certain tegra_xusb_probe() will
fail if the firmware supports power-domains but CONFIG_PM is disabled.
Previously, the cdns3_gadget_check_config() function in the cdns3 driver
mistakenly calculated the ep_buf_size by considering only one
configuration's endpoint information because "claimed" will be clear after
call usb_gadget_check_config().
The fix involves checking the private flags EP_CLAIMED instead of relying
on the "claimed" flag.
Fixes: dce49449e04f ("usb: cdns3: allocate TX FIFO size according to composite EP number") Cc: stable <stable@kernel.org> Reported-by: Ravi Gunasekaran <r-gunasekaran@ti.com> Signed-off-by: Frank Li <Frank.Li@nxp.com> Acked-by: Peter Chen <peter.chen@kernel.org> Tested-by: Ravi Gunasekaran <r-gunasekaran@ti.com> Link: https://lore.kernel.org/r/20230707230015.494999-2-Frank.Li@nxp.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To fix the condition a user has to unplug and plug the device again.
With USB_QUIRK_RESET_RESUME applied ("usbcore.quirks=1235:8211:b")
for the Scarlett audio device the issue still reproduces.
Applying USB_QUIRK_DISCONNECT_SUSPEND ("usbcore.quirks=1235:8211:m")
fixed the issue and the Scarlett audio device didn't drop off the USB
bus for ~5000 suspend/resume cycles where originally issue reproduced in
~100 or less suspend/resume cycles.
The ohci_hcd_at91_drv_suspend() sets ohci->rh_state to OHCI_RH_HALTED when
suspend which will let the ohci_irq() skip the interrupt after resume. And
nobody to handle this interrupt.
According to the comment in ohci_hcd_at91_drv_suspend(), it need to reset
when resume from suspend(MEM) to fix by setting "hibernated" argument of
ohci_resume().
Commit c4a5153e87fd ("usb: dwc3: core: Power-off core/PHYs on
system_suspend in host mode") replaces check for HOST only dr_mode with
current_dr_role. But during booting, the current_dr_role isn't
initialized, thus the device side reset is always issued even if dwc3
was configured as host-only. What's more, on some platforms with host
only dwc3, aways issuing device side reset by accessing device register
block can cause kernel panic.
Hardware based on the Bay Trail / BYT SoCs require an external ULPI phy for
USB device-mode. The phy chip usually has its 'reset' and 'chip select'
lines connected to GPIOs described by ACPI fwnodes in the DSDT table.
Because of hardware with missing ACPI resources for the 'reset' and 'chip
select' GPIOs commit 5741022cbdf3 ("usb: dwc3: pci: Add GPIO lookup table
on platforms without ACPI GPIO resources") introduced a fallback
gpiod_lookup_table with hard-coded mappings for Bay Trail devices.
However there are existing Bay Trail based devices, like the National
Instruments cRIO-903x series, where the phy chip has its 'reset' and
'chip-select' lines always asserted in hardware via resistor pull-ups. On
this hardware the phy chip is always enabled and the ACPI dsdt table is
missing information not only for the 'chip-select' and 'reset' lines but
also for the BYT GPIO controller itself "INT33FC".
With the introduction of the gpiod_lookup_table initializing the USB
device-mode on these hardware now errors out. The error comes from the
gpiod_get_optional() calls in dwc3_pci_quirks() which will now return an
-ENOENT error due to the missing ACPI entry for the INT33FC gpio controller
used in the aforementioned table.
This hardware used to work before because gpiod_get_optional() will return
NULL instead of -ENOENT if no GPIO has been assigned to the requested
function. The dwc3_pci_quirks() code for setting the 'cs' and 'reset' GPIOs
was then skipped (due to the NULL return). This is the correct behavior in
cases where the phy chip is hardwired and there are no GPIOs to control.
Since the gpiod_lookup_table relies on the presence of INT33FC fwnode
in ACPI tables only add the table if we know the entry for the INT33FC
gpio controller is present. This allows Bay Trail based devices with
hardwired dwc3 ULPI phys to continue working.
Fixes: 5741022cbdf3 ("usb: dwc3: pci: Add GPIO lookup table on platforms without ACPI GPIO resources") Cc: stable <stable@kernel.org> Signed-off-by: Gratian Crisan <gratian.crisan@ni.com> Reviewed-by: Hans de Goede <hdegoede@redhat.com> Link: https://lore.kernel.org/r/20230726184555.218091-2-gratian.crisan@ni.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
AutoRetry has been found to sometimes cause controller freezes when
communicating with buggy USB devices.
This controller feature allows the controller in host mode to send
non-terminating/burst retry ACKs instead of terminating retry ACKs
to devices when a transaction error (CRC error or overflow) occurs.
Unfortunately, if the USB device continues to respond with a CRC error,
the controller will not complete endpoint-related commands while it
keeps trying to auto-retry. [3] The xHCI driver will notice this once
it tries to abort the transfer using a Stop Endpoint command and
does not receive a completion in time. [1]
This situation is reported to dmesg:
[sda] tag#29 uas_eh_abort_handler 0 uas-tag 1 inflight: CMD IN
[sda] tag#29 CDB: opcode=0x28 28 00 00 69 42 80 00 00 48 00
xhci-hcd: xHCI host not responding to stop endpoint command
xhci-hcd: xHCI host controller not responding, assume dead
xhci-hcd: HC died; cleaning up
Some users observed this problem on an Odroid HC2 with the JMS578
USB3-to-SATA bridge. The issue can be triggered by starting
a read-heavy workload on an attached SSD. After a while, the host
controller would die and the SSD would disappear from the system. [1]
Further analysis by Synopsys determined that controller revisions
other than the one in Odroid HC2 are also affected by this.
The recommended solution was to disable AutoRetry altogether.
This change does not have a noticeable performance impact. [2]
Revert the enablement commit. This will keep the AutoRetry bit in
the default state configured during SoC design [2].
After an initial link up the CAN device is in ERROR-ACTIVE mode. Due
to a missing CAN_STATE_STOPPED in gs_can_close() it doesn't change to
STOPPED after a link down:
| ip link set dev can0 up
| ip link set dev can0 down
| ip --details link show can0
| 13: can0: <NOARP,ECHO> mtu 16 qdisc pfifo_fast state DOWN mode DEFAULT group default qlen 10
| link/can promiscuity 0 allmulti 0 minmtu 0 maxmtu 0
| can state ERROR-ACTIVE restart-ms 1000
Add missing assignment of CAN_STATE_STOPPED in gs_can_close().
Add the device and product ID for this CAN bus interface / license
dongle. The device is usable either directly from user space or can be
attached to a kernel CAN interface with slcan_attach.
Reported-by: Kaufmann Automotive GmbH <info@kaufmann-automotive.ch> Tested-by: Kaufmann Automotive GmbH <info@kaufmann-automotive.ch> Signed-off-by: Oliver Neukum <oneukum@suse.com>
[ johan: amend commit message and move entries in sort order ] Cc: stable@vger.kernel.org Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This function is called indirectly from the platform driver probe
function. Even if the driver is built in, it may be probed after
free_initmem() due to deferral or unbinding/binding via sysfs.
Thus the function cannot be marked as __init.
Preserve the original value of the Divisor Latch Fraction (DLF) register.
When the DLF register is modified without preservation, it can disrupt
the baudrate settings established by firmware or bootloader, leading to
data corruption and the generation of unreadable or distorted characters.
The runtime PM state should not be changed by drivers that do not
implement runtime PM even if it happens to work around a bug in PM core.
With the wake irq arming now fixed, drop the bogus runtime PM state
update which left the device in active state (and could potentially
prevent a parent device from suspending).
Fixes: f3974413cf02 ("tty: serial: qcom_geni_serial: Wakeup IRQ cleanup") Cc: 5.6+ <stable@vger.kernel.org> # 5.6+ Signed-off-by: Johan Hovold <johan+linaro@kernel.org> Reviewed-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Stuff CR0 and/or CR4 to be compliant with a restricted guest if and only
if KVM itself is not configured to utilize unrestricted guests, i.e. don't
stuff CR0/CR4 for a restricted L2 that is running as the guest of an
unrestricted L1. Any attempt to VM-Enter a restricted guest with invalid
CR0/CR4 values should fail, i.e. in a nested scenario, KVM (as L0) should
never observe a restricted L2 with incompatible CR0/CR4, since nested
VM-Enter from L1 should have failed.
And if KVM does observe an active, restricted L2 with incompatible state,
e.g. due to a KVM bug, fudging CR0/CR4 instead of letting VM-Enter fail
does more harm than good, as KVM will often neglect to undo the side
effects, e.g. won't clear rmode.vm86_active on nested VM-Exit, and thus
the damage can easily spill over to L1. On the other hand, letting
VM-Enter fail due to bad guest state is more likely to contain the damage
to L2 as KVM relies on hardware to perform most guest state consistency
checks, i.e. KVM needs to be able to reflect a failed nested VM-Enter into
L1 irrespective of (un)restricted guest behavior.
Cc: Jim Mattson <jmattson@google.com> Cc: stable@vger.kernel.org Fixes: bddd82d19e2e ("KVM: nVMX: KVM needs to unset "unrestricted guest" VM-execution control in vmcs02 if vmcs12 doesn't set it") Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20230613203037.1968489-3-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Grab a reference to KVM prior to installing VM and vCPU stats file
descriptors to ensure the underlying VM and vCPU objects are not freed
until the last reference to any and all stats fds are dropped.
Note, the stats paths manually invoke fd_install() and so don't need to
grab a reference before creating the file.
Fixes: ce55c049459c ("KVM: stats: Support binary stats retrieval for a VCPU") Fixes: fcfe1baeddbf ("KVM: stats: Support binary stats retrieval for a VM") Reported-by: Zheng Zhang <zheng.zhang@email.ucr.edu> Closes: https://lore.kernel.org/all/CAC_GQSr3xzZaeZt85k_RCBd5kfiOve8qXo7a81Cq53LuVQ5r=Q@mail.gmail.com Cc: stable@vger.kernel.org Cc: Kees Cook <keescook@chromium.org> Signed-off-by: Sean Christopherson <seanjc@google.com> Reviewed-by: Kees Cook <keescook@chromium.org>
Message-Id: <20230711230131.648752-2-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Currently, increasing raw_dev->count happens before invoke the
raw_queue_event(), if the raw_queue_event() return error, invoke
raw_release() will not trigger the dev_free() to be called.
[ 268.905865][ T5067] raw-gadget.0 gadget.0: failed to queue event
[ 268.912053][ T5067] udc dummy_udc.0: failed to start USB Raw Gadget: -12
[ 268.918885][ T5067] raw-gadget.0: probe of gadget.0 failed with error -12
[ 268.925956][ T5067] UDC core: USB Raw Gadget: couldn't find an available UDC or it's busy
[ 268.934657][ T5067] misc raw-gadget: fail, usb_gadget_register_driver returned -16
The legacy gadget driver omitted calling usb_gadget_check_config()
to ensure that the USB device controller (UDC) has adequate resources,
including sufficient endpoint numbers and types, to support the given
configuration.
Previously, usb_add_config() was solely invoked by the legacy gadget
driver. Adds the necessary usb_gadget_check_config() after the bind()
operation to fix the issue.
Fixes: dce49449e04f ("usb: cdns3: allocate TX FIFO size according to composite EP number") Cc: stable <stable@kernel.org> Reported-by: Ravi Gunasekaran <r-gunasekaran@ti.com> Signed-off-by: Frank Li <Frank.Li@nxp.com> Link: https://lore.kernel.org/r/20230707230015.494999-1-Frank.Li@nxp.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The reverted commit was based on static analysis and a misunderstanding
of how PTR_ERR() and NULLs are supposed to work. When a function
returns both pointer errors and NULL then normally the NULL means
"continue operating without a feature because it was deliberately
turned off". The NULL should not be treated as a failure. If a driver
cannot work when that feature is disabled then the KConfig should
enforce that the function cannot return NULL. We should not need to
test for it.
In this driver, the bug means that probe cannot succeed when CONFIG_PM
is disabled.
The cause of the warning is in __ftrace_event_enable_disable(),
trace_buffered_event_enable() was called once while
trace_buffered_event_disable() was called twice.
Reproduction script show as below, for analysis, see the comments:
```
#!/bin/bash
cd /sys/kernel/tracing/
# 1. Register a 'disable_event' command, then:
# 1) SOFT_DISABLED_BIT was set;
# 2) trace_buffered_event_enable() was called first time;
echo 'cmdline_proc_show:disable_event:initcall:initcall_finish' > \
set_ftrace_filter
# 2. Enable the event registered, then:
# 1) SOFT_DISABLED_BIT was cleared;
# 2) trace_buffered_event_disable() was called first time;
echo 1 > events/initcall/initcall_finish/enable
# 3. Try to call into cmdline_proc_show(), then SOFT_DISABLED_BIT was
# set again!!!
cat /proc/cmdline
# 4. Unregister the 'disable_event' command, then:
# 1) SOFT_DISABLED_BIT was cleared again;
# 2) trace_buffered_event_disable() was called second time!!!
echo '!cmdline_proc_show:disable_event:initcall:initcall_finish' > \
set_ftrace_filter
```
To fix it, IIUC, we can change to call trace_buffered_event_enable() at
fist time soft-mode enabled, and call trace_buffered_event_disable() at
last time soft-mode disabled.
When pages are removed in rb_remove_pages(), 'cpu_buffer->read' is set
to 0 in order to make sure any read iterators reset themselves. However,
this will mess 'entries' stating, see following steps:
# cd /sys/kernel/tracing/
# 1. Enlarge ring buffer prepare for later reducing:
# echo 20 > per_cpu/cpu0/buffer_size_kb
# 2. Write a log into ring buffer of cpu0:
# taskset -c 0 echo "hello1" > trace_marker
# 3. Read the log:
# cat per_cpu/cpu0/trace_pipe
<...>-332 [000] ..... 62.406844: tracing_mark_write: hello1
# 4. Stop reading and see the stats, now 0 entries, and 1 event readed:
# cat per_cpu/cpu0/stats
entries: 0
[...]
read events: 1
# 5. Reduce the ring buffer
# echo 7 > per_cpu/cpu0/buffer_size_kb
# 6. Now entries became unexpected 1 because actually no entries!!!
# cat per_cpu/cpu0/stats
entries: 1
[...]
read events: 0
To fix it, introduce 'page_removed' field to count total removed pages
since last reset, then use it to let read iterators reset themselves
instead of changing the 'read' pointer.
Link: https://lore.kernel.org/linux-trace-kernel/20230724054040.3489499-1-zhengyejian1@huawei.com Cc: <mhiramat@kernel.org> Cc: <vnagarnaik@google.com> Fixes: 83f40318dab0 ("ring-buffer: Make removal of ring buffer pages atomic") Signed-off-by: Zheng Yejian <zhengyejian1@huawei.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
If the second call to amdgpu_bo_create_kernel() fails, the memory
allocated from the first call should be cleared. If the third call
fails, the memory from the second call should be cleared.
Fixes: b95b5391684b ("drm/amdgpu/psp: move PSP memory alloc from hw_init to sw_init") Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
__md_stop_writes() and __md_stop() will modify many fields that are
protected by 'reconfig_mutex', and all the callers will grab
'reconfig_mutex' except for md_stop().
Also, update md_stop() to make certain 'reconfig_mutex' is held using
lockdep_assert_held().
Fix the symbolic names for zone conditions in the blkzoned.h header
file.
Cc: Hannes Reinecke <hare@suse.de> Cc: Damien Le Moal <dlemoal@kernel.org> Fixes: 6a0cb1bc106f ("block: Implement support for zoned block devices") Signed-off-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Link: https://lore.kernel.org/r/20230706201422.3987341-1-bvanassche@acm.org Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <sashal@kernel.org>
HW may generate completions that indicates QP is destroyed.
Driver should not be scheduling any more completion handlers
for this QP, after the QP is destroyed. Since CQs are active
during the QP destroy, driver may still schedule completion
handlers. This can cause a race where the destroy_cq and poll_cq
running simultaneously.
Snippet of kernel panic while doing bnxt_re driver load unload in loop.
This indicates a poll after the CQ is freed.
To avoid this, complete all completion handlers before returning the
destroy QP. If free_cq is called soon after destroy_qp, IB stack
will cancel the CQ work before invoking the destroy_cq verb and
this will prevent any race mentioned.
Commit 21c2fe94abb2 ("RDMA/mthca: Combine special QP struct with mthca QP")
introduced a new struct mthca_sqp which doesn't contain struct mthca_qp
any longer. Placing a pointer of this new struct into qptable leads
to crashes, because mthca_poll_one() expects a qp pointer. Fix this
by putting the correct pointer into qptable.
Fixes: 21c2fe94abb2 ("RDMA/mthca: Combine special QP struct with mthca QP") Signed-off-by: Thomas Bogendoerfer <tbogendoerfer@suse.de> Link: https://lore.kernel.org/r/20230713141658.9426-1-tbogendoerfer@suse.de Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
KCSAN detects a data race on cqp_request->request_done memory location
which is accessed locklessly in irdma_handle_cqp_op while being
updated in irdma_cqp_ce_handler.
Annotate lockless intent with READ_ONCE/WRITE_ONCE to avoid any
compiler optimizations like load fusing and/or KCSAN warning.
[222808.417128] BUG: KCSAN: data-race in irdma_cqp_ce_handler [irdma] / irdma_wait_event [irdma]
[222808.417532] write to 0xffff8e44107019dc of 1 bytes by task 29658 on cpu 5:
[222808.417610] irdma_cqp_ce_handler+0x21e/0x270 [irdma]
[222808.417725] cqp_compl_worker+0x1b/0x20 [irdma]
[222808.417827] process_one_work+0x4d1/0xa40
[222808.417835] worker_thread+0x319/0x700
[222808.417842] kthread+0x180/0x1b0
[222808.417852] ret_from_fork+0x22/0x30
CQP completion statistics is read lockesly in irdma_wait_event and
irdma_check_cqp_progress while it can be updated in the completion
thread irdma_sc_ccq_get_cqe_info on another CPU as KCSAN reports.
Make completion statistics an atomic variable to reflect coherent updates
to it. This will also avoid load/store tearing logic bug potentially
possible by compiler optimizations.
[77346.170861] BUG: KCSAN: data-race in irdma_handle_cqp_op [irdma] / irdma_sc_ccq_get_cqe_info [irdma]
[77346.171383] write to 0xffff8a3250b108e0 of 8 bytes by task 9544 on cpu 4:
[77346.171483] irdma_sc_ccq_get_cqe_info+0x27a/0x370 [irdma]
[77346.171658] irdma_cqp_ce_handler+0x164/0x270 [irdma]
[77346.171835] cqp_compl_worker+0x1b/0x20 [irdma]
[77346.172009] process_one_work+0x4d1/0xa40
[77346.172024] worker_thread+0x319/0x700
[77346.172037] kthread+0x180/0x1b0
[77346.172054] ret_from_fork+0x22/0x30
On code inspection, there are many instances in the driver where
CEQE and AEQE fields written to by HW are read without guaranteeing
that the polarity bit has been read and checked first.
Add a read barrier to avoid reordering of loads on the CEQE/AEQE fields
prior to checking the polarity bit.
This code is trying to ensure that only the flags specified in the list
are allowed. The problem is that ucmd->rx_hash_fields_mask is a u64 and
the flags are an enum which is treated as a u32 in this context. That
means the test doesn't check whether the highest 32 bits are zero.
If tipc_link_bc_create() fails inside tipc_node_create() for a newly
allocated tipc node then we should stop its tipc crypto and free the
resources allocated with a call to tipc_crypto_start().
As the node ref is initialized to one to that point, just put the ref on
tipc_link_bc_create() error case that would lead to tipc_node_free() be
eventually executed and properly clean the node and its crypto resources.
Found by Linux Verification Center (linuxtesting.org).
Fixes: cb8092d70a6f ("tipc: move bc link creation back to tipc_node_create") Suggested-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: Fedor Pchelkin <pchelkin@ispras.ru> Reviewed-by: Xin Long <lucien.xin@gmail.com> Link: https://lore.kernel.org/r/20230725214628.25246-1-pchelkin@ispras.ru Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
in be_lancer_xmit_workarounds(), it should go to label 'tx_drop'
if an unexpected value is returned by pskb_trim().
Fixes: 93040ae5cc8d ("be2net: Fix to trim skb for padded vlan packets to workaround an ASIC Bug") Signed-off-by: Yuanjun Gong <ruc_gongyuanjun@163.com> Link: https://lore.kernel.org/r/20230725032726.15002-1-ruc_gongyuanjun@163.com Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
The nla_for_each_nested parsing in function mqprio_parse_nlattr() does
not check the length of the nested attribute. This can lead to an
out-of-attribute read and allow a malformed nlattr (e.g., length 0) to
be viewed as 8 byte integer and passed to priv->max_rate/min_rate.
This patch adds the check based on nla_len() when check the nla_type(),
which ensures that the length of these two attribute must equals
sizeof(u64).
Fixes: 4e8b86c06269 ("mqprio: Introduce new hardware offload mode and shaper in mqprio") Reviewed-by: Victor Nogueira <victor@mojatatu.com> Signed-off-by: Lin Ma <linma@zju.edu.cn> Link: https://lore.kernel.org/r/20230725024227.426561-1-linma@zju.edu.cn Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Netlink attribute parsing in mqprio is a minesweeper game, with many
options having the possibility of being passed incorrectly and the user
being none the wiser.
Try to make errors less sour by giving user space some information
regarding what went wrong.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Ferenc Fejes <fejes@inf.elte.hu> Reviewed-by: Simon Horman <simon.horman@corigine.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Stable-dep-of: 6c58c8816abb ("net/sched: mqprio: Add length check for TCA_MQPRIO_{MAX/MIN}_RATE64") Signed-off-by: Sasha Levin <sashal@kernel.org>
mqprio_init() is quite large and unwieldy to add more code to.
Split the netlink attribute parsing to a dedicated function.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Stable-dep-of: 6c58c8816abb ("net/sched: mqprio: Add length check for TCA_MQPRIO_{MAX/MIN}_RATE64") Signed-off-by: Sasha Levin <sashal@kernel.org>
Bail out with EOPNOTSUPP when adding rule to bound chain via
NFTA_RULE_CHAIN_ID. The following warning splat is shown when
adding a rule to a deleted bound chain:
The lazy gc on insert that should remove timed-out entries fails to release
the other half of the interval, if any.
Can be reproduced with tests/shell/testcases/sets/0044interval_overlap_0
in nftables.git and kmemleak enabled kernel.
Second bug is the use of rbe_prev vs. prev pointer.
If rbe_prev() returns NULL after at least one iteration, rbe_prev points
to element that is not an end interval, hence it should not be removed.
Lastly, check the genmask of the end interval if this is active in the
current generation.
Fixes: c9e6978e2725 ("netfilter: nft_set_rbtree: Switch to node list walk for overlap detection") Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
The Xeon validation group has been carrying out some loaded tests
with various HW configurations, and they have seen some transmit
queue time out happening during the test. This will cause the
reset adapter function to be called by igc_tx_timeout().
Similar race conditions may arise when the interface is being brought
down and up in igc_reinit_locked(), an interrupt being generated, and
igc_clean_tx_irq() being called to complete the TX.
When the igc_tx_timeout() function is invoked, this patch will turn
off all TX ring HW queues during igc_down() process. TX ring HW queues
will be activated again during the igc_configure_tx_ring() process
when performing the igc_up() procedure later.
This patch also moved existing igc_disable_tx_ring_hw() to avoid using
forward declaration.
Only the HW rfkill state is toggled on laptops with quirks->ec_read_only
(so far only MSI Wind U90/U100). There are, however, a few issues with
the implementation:
1. The initial HW state is always unblocked, regardless of the actual
state on boot, because msi_init_rfkill only sets the SW state,
regardless of ec_read_only.
2. The initial SW state corresponds to the actual state on boot, but it
can't be changed afterwards, because set_device_state returns
-EOPNOTSUPP. It confuses the userspace, making Wi-Fi and/or Bluetooth
unusable if it was blocked on boot, and breaking the airplane mode if
the rfkill was unblocked on boot.
Address the above issues by properly initializing the HW state on
ec_read_only laptops and by allowing the userspace to toggle the SW
state. Don't set the SW state ourselves and let the userspace fully
control it. Toggling the SW state is a no-op, however, it allows the
userspace to properly toggle the airplane mode. The actual SW radio
disablement is handled by the corresponding rtl818x_pci and btusb
drivers that have their own rfkills.
Tested on MSI Wind U100 Plus, BIOS ver 1.0G, EC ver 130.
Fixes: 0816392b97d4 ("msi-laptop: merge quirk tables to one") Fixes: 0de6575ad0a8 ("msi-laptop: Add MSI Wind U90/U100 support") Signed-off-by: Maxim Mikityanskiy <maxtram95@gmail.com> Link: https://lore.kernel.org/r/20230721145423.161057-1-maxtram95@gmail.com Reviewed-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
commit a3a57bf07de23fe1ff779e0fdf710aa581c3ff73 ("net: stmmac: work
around sporadic tx issue on link-up") worked around a problem with TX
sometimes not working after a link-up by avoiding a redundant write to
MAC_CTRL_REG (aka GMAC_CONFIG), since the IP appeared to have problems
with handling multiple writes to that register in some cases.
That commit however only added the work around to dwmac_lib.c (apart
from the common code in stmmac_main.c), but my systems with version
4.21a of the IP exhibit the same problem, so add the work around to
dwmac4_lib.c too.
Fixes: a3a57bf07de2 ("net: stmmac: work around sporadic tx issue on link-up") Signed-off-by: Vincent Whitchurch <vincent.whitchurch@axis.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Link: https://lore.kernel.org/r/20230721-stmmac-tx-workaround-v1-1-9411cbd5ee07@axis.com Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
When adding a point to point downlink to team device, we neglected to reset
the team's flags, which were still using flags like BROADCAST and
MULTICAST. Consequently, this would initiate ARP/DAD for P2P downlink
interfaces, such as when adding a GRE device to team device. Fix this by
remove multicast/broadcast flags and add p2p and noarp flags.
After removing the none ethernet interface and adding an ethernet interface
to team, we need to reset team interface flags. Unlike bonding interface,
team do not need restore IFF_MASTER, IFF_SLAVE flags.
Reported-by: Liang Li <liali@redhat.com> Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2221438 Fixes: 1d76efe1577b ("team: add support for non-ethernet devices") Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
When adding a point to point downlink to the bond, we neglected to reset
the bond's flags, which were still using flags like BROADCAST and
MULTICAST. Consequently, this would initiate ARP/DAD for P2P downlink
interfaces, such as when adding a GRE device to the bonding.
To address this issue, let's reset the bond's flags for P2P interfaces.
Before fix:
7: gre0@NONE: <POINTOPOINT,NOARP,SLAVE,UP,LOWER_UP> mtu 1500 qdisc noqueue master bond0 state UNKNOWN group default qlen 1000
link/gre6 2006:70:10::1 peer 2006:70:10::2 permaddr 167f:18:f188::
8: bond0: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
link/gre6 2006:70:10::1 brd 2006:70:10::2
inet6 fe80::200:ff:fe00:0/64 scope link
valid_lft forever preferred_lft forever
After fix:
7: gre0@NONE: <POINTOPOINT,NOARP,SLAVE,UP,LOWER_UP> mtu 1500 qdisc noqueue master bond2 state UNKNOWN group default qlen 1000
link/gre6 2006:70:10::1 peer 2006:70:10::2 permaddr c29e:557a:e9d9::
8: bond0: <POINTOPOINT,NOARP,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
link/gre6 2006:70:10::1 peer 2006:70:10::2
inet6 fe80::1/64 scope link
valid_lft forever preferred_lft forever
Reported-by: Liang Li <liali@redhat.com> Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2221438 Fixes: 872254dd6b1f ("net/bonding: Enable bonding to enslave non ARPHRD_ETHER") Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Fix ethtool FDIR logic to not use memory after its release.
In the ice_ethtool_fdir.c file there are 2 spots where code can
refer to pointers which may be missing.
In the ice_cfg_fdir_xtrct_seq() function seg may be freed but
even then may be still used by memcpy(&tun_seg[1], seg, sizeof(*seg)).
In the ice_add_fdir_ethtool() function struct ice_fdir_fltr *input
may first fail to be added via ice_fdir_update_list_entry() but then
may be deleted by ice_fdir_update_list_entry.
Terminate in both cases when the returned value of the previous
operation is other than 0, free memory and don't use it anymore.
Reported-by: Michal Schmidt <mschmidt@redhat.com> Link: https://bugzilla.redhat.com/show_bug.cgi?id=2208423 Fixes: cac2a27cd9ab ("ice: Support IPv4 Flow Director filters") Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Jedrzej Jagielski <jedrzej.jagielski@intel.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://lore.kernel.org/r/20230721155854.1292805-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
For both IPv4 and IPv6 incoming TCP connections are tracked in a hash
table with a hash over the source & destination addresses and ports.
However, the IPv6 hash is insufficient and can lead to a high rate of
collisions.
The IPv6 hash used an XOR to fit everything into the 96 bits for the
fast jenkins hash, meaning it is possible for an external entity to
ensure the hash collides, thus falling back to a linear search in the
bucket, which is slow.
We take the approach of hash the full length of IPv6 address in
__ipv6_addr_jhash() so that all users can benefit from a more secure
version.
While this may look like it adds overhead, the reality of modern CPUs
means that this is unmeasurable in real world scenarios.
In simulating with llvm-mca, the increase in cycles for the hashing
code was ~16 cycles on Skylake (from a base of ~155), and an extra ~9
on Nehalem (base of ~173).
In commit dd6d2910c5e0 ("netfilter: conntrack: switch to siphash")
netfilter switched from a jenkins hash to a siphash, but even the faster
hsiphash is a more significant overhead (~20-30%) in some preliminary
testing. So, in this patch, we keep to the more conservative approach to
ensure we don't add much overhead per SYN.
In testing, this results in a consistently even spread across the
connection buckets. In both testing and real-world scenarios, we have
not found any measurable performance impact.
Fixes: 08dcdbf6a7b9 ("ipv6: use a stronger hash for tcp") Signed-off-by: Stewart Smith <trawets@amazon.com> Signed-off-by: Samuel Mendoza-Jonas <samjonas@amazon.com> Suggested-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20230721222410.17914-1-kuniyu@amazon.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
# ip link add dummy1 type dummy
# echo 1 > /proc/sys/net/ipv6/conf/dummy1/use_tempaddr
# ip link set dummy1 up
# ip -6 addr add 2000::1/64 mngtmpaddr dev dummy1
# ip -6 addr show dev dummy1
11: dummy1: <BROADCAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000
inet6 2000::44f3:581c:8ca:3983/64 scope global temporary dynamic
valid_lft 604800sec preferred_lft 86172sec
inet6 2000::1/64 scope global mngtmpaddr
valid_lft forever preferred_lft forever
inet6 fe80::e8a8:a6ff:fed5:56d4/64 scope link
valid_lft forever preferred_lft forever
# ip -6 addr del 2000::44f3:581c:8ca:3983/64 dev dummy1
(can wait a few seconds if you want to, the above delete isn't [directly] the problem)
# ip -6 addr show dev dummy1
11: dummy1: <BROADCAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000
inet6 2000::1/64 scope global mngtmpaddr
valid_lft forever preferred_lft forever
inet6 fe80::e8a8:a6ff:fed5:56d4/64 scope link
valid_lft forever preferred_lft forever
# ip -6 addr del 2000::1/64 mngtmpaddr dev dummy1
# ip -6 addr show dev dummy1
11: dummy1: <BROADCAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000
inet6 2000::81c9:56b7:f51a:b98f/64 scope global temporary dynamic
valid_lft 604797sec preferred_lft 86169sec
inet6 fe80::e8a8:a6ff:fed5:56d4/64 scope link
valid_lft forever preferred_lft forever
This patch prevents this new 'global temporary dynamic' address from being
created by the deletion of the related (same subnet prefix) 'mngtmpaddr'
(which is triggered by there already being no temporary addresses).
Cc: Jiri Pirko <jiri@resnulli.us> Fixes: 53bd67491537 ("ipv6 addrconf: introduce IFA_F_MANAGETEMPADDR to tell kernel to manage temporary addresses") Reported-by: Xiao Ma <xiaom@google.com> Signed-off-by: Maciej Żenczykowski <maze@google.com> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://lore.kernel.org/r/20230720160022.1887942-1-maze@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
The size of array 'priv->ports[]' is INNO_PHY_PORT_NUM.
In the for loop, 'i' is used as the index for array 'priv->ports[]'
with a check (i > INNO_PHY_PORT_NUM) which indicates that
INNO_PHY_PORT_NUM is allowed value for 'i' in the same loop.
This > comparison needs to be changed to >=, otherwise it potentially leads
to an out of bounds write on the next iteration through the loop
VXLAN-GPE does not add an extra inner Ethernet header. Take that into
account when calculating header length.
This causes problems in skb_tunnel_check_pmtu, where incorrect PMTU is
cached.
In the collect_md mode (which is the only mode that VXLAN-GPE
supports), there's no magic auto-setting of the tunnel interface MTU.
It can't be, since the destination and thus the underlying interface
may be different for each packet.
So, the administrator is responsible for setting the correct tunnel
interface MTU. Apparently, the administrators are capable enough to
calculate that the maximum MTU for VXLAN-GPE is (their_lower_MTU - 36).
They set the tunnel interface MTU to 1464. If you run a TCP stream over
such interface, it's then segmented according to the MTU 1464, i.e.
producing 1514 bytes frames. Which is okay, this still fits the lower
MTU.
However, skb_tunnel_check_pmtu (called from vxlan_xmit_one) uses 50 as
the header size and thus incorrectly calculates the frame size to be
1528. This leads to ICMP too big message being generated (locally),
PMTU of 1450 to be cached and the TCP stream to be resegmented.
The fix is to use the correct actual header size, especially for
skb_tunnel_check_pmtu calculation.
Fixes: e1e5314de08ba ("vxlan: implement GPE") Signed-off-by: Jiri Benc <jbenc@redhat.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
In dwrr mode, the default bandwidth weight of disabled tc is set to 0.
If the bandwidth weight is 0, the mode will change to sp.
Therefore, disabled tc default bandwidth weight need changed to 1,
and 0 is returned when query the bandwidth weight of disabled tc.
In addition, driver need stop configure bandwidth weight if tc is disabled.
Fixes: 848440544b41 ("net: hns3: Add support of TX Scheduler & Shaper to HNS3 driver") Signed-off-by: Jie Wang <wangjie125@huawei.com> Signed-off-by: Jijie Shao <shaojijie@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
Currently, the weight saved by the driver is used as the query result,
which may be different from the actual weight in the register.
Therefore, the register value read from the firmware is used
as the query result
Fixes: 0e32038dc856 ("net: hns3: refactor dump tc of debugfs") Signed-off-by: Jijie Shao <shaojijie@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
Clear MV_V2_PORT_CTRL_PWRDOWN bit to set power up for 88x3310 PHY,
it sometimes does not take effect immediately. And a read of this
register causes the bit not to clear. This will cause mv3310_reset()
to time out, which will fail the config initialization. So add a delay
before the next access.
Fixes: c9cc1c815d36 ("net: phy: marvell10g: place in powersave mode at probe") Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com> Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
In iavf_adminq_task(), if the function can't acquire the
adapter->crit_lock, it checks if the driver is removing. If so, it simply
exits without re-enabling the interrupt. This is done to ensure that the
task stops processing as soon as possible once the driver is being removed.
However, if the IAVF_FLAG_PF_COMMS_FAILED is set, the function checks this
before attempting to acquire the lock. In this case, the function exits
early and re-enables the interrupt. This will happen even if the driver is
already removing.
Avoid this, by moving the check to after the adapter->crit_lock is
acquired. This way, if the driver is removing, we will not re-enable the
interrupt.
Fixes: fc2e6b3b132a ("iavf: Rework mutexes for better synchronisation") Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>