The discovering of sas port is driven by workqueue in libsas. When libsas
is processing port events or phy events in workqueue, new events may rise
up and change the state of some structures such as asd_sas_phy. This may
cause some problems such as follows:
==>thread 1 ==>thread 2
==>phy up
==>phy_up_v3_hw()
==>oob_mode = SATA_OOB_MODE;
==>phy down quickly
==>hisi_sas_phy_down()
==>sas_ha->notify_phy_event()
==>sas_phy_disconnected()
==>oob_mode = OOB_NOT_CONNECTED
==>workqueue wakeup
==>sas_form_port()
==>sas_discover_domain()
==>sas_get_port_device()
==>oob_mode is OOB_NOT_CONNECTED and device
is wrongly taken as expander
This at last lead to the panic when libsas trying to issue a command to
discover the device.
Fixes: 2908d778ab3e ("[SCSI] aic94xx: new driver") Link: https://lore.kernel.org/r/20191206011118.46909-1-yanaijie@huawei.com Reported-by: Gao Chuan <gaochuan4@huawei.com> Reviewed-by: John Garry <john.garry@huawei.com> Signed-off-by: Jason Yan <yanaijie@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
mwifiex_process_country_ie() function parse elements of bss
descriptor in beacon packet. When processing WLAN_EID_COUNTRY
element, there is no upper limit check for country_ie_len before
calling memcpy. The destination buffer domain_info->triplet is an
array of length MWIFIEX_MAX_TRIPLET_802_11D(83). The remote
attacker can build a fake AP with the same ssid as real AP, and
send malicous beacon packet with long WLAN_EID_COUNTRY elemen
(country_ie_len > 83). Attacker can force STA connect to fake AP
on a different channel. When the victim STA connects to fake AP,
will trigger the heap buffer overflow. Fix this by checking for
length and if found invalid, don not connect to the AP.
This fix addresses CVE-2019-14895.
Reported-by: huangwen <huangwenabc@gmail.com> Signed-off-by: Ganapathi Bhat <gbhat@marvell.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
[bwh: Backported to 3.16:
- Use wiphy_dbg() instead of mwifiex_dbg()
- Adjust filename] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
If device has already received country information from
EEPROM, we won't parse AP's country IE and download it to
firmware. We will also set regulatory flags to disable beacon
hints and ignore country IE.
Signed-off-by: Amitkumar Karwar <akarwar@marvell.com> Signed-off-by: Cathy Luo <cluo@marvell.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
[bwh: Backported to 3.16: adjust filenames, context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Memory state around the buggy address: ffff8881f59a6a00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ffff8881f59a6a80: 00 00 00 00 00 00 00 00 00 00 fc fc fc fc fc fc
>ffff8881f59a6b00: fc fc fc fc fc fc fc fc fb fb fb fb fb fb fb fb
^ ffff8881f59a6b80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ffff8881f59a6c00: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
cpia2_init does not check return value of cpia2_init, if it failed
in usb_register_driver, there is already cleanup using driver_unregister.
No need call cpia2_usb_cleanup on module exit.
Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl> Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
It's possible to specify a non-zero s_want_extra_isize via debugging
option, and this can cause bad things(tm) to happen when using a file
system with an inode size of 128 bytes.
Add better checking when the file system is mounted, as well as when
we are actually doing the trying to do the inode expansion.
Link: https://lore.kernel.org/r/20191110121510.GH23325@mit.edu Reported-by: syzbot+f8d6f8386ceacdbfff57@syzkaller.appspotmail.com Reported-by: syzbot+33d7ea72e47de3bdf4e1@syzkaller.appspotmail.com Reported-by: syzbot+44b6763edfc17144296f@syzkaller.appspotmail.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
[bwh: Backported to 3.16:
- Use EIO instead of EFSCORRUPTED
- Adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Ben Hutchings [Tue, 7 Jan 2020 20:49:33 +0000 (20:49 +0000)]
ext4: Introduce ext4_clamp_want_extra_isize()
Based on commit 7bc04c5c2cc4 "ext4: fix use-after-free race with
debug_want_extra_isize". We don't have that bug but this will make it
easier to backport commit 4ea99936a163 "ext4: add more paranoia
checking in ext4_expand_extra_isize handling".
This was always meant to be a temporary thing, just for testing and to
see if it actually ever triggered.
The only thing that reported it was syzbot doing disk image fuzzing, and
then that warning is expected. So let's just remove it before -rc4,
because the extra sanity testing should probably go to -stable, but we
don't want the warning to do so.
Reported-by: syzbot+3031f712c7ad5dd4d926@syzkaller.appspotmail.com Fixes: 8a23eb804ca4 ("Make filldir[64]() verify the directory entry filename is valid") Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Siddharth Chandrasekaran <csiddharth@vmware.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
This has been discussed several times, and now filesystem people are
talking about doing it individually at the filesystem layer, so head
that off at the pass and just do it in getdents{64}().
This is partially based on a patch by Jann Horn, but checks for NUL
bytes as well, and somewhat simplified.
There's also commentary about how it might be better if invalid names
due to filesystem corruption don't cause an immediate failure, but only
an error at the end of the readdir(), so that people can still see the
filenames that are ok.
There's also been discussion about just how much POSIX strictly speaking
requires this since it's about filesystem corruption. It's really more
"protect user space from bad behavior" as pointed out by Jann. But
since Eric Biederman looked up the POSIX wording, here it is for context:
"From readdir:
The readdir() function shall return a pointer to a structure
representing the directory entry at the current position in the
directory stream specified by the argument dirp, and position the
directory stream at the next entry. It shall return a null pointer
upon reaching the end of the directory stream. The structure dirent
defined in the <dirent.h> header describes a directory entry.
From definitions:
3.129 Directory Entry (or Link)
An object that associates a filename with a file. Several directory
entries can associate names with the same file.
...
3.169 Filename
A name consisting of 1 to {NAME_MAX} bytes used to name a file. The
characters composing the name may be selected from the set of all
character values excluding the slash character and the null byte. The
filenames dot and dot-dot have special meaning. A filename is
sometimes referred to as a 'pathname component'."
Note that I didn't bother adding the checks to any legacy interfaces
that nobody uses.
Also note that if this ends up being noticeable as a performance
regression, we can fix that to do a much more optimized model that
checks for both NUL and '/' at the same time one word at a time.
We haven't really tended to optimize 'memchr()', and it only checks for
one pattern at a time anyway, and we really _should_ check for NUL too
(but see the comment about "soft errors" in the code about why it
currently only checks for '/')
See the CONFIG_DCACHE_WORD_ACCESS case of hash_name() for how the name
lookup code looks for pathname terminating characters in parallel.
Link: https://lore.kernel.org/lkml/20190118161440.220134-2-jannh@google.com/ Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Jann Horn <jannh@google.com> Cc: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Siddharth Chandrasekaran <csiddharth@vmware.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
A race in xhci USB3 remote wake handling may force device back to suspend
after it initiated resume siganaling, causing a missed resume event or warm
reset of device.
When a USB3 link completes resume signaling and goes to enabled (UO)
state a interrupt is issued and the interrupt handler will clear the
bus_state->port_remote_wakeup resume flag, allowing bus suspend.
If the USB3 roothub thread just finished reading port status before
the interrupt, finding ports still in suspended (U3) state, but hasn't
yet started suspending the hub, then the xhci interrupt handler will clear
the flag that prevented roothub suspend and allow bus to suspend, forcing
all port links back to suspended (U3) state.
Example case:
usb_runtime_suspend() # because all ports still show suspended U3
usb_suspend_both()
hub_suspend(); # successful as hub->wakeup_bits not set yet
==> INTERRUPT
xhci_irq()
handle_port_status()
clear bus_state->port_remote_wakeup
usb_wakeup_notification()
sets hub->wakeup_bits;
kick_hub_wq()
<== END INTERRUPT
hcd_bus_suspend()
xhci_bus_suspend() # success as port_remote_wakeup bits cleared
Fix this by increasing roothub usage count during port resume to prevent
roothub autosuspend, and by making sure bus_state->port_remote_wakeup
flag is only cleared after resume completion is visible, i.e.
after xhci roothub returned U0 or other non-U3 link state link on a
get port status request.
Issue rootcaused by Chiasheng Lee
Cc: Lee, Hou-hsun <hou-hsun.lee@intel.com> Reported-by: Lee, Chiasheng <chiasheng.lee@intel.com> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/20191211142007.8847-3-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[Mathias Nyman: Backport for 4.9 and 4.4 stable kernels]
[bwh: Backported to 3.16: USB 3.0 SS is the highest speed we handle] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
NL80211_TX_POWER_LIMITED was treated as NL80211_TX_POWER_AUTOMATIC,
which is the opposite of what should happen and can cause nasty
regulatory problems.
if/else converted to a switch without default to make gcc warn
on unhandled enum values.
Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
[bwh: Backported to 3.16: adjust filenames] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
bam_dma_terminate_all() will leak resources if any of the transactions are
committed to the hardware (present in the desc fifo), and not complete.
Since bam_dma_terminate_all() does not cause the hardware to be updated,
the hardware will still operate on any previously committed transactions.
This can cause memory corruption if the memory for the transaction has been
reassigned, and will cause a sync issue between the BAM and its client(s).
Fix this by properly updating the hardware in bam_dma_terminate_all().
Fixes: e7c0fe2a5c84 ("dmaengine: add Qualcomm BAM dma driver") Signed-off-by: Jeffrey Hugo <jeffrey.l.hugo@gmail.com> Link: https://lore.kernel.org/r/20191017152606.34120-1-jeffrey.l.hugo@gmail.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
[Jeffrey Hugo: Backported to 4.4 which is lacking 6b4faeac05bc
("dmaengine: qcom-bam: Process multiple pending descriptors")] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
It is completely unused and implemented only on x86.
Remove it.
Suggested-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20170526172900.91058-1-dvyukov@google.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
[bwh: Backported to 3.16 because this function is broken after
"x86/atomic: Fix smp_mb__{before,after}_atomic()":
- Adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: Jesse Brandeburg <jesse.brandeburg@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Link: http://lkml.kernel.org/r/20140508135851.768177189@infradead.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
[bwh: Backported to 3.16 because this function is broken after
"x86/atomic: Fix smp_mb__{before,after}_atomic()"] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Recent probing at the Linux Kernel Memory Model uncovered a
'surprise'. Strongly ordered architectures where the atomic RmW
primitive implies full memory ordering and
smp_mb__{before,after}_atomic() are a simple barrier() (such as x86)
fail for:
Because, while the atomic_inc() implies memory order, it
(surprisingly) does not provide a compiler barrier. This then allows
the compiler to re-order like so:
Debug exception handlers may be called for exceptions generated both by
user and kernel code. In many cases, this is checked explicitly, but
in other cases things either happen to work by happy accident or they
go slightly wrong. For example, executing 'brk #4' from userspace will
enter the kprobes code and be ignored, but the instruction will be
retried forever in userspace instead of delivering a SIGTRAP.
Fix this issue in the most stable-friendly fashion by simply adding
explicit checks of the triggering exception level to all of our debug
exception handlers.
Reviewed-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
FAR_EL1 is UNKNOWN for all debug exceptions other than those caused by
taking a hardware watchpoint. Unfortunately, if a debug handler returns
a non-zero value, then we will propagate the UNKNOWN FAR value to
userspace via the si_addr field of the SIGTRAP siginfo_t.
Instead, let's set si_addr to take on the PC of the faulting instruction,
which we have available in the current pt_regs.
Reviewed-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Currently stmmac driver not copying the valid ethernet
MAC address to MAC registers. This patch takes care
of updating the MAC register with MAC address.
Signed-off-by: Bhadram Varka <vbhadram@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net> Cc: Arnd Bergmann <arnd@arndb.de>
[bwh: Backported to 3.16:
- Pass priv->ioaddr as first argument to set_umac_addr operation
- Adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
As is the case for a number of other architectures that have a 32-bit
compat mode, enable KEYS_COMPAT if both COMPAT and KEYS are enabled.
This allows AArch32 programs to use the keyctl() system call when
running on an AArch64 kernel.
Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Cc: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Ilan reported that sometimes nl80211 messages weren't working if
the frames being transported got very large, which was really a
problem for userspace-to-kernel messages, but prompted me to look
at the code.
Upon review, I found various places where variable-length data is
transported in an nl80211 message but the message isn't allocated
taking that into account. This shouldn't cause any problems since
the frames aren't really that long, apart in one place where two
(possibly very long frames) might not fit.
Fix all the places (that I found) that get variable length data
from the driver and put it into a message to take the length of
the variable data into account. The 100 there is just a safe
constant for the remaining message overhead (it's usually around
50 for most messages.)
Signed-off-by: Johannes Berg <johannes.berg@intel.com> Cc: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
with CONFIG_HZ=100, the precision of jiffies is 10ms, and the
generic_cmd6_time of some card is also 10ms. then, may be current
time is only 5ms, but already timed out caused by jiffies precision.
The rtc-lib dependency is not required, and seems it was just
copy-pasted from ARM's Kconfig. If platform requires rtc-lib,
they should select it individually.
Reviewed-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Ezequiel Garcia <ezequiel@vanguardiasur.com.ar> Signed-off-by: Will Deacon <will.deacon@arm.com> Cc: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
User space Android code identifies pixclock == 0 as a sign for emulation
and will set the frame rate to 60 fps when reading this value, which is
the desired outcome.
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Roman Kiryanov <rkir@google.com> Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com> Cc: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
When rndis data transfer is in progress, some Windows7 Host PC is not
sending the GET_ENCAPSULATED_RESPONSE command for receiving the response
for the previous SEND_ENCAPSULATED_COMMAND processed.
The rndis function driver appends each response for the
SEND_ENCAPSULATED_COMMAND in a queue. As the above process got corrupted,
the Host sends a REMOTE_NDIS_RESET_MSG command to do a soft-reset.
As the rndis response queue is not freed, the previous response is sent
as a part of this REMOTE_NDIS_RESET_MSG's reset response and the Host
block any more Rndis transfers.
Hence free the rndis response queue as a part of this soft-reset so that
the correct response for REMOTE_NDIS_RESET_MSG is sent properly during the
response command.
Signed-off-by: Rajkumar Raghupathy <raghup@codeaurora.org> Signed-off-by: Xerox Lin <xerox_lin@htc.com>
[AmitP: Cherry-picked this patch and folded other relevant
fixes from Android common kernel android-4.4] Signed-off-by: Amit Pundir <amit.pundir@linaro.org> Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com> Cc: Arnd Bergmann <arnd@arndb.de>
[bwh: Backported to 3.16:
- Pass configNr instead of params as first argument to
rndis_{get_next,free}_response()
- Adjust filename, context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
There may be a race condition if f_fs calls unregister_gadget_item in
ffs_closed() when unregister_gadget is called by UDC store at the same time.
this leads to a kernel NULL pointer dereference:
[ 310.644928] Unable to handle kernel NULL pointer dereference at virtual address 00000004
[ 310.645053] init: Service 'adbd' is being killed...
[ 310.658938] pgd = c9528000
[ 310.662515] [00000004] *pgd=19451831, *pte=00000000, *ppte=00000000
[ 310.669702] Internal error: Oops: 817 [#1] PREEMPT SMP ARM
[ 310.675211] Modules linked in:
[ 310.678294] CPU: 0 PID: 1537 Comm: ->transport Not tainted 4.1.15-03725-g793404c #2
[ 310.685958] Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree)
[ 310.692493] task: c8e24200 ti: c945e000 task.ti: c945e000
[ 310.697911] PC is at usb_gadget_unregister_driver+0xb4/0xd0
[ 310.703502] LR is at __mutex_lock_slowpath+0x10c/0x16c
[ 310.708648] pc : [<c075efc0>] lr : [<c0bfb0bc>] psr: 600f0113
<snip..>
[ 311.565585] [<c075efc0>] (usb_gadget_unregister_driver) from [<c075e2b8>] (unregister_gadget_item+0x1c/0x34)
[ 311.575426] [<c075e2b8>] (unregister_gadget_item) from [<c076fcc8>] (ffs_closed+0x8c/0x9c)
[ 311.583702] [<c076fcc8>] (ffs_closed) from [<c07736b8>] (ffs_data_reset+0xc/0xa0)
[ 311.591194] [<c07736b8>] (ffs_data_reset) from [<c07738ac>] (ffs_data_closed+0x90/0xd0)
[ 311.599208] [<c07738ac>] (ffs_data_closed) from [<c07738f8>] (ffs_ep0_release+0xc/0x14)
[ 311.607224] [<c07738f8>] (ffs_ep0_release) from [<c023e030>] (__fput+0x80/0x1d0)
[ 311.614635] [<c023e030>] (__fput) from [<c014e688>] (task_work_run+0xb0/0xe8)
[ 311.621788] [<c014e688>] (task_work_run) from [<c010afdc>] (do_work_pending+0x7c/0xa4)
[ 311.629718] [<c010afdc>] (do_work_pending) from [<c010770c>] (work_pending+0xc/0x20)
for functions using functionFS, i.e. android adbd will close /dev/usb-ffs/adb/ep0
when usb IO thread fails, but switch adb from on to off also triggers write
"none" > UDC. These 2 operations both call unregister_gadget, which will lead
to the panic above.
add a mutex before calling unregister_gadget for api used in f_fs.
Signed-off-by: Winter Wang <wente.wang@nxp.com> Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com> Cc: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Some architectures require code written to memory as if it were data to be
'cleaned' from any data caches before the processor can fetch them as new
instructions.
During resume from hibernate, the snapshot code copies some pages directly,
meaning these architectures do not get a chance to perform their cache
maintenance. Modify the read and decompress code to call
flush_icache_range() on all pages that are restored, so that the restored
in-place pages are guaranteed to be executable on these architectures.
Signed-off-by: James Morse <james.morse@arm.com> Acked-by: Pavel Machek <pavel@ucw.cz> Acked-by: Rafael J. Wysocki <rjw@rjwysocki.net> Acked-by: Catalin Marinas <catalin.marinas@arm.com>
[will: make clean_pages_on_* static and remove initialisers] Signed-off-by: Will Deacon <will.deacon@arm.com> Cc: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Stop abusing struct page functionality and the swap end_io handler, and
instead add a modified version of the blk-lib.c bio_batch helpers.
Also move the block I/O code into swap.c as they are directly tied into
each other.
Signed-off-by: Christoph Hellwig <hch@lst.de> Tested-by: Pavel Machek <pavel@ucw.cz> Tested-by: Ming Lin <mlin@kernel.org> Acked-by: Pavel Machek <pavel@ucw.cz> Acked-by: Rafael J. Wysocki <rjw@rjwysocki.net> Signed-off-by: Jens Axboe <axboe@fb.com>
[bwh: Backported to 3.16 as dependency of commit f6cf0545ec69
"PM / Hibernate: Call flush_icache_range() on pages restored in-place":
- Adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
page.h uses '_AC' in the definition of PAGE_SIZE, but doesn't include
linux/const.h where this is defined. This produces build warnings when only
asm/page.h is included by asm code.
Signed-off-by: James Morse <james.morse@arm.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Cc: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
The implementation of macro inv_entry refers to its 'el' argument without
the required leading backslash, which results in an undefined symbol
'el' to be passed into the kernel_entry macro rather than the index of
the exception level as intended.
This undefined symbol strangely enough does not result in build failures,
although it is visible in vmlinux:
However, it does result in incorrect code being generated for invalid
exceptions taken from EL0, since the argument check in kernel_entry
assumes EL1 if its argument does not equal '0'.
The code enabled by the ARM_CPU_SUSPEND config option is used by
kernel subsystems for purposes that go beyond system suspend so its
config entry should be augmented to take more default options into
account and avoid forcing its selection to prevent dependencies
override.
To achieve this goal, this patch reworks the ARM_CPU_SUSPEND config
entry and updates its default config value (by adding the BL_SWITCHER
option to it) and its dependencies (ARCH_SUSPEND_POSSIBLE), so that the
symbol is still selected by default by the subsystems requiring it and
at the same time enforcing the dependencies correctly.
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Cc: Nicolas Pitre <nico@fluxnic.net> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> Cc: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
In carveout heap, change minimum allocation order from 12 to
PAGE_SHIFT. After this change each bit in bitmap (genalloc -
General purpose special memory pool) represents one page size
memory.
Cc: sprd-ind-kernel-group@googlegroups.com Cc: sanjeev.yadav@spreadtrum.com Cc: Colin Cross <ccross@android.com> Cc: Android Kernel Team <kernel-team@android.com> Cc: Greg KH <gregkh@linuxfoundation.org> Cc: Sumit Semwal <sumit.semwal@linaro.org> Signed-off-by: Rajmal Menariya <rajmal.menariya@spreadtrum.com>
[jstultz: Reworked commit message] Signed-off-by: John Stultz <john.stultz@linaro.org> Acked-by: Laura Abbott <labbott@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Currently __set_fixmap_offset is a macro function which has a local
variable called 'addr'. If a caller passes a 'phys' parameter which is
derived from a variable also called 'addr', the local variable will
shadow this, and the compiler will complain about the use of an
uninitialized variable. To avoid the issue with namespace clashes,
'addr' is prefixed with a liberal sprinkling of underscores.
Turning __set_fixmap_offset into a static inline breaks the build for
several architectures. Fixing this properly requires updates to a number
of architectures to make them agree on the prototype of __set_fixmap (it
could be done as a subsequent patch series).
Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Arnd Bergmann <arnd@arndb.de>
[catalin.marinas@arm.com: squashed the original function patch and macro fixup] Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Lorenzo reported that we could not properly find v4mapped sockets
in inet_diag_find_one_icsk(). This patch fixes the issue.
Reported-by: Lorenzo Colitti <lorenzo@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Lorenzo Colitti <lorenzo@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Cc: Arnd Bergmann <arnd@arndb.de>
[bwh: Backported to 3.16: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
It is not possible to build the bL_switcher code if the GIC
driver is disabled, because it relies on calling into some
gic specific interfaces, and that would result in this build
error:
arch/arm/common/built-in.o: In function `bL_switch_to':
:(.text+0x1230): undefined reference to `gic_get_sgir_physaddr'
:(.text+0x1244): undefined reference to `gic_send_sgi'
:(.text+0x1268): undefined reference to `gic_migrate_target'
arch/arm/common/built-in.o: In function `bL_switcher_enable.part.4':
:(.text.unlikely+0x2f8): undefined reference to `gic_get_cpu_id'
This adds a Kconfig dependency to ensure we only build the big-little
switcher if the GIC driver is present as well.
Almost all ARMv7 platforms come with a GIC anyway, but it is possible
to build a kernel that disables all platforms.
Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Nicolas Pitre <nico@linaro.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> Cc: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
ARM glibc uses (4 * __getpagesize()) for SHMLBA, which is correct for
4KB pages and works fine for 64KB pages, but the kernel uses a hardcoded
16KB that is too small for 64KB page based kernels. This changes the
definition to what user space sees when using 64KB pages.
Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Yury Norov <ynorov@caviumnetworks.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Cc: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
It is quite common for Android devices to utilize more
then 8 partitions on internal eMMC storage.
The vanilla kernel can support this via
CONFIG_MMC_BLOCK_MINORS, however that solution caps the
system to 256 minors total, which limits the number of
mmc cards the system can support.
This patch, which has been carried for quite awhile in
the AOSP common tree, provides an alternative solution
that doesn't seem to limit the total card count. So I
wanted to submit it for consideration upstream.
This patch sets the GENHD_FL_EXT_DEVT flag, which will
allocate minor number in major 259 for partitions past
disk->minors.
It also removes the use of disk_devt to determine devidx
from md->disk. md->disk->first_minor is always initialized
from devidx and can always be used to recover it.
Cc: Ulf Hansson <ulf.hansson@linaro.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ben Hutchings <ben@decadent.org.uk> Cc: Chuanxiao Dong <chuanxiao.dong@intel.com> Cc: Shawn Lin <shawn.lin@rock-chips.com> Cc: Austin S Hemmelgarn <ahferroin7@gmail.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Android Kernel Team <kernel-team@android.com> Cc: linux-mmc@vger.kernel.org Signed-off-by: Colin Cross <ccross@android.com>
[jstultz: Added context to commit message] Signed-off-by: John Stultz <john.stultz@linaro.org> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Cc: Arnd Bergmann <arnd@arndb.de>
[bwh: Backported to 3.16: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
A suspended SS port in U3 link state will go to U0 when resumed, but
can almost immediately after that enter U1 or U2 link power save
states before host controller driver reads the port status.
Host controller driver only checks for U0 state, and might miss
the finished resume, leaving flags unclear and skip notifying usb
code of the wake.
Add U1 and U2 to the possible link states when checking for finished
port resume.
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[Mathias Nyman: backport to 3.18 stable.] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
The commit b8b9c974afee ("usb: renesas_usbhs: gadget: disable all eps
when the driver stops") causes the unused-but-set-variable warning.
But, if the usbhsg_ep_disable() will return non-zero value, udc/core.c
doesn't clear the ep->enabled flag. So, this driver should not return
non-zero value, if the pipe is zero because this means the pipe is
already disabled. Otherwise, the ep->enabled flag is never cleared
when the usbhsg_ep_disable() is called by the renesas_usbhs driver first.
Fixes: b8b9c974afee ("usb: renesas_usbhs: gadget: disable all eps when the driver stops") Fixes: 11432050f070 ("usb: renesas_usbhs: gadget: fix NULL pointer dereference in ep_disable()") Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com> Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com> Cc: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
In current die(), the irq is disabled for __die() handle, not
including the possible panic() handling. Since the log in __die()
can take several hundreds ms, new irq might come and interrupt
current die().
If the process calling die() holds some critical resource, and some
other process scheduled later also needs it, then it would deadlock.
The first panic will not be executed.
So here disable irq for the whole flow of die().
Signed-off-by: Qiao Zhou <qiaozhou@asrmicro.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Per listen(fd, backlog) rules, there is really no point accepting a SYN,
sending a SYNACK, and dropping the following ACK packet if accept queue
is full, because application is not draining accept queue fast enough.
This behavior is fooling TCP clients that believe they established a
flow, while there is nothing at server side. They might then send about
10 MSS (if using IW10) that will be dropped anyway while server is under
stress.
Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Acked-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
[bwh: Backported to 3.16: Apply TCP changes in both tcp_ipv4.c and tcp_ipv6.c] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
When usb gadget is set gadget serial function, it will be crash in below
situation.
It will clean the 'port->port_usb' pointer in gserial_disconnect() function
when usb link is inactive, but it will release lock for disabling the endpoints
in this function. Druing the lock release period, it maybe complete one request
to issue gs_write_complete()--->gs_start_tx() function, but the 'port->port_usb'
pointer had been set NULL, thus it will be crash in gs_start_tx() function.
This patch adds the 'port->port_usb' pointer checking in gs_start_tx() function
to avoid this situation.
Signed-off-by: Baolin Wang <baolin.wang@linaro.org> Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com> Cc: Arnd Bergmann <arnd@arndb.de>
[bwh: Backported to 3.16: adjust filename] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
When a single thread is sending out data over the gadget serial port,
gs_start_tx() will be called both from the sender context and from the
write completion. Since the port lock is released before the packet is
queued, the order in which the URBs are submitted is not guaranteed.
E.g.
I.e., req2 is submitted before req1 but it contains the data that
comes after req1.
To reproduce, use SMP with sending thread and completion pinned to
different CPUs, or use PREEMPT_RT, and add the following delay just
before the call to usb_ep_queue():
if (port->write_started > 0 && !list_empty(pool))
udelay(1000);
To work around this problem, make sure that only one thread is running
through the gs_start_tx() loop with an extra flag write_busy. Since
gs_start_tx() is always called with the port lock held, no further
synchronisation is needed. The original caller will continue through
the loop when the request was successfully submitted.
Signed-off-by: Philip Oberstaller <Philip.Oberstaller@septentrio.com> Signed-off-by: Arnout Vandecappelle (Essensium/Mind) <arnout@mind.be> Signed-off-by: Felipe Balbi <balbi@ti.com>
[bwh: Backported to 3.16: adjust filename] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
I tried to use 'make O=...' from an unclean source tree. This triggered
the error path of setlocalversion. But by printing to STDOUT, it created
a broken localversion which then caused another (unrelated) error:
"4.7.0-rc2Error: kernelrelease not valid - run make prepare to update it" exceeds 64 characters
After printing to STDERR, the true build error gets displayed later:
/home/wsa/Kernel/linux is not clean, please run 'make mrproper'
in the '/home/wsa/Kernel/linux' directory.
Signed-off-by: Wolfram Sang <wsa@the-dreams.de> Signed-off-by: Michal Marek <mmarek@suse.com> Cc: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
This check effectively catches anon vma hierarchy inconsistence and some
vma corruptions. It was effective for catching corner cases in anon vma
reusing logic. For now this code seems stable so check could be hidden
under CONFIG_DEBUG_VM and replaced with WARN because it's not so fatal.
Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Suggested-by: Vasily Averin <vvs@virtuozzo.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Currently MMC core will keep going if HS200/HS timing switch failed
with -EBADMSG error by the assumption that the old timing is still valid.
However, for mmc_select_hs200 case, the signal voltage may have already
been switched. If the timing switch failed, we should fall back to
the old voltage in case the card is continue run with legacy timing.
If fall back signal voltage failed, we explicitly report an EIO error
to force retry during the next power cycle.
With CONFIG_PROVE_LOCKING, CONFIG_DEBUG_LOCKDEP and CONFIG_TRACE_IRQFLAGS
enabled, lockdep will compare current->hardirqs_enabled with the flags from
local_irq_save().
When a debug exception occurs, interrupts are disabled in entry.S, but
lockdep isn't told, resulting in:
DEBUG_LOCKS_WARN_ON(current->hardirqs_enabled)
------------[ cut here ]------------
WARNING: at ../kernel/locking/lockdep.c:3523
Modules linked in:
CPU: 3 PID: 1752 Comm: perf Not tainted 4.5.0-rc4+ #2204
Hardware name: ARM Juno development board (r1) (DT)
task: ffffffc974868000 ti: ffffffc975f40000 task.ti: ffffffc975f40000
PC is at check_flags.part.35+0x17c/0x184
LR is at check_flags.part.35+0x17c/0x184
pc : [<ffffff80080fc93c>] lr : [<ffffff80080fc93c>] pstate: 600003c5
[...]
---[ end trace 74631f9305ef5020 ]---
Call trace:
[<ffffff80080fc93c>] check_flags.part.35+0x17c/0x184
[<ffffff80080ffe30>] lock_acquire+0xa8/0xc4
[<ffffff8008093038>] breakpoint_handler+0x118/0x288
[<ffffff8008082434>] do_debug_exception+0x3c/0xa8
[<ffffff80080854b4>] el1_dbg+0x18/0x6c
[<ffffff80081e82f4>] do_filp_open+0x64/0xdc
[<ffffff80081d6e60>] do_sys_open+0x140/0x204
[<ffffff80081d6f58>] SyS_openat+0x10/0x18
[<ffffff8008085d30>] el0_svc_naked+0x24/0x28
possible reason: unannotated irqs-off.
irq event stamp: 65857
hardirqs last enabled at (65857): [<ffffff80081fb1c0>] lookup_mnt+0xf4/0x1b4
hardirqs last disabled at (65856): [<ffffff80081fb188>] lookup_mnt+0xbc/0x1b4
softirqs last enabled at (65790): [<ffffff80080bdca4>] __do_softirq+0x1f8/0x290
softirqs last disabled at (65757): [<ffffff80080be038>] irq_exit+0x9c/0xd0
This patch adds the annotations to do_debug_exception(), while trying not
to call trace_hardirqs_off() if el1_dbg() interrupted a task that already
had irqs disabled.
Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Cc: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Gadget controller might not be always active during system
suspend/resume as gadget driver might not have yet been loaded or
might have been unloaded prior to system suspend.
Check if we're active and only then perform
necessary actions during suspend/resume.
Signed-off-by: Roger Quadros <rogerq@ti.com> Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com> Cc: Arnd Bergmann <arnd@arndb.de>
[bwh: Backported to 3.16: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Each time a driver such as sdhci-esdhc-imx is probed, we get a info
printk complaining that the DT voltage-ranges property has not been
specified.
However, the DT binding specifically says that the voltage-ranges
property is optional. That means we should not be complaining that
DT hasn't specified this property: by indicating that it's optional,
it is valid not to have the property in DT.
Silence the warning if the property is missing.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Cc: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
The bus width is sometimes the actual bus width, and sometimes indices
to different arrays encoding the bus width. In my debugging case "2"
could mean 8-bit as well as 4-bit, which was extremly confusing. Let's
use the human-readable actual bus width in all places.
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Cc: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Compress offload does not support ioctl calls from a 32bit userspace
in a 64 bit kernel. This patch adds support for ioctls from a 32bit
userspace in a 64bit kernel
The authorize reply can be empty, for example when the ticket used to
build the authorizer is too old and TAG_BADAUTHORIZER is returned from
the service. Calling ->verify_authorizer_reply() results in an attempt
to decrypt and validate (somewhat) random data in au->buf (most likely
the signature block from calc_signature()), which fails and ends up in
con_fault_finish() with !con->auth_retry. The ticket isn't invalidated
and the connection is retried again and again until a new ticket is
obtained from the monitor:
libceph: osd2 192.168.122.1:6809 bad authorize reply
libceph: osd2 192.168.122.1:6809 bad authorize reply
libceph: osd2 192.168.122.1:6809 bad authorize reply
libceph: osd2 192.168.122.1:6809 bad authorize reply
Let TAG_BADAUTHORIZER handler kick in and increment con->auth_retry.
Fixes: 5c056fdc5b47 ("libceph: verify authorize reply on connect") Link: https://tracker.ceph.com/issues/20164 Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Sage Weil <sage@redhat.com>
[idryomov@gmail.com: backport to 4.4: extra arg, no CEPHX_V2] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
When printing multiple uprobe arguments as strings the output for the
earlier arguments would also include all later string arguments.
This is best explained in an example:
Consider adding a uprobe to a function receiving two strings as
parameters which is at offset 0xa0 in strlib.so and we want to print
both parameters when the uprobe is hit (on x86_64):
Note the extra "bar" printed as part of arg1. This behaviour stacks up
for additional string arguments.
The strings are stored in a dynamically growing part of the uprobe
buffer by fetch_store_string() after copying them from userspace via
strncpy_from_user(). The return value of strncpy_from_user() is then
directly used as the required size for the string. However, this does
not take the terminating null byte into account as the documentation
for strncpy_from_user() cleary states that it "[...] returns the
length of the string (not including the trailing NUL)" even though the
null byte will be copied to the destination.
Therefore, subsequent calls to fetch_store_string() will overwrite
the terminating null byte of the most recently fetched string with
the first character of the current string, leading to the
"accumulation" of strings in earlier arguments in the output.
Fix this by incrementing the return value of strncpy_from_user() by
one if we did not hit the maximum buffer size.
Link: http://lkml.kernel.org/r/20190116141629.5752-1-andreas.ziegler@fau.de Cc: Ingo Molnar <mingo@redhat.com> Fixes: 5baaa59ef09e ("tracing/probes: Implement 'memory' fetch method for uprobes") Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Andreas Ziegler <andreas.ziegler@fau.de> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Eric Biggers [Mon, 14 Jan 2019 23:21:45 +0000 (15:21 -0800)]
crypto: cts - fix crash on short inputs
In the CTS template, when the input length is <= one block cipher block
(e.g. <= 16 bytes for AES) pass the correct length to the underlying CBC
transform rather than one block. This matches the upstream behavior and
makes the encryption/decryption operation correctly return -EINVAL when
1 <= nbytes < bsize or succeed when nbytes == 0, rather than crashing.
This was fixed upstream incidentally by a large refactoring,
commit 0605c41cc53c ("crypto: cts - Convert to skcipher"). But
syzkaller easily trips over this when running on older kernels, as it's
easily reachable via AF_ALG. Therefore, this patch makes the minimal
fix for older kernels.
Cc: linux-crypto@vger.kernel.org Fixes: 76cb9521795a ("[CRYPTO] cts: Add CTS mode required for Kerberos AES support") Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
ql_alloc_large_buffers() has the usual RX buffer allocation
loop where it allocates skbs and maps them for DMA. It also
treats failure as a fatal error.
There are (at least) three bugs in the error paths:
1. ql_free_large_buffers() assumes that the lrg_buf[] entry for the
first buffer that couldn't be allocated will have .skb == NULL.
But the qla_buf[] array is not zero-initialised.
2. ql_free_large_buffers() DMA-unmaps all skbs in lrg_buf[]. This is
incorrect for the last allocated skb, if DMA mapping failed.
3. Commit 1acb8f2a7a9f ("net: qlogic: Fix memory leak in
ql_alloc_large_buffers") added a direct call to dev_kfree_skb_any()
after the skb is recorded in lrg_buf[], so ql_free_large_buffers()
will double-free it.
The bugs are somewhat inter-twined, so fix them all at once:
* Clear each entry in qla_buf[] before attempting to allocate
an skb for it. This goes half-way to fixing bug 1.
* Set the .skb field only after the skb is DMA-mapped. This
fixes the rest.
Fixes: 1357bfcf7106 ("qla3xxx: Dynamically size the rx buffer queue ...") Fixes: 0f8ab89e825f ("qla3xxx: Check return code from pci_map_single() ...") Fixes: 1acb8f2a7a9f ("net: qlogic: Fix memory leak in ql_alloc_large_buffers") Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
In ql_alloc_large_buffers, a new skb is allocated via netdev_alloc_skb.
This skb should be released if pci_dma_mapping_error fails.
Fixes: 0f8ab89e825f ("qla3xxx: Check return code from pci_map_single() in ql_release_to_lrg_buf_free_list(), ql_populate_free_queue(), ql_alloc_large_buffers(), and ql3xxx_send()") Signed-off-by: Navid Emamdoost <navid.emamdoost@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
zhangyi (F) [Wed, 6 Nov 2019 09:43:52 +0000 (17:43 +0800)]
fs/dcache: move security_d_instantiate() behind attaching dentry to inode
During backport 1e2e547a93a "do d_instantiate/unlock_new_inode
combinations safely", there was a error instantiating sequence of
attaching dentry to inode and calling security_d_instantiate().
Before commit ce23e640133 "->getxattr(): pass dentry and inode as
separate arguments" and b96809173e9 "security_d_instantiate(): move to
the point prior to attaching dentry to inode", security_d_instantiate()
should be called beind __d_instantiate(), otherwise it will trigger
below problem when CONFIG_SECURITY_SMACK on ext4 was enabled because
d_inode(dentry) used by ->getxattr() is NULL before __d_instantiate()
instantiate inode.
It's possible to hit the WARN_ON_ONCE(page_mapped(page)) in
remove_stable_node() when it races with __mmput() and squeezes in
between ksm_exit() and exit_mmap().
WARNING: CPU: 0 PID: 3295 at mm/ksm.c:888 remove_stable_node+0x10c/0x150
The workqueue only exists for the primary PF. For other functions
we hit a WARN_ON in kernel/workqueue.c.
Fixes: 7c236c43b838 ("sfc: Add support for IEEE-1588 PTP") Signed-off-by: Martin Habets <mhabets@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
when configuring act_pedit rules, the number of keys is validated only on
addition of a new entry. This is not sufficient to avoid hitting a WARN()
in the traffic path: for example, it is possible to replace a valid entry
with a new one having 0 extended keys, thus causing splats in dmesg like:
Fix this moving the check on 'nkeys' earlier in tcf_pedit_init(), so that
attempts to install rules having 0 keys are always rejected with -EINVAL.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Davide Caratti <dcaratti@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
[bwh: Backported to 3.16:
- Drop change in tcf_pedit_keys_ex_parse()
- netlink doesn't support error messages
- Adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
This happens because buffers for the in_vq are allocated when the port is
added but are not released when the port is unplugged.
They are only released when virtconsole is removed (see a7a69ec0d8e4)
To avoid the problem and to be symmetric, we could allocate all the buffers
in init_vqs() as they are released in remove_vqs(), but it sounds like
a waste of memory.
Rather than that, this patch changes add_port() logic to ignore ENOSPC
error in fill_queue(), which means queue has already been filled.
Fixes: a7a69ec0d8e4 ("virtio_console: free buffers after reset") Cc: mst@redhat.com Signed-off-by: Laurent Vivier <lvivier@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
An exiting task might belong to an offline cgroup. In this case an
attempt to grab a cgroup reference from the task can end up with an
infinite loop in hugetlb_cgroup_charge_cgroup(), because neither the
cgroup will become online, neither the task will be migrated to a live
cgroup.
Fix this by switching over to css_tryget(). As css_tryget_online()
can't guarantee that the cgroup won't go offline, in most cases the
check doesn't make sense. In this particular case users of
hugetlb_cgroup_charge_cgroup() are not affected by this change.
A similar problem is described by commit 18fa84a2db0e ("cgroup: Use
css_tryget() instead of css_tryget_online() in task_get_css()").
Link: http://lkml.kernel.org/r/20191106225131.3543616-2-guro@fb.com Signed-off-by: Roman Gushchin <guro@fb.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Tejun Heo <tj@kernel.org> Reviewed-by: Shakeel Butt <shakeelb@google.com> Cc: Michal Hocko <mhocko@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
We've encountered a rcu stall in get_mem_cgroup_from_mm():
rcu: INFO: rcu_sched self-detected stall on CPU
rcu: 33-....: (21000 ticks this GP) idle=6c6/1/0x4000000000000002 softirq=35441/35441 fqs=5017
(t=21031 jiffies g=324821 q=95837) NMI backtrace for cpu 33
<...>
RIP: 0010:get_mem_cgroup_from_mm+0x2f/0x90
<...>
__memcg_kmem_charge+0x55/0x140
__alloc_pages_nodemask+0x267/0x320
pipe_write+0x1ad/0x400
new_sync_write+0x127/0x1c0
__kernel_write+0x4f/0xf0
dump_emit+0x91/0xc0
writenote+0xa0/0xc0
elf_core_dump+0x11af/0x1430
do_coredump+0xc65/0xee0
get_signal+0x132/0x7c0
do_signal+0x36/0x640
exit_to_usermode_loop+0x61/0xd0
do_syscall_64+0xd4/0x100
entry_SYSCALL_64_after_hwframe+0x44/0xa9
The problem is caused by an exiting task which is associated with an
offline memcg. We're iterating over and over in the do {} while
(!css_tryget_online()) loop, but obviously the memcg won't become online
and the exiting task won't be migrated to a live memcg.
Let's fix it by switching from css_tryget_online() to css_tryget().
As css_tryget_online() cannot guarantee that the memcg won't go offline,
the check is usually useless, except some rare cases when for example it
determines if something should be presented to a user.
A similar problem is described by commit 18fa84a2db0e ("cgroup: Use
css_tryget() instead of css_tryget_online() in task_get_css()").
Johannes:
: The bug aside, it doesn't matter whether the cgroup is online for the
: callers. It used to matter when offlining needed to evacuate all charges
: from the memcg, and so needed to prevent new ones from showing up, but we
: don't care now.
Link: http://lkml.kernel.org/r/20191106225131.3543616-1-guro@fb.com Signed-off-by: Roman Gushchin <guro@fb.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Tejun Heo <tj@kernel.org> Reviewed-by: Shakeel Butt <shakeeb@google.com> Cc: Michal Hocko <mhocko@kernel.org> Cc: Michal Koutn <mkoutny@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
While output urb's snd_complete_urb() is executing, calling
prepare_outbound_urb() may cause endpoint stopped before
prepare_outbound_urb() returns and result in next urb submitted
to stopped endpoint. usb-audio driver cannot re-use it afterwards as
the urb is still hold by usb stack.
This change checks EP_FLAG_RUNNING flag after prepare_outbound_urb() again
to let snd_complete_urb() know the endpoint already stopped and does not
submit next urb. Below kind of error will be fixed:
[ 213.153103] usb 1-2: timeout: still 1 active urbs on EP #1
[ 213.164121] usb 1-2: cannot submit urb 0, error -16: unknown error
Some Coffee Lake platforms have a skewed HPET timer once the SoCs entered
PC10, which in consequence marks TSC as unstable because HPET is used as
watchdog clocksource for TSC.
Harry Pan tried to work around it in the clocksource watchdog code [1]
thereby creating a circular dependency between HPET and TSC. This also
ignores the fact, that HPET is not only unsuitable as watchdog clocksource
on these systems, it becomes unusable in general.
We need to get the underlying dentry of parent; sure, absent the races
it is the parent of underlying dentry, but there's nothing to prevent
losing a timeslice to preemtion in the middle of evaluation of
lower_dentry->d_parent->d_inode, having another process move lower_dentry
around and have its (ex)parent not pinned anymore and freed on memory
pressure. Then we regain CPU and try to fetch ->d_inode from memory
that is freed by that point.
dentry->d_parent *is* stable here - it's an argument of ->lookup() and
we are guaranteed that it won't be moved anywhere until we feed it
to d_add/d_splice_alias. So we safely go that way to get to its
underlying dentry.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
[bwh: Backported to 3.16:
- Open-code d_inode()
- Adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
lower_dentry can't go from positive to negative (we have it pinned),
but it *can* go from negative to positive. So fetching ->d_inode
into a local variable, doing a blocking allocation, checking that
now ->d_inode is non-NULL and feeding the value we'd fetched
earlier to a function that won't accept NULL is not a good idea.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
[bwh: Backported to 3.16:
- Use ACCESS_ONCE() instead of READ_ONCE()
- Adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
A check of the return value from get_cur_mix_raw() is missing at the
resolution test code in get_min_max_with_quirks(), which may leave the
variable untouched, leading to a random uninitialized value, as
detected by syzkaller fuzzer.
Add the missing return error check for fixing that.
There are two callers of this function and they both unlock the mutex so
this ends up being a double unlock.
Fixes: 44ed167da748 ("drbd: rcu_read_lock() and rcu_dereference() for tconn->net_conf") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
In the current code, we use the atomic_cmpxchg() to serialize the output
of the dump_stack(), but this implementation suffers the thundering herd
problem. We have observed such kind of livelock on a Marvell cn96xx
board(24 cpus) when heavily using the dump_stack() in a kprobe handler.
Actually we can let the competitors to wait for the releasing of the
lock before jumping to atomic_cmpxchg(). This will definitely mitigate
the thundering herd problem. Thanks Linus for the suggestion.
[akpm@linux-foundation.org: fix comment] Link: http://lkml.kernel.org/r/20191030031637.6025-1-haokexin@gmail.com Fixes: b58d977432c8 ("dump_stack: serialize the output from dump_stack()") Signed-off-by: Kevin Hao <haokexin@gmail.com> Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
/proc/pagetypeinfo is a debugging tool to examine internal page
allocator state wrt to fragmentation. It is not very useful for any
other use so normal users really do not need to read this file.
Waiman Long has noticed that reading this file can have negative side
effects because zone->lock is necessary for gathering data and that a)
interferes with the page allocator and its users and b) can lead to hard
lockups on large machines which have very long free_list.
Reduce both issues by simply not exporting the file to regular users.
Link: http://lkml.kernel.org/r/20191025072610.18526-2-mhocko@kernel.org Fixes: 467c996c1e19 ("Print out statistics in relation to fragmentation avoidance to /proc/pagetypeinfo") Signed-off-by: Michal Hocko <mhocko@suse.com> Reported-by: Waiman Long <longman@redhat.com> Acked-by: Mel Gorman <mgorman@suse.de> Acked-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: Waiman Long <longman@redhat.com> Acked-by: Rafael Aquini <aquini@redhat.com> Acked-by: David Rientjes <rientjes@google.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: David Hildenbrand <david@redhat.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Roman Gushchin <guro@fb.com> Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Cc: Jann Horn <jannh@google.com> Cc: Song Liu <songliubraving@fb.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
[bwh: Backported to 3.16: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
The final sort might get confused when the comparison is done over
bigger numbers than int like for -s time.
Check the following report for longer workloads:
$ perf report -s time -F time,overhead --stdio
Fix hist_entry__sort() to properly return int64_t and not possible cut
int.
Fixes: 043ca389a318 ("perf tools: Use hpp formats to sort final output") Signed-off-by: Jiri Olsa <jolsa@kernel.org> Reviewed-by: Andi Kleen <ak@linux.intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20191104232711.16055-1-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
When the status register is read without the status IRQ pending, the
chip may not raise the interrupt line for an upcoming status interrupt
and the driver may miss a status interrupt.
It is critical that the BUSOFF status interrupt is forwarded to the
higher layers, since no more interrupts will follow without
intervention.
Thanks to Wolfgang and Joe for bringing up the first idea.
Signed-off-by: Kurt Van Dijck <dev.kurt@vandijck-laurijssen.be> Cc: Wolfgang Grandegger <wg@grandegger.com> Cc: Joe Burmeister <joe.burmeister@devtank.co.uk> Fixes: fa39b54ccf28 ("can: c_can: Get rid of pointless interrupts") Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
When decoding a buffer received from PCAN-USB, the first timestamp read in
a packet is a 16-bit coded time base, and the next ones are an 8-bit
offset to this base, regardless of the type of packet read.
This patch corrects a potential loss of synchronization by using a
timestamp index read from the buffer, rather than an index of received
data packets, to determine on the sizeof the timestamp to be read from the
packet being decoded.
Signed-off-by: Stephane Grosjean <s.grosjean@peak-system.com> Fixes: 46be265d3388 ("can: usb: PEAK-System Technik PCAN-USB specific part") Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
The driver was accessing its driver data after having freed it.
Fixes: 0024d8ad1639 ("can: usb_8dev: Add support for USB2CAN interface from 8 devices") Cc: Bernd Krumboeck <b.krumboeck@gmail.com> Cc: Wolfgang Grandegger <wg@grandegger.com> Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Invoking the following commands on a 32-bit architecture with strict
alignment requirements (such as an ARMv7-based Raspberry Pi) results
in an alignment exception:
# nft add table ip test-ip4
# nft add chain ip test-ip4 output { type filter hook output priority 0; }
# nft add rule ip test-ip4 output quota 1025 bytes
The reason is that nft_quota_do_init() calls atomic64_set() on an
atomic64_t which is only aligned to 32-bit, not 64-bit, because it
succeeds struct nft_expr in memory which only contains a 32-bit pointer.
Fix by aligning the nft_expr private data to 64-bit.
Fixes: 96518518cc41 ("netfilter: add nftables") Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
The copy_to_user() function returns the number of bytes remaining to be
copied. In this code, that positive return is checked at the end of the
function and we return zero/success. What we should do instead is
return -EFAULT.
Fixes: a7b4f989a629 ("netfilter: ipset: IP set core support") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
For some reason I missed the case of DCCP passive
flows in my previous patch.
Fixes: a904a0693c18 ("inet: stop leaking jiffies on the wire") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Thiemo Nagel <tnagel@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
For Focusrite Saffire Pro i/o, the lowest 8 bits of register represents
configured source of sampling clock. The next lowest 8 bits represents
whether the configured source is actually detected or not just after
the register is changed for the source.
Current implementation evaluates whole the register to detect configured
source. This results in failure due to the next lowest 8 bits when the
source is connected in advance.
This commit fixes the bug.
Fixes: 25784ec2d034 ("ALSA: bebob: Add support for Focusrite Saffire/SaffirePro series") Signed-off-by: Takashi Sakamoto <o-takashi@sakamocchi.jp> Link: https://lore.kernel.org/r/20191102150920.20367-1-o-takashi@sakamocchi.jp Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Historically linux tried to stick to RFC 791, 1122, 2003
for IPv4 ID field generation.
RFC 6864 made clear that no matter how hard we try,
we can not ensure unicity of IP ID within maximum
lifetime for all datagrams with a given source
address/destination address/protocol tuple.
Linux uses a per socket inet generator (inet_id), initialized
at connection startup with a XOR of 'jiffies' and other
fields that appear clear on the wire.
Thiemo Nagel pointed that this strategy is a privacy
concern as this provides 16 bits of entropy to fingerprint
devices.
Let's switch to a random starting point, this is just as
good as far as RFC 6864 is concerned and does not leak
anything critical.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Thiemo Nagel <tnagel@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
[bwh: Backported to 3.16: drop changes in chelsio] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
The problem is that we were putting the NUL terminator too far:
buf[sizeof(buf) - 1] = '\0';
If the user input isn't NUL terminated and they haven't initialized the
whole buffer then it leads to an info leak. The NUL terminator should
be:
buf[len - 1] = '\0';
Signed-off-by: Yihui Zeng <yzeng56@asu.edu> Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
[heiko.carstens@de.ibm.com: keep semantics of how *lenp and *ppos are handled] Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
When a card is disconnected while in use, the system waits until all
opened files are closed then releases the card. This is done via
put_device() of the card device in each device release code.
The recently reported mutex deadlock bug happens in this code path;
snd_timer_close() for the timer device deals with the global
register_mutex and it calls put_device() there. When this timer
device is the last one, the card gets freed and it eventually calls
snd_timer_free(), which has again the protection with the global
register_mutex -- boom.
Basically put_device() call itself is race-free, so a relative simple
workaround is to move this put_device() call out of the mutex. For
achieving that, in this patch, snd_timer_close_locked() got a new
argument to store the card device pointer in return, and each caller
invokes put_device() with the returned object after the mutex unlock.
Reported-and-tested-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
The clean up commit 41672c0c24a6 ("ALSA: timer: Simplify error path in
snd_timer_open()") unified the error handling code paths with the
standard goto, but it introduced a subtle bug: the timer instance is
stored in snd_timer_open() incorrectly even if it returns an error.
This may eventually lead to UAF, as spotted by fuzzer.
The culprit is the snd_timer_open() code checks the
SNDRV_TIMER_IFLG_EXCLUSIVE flag with the common variable timeri.
This variable is supposed to be the newly created instance, but we
(ab-)used it for a temporary check before the actual creation of a
timer instance. After that point, there is another check for the max
number of instances, and it bails out if over the threshold. Before
the refactoring above, it worked fine because the code returned
directly from that point. After the refactoring, however, it jumps to
the unified error path that stores the timeri variable in return --
even if it returns an error. Unfortunately this stored value is kept
in the caller side (snd_timer_user_tselect()) in tu->timeri. This
causes inconsistency later, as if the timer was successfully
assigned.
In this patch, we fix it by not re-using timeri variable but a
temporary variable for testing the exclusive connection, so timeri
remains NULL at that point.
Just a minor refactoring to use the standard goto for error paths in
snd_timer_open() instead of open code. The first mutex_lock() is
moved to the beginning of the function to make the code clearer.
Signed-off-by: Takashi Iwai <tiwai@suse.de>
[bwh: Backported to 3.16 as dependency of commit a39331867335
"ALSA: timer: Fix mutex deadlock at releasing card"] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Mesh path nexthop should be a ethernet address, but current validation
checks against 4 byte integers.
Fixes: 2ec600d672e74 ("nl80211/cfg80211: support for mesh, sta dumping") Signed-off-by: Markus Theil <markus.theil@tu-ilmenau.de> Link: https://lore.kernel.org/r/20191029093003.10355-1-markus.theil@tu-ilmenau.de Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
This happens because __ceph_remove_cap() may queue a cap release
(__ceph_queue_cap_release) which can be scheduled before that cap is
removed from the inode list with
rb_erase(&cap->ci_node, &ci->i_caps);
And, when this finally happens, the use-after-free will occur.
This can be fixed by removing the cap from the inode list before being
removed from the session list, and thus eliminating the risk of an UAF.
Signed-off-by: Luis Henriques <lhenriques@suse.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Endpoints with a maxpacket length of 0 are probably useless. They
can't transfer any data, and it's not at all unlikely that a UDC will
crash or hang when trying to handle a non-zero-length usb_request for
such an endpoint. Indeed, dummy-hcd gets a divide error when trying
to calculate the remainder of a transfer length by the maxpacket
value, as discovered by the syzbot fuzzer.
Currently the gadget core does not check for endpoints having a
maxpacket value of 0. This patch adds a check to usb_ep_enable(),
preventing such endpoints from being used.
As far as I know, none of the gadget drivers in the kernel tries to
create an endpoint with maxpacket = 0, but until now there has been
nothing to prevent userspace programs under gadgetfs or configfs from
doing it.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Reported-and-tested-by: syzbot+8ab8bf161038a8768553@syzkaller.appspotmail.com Acked-by: Felipe Balbi <balbi@kernel.org> Link: https://lore.kernel.org/r/Pine.LNX.4.44L0.1910281052370.1485-100000@iolanthe.rowland.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[bwh: Backported to 3.16: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk>