Ben writes:
This fixes a bug introduced in 4.6 by commit 465893e18878 "tty:
goldfish: support platform_device with id -1". For earlier
kernel versions, it *introduces* a bug.
In case of unsolicited data for the first sequence
seq_end_offset must be set to minimum of total data length
and FirstBurstLength, so do not add cmd->write_data_done
to the min of total data length and FirstBurstLength.
This patch avoids that with ImmediateData=Yes, InitialR2T=No,
MaxXmitDataSegmentLength < FirstBurstLength that a WRITE command
with IO size above FirstBurstLength triggers sequence error
messages, for example
Set following parameters on target (linux-4.8.12)
ImmediateData = Yes
InitialR2T = No
MaxXmitDataSegmentLength = 8k
FirstBurstLength = 64k
Log in from Open iSCSI initiator and execute
dd if=/dev/zero of=/dev/sdb bs=128k count=1 oflag=direct
Error messages on target
Command ITT: 0x00000035 with Offset: 65536, Length: 8192 outside
of Sequence 73728:131072 while DataSequenceInOrder=Yes.
Command ITT: 0x00000035, received DataSN: 0x00000001 higher than
expected 0x00000000.
Unable to perform within-command recovery while ERL=0.
Signed-off-by: Varun Prakash <varun@chelsio.com>
[ bvanassche: Use min() instead of open-coding it / edited patch description ] Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Include <linux/in6.h> to fix the following linux/mroute6.h userspace
compilation errors:
/usr/include/linux/mroute6.h:80:22: error: field 'mf6cc_origin' has incomplete type
struct sockaddr_in6 mf6cc_origin; /* Origin of mcast */
/usr/include/linux/mroute6.h:81:22: error: field 'mf6cc_mcastgrp' has incomplete type
struct sockaddr_in6 mf6cc_mcastgrp; /* Group in question */
/usr/include/linux/mroute6.h:91:22: error: field 'src' has incomplete type
struct sockaddr_in6 src;
/usr/include/linux/mroute6.h:92:22: error: field 'grp' has incomplete type
struct sockaddr_in6 grp;
/usr/include/linux/mroute6.h:132:18: error: field 'im6_src' has incomplete type
struct in6_addr im6_src, im6_dst;
/usr/include/linux/mroute6.h:132:27: error: field 'im6_dst' has incomplete type
struct in6_addr im6_src, im6_dst;
Signed-off-by: Dmitry V. Levin <ldv@altlinux.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Consistently use types from linux/types.h to fix the following
linux/rds.h userspace compilation errors:
/usr/include/linux/rds.h:106:2: error: unknown type name 'uint8_t'
uint8_t name[32];
/usr/include/linux/rds.h:107:2: error: unknown type name 'uint64_t'
uint64_t value;
/usr/include/linux/rds.h:117:2: error: unknown type name 'uint64_t'
uint64_t next_tx_seq;
/usr/include/linux/rds.h:118:2: error: unknown type name 'uint64_t'
uint64_t next_rx_seq;
/usr/include/linux/rds.h:121:2: error: unknown type name 'uint8_t'
uint8_t transport[TRANSNAMSIZ]; /* null term ascii */
/usr/include/linux/rds.h:122:2: error: unknown type name 'uint8_t'
uint8_t flags;
/usr/include/linux/rds.h:129:2: error: unknown type name 'uint64_t'
uint64_t seq;
/usr/include/linux/rds.h:130:2: error: unknown type name 'uint32_t'
uint32_t len;
/usr/include/linux/rds.h:135:2: error: unknown type name 'uint8_t'
uint8_t flags;
/usr/include/linux/rds.h:139:2: error: unknown type name 'uint32_t'
uint32_t sndbuf;
/usr/include/linux/rds.h:144:2: error: unknown type name 'uint32_t'
uint32_t rcvbuf;
/usr/include/linux/rds.h:145:2: error: unknown type name 'uint64_t'
uint64_t inum;
/usr/include/linux/rds.h:153:2: error: unknown type name 'uint64_t'
uint64_t hdr_rem;
/usr/include/linux/rds.h:154:2: error: unknown type name 'uint64_t'
uint64_t data_rem;
/usr/include/linux/rds.h:155:2: error: unknown type name 'uint32_t'
uint32_t last_sent_nxt;
/usr/include/linux/rds.h:156:2: error: unknown type name 'uint32_t'
uint32_t last_expected_una;
/usr/include/linux/rds.h:157:2: error: unknown type name 'uint32_t'
uint32_t last_seen_una;
/usr/include/linux/rds.h:164:2: error: unknown type name 'uint8_t'
uint8_t src_gid[RDS_IB_GID_LEN];
/usr/include/linux/rds.h:165:2: error: unknown type name 'uint8_t'
uint8_t dst_gid[RDS_IB_GID_LEN];
/usr/include/linux/rds.h:167:2: error: unknown type name 'uint32_t'
uint32_t max_send_wr;
/usr/include/linux/rds.h:168:2: error: unknown type name 'uint32_t'
uint32_t max_recv_wr;
/usr/include/linux/rds.h:169:2: error: unknown type name 'uint32_t'
uint32_t max_send_sge;
/usr/include/linux/rds.h:170:2: error: unknown type name 'uint32_t'
uint32_t rdma_mr_max;
/usr/include/linux/rds.h:171:2: error: unknown type name 'uint32_t'
uint32_t rdma_mr_size;
/usr/include/linux/rds.h:212:9: error: unknown type name 'uint64_t'
typedef uint64_t rds_rdma_cookie_t;
/usr/include/linux/rds.h:215:2: error: unknown type name 'uint64_t'
uint64_t addr;
/usr/include/linux/rds.h:216:2: error: unknown type name 'uint64_t'
uint64_t bytes;
/usr/include/linux/rds.h:221:2: error: unknown type name 'uint64_t'
uint64_t cookie_addr;
/usr/include/linux/rds.h:222:2: error: unknown type name 'uint64_t'
uint64_t flags;
/usr/include/linux/rds.h:228:2: error: unknown type name 'uint64_t'
uint64_t cookie_addr;
/usr/include/linux/rds.h:229:2: error: unknown type name 'uint64_t'
uint64_t flags;
/usr/include/linux/rds.h:234:2: error: unknown type name 'uint64_t'
uint64_t flags;
/usr/include/linux/rds.h:240:2: error: unknown type name 'uint64_t'
uint64_t local_vec_addr;
/usr/include/linux/rds.h:241:2: error: unknown type name 'uint64_t'
uint64_t nr_local;
/usr/include/linux/rds.h:242:2: error: unknown type name 'uint64_t'
uint64_t flags;
/usr/include/linux/rds.h:243:2: error: unknown type name 'uint64_t'
uint64_t user_token;
/usr/include/linux/rds.h:248:2: error: unknown type name 'uint64_t'
uint64_t local_addr;
/usr/include/linux/rds.h:249:2: error: unknown type name 'uint64_t'
uint64_t remote_addr;
/usr/include/linux/rds.h:252:4: error: unknown type name 'uint64_t'
uint64_t compare;
/usr/include/linux/rds.h:253:4: error: unknown type name 'uint64_t'
uint64_t swap;
/usr/include/linux/rds.h:256:4: error: unknown type name 'uint64_t'
uint64_t add;
/usr/include/linux/rds.h:259:4: error: unknown type name 'uint64_t'
uint64_t compare;
/usr/include/linux/rds.h:260:4: error: unknown type name 'uint64_t'
uint64_t swap;
/usr/include/linux/rds.h:261:4: error: unknown type name 'uint64_t'
uint64_t compare_mask;
/usr/include/linux/rds.h:262:4: error: unknown type name 'uint64_t'
uint64_t swap_mask;
/usr/include/linux/rds.h:265:4: error: unknown type name 'uint64_t'
uint64_t add;
/usr/include/linux/rds.h:266:4: error: unknown type name 'uint64_t'
uint64_t nocarry_mask;
/usr/include/linux/rds.h:269:2: error: unknown type name 'uint64_t'
uint64_t flags;
/usr/include/linux/rds.h:270:2: error: unknown type name 'uint64_t'
uint64_t user_token;
/usr/include/linux/rds.h:274:2: error: unknown type name 'uint64_t'
uint64_t user_token;
/usr/include/linux/rds.h:275:2: error: unknown type name 'int32_t'
int32_t status;
Signed-off-by: Dmitry V. Levin <ldv@altlinux.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
While we hold a reference to the dentry when build_dentry_path is
called, we could end up racing with a rename that changes d_parent.
Handle that situation correctly, by using the rcu_read_lock to
ensure that the parent dentry and inode stick around long enough
to safely check ceph_snap and ceph_ino.
The mvpp2_txq_bufs_free() function is called upon TX completion to DMA
unmap TX buffers, and free the corresponding SKBs. It gets the
references to the SKB to free and the DMA buffer to unmap from a per-CPU
txq_pcpu data structure.
However, the code currently increments the pointer to the next entry
before doing the DMA unmap and freeing the SKB. It does not cause any
visible problem because for a given SKB the TX completion is guaranteed
to take place on the CPU where the TX was started. However, it is much
more logical to increment the pointer to the next entry once the current
entry has been completely unmapped/released.
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Acked-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In case CONFIG_SLUB_DEBUG_ON=n, find_mergeable() gets debug features from
commandline but never checks if there are features from the
SLAB_NEVER_MERGE set.
As a result selected by slub_debug caches are always mergeable if they
have been created without a custom constructor set or without one of the
SLAB_* debug features on.
This moves the SLAB_NEVER_MERGE check below the flags update from
commandline to make sure it won't merge the slab cache if one of the debug
features is on.
Link: http://lkml.kernel.org/r/20170101124451.GA4740@lp-laptop-d Signed-off-by: Grygorii Maistrenko <grygoriimkd@gmail.com> Reviewed-by: Pekka Enberg <penberg@kernel.org> Acked-by: David Rientjes <rientjes@google.com> Acked-by: Christoph Lameter <cl@linux.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
We are in the situation that we have to avoid recursive cluster locking,
but there is no way to check if a cluster lock has been taken by a precess
already.
Mostly, we can avoid recursive locking by writing code carefully.
However, we found that it's very hard to handle the routines that are
invoked directly by vfs code. For instance:
Both ocfs2_permission() and ocfs2_iop_get_acl() call ocfs2_inode_lock(PR):
do_sys_open
may_open
inode_permission
ocfs2_permission
ocfs2_inode_lock() <=== first time
generic_permission
get_acl
ocfs2_iop_get_acl
ocfs2_inode_lock() <=== recursive one
A deadlock will occur if a remote EX request comes in between two of
ocfs2_inode_lock(). Briefly describe how the deadlock is formed:
On one hand, OCFS2_LOCK_BLOCKED flag of this lockres is set in
BAST(ocfs2_generic_handle_bast) when downconvert is started on behalf of
the remote EX lock request. Another hand, the recursive cluster lock
(the second one) will be blocked in in __ocfs2_cluster_lock() because of
OCFS2_LOCK_BLOCKED. But, the downconvert never complete, why? because
there is no chance for the first cluster lock on this node to be
unlocked - we block ourselves in the code path.
The idea to fix this issue is mostly taken from gfs2 code.
1. introduce a new field: struct ocfs2_lock_res.l_holders, to keep track
of the processes' pid who has taken the cluster lock of this lock
resource;
2. introduce a new flag for ocfs2_inode_lock_full:
OCFS2_META_LOCK_GETBH; it means just getting back disk inode bh for
us if we've got cluster lock.
3. export a helper: ocfs2_is_locked_by_me() is used to check if we have
got the cluster lock in the upper code path.
The tracking logic should be used by some of the ocfs2 vfs's callbacks,
to solve the recursive locking issue cuased by the fact that vfs
routines can call into each other.
The performance penalty of processing the holder list should only be
seen at a few cases where the tracking logic is used, such as get/set
acl.
You may ask what if the first time we got a PR lock, and the second time
we want a EX lock? fortunately, this case never happens in the real
world, as far as I can see, including permission check,
(get|set)_(acl|attr), and the gfs2 code also do so.
[sfr@canb.auug.org.au remove some inlines] Link: http://lkml.kernel.org/r/20170117100948.11657-2-zren@suse.com Signed-off-by: Eric Ren <zren@suse.com> Reviewed-by: Junxiao Bi <junxiao.bi@oracle.com> Reviewed-by: Joseph Qi <jiangqi903@gmail.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Mark Fasheh <mfasheh@versity.com> Cc: Joel Becker <jlbec@evilplan.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Since the
commit f1c131b45410a202eb45cc55980a7a9e4e4b4f40
crypto: xts - Convert to skcipher
the XTS mode is based on ECB, so the mode must select
ECB otherwise it can fail to initialize.
Signed-off-by: Milan Broz <gmazyland@gmail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In the VF driver, module parameter mlx4_log_num_mgm_entry_size was
mistakenly overwritten -- and in a manner which overrode the
device-managed flow steering option encoded in the parameter.
log_num_mgm_entry_size is a global module parameter which
affects all ConnectX-3 PFs installed on that host.
If a VF changes log_num_mgm_entry_size, this will affect all PFs
which are probed subsequent to the change (by disabling DMFS for
those PFs).
Fixes: 3c439b5586e9 ("mlx4_core: Allow choosing flow steering mode") Signed-off-by: Majd Dibbiny <majd@mellanox.com> Reviewed-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Under certain situations, an incremental send operation can fail due to a
premature attempt to create a new top level inode (a direct child of the
subvolume/snapshot root) whose name collides with another inode that was
removed from the send snapshot.
Consider the following example scenario.
Parent snapshot:
. (ino 256, gen 8)
|---- a1/ (ino 257, gen 9)
|---- a2/ (ino 258, gen 9)
Send snapshot:
. (ino 256, gen 3)
|---- a2/ (ino 257, gen 7)
In this scenario, when receiving the incremental send stream, the btrfs
receive command fails like this (ran in verbose mode, -vv argument):
rmdir a1
mkfile o257-7-0
rename o257-7-0 -> a2
ERROR: rename o257-7-0 -> a2 failed: Is a directory
What happens when computing the incremental send stream is:
1) An operation to remove the directory with inode number 257 and
generation 9 is issued.
2) An operation to create the inode with number 257 and generation 7 is
issued. This creates the inode with an orphanized name of "o257-7-0".
3) An operation rename the new inode 257 to its final name, "a2", is
issued. This is incorrect because inode 258, which has the same name
and it's a child of the same parent (root inode 256), was not yet
processed and therefore no rmdir operation for it was yet issued.
The rename operation is issued because we fail to detect that the
name of the new inode 257 collides with inode 258, because their
parent, a subvolume/snapshot root (inode 256) has a different
generation in both snapshots.
So fix this by ignoring the generation value of a parent directory that
matches a root inode (number 256) when we are checking if the name of the
inode currently being processed collides with the name of some other
inode that was not yet processed.
We can achieve this scenario of different inodes with the same number but
different generation values either by mounting a filesystem with the inode
cache option (-o inode_cache) or by creating and sending snapshots across
different filesystems, like in the following example:
$ mkfs.btrfs -f /dev/sdc
$ mount /dev/sdc /mnt
$ touch /mnt/a2
$ btrfs subvolume snapshot -r /mnt /mnt/snap2
$ btrfs receive /mnt -f /tmp/1.snap
# Take note that once the filesystem is created, its current
# generation has value 7 so the inode from the second snapshot has
# a generation value of 7. And after receiving the first snapshot
# the filesystem is at a generation value of 10, because the call to
# create the second snapshot bumps the generation to 8 (the snapshot
# creation ioctl does a transaction commit), the receive command calls
# the snapshot creation ioctl to create the first snapshot, which bumps
# the filesystem's generation to 9, and finally when the receive
# operation finishes it calls an ioctl to transition the first snapshot
# (snap1) from RW mode to RO mode, which does another transaction commit
# and bumps the filesystem's generation to 10.
$ rm -f /tmp/1.snap
$ btrfs send /mnt/snap1 -f /tmp/1.snap
$ btrfs send -p /mnt/snap1 /mnt/snap2 -f /tmp/2.snap
$ umount /mnt
$ mkfs.btrfs -f /dev/sdd
$ mount /dev/sdd /mnt
$ btrfs receive /mnt /tmp/1.snap
# Receive of snapshot snap2 used to fail.
$ btrfs receive /mnt /tmp/2.snap
Signed-off-by: Robbie Ko <robbieko@synology.com> Reviewed-by: Filipe Manana <fdmanana@suse.com>
[Rewrote changelog to be more precise and clear] Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Commit 4dee62b1b9b4 ("netfilter: nf_ct_expect: nf_ct_expect_insert()
returns void") inadvertently changed the successful return value of
nf_ct_expect_related_report() from 0 to 1 due to
__nf_ct_expect_check() returning 1 on success. Prevent this
regression in the future by changing the return value of
__nf_ct_expect_check() to 0 on success.
The cited commit makes a great job of finding optimal shift/multiplier
values assuming a 10 seconds wrap around, but forgot to change the
overflow_period computation.
It overflows in cyclecounter_cyc2ns(), and the final result is 804 ms,
which is silly.
Lets simply use 5 seconds, no need to recompute this, given how it is
supposed to work.
Later, we will use a timer instead of a work queue, since the new RX
allocation schem will no longer need mlx4_en_recover_from_oom() and the
service_task firing every 250 ms.
Fixes: 31c128b66e5b ("net/mlx4_en: Choose time-stamping shift value according to HW frequency") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Tariq Toukan <tariqt@mellanox.com> Cc: Eugenia Emantayev <eugenia@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
iwlwifi now supports RSS and can't let mac80211 track the
PS state based on the Rx frames since they can come out of
order. iwlwifi is now advertising AP_LINK_PS, and uses
explicit notifications to teach mac80211 about the PS state
of the stations and the PS poll / uAPSD trigger frames
coming our way from the peers.
Because of that, the TIM stopped being maintained in
mac80211. I tried to fix this in commit c68df2e7be0c
("mac80211: allow using AP_LINK_PS with mac80211-generated TIM IE")
but that was later reverted by Felix in commit 6c18a6b4e799
("Revert "mac80211: allow using AP_LINK_PS with mac80211-generated TIM IE")
since it broke drivers that do not implement set_tim.
Since none of the drivers that set AP_LINK_PS have the
set_tim() handler set besides iwlwifi, I can bail out in
__sta_info_recalc_tim if AP_LINK_PS AND .set_tim is not
implemented.
Unfortunately, the nla policy was defined to have HWSIM_ATTR_RADIO_NAME
as an NLA_STRING, rather than NLA_NUL_STRING, so we can't use it as a
NUL-terminated string in the kernel.
Rather than break the API, kasprintf() the string to a new buffer to
guarantee NUL termination.
Reported-by: Andrew Zaborowski <andrew.zaborowski@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The max and entry variables are unsigned according to the dt-bindings.
Fix following 3 sparse issues (-Wtypesign):
drivers/irqchip/irq-crossbar.c:222:52: warning: incorrect type in argument 3 (different signedness)
drivers/irqchip/irq-crossbar.c:222:52: expected unsigned int [usertype] *out_value
drivers/irqchip/irq-crossbar.c:222:52: got int *<noident>
drivers/irqchip/irq-crossbar.c:245:56: warning: incorrect type in argument 4 (different signedness)
drivers/irqchip/irq-crossbar.c:245:56: expected unsigned int [usertype] *out_value
drivers/irqchip/irq-crossbar.c:245:56: got int *<noident>
drivers/irqchip/irq-crossbar.c:263:56: warning: incorrect type in argument 4 (different signedness)
drivers/irqchip/irq-crossbar.c:263:56: expected unsigned int [usertype] *out_value
drivers/irqchip/irq-crossbar.c:263:56: got int *<noident>
gcc-4.3 can't decide whether the constant value in
kempld_prescaler[PRESCALER_21] is built-time constant or
not, and gets confused by the logic in do_div():
drivers/watchdog/kempld_wdt.o: In function `kempld_wdt_set_stage_timeout':
kempld_wdt.c:(.text.kempld_wdt_set_stage_timeout+0x130): undefined reference to `__aeabi_uldivmod'
This adds a call to ACCESS_ONCE() to force it to not consider
it to be constant, and leaves the more efficient normal case
in place for modern compilers, using an #ifdef to annotate
why we do this hack.
Ben reports:
That function doesn't exist here (it was introduced in 4.13).
Instead, this backport has modified bsg_create_job(), creating a
leak. Please revert this on the 3.18, 4.4 and 4.9 stable
branches.
So I'm dropping it from here.
Reported-by: Ben Hutchings <ben.hutchings@codethink.co.uk> Cc: Christoph Hellwig <hch@lst.de> Cc: Ming Lei <ming.lei@redhat.com> Cc: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
In commit e3a77561e7d32 ("tipc: split up function tipc_msg_eval()"),
we have updated the function tipc_msg_lookup_dest() to set the error
codes to negative values at destination lookup failures. Thus when
the function sets the error code to -TIPC_ERR_NO_NAME, its inserted
into the 4 bit error field of the message header as 0xf instead of
TIPC_ERR_NO_NAME (1). The value 0xf is an unknown error code.
In this commit, we set only positive error code.
Fixes: e3a77561e7d32 ("tipc: split up function tipc_msg_eval()") Signed-off-by: Parthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
sk->sk_prot and sk->sk_prot_creator can differ when the app uses
IPV6_ADDRFORM (transforming an IPv6-socket to an IPv4-one).
Which is why sk_prot_creator is there to make sure that sk_prot_free()
does the kmem_cache_free() on the right kmem_cache slab.
Now, if such a socket gets transformed back to a listening socket (using
connect() with AF_UNSPEC) we will allocate an IPv4 tcp_sock through
sk_clone_lock() when a new connection comes in. But sk_prot_creator will
still point to the IPv6 kmem_cache (as everything got copied in
sk_clone_lock()). When freeing, we will thus put this
memory back into the IPv6 kmem_cache although it was allocated in the
IPv4 cache. I have seen memory corruption happening because of this.
With slub-debugging and MEMCG_KMEM enabled this gives the warning
"cache_from_obj: Wrong slab cache. TCPv6 but object is from TCP"
A C-program to trigger this:
void main(void)
{
int fd = socket(AF_INET6, SOCK_STREAM, IPPROTO_TCP);
int new_fd, newest_fd, client_fd;
struct sockaddr_in6 bind_addr;
struct sockaddr_in bind_addr4, client_addr1, client_addr2;
struct sockaddr unsp;
int val;
As far as I can see, this bug has been there since the beginning of the
git-days.
Signed-off-by: Christoph Paasch <cpaasch@apple.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Packet socket option po->has_vnet_hdr can be updated concurrently with
other operations if no ring is attached.
Do not test the option twice in packet_snd, as the value may change in
between calls. A race on setsockopt disable may cause a packet > mtu
to be sent without having GSO options set.
Fixes: bfd5f4a3d605 ("packet: Add GSO/csum offload support.") Signed-off-by: Willem de Bruijn <willemb@google.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Once a socket has po->fanout set, it remains a member of the group
until it is destroyed. The prot_hook must be constant and identical
across sockets in the group.
If fanout_add races with packet_do_bind between the test of po->fanout
and taking the lock, the bind call may make type or dev inconsistent
with that of the fanout group.
Hold po->bind_lock when testing po->fanout to avoid this race.
I had to introduce artificial delay (local_bh_enable) to actually
observe the race.
Fixes: dc99f600698d ("packet: Add fanout support.") Signed-off-by: Willem de Bruijn <willemb@google.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Alexander Potapenko <glider@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
If we try to delete the same tunnel twice, the first delete operation
does a lookup (l2tp_tunnel_get), finds the tunnel, calls
l2tp_tunnel_delete, which queues it for deletion by
l2tp_tunnel_del_work.
The second delete operation also finds the tunnel and calls
l2tp_tunnel_delete. If the workqueue has already fired and started
running l2tp_tunnel_del_work, then l2tp_tunnel_delete will queue the
same tunnel a second time, and try to free the socket again.
Add a dead flag to prevent firing the workqueue twice. Then we can
remove the check of queue_work's result that was meant to prevent that
race but doesn't.
Reproducer:
ip l2tp add tunnel tunnel_id 3000 peer_tunnel_id 4000 local 192.168.0.2 remote 192.168.0.1 encap udp udp_sport 5000 udp_dport 6000
ip l2tp add session name l2tp1 tunnel_id 3000 session_id 1000 peer_session_id 2000
ip link set l2tp1 up
ip l2tp del tunnel tunnel_id 3000
ip l2tp del tunnel tunnel_id 3000
Fixes: f8ccac0e4493 ("l2tp: put tunnel socket release on a workqueue") Reported-by: Jianlin Shi <jishi@redhat.com> Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Acked-by: Guillaume Nault <g.nault@alphalink.fr> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
While destroying a network namespace that contains a L2TP tunnel a
"BUG: scheduling while atomic" can be observed.
Enabling lockdep shows that this is happening because l2tp_exit_net()
is calling l2tp_tunnel_closeall() (via l2tp_tunnel_delete()) from
within an RCU critical section.
This bug can easily be reproduced with a few steps:
$ sudo unshare -n bash # Create a shell in a new namespace
# ip link set lo up
# ip addr add 127.0.0.1 dev lo
# ip l2tp add tunnel remote 127.0.0.1 local 127.0.0.1 tunnel_id 1 \
peer_tunnel_id 1 udp_sport 50000 udp_dport 50000
# ip l2tp add session name foo tunnel_id 1 session_id 1 \
peer_session_id 1
# ip link set foo up
# exit # Exit the shell, in turn exiting the namespace
$ dmesg
...
[942121.089216] BUG: scheduling while atomic: kworker/u16:3/13872/0x00000200
...
To fix this, move the call to l2tp_tunnel_closeall() out of the RCU
critical section, and instead call it from l2tp_tunnel_del_work(), which
is running from the l2tp_wq workqueue.
Fixes: 2b551c6e7d5b ("l2tp: close sessions before initiating tunnel delete") Signed-off-by: Ridge Kennedy <ridge.kennedy@alliedtelesis.co.nz> Acked-by: Guillaume Nault <g.nault@alphalink.fr> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Can be fixed if we get skb->len before dst_output().
Fixes: b9959fd3b0fa ("vti: switch to new ip tunnel code") Fixes: 22e1b23dafa8 ("vti6: Support inter address family tunneling.") Signed-off-by: Alexey Kodanev <alexey.kodanev@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In isdn_ppp_write(), the header (i.e., protobuf) of the buffer is
fetched twice from userspace. The first fetch is used to peek at the
protocol of the message and reset the huptimer if necessary; while the
second fetch copies in the whole buffer. However, given that buf resides
in userspace memory, a user process can race to change its memory content
across fetches. By doing so, we can either avoid resetting the huptimer
for any type of packets (by first setting proto to PPP_LCP and later
change to the actual type) or force resetting the huptimer for LCP
packets.
This patch changes this double-fetch behavior into two single fetches
decided by condition (lp->isdn_device < 0 || lp->isdn_channel <0).
A more detailed discussion can be found at
https://marc.info/?l=linux-kernel&m=150586376926123&w=2
Signed-off-by: Meng Xu <mengxu.gatech@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This patch fixes a bug exhibited by the following scenario:
1. fd1 = perf_event_open with attr.config = ID1
2. attach bpf program prog1 to fd1
3. fd2 = perf_event_open with attr.config = ID1
<this will be successful>
4. user program closes fd2 and prog1 is detached from the tracepoint.
5. user program with fd1 does not work properly as tracepoint
no output any more.
The issue happens at step 4. Multiple perf_event_open can be called
successfully, but only one bpf prog pointer in the tp_event. In the
current logic, any fd release for the same tp_event will free
the tp_event->prog.
The fix is to free tp_event->prog only when the closing fd
corresponds to the one which registered the program.
Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Packet socket bind operations must hold the po->bind_lock. This keeps
po->running consistent with whether the socket is actually on a ptype
list to receive packets.
fanout_add unbinds a socket and its packet_rcv/tpacket_rcv call, then
binds the fanout object to receive through packet_rcv_fanout.
Make it hold the po->bind_lock when testing po->running and rebinding.
Else, it can race with other rebind operations, such as that in
packet_set_ring from packet_rcv to tpacket_rcv. Concurrent updates
can result in a socket being added to a fanout group twice, causing
use-after-free KASAN bug reports, among others.
Reported independently by both trinity and syzkaller.
Verified that the syzkaller reproducer passes after this patch.
Fixes: dc99f600698d ("packet: Add fanout support.") Reported-by: nixioaming <nixiaoming@huawei.com> Signed-off-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This patch is pretty much a carbon copy of
commit 3079c652141f ("caif: Fix napi poll list corruption")
with "caif" replaced by "emac".
The commit d75b1ade567f ("net: less interrupt masking in NAPI")
breaks emac.
It is now required that if the entire budget is consumed when poll
returns, the napi poll_list must remain empty. However, like some
other drivers emac tries to do a last-ditch check and if there is
more work it will call napi_reschedule and then immediately process
some of this new work. Should the entire budget be consumed while
processing such new work then we will violate the new caller
contract.
This patch fixes this by not touching any work when we reschedule
in emac.
Signed-off-by: Christian Lamparter <chunkeey@googlemail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Now in ip6gre_header before packing the ipv6 header, it skb_push t->hlen
which only includes encap_hlen + tun_hlen. It means greh and inner header
would be over written by ipv6 stuff and ipv6h might have no chance to set
up.
Jianlin found this issue when using remote any on ip6_gre, the packets he
captured on gre dev are truncated:
It should also skb_push ipv6hdr so that ipv6h points to the right position
to set ipv6 stuff up.
This patch is to skb_push hlen + sizeof(*ipv6h) and also fix some indents
in ip6gre_header.
Fixes: c12b395a4664 ("gre: Support GRE over IPv6") Reported-by: Jianlin Shi <jishi@redhat.com> Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
While trying an ESP transport mode encryption for UDPv6 packets of
datagram size 1436 with MTU 1500, checksum error was observed in
the secondary fragment.
This error occurs due to the UDP payload checksum being missed out
when computing the full checksum for these packets in
udp6_hwcsum_outgoing().
Fixes: d39d938c8228 ("ipv6: Introduce udpv6_send_skb()") Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This code causes a static checker warning because Smatch doesn't trust
anything that comes from skb->data. I've reviewed this code and I do
think skb->data can be controlled by the user here.
The sctp_event_subscribe struct has 13 __u8 fields and we want to see
if ours is non-zero. sn_type can be any value in the 0-USHRT_MAX range.
We're subtracting SCTP_SN_TYPE_BASE which is 1 << 15 so we could read
either before the start of the struct or after the end.
This is a very old bug and it's surprising that it would go undetected
for so long but my theory is that it just doesn't have a big impact so
it would be hard to notice.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Commit db8466c581cc ("MIPS: IRQ Stack: Unwind IRQ stack onto task
stack") erroneously set the initial stack pointer of the IRQ stack to a
value with a 4 byte alignment. The MIPS32 ABI requires that the minimum
stack alignment is 8 byte, and the MIPS64 ABIs(n32/n64) require 16 byte
minimum alignment. Fix IRQ_STACK_START such that it leaves space for the
dummy stack frame (containing interrupted task kernel stack pointer)
while also meeting minimum alignment requirements.
Fixes: db8466c581cc ("MIPS: IRQ Stack: Unwind IRQ stack onto task stack") Reported-by: Darius Ivanauskas <dasilt@yahoo.com> Signed-off-by: Matt Redfearn <matt.redfearn@imgtec.com> Cc: Chris Metcalf <cmetcalf@mellanox.com> Cc: Petr Mladek <pmladek@suse.com> Cc: Aaron Tomlin <atomlin@redhat.com> Cc: Jason A. Donenfeld <jason@zx2c4.com> Cc: linux-mips@linux-mips.org Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/16760/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
As raw_cpu_generic_read() is a plain read from a raw_cpu_ptr() address,
it's possible (albeit unlikely) that the compiler will split the access
across multiple instructions.
In this_cpu_generic_read() we disable preemption but not interrupts
before calling raw_cpu_generic_read(). Thus, an interrupt could be taken
in the middle of the split load instructions. If a this_cpu_write() or
RMW this_cpu_*() op is made to the same variable in the interrupt
handling path, this_cpu_read() will return a torn value.
For native word types, we can avoid tearing using READ_ONCE(), but this
won't work in all cases (e.g. 64-bit types on most 32-bit platforms).
This patch reworks this_cpu_generic_read() to use READ_ONCE() where
possible, otherwise falling back to disabling interrupts.
Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Christoph Lameter <cl@linux.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Pranith Kumar <bobby.prani@gmail.com> Cc: Tejun Heo <tj@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-arch@vger.kernel.org Signed-off-by: Tejun Heo <tj@kernel.org>
[Mark: backport to v4.4.y] Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The alt_max_short() macro in asm/alternative.h does not work as
intended, leading to nasty bugs. E.g. alt_max_short("1", "3")
evaluates to 3, but alt_max_short("3", "1") evaluates to 1 -- not
exactly the maximum of 1 and 3.
In fact, I had to learn it the hard way by crashing my kernel in not
so funny ways by attempting to make use of the ALTENATIVE_2 macro
with alternatives where the first one was larger than the second
one.
According to [1] and commit dbe4058a6a44 ("x86/alternatives: Fix
ALTERNATIVE_2 padding generation properly") the right handed side
should read "-(-(a < b))" not "-(-(a - b))". Fix that, to make the
macro work as intended.
While at it, fix up the comments regarding the additional "-", too.
It's not about gas' usage of s32 but brain dead logic of having a
"true" value of -1 for the < operator ... *sigh*
Btw., the one in asm/alternative-asm.h is correct. And, apparently,
all current users of ALTERNATIVE_2() pass same sized alternatives,
avoiding to hit the bug.
Make sure to reset the USB-console port pointer when console setup fails
in order to avoid having the struct usb_serial be prematurely freed by
the console code when the device is later disconnected.
Fixes: 73e487fdb75f ("[PATCH] USB console: fix disconnection issues") Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Dell Wireless 5819/5818 devices are re-branded Sierra Wireless MC74
series which will by default boot with vid 0x413c and pid's 0x81cf,
0x81d0, 0x81d1, 0x81d2.
Add the USB device id for the ELV TFD500 data logger.
Signed-off-by: Andreas Engel <anen-nospam@gmx.net> Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
bio_map_user_iov and bio_unmap_user do unbalanced pages refcounting if
IO vector has small consecutive buffers belonging to the same page.
bio_add_pc_page merges them into one, but the page reference is never
dropped.
In the code added to function submit_page_section by commit b1058b981,
sdio->bio can currently be NULL when calling dio_bio_submit. This then
leads to a NULL pointer access in dio_bio_submit, so check for a NULL
bio in submit_page_section before trying to submit it instead.
Fixes xfstest generic/250 on gfs2.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
BUG: KASAN: use-after-free in usb_composite_overwrite_options+...
[libcomposite] at addr ...
Read of size 1 by task ...
when some driver is un-bound and then bound again.
For example, this happens with FunctionFS driver when "ffs-test"
test application is run several times in a row.
If the driver has empty manufacturer ID string in initial static data,
it is then replaced with generated string. After driver unbinding
the generated string is freed, but the driver data still keep that
pointer. And if the driver is then bound again, that pointer
is re-used for string emptiness check.
The fix is to clean up the driver string data upon its unbinding
to drop the pointer to freed memory.
Fixes: cc2683c318a5 ("usb: gadget: Provide a default implementation of default manufacturer string") Signed-off-by: Andrew Gabbasov <andrew_gabbasov@mentor.com> Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
While line6_probe() may kick off URB for a control MIDI endpoint, the
function doesn't clean up it properly at its error path. This results
in a leftover URB action that is eventually triggered later and causes
an Oops like:
general protection fault: 0000 [#1] PREEMPT SMP KASAN
CPU: 1 PID: 0 Comm: swapper/1 Not tainted
RIP: 0010:usb_fill_bulk_urb ./include/linux/usb.h:1619
RIP: 0010:line6_start_listen+0x3fe/0x9e0 sound/usb/line6/driver.c:76
Call Trace:
<IRQ>
line6_data_received+0x1f7/0x470 sound/usb/line6/driver.c:326
__usb_hcd_giveback_urb+0x2e0/0x650 drivers/usb/core/hcd.c:1779
usb_hcd_giveback_urb+0x337/0x420 drivers/usb/core/hcd.c:1845
dummy_timer+0xba9/0x39f0 drivers/usb/gadget/udc/dummy_hcd.c:1965
call_timer_fn+0x2a2/0x940 kernel/time/timer.c:1281
....
Since the whole clean-up procedure is done in line6_disconnect()
callback, we can simply call it in the error path instead of
open-coding the whole again. It'll fix such an issue automagically.
caiaq driver doesn't kill the URB properly at its error path during
the probe, which may lead to a use-after-free error later. This patch
addresses it.
Reported-by: Johan Hovold <johan@kernel.org> Reviewed-by: Johan Hovold <johan@kernel.org> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The event handler in the virmidi sequencer code takes a read-lock for
the linked list traverse, while it's calling snd_seq_dump_var_event()
in the loop. The latter function may expand the user-space data
depending on the event type. It eventually invokes copy_from_user(),
which might be a potential dead-lock.
The sequencer core guarantees that the user-space data is passed only
with atomic=0 argument, but snd_virmidi_dev_receive_event() ignores it
and always takes read-lock(). For avoiding the problem above, this
patch introduces rwsem for non-atomic case, while keeping rwlock for
atomic case.
Also while we're at it: the superfluous irq flags is dropped in
snd_virmidi_input_open().
There is a potential race window opened at creating and deleting a
port via ioctl, as spotted by fuzzing. snd_seq_create_port() creates
a port object and returns its pointer, but it doesn't take the
refcount, thus it can be deleted immediately by another thread.
Meanwhile, snd_seq_ioctl_create_port() still calls the function
snd_seq_system_client_ev_port_start() with the created port object
that is being deleted, and this triggers use-after-free like:
We may fix this in a few different ways, and in this patch, it's fixed
simply by taking the refcount properly at snd_seq_create_port() and
letting the caller unref the object after use. Also, there is another
potential use-after-free by sprintf() call in snd_seq_create_port(),
and this is moved inside the lock.
USB-audio driver may leave a stray URB for the mixer interrupt when it
exits by some error during probe. This leads to a use-after-free
error as spotted by syzkaller like:
==================================================================
BUG: KASAN: use-after-free in snd_usb_mixer_interrupt+0x604/0x6f0
Call Trace:
<IRQ>
__dump_stack lib/dump_stack.c:16
dump_stack+0x292/0x395 lib/dump_stack.c:52
print_address_description+0x78/0x280 mm/kasan/report.c:252
kasan_report_error mm/kasan/report.c:351
kasan_report+0x23d/0x350 mm/kasan/report.c:409
__asan_report_load8_noabort+0x19/0x20 mm/kasan/report.c:430
snd_usb_mixer_interrupt+0x604/0x6f0 sound/usb/mixer.c:2490
__usb_hcd_giveback_urb+0x2e0/0x650 drivers/usb/core/hcd.c:1779
....
Actually such a URB is killed properly at disconnection when the
device gets probed successfully, and what we need is to apply it for
the error-path, too.
In this patch, we apply snd_usb_mixer_disconnect() at releasing.
Also introduce a new flag, disconnected, to struct usb_mixer_interface
for not performing the disconnection procedure twice.
The DREQE bit of the DnFIFOSEL should be set to 1 after the DE bit of
USB-DMAC on R-Car SoCs is set to 1 after the USB-DMAC received a
zero-length packet. Otherwise, a transfer completion interruption
of USB-DMAC doesn't happen. Even if the driver changes the sequence,
normal operations (transmit/receive without zero-length packet) will
not cause any side-effects. So, this patch fixes the sequence anyway.
When KVM emulates an exit from L2 to L1, it loads L1 CR4 into the
guest CR4. Before this CR4 loading, the guest CR4 refers to L2
CR4. Because these two CR4's are in different levels of guest, we
should vmx_set_cr4() rather than kvm_set_cr4() here. The latter, which
is used to handle guest writes to its CR4, checks the guest change to
CR4 and may fail if the change is invalid.
The failure may cause trouble. Consider we start
a L1 guest with non-zero L1 PCID in use,
(i.e. L1 CR4.PCIDE == 1 && L1 CR3.PCID != 0)
and
a L2 guest with L2 PCID disabled,
(i.e. L2 CR4.PCIDE == 0)
and following events may happen:
1. If kvm_set_cr4() is used in load_vmcs12_host_state() to load L1 CR4
into guest CR4 (in VMCS01) for L2 to L1 exit, it will fail because
of PCID check. As a result, the guest CR4 recorded in L0 KVM (i.e.
vcpu->arch.cr4) is left to the value of L2 CR4.
2. Later, if L1 attempts to change its CR4, e.g., clearing VMXE bit,
kvm_set_cr4() in L0 KVM will think L1 also wants to enable PCID,
because the wrong L2 CR4 is used by L0 KVM as L1 CR4. As L1
CR3.PCID != 0, L0 KVM will inject GP to L1 guest.
Fixes: 4704d0befb072 ("KVM: nVMX: Exiting from L2 to L1") Cc: qemu-stable@nongnu.org Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The shash ahash digest adaptor function may crash if given a
zero-length input together with a null SG list. This is because
it tries to read the SG list before looking at the length.
This patch fixes it by checking the length first.
Reported-by: Stephan Müller<smueller@chronox.de> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Tested-by: Stephan Müller <smueller@chronox.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The hid descriptor identifies the length and type of subordinate
descriptors for a device. If the received hid descriptor is smaller than
the size of the struct hid_descriptor, it is possible to cause
out-of-bounds.
In addition, if bNumDescriptors of the hid descriptor have an incorrect
value, this can also cause out-of-bounds while approaching hdesc->desc[n].
So check the size of hid descriptor and bNumDescriptors.
BUG: KASAN: slab-out-of-bounds in usbhid_parse+0x9b1/0xa20
Read of size 1 at addr ffff88006c5f8edf by task kworker/1:2/1261
Memory to Memory transfers does not have any special alignment needs
regarding to acnt array size, but if one of the areas are in memory mapped
regions (like PCIe memory), we need to make sure that the acnt array size
is aligned with the mem copy parameters.
Before "dmaengine: edma: Optimize memcpy operation" change the memcpy was set
up in a different way: acnt == number of bytes in a word based on
__ffs((src | dest | len), bcnt and ccnt for looping the necessary number of
words to comlete the trasnfer.
Instead of reverting the commit we can fix it to make sure that the ACNT size
is aligned to the traswnfer.
The FPU emulator includes 2 calls to pr_err() which are triggered by
invalid instruction encodings for MIPSr6 cmp.cond.fmt instructions.
These cases are not kernel errors, merely invalid instructions which are
already handled by delivering a SIGILL which will provide notification
that something failed in cases where that makes sense.
In cases where that SIGILL is somewhat expected & being handled, for
example when crashme happens to generate one of the affected bad
encodings, the message is printed with no useful context about what
triggered it & spams the kernel log for no good reason.
Remove the pr_err() calls to make crashme run silently & treat the bad
encodings the same way we do others, with a SIGILL & no further kernel
log output.
Signed-off-by: Paul Burton <paul.burton@imgtec.com> Fixes: f8c3c6717a71 ("MIPS: math-emu: Add support for the CMP.condn.fmt R6 instruction") Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/17253/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The dummy-hcd driver calls the gadget driver's disconnect callback
under the wrong conditions. It should invoke the callback when Vbus
power is turned off, but instead it does so when the D+ pullup is
turned off.
This can cause a deadlock in the composite core when a gadget driver
is unregistered:
A number of architecture invoke rcu_irq_enter() on exception entry in
order to allow RCU read-side critical sections in the exception handler
when the exception is from an idle or nohz_full CPU. This works, at
least unless the exception happens in an NMI handler. In that case,
rcu_nmi_enter() would already have exited the extended quiescent state,
which would mean that rcu_irq_enter() would (incorrectly) cause RCU
to think that it is again in an extended quiescent state. This will
in turn result in lockdep splats in response to later RCU read-side
critical sections.
This commit therefore causes rcu_irq_enter() and rcu_irq_exit() to
take no action if there is an rcu_nmi_enter() in effect, thus avoiding
the unscheduled return to RCU quiescent state. This in turn should
make the kernel safe for on-demand RCU voyeurism.
Link: http://lkml.kernel.org/r/20170922211022.GA18084@linux.vnet.ibm.com Cc: stable@vger.kernel.org Fixes: 0be964be0 ("module: Sanitize RCU usage and locking") Reported-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The MCAST_FILTER_CMD can get quite large when we have many mcast
addresses to set (we support up to 255). So the command should be
send as NOCOPY to prevent a warning caused by too-long commands:
WARNING: CPU: 0 PID: 9700 at /root/iwlwifi/stack-dev/drivers/net/wireless/intel/iwlwifi/pcie/tx.c:1550 iwl_pcie_enqueue_hcmd+0x8c7/0xb40 [iwlwifi]
Command MCAST_FILTER_CMD (0x1d0) is too large (328 bytes)
This fixes: https://bugzilla.kernel.org/show_bug.cgi?id=196743
Define a policy for packet pattern attributes in order to fix a
potential read over the end of the buffer during nla_get_u32()
of the NL80211_PKTPAT_OFFSET attribute.
Note that the data there can always be read due to SKB allocation
(with alignment and struct skb_shared_info at the end), but the
data might be uninitialized. This could be used to leak some data
from uninitialized vmalloc() memory, but most drivers don't allow
an offset (so you'd just get -EINVAL if the data is non-zero) or
just allow it with a fixed value - 100 or 128 bytes, so anything
above that would get -EINVAL. With brcmfmac the limit is 1500 so
(at least) one byte could be obtained.
Cc: stable@kernel.org Signed-off-by: Peng Xu <pxu@qti.qualcomm.com> Signed-off-by: Jouni Malinen <jouni@qca.qualcomm.com>
[rewrite description based on SKB allocation knowledge] Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
According to the MS-SMB2 spec (3.2.5.1.6) once the client receives
STATUS_NETWORK_SESSION_EXPIRED error code from a server it should
reconnect the current SMB session. Currently the client doesn't do
that. This can result in subsequent client requests failing by
the server. The patch adds an additional logic to the demultiplex
thread to identify expired sessions and reconnect them.
Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com> Signed-off-by: Steve French <smfrench@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In the ext4 implementations of SEEK_HOLE and SEEK_DATA, make sure we
return -ENXIO for negative offsets instead of banging around inside
the extent code and returning -EFSCORRUPTED.
Reported-by: Mateusz S <muttdini@gmail.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Upon handling the firmware notification for scans the length was
checked properly and may result in corrupting kernel heap memory
due to buffer overruns. This fix addresses CVE-2017-0786.
Cc: Kevin Cernekee <cernekee@chromium.org> Reviewed-by: Hante Meuleman <hante.meuleman@broadcom.com> Reviewed-by: Pieter-Paul Giesberts <pieter-paul.giesberts@broadcom.com> Reviewed-by: Franky Lin <franky.lin@broadcom.com> Signed-off-by: Arend van Spriel <arend.vanspriel@broadcom.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When new directory 'DIR1' is created in a directory 'DIR0' with SGID bit
set, DIR1 is expected to have SGID bit set (and owning group equal to
the owning group of 'DIR0'). However when 'DIR0' also has some default
ACLs that 'DIR1' inherits, setting these ACLs will result in SGID bit on
'DIR1' to get cleared if user is not member of the owning group.
Fix the problem by moving posix_acl_update_mode() out of
__ext4_set_acl() into ext4_set_acl(). That way the function will not be
called when inheriting ACLs which is what we want as it prevents SGID
bit clearing and the mode has been properly set by posix_acl_create()
anyway.
Fixes: 073931017b49d9458aa351605b43a7e34598caef Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Jan Kara <jack@suse.cz> Reviewed-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
mpage_submit_page() can race with another process growing i_size and
writing data via mmap to the written-back page. As mpage_submit_page()
samples i_size too early, it may happen that ext4_bio_write_page()
zeroes out too large tail of the page and thus corrupts user data.
Fix the problem by sampling i_size only after the page has been
write-protected in page tables by clear_page_dirty_for_io() call.
Reported-by: Michael Zimmer <michael@swarm64.com> CC: stable@vger.kernel.org Fixes: cb20d5188366f04d96d2e07b1240cc92170ade40 Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cpusets vs. suspend-resume is _completely_ broken. And it got noticed
because it now resulted in non-cpuset usage breaking too.
On suspend cpuset_cpu_inactive() doesn't call into
cpuset_update_active_cpus() because it doesn't want to move tasks about,
there is no need, all tasks are frozen and won't run again until after
we've resumed everything.
But this means that when we finally do call into
cpuset_update_active_cpus() after resuming the last frozen cpu in
cpuset_cpu_active(), the top_cpuset will not have any difference with
the cpu_active_mask and this it will not in fact do _anything_.
So the cpuset configuration will not be restored. This was largely
hidden because we would unconditionally create identity domains and
mobile users would not in fact use cpusets much. And servers what do use
cpusets tend to not suspend-resume much.
An addition problem is that we'd not in fact wait for the cpuset work to
finish before resuming the tasks, allowing spurious migrations outside
of the specified domains.
Fix the rebuild by introducing cpuset_force_rebuild() and fix the
ordering with cpuset_wait_for_hotplug().
Reported-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: <stable@vger.kernel.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rafael J. Wysocki <rjw@rjwysocki.net> Cc: Tejun Heo <tj@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: deb7aa308ea2 ("cpuset: reorganize CPU / memory hotplug handling") Link: http://lkml.kernel.org/r/20170907091338.orwxrqkbfkki3c24@hirez.programming.kicks-ass.net Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Mike Galbraith <efault@gmx.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The hardware state readout oopses after several warnings when trying to
use HDMI on port A, if such a combination is configured in VBT. Filter
the combo out already at the VBT parsing phase.
v2: also ignore DVI (Ville)
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102889 Cc: Imre Deak <imre.deak@intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Tested-by: Daniel Drake <dan@reactivated.net> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20170921141920.18172-1-jani.nikula@intel.com
(cherry picked from commit d27ffc1d00327c29b3aa97f941b42f0949f9e99f) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The driver was not properly configuring firmware with regard to the
type of scan. It always performed an active scan even when user-space
was requesting for passive scan, ie. the scan request was done without
any SSIDs specified.
Reported-by: Huang, Jiangyang <Jiangyang.Huang@itron.com> Reviewed-by: Hante Meuleman <hante.meuleman@broadcom.com> Reviewed-by: Pieter-Paul Giesberts <pieter-paul.giesberts@broadcom.com> Reviewed-by: Franky Lin <franky.lin@broadcom.com> Signed-off-by: Arend van Spriel <arend.vanspriel@broadcom.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
hwarc_neep_init() assumes that endpoint 0 is interrupt, but there's no
check for that, which results in a WARNING in USB core code, when a bad
USB descriptor is provided from a device:
uwbd_start() calls kthread_run() and checks that the return value is
not NULL. But the return value is not NULL in case kthread_run() fails,
it takes the form of ERR_PTR(-EINTR).
Commit f686a36b4b79 ("iio: adc: mcp320x: Add support for mcp3301")
returns a signed voltage from mcp320x_adc_conversion() but neglects that
the caller interprets a negative return value as failure. Only mcp3301
(and the upcoming mcp3550/1/3) is affected as the other chips are
incapable of measuring negative voltages.
Fix and while at it, add mcp3301 to the list of supported chips at the
top of the file.
Fixes: f686a36b4b79 ("iio: adc: mcp320x: Add support for mcp3301") Cc: Andrea Galbusera <gizero@gmail.com> Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The serial interface can be reset by writing 32 consecutive 1s to the device.
'ret' was initialized correctly but its value was overwritten when
ad7793_check_platform_data() was called. Since a dedicated reset function
is present now, it should be used instead.
Fixes: 2edb769d246e ("iio:ad7793: Add support for the ad7798 and ad7799") Signed-off-by: Dragos Bogdan <dragos.bogdan@analog.com> Acked-by: Lars-Peter Clausen <lars@metafoo.de> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
If an IIO device returns an error code for a read access via debugfs, it
is currently ignored by the IIO core (other than emitting an error
message). Instead, return this error code to user space, so upper layers
can detect it correctly.
Signed-off-by: Matt Fornero <matt.fornero@mathworks.com> Signed-off-by: Lars-Peter Clausen <lars@metafoo.de> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Since most of the SD ADCs have the option of reseting the serial
interface by sending a number of SCLKs with CS = 0 and DIN = 1,
a dedicated function that can do this is usefull.
Needed for the patch: iio: ad7793: Fix the serial interface reset Signed-off-by: Dragos Bogdan <dragos.bogdan@analog.com> Acked-by: Lars-Peter Clausen <lars@metafoo.de> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Commit 7cc97d77ee8a has introduced a call to 'regulator_disable()' in the
.remove function.
So we should also have such a call in the .probe function in case of
error after a successful 'regulator_enable()' call.
If 'devm_regulator_get()' fails, we should go through the existing error
handling path instead of returning directly, as done is all the other
error handling paths in this function.
Andrey Konovalov reported a possible out-of-bounds problem for a USB interface
association descriptor. He writes:
It seems there's no proper size check of a USB_DT_INTERFACE_ASSOCIATION
descriptor. It's only checked that the size is >= 2 in
usb_parse_configuration(), so find_iad() might do out-of-bounds access
to intf_assoc->bInterfaceCount.
And he's right, we don't check for crazy descriptors of this type very well, so
resolve this problem. Yet another issue found by syzkaller...
Commit e0429362ab15
("usb: Add device quirk for Logitech HD Pro Webcams C920 and C930e")
introduced quirk to workaround an issue with some Logitech webcams.
The workaround is introducing delay for some USB operations.
According to our testing, delay introduced by original commit
is not long enough and in rare cases we still see issues described
by the aforementioned commit.
This patch increases delays introduced by original commit.
Having this patch applied we do not see those problems anymore.
Andrey Konovalov reported a possible out-of-bounds problem for the
cdc_parse_cdc_header function. He writes:
It looks like cdc_parse_cdc_header() doesn't validate buflen
before accessing buffer[1], buffer[2] and so on. The only check
present is while (buflen > 0).
So fix this issue up by properly validating the buffer length matches
what the descriptor says it is.
The uas driver has a subtle bug in the way it handles alternate
settings. The uas_find_uas_alt_setting() routine returns an
altsetting value (the bAlternateSetting number in the descriptor), but
uas_use_uas_driver() then treats that value as an index to the
intf->altsetting array, which it isn't.
Normally this doesn't cause any problems because the various
alternate settings have bAlternateSetting values 0, 1, 2, ..., so the
value is equal to the index in the array. But this is not guaranteed,
and Andrey Konovalov used the syzkaller fuzzer with KASAN to get a
slab-out-of-bounds error by violating this assumption.
This patch fixes the bug by making uas_find_uas_alt_setting() return a
pointer to the altsetting entry rather than either the value or the
index. Pointers are less subject to misinterpretation.
A user may lower the max_sectors_kb setting in sysfs to accommodate
certain workloads. Previously we would always set the max I/O size to
either the block layer default or the optional preferred I/O size
reported by the device.
Keep the current heuristics for the initial setting of max_sectors_kb.
For subsequent invocations, only update the current queue limit if it
exceeds the capabilities of the hardware.
Reported-by: Don Brace <don.brace@microsemi.com> Reviewed-by: Martin Wilck <mwilck@suse.com> Tested-by: Don Brace <don.brace@microsemi.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>