]> git.itanic.dy.fi Git - linux-stable/log
linux-stable
5 years agoLinux 4.19.16 v4.19.16
Greg Kroah-Hartman [Wed, 16 Jan 2019 21:04:38 +0000 (22:04 +0100)]
Linux 4.19.16

5 years agoBtrfs: use nofs context when initializing security xattrs to avoid deadlock
Filipe Manana [Mon, 10 Dec 2018 17:53:35 +0000 (17:53 +0000)]
Btrfs: use nofs context when initializing security xattrs to avoid deadlock

commit 827aa18e7b903c5ff3b3cd8fec328a99b1dbd411 upstream.

When initializing the security xattrs, we are holding a transaction handle
therefore we need to use a GFP_NOFS context in order to avoid a deadlock
with reclaim in case it's triggered.

Fixes: 39a27ec1004e8 ("btrfs: use GFP_KERNEL for xattr and acl allocations")
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoBtrfs: fix deadlock when enabling quotas due to concurrent snapshot creation
Filipe Manana [Mon, 19 Nov 2018 14:15:36 +0000 (14:15 +0000)]
Btrfs: fix deadlock when enabling quotas due to concurrent snapshot creation

commit 9a6f209e36500efac51528132a3e3083586eda5f upstream.

If the quota enable and snapshot creation ioctls are called concurrently
we can get into a deadlock where the task enabling quotas will deadlock
on the fs_info->qgroup_ioctl_lock mutex because it attempts to lock it
twice, or the task creating a snapshot tries to commit the transaction
while the task enabling quota waits for the former task to commit the
transaction while holding the mutex. The following time diagrams show how
both cases happen.

First scenario:

           CPU 0                                    CPU 1

 btrfs_ioctl()
  btrfs_ioctl_quota_ctl()
   btrfs_quota_enable()
    mutex_lock(fs_info->qgroup_ioctl_lock)
    btrfs_start_transaction()

                                             btrfs_ioctl()
                                              btrfs_ioctl_snap_create_v2
                                               create_snapshot()
                                                --> adds snapshot to the
                                                    list pending_snapshots
                                                    of the current
                                                    transaction

    btrfs_commit_transaction()
     create_pending_snapshots()
       create_pending_snapshot()
        qgroup_account_snapshot()
         btrfs_qgroup_inherit()
   mutex_lock(fs_info->qgroup_ioctl_lock)
    --> deadlock, mutex already locked
        by this task at
btrfs_quota_enable()

Second scenario:

           CPU 0                                    CPU 1

 btrfs_ioctl()
  btrfs_ioctl_quota_ctl()
   btrfs_quota_enable()
    mutex_lock(fs_info->qgroup_ioctl_lock)
    btrfs_start_transaction()

                                             btrfs_ioctl()
                                              btrfs_ioctl_snap_create_v2
                                               create_snapshot()
                                                --> adds snapshot to the
                                                    list pending_snapshots
                                                    of the current
                                                    transaction

                                                btrfs_commit_transaction()
                                                 --> waits for task at
                                                     CPU 0 to release
                                                     its transaction
                                                     handle

    btrfs_commit_transaction()
     --> sees another task started
         the transaction commit first
     --> releases its transaction
         handle
     --> waits for the transaction
         commit to be completed by
         the task at CPU 1

                                                 create_pending_snapshot()
                                                  qgroup_account_snapshot()
                                                   btrfs_qgroup_inherit()
                                                    mutex_lock(fs_info->qgroup_ioctl_lock)
                                                     --> deadlock, task at CPU 0
                                                         has the mutex locked but
                                                         it is waiting for us to
                                                         finish the transaction
                                                         commit

So fix this by setting the quota enabled flag in fs_info after committing
the transaction at btrfs_quota_enable(). This ends up serializing quota
enable and snapshot creation as if the snapshot creation happened just
before the quota enable request. The quota rescan task, scheduled after
committing the transaction in btrfs_quote_enable(), will do the accounting.

Fixes: 6426c7ad697d ("btrfs: qgroup: Fix qgroup accounting when creating snapshot")
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoBtrfs: fix access to available allocation bits when starting balance
Filipe Manana [Mon, 19 Nov 2018 09:48:12 +0000 (09:48 +0000)]
Btrfs: fix access to available allocation bits when starting balance

commit 5a8067c0d17feb7579db0476191417b441a8996e upstream.

The available allocation bits members from struct btrfs_fs_info are
protected by a sequence lock, and when starting balance we access them
incorrectly in two different ways:

1) In the read sequence lock loop at btrfs_balance() we use the values we
   read from fs_info->avail_*_alloc_bits and we can immediately do actions
   that have side effects and can not be undone (printing a message and
   jumping to a label). This is wrong because a retry might be needed, so
   our actions must not have side effects and must be repeatable as long
   as read_seqretry() returns a non-zero value. In other words, we were
   essentially ignoring the sequence lock;

2) Right below the read sequence lock loop, we were reading the values
   from avail_metadata_alloc_bits and avail_data_alloc_bits without any
   protection from concurrent writers, that is, reading them outside of
   the read sequence lock critical section.

So fix this by making sure we only read the available allocation bits
while in a read sequence lock critical section and that what we do in the
critical section is repeatable (has nothing that can not be undone) so
that any eventual retry that is needed is handled properly.

Fixes: de98ced9e743 ("Btrfs: use seqlock to protect fs_info->avail_{data, metadata, system}_alloc_bits")
Fixes: 14506127979a ("btrfs: fix a bogus warning when converting only data or metadata")
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoarm64: compat: Don't pull syscall number from regs in arm_compat_syscall
Will Deacon [Thu, 3 Jan 2019 18:00:39 +0000 (18:00 +0000)]
arm64: compat: Don't pull syscall number from regs in arm_compat_syscall

commit 53290432145a8eb143fe29e06e9c1465d43dc723 upstream.

The syscall number may have been changed by a tracer, so we should pass
the actual number in from the caller instead of pulling it from the
saved r7 value directly.

Cc: <stable@vger.kernel.org>
Cc: Pi-Hsun Shih <pihsun@chromium.org>
Reviewed-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoKVM: arm/arm64: Fix VMID alloc race by reverting to lock-less
Christoffer Dall [Tue, 11 Dec 2018 12:23:57 +0000 (13:23 +0100)]
KVM: arm/arm64: Fix VMID alloc race by reverting to lock-less

commit fb544d1ca65a89f7a3895f7531221ceeed74ada7 upstream.

We recently addressed a VMID generation race by introducing a read/write
lock around accesses and updates to the vmid generation values.

However, kvm_arch_vcpu_ioctl_run() also calls need_new_vmid_gen() but
does so without taking the read lock.

As far as I can tell, this can lead to the same kind of race:

  VM 0, VCPU 0 VM 0, VCPU 1
  ------------ ------------
  update_vttbr (vmid 254)
   update_vttbr (vmid 1) // roll over
read_lock(kvm_vmid_lock);
force_vm_exit()
  local_irq_disable
  need_new_vmid_gen == false //because vmid gen matches

  enter_guest (vmid 254)
   kvm_arch.vttbr = <PGD>:<VMID 1>
read_unlock(kvm_vmid_lock);

   enter_guest (vmid 1)

Which results in running two VCPUs in the same VM with different VMIDs
and (even worse) other VCPUs from other VMs could now allocate clashing
VMID 254 from the new generation as long as VCPU 0 is not exiting.

Attempt to solve this by making sure vttbr is updated before another CPU
can observe the updated VMID generation.

Cc: stable@vger.kernel.org
Fixes: f0cf47d939d0 "KVM: arm/arm64: Close VMID generation race"
Reviewed-by: Julien Thierry <julien.thierry@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agosunrpc: use-after-free in svc_process_common()
Vasily Averin [Mon, 24 Dec 2018 11:44:52 +0000 (14:44 +0300)]
sunrpc: use-after-free in svc_process_common()

commit d4b09acf924b84bae77cad090a9d108e70b43643 upstream.

if node have NFSv41+ mounts inside several net namespaces
it can lead to use-after-free in svc_process_common()

svc_process_common()
        /* Setup reply header */
        rqstp->rq_xprt->xpt_ops->xpo_prep_reply_hdr(rqstp); <<< HERE

svc_process_common() can use incorrect rqstp->rq_xprt,
its caller function bc_svc_process() takes it from serv->sv_bc_xprt.
The problem is that serv is global structure but sv_bc_xprt
is assigned per-netnamespace.

According to Trond, the whole "let's set up rqstp->rq_xprt
for the back channel" is nothing but a giant hack in order
to work around the fact that svc_process_common() uses it
to find the xpt_ops, and perform a couple of (meaningless
for the back channel) tests of xpt_flags.

All we really need in svc_process_common() is to be able to run
rqstp->rq_xprt->xpt_ops->xpo_prep_reply_hdr()

Bruce J Fields points that this xpo_prep_reply_hdr() call
is an awfully roundabout way just to do "svc_putnl(resv, 0);"
in the tcp case.

This patch does not initialiuze rqstp->rq_xprt in bc_svc_process(),
now it calls svc_process_common() with rqstp->rq_xprt = NULL.

To adjust reply header svc_process_common() just check
rqstp->rq_prot and calls svc_tcp_prep_reply_hdr() for tcp case.

To handle rqstp->rq_xprt = NULL case in functions called from
svc_process_common() patch intruduces net namespace pointer
svc_rqst->rq_bc_net and adjust SVC_NET() definition.
Some other function was also adopted to properly handle described case.

Signed-off-by: Vasily Averin <vvs@virtuozzo.com>
Cc: stable@vger.kernel.org
Fixes: 23c20ecd4475 ("NFS: callback up - users counting cleanup")
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
v2: added lost extern svc_tcp_prep_reply_hdr()
Signed-off-by: Vasily Averin <vvs@virtuozzo.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agomm: page_mapped: don't assume compound page is huge or THP
Jan Stancek [Tue, 8 Jan 2019 23:23:28 +0000 (15:23 -0800)]
mm: page_mapped: don't assume compound page is huge or THP

commit 8ab88c7169b7fba98812ead6524b9d05bc76cf00 upstream.

LTP proc01 testcase has been observed to rarely trigger crashes
on arm64:
    page_mapped+0x78/0xb4
    stable_page_flags+0x27c/0x338
    kpageflags_read+0xfc/0x164
    proc_reg_read+0x7c/0xb8
    __vfs_read+0x58/0x178
    vfs_read+0x90/0x14c
    SyS_read+0x60/0xc0

The issue is that page_mapped() assumes that if compound page is not
huge, then it must be THP.  But if this is 'normal' compound page
(COMPOUND_PAGE_DTOR), then following loop can keep running (for
HPAGE_PMD_NR iterations) until it tries to read from memory that isn't
mapped and triggers a panic:

        for (i = 0; i < hpage_nr_pages(page); i++) {
                if (atomic_read(&page[i]._mapcount) >= 0)
                        return true;
}

I could replicate this on x86 (v4.20-rc4-98-g60b548237fed) only
with a custom kernel module [1] which:
 - allocates compound page (PAGEC) of order 1
 - allocates 2 normal pages (COPY), which are initialized to 0xff (to
   satisfy _mapcount >= 0)
 - 2 PAGEC page structs are copied to address of first COPY page
 - second page of COPY is marked as not present
 - call to page_mapped(COPY) now triggers fault on access to 2nd COPY
   page at offset 0x30 (_mapcount)

[1] https://github.com/jstancek/reproducers/blob/master/kernel/page_mapped_crash/repro.c

Fix the loop to iterate for "1 << compound_order" pages.

Kirrill said "IIRC, sound subsystem can producuce custom mapped compound
pages".

Link: http://lkml.kernel.org/r/c440d69879e34209feba21e12d236d06bc0a25db.1543577156.git.jstancek@redhat.com
Fixes: e1534ae95004 ("mm: differentiate page_mapped() from page_mapcount() for compound pages")
Signed-off-by: Jan Stancek <jstancek@redhat.com>
Debugged-by: Laszlo Ersek <lersek@redhat.com>
Suggested-by: "Kirill A. Shutemov" <kirill@shutemov.name>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Andrea Arcangeli <aarcange@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoext4: fix special inode number checks in __ext4_iget()
Theodore Ts'o [Tue, 1 Jan 2019 03:34:31 +0000 (22:34 -0500)]
ext4: fix special inode number checks in __ext4_iget()

commit 191ce17876c9367819c4b0a25b503c0f6d9054d8 upstream.

The check for special (reserved) inode number checks in __ext4_iget()
was broken by commit 8a363970d1dc: ("ext4: avoid declaring fs
inconsistent due to invalid file handles").  This was caused by a
botched reversal of the sense of the flag now known as
EXT4_IGET_SPECIAL (when it was previously named EXT4_IGET_NORMAL).
Fix the logic appropriately.

Fixes: 8a363970d1dc ("ext4: avoid declaring fs inconsistent...")
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Cc: stable@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoext4: track writeback errors using the generic tracking infrastructure
Theodore Ts'o [Mon, 31 Dec 2018 05:11:07 +0000 (00:11 -0500)]
ext4: track writeback errors using the generic tracking infrastructure

commit 95cb67138746451cc84cf8e516e14989746e93b0 upstream.

We already using mapping_set_error() in fs/ext4/page_io.c, so all we
need to do is to use file_check_and_advance_wb_err() when handling
fsync() requests in ext4_sync_file().

Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoext4: use ext4_write_inode() when fsyncing w/o a journal
Theodore Ts'o [Mon, 31 Dec 2018 05:10:48 +0000 (00:10 -0500)]
ext4: use ext4_write_inode() when fsyncing w/o a journal

commit ad211f3e94b314a910d4af03178a0b52a7d1ee0a upstream.

In no-journal mode, we previously used __generic_file_fsync() in
no-journal mode.  This triggers a lockdep warning, and in addition,
it's not safe to depend on the inode writeback mechanism in the case
ext4.  We can solve both problems by calling ext4_write_inode()
directly.

Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoext4: avoid kernel warning when writing the superblock to a dead device
Theodore Ts'o [Mon, 31 Dec 2018 04:20:39 +0000 (23:20 -0500)]
ext4: avoid kernel warning when writing the superblock to a dead device

commit e86807862e6880809f191c4cea7f88a489f0ed34 upstream.

The xfstests generic/475 test switches the underlying device with
dm-error while running a stress test.  This results in a large number
of file system errors, and since we can't lock the buffer head when
marking the superblock dirty in the ext4_grp_locked_error() case, it's
possible the superblock to be !buffer_uptodate() without
buffer_write_io_error() being true.

We need to set buffer_uptodate() before we call mark_buffer_dirty() or
this will trigger a WARN_ON.  It's safe to do this since the
superblock must have been properly read into memory or the mount would
have been successful.  So if buffer_uptodate() is not set, we can
safely assume that this happened due to a failed attempt to write the
superblock.

Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoext4: fix a potential fiemap/page fault deadlock w/ inline_data
Theodore Ts'o [Tue, 25 Dec 2018 05:56:33 +0000 (00:56 -0500)]
ext4: fix a potential fiemap/page fault deadlock w/ inline_data

commit 2b08b1f12cd664dc7d5c84ead9ff25ae97ad5491 upstream.

The ext4_inline_data_fiemap() function calls fiemap_fill_next_extent()
while still holding the xattr semaphore.  This is not necessary and it
triggers a circular lockdep warning.  This is because
fiemap_fill_next_extent() could trigger a page fault when it writes
into page which triggers a page fault.  If that page is mmaped from
the inline file in question, this could very well result in a
deadlock.

This problem can be reproduced using generic/519 with a file system
configuration which has the inline_data feature enabled.

Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoext4: make sure enough credits are reserved for dioread_nolock writes
Theodore Ts'o [Tue, 25 Dec 2018 01:27:08 +0000 (20:27 -0500)]
ext4: make sure enough credits are reserved for dioread_nolock writes

commit 812c0cab2c0dfad977605dbadf9148490ca5d93f upstream.

There are enough credits reserved for most dioread_nolock writes;
however, if the extent tree is sufficiently deep, and/or quota is
enabled, the code was not allowing for all eventualities when
reserving journal credits for the unwritten extent conversion.

This problem can be seen using xfstests ext4/034:

   WARNING: CPU: 1 PID: 257 at fs/ext4/ext4_jbd2.c:271 __ext4_handle_dirty_metadata+0x10c/0x180
   Workqueue: ext4-rsv-conversion ext4_end_io_rsv_work
   RIP: 0010:__ext4_handle_dirty_metadata+0x10c/0x180
    ...
   EXT4-fs: ext4_free_blocks:4938: aborting transaction: error 28 in __ext4_handle_dirty_metadata
   EXT4: jbd2_journal_dirty_metadata failed: handle type 11 started at line 4921, credits 4/0, errcode -28
   EXT4-fs error (device dm-1) in ext4_free_blocks:4950: error 28

Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agorbd: don't return 0 on unmap if RBD_DEV_FLAG_REMOVING is set
Ilya Dryomov [Tue, 8 Jan 2019 18:47:38 +0000 (19:47 +0100)]
rbd: don't return 0 on unmap if RBD_DEV_FLAG_REMOVING is set

commit 85f5a4d666fd9be73856ed16bb36c5af5b406b29 upstream.

There is a window between when RBD_DEV_FLAG_REMOVING is set and when
the device is removed from rbd_dev_list.  During this window, we set
"already" and return 0.

Returning 0 from write(2) can confuse userspace tools because
0 indicates that nothing was written.  In particular, "rbd unmap"
will retry the write multiple times a second:

  10:28:05.463299 write(4, "0", 1)        = 0
  10:28:05.463509 write(4, "0", 1)        = 0
  10:28:05.463720 write(4, "0", 1)        = 0
  10:28:05.463942 write(4, "0", 1)        = 0
  10:28:05.464155 write(4, "0", 1)        = 0

Cc: stable@vger.kernel.org
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Tested-by: Dongsheng Yang <dongsheng.yang@easystack.cn>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agodrm/amdgpu: Don't fail resume process if resuming atomic state fails
Lyude Paul [Tue, 8 Jan 2019 21:11:28 +0000 (16:11 -0500)]
drm/amdgpu: Don't fail resume process if resuming atomic state fails

commit 2d1af6a11cb9d88e0e3dd10258904c437fe1b315 upstream.

This is an ugly one unfortunately. Currently, all DRM drivers supporting
atomic modesetting will save the state that userspace had set before
suspending, then attempt to restore that state on resume. This probably
worked very well at one point, like many other things, until DP MST came
into the picture. While it's easy to restore state on normal display
connectors that were disconnected during suspend regardless of their
state post-resume, this can't really be done with MST because of the
fact that setting up a downstream sink requires performing sideband
transactions between the source and the MST hub, sending out the ACT
packets, etc.

Because of this, there isn't really a guarantee that we can restore the
atomic state we had before suspend once we've resumed. This sucks pretty
bad, but so far I haven't run into any compositors that this actually
causes serious issues with. Most compositors will notice the hotplug we
send afterwards, and then reprobe state.

Since nouveau and i915 also don't fail the suspend/resume process due to
failing to restore the atomic state, let's make amdgpu match this
behavior. Better to resume the GPU properly, then to stop the process
half way because of a potentially unavoidable atomic commit failure.

Eventually, we'll have a real fix for this problem on the DRM level. But
we've got some more important low-hanging fruit to deal with first.

Signed-off-by: Lyude Paul <lyude@redhat.com>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Cc: Jerry Zuo <Jerry.Zuo@amd.com>
Cc: <stable@vger.kernel.org> # v4.15+
Link: https://patchwork.freedesktop.org/patch/msgid/20190108211133.32564-3-lyude@redhat.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agodrm/amdgpu: Don't ignore rc from drm_dp_mst_topology_mgr_resume()
Lyude Paul [Tue, 8 Jan 2019 21:11:27 +0000 (16:11 -0500)]
drm/amdgpu: Don't ignore rc from drm_dp_mst_topology_mgr_resume()

commit fe7553bef8d676d1d8b40666868b33ec39b9df5d upstream.

drm_dp_mst_topology_mgr_resume() returns whether or not it managed to
find the topology in question after a suspend resume cycle, and the
driver is supposed to check this value and disable MST accordingly if
it's gone-in addition to sending a hotplug in order to notify userspace
that something changed during suspend.

Currently, amdgpu just makes the mistake of ignoring the return code
from drm_dp_mst_topology_mgr_resume() which means that if a topology was
removed in suspend, amdgpu never notices and assumes it's still
connected which leads to all sorts of problems.

So, fix this by actually checking the rc from
drm_dp_mst_topology_mgr_resume(). Also, reformat the rest of the
function while we're at it to fix the over-indenting.

Signed-off-by: Lyude Paul <lyude@redhat.com>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Cc: Jerry Zuo <Jerry.Zuo@amd.com>
Cc: <stable@vger.kernel.org> # v4.15+
Link: https://patchwork.freedesktop.org/patch/msgid/20190108211133.32564-2-lyude@redhat.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agodrm/i915: Unwind failure on pinning the gen7 ppgtt
Chris Wilson [Sat, 22 Dec 2018 03:06:23 +0000 (03:06 +0000)]
drm/i915: Unwind failure on pinning the gen7 ppgtt

commit 280d479b310298dfeb1d6f9a1617eca37beb6ce4 upstream.

If we fail to pin the ggtt vma slot for the ppgtt page tables, we need
to unwind the locals before reporting the error. Or else on subsequent
attempts to bind the page tables into the ggtt, we will already believe
that the vma has been pinned and continue on blithely. If something else
should happen to be at that location, choas ensues.

Fixes: a2bbf7148342 ("drm/i915/gtt: Only keep gen6 page directories pinned while active")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Cc: <stable@vger.kernel.org> # v4.19+
Reviewed-by: Matthew Auld <matthew.william.auld@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181222030623.21710-1-chris@chris-wilson.co.uk
(cherry picked from commit d4de753526f4d99f541f1b6ed1d963005c09700c)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agodrm/fb-helper: Partially bring back workaround for bugs of SDL 1.2
Ivan Mironov [Tue, 8 Jan 2019 07:23:52 +0000 (12:23 +0500)]
drm/fb-helper: Partially bring back workaround for bugs of SDL 1.2

commit 62d85b3bf9d978ed4b6b2aeef5cf0ccf1423906e upstream.

SDL 1.2 sets all fields related to the pixel format to zero in some
cases[1]. Prior to commit db05c48197759 ("drm: fb-helper: Reject all
pixel format changing requests"), there was an unintentional workaround
for this that existed for more than a decade. First in device-specific DRM
drivers, then here in drm_fb_helper.c.

Previous code containing this workaround just ignores pixel format fields
from userspace code. Not a good thing either, as this way, driver may
silently use pixel format different from what client actually requested,
and this in turn will lead to displaying garbage on the screen. I think
that returning EINVAL to userspace in this particular case is the right
option, so I decided to left code from problematic commit untouched
instead of just reverting it entirely.

Here is the steps required to reproduce this problem exactly:
1) Compile fceux[2] with SDL 1.2.15 and without GTK or OpenGL
   support. SDL should be compiled with fbdev support (which is
   on by default).
2) Create /etc/fb.modes with following contents (values seems
   not used, and just required to trigger problematic code in
   SDL):

mode "test"
    geometry 1 1 1 1 1
    timings 1 1 1 1 1 1 1
endmode

3) Create ~/.fceux/fceux.cfg with following contents:

SDL.Hotkeys.Quit = 27
SDL.DoubleBuffering = 1

4) Ensure that screen resolution is at least 1280x960 (e.g.
   append "video=Virtual-1:1280x960-32" to the kernel cmdline
   for qemu/QXL).

5) Try to run fceux on VT with some ROM file[3]:

# ./fceux color_test.nes

[1] SDL 1.2.15 source code, src/video/fbcon/SDL_fbvideo.c,
    FB_SetVideoMode()
[2] http://www.fceux.com
[3] Example ROM: https://github.com/bokuweb/rustynes/blob/master/roms/color_test.nes

Reported-by: saahriktu <mail@saahriktu.org>
Suggested-by: saahriktu <mail@saahriktu.org>
Cc: stable@vger.kernel.org
Fixes: db05c48197759 ("drm: fb-helper: Reject all pixel format changing requests")
Signed-off-by: Ivan Mironov <mironov.ivan@gmail.com>
[danvet: Delete misleading comment.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20190108072353.28078-2-mironov.ivan@gmail.com
Link: https://patchwork.freedesktop.org/patch/msgid/20190108072353.28078-2-mironov.ivan@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agodrm/fb_helper: Allow leaking fbdev smem_start
Neil Armstrong [Fri, 28 Sep 2018 12:05:55 +0000 (14:05 +0200)]
drm/fb_helper: Allow leaking fbdev smem_start

commit 4be9bd10e22dfc7fc101c5cf5969ef2d3a042d8a upstream.

Since "drm/fb: Stop leaking physical address", the default behaviour of
the DRM fbdev emulation is to set the smem_base to 0 and pass the new
FBINFO_HIDE_SMEM_START flag.

The main reason is to avoid leaking physical addresse to user-space, and
it follows a general move over the kernel code to avoid user-space to
manipulate physical addresses and then use some other mechanisms like
dma-buf to transfer physical buffer handles over multiple subsystems.

But, a lot of devices depends on closed sources binaries to enable
OpenGL hardware acceleration that uses this smem_start value to
pass physical addresses to out-of-tree modules in order to render
into these physical adresses. These should use dma-buf buffers allocated
from the DRM display device instead and stop relying on fbdev overallocation
to gather DMA memory (some HW vendors delivers GBM and Wayland capable
binaries, but older unsupported devices won't have these new binaries
and are doomed until an Open Source solution like Lima finalizes).

Since these devices heavily depends on this kind of software and because
the smem_start population was available for years, it's a breakage to
stop leaking smem_start without any alternative solutions.

This patch adds a Kconfig depending on the EXPERT config and an unsafe
kernel module parameter tainting the kernel when enabled.

A clear comment and Kconfig help text was added to clarify why and when
this patch should be reverted, but in the meantime it's a necessary
feature to keep.

Cc: Dave Airlie <airlied@gmail.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Cc: Noralf Trønnes <noralf@tronnes.org>
Cc: Maxime Ripard <maxime.ripard@bootlin.com>
Cc: Eric Anholt <eric@anholt.net>
Cc: Lucas Stach <l.stach@pengutronix.de>
Cc: Rob Clark <robdclark@gmail.com>
Cc: Ben Skeggs <skeggsb@gmail.com>
Cc: Christian König <christian.koenig@amd.com>
Signed-off-by: Neil Armstrong <narmstrong@baylibre.com>
Reviewed-by: Maxime Ripard <maxime.ripard@bootlin.com>
Tested-by: Maxime Ripard <maxime.ripard@bootlin.com>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Acked-by: Dave Airlie <airlied@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1538136355-15383-1-git-send-email-narmstrong@baylibre.com
Signed-off-by: Maxime Ripard <maxime.ripard@bootlin.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agodrm/amd/display: Fix MST dp_blank REG_WAIT timeout
Jerry (Fangzhi) Zuo [Mon, 17 Dec 2018 15:32:22 +0000 (10:32 -0500)]
drm/amd/display: Fix MST dp_blank REG_WAIT timeout

commit 8c9d90eebd23b6d40ddf4ce5df5ca2b932336a06 upstream.

Need to blank stream before deallocate MST payload.

[drm:generic_reg_wait [amdgpu]] *ERROR* REG_WAIT timeout 10us * 3000 tries - dce110_stream_encoder_dp_blank line:944
------------[ cut here ]------------
WARNING: CPU: 0 PID: 2201 at /var/lib/dkms/amdgpu/18.50-690240/build/amd/amdgpu/../display/dc/dc_helper.c:249 generic_reg_wait+0xe7/0x160 [amdgpu]
Call Trace:
 dce110_stream_encoder_dp_blank+0x11c/0x180 [amdgpu]
 core_link_disable_stream+0x40/0x230 [amdgpu]
 ? generic_reg_update_ex+0xdb/0x130 [amdgpu]
 dce110_reset_hw_ctx_wrap+0xb7/0x1f0 [amdgpu]
 dce110_apply_ctx_to_hw+0x30/0x430 [amdgpu]
 ? dce110_apply_ctx_for_surface+0x206/0x260 [amdgpu]
 dc_commit_state+0x2ba/0x4d0 [amdgpu]
 amdgpu_dm_atomic_commit_tail+0x297/0xd70 [amdgpu]
 ? amdgpu_bo_pin_restricted+0x58/0x260 [amdgpu]
 ? wait_for_completion_timeout+0x1f/0x120
 ? wait_for_completion_interruptible+0x1c/0x160
 commit_tail+0x3d/0x60 [drm_kms_helper]
 drm_atomic_helper_commit+0xf6/0x100 [drm_kms_helper]
 drm_atomic_connector_commit_dpms+0xe5/0xf0 [drm]
 drm_mode_obj_set_property_ioctl+0x14f/0x250 [drm]
 drm_mode_connector_property_set_ioctl+0x2e/0x40 [drm]
 drm_ioctl+0x1e0/0x430 [drm]
 ? drm_mode_connector_set_obj_prop+0x70/0x70 [drm]
 ? ep_read_events_proc+0xb0/0xb0
 ? ep_scan_ready_list.constprop.18+0x1e6/0x1f0
 ? timerqueue_add+0x52/0x80
 amdgpu_drm_ioctl+0x49/0x80 [amdgpu]
 do_vfs_ioctl+0x90/0x5f0
 SyS_ioctl+0x74/0x80
 do_syscall_64+0x74/0x140
 entry_SYSCALL_64_after_hwframe+0x3d/0xa2
---[ end trace 3ed7b77a97d60f72 ]---

Signed-off-by: Jerry (Fangzhi) Zuo <Jerry.Zuo@amd.com>
Reviewed-by: Hersen Wu <hersenxs.wu@amd.com>
Acked-by: Harry Wentland <harry.wentland@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Tested-by: Lyude Paul <lyude@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoPCI: dwc: Move interrupt acking into the proper callback
Marc Zyngier [Tue, 13 Nov 2018 22:57:34 +0000 (22:57 +0000)]
PCI: dwc: Move interrupt acking into the proper callback

commit 3f7bb2ec20ce07c02b2002349d256c91a463fcc5 upstream.

The write to the status register is really an ACK for the HW,
and should be treated as such by the driver. Let's move it to the
irq_ack() callback, which will prevent people from moving it around
in order to paper over other bugs.

Fixes: 8c934095fa2f ("PCI: dwc: Clear MSI interrupt status after it is handled,
not before")
Fixes: 7c5925afbc58 ("PCI: dwc: Move MSI IRQs allocation to IRQ domains
hierarchical API")
Link: https://lore.kernel.org/linux-pci/20181113225734.8026-1-marc.zyngier@arm.com/
Reported-by: Trent Piepho <tpiepho@impinj.com>
Tested-by: Niklas Cassel <niklas.cassel@linaro.org>
Tested-by: Gustavo Pimentel <gustavo.pimentel@synopsys.com>
Tested-by: Stanimir Varbanov <svarbanov@mm-sol.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
[lorenzo.pieralisi@arm.com: updated commit log]
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoPCI: dwc: Take lock when ACKing an interrupt
Marc Zyngier [Tue, 13 Nov 2018 22:57:33 +0000 (22:57 +0000)]
PCI: dwc: Take lock when ACKing an interrupt

commit fce5423e4f431c71933d6c1f850b540a314aa6ee upstream.

Bizarrely, there is no lock taken in the irq_ack() helper. This
puts the ACK callback provided by a specific platform in a awkward
situation where there is no synchronization that would be expected
on other callback.

Introduce the required lock, giving some level of uniformity among
callbacks.

Fixes: 7c5925afbc58 ("PCI: dwc: Move MSI IRQs allocation to IRQ domains
hierarchical API")
Link: https://lore.kernel.org/linux-pci/20181113225734.8026-1-marc.zyngier@arm.com/
Tested-by: Niklas Cassel <niklas.cassel@linaro.org>
Tested-by: Gustavo Pimentel <gustavo.pimentel@synopsys.com>
Tested-by: Stanimir Varbanov <svarbanov@mm-sol.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
[lorenzo.pieralisi@arm.com: updated commit log]
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoPCI: dwc: Use interrupt masking instead of disabling
Marc Zyngier [Tue, 13 Nov 2018 22:57:32 +0000 (22:57 +0000)]
PCI: dwc: Use interrupt masking instead of disabling

commit 830920e065e90db318a0da98bf13a02b641eae7f upstream.

The dwc driver is showing an interesting level of brokeness, as it
insists on using the enable/disable set of registers to mask/unmask
MSIs, meaning that an MSIs being generated while the interrupt is in
that "disabled" state will simply be lost.

Let's move to the mask/unmask set of registers, which offers the
expected semantics.

Fixes: 7c5925afbc58 ("PCI: dwc: Move MSI IRQs allocation to IRQ domains
hierarchical API")
Link: https://lore.kernel.org/linux-pci/20181113225734.8026-1-marc.zyngier@arm.com/
Tested-by: Niklas Cassel <niklas.cassel@linaro.org>
Tested-by: Gustavo Pimentel <gustavo.pimentel@synopsys.com>
Tested-by: Stanimir Varbanov <svarbanov@mm-sol.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
[lorenzo.pieralisi@arm.com: updated commit log]
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agodrm/amdgpu: Add new VegaM pci id
Alex Deucher [Thu, 20 Dec 2018 15:08:46 +0000 (10:08 -0500)]
drm/amdgpu: Add new VegaM pci id

commit f6653a0e0877572c87f6dab5351e7bd6b6b7100c upstream.

Add a new pci id.

Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agovfio/type1: Fix unmap overflow off-by-one
Alex Williamson [Tue, 8 Jan 2019 05:13:22 +0000 (22:13 -0700)]
vfio/type1: Fix unmap overflow off-by-one

commit 58fec830fc19208354895d9832785505046d6c01 upstream.

The below referenced commit adds a test for integer overflow, but in
doing so prevents the unmap ioctl from ever including the last page of
the address space.  Subtract one to compare to the last address of the
unmap to avoid the overflow and wrap-around.

Fixes: 71a7d3d78e3c ("vfio/type1: silence integer overflow warning")
Link: https://bugzilla.redhat.com/show_bug.cgi?id=1662291
Cc: stable@vger.kernel.org # v4.15+
Reported-by: Pei Zhang <pezhang@redhat.com>
Debugged-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Tested-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agomtd: rawnand: qcom: fix memory corruption that causes panic
Christian Lamparter [Sun, 23 Dec 2018 00:31:26 +0000 (01:31 +0100)]
mtd: rawnand: qcom: fix memory corruption that causes panic

commit 81d9bdf59092e4755fc4307c93c4589ef0fe2e0f upstream.

This patch fixes a memory corruption that occurred in the
qcom-nandc driver since it was converted to nand_scan().

On boot, an affected device will panic from a NPE at a weird place:
| Unable to handle kernel NULL pointer dereference at virtual address 0
| pgd = (ptrval)
| [00000000] *pgd=00000000
| Internal error: Oops: 80000005 [#1] SMP ARM
| CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.19.9 #0
| Hardware name: Generic DT based system
| PC is at   (null)
| LR is at nand_block_isbad+0x90/0xa4
| pc : [<00000000>]    lr : [<c0592240>]    psr: 80000013
| sp : cf839d40  ip : 00000000  fp : cfae9e20
| r10: cf815810  r9 : 00000000  r8 : 00000000
| r7 : 00000000  r6 : 00000000  r5 : 00000001  r4 : cf815810
| r3 : 00000000  r2 : cfae9810  r1 : ffffffff  r0 : cf815810
| Flags: Nzcv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
| Control: 10c5387d  Table: 8020406a  DAC: 00000051
| Process swapper/0 (pid: 1, stack limit = 0x(ptrval))
| [<c0592240>] (nand_block_isbad) from [<c0580a94>]
| [<c0580a94>] (allocate_partition) from [<c05811e4>]
| [<c05811e4>] (add_mtd_partitions) from [<c0581164>]
| [<c0581164>] (parse_mtd_partitions) from [<c057def4>]
| [<c057def4>] (mtd_device_parse_register) from [<c059d274>]
| [<c059d274>] (qcom_nandc_probe) from [<c0567f00>]

The problem is that the nand_scan()'s qcom_nand_attach_chip callback
is updating the nandc->max_cwperpage from 1 to 4. This causes the
sg_init_table of clear_bam_transaction() in the driver's
qcom_nandc_block_bad() to memset much more than what was initially
allocated by alloc_bam_transaction().

This patch restores the old behavior by reallocating the shared bam
transaction alloc_bam_transaction() after the chip was identified,
but before mtd_device_parse_register() (which is an alias for
mtd_device_register() - see panic) gets called. This fixes the
corruption and the driver is working again.

Cc: stable@vger.kernel.org
Fixes: 6a3cec64f18c ("mtd: rawnand: qcom: convert driver to nand_scan()")
Signed-off-by: Christian Lamparter <chunkeey@gmail.com>
Acked-by: Miquel Raynal <miquel.raynal@bootlin.com>
Signed-off-by: Boris Brezillon <bbrezillon@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoi2c: dev: prevent adapter retries and timeout being set as minus value
Yi Zeng [Wed, 9 Jan 2019 07:33:07 +0000 (15:33 +0800)]
i2c: dev: prevent adapter retries and timeout being set as minus value

commit 6ebec961d59bccf65d08b13fc1ad4e6272a89338 upstream.

If adapter->retries is set to a minus value from user space via ioctl,
it will make __i2c_transfer and __i2c_smbus_xfer skip the calling to
adapter->algo->master_xfer and adapter->algo->smbus_xfer that is
registered by the underlying bus drivers, and return value 0 to all the
callers. The bus driver will never be accessed anymore by all users,
besides, the users may still get successful return value without any
error or information log print out.

If adapter->timeout is set to minus value from user space via ioctl,
it will make the retrying loop in __i2c_transfer and __i2c_smbus_xfer
always break after the the first try, due to the time_after always
returns true.

Signed-off-by: Yi Zeng <yizeng@asrmicro.com>
[wsa: minor grammar updates to commit message]
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Cc: stable@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoACPI/IORT: Fix rc_dma_get_range()
Jean-Philippe Brucker [Thu, 10 Jan 2019 18:41:51 +0000 (18:41 +0000)]
ACPI/IORT: Fix rc_dma_get_range()

commit c7777236dd8f587f6a8d6800c03df318fd4d2627 upstream.

When executed for a PCI_ROOT_COMPLEX type, iort_match_node_callback()
expects the opaque pointer argument to be a PCI bus device. At the
moment rc_dma_get_range() passes the PCI endpoint instead of the bus,
and we've been lucky to have pci_domain_nr(ptr) return 0 instead of
crashing. Pass the bus device to iort_scan_node().

Fixes: 5ac65e8c8941 ("ACPI/IORT: Support address size limit for root complexes")
Reported-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Acked-by: Robin Murphy <robin.murphy@arm.com>
Cc: stable@vger.kernel.org
Cc: Will Deacon <will.deacon@arm.com>
Cc: Hanjun Guo <hanjun.guo@linaro.org>
Cc: Sudeep Holla <sudeep.holla@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoACPI / PMIC: xpower: Fix TS-pin current-source handling
Hans de Goede [Fri, 4 Jan 2019 22:10:54 +0000 (23:10 +0100)]
ACPI / PMIC: xpower: Fix TS-pin current-source handling

commit 2b531d71595d2b5b12782a49b23c335869e2621e upstream.

The current-source used for the battery temp-sensor (TS) is shared with the
GPADC. For proper fuel-gauge and charger operation the TS current-source
needs to be permanently on. But to read the GPADC we need to temporary
switch the TS current-source to ondemand, so that the GPADC can use it,
otherwise we will always read an all 0 value.

The switching from on to on-ondemand is not necessary when the TS
current-source is off (this happens on devices which do not have a TS).

Prior to this commit there were 2 issues with our handling of the TS
current-source switching:

 1) We were writing hardcoded values to the ADC TS pin-ctrl register,
 overwriting various other unrelated bits. Specifically we were overwriting
 the current-source setting for the TS and GPIO0 pins, forcing it to 80ųA
 independent of its original setting. On a Chuwi Vi10 tablet this was
 causing us to get a too high adc value (due to a too high current-source)
 resulting in acpi_lpat_raw_to_temp() returning -ENOENT, resulting in:

ACPI Error: AE_ERROR, Returned by Handler for [UserDefinedRegion]
ACPI Error: Method parse/execution failed \_SB.SXP1._TMP, AE_ERROR

This commit fixes this by using regmap_update_bits to change only the
relevant bits.

 2) At the end of intel_xpower_pmic_get_raw_temp() we were unconditionally
 enabling the TS current-source even on devices where the TS-pin is not used
 and the current-source thus was off on entry of the function.

This commit fixes this by checking if the TS current-source is off when
entering intel_xpower_pmic_get_raw_temp() and if so it is left as is.

Fixes: 58eefe2f3f53 (ACPI / PMIC: xpower: Do pinswitch ... reading GPADC)
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Acked-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: 4.14+ <stable@vger.kernel.org> # 4.14+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoACPI: power: Skip duplicate power resource references in _PRx
Hans de Goede [Sun, 30 Dec 2018 17:25:00 +0000 (18:25 +0100)]
ACPI: power: Skip duplicate power resource references in _PRx

commit 7d7b467cb95bf29597b417d4990160d4ea6d69b9 upstream.

Some ACPI tables contain duplicate power resource references like this:

        Name (_PR0, Package (0x04)  // _PR0: Power Resources for D0
        {
            P28P,
            P18P,
            P18P,
            CLK4
        })

This causes a WARN_ON in sysfs_add_link_to_group() because we end up
adding a link to the same acpi_device twice:

sysfs: cannot create duplicate filename '/devices/LNXSYSTM:00/LNXSYBUS:00/PNP0A08:00/808622C1:00/OVTI2680:00/power_resources_D0/LNXPOWER:0a'
CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.19.12-301.fc29.x86_64 #1
Hardware name: Insyde CherryTrail/Type2 - Board Product Name, BIOS jumperx.T87.KFBNEEA02 04/13/2016
Call Trace:
 dump_stack+0x5c/0x80
 sysfs_warn_dup.cold.3+0x17/0x2a
 sysfs_do_create_link_sd.isra.2+0xa9/0xb0
 sysfs_add_link_to_group+0x30/0x50
 acpi_power_expose_list+0x74/0xa0
 acpi_power_add_remove_device+0x50/0xa0
 acpi_add_single_object+0x26b/0x5f0
 acpi_bus_check_add+0xc4/0x250
 ...

To address this issue, make acpi_extract_power_resources() check for
duplicates and simply skip them when found.

Cc: All applicable <stable@vger.kernel.org>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
[ rjw: Subject & changelog, comments ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agomm, memcg: fix reclaim deadlock with writeback
Michal Hocko [Tue, 8 Jan 2019 23:23:07 +0000 (15:23 -0800)]
mm, memcg: fix reclaim deadlock with writeback

commit 63f3655f950186752236bb88a22f8252c11ce394 upstream.

Liu Bo has experienced a deadlock between memcg (legacy) reclaim and the
ext4 writeback

  task1:
    wait_on_page_bit+0x82/0xa0
    shrink_page_list+0x907/0x960
    shrink_inactive_list+0x2c7/0x680
    shrink_node_memcg+0x404/0x830
    shrink_node+0xd8/0x300
    do_try_to_free_pages+0x10d/0x330
    try_to_free_mem_cgroup_pages+0xd5/0x1b0
    try_charge+0x14d/0x720
    memcg_kmem_charge_memcg+0x3c/0xa0
    memcg_kmem_charge+0x7e/0xd0
    __alloc_pages_nodemask+0x178/0x260
    alloc_pages_current+0x95/0x140
    pte_alloc_one+0x17/0x40
    __pte_alloc+0x1e/0x110
    alloc_set_pte+0x5fe/0xc20
    do_fault+0x103/0x970
    handle_mm_fault+0x61e/0xd10
    __do_page_fault+0x252/0x4d0
    do_page_fault+0x30/0x80
    page_fault+0x28/0x30

  task2:
    __lock_page+0x86/0xa0
    mpage_prepare_extent_to_map+0x2e7/0x310 [ext4]
    ext4_writepages+0x479/0xd60
    do_writepages+0x1e/0x30
    __writeback_single_inode+0x45/0x320
    writeback_sb_inodes+0x272/0x600
    __writeback_inodes_wb+0x92/0xc0
    wb_writeback+0x268/0x300
    wb_workfn+0xb4/0x390
    process_one_work+0x189/0x420
    worker_thread+0x4e/0x4b0
    kthread+0xe6/0x100
    ret_from_fork+0x41/0x50

He adds
 "task1 is waiting for the PageWriteback bit of the page that task2 has
  collected in mpd->io_submit->io_bio, and tasks2 is waiting for the
  LOCKED bit the page which tasks1 has locked"

More precisely task1 is handling a page fault and it has a page locked
while it charges a new page table to a memcg.  That in turn hits a
memory limit reclaim and the memcg reclaim for legacy controller is
waiting on the writeback but that is never going to finish because the
writeback itself is waiting for the page locked in the #PF path.  So
this is essentially ABBA deadlock:

                                        lock_page(A)
                                        SetPageWriteback(A)
                                        unlock_page(A)
  lock_page(B)
                                        lock_page(B)
  pte_alloc_pne
    shrink_page_list
      wait_on_page_writeback(A)
                                        SetPageWriteback(B)
                                        unlock_page(B)

                                        # flush A, B to clear the writeback

This accumulating of more pages to flush is used by several filesystems
to generate a more optimal IO patterns.

Waiting for the writeback in legacy memcg controller is a workaround for
pre-mature OOM killer invocations because there is no dirty IO
throttling available for the controller.  There is no easy way around
that unfortunately.  Therefore fix this specific issue by pre-allocating
the page table outside of the page lock.  We have that handy
infrastructure for that already so simply reuse the fault-around pattern
which already does this.

There are probably other hidden __GFP_ACCOUNT | GFP_KERNEL allocations
from under a fs page locked but they should be really rare.  I am not
aware of a better solution unfortunately.

[akpm@linux-foundation.org: fix mm/memory.c:__do_fault()]
[akpm@linux-foundation.org: coding-style fixes]
[mhocko@kernel.org: enhance comment, per Johannes]
Link: http://lkml.kernel.org/r/20181214084948.GA5624@dhcp22.suse.cz
Link: http://lkml.kernel.org/r/20181213092221.27270-1-mhocko@kernel.org
Fixes: c3b94f44fcb0 ("memcg: further prevent OOM with too many dirty pages")
Signed-off-by: Michal Hocko <mhocko@suse.com>
Reported-by: Liu Bo <bo.liu@linux.alibaba.com>
Debugged-by: Liu Bo <bo.liu@linux.alibaba.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Reviewed-by: Liu Bo <bo.liu@linux.alibaba.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agomm/usercopy.c: no check page span for stack objects
Qian Cai [Tue, 8 Jan 2019 23:23:04 +0000 (15:23 -0800)]
mm/usercopy.c: no check page span for stack objects

commit 7bff3c06997374fb9b9991536a547b840549a813 upstream.

It is easy to trigger this with CONFIG_HARDENED_USERCOPY_PAGESPAN=y,

  usercopy: Kernel memory overwrite attempt detected to spans multiple pages (offset 0, size 23)!
  kernel BUG at mm/usercopy.c:102!

For example,

print_worker_info
char name[WQ_NAME_LEN] = { };
char desc[WORKER_DESC_LEN] = { };
  probe_kernel_read(name, wq->name, sizeof(name) - 1);
  probe_kernel_read(desc, worker->desc, sizeof(desc) - 1);
    __copy_from_user_inatomic
      check_object_size
        check_heap_object
          check_page_span

This is because on-stack variables could cross PAGE_SIZE boundary, and
failed this check,

if (likely(((unsigned long)ptr & (unsigned long)PAGE_MASK) ==
   ((unsigned long)end & (unsigned long)PAGE_MASK)))

ptr = FFFF889007D7EFF8
end = FFFF889007D7F00E

Hence, fix it by checking if it is a stack object first.

[keescook@chromium.org: improve comments after reorder]
Link: http://lkml.kernel.org/r/20190103165151.GA32845@beast
Link: http://lkml.kernel.org/r/20181231030254.99441-1-cai@lca.pw
Signed-off-by: Qian Cai <cai@lca.pw>
Signed-off-by: Kees Cook <keescook@chromium.org>
Acked-by: Kees Cook <keescook@chromium.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoslab: alien caches must not be initialized if the allocation of the alien cache failed
Christoph Lameter [Tue, 8 Jan 2019 23:23:00 +0000 (15:23 -0800)]
slab: alien caches must not be initialized if the allocation of the alien cache failed

commit 09c2e76ed734a1d36470d257a778aaba28e86531 upstream.

Callers of __alloc_alien() check for NULL.  We must do the same check in
__alloc_alien_cache to avoid NULL pointer dereferences on allocation
failures.

Link: http://lkml.kernel.org/r/010001680f42f192-82b4e12e-1565-4ee0-ae1f-1e98974906aa-000000@email.amazonses.com
Fixes: 49dfc304ba241 ("slab: use the lock on alien_cache, instead of the lock on array_cache")
Fixes: c8522a3a5832b ("Slab: introduce alloc_alien")
Signed-off-by: Christoph Lameter <cl@linux.com>
Reported-by: syzbot+d6ed4ec679652b4fd4e4@syzkaller.appspotmail.com
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoUSB: Add USB_QUIRK_DELAY_CTRL_MSG quirk for Corsair K70 RGB
Jack Stocker [Thu, 3 Jan 2019 21:56:53 +0000 (21:56 +0000)]
USB: Add USB_QUIRK_DELAY_CTRL_MSG quirk for Corsair K70 RGB

commit 3483254b89438e60f719937376c5e0ce2bc46761 upstream.

To match the Corsair Strafe RGB, the Corsair K70 RGB also requires
USB_QUIRK_DELAY_CTRL_MSG to completely resolve boot connection issues
discussed here: https://github.com/ckb-next/ckb-next/issues/42.
Otherwise roughly 1 in 10 boots the keyboard will fail to be detected.

Patch that applied delay control quirk for Corsair Strafe RGB:
cb88a0588717 ("usb: quirks: add control message delay for 1b1c:1b20")

Previous K70 RGB patch to add delay-init quirk:
7a1646d92257 ("Add delay-init quirk for Corsair K70 RGB keyboards")

Signed-off-by: Jack Stocker <jackstocker.93@gmail.com>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoUSB: storage: add quirk for SMI SM3350
Icenowy Zheng [Thu, 3 Jan 2019 03:26:18 +0000 (11:26 +0800)]
USB: storage: add quirk for SMI SM3350

commit 0a99cc4b8ee83885ab9f097a3737d1ab28455ac0 upstream.

The SMI SM3350 USB-UFS bridge controller cannot handle long sense request
correctly and will make the chip refuse to do read/write when requested
long sense.

Add a bad sense quirk for it.

Signed-off-by: Icenowy Zheng <icenowy@aosc.io>
Cc: stable <stable@vger.kernel.org>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoUSB: storage: don't insert sane sense for SPC3+ when bad sense specified
Icenowy Zheng [Thu, 3 Jan 2019 03:26:17 +0000 (11:26 +0800)]
USB: storage: don't insert sane sense for SPC3+ when bad sense specified

commit c5603d2fdb424849360fe7e3f8c1befc97571b8c upstream.

Currently the code will set US_FL_SANE_SENSE flag unconditionally if
device claims SPC3+, however we should allow US_FL_BAD_SENSE flag to
prevent this behavior, because SMI SM3350 UFS-USB bridge controller,
which claims SPC4, will show strange behavior with 96-byte sense
(put the chip into a wrong state that cannot read/write anything).

Check the presence of US_FL_BAD_SENSE when assuming US_FL_SANE_SENSE on
SPC4+ devices.

Signed-off-by: Icenowy Zheng <icenowy@aosc.io>
Cc: stable <stable@vger.kernel.org>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agousb: cdc-acm: send ZLP for Telit 3G Intel based modems
Daniele Palmas [Fri, 28 Dec 2018 15:15:41 +0000 (16:15 +0100)]
usb: cdc-acm: send ZLP for Telit 3G Intel based modems

commit 34aabf918717dd14e05051896aaecd3b16b53d95 upstream.

Telit 3G Intel based modems require zero packet to be sent if
out data size is equal to the endpoint max packet size.

Signed-off-by: Daniele Palmas <dnlplm@gmail.com>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agocifs: Fix potential OOB access of lock element array
Ross Lagerwall [Tue, 8 Jan 2019 18:30:57 +0000 (18:30 +0000)]
cifs: Fix potential OOB access of lock element array

commit b9a74cde94957d82003fb9f7ab4777938ca851cd upstream.

If maxBuf is small but non-zero, it could result in a zero sized lock
element array which we would then try and access OOB.

Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
CC: Stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoCIFS: Fix credit computation for compounded requests
Pavel Shilovsky [Sat, 22 Dec 2018 20:40:05 +0000 (12:40 -0800)]
CIFS: Fix credit computation for compounded requests

commit 8544f4aa9dd19a04d1244dae10feecc813ccf175 upstream.

In SMB3 protocol every part of the compound chain consumes credits
individually, so we need to call wait_for_free_credits() for each
of the PDUs in the chain. If an operation is interrupted, we must
ensure we return all credits taken from the server structure back.

Without this patch server can sometimes disconnect the session
due to credit mismatches, especially when first operation(s)
are large writes.

Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
CC: Stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoCIFS: Do not hide EINTR after sending network packets
Pavel Shilovsky [Thu, 10 Jan 2019 19:27:28 +0000 (11:27 -0800)]
CIFS: Do not hide EINTR after sending network packets

commit ee13919c2e8d1f904e035ad4b4239029a8994131 upstream.

Currently we hide EINTR code returned from sock_sendmsg()
and return 0 instead. This makes a caller think that we
successfully completed the network operation which is not
true. Fix this by properly returning EINTR to callers.

Cc: <stable@vger.kernel.org>
Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoCIFS: Do not set credits to 1 if the server didn't grant anything
Pavel Shilovsky [Fri, 4 Jan 2019 00:45:13 +0000 (16:45 -0800)]
CIFS: Do not set credits to 1 if the server didn't grant anything

commit 33fa5c8b8a7dbe6353a56eaa654b790348890d42 upstream.

Currently we reset the number of total credits granted by the server
to 1 if the server didn't grant us anything int the response. This
violates the SMB3 protocol - we need to trust the server and use
the credit values from the response. Fix this by removing the
corresponding code.

Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
CC: Stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoCIFS: Fix adjustment of credits for MTU requests
Pavel Shilovsky [Wed, 19 Dec 2018 22:49:09 +0000 (22:49 +0000)]
CIFS: Fix adjustment of credits for MTU requests

commit b983f7e92348d7e7d091db1b78b7915e9dd3d63a upstream.

Currently for MTU requests we allocate maximum possible credits
in advance and then adjust them according to the request size.
While we were adjusting the number of credits belonging to the
server, we were skipping adjustment of credits belonging to the
request. This patch fixes it by setting request credits to
CreditCharge field value of SMB2 packet header.

Also ask 1 credit more for async read and write operations to
increase parallelism and match the behavior of other operations.

Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
CC: Stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoALSA: hda/realtek - Disable headset Mic VREF for headset mode of ALC225
Kailang Yang [Wed, 9 Jan 2019 09:05:24 +0000 (17:05 +0800)]
ALSA: hda/realtek - Disable headset Mic VREF for headset mode of ALC225

commit d1dd42110d2727e81b9265841a62bc84c454c3a2 upstream.

Disable Headset Mic VREF for headset mode of ALC225.
This will be controlled by coef bits of headset mode functions.

[ Fixed a compile warning and code simplification -- tiwai ]

Signed-off-by: Kailang Yang <kailang@realtek.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoALSA: hda/realtek - Add unplug function into unplug state of Headset Mode for ALC225
Kailang Yang [Wed, 9 Jan 2019 08:23:37 +0000 (16:23 +0800)]
ALSA: hda/realtek - Add unplug function into unplug state of Headset Mode for ALC225

commit 4d4b0c52bde470c379f5d168d5c139ad866cb808 upstream.

Forgot to add unplug function to unplug state of headset mode
for ALC225.

Signed-off-by: Kailang Yang <kailang@realtek.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoALSA: hda/realtek - Support Dell headset mode for New AIO platform
Kailang Yang [Thu, 3 Jan 2019 07:53:39 +0000 (15:53 +0800)]
ALSA: hda/realtek - Support Dell headset mode for New AIO platform

commit c2a7c55a04065c3b0c32d23b099db7ea1dbf6250 upstream.

Dell has new platform for ALC274.
This will support to enable headset mode.

Signed-off-by: Kailang Yang <kailang@realtek.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agox86, modpost: Replace last remnants of RETPOLINE with CONFIG_RETPOLINE
WANG Chao [Mon, 10 Dec 2018 16:37:25 +0000 (00:37 +0800)]
x86, modpost: Replace last remnants of RETPOLINE with CONFIG_RETPOLINE

commit e4f358916d528d479c3c12bd2fd03f2d5a576380 upstream.

Commit

  4cd24de3a098 ("x86/retpoline: Make CONFIG_RETPOLINE depend on compiler support")

replaced the RETPOLINE define with CONFIG_RETPOLINE checks. Remove the
remaining pieces.

 [ bp: Massage commit message. ]

Fixes: 4cd24de3a098 ("x86/retpoline: Make CONFIG_RETPOLINE depend on compiler support")
Signed-off-by: WANG Chao <chao.wang@ucloud.cn>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Zhenzhong Duan <zhenzhong.duan@oracle.com>
Reviewed-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: David Woodhouse <dwmw@amazon.co.uk>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Jessica Yu <jeyu@kernel.org>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Kees Cook <keescook@chromium.org>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
Cc: Michal Marek <michal.lkml@markovi.net>
Cc: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: linux-kbuild@vger.kernel.org
Cc: srinivas.eeda@oracle.com
Cc: stable <stable@vger.kernel.org>
Cc: x86-ml <x86@kernel.org>
Link: https://lkml.kernel.org/r/20181210163725.95977-1-chao.wang@ucloud.cn
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agocpufreq: scmi: Fix frequency invariance in slow path
Quentin Perret [Wed, 9 Jan 2019 10:42:36 +0000 (10:42 +0000)]
cpufreq: scmi: Fix frequency invariance in slow path

commit 0e141d1c65c1dd31c914eb2e11651adcc1a15912 upstream.

The scmi-cpufreq driver calls the arch_set_freq_scale() callback on
frequency changes to provide scale-invariant load-tracking signals to
the scheduler. However, in the slow path, it does so while specifying
the current and max frequencies in different units, hence resulting in a
broken freq_scale factor.

Fix this by passing all frequencies in KHz, as stored in the CPUFreq
frequency table.

Fixes: 99d6bdf33877 (cpufreq: add support for CPU DVFS based on SCMI message protocol)
Signed-off-by: Quentin Perret <quentin.perret@arm.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Sudeep Holla <sudeep.holla@arm.com>
Cc: 4.17+ <stable@vger.kernel.org> # v4.17+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agostaging: rtl8188eu: Fix module loading from tasklet for WEP encryption
Larry Finger [Thu, 3 Jan 2019 02:12:47 +0000 (20:12 -0600)]
staging: rtl8188eu: Fix module loading from tasklet for WEP encryption

commit 7775665aadc48a562051834a73519129bf717d73 upstream.

Commit 2b2ea09e74a5 ("staging:r8188eu: Use lib80211 to decrypt WEP-frames")
causes scheduling while atomic bugs followed by a hard freeze whenever
the driver tries to connect to a WEP-encrypted network. Experimentation
showed that the freezes were eliminated when module lib80211 was
preloaded, which can be forced by calling lib80211_get_crypto_ops()
directly rather than indirectly through try_then_request_module().
With this change, no BUG messages are logged.

Fixes: 2b2ea09e74a5 ("staging:r8188eu: Use lib80211 to decrypt WEP-frames")
Cc: Stable <stable@vger.kernel.org> # v4.17+
Cc: Michael Straube <straube.linux@gmail.com>
Cc: Ivan Safonov <insafonov@gmail.com>
Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agostaging: rtl8188eu: Fix module loading from tasklet for CCMP encryption
Larry Finger [Thu, 3 Jan 2019 02:12:46 +0000 (20:12 -0600)]
staging: rtl8188eu: Fix module loading from tasklet for CCMP encryption

commit 84cad97a717f5749a0236abd5ce68da582ea074f upstream.

Commit 6bd082af7e36 ("staging:r8188eu: use lib80211 CCMP decrypt")
causes scheduling while atomic bugs followed by a hard freeze whenever
the driver tries to connect to a CCMP-encrypted network. Experimentation
showed that the freezes were eliminated when module lib80211 was
preloaded, which can be forced by calling lib80211_get_crypto_ops()
directly rather than indirectly through try_then_request_module().
With this change, no BUG messages are logged.

Fixes: 6bd082af7e36 ("staging:r8188eu: use lib80211 CCMP decrypt")
Cc: Stable <stable@vger.kernel.org> # v4.17+
Reported-and-tested-by: Michael Straube <straube.linux@gmail.com>
Cc: Ivan Safonov <insafonov@gmail.com>
Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoBtrfs: fix deadlock when using free space tree due to block group creation
Filipe Manana [Tue, 8 Jan 2019 11:44:41 +0000 (11:44 +0000)]
Btrfs: fix deadlock when using free space tree due to block group creation

commit a6d8654d885d7d79a3fb82da64eaa489ca332a82 upstream.

When modifying the free space tree we can end up COWing one of its extent
buffers which in turn might result in allocating a new chunk, which in
turn can result in flushing (finish creation) of pending block groups. If
that happens we can deadlock because creating a pending block group needs
to update the free space tree, and if any of the updates tries to modify
the same extent buffer that we are COWing, we end up in a deadlock since
we try to write lock twice the same extent buffer.

So fix this by skipping pending block group creation if we are COWing an
extent buffer from the free space tree. This is a case missed by commit
5ce555578e091 ("Btrfs: fix deadlock when writing out free space caches").

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=202173
Fixes: 5ce555578e091 ("Btrfs: fix deadlock when writing out free space caches")
CC: stable@vger.kernel.org # 4.18+
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoLinux 4.19.15 v4.19.15
Greg Kroah-Hartman [Sun, 13 Jan 2019 08:51:11 +0000 (09:51 +0100)]
Linux 4.19.15

5 years agobnx2x: Fix NULL pointer dereference in bnx2x_del_all_vlans() on some hw
Ivan Mironov [Mon, 24 Dec 2018 15:13:05 +0000 (20:13 +0500)]
bnx2x: Fix NULL pointer dereference in bnx2x_del_all_vlans() on some hw

commit 38355a5f9a22bfa5bd5b1bb79805aca39fa53729 upstream.

This happened when I tried to boot normal Fedora 29 system with latest
available kernel (from fedora rawhide, plus some unrelated custom
patches):

BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
PGD 0 P4D 0
Oops: 0010 [#1] SMP PTI
CPU: 6 PID: 1422 Comm: libvirtd Tainted: G          I       4.20.0-0.rc7.git3.hpsa2.1.fc29.x86_64 #1
Hardware name: HP ProLiant BL460c G6, BIOS I24 05/21/2018
RIP: 0010:          (null)
Code: Bad RIP value.
RSP: 0018:ffffa47ccdc9fbe0 EFLAGS: 00010246
RAX: 0000000000000000 RBX: 00000000000003e8 RCX: ffffa47ccdc9fbf8
RDX: ffffa47ccdc9fc00 RSI: ffff97d9ee7b01f8 RDI: ffff97d9f0150b80
RBP: ffff97d9f0150b80 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000003
R13: ffff97d9ef1e53e8 R14: 0000000000000009 R15: ffff97d9f0ac6730
FS:  00007f4d224ef700(0000) GS:ffff97d9fa200000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffffffffffffd6 CR3: 00000011ece52006 CR4: 00000000000206e0
Call Trace:
 ? bnx2x_chip_cleanup+0x195/0x610 [bnx2x]
 ? bnx2x_nic_unload+0x1e2/0x8f0 [bnx2x]
 ? bnx2x_reload_if_running+0x24/0x40 [bnx2x]
 ? bnx2x_set_features+0x79/0xa0 [bnx2x]
 ? __netdev_update_features+0x244/0x9e0
 ? netlink_broadcast_filtered+0x136/0x4b0
 ? netdev_update_features+0x22/0x60
 ? dev_disable_lro+0x1c/0xe0
 ? devinet_sysctl_forward+0x1c6/0x211
 ? proc_sys_call_handler+0xab/0x100
 ? __vfs_write+0x36/0x1a0
 ? rcu_read_lock_sched_held+0x79/0x80
 ? rcu_sync_lockdep_assert+0x2e/0x60
 ? __sb_start_write+0x14c/0x1b0
 ? vfs_write+0x159/0x1c0
 ? vfs_write+0xba/0x1c0
 ? ksys_write+0x52/0xc0
 ? do_syscall_64+0x60/0x1f0
 ? entry_SYSCALL_64_after_hwframe+0x49/0xbe

After some investigation I figured out that recently added cleanup code
tries to call VLAN filtering de-initialization function which exist only
for newer hardware. Corresponding function pointer is not
set (== 0) for older hardware, namely these chips:

#define CHIP_NUM_57710 0x164e
#define CHIP_NUM_57711 0x164f
#define CHIP_NUM_57711E 0x1650

And I have one of those in my test system:

Broadcom Inc. and subsidiaries NetXtreme II BCM57711E 10-Gigabit PCIe [14e4:1650]

Function bnx2x_init_vlan_mac_fp_objs() from
drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.h decides whether to
initialize relevant pointers in bnx2x_sp_objs.vlan_obj or not.

This regression was introduced after v4.20-rc7, and still exists in v4.20
release.

Fixes: 04f05230c5c13 ("bnx2x: Remove configured vlans as part of unload sequence.")
Signed-off-by: Ivan Mironov <mironov.ivan@gmail.com>
Signed-off-by: Ivan Mironov <mironov.ivan@gmail.com>
Acked-by: Sudarsana Kalluru <Sudarsana.Kalluru@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cc: Sudip Mukherjee <sudipm.mukherjee@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agodrm/amd/display: Fix unintialized max_bpc state values
Nicholas Kazlauskas [Wed, 28 Nov 2018 21:17:50 +0000 (16:17 -0500)]
drm/amd/display: Fix unintialized max_bpc state values

commit 49f1c44b581b08e3289127ffe58bd208c3166701 upstream.

[Why]
If the "max bpc" isn't explicitly set in the atomic state then it
have a value of 0. This has the correct behavior of limiting a panel
to 8bpc in the case where the panel supports 8bpc. In the case of eDP
panels this isn't a true assumption - there are panels that can only
do 6bpc.

Banding occurs for these displays.

[How]
Initialize the max_bpc when the connector resets to 8bpc. Also carry
over the value when the state is duplicated.

Bugzilla: https://bugs.freedesktop.org/108825
Fixes: 307638884f72 ("drm/amd/display: Support amdgpu "max bpc" connector property")
Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agodrm/rockchip: psr: do not dereference encoder before it is null checked.
Enric Balletbo i Serra [Sat, 13 Oct 2018 10:56:54 +0000 (12:56 +0200)]
drm/rockchip: psr: do not dereference encoder before it is null checked.

commit 4eda776c3cefcb1f01b2d85bd8753f67606282b5 upstream.

'encoder' is dereferenced before it is null sanity checked, hence we
potentially have a null pointer dereference bug. Instead, initialise
drm_drv from encoder->dev->dev_private after we are sure 'encoder' is
not null.

Fixes: 5182c1a556d7f ("drm/rockchip: add an common abstracted PSR driver")
Cc: stable@vger.kernel.org
Signed-off-by: Enric Balletbo i Serra <enric.balletbo@collabora.com>
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20181013105654.11827-1-enric.balletbo@collabora.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agodrm/vc4: Set ->is_yuv to false when num_planes == 1
Boris Brezillon [Tue, 9 Oct 2018 13:24:46 +0000 (15:24 +0200)]
drm/vc4: Set ->is_yuv to false when num_planes == 1

commit 2b02a05bdc3a62d36e0d0b015351897109e25991 upstream.

When vc4_plane_state is duplicated ->is_yuv is left assigned to its
previous value, and we never set it back to false when switching to
a non-YUV format.

Fix that by setting ->is_yuv to false in the 'num_planes == 1' branch
of the vc4_plane_setup_clipping_and_scaling() function.

Fixes: fc04023fafecf ("drm/vc4: Add support for YUV planes.")
Cc: <stable@vger.kernel.org>
Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Link: https://patchwork.freedesktop.org/patch/msgid/20181009132446.21960-1-boris.brezillon@bootlin.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agodrm/nouveau/drm/nouveau: Check rc from drm_dp_mst_topology_mgr_resume()
Lyude Paul [Thu, 15 Nov 2018 01:39:51 +0000 (20:39 -0500)]
drm/nouveau/drm/nouveau: Check rc from drm_dp_mst_topology_mgr_resume()

commit b89fdf7ae8500feae1100d8b283176a44d31d698 upstream.

We need to actually make sure we check this on resume since otherwise we
won't know whether or not the topology is still there once we've
resumed, which will cause us to still think the topology is connected
even after it's been removed if the removal happens mid-suspend.

Signed-off-by: Lyude Paul <lyude@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agolib: fix build failure in CONFIG_DEBUG_VIRTUAL test
Christophe Leroy [Mon, 10 Dec 2018 08:08:28 +0000 (08:08 +0000)]
lib: fix build failure in CONFIG_DEBUG_VIRTUAL test

commit 10fdf838e5f540beca466e9d1325999c072e5d3f upstream.

On several arches, virt_to_phys() is in io.h

Build fails without it:

  CC      lib/test_debug_virtual.o
lib/test_debug_virtual.c: In function 'test_debug_virtual_init':
lib/test_debug_virtual.c:26:7: error: implicit declaration of function 'virt_to_phys' [-Werror=implicit-function-declaration]
  pa = virt_to_phys(va);
       ^

Fixes: e4dace361552 ("lib: add test module for CONFIG_DEBUG_VIRTUAL")
CC: stable@vger.kernel.org
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoof: __of_detach_node() - remove node from phandle cache
Frank Rowand [Tue, 18 Dec 2018 19:40:03 +0000 (11:40 -0800)]
of: __of_detach_node() - remove node from phandle cache

commit 5801169a2ed20003f771acecf3ac00574cf10a38 upstream.

Non-overlay dynamic devicetree node removal may leave the node in
the phandle cache.  Subsequent calls to of_find_node_by_phandle()
will incorrectly find the stale entry.  Remove the node from the
cache.

Add paranoia checks in of_find_node_by_phandle() as a second level
of defense (do not return cached node if detached, do not add node
to cache if detached).

Fixes: 0b3ce78e90fc ("of: cache phandle nodes to reduce cost of of_find_node_by_phandle()")
Reported-by: Michael Bringmann <mwb@linux.vnet.ibm.com>
Cc: stable@vger.kernel.org # v4.17+
Signed-off-by: Frank Rowand <frank.rowand@sony.com>
Signed-off-by: Rob Herring <robh@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoof: of_node_get()/of_node_put() nodes held in phandle cache
Frank Rowand [Tue, 18 Dec 2018 19:40:02 +0000 (11:40 -0800)]
of: of_node_get()/of_node_put() nodes held in phandle cache

commit b8a9ac1a5b99a2fcbed19fd29d2d59270c281a31 upstream.

The phandle cache contains struct device_node pointers.  The refcount
of the pointers was not incremented while in the cache, allowing use
after free error after kfree() of the node.  Add the proper increment
and decrement of the use count.

Fixes: 0b3ce78e90fc ("of: cache phandle nodes to reduce cost of of_find_node_by_phandle()")
Cc: stable@vger.kernel.org # v4.17+
Signed-off-by: Frank Rowand <frank.rowand@sony.com>
Signed-off-by: Rob Herring <robh@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agopower: supply: olpc_battery: correct the temperature units
Lubomir Rintel [Fri, 16 Nov 2018 16:23:47 +0000 (17:23 +0100)]
power: supply: olpc_battery: correct the temperature units

commit ed54ffbe554f0902689fd6d1712bbacbacd11376 upstream.

According to [1] and [2], the temperature values are in tenths of degree
Celsius. Exposing the Celsius value makes the battery appear on fire:

  $ upower -i /org/freedesktop/UPower/devices/battery_olpc_battery
  ...
      temperature:         236.9 degrees C

Tested on OLPC XO-1 and OLPC XO-1.75 laptops.

[1] include/linux/power_supply.h
[2] Documentation/power/power_supply_class.txt

Fixes: fb972873a767 ("[BATTERY] One Laptop Per Child power/battery driver")
Cc: stable@vger.kernel.org
Signed-off-by: Lubomir Rintel <lkundrak@v3.sk>
Acked-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agointel_th: msu: Fix an off-by-one in attribute store
Alexander Shishkin [Wed, 19 Dec 2018 15:19:22 +0000 (17:19 +0200)]
intel_th: msu: Fix an off-by-one in attribute store

commit ec5b5ad6e272d8d6b92d1007f79574919862a2d2 upstream.

The 'nr_pages' attribute of the 'msc' subdevices parses a comma-separated
list of window sizes, passed from userspace. However, there is a bug in
the string parsing logic wherein it doesn't exclude the comma character
from the range of characters as it consumes them. This leads to an
out-of-bounds access given a sufficiently long list. For example:

> # echo 8,8,8,8 > /sys/bus/intel_th/devices/0-msc0/nr_pages
> ==================================================================
> BUG: KASAN: slab-out-of-bounds in memchr+0x1e/0x40
> Read of size 1 at addr ffff8803ffcebcd1 by task sh/825
>
> CPU: 3 PID: 825 Comm: npktest.sh Tainted: G        W         4.20.0-rc1+
> Call Trace:
>  dump_stack+0x7c/0xc0
>  print_address_description+0x6c/0x23c
>  ? memchr+0x1e/0x40
>  kasan_report.cold.5+0x241/0x308
>  memchr+0x1e/0x40
>  nr_pages_store+0x203/0xd00 [intel_th_msu]

Fix this by accounting for the comma character.

Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Fixes: ba82664c134ef ("intel_th: Add Memory Storage Unit driver")
Cc: stable@vger.kernel.org # v4.4+
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agogenwqe: Fix size check
Christian Borntraeger [Wed, 12 Dec 2018 13:45:18 +0000 (14:45 +0100)]
genwqe: Fix size check

commit fdd669684655c07dacbdb0d753fd13833de69a33 upstream.

Calling the test program genwqe_cksum with the default buffer size of
2MB triggers the following kernel warning on s390:

WARNING: CPU: 30 PID: 9311 at mm/page_alloc.c:3189 __alloc_pages_nodemask+0x45c/0xbe0
CPU: 30 PID: 9311 Comm: genwqe_cksum Kdump: loaded Not tainted 3.10.0-957.el7.s390x #1
task: 00000005e5d13980 ti: 00000005e7c6c000 task.ti: 00000005e7c6c000
Krnl PSW : 0704c00180000000 00000000002780ac (__alloc_pages_nodemask+0x45c/0xbe0)
           R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:0 PM:0 EA:3
Krnl GPRS: 00000000002932b8 0000000000b73d7c 0000000000000010 0000000000000009
           0000000000000041 00000005e7c6f9b8 0000000000000001 00000000000080d0
           0000000000000000 0000000000b70500 0000000000000001 0000000000000000
           0000000000b70528 00000000007682c0 0000000000277df2 00000005e7c6f9a0
Krnl Code: 000000000027809ede7195001000 ed 1280(114,%r9),0(%r1)
   00000000002780a4a774fead brc 7,277dfe
  #00000000002780a8a7f40001 brc 15,2780aa
  >00000000002780ac92011000 mvi 0(%r1),1
   00000000002780b0a7f4fea7 brc 15,277dfe
   00000000002780b49101c6b6 tm 1718(%r12),1
   00000000002780b8a784ff3a brc 8,277f2c
   00000000002780bca7f4fe2e brc 15,277d18
Call Trace:
([<0000000000277df2>] __alloc_pages_nodemask+0x1a2/0xbe0)
 [<000000000013afae>] s390_dma_alloc+0xfe/0x310
 [<000003ff8065f362>] __genwqe_alloc_consistent+0xfa/0x148 [genwqe_card]
 [<000003ff80658f7a>] genwqe_mmap+0xca/0x248 [genwqe_card]
 [<00000000002b2712>] mmap_region+0x4e2/0x778
 [<00000000002b2c54>] do_mmap+0x2ac/0x3e0
 [<0000000000292d7e>] vm_mmap_pgoff+0xd6/0x118
 [<00000000002b081c>] SyS_mmap_pgoff+0xdc/0x268
 [<00000000002b0a34>] SyS_old_mmap+0x8c/0xb0
 [<000000000074e518>] sysc_tracego+0x14/0x1e
 [<000003ffacf87dc6>] 0x3ffacf87dc6

turns out the check in __genwqe_alloc_consistent uses "> MAX_ORDER"
while the mm code uses ">= MAX_ORDER". Fix genwqe.

Cc: stable@vger.kernel.org
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Frank Haverkamp <haver@linux.vnet.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agodrivers/perf: hisi: Fixup one DDRC PMU register offset
Shaokun Zhang [Fri, 4 Jan 2019 06:21:34 +0000 (14:21 +0800)]
drivers/perf: hisi: Fixup one DDRC PMU register offset

commit eb4f5213251833567570df1a09803f895653274d upstream.

For DDRC PMU, each PMU counter is fixed-purpose. There is a mismatch
between perf list and driver definition on rw_chg event.
# perf list | grep chg
  hisi_sccl1_ddrc0/rnk_chg/                          [Kernel PMU event]
  hisi_sccl1_ddrc0/rw_chg/                           [Kernel PMU event]
But the register offset of rw_chg event is not defined in the driver,
meanwhile bnk_chg register offset is mis-defined, let's fixup it.

Fixes: 904dcf03f086 ("perf: hisi: Add support for HiSilicon SoC DDRC PMU driver")
Cc: stable@vger.kernel.org
Cc: John Garry <john.garry@huawei.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Reported-by: Weijian Huang <huangweijian4@hisilicon.com>
Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agovideo: fbdev: pxafb: Fix "WARNING: invalid free of devm_ allocated data"
YueHaibing [Thu, 20 Dec 2018 18:13:08 +0000 (19:13 +0100)]
video: fbdev: pxafb: Fix "WARNING: invalid free of devm_ allocated data"

commit 2607391882fca37463187e7f2a9c76dec286947e upstream.

'info->modes' got allocated with devm_kcalloc in of_get_pxafb_display.

This gives this error message:
  ./drivers/video/fbdev/pxafb.c:2238:2-7: WARNING: invalid free of devm_ allocated data

Fixes: c8f96304ec8b4 ("video: fbdev: pxafb: switch to devm_* API")
Cc: stable@kernel.org [v4.19+]
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Reviewed-by: Daniel Mack <daniel@zonque.org>
Cc: Robert Jarzmik <robert.jarzmik@free.fr>
Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoceph: don't update importing cap's mseq when handing cap export
Yan, Zheng [Thu, 29 Nov 2018 03:22:50 +0000 (11:22 +0800)]
ceph: don't update importing cap's mseq when handing cap export

commit 3c1392d4c49962a31874af14ae9ff289cb2b3851 upstream.

Updating mseq makes client think importer mds has accepted all prior
cap messages and importer mds knows what caps client wants. Actually
some cap messages may have been dropped because of mseq mismatch.

If mseq is left untouched, importing cap's mds_wanted later will get
reset by cap import message.

Cc: stable@vger.kernel.org
Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agosched/fair: Fix infinite loop in update_blocked_averages() by reverting a9e7f6544b9c
Linus Torvalds [Thu, 27 Dec 2018 21:46:17 +0000 (13:46 -0800)]
sched/fair: Fix infinite loop in update_blocked_averages() by reverting a9e7f6544b9c

commit c40f7d74c741a907cfaeb73a7697081881c497d0 upstream.

Zhipeng Xie, Xie XiuQi and Sargun Dhillon reported lockups in the
scheduler under high loads, starting at around the v4.18 time frame,
and Zhipeng Xie tracked it down to bugs in the rq->leaf_cfs_rq_list
manipulation.

Do a (manual) revert of:

  a9e7f6544b9c ("sched/fair: Fix O(nr_cgroups) in load balance path")

It turns out that the list_del_leaf_cfs_rq() introduced by this commit
is a surprising property that was not considered in followup commits
such as:

  9c2791f936ef ("sched/fair: Fix hierarchical order in rq->leaf_cfs_rq_list")

As Vincent Guittot explains:

 "I think that there is a bigger problem with commit a9e7f6544b9c and
  cfs_rq throttling:

  Let take the example of the following topology TG2 --> TG1 --> root:

   1) The 1st time a task is enqueued, we will add TG2 cfs_rq then TG1
      cfs_rq to leaf_cfs_rq_list and we are sure to do the whole branch in
      one path because it has never been used and can't be throttled so
      tmp_alone_branch will point to leaf_cfs_rq_list at the end.

   2) Then TG1 is throttled

   3) and we add TG3 as a new child of TG1.

   4) The 1st enqueue of a task on TG3 will add TG3 cfs_rq just before TG1
      cfs_rq and tmp_alone_branch will stay  on rq->leaf_cfs_rq_list.

  With commit a9e7f6544b9c, we can del a cfs_rq from rq->leaf_cfs_rq_list.
  So if the load of TG1 cfs_rq becomes NULL before step 2) above, TG1
  cfs_rq is removed from the list.
  Then at step 4), TG3 cfs_rq is added at the beginning of rq->leaf_cfs_rq_list
  but tmp_alone_branch still points to TG3 cfs_rq because its throttled
  parent can't be enqueued when the lock is released.
  tmp_alone_branch doesn't point to rq->leaf_cfs_rq_list whereas it should.

  So if TG3 cfs_rq is removed or destroyed before tmp_alone_branch
  points on another TG cfs_rq, the next TG cfs_rq that will be added,
  will be linked outside rq->leaf_cfs_rq_list - which is bad.

  In addition, we can break the ordering of the cfs_rq in
  rq->leaf_cfs_rq_list but this ordering is used to update and
  propagate the update from leaf down to root."

Instead of trying to work through all these cases and trying to reproduce
the very high loads that produced the lockup to begin with, simplify
the code temporarily by reverting a9e7f6544b9c - which change was clearly
not thought through completely.

This (hopefully) gives us a kernel that doesn't lock up so people
can continue to enjoy their holidays without worrying about regressions. ;-)

[ mingo: Wrote changelog, fixed weird spelling in code comment while at it. ]

Analyzed-by: Xie XiuQi <xiexiuqi@huawei.com>
Analyzed-by: Vincent Guittot <vincent.guittot@linaro.org>
Reported-by: Zhipeng Xie <xiezhipeng1@huawei.com>
Reported-by: Sargun Dhillon <sargun@sargun.me>
Reported-by: Xie XiuQi <xiexiuqi@huawei.com>
Tested-by: Zhipeng Xie <xiezhipeng1@huawei.com>
Tested-by: Sargun Dhillon <sargun@sargun.me>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Acked-by: Vincent Guittot <vincent.guittot@linaro.org>
Cc: <stable@vger.kernel.org> # v4.13+
Cc: Bin Li <huawei.libin@huawei.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: a9e7f6544b9c ("sched/fair: Fix O(nr_cgroups) in load balance path")
Link: http://lkml.kernel.org/r/1545879866-27809-1-git-send-email-xiexiuqi@huawei.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoiommu/vt-d: Handle domain agaw being less than iommu agaw
Sohil Mehta [Wed, 21 Nov 2018 23:29:33 +0000 (15:29 -0800)]
iommu/vt-d: Handle domain agaw being less than iommu agaw

commit 3569dd07aaad71920c5ea4da2d5cc9a167c1ffd4 upstream.

The Intel IOMMU driver opportunistically skips a few top level page
tables from the domain paging directory while programming the IOMMU
context entry. However there is an implicit assumption in the code that
domain's adjusted guest address width (agaw) would always be greater
than IOMMU's agaw.

The IOMMU capabilities in an upcoming platform cause the domain's agaw
to be lower than IOMMU's agaw. The issue is seen when the IOMMU supports
both 4-level and 5-level paging. The domain builds a 4-level page table
based on agaw of 2. However the IOMMU's agaw is set as 3 (5-level). In
this case the code incorrectly tries to skip page page table levels.
This causes the IOMMU driver to avoid programming the context entry. The
fix handles this case and programs the context entry accordingly.

Fixes: de24e55395698 ("iommu/vt-d: Simplify domain_context_mapping_one")
Cc: <stable@vger.kernel.org>
Cc: Ashok Raj <ashok.raj@intel.com>
Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
Cc: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Reported-by: Ramos Falcon, Ernesto R <ernesto.r.ramos.falcon@intel.com>
Tested-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
Signed-off-by: Sohil Mehta <sohil.mehta@intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoRDMA/srpt: Fix a use-after-free in the channel release code
Bart Van Assche [Mon, 17 Dec 2018 21:20:40 +0000 (13:20 -0800)]
RDMA/srpt: Fix a use-after-free in the channel release code

commit ed041919f0d23c109d52cde8da6ddc211c52d67e upstream.

This patch avoids that KASAN sporadically reports the following:

BUG: KASAN: use-after-free in rxe_run_task+0x1e/0x60 [rdma_rxe]
Read of size 1 at addr ffff88801c50d8f4 by task check/24830

CPU: 4 PID: 24830 Comm: check Not tainted 4.20.0-rc6-dbg+ #3
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014
Call Trace:
 dump_stack+0x86/0xca
 print_address_description+0x71/0x239
 kasan_report.cold.5+0x242/0x301
 __asan_load1+0x47/0x50
 rxe_run_task+0x1e/0x60 [rdma_rxe]
 rxe_post_send+0x4bd/0x8d0 [rdma_rxe]
 srpt_zerolength_write+0xe1/0x160 [ib_srpt]
 srpt_close_ch+0x8b/0xe0 [ib_srpt]
 srpt_set_enabled+0xe7/0x150 [ib_srpt]
 srpt_tpg_enable_store+0xc0/0x100 [ib_srpt]
 configfs_write_file+0x157/0x1d0
 __vfs_write+0xd7/0x3d0
 vfs_write+0x102/0x290
 ksys_write+0xab/0x130
 __x64_sys_write+0x43/0x50
 do_syscall_64+0x71/0x210
 entry_SYSCALL_64_after_hwframe+0x49/0xbe

Allocated by task 13856:
 save_stack+0x43/0xd0
 kasan_kmalloc+0xc7/0xe0
 kasan_slab_alloc+0x11/0x20
 kmem_cache_alloc+0x105/0x320
 rxe_alloc+0xff/0x1f0 [rdma_rxe]
 rxe_create_qp+0x9f/0x160 [rdma_rxe]
 ib_create_qp+0xf5/0x690 [ib_core]
 rdma_create_qp+0x6a/0x140 [rdma_cm]
 srpt_cm_req_recv.cold.59+0x1588/0x237b [ib_srpt]
 srpt_rdma_cm_req_recv.isra.35+0x1d5/0x220 [ib_srpt]
 srpt_rdma_cm_handler+0x6f/0x100 [ib_srpt]
 cma_listen_handler+0x59/0x60 [rdma_cm]
 cma_ib_req_handler+0xd5b/0x2570 [rdma_cm]
 cm_process_work+0x2e/0x110 [ib_cm]
 cm_work_handler+0x2aae/0x502b [ib_cm]
 process_one_work+0x481/0x9e0
 worker_thread+0x67/0x5b0
 kthread+0x1cf/0x1f0
 ret_from_fork+0x24/0x30

Freed by task 3440:
 save_stack+0x43/0xd0
 __kasan_slab_free+0x139/0x190
 kasan_slab_free+0xe/0x10
 kmem_cache_free+0xbc/0x330
 rxe_elem_release+0x66/0xe0 [rdma_rxe]
 rxe_destroy_qp+0x3f/0x50 [rdma_rxe]
 ib_destroy_qp+0x140/0x360 [ib_core]
 srpt_release_channel_work+0xdc/0x310 [ib_srpt]
 process_one_work+0x481/0x9e0
 worker_thread+0x67/0x5b0
 kthread+0x1cf/0x1f0
 ret_from_fork+0x24/0x30

Cc: Sergey Gorenko <sergeygo@mellanox.com>
Cc: Max Gurtovoy <maxg@mellanox.com>
Cc: Laurence Oberman <loberman@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agorxe: fix error completion wr_id and qp_num
Sagi Grimberg [Thu, 25 Oct 2018 19:40:57 +0000 (12:40 -0700)]
rxe: fix error completion wr_id and qp_num

commit e48d8ed9c6193502d849b35767fd18e20bbd7ba2 upstream.

Error completions must still contain a valid wr_id and
qp_num such that the consumer can rely on. Correctly
fill these fields in receive error completions.

Reported-by: Walker Benjamin <benjamin.walker@intel.com>
Cc: stable@vger.kernel.org
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Zhu Yanjun <yanjun.zhu@oracle.com>
Tested-by: Zhu Yanjun <yanjun.zhu@oracle.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years ago9p/net: put a lower bound on msize
Dominique Martinet [Mon, 5 Nov 2018 08:52:48 +0000 (09:52 +0100)]
9p/net: put a lower bound on msize

commit 574d356b7a02c7e1b01a1d9cba8a26b3c2888f45 upstream.

If the requested msize is too small (either from command line argument
or from the server version reply), we won't get any work done.
If it's *really* too small, nothing will work, and this got caught by
syzbot recently (on a new kmem_cache_create_usercopy() call)

Just set a minimum msize to 4k in both code paths, until someone
complains they have a use-case for a smaller msize.

We need to check in both mount option and server reply individually
because the msize for the first version request would be unchecked
with just a global check on clnt->msize.

Link: http://lkml.kernel.org/r/1541407968-31350-1-git-send-email-asmadeus@codewreck.org
Reported-by: syzbot+0c1d61e4db7db94102ca@syzkaller.appspotmail.com
Signed-off-by: Dominique Martinet <dominique.martinet@cea.fr>
Cc: Eric Van Hensbergen <ericvh@gmail.com>
Cc: Latchesar Ionkov <lucho@ionkov.net>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoiio: dac: ad5686: fix bit shift read register
Mircea Caprioru [Thu, 6 Dec 2018 13:53:15 +0000 (15:53 +0200)]
iio: dac: ad5686: fix bit shift read register

commit 0e76df5c978338f3051e5126fc0c4245c57a307a upstream.

This patch solves the register readback issue with the bit shift. When the
dac resolution was lower than the register size (ex. 12 bits out of 16
bits) the readback value was not shifted with the difference in bits and
the value was higher. Also a mask is applied on the read value in order to
get the value relative to the actual bit size.

Fixes: 0357e488b8 ("iio:dac:ad5686: Refactor the driver")
Signed-off-by: Mircea Caprioru <mircea.caprioru@analog.com>
Cc: <Stable@vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agopowerpc/tm: Set MSR[TS] just prior to recheckpoint
Breno Leitao [Wed, 21 Nov 2018 19:21:09 +0000 (17:21 -0200)]
powerpc/tm: Set MSR[TS] just prior to recheckpoint

commit e1c3743e1a20647c53b719dbf28b48f45d23f2cd upstream.

On a signal handler return, the user could set a context with MSR[TS] bits
set, and these bits would be copied to task regs->msr.

At restore_tm_sigcontexts(), after current task regs->msr[TS] bits are set,
several __get_user() are called and then a recheckpoint is executed.

This is a problem since a page fault (in kernel space) could happen when
calling __get_user(). If it happens, the process MSR[TS] bits were
already set, but recheckpoint was not executed, and SPRs are still invalid.

The page fault can cause the current process to be de-scheduled, with
MSR[TS] active and without tm_recheckpoint() being called.  More
importantly, without TEXASR[FS] bit set also.

Since TEXASR might not have the FS bit set, and when the process is
scheduled back, it will try to reclaim, which will be aborted because of
the CPU is not in the suspended state, and, then, recheckpoint. This
recheckpoint will restore thread->texasr into TEXASR SPR, which might be
zero, hitting a BUG_ON().

kernel BUG at /build/linux-sf3Co9/linux-4.9.30/arch/powerpc/kernel/tm.S:434!
cpu 0xb: Vector: 700 (Program Check) at [c00000041f1576d0]
    pc: c000000000054550: restore_gprs+0xb0/0x180
    lr: 0000000000000000
    sp: c00000041f157950
   msr: 8000000100021033
  current = 0xc00000041f143000
  paca    = 0xc00000000fb86300  softe: 0  irq_happened: 0x01
    pid   = 1021, comm = kworker/11:1
kernel BUG at /build/linux-sf3Co9/linux-4.9.30/arch/powerpc/kernel/tm.S:434!
Linux version 4.9.0-3-powerpc64le (debian-kernel@lists.debian.org) (gcc version 6.3.0 20170516 (Debian 6.3.0-18) ) #1 SMP Debian 4.9.30-2+deb9u2 (2017-06-26)
enter ? for help
[c00000041f157b30c00000000001bc3c tm_recheckpoint.part.11+0x6c/0xa0
[c00000041f157b70c00000000001d184 __switch_to+0x1e4/0x4c0
[c00000041f157bd0c00000000082eeb8 __schedule+0x2f8/0x990
[c00000041f157cb0c00000000082f598 schedule+0x48/0xc0
[c00000041f157ce0c0000000000f0d28 worker_thread+0x148/0x610
[c00000041f157d80c0000000000f96b0 kthread+0x120/0x140
[c00000041f157e30c00000000000c0e0 ret_from_kernel_thread+0x5c/0x7c

This patch simply delays the MSR[TS] set, so, if there is any page fault in
the __get_user() section, it does not have regs->msr[TS] set, since the TM
structures are still invalid, thus avoiding doing TM operations for
in-kernel exceptions and possible process reschedule.

With this patch, the MSR[TS] will only be set just before recheckpointing
and setting TEXASR[FS] = 1, thus avoiding an interrupt with TM registers in
invalid state.

Other than that, if CONFIG_PREEMPT is set, there might be a preemption just
after setting MSR[TS] and before tm_recheckpoint(), thus, this block must
be atomic from a preemption perspective, thus, calling
preempt_disable/enable() on this code.

It is not possible to move tm_recheckpoint to happen earlier, because it is
required to get the checkpointed registers from userspace, with
__get_user(), thus, the only way to avoid this undesired behavior is
delaying the MSR[TS] set.

The 32-bits signal handler seems to be safe this current issue, but, it
might be exposed to the preemption issue, thus, disabling preemption in
this chunk of code.

Changes from v2:
 * Run the critical section with preempt_disable.

Fixes: 87b4e5393af7 ("powerpc/tm: Fix return of active 64bit signals")
Cc: stable@vger.kernel.org (v3.9+)
Signed-off-by: Breno Leitao <leitao@debian.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoRevert "powerpc/tm: Unset MSR[TS] if not recheckpointing"
Greg Kroah-Hartman [Fri, 11 Jan 2019 07:05:32 +0000 (08:05 +0100)]
Revert "powerpc/tm: Unset MSR[TS] if not recheckpointing"

This reverts commit a9935a12768851762089fda8e5a9daaf0231808e which is
commit 6f5b9f018f4c7686fd944d920209d1382d320e4e upstream.

It breaks the powerpc build, so drop it from the tree until a fix goes
upstream.

Reported-by: Guenter Roeck <linux@roeck-us.net>
Cc: Breno Leitao <leitao@debian.org>
Cc: Michal Suchánek <msuchanek@suse.de>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Christoph Biedl <linux-kernel.bfrz@manchmal.in-ulm.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoleds: pwm: silently error out on EPROBE_DEFER
Jerome Brunet [Thu, 6 Sep 2018 13:59:04 +0000 (15:59 +0200)]
leds: pwm: silently error out on EPROBE_DEFER

commit 9aec30371fb095a0c9415f3f0146ae269c3713d8 upstream.

When probing, if we fail to get the pwm due to probe deferal, we shouldn't
print an error message. Just be silent in this case.

Signed-off-by: Jerome Brunet <jbrunet@baylibre.com>
Signed-off-by: Jacek Anaszewski <jacek.anaszewski@gmail.com>
Cc: Benjamin Drung <bdrung@debian.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoarm64: relocatable: fix inconsistencies in linker script and options
Ard Biesheuvel [Mon, 3 Dec 2018 19:58:05 +0000 (20:58 +0100)]
arm64: relocatable: fix inconsistencies in linker script and options

commit 3bbd3db86470c701091fb1d67f1fab6621debf50 upstream.

readelf complains about the section layout of vmlinux when building
with CONFIG_RELOCATABLE=y (for KASLR):

  readelf: Warning: [21]: Link field (0) should index a symtab section.
  readelf: Warning: [21]: Info field (0) should index a relocatable section.

Also, it seems that our use of '-pie -shared' is contradictory, and
thus ambiguous. In general, the way KASLR is wired up at the moment
is highly tailored to how ld.bfd happens to implement (and conflate)
PIE executables and shared libraries, so given the current effort to
support other toolchains, let's fix some of these issues as well.

- Drop the -pie linker argument and just leave -shared. In ld.bfd,
  the differences between them are unclear (except for the ELF type
  of the produced image [0]) but lld chokes on seeing both at the
  same time.

- Rename the .rela output section to .rela.dyn, as is customary for
  shared libraries and PIE executables, so that it is not misidentified
  by readelf as a static relocation section (producing the warnings
  above).

- Pass the -z notext and -z norelro options to explicitly instruct the
  linker to permit text relocations, and to omit the RELRO program
  header (which requires a certain section layout that we don't adhere
  to in the kernel). These are the defaults for current versions of
  ld.bfd.

- Discard .eh_frame and .gnu.hash sections to avoid them from being
  emitted between .head.text and .text, screwing up the section layout.

These changes only affect the ELF image, and produce the same binary
image.

[0] b9dce7f1ba01 ("arm64: kernel: force ET_DYN ELF type for ...")

Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Peter Smith <peter.smith@linaro.org>
Tested-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoarm64: drop linker script hack to hide __efistub_ symbols
Ard Biesheuvel [Fri, 30 Nov 2018 11:35:58 +0000 (12:35 +0100)]
arm64: drop linker script hack to hide __efistub_ symbols

commit dd6846d774693bfa27d7db4dae5ea67dfe373fa1 upstream.

Commit 1212f7a16af4 ("scripts/kallsyms: filter arm64's __efistub_
symbols") updated the kallsyms code to filter out symbols with
the __efistub_ prefix explicitly, so we no longer require the
hack in our linker script to emit them as absolute symbols.

Cc: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agonfsd4: zero-length WRITE should succeed
J. Bruce Fields [Thu, 15 Nov 2018 16:21:40 +0000 (11:21 -0500)]
nfsd4: zero-length WRITE should succeed

commit fdec6114ee1f0f43b1ad081ad8d46b23ba126d70 upstream.

Zero-length writes are legal; from 5661 section 18.32.3: "If the count
is zero, the WRITE will succeed and return a count of zero subject to
permissions checking".

This check is unnecessary and is causing zero-length reads to return
EINVAL.

Cc: stable@vger.kernel.org
Fixes: 3fd9557aec91 "NFSD: Refactor the generic write vector fill helper"
Cc: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agolockd: Show pid of lockd for remote locks
Benjamin Coddington [Thu, 1 Nov 2018 17:39:49 +0000 (13:39 -0400)]
lockd: Show pid of lockd for remote locks

commit b8eee0e90f9797b747113638bc75e739b192ad38 upstream.

Commit 9d5b86ac13c5 ("fs/locks: Remove fl_nspid and use fs-specific l_pid
for remote locks") specified that the l_pid returned for F_GETLK on a local
file that has a remote lock should be the pid of the lock manager process.
That commit, while updating other filesystems, failed to update lockd, such
that locks created by lockd had their fl_pid set to that of the remote
process holding the lock.  Fix that here to be the pid of lockd.

Also, fix the client case so that the returned lock pid is negative, which
indicates a remote lock on a remote file.

Fixes: 9d5b86ac13c5 ("fs/locks: Remove fl_nspid and use fs-specific...")
Cc: stable@vger.kernel.org
Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoPCI / PM: Allow runtime PM without callback functions
Jarkko Nikula [Tue, 23 Oct 2018 11:45:52 +0000 (14:45 +0300)]
PCI / PM: Allow runtime PM without callback functions

commit c5eb1190074cfb14c5d9cac692f1912eecf1a5e4 upstream.

a9c8088c7988 ("i2c: i801: Don't restore config registers on runtime PM")
nullified the runtime PM suspend/resume callback pointers while keeping the
runtime PM enabled.

This caused the SMBus PCI device to stay in D0 with
/sys/devices/.../power/runtime_status showing "error" when the runtime PM
framework attempted to autosuspend the device.  This is due to PCI bus
runtime PM, which checks for driver runtime PM callbacks and returns
-ENOSYS if they are not set.

Since i2c-i801.c doesn't need to do anything device-specific for runtime
PM, Jean Delvare proposed this be fixed in the PCI core rather than adding
dummy runtime PM callback functions in the PCI drivers.

Change pci_pm_runtime_suspend()/pci_pm_runtime_resume() so they allow
changing the PCI device power state during runtime PM transitions even if
the driver supplies no runtime PM callbacks.

This fixes the runtime PM regression on i2c-i801.c.

It is not obvious why the code previously required the runtime PM
callbacks.  The test has been there since the code was introduced by
6cbf82148ff2 ("PCI PM: Run-time callbacks for PCI bus type").

On the other hand, a similar change was done to generic runtime PM
callbacks in 05aa55dddb9e ("PM / Runtime: Lenient generic runtime pm
callbacks").

Fixes: a9c8088c7988 ("i2c: i801: Don't restore config registers on runtime PM")
Reported-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Jarkko Nikula <jarkko.nikula@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Jean Delvare <jdelvare@suse.de>
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: stable@vger.kernel.org # v4.18+
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoselinux: policydb - fix byte order and alignment issues
Ondrej Mosnacek [Tue, 23 Oct 2018 07:02:17 +0000 (09:02 +0200)]
selinux: policydb - fix byte order and alignment issues

commit 5df275cd4cf51c86d49009f1397132f284ba515e upstream.

Do the LE conversions before doing the Infiniband-related range checks.
The incorrect checks are otherwise causing a failure to load any policy
with an ibendportcon rule on BE systems. This can be reproduced by
running (on e.g. ppc64):

cat >my_module.cil <<EOF
(type test_ibendport_t)
(roletype object_r test_ibendport_t)
(ibendportcon mlx4_0 1 (system_u object_r test_ibendport_t ((s0) (s0))))
EOF
semodule -i my_module.cil

Also, fix loading/storing the 64-bit subnet prefix for OCON_IBPKEY to
use a correctly aligned buffer.

Finally, do not use the 'nodebuf' (u32) buffer where 'buf' (__le32)
should be used instead.

Tested internally on a ppc64 machine with a RHEL 7 kernel with this
patch applied.

Cc: Daniel Jurgens <danielj@mellanox.com>
Cc: Eli Cohen <eli@mellanox.com>
Cc: James Morris <jmorris@namei.org>
Cc: Doug Ledford <dledford@redhat.com>
Cc: <stable@vger.kernel.org> # 4.13+
Fixes: a806f7a1616f ("selinux: Create policydb version for Infiniband support")
Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agob43: Fix error in cordic routine
Larry Finger [Mon, 19 Nov 2018 18:01:24 +0000 (20:01 +0200)]
b43: Fix error in cordic routine

commit 8ea3819c0bbef57a51d8abe579e211033e861677 upstream.

The cordic routine for calculating sines and cosines that was added in
commit 6f98e62a9f1b ("b43: update cordic code to match current specs")
contains an error whereby a quantity declared u32 can in fact go negative.

This problem was detected by Priit Laes who is switching b43 to use the
routine in the library functions of the kernel.

Fixes: 986504540306 ("b43: make cordic common (LP-PHY and N-PHY need it)")
Reported-by: Priit Laes <plaes@plaes.org>
Cc: Rafał Miłecki <zajec5@gmail.com>
Cc: Stable <stable@vger.kernel.org> # 2.6.34
Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net>
Signed-off-by: Priit Laes <plaes@plaes.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agogfs2: Fix loop in gfs2_rbm_find
Andreas Gruenbacher [Tue, 4 Dec 2018 14:06:27 +0000 (15:06 +0100)]
gfs2: Fix loop in gfs2_rbm_find

commit 2d29f6b96d8f80322ed2dd895bca590491c38d34 upstream.

Fix the resource group wrap-around logic in gfs2_rbm_find that commit
e579ed4f44 broke.  The bug can lead to unnecessary repeated scanning of the
same bitmaps; there is a risk that future changes will turn this into an
endless loop.

Fixes: e579ed4f44 ("GFS2: Introduce rbm field bii")
Cc: stable@vger.kernel.org # v3.13+
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agogfs2: Get rid of potential double-freeing in gfs2_create_inode
Andreas Gruenbacher [Mon, 26 Nov 2018 17:45:35 +0000 (18:45 +0100)]
gfs2: Get rid of potential double-freeing in gfs2_create_inode

commit 6ff9b09e00a441599f3aacdf577254455a048bc9 upstream.

In gfs2_create_inode, after setting and releasing the acl / default_acl, the
acl / default_acl pointers are not set to NULL as they should be.  In that
state, when the function reaches label fail_free_acls, gfs2_create_inode will
try to release the same acls again.

Fix that by setting the pointers to NULL after releasing the acls.  Slightly
simplify the logic.  Also, posix_acl_release checks for NULL already, so
there is no need to duplicate those checks here.

Fixes: e01580bf9e4d ("gfs2: use generic posix ACL infrastructure")
Reported-by: Pan Bian <bianpan2016@163.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: stable@vger.kernel.org # v4.9+
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agodlm: memory leaks on error path in dlm_user_request()
Vasily Averin [Thu, 15 Nov 2018 10:18:56 +0000 (13:18 +0300)]
dlm: memory leaks on error path in dlm_user_request()

commit d47b41aceeadc6b58abc9c7c6485bef7cfb75636 upstream.

According to comment in dlm_user_request() ua should be freed
in dlm_free_lkb() after successful attach to lkb.

However ua is attached to lkb not in set_lock_args() but later,
inside request_lock().

Fixes 597d0cae0f99 ("[DLM] dlm: user locks")
Cc: stable@kernel.org # 2.6.19
Signed-off-by: Vasily Averin <vvs@virtuozzo.com>
Signed-off-by: David Teigland <teigland@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agodlm: lost put_lkb on error path in receive_convert() and receive_unlock()
Vasily Averin [Thu, 15 Nov 2018 10:18:24 +0000 (13:18 +0300)]
dlm: lost put_lkb on error path in receive_convert() and receive_unlock()

commit c0174726c3976e67da8649ac62cae43220ae173a upstream.

Fixes 6d40c4a708e0 ("dlm: improve error and debug messages")
Cc: stable@kernel.org # 3.5
Signed-off-by: Vasily Averin <vvs@virtuozzo.com>
Signed-off-by: David Teigland <teigland@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agodlm: possible memory leak on error path in create_lkb()
Vasily Averin [Thu, 15 Nov 2018 10:18:18 +0000 (13:18 +0300)]
dlm: possible memory leak on error path in create_lkb()

commit 23851e978f31eda8b2d01bd410d3026659ca06c7 upstream.

Fixes 3d6aa675fff9 ("dlm: keep lkbs in idr")
Cc: stable@kernel.org # 3.1
Signed-off-by: Vasily Averin <vvs@virtuozzo.com>
Signed-off-by: David Teigland <teigland@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agodlm: fixed memory leaks after failed ls_remove_names allocation
Vasily Averin [Thu, 15 Nov 2018 10:15:05 +0000 (13:15 +0300)]
dlm: fixed memory leaks after failed ls_remove_names allocation

commit b982896cdb6e6a6b89d86dfb39df489d9df51e14 upstream.

If allocation fails on last elements of array need to free already
allocated elements.

v2: just move existing out_rsbtbl label to right place

Fixes 789924ba635f ("dlm: fix race between remove and lookup")
Cc: stable@kernel.org # 3.6
Signed-off-by: Vasily Averin <vvs@virtuozzo.com>
Signed-off-by: David Teigland <teigland@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoblock: mq-deadline: Fix write completion handling
Damien Le Moal [Mon, 17 Dec 2018 06:14:05 +0000 (15:14 +0900)]
block: mq-deadline: Fix write completion handling

commit 7211aef86f79583e59b88a0aba0bc830566f7e8e upstream.

For a zoned block device using mq-deadline, if a write request for a
zone is received while another write was already dispatched for the same
zone, dd_dispatch_request() will return NULL and the newly inserted
write request is kept in the scheduler queue waiting for the ongoing
zone write to complete. With this behavior, when no other request has
been dispatched, rq_list in blk_mq_sched_dispatch_requests() is empty
and blk_mq_sched_mark_restart_hctx() not called. This in turn leads to
__blk_mq_free_request() call of blk_mq_sched_restart() to not run the
queue when the already dispatched write request completes. The newly
dispatched request stays stuck in the scheduler queue until eventually
another request is submitted.

This problem does not affect SCSI disk as the SCSI stack handles queue
restart on request completion. However, this problem is can be triggered
the nullblk driver with zoned mode enabled.

Fix this by always requesting a queue restart in dd_dispatch_request()
if no request was dispatched while WRITE requests are queued.

Fixes: 5700f69178e9 ("mq-deadline: Introduce zone locking support")
Cc: <stable@vger.kernel.org>
Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Add missing export of blk_mq_sched_restart()

Signed-off-by: Jens Axboe <axboe@kernel.dk>
5 years agoblock: deactivate blk_stat timer in wbt_disable_default()
Ming Lei [Wed, 12 Dec 2018 11:44:34 +0000 (19:44 +0800)]
block: deactivate blk_stat timer in wbt_disable_default()

commit 544fbd16a461a318cd80537d1331c0df5c6cf930 upstream.

rwb_enabled() can't be changed when there is any inflight IO.

wbt_disable_default() may set rwb->wb_normal as zero, however the
blk_stat timer may still be pending, and the timer function will update
wrb->wb_normal again.

This patch introduces blk_stat_deactivate() and applies it in
wbt_disable_default(), then the following IO hang triggered when running
parted & switching io scheduler can be fixed:

[  369.937806] INFO: task parted:3645 blocked for more than 120 seconds.
[  369.938941]       Not tainted 4.20.0-rc6-00284-g906c801e5248 #498
[  369.939797] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  369.940768] parted          D    0  3645   3239 0x00000000
[  369.941500] Call Trace:
[  369.941874]  ? __schedule+0x6d9/0x74c
[  369.942392]  ? wbt_done+0x5e/0x5e
[  369.942864]  ? wbt_cleanup_cb+0x16/0x16
[  369.943404]  ? wbt_done+0x5e/0x5e
[  369.943874]  schedule+0x67/0x78
[  369.944298]  io_schedule+0x12/0x33
[  369.944771]  rq_qos_wait+0xb5/0x119
[  369.945193]  ? karma_partition+0x1c2/0x1c2
[  369.945691]  ? wbt_cleanup_cb+0x16/0x16
[  369.946151]  wbt_wait+0x85/0xb6
[  369.946540]  __rq_qos_throttle+0x23/0x2f
[  369.947014]  blk_mq_make_request+0xe6/0x40a
[  369.947518]  generic_make_request+0x192/0x2fe
[  369.948042]  ? submit_bio+0x103/0x11f
[  369.948486]  ? __radix_tree_lookup+0x35/0xb5
[  369.949011]  submit_bio+0x103/0x11f
[  369.949436]  ? blkg_lookup_slowpath+0x25/0x44
[  369.949962]  submit_bio_wait+0x53/0x7f
[  369.950469]  blkdev_issue_flush+0x8a/0xae
[  369.951032]  blkdev_fsync+0x2f/0x3a
[  369.951502]  do_fsync+0x2e/0x47
[  369.951887]  __x64_sys_fsync+0x10/0x13
[  369.952374]  do_syscall_64+0x89/0x149
[  369.952819]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
[  369.953492] RIP: 0033:0x7f95a1e729d4
[  369.953996] Code: Bad RIP value.
[  369.954456] RSP: 002b:00007ffdb570dd48 EFLAGS: 00000246 ORIG_RAX: 000000000000004a
[  369.955506] RAX: ffffffffffffffda RBX: 000055c2139c6be0 RCX: 00007f95a1e729d4
[  369.956389] RDX: 0000000000000001 RSI: 0000000000001261 RDI: 0000000000000004
[  369.957325] RBP: 0000000000000002 R08: 0000000000000000 R09: 000055c2139c6ce0
[  369.958199] R10: 0000000000000000 R11: 0000000000000246 R12: 000055c2139c0380
[  369.959143] R13: 0000000000000004 R14: 0000000000000100 R15: 0000000000000008

Cc: stable@vger.kernel.org
Cc: Paolo Valente <paolo.valente@linaro.org>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoFix failure path in alloc_pid()
Matthew Wilcox [Fri, 28 Dec 2018 15:22:26 +0000 (07:22 -0800)]
Fix failure path in alloc_pid()

commit 1a80dade010c7a7f4885a4c4c2a7ac22cc7b34df upstream.

The failure path removes the allocated PIDs from the wrong namespace.
This could lead to us inadvertently reusing PIDs in the leaf namespace
and leaking PIDs in parent namespaces.

Fixes: 95846ecf9dac ("pid: replace pid bitmap implementation with IDR API")
Cc: <stable@vger.kernel.org>
Signed-off-by: Matthew Wilcox <willy@infradead.org>
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agodriver core: Add missing dev->bus->need_parent_lock checks
Rafael J. Wysocki [Thu, 13 Dec 2018 18:27:47 +0000 (19:27 +0100)]
driver core: Add missing dev->bus->need_parent_lock checks

commit e121a833745b4708b660e3fe6776129c2956b041 upstream.

__device_release_driver() has to check dev->bus->need_parent_lock
before dropping the parent lock and acquiring it again as it may
attempt to drop a lock that hasn't been acquired or lock a device
that shouldn't be locked and create a lock imbalance.

Fixes: 8c97a46af04b (driver core: hold dev's parent lock when needed)
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: stable <stable@vger.kernel.org>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agosrcu: Lock srcu_data structure in srcu_gp_start()
Dennis Krein [Fri, 26 Oct 2018 14:38:24 +0000 (07:38 -0700)]
srcu: Lock srcu_data structure in srcu_gp_start()

commit eb4c2382272ae7ae5d81fdfa5b7a6c86146eaaa4 upstream.

The srcu_gp_start() function is called with the srcu_struct structure's
->lock held, but not with the srcu_data structure's ->lock.  This is
problematic because this function accesses and updates the srcu_data
structure's ->srcu_cblist, which is protected by that lock.  Failing to
hold this lock can result in corruption of the SRCU callback lists,
which in turn can result in arbitrarily bad results.

This commit therefore makes srcu_gp_start() acquire the srcu_data
structure's ->lock across the calls to rcu_segcblist_advance() and
rcu_segcblist_accelerate(), thus preventing this corruption.

Reported-by: Bart Van Assche <bvanassche@acm.org>
Reported-by: Christoph Hellwig <hch@infradead.org>
Reported-by: Sebastian Kuzminsky <seb.kuzminsky@gmail.com>
Signed-off-by: Dennis Krein <Dennis.Krein@netapp.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
Tested-by: Dennis Krein <Dennis.Krein@netapp.com>
Cc: <stable@vger.kernel.org> # 4.16.x
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoALSA: usb-audio: Always check descriptor sizes in parser code
Takashi Iwai [Wed, 2 Jan 2019 16:12:21 +0000 (17:12 +0100)]
ALSA: usb-audio: Always check descriptor sizes in parser code

commit 3e96d7280f16e2f787307f695a31296b9e4a1cd7 upstream.

There are a few places where we access the data without checking the
actual object size from the USB audio descriptor.  This may result in
OOB access, as recently reported.

This patch addresses these missing checks.  Most of added codes are
simple bLength checks in the caller side.  For the input and output
terminal parsers, we put the length check in the parser functions.
For the input terminal, a new argument is added to distinguish between
UAC1 and the rest, as they treat different objects.

Reported-by: Mathias Payer <mathias.payer@nebelwelt.net>
Reported-by: Hui Peng <benquike@163.com>
Tested-by: Hui Peng <benquike@163.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoALSA: usb-audio: Fix an out-of-bound read in create_composite_quirks
Hui Peng [Tue, 25 Dec 2018 23:11:52 +0000 (18:11 -0500)]
ALSA: usb-audio: Fix an out-of-bound read in create_composite_quirks

commit cbb2ebf70daf7f7d97d3811a2ff8e39655b8c184 upstream.

In `create_composite_quirk`, the terminating condition of for loops is
`quirk->ifnum < 0`. So any composite quirks should end with `struct
snd_usb_audio_quirk` object with ifnum < 0.

    for (quirk = quirk_comp->data; quirk->ifnum >= 0; ++quirk) {

     .....
    }

the data field of Bower's & Wilkins PX headphones usb device device quirks
do not end with {.ifnum = -1}, wihch may result in out-of-bound read.

This Patch fix the bug by adding an ending quirk object.

Fixes: 240a8af929c7 ("ALSA: usb-audio: Add a quirck for B&W PX headphones")
Signed-off-by: Hui Peng <benquike@163.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoALSA: usb-audio: Check mixer unit descriptors more strictly
Takashi Iwai [Wed, 19 Dec 2018 13:04:47 +0000 (14:04 +0100)]
ALSA: usb-audio: Check mixer unit descriptors more strictly

commit 0bfe5e434e6665b3590575ec3c5e4f86a1ce51c9 upstream.

We've had some sanity checks of the mixer unit descriptors but they
are too loose and some corner cases are overlooked.  Add more strict
checks in uac_mixer_unit_get_channels() for avoiding possible OOB
accesses by malformed descriptors.

This also changes the semantics of uac_mixer_unit_get_channels()
slightly.  Now it returns zero for the cases where the descriptor
lacks of bmControls instead of -EINVAL.  Then the caller side skips
the mixer creation for such unit while it keeps parsing it.
This corresponds to the case like Maya44.

Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoALSA: usb-audio: Avoid access before bLength check in build_audio_procunit()
Takashi Iwai [Wed, 19 Dec 2018 11:36:27 +0000 (12:36 +0100)]
ALSA: usb-audio: Avoid access before bLength check in build_audio_procunit()

commit f4351a199cc120ff9d59e06d02e8657d08e6cc46 upstream.

The parser for the processing unit reads bNrInPins field before the
bLength sanity check, which may lead to an out-of-bound access when a
malformed descriptor is given.  Fix it by assignment after the bLength
check.

Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoALSA: cs46xx: Potential NULL dereference in probe
Dan Carpenter [Tue, 8 Jan 2019 07:43:30 +0000 (10:43 +0300)]
ALSA: cs46xx: Potential NULL dereference in probe

commit 1524f4e47f90b27a3ac84efbdd94c63172246a6f upstream.

The "chip->dsp_spos_instance" can be NULL on some of the ealier error
paths in snd_cs46xx_create().

Reported-by: "Yavuz, Tuba" <tuba@ece.ufl.edu>
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agomedia: cx23885: only reset DMA on problematic CPUs
Brad Love [Wed, 19 Dec 2018 17:07:01 +0000 (12:07 -0500)]
media: cx23885: only reset DMA on problematic CPUs

commit 4bd46aa0353e022c2401a258e93b107880a66533 upstream.

It is reported that commit 95f408bbc4e4 ("media: cx23885: Ryzen DMA
related RiSC engine stall fixes") caused regresssions with other CPUs.

Ensure that the quirk will be applied only for the CPUs that
are known to cause problems.

A module option is added for explicit control of the behaviour.

Fixes: 95f408bbc4e4 ("media: cx23885: Ryzen DMA related RiSC engine stall fixes")
Signed-off-by: Brad Love <brad@nextdimension.cc>
Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agomt76x0: init hw capabilities
Lorenzo Bianconi [Thu, 6 Sep 2018 09:18:41 +0000 (11:18 +0200)]
mt76x0: init hw capabilities

commit 0ae976a11b4fb5704b597e103b5189237641c1a1 upstream.

Enable hw capabilities supported by mt76-usb layer
- fast_xmit
- tx/rx amsdu
- MFP
- non-linear tx skbs

[This is one line hw feature backport from 0ae976a11b4f ("mt76x0: init
hw capabilities"), which add also other different features, however
those are not supported in 4.19.

802.11w is supported by mac80211 and mt76x0u driver in 4.19 correctly
fall-back to software encryption when 802.11w ciphers are used.

Without the patch we fail to associate with WPA3 APs, so this is
considered as fix.]

Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com>
Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
Signed-off-by: Felix Fietkau <nbd@nbd.name>
[remove marking non-working features on 4.19, make topic correspond the change]
Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>