]> git.itanic.dy.fi Git - linux-stable/log
linux-stable
9 years agoLinux 3.14.11 v3.14.11
Greg Kroah-Hartman [Mon, 7 Jul 2014 01:57:39 +0000 (18:57 -0700)]
Linux 3.14.11

9 years agoALSA: hda - hdmi: call overridden init on resume
Pierre Ossman [Wed, 18 Jun 2014 19:48:09 +0000 (21:48 +0200)]
ALSA: hda - hdmi: call overridden init on resume

commit a283368382c50345dff61525f493ea307f21ec9b upstream.

We need to call the proper init function in case it has been
overridden, as it might restore things that the generic routing
doesn't know anything about. E.g. AMD cards have special verbs
that need resetting.

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=77901
Fixes: 5a61358433b1 ('ALSA: hda - hdmi: Add ATI/AMD multi-channel audio support')
Signed-off-by: Pierre Ossman <pierre@ossman.eu>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
9 years agoALSA: usb-audio: Fix races at disconnection and PCM closing
Takashi Iwai [Wed, 25 Jun 2014 12:24:47 +0000 (14:24 +0200)]
ALSA: usb-audio: Fix races at disconnection and PCM closing

commit 92a586bdc06de6629dae1b357dac221253f55ff8 upstream.

When a USB-audio device is disconnected while PCM is still running, we
still see some race: the disconnect callback calls
snd_usb_endpoint_free() that calls release_urbs() and then kfree()
while a PCM stream would be closed at the same time and calls
stop_endpoints() that leads to wait_clear_urbs().  That is, the EP
object might be deallocated while a PCM stream is syncing with
wait_clear_urbs() with the same EP.

Basically calling multiple wait_clear_urbs() would work fine, also
calling wait_clear_urbs() and release_urbs() would work, too, as
wait_clear_urbs() just reads some fields in ep.  The problem is the
succeeding kfree() in snd_pcm_endpoint_free().

This patch moves out the EP deallocation into the later point, the
destructor callback.  At this stage, all PCMs must have been already
closed, so it's safe to free the objects.

Reported-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
9 years agotracing: Fix syscall_*regfunc() vs copy_process() race
Oleg Nesterov [Sun, 13 Apr 2014 18:58:54 +0000 (20:58 +0200)]
tracing: Fix syscall_*regfunc() vs copy_process() race

commit 4af4206be2bd1933cae20c2b6fb2058dbc887f7c upstream.

syscall_regfunc() and syscall_unregfunc() should set/clear
TIF_SYSCALL_TRACEPOINT system-wide, but do_each_thread() can race
with copy_process() and miss the new child which was not added to
the process/thread lists yet.

Change copy_process() to update the child's TIF_SYSCALL_TRACEPOINT
under tasklist.

Link: http://lkml.kernel.org/p/20140413185854.GB20668@redhat.com
Fixes: a871bd33a6c0 "tracing: Add syscall tracepoints"
Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
9 years agotracing: Try again for saved cmdline if failed due to locking
Steven Rostedt (Red Hat) [Fri, 30 May 2014 13:42:39 +0000 (09:42 -0400)]
tracing: Try again for saved cmdline if failed due to locking

commit 379cfdac37923653c9d4242d10052378b7563005 upstream.

In order to prevent the saved cmdline cache from being filled when
tracing is not active, the comms are only recorded after a trace event
is recorded.

The problem is, a comm can fail to be recorded if the trace_cmdline_lock
is held. That lock is taken via a trylock to allow it to happen from
any context (including NMI). If the lock fails to be taken, the comm
is skipped. No big deal, as we will try again later.

But! Because of the code that was added to only record after an event,
we may not try again later as the recording is made as a oneshot per
event per CPU.

Only disable the recording of the comm if the comm is actually recorded.

Fixes: 7ffbd48d5cab "tracing: Cache comms only after an event occurred"
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
9 years agoDocumentation/SubmittingPatches: describe the Fixes: tag
Jacob Keller [Fri, 6 Jun 2014 21:36:39 +0000 (14:36 -0700)]
Documentation/SubmittingPatches: describe the Fixes: tag

commit 8401aa1f59975c03eeebd3ac6d264cbdfe9af5de upstream.

Update the SubmittingPatches process to include howto about the new
'Fixes:' tag to be used when a patch fixes an issue in a previous commit
(found by git-bisect for example).

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
9 years agolz4: add overrun checks to lz4_uncompress_unknownoutputsize()
Greg Kroah-Hartman [Thu, 3 Jul 2014 23:06:57 +0000 (16:06 -0700)]
lz4: add overrun checks to lz4_uncompress_unknownoutputsize()

commit 4a3a99045177369700c60d074c0e525e8093b0fc upstream.

Jan points out that I forgot to make the needed fixes to the
lz4_uncompress_unknownoutputsize() function to mirror the changes done
in lz4_decompress() with regards to potential pointer overflows.

The only in-kernel user of this function is the zram code, which only
takes data from a valid compressed buffer that it made itself, so it's
not a big issue.  But due to external kernel modules using this
function, it's better to be safe here.

Reported-by: Jan Beulich <JBeulich@suse.com>
Cc: "Don A. Bailey" <donb@securitymouse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
9 years agoptrace,x86: force IRET path after a ptrace_stop()
Tejun Heo [Thu, 3 Jul 2014 19:43:15 +0000 (15:43 -0400)]
ptrace,x86: force IRET path after a ptrace_stop()

commit b9cd18de4db3c9ffa7e17b0dc0ca99ed5aa4d43a upstream.

The 'sysret' fastpath does not correctly restore even all regular
registers, much less any segment registers or reflags values.  That is
very much part of why it's faster than 'iret'.

Normally that isn't a problem, because the normal ptrace() interface
catches the process using the signal handler infrastructure, which
always returns with an iret.

However, some paths can get caught using ptrace_event() instead of the
signal path, and for those we need to make sure that we aren't going to
return to user space using 'sysret'.  Otherwise the modifications that
may have been done to the register set by the tracer wouldn't
necessarily take effect.

Fix it by forcing IRET path by setting TIF_NOTIFY_RESUME from
arch_ptrace_stop_needed() which is invoked from ptrace_stop().

Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Andy Lutomirski <luto@amacapital.net>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
9 years agoipvs: Fix panic due to non-linear skb
Peter Christensen [Sat, 24 May 2014 19:40:12 +0000 (21:40 +0200)]
ipvs: Fix panic due to non-linear skb

commit f44a5f45f544561302e855e7bd104e5f506ec01b upstream.

Receiving a ICMP response to an IPIP packet in a non-linear skb could
cause a kernel panic in __skb_pull.

The problem was introduced in
commit f2edb9f7706dcb2c0d9a362b2ba849efe3a97f5e ("ipvs: implement
passive PMTUD for IPIP packets").

Signed-off-by: Peter Christensen <pch@ordbogen.com>
Acked-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
9 years agoMIPS: KVM: Fix memory leak on VCPU
Deng-Cheng Zhu [Tue, 24 Jun 2014 17:31:08 +0000 (10:31 -0700)]
MIPS: KVM: Fix memory leak on VCPU

commit 8c9eb041cf76038eb3b62ee259607eec9b89f48d upstream.

kvm_arch_vcpu_free() is called in 2 code paths:

1) kvm_vm_ioctl()
       kvm_vm_ioctl_create_vcpu()
           kvm_arch_vcpu_destroy()
               kvm_arch_vcpu_free()
2) kvm_put_kvm()
       kvm_destroy_vm()
           kvm_arch_destroy_vm()
               kvm_mips_free_vcpus()
                   kvm_arch_vcpu_free()

Neither of the paths handles VCPU free. We need to do it in
kvm_arch_vcpu_free() corresponding to the memory allocation in
kvm_arch_vcpu_create().

Signed-off-by: Deng-Cheng Zhu <dengcheng.zhu@imgtec.com>
Reviewed-by: James Hogan <james.hogan@imgtec.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
9 years agoMIPS: KVM: Remove redundant NULL checks before kfree()
James Hogan [Thu, 29 May 2014 09:16:44 +0000 (10:16 +0100)]
MIPS: KVM: Remove redundant NULL checks before kfree()

commit c6c0a6637f9da54f9472144d44f71cf847f92e20 upstream.

The kfree() function already NULL checks the parameter so remove the
redundant NULL checks before kfree() calls in arch/mips/kvm/.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Gleb Natapov <gleb@kernel.org>
Cc: kvm@vger.kernel.org
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: Sanjay Lal <sanjayl@kymasys.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
9 years agoreiserfs: call truncate_setsize under tailpack mutex
Jeff Mahoney [Wed, 21 May 2014 17:28:07 +0000 (13:28 -0400)]
reiserfs: call truncate_setsize under tailpack mutex

commit 22e7478ddbcb670e33fab72d0bbe7c394c3a2c84 upstream.

Prior to commit 0e4f6a791b1e (Fix reiserfs_file_release()), reiserfs
truncates serialized on i_mutex. They mostly still do, with the exception
of reiserfs_file_release. That blocks out other writers via the tailpack
mutex and the inode openers counter adjusted in reiserfs_file_open.

However, NFS will call reiserfs_setattr without having called ->open, so
we end up with a race when nfs is calling ->setattr while another
process is releasing the file. Ultimately, it triggers the
BUG_ON(inode->i_size != new_file_size) check in maybe_indirect_to_direct.

The solution is to pull the lock into reiserfs_setattr to encompass the
truncate_setsize call as well.

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
9 years agoxfs: xfs_readsb needs to check for magic numbers
Dave Chinner [Fri, 6 Jun 2014 06:00:43 +0000 (16:00 +1000)]
xfs: xfs_readsb needs to check for magic numbers

commit 556b8883cfac3d3203557e161ea8005f8b5479b2 upstream.

Commit daba542 ("xfs: skip verification on initial "guess"
superblock read") dropped the use of a verifier for the initial
superblock read so we can probe the sector size of the filesystem
stored in the superblock. It, however, now fails to validate that
what was read initially is actually an XFS superblock and hence will
fail the sector size check and return ENOSYS.

This causes probe-based mounts to fail because it expects XFS to
return EINVAL when it doesn't recognise the superblock format.

Reported-by: Plamen Petrov <plamen.sisi@gmail.com>
Tested-by: Plamen Petrov <plamen.sisi@gmail.com>
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
9 years agopowerpc: Don't skip ePAPR spin-table CPUs
Scott Wood [Wed, 25 Jun 2014 01:15:51 +0000 (20:15 -0500)]
powerpc: Don't skip ePAPR spin-table CPUs

commit 6663a4fa6711050036562ddfd2086edf735fae21 upstream.

Commit 59a53afe70fd530040bdc69581f03d880157f15a "powerpc: Don't setup
CPUs with bad status" broke ePAPR SMP booting.  ePAPR says that CPUs
that aren't presently running shall have status of disabled, with
enable-method being used to determine whether the CPU can be enabled.

Fix by checking for spin-table, which is currently the only supported
enable-method.

Signed-off-by: Scott Wood <scottwood@freescale.com>
Cc: Michael Neuling <mikey@neuling.org>
Cc: Emil Medve <Emilian.Medve@Freescale.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
9 years agopowerpc: Add AT_HWCAP2 to indicate V.CRYPTO category support
Benjamin Herrenschmidt [Tue, 10 Jun 2014 05:04:40 +0000 (15:04 +1000)]
powerpc: Add AT_HWCAP2 to indicate V.CRYPTO category support

commit dd58a092c4202f2bd490adab7285b3ff77f8e467 upstream.

The Vector Crypto category instructions are supported by current POWER8
chips, advertise them to userspace using a specific bit to properly
differentiate with chips of the same architecture level that might not
have them.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
9 years agopowerpc: Don't setup CPUs with bad status
Michael Neuling [Fri, 6 Jun 2014 04:28:51 +0000 (14:28 +1000)]
powerpc: Don't setup CPUs with bad status

commit 59a53afe70fd530040bdc69581f03d880157f15a upstream.

OPAL will mark a CPU that is guarded as "bad" in the status property of the CPU
node.

Unfortunatley Linux doesn't check this property and will put the bad CPU in the
present map.  This has caused hangs on booting when we try to unsplit the core.

This patch checks the CPU is avaliable via this status property before putting
it in the present map.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Tested-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
9 years agopowerpc: fix typo 'CONFIG_PPC_CPU'
Paul Bolle [Tue, 20 May 2014 19:59:42 +0000 (21:59 +0200)]
powerpc: fix typo 'CONFIG_PPC_CPU'

commit b69a1da94f3d1589d1942b5d1b384d8cfaac4500 upstream.

Commit cd64d1697cf0 ("powerpc: mtmsrd not defined") added a check for
CONFIG_PPC_CPU were a check for CONFIG_PPC_FPU was clearly intended.

Fixes: cd64d1697cf0 ("powerpc: mtmsrd not defined")
Signed-off-by: Paul Bolle <pebolle@tiscali.nl>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
9 years agopowerpc/perf: Ensure all EBB register state is cleared on fork()
Michael Ellerman [Tue, 10 Jun 2014 06:46:21 +0000 (16:46 +1000)]
powerpc/perf: Ensure all EBB register state is cleared on fork()

commit 3df48c981d5a9610e02e9270b1bc4274fb536710 upstream.

In commit 330a1eb "Core EBB support for 64-bit book3s" I messed up
clear_task_ebb(). It clears some but not all of the task's Event Based
Branch (EBB) registers when we duplicate a task struct.

That allows a child task to observe the EBBHR & EBBRR of its parent,
which it should not be able to do.

Fix it by clearing EBBHR & EBBRR.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
9 years agopowerpc: fix typo 'CONFIG_PMAC'
Paul Bolle [Tue, 20 May 2014 20:24:58 +0000 (22:24 +0200)]
powerpc: fix typo 'CONFIG_PMAC'

commit 6e0fdf9af216887e0032c19d276889aad41cad00 upstream.

Commit b0d278b7d3ae ("powerpc/perf_event: Reduce latency of calling
perf_event_do_pending") added a check for CONFIG_PMAC were a check for
CONFIG_PPC_PMAC was clearly intended.

Fixes: b0d278b7d3ae ("powerpc/perf_event: Reduce latency of calling perf_event_do_pending")
Signed-off-by: Paul Bolle <pebolle@tiscali.nl>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
9 years agopowerpc: 64bit sendfile is capped at 2GB
Anton Blanchard [Wed, 4 Jun 2014 00:48:48 +0000 (10:48 +1000)]
powerpc: 64bit sendfile is capped at 2GB

commit 5d73320a96fcce80286f1447864c481b5f0b96fa upstream.

commit 8f9c0119d7ba (compat: fs: Generic compat_sys_sendfile
implementation) changed the PowerPC 64bit sendfile call from
sys_sendile64 to sys_sendfile.

Unfortunately this broke sendfile of lengths greater than 2G because
sys_sendfile caps at MAX_NON_LFS. Restore what we had previously which
fixes the bug.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
9 years agopowerpc/serial: Use saner flags when creating legacy ports
Benjamin Herrenschmidt [Tue, 3 Jun 2014 07:33:41 +0000 (17:33 +1000)]
powerpc/serial: Use saner flags when creating legacy ports

commit c4cad90f9e9dcb85afc5e75a02ae3522ed077296 upstream.

We had a mix & match of flags used when creating legacy ports
depending on where we found them in the device-tree. Among others
we were missing UPF_SKIP_TEST for some kind of ISA ports which is
a problem as quite a few UARTs out there don't support the loopback
test (such as a lot of BMCs).

Let's pick the set of flags used by the SoC code and generalize it
which means autoconf, no loopback test, irq maybe shared and fixed
port.

Sending to stable as the lack of UPF_SKIP_TEST is breaking
serial on some machines so I want this back into distros

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
9 years agopowerpc/mm: Check paca psize is up to date for huge mappings
Michael Ellerman [Wed, 28 May 2014 08:21:17 +0000 (18:21 +1000)]
powerpc/mm: Check paca psize is up to date for huge mappings

commit 09567e7fd44291bfc08accfdd67ad8f467842332 upstream.

We have a bug in our hugepage handling which exhibits as an infinite
loop of hash faults. If the fault is being taken in the kernel it will
typically trigger the softlockup detector, or the RCU stall detector.

The bug is as follows:

 1. mmap(0xa0000000, ..., MAP_FIXED | MAP_HUGE_TLB | MAP_ANONYMOUS ..)
 2. Slice code converts the slice psize to 16M.
 3. The code on lines 539-540 of slice.c in slice_get_unmapped_area()
    synchronises the mm->context with the paca->context. So the paca slice
    mask is updated to include the 16M slice.
 3. Either:
    * mmap() fails because there are no huge pages available.
    * mmap() succeeds and the mapping is then munmapped.
    In both cases the slice psize remains at 16M in both the paca & mm.
 4. mmap(0xa0000000, ..., MAP_FIXED | MAP_ANONYMOUS ..)
 5. The slice psize is converted back to 64K. Because of the check on line 539
    of slice.c we DO NOT update the paca->context. The paca slice mask is now
    out of sync with the mm slice mask.
 6. User/kernel accesses 0xa0000000.
 7. The SLB miss handler slb_allocate_realmode() **uses the paca slice mask**
    to create an SLB entry and inserts it in the SLB.
18. With the 16M SLB entry in place the hardware does a hash lookup, no entry
    is found so a data access exception is generated.
19. The data access handler calls do_page_fault() -> handle_mm_fault().
10. __handle_mm_fault() creates a THP mapping with do_huge_pmd_anonymous_page().
11. The hardware retries the access, there is still nothing in the hash table
    so once again a data access exception is generated.
12. hash_page() calls into __hash_page_thp() and inserts a mapping in the
    hash. Although the THP mapping maps 16M the hashing is done using 64K
    as the segment page size.
13. hash_page() returns immediately after calling __hash_page_thp(), skipping
    over the code at line 1125. Resulting in the mismatch between the
    paca->context and mm->context not being detected.
14. The hardware retries the access, the hash it generates using the 16M
    SLB entry does NOT match the hash we inserted.
15. We take another data access and go into __hash_page_thp().
16. We see a valid entry in the hpte_slot_array and so we call updatepp()
    which succeeds.
17. Goto 14.

We could fix this in two ways. The first would be to remove or modify
the check on line 539 of slice.c.

The second option is to cause the check of paca psize in hash_page() on
line 1125 to also be done for THP pages.

We prefer the latter, because the check & update of the paca psize is
not done until we know it's necessary. It's also done only on the
current cpu, so we don't need to IPI all other cpus.

Without further rearranging the code, the simplest fix is to pull out
the code that checks paca psize and call it in two places. Firstly for
THP/hugetlb, and secondly for other mappings as before.

Thanks to Dave Jones for trinity, which originally found this bug.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
9 years agopowerpc/pseries: Fix overwritten PE state
Gavin Shan [Thu, 24 Apr 2014 08:00:21 +0000 (18:00 +1000)]
powerpc/pseries: Fix overwritten PE state

commit 54f112a3837d4e7532bbedbbbf27c0de277be510 upstream.

In pseries_eeh_get_state(), EEH_STATE_UNAVAILABLE is always
overwritten by EEH_STATE_NOT_SUPPORT because of the missed
"break" there. The patch fixes the issue.

Reported-by: Joe Perches <joe@perches.com>
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
9 years agonfs: Fix cache_validity check in nfs_write_pageuptodate()
Scott Mayhew [Fri, 20 Jun 2014 12:44:42 +0000 (08:44 -0400)]
nfs: Fix cache_validity check in nfs_write_pageuptodate()

commit 18dd78c427513fb0f89365138be66e6ee8700d1b upstream.

NFS_INO_INVALID_DATA cannot be ignored, even if we have a delegation.

We're still having some problems with data corruption when multiple
clients are appending to a file and those clients are being granted
write delegations on open.

To reproduce:

Client A:
vi /mnt/`hostname -s`
while :; do echo "XXXXXXXXXXXXXXX" >>/mnt/file; sleep $(( $RANDOM % 5 )); done

Client B:
vi /mnt/`hostname -s`
while :; do echo "YYYYYYYYYYYYYYY" >>/mnt/file; sleep $(( $RANDOM % 5 )); done

What's happening is that in nfs_update_inode() we're recognizing that
the file size has changed and we're setting NFS_INO_INVALID_DATA
accordingly, but then we ignore the cache_validity flags in
nfs_write_pageuptodate() because we have a delegation.  As a result,
in nfs_updatepage() we're extending the write to cover the full page
even though we've not read in the data to begin with.

Signed-off-by: Scott Mayhew <smayhew@redhat.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
9 years agoNFS: populate ->net in mount data when remounting
Mateusz Guzik [Tue, 10 Jun 2014 10:44:12 +0000 (12:44 +0200)]
NFS: populate ->net in mount data when remounting

commit a914722f333b3359d2f4f12919380a334176bb89 upstream.

Otherwise the kernel oopses when remounting with IPv6 server because
net is dereferenced in dev_get_by_name.

Use net ns of current thread so that dev_get_by_name does not operate on
foreign ns. Changing the address is prohibited anyway so this should not
affect anything.

Signed-off-by: Mateusz Guzik <mguzik@redhat.com>
Cc: linux-nfs@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
9 years agoNFS: Use raw_write_seqcount_begin/end int nfs4_reclaim_open_state
Trond Myklebust [Thu, 5 Jun 2014 14:42:37 +0000 (10:42 -0400)]
NFS: Use raw_write_seqcount_begin/end int nfs4_reclaim_open_state

commit abbec2da13f0e4c5d9b78b7e2c025a3e617228ba upstream.

The addition of lockdep code to write_seqcount_begin/end has lead to
a bunch of false positive claims of ABBA deadlocks with the so_lock
spinlock. Audits show that this simply cannot happen because the
read side code does not spin while holding so_lock.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
9 years agoNFS: Don't declare inode uptodate unless all attributes were checked
Trond Myklebust [Tue, 15 Apr 2014 14:07:57 +0000 (10:07 -0400)]
NFS: Don't declare inode uptodate unless all attributes were checked

commit 43b6535e717d2f656f71d9bd16022136b781c934 upstream.

Fix a bug, whereby nfs_update_inode() was declaring the inode to be
up to date despite not having checked all the attributes.
The bug occurs because the temporary variable in which we cache
the validity information is 'sanitised' before reapplying to
nfsi->cache_validity.

Reported-by: Kinglong Mee <kinglongmee@gmail.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
9 years agonfsd: getattr for FATTR4_WORD0_FILES_AVAIL needs the statfs buffer
Christoph Hellwig [Wed, 28 May 2014 08:46:13 +0000 (10:46 +0200)]
nfsd: getattr for FATTR4_WORD0_FILES_AVAIL needs the statfs buffer

commit 12337901d654415d9f764b5f5ba50052e9700f37 upstream.

Note nobody's ever noticed because the typical client probably never
requests FILES_AVAIL without also requesting something else on the list.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
9 years agonfsd4: fix FREE_STATEID lockowner leak
J. Bruce Fields [Tue, 27 May 2014 15:14:26 +0000 (11:14 -0400)]
nfsd4: fix FREE_STATEID lockowner leak

commit 48385408b45523d9a432c66292d47ef43efcbb94 upstream.

27b11428b7de ("nfsd4: remove lockowner when removing lock stateid")
introduced a memory leak.

Reported-by: Jeff Layton <jeff.layton@primarydata.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
9 years agopNFS: Handle allocation errors correctly in filelayout_alloc_layout_hdr()
Trond Myklebust [Fri, 30 May 2014 00:06:55 +0000 (20:06 -0400)]
pNFS: Handle allocation errors correctly in filelayout_alloc_layout_hdr()

commit 6df200f5d5191bdde4d2e408215383890f956781 upstream.

Return the NULL pointer when the allocation fails.

Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
9 years agoSUNRPC: Fix a module reference leak in svc_handle_xprt
Trond Myklebust [Sun, 18 May 2014 18:05:22 +0000 (14:05 -0400)]
SUNRPC: Fix a module reference leak in svc_handle_xprt

commit c789102c20bbbdda6831a273e046715be9d6af79 upstream.

If the accept() call fails, we need to put the module reference.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
9 years agoIB/umad: Fix use-after-free on close
Bart Van Assche [Fri, 6 Jun 2014 16:25:04 +0000 (18:25 +0200)]
IB/umad: Fix use-after-free on close

commit 60e1751cb52cc6d1ae04b6bd3c2b96e770b5823f upstream.

Avoid that closing /dev/infiniband/umad<n> or /dev/infiniband/issm<n>
triggers a use-after-free.  __fput() invokes f_op->release() before it
invokes cdev_put().  Make sure that the ib_umad_device structure is
freed by the cdev_put() call instead of f_op->release().  This avoids
that changing the port mode from IB into Ethernet and back to IB
followed by restarting opensmd triggers the following kernel oops:

    general protection fault: 0000 [#1] PREEMPT SMP
    RIP: 0010:[<ffffffff810cc65c>]  [<ffffffff810cc65c>] module_put+0x2c/0x170
    Call Trace:
     [<ffffffff81190f20>] cdev_put+0x20/0x30
     [<ffffffff8118e2ce>] __fput+0x1ae/0x1f0
     [<ffffffff8118e35e>] ____fput+0xe/0x10
     [<ffffffff810723bc>] task_work_run+0xac/0xe0
     [<ffffffff81002a9f>] do_notify_resume+0x9f/0xc0
     [<ffffffff814b8398>] int_signal+0x12/0x17

Reference: https://bugzilla.kernel.org/show_bug.cgi?id=75051
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Yann Droneaud <ydroneaud@opteya.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
9 years agoIB/umad: Fix error handling
Bart Van Assche [Tue, 20 May 2014 08:33:41 +0000 (10:33 +0200)]
IB/umad: Fix error handling

commit 8ec0a0e6b58218bdc1db91dd70ebfcd6ad8dd6cd upstream.

Avoid leaking a kref count in ib_umad_open() if port->ib_dev == NULL
or if nonseekable_open() fails.

Avoid leaking a kref count, that sm_sem is kept down and also that the
IB_PORT_SM capability mask is not cleared in ib_umad_sm_open() if
nonseekable_open() fails.

Since container_of() never returns NULL, remove the code that tests
whether container_of() returns NULL.

Moving the kref_get() call from the start of ib_umad_*open() to the
end is safe since it is the responsibility of the caller of these
functions to ensure that the cdev pointer remains valid until at least
when these functions return.

Signed-off-by: Bart Van Assche <bvanassche@acm.org>
[ydroneaud@opteya.com: rework a bit to reduce the amount of code changed]

Signed-off-by: Yann Droneaud <ydroneaud@opteya.com>
[ nonseekable_open() can't actually fail, but....  - Roland ]

Signed-off-by: Roland Dreier <roland@purestorage.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
9 years agoIB/srp: Fix a sporadic crash triggered by cable pulling
Bart Van Assche [Tue, 20 May 2014 13:03:49 +0000 (15:03 +0200)]
IB/srp: Fix a sporadic crash triggered by cable pulling

commit 024ca90151f5e4296d30f72c13ff9a075e23c9ec upstream.

Avoid that the loops that iterate over the request ring can encounter
a pointer to a SCSI command in req->scmnd that is no longer associated
with that request. If the function srp_unmap_data() is invoked twice
for a SCSI command that is not in flight then that would cause
ib_fmr_pool_unmap() to be invoked with an invalid pointer as argument,
resulting in a kernel oops.

Reported-by: Sagi Grimberg <sagig@mellanox.com>
Reference: http://thread.gmane.org/gmane.linux.drivers.rdma/19068/focus=19069
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
9 years agoIB/ipath: Translate legacy diagpkt into newer extended diagpkt
Dennis Dalessandro [Fri, 2 May 2014 15:40:17 +0000 (11:40 -0400)]
IB/ipath: Translate legacy diagpkt into newer extended diagpkt

commit 7e6d3e5c70f13874fb06e6b67696ed90ce79bd48 upstream.

This patch addresses an issue where the legacy diagpacket is sent in
from the user, but the driver operates on only the extended
diagpkt. This patch specifically initializes the extended diagpkt
based on the legacy packet.

Reported-by: Rickard Strandqvist <rickard_strandqvist@spectrumdigital.se>
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
9 years agoIB/qib: Fix port in pkey change event
Mike Marciniszyn [Fri, 2 May 2014 15:28:04 +0000 (11:28 -0400)]
IB/qib: Fix port in pkey change event

commit 911eccd284d13d78c92ec4f1f1092c03457d732a upstream.

The code used a literal 1 in dispatching an IB_EVENT_PKEY_CHANGE.

As of the dual port qib QDR card, this is not necessarily correct.

Change to use the port as specified in the call.

Reported-by: Alex Estrin <alex.estrin@intel.com>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
9 years agoIB/mlx5: add missing padding at end of struct mlx5_ib_create_srq
Yann Droneaud [Mon, 5 May 2014 17:33:22 +0000 (19:33 +0200)]
IB/mlx5: add missing padding at end of struct mlx5_ib_create_srq

commit 43bc889380c2ad9aa230eccc03a15cc52cf710d4 upstream.

The i386 ABI disagrees with most other ABIs regarding alignment of
data type larger than 4 bytes: on most ABIs a padding must be added at
end of the structures, while it is not required on i386.

So for most ABIs struct mlx5_ib_create_srq gets implicitly padded to be
aligned on a 8 bytes multiple, while for i386, such padding is not
added.

Tool pahole could be used to find such implicit padding:

  $ pahole --anon_include \
           --nested_anon_include \
           --recursive \
           --class_name mlx5_ib_create_srq \
           drivers/infiniband/hw/mlx5/mlx5_ib.o

Then, structure layout can be compared between i386 and x86_64:

#  +++ obj-i386/drivers/infiniband/hw/mlx5/mlx5_ib.o.pahole.txt    2014-03-28 11:43:07.386413682 +0100
#  --- obj-x86_64/drivers/infiniband/hw/mlx5/mlx5_ib.o.pahole.txt  2014-03-27 13:06:17.788472721 +0100
#  @@ -69,7 +68,6 @@ struct mlx5_ib_create_srq {
#          __u64                      db_addr;              /*     8     8 */
#          __u32                      flags;                /*    16     4 */
#
#  -       /* size: 20, cachelines: 1, members: 3 */
#  -       /* last cacheline: 20 bytes */
#  +       /* size: 24, cachelines: 1, members: 3 */
#  +       /* padding: 4 */
#  +       /* last cacheline: 24 bytes */
#   };

ABI disagreement will make an x86_64 kernel try to read past
the buffer provided by an i386 binary.

When boundary check will be implemented, the x86_64 kernel will
refuse to read past the i386 userspace provided buffer and the
uverb will fail.

Anyway, if the structure lay in memory on a page boundary and
next page is not mapped, ib_copy_from_udata() will fail and the
uverb will fail.

This patch makes create_srq_user() takes care of the input
data size to handle the case where no padding was provided.

This way, x86_64 kernel will be able to handle struct mlx5_ib_create_srq
as sent by unpatched and patched i386 libmlx5.

Link: http://marc.info/?i=cover.1399309513.git.ydroneaud@opteya.com
Fixes: e126ba97dba9e ("mlx5: Add driver for Mellanox Connect-IB adapter")
Signed-off-by: Yann Droneaud <ydroneaud@opteya.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
9 years agoIB/mlx5: add missing padding at end of struct mlx5_ib_create_cq
Yann Droneaud [Mon, 5 May 2014 17:33:21 +0000 (19:33 +0200)]
IB/mlx5: add missing padding at end of struct mlx5_ib_create_cq

commit a8237b32a3faab155a5dc8f886452147ce73da3e upstream.

The i386 ABI disagrees with most other ABIs regarding alignment of
data type larger than 4 bytes: on most ABIs a padding must be added at
end of the structures, while it is not required on i386.

So for most ABI struct mlx5_ib_create_cq get padded to be aligned on a
8 bytes multiple, while for i386, such padding is not added.

The tool pahole can be used to find such implicit padding:

  $ pahole --anon_include \
    --nested_anon_include \
    --recursive \
    --class_name mlx5_ib_create_cq \
    drivers/infiniband/hw/mlx5/mlx5_ib.o

Then, structure layout can be compared between i386 and x86_64:

#  +++ obj-i386/drivers/infiniband/hw/mlx5/mlx5_ib.o.pahole.txt    2014-03-28 11:43:07.386413682 +0100
#  --- obj-x86_64/drivers/infiniband/hw/mlx5/mlx5_ib.o.pahole.txt  2014-03-27 13:06:17.788472721 +0100
#  @@ -34,9 +34,8 @@ struct mlx5_ib_create_cq {
#          __u64                      db_addr;              /*     8     8 */
#          __u32                      cqe_size;             /*    16     4 */
#
#  -       /* size: 20, cachelines: 1, members: 3 */
#  -       /* last cacheline: 20 bytes */
#  +       /* size: 24, cachelines: 1, members: 3 */
#  +       /* padding: 4 */
#  +       /* last cacheline: 24 bytes */
#   };

This ABI disagreement will make an x86_64 kernel try to read past the
buffer provided by an i386 binary.

When boundary check will be implemented, a x86_64 kernel will refuse
to read past the i386 userspace provided buffer and the uverb will
fail.

Anyway, if the structure lies in memory on a page boundary and next
page is not mapped, ib_copy_from_udata() will fail when trying to read
the 4 bytes of padding and the uverb will fail.

This patch makes create_cq_user() takes care of the input data size to
handle the case where no padding is provided.

This way, x86_64 kernel will be able to handle struct
mlx5_ib_create_cq as sent by unpatched and patched i386 libmlx5.

Link: http://marc.info/?i=cover.1399309513.git.ydroneaud@opteya.com
Fixes: e126ba97dba9e ("mlx5: Add driver for Mellanox Connect-IB adapter")
Signed-off-by: Yann Droneaud <ydroneaud@opteya.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
9 years agokernel/watchdog.c: remove preemption restrictions when restarting lockup detector
Don Zickus [Mon, 23 Jun 2014 20:22:03 +0000 (13:22 -0700)]
kernel/watchdog.c: remove preemption restrictions when restarting lockup detector

commit bde92cf455a03a91badb7046855592d8c008e929 upstream.

Peter Wu noticed the following splat on his machine when updating
/proc/sys/kernel/watchdog_thresh:

  BUG: sleeping function called from invalid context at mm/slub.c:965
  in_atomic(): 1, irqs_disabled(): 0, pid: 1, name: init
  3 locks held by init/1:
   #0:  (sb_writers#3){.+.+.+}, at: [<ffffffff8117b663>] vfs_write+0x143/0x180
   #1:  (watchdog_proc_mutex){+.+.+.}, at: [<ffffffff810e02d3>] proc_dowatchdog+0x33/0x110
   #2:  (cpu_hotplug.lock){.+.+.+}, at: [<ffffffff810589c2>] get_online_cpus+0x32/0x80
  Preemption disabled at:[<ffffffff810e0384>] proc_dowatchdog+0xe4/0x110

  CPU: 0 PID: 1 Comm: init Not tainted 3.16.0-rc1-testing #34
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
  Call Trace:
    dump_stack+0x4e/0x7a
    __might_sleep+0x11d/0x190
    kmem_cache_alloc_trace+0x4e/0x1e0
    perf_event_alloc+0x55/0x440
    perf_event_create_kernel_counter+0x26/0xe0
    watchdog_nmi_enable+0x75/0x140
    update_timers_all_cpus+0x53/0xa0
    proc_dowatchdog+0xe4/0x110
    proc_sys_call_handler+0xb3/0xc0
    proc_sys_write+0x14/0x20
    vfs_write+0xad/0x180
    SyS_write+0x49/0xb0
    system_call_fastpath+0x16/0x1b
  NMI watchdog: disabled (cpu0): hardware events not enabled

What happened is after updating the watchdog_thresh, the lockup detector
is restarted to utilize the new value.  Part of this process involved
disabling preemption.  Once preemption was disabled, perf tried to
allocate a new event (as part of the restart).  This caused the above
BUG_ON as you can't sleep with preemption disabled.

The preemption restriction seemed agressive as we are not doing anything
on that particular cpu, but with all the online cpus (which are
protected by the get_online_cpus lock).  Remove the restriction and the
BUG_ON goes away.

Signed-off-by: Don Zickus <dzickus@redhat.com>
Acked-by: Michal Hocko <mhocko@suse.cz>
Reported-by: Peter Wu <peter@lekensteyn.nl>
Tested-by: Peter Wu <peter@lekensteyn.nl>
Acked-by: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
9 years agowatchdog: kempld-wdt: Use the correct value when configuring the prescaler with the...
gundberg [Thu, 24 Apr 2014 13:49:19 +0000 (15:49 +0200)]
watchdog: kempld-wdt: Use the correct value when configuring the prescaler with the watchdog

commit a9e0436b303e94ba57d3bd4b1fcbeaa744b7ebeb upstream.

Use the prescaler index, rather than its value, to configure the watchdog.
This will prevent a mismatch with the prescaler used to calculate the cycles.

Signed-off-by: Per Gundberg <per.gundberg@icomera.com>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Reviewed-by: Michael Brunner <michael.brunner@kontron.com>
Tested-by: Michael Brunner <michael.brunner@kontron.com>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
9 years agowatchdog: ath79_wdt: avoid spurious restarts on AR934x
Gabor Juhos [Wed, 16 Apr 2014 09:34:41 +0000 (11:34 +0200)]
watchdog: ath79_wdt: avoid spurious restarts on AR934x

commit 23afeb613ec0e10aecfae7838a14d485db62ac52 upstream.

On some AR934x based systems, where the frequency of
the AHB bus is relatively high, the built-in watchdog
causes a spurious restart when it gets enabled.

The possible cause of these restarts is that the timeout
value written into the TIMER register does not reaches
the hardware in time.

Add an explicit delay into the ath79_wdt_enable function
to avoid the spurious restarts.

Signed-off-by: Gabor Juhos <juhosg@openwrt.org>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
9 years agowatchdog: sp805: Set watchdog_device->timeout from ->set_timeout()
Viresh Kumar [Thu, 15 May 2014 04:31:59 +0000 (10:01 +0530)]
watchdog: sp805: Set watchdog_device->timeout from ->set_timeout()

commit 938626d96a3ffb9eb54552bb0d3a4f2b30ffdeb0 upstream.

Implementation of ->set_timeout() is supposed to set 'timeout' field of 'struct
watchdog_device' passed to it. sp805 was rather setting this in a local
variable. Fix it.

Reported-by: Arun Ramamurthy <arun.ramamurthy@broadcom.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
9 years agoUBIFS: Remove incorrect assertion in shrink_tnc()
hujianyang [Sat, 31 May 2014 03:39:32 +0000 (11:39 +0800)]
UBIFS: Remove incorrect assertion in shrink_tnc()

commit 72abc8f4b4e8574318189886de627a2bfe6cd0da upstream.

I hit the same assert failed as Dolev Raviv reported in Kernel v3.10
shows like this:

[ 9641.164028] UBIFS assert failed in shrink_tnc at 131 (pid 13297)
[ 9641.234078] CPU: 1 PID: 13297 Comm: mmap.test Tainted: G           O 3.10.40 #1
[ 9641.234116] [<c0011a6c>] (unwind_backtrace+0x0/0x12c) from [<c000d0b0>] (show_stack+0x20/0x24)
[ 9641.234137] [<c000d0b0>] (show_stack+0x20/0x24) from [<c0311134>] (dump_stack+0x20/0x28)
[ 9641.234188] [<c0311134>] (dump_stack+0x20/0x28) from [<bf22425c>] (shrink_tnc_trees+0x25c/0x350 [ubifs])
[ 9641.234265] [<bf22425c>] (shrink_tnc_trees+0x25c/0x350 [ubifs]) from [<bf2245ac>] (ubifs_shrinker+0x25c/0x310 [ubifs])
[ 9641.234307] [<bf2245ac>] (ubifs_shrinker+0x25c/0x310 [ubifs]) from [<c00cdad8>] (shrink_slab+0x1d4/0x2f8)
[ 9641.234327] [<c00cdad8>] (shrink_slab+0x1d4/0x2f8) from [<c00d03d0>] (do_try_to_free_pages+0x300/0x544)
[ 9641.234344] [<c00d03d0>] (do_try_to_free_pages+0x300/0x544) from [<c00d0a44>] (try_to_free_pages+0x2d0/0x398)
[ 9641.234363] [<c00d0a44>] (try_to_free_pages+0x2d0/0x398) from [<c00c6a60>] (__alloc_pages_nodemask+0x494/0x7e8)
[ 9641.234382] [<c00c6a60>] (__alloc_pages_nodemask+0x494/0x7e8) from [<c00f62d8>] (new_slab+0x78/0x238)
[ 9641.234400] [<c00f62d8>] (new_slab+0x78/0x238) from [<c031081c>] (__slab_alloc.constprop.42+0x1a4/0x50c)
[ 9641.234419] [<c031081c>] (__slab_alloc.constprop.42+0x1a4/0x50c) from [<c00f80e8>] (kmem_cache_alloc_trace+0x54/0x188)
[ 9641.234459] [<c00f80e8>] (kmem_cache_alloc_trace+0x54/0x188) from [<bf227908>] (do_readpage+0x168/0x468 [ubifs])
[ 9641.234553] [<bf227908>] (do_readpage+0x168/0x468 [ubifs]) from [<bf2296a0>] (ubifs_readpage+0x424/0x464 [ubifs])
[ 9641.234606] [<bf2296a0>] (ubifs_readpage+0x424/0x464 [ubifs]) from [<c00c17c0>] (filemap_fault+0x304/0x418)
[ 9641.234638] [<c00c17c0>] (filemap_fault+0x304/0x418) from [<c00de694>] (__do_fault+0xd4/0x530)
[ 9641.234665] [<c00de694>] (__do_fault+0xd4/0x530) from [<c00e10c0>] (handle_pte_fault+0x480/0xf54)
[ 9641.234690] [<c00e10c0>] (handle_pte_fault+0x480/0xf54) from [<c00e2bf8>] (handle_mm_fault+0x140/0x184)
[ 9641.234716] [<c00e2bf8>] (handle_mm_fault+0x140/0x184) from [<c0316688>] (do_page_fault+0x150/0x3ac)
[ 9641.234737] [<c0316688>] (do_page_fault+0x150/0x3ac) from [<c000842c>] (do_DataAbort+0x3c/0xa0)
[ 9641.234759] [<c000842c>] (do_DataAbort+0x3c/0xa0) from [<c0314e38>] (__dabt_usr+0x38/0x40)

After analyzing the code, I found a condition that may cause this failed
in correct operations. Thus, I think this assertion is wrong and should be
removed.

Suppose there are two clean znodes and one dirty znode in TNC. So the
per-filesystem atomic_t @clean_zn_cnt is (2). If commit start, dirty_znode
is set to COW_ZNODE in get_znodes_to_commit() in case of potentially ops
on this znode. We clear COW bit and DIRTY bit in write_index() without
@tnc_mutex locked. We don't increase @clean_zn_cnt in this place. As the
comments in write_index() shows, if another process hold @tnc_mutex and
dirty this znode after we clean it, @clean_zn_cnt would be decreased to (1).
We will increase @clean_zn_cnt to (2) with @tnc_mutex locked in
free_obsolete_znodes() to keep it right.

If shrink_tnc() performs between decrease and increase, it will release
other 2 clean znodes it holds and found @clean_zn_cnt is less than zero
(1 - 2 = -1), then hit the assertion. Because free_obsolete_znodes() will
soon correct @clean_zn_cnt and no harm to fs in this case, I think this
assertion could be removed.

2 clean zondes and 1 dirty znode, @clean_zn_cnt == 2

Thread A (commit)         Thread B (write or others)       Thread C (shrinker)
->write_index
   ->clear_bit(DIRTY_NODE)
   ->clear_bit(COW_ZNODE)

            @clean_zn_cnt == 2
                          ->mutex_locked(&tnc_mutex)
                          ->dirty_cow_znode
                              ->!ubifs_zn_cow(znode)
                              ->!test_and_set_bit(DIRTY_NODE)
                              ->atomic_dec(&clean_zn_cnt)
                          ->mutex_unlocked(&tnc_mutex)

            @clean_zn_cnt == 1
                                                           ->mutex_locked(&tnc_mutex)
                                                           ->shrink_tnc
                                                             ->destroy_tnc_subtree
                                                             ->atomic_sub(&clean_zn_cnt, 2)
                                                             ->ubifs_assert  <- hit
                                                           ->mutex_unlocked(&tnc_mutex)

            @clean_zn_cnt == -1
->mutex_lock(&tnc_mutex)
->free_obsolete_znodes
   ->atomic_inc(&clean_zn_cnt)
->mutux_unlock(&tnc_mutex)

            @clean_zn_cnt == 0 (correct after shrink)

Signed-off-by: hujianyang <hujianyang@huawei.com>
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
9 years agoUBIFS: fix an mmap and fsync race condition
hujianyang [Wed, 30 Apr 2014 06:06:06 +0000 (14:06 +0800)]
UBIFS: fix an mmap and fsync race condition

commit 691a7c6f28ac90cccd0dbcf81348ea90b211bdd0 upstream.

There is a race condition in UBIFS:

Thread A (mmap)                        Thread B (fsync)

->__do_fault                           ->write_cache_pages
   -> ubifs_vm_page_mkwrite
       -> budget_space
       -> lock_page
       -> release/convert_page_budget
       -> SetPagePrivate
       -> TestSetPageDirty
       -> unlock_page
                                       -> lock_page
                                           -> TestClearPageDirty
                                           -> ubifs_writepage
                                               -> do_writepage
                                                   -> release_budget
                                                   -> ClearPagePrivate
                                                   -> unlock_page
   -> !(ret & VM_FAULT_LOCKED)
   -> lock_page
   -> set_page_dirty
       -> ubifs_set_page_dirty
           -> TestSetPageDirty (set page dirty without budgeting)
   -> unlock_page

This leads to situation where we have a diry page but no budget allocated for
this page, so further write-back may fail with -ENOSPC.

In this fix we return from page_mkwrite without performing unlock_page. We
return VM_FAULT_LOCKED instead. After doing this, the race above will not
happen.

Signed-off-by: hujianyang <hujianyang@huawei.com>
Tested-by: Laurence Withers <lwithers@guralp.com>
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
9 years agoMIPS: MSC: Prevent out-of-bounds writes to MIPS SC ioremap'd region
Markos Chandras [Mon, 23 Jun 2014 08:48:51 +0000 (09:48 +0100)]
MIPS: MSC: Prevent out-of-bounds writes to MIPS SC ioremap'd region

commit ab6c15bc6620ebe220970cc040b29bcb2757f373 upstream.

Previously, the lower limit for the MIPS SC initialization loop was
set incorrectly allowing one extra loop leading to writes
beyond the MSC ioremap'd space. More precisely, the value of the 'imp'
in the last loop increased beyond the msc_irqmap_t boundaries and
as a result of which, the 'n' variable was loaded with an incorrect
value. This value was used later on to calculate the offset in the
MSC01_IC_SUP which led to random crashes like the following one:

CPU 0 Unable to handle kernel paging request at virtual address e75c0200,
epc == 8058dba4, ra == 8058db90
[...]
Call Trace:
[<8058dba4>] init_msc_irqs+0x104/0x154
[<8058b5bc>] arch_init_irq+0xd8/0x154
[<805897b0>] start_kernel+0x220/0x36c

Kernel panic - not syncing: Attempted to kill the idle task!

This patch fixes the problem

Signed-off-by: Markos Chandras <markos.chandras@imgtec.com>
Reviewed-by: James Hogan <james.hogan@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/7118/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
9 years agorecordmcount/MIPS: Fix possible incorrect mcount_loc table entries in modules
Alex Smith [Tue, 17 Jun 2014 09:39:53 +0000 (10:39 +0100)]
recordmcount/MIPS: Fix possible incorrect mcount_loc table entries in modules

commit 91ad11d7cc6f4472ebf177a6252fbf0fd100d798 upstream.

On MIPS calls to _mcount in modules generate 2 instructions to load
the _mcount address (and therefore 2 relocations). The mcount_loc
table should only reference the first of these, so the second is
filtered out by checking the relocation offset and ignoring ones that
immediately follow the previous one seen.

However if a module has an _mcount call at offset 0, the second
relocation would not be filtered out due to old_r_offset == 0
being taken to mean that the current relocation is the first one
seen, and both would end up in the mcount_loc table.

This results in ftrace_make_nop() patching both (adjacent)
instructions to branches over the _mcount call sequence like so:

  0xffffffffc08a8000:  04 00 00 10     b       0xffffffffc08a8014
  0xffffffffc08a8004:  04 00 00 10     b       0xffffffffc08a8018
  0xffffffffc08a8008:  2d 08 e0 03     move    at,ra
  ...

The second branch is in the delay slot of the first, which is
defined to be unpredictable - on the platform on which this bug was
encountered, it triggers a reserved instruction exception.

Fix by initializing old_r_offset to ~0 and using that instead of 0
to determine whether the current relocation is the first seen.

Signed-off-by: Alex Smith <alex.smith@imgtec.com>
Cc: linux-kernel@vger.kernel.org
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/7098/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
9 years agomtip32xx: Remove dfs_parent after pci unregister
Asai Thambi S P [Thu, 17 Apr 2014 03:30:16 +0000 (20:30 -0700)]
mtip32xx: Remove dfs_parent after pci unregister

commit af5ded8ccf21627f9614afc03b356712666ed225 upstream.

In module exit, dfs_parent and it's subtree were removed before
unregistering with pci. When debugfs entry for each device is attempted
to remove in pci_remove() context, they don't exist, as dfs_parent and
its children were already ripped apart.

Modified to first unregister with pci and then remove dfs_parent.

Signed-off-by: Asai Thambi S P <asamymuthupa@micron.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
9 years agomtip32xx: Increase timeout for STANDBY IMMEDIATE command
Asai Thambi S P [Thu, 17 Apr 2014 03:27:54 +0000 (20:27 -0700)]
mtip32xx: Increase timeout for STANDBY IMMEDIATE command

commit 670a641420a3d9586eebe7429dfeec4e7ed447aa upstream.

Increased timeout for STANDBY IMMEDIATE command to 2 minutes.

Signed-off-by: Selvan Mani <smani@micron.com>
Signed-off-by: Asai Thambi S P <asamymuthupa@micron.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
9 years agomtip32xx: Fix ERO and NoSnoop values in PCIe upstream on AMD systems
Asai Thambi S P [Fri, 14 Mar 2014 01:45:15 +0000 (18:45 -0700)]
mtip32xx: Fix ERO and NoSnoop values in PCIe upstream on AMD systems

commit d1e714db8129a1d3670e449b87719c78e2c76f9f upstream.

A hardware quirk in P320h/P420m interfere with PCIe transactions on some
AMD chipsets, making P320h/P420m unusable. This workaround is to disable
ERO and NoSnoop bits in the parent and root complex for normal
functioning of these devices

NOTE: This workaround is specific to AMD chipset with a PCIe upstream
device with device id 0x5aXX

Signed-off-by: Asai Thambi S P <asamymuthupa@micron.com>
Signed-off-by: Sam Bradshaw <sbradshaw@micron.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
9 years agoPCI: Fix incorrect vgaarb conditional in WARN_ON()
Bjorn Helgaas [Sat, 5 Apr 2014 21:14:22 +0000 (15:14 -0600)]
PCI: Fix incorrect vgaarb conditional in WARN_ON()

commit 67ebd8140dc8923c65451fa0f6a8eee003c4dcd3 upstream.

3448a19da479 "vgaarb: use bridges to control VGA routing where possible"
added the "flags & PCI_VGA_STATE_CHANGE_DECODES" condition to an existing
WARN_ON(), but used bitwise AND (&) instead of logical AND (&&), so the
condition is never true.  Replace with logical AND.

Found by Coverity (CID 142811).

Fixes: 3448a19da479 "vgaarb: use bridges to control VGA routing where possible"
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Yinghai Lu <yinghai@kernel.org>
Acked-by: David Airlie <airlied@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
9 years agoPCI: Add new ID for Intel GPU "spurious interrupt" quirk
Thomas Jarosch [Mon, 7 Apr 2014 13:10:32 +0000 (15:10 +0200)]
PCI: Add new ID for Intel GPU "spurious interrupt" quirk

commit 7c82126a94e69bbbac586f0249e7ef11e681246c upstream.

After a CPU upgrade while keeping the same mainboard, we faced "spurious
interrupt" problems again.

It turned out that the new CPU also featured a new GPU with a different PCI
ID.

Add this PCI ID to the quirk table.  Probably all other Intel GPU PCI IDs
are affected, too, but I don't want to add them without a test system.

See f67fd55fa96f ("PCI: Add quirk for still enabled interrupts on Intel
Sandy Bridge GPUs") for some history.

[bhelgaas: add f67fd55fa96f reference, stable tag]
Signed-off-by: Thomas Jarosch <thomas.jarosch@intra2net.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
9 years agoInput: elantech - don't set bit 1 of reg_10 when the no_hw_res quirk is set
Hans de Goede [Sun, 8 Jun 2014 06:07:13 +0000 (23:07 -0700)]
Input: elantech - don't set bit 1 of reg_10 when the no_hw_res quirk is set

commit fb4f8f568a9def02240ef9bf7aabd246dc63a081 upstream.

The touchpad on the GIGABYTE U2442 not only stops communicating when we try
to set bit 3 (enable real hardware resolution) of reg_10, but on some BIOS
versions also when we set bit 1 (enable two finger mode auto correct).

I've asked the original reporter of:
https://bugzilla.kernel.org/show_bug.cgi?id=61151

To check that not setting bit 1 does not lead to any adverse effects on his
model / BIOS revision, and it does not, so this commit fixes the touchpad
not working on these versions by simply never setting bit 1 for laptop
models with the no_hw_res quirk.

Reported-and-tested-by: James Lademann <jwlademann@gmail.com>
Tested-by: Philipp Wolfer <ph.wolfer@gmail.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
9 years agoInput: elantech - deal with clickpads reporting right button events
Hans de Goede [Sun, 8 Jun 2014 05:35:07 +0000 (22:35 -0700)]
Input: elantech - deal with clickpads reporting right button events

commit cd9e83e2754465856097f31c7ab933ce74c473f8 upstream.

At least the Dell Vostro 5470 elantech *clickpad* reports right button
clicks when clicked in the right bottom area:

https://bugzilla.redhat.com/show_bug.cgi?id=1103528

This is different from how (elantech) clickpads normally operate, normally
no matter where the user clicks on the pad the pad always reports a left
button event, since there is only 1 hardware button beneath the path.

It looks like Dell has put 2 buttons under the pad, one under each bottom
corner, causing this.

Since this however still clearly is a real clickpad hardware-wise, we still
want to report it as such to userspace, so that things like finger movement
in the bottom area can be properly ignored as it should be on clickpads.

So deal with this weirdness by simply mapping a right click to a left click
on elantech clickpads. As an added advantage this is something which we can
simply do on all elantech clickpads, so no need to add special quirks for
this weird model.

Reported-and-tested-by: Elder Marco <eldermarco@gmail.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
9 years agoInput: synaptics - fix resolution for manually provided min/max
Benjamin Tissoires [Sun, 8 Jun 2014 05:37:47 +0000 (22:37 -0700)]
Input: synaptics - fix resolution for manually provided min/max

commit d49cb7aeebb974713f9f7ab2991352d3050b095b upstream.

commit 421e08c41fda fixed the reported min/max for the X and Y axis,
but unfortunately, it broke the resolution of those same axis.

On the t540p, the resolution is the same regarding X and Y. It is not
a problem for xf86-input-synaptics because this driver is only interested
in the ratio between X and Y.
Unfortunately, xf86-input-cmt uses directly the resolution, and having a
null resolution leads to some divide by 0 errors, which are translated by
-infinity in the resulting coordinates.

Reported-by: Peter Hutterer <peter.hutterer@who-t.net>
Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
9 years agoiscsi-target: fix iscsit_del_np deadlock on unload
Mikulas Patocka [Mon, 23 Jun 2014 17:42:37 +0000 (13:42 -0400)]
iscsi-target: fix iscsit_del_np deadlock on unload

commit 81a9c5e72bdf7109a65102ca61d8cbd722cf4021 upstream.

On uniprocessor preemptible kernel, target core deadlocks on unload. The
following events happen:
* iscsit_del_np is called
* it calls send_sig(SIGINT, np->np_thread, 1);
* the scheduler switches to the np_thread
* the np_thread is woken up, it sees that kthread_should_stop() returns
  false, so it doesn't terminate
* the np_thread clears signals with flush_signals(current); and goes back
  to sleep in iscsit_accept_np
* the scheduler switches back to iscsit_del_np
* iscsit_del_np calls kthread_stop(np->np_thread);
* the np_thread is waiting in iscsit_accept_np and it doesn't respond to
  kthread_stop

The deadlock could be resolved if the administrator sends SIGINT signal to
the np_thread with killall -INT iscsi_np

The reproducible deadlock was introduced in commit
db6077fd0b7dd41dc6ff18329cec979379071f87, but the thread-stopping code was
racy even before.

This patch fixes the problem. Using kthread_should_stop to stop the
np_thread is unreliable, so we test np_thread_state instead. If
np_thread_state equals ISCSI_NP_THREAD_SHUTDOWN, the thread exits.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
9 years agoiscsi-target: Explicily clear login response PDU in exception path
Nicholas Bellinger [Tue, 17 Jun 2014 21:54:38 +0000 (21:54 +0000)]
iscsi-target: Explicily clear login response PDU in exception path

commit 683497566d48f86e04d026de1ee658dd74fc1077 upstream.

This patch adds a explicit memset to the login response PDU
exception path in iscsit_tx_login_rsp().

This addresses a regression bug introduced in commit baa4d64b
where the initiator would end up not receiving the login
response and associated status class + detail, before closing
the login connection.

Reported-by: Christophe Vu-Brugier <cvubrugier@yahoo.fr>
Tested-by: Christophe Vu-Brugier <cvubrugier@yahoo.fr>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
9 years agoiscsi-target: Avoid rejecting incorrect ITT for Data-Out
Nicholas Bellinger [Fri, 20 Jun 2014 17:59:57 +0000 (10:59 -0700)]
iscsi-target: Avoid rejecting incorrect ITT for Data-Out

commit 97c99b47ac58bacb7c09e1f47d5d184434f6b06a upstream.

This patch changes iscsit_check_dataout_hdr() to dump the incoming
Data-Out payload when the received ITT is not associated with a
WRITE, instead of calling iscsit_reject_cmd() for the non WRITE
ITT descriptor.

This addresses a bug where an initiator sending an Data-Out for
an ITT associated with a READ would end up generating a reject
for the READ, eventually resulting in list corruption.

Reported-by: Santosh Kulkarni <santosh.kulkarni@calsoftinc.com>
Reported-by: Arshad Hussain <arshad.hussain@calsoftinc.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
9 years agotarget: Fix left-over se_lun->lun_sep pointer OOPs
Nicholas Bellinger [Mon, 16 Jun 2014 20:25:54 +0000 (20:25 +0000)]
target: Fix left-over se_lun->lun_sep pointer OOPs

commit 83ff42fcce070801a3aa1cd6a3269d7426271a8d upstream.

This patch fixes a left-over se_lun->lun_sep pointer OOPs when one
of the /sys/kernel/config/target/$FABRIC/$WWPN/$TPGT/lun/$LUN/alua*
attributes is accessed after the $DEVICE symlink has been removed.

To address this bug, go ahead and clear se_lun->lun_sep memory in
core_dev_unexport(), so that the existing checks for show/store
ALUA attributes in target_core_fabric_configfs.c work as expected.

Reported-by: Sebastian Herbszt <herbszt@gmx.de>
Tested-by: Sebastian Herbszt <herbszt@gmx.de>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
9 years agoARM: dts: disable MDMA1 node for exynos5420
Seungwon Jeon [Thu, 8 May 2014 22:02:33 +0000 (07:02 +0900)]
ARM: dts: disable MDMA1 node for exynos5420

commit e6015c1f8a9032c2aecb78d23edf49582563bd47 upstream.

This change places MDMA1 in disabled node for Exynos5420.
If MDMA1 region is configured with secure mode, it makes
the boot failure with the following on smdk5420 board.
("Unhandled fault: imprecise external abort (0x1406) at 0x00000000")
Thus, arndale-octa board don't need to do the same thing anymore.

Signed-off-by: Seungwon Jeon <tgih.jun@samsung.com>
Tested-by: Javi Merino <javi.merino@arm.com>
Signed-off-by: Kukjin Kim <kgene.kim@samsung.com>
Signed-off-by: Tushar Behera <tushar.b@samsung.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
9 years agoLinux 3.14.10 v3.14.10
Greg Kroah-Hartman [Tue, 1 Jul 2014 03:12:08 +0000 (20:12 -0700)]
Linux 3.14.10

9 years agoefi-pstore: Fix an overflow on 32-bit builds
Andrzej Zaborowski [Mon, 9 Jun 2014 14:50:40 +0000 (16:50 +0200)]
efi-pstore: Fix an overflow on 32-bit builds

commit 783ee43118dc773bc8b0342c5b230e017d5a04d0 upstream.

In generic_id the long int timestamp is multiplied by 100000 and needs
an explicit cast to u64.

Without that the id in the resulting pstore filename is wrong and
userspace may have problems parsing it, but more importantly files in
pstore can never be deleted and may fill the EFI flash (brick device?).
This happens because when generic pstore code wants to delete a file,
it passes the id to the EFI backend which reinterpretes it and a wrong
variable name is attempted to be deleted.  There's no error message but
after remounting pstore, deleted files would reappear.

Signed-off-by: Andrew Zaborowski <andrew.zaborowski@intel.com>
Acked-by: David Rientjes <rientjes@google.com>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
9 years agobuilddeb: use $OBJCOPY variable instead of objcopy
Fathi Boudra [Sat, 12 Apr 2014 10:13:24 +0000 (13:13 +0300)]
builddeb: use $OBJCOPY variable instead of objcopy

commit 6b4a144a92ab81a1f45fb9b12aebaaaee0d08120 upstream.

In cross-build environment, we expect to use the cross-compiler objcopy
instead of the host objcopy.

It fixes following build failures:
objcopy --only-keep-debug lib/modules/3.14/kernel/net/ipv6/xfrm6_mode_tunnel.ko /srv/build/linux/debian/dbgtmp/usr/lib/debug/lib/modules/3.14/kernel/net/ipv6/xfrm6_mode_tunnel.ko
objcopy: Unable to recognise the format of the input file `lib/modules/3.14/kernel/net/ipv6/xfrm6_mode_tunnel.ko'

Signed-off-by: Fathi Boudra <fathi.boudra@linaro.org>
Fixes: 810e843746b7 ('deb-pkg: split debug symbols in their own package')
Reviewed-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Michal Marek <mmarek@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
9 years agoepoll: fix use-after-free in eventpoll_release_file
Konstantin Khlebnikov [Tue, 17 Jun 2014 02:58:05 +0000 (06:58 +0400)]
epoll: fix use-after-free in eventpoll_release_file

commit ebe06187bf2aec10d537ce4595e416035367d703 upstream.

This fixes use-after-free of epi->fllink.next inside list loop macro.
This loop actually releases elements in the body.  The list is
rcu-protected but here we cannot hold rcu_read_lock because we need to
lock mutex inside.

The obvious solution is to use list_for_each_entry_safe().  RCU-ness
isn't essential because nobody can change this list under us, it's final
fput for this file.

The bug was introduced by ae10b2b4eb01 ("epoll: optimize EPOLL_CTL_DEL
using rcu")

Signed-off-by: Konstantin Khlebnikov <koct9i@gmail.com>
Reported-by: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: Jason Baron <jbaron@akamai.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
9 years agox86_32, entry: Do syscall exit work on badsys (CVE-2014-4508)
Andy Lutomirski [Mon, 23 Jun 2014 21:22:15 +0000 (14:22 -0700)]
x86_32, entry: Do syscall exit work on badsys (CVE-2014-4508)

commit 554086d85e71f30abe46fc014fea31929a7c6a8a upstream.

The bad syscall nr paths are their own incomprehensible route
through the entry control flow.  Rearrange them to work just like
syscalls that return -ENOSYS.

This fixes an OOPS in the audit code when fast-path auditing is
enabled and sysenter gets a bad syscall nr (CVE-2014-4508).

This has probably been broken since Linux 2.6.27:
af0575bba0 i386 syscall audit fast-path

Cc: Roland McGrath <roland@redhat.com>
Reported-by: Toralf Förster <toralf.foerster@gmx.de>
Signed-off-by: Andy Lutomirski <luto@amacapital.net>
Link: http://lkml.kernel.org/r/e09c499eade6fc321266dd6b54da7beb28d6991c.1403558229.git.luto@amacapital.net
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
9 years agolz4: fix another possible overrun
Greg Kroah-Hartman [Tue, 24 Jun 2014 20:59:01 +0000 (16:59 -0400)]
lz4: fix another possible overrun

commit 4148c1f67abf823099b2d7db6851e4aea407f5ee upstream.

There is one other possible overrun in the lz4 code as implemented by
Linux at this point in time (which differs from the upstream lz4
codebase, but will get synced at in a future kernel release.)  As
pointed out by Don, we also need to check the overflow in the data
itself.

While we are at it, replace the odd error return value with just a
"simple" -1 value as the return value is never used for anything other
than a basic "did this work or not" check.

Reported-by: "Don A. Bailey" <donb@securitymouse.com>
Reported-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
9 years agobtrfs: allocate raid type kobjects dynamically
Jeff Mahoney [Tue, 27 May 2014 16:59:57 +0000 (12:59 -0400)]
btrfs: allocate raid type kobjects dynamically

commit c1895442be01c58449e3bf9272f22062a670e08f upstream.

We are currently allocating space_info objects in an array when we
allocate space_info. When a user does something like:

# btrfs balance start -mconvert=raid1 -dconvert=raid1 /mnt
# btrfs balance start -mconvert=single -dconvert=single /mnt -f
# btrfs balance start -mconvert=raid1 -dconvert=raid1 /

We can end up with memory corruption since the kobject hasn't
been reinitialized properly and the name pointer was left set.

The rationale behind allocating them statically was to avoid
creating a separate kobject container that just contained the
raid type. It used the index in the array to determine the index.

Ultimately, though, this wastes more memory than it saves in all
but the most complex scenarios and introduces kobject lifetime
questions.

This patch allocates the kobjects dynamically instead. Note that
we also remove the kobject_get/put of the parent kobject since
kobject_add and kobject_del do that internally.

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Reported-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <clm@fb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
9 years agobtrfs: fix lockdep warning with reclaim lock inversion
Jeff Mahoney [Wed, 26 Mar 2014 18:11:26 +0000 (14:11 -0400)]
btrfs: fix lockdep warning with reclaim lock inversion

commit ed55b6ac077fe7f9c6490ff55172c4b563562d7c upstream.

When encountering memory pressure, testers have run into the following
lockdep warning. It was caused by __link_block_group calling kobject_add
with the groups_sem held. kobject_add calls kvasprintf with GFP_KERNEL,
which gets us into reclaim context. The kobject doesn't actually need
to be added under the lock -- it just needs to ensure that it's only
added for the first block group to be linked.

=========================================================
[ INFO: possible irq lock inversion dependency detected ]
3.14.0-rc8-default #1 Not tainted
---------------------------------------------------------
kswapd0/169 just changed the state of lock:
 (&delayed_node->mutex){+.+.-.}, at: [<ffffffffa018baea>] __btrfs_release_delayed_node+0x3a/0x200 [btrfs]
but this lock took another, RECLAIM_FS-unsafe lock in the past:
 (&found->groups_sem){+++++.}

and interrupts could create inverse lock ordering between them.

other info that might help us debug this:
 Possible interrupt unsafe locking scenario:
       CPU0                    CPU1
       ----                    ----
  lock(&found->groups_sem);
                               local_irq_disable();
                               lock(&delayed_node->mutex);
                               lock(&found->groups_sem);
  <Interrupt>
    lock(&delayed_node->mutex);

 *** DEADLOCK ***
2 locks held by kswapd0/169:
 #0:  (shrinker_rwsem){++++..}, at: [<ffffffff81159e8a>] shrink_slab+0x3a/0x160
 #1:  (&type->s_umount_key#27){++++..}, at: [<ffffffff811bac6f>] grab_super_passive+0x3f/0x90

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Chris Mason <clm@fb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
9 years agobtrfs: fix use of uninit "ret" in end_extent_writepage()
Eric Sandeen [Thu, 12 Jun 2014 05:39:58 +0000 (00:39 -0500)]
btrfs: fix use of uninit "ret" in end_extent_writepage()

commit 3e2426bd0eb980648449e7a2f5a23e3cd3c7725c upstream.

If this condition in end_extent_writepage() is false:

if (tree->ops && tree->ops->writepage_end_io_hook)

we will then test an uninitialized "ret" at:

ret = ret < 0 ? ret : -EIO;

The test for ret is for the case where ->writepage_end_io_hook
failed, and we'd choose that ret as the error; but if
there is no ->writepage_end_io_hook, nothing sets ret.

Initializing ret to 0 should be sufficient; if
writepage_end_io_hook wasn't set, (!uptodate) means
non-zero err was passed in, so we choose -EIO in that case.

Signed-of-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Chris Mason <clm@fb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
9 years agoBtrfs: fix scrub_print_warning to handle skinny metadata extents
Liu Bo [Mon, 9 Jun 2014 02:54:07 +0000 (10:54 +0800)]
Btrfs: fix scrub_print_warning to handle skinny metadata extents

commit 6eda71d0c030af0fc2f68aaa676e6d445600855b upstream.

The skinny extents are intepreted incorrectly in scrub_print_warning(),
and end up hitting the BUG() in btrfs_extent_inline_ref_size.

Reported-by: Konstantinos Skarlatos <k.skarlatos@gmail.com>
Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
Signed-off-by: Chris Mason <clm@fb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
9 years agoBtrfs: use right type to get real comparison
Liu Bo [Sun, 8 Jun 2014 11:04:13 +0000 (19:04 +0800)]
Btrfs: use right type to get real comparison

commit cd857dd6bc2ae9ecea14e75a34e8a8fdc158e307 upstream.

We want to make sure the point is still within the extent item, not to verify
the memory it's pointing to.

Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
Signed-off-by: Chris Mason <clm@fb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
9 years agoBtrfs: don't check nodes for extent items
Josef Bacik [Thu, 5 Jun 2014 20:08:45 +0000 (16:08 -0400)]
Btrfs: don't check nodes for extent items

commit 8a56457f5f8fa7c2698ffae8545214c5b96a2cb5 upstream.

The backref code was looking at nodes as well as leaves when we tried to
populate extent item entries.  This is not good, and although we go away with it
for the most part because we'd skip where disk_bytenr != random_memory,
sometimes random_memory would match and suddenly boom.  This fixes that problem.
Thanks,

Signed-off-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: Chris Mason <clm@fb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
9 years agofs: btrfs: volumes.c: Fix for possible null pointer dereference
Rickard Strandqvist [Thu, 22 May 2014 20:43:43 +0000 (22:43 +0200)]
fs: btrfs: volumes.c: Fix for possible null pointer dereference

commit 8321cf2596d283821acc466377c2b85bcd3422b7 upstream.

There is otherwise a risk of a possible null pointer dereference.

Was largely found by using a static code analysis program called cppcheck.

Signed-off-by: Rickard Strandqvist <rickard_strandqvist@spectrumdigital.se>
Signed-off-by: Chris Mason <clm@fb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
9 years agoBtrfs: send, don't error in the presence of subvols/snapshots
Filipe Manana [Sun, 25 May 2014 03:49:24 +0000 (04:49 +0100)]
Btrfs: send, don't error in the presence of subvols/snapshots

commit 1af56070e3ef9477dbc7eba3b9ad7446979c7974 upstream.

If we are doing an incremental send and the base snapshot has a
directory with name X that doesn't exist anymore in the second
snapshot and a new subvolume/snapshot exists in the second snapshot
that has the same name as the directory (name X), the incremental
send would fail with -ENOENT error. This is because it attempts
to lookup for an inode with a number matching the objectid of a
root, which doesn't exist.

Steps to reproduce:

    mkfs.btrfs -f /dev/sdd
    mount /dev/sdd /mnt

    mkdir /mnt/testdir
    btrfs subvolume snapshot -r /mnt /mnt/mysnap1

    rmdir /mnt/testdir
    btrfs subvolume create /mnt/testdir
    btrfs subvolume snapshot -r /mnt /mnt/mysnap2

    btrfs send -p /mnt/mysnap1 /mnt/mysnap2 -f /tmp/send.data

A test case for xfstests follows.

Reported-by: Robert White <rwhite@pobox.com>
Signed-off-by: Filipe David Borba Manana <fdmanana@gmail.com>
Signed-off-by: Chris Mason <clm@fb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
9 years agoBtrfs: set right total device count for seeding support
Wang Shilong [Tue, 13 May 2014 09:05:06 +0000 (17:05 +0800)]
Btrfs: set right total device count for seeding support

commit 298658414a2f0bea1f05a81876a45c1cd96aa2e0 upstream.

Seeding device support allows us to create a new filesystem
based on existed filesystem.

However newly created filesystem's @total_devices should include seed
devices. This patch fix the following problem:

 # mkfs.btrfs -f /dev/sdb
 # btrfstune -S 1 /dev/sdb
 # mount /dev/sdb /mnt
 # btrfs device add -f /dev/sdc /mnt --->fs_devices->total_devices = 1
 # umount /mnt
 # mount /dev/sdc /mnt               --->fs_devices->total_devices = 2

This is because we record right @total_devices in superblock, but
@fs_devices->total_devices is reset to be 0 in btrfs_prepare_sprout().

Fix this problem by not resetting @fs_devices->total_devices.

Signed-off-by: Wang Shilong <wangsl.fnst@cn.fujitsu.com>
Signed-off-by: Chris Mason <clm@fb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
9 years agoBtrfs: mark mapping with error flag to report errors to userspace
Liu Bo [Mon, 12 May 2014 04:47:36 +0000 (12:47 +0800)]
Btrfs: mark mapping with error flag to report errors to userspace

commit 5dca6eea91653e9949ce6eb9e9acab6277e2f2c4 upstream.

According to commit 865ffef3797da2cac85b3354b5b6050dc9660978
(fs: fix fsync() error reporting),
it's not stable to just check error pages because pages can be
truncated or invalidated, we should also mark mapping with error
flag so that a later fsync can catch the error.

Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
Signed-off-by: Chris Mason <clm@fb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
9 years agoBtrfs: fix NULL pointer crash of deleting a seed device
Liu Bo [Sun, 11 May 2014 15:14:59 +0000 (23:14 +0800)]
Btrfs: fix NULL pointer crash of deleting a seed device

commit 29cc83f69c8338ff8fd1383c9be263d4bdf52d73 upstream.

Same as normal devices, seed devices should be initialized with
fs_info->dev_root as well, otherwise we'll get a NULL pointer crash.

Cc: Chris Murphy <lists@colorremedies.com>
Reported-by: Chris Murphy <lists@colorremedies.com>
Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
Signed-off-by: Chris Mason <clm@fb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
9 years agoBtrfs: make sure there are not any read requests before stopping workers
Wang Shilong [Wed, 9 Apr 2014 11:23:22 +0000 (19:23 +0800)]
Btrfs: make sure there are not any read requests before stopping workers

commit de348ee022175401e77d7662b7ca6e231a94e3fd upstream.

In close_ctree(), after we have stopped all workers,there maybe still
some read requests(for example readahead) to submit and this *maybe* trigger
an oops that user reported before:

kernel BUG at fs/btrfs/async-thread.c:619!

By hacking codes, i can reproduce this problem with one cpu available.
We fix this potential problem by invalidating all btree inode pages before
stopping all workers.

Thanks to Miao for pointing out this problem.

Signed-off-by: Wang Shilong <wangsl.fnst@cn.fujitsu.com>
Reviewed-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <clm@fb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
9 years agoBtrfs: output warning instead of error when loading free space cache failed
Miao Xie [Thu, 24 Apr 2014 05:31:55 +0000 (13:31 +0800)]
Btrfs: output warning instead of error when loading free space cache failed

commit 32d6b47fe6fc1714d5f1bba1b9f38e0ab0ad58a8 upstream.

If we fail to load a free space cache, we can rebuild it from the extent tree,
so it is not a serious error, we should not output a error message that
would make the users uncomfortable. This patch uses warning message instead
of it.

Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
Signed-off-by: Chris Mason <clm@fb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
9 years agobtrfs: Add ctime/mtime update for btrfs device add/remove.
Qu Wenruo [Wed, 16 Apr 2014 09:02:32 +0000 (17:02 +0800)]
btrfs: Add ctime/mtime update for btrfs device add/remove.

commit 5a1972bd9fd4b2fb1bac8b7a0b636d633d8717e3 upstream.

Btrfs will send uevent to udev inform the device change,
but ctime/mtime for the block device inode is not udpated, which cause
libblkid used by btrfs-progs unable to detect device change and use old
cache, causing 'btrfs dev scan; btrfs dev rmove; btrfs dev scan' give an
error message.

Reported-by: Tsutomu Itoh <t-itoh@jp.fujitsu.com>
Cc: Karel Zak <kzak@redhat.com>
Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: Chris Mason <clm@fb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
9 years agoBtrfs: fix double free in find_lock_delalloc_range
Chris Mason [Wed, 21 May 2014 12:49:54 +0000 (05:49 -0700)]
Btrfs: fix double free in find_lock_delalloc_range

commit 7d78874273463a784759916fc3e0b4e2eb141c70 upstream.

We need to NULL the cached_state after freeing it, otherwise
we might free it again if find_delalloc_range doesn't find anything.

Signed-off-by: Chris Mason <clm@fb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
9 years agoCIFS: Fix memory leaks in SMB2_open
Pavel Shilovsky [Sat, 24 May 2014 12:42:02 +0000 (16:42 +0400)]
CIFS: Fix memory leaks in SMB2_open

commit 663a962151593c69374776e8651238d0da072459 upstream.

Signed-off-by: Pavel Shilovsky <pshilovsky@samba.org>
Reviewed-by: Shirish Pargaonkar <spargaonkar@suse.com>
Signed-off-by: Steve French <smfrench@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
9 years agoaio: fix kernel memory disclosure in io_getevents() introduced in v3.10
Benjamin LaHaise [Tue, 24 Jun 2014 17:32:51 +0000 (13:32 -0400)]
aio: fix kernel memory disclosure in io_getevents() introduced in v3.10

commit edfbbf388f293d70bf4b7c0bc38774d05e6f711a upstream.

A kernel memory disclosure was introduced in aio_read_events_ring() in v3.10
by commit a31ad380bed817aa25f8830ad23e1a0480fef797.  The changes made to
aio_read_events_ring() failed to correctly limit the index into
ctx->ring_pages[], allowing an attacked to cause the subsequent kmap() of
an arbitrary page with a copy_to_user() to copy the contents into userspace.
This vulnerability has been assigned CVE-2014-0206.  Thanks to Mateusz and
Petr for disclosing this issue.

This patch applies to v3.12+.  A separate backport is needed for 3.10/3.11.

Signed-off-by: Benjamin LaHaise <bcrl@kvack.org>
Cc: Mateusz Guzik <mguzik@redhat.com>
Cc: Petr Matousek <pmatouse@redhat.com>
Cc: Kent Overstreet <kmo@daterainc.com>
Cc: Jeff Moyer <jmoyer@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
9 years agoaio: fix aio request leak when events are reaped by userspace
Benjamin LaHaise [Tue, 24 Jun 2014 17:12:55 +0000 (13:12 -0400)]
aio: fix aio request leak when events are reaped by userspace

commit f8567a3845ac05bb28f3c1b478ef752762bd39ef upstream.

The aio cleanups and optimizations by kmo that were merged into the 3.10
tree added a regression for userspace event reaping.  Specifically, the
reference counts are not decremented if the event is reaped in userspace,
leading to the application being unable to submit further aio requests.
This patch applies to 3.12+.  A separate backport is required for 3.10/3.11.
This issue was uncovered as part of CVE-2014-0206.

Signed-off-by: Benjamin LaHaise <bcrl@kvack.org>
Cc: Kent Overstreet <kmo@daterainc.com>
Cc: Mateusz Guzik <mguzik@redhat.com>
Cc: Petr Matousek <pmatouse@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
9 years agogenirq: Sanitize spurious interrupt detection of threaded irqs
Thomas Gleixner [Thu, 7 Mar 2013 13:53:45 +0000 (14:53 +0100)]
genirq: Sanitize spurious interrupt detection of threaded irqs

commit 1e77d0a1ed7417d2a5a52a7b8d32aea1833faa6c upstream.

Till reported that the spurious interrupt detection of threaded
interrupts is broken in two ways:

- note_interrupt() is called for each action thread of a shared
  interrupt line. That's wrong as we are only interested whether none
  of the device drivers felt responsible for the interrupt, but by
  calling multiple times for a single interrupt line we account
  IRQ_NONE even if one of the drivers felt responsible.

- note_interrupt() when called from the thread handler is not
  serialized. That leaves the members of irq_desc which are used for
  the spurious detection unprotected.

To solve this we need to defer the spurious detection of a threaded
interrupt to the next hardware interrupt context where we have
implicit serialization.

If note_interrupt is called with action_ret == IRQ_WAKE_THREAD, we
check whether the previous interrupt requested a deferred check. If
not, we request a deferred check for the next hardware interrupt and
return.

If set, we check whether one of the interrupt threads signaled
success. Depending on this information we feed the result into the
spurious detector.

If one primary handler of a shared interrupt returns IRQ_HANDLED we
disable the deferred check of irq threads on the same line, as we have
found at least one device driver who cared.

Reported-by: Till Straumann <strauman@slac.stanford.edu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Austin Schuh <austin@peloton-tech.com>
Cc: Oliver Hartkopp <socketcan@hartkopp.net>
Cc: Wolfgang Grandegger <wg@grandegger.com>
Cc: Pavel Pisa <pisa@cmp.felk.cvut.cz>
Cc: Marc Kleine-Budde <mkl@pengutronix.de>
Cc: linux-can@vger.kernel.org
Link: http://lkml.kernel.org/r/alpine.LFD.2.02.1303071450130.22263@ionos
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
9 years agox86, x32: Use compat shims for io_{setup,submit}
Mike Frysinger [Mon, 5 May 2014 00:43:15 +0000 (20:43 -0400)]
x86, x32: Use compat shims for io_{setup,submit}

commit 7fd44dacdd803c0bbf38bf478d51d280902bb0f1 upstream.

The io_setup takes a pointer to a context id of type aio_context_t.
This in turn is typed to a __kernel_ulong_t.  We could tweak the
exported headers to define this as a 64bit quantity for specific
ABIs, but since we already have a 32bit compat shim for the x86 ABI,
let's just re-use that logic.  The libaio package is also written to
expect this as a pointer type, so a compat shim would simplify that.

The io_submit func operates on an array of pointers to iocb structs.
Padding out the array to be 64bit aligned is a huge pain, so convert
it over to the existing compat shim too.

We don't convert io_getevents to the compat func as its only purpose
is to handle the timespec struct, and the x32 ABI uses 64bit times.

With this change, the libaio package can now pass its testsuite when
built for the x32 ABI.

Signed-off-by: Mike Frysinger <vapier@gentoo.org>
Link: http://lkml.kernel.org/r/1399250595-5005-1-git-send-email-vapier@gentoo.org
Cc: H.J. Lu <hjl.tools@gmail.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
9 years agox86-32, espfix: Remove filter for espfix32 due to race
H. Peter Anvin [Wed, 30 Apr 2014 21:03:25 +0000 (14:03 -0700)]
x86-32, espfix: Remove filter for espfix32 due to race

commit 246f2d2ee1d715e1077fc47d61c394569c8ee692 upstream.

It is not safe to use LAR to filter when to go down the espfix path,
because the LDT is per-process (rather than per-thread) and another
thread might change the descriptors behind our back.  Fortunately it
is always *safe* (if a bit slow) to go down the espfix path, and a
32-bit LDT stack segment is extremely rare.

Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Link: http://lkml.kernel.org/r/1398816946-3351-1-git-send-email-hpa@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
9 years agoarm64/dma: Removing ARCH_HAS_DMA_GET_REQUIRED_MASK macro
Suravee Suthikulpanit [Fri, 6 Jun 2014 22:07:16 +0000 (23:07 +0100)]
arm64/dma: Removing ARCH_HAS_DMA_GET_REQUIRED_MASK macro

commit f3a183cb422574014538017b5b291a416396f97e upstream.

Arm64 does not define dma_get_required_mask() function.
Therefore, it should not define the ARCH_HAS_DMA_GET_REQUIRED_MASK.
This causes build errors in some device drivers (e.g. mpt2sas)

Signed-off-by: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
9 years agoARM: mvebu: DT: fix OpenBlocks AX3-4 RAM size
Jason Cooper [Wed, 4 Jun 2014 13:41:20 +0000 (13:41 +0000)]
ARM: mvebu: DT: fix OpenBlocks AX3-4 RAM size

commit e47043aea3853a74a9aa5726a1faa916d7462ab7 upstream.

The OpenBlocks AX3-4 has a non-DT bootloader.  It also comes with 1GB of
soldered on RAM, and a DIMM slot for expansion.

Unfortunately, atags_to_fdt() doesn't work in big-endian mode, so we see
the following failure when attempting to boot a big-endian kernel:

  686 slab pages
  17 pages shared
  0 pages swap cached
  [ pid ]   uid  tgid total_vm      rss nr_ptes swapents oom_score_adj name
  Kernel panic - not syncing: Out of memory and no killable processes...

  CPU: 1 PID: 351 Comm: kworker/u4:0 Not tainted 3.15.0-rc8-next-20140603 #1
  [<c0215a54>] (unwind_backtrace) from [<c021160c>] (show_stack+0x10/0x14)
  [<c021160c>] (show_stack) from [<c0802500>] (dump_stack+0x78/0x94)
  [<c0802500>] (dump_stack) from [<c0800068>] (panic+0x90/0x21c)
  [<c0800068>] (panic) from [<c02b5704>] (out_of_memory+0x320/0x340)
  [<c02b5704>] (out_of_memory) from [<c02b93a0>] (__alloc_pages_nodemask+0x874/0x930)
  [<c02b93a0>] (__alloc_pages_nodemask) from [<c02d446c>] (handle_mm_fault+0x744/0x96c)
  [<c02d446c>] (handle_mm_fault) from [<c02cf250>] (__get_user_pages+0xd0/0x4c0)
  [<c02cf250>] (__get_user_pages) from [<c02f3598>] (get_arg_page+0x54/0xbc)
  [<c02f3598>] (get_arg_page) from [<c02f3878>] (copy_strings+0x278/0x29c)
  [<c02f3878>] (copy_strings) from [<c02f38bc>] (copy_strings_kernel+0x20/0x28)
  [<c02f38bc>] (copy_strings_kernel) from [<c02f4f1c>] (do_execve+0x3a8/0x4c8)
  [<c02f4f1c>] (do_execve) from [<c025ac10>] (____call_usermodehelper+0x15c/0x194)
  [<c025ac10>] (____call_usermodehelper) from [<c020e9b8>] (ret_from_fork+0x14/0x3c)
  CPU0: stopping
  CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.15.0-rc8-next-20140603 #1
  [<c0215a54>] (unwind_backtrace) from [<c021160c>] (show_stack+0x10/0x14)
  [<c021160c>] (show_stack) from [<c0802500>] (dump_stack+0x78/0x94)
  [<c0802500>] (dump_stack) from [<c021429c>] (handle_IPI+0x138/0x174)
  [<c021429c>] (handle_IPI) from [<c02087f0>] (armada_370_xp_handle_irq+0xb0/0xcc)
  [<c02087f0>] (armada_370_xp_handle_irq) from [<c0212100>] (__irq_svc+0x40/0x50)
  Exception stack(0xc0b6bf68 to 0xc0b6bfb0)
  bf60:                   e9fad598 00000000 00f509a3 00000000 c0b6a000 c0b724c4
  bf80: c0b72458 c0b6a000 00000000 00000000 c0b66da0 c0b6a000 00000000 c0b6bfb0
  bfa0: c027bb94 c027bb24 60000313 ffffffff
  [<c0212100>] (__irq_svc) from [<c027bb24>] (cpu_startup_entry+0x54/0x214)
  [<c027bb24>] (cpu_startup_entry) from [<c0ac5b30>] (start_kernel+0x318/0x37c)
  [<c0ac5b30>] (start_kernel) from [<00208078>] (0x208078)
  ---[ end Kernel panic - not syncing: Out of memory and no killable processes...

A similar failure will also occur if ARM_ATAG_DTB_COMPAT isn't selected.

Fix this by setting a sane default (1 GB) in the dts file.

Signed-off-by: Jason Cooper <jason@lakedaemon.net>
Tested-by: Kevin Hilman <khilman@linaro.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
9 years agoSCSI: Fix spurious request sense in error handling
James Bottomley [Fri, 28 Mar 2014 17:50:17 +0000 (10:50 -0700)]
SCSI: Fix spurious request sense in error handling

commit d555a2abf3481f81303d835046a5ec2c4fb3ca8e upstream.

We unconditionally execute scsi_eh_get_sense() to make sure all failed
commands that should have sense attached, do.  However, the routine forgets
that some commands, because of the way they fail, will not have any sense code
... we should not bother them with a REQUEST_SENSE command.  Fix this by
testing to see if we actually got a CHECK_CONDITION return and skip asking for
sense if we don't.

Tested-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
9 years agotarget: Explicitly clear ramdisk_mcp backend pages
Nicholas A. Bellinger [Mon, 16 Jun 2014 20:59:52 +0000 (20:59 +0000)]
target: Explicitly clear ramdisk_mcp backend pages

[Note that a different patch to address the same issue went in during
v3.15-rc1 (commit 4442dc8a), but includes a bunch of other changes that
don't strictly apply to fixing the bug]

This patch changes rd_allocate_sgl_table() to explicitly clear
ramdisk_mcp backend memory pages by passing __GFP_ZERO into
alloc_pages().

This addresses a potential security issue where reading from a
ramdisk_mcp could return sensitive information, and follows what
>= v3.15 does to explicitly clear ramdisk_mcp memory at backend
device initialization time.

Reported-by: Jorge Daniel Sequeira Matias <jdsm@tecnico.ulisboa.pt>
Cc: Jorge Daniel Sequeira Matias <jdsm@tecnico.ulisboa.pt>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
9 years agotarget: Report correct response length for some commands
Roland Dreier [Tue, 10 Jun 2014 18:07:47 +0000 (11:07 -0700)]
target: Report correct response length for some commands

commit 2426bd456a61407388b6e61fc5f98dbcbebc50e2 upstream.

When an initiator sends an allocation length bigger than what its
command consumes, the target should only return the actual response data
and set the residual length to the unused part of the allocation length.

Add a helper function that command handlers (INQUIRY, READ CAPACITY,
etc) can use to do this correctly, and use this code to get the correct
residual for commands that don't use the full initiator allocation in the
handlers for READ CAPACITY, READ CAPACITY(16), INQUIRY, MODE SENSE and
REPORT LUNS.

This addresses a handful of failures as reported by Christophe with
the Windows Certification Kit:

  http://permalink.gmane.org/gmane.linux.scsi.target.devel/6515

Signed-off-by: Roland Dreier <roland@purestorage.com>
Tested-by: Christophe Vu-Brugier <cvubrugier@yahoo.fr>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
9 years agoTarget/iscsi: Fix sendtargets response pdu for iser transport
Sagi Grimberg [Tue, 10 Jun 2014 15:27:59 +0000 (18:27 +0300)]
Target/iscsi: Fix sendtargets response pdu for iser transport

commit 22c7aaa57e80853b4904a46c18f97db0036a3b97 upstream.

In case the transport is iser we should not include the
iscsi target info in the sendtargets text response pdu.
This causes sendtargets response to include the target
info twice.

Modify iscsit_build_sendtargets_response to filter
transport types that don't match.

Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Reported-by: Slava Shwartsman <valyushash@gmail.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
9 years agoiscsi-target: Fix ABORT_TASK + connection reset iscsi_queue_req memory leak
Nicholas Bellinger [Tue, 10 Jun 2014 04:03:54 +0000 (04:03 +0000)]
iscsi-target: Fix ABORT_TASK + connection reset iscsi_queue_req memory leak

commit bbc050488525e1ab1194c27355f63c66814385b8 upstream.

This patch fixes a iscsi_queue_req memory leak when ABORT_TASK response
has been queued by TFO->queue_tm_rsp() -> lio_queue_tm_rsp() after a
long standing I/O completes, but the connection has already reset and
waiting for cleanup to complete in iscsit_release_commands_from_conn()
-> transport_generic_free_cmd() -> transport_wait_for_tasks() code.

It moves iscsit_free_queue_reqs_for_conn() after the per-connection command
list has been released, so that the associated se_cmd tag can be completed +
released by target-core before freeing any remaining iscsi_queue_req memory
for the connection generated by lio_queue_tm_rsp().

Cc: Thomas Glanzmann <thomas@glanzmann.de>
Cc: Charalampos Pournaris <charpour@gmail.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
9 years agotarget: Use complete_all for se_cmd->t_transport_stop_comp
Nicholas Bellinger [Mon, 9 Jun 2014 23:36:51 +0000 (23:36 +0000)]
target: Use complete_all for se_cmd->t_transport_stop_comp

commit a95d6511303b848da45ee27b35018bb58087bdc6 upstream.

This patch fixes a bug where multiple waiters on ->t_transport_stop_comp
occurs due to a concurrent ABORT_TASK and session reset both invoking
transport_wait_for_tasks(), while waiting for the associated se_cmd
descriptor backend processing to complete.

For this case, complete_all() should be invoked in order to wake up
both waiters in core_tmr_abort_task() + transport_generic_free_cmd()
process contexts.

Cc: Thomas Glanzmann <thomas@glanzmann.de>
Cc: Charalampos Pournaris <charpour@gmail.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
9 years agotarget: Set CMD_T_ACTIVE bit for Task Management Requests
Nicholas Bellinger [Mon, 9 Jun 2014 23:13:20 +0000 (23:13 +0000)]
target: Set CMD_T_ACTIVE bit for Task Management Requests

commit f15e9cd910c4d9da7de43f2181f362082fc45f0f upstream.

This patch fixes a bug where se_cmd descriptors associated with a
Task Management Request (TMR) where not setting CMD_T_ACTIVE before
being dispatched into target_tmr_work() process context.

This is required in order for transport_generic_free_cmd() ->
transport_wait_for_tasks() to wait on se_cmd->t_transport_stop_comp
if a session reset event occurs while an ABORT_TASK is outstanding
waiting for another I/O to complete.

Cc: Thomas Glanzmann <thomas@glanzmann.de>
Cc: Charalampos Pournaris <charpour@gmail.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
9 years agoTarget/iser: Wait for proper cleanup before unloading
Sagi Grimberg [Mon, 19 May 2014 14:44:25 +0000 (17:44 +0300)]
Target/iser: Wait for proper cleanup before unloading

commit f5ebec9629cf78eeeea4b8258882a9f439ab2404 upstream.

disconnected_handler works are scheduled on system_wq.
When attempting to unload, first make sure all works
have cleaned up.

Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
9 years agoTarget/iser: Improve cm events handling
Sagi Grimberg [Mon, 19 May 2014 14:44:24 +0000 (17:44 +0300)]
Target/iser: Improve cm events handling

commit 88c4015fda6d014392f76d3b1688347950d7a12d upstream.

There are 4 RDMA_CM events that all basically mean that
the user should teardown the IB connection:
- DISCONNECTED
- ADDR_CHANGE
- DEVICE_REMOVAL
- TIMEWAIT_EXIT

Only in DISCONNECTED/ADDR_CHANGE it makes sense to
call rdma_disconnect (send DREQ/DREP to our initiator).
So we keep the same teardown handler for all of them
but only indicate calling rdma_disconnect for the relevant
events.

This patch also removes redundant debug prints for each single
event.

v2 changes:
 - Call isert_disconnected_handler() for DEVICE_REMOVAL (Or + Sag)

Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
9 years agoTarget/iser: Fix hangs in connection teardown
Sagi Grimberg [Mon, 19 May 2014 14:44:23 +0000 (17:44 +0300)]
Target/iser: Fix hangs in connection teardown

commit 9d49f5e284e700576f3b65f1e28dea8539da6661 upstream.

In ungraceful teardowns isert close flows seem racy such that
isert_wait_conn hangs as RDMA_CM_EVENT_DISCONNECTED never
gets invoked (no one called rdma_disconnect).

Both graceful and ungraceful teardowns will have rx flush errors
(isert posts a batch once connection is established). Once all
flush errors are consumed we invoke isert_wait_conn and it will
be responsible for calling rdma_disconnect. This way it can be
sure that rdma_disconnect was called and it won't wait forever.

This patch also removes the logout_posted indicator. either the
logout completion was consumed and no problem decrementing the
post_send_buf_count, or it was consumed as a flush error. no point
of keeping it for isert_wait_conn as there is no danger that
isert_conn will be accidentally removed while it is running.

(Drop unnecessary sleep_on_conn_wait_comp check in
 isert_cq_rx_comp_err - nab)

Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
9 years agoTarget/iser: Bail from accept_np if np_thread is trying to close
Sagi Grimberg [Mon, 19 May 2014 14:44:22 +0000 (17:44 +0300)]
Target/iser: Bail from accept_np if np_thread is trying to close

commit e346ab343f4f58c12a96725c7b13df9cc2ad56f6 upstream.

In case np_thread state is in RESET/SHUTDOWN/EXIT states,
no point for isert to stall there as we may get a hang in
case no one will wake it up later.

Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
9 years agoBluetooth: Fix L2CAP deadlock
Jukka Taimisto [Thu, 22 May 2014 10:02:39 +0000 (10:02 +0000)]
Bluetooth: Fix L2CAP deadlock

commit 8a96f3cd22878fc0bb564a8478a6e17c0b8dca73 upstream.

-[0x01 Introduction

We have found a programming error causing a deadlock in Bluetooth subsystem
of Linux kernel. The problem is caused by missing release_sock() call when
L2CAP connection creation fails due full accept queue.

The issue can be reproduced with 3.15-rc5 kernel and is also present in
earlier kernels.

-[0x02 Details

The problem occurs when multiple L2CAP connections are created to a PSM which
contains listening socket (like SDP) and left pending, for example,
configuration (the underlying ACL link is not disconnected between
connections).

When L2CAP connection request is received and listening socket is found the
l2cap_sock_new_connection_cb() function (net/bluetooth/l2cap_sock.c) is called.
This function locks the 'parent' socket and then checks if the accept queue
is full.

1178         lock_sock(parent);
1179
1180         /* Check for backlog size */
1181         if (sk_acceptq_is_full(parent)) {
1182                 BT_DBG("backlog full %d", parent->sk_ack_backlog);
1183                 return NULL;
1184         }

If case the accept queue is full NULL is returned, but the 'parent' socket
is not released. Thus when next L2CAP connection request is received the code
blocks on lock_sock() since the parent is still locked.

Also note that for connections already established and waiting for
configuration to complete a timeout will occur and l2cap_chan_timeout()
(net/bluetooth/l2cap_core.c) will be called. All threads calling this
function will also be blocked waiting for the channel mutex since the thread
which is waiting on lock_sock() alread holds the channel mutex.

We were able to reproduce this by sending continuously L2CAP connection
request followed by disconnection request containing invalid CID. This left
the created connections pending configuration.

After the deadlock occurs it is impossible to kill bluetoothd, btmon will not
get any more data etc. requiring reboot to recover.

-[0x03 Fix

Releasing the 'parent' socket when l2cap_sock_new_connection_cb() returns NULL
seems to fix the issue.

Signed-off-by: Jukka Taimisto <jtt@codenomicon.com>
Reported-by: Tommi Mäkilä <tmakila@codenomicon.com>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>