]> git.itanic.dy.fi Git - linux-stable/log
linux-stable
10 years agoLinux 3.13.9 v3.13.9
Greg Kroah-Hartman [Thu, 3 Apr 2014 19:02:51 +0000 (12:02 -0700)]
Linux 3.13.9

10 years agonetfilter: nf_conntrack_dccp: fix skb_header_pointer API usages
Daniel Borkmann [Sun, 5 Jan 2014 23:57:54 +0000 (00:57 +0100)]
netfilter: nf_conntrack_dccp: fix skb_header_pointer API usages

commit b22f5126a24b3b2f15448c3f2a254fc10cbc2b92 upstream.

Some occurences in the netfilter tree use skb_header_pointer() in
the following way ...

  struct dccp_hdr _dh, *dh;
  ...
  skb_header_pointer(skb, dataoff, sizeof(_dh), &dh);

... where dh itself is a pointer that is being passed as the copy
buffer. Instead, we need to use &_dh as the forth argument so that
we're copying the data into an actual buffer that sits on the stack.

Currently, we probably could overwrite memory on the stack (e.g.
with a possibly mal-formed DCCP packet), but unintentionally, as
we only want the buffer to be placed into _dh variable.

Fixes: 2bc780499aa3 ("[NETFILTER]: nf_conntrack: add DCCP protocol support")
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agocgroup: protect modifications to cgroup_idr with cgroup_mutex
Li Zefan [Tue, 11 Feb 2014 08:05:46 +0000 (16:05 +0800)]
cgroup: protect modifications to cgroup_idr with cgroup_mutex

commit 0ab02ca8f887908152d1a96db5130fc661d36a1e upstream.

Setup cgroupfs like this:
  # mount -t cgroup -o cpuacct xxx /cgroup
  # mkdir /cgroup/sub1
  # mkdir /cgroup/sub2

Then run these two commands:
  # for ((; ;)) { mkdir /cgroup/sub1/tmp && rmdir /mnt/sub1/tmp; } &
  # for ((; ;)) { mkdir /cgroup/sub2/tmp && rmdir /mnt/sub2/tmp; } &

After seconds you may see this warning:

------------[ cut here ]------------
WARNING: CPU: 1 PID: 25243 at lib/idr.c:527 sub_remove+0x87/0x1b0()
idr_remove called for id=6 which is not allocated.
...
Call Trace:
 [<ffffffff8156063c>] dump_stack+0x7a/0x96
 [<ffffffff810591ac>] warn_slowpath_common+0x8c/0xc0
 [<ffffffff81059296>] warn_slowpath_fmt+0x46/0x50
 [<ffffffff81300aa7>] sub_remove+0x87/0x1b0
 [<ffffffff810f3f02>] ? css_killed_work_fn+0x32/0x1b0
 [<ffffffff81300bf5>] idr_remove+0x25/0xd0
 [<ffffffff810f2bab>] cgroup_destroy_css_killed+0x5b/0xc0
 [<ffffffff810f4000>] css_killed_work_fn+0x130/0x1b0
 [<ffffffff8107cdbc>] process_one_work+0x26c/0x550
 [<ffffffff8107eefe>] worker_thread+0x12e/0x3b0
 [<ffffffff81085f96>] kthread+0xe6/0xf0
 [<ffffffff81570bac>] ret_from_fork+0x7c/0xb0
---[ end trace 2d1577ec10cf80d0 ]---

It's because allocating/removing cgroup ID is not properly synchronized.

The bug was introduced when we converted cgroup_ida to cgroup_idr.
While synchronization is already done inside ida_simple_{get,remove}(),
users are responsible for concurrent calls to idr_{alloc,remove}().

tj: Refreshed on top of b58c89986a77 ("cgroup: fix error return from
cgroup_create()").

[mhocko@suse.cz: ported to 3.12]
Fixes: 4e96ee8e981b ("cgroup: convert cgroup_ida to cgroup_idr")
Cc: <stable@vger.kernel.org> #3.12+
Reported-by: Michal Hocko <mhocko@suse.cz>
Signed-off-by: Li Zefan <lizefan@huawei.com>
Signed-off-by: Michal Hocko <mhocko@suse.cz>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agomm: close PageTail race
David Rientjes [Mon, 3 Mar 2014 23:38:18 +0000 (15:38 -0800)]
mm: close PageTail race

commit 668f9abbd4334e6c29fa8acd71635c4f9101caa7 upstream.

Commit bf6bddf1924e ("mm: introduce compaction and migration for
ballooned pages") introduces page_count(page) into memory compaction
which dereferences page->first_page if PageTail(page).

This results in a very rare NULL pointer dereference on the
aforementioned page_count(page).  Indeed, anything that does
compound_head(), including page_count() is susceptible to racing with
prep_compound_page() and seeing a NULL or dangling page->first_page
pointer.

This patch uses Andrea's implementation of compound_trans_head() that
deals with such a race and makes it the default compound_head()
implementation.  This includes a read memory barrier that ensures that
if PageTail(head) is true that we return a head page that is neither
NULL nor dangling.  The patch then adds a store memory barrier to
prep_compound_page() to ensure page->first_page is set.

This is the safest way to ensure we see the head page that we are
expecting, PageTail(page) is already in the unlikely() path and the
memory barriers are unfortunately required.

Hugetlbfs is the exception, we don't enforce a store memory barrier
during init since no race is possible.

Signed-off-by: David Rientjes <rientjes@google.com>
Cc: Holger Kiehl <Holger.Kiehl@dwd.de>
Cc: Christoph Lameter <cl@linux.com>
Cc: Rafael Aquini <aquini@redhat.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agoswitch mnt_hash to hlist
Al Viro [Fri, 21 Mar 2014 01:10:51 +0000 (21:10 -0400)]
switch mnt_hash to hlist

commit 38129a13e6e71f666e0468e99fdd932a687b4d7e upstream.

fixes RCU bug - walking through hlist is safe in face of element moves,
since it's self-terminating.  Cyclic lists are not - if we end up jumping
to another hash chain, we'll loop infinitely without ever hitting the
original list head.

[fix for dumb braino folded]

Spotted by: Max Kellermann <mk@cm4all.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agodon't bother with propagate_mnt() unless the target is shared
Al Viro [Fri, 21 Mar 2014 14:14:08 +0000 (10:14 -0400)]
don't bother with propagate_mnt() unless the target is shared

commit 0b1b901b5a98bb36943d10820efc796f7cd45ff3 upstream.

If the dest_mnt is not shared, propagate_mnt() does nothing -
there's no mounts to propagate to and thus no copies to create.
Might as well don't bother calling it in that case.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agokeep shadowed vfsmounts together
Al Viro [Fri, 21 Mar 2014 00:34:43 +0000 (20:34 -0400)]
keep shadowed vfsmounts together

commit 1d6a32acd70ab18499829c0a9a5dbe2bace72a13 upstream.

preparation to switching mnt_hash to hlist

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agoresizable namespace.c hashes
Al Viro [Fri, 28 Feb 2014 18:46:44 +0000 (13:46 -0500)]
resizable namespace.c hashes

commit 0818bf27c05b2de56c5b2bd08cfae2a939bd5f52 upstream.

* switch allocation to alloc_large_system_hash()
* make sizes overridable by boot parameters (mhash_entries=, mphash_entries=)
* switch mountpoint_hashtable from list_head to hlist_head

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agorandom32: avoid attempt to late reseed if in the middle of seeding
Sasha Levin [Fri, 28 Mar 2014 16:38:42 +0000 (17:38 +0100)]
random32: avoid attempt to late reseed if in the middle of seeding

commit 05efa8c943b1d5d90fa8c8147571837573338bb6 upstream.

Commit 4af712e8df ("random32: add prandom_reseed_late() and call when
nonblocking pool becomes initialized") has added a late reseed stage
that happens as soon as the nonblocking pool is marked as initialized.

This fails in the case that the nonblocking pool gets initialized
during __prandom_reseed()'s call to get_random_bytes(). In that case
we'd double back into __prandom_reseed() in an attempt to do a late
reseed - deadlocking on 'lock' early on in the boot process.

Instead, just avoid even waiting to do a reseed if a reseed is already
occuring.

Fixes: 4af712e8df99 ("random32: add prandom_reseed_late() and call when nonblocking pool becomes initialized")
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agonet: mvneta: fix usage as a module on RGMII configurations
Thomas Petazzoni [Tue, 25 Mar 2014 23:25:42 +0000 (00:25 +0100)]
net: mvneta: fix usage as a module on RGMII configurations

commit e3a8786c10e75903f1269474e21fe8cb49c3a670 upstream.

Commit 5445eaf309ff ('mvneta: Try to fix mvneta when compiled as
module') fixed the mvneta driver to make it work properly when loaded
as a module in SGMII configuration, which was tested successful by the
author on the Armada XP OpenBlocks AX3, which uses SGMII.

However, it turns out that the Armada XP GP, which uses RGMII, is
affected by a similar problem: its SERDES configuration is lost when
mvneta is loaded as a module, because this configuration is set by the
bootloader, and then lost because the clock is gated by the clock
framework until the mvneta driver is loaded again and the clock is
re-enabled.

However, it turns out that for the RGMII case, setting the SERDES
configuration is not sufficient: the PCS enable bit in the
MVNETA_GMAC_CTRL_2 register must also be set, like in the SGMII
configuration.

Therefore, this commit reworks the SGMII/RGMII initialization: the
only difference between the two now is a different SERDES
configuration, all the rest is identical.

In detail, to achieve this, the commit:

 * Renames MVNETA_SGMII_SERDES_CFG to MVNETA_SERDES_CFG because it is
   not specific to SGMII, but also used on RGMII configurations.

 * Adds a MVNETA_RGMII_SERDES_PROTO definition, that must be used as
   the MVNETA_SERDES_CFG value in RGMII configurations.

 * Removes the mvneta_gmac_rgmii_set() and mvneta_port_sgmii_config()
   functions, and instead directly do the SGMII/RGMII configuration in
   mvneta_port_up(), from where those functions where called. It is
   worth mentioning that mvneta_gmac_rgmii_set() had an 'enable'
   parameter that was always passed as '1', so it was pretty useless.

 * Reworks the mvneta_port_up() function to set the MVNETA_SERDES_CFG
   register to the appropriate value depending on the RGMII vs. SGMII
   configuration. It also unconditionally set the PCS_ENABLE bit (was
   already done for SGMII, but is now also needed for RGMII), and sets
   the PORT_RGMII bit (which was already done for both SGMII and
   RGMII).

This commit was successfully tested with mvneta compiled as a module,
on both the OpenBlocks AX3 (SGMII configuration) and the Armada XP GP
(RGMII configuration).

Reported-by: Steve McIntyre <steve@einval.com>
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agonet: mvneta: rename MVNETA_GMAC2_PSC_ENABLE to MVNETA_GMAC2_PCS_ENABLE
Thomas Petazzoni [Tue, 25 Mar 2014 23:25:41 +0000 (00:25 +0100)]
net: mvneta: rename MVNETA_GMAC2_PSC_ENABLE to MVNETA_GMAC2_PCS_ENABLE

commit a79121d3b57e7ad61f0b5d23eae05214054f3ccd upstream.

Bit 3 of the MVNETA_GMAC_CTRL_2 is actually used to enable the PCS,
not the PSC: there was a typo in the name of the define, which this
commit fixes.

Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agomake prepend_name() work correctly when called with negative *buflen
Al Viro [Sun, 23 Mar 2014 04:28:40 +0000 (00:28 -0400)]
make prepend_name() work correctly when called with negative *buflen

commit e825196d48d2b89a6ec3a8eff280098d2a78207e upstream.

In all callchains leading to prepend_name(), the value left in *buflen
is eventually discarded unused if prepend_name() has returned a negative.
So we are free to do what prepend() does, and subtract from *buflen
*before* checking for underflow (which turns into checking the sign
of subtraction result, of course).

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agox86: fix boot on uniprocessor systems
Artem Fetishev [Fri, 28 Mar 2014 20:33:39 +0000 (13:33 -0700)]
x86: fix boot on uniprocessor systems

commit 825600c0f20e595daaa7a6dd8970f84fa2a2ee57 upstream.

On x86 uniprocessor systems topology_physical_package_id() returns -1
which causes rapl_cpu_prepare() to leave rapl_pmu variable uninitialized
which leads to GPF in rapl_pmu_init().

See arch/x86/kernel/cpu/perf_event_intel_rapl.c.

It turns out that physical_package_id and core_id can actually be
retreived for uniprocessor systems too.  Enabling them also fixes
rapl_pmu code.

Signed-off-by: Artem Fetishev <artem_fetishev@epam.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agodrm/i915: Undo gtt scratch pte unmapping again
Daniel Vetter [Wed, 26 Mar 2014 19:10:09 +0000 (20:10 +0100)]
drm/i915: Undo gtt scratch pte unmapping again

commit 8ee661b505613ef2747b350ca2871a31b3781bee upstream.

It apparently blows up on some machines. This functionally reverts

commit 828c79087cec61eaf4c76bb32c222fbe35ac3930
Author: Ben Widawsky <benjamin.widawsky@intel.com>
Date:   Wed Oct 16 09:21:30 2013 -0700

    drm/i915: Disable GGTT PTEs on GEN6+ suspend

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=64841
Reported-and-Tested-by: Brad Jackson <bjackson0971@gmail.com>
Cc: Takashi Iwai <tiwai@suse.de>
Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Cc: Todd Previte <tprevite@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agoi2c: cpm: Fix build by adding of_address.h and of_irq.h
Scott Wood [Tue, 18 Mar 2014 21:10:24 +0000 (16:10 -0500)]
i2c: cpm: Fix build by adding of_address.h and of_irq.h

commit 5f12c5eca6e6b7aeb4b2028d579f614b4fe7a81f upstream.

Fixes a build break due to the undeclared use of irq_of_parse_and_map()
and of_iomap().  This build break was apparently introduced while the
driver was unbuildable due to the bug fixed by
62c19c9d29e65086e5ae76df371ed2e6b23f00cd ("i2c: Remove usage of
orphaned symbol OF_I2C").  When 62c19c was added in v3.14-rc7,
the driver was enabled again, breaking the powerpc mpc85xx_defconfig
and mpc85xx_smp_defconfig.

62c19c is marked for stable, so this should go there as well.

Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agoRevert "xen: properly account for _PAGE_NUMA during xen pte translations"
David Vrabel [Tue, 25 Mar 2014 10:38:37 +0000 (10:38 +0000)]
Revert "xen: properly account for _PAGE_NUMA during xen pte translations"

commit 5926f87fdaad4be3ed10cec563bf357915e55a86 upstream.

This reverts commit a9c8e4beeeb64c22b84c803747487857fe424b68.

PTEs in Xen PV guests must contain machine addresses if _PAGE_PRESENT
is set and pseudo-physical addresses is _PAGE_PRESENT is clear.

This is because during a domain save/restore (migration) the page
table entries are "canonicalised" and uncanonicalised". i.e., MFNs are
converted to PFNs during domain save so that on a restore the page
table entries may be rewritten with the new MFNs on the destination.
This canonicalisation is only done for PTEs that are present.

This change resulted in writing PTEs with MFNs if _PAGE_PROTNONE (or
_PAGE_NUMA) was set but _PAGE_PRESENT was clear.  These PTEs would be
migrated as-is which would result in unexpected behaviour in the
destination domain.  Either a) the MFN would be translated to the
wrong PFN/page; b) setting the _PAGE_PRESENT bit would clear the PTE
because the MFN is no longer owned by the domain; or c) the present
bit would not get set.

Symptoms include "Bad page" reports when munmapping after migrating a
domain.

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agoxen/balloon: flush persistent kmaps in correct position
Wei Liu [Sat, 15 Mar 2014 16:11:47 +0000 (16:11 +0000)]
xen/balloon: flush persistent kmaps in correct position

commit 09ed3d5ba06137913960f9c9385f71fc384193ab upstream.

Xen balloon driver will update ballooned out pages' P2M entries to point
to scratch page for PV guests. In 24f69373e2 ("xen/balloon: don't alloc
page while non-preemptible", kmap_flush_unused was moved after updating
P2M table. In that case for 32 bit PV guest we might end up with

  P2M    X -----> S  (S is mfn of balloon scratch page)
  M2P    Y -----> X  (Y is mfn in persistent kmap entry)

kmap_flush_unused() iterates through all the PTEs in the kmap address
space, using pte_to_page() to obtain the page. If the p2m and the m2p
are inconsistent the incorrect page is returned.  This will clear
page->address on the wrong page which may cause subsequent oopses if
that page is currently kmap'ed.

Move the flush back between get_page and __set_phys_to_machine to fix
this.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agoInput: cypress_ps2 - don't report as a button pads
Hans de Goede [Wed, 26 Mar 2014 20:30:52 +0000 (13:30 -0700)]
Input: cypress_ps2 - don't report as a button pads

commit 6797b39e6f6f34c74177736e146406e894b9482b upstream.

The cypress PS/2 trackpad models supported by the cypress_ps2 driver
emulate BTN_RIGHT events in firmware based on the finger position, as part
of this no motion events are sent when the finger is in the button area.

The INPUT_PROP_BUTTONPAD property is there to indicate to userspace that
BTN_RIGHT events should be emulated in userspace, which is not necessary
in this case.

When INPUT_PROP_BUTTONPAD is advertised userspace will wait for a motion
event before propagating the button event higher up the stack, as it needs
current abs x + y data for its BTN_RIGHT emulation. Since in the
cypress_ps2 pads don't report motion events in the button area, this means
that clicks in the button area end up being ignored, so
INPUT_PROP_BUTTONPAD actually causes problems for these touchpads, and
removing it fixes:

https://bugs.freedesktop.org/show_bug.cgi?id=76341

Reported-by: Adam Williamson <awilliam@redhat.com>
Tested-by: Adam Williamson <awilliam@redhat.com>
Reviewed-by: Peter Hutterer <peter.hutterer@who-t.net>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agoInput: synaptics - add manual min/max quirk for ThinkPad X240
Hans de Goede [Fri, 28 Mar 2014 08:01:38 +0000 (01:01 -0700)]
Input: synaptics - add manual min/max quirk for ThinkPad X240

commit 8a0435d958fb36d93b8df610124a0e91e5675c82 upstream.

This extends Benjamin Tissoires manual min/max quirk table with support for
the ThinkPad X240.

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agoInput: synaptics - add manual min/max quirk
Benjamin Tissoires [Fri, 28 Mar 2014 07:43:00 +0000 (00:43 -0700)]
Input: synaptics - add manual min/max quirk

commit 421e08c41fda1f0c2ff6af81a67b491389b653a5 upstream.

The new Lenovo Haswell series (-40's) contains a new Synaptics touchpad.
However, these new Synaptics devices report bad axis ranges.
Under Windows, it is not a problem because the Windows driver uses RMI4
over SMBus to talk to the device. Under Linux, we are using the PS/2
fallback interface and it occurs the reported ranges are wrong.

Of course, it would be too easy to have only one range for the whole
series, each touchpad seems to be calibrated in a different way.

We can not use SMBus to get the actual range because I suspect the firmware
will switch into the SMBus mode and stop talking through PS/2 (this is the
case for hybrid HID over I2C / PS/2 Synaptics touchpads).

So as a temporary solution (until RMI4 land into upstream), start a new
list of quirks with the min/max manually set.

Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agoInput: mousedev - fix race when creating mixed device
Dmitry Torokhov [Thu, 6 Mar 2014 20:57:24 +0000 (12:57 -0800)]
Input: mousedev - fix race when creating mixed device

commit e4dbedc7eac7da9db363a36f2bd4366962eeefcc upstream.

We should not be using static variable mousedev_mix in methods that can be
called before that singleton gets assigned. While at it let's add open and
close methods to mousedev structure so that we do not need to test if we
are dealing with multiplexor or normal device and simply call appropriate
method directly.

This fixes: https://bugzilla.kernel.org/show_bug.cgi?id=71551

Reported-by: GiulioDP <depasquale.giulio@gmail.com>
Tested-by: GiulioDP <depasquale.giulio@gmail.com>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agorcuwalk: recheck mount_lock after mountpoint crossing attempts
Al Viro [Thu, 20 Mar 2014 19:18:22 +0000 (15:18 -0400)]
rcuwalk: recheck mount_lock after mountpoint crossing attempts

commit b37199e626b31e1175fb06764c5d1d687723aac2 upstream.

We can get false negative from __lookup_mnt() if an unrelated vfsmount
gets moved.  In that case legitimize_mnt() is guaranteed to fail,
and we will fall back to non-RCU walk... unless we end up running
into a hard error on a filesystem object we wouldn't have reached
if not for that false negative.  IOW, delaying that check until
the end of pathname resolution is wrong - we should recheck right
after we attempt to cross the mountpoint.  We don't need to recheck
unless we see d_mountpoint() being true - in that case even if
we have just raced with mount/umount, we can simply go on as if
we'd come at the moment when the sucker wasn't a mountpoint; if we
run into a hard error as the result, it was a legitimate outcome.
__lookup_mnt() returning NULL is different in that respect, since
it might've happened due to operation on completely unrelated
mountpoint.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agoext4: atomically set inode->i_flags in ext4_set_inode_flags()
Theodore Ts'o [Sun, 30 Mar 2014 14:20:01 +0000 (10:20 -0400)]
ext4: atomically set inode->i_flags in ext4_set_inode_flags()

commit 00a1a053ebe5febcfc2ec498bd894f035ad2aa06 upstream.

Use cmpxchg() to atomically set i_flags instead of clearing out the
S_IMMUTABLE, S_APPEND, etc. flags and then setting them from the
EXT4_IMMUTABLE_FL, EXT4_APPEND_FL flags, since this opens up a race
where an immutable file has the immutable flag cleared for a brief
window of time.

Reported-by: John Sullivan <jsrhbz@kanargh.force9.co.uk>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agoLinux 3.13.8 v3.13.8
Greg Kroah-Hartman [Mon, 31 Mar 2014 17:05:38 +0000 (10:05 -0700)]
Linux 3.13.8

10 years agolibceph: fix preallocation check in get_reply()
Ilya Dryomov [Thu, 9 Jan 2014 18:08:21 +0000 (20:08 +0200)]
libceph: fix preallocation check in get_reply()

commit f2be82b0058e90b5d9ac2cb896b4914276fb50ef upstream.

The check that makes sure that we have enough memory allocated to read
in the entire header of the message in question is currently busted.
It compares front_len of the incoming message with iov_len field of
ceph_msg::front structure, which is used primarily to indicate the
amount of data already read in, and not the size of the allocated
buffer.  Under certain conditions (e.g. a short read from a socket
followed by that socket's shutdown and owning ceph_connection reset)
this results in a warning similar to

[85688.975866] libceph: get_reply front 198 > preallocated 122 (4#0)

and, through another bug, leads to forever hung tasks and forced
reboots.  Fix this by comparing front_len with front_alloc_len field of
struct ceph_msg, which stores the actual size of the buffer.

Fixes: http://tracker.ceph.com/issues/5425
Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agolibceph: rename front to front_len in get_reply()
Ilya Dryomov [Thu, 9 Jan 2014 18:08:21 +0000 (20:08 +0200)]
libceph: rename front to front_len in get_reply()

commit 3f0a4ac55fe036902e3666be740da63528ad8639 upstream.

Rename front local variable to front_len in get_reply() to make its
purpose more clear.

Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agolibceph: rename ceph_msg::front_max to front_alloc_len
Ilya Dryomov [Thu, 9 Jan 2014 18:08:21 +0000 (20:08 +0200)]
libceph: rename ceph_msg::front_max to front_alloc_len

commit 3cea4c3071d4e55e9d7356efe9d0ebf92f0c2204 upstream.

Rename front_max field of struct ceph_msg to front_alloc_len to make
its purpose more clear.

Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agoe100: Fix "disabling already-disabled device" warning
Michele Baldessari [Thu, 30 Jan 2014 10:51:04 +0000 (10:51 +0000)]
e100: Fix "disabling already-disabled device" warning

commit 2b6e0ca175fe4a20f21ba82b1e7ccc71029c4dd4 upstream.

In https://bugzilla.redhat.com/show_bug.cgi?id=994438 and
https://bugzilla.redhat.com/show_bug.cgi?id=970480  we
received different reports of e100 throwing the following
warning:

 [<c06a0ba5>] ? pci_disable_device+0x85/0x90
 [<c044a153>] warn_slowpath_fmt+0x33/0x40
 [<c06a0ba5>] pci_disable_device+0x85/0x90
 [<f7fdf7e0>] __e100_shutdown+0x80/0x120 [e100]
 [<c0476ca5>] ? check_preempt_curr+0x65/0x90
 [<f7fdf8d6>] e100_suspend+0x16/0x30 [e100]
 [<c06a1ebb>] pci_legacy_suspend+0x2b/0xb0
 [<c098fc0f>] ? wait_for_completion+0x1f/0xd0
 [<c06a2d50>] ? pci_pm_poweroff+0xb0/0xb0
 [<c06a2de4>] pci_pm_freeze+0x94/0xa0
 [<c0767bb7>] dpm_run_callback+0x37/0x80
 [<c076a204>] ? pm_wakeup_pending+0xc4/0x140
 [<c0767f12>] __device_suspend+0xb2/0x1f0
 [<c076806f>] async_suspend+0x1f/0x90
 [<c04706e5>] async_run_entry_fn+0x35/0x140
 [<c0478aef>] ? wake_up_process+0x1f/0x40
 [<c0464495>] process_one_work+0x115/0x370
 [<c0462645>] ? start_worker+0x25/0x30
 [<c0464dc5>] ? manage_workers.isra.27+0x1a5/0x250
 [<c0464f6e>] worker_thread+0xfe/0x330
 [<c0464e70>] ? manage_workers.isra.27+0x250/0x250
 [<c046a224>] kthread+0x94/0xa0
 [<c0997f37>] ret_from_kernel_thread+0x1b/0x28
 [<c046a190>] ? insert_kthread_work+0x30/0x30

This patch removes pci_disable_device() from __e100_shutdown().
pci_clear_master() is enough.

Signed-off-by: Michele Baldessari <michele@acksyn.org>
Tested-by: Mark Harig <idirectscm@aim.com>
Signed-off-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cc: Josh Boyer <jwboyer@fedoraproject.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agoxhci: Fix resume issues on Renesas chips in Samsung laptops
Sarah Sharp [Fri, 17 Jan 2014 23:38:12 +0000 (15:38 -0800)]
xhci: Fix resume issues on Renesas chips in Samsung laptops

commit 1aa9578c1a9450fb21501c4f549f5b1edb557e6d upstream.

Don Zickus <dzickus@redhat.com> writes:

Some co-workers of mine bought Samsung laptops that had mostly usb3 ports.
Those ports did not resume correctly (the driver would timeout communicating
and fail).  This led to frustration as suspend/resume is a common use for
laptops.

Poking around, I applied the reset on resume quirk to this chipset and the
resume started working.  Reloading the xhci_hcd module had been the temporary
workaround.

Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
Reported-by: Don Zickus <dzickus@redhat.com>
Tested-by: Prarit Bhargava <prarit@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agoInput: wacom - add reporting of SW_MUTE_DEVICE events
Ping Cheng [Thu, 5 Dec 2013 20:54:53 +0000 (12:54 -0800)]
Input: wacom - add reporting of SW_MUTE_DEVICE events

commit 961794a00eab03f4344b7d5e825e8e789e55da87 upstream.

New Intuos series models added a hardware switch to turn touch
data on/off. The state of the switch is reported periodically
from the tablet. To report the state the driver will emit SW_MUTE_DEVICE
events.

Reviewed_by: Chris Bagwell <chris@cnpbagwell.com>
Acked-by: Peter Hutterer <peter.hutterer@who-t.net>
Tested-by: Jason Gerecke <killertofu@gmail.com>
Signed-off-by: Ping Cheng <pingc@wacom.com>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Cc: Josh Boyer <jwboyer@fedoraproject.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agoInput: wacom - add support for three new Intuos devices
Ping Cheng [Tue, 26 Nov 2013 02:44:55 +0000 (18:44 -0800)]
Input: wacom - add support for three new Intuos devices

commit b5fd2a3e92ca5c8c1f3c20d31ac5daed3ec4d604 upstream.

Two tablets in this series support both pen and touch. One (Intuos S)
only supports pen. This patch also updates the driver to process wireless
devices that do not support touch interface.

Tested-by: Jason Gerecke <killertofu@gmail.com>
Reviewed-by: Chris Bagwell <chris@cnpbagwell.com>
Signed-off-by: Ping Cheng <pingc@wacom.com>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Cc: Josh Boyer <jwboyer@fedoraproject.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agoInput: wacom - make sure touch_max is set for touch devices
Ping Cheng [Tue, 26 Nov 2013 02:43:45 +0000 (18:43 -0800)]
Input: wacom - make sure touch_max is set for touch devices

commit 1d0d6df02750b4a6f466768cbfbf860e24f4c8d4 upstream.

Old single touch Tablet PCs do not have touch_max set at
wacom_features. Since touch device at lease supports one
finger, assign touch_max to 1 when touch usage is defined
in its HID Descriptor and touch_max is not pre-defined.

Tested-by: Jason Gerecke <killertofu@gmail.com>
Signed-off-by: Ping Cheng <pingc@wacom.com>
Reviewed-by: Chris Bagwell <chris@cnpbagwell.com>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Cc: Josh Boyer <jwboyer@fedoraproject.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agoKVM: VMX: fix use after free of vmx->loaded_vmcs
Marcelo Tosatti [Fri, 3 Jan 2014 19:00:51 +0000 (17:00 -0200)]
KVM: VMX: fix use after free of vmx->loaded_vmcs

commit 26a865f4aa8e66a6d94958de7656f7f1b03c6c56 upstream.

After free_loaded_vmcs executes, the "loaded_vmcs" structure
is kfreed, and now vmx->loaded_vmcs points to a kfreed area.
Subsequent free_loaded_vmcs then attempts to manipulate
vmx->loaded_vmcs.

Switch the order to avoid the problem.

https://bugzilla.redhat.com/show_bug.cgi?id=1047892

Reviewed-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Josh Boyer <jwboyer@fedoraproject.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agoKVM: x86: handle invalid root_hpa everywhere
Marcelo Tosatti [Fri, 3 Jan 2014 19:09:32 +0000 (17:09 -0200)]
KVM: x86: handle invalid root_hpa everywhere

commit 37f6a4e237303549c8676dfe1fd1991ceab512eb upstream.

Rom Freiman <rom@stratoscale.com> notes other code paths vulnerable to
bug fixed by 989c6b34f6a9480e397b.

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Josh Boyer <jwboyer@fedoraproject.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agoKVM: MMU: handle invalid root_hpa at __direct_map
Marcelo Tosatti [Thu, 19 Dec 2013 17:28:51 +0000 (15:28 -0200)]
KVM: MMU: handle invalid root_hpa at __direct_map

commit 989c6b34f6a9480e397b170cc62237e89bf4fdb9 upstream.

It is possible for __direct_map to be called on invalid root_hpa
(-1), two examples:

1) try_async_pf -> can_do_async_pf
    -> vmx_interrupt_allowed -> nested_vmx_vmexit
2) vmx_handle_exit -> vmx_interrupt_allowed -> nested_vmx_vmexit

Then to load_vmcs12_host_state and kvm_mmu_reset_context.

Check for this possibility, let fault exception be regenerated.

BZ: https://bugzilla.redhat.com/show_bug.cgi?id=924916

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Cc: Josh Boyer <jwboyer@fedoraproject.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agoInput: elantech - improve clickpad detection
Hans de Goede [Mon, 16 Dec 2013 15:09:25 +0000 (07:09 -0800)]
Input: elantech - improve clickpad detection

commit c15bdfd5b9831e4cab8cfc118243956e267dd30e upstream.

The current assumption in the elantech driver that hw version 3 touchpads
are never clickpads and hw version 4 touchpads are always clickpads is
wrong.

There are several bug reports for this, ie:
https://bugzilla.redhat.com/show_bug.cgi?id=1030802
http://superuser.com/questions/619582/right-elantech-touchpad-button-not-working-in-linux

I've spend a couple of hours wading through various bugzillas, launchpads
and forum posts to create a list of fw-versions and capabilities for
different laptop models to find a good method to differentiate between
clickpads and versions with separate hardware buttons.

Which shows that a device being a clickpad is reliable indicated by bit 12
being set in the fw_version. I've included the gathered list inside the
driver, so that we've this info at hand if we need to revisit this later.

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Cc: Josh Boyer <jwboyer@fedoraproject.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agofs/proc/proc_devtree.c: remove empty /proc/device-tree when no openfirmware exists.
Dave Jones [Thu, 23 Jan 2014 23:55:43 +0000 (15:55 -0800)]
fs/proc/proc_devtree.c: remove empty /proc/device-tree when no openfirmware exists.

commit c1d867a54d426b45da017fbe8e585f8a3064ce8d upstream.

Distribution kernels might want to build in support for /proc/device-tree
for kernels that might end up running on hardware that doesn't support
openfirmware.  This results in an empty /proc/device-tree existing.
Remove it if the OFW root node doesn't exist.

This situation actually confuses grub2, resulting in install failures.
grub2 sees the /proc/device-tree and picks the wrong install target cf.
http://bzr.savannah.gnu.org/lh/grub/trunk/grub/annotate/4300/util/grub-install.in#L311
grub should be more robust, but still, leaving an empty proc dir seems
pointless.

Addresses https://bugzilla.redhat.com/show_bug.cgi?id=818378.

Signed-off-by: Dave Jones <davej@redhat.com>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Josh Boyer <jwboyer@fedoraproject.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agopowerpc/powernv: Refactor PHB diag-data dump
Gavin Shan [Tue, 25 Feb 2014 07:28:38 +0000 (15:28 +0800)]
powerpc/powernv: Refactor PHB diag-data dump

commit af87d2fe95444d107e0c0cf0ba7e20e6716a7bfd upstream.

As Ben suggested, the patch prints PHB diag-data with multiple
fields in one line and omits the line if the fields of that
line are all zero.

With the patch applied, the PHB3 diag-data dump looks like:

PHB3 PHB#3 Diag-data (Version: 1)

  brdgCtl:     00000002
  RootSts:     0000000f 00400000 b0830008 00100147 00002000
  nFir:        0000000000000000 0030006e00000000 0000000000000000
  PhbSts:      0000001c00000000 0000000000000000
  Lem:         0000000000100000 42498e327f502eae 0000000000000000
  InAErr:      8000000000000000 8000000000000000 0402030000000000 0000000000000000
  PE[  8] A/B: 8480002b00000000 8000000000000000

[ The current diag data is so big that it overflows the printk
  buffer pretty quickly in cases when we get a handful of errors
  at once which can happen. --BenH
]

Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agopowerpc/powernv: Dump PHB diag-data immediately
Gavin Shan [Tue, 25 Feb 2014 07:28:37 +0000 (15:28 +0800)]
powerpc/powernv: Dump PHB diag-data immediately

commit 947166043732b69878123bf31f51933ad0316080 upstream.

The PHB diag-data is important to help locating the root cause for
EEH errors such as frozen PE or fenced PHB. However, the EEH core
enables IO path by clearing part of HW registers before collecting
this data causing it to be corrupted.

This patch fixes this by dumping the PHB diag-data immediately when
frozen/fenced state on PE or PHB is detected for the first time in
eeh_ops::get_state() or next_error() backend.

Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
CC: <stable@vger.kernel.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agopowerpc/eeh: Handle multiple EEH errors
Gavin Shan [Wed, 15 Jan 2014 05:16:11 +0000 (13:16 +0800)]
powerpc/eeh: Handle multiple EEH errors

commit 7e4e7867b1e551b7b8f326da3604c47332972bc6 upstream.

For one PCI error relevant OPAL event, we possibly have multiple
EEH errors for that. For example, multiple frozen PEs detected on
different PHBs. Unfortunately, we didn't cover the case. The patch
enumarates the return value from eeh_ops::next_error() and change
eeh_handle_special_event() and eeh_ops::next_error() to handle all
existing EEH errors.

As Ben pointed out, we needn't list_for_each_entry_safe() since we
are not deleting any PHB from the hose_list and the EEH serialized
lock should be held while purging EEH events. The patch covers those
suggestions as well.

Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agopowerpc/powernv: Move PHB-diag dump functions around
Gavin Shan [Fri, 22 Nov 2013 08:28:45 +0000 (16:28 +0800)]
powerpc/powernv: Move PHB-diag dump functions around

commit 93aef2a789778e7ec787179fc9b34ca4885a5ef3 upstream.

Prior to the completion of PCI enumeration, we actively detects
EEH errors on PCI config cycles and dump PHB diag-data if necessary.
The EEH backend also dumps PHB diag-data in case of frozen PE or
fenced PHB. However, we are using different functions to dump the
PHB diag-data for those 2 cases.

The patch merges the functions for dumping PHB diag-data to one so
that we can avoid duplicate code. Also, we never dump PHB3 diag-data
during PCI config cycles with frozen PE. The patch fixes it as well.

Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agoregulator: core: Replace direct ops->disable usage
Markus Pargmann [Thu, 20 Feb 2014 16:36:04 +0000 (17:36 +0100)]
regulator: core: Replace direct ops->disable usage

commit 66fda75f47dc583f1c187556e9a2c082dd64f8c6 upstream.

There are many places where ops->disable is called directly. Instead we
should use _regulator_do_disable() which also handles gpio regulators.

To be able to use the wrapper function from _regulator_force_disable(),
I moved the _notifier_call_chain() call from _regulator_do_disable() to
_regulator_disable(). This way, _regulator_force_disable() can use
different flags for _notifier_call_chain() without calling it twice.

Signed-off-by: Markus Pargmann <mpa@pengutronix.de>
Signed-off-by: Mark Brown <broonie@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agop54: clamp properly instead of just truncating
Dan Carpenter [Mon, 13 Jan 2014 19:05:23 +0000 (22:05 +0300)]
p54: clamp properly instead of just truncating

commit 608cfbe4abaf76e9d732efd7ed1cfa3998163d91 upstream.

The call to clamp_t() first truncates the variable signed 8 bit and as a
result, the actual clamp is a no-op.

Fixes: 0d78156eef1d ('p54: improve site survey')
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agoMIPS: Fix build error seen in some configurations
Guenter Roeck [Mon, 25 Nov 2013 23:21:00 +0000 (15:21 -0800)]
MIPS: Fix build error seen in some configurations

commit 63238f2cc5518e1c45a3418fc0ac0f560dafe7ef upstream.

The following build error is seen if CONFIG_32BIT is undefined,
CONFIG_64BIT is defined, and CONFIG_MIPS32_O32 is undefined.

asm/syscall.h: In function 'mips_get_syscall_arg':
arch/mips/include/asm/syscall.h:32:16: error: unused variable 'usp' [-Werror=unused-variable]
cc1: all warnings being treated as errors

Fixes: c0ff3c53d4f9 ('MIPS: Enable HAVE_ARCH_TRACEHOOK')
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Acked-by: David Daney <david.daney@cavium.com>
Signed-off-by: John Crispin <blogic@openwrt.org>
Patchwork: http://patchwork.linux-mips.org/patch/6160/
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agodeb-pkg: Fix cross-building linux-headers package
Ben Hutchings [Thu, 5 Dec 2013 14:37:35 +0000 (14:37 +0000)]
deb-pkg: Fix cross-building linux-headers package

commit f8ce239dfc7ba9add41d9ecdc5e7810738f839fa upstream.

builddeb generates a control file that says the linux-headers package
can only be built for the build system primary architecture.  This
breaks cross-building configurations.  We should use $debarch for this
instead.

Since $debarch is not yet set when generating the control file, set
Architecture: any and use control file variables to fill in the
description.

Fixes: cd8d60a20a45 ('kbuild: create linux-headers package in deb-pkg')
Reported-and-tested-by: "Niew, Sh." <shniew@gmail.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Michal Marek <mmarek@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agodeb-pkg: Fix building for MIPS big-endian or ARM OABI
Ben Hutchings [Thu, 5 Dec 2013 14:39:11 +0000 (14:39 +0000)]
deb-pkg: Fix building for MIPS big-endian or ARM OABI

commit c5e318f67eebbad491615a752c51dbfde7dc3d78 upstream.

These commands will mysteriously fail:

$ make ARCH=arm versatile_defconfig
[...]
$ make ARCH=arm deb-pkg
[...]
make[1]: *** [deb-pkg] Error 1
make: *** [deb-pkg] Error 2

The Debian architecture selection for these kernel architectures does
'grep FOO=y $KCONFIG_CONFIG && echo bar', and after 'set -e' this
aborts the script if grep does not find the given config symbol.

Fixes: 10f26fa64200 ('build, deb-pkg: select userland architecture based on UTS_MACHINE')
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Michal Marek <mmarek@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agoperf tools: Fix AAAAARGH64 memory barriers
Peter Zijlstra [Fri, 24 Jan 2014 15:40:02 +0000 (16:40 +0100)]
perf tools: Fix AAAAARGH64 memory barriers

commit f428ebd184c82a7914b2aa7e9f868918aaf7ea78 upstream.

Someone got the load and store barriers mixed up for AAAAARGH64.  Turn
them the right side up.

Reported-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Fixes: a94d342b9cb0 ("tools/perf: Add required memory barriers")
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Will Deacon <will.deacon@arm.com>
Link: http://lkml.kernel.org/r/20140124154002.GF31570@twins.programming.kicks-ass.net
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agoFix uses of dma_max_pfn() when converting to a limiting address
Russell King [Tue, 11 Feb 2014 17:11:04 +0000 (17:11 +0000)]
Fix uses of dma_max_pfn() when converting to a limiting address

commit e83b366487b5582274374f8226e489cb214ae5a6 upstream.

We must use a 64-bit for this, otherwise overflowed bits get lost, and
that can result in a lower than intended value set.

Fixes: 8e0cb8a1f6ac ("ARM: 7797/1: mmc: Use dma_max_pfn(dev) helper for bounce_limit calculations")
Fixes: 7d35496dd982 ("ARM: 7796/1: scsi: Use dma_max_pfn(dev) helper for bounce_limit calculations")
Tested-Acked-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agoprintk: fix syslog() overflowing user buffer
Linus Torvalds [Mon, 17 Feb 2014 20:24:45 +0000 (12:24 -0800)]
printk: fix syslog() overflowing user buffer

commit e4178d809fdaee32a56833fff1f5056c99e90a1a upstream.

This is not a buffer overflow in the traditional sense: we don't
overflow any *kernel* buffers, but we do mis-count the amount of data we
copy back to user space for the SYSLOG_ACTION_READ_ALL case.

In particular, if the user buffer is too small to hold everything, and
*if* there is a continuation line at just the right place, we can end up
giving the user more data than he asked for.

The reason is that we first count up the number of bytes all the log
records contains, then we walk the records again until we've skipped the
records at the beginning that won't fit, and then we walk the rest of
the records and copy them to the user space buffer.

And in between that "skip the initial records that won't fit" and the
"copy the records that *will* fit to user space", we reset the 'prev'
variable that contained the record information for the last record not
copied.  That meant that when we started copying to user space, we now
had a different character count than what we had originally calculated
in the first record walk-through.

The fix is to simply not clear the 'prev' flags value (in both cases
where we had the same logic: syslog_print_all and kmsg_dump_get_buffer:
the latter is used for pstore-like dumping)

Reported-and-tested-by: Debabrata Banerjee <dbanerje@akamai.com>
Acked-by: Kay Sievers <kay@vrfy.org>
Cc: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Josh Hunt <joshhunt00@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agox86: bpf_jit: support negative offsets
Alexei Starovoitov [Mon, 10 Mar 2014 22:56:51 +0000 (15:56 -0700)]
x86: bpf_jit: support negative offsets

commit fdfaf64e75397567257e1051931f9a3377360665 upstream.

Commit a998d4342337 claimed to introduce negative offset support to x86 jit,
but it couldn't be working, since at the time of the execution
of LD+ABS or LD+IND instructions via call into
bpf_internal_load_pointer_neg_helper() the %edx (3rd argument of this func)
had junk value instead of access size in bytes (1 or 2 or 4).

Store size into %edx instead of %ecx (what original commit intended to do)

Fixes: a998d4342337 ("bpf jit: Let the x86 jit handle negative offsets")
Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Cc: Jan Seiffert <kaffeemonster@googlemail.com>
Cc: Eric Dumazet <edumazet@google.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agoSUNRPC: Fix a pipe_version reference leak
Trond Myklebust [Sun, 16 Feb 2014 18:28:01 +0000 (13:28 -0500)]
SUNRPC: Fix a pipe_version reference leak

commit e9776d0f4adee8877145672f6416b06b57f2dc27 upstream.

In gss_alloc_msg(), if the call to gss_encode_v1_msg() fails, we
want to release the reference to the pipe_version that was obtained
earlier in the function.

Fixes: 9d3a2260f0f4b (SUNRPC: Fix buffer overflow checking in...)
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agoARM: 7941/2: Fix incorrect FDT initrd parameter override
Ben Peddell [Mon, 13 Jan 2014 22:25:18 +0000 (23:25 +0100)]
ARM: 7941/2: Fix incorrect FDT initrd parameter override

commit 4c235cb9e35407bdb4a2debeef4dc8721e8f91f2 upstream.

Commit 65939301acdb (arm: set initrd_start/initrd_end for fdt scan)
caused the FDT initrd_start and initrd_end to override the
phys_initrd_start and phys_initrd_size set by the initrd= kernel
parameter.  With this patch initrd_start and initrd_end will be
overridden if phys_initrd_start and phys_initrd_size are set by the
kernel initrd= parameter.

Fixes: 65939301acdb (arm: set initrd_start/initrd_end for fdt scan)
Signed-off-by: Ben Peddell <klightspeed@killerwolves.net>
Acked-by: Jason Cooper <jason@lakedaemon.net>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agosfc: Use the correct maximum TX DMA ring size for SFC9100
Ben Hutchings [Thu, 23 Jan 2014 14:35:48 +0000 (14:35 +0000)]
sfc: Use the correct maximum TX DMA ring size for SFC9100

commit d9317aea16ecec7694271ef11fb7791a0f0d9cc5 upstream.

As part of a workaround for a hardware erratum in the SFC9100 family
(SF bug 35388), the TX_DESC_UPD_DWORD register address is also used
for communicating with the event block, and only descriptor pointer
values < 2048 are valid.

If the TX DMA ring size is increased to 4096 descriptors (which the
firmware still allows) then we may write a descriptor pointer
value >= 2048, which has entirely different and undesirable effects!

Limit the TX DMA ring size correctly when this workaround is in
effect.

Fixes: 8127d661e77f ('sfc: Add support for Solarflare SFC9100 family')
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: Shradha Shah <sshah@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agostop_machine: Fix^2 race between stop_two_cpus() and stop_cpus()
Peter Zijlstra [Fri, 28 Feb 2014 12:39:05 +0000 (13:39 +0100)]
stop_machine: Fix^2 race between stop_two_cpus() and stop_cpus()

commit 177c53d943368fc97644ebc0a250dc8e2d124250 upstream.

We must use smp_call_function_single(.wait=1) for the
irq_cpu_stop_queue_work() to ensure the queueing is actually done under
stop_cpus_lock. Without this we could have dropped the lock by the time
we do the queueing and get the race we tried to fix.

Fixes: 7053ea1a34fa ("stop_machine: Fix race between stop_two_cpus() and stop_cpus()")
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/r/20140228123905.GK3104@twins.programming.kicks-ass.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agoASoC: max98090: make REVISION_ID readable
Stephen Warren [Thu, 13 Feb 2014 23:54:24 +0000 (16:54 -0700)]
ASoC: max98090: make REVISION_ID readable

commit e126a646f77fdd66978785cb0a3a5e46b07aee2e upstream.

The REVISION_ID register is not currently marked readable. snd_soc_read()
refuses to read the register, and hence probe() fails.

Fixes: d4807ad2c4c0 ("regmap: Check readable regs in _regmap_read")
[exposed the bug, by checking for readability]
Fixes: 685e42154dcf ("ASoC: Replace max98090 Device Driver")
[left out this register from the readable list]
Signed-off-by: Stephen Warren <swarren@nvidia.com>
Signed-off-by: Mark Brown <broonie@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agolibceph: resend all writes after the osdmap loses the full flag
Josh Durgin [Tue, 10 Dec 2013 17:35:13 +0000 (09:35 -0800)]
libceph: resend all writes after the osdmap loses the full flag

commit 9a1ea2dbff11547a8e664f143c1ffefc586a577a upstream.

With the current full handling, there is a race between osds and
clients getting the first map marked full. If the osd wins, it will
return -ENOSPC to any writes, but the client may already have writes
in flight. This results in the client getting the error and
propagating it up the stack. For rbd, the block layer turns this into
EIO, which can cause corruption in filesystems above it.

To avoid this race, osds are being changed to drop writes that came
from clients with an osdmap older than the last osdmap marked full.
In order for this to work, clients must resend all writes after they
encounter a full -> not full transition in the osdmap. osds will wait
for an updated map instead of processing a request from a client with
a newer map, so resent writes will not be dropped by the osd unless
there is another not full -> full transition.

This approach requires both osds and clients to be fixed to avoid the
race. Old clients talking to osds with this fix may hang instead of
returning EIO and potentially corrupting an fs. New clients talking to
old osds have the same behavior as before if they encounter this race.

Fixes: http://tracker.ceph.com/issues/6938
Reviewed-by: Sage Weil <sage@inktank.com>
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agolibceph: block I/O when PAUSE or FULL osd map flags are set
Josh Durgin [Tue, 3 Dec 2013 03:11:48 +0000 (19:11 -0800)]
libceph: block I/O when PAUSE or FULL osd map flags are set

commit d29adb34a94715174c88ca93e8aba955850c9bde upstream.

The PAUSEWR and PAUSERD flags are meant to stop the cluster from
processing writes and reads, respectively. The FULL flag is set when
the cluster determines that it is out of space, and will no longer
process writes.  PAUSEWR and PAUSERD are purely client-side settings
already implemented in userspace clients. The osd does nothing special
with these flags.

When the FULL flag is set, however, the osd responds to all writes
with -ENOSPC. For cephfs, this makes sense, but for rbd the block
layer translates this into EIO.  If a cluster goes from full to
non-full quickly, a filesystem on top of rbd will not behave well,
since some writes succeed while others get EIO.

Fix this by blocking any writes when the FULL flag is set in the osd
client. This is the same strategy used by userspace, so apply it by
default.  A follow-on patch makes this configurable.

__map_request() is called to re-target osd requests in case the
available osds changed.  Add a paused field to a ceph_osd_request, and
set it whenever an appropriate osd map flag is set.  Avoid queueing
paused requests in __map_request(), but force them to be resent if
they become unpaused.

Also subscribe to the next osd map from the monitor if any of these
flags are set, so paused requests can be unblocked as soon as
possible.

Fixes: http://tracker.ceph.com/issues/6079
Reviewed-by: Sage Weil <sage@inktank.com>
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agomedia: cx18: check for allocation failure in cx18_read_eeprom()
Dan Carpenter [Fri, 22 Nov 2013 07:51:47 +0000 (04:51 -0300)]
media: cx18: check for allocation failure in cx18_read_eeprom()

commit e351bf25fa373a3de0be2141b962c5c3c27006a2 upstream.

It upsets static checkers when we don't check for allocation failure.  I
moved the memset() of "tv" earlier so we don't use uninitialized data on
error.
Fixes: 1d212cf0c2d8 ('[media] cx18: struct i2c_client is too big for stack')
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Andy Walls <awalls@md.metrocast.net>
Signed-off-by: Mauro Carvalho Chehab <m.chehab@samsung.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agomedia: dw2102: some missing unlocks on error
Dan Carpenter [Fri, 22 Nov 2013 07:56:33 +0000 (04:56 -0300)]
media: dw2102: some missing unlocks on error

commit 324ed533bf0b23c309b805272c4ffcc5d51493a6 upstream.

We recently introduced some new error paths but the unlocks are missing.
Fixes: 0065a79a8698 ('[media] dw2102: Don't use dynamic static allocation')
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Mauro Carvalho Chehab <m.chehab@samsung.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agomedia: cxusb: unlock on error in cxusb_i2c_xfer()
Dan Carpenter [Fri, 22 Nov 2013 07:55:43 +0000 (04:55 -0300)]
media: cxusb: unlock on error in cxusb_i2c_xfer()

commit 1cdbcc5db4e6d51ce9bb1313195167cada9aa6e9 upstream.

We recently introduced some new error paths which are missing their
unlocks.
Fixes: 64f7ef8afbf8 ('[media] cxusb: Don't use dynamic static allocation')
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Mauro Carvalho Chehab <m.chehab@samsung.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agoNFSv4: Use the correct net namespace in nfs4_update_server
Trond Myklebust [Mon, 17 Feb 2014 02:42:56 +0000 (21:42 -0500)]
NFSv4: Use the correct net namespace in nfs4_update_server

commit 292f503cade2b1d966239ef56a851e6897d1ba92 upstream.

We need to use the same net namespace that was used to resolve
the hostname and sockaddr arguments.

Fixes: 32e62b7c3ef09 (NFS: Add nfs4_update_server)
Cc: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agonet: davinci_emac: Replace devm_request_irq with request_irq
Christian Riesch [Mon, 24 Mar 2014 12:46:26 +0000 (13:46 +0100)]
net: davinci_emac: Replace devm_request_irq with request_irq

commit 33b7107f59a61236d94ecd6b45e20283cd5abcc8 upstream.

In commit 6892b41d9701283085b655c6086fb57a5d63fa47

Author: Lad, Prabhakar <prabhakar.csengg@gmail.com>
Date:   Tue Jun 25 21:24:51 2013 +0530
net: davinci: emac: Convert to devm_* api

the call of request_irq is replaced by devm_request_irq and the call
of free_irq is removed. But since interrupts are requested in
emac_dev_open, doing ifconfig up/down on the board requests the
interrupts again each time, causing devm_request_irq to fail. The
interface is dead until the device is rebooted.

This patch reverts said commit partially: It changes the driver back
to use request_irq instead of devm_request_irq, puts free_irq back in
place, but keeps the remaining changes of the original patch.

Reported-by: Jon Ringle <jon@ringle.org>
Signed-off-by: Christian Riesch <christian.riesch@omicron.at>
Cc: Lad, Prabhakar <prabhakar.csengg@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agopartly revert commit 8a10bc9: parisc/sti_console: prefer Linux fonts over built-in...
Helge Deller [Sun, 23 Mar 2014 15:39:52 +0000 (16:39 +0100)]
partly revert commit 8a10bc9: parisc/sti_console: prefer Linux fonts over built-in ROM fonts

commit a2fb4d782c61f77480e586578eeb4dfd27d134ea upstream.

STI console is used on parisc and m68k HP machines. This patch partly reverts
my previous commit and as such restores the fonts for the m68k machines.

Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agotracing: Fix array size mismatch in format string
Vaibhav Nagarnaik [Fri, 14 Feb 2014 03:51:48 +0000 (19:51 -0800)]
tracing: Fix array size mismatch in format string

commit 87291347c49dc40aa339f587b209618201c2e527 upstream.

In event format strings, the array size is reported in two locations.
One in array subscript and then via the "size:" attribute. The values
reported there have a mismatch.

For e.g., in sched:sched_switch the prev_comm and next_comm character
arrays have subscript values as [32] where as the actual field size is
16.

name: sched_switch
ID: 301
format:
        field:unsigned short common_type;       offset:0;       size:2; signed:0;
        field:unsigned char common_flags;       offset:2;       size:1; signed:0;
        field:unsigned char common_preempt_count;       offset:3;       size:1;signed:0;
        field:int common_pid;   offset:4;       size:4; signed:1;

        field:char prev_comm[32];       offset:8;       size:16;        signed:1;
        field:pid_t prev_pid;   offset:24;      size:4; signed:1;
        field:int prev_prio;    offset:28;      size:4; signed:1;
        field:long prev_state;  offset:32;      size:8; signed:1;
        field:char next_comm[32];       offset:40;      size:16;        signed:1;
        field:pid_t next_pid;   offset:56;      size:4; signed:1;
        field:int next_prio;    offset:60;      size:4; signed:1;

After bisection, the following commit was blamed:
92edca0 tracing: Use direct field, type and system names

This commit removes the duplication of strings for field->name and
field->type assuming that all the strings passed in
__trace_define_field() are immutable. This is not true for arrays, where
the type string is created in event_storage variable and field->type for
all array fields points to event_storage.

Use __stringify() to create a string constant for the type string.

Also, get rid of event_storage and event_storage_mutex that are not
needed anymore.

also, an added benefit is that this reduces the overhead of events a bit more:

   text    data     bss     dec     hex filename
8424787 2036472 1302528 11763787         b3804b vmlinux
8420814 2036408 1302528 11759750         b37086 vmlinux.patched

Link: http://lkml.kernel.org/r/1392349908-29685-1-git-send-email-vnagarnaik@google.com
Cc: Laurent Chavey <chavey@google.com>
Signed-off-by: Vaibhav Nagarnaik <vnagarnaik@google.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agodrm/i915: Disable stolen memory when DMAR is active
Chris Wilson [Tue, 18 Mar 2014 12:50:50 +0000 (14:50 +0200)]
drm/i915: Disable stolen memory when DMAR is active

commit 0f4706d2740f2a221cd502922b22e522009041d9 upstream.

We have reports of heavy screen corruption if we try to use the stolen
memory reserved by the BIOS whilst the DMA-Remapper is active. This
quirk may be only specific to a few machines or BIOSes, but first lets
apply the big hammer and always disable use of stolen memory when DMAR
is active.

v2 by Jani: Rebase on -fixes, only look at intel_iommu_gfx_mapped.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=68535
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agodrm/i915: Don't enable display error interrupts from the start
Daniel Vetter [Fri, 7 Mar 2014 19:34:46 +0000 (20:34 +0100)]
drm/i915: Don't enable display error interrupts from the start

commit 5c673b60a9b3b23486f4eda75c72e91d31d26a2b upstream.

We need to enable interrupt processing before all the modeset
state is set up. But that means we can fall over when we get a pipe
underrun. This shouldn't happen as long as the bios works correctly
but as usual this turns out to be wishful thinking.

So disable error interrupts at irq install time and rely on the
re-enabling code in the modeset functions to take care of this.

Note that due to the SDE interrupt handling race we must
uncondtionally enable all interrupt sources in SDEIER, hence no need
to enable the SERR bit specifically.

On gmch platforms we don't have an explicit enable/mask bit for fifo
underruns. Fixing this up would require a bit of software tracking,
hence is material for a separate patch. To make this possible we need
to switch all gmch platforms to the new pipestat interrupt handling
scheme Imre implemented for vlv, and then also add a safe form of sw
state checking to __cpu_fifo_underrun_reporting_enabled a bit.

v2: Also handle the ilk/snb cpu fifo underrun bits accordingly.
Spotted by Ville.

v3: Also handle the south interrupt underrun bits on ibx. Again
spotted by Ville.

Reported-by: Rob Clark <robdclark@gmail.com>
Cc: Rob Clark <robdclark@gmail.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Tested-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agodrm/i915: Fix PSR programming
Ben Widawsky [Wed, 5 Mar 2014 06:38:10 +0000 (22:38 -0800)]
drm/i915: Fix PSR programming

commit 24bd9bf54d45d28089251cdf62bf14323d1aa827 upstream.

| has a higher precedence than ?. Therefore, the calculation doesn't do
at all what you would expect. Thanks to Ken for convincing me that this
was indeed the issue. Send me back to C programmer school, please.

I'm sort of surprised PSR was continuing to work for people. It should
be broken IMO (and it was broken for me, but I had assumed it never
worked).

Regression from:
commit ed8546ac1f99b850879f07b1e9b06b42fb0a36d9
Author: Ben Widawsky <benjamin.widawsky@intel.com>
Date:   Mon Nov 4 22:45:05 2013 -0800

    drm/i915/bdw: Support eDP PSR

Cc: Rodrigo Vivi <rodrigo.vivi@gmail.com>
Cc: Kenneth Graunke <kenneth.w.graunke@intel.com>
Cc: Art Runyan <arthur.j.runyan@intel.com>
Reported-by: "Kumar, Kiran S" <kiran.s.kumar@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agoclocksource: vf_pit_timer: use complement for sched_clock reading
Stefan Agner [Wed, 5 Mar 2014 22:11:08 +0000 (23:11 +0100)]
clocksource: vf_pit_timer: use complement for sched_clock reading

commit 224aa3ed45c8735ae02bb2ecca002409fa6aa772 upstream.

Vybrids PIT register is monitonic decreasing. However, sched_clock
reading needs to be monitonic increasing. Use bitwise not to get
the complement of the clock register. This fixes the clock going
backward. Also, the clock now starts at 0 since we load the
register with the maximum value at start.

Signed-off-by: Stefan Agner <stefan@agner.ch>
Acked-by: Shawn Guo <shawn.guo@linaro.org>
Cc: daniel.lezcano@linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux@arm.linux.org.uk
Link: http://lkml.kernel.org/r/d25af915993aec1b486be653eb86f748ddef54fe.1394057313.git.stefan@agner.ch
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agoALSA: compress: Pass through return value of open ops callback
Charles Keepax [Wed, 19 Mar 2014 12:59:39 +0000 (12:59 +0000)]
ALSA: compress: Pass through return value of open ops callback

commit 749d32237bf39e6576dd95bfdf24e4378e51716c upstream.

The snd_compr_open function would always return 0 even if the compressed
ops open function failed, obviously this is incorrect. Looks like this
was introduced by a small typo in:

commit a0830dbd4e42b38aefdf3fb61ba5019a1a99ea85
ALSA: Add a reference counter to card instance

This patch returns the value from the compressed op as it should.

Signed-off-by: Charles Keepax <ckeepax@opensource.wolfsonmicro.com>
Acked-by: Vinod Koul <vinod.koul@intel.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agoHID: hidraw: fix warning destroying hidraw device files after parent
Fernando Luis Vazquez Cao [Wed, 26 Feb 2014 07:51:24 +0000 (16:51 +0900)]
HID: hidraw: fix warning destroying hidraw device files after parent

commit 47587fc098451c2100dc1fb618855fc2e2d937af upstream.

I noticed that after hot unplugging a Logitech unifying receiver
(drivers/hid/hid-logitech-dj.c) the kernel would occasionally spew a
stack trace similar to this:

usb 1-1.1.2: USB disconnect, device number 7
WARNING: CPU: 0 PID: 2865 at fs/sysfs/group.c:216 device_del+0x40/0x1b0()
sysfs group ffffffff8187fa20 not found for kobject 'hidraw0'
[...]
CPU: 0 PID: 2865 Comm: upowerd Tainted: G        W 3.14.0-rc4 #7
Hardware name: LENOVO 7783PN4/        , BIOS 9HKT43AUS 07/11/2011
 0000000000000009 ffffffff814cd684 ffff880427ccfdf8 ffffffff810616e7
 ffff88041ec61800 ffff880427ccfe48 ffff88041e444d80 ffff880426fab8e8
 ffff880429359960 ffffffff8106174c ffffffff81714b98 0000000000000028
Call Trace:
 [<ffffffff814cd684>] ? dump_stack+0x41/0x51
 [<ffffffff810616e7>] ? warn_slowpath_common+0x77/0x90
 [<ffffffff8106174c>] ? warn_slowpath_fmt+0x4c/0x50
 [<ffffffff81374fd0>] ? device_del+0x40/0x1b0
 [<ffffffff8137516f>] ? device_unregister+0x2f/0x50
 [<ffffffff813751fa>] ? device_destroy+0x3a/0x40
 [<ffffffffa03ca245>] ? drop_ref+0x55/0x120 [hid]
 [<ffffffffa03ca3e6>] ? hidraw_release+0x96/0xb0 [hid]
 [<ffffffff811929da>] ? __fput+0xca/0x210
 [<ffffffff8107fe17>] ? task_work_run+0x97/0xd0
 [<ffffffff810139a9>] ? do_notify_resume+0x69/0xa0
 [<ffffffff814dbd22>] ? int_signal+0x12/0x17
---[ end trace 63f4a46f6566d737 ]---

During device removal hid_disconnect() is called via hid_hw_stop() to
stop the device and free all its resources, including the sysfs
files. The problem is that if a user space process, such as upowerd,
holds a reference to a hidraw file the corresponding sysfs files will
be kept around (drop_ref() does not call device_destroy() if the open
counter is not 0) and it will be usb_disconnect() who, by calling
device_del() for the USB device, will indirectly remove the sysfs
files of the hidraw device (sysfs_remove_dir() is recursive these
days). Because of this, by the time user space releases the last
reference to the hidraw file and drop_ref() tries to destroy the
device the sysfs files are already gone and the kernel will print
the warning above.

Fix this by calling device_destroy() at USB disconnect time.

Signed-off-by: Fernando Luis Vazquez Cao <fernando@oss.ntt.co.jp>
Reviewed-by: David Herrmann <dh.herrmann@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agoLinux 3.13.7 v3.13.7
Greg Kroah-Hartman [Mon, 24 Mar 2014 04:45:42 +0000 (21:45 -0700)]
Linux 3.13.7

10 years agoPNP / ACPI: proper handling of ACPI IO/Memory resource parsing failures
Zhang Rui [Tue, 11 Mar 2014 14:40:27 +0000 (22:40 +0800)]
PNP / ACPI: proper handling of ACPI IO/Memory resource parsing failures

commit 89935315f192abf7068d0044cefc84f162c3c81f upstream.

Before commit b355cee88e3b (ACPI / resources: ignore invalid ACPI
device resources), if acpi_dev_resource_memory()/acpi_dev_resource_io()
returns false, it means the the resource is not a memeory/IO resource.

But after commit b355cee88e3b, those functions return false if the
given memory/IO resource entry is invalid (the length of the resource
is zero).

This breaks pnpacpi_allocated_resource(), because it now recognizes
the invalid memory/io resources as resources of unknown type.  Thus
users see confusing warning messages on machines with zero length
ACPI memory/IO resources.

Fix the problem by rearranging pnpacpi_allocated_resource() so that
it calls acpi_dev_resource_memory() for memory type and IO type
resources only, respectively.

Fixes: b355cee88e3b (ACPI / resources: ignore invalid ACPI device resources)
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Reported-and-tested-by: Markus Trippelsdorf <markus@trippelsdorf.de>
Reported-and-tested-by: Julian Wollrath <jwollrath@web.de>
Reported-and-tested-by: Paul Bolle <pebolle@tiscali.nl>
[rjw: Changelog]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agomemcg: reparent charges of children before processing parent
Filipe Brandenburger [Mon, 3 Mar 2014 23:38:25 +0000 (15:38 -0800)]
memcg: reparent charges of children before processing parent

commit 4fb1a86fb5e4209a7d4426d4e586c58e9edc74ac upstream.

Sometimes the cleanup after memcg hierarchy testing gets stuck in
mem_cgroup_reparent_charges(), unable to bring non-kmem usage down to 0.

There may turn out to be several causes, but a major cause is this: the
workitem to offline parent can get run before workitem to offline child;
parent's mem_cgroup_reparent_charges() circles around waiting for the
child's pages to be reparented to its lrus, but it's holding
cgroup_mutex which prevents the child from reaching its
mem_cgroup_reparent_charges().

Further testing showed that an ordered workqueue for cgroup_destroy_wq
is not always good enough: percpu_ref_kill_and_confirm's call_rcu_sched
stage on the way can mess up the order before reaching the workqueue.

Instead, when offlining a memcg, call mem_cgroup_reparent_charges() on
all its children (and grandchildren, in the correct order) to have their
charges reparented first.

Fixes: e5fca243abae ("cgroup: use a dedicated workqueue for cgroup destruction")
Signed-off-by: Filipe Brandenburger <filbranden@google.com>
Signed-off-by: Hugh Dickins <hughd@google.com>
Reviewed-by: Tejun Heo <tj@kernel.org>
Acked-by: Michal Hocko <mhocko@suse.cz>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: <stable@vger.kernel.org> [v3.10+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agoarm64: mm: Add double logical invert to pte accessors
Steve Capper [Tue, 25 Feb 2014 11:38:53 +0000 (11:38 +0000)]
arm64: mm: Add double logical invert to pte accessors

commit 84fe6826c28f69d8708bd575faed7f75e6b6f57f upstream.

Page table entries on ARM64 are 64 bits, and some pte functions such as
pte_dirty return a bitwise-and of a flag with the pte value. If the
flag to be tested resides in the upper 32 bits of the pte, then we run
into the danger of the result being dropped if downcast.

For example:
gather_stats(page, md, pte_dirty(*pte), 1);
where pte_dirty(*pte) is downcast to an int.

This patch adds a double logical invert to all the pte_ accessors to
ensure predictable downcasting.

Signed-off-by: Steve Capper <steve.capper@linaro.org>
[steve.capper@linaro.org: rebased patch to leave pte_write alone to
allow for merge with 3.13 stable]
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agobio-integrity: Fix bio_integrity_verify segment start bug
Nicholas Bellinger [Wed, 22 Jan 2014 04:32:05 +0000 (20:32 -0800)]
bio-integrity: Fix bio_integrity_verify segment start bug

commit 5837c80e870bc3b12ac6a98cdc9ce7a9522a8fb6 upstream.

This patch addresses a bug in bio_integrity_verify() code that has
been causing DIF READ verify operations to be silently skipped.

The issue is that bio->bi_idx will have been incremented within
bio_advance() code in the normal blk_update_request() ->
req_bio_endio() completion path, and bio_integrity_verify() is
using bio_for_each_segment() which starts the bio segment walk
at the current bio->bi_idx.

So instead use bio_for_each_segment_all() to always start the bio
segment walk from zero, regardless of the current bio->bi_idx
value after bio_advance() has been called.

(Context change for v3.10.y -> v3.13.y code - nab)

Cc: Martin K. Petersen <martin.petersen@oracle.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agoMIPS: include linux/types.h
Qais Yousef [Mon, 9 Dec 2013 09:49:45 +0000 (09:49 +0000)]
MIPS: include linux/types.h

commit 87c99203fea897fbdd84b681ad9fced2517dcf98 upstream.

The file uses u16 type but doesn't include its definition explicitly

I was getting this error when including this header in my driver:

  arch/mips/include/asm/mipsregs.h:644:33: error: unknown type name â€˜u16’

Signed-off-by: Qais Yousef <qais.yousef@imgtec.com>
Reviewed-by: Steven J. Hill <Steven.Hill@imgtec.com>
Acked-by: David Daney <david.daney@cavium.com>
Signed-off-by: John Crispin <blogic@openwrt.org>
Patchwork: http://patchwork.linux-mips.org/patch/6212/
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agoFix mountpoint reference leakage in linkat
Oleg Drokin [Fri, 31 Jan 2014 20:41:58 +0000 (15:41 -0500)]
Fix mountpoint reference leakage in linkat

commit d22e6338db7f613dd4f6095c190682fcc519e4b7 upstream.

Recent changes to retry on ESTALE in linkat
(commit 442e31ca5a49e398351b2954b51f578353fdf210)
introduced a mountpoint reference leak and a small memory
leak in case a filesystem link operation returns ESTALE
which is pretty normal for distributed filesystems like
lustre, nfs and so on.
Free old_path in such a case.

[AV: there was another missing path_put() nearby - on the previous
goto retry]

Signed-off-by: Oleg Drokin: <green@linuxhacker.ru>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agoregulator: core: Change dummy supplies error message to a warning
Shuah Khan [Thu, 20 Feb 2014 19:58:15 +0000 (12:58 -0700)]
regulator: core: Change dummy supplies error message to a warning

commit acc3d5cec84f82ebea535fa0bd9500ac3df2aee9 upstream.

Change "dummy supplies not allowed" error message to warning instead, as this
is a just warning message with no change to the behavior.

[Added a CC to stable since some other bug fixes cause this to come up
more frequently on PCs which is how it was noticed -- broonie]

Signed-off-by: Shuah Khan <shuah.kh@samsung.com>
Signed-off-by: Mark Brown <broonie@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agoALSA: oxygen: modify adjust_dg_dac_routing function
Roman Volkov [Fri, 24 Jan 2014 12:18:11 +0000 (16:18 +0400)]
ALSA: oxygen: modify adjust_dg_dac_routing function

commit 1f91ecc14deea9461aca93273d78871ec4d98fcd upstream.

When selecting the audio output destinations (headphones,
FP headphones, multichannel output), the channel routing
should be changed depending on what destination selected.
Also unnecessary I2S channels are digitally muted. This
function called when the user selects the destination
in the ALSA mixer.

Signed-off-by: Roman Volkov <v1ron@mail.ru>
Signed-off-by: Clemens Ladisch <clemens@ladisch.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agointel_pstate: Add support for Baytrail turbo P states
Dirk Brandewie [Wed, 12 Feb 2014 18:01:07 +0000 (10:01 -0800)]
intel_pstate: Add support for Baytrail turbo P states

commit 61d8d2abc15e9232c3914c55502b73e559366583 upstream.

A documentation update exposed the existance of the turbo ratio
register. Update baytrail support to use the turbo range.

Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agointel_pstate: Add setting voltage value for baytrail P states.
Dirk Brandewie [Wed, 18 Dec 2013 18:32:39 +0000 (10:32 -0800)]
intel_pstate: Add setting voltage value for baytrail P states.

commit 007bea098b869945a462420a1f9d442ff169f722 upstream.

Baytrail requires setting P state and voltage pairs when adjusting the
requested P state.  Add function for retrieving the valid voltage
values and modify *_set_pstate() functions to caluclate the
appropriate voltage for the requested P state.

Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agoaudit: don't generate loginuid log when audit disabled
Gao feng [Fri, 1 Nov 2013 11:34:45 +0000 (19:34 +0800)]
audit: don't generate loginuid log when audit disabled

commit c2412d91c68426e22add16550f97ae5cd988a159 upstream.

If audit is disabled, we shouldn't generate loginuid audit
log.

Acked-by: Eric Paris <eparis@redhat.com>
Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
Signed-off-by: Richard Guy Briggs <rgb@redhat.com>
Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agoBtrfs: fix data corruption when reading/updating compressed extents
Filipe David Borba Manana [Sat, 8 Feb 2014 15:47:46 +0000 (15:47 +0000)]
Btrfs: fix data corruption when reading/updating compressed extents

commit a2aa75e18a21b21952dc6daa9bac7c9f4426f81f upstream.

When using a mix of compressed file extents and prealloc extents, it
is possible to fill a page of a file with random, garbage data from
some unrelated previous use of the page, instead of a sequence of zeroes.

A simple sequence of steps to get into such case, taken from the test
case I made for xfstests, is:

   _scratch_mkfs
   _scratch_mount "-o compress-force=lzo"
   $XFS_IO_PROG -f -c "pwrite -S 0x06 -b 18670 266978 18670" $SCRATCH_MNT/foobar
   $XFS_IO_PROG -c "falloc 26450 665194" $SCRATCH_MNT/foobar
   $XFS_IO_PROG -c "truncate 542872" $SCRATCH_MNT/foobar
   $XFS_IO_PROG -c "fsync" $SCRATCH_MNT/foobar

This results in the following file items in the fs tree:

   item 4 key (257 INODE_ITEM 0) itemoff 15879 itemsize 160
       inode generation 6 transid 6 size 542872 block group 0 mode 100600
   item 5 key (257 INODE_REF 256) itemoff 15863 itemsize 16
       inode ref index 2 namelen 6 name: foobar
   item 6 key (257 EXTENT_DATA 0) itemoff 15810 itemsize 53
       extent data disk byte 0 nr 0 gen 6
       extent data offset 0 nr 24576 ram 266240
       extent compression 0
   item 7 key (257 EXTENT_DATA 24576) itemoff 15757 itemsize 53
       prealloc data disk byte 12849152 nr 241664 gen 6
       prealloc data offset 0 nr 241664
   item 8 key (257 EXTENT_DATA 266240) itemoff 15704 itemsize 53
       extent data disk byte 12845056 nr 4096 gen 6
       extent data offset 0 nr 20480 ram 20480
       extent compression 2
   item 9 key (257 EXTENT_DATA 286720) itemoff 15651 itemsize 53
       prealloc data disk byte 13090816 nr 405504 gen 6
       prealloc data offset 0 nr 258048

The on disk extent at offset 266240 (which corresponds to 1 single disk block),
contains 5 compressed chunks of file data. Each of the first 4 compress 4096
bytes of file data, while the last one only compresses 3024 bytes of file data.
Therefore a read into the file region [285648 ; 286720[ (length = 4096 - 3024 =
1072 bytes) should always return zeroes (our next extent is a prealloc one).

The solution here is the compression code path to zero the remaining (untouched)
bytes of the last page it uncompressed data into, as the information about how
much space the file data consumes in the last page is not known in the upper layer
fs/btrfs/extent_io.c:__do_readpage(). In __do_readpage we were correctly zeroing
the remainder of the page but only if it corresponds to the last page of the inode
and if the inode's size is not a multiple of the page size.

This would cause not only returning random data on reads, but also permanently
storing random data when updating parts of the region that should be zeroed.
For the example above, it means updating a single byte in the region [285648 ; 286720[
would store that byte correctly but also store random data on disk.

A test case for xfstests follows soon.

Signed-off-by: Filipe David Borba Manana <fdmanana@gmail.com>
Signed-off-by: Chris Mason <clm@fb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agoBtrfs: fix tree mod logging
Filipe David Borba Manana [Fri, 20 Dec 2013 15:17:46 +0000 (15:17 +0000)]
Btrfs: fix tree mod logging

commit 5de865eebb8330eee19c37b31fb6f315a09d4273 upstream.

While running the test btrfs/004 from xfstests in a loop, it failed
about 1 time out of 20 runs in my desktop. The failure happened in
the backref walking part of the test, and the test's error message was
like this:

#  btrfs/004 93s ... [failed, exit status 1] - output mismatch (see /home/fdmanana/git/hub/xfstests_2/results//btrfs/004.out.bad)
#      --- tests/btrfs/004.out 2013-11-26 18:25:29.263333714 +0000
#      +++ /home/fdmanana/git/hub/xfstests_2/results//btrfs/004.out.bad 2013-12-10 15:25:10.327518516 +0000
#      @@ -1,3 +1,8 @@
#       QA output created by 004
#       *** test backref walking
#      -*** done
#      +unexpected output from
#      + /home/fdmanana/git/hub/btrfs-progs/btrfs inspect-internal logical-resolve -P 141512704 /home/fdmanana/btrfs-tests/scratch_1
#      +expected inum: 405, expected address: 454656, file: /home/fdmanana/btrfs-tests/scratch_1/snap1/p0/d6/d3d/d156/fce, got:
#      +
       ...
       (Run 'diff -u tests/btrfs/004.out /home/fdmanana/git/hub/xfstests_2/results//btrfs/004.out.bad' to see the entire diff)
  Ran: btrfs/004
  Failures: btrfs/004
  Failed 1 of 1 tests

But immediately after the test finished, the btrfs inspect-internal command
returned the expected output:

  $ btrfs inspect-internal logical-resolve -P 141512704 /home/fdmanana/btrfs-tests/scratch_1
  inode 405 offset 454656 root 258
  inode 405 offset 454656 root 5

It turned out this was because the btrfs_search_old_slot() calls performed
during backref walking (backref.c:__resolve_indirect_ref) were not finding
anything. The reason for this turned out to be that the tree mod logging
code was not logging some node multi-step operations atomically, therefore
btrfs_search_old_slot() callers iterated often over an incomplete tree that
wasn't fully consistent with any tree state from the past. Besides missing
items, this often (but not always) resulted in -EIO errors during old slot
searches, reported in dmesg like this:

[ 4299.933936] ------------[ cut here ]------------
[ 4299.933949] WARNING: CPU: 0 PID: 23190 at fs/btrfs/ctree.c:1343 btrfs_search_old_slot+0x57b/0xab0 [btrfs]()
[ 4299.933950] Modules linked in: btrfs raid6_pq xor pci_stub vboxpci(O) vboxnetadp(O) vboxnetflt(O) vboxdrv(O) bnep rfcomm bluetooth parport_pc ppdev binfmt_misc joydev snd_hda_codec_h
[ 4299.933977] CPU: 0 PID: 23190 Comm: btrfs Tainted: G        W  O 3.12.0-fdm-btrfs-next-16+ #70
[ 4299.933978] Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./Z77 Pro4, BIOS P1.50 09/04/2012
[ 4299.933979]  000000000000053f ffff8806f3fd98f8 ffffffff8176d284 0000000000000007
[ 4299.933982]  0000000000000000 ffff8806f3fd9938 ffffffff8104a81c ffff880659c64b70
[ 4299.933984]  ffff880659c643d0 ffff8806599233d8 ffff880701e2e938 0000160000000000
[ 4299.933987] Call Trace:
[ 4299.933991]  [<ffffffff8176d284>] dump_stack+0x55/0x76
[ 4299.933994]  [<ffffffff8104a81c>] warn_slowpath_common+0x8c/0xc0
[ 4299.933997]  [<ffffffff8104a86a>] warn_slowpath_null+0x1a/0x20
[ 4299.934003]  [<ffffffffa065d3bb>] btrfs_search_old_slot+0x57b/0xab0 [btrfs]
[ 4299.934005]  [<ffffffff81775f3b>] ? _raw_read_unlock+0x2b/0x50
[ 4299.934010]  [<ffffffffa0655001>] ? __tree_mod_log_search+0x81/0xc0 [btrfs]
[ 4299.934019]  [<ffffffffa06dd9b0>] __resolve_indirect_refs+0x130/0x5f0 [btrfs]
[ 4299.934027]  [<ffffffffa06a21f1>] ? free_extent_buffer+0x61/0xc0 [btrfs]
[ 4299.934034]  [<ffffffffa06de39c>] find_parent_nodes+0x1fc/0xe40 [btrfs]
[ 4299.934042]  [<ffffffffa06b13e0>] ? defrag_lookup_extent+0xe0/0xe0 [btrfs]
[ 4299.934048]  [<ffffffffa06b13e0>] ? defrag_lookup_extent+0xe0/0xe0 [btrfs]
[ 4299.934056]  [<ffffffffa06df980>] iterate_extent_inodes+0xe0/0x250 [btrfs]
[ 4299.934058]  [<ffffffff817762db>] ? _raw_spin_unlock+0x2b/0x50
[ 4299.934065]  [<ffffffffa06dfb82>] iterate_inodes_from_logical+0x92/0xb0 [btrfs]
[ 4299.934071]  [<ffffffffa06b13e0>] ? defrag_lookup_extent+0xe0/0xe0 [btrfs]
[ 4299.934078]  [<ffffffffa06b7015>] btrfs_ioctl+0xf65/0x1f60 [btrfs]
[ 4299.934080]  [<ffffffff811658b8>] ? handle_mm_fault+0x278/0xb00
[ 4299.934083]  [<ffffffff81075563>] ? up_read+0x23/0x40
[ 4299.934085]  [<ffffffff8177a41c>] ? __do_page_fault+0x20c/0x5a0
[ 4299.934088]  [<ffffffff811b2946>] do_vfs_ioctl+0x96/0x570
[ 4299.934090]  [<ffffffff81776e23>] ? error_sti+0x5/0x6
[ 4299.934093]  [<ffffffff810b71e8>] ? trace_hardirqs_off_caller+0x28/0xd0
[ 4299.934096]  [<ffffffff81776a09>] ? retint_swapgs+0xe/0x13
[ 4299.934098]  [<ffffffff811b2eb1>] SyS_ioctl+0x91/0xb0
[ 4299.934100]  [<ffffffff813eecde>] ? trace_hardirqs_on_thunk+0x3a/0x3f
[ 4299.934102]  [<ffffffff8177ef12>] system_call_fastpath+0x16/0x1b
[ 4299.934102]  [<ffffffff8177ef12>] system_call_fastpath+0x16/0x1b
[ 4299.934104] ---[ end trace 48f0cfc902491414 ]---
[ 4299.934378] btrfs bad fsid on block 0

These tree mod log operations that must be performed atomically, tree_mod_log_free_eb,
tree_mod_log_eb_copy, tree_mod_log_insert_root and tree_mod_log_insert_move, used to
be performed atomically before the following commit:

  c8cc6341653721b54760480b0d0d9b5f09b46741
  (Btrfs: stop using GFP_ATOMIC for the tree mod log allocations)

That change removed the atomicity of such operations. This patch restores the
atomicity while still not doing the GFP_ATOMIC allocations of tree_mod_elem
structures, so it has to do the allocations using GFP_NOFS before acquiring
the mod log lock.

This issue has been experienced by several users recently, such as for example:

  http://www.spinics.net/lists/linux-btrfs/msg28574.html

After running the btrfs/004 test for 679 consecutive iterations with this
patch applied, I didn't ran into the issue anymore.

Signed-off-by: Filipe David Borba Manana <fdmanana@gmail.com>
Signed-off-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: Chris Mason <clm@fb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agoBtrfs: return immediately if tree log mod is not necessary
Filipe David Borba Manana [Thu, 12 Dec 2013 19:19:52 +0000 (19:19 +0000)]
Btrfs: return immediately if tree log mod is not necessary

commit 783577663507411e36e459390ef056556e93ef29 upstream.

In ctree.c:tree_mod_log_set_node_key() we were calling
__tree_mod_log_insert_key() even when the modification doesn't need
to be logged. This would allocate a tree_mod_elem structure, fill it
and pass it to  __tree_mod_log_insert(), which would just acquire
the tree mod log write lock and then free the tree_mod_elem structure
and return (that is, a no-op).

Therefore call tree_mod_log_insert() instead of __tree_mod_log_insert()
which just returns immediately if the modification doesn't need to be
logged (without allocating the structure, fill it, acquire write lock,
free structure).

Signed-off-by: Filipe David Borba Manana <fdmanana@gmail.com>
Signed-off-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: Chris Mason <clm@fb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agox86, fpu: Check tsk_used_math() in kernel_fpu_end() for eager FPU
Suresh Siddha [Mon, 3 Feb 2014 06:56:23 +0000 (22:56 -0800)]
x86, fpu: Check tsk_used_math() in kernel_fpu_end() for eager FPU

commit 731bd6a93a6e9172094a2322bd0ee964bb1f4d63 upstream.

For non-eager fpu mode, thread's fpu state is allocated during the first
fpu usage (in the context of device not available exception). This
(math_state_restore()) can be a blocking call and hence we enable
interrupts (which were originally disabled when the exception happened),
allocate memory and disable interrupts etc.

But the eager-fpu mode, call's the same math_state_restore() from
kernel_fpu_end(). The assumption being that tsk_used_math() is always
set for the eager-fpu mode and thus avoid the code path of enabling
interrupts, allocating fpu state using blocking call and disable
interrupts etc.

But the below issue was noticed by Maarten Baert, Nate Eldredge and
few others:

If a user process dumps core on an ecrypt fs while aesni-intel is loaded,
we get a BUG() in __find_get_block() complaining that it was called with
interrupts disabled; then all further accesses to our ecrypt fs hang
and we have to reboot.

The aesni-intel code (encrypting the core file that we are writing) needs
the FPU and quite properly wraps its code in kernel_fpu_{begin,end}(),
the latter of which calls math_state_restore(). So after kernel_fpu_end(),
interrupts may be disabled, which nobody seems to expect, and they stay
that way until we eventually get to __find_get_block() which barfs.

For eager fpu, most the time, tsk_used_math() is true. At few instances
during thread exit, signal return handling etc, tsk_used_math() might
be false.

In kernel_fpu_end(), for eager-fpu, call math_state_restore()
only if tsk_used_math() is set. Otherwise, don't bother. Kernel code
path which cleared tsk_used_math() knows what needs to be done
with the fpu state.

Reported-by: Maarten Baert <maarten-baert@hotmail.com>
Reported-by: Nate Eldredge <nate@thatsmathematics.com>
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Suresh Siddha <sbsiddha@gmail.com>
Link: http://lkml.kernel.org/r/1391410583.3801.6.camel@europa
Cc: George Spelvin <linux@horizon.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agoSCSI: storvsc: NULL pointer dereference fix
Ales Novak [Thu, 27 Feb 2014 10:03:30 +0000 (11:03 +0100)]
SCSI: storvsc: NULL pointer dereference fix

commit b12bb60d6c350b348a4e1460cd68f97ccae9822e upstream.

If the initialization of storvsc fails, the storvsc_device_destroy()
causes NULL pointer dereference.

storvsc_bus_scan()
  scsi_scan_target()
    __scsi_scan_target()
      scsi_probe_and_add_lun(hostdata=NULL)
        scsi_alloc_sdev(hostdata=NULL)

  sdev->hostdata = hostdata

  now the host allocation fails

          __scsi_remove_device(sdev)

  calls sdev->host->hostt->slave_destroy() ==
  storvsc_device_destroy(sdev)
    access of sdev->hostdata->request_mempool

Signed-off-by: Ales Novak <alnovak@suse.cz>
Signed-off-by: Thomas Abraham <tabraham@suse.com>
Reviewed-by: Jiri Kosina <jkosina@suse.cz>
Acked-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agoSCSI: qla2xxx: Fix multiqueue MSI-X registration.
Chad Dupuis [Wed, 26 Feb 2014 09:15:14 +0000 (04:15 -0500)]
SCSI: qla2xxx: Fix multiqueue MSI-X registration.

commit f324777ea88bab2522602671e46fc0851d7d5e35 upstream.

This fixes requesting of the MSI-X vectors for the base response queue.
The iteration in the for loop in qla24xx_enable_msix() was incorrect.
We should only iterate of the first two MSI-X vectors and not the total
number of MSI-X vectors that have given to the driver for this device
from pci_enable_msix() in this function.

Signed-off-by: Chad Dupuis <chad.dupuis@qlogic.com>
Signed-off-by: Saurav Kashyap <saurav.kashyap@qlogic.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agoSCSI: qla2xxx: Poll during initialization for ISP25xx and ISP83xx
Giridhar Malavali [Wed, 26 Feb 2014 09:15:12 +0000 (04:15 -0500)]
SCSI: qla2xxx: Poll during initialization for ISP25xx and ISP83xx

commit b77ed25c9f8402e8b3e49e220edb4ef09ecfbb53 upstream.

Signed-off-by: Giridhar Malavali <giridhar.malavali@qlogic.com>
Signed-off-by: Saurav Kashyap <saurav.kashyap@qlogic.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agoSCSI: isci: correct erroneous for_each_isci_host macro
Lukasz Dorau [Thu, 6 Feb 2014 20:23:20 +0000 (12:23 -0800)]
SCSI: isci: correct erroneous for_each_isci_host macro

commit c59053a23d586675c25d789a7494adfdc02fba57 upstream.

In the first place, the loop 'for' in the macro 'for_each_isci_host'
(drivers/scsi/isci/host.h:314) is incorrect, because it accesses
the 3rd element of 2 element array. After the 2nd iteration it executes
the instruction:
        ihost = to_pci_info(pdev)->hosts[2]
(while the size of the 'hosts' array equals 2) and reads an
out of range element.

In the second place, this loop is incorrectly optimized by GCC v4.8
(see http://marc.info/?l=linux-kernel&m=138998871911336&w=2).
As a result, on platforms with two SCU controllers,
the loop is executed more times than it can be (for i=0,1 and 2).
It causes kernel panic during entering the S3 state
and the following oops after 'rmmod isci':

BUG: unable to handle kernel NULL pointer dereference at (null)
IP: [<ffffffff8131360b>] __list_add+0x1b/0xc0
Oops: 0000 [#1] SMP
RIP: 0010:[<ffffffff8131360b>]  [<ffffffff8131360b>] __list_add+0x1b/0xc0
Call Trace:
  [<ffffffff81661b84>] __mutex_lock_slowpath+0x114/0x1b0
  [<ffffffff81661c3f>] mutex_lock+0x1f/0x30
  [<ffffffffa03e97cb>] sas_disable_events+0x1b/0x50 [libsas]
  [<ffffffffa03e9818>] sas_unregister_ha+0x18/0x60 [libsas]
  [<ffffffffa040316e>] isci_unregister+0x1e/0x40 [isci]
  [<ffffffffa0403efd>] isci_pci_remove+0x5d/0x100 [isci]
  [<ffffffff813391cb>] pci_device_remove+0x3b/0xb0
  [<ffffffff813fbf7f>] __device_release_driver+0x7f/0xf0
  [<ffffffff813fc8f8>] driver_detach+0xa8/0xb0
  [<ffffffff813fbb8b>] bus_remove_driver+0x9b/0x120
  [<ffffffff813fcf2c>] driver_unregister+0x2c/0x50
  [<ffffffff813381f3>] pci_unregister_driver+0x23/0x80
  [<ffffffffa04152f8>] isci_exit+0x10/0x1e [isci]
  [<ffffffff810d199b>] SyS_delete_module+0x16b/0x2d0
  [<ffffffff81012a21>] ? do_notify_resume+0x61/0xa0
  [<ffffffff8166ce29>] system_call_fastpath+0x16/0x1b

The loop has been corrected.
This patch fixes kernel panic during entering the S3 state
and the above oops.

Signed-off-by: Lukasz Dorau <lukasz.dorau@intel.com>
Reviewed-by: Maciej Patelczyk <maciej.patelczyk@intel.com>
Tested-by: Lukasz Dorau <lukasz.dorau@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agoSCSI: isci: fix reset timeout handling
Dan Williams [Thu, 6 Feb 2014 20:23:01 +0000 (12:23 -0800)]
SCSI: isci: fix reset timeout handling

commit ddfadd7736b677de2d4ca2cd5b4b655368c85a7a upstream.

Remove an erroneous BUG_ON() in the case of a hard reset timeout.  The
reset timeout handler puts the port into the "awaiting link-up" state.
The timeout causes the device to be disconnected and we need to be in
the awaiting link-up state to re-connect the port.  The BUG_ON() made
the incorrect assumption that resets never timeout and we always
complete the reset in the "resetting" state.

Testing this patch also uncovered that libata continues to attempt to
reset the port long after the driver has torn down the context.  Once
the driver has committed to abandoning the link it must indicate to
libata that recovery ends by returning -ENODEV from
->lldd_I_T_nexus_reset().

Acked-by: Lukasz Dorau <lukasz.dorau@intel.com>
Reported-by: David Milburn <dmilburn@redhat.com>
Reported-by: Xun Ni <xun.ni@intel.com>
Tested-by: Xun Ni <xun.ni@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agocan: flexcan: flexcan_remove(): add missing netif_napi_del()
Marc Kleine-Budde [Fri, 28 Feb 2014 19:48:36 +0000 (20:48 +0100)]
can: flexcan: flexcan_remove(): add missing netif_napi_del()

commit d96e43e8fce28cf97df576a07af9d65657a41a6f upstream.

This patch adds the missing netif_napi_del() to the flexcan_remove() function.

Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agocan: flexcan: factor out transceiver {en,dis}able into seperate functions
Marc Kleine-Budde [Fri, 28 Feb 2014 16:18:27 +0000 (17:18 +0100)]
can: flexcan: factor out transceiver {en,dis}able into seperate functions

commit f003698e23f6f56a791774f14d0ac35d04872490 upstream.

This patch moves the transceiver enable and disable into seperate functions,
where the NULL pointer check is hidden.

Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agocan: flexcan: fix transition from and to low power mode in chip_{en,dis}able
Marc Kleine-Budde [Fri, 28 Feb 2014 14:30:18 +0000 (15:30 +0100)]
can: flexcan: fix transition from and to low power mode in chip_{en,dis}able

commit 9b00b300e7bce032c467c36ca47fe2a776887fc2 upstream.

In flexcan_chip_enable() and flexcan_chip_disable() fixed delays are used.
Experiments have shown that the transition from and to low power mode may take
several microseconds.

This patch adds a while loop which polls the Low Power Mode ACK bit (LPM_ACK)
that indicates a successfull mode change. If the function runs into a timeout a
error value is returned.

Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agocan: flexcan: flexcan_open(): fix error path if flexcan_chip_start() fails
Marc Kleine-Budde [Fri, 28 Feb 2014 13:52:01 +0000 (14:52 +0100)]
can: flexcan: flexcan_open(): fix error path if flexcan_chip_start() fails

commit 7e9e148af01ef388efb6e2490805970be4622792 upstream.

If flexcan_chip_start() in flexcan_open() fails, the interrupt is not freed,
this patch adds the missing cleanup.

Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agocan: flexcan: fix shutdown: first disable chip, then all interrupts
Marc Kleine-Budde [Wed, 19 Feb 2014 11:00:51 +0000 (12:00 +0100)]
can: flexcan: fix shutdown: first disable chip, then all interrupts

commit 5be93bdda64e85450598c6e97f79fb8f6acf30e0 upstream.

When shutting down the CAN interface (ifconfig canX down) during high CAN bus
loads, the CAN core might hang and freeze the whole CPU.

This patch fixes the shutdown sequence by first disabling the CAN core then
disabling all interrupts.

Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agonet: unix socket code abuses csum_partial
Anton Blanchard [Wed, 5 Mar 2014 03:29:58 +0000 (14:29 +1100)]
net: unix socket code abuses csum_partial

commit 0a13404dd3bf4ea870e3d96270b5a382edca85c0 upstream.

The unix socket code is using the result of csum_partial to
hash into a lookup table:

unix_hash_fold(csum_partial(sunaddr, len, 0));

csum_partial is only guaranteed to produce something that can be
folded into a checksum, as its prototype explains:

 * returns a 32-bit number suitable for feeding into itself
 * or csum_tcpudp_magic

The 32bit value should not be used directly.

Depending on the alignment, the ppc64 csum_partial will return
different 32bit partial checksums that will fold into the same
16bit checksum.

This difference causes the following testcase (courtesy of
Gustavo) to sometimes fail:

#include <sys/socket.h>
#include <stdio.h>

int main()
{
int fd = socket(PF_LOCAL, SOCK_STREAM|SOCK_CLOEXEC, 0);

int i = 1;
setsockopt(fd, SOL_SOCKET, SO_REUSEADDR, &i, 4);

struct sockaddr addr;
addr.sa_family = AF_LOCAL;
bind(fd, &addr, 2);

listen(fd, 128);

struct sockaddr_storage ss;
socklen_t sslen = (socklen_t)sizeof(ss);
getsockname(fd, (struct sockaddr*)&ss, &sslen);

fd = socket(PF_LOCAL, SOCK_STREAM|SOCK_CLOEXEC, 0);

if (connect(fd, (struct sockaddr*)&ss, sslen) == -1){
perror(NULL);
return 1;
}
printf("OK\n");
return 0;
}

As suggested by davem, fix this by using csum_fold to fold the
partial 32bit checksum into a 16bit checksum before using it.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agodm cache: fix access beyond end of origin device
Heinz Mauelshagen [Wed, 12 Mar 2014 15:13:39 +0000 (16:13 +0100)]
dm cache: fix access beyond end of origin device

commit e893fba90c09f9b57fb97daae204ea9cc2c52fa5 upstream.

In order to avoid wasting cache space a partial block at the end of the
origin device is not cached.  Unfortunately, the check for such a
partial block at the end of the origin device was flawed.

Fix accesses beyond the end of the origin device that occured due to
attempted promotion of an undetected partial block by:

- initializing the per bio data struct to allow cache_end_io to work properly
- recognizing access to the partial block at the end of the origin device
- avoiding out of bounds access to the discard bitset

Otherwise, users can experience errors like the following:

 attempt to access beyond end of device
 dm-5: rw=0, want=20971520, limit=20971456
 ...
 device-mapper: cache: promotion failed; couldn't copy block

Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com>
Acked-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agodm cache: fix truncation bug when copying a block to/from >2TB fast device
Heinz Mauelshagen [Tue, 11 Mar 2014 23:40:05 +0000 (00:40 +0100)]
dm cache: fix truncation bug when copying a block to/from >2TB fast device

commit 8b9d96666529a979acf4825391efcc7c8a3e9f12 upstream.

During demotion or promotion to a cache's >2TB fast device we must not
truncate the cache block's associated sector to 32bits.  The 32bit
temporary result of from_cblock() caused a 32bit multiplication when
calculating the sector of the fast device in issue_copy_real().

Use an intermediate 64bit type to store the 32bit from_cblock() to allow
for proper 64bit multiplication.

Here is an example of how this bug manifests on an ext4 filesystem:

 EXT4-fs error (device dm-0): ext4_mb_generate_buddy:756: group 17136, 32768 clusters in bitmap, 30688 in gd; block bitmap corrupt.
 JBD2: Spotted dirty metadata buffer (dev = dm-0, blocknr = 0). There's a risk of filesystem corruption in case of system crash.

Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com>
Acked-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agodm space map metadata: fix refcount decrement below 0 which caused corruption
Joe Thornber [Fri, 7 Mar 2014 14:57:19 +0000 (14:57 +0000)]
dm space map metadata: fix refcount decrement below 0 which caused corruption

commit cebc2de44d3bce53e46476e774126c298ca2c8a9 upstream.

This has been a relatively long-standing issue that wasn't nailed down
until Teng-Feng Yang's meticulous bug report to dm-devel on 3/7/2014,
see: http://www.redhat.com/archives/dm-devel/2014-March/msg00021.html

From that report:
  "When decreasing the reference count of a metadata block with its
  reference count equals 3, we will call dm_btree_remove() to remove
  this enrty from the B+tree which keeps the reference count info in
  metadata device.

  The B+tree will try to rebalance the entry of the child nodes in each
  node it traversed, and the rebalance process contains the following
  steps.

  (1) Finding the corresponding children in current node (shadow_current(s))
  (2) Shadow the children block (issue BOP_INC)
  (3) redistribute keys among children, and free children if necessary (issue BOP_DEC)

  Since the update of a metadata block's reference count could be
  recursive, we will stash these reference count update operations in
  smm->uncommitted and then process them in a FILO fashion.

  The problem is that step(3) could free the children which is created
  in step(2), so the BOP_DEC issued in step(3) will be carried out
  before the BOP_INC issued in step(2) since these BOPs will be
  processed in FILO fashion. Once the BOP_DEC from step(3) tries to
  decrease the reference count of newly shadow block, it will report
  failure for its reference equals 0 before decreasing. It looks like we
  can solve this issue by processing these BOPs in a FIFO fashion
  instead of FILO."

Commit 5b564d80 ("dm space map: disallow decrementing a reference count
below zero") changed the code to report an error for this temporary
refcount decrement below zero.  So what was previously a harmless
invalid refcount became a hard failure due to the new error path:

 device-mapper: space map common: unable to decrement a reference count below 0
 device-mapper: thin: 253:6: dm_thin_insert_block() failed: error = -22
 device-mapper: thin: 253:6: switching pool to read-only mode

This bug is in dm persistent-data code that is common to the DM thin and
cache targets.  So any users of those targets should apply this fix.

Fix this by applying recursive space map operations in FIFO order rather
than FILO.

Resolves: https://bugzilla.kernel.org/show_bug.cgi?id=68801

Reported-by: Apollon Oikonomopoulos <apoikos@debian.org>
Reported-by: edwillam1007@gmail.com
Reported-by: Teng-Feng Yang <shinrairis@gmail.com>
Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>