git.itanic.dy.fi Git - linux-stable/log

Linux 3.12.12

pinctrl: protect pinctrl_list add

commit 7b320cb1ed2dbd2c5f2a778197baf76fd6bf545a upstream.

We have few fedora bug reports about list corruption on pinctrl,
for example:
https://bugzilla.redhat.com/show_bug.cgi?id=1051918

Most likely corruption happen due lack of protection of pinctrl_list
when adding new nodes to it. Patch corrects that.

Fixes: 42fed7ba44e ("pinctrl: move subsystem mutex to pinctrl_dev struct")
Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
Acked-by: Stephen Warren <swarren@nvidia.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

pinctrl: vt8500: Change devicetree data parsing

commit f17248ed868767567298e1cdf06faf8159a81f7c upstream.

Due to an assumption in the VT8500 pinctrl driver, the value passed
from devicetree for 'wm,pull' was not explicitly translated before
being passed to pinconf.

Since v3.10, changes to 'enum pin_config_param', PIN_CONFIG_BIAS_PULL_(UP/DOWN)
no longer map 1-to-1 with the expected values in devicetree.

This patch adds a small translation between the devicetree values (0..2)
and the enum pin_config_param equivalent values.

Signed-off-by: Tony Prisk <linux@prisktech.co.nz>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

pinctrl: at91: use locked variant of irq_set_handler

commit b0dcfd87323ea86501e93d0fa2a98d2fd3579bcf upstream.

When setting the gpio irq type, use the __irq_set_handler_locked()
variant instead of the irq_set_handler() to prevent false
spinlock recursion warning.

Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

genirq: Generic irq chip requires IRQ_DOMAIN

commit 923fa4ea382f592dee2ba3b205befb90cbddf3af upstream.

The generic_chip.c uses interfaces from irq_domain.c which is
controlled by the IRQ_DOMAIN config option, but there is no Kconfig
dependency so the build can fail:

linux/kernel/irq/generic-chip.c:400:11: error:
'irq_domain_xlate_onetwocell' undeclared here (not in a function)

Select IRQ_DOMAIN when GENERIC_IRQ_CHIP is selected.

Signed-off-by: Nitin A Kamble <nitin.a.kamble@intel.com>
Link: http://lkml.kernel.org/r/1391129410-54548-2-git-send-email-nitin.a.kamble@intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

x86, hweight: Fix BUG when booting with CONFIG_GCOV_PROFILE_ALL=y

commit 6583327c4dd55acbbf2a6f25e775b28b3abf9a42 upstream.

Commit d61931d89b, "x86: Add optimized popcnt variants" introduced
compile flag -fcall-saved-rdi for lib/hweight.c. When combined with
options -fprofile-arcs and -O2, this flag causes gcc to generate
broken constructor code. As a result, a 64 bit x86 kernel compiled
with CONFIG_GCOV_PROFILE_ALL=y prints message "gcov: could not create
file" and runs into sproadic BUGs during boot.

The gcc people indicate that these kinds of problems are endemic when
using ad hoc calling conventions. It is therefore best to treat any
file compiled with ad hoc calling conventions as an isolated
environment and avoid things like profiling or coverage analysis,
since those subsystems assume a "normal" calling conventions.

This patch avoids the bug by excluding lib/hweight.o from coverage
profiling.

Reported-by: Meelis Roos <mroos@linux.ee>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Peter Oberparleiter <oberpar@linux.vnet.ibm.com>
Link: http://lkml.kernel.org/r/52F3A30C.7050205@linux.vnet.ibm.com
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Revert "[media] videobuf_vm_{open,close} race fixes"

commit cca36e2eecec2b8fc869a50ffd3bd0adeed92b8b upstream.

This reverts commit a242f426108c284049a69710f871cc9f11b13e61.

That commit actually caused deadlocks, rather then fixing them.

If ext_lock is set to NULL (otherwise videobuf_queue_lock doesn't do
anything), then you get this deadlock:

The driver's mmap function calls videobuf_mmap_mapper which calls
videobuf_queue_lock on q. videobuf_mmap_mapper calls __videobuf_mmap_mapper,
__videobuf_mmap_mapper calls videobuf_vm_open and videobuf_vm_open
calls videobuf_queue_lock on q (introduced by above patch): deadlocked.

This affects drivers using dma-contig and dma-vmalloc. Only dma-sg is
not affected since it doesn't call videobuf_vm_open from __videobuf_mmap_mapper.

Most drivers these days have a non-NULL ext_lock. Those that still use
NULL there are all fairly obscure drivers, which is why this hasn't been
seen earlier.

Since everything worked perfectly fine for many years I prefer to just
revert this patch rather than trying to fix it. videobuf is quite fragile
and I rather not touch it too much. Work is (slowly) progressing to move
everything over to vb2 or at the very least use non-NULL ext_lock in
videobuf.

Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Reported-by: Pete Eberlein <pete@sensoray.com>
Signed-off-by: Mauro Carvalho Chehab <m.chehab@samsung.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mxl111sf: Fix compile when CONFIG_DVB_USB_MXL111SF is unset

commit 13e1b87c986100169b0695aeb26970943665eda9 upstream.

Fix the following build error:

drivers/media/usb/dvb-usb-v2/
mxl111sf-tuner.h:72:9: error: expected ‘;’, ‘,’ or ‘)’ before ‘struct’
struct mxl111sf_tuner_config *cfg)

Signed-off-by: Dave Jones <davej@fedoraproject.org>
Signed-off-by: Michael Krufky <mkrufky@linuxtv.org>
Signed-off-by: Mauro Carvalho Chehab <m.chehab@samsung.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mxl111sf: Fix unintentional garbage stack read

commit 866e8d8a9dc1ebb4f9e67197e264ac2df81f7d4b upstream.

mxl111sf_read_reg takes an address of a variable to write to as an argument.
drivers/media/usb/dvb-usb-v2/mxl111sf-gpio.c:mxl111sf_config_pin_mux_modes
passes several uninitialized stack variables to this routine, expecting
them to be filled in. In the event that something unexpected happens when
reading from the chip, we end up doing a pr_debug of the value passed in,
revealing whatever garbage happened to be on the stack.

Change the pr_debug to match what happens in the 'success' case, where we
assign buf[1] to *data.

Spotted with Coverity (Bugs 731910 through 731917)

Signed-off-by: Dave Jones <davej@fedoraproject.org>
Signed-off-by: Michael Krufky <mkrufky@linuxtv.org>
Signed-off-by: Mauro Carvalho Chehab <m.chehab@samsung.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

af9035: add ID [2040:f900] Hauppauge WinTV-MiniStick 2

commit f2e4c5e004691dfe37d0e4b363296f28abdb9bc7 upstream.

Add USB ID [2040:f900] for Hauppauge WinTV-MiniStick 2.
Device is build upon IT9135 chipset.

Tested-by: Stefan Becker <schtefan@gmx.net>
Signed-off-by: Antti Palosaari <crope@iki.fi>
Signed-off-by: Mauro Carvalho Chehab <m.chehab@samsung.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

x86: mm: change tlb_flushall_shift for IvyBridge

commit f98b7a772ab51b52ca4d2a14362fc0e0c8a2e0f3 upstream.

There was a large performance regression that was bisected to
commit 611ae8e3 ("x86/tlb: enable tlb flush range support for
x86").  This patch simply changes the default balance point
between a local and global flush for IvyBridge.

In the interest of allowing the tests to be reproduced, this
patch was tested using mmtests 0.15 with the following
configurations

configs/config-global-dhp__tlbflush-performance
configs/config-global-dhp__scheduler-performance
configs/config-global-dhp__network-performance

Results are from two machines

Ivybridge   4 threads:  Intel(R) Core(TM) i3-3240 CPU @ 3.40GHz
Ivybridge   8 threads:  Intel(R) Core(TM) i7-3770 CPU @ 3.40GHz

Page fault microbenchmark showed nothing interesting.

Ebizzy was configured to run multiple iterations and threads.
Thread counts ranged from 1 to NR_CPUS*2. For each thread count,
it ran 100 iterations and each iteration lasted 10 seconds.

Ivybridge 4 threads
                    3.13.0-rc7            3.13.0-rc7
                       vanilla           altshift-v3
Mean   1     6395.44 (  0.00%)     6789.09 (  6.16%)
Mean   2     7012.85 (  0.00%)     8052.16 ( 14.82%)
Mean   3     6403.04 (  0.00%)     6973.74 (  8.91%)
Mean   4     6135.32 (  0.00%)     6582.33 (  7.29%)
Mean   5     6095.69 (  0.00%)     6526.68 (  7.07%)
Mean   6     6114.33 (  0.00%)     6416.64 (  4.94%)
Mean   7     6085.10 (  0.00%)     6448.51 (  5.97%)
Mean   8     6120.62 (  0.00%)     6462.97 (  5.59%)

Ivybridge 8 threads
                     3.13.0-rc7            3.13.0-rc7
                        vanilla           altshift-v3
Mean   1      7336.65 (  0.00%)     7787.02 (  6.14%)
Mean   2      8218.41 (  0.00%)     9484.13 ( 15.40%)
Mean   3      7973.62 (  0.00%)     8922.01 ( 11.89%)
Mean   4      7798.33 (  0.00%)     8567.03 (  9.86%)
Mean   5      7158.72 (  0.00%)     8214.23 ( 14.74%)
Mean   6      6852.27 (  0.00%)     7952.45 ( 16.06%)
Mean   7      6774.65 (  0.00%)     7536.35 ( 11.24%)
Mean   8      6510.50 (  0.00%)     6894.05 (  5.89%)
Mean   12     6182.90 (  0.00%)     6661.29 (  7.74%)
Mean   16     6100.09 (  0.00%)     6608.69 (  8.34%)

Ebizzy hits the worst case scenario for TLB range flushing every
time and it shows for these Ivybridge CPUs at least that the
default choice is a poor on.  The patch addresses the problem.

Next was a tlbflush microbenchmark written by Alex Shi at
http://marc.info/?l=linux-kernel&m=133727348217113 .  It
measures access costs while the TLB is being flushed.  The
expectation is that if there are always full TLB flushes that
the benchmark would suffer and it benefits from range flushing

There are 320 iterations of the test per thread count.  The
number of entries is randomly selected with a min of 1 and max
of 512.  To ensure a reasonably even spread of entries, the full
range is broken up into 8 sections and a random number selected
within that section.

iteration 1, random number between 0-64
iteration 2, random number between 64-128 etc

This is still a very weak methodology.  When you do not know
what are typical ranges, random is a reasonable choice but it
can be easily argued that the opimisation was for smaller ranges
and an even spread is not representative of any workload that
matters.  To improve this, we'd need to know the probability
distribution of TLB flush range sizes for a set of workloads
that are considered "common", build a synthetic trace and feed
that into this benchmark.  Even that is not perfect because it
would not account for the time between flushes but there are
limits of what can be reasonably done and still be doing
something useful.  If a representative synthetic trace is
provided then this benchmark could be revisited and the shift values retuned.

Ivybridge 4 threads
                        3.13.0-rc7            3.13.0-rc7
                           vanilla           altshift-v3
Mean       1       10.50 (  0.00%)       10.50 (  0.03%)
Mean       2       17.59 (  0.00%)       17.18 (  2.34%)
Mean       3       22.98 (  0.00%)       21.74 (  5.41%)
Mean       5       47.13 (  0.00%)       46.23 (  1.92%)
Mean       8       43.30 (  0.00%)       42.56 (  1.72%)

Ivybridge 8 threads
                         3.13.0-rc7            3.13.0-rc7
                            vanilla           altshift-v3
Mean       1         9.45 (  0.00%)        9.36 (  0.93%)
Mean       2         9.37 (  0.00%)        9.70 ( -3.54%)
Mean       3         9.36 (  0.00%)        9.29 (  0.70%)
Mean       5        14.49 (  0.00%)       15.04 ( -3.75%)
Mean       8        41.08 (  0.00%)       38.73 (  5.71%)
Mean       13       32.04 (  0.00%)       31.24 (  2.49%)
Mean       16       40.05 (  0.00%)       39.04 (  2.51%)

For both CPUs, average access time is reduced which is good as
this is the benchmark that was used to tune the shift values in
the first place albeit it is now known *how* the benchmark was
used.

The scheduler benchmarks were somewhat inconclusive.  They
showed gains and losses and makes me reconsider how stable those
benchmarks really are or if something else might be interfering
with the test results recently.

Network benchmarks were inconclusive.  Almost all results were
flat except for netperf-udp tests on the 4 thread machine.
These results were unstable and showed large variations between
reboots.  It is unknown if this is a recent problems but I've
noticed before that netperf-udp results tend to vary.

Based on these results, changing the default for Ivybridge seems
like a logical choice.

Signed-off-by: Mel Gorman <mgorman@suse.de>
Tested-by: Davidlohr Bueso <davidlohr@hp.com>
Reviewed-by: Alex Shi <alex.shi@linaro.org>
Reviewed-by: Rik van Riel <riel@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/n/tip-cqnadffh1tiqrshthRj3Esge@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mm: __set_page_dirty uses spin_lock_irqsave instead of spin_lock_irq

commit 227d53b397a32a7614667b3ecaf1d89902fb6c12 upstream.

To use spin_{un}lock_irq is dangerous if caller disabled interrupt.
During aio buffer migration, we have a possibility to see the following
call stack.

aio_migratepage  [disable interrupt]
  migrate_page_copy
    clear_page_dirty_for_io
      set_page_dirty
        __set_page_dirty_buffers
          __set_page_dirty
            spin_lock_irq

This mean, current aio migration is a deadlockable.  spin_lock_irqsave
is a safer alternative and we should use it.

Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Reported-by: David Rientjes rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mm: __set_page_dirty_nobuffers() uses spin_lock_irqsave() instead of spin_lock_irq()

commit a85d9df1ea1d23682a0ed1e100e6965006595d06 upstream.

During aio stress test, we observed the following lockdep warning.  This
mean AIO+numa_balancing is currently deadlockable.

The problem is, aio_migratepage disable interrupt, but
__set_page_dirty_nobuffers unintentionally enable it again.

Generally, all helper function should use spin_lock_irqsave() instead of
spin_lock_irq() because they don't know caller at all.

   other info that might help us debug this:
    Possible unsafe locking scenario:

          CPU0
          ----
     lock(&(&ctx->completion_lock)->rlock);
     <Interrupt>
       lock(&(&ctx->completion_lock)->rlock);

    *** DEADLOCK ***

      dump_stack+0x19/0x1b
      print_usage_bug+0x1f7/0x208
      mark_lock+0x21d/0x2a0
      mark_held_locks+0xb9/0x140
      trace_hardirqs_on_caller+0x105/0x1d0
      trace_hardirqs_on+0xd/0x10
      _raw_spin_unlock_irq+0x2c/0x50
      __set_page_dirty_nobuffers+0x8c/0xf0
      migrate_page_copy+0x434/0x540
      aio_migratepage+0xb1/0x140
      move_to_new_page+0x7d/0x230
      migrate_pages+0x5e5/0x700
      migrate_misplaced_page+0xbc/0xf0
      do_numa_page+0x102/0x190
      handle_pte_fault+0x241/0x970
      handle_mm_fault+0x265/0x370
      __do_page_fault+0x172/0x5a0
      do_page_fault+0x1a/0x70
      page_fault+0x28/0x30

Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Larry Woodman <lwoodman@redhat.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Johannes Weiner <jweiner@redhat.com>
Acked-by: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mm/swap: fix race on swap_info reuse between swapoff and swapon

commit f893ab41e4dae2fe8991faf5d86d029068d1ef3a upstream.

swapoff clear swap_info's SWP_USED flag prematurely and free its
resources after that. A concurrent swapon will reuse this swap_info
while its previous resources are not cleared completely.

These late freed resources are:
- p->percpu_cluster
- swap_cgroup_ctrl[type]
- block_device setting
- inode->i_flags &= ~S_SWAPFILE

This patch clears the SWP_USED flag after all its resources are freed,
so that swapon can reuse this swap_info by alloc_swap_info() safely.

[akpm@linux-foundation.org: tidy up code comment]
Signed-off-by: Weijie Yang <weijie.yang@samsung.com>
Acked-by: Hugh Dickins <hughd@google.com>
Cc: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ALSA: hda - Improve loopback path lookups for AD1983

commit 276ab336b4c6e483d12fd46cbf24f97f71867710 upstream.

AD1983 has flexible loopback routes and the generic parser would take
wrong path confusingly instead of taking individual paths via NID 0x0c
and 0x0d. For avoiding it, limit the connections at these widgets so
that the parser can think more straightforwardly. This fixes the
regression of the missing line-in loopback on Dell machine.

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=70011
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ALSA: hda - Add missing mixer widget for AD1983

commit c7579fed1f1b2567529aea64ef19871337403ab3 upstream.

The mixer widget on AD1983 at NID 0x0e was missing in the commit
[f2f8be43c5c9: ALSA: hda - Add aamix NID to AD codecs].

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=70011
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ALSA: hda - Fix silent output on Toshiba Satellite L40

commit 4528eb19b00d9ccd65ded6f8201eec704267edd8 upstream.

Toshiba Satellite L40 with AD1986A codec requires the EAPD of NID 0x1b
to be constantly on, otherwise the output doesn't work.
Unlike most of other AD1986A machines, EAPD is correctly implemented
in HD-audio manner (that is, bit set = amp on), so we need to clear
the inv_eapd flag in the fixup, too.

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=67481
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ALSA: hda - Fix missing VREF setup for Mac Pro 1,1

commit c20f31ec421ea4fabea5e95a6afd46c5f41e5599 upstream.

Mac Pro 1,1 with ALC889A codec needs the VREF setup on NID 0x18 to
VREF50, in order to make the speaker working. The same fixup was
already needed for MacBook Air 1,1, so we can reuse it.

Reported-by: Nicolai Beuermann <mail@nico-beuermann.de>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ALSA: usb-audio: Add missing kconfig dependecy

commit 4fa71c1550a857ff1dbfe9e99acc1f4cfec5f0d0 upstream.

The commit 44dcbbb1cd61 introduced the usage of bitreverse helpers but
forgot to add the dependency. This patch adds the selection for
CONFIG_BITREVERSE.

Fixes: 44dcbbb1cd61 ('ALSA: snd-usb: add support for bit-reversed byte formats')
Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

arm64: add DSB after icache flush in __flush_icache_all()

commit 5044bad43ee573d0b6d90e3ccb7a40c2c7d25eb4 upstream.

Add DSB after icache flush to complete the cache maintenance operation.
The function __flush_icache_all() is used only for user space mappings
and an ISB is not required because of an exception return before executing
user instructions. An exception return would behave like an ISB.

Signed-off-by: Vinayak Kale <vkale@apm.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

arm64: vdso: fix coarse clock handling

commit 069b918623e1510e58dacf178905a72c3baa3ae4 upstream.

When __kernel_clock_gettime is called with a CLOCK_MONOTONIC_COARSE or
CLOCK_REALTIME_COARSE clock id, it returns incorrectly to whatever the
caller has placed in x2 ("ret x2" to return from the fast path).  Fix
this by saving x30/LR to x2 only in code that will call
__do_get_tspec, restoring x30 afterward, and using a plain "ret" to
return from the routine.

Also: while the resulting tv_nsec value for CLOCK_REALTIME and
CLOCK_MONOTONIC must be computed using intermediate values that are
left-shifted by cs_shift (x12, set by __do_get_tspec), the results for
coarse clocks should be calculated using unshifted values
(xtime_coarse_nsec is in units of actual nanoseconds).  The current
code shifts intermediate values by x12 unconditionally, but x12 is
uninitialized when servicing a coarse clock.  Fix this by setting x12
to 0 once we know we are dealing with a coarse clock id.

Signed-off-by: Nathan Lynch <nathan_lynch@mentor.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

arm64: Invalidate the TLB when replacing pmd entries during boot

commit a55f9929a9b257f84b6cc7b2397379cabd744a22 upstream.

With the 64K page size configuration, __create_page_tables in head.S
maps enough memory to get started but using 64K pages rather than 512M
sections with a single pgd/pud/pmd entry pointing to a pte table.
create_mapping() may override the pgd/pud/pmd table entry with a block
(section) one if the RAM size is more than 512MB and aligned correctly.
For the end of this block to be accessible, the old TLB entry must be
invalidated.

Reported-by: Mark Salter <msalter@redhat.com>
Tested-by: Mark Salter <msalter@redhat.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

arm64: vdso: prevent ld from aligning PT_LOAD segments to 64k

commit 40507403485fcb56b83d6ddfc954e9b08305054c upstream.

Whilst the text segment for our VDSO is marked as PT_LOAD in the ELF
headers, it is mapped by the kernel and not actually subject to
demand-paging. ld doesn't realise this, and emits a p_align field of 64k
(the maximum supported page size), which conflicts with the load address
picked by the kernel on 4k systems, which will be 4k aligned. This
causes GDB to fail with "Failed to read a valid object file image from
memory" when attempting to load the VDSO.

This patch passes the -n option to ld, which prevents it from aligning
PT_LOAD segments to the maximum page size.

Reported-by: Kyle McMartin <kyle@redhat.com>
Acked-by: Kyle McMartin <kyle@redhat.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

arm64: vdso: update wtm fields for CLOCK_MONOTONIC_COARSE

commit d4022a335271a48cce49df35d825897914fbffe3 upstream.

Update wall-to-monotonic fields in the VDSO data page
unconditionally. These are used to service CLOCK_MONOTONIC_COARSE,
which is not guarded by use_syscall.

Signed-off-by: Nathan Lynch <nathan_lynch@mentor.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

irqchip: armada-370-xp: fix IPI race condition

commit a6f089e95b1e08cdea9633d50ad20aa5d44ba64d upstream.

In the Armada 370/XP driver, when we receive an IRQ 0, we read the
list of doorbells that caused the interrupt from register
ARMADA_370_XP_IN_DRBEL_CAUSE_OFFS. This gives the list of IPIs that
were generated. However, instead of acknowledging only the IPIs that
were generated, we acknowledge *all* the IPIs, by writing
~IPI_DOORBELL_MASK in the ARMADA_370_XP_IN_DRBEL_CAUSE_OFFS register.

This creates a race condition: if a new IPI that isn't part of the
ones read into the temporary "ipimask" variable is fired before we
acknowledge all IPIs, then we will simply loose it. This is causing
scheduling hangs on SMP intensive workloads.

It is important to mention that this ARMADA_370_XP_IN_DRBEL_CAUSE_OFFS
register has the following behavior: "A CPU write of 0 clears the bits
in this field. A CPU write of 1 has no effect". This is what allows us
to simply write ~ipimask to acknoledge the handled IPIs.

Notice that the same problem is present in the MSI implementation, but
it will be fixed as a separate patch, so that this IPI fix can be
pushed to older stable versions as appropriate (all the way to 3.8),
while the MSI code only appeared in 3.13.

Signed-off-by: Lior Amsalem <alior@marvell.com>
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Fixes: 344e873e5657e8dc0 'arm: mvebu: Add IPI support via doorbells'
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Jason Cooper <jason@lakedaemon.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

NFSv4: Fix memory corruption in nfs4_proc_open_confirm

commit 17ead6c85c3d0ef57a14d1373f1f1cee2ce60ea8 upstream.

nfs41_wake_and_assign_slot() relies on the task->tk_msg.rpc_argp and
task->tk_msg.rpc_resp always pointing to the session sequence arguments.

nfs4_proc_open_confirm tries to pull a fast one by reusing the open
sequence structure, thus causing corruption of the NFSv4 slot table.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

NFSv4.1: nfs4_destroy_session must call rpc_destroy_waitqueue

commit 20b9a9024540a775395d5d1f41eec0ec6ec41f9b upstream.

There may still be timers active on the session waitqueues. Make sure
that we kill them before freeing the memory.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

crypto: s390 - fix des and des3_ede ctr concurrency issue

commit ee97dc7db4cbda33e4241c2d85b42d1835bc8a35 upstream.

In s390 des and 3des ctr mode there is one preallocated page
used to speed up the en/decryption. This page is not protected
against concurrent usage and thus there is a potential of data
corruption with multiple threads.

The fix introduces locking/unlocking the ctr page and a slower
fallback solution at concurrency situations.

Signed-off-by: Harald Freudenberger <freude@linux.vnet.ibm.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

crypto: s390 - fix des and des3_ede cbc concurrency issue

commit adc3fcf1552b6e406d172fd9690bbd1395053d13 upstream.

In s390 des and des3_ede cbc mode the iv value is not protected
against concurrency access and modifications from another running
en/decrypt operation which is using the very same tfm struct
instance. This fix copies the iv to the local stack before
the crypto operation and stores the value back when done.

Signed-off-by: Harald Freudenberger <freude@linux.vnet.ibm.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

crypto: s390 - fix concurrency issue in aes-ctr mode

commit 0519e9ad89e5cd6e6b08398f57c6a71d9580564c upstream.

The aes-ctr mode uses one preallocated page without any concurrency
protection. When multiple threads run aes-ctr encryption or decryption
this can lead to data corruption.

The patch introduces locking for the page and a fallback solution with
slower en/decryption performance in concurrency situations.

Signed-off-by: Harald Freudenberger <freude@linux.vnet.ibm.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Btrfs: disable snapshot aware defrag for now

commit 8101c8dbf6243ba517aab58d69bf1bc37d8b7b9c upstream.

It's just broken and it's taking a lot of effort to fix it, so for now just
disable it so people can defrag in peace. Thanks,

Signed-off-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: Chris Mason <clm@fb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

SELinux: Fix kernel BUG on empty security contexts.

commit 2172fa709ab32ca60e86179dc67d0857be8e2c98 upstream.

Setting an empty security context (length=0) on a file will
lead to incorrectly dereferencing the type and other fields
of the security context structure, yielding a kernel BUG.
As a zero-length security context is never valid, just reject
all such security contexts whether coming from userspace
via setxattr or coming from the filesystem upon a getxattr
request by SELinux.

Setting a security context value (empty or otherwise) unknown to
SELinux in the first place is only possible for a root process
(CAP_MAC_ADMIN), and, if running SELinux in enforcing mode, only
if the corresponding SELinux mac_admin permission is also granted
to the domain by policy.  In Fedora policies, this is only allowed for
specific domains such as livecd for setting down security contexts
that are not defined in the build host policy.

Reproducer:
su
setenforce 0
touch foo
setfattr -n security.selinux foo

Caveat:
Relabeling or removing foo after doing the above may not be possible
without booting with SELinux disabled.  Any subsequent access to foo
after doing the above will also trigger the BUG.

BUG output from Matthew Thode:
[  473.893141] ------------[ cut here ]------------
[  473.962110] kernel BUG at security/selinux/ss/services.c:654!
[  473.995314] invalid opcode: 0000 [#6] SMP
[  474.027196] Modules linked in:
[  474.058118] CPU: 0 PID: 8138 Comm: ls Tainted: G      D   I
3.13.0-grsec #1
[  474.116637] Hardware name: Supermicro X8ST3/X8ST3, BIOS 2.0
07/29/10
[  474.149768] task: ffff8805f50cd010 ti: ffff8805f50cd488 task.ti:
ffff8805f50cd488
[  474.183707] RIP: 0010:[<ffffffff814681c7>]  [<ffffffff814681c7>]
context_struct_compute_av+0xce/0x308
[  474.219954] RSP: 0018:ffff8805c0ac3c38  EFLAGS: 00010246
[  474.252253] RAX: 0000000000000000 RBX: ffff8805c0ac3d94 RCX:
0000000000000100
[  474.287018] RDX: ffff8805e8aac000 RSI: 00000000ffffffff RDI:
ffff8805e8aaa000
[  474.321199] RBP: ffff8805c0ac3cb8 R08: 0000000000000010 R09:
0000000000000006
[  474.357446] R10: 0000000000000000 R11: ffff8805c567a000 R12:
0000000000000006
[  474.419191] R13: ffff8805c2b74e88 R14: 00000000000001da R15:
0000000000000000
[  474.453816] FS:  00007f2e75220800(0000) GS:ffff88061fc00000(0000)
knlGS:0000000000000000
[  474.489254] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  474.522215] CR2: 00007f2e74716090 CR3: 00000005c085e000 CR4:
00000000000207f0
[  474.556058] Stack:
[  474.584325]  ffff8805c0ac3c98 ffffffff811b549b ffff8805c0ac3c98
ffff8805f1190a40
[  474.618913]  ffff8805a6202f08 ffff8805c2b74e88 00068800d0464990
ffff8805e8aac860
[  474.653955]  ffff8805c0ac3cb8 000700068113833a ffff880606c75060
ffff8805c0ac3d94
[  474.690461] Call Trace:
[  474.723779]  [<ffffffff811b549b>] ? lookup_fast+0x1cd/0x22a
[  474.778049]  [<ffffffff81468824>] security_compute_av+0xf4/0x20b
[  474.811398]  [<ffffffff8196f419>] avc_compute_av+0x2a/0x179
[  474.843813]  [<ffffffff8145727b>] avc_has_perm+0x45/0xf4
[  474.875694]  [<ffffffff81457d0e>] inode_has_perm+0x2a/0x31
[  474.907370]  [<ffffffff81457e76>] selinux_inode_getattr+0x3c/0x3e
[  474.938726]  [<ffffffff81455cf6>] security_inode_getattr+0x1b/0x22
[  474.970036]  [<ffffffff811b057d>] vfs_getattr+0x19/0x2d
[  475.000618]  [<ffffffff811b05e5>] vfs_fstatat+0x54/0x91
[  475.030402]  [<ffffffff811b063b>] vfs_lstat+0x19/0x1b
[  475.061097]  [<ffffffff811b077e>] SyS_newlstat+0x15/0x30
[  475.094595]  [<ffffffff8113c5c1>] ? __audit_syscall_entry+0xa1/0xc3
[  475.148405]  [<ffffffff8197791e>] system_call_fastpath+0x16/0x1b
[  475.179201] Code: 00 48 85 c0 48 89 45 b8 75 02 0f 0b 48 8b 45 a0 48
8b 3d 45 d0 b6 00 8b 40 08 89 c6 ff ce e8 d1 b0 06 00 48 85 c0 49 89 c7
75 02 <0f> 0b 48 8b 45 b8 4c 8b 28 eb 1e 49 8d 7d 08 be 80 01 00 00 e8
[  475.255884] RIP  [<ffffffff814681c7>]
context_struct_compute_av+0xce/0x308
[  475.296120]  RSP <ffff8805c0ac3c38>
[  475.328734] ---[ end trace f076482e9d754adc ]---

Reported-by: Matthew Thode <mthode@mthode.org>
Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <pmoore@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Linux 3.12.11

mmc: sdhci-pci: Fix possibility of chip->fixes being null

commit 945be38caa287b177b8c17ffaae7754cab6a658f upstream.

It is possible for chip->fixes to be null. Check before dereferencing it.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Chris Ball <chris@printf.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mmc: sdhci-pci: Fix BYT sd card getting stuck in runtime suspend

commit 77a0122e0838663795651aa0beb2325156f98c09 upstream.

A host controller for a SD card may need a GPIO for card detect in order
to wake up from runtime suspend when a card is inserted.  If that GPIO is
not configured, then the host controller will not wake up.  Fix that for
the affected devices by not enabling runtime PM unless the GPIO is
successfully set up.

This affects BYT sd card host controller which had runtime PM enabled from
v3.11.  For completeness, the MFD sd card host controller is flagged also.

The original patch before rebasing (see link below) was tested on v3.11.10
and v3.12.4 although the patch applied with some offsets and fuzz.  The
original patch is here:

    http://marc.info/?l=linux-mmc&m=138676702327057

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Chris Ball <chris@printf.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

rtc-cmos: Add an alarm disable quirk

commit d5a1c7e3fc38d9c7d629e1e47f32f863acbdec3d upstream.

41c7f7424259f ("rtc: Disable the alarm in the hardware (v2)") added the
functionality to disable the RTC wake alarm when shutting down the box.

However, there are at least two b0rked BIOSes we know about:

https://bugzilla.novell.com/show_bug.cgi?id=812592
https://bugzilla.novell.com/show_bug.cgi?id=805740

where, when wakeup alarm is enabled in the BIOS, the machine reboots
automatically right after shutdown, regardless of what wakeup time is
programmed.

Bisecting the issue lead to this patch so disable its functionality with
a DMI quirk only for those boxes.

Cc: Brecht Machiels <brecht@mos6581.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Rabin Vincent <rabin.vincent@stericsson.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
[jstultz: Changed variable name for clarity, added extra dmi entry]
Tested-by: Brecht Machiels <brecht@mos6581.org>
Tested-by: Borislav Petkov <bp@suse.de>
Signed-off-by: John Stultz <john.stultz@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

timekeeping: Fix missing timekeeping_update in suspend path

commit 330a1617b0a6268d427aa5922c94d082b1d3e96d upstream.

Since 48cdc135d4840 (Implement a shadow timekeeper), we have to
call timekeeping_update() after any adjustment to the timekeeping
structure in order to make sure that any adjustments to the structure
persist.

In the timekeeping suspend path, we udpate the timekeeper
structure, so we should be sure to update the shadow-timekeeper
before releasing the timekeeping locks. Currently this isn't done.

In most cases, the next time related code to run would be
timekeeping_resume, which does update the shadow-timekeeper, but
in an abundence of caution, this patch adds the call to
timekeeping_update() in the suspend path.

Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Richard Cochran <richardcochran@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Signed-off-by: John Stultz <john.stultz@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

timekeeping: Fix CLOCK_TAI timer/nanosleep delays

commit 04005f6011e3b504cd4d791d9769f7cb9a3b2eae upstream.

A think-o in the calculation of the monotonic -> tai time offset
results in CLOCK_TAI timers and nanosleeps to expire late (the
latency is ~2x the tai offset).

Fix this by adding the tai offset from the realtime offset instead
of subtracting.

Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Richard Cochran <richardcochran@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Signed-off-by: John Stultz <john.stultz@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

3.13.y: timekeeping: Fix clock_set/clock_was_set think-o

In backporting 6fdda9a9c5db367130cf32df5d6618d08b89f46a
(timekeeping: Avoid possible deadlock from clock_was_set_delayed),
I ralized the patch had a think-o where instead of checking
clock_set I accidentally typed clock_was_set (which is a function
- so the conditional always is true).

Upstream this was resolved in the immediately following patch
47a1b796306356f358e515149d86baf0cc6bf007 (tick/timekeeping: Call
update_wall_time outside the jiffies lock). But since that patch
really isn't -stable material, so this patch only pulls
the name change.

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Richard Cochran <richardcochran@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: John Stultz <john.stultz@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

timekeeping: Avoid possible deadlock from clock_was_set_delayed

commit 6fdda9a9c5db367130cf32df5d6618d08b89f46a upstream.

As part of normal operaions, the hrtimer subsystem frequently calls
into the timekeeping code, creating a locking order of
  hrtimer locks -> timekeeping locks

clock_was_set_delayed() was suppoed to allow us to avoid deadlocks
between the timekeeping the hrtimer subsystem, so that we could
notify the hrtimer subsytem the time had changed while holding
the timekeeping locks. This was done by scheduling delayed work
that would run later once we were out of the timekeeing code.

But unfortunately the lock chains are complex enoguh that in
scheduling delayed work, we end up eventually trying to grab
an hrtimer lock.

Sasha Levin noticed this in testing when the new seqlock lockdep
enablement triggered the following (somewhat abrieviated) message:

[  251.100221] ======================================================
[  251.100221] [ INFO: possible circular locking dependency detected ]
[  251.100221] 3.13.0-rc2-next-20131206-sasha-00005-g8be2375-dirty #4053 Not tainted
[  251.101967] -------------------------------------------------------
[  251.101967] kworker/10:1/4506 is trying to acquire lock:
[  251.101967]  (timekeeper_seq){----..}, at: [<ffffffff81160e96>] retrigger_next_event+0x56/0x70
[  251.101967]
[  251.101967] but task is already holding lock:
[  251.101967]  (hrtimer_bases.lock#11){-.-...}, at: [<ffffffff81160e7c>] retrigger_next_event+0x3c/0x70
[  251.101967]
[  251.101967] which lock already depends on the new lock.
[  251.101967]
[  251.101967]
[  251.101967] the existing dependency chain (in reverse order) is:
[  251.101967]
-> #5 (hrtimer_bases.lock#11){-.-...}:
[snipped]
-> #4 (&rt_b->rt_runtime_lock){-.-...}:
[snipped]
-> #3 (&rq->lock){-.-.-.}:
[snipped]
-> #2 (&p->pi_lock){-.-.-.}:
[snipped]
-> #1 (&(&pool->lock)->rlock){-.-...}:
[  251.101967]        [<ffffffff81194803>] validate_chain+0x6c3/0x7b0
[  251.101967]        [<ffffffff81194d9d>] __lock_acquire+0x4ad/0x580
[  251.101967]        [<ffffffff81194ff2>] lock_acquire+0x182/0x1d0
[  251.101967]        [<ffffffff84398500>] _raw_spin_lock+0x40/0x80
[  251.101967]        [<ffffffff81153e69>] __queue_work+0x1a9/0x3f0
[  251.101967]        [<ffffffff81154168>] queue_work_on+0x98/0x120
[  251.101967]        [<ffffffff81161351>] clock_was_set_delayed+0x21/0x30
[  251.101967]        [<ffffffff811c4bd1>] do_adjtimex+0x111/0x160
[  251.101967]        [<ffffffff811e2711>] compat_sys_adjtimex+0x41/0x70
[  251.101967]        [<ffffffff843a4b49>] ia32_sysret+0x0/0x5
[  251.101967]
-> #0 (timekeeper_seq){----..}:
[snipped]
[  251.101967] other info that might help us debug this:
[  251.101967]
[  251.101967] Chain exists of:
  timekeeper_seq --> &rt_b->rt_runtime_lock --> hrtimer_bases.lock#11

[  251.101967]  Possible unsafe locking scenario:
[  251.101967]
[  251.101967]        CPU0                    CPU1
[  251.101967]        ----                    ----
[  251.101967]   lock(hrtimer_bases.lock#11);
[  251.101967]                                lock(&rt_b->rt_runtime_lock);
[  251.101967]                                lock(hrtimer_bases.lock#11);
[  251.101967]   lock(timekeeper_seq);
[  251.101967]
[  251.101967]  *** DEADLOCK ***
[  251.101967]
[  251.101967] 3 locks held by kworker/10:1/4506:
[  251.101967]  #0:  (events){.+.+.+}, at: [<ffffffff81154960>] process_one_work+0x200/0x530
[  251.101967]  #1:  (hrtimer_work){+.+...}, at: [<ffffffff81154960>] process_one_work+0x200/0x530
[  251.101967]  #2:  (hrtimer_bases.lock#11){-.-...}, at: [<ffffffff81160e7c>] retrigger_next_event+0x3c/0x70
[  251.101967]
[  251.101967] stack backtrace:
[  251.101967] CPU: 10 PID: 4506 Comm: kworker/10:1 Not tainted 3.13.0-rc2-next-20131206-sasha-00005-g8be2375-dirty #4053
[  251.101967] Workqueue: events clock_was_set_work

So the best solution is to avoid calling clock_was_set_delayed() while
holding the timekeeping lock, and instead using a flag variable to
decide if we should call clock_was_set() once we've released the locks.

This works for the case here, where the do_adjtimex() was the deadlock
trigger point. Unfortuantely, in update_wall_time() we still hold
the jiffies lock, which would deadlock with the ipi triggered by
clock_was_set(), preventing us from calling it even after we drop the
timekeeping lock. So instead call clock_was_set_delayed() at that point.

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Richard Cochran <richardcochran@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Sasha Levin <sasha.levin@oracle.com>
Reported-by: Sasha Levin <sasha.levin@oracle.com>
Tested-by: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: John Stultz <john.stultz@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

timekeeping: Fix potential lost pv notification of time change

commit 5258d3f25c76f6ab86e9333abf97a55a877d3870 upstream.

In 780427f0e11 (Indicate that clock was set in the pvclock
gtod notifier), logic was added to pass a CLOCK_WAS_SET
notification to the pvclock notifier chain.

While that patch added a action flag returned from
accumulate_nsecs_to_secs(), it only uses the returned value
in one location, and not in the logarithmic accumulation.

This means if a leap second triggered during the logarithmic
accumulation (which is most likely where it would happen),
the notification that the clock was set would not make it to
the pv notifiers.

This patch extends the logarithmic_accumulation pass down
that action flag so proper notification will occur.

This patch also changes the varialbe action -> clock_set
per Ingo's suggestion.

Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: David Vrabel <david.vrabel@citrix.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Richard Cochran <richardcochran@gmail.com>
Cc: <xen-devel@lists.xen.org>
Signed-off-by: John Stultz <john.stultz@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

timekeeping: Fix lost updates to tai adjustment

commit f55c07607a38f84b5c7e6066ee1cfe433fa5643c upstream.

Since 48cdc135d4840 (Implement a shadow timekeeper), we have to
call timekeeping_update() after any adjustment to the timekeeping
structure in order to make sure that any adjustments to the structure
persist.

Unfortunately, the updates to the tai offset via adjtimex do not
trigger this update, causing adjustments to the tai offset to be
made and then over-written by the previous value at the next
update_wall_time() call.

This patch resovles the issue by calling timekeeping_update()
right after setting the tai offset.

Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Richard Cochran <richardcochran@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Signed-off-by: John Stultz <john.stultz@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ftrace: Have function graph only trace based on global_ops filters

commit 23a8e8441a0a74dd612edf81dc89d1600bc0a3d1 upstream.

Doing some different tests, I discovered that function graph tracing, when
filtered via the set_ftrace_filter and set_ftrace_notrace files, does
not always keep with them if another function ftrace_ops is registered
to trace functions.

The reason is that function graph just happens to trace all functions
that the function tracer enables. When there was only one user of
function tracing, the function graph tracer did not need to worry about
being called by functions that it did not want to trace. But now that there
are other users, this becomes a problem.

For example, one just needs to do the following:

# cd /sys/kernel/debug/tracing
# echo schedule > set_ftrace_filter
# echo function_graph > current_tracer
# cat trace
[..]
0)               |  schedule() {
------------------------------------------
0)    <idle>-0    =>   rcu_pre-7
------------------------------------------

0) ! 2980.314 us |  }
0)               |  schedule() {
------------------------------------------
0)   rcu_pre-7    =>    <idle>-0
------------------------------------------

0) + 20.701 us   |  }

# echo 1 > /proc/sys/kernel/stack_tracer_enabled
# cat trace
[..]
1) + 20.825 us   |      }
1) + 21.651 us   |    }
1) + 30.924 us   |  } /* SyS_ioctl */
1)               |  do_page_fault() {
1)               |    __do_page_fault() {
1)   0.274 us    |      down_read_trylock();
1)   0.098 us    |      find_vma();
1)               |      handle_mm_fault() {
1)               |        _raw_spin_lock() {
1)   0.102 us    |          preempt_count_add();
1)   0.097 us    |          do_raw_spin_lock();
1)   2.173 us    |        }
1)               |        do_wp_page() {
1)   0.079 us    |          vm_normal_page();
1)   0.086 us    |          reuse_swap_page();
1)   0.076 us    |          page_move_anon_rmap();
1)               |          unlock_page() {
1)   0.082 us    |            page_waitqueue();
1)   0.086 us    |            __wake_up_bit();
1)   1.801 us    |          }
1)   0.075 us    |          ptep_set_access_flags();
1)               |          _raw_spin_unlock() {
1)   0.098 us    |            do_raw_spin_unlock();
1)   0.105 us    |            preempt_count_sub();
1)   1.884 us    |          }
1)   9.149 us    |        }
1) + 13.083 us   |      }
1)   0.146 us    |      up_read();

When the stack tracer was enabled, it enabled all functions to be traced, which
now the function graph tracer also traces. This is a side effect that should
not occur.

To fix this a test is added when the function tracing is changed, as well as when
the graph tracer is enabled, to see if anything other than the ftrace global_ops
function tracer is enabled. If so, then the graph tracer calls a test trampoline
that will look at the function that is being traced and compare it with the
filters defined by the global_ops.

As an optimization, if there's no other function tracers registered, or if
the only registered function tracers also use the global ops, the function
graph infrastructure will call the registered function graph callback directly
and not go through the test trampoline.

Fixes: d2d45c7a03a2 "tracing: Have stack_tracer use a separate list of functions"
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ftrace: Fix synchronization location disabling and freeing ftrace_ops

commit a4c35ed241129dd142be4cadb1e5a474a56d5464 upstream.

The synchronization needed after ftrace_ops are unregistered must happen
after the callback is disabled from becing called by functions.

The current location happens after the function is being removed from the
internal lists, but not after the function callbacks were disabled, leaving
the functions susceptible of being called after their callbacks are freed.

This affects perf and any externel users of function tracing (LTTng and
SystemTap).

Fixes: cdbe61bfe704 "ftrace: Allow dynamically allocated function tracers"
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ftrace: Synchronize setting function_trace_op with ftrace_trace_function

commit 405e1d834807e51b2ebd3dea81cb51e53fb61504 upstream.

ftrace_trace_function is a variable that holds what function will be called
directly by the assembly code (mcount). If just a single function is
registered and it handles recursion itself, then the assembly will call that
function directly without any helper function. It also passes in the
ftrace_op that was registered with the callback. The ftrace_op to send is
stored in the function_trace_op variable.

The ftrace_trace_function and function_trace_op needs to be coordinated such
that the called callback wont be called with the wrong ftrace_op, otherwise
bad things can happen if it expected a different op. Luckily, there's no
callback that doesn't use the helper functions that requires this. But
there soon will be and this needs to be fixed.

Use a set_function_trace_op to store the ftrace_op to set the
function_trace_op to when it is safe to do so (during the update function
within the breakpoint or stop machine calls). Or if dynamic ftrace is not
being used (static tracing) then we have to do a bit more synchronization
when the ftrace_trace_function is set as that takes affect immediately
(as oppose to dynamic ftrace doing it with the modification of the trampoline).

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

drm/mgag200,ast,cirrus: fix regression with drm_can_sleep conversion

commit 8b7ad1bb3d440da888f2a939dc870eba429b9192 upstream.

I totally sign inverted my way out of this one.

Reported-by: "Sabrina Dubroca" <sd@queasysnail.net>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

drm/mgag200: fix typo causing bw limits to be ignored on some chips

commit ec22b4aa993abbd18f5bbbcb20a1c56be3b1d38b upstream.

mode->mdev otherwise the bw limits never kick in.

Reported in RHEL testing.

Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

drm/mgag200: fix oops in cursor code.

commit 53dac830537b51df555ba5e7ebb236705b7eaa7c upstream.

In some cases we enter the cursor code with file_priv = NULL causing an oops,
we also can try to unpin something that isn't pinned, and this is a good fix for it.

Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

drm/vmwgfx: Fix regression caused by "drm/ttm: make ttm reservation calls behave like reservation calls"

commit cf5e3413337309050c05e13dcebe85b7194a21e5 upstream.

The call to ttm_eu_backoff_reservation() as part of an error path would cause
a lock imbalance if the reservation ticket was not initialized. This error is
easily triggered from user-space by submitting a bogus command stream.

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Jakob Bornecrantz <jakob@vmware.com>
Cc: Maarten Lankhorst <maarten.lankhorst@canonical.com>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: Dave Airlie <airlied@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

drm: ast,cirrus,mgag200: use drm_can_sleep

commit f4b4718b61d1d5a7442a4fd6863ea80c3a10e508 upstream.

these 3 were checking in_interrupt but we have situations where
calling vunmap under this could cause a BUG to be hit in
smp_call_function_many. Use the drm_can_sleep macro instead,
which should stop this path from been taken in this case.

Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

drm/gma500: Lock struct_mutex around cursor updates

commit 631794b44bd3dbfba37074954d5c584c9e8725f0 upstream.

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=64361
Signed-off-by: Patrik Jakobsson <patrik.r.jakobsson@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

drm/rcar-du: Update plane pitch in .mode_set_base() operation

commit eb86301f293da3c362db729a9f40ddb25755902b upstream.

When setting a new frame buffer with the mode set base operation the
pitch value might change. Set the hardware plane pitch register at the
same time as the plane base address in the rcar_du_plane_update_base()
function to make sure the pitch value always matches the frame buffer.

Signed-off-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

drm/gem: Always initialize the gem object in object_init

commit 6ab11a2635ce988ebc2e798947beb72cf7324119 upstream.

At least drm/i915 expects that the obj->dev pointer is set even in
failure paths. Specifically when the shmem initialization fails we
call i915_gem_object_free which needs to deref obj->base.dev to get at
the slab pointer in the device private structure. And the shmem
allocation can easily fail when userspace is hitting open file limits.

Doing the structure init even when the shmem file allocation fails
prevents this Oops.

This is a regression from

commit 89c8233f82d9c8af5b20e72e4a185a38a7d3c50b
Author: David Herrmann <dh.herrmann@gmail.com>
Date: Thu Jul 11 11:56:32 2013 +0200

drm/gem: simplify object initialization

v2: Add regression note which Chris supplied.

Testcase: igt/gem_fd_exhaustion
Reported-and-Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
References: http://lists.freedesktop.org/archives/intel-gfx/2014-January/038433.html
Reviewed-by: David Herrmann <dh.herrmann@gmail.com>
Cc: David Herrmann <dh.herrmann@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

drm/cirrus: correct register values for 16bpp

commit 2510538fa000dd13a3e57b79bf073ffb1748976c upstream.

When the mode is set with 16bpp on QEMU, the output gets totally broken.
The culprit is the bogus register values set for 16bpp, which was likely
copied from from a wrong place.

Addresses https://bugzilla.novell.com/show_bug.cgi?id=799216

Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Cc: David Airlie <airlied@linux.ie>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

drm/i915: Decouple GPU error reporting from ring initialisation

commit 372fbb8e3927fc76b0f842d8eb8a798a71d8960f upstream.

Currently we report through our error state only the rings that have
been initialised (as detected by ring->obj). This check is done after
the GPU reset and ring re-initialisation, which means that the software
state may not be the same as when we captured the hardware error and we
may not print out any of the vital information for debugging the hang.

This (and the implied object leak) is a regression from

commit 3d57e5bd1284f44e325f3a52d966259ed42f9e05
Author: Ben Widawsky <ben@bwidawsk.net>
Date:   Mon Oct 14 10:01:36 2013 -0700

    drm/i915: Do a fuller init after reset

Note that we are already starting to get bug reports with incomplete
error states from 3.13, which also hampers debugging userspace driver
issues.

v2: Prevent a NULL dereference on 830gm/845g after a GPU reset where
    the scratch obj may be NULL.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Ben Widawsky <ben@bwidawsk.net>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
References: https://bugs.freedesktop.org/show_bug.cgi?id=74094
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
[danvet: Add a bit of fluff to make it clear we need this expedited in
stable.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

i915: remove pm_qos request on error

commit 22accca01713b13dac386ca90b787aadf88f6551 upstream.

Not removing pm qos request and free memory for it can cause crash,
when some other driver use pm qos. For example, this oops:

BUG: unable to handle kernel paging request at fffffffffffffff8
IP: [<ffffffff81307a6b>] plist_add+0x5b/0xd0
Call Trace:
[<ffffffff810acf25>] pm_qos_update_target+0x125/0x1e0
[<ffffffff810ad071>] pm_qos_add_request+0x91/0x100
[<ffffffffa053ec14>] e1000_open+0xe4/0x5b0 [e1000e]

was caused by earlier i915 probe failure:

[drm:i915_report_and_clear_eir] *ERROR* EIR stuck: 0x00000010, masking
[drm:init_ring_common] *ERROR* render ring initialization failed ctl 0001f001 head 00003004 tail 00000000 start 00003000
[drm:i915_driver_load] *ERROR* failed to init modeset
i915: probe of 0000:00:02.0 failed with error -5

Bug report:
http://bugzilla.redhat.com/show_bug.cgi?id=1057533

Reported-by: Giandomenico De Tullio <ghisha@gmail.com>
Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
[danvet: Drop unnecessary code movement.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

drm/i915: VLV2 - Fix hotplug detect bits

commit 232a6ee9af8adb185640f67fcaaa9014a9aa0573 upstream.

Add new definitions for hotplug live status bits for VLV2 since they're
in reverse order from the gen4x ones.

Changelog:
- Restored gen4 bit definitions
- Added new definitions for VLV2
- Added platform check for IS_VALLEYVIEW() in dp_detect to use the correct
bit defintions
- Replaced a lost trailing brace for the added switch()

Signed-off-by: Todd Previte <tprevite@gmail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=73951
[danvet: Switch to _VLV postfix instead of prefix and regroupg
comments again so that the g4x warning is right next to those defines.
Also add a _G4X suffix for those special ones. Also cc stable.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

drm/i915: Fix the offset issue for the stolen GEM objects

commit ec14ba47791965d2c08e0a681ff44eacbf3c4553 upstream.

The 'offset' field of the 'scatterlist' structure was wrongly
programmed with the offset value from the base of stolen area,
whereas this field indicates the offset from where the interested
data starts within the first PAGE pointed to by 'scattterlist'
structure. As a result when a new GEM object allocated from stolen
area is mapped to GTT, it could lead to an overwrite of GTT entries
as the page count calculation will go wrong, refer the function
'sg_page_count'.

v2: Modified the commit message. (Chris)

Signed-off-by: Akash Goel <akash.goel@intel.com>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=71908
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=69104
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

drm/i915: Flush outstanding requests before allocating new seqno

commit 304d695c3dc8eb65206b9eaf16f8d1a41510d1cf upstream.

In very rare cases (such as a memory failure stress test) it is possible
to fill the entire ring without emitting a request. Under this
circumstance, the outstanding request is flushed and waited upon. After
space on the ring is cleared, we return to emitting the new command -
except that we just cleared the seqno allocated for this operation and
trigger the sanity check that a request is only ever emitted with a
valid seqno. The fix is to rearrange the code to make sure the
allocation of the seqno for this operation is after any required flushes
of outstanding operations.

The bug exists since the preallocation was introduced in
commit 9d7730914f4cd496e356acfab95b41075aa8eae8
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date: Tue Nov 27 16:22:52 2012 +0000

drm/i915: Preallocate next seqno before touching the ring

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

drm/nouveau/falcon: use vmalloc to create firwmare copies

commit 90d6db1635d5e225623af2e2e859feb607345287 upstream.

Some firmware images may be large (64K), so using kmalloc memory is
inappropriate for them. Use vmalloc instead, to avoid high-order
allocation failures.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

drm/nouveau: fix m2mf copy to tiled gart

commit ce8f7699f2b6ffe4aa8368b8d9d370875accaa5f upstream.

Commit de7b7d59d54852c introduced tiled GART, but a linear copy is
still performed. This may result in errors on eviction, fix it by
checking tiling from memtype.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

dm sysfs: fix a module unload race

commit 2995fa78e423d7193f3b57835f6c1c75006a0315 upstream.

This reverts commit be35f48610 ("dm: wait until embedded kobject is
released before destroying a device") and provides an improved fix.

The kobject release code that calls the completion must be placed in a
non-module file, otherwise there is a module unload race (if the process
calling dm_kobject_release is preempted and the DM module unloaded after
the completion is triggered, but before dm_kobject_release returns).

To fix this race, this patch moves the completion code to dm-builtin.c
which is always compiled directly into the kernel if BLK_DEV_DM is
selected.

The patch introduces a new dm_kobject_holder structure, its purpose is
to keep the completion and kobject in one place, so that it can be
accessed from non-module code without the need to export the layout of
struct mapped_device to that code.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

drm/radeon/dce8: workaround for atom BlankCrtc table

commit 78fe9e545ce6d510b979dc2d8e14096a279fc519 upstream.

Some DCE8 boards have a funky BlankCrtc table that results
in a timeout when trying to blank the display. The
timeout is harmless (all operations needed from the table
are complete), but wastes time and is confusing to users so
work around it.

bug:
https://bugs.freedesktop.org/show_bug.cgi?id=73420

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

drm/radeon/DCE4+: clear bios scratch dpms bit (v2)

commit 6802d4bad83f50081b2788698570218aaff8d10e upstream.

The BlankCrtc table in some DCE8 boards has some
logic shortcuts for the vbios when this bit is set.
Clear it for driver use.

v2: fix typo

Bug:
https://bugs.freedesktop.org/show_bug.cgi?id=73420

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

drm/radeon: fix DAC interrupt handling on DCE5+

commit e9a321c6b2ac954a7dbf235f419c255a424a1273 upstream.

DCE5 and newer hardware only has 1 DAC. Use the correct
offset. This may fix display problems on certain board
configurations.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

drm/radeon: add UVD support for OLAND

commit 5d029339bb8ce69aeb68280c3de67d3cea456146 upstream.

It seems this got dropped when we merged UVD support
last year. Add this back now.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

drm/radeon: set the full cache bit for fences on r7xx+

commit d45b964a22cad962d3ede1eba8d24f5cee7b2a92 upstream.

Needed to properly flush the read caches for fences.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

drm/radeon: fix surface sync in fence on cayman (v2)

commit 10e9ffae463396c5a25fdfe8a48d7c98a87f6b85 upstream.

We need to set the engine bit to select the ME and
also set the full cache bit. Should help stability
on TN and cayman.

V2: fix up surface sync in ib execute as well

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

drm/radeon: disable ss on DP for DCE3.x

commit d8e24525094200601236fa64a54cf73e3d682f2e upstream.

Seems to cause problems with certain DP monitors.

Bug:
https://bugs.freedesktop.org/show_bug.cgi?id=40699

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

drm/radeon: skip colorbuffer checking if COLOR_INFO.FORMAT is set to INVALID

commit 56492e0fac2dbaf7735ffd66b206a90624917789 upstream.

This fixes a bug which was causing rejections of valid GPU commands
from userspace.

Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

m88rs2000: set symbol rate accurately

commit dd4491dfb9eb4fa3bfa7dc73ba989e69fbce2e10 upstream.

Current setting of symbol rate is not very actuate causing
loss of lock.

Covert temp to u64 and use mclk to calculate from big number.

Calculate symbol rate by dividing symbol rate by 1000 times
1 << 24 and dividing sum by mclk.

Add other symbol rate settings to function registers 0xa0-0xa3.

In set_frontend add changes to register 0xf1 this must be done
prior call to fe_reset. Register 0x00 doesn't need a second
write of 0x1

Applied after patch
m88rs2000: add m88rs2000_set_carrieroffset

Signed-off-by: Malcolm Priestley <tvboxspy@gmail.com>
Signed-off-by: Mauro Carvalho Chehab <m.chehab@samsung.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

m88rs2000: add m88rs2000_set_carrieroffset

commit 06af15d1b6f45c60358feab88004472e5428f01c upstream.

Set the carrier offset correctly using the default mclk values.

Add function m88rs2000_get_mclk to calculate the mclk value
against crystal frequency which will later be used for
other functions.

Add function m88rs2000_set_carrieroffset to calculate
and set the offset value.

variable offset becomes a signed value.

Register 0x86 is set the appropriate value according to
remainder value of frequency % 192857 calculation as
shown.

Signed-off-by: Malcolm Priestley <tvboxspy@gmail.com>
Signed-off-by: Mauro Carvalho Chehab <m.chehab@samsung.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

dib8000: fix regression with dib807x

commit d67350f8c4e67f5eba627e1fd111f16257ca9c95 upstream.

Commit 173a64cb3fcf broke support for some dib807x versions.

Fix it by providing backward compatibility with the older versions.

[mkrufky@linuxtv.org: conflict handling and CodingStyle fixes]

Signed-off-by: Olivier Grenie <olivier.grenie@parrot.com>
Acked-by: Patrick Boettcher <pboettcher@kernellabs.com>
Signed-off-by: Mauro Carvalho Chehab <m.chehab@samsung.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

nxt200x: increase write buffer size

commit fa1e1de6bb679f2c86da3311bbafee7eaf78f125 upstream.

The buffer size on nxt200x is not enough:

...
> Dec 20 10:52:04 rich kernel: [ 31.747949] nxt200x: nxt200x_writebytes: i2c wr reg=002c: len=255 is too big!
...

Increase it to 256 bytes.

Reported-by: Rich Freeman <rich0@gentoo.org>
Signed-off-by: Mauro Carvalho Chehab <m.chehab@samsung.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

it913x: Add support for Avermedia H335 id 0x0335

commit 17f335c304ac19d9b11814238fe8a7519d80e2ff upstream.

Trivial USB ID addition for Avermedia H335.

Signed-off-by: Malcolm Priestley <tvboxspy@gmail.com>
Signed-off-by: Mauro Carvalho Chehab <m.chehab@samsung.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

media: s5p_mfc: remove s5p_mfc_get_node_type() function

commit b80cb8dc4162bc954cc71efec192ed89f2061573 upstream.

s5p_mfc_get_node_type() relies on get_index() helper function, which in
turn relies on video_device index numbers assigned on driver
registration. All this code is not really needed, because there is
already access to respective video_device structures via common
s5p_mfc_dev structure. This fixes the issues introduced by patch
1056e4388b0454917a512618c8416a98628fc9ce ("v4l2-dev: Fix race condition
on __video_register_device"), which has been merged in v3.12-rc1.

Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Kamil Debski <k.debski@samsung.com>
Signed-off-by: Mauro Carvalho Chehab <m.chehab@samsung.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

dib8000: make 32 bits read atomic

commit 5ac64ba12aca3bef18e61c866583155a3bbf81c4 upstream.

As the dvb-frontend kthread can be called anytime, it can race
with some get status ioctl. So, it seems better to avoid one to
race with the other while reading a 32 bits register.
I can't see any other reason for having a mutex there at I2C, except
to provide such kind of protection, as the I2C core already has a
mutex to protect I2C transfers.

Note: instead of this approach, it could eventually remove the dib8000
specific mutex for it, and either group the 4 ops into one xfer or
to manually control the I2C mutex. The main advantage of the current
approach is that the changes are smaller and more puntual.

Signed-off-by: Mauro Carvalho Chehab <m.chehab@samsung.com>
Signed-off-by: Mauro Carvalho Chehab <m.chehab@samsung.com>
Acked-by: Patrick Boettcher <pboettcher@kernellabs.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

media: anysee: fix non-working E30 Combo Plus DVB-T

commit c57f87e62368c33ebda11a4993380c8e5a19a5c5 upstream.

PLL was attached twice to frontend0 leaving frontend1 without a tuner.
frontend0 is DVB-C and frontend1 is DVB-T.

Signed-off-by: Antti Palosaari <crope@iki.fi>
Signed-off-by: Mauro Carvalho Chehab <m.chehab@samsung.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

media: media: v4l2-dev: fix video device index assignment

commit 6c3df5da67f1f53df78c7e20cd53a481dc28eade upstream.

The side effect of commit 1056e4388b045 ("v4l2-dev: Fix race condition on
__video_register_device") is the increased number of index value assigned
on video_device registration. Before that commit video_devices were
numbered from 0, after it, the indexes starts from 1, because get_index()
always count the device, which is being registered. Some device drivers
rely on video_device index number for internal purposes, i.e. s5p-mfc
driver stopped working after that patch. This patch restores the old method
of numbering the video_device indexes.

Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Acked-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Acked-by: Ricardo Ribalda <ricardo.ribalda@gmail.com>
Signed-off-by: Mauro Carvalho Chehab <m.chehab@samsung.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mm, oom: base root bonus on current usage

commit 778c14affaf94a9e4953179d3e13a544ccce7707 upstream.

A 3% of system memory bonus is sometimes too excessive in comparison to
other processes.

With commit a63d83f427fb ("oom: badness heuristic rewrite"), the OOM
killer tries to avoid killing privileged tasks by subtracting 3% of
overall memory (system or cgroup) from their per-task consumption.  But
as a result, all root tasks that consume less than 3% of overall memory
are considered equal, and so it only takes 33+ privileged tasks pushing
the system out of memory for the OOM killer to do something stupid and
kill dhclient or other root-owned processes.  For example, on a 32G
machine it can't tell the difference between the 1M agetty and the 10G
fork bomb member.

The changelog describes this 3% boost as the equivalent to the global
overcommit limit being 3% higher for privileged tasks, but this is not
the same as discounting 3% of overall memory from _every privileged task
individually_ during OOM selection.

Replace the 3% of system memory bonus with a 3% of current memory usage
bonus.

By giving root tasks a bonus that is proportional to their actual size,
they remain comparable even when relatively small.  In the example
above, the OOM killer will discount the 1M agetty's 256 badness points
down to 179, and the 10G fork bomb's 262144 points down to 183500 points
and make the right choice, instead of discounting both to 0 and killing
agetty because it's first in the task list.

Signed-off-by: David Rientjes <rientjes@google.com>
Reported-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

iscsi-target: Fix connection reset hang with percpu_ida_alloc

commit 555b270e25b0279b98083518a85f4b1da144a181 upstream.

This patch addresses a bug where connection reset would hang
indefinately once percpu_ida_alloc() was starved for tags, due
to the fact that it always assumed uninterruptible sleep mode.

So now make percpu_ida_alloc() check for signal_pending_state() for
making interruptible sleep optional, and convert iscsit_allocate_cmd()
to set TASK_INTERRUPTIBLE for GFP_KERNEL, or TASK_RUNNING for
GFP_ATOMIC.

Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Kent Overstreet <kmo@daterainc.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

percpu_ida: Make percpu_ida_alloc + callers accept task state bitmask

commit 6f6b5d1ec56acdeab0503d2b823f6f88a0af493e upstream.

This patch changes percpu_ida_alloc() + callers to accept task state
bitmask for prepare_to_wait() for code like target/iscsi that needs
it for interruptible sleep, that is provided in a subsequent patch.

It now expects TASK_UNINTERRUPTIBLE when the caller is able to sleep
waiting for a new tag, or TASK_RUNNING when the caller cannot sleep,
and is forced to return a negative value when no tags are available.

v2 changes:
  - Include blk-mq + tcm_fc + vhost/scsi + target/iscsi changes
  - Drop signal_pending_state() call
v3 changes:
  - Only call prepare_to_wait() + finish_wait() when != TASK_RUNNING
    (PeterZ)

Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Kent Overstreet <kmo@daterainc.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mei: mei_hbm_dispatch() returns void

Building hbm.o for v3.13.2 triggers a GCC warning:
    drivers/misc/mei/hbm.c: In function 'mei_hbm_dispatch':
    drivers/misc/mei/hbm.c:596:3: warning: 'return' with a value, in function returning void [enabled by default]
       return 0;
       ^

GCC is correct, obviously. So let's return void instead of zero here.

Signed-off-by: Paul Bolle <pebolle@tiscali.nl>
Acked-by: Tomas Winkler <tomas.winkler@intel.com>
Cc: Alexander Usyskin <alexander.usyskin@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

radeon/pm: Guard access to rdev->pm.power_state array

commit 370169516e736edad3b3c5aa49858058f8b55195 upstream.

It's never allocated on systems without an ATOMBIOS or COMBIOS ROM.

Should fix an oops I encountered while resetting the GPU after a lockup
on my PowerBook with an RV350.

Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

drm/radeon/dpm: disable mclk switching on desktop RV770

commit 8097d94116d0c17e774ba4c8256e774018dc2a46 upstream.

Mclk switching doesn't seem to work reliably on these
cards. Most RV770 boards specify the same mclk for all
performance levels anyway so in most cases, this has
no affect.

Bug:
https://bugs.freedesktop.org/show_bug.cgi?id=73067

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

drm/radeon: warn users when hw_i2c is enabled (v2)

commit d195178297de9a91246519dbfa98952b70f9a9b6 upstream.

The hw i2c engines are disabled by default as the
current implementation is still experimental. Print
a warning when users enable it so that it's obvious
when the option is enabled.

v2: check for non-0 rather than 1

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

dm space map metadata: fix bug in resizing of thin metadata

commit fca028438fb903852beaf7c3fe1cd326651af57d upstream.

This bug was introduced in commit 7e664b3dec431e ("dm space map metadata:
fix extending the space map").

When extending a dm-thin metadata volume we:

- Switch the space map into a simple bootstrap mode, which allocates
  all space linearly from the newly added space.
- Add new bitmap entries for the new space
- Increment the reference counts for those newly allocated bitmap
  entries
- Commit changes to disk
- Switch back out of bootstrap mode.

But, the disk commit may allocate space itself, if so this fact will be
lost when switching out of bootstrap mode.

The bug exhibited itself as an error when the bitmap_root, with an
erroneous ref count of 0, was subsequently decremented as part of a
later disk commit.  This would cause the disk commit to fail, and thinp
to enter read_only mode.  The metadata was not damaged (thin_check
passed).

The fix is to put the increments + commit into a loop, running until
the commit has not allocated extra space.  In practise this loop only
runs twice.

With this fix the following device mapper testsuite test passes:
dmtest run --suite thin-provisioning -n thin_remove_works_after_resize

Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

dm space map metadata: fix extending the space map

commit 7e664b3dec431eebf0c5df5ff704d6197634cf35 upstream.

When extending a metadata space map we should do the first commit whilst
still in bootstrap mode -- a mode where all blocks get allocated in the
new area.

That way the commit overhead is allocated from the newly added space.
Otherwise we risk running out of space.

With this fix, and the previous commit "dm space map common: make sure
new space is used during extend", the following device mapper testsuite
test passes:
dmtest run --suite thin-provisioning -n /resize_metadata_no_io/

Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

dm space map common: make sure new space is used during extend

commit 12c91a5c2d2a8e8cc40a9552313e1e7b0a2d9ee3 upstream.

When extending a low level space map we should update nr_blocks at
the start so the new space is used for the index entries.

Otherwise extend can fail, e.g.: sm_metadata_extend call sequence
that fails:
-> sm_ll_extend
-> dm_tm_new_block -> dm_sm_new_block -> sm_bootstrap_new_block
=> returns -ENOSPC because smm->begin == smm->ll.nr_blocks

Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

dm: wait until embedded kobject is released before destroying a device

commit be35f486108227e10fe5d96fd42fb2b344c59983 upstream.

There may be other parts of the kernel holding a reference on the dm
kobject.  We must wait until all references are dropped before
deallocating the mapped_device structure.

The dm_kobject_release method signals that all references are dropped
via completion.  But dm_kobject_release doesn't free the kobject (which
is embedded in the mapped_device structure).

This is the sequence of operations:
* when destroying a DM device, call kobject_put from dm_sysfs_exit
* wait until all users stop using the kobject, when it happens the
  release method is called
* the release method signals the completion and should return without
  delay
* the dm device removal code that waits on the completion continues
* the dm device removal code drops the dm_mod reference the device had
* the dm device removal code frees the mapped_device structure that
  contains the kobject

Using kobject this way should avoid the module unload race that was
mentioned at the beginning of this thread:
https://lkml.org/lkml/2014/1/4/83

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

dm thin: fix set_pool_mode exposed pool operation races

commit 8b64e881eb40ac8b9bfcbce068a97eef819044ee upstream.

The pool mode must not be switched until after the corresponding pool
process_* methods have been established.  Otherwise, because
set_pool_mode() isn't interlocked with the IO path for performance
reasons, the IO path can end up executing process_* operations that
don't match the mode.  This patch eliminates problems like the following
(as seen on really fast PCIe SSD storage when transitioning the pool's
mode from PM_READ_ONLY to PM_WRITE):

kernel: device-mapper: thin: 253:2: reached low water mark for data device: sending event.
kernel: device-mapper: thin: 253:2: no free data space available.
kernel: device-mapper: thin: 253:2: switching pool to read-only mode
kernel: device-mapper: thin: 253:2: switching pool to write mode
kernel: ------------[ cut here ]------------
kernel: WARNING: CPU: 11 PID: 7564 at drivers/md/dm-thin.c:995 handle_unserviceable_bio+0x146/0x160 [dm_thin_pool]()
...
kernel: Workqueue: dm-thin do_worker [dm_thin_pool]
kernel: 00000000000003e3 ffff880308831cc8 ffffffff8152ebcb 00000000000003e3
kernel: 0000000000000000 ffff880308831d08 ffffffff8104c46c ffff88032502a800
kernel: ffff880036409000 ffff88030ec7ce00 0000000000000001 00000000ffffffc3
kernel: Call Trace:
kernel: [<ffffffff8152ebcb>] dump_stack+0x49/0x5e
kernel: [<ffffffff8104c46c>] warn_slowpath_common+0x8c/0xc0
kernel: [<ffffffff8104c4ba>] warn_slowpath_null+0x1a/0x20
kernel: [<ffffffffa001e2c6>] handle_unserviceable_bio+0x146/0x160 [dm_thin_pool]
kernel: [<ffffffffa001f276>] process_bio_read_only+0x136/0x180 [dm_thin_pool]
kernel: [<ffffffffa0020b75>] process_deferred_bios+0xc5/0x230 [dm_thin_pool]
kernel: [<ffffffffa0020d31>] do_worker+0x51/0x60 [dm_thin_pool]
kernel: [<ffffffff81067823>] process_one_work+0x183/0x490
kernel: [<ffffffff81068c70>] worker_thread+0x120/0x3a0
kernel: [<ffffffff81068b50>] ? manage_workers+0x160/0x160
kernel: [<ffffffff8106e86e>] kthread+0xce/0xf0
kernel: [<ffffffff8106e7a0>] ? kthread_freezable_should_stop+0x70/0x70
kernel: [<ffffffff8153b3ec>] ret_from_fork+0x7c/0xb0
kernel: [<ffffffff8106e7a0>] ? kthread_freezable_should_stop+0x70/0x70
kernel: ---[ end trace 3f00528e08ffa55c ]---
kernel: device-mapper: thin: pool mode is PM_WRITE not PM_READ_ONLY like expected!?

dm-thin.c:995 was the WARN_ON_ONCE(get_pool_mode(pool) != PM_READ_ONLY);
at the top of handle_unserviceable_bio().  And as the additional
debugging I had conveys: the pool mode was _not_ PM_READ_ONLY like
expected, it was already PM_WRITE, yet pool->process_bio was still set
to process_bio_read_only().

Also, while fixing this up, reduce logging of redundant pool mode
transitions by checking new_mode is different from old_mode.

Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

dm thin: initialize dm_thin_new_mapping returned by get_next_mapping

commit 16961b042db8cc5cf75d782b4255193ad56e1d4f upstream.

As additional members are added to the dm_thin_new_mapping structure
care should be taken to make sure they get initialized before use.

Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Acked-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

dm thin: fix discard support to a previously shared block

commit 19fa1a6756ed9e92daa9537c03b47d6b55cc2316 upstream.

If a snapshot is created and later deleted the origin dm_thin_device's
snapshotted_time will have been updated to reflect the snapshot's
creation time.  The 'shared' flag in the dm_thin_lookup_result struct
returned from dm_thin_find_block() is an approximation based on
snapshotted_time -- this is done to avoid 0(n), or worse, time
complexity.  In this case, the shared flag would be true.

But because the 'shared' flag reflects an approximation a block can be
incorrectly assumed to be shared (e.g. false positive for 'shared'
because the snapshot no longer exists).  This could result in discards
issued to a thin device not being passed down to the pool's underlying
data device.

To fix this we double check that a thin block is really still in-use
after a mapping is removed using dm_pool_block_is_used().  If the
reference count for a block is now zero the discard is allowed to be
passed down.

Also add a 'definitely_not_shared' member to the dm_thin_new_mapping
structure -- reflects that the 'shared' flag in the response from
dm_thin_find_block() can only be held as definitive if false is
returned.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1043527

Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

sunrpc: don't wait for write before allowing reads from use-gss-proxy file

commit 1654a04cd702fd19c297c36300a6ab834cf8c072 upstream.

It doesn't make much sense to make reads from this procfile hang. As
far as I can tell, only gssproxy itself will open this file and it
never reads from it. Change it to just give the present setting of
sn->use_gss_proxy without waiting for anything.

Note that we do not want to call use_gss_proxy() in this codepath
since an inopportune read of this file could cause it to be disabled
prematurely.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

sunrpc: Fix infinite loop in RPC state machine

commit 6ff33b7dd0228b7d7ed44791bbbc98b03fd15d9d upstream.

When a task enters call_refreshresult with status 0 from call_refresh and
!rpcauth_uptodatecred(task) it enters call_refresh again with no rate-limiting
or max number of retries.

Instead of trying forever, make use of the retry path that other errors use.

This only seems to be possible when the crrefresh callback is gss_refresh_null,
which only happens when destroying the context.

To reproduce:

1) mount with sec=krb5 (or sec=sys with krb5 negotiated for non FSID specific
   operations).

2) reboot - the client will be stuck and will need to be hard rebooted

BUG: soft lockup - CPU#0 stuck for 22s! [kworker/0:2:46]
Modules linked in: rpcsec_gss_krb5 nfsv4 nfs fscache ppdev crc32c_intel aesni_intel aes_x86_64 glue_helper lrw gf128mul ablk_helper cryptd serio_raw i2c_piix4 i2c_core e1000 parport_pc parport shpchp nfsd auth_rpcgss oid_registry exportfs nfs_acl lockd sunrpc autofs4 mptspi scsi_transport_spi mptscsih mptbase ata_generic floppy
irq event stamp: 195724
hardirqs last  enabled at (195723): [<ffffffff814a925c>] restore_args+0x0/0x30
hardirqs last disabled at (195724): [<ffffffff814b0a6a>] apic_timer_interrupt+0x6a/0x80
softirqs last  enabled at (195722): [<ffffffff8103f583>] __do_softirq+0x1df/0x276
softirqs last disabled at (195717): [<ffffffff8103f852>] irq_exit+0x53/0x9a
CPU: 0 PID: 46 Comm: kworker/0:2 Not tainted 3.13.0-rc3-branch-dros_testing+ #4
Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 07/31/2013
Workqueue: rpciod rpc_async_schedule [sunrpc]
task: ffff8800799c4260 ti: ffff880079002000 task.ti: ffff880079002000
RIP: 0010:[<ffffffffa0064fd4>]  [<ffffffffa0064fd4>] __rpc_execute+0x8a/0x362 [sunrpc]
RSP: 0018:ffff880079003d18  EFLAGS: 00000246
RAX: 0000000000000005 RBX: 0000000000000007 RCX: 0000000000000007
RDX: 0000000000000007 RSI: ffff88007aecbae8 RDI: ffff8800783d8900
RBP: ffff880079003d78 R08: ffff88006e30e9f8 R09: ffffffffa005a3d7
R10: ffff88006e30e7b0 R11: ffff8800783d8900 R12: ffffffffa006675e
R13: ffff880079003ce8 R14: ffff88006e30e7b0 R15: ffff8800783d8900
FS:  0000000000000000(0000) GS:ffff88007f200000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f3072333000 CR3: 0000000001a0b000 CR4: 00000000001407f0
Stack:
ffff880079003d98 0000000000000246 0000000000000000 ffff88007a9a4830
ffff880000000000 ffffffff81073f47 ffff88007f212b00 ffff8800799c4260
ffff8800783d8988 ffff88007f212b00 ffffe8ffff604800 0000000000000000
Call Trace:
[<ffffffff81073f47>] ? trace_hardirqs_on_caller+0x145/0x1a1
[<ffffffffa00652d3>] rpc_async_schedule+0x27/0x32 [sunrpc]
[<ffffffff81052974>] process_one_work+0x211/0x3a5
[<ffffffff810528d5>] ? process_one_work+0x172/0x3a5
[<ffffffff81052eeb>] worker_thread+0x134/0x202
[<ffffffff81052db7>] ? rescuer_thread+0x280/0x280
[<ffffffff81052db7>] ? rescuer_thread+0x280/0x280
[<ffffffff810584a0>] kthread+0xc9/0xd1
[<ffffffff810583d7>] ? __kthread_parkme+0x61/0x61
[<ffffffff814afd6c>] ret_from_fork+0x7c/0xb0
[<ffffffff810583d7>] ? __kthread_parkme+0x61/0x61
Code: e8 87 63 fd e0 c6 05 10 dd 01 00 01 48 8b 43 70 4c 8d 6b 70 45 31 e4 a8 02 0f 85 d5 02 00 00 4c 8b 7b 48 48 c7 43 48 00 00 00 00 <4c> 8b 4b 50 4d 85 ff 75 0c 4d 85 c9 4d 89 cf 0f 84 32 01 00 00

And the output of "rpcdebug -m rpc -s all":

RPC:    61 call_refresh (status 0)
RPC:    61 call_refresh (status 0)
RPC:    61 refreshing RPCSEC_GSS cred ffff88007a413cf0
RPC:    61 refreshing RPCSEC_GSS cred ffff88007a413cf0
RPC:    61 call_refreshresult (status 0)
RPC:    61 refreshing RPCSEC_GSS cred ffff88007a413cf0
RPC:    61 call_refreshresult (status 0)
RPC:    61 refreshing RPCSEC_GSS cred ffff88007a413cf0
RPC:    61 call_refresh (status 0)
RPC:    61 call_refreshresult (status 0)
RPC:    61 call_refresh (status 0)
RPC:    61 call_refresh (status 0)
RPC:    61 refreshing RPCSEC_GSS cred ffff88007a413cf0
RPC:    61 call_refreshresult (status 0)
RPC:    61 call_refresh (status 0)
RPC:    61 refreshing RPCSEC_GSS cred ffff88007a413cf0
RPC:    61 call_refresh (status 0)
RPC:    61 refreshing RPCSEC_GSS cred ffff88007a413cf0
RPC:    61 refreshing RPCSEC_GSS cred ffff88007a413cf0
RPC:    61 call_refreshresult (status 0)
RPC:    61 call_refresh (status 0)
RPC:    61 call_refresh (status 0)
RPC:    61 call_refresh (status 0)
RPC:    61 call_refresh (status 0)
RPC:    61 call_refreshresult (status 0)
RPC:    61 refreshing RPCSEC_GSS cred ffff88007a413cf0

Signed-off-by: Weston Andros Adamson <dros@netapp.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

NFSv4: Fix a slot leak in nfs40_sequence_done

commit cab92c19821a814ecf5a5279e2699bf28e66caee upstream.

The check for whether or not we sent an RPC call in nfs40_sequence_done
is insufficient to decide whether or not we are holding a session slot,
and thus should not be used to decide when to free that slot.

This patch replaces the RPC_WAS_SENT() test with the correct test for
whether or not slot == NULL.

Cc: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

pnfs: Proper delay for NFS4ERR_RECALLCONFLICT in layout_get_done

commit ed7e5423014ad89720fcf315c0b73f2c5d0c7bd2 upstream.

An NFS4ERR_RECALLCONFLICT is returned by server from a GET_LAYOUT
only when a Server Sent a RECALL do to that GET_LAYOUT, or
the RECALL and GET_LAYOUT crossed on the wire.
In any way this means we want to wait at most until in-flight IO
is finished and the RECALL can be satisfied.

So a proper wait here is more like 1/10 of a second, not 15 seconds
like we have now. In case of a server bug we delay exponentially
longer on each retry.

Current code totally craps out performance of very large files on
most pnfs-objects layouts, because of how the map changes when the
file has grown into the next raid group.

[Stable: This will patch back to 3.9. If there are earlier still
maintained trees, please tell me I'll send a patch]

Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

nfs4: fix discover_server_trunking use after free

commit abad2fa5ba67725a3f9c376c8cfe76fbe94a3041 upstream.

If clp is new (cl_count = 1) and it matches another client in
nfs4_discover_server_trunking, the nfs_put_client will free clp before
->cl_preserve_clid is set.

Signed-off-by: Weston Andros Adamson <dros@primarydata.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

NFSv4.1: Handle errors correctly in nfs41_walk_client_list

commit 64590daa9e0dfb3aad89e3ab9230683b76211d5b upstream.

Both nfs41_walk_client_list and nfs40_walk_client_list expect the
'status' variable to be set to the value -NFS4ERR_STALE_CLIENTID
if the loop fails to find a match.
The problem is that the 'pos->cl_cons_state > NFS_CS_READY' changes
the value of 'status', and sets it either to the value '0' (which
indicates success), or to the value EINTR.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

nfs: always make sure page is up-to-date before extending a write to cover the entire page

commit 263b4509ec4d47e0da3e753f85a39ea12d1eff24 upstream.

We should always make sure the cached page is up-to-date when we're
determining whether we can extend a write to cover the full page -- even
if we've received a write delegation from the server.

Commit c7559663 added logic to skip this check if we have a write
delegation, which can lead to data corruption such as the following
scenario if client B receives a write delegation from the NFS server:

Client A:
    # echo 123456789 > /mnt/file

Client B:
    # echo abcdefghi >> /mnt/file
    # cat /mnt/file
    0�D0�abcdefghi

Just because we hold a write delegation doesn't mean that we've read in
the entire page contents.

Signed-off-by: Scott Mayhew <smayhew@redhat.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>