]> git.itanic.dy.fi Git - linux-stable/log
linux-stable
10 years agoLinux 3.11.3 v3.11.3
Greg Kroah-Hartman [Tue, 1 Oct 2013 16:41:12 +0000 (09:41 -0700)]
Linux 3.11.3

10 years agonetfilter: ipset: Fix serious failure in CIDR tracking
Oliver Smith [Mon, 16 Sep 2013 18:30:57 +0000 (20:30 +0200)]
netfilter: ipset: Fix serious failure in CIDR tracking

commit 2cf55125c64d64cc106e204d53b107094762dfdf upstream.

This fixes a serious bug affecting all hash types with a net element -
specifically, if a CIDR value is deleted such that none of the same size
exist any more, all larger (less-specific) values will then fail to
match. Adding back any prefix with a CIDR equal to or more specific than
the one deleted will fix it.

Steps to reproduce:
ipset -N test hash:net
ipset -A test 1.1.0.0/16
ipset -A test 2.2.2.0/24
ipset -T test 1.1.1.1           #1.1.1.1 IS in set
ipset -D test 2.2.2.0/24
ipset -T test 1.1.1.1           #1.1.1.1 IS NOT in set

This is due to the fact that the nets counter was unconditionally
decremented prior to the iteration that shifts up the entries. Now, we
first check if there is a proceeding entry and if not, decrement it and
return. Otherwise, we proceed to iterate and then zero the last element,
which, in most cases, will already be zero.

Signed-off-by: Oliver Smith <oliver@8.c.9.b.0.7.4.0.1.0.0.2.ip6.arpa>
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agocw1200: Don't perform SPI transfers in interrupt context
Solomon Peachy [Wed, 28 Aug 2013 00:29:46 +0000 (20:29 -0400)]
cw1200: Don't perform SPI transfers in interrupt context

commit aec8e88c947b7017e2b4bbcb68a4bfc4a1f8ad35 upstream.

When we get an interrupt from the hardware, the first thing the driver does
is tell the device to mask off the interrupt line.  Unfortunately this
involves a SPI transaction in interrupt context.  Some (most?) SPI
controllers perform the transfer asynchronously and try to sleep.
This is bad, and triggers a BUG().

So, work around this by using adding a hwbus hook for the cw1200 driver
core to call.  The cw1200_spi driver translates this into
irq_disable()/irq_enable() calls instead, which can safely be called in
interrupt context.

Apparently the platforms I used to develop the cw1200_spi driver used
synchronous spi_sync() implementations, which is why this didn't surface
until now.

Many thanks to Dave Sizeburns for the inital bug report and his services
as a tester.

Signed-off-by: Solomon Peachy <pizza@shaftnet.org>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agocw1200: Prevent a lock-related hang in the cw1200_spi driver
Solomon Peachy [Wed, 28 Aug 2013 00:29:47 +0000 (20:29 -0400)]
cw1200: Prevent a lock-related hang in the cw1200_spi driver

commit 85ba8f529c57ac6e2fca9be0d9e17920a1afb2e8 upstream.

The cw1200_spi driver tries to mirror the cw1200_sdio driver's lock
API, which relies on sdio_claim_host/sdio_release_host to serialize
hardware operations across multiple threads.

Unfortunately the implementation was flawed, as it lacked a way to wake
up the lock requestor when there was contention, often resulting in a
hang.

This problem was uncovered while trying to fix the
spi-transfers-in-interrupt-context BUG() corrected in the previous
patch.  Many thanks to Dave Sizeburns for his assistance in fixing this.

Signed-off-by: Solomon Peachy <pizza@shaftnet.org>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agorpc: let xdr layer allocate gssproxy receieve pages
J. Bruce Fields [Fri, 23 Aug 2013 21:26:28 +0000 (17:26 -0400)]
rpc: let xdr layer allocate gssproxy receieve pages

commit d4a516560fc96a9d486a9939bcb567e3fdce8f49 upstream.

In theory the linux cred in a gssproxy reply can include up to
NGROUPS_MAX data, 256K of data.  In the common case we expect it to be
shorter.  So do as the nfsv3 ACL code does and let the xdr code allocate
the pages as they come in, instead of allocating a lot of pages that
won't typically be used.

Tested-by: Simo Sorce <simo@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agorpc: fix huge kmalloc's in gss-proxy
J. Bruce Fields [Tue, 20 Aug 2013 22:13:27 +0000 (18:13 -0400)]
rpc: fix huge kmalloc's in gss-proxy

commit 9dfd87da1aeb0fd364167ad199f40fe96a6a87be upstream.

The reply to a gssproxy can include up to NGROUPS_MAX gid's, which will
take up more than a page.  We therefore need to allocate an array of
pages to hold the reply instead of trying to allocate a single huge
buffer.

Tested-by: Simo Sorce <simo@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agorpc: comment on linux_cred encoding, treat all as unsigned
J. Bruce Fields [Fri, 23 Aug 2013 15:17:53 +0000 (11:17 -0400)]
rpc: comment on linux_cred encoding, treat all as unsigned

commit 6a36978e6931e6601be586eb313375335f2cfaa3 upstream.

The encoding of linux creds is a bit confusing.

Also: I think in practice it doesn't really matter whether we treat any
of these things as signed or unsigned, but unsigned seems more
straightforward: uid_t/gid_t are unsigned and it simplifies the ngroups
overflow check.

Tested-by: Simo Sorce <simo@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agorpc: clean up decoding of gssproxy linux creds
J. Bruce Fields [Wed, 21 Aug 2013 14:32:52 +0000 (10:32 -0400)]
rpc: clean up decoding of gssproxy linux creds

commit 778e512bb1d3315c6b55832248cd30c566c081d7 upstream.

We can use the normal coding infrastructure here.

Two minor behavior changes:

- we're assuming no wasted space at the end of the linux cred.
  That seems to match gss-proxy's behavior, and I can't see why
  it would need to do differently in the future.

- NGROUPS_MAX check added: note groups_alloc doesn't do this,
  this is the caller's responsibility.

Tested-by: Simo Sorce <simo@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agocfq: explicitly use 64bit divide operation for 64bit arguments
Anatol Pomozov [Sun, 22 Sep 2013 18:43:47 +0000 (12:43 -0600)]
cfq: explicitly use 64bit divide operation for 64bit arguments

commit f3cff25f05f2ac29b2ee355e611b0657482f6f1d upstream.

'samples' is 64bit operant, but do_div() second parameter is 32.
do_div silently truncates high 32 bits and calculated result
is invalid.

In case if low 32bit of 'samples' are zeros then do_div() produces
kernel crash.

Signed-off-by: Anatol Pomozov <anatol.pomozov@gmail.com>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Cc: Jonghwan Choi <jhbird.choi@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agobio-integrity: Fix use of bs->bio_integrity_pool after free
Bjorn Helgaas [Wed, 29 May 2013 22:29:55 +0000 (16:29 -0600)]
bio-integrity: Fix use of bs->bio_integrity_pool after free

commit adbe6991efd36104ac9eaf751993d35eaa7f493a upstream.

This fixes a copy and paste error introduced by 9f060e2231
("block: Convert integrity to bvec_alloc_bs()").

Found by Coverity (CID 1020654).

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Kent Overstreet <koverstreet@google.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Cc: Jonghwan Choi <jhbird.choi@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agomm: fix aio performance regression for database caused by THP
Khalid Aziz [Wed, 11 Sep 2013 21:22:20 +0000 (14:22 -0700)]
mm: fix aio performance regression for database caused by THP

commit 7cb2ef56e6a8b7b368b2e883a0a47d02fed66911 upstream.

I am working with a tool that simulates oracle database I/O workload.
This tool (orion to be specific -
<http://docs.oracle.com/cd/E11882_01/server.112/e16638/iodesign.htm#autoId24>)
allocates hugetlbfs pages using shmget() with SHM_HUGETLB flag.  It then
does aio into these pages from flash disks using various common block
sizes used by database.  I am looking at performance with two of the most
common block sizes - 1M and 64K.  aio performance with these two block
sizes plunged after Transparent HugePages was introduced in the kernel.
Here are performance numbers:

pre-THP 2.6.39 3.11-rc5
1M read 8384 MB/s 5629 MB/s 6501 MB/s
64K read 7867 MB/s 4576 MB/s 4251 MB/s

I have narrowed the performance impact down to the overheads introduced by
THP in __get_page_tail() and put_compound_page() routines.  perf top shows
>40% of cycles being spent in these two routines.  Every time direct I/O
to hugetlbfs pages starts, kernel calls get_page() to grab a reference to
the pages and calls put_page() when I/O completes to put the reference
away.  THP introduced significant amount of locking overhead to get_page()
and put_page() when dealing with compound pages because hugepages can be
split underneath get_page() and put_page().  It added this overhead
irrespective of whether it is dealing with hugetlbfs pages or transparent
hugepages.  This resulted in 20%-45% drop in aio performance when using
hugetlbfs pages.

Since hugetlbfs pages can not be split, there is no reason to go through
all the locking overhead for these pages from what I can see.  I added
code to __get_page_tail() and put_compound_page() to bypass all the
locking code when working with hugetlbfs pages.  This improved performance
significantly.  Performance numbers with this patch:

pre-THP 3.11-rc5 3.11-rc5 + Patch
1M read 8384 MB/s 6501 MB/s 8371 MB/s
64K read 7867 MB/s 4251 MB/s 6510 MB/s

Performance with 64K read is still lower than what it was before THP, but
still a 53% improvement.  It does mean there is more work to be done but I
will take a 53% improvement for now.

Please take a look at the following patch and let me know if it looks
reasonable.

[akpm@linux-foundation.org: tweak comments]
Signed-off-by: Khalid Aziz <khalid.aziz@oracle.com>
Cc: Pravin B Shelar <pshelar@nicira.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Rik van Riel <riel@redhat.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agoaudit: fix endless wait in audit_log_start()
Konstantin Khlebnikov [Tue, 24 Sep 2013 22:27:42 +0000 (15:27 -0700)]
audit: fix endless wait in audit_log_start()

commit 8ac1c8d5deba65513b6a82c35e89e73996c8e0d6 upstream.

After commit 829199197a43 ("kernel/audit.c: avoid negative sleep
durations") audit emitters will block forever if userspace daemon cannot
handle backlog.

After the timeout the waiting loop turns into busy loop and runs until
daemon dies or returns back to work.  This is a minimal patch for that
bug.

Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org>
Cc: Luiz Capitulino <lcapitulino@redhat.com>
Cc: Richard Guy Briggs <rgb@redhat.com>
Cc: Eric Paris <eparis@redhat.com>
Cc: Chuck Anderson <chuck.anderson@oracle.com>
Cc: Dan Duval <dan.duval@oracle.com>
Cc: Dave Kleikamp <dave.kleikamp@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Jonghwan Choi <jhbird.choi@samsung.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agoudf: Refuse RW mount of the filesystem instead of making it RO
Jan Kara [Thu, 25 Jul 2013 14:15:16 +0000 (16:15 +0200)]
udf: Refuse RW mount of the filesystem instead of making it RO

commit e729eac6f65e11c5f03b09adcc84bd5bcb230467 upstream.

Refuse RW mount of udf filesystem. So far we just silently changed it
to RO mount but when the media is writeable, block layer won't notice
this change and thus will think device is used RW and will block eject
button of the drive. That is unexpected by users because for
non-writeable media eject button works just fine.

Userspace mount(8) command handles this just fine and retries mounting
with MS_RDONLY set so userspace shouldn't see any regression.  Plus any
tool mounting udf is likely confronted with the case of read-only
media where block layer already refuses to mount the filesystem without
MS_RDONLY set so our behavior shouldn't be anything new for it.

Reported-by: Hui Wang <hui.wang@canonical.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agoudf: Standardize return values in mount sequence
Jan Kara [Thu, 25 Jul 2013 17:10:59 +0000 (19:10 +0200)]
udf: Standardize return values in mount sequence

commit d759bfa4e7919b89357de50a2e23817079889195 upstream.

Change all function used in filesystem discovery during mount to user
standard kernel return values - -errno on error, 0 on success instead
of 1 on failure and 0 on success. This allows us to pass error number
(not just failure / success) so we can abort device scanning earlier
in case of errors like EIO or ENOMEM . Also we will be able to return
EROFS in case writeable mount is requested but writing isn't supported.

Signed-off-by: Jan Kara <jack@suse.cz>
Cc: Hui Wang <hui.wang@canonical.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agoskge: fix broken driver
Mikulas Patocka [Thu, 19 Sep 2013 18:13:17 +0000 (14:13 -0400)]
skge: fix broken driver

commit c194992cbe71c20bb3623a566af8d11b0bfaa721 upstream.

The patch 136d8f377e1575463b47840bc5f1b22d94bf8f63 broke the skge driver.
Note this part of the patch:
+               if (skge_rx_setup(skge, e, nskb, skge->rx_buf_size) < 0) {
+                       dev_kfree_skb(nskb);
+                       goto resubmit;
+               }
+
                pci_unmap_single(skge->hw->pdev,
                                 dma_unmap_addr(e, mapaddr),
                                 dma_unmap_len(e, maplen),
                                 PCI_DMA_FROMDEVICE);
                skb = e->skb;
                prefetch(skb->data);
-               skge_rx_setup(skge, e, nskb, skge->rx_buf_size);

The function skge_rx_setup modifies e->skb to point to the new skb. Thus,
after this change, the new buffer, not the old, is returned to the
networking stack.

This bug is present in kernels 3.11, 3.11.1 and 3.12-rc1. The patch should
be queued for 3.11-stable.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Reported-by: Mikulas Patocka <mpatocka@redhat.com>
Reported-by: Vasiliy Glazov <vascom2@gmail.com>
Tested-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agodrm/radeon: avoid UVD corruptions on AGP cards
Christian König [Sun, 15 Sep 2013 11:31:28 +0000 (13:31 +0200)]
drm/radeon: avoid UVD corruptions on AGP cards

commit 4f66c59922cbcda14c9e103e6c7f4ee616360d43 upstream.

Putting everything into VRAM seems to help.

Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agodrm/radeon: fix panel scaling with eDP and LVDS bridges
Alex Deucher [Fri, 13 Sep 2013 22:33:16 +0000 (18:33 -0400)]
drm/radeon: fix panel scaling with eDP and LVDS bridges

commit 855f5f1d882a34e4e9dd27b299737cd3508a5624 upstream.

We were using the wrong set_properly callback so we always
ended up with Full scaling even if something else (Center or
Full aspect) was selected.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agodrm/radeon/dpm/rs780: don't enable sclk scaling if not required
Alex Deucher [Fri, 13 Sep 2013 14:55:10 +0000 (10:55 -0400)]
drm/radeon/dpm/rs780: don't enable sclk scaling if not required

commit e40210cca98068835acd5a4fe760bf96b3a1aa48 upstream.

If the low and high sclks are the same, there is no need to
enable sclk scaling.  This causes display stability issues on
certain boards.

Fixes:
https://bugzilla.kernel.org/show_bug.cgi?id=60857

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agodrm/radeon/dpm: fix fallback for empty UVD clocks
Alex Deucher [Tue, 10 Sep 2013 13:40:37 +0000 (09:40 -0400)]
drm/radeon/dpm: fix fallback for empty UVD clocks

commit 84f3d9f7b4781dea6e11dcaf7f81367c1b39fef0 upstream.

Some older 6xx-7xx boards didn't always fill in the
UVD clocks properly in the UVD power states.  This
leads to the driver trying to set a 0 clock which
results in slow or broken UVD playback.

Fixes:
https://bugs.freedesktop.org/show_bug.cgi?id=69120

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agodrm/radeon/dpm: handle bapm on trinity
Alex Deucher [Mon, 9 Sep 2013 22:56:50 +0000 (18:56 -0400)]
drm/radeon/dpm: handle bapm on trinity

commit ef4e03658420bbf91365647615460668c2510e79 upstream.

bapm is a power management feature for handling the
power budget between the CPU and GPU on APUs.  This
patch adds support for enabling or disabling it.
For now disable it by default.  Enabling it properly
requires quite a bit more work and will be addressed
in a separate patch.

This patch fixes hangs on boot on certain trinity
laptops when the system is on battery power.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agodrm/radeon/atom: workaround vbios bug in transmitter table on rs880 (v2)
Alex Deucher [Mon, 9 Sep 2013 14:54:22 +0000 (10:54 -0400)]
drm/radeon/atom: workaround vbios bug in transmitter table on rs880 (v2)

commit 91f3a6aaf280294b07c05dfe606e6c27b7ba3c72 upstream.

The OUTPUT_ENABLE action jumps past the point in the coder where
the data_offset is set on certain rs780 cards.  This worked
previously because the OUTPUT_ENABLE action is always called
immediately after the ENABLE action so the data_offset remained
set.  In 6f8bbaf568c7f2c497558bfd04654c0b9841ad57
(drm/radeon/atom: initialize more atom interpretor elements to 0),
we explictly reset data_offset to 0 between atom calls which then
caused this to fail.  The fix is to just skip calling the
OUTPUT_ENABLE action on the problematic chipsets.  The ENABLE
action does the same thing and more.  Ultimately, we could
probably drop the OUTPUT_ENABLE action all together on DCE3
asics.

fixes:
https://bugzilla.kernel.org/show_bug.cgi?id=60791

v2: only rs880 seems to be affected

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agodrm/radeon/r6xx: add a stubbed out set_uvd_clocks callback
Alex Deucher [Thu, 5 Sep 2013 13:52:37 +0000 (09:52 -0400)]
drm/radeon/r6xx: add a stubbed out set_uvd_clocks callback

commit 1b9ba70a49ba92e910d8e5df702edf8c1858cecf upstream.

Certain r6xx boards use the same power state for both UVD
and other things.  Since we don't support UVD on r6xx boards
at the moment, there was no callback installed for setting
the UVD clocks, however, on systems that use the same power
state, this leads to a NULL pointer dereference.  Fill
in a stubbed out implementation for now to avoid the crash.

Fixes:
https://bugs.freedesktop.org/show_bug.cgi?id=66963

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agodrm/radeon: add some additional berlin pci ids
Alex Deucher [Wed, 4 Sep 2013 20:48:40 +0000 (16:48 -0400)]
drm/radeon: add some additional berlin pci ids

commit 9a71677874d200865433647e9282fcf9fa6b05dd upstream.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agotg3: Expand led off fix to include 5720
Nithin Sujir [Thu, 12 Sep 2013 21:01:31 +0000 (14:01 -0700)]
tg3: Expand led off fix to include 5720

commit 300cf9b93f74c3d969a0ad50bdac65416107c44c upstream.

Commit 989038e217e94161862a959e82f9a1ecf8dda152 ("tg3: Don't turn off
led on 5719 serdes port 0") added code to skip turning led off on port
0 of the 5719 since it powered down other ports. This workaround needs
to be enabled on the 5720 as well.

Signed-off-by: Nithin Nayak Sujir <nsujir@broadcom.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agotg3: Don't turn off led on 5719 serdes port 0
Nithin Sujir [Sat, 31 Aug 2013 00:01:36 +0000 (17:01 -0700)]
tg3: Don't turn off led on 5719 serdes port 0

commit 989038e217e94161862a959e82f9a1ecf8dda152 upstream.

Turning off led on port 0 of the 5719 serdes causes all other ports to
lose power and stop functioning. Add tg3_phy_led_bug() function to check
for this condition. We use a switch() in tg3_phy_led_bug() for
consistency with the tg3_phy_power_bug() function.

Signed-off-by: Nithin Nayak Sujir <nsujir@broadcom.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agodrm/radeon/dpm: make sure dc performance level limits are valid (BTC-SI) (v2)
Alex Deucher [Fri, 30 Aug 2013 20:18:35 +0000 (16:18 -0400)]
drm/radeon/dpm: make sure dc performance level limits are valid (BTC-SI) (v2)

commit 1ff60ddb84bb9ff6fa182710c4e08b66badf918c upstream.

Check to make sure the dc limits are valid before using them.
Some systems may not have a dc limits table.  In that case just
use the ac limits.  This fixes hangs on systems when the power
state is changed when on battery (dc) due to invalid performance
state parameters.

Should fix:
https://bugs.freedesktop.org/show_bug.cgi?id=68708

v2: fix up limits in dpm_init()

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agodrm/radeon: fix handling of variable sized arrays for router objects
Alex Deucher [Tue, 27 Aug 2013 16:36:01 +0000 (12:36 -0400)]
drm/radeon: fix handling of variable sized arrays for router objects

commit fb93df1c2d8b3b1fb16d6ee9e32554e0c038815d upstream.

The table has the following format:

typedef struct _ATOM_SRC_DST_TABLE_FOR_ONE_OBJECT         //usSrcDstTableOffset pointing to this structure
{
  UCHAR               ucNumberOfSrc;
  USHORT              usSrcObjectID[1];
  UCHAR               ucNumberOfDst;
  USHORT              usDstObjectID[1];
}ATOM_SRC_DST_TABLE_FOR_ONE_OBJECT;

usSrcObjectID[] and usDstObjectID[] are variably sized, so we
can't access them directly.  Use pointers and update the offset
appropriately when accessing the Dst members.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agodrm/radeon: fix resume on some rs4xx boards (v2)
Alex Deucher [Mon, 26 Aug 2013 21:52:12 +0000 (17:52 -0400)]
drm/radeon: fix resume on some rs4xx boards (v2)

commit acf88deb8ddbb73acd1c3fa32fde51af9153227f upstream.

Setting MC_MISC_CNTL.GART_INDEX_REG_EN causes hangs on
some boards on resume.  The systems seem to work fine
without touching this bit so leave it as is.

v2: read-modify-write the GART_INDEX_REG_EN bit.
I suspect the problem is that we are losing the other
settings in the register.

fixes:
https://bugs.freedesktop.org/show_bug.cgi?id=52952

Reported-by: Ondrej Zary <linux@rainbow-software.org>
Tested-by: Daniel Tobias <dan.g.tob@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agodrm/radeon: update line buffer allocation for dce6
Alex Deucher [Mon, 19 Aug 2013 15:15:43 +0000 (11:15 -0400)]
drm/radeon: update line buffer allocation for dce6

commit 290d24576ccf1aa0373d2185cedfe262d0d4952a upstream.

We need to allocate line buffer to each display when
setting up the watermarks.  Failure to do so can lead
to a blank screen.  This fixes blank screen problems
on dce6 asics.

Fixes:
https://bugs.freedesktop.org/show_bug.cgi?id=64850

Based on an initial fix from:
Jay Cornwall <jay.cornwall@amd.com>

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agodrm/radeon: update line buffer allocation for dce4.1/5
Alex Deucher [Mon, 19 Aug 2013 15:06:50 +0000 (11:06 -0400)]
drm/radeon: update line buffer allocation for dce4.1/5

commit 0b31e02363b0db4e7931561bc6c141436e729d9f upstream.

We need to allocate line buffer to each display when
setting up the watermarks.  Failure to do so can lead
to a blank screen.  This fixes blank screen problems
on dce4.1/5 asics.

Based on an initial fix from:
Jay Cornwall <jay.cornwall@amd.com>

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agodrm/radeon/si: Add support for CP DMA to CS checker for compute v2
Tom Stellard [Fri, 16 Aug 2013 21:47:39 +0000 (17:47 -0400)]
drm/radeon/si: Add support for CP DMA to CS checker for compute v2

commit e5b9e7503eb1f4884efa3b321d3cc47806779202 upstream.

Also add a new RADEON_INFO query to check that CP DMA packets are
supported on the compute ring.

CP DMA has been supported since the 3.8 kernel, but due to an oversight
we forgot to teach the CS checker that the CP DMA packet was legal for
the compute ring on Southern Islands GPUs.

This patch fixes a bug where the radeon driver will incorrectly reject a legal
CP DMA packet from user space.  I would like to have the patch
backported to stable so that we don't have to require Mesa users to use a
bleeding edge kernel in order to take advantage of this feature which
is already present in the stable kernels (3.8 and newer).

v2:
  - Don't bump kms version, so this patch can be backported to stable
    kernels.

Signed-off-by: Tom Stellard <thomas.stellard@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agodrm/radeon: add berlin pci ids
Alex Deucher [Mon, 10 Jun 2013 19:51:21 +0000 (15:51 -0400)]
drm/radeon: add berlin pci ids

commit 0431b2742f8e7755f3bbf5924900d12973412e94 upstream.

This adds the pci ids for the berlin GPU core.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agodrm/radeon/cik: update gpu_init for an additional berlin gpu
Alex Deucher [Wed, 4 Sep 2013 20:46:07 +0000 (16:46 -0400)]
drm/radeon/cik: update gpu_init for an additional berlin gpu

commit 7c4622d5415038a74964480844de885e7253a0f4 upstream.

Sets the right paramters for the new pci id.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agodrm/radeon: fix init ordering for r600+
Alex Deucher [Fri, 30 Aug 2013 12:58:20 +0000 (08:58 -0400)]
drm/radeon: fix init ordering for r600+

commit e5903d399a7b0e5c14673c1206f4aeec2859c730 upstream.

The vram scratch buffer needs to be initialized
before the mc is programmed otherwise we program
0 as the GPU address of the default GPU fault
page.  In most cases we put vram at zero anyway and
reserve a page for the legacy vga buffer so in practice
this shouldn't cause any problems, but better to make
it correct.

Was changed in:
6fab3febf6d949b0a12b1e4e73db38e4a177a79e

Reported-by: FrankR Huang <FrankR.Huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agodrm/radeon: update line buffer allocation for dce8
Alex Deucher [Mon, 19 Aug 2013 15:39:27 +0000 (11:39 -0400)]
drm/radeon: update line buffer allocation for dce8

commit bc01a8c7a24169f8b111b7dda6f5d8e7088309af upstream.

We need to allocate line buffer to each display when
setting up the watermarks.  Failure to do so can lead
to a blank screen.  This fixes blank screen problems
on dce8 asics.

Based on an initial fix from:
Jay Cornwall <jay.cornwall@amd.com>

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agodrm/radeon: fill in gpu_init for berlin GPU cores
Alex Deucher [Mon, 10 Jun 2013 19:18:26 +0000 (15:18 -0400)]
drm/radeon: fill in gpu_init for berlin GPU cores

commit b2e4c70a9747ecb618d563b004ba746869dde5aa upstream.

This fills in the GPU specific details for berlin
GPU cores so that the driver will work with them.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agodrm/radeon: enable UVD interrupts on CIK
Christian König [Fri, 30 Aug 2013 09:10:33 +0000 (11:10 +0200)]
drm/radeon: enable UVD interrupts on CIK

commit 6a3808b8233eb91b57c230cf1161ac116a189ffd upstream.

The same as on evergreen.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reported-by: FrankR Huang <FrankR.Huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agodrm/radeon: fix endian bugs in hw i2c atom routines
Alex Deucher [Wed, 7 Aug 2013 23:34:53 +0000 (19:34 -0400)]
drm/radeon: fix endian bugs in hw i2c atom routines

commit 4543eda52113d1e2cc0e9bf416f79597e6ef1ec7 upstream.

Need to swap the data fetched over i2c properly.  This
is the same fix as the endian fix for aux channel
transactions.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agodrm/radeon/dpm: add reclocking quirk for ASUS K70AF
Alex Deucher [Mon, 12 Aug 2013 15:24:05 +0000 (11:24 -0400)]
drm/radeon/dpm: add reclocking quirk for ASUS K70AF

commit f75195cac32bfd2ef07764bd370d3b788bd8b003 upstream.

The LCD has a relatively short vblank time (216us), but
the card is able to reclock memory fine in that time.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reported-by: normalrawr@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agodrm/radeon: fix LCD record parsing
Alex Deucher [Tue, 20 Aug 2013 18:59:01 +0000 (14:59 -0400)]
drm/radeon: fix LCD record parsing

commit 95663948ba22a4be8b99acd67fbf83e86ddffba4 upstream.

If the LCD table contains an EDID record, properly account
for the edid size when walking through the records.

This should fix error messages about unknown LCD records.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agodrm/nv50/disp: prevent false output detection on the original nv50
Emil Velikov [Fri, 23 Aug 2013 17:43:42 +0000 (18:43 +0100)]
drm/nv50/disp: prevent false output detection on the original nv50

commit 5087f51da805f53cba7366f70d596e7bde2a5486 upstream.

Commit ea9197cc323839ef3d5280c0453b2c622caa6bc7 effectively enabled the
use of an improved DAC detection code, but introduced a regression on
the original nv50 chipset, causing a ghost monitor to be detected.

v2 (Ben Skeggs): the offending line was likely a thinko, removed it for
all chipsets (tested nv50 and nve6 to cover entire range) and added
some additional debugging.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=67382
Tested-by: Martin Peres <martin.peres@labri.fr>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agoradeon kms: fix uninitialised hotplug work usage in r100_irq_process()
Sergey Senozhatsky [Thu, 29 Aug 2013 09:29:35 +0000 (12:29 +0300)]
radeon kms: fix uninitialised hotplug work usage in r100_irq_process()

commit 27c505ca84e164ec66ad55dcf3f5befaac83f10a upstream.

Commit a01c34f72e7cd2624570818f579b5ab464f93de2 (radeon kms: do not
flush uninitialized hotplug work) moved work initialisation phase to
the last step of radeon_irq_kms_init(). Meelis Roos reported that this
causes problems on his machine because drm_irq_install() uses hotplug
work on r100.

hotplug work flushed in radeon_irq_kms_fini(), with two possible cases:
-- radeon_irq_kms_fini() call after successful radeon_irq_kms_init()
-- radeon_irq_kms_fini() call after unsuccessful (or not called at all)
   radeon_irq_kms_init()

The latter one causes flush work on uninitialised hotplug work. Move
work initialisation before drm_irq_install(), but keep existing agreement
to flush hotplug work in radeon_irq_kms_fini() only for `irq.installed'
(successful radeon_irq_kms_init()) case.

WARNING: CPU: 0 PID: 243 at kernel/workqueue.c:1378 __queue_work+0x132/0x16d()
Call Trace:
[<c12319b3>] ? dump_stack+0xa/0x13
[<c1022600>] ? warn_slowpath_common+0x75/0x8a
[<c1031010>] ? __queue_work+0x132/0x16d
[<c1031010>] ? __queue_work+0x132/0x16d
[<c102269e>] ? warn_slowpath_null+0x1b/0x1f
[<c1031010>] ? __queue_work+0x132/0x16d
[<c103107b>] ? queue_work_on+0x30/0x40
[<f8aed3f3>] ? r100_irq_process+0x16d/0x1e6 [radeon]
[<f8ae77cf>] ? radeon_driver_irq_preinstall_kms+0xc2/0xc5 [radeon]
[<f8974d77>] ? drm_irq_install+0xb2/0x1ac [drm]
[<f897604d>] ? drm_vblank_init+0x196/0x1d2 [drm]
[<f8ae78d3>] ? radeon_irq_kms_init+0x33/0xc6 [radeon]
[<f8aef35a>] ? r100_startup+0x1a3/0x1d6 [radeon]
[<f8ad77c8>] ? radeon_ttm_init+0x26e/0x287 [radeon]
[<f8aef752>] ? r100_init+0x2b3/0x309 [radeon]
[<c118082e>] ? vga_client_register+0x39/0x40
[<f8ac535f>] ? radeon_device_init+0x54b/0x61b [radeon]
[<f8ac40fd>] ? cail_mc_write+0x13/0x13 [radeon]
[<f8ac6864>] ? radeon_driver_load_kms+0x82/0xda [radeon]
[<f8978bbd>] ? drm_get_pci_dev+0x136/0x22d [drm]
[<f8ac409b>] ? radeon_pci_probe+0x6c/0x86 [radeon]
[<c112acf6>] ? pci_device_probe+0x4c/0x83
[<c11846c7>] ? driver_probe_device+0x80/0x184
[<c112a848>] ? pci_match_id+0x18/0x36
[<c1184837>] ? __driver_attach+0x44/0x5f
[<c11833f4>] ? bus_for_each_dev+0x50/0x5a
[<c118433e>] ? driver_attach+0x14/0x16
[<c11847f3>] ? __device_attach+0x28/0x28
[<c1184045>] ? bus_add_driver+0xd6/0x1bf
[<c1184c22>] ? driver_register+0x78/0xcf
[<f8ba8000>] ? 0xf8ba7fff
[<c10003bf>] ? do_one_initcall+0x8b/0x121
[<c101e668>] ? change_page_attr_clear+0x2e/0x33
[<f8ba8000>] ? 0xf8ba7fff
[<c101e689>] ? set_memory_ro+0x1c/0x20
[<c104de94>] ? set_page_attributes+0x11/0x12
[<c104f6e1>] ? load_module+0x12fa/0x17e8
[<c107483b>] ? map_vm_area+0x22/0x31
[<c104fc36>] ? SyS_init_module+0x67/0x7d
[<c1234245>] ? sysenter_do_call+0x12/0x26

Reported-by: Meelis Roos <mroos@linux.ee>
Tested-by: Meelis Roos <mroos@linux.ee>
Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agodrm/ttm: fix the tt_populated check in ttm_tt_destroy()
Ben Skeggs [Tue, 17 Sep 2013 04:21:15 +0000 (14:21 +1000)]
drm/ttm: fix the tt_populated check in ttm_tt_destroy()

commit 182b17c8dc4e83aab000ce86587b6810e515da87 upstream.

After a vmalloc failure in ttm_dma_tt_alloc_page_directory(),
ttm_dma_tt_init() will call ttm_tt_destroy() to cleanup, and end up
inside the driver's unpopulate() hook when populate() has never yet
been called.

On nouveau, the first issue to be hit because of this is that
dma_address[] may be a NULL pointer.  After working around this,
ttm_pool_unpopulate() may potentially hit the same issue with
the pages[] array.

It seems to make more sense to avoid calling unpopulate on already
unpopulated TTMs than to add checks to all the implementations.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
Cc: Jerome Glisse <jglisse@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agodrm/ast: fix the ast open key function
Dave Airlie [Thu, 12 Sep 2013 05:31:04 +0000 (15:31 +1000)]
drm/ast: fix the ast open key function

commit 2e8378136f28bea960cec643d3fa5d843c9049ec upstream.

When porting from UMS I mistyped this from the wrong place, AST noticed
and pointed it out, so we should fix it to be like the X.org driver.

Reported-by: Y.C. Chen <yc_chen@aspeedtech.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agodrm: fix DRM_IOCTL_MODE_GETFB handle-leak
David Herrmann [Mon, 26 Aug 2013 13:16:49 +0000 (15:16 +0200)]
drm: fix DRM_IOCTL_MODE_GETFB handle-leak

commit 101b96f32956ee99bf1468afaf572b88cda9f88b upstream.

DRM_IOCTL_MODE_GETFB is used to retrieve information about a given
framebuffer ID. It is a read-only helper and was thus declassified for
unprivileged access in:

  commit a14b1b42477c5ef089fcda88cbaae50d979eb8f9
  Author: Mandeep Singh Baines <mandeep.baines@gmail.com>
  Date:   Fri Jan 20 12:11:16 2012 -0800

      drm: remove master fd restriction on mode setting getters

However, alongside width, height and stride information,
DRM_IOCTL_MODE_GETFB also passes back a handle to the underlying buffer of
the framebuffer. This handle allows users to mmap() it and read or write
into it. Obviously, this should be restricted to DRM-Master.

With the current setup, *any* process with access to /dev/dri/card0 (which
means any process with access to hardware-accelerated rendering) can
access the current screen framebuffer and modify it ad libitum.

For backwards-compatibility reasons we want to keep the
DRM_IOCTL_MODE_GETFB call unprivileged. Besides, it provides quite useful
information regarding screen setup. So we simply test whether the caller
is the current DRM-Master and if not, we return 0 as handle, which is
always invalid. A following DRM_IOCTL_GEM_CLOSE on this handle will fail
with EINVAL, but we accept this. Users shouldn't test for errors during
GEM_CLOSE, anyway. And it is still better as a failing MODE_GETFB call.

v2: add capable(CAP_SYS_ADMIN) check for compatibility with i-g-t

Signed-off-by: David Herrmann <dh.herrmann@gmail.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agodrm/i915: Don't enable the cursor on a disable pipe
Ville Syrjälä [Tue, 17 Sep 2013 15:33:44 +0000 (18:33 +0300)]
drm/i915: Don't enable the cursor on a disable pipe

commit f2f5f771c5fc0fa252cde3d0d0452dcc785cc17a upstream.

On HSW enabling a plane on a disabled pipe may hang the entire system.
And there's no good reason for doing it ever, so just don't.

v2: Move the crtc active checks to intel_crtc_cursor_{set,move} to
    avoid confusing people during modeset

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Tested-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agodrm/i915: do not update cursor in crtc mode set
Jani Nikula [Tue, 17 Sep 2013 15:33:43 +0000 (18:33 +0300)]
drm/i915: do not update cursor in crtc mode set

commit cc173961a68034c1171a421f0dbed39edfb60880 upstream.

The cursor is disabled before crtc mode set in crtc disable (and we
assert this is the case), and enabled afterwards in crtc enable. Do not
update it in crtc mode set.

On HSW enabling a plane on a disabled pipe may hang the entire system.
And there's no good reason for doing it ever, so just don't.

v2: Add note about HSW hangs - vsyrjala

Suggested-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Tested-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agodrm/i915: fix wait_for_pending_flips vs gpu hang deadlock
Daniel Vetter [Sun, 8 Sep 2013 19:57:13 +0000 (21:57 +0200)]
drm/i915: fix wait_for_pending_flips vs gpu hang deadlock

commit 17e1df07df0fbc77696a1e1b6ccf9f2e5af70e40 upstream.

My g33 here seems to be shockingly good at hitting them all. This time
around kms_flip/flip-vs-panning-vs-hang blows up:

intel_crtc_wait_for_pending_flips correctly checks for gpu hangs and
if a gpu hang is pending aborts the wait for outstanding flips so that
the setcrtc call will succeed and release the crtc mutex. And the gpu
hang handler needs that lock in intel_display_handle_reset to be able
to complete outstanding flips.

The problem is that we can race in two ways:
- Waiters on the dev_priv->pending_flip_queue aren't woken up after
  we've the reset as pending, but before we actually start the reset
  work. This means that the waiter doesn't notice the pending reset
  and hence will keep on hogging the locks.

  Like with dev->struct_mutex and the ring->irq_queue wait queues we
  there need to wake up everyone that potentially holds a lock which
  the reset handler needs.

- intel_display_handle_reset was called _after_ we've already
  signalled the completion of the reset work. Which means a waiter
  could sneak in, grab the lock and never release it (since the
  pageflips won't ever get released).

  Similar to resetting the gem state all the reset work must complete
  before we update the reset counter. Contrary to the gem reset we
  don't need to have a second explicit wake up call since that will
  have happened already when completing the pageflips. We also don't
  have any issues that the completion happens while the reset state is
  still pending - wait_for_pending_flips is only there to ensure we
  display the right frame. After a gpu hang&reset events such
  guarantees are out the window anyway. This is in contrast to the gem
  code where too-early wake-up would result in unnecessary restarting
  of ioctls.

Also, since we've gotten these various deadlocks and ordering
constraints wrong so often throw copious amounts of comments at the
code.

This deadlock regression has been introduced in the commit which added
the pageflip reset logic to the gpu hang work:

commit 96a02917a0131e52efefde49c2784c0421d6c439
Author: Ville Syrjälä <ville.syrjala@linux.intel.com>
Date:   Mon Feb 18 19:08:49 2013 +0200

    drm/i915: Finish page flips and update primary planes after a GPU reset

v2:
- Add comments to explain how the wake_up serves as memory barriers
  for the atomic_t reset counter.
- Improve the comments a bit as suggested by Chris Wilson.
- Extract the wake_up calls before/after the reset into a little
  i915_error_wake_up and unconditionally wake up the
  pending_flip_queue waiters, again as suggested by Chris Wilson.

v3: Throw copious amounts of comments at i915_error_wake_up as
suggested by Chris Wilson.

Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agodrm/i915: fix gpu hang vs. flip stall deadlocks
Daniel Vetter [Wed, 4 Sep 2013 15:36:14 +0000 (17:36 +0200)]
drm/i915: fix gpu hang vs. flip stall deadlocks

commit 122f46badaafbe651f05c2c0f24cadee692f761b upstream.

Since we've started to clean up pending flips when the gpu hangs in

commit 96a02917a0131e52efefde49c2784c0421d6c439
Author: Ville Syrjälä <ville.syrjala@linux.intel.com>
Date:   Mon Feb 18 19:08:49 2013 +0200

    drm/i915: Finish page flips and update primary planes after a GPU reset

the gpu reset work now also grabs modeset locks. But since work items
on our private work queue are not allowed to do that due to the
flush_workqueue from the pageflip code this results in a neat
deadlock:

INFO: task kms_flip:14676 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
kms_flip        D ffff88019283a5c0     0 14676  13344 0x00000004
 ffff88018e62dbf8 0000000000000046 ffff88013bdb12e0 ffff88018e62dfd8
 ffff88018e62dfd8 00000000001d3b00 ffff88019283a5c0 ffff88018ec21000
 ffff88018f693f00 ffff88018eece000 ffff88018e62dd60 ffff88018eece898
Call Trace:
 [<ffffffff8138ee7b>] schedule+0x60/0x62
 [<ffffffffa046c0dd>] intel_crtc_wait_for_pending_flips+0xb2/0x114 [i915]
 [<ffffffff81050ff4>] ? finish_wait+0x60/0x60
 [<ffffffffa0478041>] intel_crtc_set_config+0x7f3/0x81e [i915]
 [<ffffffffa031780a>] drm_mode_set_config_internal+0x4f/0xc6 [drm]
 [<ffffffffa0319cf3>] drm_mode_setcrtc+0x44d/0x4f9 [drm]
 [<ffffffff810e44da>] ? might_fault+0x38/0x86
 [<ffffffffa030d51f>] drm_ioctl+0x2f9/0x447 [drm]
 [<ffffffff8107a722>] ? trace_hardirqs_off+0xd/0xf
 [<ffffffffa03198a6>] ? drm_mode_setplane+0x343/0x343 [drm]
 [<ffffffff8112222f>] ? mntput_no_expire+0x3e/0x13d
 [<ffffffff81117f33>] vfs_ioctl+0x18/0x34
 [<ffffffff81118776>] do_vfs_ioctl+0x396/0x454
 [<ffffffff81396b37>] ? sysret_check+0x1b/0x56
 [<ffffffff81118886>] SyS_ioctl+0x52/0x7d
 [<ffffffff81396b12>] system_call_fastpath+0x16/0x1b
2 locks held by kms_flip/14676:
 #0:  (&dev->mode_config.mutex){+.+.+.}, at: [<ffffffffa0316545>] drm_modeset_lock_all+0x22/0x59 [drm]
 #1:  (&crtc->mutex){+.+.+.}, at: [<ffffffffa031656b>] drm_modeset_lock_all+0x48/0x59 [drm]
INFO: task kworker/u8:4:175 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
kworker/u8:4    D ffff88018de9a5c0     0   175      2 0x00000000
Workqueue: i915 i915_error_work_func [i915]
 ffff88018e37dc30 0000000000000046 ffff8801938ab8a0 ffff88018e37dfd8
 ffff88018e37dfd8 00000000001d3b00 ffff88018de9a5c0 ffff88018ec21018
 0000000000000246 ffff88018e37dca0 000000005a865a86 ffff88018de9a5c0
Call Trace:
 [<ffffffff8138ee7b>] schedule+0x60/0x62
 [<ffffffff8138f23d>] schedule_preempt_disabled+0x9/0xb
 [<ffffffff8138d0cd>] mutex_lock_nested+0x205/0x3b1
 [<ffffffffa0477094>] ? intel_display_handle_reset+0x7e/0xbd [i915]
 [<ffffffffa0477094>] ? intel_display_handle_reset+0x7e/0xbd [i915]
 [<ffffffffa0477094>] intel_display_handle_reset+0x7e/0xbd [i915]
 [<ffffffffa044e0a2>] i915_error_work_func+0x128/0x147 [i915]
 [<ffffffff8104a89a>] process_one_work+0x1d4/0x35a
 [<ffffffff8104a821>] ? process_one_work+0x15b/0x35a
 [<ffffffff8104b4a5>] worker_thread+0x144/0x1f0
 [<ffffffff8104b361>] ? rescuer_thread+0x275/0x275
 [<ffffffff8105076d>] kthread+0xac/0xb4
 [<ffffffff81059d30>] ? finish_task_switch+0x3b/0xc0
 [<ffffffff810506c1>] ? __kthread_parkme+0x60/0x60
 [<ffffffff81396a6c>] ret_from_fork+0x7c/0xb0
 [<ffffffff810506c1>] ? __kthread_parkme+0x60/0x60
3 locks held by kworker/u8:4/175:
 #0:  (i915){.+.+.+}, at: [<ffffffff8104a821>] process_one_work+0x15b/0x35a
 #1:  ((&dev_priv->gpu_error.work)){+.+.+.}, at: [<ffffffff8104a821>] process_one_work+0x15b/0x35a
 #2:  (&crtc->mutex){+.+.+.}, at: [<ffffffffa0477094>] intel_display_handle_reset+0x7e/0xbd [i915]

This blew up while running kms_flip/flip-vs-panning-vs-hang-interruptible
on one of my older machines.

Unfortunately (despite the proper lockdep annotations for
flush_workqueue) lockdep still doesn't detect this correctly, so we
need to rely on chance to discover these bugs.

Apply the usual bugfix and schedule the reset work on the system
workqueue to keep our own driver workqueue free of any modeset lock
grabbing.

Note that this is not a terribly serious regression since before the
offending commit we'd simply have stalled userspace forever due to
failing to abort all outstanding pageflips.

v2: Add a comment as requested by Chris.

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agodrm/i915: fix hpd work vs. flush_work in the pageflip code deadlock
Daniel Vetter [Mon, 2 Sep 2013 14:22:25 +0000 (16:22 +0200)]
drm/i915: fix hpd work vs. flush_work in the pageflip code deadlock

commit 645416f5adc87c8fae44289cdba7562f3ade8f5c upstream.

Historically we've run our own driver hotplug handling in our own
work-queue, which then launched the drm core hotplug handling in the
system workqueue. This is important since we flush our own driver
workqueue in the pageflip code while hodling modeset locks, and only
the drm hotplug code grabbed these locks. But with

commit 69787f7da6b2adc4054357a661aaa1701a9ca76f
Author: Daniel Vetter <daniel.vetter@ffwll.ch>
Date:   Tue Oct 23 18:23:34 2012 +0000

    drm: run the hpd irq event code directly

this was changed and now we could deadlock in our flip handler if
there's a hotplug work blocking the progress of the crucial unpin
works. So this broke the careful deadlock avoidance implemented in

commit b4a98e57fc27854b5938fc8b08b68e5e68b91e1f
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Thu Nov 1 09:26:26 2012 +0000

    drm/i915: Flush outstanding unpin tasks before pageflipping

Since the rule thus far has been that work items on our own workqueue
may never grab modeset locks simply restore that rule again.

v2: Add a comment to the declaration of dev_priv->wq to warn readers
about the tricky implications of using it. Suggested by Chris Wilson.

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Stuart Abercrombie <sabercrombie@chromium.org>
Reported-by: Stuart Abercrombie <sabercrombie@chromium.org>
References: http://permalink.gmane.org/gmane.comp.freedesktop.xorg.drivers.intel/26239
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
[danvet: Squash in a comment at the place where we schedule the work.
Requested after-the-fact by Chris on irc since the hpd work isn't the
only place we botch this.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agodrm/i915: try not to lose backlight CBLV precision
Jani Nikula [Fri, 23 Aug 2013 07:50:39 +0000 (10:50 +0300)]
drm/i915: try not to lose backlight CBLV precision

commit cac6a5ae0118832936eb162ec4cedb30f2422bcc upstream.

ACPI has _BCM and _BQC methods to set and query the backlight
brightness, respectively. The ACPI opregion has variables BCLP and CBLV
to hold the requested and current backlight brightness, respectively.

The BCLP variable has range 0..255 while the others have range
0..100. This means the _BCM method has to scale the brightness for BCLP,
and the gfx driver has to scale the requested value back for CBLV. If
the _BQC method uses the CBLV variable (apparently some implementations
do, some don't) for current backlight level reporting, there's room for
rounding errors.

Use DIV_ROUND_UP for scaling back to CBLV to get back to the same values
that were passed to _BCM, presuming the _BCM simply uses bclp = (in *
255) / 100 for scaling to BCLP.

Reference: https://gist.github.com/aaronlu/6314920
Reported-by: Aaron Lu <aaron.lu@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Aaron Lu <aaron.lu@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agousb: gadget: fix a bug and a WARN_ON in dummy-hcd
Alan Stern [Tue, 30 Jul 2013 19:18:15 +0000 (15:18 -0400)]
usb: gadget: fix a bug and a WARN_ON in dummy-hcd

commit 5f5610f69be3a925b1f79af27150bb7377bc9ad6 upstream.

This patch fixes a NULL pointer dereference and a WARN_ON in
dummy-hcd.  These things were the result of moving to the UDC core
framework, and possibly of changes to that framework.

Now unloading a gadget driver causes the UDC to be stopped after the
gadget driver is unbound, not before.  Therefore the "driver" argument
to dummy_udc_stop() can be NULL, so we must not try to print the
driver's name without checking first.

Also, the UDC framework automatically unregisters the gadget when the
UDC is deleted.  Therefore a sysfs attribute file attached to the
gadget must be removed before the UDC is deleted, not after.

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Felipe Balbi <balbi@ti.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agoHID: logitech-dj: validate output report details
Kees Cook [Wed, 11 Sep 2013 19:56:56 +0000 (21:56 +0200)]
HID: logitech-dj: validate output report details

commit 297502abb32e225fb23801fcdb0e4f6f8e17099a upstream.

A HID device could send a malicious output report that would cause the
logitech-dj HID driver to leak kernel memory contents to the device, or
trigger a NULL dereference during initialization:

[  304.424553] usb 1-1: New USB device found, idVendor=046d, idProduct=c52b
...
[  304.780467] BUG: unable to handle kernel NULL pointer dereference at 0000000000000028
[  304.781409] IP: [<ffffffff815d50aa>] logi_dj_recv_send_report.isra.11+0x1a/0x90

CVE-2013-2895

Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Benjamin Tissoires <benjamin.tissoires@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agoHID: lenovo-tpkbd: validate output report details
Kees Cook [Wed, 11 Sep 2013 19:56:55 +0000 (21:56 +0200)]
HID: lenovo-tpkbd: validate output report details

commit 0a9cd0a80ac559357c6a90d26c55270ed752aa26 upstream.

A HID device could send a malicious output report that would cause the
lenovo-tpkbd HID driver to write just beyond the output report allocation
during initialization, causing a heap overflow:

[   76.109807] usb 1-1: New USB device found, idVendor=17ef, idProduct=6009
...
[   80.462540] BUG kmalloc-192 (Tainted: G        W   ): Redzone overwritten

CVE-2013-2894

Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agoHID: sony: validate HID output report details
Kees Cook [Wed, 11 Sep 2013 19:56:52 +0000 (21:56 +0200)]
HID: sony: validate HID output report details

commit 9446edb9a1740989cf6c20daf7510fb9a23be14a upstream.

This driver must validate the availability of the HID output report and
its size before it can write LED states via buzz_set_leds(). This stops
a heap overflow that is possible if a device provides a malicious HID
output report:

[  108.171280] usb 1-1: New USB device found, idVendor=054c, idProduct=0002
...
[  117.507877] BUG kmalloc-192 (Not tainted): Redzone overwritten

CVE-2013-2890

Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agoHID: steelseries: validate output report details
Kees Cook [Wed, 11 Sep 2013 19:56:53 +0000 (21:56 +0200)]
HID: steelseries: validate output report details

commit 41df7f6d43723deb7364340b44bc5d94bf717456 upstream.

A HID device could send a malicious output report that would cause the
steelseries HID driver to write beyond the output report allocation
during initialization, causing a heap overflow:

[  167.981534] usb 1-1: New USB device found, idVendor=1038, idProduct=1410
...
[  182.050547] BUG kmalloc-256 (Tainted: G        W   ): Redzone overwritten

CVE-2013-2891

Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agoHID: lenovo-tpkbd: fix leak if tpkbd_probe_tp fails
Benjamin Tissoires [Wed, 11 Sep 2013 19:56:59 +0000 (21:56 +0200)]
HID: lenovo-tpkbd: fix leak if tpkbd_probe_tp fails

commit 0ccdd9e7476680c16113131264ad6597bd10299d upstream.

If tpkbd_probe_tp() bails out, the probe() function return an error,
but hid_hw_stop() is never called.

fixes:
https://bugzilla.redhat.com/show_bug.cgi?id=1003998

Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agoHID: zeroplus: validate output report details
Kees Cook [Wed, 11 Sep 2013 19:56:51 +0000 (21:56 +0200)]
HID: zeroplus: validate output report details

commit 78214e81a1bf43740ce89bb5efda78eac2f8ef83 upstream.

The zeroplus HID driver was not checking the size of allocated values
in fields it used. A HID device could send a malicious output report
that would cause the driver to write beyond the output report allocation
during initialization, causing a heap overflow:

[ 1442.728680] usb 1-1: New USB device found, idVendor=0c12, idProduct=0005
...
[ 1466.243173] BUG kmalloc-192 (Tainted: G        W   ): Redzone overwritten

CVE-2013-2889

Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agoHID: LG: validate HID output report details
Kees Cook [Wed, 11 Sep 2013 19:56:54 +0000 (21:56 +0200)]
HID: LG: validate HID output report details

commit 0fb6bd06e06792469acc15bbe427361b56ada528 upstream.

A HID device could send a malicious output report that would cause the
lg, lg3, and lg4 HID drivers to write beyond the output report allocation
during an event, causing a heap overflow:

[  325.245240] usb 1-1: New USB device found, idVendor=046d, idProduct=c287
...
[  414.518960] BUG kmalloc-4096 (Not tainted): Redzone overwritten

Additionally, while lg2 did correctly validate the report details, it was
cleaned up and shortened.

CVE-2013-2893

Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agoHID: multitouch: validate indexes details
Benjamin Tissoires [Wed, 11 Sep 2013 19:56:58 +0000 (21:56 +0200)]
HID: multitouch: validate indexes details

commit 8821f5dc187bdf16cfb32ef5aa8c3035273fa79a upstream.

When working on report indexes, always validate that they are in bounds.
Without this, a HID device could report a malicious feature report that
could trick the driver into a heap overflow:

[  634.885003] usb 1-1: New USB device found, idVendor=0596, idProduct=0500
...
[  676.469629] BUG kmalloc-192 (Tainted: G        W   ): Redzone overwritten

Note that we need to change the indexes from s8 to s16 as they can
be between -1 and 255.

CVE-2013-2897

Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Acked-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agoHID: validate feature and input report details
Benjamin Tissoires [Wed, 11 Sep 2013 19:56:57 +0000 (21:56 +0200)]
HID: validate feature and input report details

commit cc6b54aa54bf40b762cab45a9fc8aa81653146eb upstream.

When dealing with usage_index, be sure to properly use unsigned instead of
int to avoid overflows.

When working on report fields, always validate that their report_counts are
in bounds.
Without this, a HID device could report a malicious feature report that
could trick the driver into a heap overflow:

[  634.885003] usb 1-1: New USB device found, idVendor=0596, idProduct=0500
...
[  676.469629] BUG kmalloc-192 (Tainted: G        W   ): Redzone overwritten

CVE-2013-2897

Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Acked-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agoHID: provide a helper for validating hid reports
Kees Cook [Wed, 11 Sep 2013 19:56:50 +0000 (21:56 +0200)]
HID: provide a helper for validating hid reports

commit 331415ff16a12147d57d5c953f3a961b7ede348b upstream.

Many drivers need to validate the characteristics of their HID report
during initialization to avoid misusing the reports. This adds a common
helper to perform validation of the report exisitng, the field existing,
and the expected number of values within the field.

Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agosched/fair: Fix small race where child->se.parent,cfs_rq might point to invalid ones
Daisuke Nishimura [Tue, 10 Sep 2013 09:16:36 +0000 (18:16 +0900)]
sched/fair: Fix small race where child->se.parent,cfs_rq might point to invalid ones

commit 6c9a27f5da9609fca46cb2b183724531b48f71ad upstream.

There is a small race between copy_process() and cgroup_attach_task()
where child->se.parent,cfs_rq points to invalid (old) ones.

        parent doing fork()      | someone moving the parent to another cgroup
  -------------------------------+---------------------------------------------
    copy_process()
      + dup_task_struct()
        -> parent->se is copied to child->se.
           se.parent,cfs_rq of them point to old ones.

                                     cgroup_attach_task()
                                       + cgroup_task_migrate()
                                         -> parent->cgroup is updated.
                                       + cpu_cgroup_attach()
                                         + sched_move_task()
                                           + task_move_group_fair()
                                             +- set_task_rq()
                                                -> se.parent,cfs_rq of parent
                                                   are updated.

      + cgroup_fork()
        -> parent->cgroup is copied to child->cgroup. (*1)
      + sched_fork()
        + task_fork_fair()
          -> se.parent,cfs_rq of child are accessed
             while they point to old ones. (*2)

In the worst case, this bug can lead to "use-after-free" and cause a panic,
because it's new cgroup's refcount that is incremented at (*1),
so the old cgroup(and related data) can be freed before (*2).

In fact, a panic caused by this bug was originally caught in RHEL6.4.

    BUG: unable to handle kernel NULL pointer dereference at (null)
    IP: [<ffffffff81051e3e>] sched_slice+0x6e/0xa0
    [...]
    Call Trace:
     [<ffffffff81051f25>] place_entity+0x75/0xa0
     [<ffffffff81056a3a>] task_fork_fair+0xaa/0x160
     [<ffffffff81063c0b>] sched_fork+0x6b/0x140
     [<ffffffff8106c3c2>] copy_process+0x5b2/0x1450
     [<ffffffff81063b49>] ? wake_up_new_task+0xd9/0x130
     [<ffffffff8106d2f4>] do_fork+0x94/0x460
     [<ffffffff81072a9e>] ? sys_wait4+0xae/0x100
     [<ffffffff81009598>] sys_clone+0x28/0x30
     [<ffffffff8100b393>] stub_clone+0x13/0x20
     [<ffffffff8100b072>] ? system_call_fastpath+0x16/0x1b

Signed-off-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/039601ceae06$733d3130$59b79390$@mxp.nes.nec.co.jp
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agosched/cputime: Do not scale when utime == 0
Stanislaw Gruszka [Wed, 4 Sep 2013 13:16:03 +0000 (15:16 +0200)]
sched/cputime: Do not scale when utime == 0

commit 5a8e01f8fa51f5cbce8f37acc050eb2319d12956 upstream.

scale_stime() silently assumes that stime < rtime, otherwise
when stime == rtime and both values are big enough (operations
on them do not fit in 32 bits), the resulting scaling stime can
be bigger than rtime. In consequence utime = rtime - stime
results in negative value.

User space visible symptoms of the bug are overflowed TIME
values on ps/top, for example:

 $ ps aux | grep rcu
 root         8  0.0  0.0      0     0 ?        S    12:42   0:00 [rcuc/0]
 root         9  0.0  0.0      0     0 ?        S    12:42   0:00 [rcub/0]
 root        10 62422329  0.0  0     0 ?        R    12:42 21114581:37 [rcu_preempt]
 root        11  0.1  0.0      0     0 ?        S    12:42   0:02 [rcuop/0]
 root        12 62422329  0.0  0     0 ?        S    12:42 21114581:35 [rcuop/1]
 root        10 62422329  0.0  0     0 ?        R    12:42 21114581:37 [rcu_preempt]

or overflowed utime values read directly from /proc/$PID/stat

Reference:

  https://lkml.org/lkml/2013/8/20/259

Reported-and-tested-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
Cc: stable@vger.kernel.org
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Borislav Petkov <bp@alien8.de>
Link: http://lkml.kernel.org/r/20130904131602.GC2564@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agotimekeeping: Fix HRTICK related deadlock from ntp lock changes
John Stultz [Wed, 11 Sep 2013 23:50:56 +0000 (16:50 -0700)]
timekeeping: Fix HRTICK related deadlock from ntp lock changes

commit 7bd36014460f793c19e7d6c94dab67b0afcfcb7f upstream.

Gerlando Falauto reported that when HRTICK is enabled, it is
possible to trigger system deadlocks. These were hard to
reproduce, as HRTICK has been broken in the past, but seemed
to be connected to the timekeeping_seq lock.

Since seqlock/seqcount's aren't supported w/ lockdep, I added
some extra spinlock based locking and triggered the following
lockdep output:

[   15.849182] ntpd/4062 is trying to acquire lock:
[   15.849765]  (&(&pool->lock)->rlock){..-...}, at: [<ffffffff810aa9b5>] __queue_work+0x145/0x480
[   15.850051]
[   15.850051] but task is already holding lock:
[   15.850051]  (timekeeper_lock){-.-.-.}, at: [<ffffffff810df6df>] do_adjtimex+0x7f/0x100

<snip>

[   15.850051] Chain exists of: &(&pool->lock)->rlock --> &p->pi_lock --> timekeeper_lock
[   15.850051]  Possible unsafe locking scenario:
[   15.850051]
[   15.850051]        CPU0                    CPU1
[   15.850051]        ----                    ----
[   15.850051]   lock(timekeeper_lock);
[   15.850051]                                lock(&p->pi_lock);
[   15.850051] lock(timekeeper_lock);
[   15.850051] lock(&(&pool->lock)->rlock);
[   15.850051]
[   15.850051]  *** DEADLOCK ***

The deadlock was introduced by 06c017fdd4dc48451a ("timekeeping:
Hold timekeepering locks in do_adjtimex and hardpps") in 3.10

This patch avoids this deadlock, by moving the call to
schedule_delayed_work() outside of the timekeeper lock
critical section.

Reported-by: Gerlando Falauto <gerlando.falauto@keymile.com>
Tested-by: Lin Ming <minggr@gmail.com>
Signed-off-by: John Stultz <john.stultz@linaro.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Link: http://lkml.kernel.org/r/1378943457-27314-1-git-send-email-john.stultz@linaro.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agort2800: fix wrong TX power compensation
Stanislaw Gruszka [Mon, 26 Aug 2013 13:18:53 +0000 (15:18 +0200)]
rt2800: fix wrong TX power compensation

commit 6e956da2027c767859128b9bfef085cf2a8e233b upstream.

We should not do temperature compensation on devices without
EXTERNAL_TX_ALC bit set (called DynamicTxAgcControl on vendor driver).
Such devices can have totally bogus TSSI parameters on the EEPROM,
but still threaded by us as valid and result doing wrong TX power
calculations.

This fix inability to connect to AP on slightly longer distance on
some Ralink chips/devices.

Reported-and-tested-by: Fabien ADAM <id2ndr@crocobox.org>
Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agort2800: change initialization sequence to fix system freeze
Stanislaw Gruszka [Mon, 9 Sep 2013 10:37:37 +0000 (12:37 +0200)]
rt2800: change initialization sequence to fix system freeze

commit f4e1a4d3ecbb9e42bdf8e7869ee8a4ebfa27fb20 upstream.

My commit

commit c630ccf1a127578421a928489d51e99c05037054
Author: Stanislaw Gruszka <stf_xl@wp.pl>
Date:   Sat Mar 16 19:19:46 2013 +0100

    rt2800: rearrange bbp/rfcsr initialization

make Maxim machine freeze when try to start wireless device.

Initialization order and sending MCU_BOOT_SIGNAL request, changed in
above commit, is important. Doing things incorrectly make PCIe bus
problems, which can froze the machine.

This patch change initialization sequence like vendor driver do:
function NICInitializeAsic() from
2011_1007_RT5390_RT5392_Linux_STA_V2.5.0.3_DPO (PCI devices) and
DPO_RT5572_LinuxSTA_2.6.1.3_20121022 (according Mediatek, latest driver
for RT8070/RT3070/RT3370/RT3572/RT5370/RT5372/RT5572 USB devices).
It fixes freezes on Maxim system.

Resolve:
https://bugzilla.redhat.com/show_bug.cgi?id=1000679

Reported-and-tested-by: Maxim Polyakov <polyakov@dexmalabs.com>
Bisected-by: Igor Gnatenko <i.gnatenko.brain@gmail.com>
Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agobgmac: fix internal switch initialization
Rafał Miłecki [Sat, 14 Sep 2013 22:22:47 +0000 (00:22 +0200)]
bgmac: fix internal switch initialization

commit 6a391e7bf26c04a6df5f77290e1146941d210d49 upstream.

Some devices (BCM4749, BCM5357, BCM53572) have internal switch that
requires initialization. We already have code for this, but because
of the typo in code it was never working. This resulted in network not
working for some routers and possibility of soft-bricking them.

Use correct bit for switch initialization and fix typo in the define.

Signed-off-by: Rafał Miłecki <zajec5@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agocifs: fix filp leak in cifs_atomic_open()
Miklos Szeredi [Mon, 16 Sep 2013 12:51:59 +0000 (14:51 +0200)]
cifs: fix filp leak in cifs_atomic_open()

commit dfb1d61b0e9f9e2c542e9adc8d970689f4114ff6 upstream.

If an error occurs after having called finish_open() then fput() needs to
be called on the already opened file.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Cc: Steve French <sfrench@samba.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agonet: usb: cdc_ether: Use wwan interface for Telit modules
Fabio Porcedda [Mon, 16 Sep 2013 09:47:50 +0000 (11:47 +0200)]
net: usb: cdc_ether: Use wwan interface for Telit modules

commit 0092820407901a0b2c4e343e85f96bb7abfcded1 upstream.

Signed-off-by: Fabio Porcedda <fabio.porcedda@gmail.com>
Acked-by: Oliver Neukum <oliver@neukum.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agoARM: OMAP4: Fix clock_get error for GPMC during boot
Tony Lindgren [Wed, 18 Sep 2013 19:01:58 +0000 (12:01 -0700)]
ARM: OMAP4: Fix clock_get error for GPMC during boot

commit 2cfeed314207f808077edb2f1ba41ba1ebbe3e69 upstream.

Looks like we still have the legacy clock alias name for
omap4 GPMC (General Purpose Memory Controller), so let's
fix it for the device tree naming. There's no need to keep
the legacy naming as omap4 is DT only nowadays.

Without this fix we get the following error while booting:

[    0.440399] omap-gpmc 50000000.gpmc: error: clk_get

Reported-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agoPCI / ACPI / PM: Clear pme_poll for devices in D3cold on wakeup
Rafael J. Wysocki [Sat, 14 Sep 2013 01:38:20 +0000 (03:38 +0200)]
PCI / ACPI / PM: Clear pme_poll for devices in D3cold on wakeup

commit 834145156bedadfb50121f0bc5e9d9f9f942bcca upstream.

Commit 448bd85 (PCI/PM: add PCIe runtime D3cold support) added a
piece of code to pci_acpi_wake_dev() causing that function to behave
in a special way for devices in D3cold (so that their configuration
registers are not accessed before those devices are resumed).
However, it didn't take the clearing of the pme_poll flag into
account.  That has to be done for all devices, even if they are in
D3cold, or pci_pme_list_scan() will not know that wakeup has been
signaled for the device and will poll its PME Status bit
unnecessarily.

Fix the problem by moving the clearing of the pme_poll flag in
pci_acpi_wake_dev() before the code introduced by commit 448bd85.

Reported-and-tested-by: David E. Box <david.e.box@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agoLinux 3.11.2 v3.11.2
Greg Kroah-Hartman [Fri, 27 Sep 2013 00:22:44 +0000 (17:22 -0700)]
Linux 3.11.2

10 years agofuse: readdir: check for slash in names
Miklos Szeredi [Tue, 3 Sep 2013 12:28:38 +0000 (14:28 +0200)]
fuse: readdir: check for slash in names

commit efeb9e60d48f7778fdcad4a0f3ad9ea9b19e5dfd upstream.

Userspace can add names containing a slash character to the directory
listing.  Don't allow this as it could cause all sorts of trouble.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agofuse: hotfix truncate_pagecache() issue
Maxim Patlasov [Fri, 30 Aug 2013 13:06:04 +0000 (17:06 +0400)]
fuse: hotfix truncate_pagecache() issue

commit 06a7c3c2781409af95000c60a5df743fd4e2f8b4 upstream.

The way how fuse calls truncate_pagecache() from fuse_change_attributes()
is completely wrong. Because, w/o i_mutex held, we never sure whether
'oldsize' and 'attr->size' are valid by the time of execution of
truncate_pagecache(inode, oldsize, attr->size). In fact, as soon as we
released fc->lock in the middle of fuse_change_attributes(), we completely
loose control of actions which may happen with given inode until we reach
truncate_pagecache. The list of potentially dangerous actions includes
mmap-ed reads and writes, ftruncate(2) and write(2) extending file size.

The typical outcome of doing truncate_pagecache() with outdated arguments
is data corruption from user point of view. This is (in some sense)
acceptable in cases when the issue is triggered by a change of the file on
the server (i.e. externally wrt fuse operation), but it is absolutely
intolerable in scenarios when a single fuse client modifies a file without
any external intervention. A real life case I discovered by fsx-linux
looked like this:

1. Shrinking ftruncate(2) comes to fuse_do_setattr(). The latter sends
FUSE_SETATTR to the server synchronously, but before getting fc->lock ...
2. fuse_dentry_revalidate() is asynchronously called. It sends FUSE_LOOKUP
to the server synchronously, then calls fuse_change_attributes(). The
latter updates i_size, releases fc->lock, but before comparing oldsize vs
attr->size..
3. fuse_do_setattr() from the first step proceeds by acquiring fc->lock and
updating attributes and i_size, but now oldsize is equal to
outarg.attr.size because i_size has just been updated (step 2). Hence,
fuse_do_setattr() returns w/o calling truncate_pagecache().
4. As soon as ftruncate(2) completes, the user extends file size by
write(2) making a hole in the middle of file, then reads data from the hole
either by read(2) or mmap-ed read. The user expects to get zero data from
the hole, but gets stale data because truncate_pagecache() is not executed
yet.

The scenario above illustrates one side of the problem: not truncating the
page cache even though we should. Another side corresponds to truncating
page cache too late, when the state of inode changed significantly.
Theoretically, the following is possible:

1. As in the previous scenario fuse_dentry_revalidate() discovered that
i_size changed (due to our own fuse_do_setattr()) and is going to call
truncate_pagecache() for some 'new_size' it believes valid right now. But
by the time that particular truncate_pagecache() is called ...
2. fuse_do_setattr() returns (either having called truncate_pagecache() or
not -- it doesn't matter).
3. The file is extended either by write(2) or ftruncate(2) or fallocate(2).
4. mmap-ed write makes a page in the extended region dirty.

The result will be the lost of data user wrote on the fourth step.

The patch is a hotfix resolving the issue in a simplistic way: let's skip
dangerous i_size update and truncate_pagecache if an operation changing
file size is in progress. This simplistic approach looks correct for the
cases w/o external changes. And to handle them properly, more sophisticated
and intrusive techniques (e.g. NFS-like one) would be required. I'd like to
postpone it until the issue is well discussed on the mailing list(s).

Changed in v2:
 - improved patch description to cover both sides of the issue.

Signed-off-by: Maxim Patlasov <mpatlasov@parallels.com>
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agofuse: invalidate inode attributes on xattr modification
Anand Avati [Tue, 20 Aug 2013 06:21:07 +0000 (02:21 -0400)]
fuse: invalidate inode attributes on xattr modification

commit d331a415aef98717393dda0be69b7947da08eba3 upstream.

Calls like setxattr and removexattr result in updation of ctime.
Therefore invalidate inode attributes to force a refresh.

Signed-off-by: Anand Avati <avati@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agofuse: postpone end_page_writeback() in fuse_writepage_locked()
Maxim Patlasov [Mon, 12 Aug 2013 16:39:30 +0000 (20:39 +0400)]
fuse: postpone end_page_writeback() in fuse_writepage_locked()

commit 4a4ac4eba1010ef9a804569058ab29e3450c0315 upstream.

The patch fixes a race between ftruncate(2), mmap-ed write and write(2):

1) An user makes a page dirty via mmap-ed write.
2) The user performs shrinking truncate(2) intended to purge the page.
3) Before fuse_do_setattr calls truncate_pagecache, the page goes to
   writeback. fuse_writepage_locked fills FUSE_WRITE request and releases
   the original page by end_page_writeback.
4) fuse_do_setattr() completes and successfully returns. Since now, i_mutex
   is free.
5) Ordinary write(2) extends i_size back to cover the page. Note that
   fuse_send_write_pages do wait for fuse writeback, but for another
   page->index.
6) fuse_writepage_locked proceeds by queueing FUSE_WRITE request.
   fuse_send_writepage is supposed to crop inarg->size of the request,
   but it doesn't because i_size has already been extended back.

Moving end_page_writeback to the end of fuse_writepage_locked fixes the
race because now the fact that truncate_pagecache is successfully returned
infers that fuse_writepage_locked has already called end_page_writeback.
And this, in turn, infers that fuse_flush_writepages has already called
fuse_send_writepage, and the latter used valid (shrunk) i_size. write(2)
could not extend it because of i_mutex held by ftruncate(2).

Signed-off-by: Maxim Patlasov <mpatlasov@parallels.com>
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agoclk: wm831x: Initialise wm831x pointer on init
Mark Brown [Thu, 29 Aug 2013 11:21:01 +0000 (12:21 +0100)]
clk: wm831x: Initialise wm831x pointer on init

commit 08442ce993deeb15a070c14cc3f3459e87d111e0 upstream.

Otherwise any attempt to interact with the hardware will crash. This is
what happens when drivers get written blind.

Signed-off-by: Mark Brown <broonie@linaro.org>
Signed-off-by: Mike Turquette <mturquette@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agomtd: nand: fix NAND_BUSWIDTH_AUTO for x16 devices
Brian Norris [Thu, 18 Jul 2013 08:17:02 +0000 (01:17 -0700)]
mtd: nand: fix NAND_BUSWIDTH_AUTO for x16 devices

commit 68e8078072e802e77134664f11d2ffbfbd2f8fbe upstream.

The code for NAND_BUSWIDTH_AUTO is broken. According to Alexander:

  "I have a problem with attach NAND UBI in 16 bit mode.
   NAND works fine if I specify NAND_BUSWIDTH_16 option, but not
   working with NAND_BUSWIDTH_AUTO option. In second case NAND
   chip is identifyed with ONFI."

See his report for the rest of the details:

  http://lists.infradead.org/pipermail/linux-mtd/2013-July/047515.html

Anyway, the problem is that nand_set_defaults() is called twice, we
intend it to reset the chip functions to their x16 buswidth verions
if the buswidth changed from x8 to x16; however, nand_set_defaults()
does exactly nothing if called a second time.

Fix this by hacking nand_set_defaults() to reset the buswidth-dependent
functions if they were set to the x8 version the first time. Note that
this does not do anything to reset from x16 to x8, but that's not the
supported use case for NAND_BUSWIDTH_AUTO anyway.

Signed-off-by: Brian Norris <computersforpeace@gmail.com>
Reported-by: Alexander Shiyan <shc_work@mail.ru>
Tested-by: Alexander Shiyan <shc_work@mail.ru>
Cc: Matthieu Castet <matthieu.castet@parrot.com>
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agoof: Fix missing memory initialization on FDT unflattening
Grant Likely [Wed, 28 Aug 2013 20:24:17 +0000 (21:24 +0100)]
of: Fix missing memory initialization on FDT unflattening

commit 0640332e073be9207f0784df43595c0c39716e42 upstream.

Any calls to dt_alloc() need to be zeroed. This is a temporary fix, but
the allocation function itself needs to zero memory before returning
it. This is a follow up to patch 9e4012752, "of: fdt: fix memory
initialization for expanded DT" which fixed one call site but missed
another.

Signed-off-by: Grant Likely <grant.likely@linaro.org>
Acked-by: Wladislav Wiebe <wladislav.kw@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agommc: tmio_mmc_dma: fix PIO fallback on SDHI
Sergei Shtylyov [Sun, 25 Aug 2013 03:38:15 +0000 (23:38 -0400)]
mmc: tmio_mmc_dma: fix PIO fallback on SDHI

commit f936f9b67b7f8c2eae01dd303a0e90bd777c4679 upstream.

I'm testing SH-Mobile SDHI driver in DMA mode with  a new DMA controller  using
'bonnie++' and getting DMA error after which the tmio_mmc_dma.c code falls back
to PIO but all commands time out after that.  It turned out that the fallback
code calls tmio_mmc_enable_dma() with RX/TX channels already freed and pointers
to them cleared, so that the function bails out early instead  of clearing the
DMA bit in the CTL_DMA_ENABLE register. The regression was introduced by commit
162f43e31c5a376ec16336e5d0ac973373d54c89 (mmc: tmio: fix a deadlock).
Moving tmio_mmc_enable_dma() calls to the top of the PIO fallback code in
tmio_mmc_start_dma_{rx|tx}() helps.

Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Acked-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
Signed-off-by: Chris Ball <cjb@laptop.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agorbd: fix I/O error propagation for reads
Josh Durgin [Tue, 27 Aug 2013 00:55:38 +0000 (17:55 -0700)]
rbd: fix I/O error propagation for reads

commit 17c1cc1d9293a568a00545469078e29555cc7f39 upstream.

When a request returns an error, the driver needs to report the entire
extent of the request as completed.  Writes already did this, since
they always set xferred = length, but reads were skipping that step if
an error other than -ENOENT occurred.  Instead, rbd would end up
passing 0 xferred to blk_end_request(), which would always report
needing more data.  This resulted in an assert failing when more data
was required by the block layer, but all the object requests were
done:

[ 1868.719077] rbd: obj_request read result -108 xferred 0
[ 1868.719077]
[ 1868.719518] end_request: I/O error, dev rbd1, sector 0
[ 1868.719739]
[ 1868.719739] Assertion failure in rbd_img_obj_callback() at line 1736:
[ 1868.719739]
[ 1868.719739]   rbd_assert(more ^ (which == img_request->obj_request_count));

Without this assert, reads that hit errors would hang forever, since
the block layer considered them incomplete.

Fixes: http://tracker.ceph.com/issues/5647
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
Reviewed-by: Alex Elder <alex.elder@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agoceph: Don't forget the 'up_read(&osdc->map_sem)' if met error.
majianpeng [Tue, 16 Jul 2013 11:36:21 +0000 (19:36 +0800)]
ceph: Don't forget the 'up_read(&osdc->map_sem)' if met error.

commit 494ddd11be3e2621096bb425eed2886f8e8446d4 upstream.

Signed-off-by: Jianpeng Ma <majianpeng@gmail.com>
Reviewed-by: Sage Weil <sage@inktank.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agolibceph: use pg_num_mask instead of pgp_num_mask for pg.seed calc
Sage Weil [Thu, 29 Aug 2013 00:17:29 +0000 (17:17 -0700)]
libceph: use pg_num_mask instead of pgp_num_mask for pg.seed calc

commit 9542cf0bf9b1a3adcc2ef271edbcbdba03abf345 upstream.

Fix a typo that used the wrong bitmask for the pg.seed calculation.  This
is normally unnoticed because in most cases pg_num == pgp_num.  It is, however,
a bug that is easily corrected.

Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Alex Elder <alex.elder@linary.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agolibceph: unregister request in __map_request failed and nofail == false
majianpeng [Tue, 16 Jul 2013 07:45:48 +0000 (15:45 +0800)]
libceph: unregister request in __map_request failed and nofail == false

commit 73d9f7eef3d98c3920e144797cc1894c6b005a1e upstream.

For nofail == false request, if __map_request failed, the caller does
cleanup work, like releasing the relative pages.  It doesn't make any sense
to retry this request.

Signed-off-by: Jianpeng Ma <majianpeng@gmail.com>
Reviewed-by: Sage Weil <sage@inktank.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agoum: Implement probe_kernel_read()
Richard Weinberger [Sat, 17 Aug 2013 16:46:00 +0000 (18:46 +0200)]
um: Implement probe_kernel_read()

commit f75b1b1bedfb498cc43a992ce4d7ed8df3b1e770 upstream.

UML needs it's own probe_kernel_read() to handle kernel
mode faults correctly.
The implementation uses mincore() on the host side to detect
whether a page is owned by the UML kernel process.

This fixes also a possible crash when sysrq-t is used.
Starting with 3.10 sysrq-t calls probe_kernel_read() to
read details from the kernel workers. As kernel worker are
completely async pointers may turn NULL while reading them.

Signed-off-by: Richard Weinberger <richard@nod.at>
Cc: <stian@nixia.no>
Cc: <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agodrm/edid: add quirk for Medion MD30217PG
Alex Deucher [Mon, 12 Aug 2013 15:04:29 +0000 (11:04 -0400)]
drm/edid: add quirk for Medion MD30217PG

commit 118bdbd86b39dbb843155054021d2c59058f1e05 upstream.

This LCD monitor (1280x1024 native) has a completely
bogus detailed timing (640x350@70hz).  User reports that
1280x1024@60 has waves so prefer 1280x1024@75.

Manufacturer: MED  Model: 7b8  Serial#: 99188
Year: 2005  Week: 5
EDID Version: 1.3
Analog Display Input,  Input Voltage Level: 0.700/0.700 V
Sync:  Separate
Max Image Size [cm]: horiz.: 34  vert.: 27
Gamma: 2.50
DPMS capabilities: Off; RGB/Color Display
First detailed timing is preferred mode
redX: 0.645 redY: 0.348   greenX: 0.280 greenY: 0.605
blueX: 0.142 blueY: 0.071   whiteX: 0.313 whiteY: 0.329
Supported established timings:
720x400@70Hz
640x480@60Hz
640x480@72Hz
640x480@75Hz
800x600@56Hz
800x600@60Hz
800x600@72Hz
800x600@75Hz
1024x768@60Hz
1024x768@70Hz
1024x768@75Hz
1280x1024@75Hz
Manufacturer's mask: 0
Supported standard timings:
Supported detailed timing:
clock: 25.2 MHz   Image Size:  337 x 270 mm
h_active: 640  h_sync: 688  h_sync_end 784 h_blank_end 800 h_border: 0
v_active: 350  v_sync: 350  v_sync_end 352 v_blanking: 449 v_border: 0
Monitor name: MD30217PG
Ranges: V min: 56 V max: 76 Hz, H min: 30 H max: 83 kHz, PixClock max 145 MHz
Serial No: 501099188
EDID (in hex):
          00ffffffffffff0034a4b80774830100
          050f010368221b962a0c55a559479b24
          125054afcf00310a0101010101018180
          000000000000d60980a0205e63103060
          0200510e1100001e000000fc004d4433
          3032313750470a202020000000fd0038
          4c1e530e000a202020202020000000ff
          003530313039393138380a2020200078

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reported-by: friedrich@mailstation.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agoamd64_edac: Fix single-channel setups
Borislav Petkov [Tue, 23 Jul 2013 18:01:23 +0000 (20:01 +0200)]
amd64_edac: Fix single-channel setups

commit f0a56c480196a98479760862468cc95879df3de0 upstream.

It can happen that configurations are running in a single-channel mode
even with a dual-channel memory controller, by, say, putting the DIMMs
only on the one channel and leaving the other empty. This causes a
problem in init_csrows which implicitly assumes that when the second
channel is enabled, i.e. channel 1, the struct dimm hierarchy will be
present. Which is not.

So always allocate two channels unconditionally.

This provides for the nice side effect that the data structures are
initialized so some day, when memory hotplug is supported, it should
just work out of the box when all of a sudden a second channel appears.

Reported-and-tested-by: Roger Leigh <rleigh@debian.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agoisofs: Refuse RW mount of the filesystem instead of making it RO
Jan Kara [Thu, 25 Jul 2013 09:49:11 +0000 (11:49 +0200)]
isofs: Refuse RW mount of the filesystem instead of making it RO

commit 17b7f7cf58926844e1dd40f5eb5348d481deca6a upstream.

Refuse RW mount of isofs filesystem. So far we just silently changed it
to RO mount but when the media is writeable, block layer won't notice
this change and thus will think device is used RW and will block eject
button of the drive. That is unexpected by users because for
non-writeable media eject button works just fine.

Userspace mount(8) command handles this just fine and retries mounting
with MS_RDONLY set so userspace shouldn't see any regression.  Plus any
tool mounting isofs is likely confronted with the case of read-only
media where block layer already refuses to mount the filesystem without
MS_RDONLY set so our behavior shouldn't be anything new for it.

Reported-by: Hui Wang <hui.wang@canonical.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agoproc: Restrict mounting the proc filesystem
Eric W. Biederman [Tue, 26 Mar 2013 02:57:10 +0000 (19:57 -0700)]
proc: Restrict mounting the proc filesystem

commit aee1c13dd0f6c2fc56e0e492b349ee8ac655880f upstream.

Don't allow mounting the proc filesystem unless the caller has
CAP_SYS_ADMIN rights over the pid namespace.  The principle here is if
you create or have capabilities over it you can mount it, otherwise
you get to live with what other people have mounted.

Andy pointed out that this is needed to prevent users in a user
namespace from remounting proc and specifying different hidepid and gid
options on already existing proc mounts.

Reported-by: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agomm/huge_memory.c: fix potential NULL pointer dereference
Libin [Wed, 11 Sep 2013 21:20:38 +0000 (14:20 -0700)]
mm/huge_memory.c: fix potential NULL pointer dereference

commit a8f531ebc33052642b4bd7b812eedf397108ce64 upstream.

In collapse_huge_page() there is a race window between releasing the
mmap_sem read lock and taking the mmap_sem write lock, so find_vma() may
return NULL.  So check the return value to avoid NULL pointer dereference.

collapse_huge_page
khugepaged_alloc_page
up_read(&mm->mmap_sem)
down_write(&mm->mmap_sem)
vma = find_vma(mm, address)

Signed-off-by: Libin <huawei.libin@huawei.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reviewed-by: Wanpeng Li <liwanp@linux.vnet.ibm.com>
Reviewed-by: Michal Hocko <mhocko@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agomemcg: fix multiple large threshold notifications
Greg Thelen [Wed, 11 Sep 2013 21:23:08 +0000 (14:23 -0700)]
memcg: fix multiple large threshold notifications

commit 2bff24a3707093c435ab3241c47dcdb5f16e432b upstream.

A memory cgroup with (1) multiple threshold notifications and (2) at least
one threshold >=2G was not reliable.  Specifically the notifications would
either not fire or would not fire in the proper order.

The __mem_cgroup_threshold() signaling logic depends on keeping 64 bit
thresholds in sorted order.  mem_cgroup_usage_register_event() sorts them
with compare_thresholds(), which returns the difference of two 64 bit
thresholds as an int.  If the difference is positive but has bit[31] set,
then sort() treats the difference as negative and breaks sort order.

This fix compares the two arbitrary 64 bit thresholds returning the
classic -1, 0, 1 result.

The test below sets two notifications (at 0x1000 and 0x81001000):
  cd /sys/fs/cgroup/memory
  mkdir x
  for x in 4096 2164264960; do
    cgroup_event_listener x/memory.usage_in_bytes $x | sed "s/^/$x listener:/" &
  done
  echo $$ > x/cgroup.procs
  anon_leaker 500M

v3.11-rc7 fails to signal the 4096 event listener:
  Leaking...
  Done leaking pages.

Patched v3.11-rc7 properly notifies:
  Leaking...
  4096 listener:2013:8:31:14:13:36
  Done leaking pages.

The fixed bug is old.  It appears to date back to the introduction of
memcg threshold notifications in v2.6.34-rc1-116-g2e72b6347c94 "memcg:
implement memory thresholds"

Signed-off-by: Greg Thelen <gthelen@google.com>
Acked-by: Michal Hocko <mhocko@suse.cz>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agoocfs2: fix the end cluster offset of FIEMAP
Jie Liu [Wed, 11 Sep 2013 21:20:05 +0000 (14:20 -0700)]
ocfs2: fix the end cluster offset of FIEMAP

commit 28e8be31803b19d0d8f76216cb11b480b8a98bec upstream.

Call fiemap ioctl(2) with given start offset as well as an desired mapping
range should show extents if possible.  However, we somehow figure out the
end offset of mapping via 'mapping_end -= cpos' before iterating the
extent records which would cause problems if the given fiemap length is
too small to a cluster size, e.g,

Cluster size 4096:
debugfs.ocfs2 1.6.3
        Block Size Bits: 12   Cluster Size Bits: 12

The extended fiemap test utility From David:
https://gist.github.com/anonymous/6172331

# dd if=/dev/urandom of=/ocfs2/test_file bs=1M count=1000
# ./fiemap /ocfs2/test_file 4096 10
start: 4096, length: 10
File /ocfs2/test_file has 0 extents:
# Logical          Physical         Length           Flags
^^^^^ <-- No extent is shown

In this case, at ocfs2_fiemap(): cpos == mapping_end == 1. Hence the
loop of searching extent records was not executed at all.

This patch remove the in question 'mapping_end -= cpos', and loops
until the cpos is larger than the mapping_end as usual.

# ./fiemap /ocfs2/test_file 4096 10
start: 4096, length: 10
File /ocfs2/test_file has 1 extents:
# Logical          Physical         Length           Flags
0: 0000000000000000 0000000056a01000 0000000006a00000 0000

Signed-off-by: Jie Liu <jeff.liu@oracle.com>
Reported-by: David Weber <wb@munzinger.de>
Tested-by: David Weber <wb@munzinger.de>
Cc: Sunil Mushran <sunil.mushran@gmail.com>
Cc: Mark Fashen <mfasheh@suse.de>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agopidns: fix vfork() after unshare(CLONE_NEWPID)
Oleg Nesterov [Wed, 11 Sep 2013 21:19:38 +0000 (14:19 -0700)]
pidns: fix vfork() after unshare(CLONE_NEWPID)

commit e79f525e99b04390ca4d2366309545a836c03bf1 upstream.

Commit 8382fcac1b81 ("pidns: Outlaw thread creation after
unshare(CLONE_NEWPID)") nacks CLONE_VM if the forking process unshared
pid_ns, this obviously breaks vfork:

int main(void)
{
assert(unshare(CLONE_NEWUSER | CLONE_NEWPID) == 0);
assert(vfork() >= 0);
_exit(0);
return 0;
}

fails without this patch.

Change this check to use CLONE_SIGHAND instead.  This also forbids
CLONE_THREAD automatically, and this is what the comment implies.

We could probably even drop CLONE_SIGHAND and use CLONE_THREAD, but it
would be safer to not do this.  The current check denies CLONE_SIGHAND
implicitely and there is no reason to change this.

Eric said "CLONE_SIGHAND is fine.  CLONE_THREAD would be even better.
Having shared signal handling between two different pid namespaces is
the case that we are fundamentally guarding against."

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Reported-by: Colin Walters <walters@redhat.com>
Acked-by: Andy Lutomirski <luto@amacapital.net>
Reviewed-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agopidns: Fix hang in zap_pid_ns_processes by sending a potentially extra wakeup
Eric W. Biederman [Thu, 29 Aug 2013 20:56:50 +0000 (13:56 -0700)]
pidns: Fix hang in zap_pid_ns_processes by sending a potentially extra wakeup

commit a606488513543312805fab2b93070cefe6a3016c upstream.

Serge Hallyn <serge.hallyn@ubuntu.com> writes:

> Since commit af4b8a83add95ef40716401395b44a1b579965f4 it's been
> possible to get into a situation where a pidns reaper is
> <defunct>, reparented to host pid 1, but never reaped.  How to
> reproduce this is documented at
>
> https://bugs.launchpad.net/ubuntu/+source/lxc/+bug/1168526
> (and see
> https://bugs.launchpad.net/ubuntu/+source/lxc/+bug/1168526/comments/13)
> In short, run repeated starts of a container whose init is
>
> Process.exit(0);
>
> sysrq-t when such a task is playing zombie shows:
>
> [  131.132978] init            x ffff88011fc14580     0  2084   2039 0x00000000
> [  131.132978]  ffff880116e89ea8 0000000000000002 ffff880116e89fd8 0000000000014580
> [  131.132978]  ffff880116e89fd8 0000000000014580 ffff8801172a0000 ffff8801172a0000
> [  131.132978]  ffff8801172a0630 ffff88011729fff0 ffff880116e14650 ffff88011729fff0
> [  131.132978] Call Trace:
> [  131.132978]  [<ffffffff816f6159>] schedule+0x29/0x70
> [  131.132978]  [<ffffffff81064591>] do_exit+0x6e1/0xa40
> [  131.132978]  [<ffffffff81071eae>] ? signal_wake_up_state+0x1e/0x30
> [  131.132978]  [<ffffffff8106496f>] do_group_exit+0x3f/0xa0
> [  131.132978]  [<ffffffff810649e4>] SyS_exit_group+0x14/0x20
> [  131.132978]  [<ffffffff8170102f>] tracesys+0xe1/0xe6
>
> Further debugging showed that every time this happened, zap_pid_ns_processes()
> started with nr_hashed being 3, while we were expecting it to drop to 2.
> Any time it didn't happen, nr_hashed was 1 or 2.  So the reaper was
> waiting for nr_hashed to become 2, but free_pid() only wakes the reaper
> if nr_hashed hits 1.

The issue is that when the task group leader of an init process exits
before other tasks of the init process when the init process finally
exits it will be a secondary task sleeping in zap_pid_ns_processes and
waiting to wake up when the number of hashed pids drops to two.  This
case waits forever as free_pid only sends a wake up when the number of
hashed pids drops to 1.

To correct this the simple strategy of sending a possibly unncessary
wake up when the number of hashed pids drops to 2 is adopted.

Sending one extraneous wake up is relatively harmless, at worst we
waste a little cpu time in the rare case when a pid namespace
appropaches exiting.

We can detect the case when the pid namespace drops to just two pids
hashed race free in free_pid.

Dereferencing pid_ns->child_reaper with the pidmap_lock held is safe
without out the tasklist_lock because it is guaranteed that the
detach_pid will be called on the child_reaper before it is freed and
detach_pid calls __change_pid which calls free_pid which takes the
pidmap_lock.  __change_pid only calls free_pid if this is the
last use of the pid.  For a thread that is not the thread group leader
the threads pid will only ever have one user because a threads pid
is not allowed to be the pid of a process, of a process group or
a session.  For a thread that is a thread group leader all of
the other threads of that process will be reaped before it is allowed
for the thread group leader to be reaped ensuring there will only
be one user of the threads pid as a process pid.  Furthermore
because the thread is the init process of a pid namespace all of the
other processes in the pid namespace will have also been already freed
leading to the fact that the pid will not be used as a session pid or
a process group pid for any other running process.

Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
Tested-by: Serge Hallyn <serge.hallyn@canonical.com>
Reported-by: Serge Hallyn <serge.hallyn@ubuntu.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agointel-iommu: Fix leaks in pagetable freeing
Alex Williamson [Sat, 15 Jun 2013 16:27:19 +0000 (10:27 -0600)]
intel-iommu: Fix leaks in pagetable freeing

commit 3269ee0bd6686baf86630300d528500ac5b516d7 upstream.

At best the current code only seems to free the leaf pagetables and
the root.  If you're unlucky enough to have a large gap (like any
QEMU guest with more than 3G of memory), only the first chunk of leaf
pagetables are freed (plus the root).  This is a massive memory leak.
This patch re-writes the pagetable freeing function to use a
recursive algorithm and manages to not only free all the pagetables,
but does it without any apparent performance loss versus the current
broken version.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Joerg Roedel <joro@8bytes.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agotarget: Fix >= v3.9+ regression in PR APTPL + ALUA metadata write-out
Gera Kazakov [Mon, 9 Sep 2013 22:47:06 +0000 (15:47 -0700)]
target: Fix >= v3.9+ regression in PR APTPL + ALUA metadata write-out

commit f730f9158f6ee7b5c4d892af6b51a72194445ea4 upstream.

This patch fixes a >= v3.9+ regression in __core_scsi3_write_aptpl_to_file()
+ core_alua_write_tpg_metadata() write-out, where a return value of -EIO was
incorrectly being returned upon success.

This bug was originally introduced in:

commit 0e9b10a90f1c30f25dd6f130130240745ab14010
Author: Al Viro <viro@zeniv.linux.org.uk>
Date:   Sat Feb 23 15:22:43 2013 -0500

    target: writev() on single-element vector is pointless

However, given that the return of core_scsi3_update_and_write_aptpl()
was not used to determine if a command should be returned with non GOOD
status, this bug was not being triggered in PR logic until v3.11-rc1 by
commit:

commit 459f213ba162bd13e113d6f92a8fa6c780fd67ed
Author: Andy Grover <agrover@redhat.com>
Date:   Thu May 16 10:41:02 2013 -0700

    target: Allocate aptpl_buf inside update_and_write_aptpl()

So, go ahead and only return -EIO if kernel_write() returned a
negative value.

Reported-by: Gera Kazakov <gkazakov@msn.com>
Signed-off-by: Gera Kazakov <gkazakov@msn.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andy Grover <agrover@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agoMIPS: ath79: Fix ar933x watchdog clock
Felix Fietkau [Wed, 28 Aug 2013 08:41:42 +0000 (10:41 +0200)]
MIPS: ath79: Fix ar933x watchdog clock

commit a1191927ace7e6f827132aa9e062779eb3f11fa5 upstream.

The watchdog device on the AR933x is connected to
the AHB clock, however the current code uses the
reference clock. Due to the wrong rate, the watchdog
driver can't calculate correct register values for
a given timeout value and the watchdog unexpectedly
restarts the system.

The code uses the wrong value since the initial
commit 04225e1d227c8e68d685936ecf42ac175fec0e54
(MIPS: ath79: add AR933X specific clock init)

The patch fixes the code to use the correct clock
rate to avoid the problem.

Signed-off-by: Felix Fietkau <nbd@openwrt.org>
Signed-off-by: Gabor Juhos <juhosg@openwrt.org>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/5777/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agoleds: wm831x-status: Request a REG resource
Mark Brown [Thu, 29 Aug 2013 14:18:14 +0000 (07:18 -0700)]
leds: wm831x-status: Request a REG resource

commit 61abeba5222895d6900b13115f5d8eba7988d7d6 upstream.

The wm831x-status driver was not converted to use a REG resource when they
were introduced and the rest of the wm831x drivers converted, causing it
to fail to probe due to requesting the wrong resource type.

Signed-off-by: Mark Brown <broonie@linaro.org>
Signed-off-by: Bryan Wu <cooloney@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
10 years agouprobes: Fix utask->depth accounting in handle_trampoline()
Oleg Nesterov [Wed, 11 Sep 2013 15:47:26 +0000 (17:47 +0200)]
uprobes: Fix utask->depth accounting in handle_trampoline()

commit 878b5a6efd38030c7a90895dc8346e8fb1e09b4c upstream.

Currently utask->depth is simply the number of allocated/pending
return_instance's in uprobe_task->return_instances list.

handle_trampoline() should decrement this counter every time we
handle/free an instance, but due to typo it does this only if
->chained == T. This means that in the likely case this counter
is never decremented and the probed task can't report more than
MAX_URETPROBE_DEPTH events.

Reported-by: Mikhail Kulemin <Mikhail.Kulemin@ru.ibm.com>
Reported-by: Hemant Kumar Shaw <hkshaw@linux.vnet.ibm.com>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Anton Arapov <anton@redhat.com>
Cc: masami.hiramatsu.pt@hitachi.com
Cc: srikar@linux.vnet.ibm.com
Cc: systemtap@sourceware.org
Link: http://lkml.kernel.org/r/20130911154726.GA8093@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>