]> git.itanic.dy.fi Git - linux-stable/log
linux-stable
15 years agoLinux 2.6.28.3 v2.6.28.3
Greg Kroah-Hartman [Mon, 2 Feb 2009 18:12:10 +0000 (10:12 -0800)]
Linux 2.6.28.3

15 years agorelay: fix lock imbalance in relay_late_setup_files
Jiri Slaby [Sat, 17 Jan 2009 11:04:36 +0000 (12:04 +0100)]
relay: fix lock imbalance in relay_late_setup_files

commit b786c6a98ef6fa81114ba7b9fbfc0d67060775e3 upstream.

One fail path in relay_late_setup_files() omits
mutex_unlock(&relay_channels_mutex);
Add it.

Signed-off-by: Jiri Slaby <jirislaby@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agoNET: net_namespace, fix lock imbalance
Jiri Slaby [Sat, 17 Jan 2009 06:47:12 +0000 (06:47 +0000)]
NET: net_namespace, fix lock imbalance

commit 357f5b0b91054ae23385ea4b0634bb8b43736e83 upstream.

register_pernet_gen_subsys omits mutex_unlock in one fail path.
Fix it.

Signed-off-by: Jiri Slaby <jirislaby@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agodmaengine: fix dependency chaining
Yuri Tikhonov [Thu, 29 Jan 2009 12:37:13 +0000 (15:37 +0300)]
dmaengine: fix dependency chaining

commit dd59b8537f6cb53ab863fafad86a5828f1e889a2 upstream

 ASYNC_TX: fix dependency chaining

 In ASYNC_TX we track the dependencies between the descriptors
using the 'next' pointers of the structures. These pointers are
set to NULL as soon as the corresponding descriptor has been
submitted to the channel (in async_tx_run_dependencies()).
 But, the first 'next' in chain still remains set, regardless
the fact, that tx->next is already submitted. This may lead to
multiple submisions of the same descriptor. This patch fixes this.

Signed-off-by: Yuri Tikhonov <yur@emcraft.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agoPCI hotplug: fix lock imbalance in pciehp
Jiri Slaby [Sat, 17 Jan 2009 15:23:55 +0000 (16:23 +0100)]
PCI hotplug: fix lock imbalance in pciehp

commit c2fdd36b550659f5ac2240d1f5a83ffa1a092289 upstream.

set_lock_status omits mutex_unlock in fail path. Add the omitted
unlock.

As a result a lockup caused by this can be triggered from userspace
by writing 1 to /sys/bus/pci/slots/.../lock often enough.

Signed-off-by: Jiri Slaby <jirislaby@gmail.com>
Reviewed-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agox86, pat: fix PTE corruption issue while mapping RAM using /dev/mem
Suresh Siddha [Thu, 29 Jan 2009 00:51:53 +0000 (16:51 -0800)]
x86, pat: fix PTE corruption issue while mapping RAM using /dev/mem

commit 9597134218300c045cf219be3664615e97cb239c upstream.

Beschorner Daniel reported:
> hwinfo problem since 2.6.28, showing this in the oops:
>   Corrupted page table at address 7fd04de3ec00

Also, PaX Team reported a regression with this commit:

>   commit 9542ada803198e6eba29d3289abb39ea82047b92
>   Author: Suresh Siddha <suresh.b.siddha@intel.com>
>   Date:   Wed Sep 24 08:53:33 2008 -0700
>
>       x86: track memtype for RAM in page struct

This commit breaks mapping any RAM page through /dev/mem, as the
reserve_memtype() was not initializing the return attribute type and as such
corrupting the PTE entry that was setup with the return attribute type.

Because of this bug, application mapping this RAM page through /dev/mem
will die with "Corrupted page table at address xxxx" message in the kernel
log and also the kernel identity mapping which maps the underlying RAM
page gets converted to UC.

Fix this by initializing the return attribute type before calling
reserve_ram_pages_type()

Reported-by: PaX Team <pageexec@freemail.hu>
Reported-and-tested-by: Beschorner Daniel <Daniel.Beschorner@facton.com>
Tested-and-Acked-by: PaX Team <pageexec@freemail.hu>
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agox86, pat: fix reserve_memtype() for legacy 1MB range
Suresh Siddha [Thu, 29 Jan 2009 00:51:52 +0000 (16:51 -0800)]
x86, pat: fix reserve_memtype() for legacy 1MB range

commit 5cca0cf15a94417f49625ce52e23589eed0a1675 upstream

Thierry Vignaud reported:
> http://bugzilla.kernel.org/show_bug.cgi?id=12372
>
> On P4 with an SiS motherboard (video card is a SiS 651)
> X server fails to start with error:
> xf86MapVidMem: Could not mmap framebuffer (0x00000000,0x2000) (Invalid
> argument)

Here X is trying to map first 8KB of memory using /dev/mem. Existing
code treats first 0-4KB of memory as non-RAM and 4KB-8KB as RAM. Recent
code changes don't allow to map memory with different attributes
at the same time.

Fix this by treating the first 1MB legacy region as special and always
track the attribute requests with in this region using linear linked
list (and don't bother if the range is RAM or non-RAM or mixed)

Reported-and-tested-by: Thierry Vignaud <tvignaud@mandriva.com>
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agocrypto: ccm - Fix handling of null assoc data
Jarod Wilson [Thu, 22 Jan 2009 08:58:15 +0000 (19:58 +1100)]
crypto: ccm - Fix handling of null assoc data

commit 516280e735b034216de97eb7ba080ec6acbfc58f upstream.

Its a valid use case to have null associated data in a ccm vector, but
this case isn't being handled properly right now.

The following ccm decryption/verification test vector, using the
rfc4309 implementation regularly triggers a panic, as will any
other vector with null assoc data:

* key: ab2f8a74b71cd2b1ff802e487d82f8b9
* iv: c6fb7d800d13abd8a6b2d8
* Associated Data: [NULL]
* Tag Length: 8
* input: d5e8939fc7892e2b

The resulting panic looks like so:

Unable to handle kernel paging request at ffff810064ddaec0 RIP:
 [<ffffffff8864c4d7>] :ccm:get_data_to_compute+0x1a6/0x1d6
PGD 8063 PUD 0
Oops: 0002 [1] SMP
last sysfs file: /module/libata/version
CPU 0
Modules linked in: crypto_tester_kmod(U) seqiv krng ansi_cprng chainiv rng ctr aes_generic aes_x86_64 ccm cryptomgr testmgr_cipher testmgr aead crypto_blkcipher crypto_a
lgapi des ipv6 xfrm_nalgo crypto_api autofs4 hidp l2cap bluetooth nfs lockd fscache nfs_acl sunrpc ip_conntrack_netbios_ns ipt_REJECT xt_state ip_conntrack nfnetlink xt_
tcpudp iptable_filter ip_tables x_tables dm_mirror dm_log dm_multipath scsi_dh dm_mod video hwmon backlight sbs i2c_ec button battery asus_acpi acpi_memhotplug ac lp sg
snd_intel8x0 snd_ac97_codec ac97_bus snd_seq_dummy snd_seq_oss joydev snd_seq_midi_event snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss ide_cd snd_pcm floppy parport_p
c shpchp e752x_edac snd_timer e1000 i2c_i801 edac_mc snd soundcore snd_page_alloc i2c_core cdrom parport serio_raw pcspkr ata_piix libata sd_mod scsi_mod ext3 jbd uhci_h
cd ohci_hcd ehci_hcd
Pid: 12844, comm: crypto-tester Tainted: G      2.6.18-128.el5.fips1 #1
RIP: 0010:[<ffffffff8864c4d7>]  [<ffffffff8864c4d7>] :ccm:get_data_to_compute+0x1a6/0x1d6
RSP: 0018:ffff8100134434e8  EFLAGS: 00010246
RAX: 0000000000000000 RBX: ffff8100104898b0 RCX: ffffffffab6aea10
RDX: 0000000000000010 RSI: ffff8100104898c0 RDI: ffff810064ddaec0
RBP: 0000000000000000 R08: ffff8100104898b0 R09: 0000000000000000
R10: ffff8100103bac84 R11: ffff8100104898b0 R12: ffff810010489858
R13: ffff8100104898b0 R14: ffff8100103bac00 R15: 0000000000000000
FS:  00002ab881adfd30(0000) GS:ffffffff803ac000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: ffff810064ddaec0 CR3: 0000000012a88000 CR4: 00000000000006e0
Process crypto-tester (pid: 12844, threadinfo ffff810013442000, task ffff81003d165860)
Stack:  ffff8100103bac00 ffff8100104898e8 ffff8100134436f8 ffffffff00000000
 0000000000000000 ffff8100104898b0 0000000000000000 ffff810010489858
 0000000000000000 ffff8100103bac00 ffff8100134436f8 ffffffff8864c634
Call Trace:
 [<ffffffff8864c634>] :ccm:crypto_ccm_auth+0x12d/0x140
 [<ffffffff8864cf73>] :ccm:crypto_ccm_decrypt+0x161/0x23a
 [<ffffffff88633643>] :crypto_tester_kmod:cavs_test_rfc4309_ccm+0x4a5/0x559
[...]

The above is from a RHEL5-based kernel, but upstream is susceptible too.

The fix is trivial: in crypto/ccm.c:crypto_ccm_auth(), pctx->ilen contains
whatever was in memory when pctx was allocated if assoclen is 0. The tested
fix is to simply add an else clause setting pctx->ilen to 0 for the
assoclen == 0 case, so that get_data_to_compute() doesn't try doing
things its not supposed to.

Signed-off-by: Jarod Wilson <jarod@redhat.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agocrypto: authenc - Fix zero-length IV crash
Herbert Xu [Tue, 13 Jan 2009 00:26:18 +0000 (11:26 +1100)]
crypto: authenc - Fix zero-length IV crash

commit 29b37f42127f7da511560a40ea74f5047da40c13 upstream.

As it is if an algorithm with a zero-length IV is used (e.g.,
NULL encryption) with authenc, authenc may generate an SG entry
of length zero, which will trigger a BUG check in the hash layer.

This patch fixes it by skipping the IV SG generation if the IV
size is zero.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agoALSA: hda - Add quirk for HP DV6700 laptop
Joerg Schirottke [Tue, 27 Jan 2009 10:01:34 +0000 (11:01 +0100)]
ALSA: hda - Add quirk for HP DV6700 laptop

commit aa9d823bb347fb66cb07f98c686be8bb85cb6a74 upstream.

Added the matching model=laptop for HP DV6700 laptop.

Signed-off-by: Joerg Schirottke <master@kanotix.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agoALSA: hda - add another MacBook Pro 4, 1 subsystem ID
Luke Yelavich [Wed, 28 Jan 2009 04:58:38 +0000 (15:58 +1100)]
ALSA: hda - add another MacBook Pro 4, 1 subsystem ID

commit 2a88464ceb1bda2571f88902fd8068a6168e3f7b upstream.

Add another MacBook Pro 4,1 SSID (106b:3800). It seems that latter revisions,
(at least mine), have different IDs to earlier revisions.

Signed-off-by: Luke Yelavich <themuso@ubuntu.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agoALSA: hda - Fix PCM reference NID for STAC/IDT analog outputs
Takashi Iwai [Fri, 23 Jan 2009 10:55:42 +0000 (11:55 +0100)]
ALSA: hda - Fix PCM reference NID for STAC/IDT analog outputs

commit 00a602db1ce9d61319d6f769dee206ec85f19bda upstream.

The reference NID for the analog outputs of STAC/IDT codecs is set
to a fixed number 0x02.  But this isn't always correct and in many
codecs it points to a non-existing NID.

This patch fixes the initialization of the PCM reference NID taken
from the actually probed DAC list.

Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agoinclude/linux: Add bsg.h to the Kernel exported headers
Boaz Harrosh [Mon, 19 Jan 2009 09:37:38 +0000 (10:37 +0100)]
include/linux: Add bsg.h to the Kernel exported headers

commit a229fc61ef0ee3c30fd193beee0eeb87410227f1 upstream.

bsg.h in current form is perfectly suitable for user-mode
consumption. It is needed together with scsi/sg.h for applications
that want to interface with the bsg driver.

Currently the few projects that use it would copy it over into
the projects. But that is not acceptable for projects that need
to provide source and devel packages for distros.

This should also be submitted to stable 2.6.28 and 2.6.27 since bsg had
a stable API since these Kernels and distro users will need the header
for these kernels a swell

Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Acked-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agosgi-xpc: ensure flags are updated before bte_copy
Robin Holt [Thu, 29 Jan 2009 22:25:06 +0000 (14:25 -0800)]
sgi-xpc: ensure flags are updated before bte_copy

commit 69b3bb65fa97a1e8563518dbbc35cd57beefb2d4 upstream.

The clearing of the msg->flags needs a barrier between it and the notify
of the channel threads that the messages are cleaned and ready for use.

Signed-off-by: Robin Holt <holt@sgi.com>
Signed-off-by: Dean Nelson <dcn@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agosgi-xpc: Remove NULL pointer dereference.
Robin Holt [Thu, 29 Jan 2009 22:25:07 +0000 (14:25 -0800)]
sgi-xpc: Remove NULL pointer dereference.

commit 17e2161654da4e6bdfd8d53d4f52e820ee93f423 upstream.

If the bte copy fails, the attempt to retrieve payloads merely returns a
null pointer deref and not NULL as was expected.

Signed-off-by: Robin Holt <holt@sgi.com>
Signed-off-by: Dean Nelson <dcn@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agogpiolib: fix request related issue
Magnus Damm [Thu, 29 Jan 2009 22:25:12 +0000 (14:25 -0800)]
gpiolib: fix request related issue

commit 7460db567bbca76bf087d1694d792a1a96bdaa26 upstream.

Fix request-already-requested handling in gpio_request().

Signed-off-by: Magnus Damm <damm@igel.co.jp>
Acked-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agoepoll: drop max_user_instances and rely only on max_user_watches
Davide Libenzi [Thu, 29 Jan 2009 22:25:26 +0000 (14:25 -0800)]
epoll: drop max_user_instances and rely only on max_user_watches

commit 9df04e1f25effde823a600e755b51475d438f56b upstream.

Linus suggested to put limits where the money is, and max_user_watches
already does that w/out the need of max_user_instances.  That has the
advantage to mitigate the potential DoS while allowing pretty generous
default behavior.

Allowing top 4% of low memory (per user) to be allocated in epoll watches,
we have:

LOMEM    MAX_WATCHES (per user)
512MB    ~178000
1GB      ~356000
2GB      ~712000

A box with 512MB of lomem, will meet some challenge in hitting 180K
watches, socket buffers math teaches us.  No more max_user_instances
limits then.

Signed-off-by: Davide Libenzi <davidel@xmailserver.org>
Cc: Willy Tarreau <w@1wt.eu>
Cc: Michael Kerrisk <mtk.manpages@googlemail.com>
Cc: Bron Gondwana <brong@fastmail.fm>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agortl8187: Fix error in setting OFDM power settings for RTL8187L
Larry Finger [Tue, 27 Jan 2009 18:31:23 +0000 (12:31 -0600)]
rtl8187: Fix error in setting OFDM power settings for RTL8187L

commit eb83bbf57429ab80f49b413e3e44d3b19c3fdc5a upstream.

After reports of poor performance, a review of the latest vendor driver
(rtl8187_linux_26.1025.0328.2007) for RTL8187L devices was undertaken.

A difference was found in the code used to index the OFDM power tables. When
the Linux driver was changed, my unit works at a much greater range than
before. I think this fixes Bugzilla #12380 and has been tested by at least
two other users.

Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net>
Tested-by: Martín Ernesto Barreyro <barreyromartin@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agoext3: Add sanity check to make_indexed_dir
Theodore Ts'o [Fri, 16 Jan 2009 16:13:47 +0000 (11:13 -0500)]
ext3: Add sanity check to make_indexed_dir

commit a21102b55c4f8dfd3adb4a15a34cd62237b46039 upstream.

Make sure the rec_len field in the '..' entry is sane, lest we overrun
the directory block and cause a kernel oops on a purposefully
corrupted filesystem.

This fixes a bug related to a bug originally reported by Sami Liedes
for ext4 at:

http://bugzilla.kernel.org/show_bug.cgi?id=12430

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agobnx2x: Block nvram access when the device is inactive
Eilon Greenstein [Wed, 14 Jan 2009 06:44:07 +0000 (06:44 +0000)]
bnx2x: Block nvram access when the device is inactive

commit 2add3acb11a26cc14b54669433ae6ace6406cbf2 upstream.

Don't dump eeprom when bnx2x adapter is down.  Running ethtool -e causes an eeh
without it when the device is down

Signed-off-by: Paul Larson <pl@linux.vnet.ibm.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agoFix OOPS in mmap_region() when merging adjacent VM_LOCKED file segments
Andrew Morton [Wed, 28 Jan 2009 21:43:50 +0000 (13:43 -0800)]
Fix OOPS in mmap_region() when merging adjacent VM_LOCKED file segments

This patch differs from the upstream commit
de33c8db5910cda599899dd431cc30d7c1018cbf written by Linus, as it aims to
only prevent the oops from happening, not attempt to change anything
else.

The problem was introduced by commit
ba470de43188cdbff795b5da43a1474523c6c2fb

which added new references to *vma after we've potentially freed it.

From: Andrew Morton <akpm@linux-foundation.org>
Reported-by: Maksim Yevmenkin <maksim.yevmenkin@gmail.com>
Tested-by: Maksim Yevmenkin <maksim.yevmenkin@gmail.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
Cc: Nick Piggin <npiggin@suse.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Hugh Dickins <hugh@veritas.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agodrm: stash AGP include under the do-we-have-AGP ifdef
Eric Anholt [Thu, 15 Jan 2009 09:16:25 +0000 (01:16 -0800)]
drm: stash AGP include under the do-we-have-AGP ifdef

commit 1bb88edb7a3769992026f34fd648bb459b0469aa upstream.

This fixes the MIPS with DRM build.

Signed-off-by: Eric Anholt <eric@anholt.net>
Tested-by: Martin Michlmayr <tbm@cyrius.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agoserial_8250: support for Sealevel Systems Model 7803 COMM+8
Flavio Leitner [Fri, 2 Jan 2009 13:50:43 +0000 (13:50 +0000)]
serial_8250: support for Sealevel Systems Model 7803 COMM+8

commit e65f0f8271b1b0452334e5da37fd35413a000de4 upstream.

Add support for Sealevel Systems Model 7803 COMM+8

Signed-off-by: Flavio Leitner <fleitner@redhat.com>
Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agolibata: pata_via: support VX855, future chips whose IDE controller use 0x0571
JosephChan@via.com.tw [Fri, 23 Jan 2009 07:37:39 +0000 (15:37 +0800)]
libata: pata_via: support VX855, future chips whose IDE controller use 0x0571

commit e4d866cdea24543ee16ce6d07d80c513e86ba983 upstream.

It supports VX855 and future chips whose IDE controller uses PCI ID 0x0571.

Signed-off-by: Joseph Chan <josephchan@via.com.tw>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agoit821x: Add ultra_mask quirk for Vortex86SX
Brandon Philips [Wed, 14 Jan 2009 18:19:02 +0000 (19:19 +0100)]
it821x: Add ultra_mask quirk for Vortex86SX

commit b94b898f3107046b5c97c556e23529283ea5eadd upstream.

On Vortex86SX with IDE controller revision 0x11 ultra DMA must be
disabled. This patch was tested by DMP and seems to work.

It is a cleaned up version of their older Kernel patch:
 http://www.dmp.com.tw/tech/vortex86sx/patch-2.6.24-DMP.gz

Tested-by: Shawn Lin <shawn@dmp.com.tw>
Signed-off-by: Brandon Philips <bphilips@suse.de>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agortl8187: Add termination packet to prevent stall
Larry Finger [Fri, 23 Jan 2009 17:46:32 +0000 (11:46 -0600)]
rtl8187: Add termination packet to prevent stall

commit 2fcbab044a3faf4d4a6e269148dd1f188303b206 upstream.

The RTL8187 and RTL8187B devices can stall unless an explicit termination
packet is sent.

Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agoresources: skip sanity check of busy resources
Arjan van de Ven [Sat, 13 Dec 2008 17:15:27 +0000 (09:15 -0800)]
resources: skip sanity check of busy resources

commit 3ac52669c7a24b93663acfcab606d1065ed1accd upstream.

Impact: reduce false positives in iomem_map_sanity_check()

Some drivers (vesafb) only map/reserve a portion of a resource.
If then some other driver comes in and maps the whole resource,
the current code WARN_ON's. This is not the intent of the checks
in iomem_map_sanity_check(); rather these checks want to
warn when crossing *hardware* resources only.

This patch skips BUSY resources as suggested by Linus.

Note: having two drivers talk to the same hardware at the same
time is obviously not optimal behavior, but that's a separate story.

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Kyle McMartin <kyle@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agoalpha: fix vmalloc breakage
Ivan Kokshaysky [Thu, 15 Jan 2009 21:50:48 +0000 (13:50 -0800)]
alpha: fix vmalloc breakage

commit 822c18f2e38cbc775792ab65ace4f9198678dec9 upstream.

On alpha, we have to map some stuff in the VMALLOC space very early in the
boot process (to make SRM console callbacks work and so on, see
arch/alpha/mm/init.c).  For old VM allocator, we just manually placed a
vm_struct onto the global vmlist and this worked for ages.

Unfortunately, the new allocator isn't aware of this, so it constantly
tries to allocate the VM space which is already in use, making vmalloc on
alpha defunct.

This patch forces KVA to import vmlist entries on init.

[akpm@linux-foundation.org: remove unneeded check (per Johannes)]
Signed-off-by: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Nick Piggin <npiggin@suse.de>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agoalpha: nautilus - fix compile failure with gcc-4.3
Ivan Kokshaysky [Thu, 15 Jan 2009 21:51:17 +0000 (13:51 -0800)]
alpha: nautilus - fix compile failure with gcc-4.3

commit 70b66cbfd3316b792a855cb9a2574e85f1a63d0f upstream.

init_srm_irq() deals with irq's #16 and above, but size of irq_desc
array on nautilus and some other system types is 16. So gcc-4.3
complains that "array subscript is above array bounds", even though
this function is never called on those systems.

This adds a check for NR_IRQS <= 16, which effectively optimizes
init_srm_irq() code away on problematic platforms.

Thanks to Daniel Drake <dsd@gentoo.org> for detailed analysis
of the problem.

Signed-off-by: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Richard Henderson <rth@twiddle.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Tobias Klausmann <klausman@schwarzvogel.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agoUSB: storage: add unusual devs entry
Oliver Neukum [Wed, 14 Jan 2009 15:17:19 +0000 (16:17 +0100)]
USB: storage: add unusual devs entry

commit b90de8aea36ae6fe8050a6e91b031369c4f251b2 upstream.

This adds an unusual devs entry for 2116:0320

Signed-off-by: Oliver Neukum <oneukum@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agoUSB: fix char-device disconnect handling
Alan Stern [Tue, 13 Jan 2009 16:33:42 +0000 (11:33 -0500)]
USB: fix char-device disconnect handling

commit 501950d846218ed80a776d2aae5aed9c8b92e778 upstream.

This patch (as1198) fixes a conceptual bug: Somewhere along the line
we managed to confuse USB class devices with USB char devices.  As a
result, the code to send a disconnect signal to userspace would not be
built if both CONFIG_USB_DEVICE_CLASS and CONFIG_USB_DEVICEFS were
disabled.

The usb_fs_classdev_common_remove() routine has been renamed to
usbdev_remove() and it is now called whenever any USB device is
removed, not just when a class device is unregistered.  The notifier
registration and unregistration calls are no longer conditionally
compiled.  And since the common removal code will always be called as
part of the char device interface, there's no need to call it again as
part of the usbfs interface; thus the invocation of
usb_fs_classdev_common_remove() has been taken out of
usbfs_remove_device().

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Reported-by: Alon Bar-Lev <alon.barlev@gmail.com>
Tested-by: Alon Bar-Lev <alon.barlev@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agoUSB: usbmon: Implement compat_ioctl
Pete Zaitcev [Sat, 20 Dec 2008 19:56:08 +0000 (12:56 -0700)]
USB: usbmon: Implement compat_ioctl

commit 7abce6bedc118eb39fe177c2c26be5d008505c14 upstream.

Running a 32-bit usbmon(8) on 2.6.28-rc9 produces the following:
ioctl32(usbmon:28563): Unknown cmd fd(3) cmd(400c9206){t:ffffff92;sz:12} arg(ffd3f458) on /dev/usbmon0

It happens because the compatibility mode was implemented for 2.6.18
and not updated for the fsops.compat_ioctl API.

This patch relocates the pieces from under #ifdef CONFIG_COMPAT into
compat_ioctl with no other changes except one new whitespace.

Signed-off-by: Pete Zaitcev <zaitcev@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agosound: virtuoso: enable UART on Xonar HDAV1.3
Clemens Ladisch [Mon, 19 Jan 2009 09:07:21 +0000 (10:07 +0100)]
sound: virtuoso: enable UART on Xonar HDAV1.3

commit 22c733788bbd4b75c00279119a83da5cd74b987a upstream.

This hardware has a better chance of working correctly if we don't
forget to enable it.

Signed-off-by: Clemens Ladisch <clemens@ladisch.de>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agoUSB: fix toggle mismatch in disable_endpoint paths
Alan Stern [Thu, 15 Jan 2009 22:03:33 +0000 (17:03 -0500)]
USB: fix toggle mismatch in disable_endpoint paths

commit ddeac4e75f2527a340f9dc655bde49bb2429b39b upstream.

This patch (as1200) finishes some fixes that were left incomplete by
an earlier patch.

Although nobody has addressed this issue in the past, it turns out
that we need to distinguish between two different modes of disabling
and enabling endpoints.  In one mode only the data structures in
usbcore are affected, and in the other mode the host controller and
device hardware states are affected as well.

The earlier patch added an extra argument to the routines in the
enable_endpoint pathways to reflect this difference.  This patch adds
corresponding arguments to the disable_endpoint pathways.  Without
this change, the endpoint toggle state can get out of sync between
the host and the device.  The exact mechanism depends on the details
of the host controller (whether or not it stores its own copy of the
toggle values).

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Reported-by: Dan Streetman <ddstreet@ieee.org>
Tested-by: Dan Streetman <ddstreet@ieee.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agox86: fix page attribute corruption with cpa()
Suresh Siddha [Tue, 20 Jan 2009 22:20:21 +0000 (14:20 -0800)]
x86: fix page attribute corruption with cpa()

commit a1e46212a410793d575718818e81ddc442a65283 upstream.

Impact: fix sporadic slowdowns and warning messages

This patch fixes a performance issue reported by Linus on his
Nehalem system. While Linus reverted the PAT patch (commit
58dab916dfb57328d50deb0aa9b3fc92efa248ff) which exposed the issue,
existing cpa() code can potentially still cause wrong(page attribute
corruption) behavior.

This patch also fixes the "WARNING: at arch/x86/mm/pageattr.c:560" that
various people reported.

In 64bit kernel, kernel identity mapping might have holes depending
on the available memory and how e820 reports the address range
covering the RAM, ACPI, PCI reserved regions. If there is a 2MB/1GB hole
in the address range that is not listed by e820 entries, kernel identity
mapping will have a corresponding hole in its 1-1 identity mapping.

If cpa() happens on the kernel identity mapping which falls into these holes,
existing code fails like this:

__change_page_attr_set_clr()
__change_page_attr()
returns 0 because of if (!kpte). But doesn't
set cpa->numpages and cpa->pfn.
cpa_process_alias()
uses uninitialized cpa->pfn (random value)
which can potentially lead to changing the page
attribute of kernel text/data, kernel identity
mapping of RAM pages etc. oops!

This bug was easily exposed by another PAT patch which was doing
cpa() more often on kernel identity mapping holes (physical range between
max_low_pfn_mapped and 4GB), where in here it was setting the
cache disable attribute(PCD) for kernel identity mappings aswell.

Fix cpa() to handle the kernel identity mapping holes. Retain
the WARN() for cpa() calls to other not present address ranges
(kernel-text/data, ioremap() addresses)

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agosysfs: fix problems with binary files
Greg Kroah-Hartman [Tue, 20 Jan 2009 23:51:16 +0000 (15:51 -0800)]
sysfs: fix problems with binary files

commit 4503efd0891c40e30928afb4b23dc3f99c62a6b2 upstream.

Some sysfs binary files don't like having 0 passed to them as a size.
Fix this up at the root by just returning to the vfs if userspace asks
us for a zero sized buffer.

Thanks to Pavel Roskin for pointing this out.

Reported-by: Pavel Roskin <proski@gnu.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agoklist.c: bit 0 in pointer can't be used as flag
Jesper Nilsson [Wed, 14 Jan 2009 10:19:08 +0000 (11:19 +0100)]
klist.c: bit 0 in pointer can't be used as flag

commit c0e69a5bbc6fc74184aa043aadb9a53bc58f953b upstream.

The commit a1ed5b0cffe4b16a93a6a3390e8cee0fbef94f86
(klist: don't iterate over deleted entries) introduces use of the
low bit in a pointer to indicate if the knode is dead or not,
assuming that this bit is always free.

This is not true for all architectures, CRIS for example may align data
on byte borders.

The result is a bunch of warnings on bootup, devices not being
added correctly etc, reported by Hinko Kocevar <hinko.kocevar@cetrtapot.si>:

------------[ cut here ]------------
WARNING: at lib/klist.c:62 ()
Modules linked in:

Stack from c1fe1cf0:
       c01cc7f4 c1fe1d11 c000eb4e c000e4de 00000000 00000000 c1f4f78f c1f50c2d
       c01d008c c1fdd1a0 c1fdd1a0 c1fe1d38 c0192954 c1fe0000 00000000 c1fe1dc0
       00000002 7fffffff c1fe1da8 c0192d50 c1fe1dc0 00000002 7fffffff c1ff9fcc
Call Trace: [<c000eb4e>] [<c000e4de>] [<c0192954>] [<c0192d50>] [<c001d49e>] [<c000b688>] [<c0192a3c>]
       [<c000b63e>] [<c000b63e>] [<c001a542>] [<c00b55b0>] [<c00411c0>] [<c00b559c>] [<c01918e6>] [<c0191988>]
       [<c01919d0>] [<c00cd9c8>] [<c00cdd6a>] [<c0034178>] [<c000409a>] [<c0015576>] [<c0029130>] [<c0029078>]
       [<c0029170>] [<c0012336>] [<c00b4076>] [<c00b4770>] [<c006d6e4>] [<c006d974>] [<c006dca0>] [<c0028d6c>]
       [<c0028e12>] [<c0006424>] <4>---[ end trace 4eaa2a86a8e2da22 ]---
------------[ cut here ]------------
Repeat ad nauseam.

Wed, Jan 14, 2009 at 12:11:32AM +0100, Bastien ROUCARIES wrote:
> Perhaps using a pointerhackalign trick on this structure where
> #define pointerhackalign(x) __attribute__ ((aligned (x)))
> and declare
> struct klist_node {
> ...
> }  pointerhackalign(2);
>
> Because  __attribute__ ((aligned (x))) could only increase alignment
> it will safe to do that and serve as documentation purpose :)

That works, but we need to do it not for the struct klist_node,
but for the struct we insert into the void * in klist_node,
which is struct klist.

Reported-by: Hinko Kocevar <hinko.kocevar@cetrtapot.si
Cc: Bastien ROUCARIES <roucaries.bastien@gmail.com>
Signed-off-by: Jesper Nilsson <jesper.nilsson@axis.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agox86, mm: fix pte_free()
Peter Zijlstra [Fri, 23 Jan 2009 16:37:49 +0000 (17:37 +0100)]
x86, mm: fix pte_free()

commit 42ef73fe134732b2e91c0326df5fd568da17c4b2 upstream.

On -rt we were seeing spurious bad page states like:

Bad page state in process 'firefox'
page:c1bc2380 flags:0x40000000 mapping:c1bc2390 mapcount:0 count:0
Trying to fix it up, but a reboot is needed
Backtrace:
Pid: 503, comm: firefox Not tainted 2.6.26.8-rt13 #3
[<c043d0f3>] ? printk+0x14/0x19
[<c0272d4e>] bad_page+0x4e/0x79
[<c0273831>] free_hot_cold_page+0x5b/0x1d3
[<c02739f6>] free_hot_page+0xf/0x11
[<c0273a18>] __free_pages+0x20/0x2b
[<c027d170>] __pte_alloc+0x87/0x91
[<c027d25e>] handle_mm_fault+0xe4/0x733
[<c043f680>] ? rt_mutex_down_read_trylock+0x57/0x63
[<c043f680>] ? rt_mutex_down_read_trylock+0x57/0x63
[<c0218875>] do_page_fault+0x36f/0x88a

This is the case where a concurrent fault already installed the PTE and
we get to free the newly allocated one.

This is due to pgtable_page_ctor() doing the spin_lock_init(&page->ptl)
which is overlaid with the {private, mapping} struct.

union {
    struct {
        unsigned long private;
        struct address_space *mapping;
    };
    spinlock_t ptl;
    struct kmem_cache *slab;
    struct page *first_page;
};

Normally the spinlock is small enough to not stomp on page->mapping, but
PREEMPT_RT=y has huge 'spin'locks.

But lockdep kernels should also be able to trigger this splat, as the
lock tracking code grows the spinlock to cover page->mapping.

The obvious fix is calling pgtable_page_dtor() like the regular pte free
path __pte_free_tlb() does.

It seems all architectures except x86 and nm10300 already do this, and
nm10300 doesn't seem to use pgtable_page_ctor(), which suggests it
doesn't do SMP or simply doesnt do MMU at all or something.

Signed-off-by: Peter Zijlstra <a.p.zijlsta@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agofuse: fix NULL deref in fuse_file_alloc()
Dan Carpenter [Mon, 26 Jan 2009 14:00:58 +0000 (15:00 +0100)]
fuse: fix NULL deref in fuse_file_alloc()

commit bb875b38dc5e343bdb696b2eab8233e4d195e208 upstream.

ff is set to NULL and then dereferenced on line 65.  Compile tested only.

Signed-off-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agofuse: fix missing fput on error
Miklos Szeredi [Mon, 26 Jan 2009 14:00:58 +0000 (15:00 +0100)]
fuse: fix missing fput on error

commit 3ddf1e7f57237ac7c5d5bfb7058f1ea4f970b661 upstream.

Fix the leaking file reference if allocation or initialization of
fuse_conn failed.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agofuse: destroy bdi on umount
Miklos Szeredi [Mon, 26 Jan 2009 14:00:59 +0000 (15:00 +0100)]
fuse: destroy bdi on umount

commit 26c3679101dbccc054dcf370143941844ba70531 upstream.

If a fuse filesystem is unmounted but the device file descriptor
remains open and a new mount reuses the old device number, then the
mount fails with EEXIST and the following warning is printed in the
kernel log:

  WARNING: at fs/sysfs/dir.c:462 sysfs_add_one+0x35/0x3d()
  sysfs: duplicate filename '0:15' can not be created

The cause is that the bdi belonging to the fuse filesystem was
destoryed only after the device file was released.  Fix this by
calling bdi_destroy() from fuse_put_super() instead.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agoinotify: clean up inotify_read and fix locking problems
Vegard Nossum [Thu, 22 Jan 2009 14:29:45 +0000 (15:29 +0100)]
inotify: clean up inotify_read and fix locking problems

commit 3632dee2f8b8a9720329f29eeaa4ec4669a3aff8 upstream.

If userspace supplies an invalid pointer to a read() of an inotify
instance, the inotify device's event list mutex is unlocked twice.
This causes an unbalance which effectively leaves the data structure
unprotected, and we can trigger oopses by accessing the inotify
instance from different tasks concurrently.

The best fix (contributed largely by Linus) is a total rewrite
of the function in question:

On Thu, Jan 22, 2009 at 7:05 AM, Linus Torvalds wrote:
> The thing to notice is that:
>
>  - locking is done in just one place, and there is no question about it
>   not having an unlock.
>
>  - that whole double-while(1)-loop thing is gone.
>
>  - use multiple functions to make nesting and error handling sane
>
>  - do error testing after doing the things you always need to do, ie do
>   this:
>
>        mutex_lock(..)
>        ret = function_call();
>        mutex_unlock(..)
>
>        .. test ret here ..
>
>   instead of doing conditional exits with unlocking or freeing.
>
> So if the code is written in this way, it may still be buggy, but at least
> it's not buggy because of subtle "forgot to unlock" or "forgot to free"
> issues.
>
> This _always_ unlocks if it locked, and it always frees if it got a
> non-error kevent.

Cc: John McCutchan <ttb@tentacle.dhs.org>
Cc: Robert Love <rlove@google.com>
Signed-off-by: Vegard Nossum <vegard.nossum@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agomac80211: decrement ref count to netdev after launching mesh discovery
Brian Cavagnolo [Sat, 17 Jan 2009 03:04:49 +0000 (19:04 -0800)]
mac80211: decrement ref count to netdev after launching mesh discovery

commit 5dc306f3bd1d4cfdf79df39221b3036eab1ddcf3 upstream.

After launching mesh discovery in tx path, reference count was not being
decremented.  This was preventing module unload.

Signed-off-by: Brian Cavagnolo <brian@cozybit.com>
Signed-off-by: Andrey Yurovsky <andrey@cozybit.com>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agoath5k: fix mesh point operation
Andrey Yurovsky [Tue, 14 Oct 2008 01:23:07 +0000 (18:23 -0700)]
ath5k: fix mesh point operation

commit b706e65b40417e03c2451bb3f92488f3736843fa upstream.

This patch fixes mesh point operation (thanks to YanBo for pointing
out the problem): make mesh point interfaces start beaconing when
they come up and configure the RX filter in mesh mode so that mesh
beacons and action frames are received.  Add mesh point to the check
in ath5k_add_interface.  Tested with multiple AR5211 cards.

Signed-off-by: Andrey Yurovsky <andrey@cozybit.com>
Acked-by: Nick Kossifidis <mickflemm@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Cc: Bob Copeland <me@bobcopeland.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agoLinux 2.6.28.2 v2.6.28.2
Greg Kroah-Hartman [Sun, 25 Jan 2009 00:42:07 +0000 (16:42 -0800)]
Linux 2.6.28.2

15 years agofs: sys_sync fix
Nick Piggin [Tue, 6 Jan 2009 22:40:26 +0000 (14:40 -0800)]
fs: sys_sync fix

commit 856bf4d717feb8c55d4e2f817b71ebb70cfbc67b upstream.

s_syncing livelock avoidance was breaking data integrity guarantee of
sys_sync, by allowing sys_sync to skip writing or waiting for superblocks
if there is a concurrent sys_sync happening.

This livelock avoidance is much less important now that we don't have the
get_super_to_sync() call after every sb that we sync.  This was replaced
by __put_super_and_need_restart.

Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agofs: sync_sb_inodes fix
Nick Piggin [Tue, 6 Jan 2009 22:40:25 +0000 (14:40 -0800)]
fs: sync_sb_inodes fix

commit 38f21977663126fef53f5585e7f1653d8ebe55c4 upstream.

Fix data integrity semantics required by sys_sync, by iterating over all
inodes and waiting for any writeback pages after the initial writeout.
Comments explain the exact problem.

Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agofs: remove WB_SYNC_HOLD
Nick Piggin [Tue, 6 Jan 2009 22:40:25 +0000 (14:40 -0800)]
fs: remove WB_SYNC_HOLD

commit 4f5a99d64c17470a784a6c68064207d82e3e74a5 upstream.

Remove WB_SYNC_HOLD.  The primary motiviation is the design of my
anti-starvation code for fsync.  It requires taking an inode lock over the
sync operation, so we could run into lock ordering problems with multiple
inodes.  It is possible to take a single global lock to solve the ordering
problem, but then that would prevent a future nice implementation of "sync
multiple inodes" based on lock order via inode address.

Seems like a backward step to remove this, but actually it is busted
anyway: we can't use the inode lists for data integrity wait: an inode can
be taken off the dirty lists but still be under writeback.  In order to
satisfy data integrity semantics, we should wait for it to finish
writeback, but if we only search the dirty lists, we'll miss it.

It would be possible to have a "writeback" list, for sys_sync, I suppose.
But why complicate things by prematurely optimise?  For unmounting, we
could avoid the "livelock avoidance" code, which would be easier, but
again premature IMO.

Fixing the existing data integrity problem will come next.

Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agomm: direct IO starvation improvement
Nick Piggin [Tue, 6 Jan 2009 22:40:22 +0000 (14:40 -0800)]
mm: direct IO starvation improvement

commit 48b47c561e41525061b5bc0cfd67d6367fd11dc4 upstream.

Direct IO can invalidate and sync a lot of pagecache pages in the mapping.
 A 4K direct IO will actually try to sync and/or invalidate the pagecache
of the entire file, for example (which might be many GB or TB large).

Improve this by doing range syncs.  Also, memory no longer has to be
unmapped to catch the dirty bits for syncing, as dirty bits would remain
coherent due to dirty mmap accounting.

This fixes the immediate DM deadlocks when doing direct IO reads to block
device with a mounted filesystem, if only by papering over the problem
somewhat rather than addressing the fsync starvation cases.

Signed-off-by: Nick Piggin <npiggin@suse.de>
Reviewed-by: Jeff Moyer <jmoyer@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agomm: do_sync_mapping_range integrity fix
Nick Piggin [Tue, 6 Jan 2009 22:39:12 +0000 (14:39 -0800)]
mm: do_sync_mapping_range integrity fix

commit ee53a891f47444c53318b98dac947ede963db400 upstream.

Chris Mason notices do_sync_mapping_range didn't actually ask for data
integrity writeout.  Unfortunately, it is advertised as being usable for
data integrity operations.

This is a data integrity bug.

Signed-off-by: Nick Piggin <npiggin@suse.de>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: Dave Chinner <david@fromorbit.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agomm: write_cache_pages more terminate quickly
Andrew Morton [Tue, 6 Jan 2009 22:39:11 +0000 (14:39 -0800)]
mm: write_cache_pages more terminate quickly

commit 82fd1a9a8ced9607312b54859572bcc6211e8919 upstream.

Now that we have the early-termination logic in place, it makes sense to
bail out early in all other cases where done is set to 1.

Signed-off-by: Nick Piggin <npiggin@suse.de>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: Dave Chinner <david@fromorbit.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agomm: write_cache_pages terminate quickly
Nick Piggin [Tue, 6 Jan 2009 22:39:11 +0000 (14:39 -0800)]
mm: write_cache_pages terminate quickly

commit d5482cdf8a0aacb1e6468a97d5544f5829c8d8c4 upstream.

Terminate the write_cache_pages loop upon encountering the first page past
end, without locking the page.  Pages cannot have their index change when
we have a reference on them (truncate, eg truncate_inode_pages_range
performs the same check without the page lock).

Signed-off-by: Nick Piggin <npiggin@suse.de>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: Dave Chinner <david@fromorbit.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agomm: write_cache_pages optimise page cleaning
Nick Piggin [Tue, 6 Jan 2009 22:39:10 +0000 (14:39 -0800)]
mm: write_cache_pages optimise page cleaning

commit 515f4a037fb9ab736f8bad733fcd2ffd350cf265 upstream.

In write_cache_pages, if we get stuck behind another process that is
cleaning pages, we will be forced to wait for them to finish, then perform
our own writeout (if it was redirtied during the long wait), then wait for
that.

If a page under writeout is still clean, we can skip waiting for it (if
we're part of a data integrity sync, we'll be waiting for all writeout
pages afterwards, so we'll still be waiting for the other guy's write
that's cleaned the page).

Signed-off-by: Nick Piggin <npiggin@suse.de>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: Dave Chinner <david@fromorbit.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agomm: write_cache_pages cleanups
Nick Piggin [Tue, 6 Jan 2009 22:39:09 +0000 (14:39 -0800)]
mm: write_cache_pages cleanups

commit 5a3d5c9813db56a75934eb1015367fda23a8b0b4 upstream.

Get rid of some complex expressions from flow control statements, add a
comment, remove some duplicate code.

Signed-off-by: Nick Piggin <npiggin@suse.de>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: Dave Chinner <david@fromorbit.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agomm: write_cache_pages integrity fix
Nick Piggin [Tue, 6 Jan 2009 22:39:08 +0000 (14:39 -0800)]
mm: write_cache_pages integrity fix

commit 05fe478dd04e02fa230c305ab9b5616669821dd3 upstream.

In write_cache_pages, nr_to_write is heeded even for data-integrity syncs,
so the function will return success after writing out nr_to_write pages,
even if that was not sufficient to guarantee data integrity.

The callers tend to set it to values that could break data interity
semantics easily in practice.  For example, nr_to_write can be set to
mapping->nr_pages * 2, however if a file has a single, dirty page, then
fsync is called, subsequent pages might be concurrently added and dirtied,
then write_cache_pages might writeout two of these newly dirty pages,
while not writing out the old page that should have been written out.

Fix this by ignoring nr_to_write if it is a data integrity sync.

This is a data integrity bug.

The reason this has been done in the past is to avoid stalling sync
operations behind page dirtiers.

 "If a file has one dirty page at offset 1000000000000000 then someone
  does an fsync() and someone else gets in first and starts madly writing
  pages at offset 0, we want to write that page at 1000000000000000.
  Somehow."

What we do today is return success after an arbitrary amount of pages are
written, whether or not we have provided the data-integrity semantics that
the caller has asked for.  Even this doesn't actually fix all stall cases
completely: in the above situation, if the file has a huge number of pages
in pagecache (but not dirty), then mapping->nrpages is going to be huge,
even if pages are being dirtied.

This change does indeed make the possibility of long stalls lager, and
that's not a good thing, but lying about data integrity is even worse.  We
have to either perform the sync, or return -ELINUXISLAME so at least the
caller knows what has happened.

There are subsequent competing approaches in the works to solve the stall
problems properly, without compromising data integrity.

Signed-off-by: Nick Piggin <npiggin@suse.de>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: Dave Chinner <david@fromorbit.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agomm: write_cache_pages writepage error fix
Nick Piggin [Tue, 6 Jan 2009 22:39:06 +0000 (14:39 -0800)]
mm: write_cache_pages writepage error fix

commit 00266770b8b3a6a77f896ca501a0613739086832 upstream.

In write_cache_pages, if ret signals a real error, but we still have some
pages left in the pagevec, done would be set to 1, but the remaining pages
would continue to be processed and ret will be overwritten in the process.

It could easily be overwritten with success, and thus success will be
returned even if there is an error.  Thus the caller is told all writes
succeeded, wheras in reality some did not.

Fix this by bailing immediately if there is an error, and retaining the
first error code.

This is a data integrity bug.

Signed-off-by: Nick Piggin <npiggin@suse.de>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: Dave Chinner <david@fromorbit.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agomm: write_cache_pages early loop termination
Nick Piggin [Tue, 6 Jan 2009 22:39:06 +0000 (14:39 -0800)]
mm: write_cache_pages early loop termination

commit bd19e012f6fd3b7309689165ea865cbb7bb88c1e upstream.

We'd like to break out of the loop early in many situations, however the
existing code has been setting mapping->writeback_index past the final
page in the pagevec lookup for cyclic writeback.  This is a problem if we
don't process all pages up to the final page.

Currently the code mostly keeps writeback_index reasonable and hacked
around this by not breaking out of the loop or writing pages outside the
range in these cases.  Keep track of a real "done index" that enables us
to terminate the loop in a much more flexible manner.

Needed by the subsequent patch to preserve writepage errors, and then
further patches to break out of the loop early for other reasons.  However
there are no functional changes with this patch alone.

Signed-off-by: Nick Piggin <npiggin@suse.de>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: Dave Chinner <david@fromorbit.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agomm: write_cache_pages cyclic fix
Nick Piggin [Tue, 6 Jan 2009 22:39:04 +0000 (14:39 -0800)]
mm: write_cache_pages cyclic fix

commit 31a12666d8f0c22235297e1c1575f82061480029 upstream.

In write_cache_pages, scanned == 1 is supposed to mean that cyclic
writeback has circled through zero, thus we should not circle again.
However it gets set to 1 after the first successful pagevec lookup.  This
leads to cases where not enough data gets written.

Counterexample: file with first 10 pages dirty, writeback_index == 5,
nr_to_write == 10.  Then the 5 last pages will be found, and scanned will
be set to 1, after writing those out, we will not cycle back to get the
first 5.

Rework this logic, now we'll always cycle unless we started off from index
0.  When cycling, only write out as far as 1 page before the start page
from the first cycle (so we don't write parts of the file twice).

Signed-off-by: Nick Piggin <npiggin@suse.de>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: Dave Chinner <david@fromorbit.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agohwmon: (abituguru3) Fix CONFIG_DMI=n fallback to probe
Alistair John Strachan [Thu, 15 Jan 2009 21:27:48 +0000 (22:27 +0100)]
hwmon: (abituguru3) Fix CONFIG_DMI=n fallback to probe

commit 46a5f173fc88ffc22651162033696d8a9fbcdc5c upstream.

When CONFIG_DMI is not enabled, dmi detection should flag that no board
could be detected (err=1) rather than another error condition (err<0).

This fixes the fallback to manual probing for all motherboards, even
those without DMI strings, when CONFIG_DMI=n.

Signed-off-by: Alistair John Strachan <alistair@devzero.co.uk>
Cc: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agopowerpc: is_hugepage_only_range() must account for both 4kB and 64kB slices
Dave Kleikamp [Wed, 14 Jan 2009 09:09:34 +0000 (09:09 +0000)]
powerpc: is_hugepage_only_range() must account for both 4kB and 64kB slices

commit 9ba0fdbfaed2e74005d87fab948c5522b86ff733 upstream.

powerpc: is_hugepage_only_range() must account for both 4kB and 64kB slices

The subpage_prot syscall fails on second and subsequent calls for a given
region, because is_hugepage_only_range() is mis-identifying the 4 kB
slices when the process has a 64 kB page size.

Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agodell_rbu: use scnprintf() instead of less secure sprintf()
Pavel Roskin [Sat, 17 Jan 2009 18:33:03 +0000 (13:33 -0500)]
dell_rbu: use scnprintf() instead of less secure sprintf()

commit 81156928f8fe31621e467490b9d441c0285998c3 upstream.

Reading 0 bytes from /sys/devices/platform/dell_rbu/image_type or
/sys/devices/platform/dell_rbu/packet_size by an ordinary user causes an
oops.

Signed-off-by: Pavel Roskin <proski@gnu.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agonetfilter: nf_conntrack: fix ICMP/ICMPv6 timeout sysctls on big-endian
Patrick McHardy [Mon, 19 Jan 2009 14:19:39 +0000 (15:19 +0100)]
netfilter: nf_conntrack: fix ICMP/ICMPv6 timeout sysctls on big-endian

Upstream commit 71320af:

An old bug crept back into the ICMP/ICMPv6 conntrack protocols: the timeout
values are defined as unsigned longs, the sysctl's maxsize is set to
sizeof(unsigned int). Use unsigned int for the timeout values as in the
other conntrack protocols.

Reported-by: Jean-Mickael Guerin <jean-mickael.guerin@6wind.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agonetfilter: ebtables: fix inversion in match code
Patrick McHardy [Mon, 19 Jan 2009 14:19:37 +0000 (15:19 +0100)]
netfilter: ebtables: fix inversion in match code

Upstream commit d61ba9f:

Commit 8cc784ee (netfilter: change return types of match functions
for ebtables extensions) broke ebtables matches by inverting the
sense of match/nomatch.

Reported-by: Matt Cross <matthltc@us.ibm.com>
Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agonetfilter: x_tables: fix match/target revision lookup
Patrick McHardy [Mon, 19 Jan 2009 14:19:36 +0000 (15:19 +0100)]
netfilter: x_tables: fix match/target revision lookup

Upstream commit 656caff:

Commit 55b69e91 (netfilter: implement NFPROTO_UNSPEC as a wildcard
for extensions) broke revision probing for matches and targets that
are registered with NFPROTO_UNSPEC.

Fix by continuing the search on the NFPROTO_UNSPEC list if nothing
is found on the af-specific lists.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agop54usb: fix traffic stalls / packet drop
Christian Lamparter [Tue, 20 Jan 2009 22:11:11 +0000 (23:11 +0100)]
p54usb: fix traffic stalls / packet drop

commit 00627f229c9807e4cb825a7ce36b886e2adf2229 upstream.

All p54usb devices need a explicit termination packet, in oder to finish the pending transfer properly.
Else, the firmware could freeze, or simply drop the frame.

Signed-off-by: Christian Lamparter <chunkeey@web.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agoUSB: re-enable interface after driver unbinds
Alan Stern [Wed, 31 Dec 2008 16:31:33 +0000 (11:31 -0500)]
USB: re-enable interface after driver unbinds

commit 2caf7fcdb8532045680f06b67b9e63f0c9613aaa upstream.

This patch (as1197) fixes an error introduced recently.  Since a
significant number of devices can't handle Set-Interface requests, we
no longer call usb_set_interface() when a driver unbinds from an
interface, provided the interface is already in altsetting 0.  However
the interface still does get disabled, and the call to
usb_set_interface() was the only thing re-enabling it.  Since the
interface doesn't get re-enabled, further attempts to use it fail.

So the patch adds a call to usb_enable_interface() when a driver
unbinds and the interface is in altsetting 0.  For this to work
right, the interface's endpoints have to be re-enabled but their
toggles have to be left alone.  Therefore an additional argument is
added to usb_enable_endpoint() and usb_enable_interface(), a flag
indicating whether or not the endpoint toggles should be reset.

This is a forward-ported version of a patch which fixes Bugzilla
#12301.

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Reported-by: David Roka <roka@dawid.hu>
Reported-by: Erik Ekman <erik@kryo.se>
Tested-by: Erik Ekman <erik@kryo.se>
Tested-by: Alon Bar-Lev <alon.barlev@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agotcp: don't mask EOF and socket errors on nonblocking splice receive
Lennert Buytenhek [Tue, 20 Jan 2009 23:25:21 +0000 (15:25 -0800)]
tcp: don't mask EOF and socket errors on nonblocking splice receive

[ Upstream commit: 4f7d54f59bc470f0aaa932f747a95232d7ebf8b1 ]

Currently, setting SPLICE_F_NONBLOCK on splice from a TCP socket
results in masking of EOF (RDHUP) and error conditions on the socket
by an -EAGAIN return.  Move the NONBLOCK check in tcp_splice_read()
to be after the EOF and error checks to fix this.

Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agor6040: bump release number to 0.19
Florian Fainelli [Tue, 23 Dec 2008 03:40:38 +0000 (19:40 -0800)]
r6040: bump release number to 0.19

[ Upstream commit: 4707470ae7441733822efcd680b0ef3971921c4d ]

This patch bumps the release number of the driver.

Signed-off-by: Florian Fainelli <florian@openwrt.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agor6040: save and restore MIER correctly in the interrupt routine
Joe Chou [Tue, 23 Dec 2008 03:40:02 +0000 (19:40 -0800)]
r6040: save and restore MIER correctly in the interrupt routine

[ Upstream commit: 3e7c469f07ff14cbf9a814739e1fc99a863e0943 ]

This patch saves the MIER register contents before treating
interrupts, then restores them correcty at the end of the
interrupt routine.

Signed-off-by: Joe Chou <Joe.Chou@rdc.com.tw>
Signed-off-by: Florian Fainelli <florian@openwrt.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agor6040: fix wrong logic in mdio code
Joe Chou [Tue, 23 Dec 2008 03:38:17 +0000 (19:38 -0800)]
r6040: fix wrong logic in mdio code

[ Upstream commit: 11e5e8f5d14a1229706576184d2cf4c4556ed94c ]

This patch fixes a reverse logic in the MDIO code.

Signed-off-by: Joe Chou <Joe.Chou@rdc.com.tw>
Signed-off-by: Florian Fainelli <florian@openwrt.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agopkt_sched: cls_u32: Fix locking in u32_change()
Jarek Poplawski [Tue, 20 Jan 2009 22:08:23 +0000 (14:08 -0800)]
pkt_sched: cls_u32: Fix locking in u32_change()

[ Upstream commit: 6f57321422e0d359e83c978c2b03db77b967b7d5 ]

New nodes are inserted in u32_change() under rtnl_lock() with wmb(),
so without tcf_tree_lock() like in other classifiers (e.g. cls_fw).
This isn't enough without rmb() on the read side, but on the other
hand adding such barriers doesn't give any savings, so the lock is
added instead.

Reported-by: m0sia <m0sia@plotinka.ru>
Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agosctp: Avoid memory overflow while FWD-TSN chunk is received with bad stream ID
Wei Yongjun [Tue, 20 Jan 2009 22:08:01 +0000 (14:08 -0800)]
sctp: Avoid memory overflow while FWD-TSN chunk is received with bad stream ID

[ Upstream commit: 9fcb95a105758b81ef0131cd18e2db5149f13e95 ]

If FWD-TSN chunk is received with bad stream ID, the sctp will not do the
validity check, this may cause memory overflow when overwrite the TSN of
the stream ID.

The FORWARD-TSN chunk is like this:

FORWARD-TSN chunk
  Type                       = 192
  Flags                      = 0
  Length                     = 172
  NewTSN                     = 99
  Stream                     = 10000
  StreamSequence             = 0xFFFF

This patch fix this problem by discard the chunk if stream ID is not
less than MIS.

Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agoipv6: Fix fib6_dump_table walker leak
Herbert Xu [Tue, 20 Jan 2009 22:06:49 +0000 (14:06 -0800)]
ipv6: Fix fib6_dump_table walker leak

[ Upstream commit: 7891cc818967e186be68caac32d84bfd0a3f0bd2 ]

When a fib6 table dump is prematurely ended, we won't unlink
its walker from the list.  This causes all sorts of grief for
other users of the list later.

Reported-by: Chris Caputo <ccaputo@alt.net>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agopkt_sched: sch_htb: Fix deadlock in hrtimers triggered by HTB
Jarek Poplawski [Tue, 20 Jan 2009 22:06:26 +0000 (14:06 -0800)]
pkt_sched: sch_htb: Fix deadlock in hrtimers triggered by HTB

[ Upstream commit: none

  This is a quick fix for -stable purposes.  Upstream fixes these
  problems via a large set of invasive hrtimer changes.  ]

Most probably there is a (still unproven) race in hrtimers (before
2.6.29 kernels), which causes a corruption of hrtimers rbtree. This
patch doesn't fix it, but should let HTB avoid triggering the bug.

Reported-by: Denys Fedoryschenko <denys@visp.net.lb>
Reported-by: Badalian Vyacheslav <slavon@bigtelecom.ru>
Reported-by: Chris Caputo <ccaputo@alt.net>
Tested-by: Badalian Vyacheslav <slavon@bigtelecom.ru>
Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agousb-storage: set CAPACITY_HEURISTICS flag for bad vendors
Alan Stern [Tue, 20 Jan 2009 20:33:34 +0000 (15:33 -0500)]
usb-storage: set CAPACITY_HEURISTICS flag for bad vendors

commit a81a81a25d3ecdab777abca87c5ddf484056103d upstream.

This patch (as1194b) makes usb-storage set the CAPACITY_HEURISTICS flag
for all devices made by Nokia, Nikon, or Motorola.  These companies
seem to include the READ CAPACITY bug in all of their devices.

Since cell phones and digital cameras rely on flash storage, which
always has an even number of sectors, setting CAPACITY_HEURISTICS
shouldn't cause any problems.  Not even if the companies wise up and
start making devices without the bug.

A large number of unusual_devs entries are now unnecessary, so the
patch removes them.

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agousb-storage: add last-sector hacks
Alan Stern [Tue, 20 Jan 2009 20:33:26 +0000 (15:33 -0500)]
usb-storage: add last-sector hacks

commit 25ff1c316f6a763f1eefe7f8984b2d8c03888432 upstream.

This patch (as1189c) adds some hacks to usb-storage for dealing with
the growing problems involving bad capacity values and last-sector
accesses:

A new flag, US_FL_CAPACITY_OK, is created to indicate that
the device is known to report its capacity correctly.  An
unusual_devs entry for Linux's own File-backed Storage Gadget
is added with this flag set, since g_file_storage always
reports the correct capacity and since the capacity need
not be even (it is determined by the size of the backing
file).

An entry in unusual_devs.h which has only the CAPACITY_OK
flag set shouldn't prejudice libusual, since the device will
work perfectly well with either usb-storage or ub.  So a
new macro, COMPLIANT_DEV, is added to let libusual know
about these entries.

When a last-sector access fails three times in a row and
neither the FIX_CAPACITY nor the CAPACITY_OK flag is set,
we assume the last-sector bug is present.  We replace the
existing status and sense data with values that will cause
the SCSI core to fail the access immediately rather than
retry indefinitely.  This should fix the difficulties
people have been having with Nokia phones.

This version of the patch differs from the version accepted into the
mainline only in that it does not trigger a WARN() when an
odd-numbered last-sector access succeeds.  In a stable kernel series
we don't want to go around spamming users' logs and consoles for no
good reason.

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agodrivers/net/irda/irda-usb.c: fix buffer overflow
Jos-Vicente Gilabert [Thu, 15 Jan 2009 04:55:00 +0000 (20:55 -0800)]
drivers/net/irda/irda-usb.c: fix buffer overflow

commit 2950e952920811be465ec95c6b56f03dc66a05c0 upstream.

Taken from http://bugzilla.kernel.org/show_bug.cgi?id=12397

We're doing an sprintf of an 11-char string into an 11-char buffer.
Whoops.  It breaks firmware uploading.

Reported-by: Jos-Vicente Gilabert <josevteg@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agoALSA: hda - make laptop-eapd model back for AD1986A
Takashi Iwai [Fri, 21 Nov 2008 01:25:48 +0000 (02:25 +0100)]
ALSA: hda - make laptop-eapd model back for AD1986A

commit 1725b82a6e2721612a3572d0336f51f1f1c3cf54 upstream.

The changes specific for Samsung laptops seem unapplicable to other
hardware models like ASUS.  The mic inputs are lost on such hardware
by the change 5d5d5f43f1b835c375de9bd270cce030d16e2871.

This patch adds back the old laptop-eapd model, and create a new
model "samsung" for the new one specific to Samsung laptops with
automatic mic selection feature.

Reference: kernel bugzilla #12070
http://bugzilla.kernel.org/show_bug.cgi?id=12070

Signed-off-by: Takashi Iwai <tiwai@suse.de>
Cc: Daniel Drake <dsd@gentoo.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agoALSA: hda - Don't reset HP pinctl in patch_sigmatel.c
Takashi Iwai [Wed, 14 Jan 2009 06:56:51 +0000 (07:56 +0100)]
ALSA: hda - Don't reset HP pinctl in patch_sigmatel.c

commit 8317e0b0c2234f5f1f5d54804e4093d11bc0dffa upstream.

Resetting HP pinctl at the unplugged state may cause a sort of regression
on some devices because of their wrong pin configuration.

A simple workaround is to disable the pin reset.  This is ugly and may be
not good from the power-saving POV (if any), but damn simple.

Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agoALSA: hda - Add automatic model setting for Samsung Q45
Luke Yelavich [Tue, 16 Dec 2008 01:37:47 +0000 (12:37 +1100)]
ALSA: hda - Add automatic model setting for Samsung Q45

commit 3e420e78ece6f9d2accc1568e80dfd0501e13df1 upstream.

Have the Samsung Q45 (144d:c510) select ALC262_HIPPO by default

Reference: Ubuntu bug 200210
http://launchpad.net/bugs/200210

Signed-off-by: Luke Yelavich <themuso@ubuntu.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agoALSA: hda - Fix HP dv5 mic input
Takashi Iwai [Wed, 14 Jan 2009 07:27:35 +0000 (08:27 +0100)]
ALSA: hda - Fix HP dv5 mic input

commit 1b0652eb588e57c3ab230e0291e7da99c7e665e0 upstream.

Fix HP dv5 (103c:3603) built-in mic input.

Reference: kernel bug 12440
http://bugzilla.kernel.org/show_bug.cgi?id=12440

Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agoALSA: hda - Add quirk for another HP dv5
Giuseppe Bilotta [Tue, 13 Jan 2009 13:58:49 +0000 (08:58 -0500)]
ALSA: hda - Add quirk for another HP dv5

commit dafb70ce1026d4d6ef1b16ad6996c9589bb11cce upstream.

Add the model=hp-m4 quirk for another HP dv5 (103c:3603)
Reference: kernel bug#12440
http://bugzilla.kernel.org/show_bug.cgi?id=12440

Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agosound: virtuoso: do not overwrite EEPROM on Xonar D2/D2X
Clemens Ladisch [Thu, 15 Jan 2009 09:21:23 +0000 (10:21 +0100)]
sound: virtuoso: do not overwrite EEPROM on Xonar D2/D2X

commit 7e86c0e6850504ec9516b953f316a47277825e33 upstream.

On the Asus Xonar D2 and D2X models, the SPI chip select signal for the
fourth DAC shares its pin with the serial clock for the EEPROM that
contains the PCI subdevice ID values.  It appears that when DAC
registers are written and some other unknown conditions occur (probably
noise on the EEPROM's chip select line), the EEPROM gets overwritten
with garbage, which makes it impossible to properly detect the card
later.

Therefore, we better avoid DAC register writes and make sure that the
driver works with the DAC's registers' default values.  Consequently,
the sample format is now I2S instead of left-justified (no user-visible
change), and the DAC's volume/mute registers cannot be used anymore
(volume changes are now done by the software volume plugin).

Signed-off-by: Clemens Ladisch <clemens@ladisch.de>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agoIA64: Turn on CONFIG_HAVE_UNSTABLE_CLOCK
Tony Luck [Thu, 15 Jan 2009 18:29:17 +0000 (10:29 -0800)]
IA64: Turn on CONFIG_HAVE_UNSTABLE_CLOCK

commit 0773a6cf673316440999752e23f8c3d4f85e48b9 upstream.

sched_clock() on ia64 is based on ar.itc, so is never
completely synchronized between cpus. On some platforms
(e.g. certain models of SGI Altix) it may be running at
radically different frequencies.

Based on a patch from Dimitri Sivanich which set this
just for SN2 && GENERIC kernels ... it is needed for
all ia64 machines.

Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agosched: fix update_min_vruntime
Peter Zijlstra [Thu, 15 Jan 2009 13:53:39 +0000 (14:53 +0100)]
sched: fix update_min_vruntime

commit e17036dac189dd034c092a91df56aa740db7146d upstream.

Impact: fix SCHED_IDLE latency problems

OK, so we have 1 running task A (which is obviously curr and the tree is
equally obviously empty).

'A' nicely chugs along, doing its thing, carrying min_vruntime along as it
goes.

Then some whacko speed freak SCHED_IDLE task gets inserted due to SMP
balancing, which is very likely far right, in that case

update_curr
  update_min_vruntime
    cfs_rq->rb_leftmost := true (the crazy task sitting in a tree)
      vruntime = se->vruntime

and voila, min_vruntime is waaay right of where it ought to be.

OK, so why did I write it like that to begin with...

Aah, yes.

Say we've just dequeued current

schedule
  deactivate_task(prev)
    dequeue_entity
      update_min_vruntime

Then we'll set

  vruntime = cfs_rq->min_vruntime;

we find !cfs_rq->curr, but do find someone in the tree. Then we _must_
do vruntime = se->vruntime, because

 vruntime = min_vruntime(vruntime := cfs_rq->min_vruntime, se->vruntime)

will not advance vruntime, and cause lags the other way around (which we
fixed with that initial patch: 1af5f730fc1bf7c62ec9fb2d307206e18bf40a69
(sched: more accurate min_vruntime accounting).

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Tested-by: Mike Galbraith <efault@gmx.de>
Acked-by: Mike Galbraith <efault@gmx.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agosgi-xp: eliminate false detection of no heartbeat
Dean Nelson [Thu, 15 Jan 2009 21:50:57 +0000 (13:50 -0800)]
sgi-xp: eliminate false detection of no heartbeat

commit 158bc69effbf96f59c01cdeb20f8d4c184e59f8e upstream.

After XPC has been up and running on multiple partitions for any length of
time, if XPC on one of the partitions is stopped and restarted (either by
a rmmod/insmod or a system restart), it is possible for the XPCs running
on the other partitions to falsely detect a lack of heartbeat from the XPC
that was just restarted.  This false detection will occur if the restarted
XPC comes up within the five-seconds preceding one of the other XPC's
heartbeat check (which occurs once every twenty seconds).

The detection of no heartbeat results in the detecting XPC deactivating
from the just restarted XPC.  The only remedy is to restart one of the
XPCs and hope that one doesn't hit this five-second window on any of the
other partitions.

Signed-off-by: Dean Nelson <dcn@sgi.com>
Signed-off-by: Robin Holt <holt@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agolib/idr.c: use kmem_cache_zalloc() for the idr_layer cache
Andrew Morton [Thu, 15 Jan 2009 21:51:21 +0000 (13:51 -0800)]
lib/idr.c: use kmem_cache_zalloc() for the idr_layer cache

commit 5b019e99016f3a692ba45bf68fba73a402d7c01a upstream.

David points out that the idr_remove_all() function returns unused slabs
to the kmem cache, but needs to zero them first or else they will be
uninitialized upon next use.  This causes crashes which have been observed
in the firewire subsystem.

He fixed this by zeroing the object before freeing it in idr_remove_all().

But we agree that simply removing the constructor and zeroing the object
at allocation time is simpler than relying upon slab constructor machinery
and might even be faster.

This problem was introduced by "idr: make idr_remove rcu-safe" (commit
cf481c20c476ad2c0febdace9ce23f5a4db19582), which was first released in
2.6.27.

There are no known codesites which trigger this bug in 2.6.27 or 2.6.28.
The post-2.6.28 firewire changes are the only known triggerer.

There might of course be not-yet-discovered triggerers in 2.6.27 and
2.6.28, and there might be out-of-tree triggerers which are added to those
kernel versions.  I'll let the -stable guys decide whether they want to
backport this fix.

Reported-by: David Moore <dcm@acm.org>
Cc: Stefan Richter <stefanr@s5r6.in-berlin.de>
Cc: Nadia Derbey <Nadia.Derbey@bull.net>
Cc: Paul E. McKenney <paulmck@us.ibm.com>
Cc: Manfred Spraul <manfred@colorfullife.com>
Cc: Kristian Hgsberg <krh@redhat.com>
Acked-by: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agop54usb: Add USB ID for Thomson Speedtouch 121g
Michiel [Sun, 4 Jan 2009 23:22:28 +0000 (17:22 -0600)]
p54usb: Add USB ID for Thomson Speedtouch 121g

commit 878e6a432f85690a2c0d88d96f177e54ff1d4a57 upstream.

Add the USB ID for Thomson Speedtouch 121g to p54usb.

Signed-off-by: Michiel <michiel@ettema.net>
Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agort2x00: add USB ID for the Linksys WUSB200.
Stefan Lippers-Hollmann [Sun, 4 Jan 2009 00:10:49 +0000 (01:10 +0100)]
rt2x00: add USB ID for the Linksys WUSB200.

commit 3be36ae223271f9c2cfbe7406846c8fdcd2f50c3 upstream.

add USB ID for the Linksys WUSB200 Wireless-G Business USB Adapter to
rt73usb.

Signed-off-by: Stefan Lippers-Hollmann <s.l-h@gmx.de>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agosecurity: introduce missing kfree
Vegard Nossum [Sat, 17 Jan 2009 16:45:45 +0000 (17:45 +0100)]
security: introduce missing kfree

commit 0d54ee1c7850a954026deec4cd4885f331da35cc upstream.

Plug this leak.

Acked-by: David Howells <dhowells@redhat.com>
Cc: James Morris <jmorris@namei.org>
Signed-off-by: Vegard Nossum <vegard.nossum@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agoPCI: keep ASPM link state consistent throughout PCIe hierarchy
Shaohua Li [Fri, 19 Dec 2008 01:27:42 +0000 (09:27 +0800)]
PCI: keep ASPM link state consistent throughout PCIe hierarchy

commit 46bbdfa44cfc0d352148a0dc33ba9f6db02ccdf0 upstream.

In a PCIe hierarchy with a switch present, if the link state of an
endpoint device is changed, we must check the whole hierarchy from the
endpoint device to root port, and for each link in the hierarchy, the new
link state should be configured. Previously, the implementation checked
the state but forgot to configure the links between root port to switch.
Fixes Novell bz #448987.

Signed-off-by: Shaohua Li <shaohua.li@intel.com>
Tested-by: Andrew Patterson <andrew.patterson@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agoLinux 2.6.28.1 v2.6.28.1
Greg Kroah-Hartman [Sun, 18 Jan 2009 18:45:37 +0000 (10:45 -0800)]
Linux 2.6.28.1

15 years agoXFS: truncate readdir offsets to signed 32 bit values
Christoph Hellwig [Thu, 8 Jan 2009 19:00:00 +0000 (14:00 -0500)]
XFS: truncate readdir offsets to signed 32 bit values

commit 15440319767942a363f282d6585303d3d75088ba upstream.

John Stanley reported EOVERFLOW errors in readdir from his self-build
glibc.  I traced this down to glibc enabling d_off overflow checks
in one of the about five million different getdents implementations.

In 2.6.28 Dave Woodhouse moved our readdir double buffering required
for NFS4 readdirplus into nfsd and at that point we lost the capping
of the directory offsets to 32 bit signed values.  Johns glibc used
getdents64 to even implement readdir for normal 32 bit offset dirents,
and failed with EOVERFLOW only if this happens on the first dirent in
a getdents call.  I managed to come up with a testcase that uses
raw getdents and does the EOVERFLOW check manually.  We always hit
it with our last entry due to the special end of directory marker.

The patch below is a dumb version of just putting back the masking,
to make sure we have the same behavior as in 2.6.27 and earlier.

I will work on a better and cleaner fix for 2.6.30.

Reported-by: John Stanley <jpsinthemix@verizon.net>
Tested-by: John Stanley <jpsinthemix@verizon.net>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agomm: fix assertion
Nick Piggin [Wed, 14 Jan 2009 06:28:16 +0000 (07:28 +0100)]
mm: fix assertion

commit 18e6959c385f3edf3991fa6662a53dac4eb10d5b upstream.

This assertion is incorrect for lockless pagecache.  By definition if we
have an unpinned page that we are trying to take a speculative reference
to, it may become the tail of a compound page at any time (if it is
freed, then reallocated as a compound page).

It was still a valid assertion for the vmscan.c LRU isolation case, but
it doesn't seem incredibly helpful...  if somebody wants it, they can
put it back directly where it applies in the vmscan code.

Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agoath5k: ignore the return value of ath5k_hw_noise_floor_calibration
Felix Fietkau [Mon, 3 Nov 2008 10:27:38 +0000 (11:27 +0100)]
ath5k: ignore the return value of ath5k_hw_noise_floor_calibration

commit 8b0162a3dc5c30e862b7a73da29e32de3170f5e4 upstream.

Noise floor calibration occasionally fails on Atheros hardware.
This is not fatal and can happen if there's simply too much
noise on the air. Ignoring the calibration error is the right
thing to do here, because when the error is ignored, the hardware
will still work, whereas if the error causes the driver to bail out
of a bigger configuration function and does not configure the tx
queues or the IMR (as is the case in reset.c), the hw no longer
works properly until the next reset.

Signed-off-by: Felix Fietkau <nbd@openwrt.org>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Cc: Bob Copeland <me@bobcopeland.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agogetrusage: RUSAGE_THREAD should return ru_utime and ru_stime
KOSAKI Motohiro [Sat, 3 Jan 2009 20:40:37 +0000 (05:40 +0900)]
getrusage: RUSAGE_THREAD should return ru_utime and ru_stime

commit 8916edef5888c5d8fe283714416a9ca95b4c3431 upstream.

Impact: task stats regression fix

Original getrusage(RUSAGE_THREAD) implementation can return ru_utime and
ru_stime. But commit "f06febc: timers: fix itimer/many thread hang" broke it.

this patch restores it.

Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Acked-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agoibmvfc: Improve async event handling
Brian King [Thu, 18 Dec 2008 15:26:51 +0000 (09:26 -0600)]
ibmvfc: Improve async event handling

commit d2131b33c7e07c2905ee2f2321cc4dae1928c483 upstream.

While doing various error injection testing, such as cable
pulls and target moves, some issues were observed in handling
these events. This patch improves the way these events are handled
by increasing the delay waiting for the fabric to settle and also
changes the behavior of Link Up to break the CRQ to ensure everything
gets cleaned up properly on the VIOS.

Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agoibmvfc: Delay NPIV login retry and add retries
Brian King [Wed, 3 Dec 2008 17:02:54 +0000 (11:02 -0600)]
ibmvfc: Delay NPIV login retry and add retries

commit 1c41fa8288277e76785acb50f52bb2f39509f903 upstream.

Adds a delay prior to retrying a failed NPIV login. This fixes
a scenario if the backing fibre channel adapter is getting reset
due to an EEH event, NPIV login will fail. Currently, ibmvfc
retries three times very quickly, resets the CRQ and tries one
more time. If the adapter is getting reset due to EEH, this isn't
enough time. This adds a delay prior to retrying a failed NPIV
login and also increments the number of retries.

Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agopowerpc: Disable Collaborative Memory Manager for kdump
Brian King [Thu, 18 Dec 2008 11:13:46 +0000 (11:13 +0000)]
powerpc: Disable Collaborative Memory Manager for kdump

commit 2218108e182fd8a6d9106077833ed7ad05fc8e75 upstream.

When running Active Memory Sharing, the Collaborative Memory Manager
(CMM) may mark some pages as "loaned" with the hypervisor.
Periodically, the CMM will query the hypervisor for a loan request,
which is a single signed value.  When kexec'ing into a kdump kernel,
the CMM driver in the kdump kernel is not aware of the pages the
previous kernel had marked as "loaned", so the hypervisor and the CMM
driver are out of sync.  This results in the CMM driver getting a
negative loan request, which can then get treated as a large unsigned
value and can cause kdump to hang due to the CMM driver inflating too
large.  Since there really is no clean way for the CMM driver in the
kdump kernel to clean this up, simply disable CMM in the kdump kernel.
This fixes hangs we were seeing doing kdump with AMS.

Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
15 years agomm lockless pagecache barrier fix
Nick Piggin [Tue, 6 Jan 2009 02:05:50 +0000 (03:05 +0100)]
mm lockless pagecache barrier fix

commit e8c82c2e23e3527e0c9dc195e432c16784d270fa upstream.

An XFS workload showed up a bug in the lockless pagecache patch. Basically it
would go into an "infinite" loop, although it would sometimes be able to break
out of the loop! The reason is a missing compiler barrier in the "increment
reference count unless it was zero" case of the lockless pagecache protocol in
the gang lookup functions.

This would cause the compiler to use a cached value of struct page pointer to
retry the operation with, rather than reload it. So the page might have been
removed from pagecache and freed (refcount==0) but the lookup would not correctly
notice the page is no longer in pagecache, and keep attempting to increment the
refcount and failing, until the page gets reallocated for something else. This
isn't a data corruption because the condition will be detected if the page has
been reallocated. However it can result in a lockup.

Linus points out that ACCESS_ONCE is also required in that pointer load, even
if it's absence is not causing a bug on our particular build. The most general
way to solve this is just to put an rcu_dereference in radix_tree_deref_slot.

Assembly of find_get_pages,
before:
.L220:
        movq    (%rbx), %rax    #* ivtmp.1162, tmp82
        movq    (%rax), %rdi    #, prephitmp.1149
.L218:
        testb   $1, %dil        #, prephitmp.1149
        jne     .L217   #,
        testq   %rdi, %rdi      # prephitmp.1149
        je      .L203   #,
        cmpq    $-1, %rdi       #, prephitmp.1149
        je      .L217   #,
        movl    8(%rdi), %esi   # <variable>._count.counter, c
        testl   %esi, %esi      # c
        je      .L218   #,

after:
.L212:
        movq    (%rbx), %rax    #* ivtmp.1109, tmp81
        movq    (%rax), %rdi    #, ret
        testb   $1, %dil        #, ret
        jne     .L211   #,
        testq   %rdi, %rdi      # ret
        je      .L197   #,
        cmpq    $-1, %rdi       #, ret
        je      .L211   #,
        movl    8(%rdi), %esi   # <variable>._count.counter, c
        testl   %esi, %esi      # c
        je      .L212   #,

(notice the obvious infinite loop in the first example, if page->count remains 0)

Signed-off-by: Nick Piggin <npiggin@suse.de>
Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>