]> git.itanic.dy.fi Git - linux-stable/log
linux-stable
14 years agoLinux 2.6.30.8 v2.6.30.8
Greg Kroah-Hartman [Thu, 24 Sep 2009 15:28:02 +0000 (08:28 -0700)]
Linux 2.6.30.8

14 years agopowerpc/pseries: Fix to handle slb resize across migration
Brian King [Fri, 28 Aug 2009 12:06:29 +0000 (12:06 +0000)]
powerpc/pseries: Fix to handle slb resize across migration

commit 46db2f86a3b2a94e0b33e0b4548fb7b7b6bdff66 upstream.

The SLB can change sizes across a live migration, which was not
being handled, resulting in possible machine crashes during
migration if migrating to a machine which has a smaller max SLB
size than the source machine. Fix this by first reducing the
SLB size to the minimum possible value, which is 32, prior to
migration. Then during the device tree update which occurs after
migration, we make the call to ensure the SLB gets updated. Also
add the slb_size to the lparcfg output so that the migration
tools can check to make sure the kernel has this capability
before allowing migration in scenarios where the SLB size will change.

BenH: Fixed #include <asm/mmu-hash64.h> -> <asm/mmu.h> to avoid
      breaking ppc32 build

Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agoPCI: Unhide the SMBus on the Compaq Evo D510 USDT
Jean Delvare [Tue, 28 Jul 2009 09:49:19 +0000 (11:49 +0200)]
PCI: Unhide the SMBus on the Compaq Evo D510 USDT

commit 6b5096e4d4496e185cd1ada5d1b8e1d941c805ed upstream.

One more form factor for Compaq Evo D510, which needs the same quirk
as the other form factors. Apparently there's no hardware monitoring
chip on that one, but SPD EEPROMs, so it's still worth unhiding the
SMBus.

Signed-off-by: Jean Delvare <khali@linux-fr.org>
Tested-by: Nuzhna Pomoshch <nuzhna_pomoshch@yahoo.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agolibata: fix off-by-one error in ata_tf_read_block()
Tejun Heo [Sun, 16 Aug 2009 12:21:21 +0000 (21:21 +0900)]
libata: fix off-by-one error in ata_tf_read_block()

commit ac8672ea922bde59acf50eaa1eaa1640a6395fd2 upstream.

ata_tf_read_block() has off-by-one error when converting CHS address
to LBA.  The bug isn't very visible because ata_tf_read_block() is
used only when generating sense data for a failed RW command and CHS
addressing isn't used too often these days.

This problem was spotted by Atsushi Nemoto.

Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agovirtio_blk: don't bounce highmem requests
Christoph Hellwig [Fri, 11 Sep 2009 22:49:19 +0000 (18:49 -0400)]
virtio_blk: don't bounce highmem requests

commit 4eff3cae9c9809720c636e64bc72f212258e0bd5 upstream

virtio_blk: don't bounce highmem requests

By default a block driver bounces highmem requests, but virtio-blk is
perfectly fine with any request that fit into it's 64 bit addressing scheme,
mapped in the kernel virtual space or not.

Besides improving performance on highmem systems this also makes the
reproducible oops in __bounce_end_io go away (but hiding the real cause).

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: Chuck Ebbert <cebbert@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agoV4L: em28xx: set up tda9887_conf in em28xx_card_setup()
Franklin Meng [Sat, 12 Sep 2009 14:31:05 +0000 (10:31 -0400)]
V4L: em28xx: set up tda9887_conf in em28xx_card_setup()

V4L: em28xx: set up tda9887_conf in em28xx_card_setup()

(cherry picked from commit ae3340cbf59ea362c2016eea762456cc0969fd9e)

Added tda9887_conf set up into em28xx_card_setup()

Signed-off-by: Franklin Meng <fmeng2002@yahoo.com>
Signed-off-by: Douglas Schilling Landgraf <dougsland@redhat.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Tested-by: Larry Finger <Larry.Finger@lwfinger.net>
Signed-off-by: Michael Krufky <mkrufky@linuxtv.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agox86, pat: Fix cacheflush address in change_page_attr_set_clr()
Jack Steiner [Thu, 3 Sep 2009 17:56:02 +0000 (12:56 -0500)]
x86, pat: Fix cacheflush address in change_page_attr_set_clr()

commit fa526d0d641b5365676a1fb821ce359e217c9b85 upstream.

Fix address passed to cpa_flush_range() when changing page
attributes from WB to UC. The address (*addr) is
modified by __change_page_attr_set_clr(). The result is that
the pages being flushed start at the _end_ of the changed range
instead of the beginning.

This should be considered for 2.6.30-stable and 2.6.31-stable.

Signed-off-by: Jack Steiner <steiner@sgi.com>
Acked-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agox86/i386: Make sure stack-protector segment base is cache aligned
Jeremy Fitzhardinge [Thu, 3 Sep 2009 19:27:15 +0000 (12:27 -0700)]
x86/i386: Make sure stack-protector segment base is cache aligned

commit 1ea0d14e480c245683927eecc03a70faf06e80c8 upstream.

The Intel Optimization Reference Guide says:

In Intel Atom microarchitecture, the address generation unit
assumes that the segment base will be 0 by default. Non-zero
segment base will cause load and store operations to experience
a delay.
- If the segment base isn't aligned to a cache line
  boundary, the max throughput of memory operations is
  reduced to one [e]very 9 cycles.
[...]
Assembly/Compiler Coding Rule 15. (H impact, ML generality)
For Intel Atom processors, use segments with base set to 0
whenever possible; avoid non-zero segment base address that is
not aligned to cache line boundary at all cost.

We can't avoid having a non-zero base for the stack-protector
segment, but we can make it cache-aligned.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
LKML-Reference: <4AA01893.6000507@goop.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agox86: Fix x86_model test in es7000_apic_is_cluster()
Roel Kluin [Tue, 25 Aug 2009 13:35:12 +0000 (15:35 +0200)]
x86: Fix x86_model test in es7000_apic_is_cluster()

commit 005155b1f626d2b2d7932e4afdf4fead168c6888 upstream.

For the x86_model to be greater than 6 or less than 12 is
logically always true.

Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agosound: oxygen: work around MCE when changing volume
Clemens Ladisch [Mon, 7 Sep 2009 08:18:54 +0000 (10:18 +0200)]
sound: oxygen: work around MCE when changing volume

commit f1bc07af9a9edc5c1d4bdd971f7099316ed2e405 upstream.

When the volume is changed continuously (e.g., when the user drags a
volume slider with the mouse), the driver does lots of I2C writes.
Apparently, the sound chip can get confused when we poll the I2C status
register too much, and fails to complete a read from it.  On the PCI-E
models, the PCI-E/PCI bridge gets upset by this and generates a machine
check exception.

To avoid this, this patch replaces the polling with an unconditional
wait that is guaranteed to be long enough.

Signed-off-by: Clemens Ladisch <clemens@ladisch.de>
Tested-by: Johann Messner <johann.messner at jku.at>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agoPCI: apply nv_msi_ht_cap_quirk on resume too
Tejun Heo [Tue, 21 Jul 2009 23:08:43 +0000 (16:08 -0700)]
PCI: apply nv_msi_ht_cap_quirk on resume too

commit 6dab62ee5a3bf4f71b8320c09db2e6022a19f40e upstream.

http://bugzilla.kernel.org/show_bug.cgi?id=12542 reports that with the
quirk not applied on resume, msi stops working after resuming and mcp78s
ahci fails due to IRQ mis-delivery.  Apply it on resume too.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Peer Chen <pchen@nvidia.com>
Cc: Tj <linux@tjworld.net>
Reported-by: Nicolas Derive <kalon33@ubuntu.com>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agomlx4_core: Allocate and map sufficient ICM memory for EQ context
Roland Dreier [Sun, 6 Sep 2009 03:24:49 +0000 (20:24 -0700)]
mlx4_core: Allocate and map sufficient ICM memory for EQ context

commit fa0681d2129732027355d6b7083dd8932b9b799d upstream.

The current implementation allocates a single host page for EQ context
memory, which was OK when we only allocated a few EQs.  However, since
we now allocate an EQ for each CPU core, this patch removes the
hard-coded limit (which we exceed with 4 KB pages and 128 byte EQ
context entries with 32 CPUs) and uses the same ICM table code as all
other context tables, which ends up simplifying the code quite a bit
while fixing the problem.

This problem was actually hit in practice on a dual-socket Nehalem box
with 16 real hardware threads and sufficiently odd ACPI tables that it
shows on boot

    SMP: Allowing 32 CPUs, 16 hotplug CPUs

so num_possible_cpus() ends up 32, and mlx4 ends up creating 33 MSI-X
interrupts and 33 EQs.  This mlx4 bug means that mlx4 can't even
initialize at all on this quite mainstream system.

Reported-by: Eli Cohen <eli@mellanox.co.il>
Tested-by: Christoph Lameter <cl@linux-foundation.org>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agoASoC: Fix WM835x Out4 capture enumeration
Mark Brown [Mon, 7 Sep 2009 17:09:58 +0000 (18:09 +0100)]
ASoC: Fix WM835x Out4 capture enumeration

commit 87831cb660954356d68cebdb1406f3be09e784e9 upstream.

It's the 8th enum of a zero indexed array. This is why I don't let
new drivers use these arrays of enums...

Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agoARM: 5691/1: fix cache aliasing issues between kmap() and kmap_atomic() with highmem
Nicolas Pitre [Thu, 3 Sep 2009 20:45:59 +0000 (21:45 +0100)]
ARM: 5691/1: fix cache aliasing issues between kmap() and kmap_atomic() with highmem

commit 7929eb9cf643ae416e5081b2a6fa558d37b9854c upstream.

Let's suppose a highmem page is kmap'd with kmap().  A pkmap entry is
used, the page mapped to it, and the virtual cache is dirtied.  Then
kunmap() is used which does virtually nothing except for decrementing a
usage count.

Then, let's suppose the _same_ page gets mapped using kmap_atomic().
It is therefore mapped onto a fixmap entry instead, which has a
different virtual address unaware of the dirty cache data for that page
sitting in the pkmap mapping.

Fortunately it is easy to know if a pkmap mapping still exists for that
page and use it directly with kmap_atomic(), thanks to kmap_high_get().

And actual testing with a printk in the added code path shows that this
condition is actually met *extremely* frequently.  Seems that we've been
quite lucky that things have worked so well with highmem so far.

Signed-off-by: Nicolas Pitre <nico@marvell.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agoALSA: cs46xx - Fix minimum period size
Sophie Hamilton [Tue, 8 Sep 2009 08:58:42 +0000 (10:58 +0200)]
ALSA: cs46xx - Fix minimum period size

commit 6148b130eb84edc76e4fa88da1877b27be6c2f06 upstream.

Fix minimum period size for cs46xx cards. This fixes a problem in the
case where neither a period size nor a buffer size is passed to ALSA;
this is the case in Audacious, OpenAL, and others.

Signed-off-by: Sophie Hamilton <kernel@theblob.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agoudf: Use device size when drive reported bogus number of written blocks
Jan Kara [Thu, 18 Jun 2009 10:33:16 +0000 (12:33 +0200)]
udf: Use device size when drive reported bogus number of written blocks

commit 24a5d59f3477bcff4c069ff4d0ca9a3e037d0235 upstream.

Some drives report 0 as the number of written blocks when there are some blocks
recorded. Use device size in such case so that we can automagically mount such
media.

Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agoTPM: Fixup boot probe timeout for tpm_tis driver
Jason Gunthorpe [Wed, 9 Sep 2009 23:22:18 +0000 (17:22 -0600)]
TPM: Fixup boot probe timeout for tpm_tis driver

commit ec57935837a78f9661125b08a5d08b697568e040 upstream.

When probing the device in tpm_tis_init the call request_locality
uses timeout_a, which wasn't being initalized until after
request_locality. This results in request_locality falsely timing
out if the chip is still starting. Move the initialization to before
request_locality.

This probably only matters for embedded cases (ie mine), a BIOS likely
gets the TPM into a state where this code path isn't necessary.

Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Acked-by: Rajiv Andrade <srajiv@linux.vnet.ibm.com>
Signed-off-by: James Morris <jmorris@namei.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agopowerpc/ps3: Workaround for flash memory I/O error
Geoff Levand [Wed, 9 Sep 2009 13:28:05 +0000 (13:28 +0000)]
powerpc/ps3: Workaround for flash memory I/O error

commit bc00351edd5c1b84d48c3fdca740fedfce4ae6ce upstream.

A workaround for flash memory I/O errors when the PS3 internal
hard disk has not been formatted for OtherOS use.

This error condition mainly effects 'Live CD' users who have not
formatted the PS3's internal hard disk for OtherOS.

Fixes errors similar to these when using the ps3-flash-util
or ps3-boot-game-os programs:

  ps3flash read failed 0x2050000
  os_area_header_read: read error: os_area_header: Input/output error
  main:627: os_area_read_hp error.
  ERROR: can't change boot flag

Signed-off-by: Geoff Levand <geoffrey.levand@am.sony.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agofix undefined reference to user_shm_unlock
Hugh Dickins [Sat, 12 Sep 2009 11:21:27 +0000 (12:21 +0100)]
fix undefined reference to user_shm_unlock

commit 2195d2818c37bdf263865f1e9effccdd9fc5f9d4 upstream.

My 353d5c30c666580347515da609dd74a2b8e9b828 "mm: fix hugetlb bug due to
user_shm_unlock call" broke the CONFIG_SYSVIPC !CONFIG_MMU build of both
2.6.31 and 2.6.30.6: "undefined reference to `user_shm_unlock'".

gcc didn't understand my comment! so couldn't figure out to optimize
away user_shm_unlock() from the error path in the hugetlb-less case, as
it does elsewhere.  Help it to do so, in a language it understands.

Reported-by: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: Hugh Dickins <hugh.dickins@tiscali.co.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agocfg80211: fix looping soft lockup in find_ie()
Bob Copeland [Tue, 1 Sep 2009 22:12:11 +0000 (18:12 -0400)]
cfg80211: fix looping soft lockup in find_ie()

commit fcc6cb0c13555e78c2d47257b6d1b5e59b0c419a upstream.

The find_ie() function uses a size_t for the len parameter, and
directly uses len as a loop variable.  If any received packets
are malformed, it is possible for the decrease of len to overflow,
and since the result is unsigned, the loop will not terminate.
Change it to a signed int so the loop conditional works for
negative values.

This fixes the following soft lockup:

[38573.102007] BUG: soft lockup - CPU#0 stuck for 61s! [phy0:2230]
[38573.102007] Modules linked in: aes_i586 aes_generic fuse af_packet ipt_REJECT xt_tcpudp nf_conntrack_ipv4 nf_defrag_ipv4 xt_state iptable_filter ip_tables x_tables acpi_cpufreq binfmt_misc dm_mirror dm_region_hash dm_log dm_multipath dm_mod kvm_intel kvm uinput i915 arc4 ecb drm snd_hda_codec_idt ath5k snd_hda_intel hid_apple mac80211 usbhid appletouch snd_hda_codec snd_pcm ath cfg80211 snd_timer i2c_algo_bit ohci1394 video snd processor ieee1394 rfkill ehci_hcd sg sky2 backlight snd_page_alloc uhci_hcd joydev output ac thermal button battery sr_mod applesmc cdrom input_polldev evdev unix [last unloaded: scsi_wait_scan]
[38573.102007] irq event stamp: 2547724535
[38573.102007] hardirqs last  enabled at (2547724534): [<c1002ffc>] restore_all_notrace+0x0/0x18
[38573.102007] hardirqs last disabled at (2547724535): [<c10038f4>] apic_timer_interrupt+0x28/0x34
[38573.102007] softirqs last  enabled at (92950144): [<c103ab48>] __do_softirq+0x108/0x210
[38573.102007] softirqs last disabled at (92950274): [<c1348e74>] _spin_lock_bh+0x14/0x80
[38573.102007]
[38573.102007] Pid: 2230, comm: phy0 Tainted: G        W  (2.6.31-rc7-wl #8) MacBook1,1
[38573.102007] EIP: 0060:[<f8ea2d50>] EFLAGS: 00010292 CPU: 0
[38573.102007] EIP is at cmp_ies+0x30/0x180 [cfg80211]
[38573.102007] EAX: 00000082 EBX: 00000000 ECX: ffffffc1 EDX: d8efd014
[38573.102007] ESI: ffffff7c EDI: 0000004d EBP: eee2dc50 ESP: eee2dc3c
[38573.102007]  DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
[38573.102007] CR0: 8005003b CR2: d8efd014 CR3: 01694000 CR4: 000026d0
[38573.102007] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
[38573.102007] DR6: ffff0ff0 DR7: 00000400
[38573.102007] Call Trace:
[38573.102007]  [<f8ea2f8d>] cmp_bss+0xed/0x100 [cfg80211]
[38573.102007]  [<f8ea33e4>] cfg80211_bss_update+0x84/0x410 [cfg80211]
[38573.102007]  [<f8ea3884>] cfg80211_inform_bss_frame+0x114/0x180 [cfg80211]
[38573.102007]  [<f97255ff>] ieee80211_bss_info_update+0x4f/0x180 [mac80211]
[38573.102007]  [<f972b118>] ieee80211_rx_bss_info+0x88/0xf0 [mac80211]
[38573.102007]  [<f9739297>] ? ieee802_11_parse_elems+0x27/0x30 [mac80211]
[38573.102007]  [<f972b224>] ieee80211_rx_mgmt_probe_resp+0xa4/0x1c0 [mac80211]
[38573.102007]  [<f972bc59>] ieee80211_sta_rx_queued_mgmt+0x919/0xc50 [mac80211]
[38573.102007]  [<c1009707>] ? sched_clock+0x27/0xa0
[38573.102007]  [<c1009707>] ? sched_clock+0x27/0xa0
[38573.102007]  [<c105ffd0>] ? mark_held_locks+0x60/0x80
[38573.102007]  [<c1348be5>] ? _spin_unlock_irqrestore+0x55/0x70
[38573.102007]  [<c134baa5>] ? sub_preempt_count+0x85/0xc0
[38573.102007]  [<c1348bce>] ? _spin_unlock_irqrestore+0x3e/0x70
[38573.102007]  [<c12c1c0f>] ? skb_dequeue+0x4f/0x70
[38573.102007]  [<f972c021>] ieee80211_sta_work+0x91/0xb80 [mac80211]
[38573.102007]  [<c1009707>] ? sched_clock+0x27/0xa0
[38573.102007]  [<c134baa5>] ? sub_preempt_count+0x85/0xc0
[38573.102007]  [<c10479af>] worker_thread+0x18f/0x320
[38573.102007]  [<c104794e>] ? worker_thread+0x12e/0x320
[38573.102007]  [<c1348be5>] ? _spin_unlock_irqrestore+0x55/0x70
[38573.102007]  [<f972bf90>] ? ieee80211_sta_work+0x0/0xb80 [mac80211]
[38573.102007]  [<c104cbb0>] ? autoremove_wake_function+0x0/0x50
[38573.102007]  [<c1047820>] ? worker_thread+0x0/0x320
[38573.102007]  [<c104c854>] kthread+0x84/0x90
[38573.102007]  [<c104c7d0>] ? kthread+0x0/0x90
[38573.102007]  [<c1003ab7>] kernel_thread_helper+0x7/0x10

Signed-off-by: Bob Copeland <me@bobcopeland.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agobinfmt_elf: fix PT_INTERP bss handling
Roland McGrath [Wed, 9 Sep 2009 02:49:40 +0000 (19:49 -0700)]
binfmt_elf: fix PT_INTERP bss handling

commit 9f0ab4a3f0fdb1ff404d150618ace2fa069bb2e1 upstream.

In fs/binfmt_elf.c, load_elf_interp() calls padzero() for .bss even if
the PT_LOAD has no PROT_WRITE and no .bss.  This generates EFAULT.

Here is a small test case.  (Yes, there are other, useful PT_INTERP
which have only .text and no .data/.bss.)

----- ptinterp.S
_start: .globl _start
 nop
 int3
-----
$ gcc -m32 -nostartfiles -nostdlib -o ptinterp ptinterp.S
$ gcc -m32 -Wl,--dynamic-linker=ptinterp -o hello hello.c
$ ./hello
Segmentation fault  # during execve() itself

After applying the patch:
$ ./hello
Trace trap  # user-mode execution after execve() finishes

If the ELF headers are actually self-inconsistent, then dying is fine.
But having no PROT_WRITE segment is perfectly normal and correct if
there is no segment with p_memsz > p_filesz (i.e. bss).  John Reiser
suggested checking for PROT_WRITE in the bss logic.  I think it makes
most sense to simply apply the bss logic only when there is bss.

This patch looks less trivial than it is due to some reindentation.
It just moves the "if (last_bss > elf_bss) {" test up to include the
partial-page bss logic as well as the more-pages bss logic.

Reported-by: John Reiser <jreiser@bitwagon.com>
Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agoath5k: write PCU registers on initial reset
Bob Copeland [Sun, 5 Jul 2009 01:03:13 +0000 (21:03 -0400)]
ath5k: write PCU registers on initial reset

commit 3355443ad7601991affa5992b0d53870335af765 upstream.

"Ath5k: unify resets"
introduced a regression into 2.6.28 where the PCU registers are never
initialized, due to ath5k_reset() always passing true for change_channel.
We subsequently program a lot of these registers but several may start
in an unknown state.

Reported-by: Forrest Zhang <forrest@hifulltech.com>
Signed-off-by: Bob Copeland <me@bobcopeland.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agoagp/intel: remove restore in resume
Zhenyu Wang [Mon, 14 Sep 2009 02:47:06 +0000 (10:47 +0800)]
agp/intel: remove restore in resume

commit 121264827656f5f06328b17983c796af17dc5949 upstream.

As early pci resume has already restored config for host
bridge and graphics device, don't need to restore it again,
This removes an original order hack for graphics device restore.

This fixed the resume hang issue found by Alan Stern on 845G,
caused by extra config restore on graphics device.

Cc: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Signed-off-by: Dave Airlie <airlied@linux.ie>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agosg: fix oops in the error path in sg_build_indirect()
Michal Schmidt [Thu, 3 Sep 2009 12:27:08 +0000 (14:27 +0200)]
sg: fix oops in the error path in sg_build_indirect()

commit e71044ee2efa4792e21d243b03d49006db66aec9 upstream.

When the allocation fails in sg_build_indirect(), an oops happens in
the error path. It's caused by an obvious typo.

Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
Reported-by: Bob Tracy <rct@gherkin.frus.com>
Acked-by: Douglas Gilbert <dgilbert@interlog.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agoInput: joydev - decouple axis and button map ioctls from input constants
Stephen Kitt [Wed, 12 Aug 2009 08:12:08 +0000 (01:12 -0700)]
Input: joydev - decouple axis and button map ioctls from input constants

commit ec8b4b7085605e801a7740a2c3c33256aebe249c upstream.

The KEY_MAX change in 2.6.28 changed the values of the JSIOCSBTNMAP and
JSIOCGBTNMAP constants; software compiled with the old values no longer
works with kernels following 2.6.28, because the ioctl switch statement
no longer matches the values given by the software. This patch handles
these ioctls independently of the length of data specified, and applies the
same treatment to JSIOCSAXMAP and JSIOCGAXMAP which currently depend on
ABS_MAX.

Signed-off-by: Stephen Kitt <steve@sk2.org>
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agoLinux 2.6.30.7 v2.6.30.7
Greg Kroah-Hartman [Tue, 15 Sep 2009 17:46:05 +0000 (10:46 -0700)]
Linux 2.6.30.7

14 years agodm snapshot: fix on disk chunk size validation
Mikulas Patocka [Fri, 4 Sep 2009 19:40:43 +0000 (20:40 +0100)]
dm snapshot: fix on disk chunk size validation

commit ae0b7448e91353ea5f821601a055aca6b58042cd upstream.

Fix some problems seen in the chunk size processing when activating a
pre-existing snapshot.

For a new snapshot, the chunk size can either be supplied by the creator
or a default value can be used.  For an existing snapshot, the
chunk size in the snapshot header on disk should always be used.

If someone attempts to load an existing snapshot and has the 'default
chunk size' option set, the kernel uses its default value even when it
is incorrect for the snapshot being loaded.  This patch ensures the
correct on-disk value is always used.

Secondly, when the code does use the chunk size stored on the disk it is
prudent to revalidate it, so the code can exit cleanly if it got
corrupted as happened in
https://bugzilla.redhat.com/show_bug.cgi?id=461506 .

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agodm exception store: split set_chunk_size
Mikulas Patocka [Fri, 4 Sep 2009 19:40:41 +0000 (20:40 +0100)]
dm exception store: split set_chunk_size

commit 2defcc3fb4661e7351cb2ac48d843efc4c64db13 upstream.

Break the function set_chunk_size to two functions in preparation for
the fix in the following patch.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agodm snapshot: fix header corruption race on invalidation
Mikulas Patocka [Fri, 4 Sep 2009 19:40:39 +0000 (20:40 +0100)]
dm snapshot: fix header corruption race on invalidation

commit 61578dcd3fafe6babd72e8db32110cc0b630a432 upstream.

If a persistent snapshot fills up, a race can corrupt the on-disk header
which causes a crash on any future attempt to activate the snapshot
(typically while booting).  This patch fixes the race.

When the snapshot overflows, __invalidate_snapshot is called, which calls
snapshot store method drop_snapshot. It goes to persistent_drop_snapshot that
calls write_header. write_header constructs the new header in the "area"
location.

Concurrently, an existing kcopyd job may finish, call copy_callback
and commit_exception method, that goes to persistent_commit_exception.
persistent_commit_exception doesn't do locking, relying on the fact that
callbacks are single-threaded, but it can race with snapshot invalidation and
overwrite the header that is just being written while the snapshot is being
invalidated.

The result of this race is a corrupted header being written that can
lead to a crash on further reactivation (if chunk_size is zero in the
corrupted header).

The fix is to use separate memory areas for each.

See the bug: https://bugzilla.redhat.com/show_bug.cgi?id=461506

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agodm snapshot: refactor zero_disk_area to use chunk_io
Mikulas Patocka [Fri, 4 Sep 2009 19:40:37 +0000 (20:40 +0100)]
dm snapshot: refactor zero_disk_area to use chunk_io

commit 02d2fd31defce6ff77146ad0fef4f19006055d86 upstream.

Refactor chunk_io to prepare for the fix in the following patch.

Pass an area pointer to chunk_io and simplify zero_disk_area to use
chunk_io.  No functional change.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agodm raid1: do not allow log_failure variable to unset after being set
Jonathan Brassow [Fri, 4 Sep 2009 19:40:32 +0000 (20:40 +0100)]
dm raid1: do not allow log_failure variable to unset after being set

commit d2b698644c97cb033261536a4f2010924a00eac9 upstream.

This patch fixes a bug which was triggering a case where the primary leg
could not be changed on failure even when the mirror was in-sync.

The case involves the failure of the primary device along with
the transient failure of the log device.  The problem is that
bios can be put on the 'failures' list (due to log failure)
before 'fail_mirror' is called due to the primary device failure.
Normally, this is fine, but if the log device failure is transient,
a subsequent iteration of the work thread, 'do_mirror', will
reset 'log_failure'.  The 'do_failures' function then resets
the 'in_sync' variable when processing bios on the failures list.
The 'in_sync' variable is what is used to determine if the
primary device can be switched in the event of a failure.  Since
this has been reset, the primary device is incorrectly assumed
to be not switchable.

The case has been seen in the cluster mirror context, where one
machine realizes the log device is dead before the other machines.
As the responsibilities of the server migrate from one node to
another (because the mirror is being reconfigured due to the failure),
the new server may think for a moment that the log device is fine -
thus resetting the 'log_failure' variable.

In any case, it is inappropiate for us to reset the 'log_failure'
variable.  The above bug simply illustrates that it can actually
hurt us.

Signed-off-by: Jonathan Brassow <jbrassow@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agosound: oxygen: fix MCLK rate for 192 kHz playback
Clemens Ladisch [Tue, 1 Sep 2009 06:23:58 +0000 (08:23 +0200)]
sound: oxygen: fix MCLK rate for 192 kHz playback

commit b91ab72b830e1494c2c7f8de05ccb2ab2c9cfb26 upstream.

Do not forget to program the MCLK ratio for the I2S output.
Otherwise, the master clock frequency can be too high for
the DACs at sample frequencies above 96 kHz.

Signed-off-by: Clemens Ladisch <clemens@ladisch.de>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agosound: oxygen: handle cards with missing EEPROM
Clemens Ladisch [Wed, 2 Sep 2009 16:25:39 +0000 (18:25 +0200)]
sound: oxygen: handle cards with missing EEPROM

commit 92653453c3015c083b9fe0ad48261c6b2267d482 upstream.

The card model detection code introduced in 2.6.30 that tries to work
around partially broken EEPROM contents by reading the EEPROM directly
does not handle cards where the EEPROM has been omitted.  In this case,
we have to use the default ID to allow the driver to load.

Signed-off-by: Clemens Ladisch <clemens@ladisch.de>
Reported-and-tested-by: Ozan Çağlayan <ozan@pardus.org.tr>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agoSCSI: sd: fix bug in SCSI async probing
James Bottomley [Tue, 26 May 2009 20:35:48 +0000 (20:35 +0000)]
SCSI: sd: fix bug in SCSI async probing

commit 601e7638254c118fca135af9b1a9f35061420f62 upstream.

The async split up of probing in sd.c created a potential failure case where
something goes wrong with device_add(), but which we don't recover properly.
Since, in general, asynchronous error handling is hard, move the device_add()
into the asynchronous path (it should be fast) and make sure all the deferred
processing cannot fail.

Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agoPCI SR-IOV: correct broken resource alignment calculations
Chris Wright [Fri, 28 Aug 2009 20:00:06 +0000 (13:00 -0700)]
PCI SR-IOV: correct broken resource alignment calculations

commit 6faf17f6f1ffc586d16efc2f9fa2083a7785ee74 upstream.

An SR-IOV capable device includes an SR-IOV PCIe capability which
describes the Virtual Function (VF) BAR requirements.  A typical SR-IOV
device can support multiple VFs whose BARs must be in a contiguous region,
effectively an array of VF BARs.  The BAR reports the size requirement
for a single VF.  We calculate the full range needed by simply multiplying
the VF BAR size with the number of possible VFs and create a resource
spanning the full range.

This all seems sane enough except it artificially inflates the alignment
requirement for the VF BAR.  The VF BAR need only be aligned to the size
of a single BAR not the contiguous range of VF BARs.  This can cause us
to fail to allocate resources for the BAR despite the fact that we
actually have enough space.

This patch adds a thin PCI specific layer over the generic
resource_alignment() function which is aware of the special nature of
VF BARs and does sorting and allocation based on the smaller alignment
requirement.

I recognize that while resource_alignment is generic, it's basically a
PCI helper.  An alternative to this patch is to add PCI VF BAR specific
information to struct resource.  I opted for the extra layer rather than
adding such PCI specific information to struct resource.  This does
have the slight downside that we don't cache the BAR size and re-read
for each alignment query (happens a small handful of times during boot
for each VF BAR).

Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Matthew Wilcox <matthew@wil.cx>
Cc: Yu Zhao <yu.zhao@intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agonilfs2: fix preempt count underflow in nilfs_btnode_prepare_change_key
Ryusuke Konishi [Sat, 29 Aug 2009 19:21:41 +0000 (04:21 +0900)]
nilfs2: fix preempt count underflow in nilfs_btnode_prepare_change_key

commit b1f1b8ce0a1d71cbc72f7540134d52b79bd8f5ac upstream.

This will fix the following preempt count underflow reported from
users with the title "[NILFS users] segctord problem" (Message-ID:
<949415.6494.qm@web58808.mail.re1.yahoo.com> and Message-ID:
<debc30fc0908270825v747c1734xa59126623cfd5b05@mail.gmail.com>):

 WARNING: at kernel/sched.c:4890 sub_preempt_count+0x95/0xa0()
 Hardware name: HP Compaq 6530b (KR980UT#ABC)
 Modules linked in: bridge stp llc bnep rfcomm l2cap xfs exportfs nilfs2 cowloop loop vboxnetadp vboxnetflt vboxdrv btusb bluetooth uvcvideo videodev v4l1_compat v4l2_compat_ioctl32 arc4 snd_hda_codec_analog ecb iwlagn iwlcore rfkill lib80211 mac80211 snd_hda_intel snd_hda_codec ehci_hcd uhci_hcd usbcore snd_hwdep snd_pcm tg3 cfg80211 psmouse snd_timer joydev libphy ohci1394 snd_page_alloc hp_accel lis3lv02d ieee1394 led_class i915 drm i2c_algo_bit video backlight output i2c_core dm_crypt dm_mod
 Pid: 4197, comm: segctord Not tainted 2.6.30-gentoo-r4-64 #7
 Call Trace:
  [<ffffffff8023fa05>] ? sub_preempt_count+0x95/0xa0
  [<ffffffff802470f8>] warn_slowpath_common+0x78/0xd0
  [<ffffffff8024715f>] warn_slowpath_null+0xf/0x20
  [<ffffffff8023fa05>] sub_preempt_count+0x95/0xa0
  [<ffffffffa04ce4db>] nilfs_btnode_prepare_change_key+0x11b/0x190 [nilfs2]
  [<ffffffffa04d01ad>] nilfs_btree_assign_p+0x19d/0x1e0 [nilfs2]
  [<ffffffffa04d10ad>] nilfs_btree_assign+0xbd/0x130 [nilfs2]
  [<ffffffffa04cead7>] nilfs_bmap_assign+0x47/0x70 [nilfs2]
  [<ffffffffa04d9bc6>] nilfs_segctor_do_construct+0x956/0x20f0 [nilfs2]
  [<ffffffff805ac8e2>] ? _spin_unlock_irqrestore+0x12/0x40
  [<ffffffff803c06e0>] ? __up_write+0xe0/0x150
  [<ffffffff80262959>] ? up_write+0x9/0x10
  [<ffffffffa04ce9f3>] ? nilfs_bmap_test_and_clear_dirty+0x43/0x60 [nilfs2]
  [<ffffffffa04cd627>] ? nilfs_mdt_fetch_dirty+0x27/0x60 [nilfs2]
  [<ffffffffa04db5fc>] nilfs_segctor_construct+0x8c/0xd0 [nilfs2]
  [<ffffffffa04dc3dc>] nilfs_segctor_thread+0x15c/0x3a0 [nilfs2]
  [<ffffffffa04dbe20>] ? nilfs_construction_timeout+0x0/0x10 [nilfs2]
  [<ffffffff80252633>] ? add_timer+0x13/0x20
  [<ffffffff802370da>] ? __wake_up_common+0x5a/0x90
  [<ffffffff8025e960>] ? autoremove_wake_function+0x0/0x40
  [<ffffffffa04dc280>] ? nilfs_segctor_thread+0x0/0x3a0 [nilfs2]
  [<ffffffffa04dc280>] ? nilfs_segctor_thread+0x0/0x3a0 [nilfs2]
  [<ffffffff8025e556>] kthread+0x56/0x90
  [<ffffffff8020cdea>] child_rip+0xa/0x20
  [<ffffffff8025e500>] ? kthread+0x0/0x90
  [<ffffffff8020cde0>] ? child_rip+0x0/0x20

This problem was caused due to a missing radix_tree_preload() call in
the retry path of nilfs_btnode_prepare_change_key() function.

Reported-by: Eric A <eric225125@yahoo.com>
Reported-by: Jerome Poulin <jeromepoulin@gmail.com>
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Tested-by: Jerome Poulin <jeromepoulin@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agoslub: Fix kmem_cache_destroy() with SLAB_DESTROY_BY_RCU
Eric Dumazet [Thu, 3 Sep 2009 19:38:59 +0000 (22:38 +0300)]
slub: Fix kmem_cache_destroy() with SLAB_DESTROY_BY_RCU

commit d76b1590e06a63a3d8697168cd0aabf1c4b3cb3a upstream.

kmem_cache_destroy() should call rcu_barrier() *after* kmem_cache_close() and
*before* sysfs_slab_remove() or risk rcu_free_slab() being called after
kmem_cache is deleted (kfreed).

rmmod nf_conntrack can crash the machine because it has to kmem_cache_destroy()
a SLAB_DESTROY_BY_RCU enabled cache.

Reported-by: Zdenek Kabelac <zdenek.kabelac@gmail.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agoJFFS2: add missing verify buffer allocation/deallocation
Massimo Cirillo [Thu, 27 Aug 2009 08:44:09 +0000 (10:44 +0200)]
JFFS2: add missing verify buffer allocation/deallocation

commit bc8cec0dff072f1a45ce7f6b2c5234bb3411ac51 upstream.

The function jffs2_nor_wbuf_flash_setup() doesn't allocate the verify buffer
if CONFIG_JFFS2_FS_WBUF_VERIFY is defined, so causing a kernel panic when
that macro is enabled and the verify function is called. Similarly the
jffs2_nor_wbuf_flash_cleanup() must free the buffer if
CONFIG_JFFS2_FS_WBUF_VERIFY is enabled.
The following patch fixes the problem.
The following patch applies to 2.6.30 kernel.

Signed-off-by: Massimo Cirillo <maxcir@gmail.com>
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agosparc: sys32.S incorrect compat-layer splice() system call
Mathieu Desnoyers [Wed, 19 Aug 2009 03:16:55 +0000 (20:16 -0700)]
sparc: sys32.S incorrect compat-layer splice() system call

[ Upstream commit e2c6cbd9ace61039d3de39e717195e38f1492aee ]

I think arch/sparc/kernel/sys32.S has an incorrect splice definition:

SIGN2(sys32_splice, sys_splice, %o0, %o1)

The splice() prototype looks like :

       long splice(int fd_in, loff_t *off_in, int fd_out,
                   loff_t *off_out, size_t len, unsigned int flags);

So I think we should have :

SIGN2(sys32_splice, sys_splice, %o0, %o2)

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agosparc64: Fix bootup with mcount in some configs.
David S. Miller [Fri, 4 Sep 2009 10:38:54 +0000 (03:38 -0700)]
sparc64: Fix bootup with mcount in some configs.

[ Upstream commit bd4352cadfacb9084c97c853b025fac010266c26 ]

Functions invoked early when booting up a cpu can't use
tracing because mcount requires a valid 'current_thread_info()'
and TLB mappings to be setup.

The code path of sun4v_register_mondo_queues --> register_one_mondo
is one such case.  sun4v_register_mondo_queues already has the
necessary 'notrace' annotation, but register_one_mondo does not.

Normally register_one_mondo is inlined so the bug doesn't trigger,
but with some config/compiler combinations, it won't be so we
must properly mark it notrace.

While we're here, add 'notrace' annoations to prom_printf and
prom_halt so that early error handling won't have the same problem.

Reported-by: Alexander Beregalov <a.beregalov@gmail.com>
Reported-by: Leif Sawyer <lsawyer@gci.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agosparc64: Validate linear D-TLB misses.
David S. Miller [Tue, 25 Aug 2009 23:47:46 +0000 (16:47 -0700)]
sparc64: Validate linear D-TLB misses.

[ Upstream commit d8ed1d43e17898761c7221014a15a4c7501d2ff3 ]

When page alloc debugging is not enabled, we essentially accept any
virtual address for linear kernel TLB misses.  But with kgdb, kernel
address probing, and other facilities we can try to access arbitrary
crap.

So, make sure the address we miss on will translate to physical memory
that actually exists.

In order to make this work we have to embed the valid address bitmap
into the kernel image.  And in order to make that less expensive we
make an adjustment, in that the max physical memory address is
decreased to "1 << 41", even on the chips that support a 42-bit
physical address space.  We can do this because bit 41 indicates
"I/O space" and thus covers non-memory ranges.

The result of this is that:

1) kpte_linear_bitmap shrinks from 2K to 1K in size

2) we need 64K more for the valid address bitmap

We can't let the valid address bitmap be dynamically allocated
once we start using it to validate TLB misses, otherwise we have
crazy issues to deal with wrt. recursive TLB misses and such.

If we're in a TLB miss it could be the deepest trap level that's legal
inside of the cpu.  So if we TLB miss referencing the bitmap, the cpu
will be out of trap levels and enter RED state.

To guard against out-of-range accesses to the bitmap, we have to check
to make sure no bits in the physical address above bit 40 are set.  We
could export and use last_valid_pfn for this check, but that's just an
unnecessary extra memory reference.

On the plus side of all this, since we load all of these translations
into the special 4MB mapping TSB, and we check the TSB first for TLB
misses, there should be absolutely no real cost for these new checks
in the TLB miss path.

Reported-by: heyongli@gmail.com
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agosparc64: Kill spurious NMI watchdog triggers by increasing limit to 30 seconds.
David S. Miller [Thu, 3 Sep 2009 09:35:20 +0000 (02:35 -0700)]
sparc64: Kill spurious NMI watchdog triggers by increasing limit to 30 seconds.

[ Upstream commit e6617c6ec28a17cf2f90262b835ec05b9b861400 ]

This is a compromise and a temporary workaround for bootup NMI
watchdog triggers some people see with qla2xxx devices present.

This happens when, for example:

CPU 0 is in the driver init and looping submitting mailbox commands to
load the firmware, then waiting for completion.

CPU 1 is receiving the device interrupts.  CPU 1 is where the NMI
watchdog triggers.

CPU 0 is submitting mailbox commands fast enough that by the time CPU
1 returns from the device interrupt handler, a new one is pending.
This sequence runs for more than 5 seconds.

The problematic case is CPU 1's timer interrupt running when the
barrage of device interrupts begin.  Then we have:

timer interrupt
return for softirq checking
pending, thus enable interrupts

 qla2xxx interrupt
 return
 qla2xxx interrupt
 return
 ... 5+ seconds pass
 final qla2xxx interrupt for fw load
 return

run timer softirq
return

At some point in the multi-second qla2xxx interrupt storm we trigger
the NMI watchdog on CPU 1 from the NMI interrupt handler.

The timer softirq, once we get back to running it, is smart enough to
run the timer work enough times to make up for the missed timer
interrupts.

However, the NMI watchdogs (both x86 and sparc) use the timer
interrupt count to notice the cpu is wedged.  But in the above
scenerio we'll receive only one such timer interrupt even if we last
all the way back to running the timer softirq.

The default watchdog trigger point is only 5 seconds, which is pretty
low (the softwatchdog triggers at 60 seconds).  So increase it to 30
seconds for now.

Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agonet: net_assign_generic() fix
Eric Dumazet [Tue, 28 Jul 2009 02:36:15 +0000 (02:36 +0000)]
net: net_assign_generic() fix

[ Upstream commit 144586301f6af5ae5943a002f030d8c626fa4fdd ]

memcpy() should take into account size of pointers,
not only number of pointers to copy.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agopppol2tp: calls unregister_pernet_gen_device() at unload time
Eric Dumazet [Tue, 28 Jul 2009 03:47:39 +0000 (03:47 +0000)]
pppol2tp: calls unregister_pernet_gen_device() at unload time

[ Upstream commit 446e72f30eca76d6f9a1a54adf84d2c6ba2831f8 ]

Failure to call unregister_pernet_gen_device() can exhaust memory
if module is loaded/unloaded many times.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agoppp: fix lost fragments in ppp_mp_explode() (resubmit)
Ben McKeegan [Tue, 28 Jul 2009 07:43:57 +0000 (07:43 +0000)]
ppp: fix lost fragments in ppp_mp_explode() (resubmit)

[ Upstream commit a53a8b56827cc429c6d9f861ad558beeb5f6103f ]

This patch fixes the corner cases where the sum of MTU of the free
channels (adjusted for fragmentation overheads) is less than the MTU
of PPP link.  There are at least 3 situations where this case might
arise:

- some of the channels are busy

- the multilink session is running in a degraded state (i.e. with less
than its full complement of active channels)

- by design, where multilink protocol is being used to artificially
increase the effective link MTU of a single link.

Without this patch, at most 1 fragment is ever sent per free channel
for a given PPP frame and any remaining part of the PPP frame that
does not fit into those fragments is silently discarded.

This patch restores the original behaviour which was broken by commit
9c705260feea6ae329bc6b6d5f6d2ef0227eda0a 'ppp:ppp_mp_explode()
redesign'.  Once all 'free' channels have been given a fragment, an
additional fragment is queued to each available channel in turn, as many
times as necessary, until the entire PPP frame has been consumed.

Signed-off-by: Ben McKeegan <ben@netservers.co.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agogre: Fix MTU calculation for bound GRE tunnels
Tom Goff [Fri, 14 Aug 2009 23:33:56 +0000 (16:33 -0700)]
gre: Fix MTU calculation for bound GRE tunnels

[ Upstream commit 8cdb045632e5ee22854538619ac6f150eb0a4894 ]

The GRE header length should be subtracted when the tunnel MTU is
calculated.  This just corrects for the associativity change
introduced by commit 42aa916265d740d66ac1f17290366e9494c884c2
("gre: Move MTU setting out of ipgre_tunnel_bind_dev").

Signed-off-by: Tom Goff <thomas.goff@boeing.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agoE100: fix interaction with swiotlb on X86.
Krzysztof Hałasa [Mon, 24 Aug 2009 02:02:13 +0000 (19:02 -0700)]
E100: fix interaction with swiotlb on X86.

[ Upstream commit 6ff9c2e7fa8ca63a575792534b63c5092099c286 ]

E100 places it's RX packet descriptors inside skb->data and uses them
with bidirectional streaming DMA mapping. Data in descriptors is
accessed simultaneously by the chip (writing status and size when
a packet is received) and CPU (reading to check if the packet was
received). This isn't a valid usage of PCI DMA API, which requires use
of the coherent (consistent) memory for such purpose. Unfortunately e100
chips working in "simplified" RX mode have to store received data
directly after the descriptor. Fixing the driver to conform to the API
would require using unsupported "flexible" RX mode or receiving data
into a coherent memory and using CPU to copy it to network buffers.

This patch, while not yet making the driver conform to the PCI DMA API,
allows it to work correctly on X86 with swiotlb (while not breaking
other architectures).

Signed-off-by: Krzysztof Hałasa <khc@pm.waw.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agodccp: missing destroy of percpu counter variable while unload module
Wei Yongjun [Tue, 4 Aug 2009 21:44:39 +0000 (21:44 +0000)]
dccp: missing destroy of percpu counter variable while unload module

[ Upstream commit 476181cb05c6a3aea3ef42309388e255c934a06f ]

percpu counter dccp_orphan_count is init in dccp_init() by
percpu_counter_init() while dccp module is loaded, but the
destroy of it is missing while dccp module is unloaded. We
can get the kernel WARNING about this. Reproduct by the
following commands:

  $ modprobe dccp
  $ rmmod dccp
  $ modprobe dccp

WARNING: at lib/list_debug.c:26 __list_add+0x27/0x5c()
Hardware name: VMware Virtual Platform
list_add corruption. next->prev should be prev (c080c0c4), but was (null). (next
=ca7188cc).
Modules linked in: dccp(+) nfsd lockd nfs_acl auth_rpcgss exportfs sunrpc
Pid: 1956, comm: modprobe Not tainted 2.6.31-rc5 #55
Call Trace:
 [<c042f8fa>] warn_slowpath_common+0x6a/0x81
 [<c053a6cb>] ? __list_add+0x27/0x5c
 [<c042f94f>] warn_slowpath_fmt+0x29/0x2c
 [<c053a6cb>] __list_add+0x27/0x5c
 [<c053c9b3>] __percpu_counter_init+0x4d/0x5d
 [<ca9c90c7>] dccp_init+0x19/0x2ed [dccp]
 [<c0401141>] do_one_initcall+0x4f/0x111
 [<ca9c90ae>] ? dccp_init+0x0/0x2ed [dccp]
 [<c06971b5>] ? notifier_call_chain+0x26/0x48
 [<c0444943>] ? __blocking_notifier_call_chain+0x45/0x51
 [<c04516f7>] sys_init_module+0xac/0x1bd
 [<c04028e4>] sysenter_do_call+0x12/0x22

Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agoLinux 2.6.30.6 v2.6.30.6
Greg Kroah-Hartman [Wed, 9 Sep 2009 03:39:34 +0000 (20:39 -0700)]
Linux 2.6.30.6

14 years agoocfs2: ocfs2_write_begin_nolock() should handle len=0
Sunil Mushran [Fri, 4 Sep 2009 18:12:01 +0000 (11:12 -0700)]
ocfs2: ocfs2_write_begin_nolock() should handle len=0

commit 8379e7c46cc48f51197dd663fc6676f47f2a1e71 upstream.

Bug introduced by mainline commit e7432675f8ca868a4af365759a8d4c3779a3d922
The bug causes ocfs2_write_begin_nolock() to oops when len=0.

Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com>
Signed-off-by: Joel Becker <joel.becker@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agoNET: llc, zero sockaddr_llc struct
Jiri Slaby [Mon, 24 Aug 2009 05:55:51 +0000 (22:55 -0700)]
NET: llc, zero sockaddr_llc struct

commit 28e9fc592cb8c7a43e4d3147b38be6032a0e81bc upstream.

sllc_arphrd member of sockaddr_llc might not be changed. Zero sllc
before copying to the above layer's structure.

Signed-off-by: Jiri Slaby <jirislaby@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agorose: Fix rose_getname() leak
Eric Dumazet [Thu, 6 Aug 2009 03:34:06 +0000 (03:34 +0000)]
rose: Fix rose_getname() leak

commit 17ac2e9c58b69a1e25460a568eae1b0dc0188c25 upstream.

rose_getname() can leak kernel memory to user.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agoeconet: Fix econet_getname() leak
Eric Dumazet [Thu, 6 Aug 2009 03:48:36 +0000 (03:48 +0000)]
econet: Fix econet_getname() leak

commit 80922bbb12a105f858a8f0abb879cb4302d0ecaa upstream.

econet_getname() can leak kernel memory to user.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agonetrom: Fix nr_getname() leak
Eric Dumazet [Thu, 6 Aug 2009 03:31:07 +0000 (03:31 +0000)]
netrom: Fix nr_getname() leak

commit f6b97b29513950bfbf621a83d85b6f86b39ec8db upstream.

nr_getname() can leak kernel memory to user.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agoappletalk: fix atalk_getname() leak
Eric Dumazet [Thu, 6 Aug 2009 02:27:43 +0000 (02:27 +0000)]
appletalk: fix atalk_getname() leak

commit 3d392475c873c10c10d6d96b94d092a34ebd4791 upstream.

atalk_getname() can leak 8 bytes of kernel memory to user

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agoirda: Fix irda_getname() leak
Eric Dumazet [Thu, 6 Aug 2009 03:55:04 +0000 (03:55 +0000)]
irda: Fix irda_getname() leak

commit 09384dfc76e526c3993c09c42e016372dc9dd22c upstream.

irda_getname() can leak kernel memory to user.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agocan: Fix raw_getname() leak
Eric Dumazet [Thu, 6 Aug 2009 20:27:04 +0000 (20:27 +0000)]
can: Fix raw_getname() leak

commit e84b90ae5eb3c112d1f208964df1d8156a538289 upstream.

raw_getname() can leak 10 bytes of kernel memory to user

(two bytes hole between can_family and can_ifindex,
8 bytes at the end of sockaddr_can structure)

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Oliver Hartkopp <oliver@hartkopp.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agoxenfb: connect to backend before registering fb
Michal Schmidt [Thu, 27 Aug 2009 19:22:43 +0000 (12:22 -0700)]
xenfb: connect to backend before registering fb

commit 0a80fb10239b04c45e5e80aad8d4b2ca5ac407b2 upstream.

As soon as the framebuffer is registered, our methods may be called by the
kernel. This leads to a crash as xenfb_refresh() gets called before we have
the irq.

Connect to the backend before registering our framebuffer with the kernel.

[ Fixes bug http://bugzilla.kernel.org/show_bug.cgi?id=14059 ]

Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agoar9170: fix read & write outside array bounds
Dan Carpenter [Sun, 9 Aug 2009 12:24:09 +0000 (14:24 +0200)]
ar9170: fix read & write outside array bounds

commit e9d126cdfa60b575f1b5b02024c4faee27dccf07 upstream.

Backport done by Christian Lamparter <chunkeey@googlemail.com>

queue == __AR9170_NUM_TXQ would cause a bug on the next line.

found by Smatch ( http://repo.or.cz/w/smatch.git ).

Reported-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: Christian Lamparter <chunkeey@web.de>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agoipv4: make ip_append_data() handle NULL routing table
Julien TINNES [Thu, 27 Aug 2009 13:26:58 +0000 (15:26 +0200)]
ipv4: make ip_append_data() handle NULL routing table

commit 788d908f2879a17e5f80924f3da2e23f1034482d upstream.

Add a check in ip_append_data() for NULL *rtp to prevent future bugs in
callers from being exploitable.

Signed-off-by: Julien Tinnes <julien@cr0.org>
Signed-off-by: Tavis Ormandy <taviso@sdf.lonestar.org>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agopowerpc/ps3: Add missing check for PS3 to rtc-ps3 platform device registration
Geert Uytterhoeven [Sun, 23 Aug 2009 22:54:32 +0000 (22:54 +0000)]
powerpc/ps3: Add missing check for PS3 to rtc-ps3 platform device registration

commit 7b6a09f3d6aedeaac923824af2a5df30300b56e9 upstream.

On non-PS3, we get:

| kernel BUG at drivers/rtc/rtc-ps3.c:36!

because the rtc-ps3 platform device is registered unconditionally in a kernel
with builtin support for PS3.

Reported-by: Sachin Sant <sachinp@in.ibm.com>
Signed-off-by: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com>
Acked-by: Geoff Levand <geoffrey.levand@am.sony.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agoUSB: EHCI: fix two new bugs related to Clear-TT-Buffer
Alan Stern [Fri, 31 Jul 2009 14:40:22 +0000 (10:40 -0400)]
USB: EHCI: fix two new bugs related to Clear-TT-Buffer

commit 7a0f0d951273eee889c2441846842348ebc00a2a upstream.

This patch (as1273) fixes two(!) bugs introduced by the new
Clear-TT-Buffer implementation in ehci-hcd.

It is now possible for an idle QH to have some URBs on its
queue -- this will happen if a Clear-TT-Buffer is pending for
the QH's endpoint.  Consequently we should not issue a warning
when someone tries to unlink an URB from an idle QH; instead
we should process the request immediately.

The refcounts for QHs could get messed up, because
submit_async() would increment the refcount when calling
qh_link_async() and qh_link_async() would then refuse to link
the QH into the schedule if a Clear-TT-Buffer was pending.
Instead we should increment the refcount only when the QH
actually is added to the schedule.  The current code tries to
be clever by leaving the refcount alone if an unlink is
immediately followed by a relink; the patch changes this to an
unconditional decrement and increment (although they occur in
the opposite order).

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
CC: David Brownell <david-b@pacbell.net>
Tested-by: Manuel Lauss <manuel.lauss@gmail.com>
Tested-by: Matthijs Kooijman <matthijs@stdin.nl>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agoUSB: EHCI: use the new clear_tt_buffer interface
Alan Stern [Mon, 29 Jun 2009 14:47:30 +0000 (10:47 -0400)]
USB: EHCI: use the new clear_tt_buffer interface

commit 914b701280a76f96890ad63eb0fa99bf204b961c upstream.

This patch (as1256) changes ehci-hcd and all the other drivers in the
EHCI family to make use of the new clear_tt_buffer callbacks.  When a
Clear-TT-Buffer request is in progress for a QH, the QH is not allowed
to be linked into the async schedule until the request is finished.
At that time, if there are any URBs queued for the QH, it is linked
into the async schedule.

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agoUSB: fix the clear_tt_buffer interface
Alan Stern [Mon, 29 Jun 2009 14:43:32 +0000 (10:43 -0400)]
USB: fix the clear_tt_buffer interface

commit cb88a1b887bb8908f6e00ce29e893ea52b074940 upstream.

This patch (as1255) updates the interface for calling
usb_hub_clear_tt_buffer().  Even the name of the function is changed!

When an async URB (i.e., Control or Bulk) going through a high-speed
hub to a non-high-speed device is cancelled or fails, the hub's
Transaction Translator buffer may be left busy still trying to
complete the transaction.  The buffer has to be cleared; that's what
usb_hub_clear_tt_buffer() does.

It isn't safe to send any more URBs to the same endpoint until the TT
buffer is fully clear.  Therefore the HCD needs to be told when the
Clear-TT-Buffer request has finished.  This patch adds a callback
method to struct hc_driver for that purpose, and makes the hub driver
invoke the callback at the proper time.

The patch also changes a couple of names; "hub_tt_kevent" and
"tt.kevent" now look rather antiquated.

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agoipv6: Fix commit 63d9950b08184e6531adceb65f64b429909cc101 (ipv6: Make v4-mapped bindi...
Bruno Prémont [Mon, 24 Aug 2009 02:06:28 +0000 (19:06 -0700)]
ipv6: Fix commit 63d9950b08184e6531adceb65f64b429909cc101 (ipv6: Make v4-mapped bindings consistent with IPv4)

commit ca6982b858e1d08010c1d29d8e8255b2ac2ad70a upstream.

Commit 63d9950b08184e6531adceb65f64b429909cc101
  (ipv6: Make v4-mapped bindings consistent with IPv4)
changes behavior of inet6_bind() for v4-mapped addresses so it should
behave the same way as inet_bind().

During this change setting of err to -EADDRNOTAVAIL got lost:

af_inet.c:469 inet_bind()
err = -EADDRNOTAVAIL;
if (!sysctl_ip_nonlocal_bind &&
    !(inet->freebind || inet->transparent) &&
    addr->sin_addr.s_addr != htonl(INADDR_ANY) &&
    chk_addr_ret != RTN_LOCAL &&
    chk_addr_ret != RTN_MULTICAST &&
    chk_addr_ret != RTN_BROADCAST)
goto out;

af_inet6.c:463 inet6_bind()
if (addr_type == IPV6_ADDR_MAPPED) {
int chk_addr_ret;

/* Binding to v4-mapped address on a v6-only socket
 * makes no sense
 */
if (np->ipv6only) {
err = -EINVAL;
goto out;
}

/* Reproduce AF_INET checks to make the bindings consitant */
v4addr = addr->sin6_addr.s6_addr32[3];
chk_addr_ret = inet_addr_type(net, v4addr);
if (!sysctl_ip_nonlocal_bind &&
    !(inet->freebind || inet->transparent) &&
    v4addr != htonl(INADDR_ANY) &&
    chk_addr_ret != RTN_LOCAL &&
    chk_addr_ret != RTN_MULTICAST &&
    chk_addr_ret != RTN_BROADCAST)
goto out;
} else {

Signed-off-by: Bruno Prémont <bonbons@linux-vserver.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agokthreads: fix kthread_create() vs kthread_stop() race
Oleg Nesterov [Mon, 24 Aug 2009 10:45:29 +0000 (12:45 +0200)]
kthreads: fix kthread_create() vs kthread_stop() race

The bug should be "accidently" fixed by recent changes in 2.6.31,
all kernels <= 2.6.30 need the fix. The problem was never noticed before,
it was found because it causes mysterious failures with GFS mount/umount.

Credits to Robert Peterson. He blaimed kthread.c from the very beginning.
But, despite my promise, I forgot to inspect the old implementation until
he did a lot of testing and reminded me. This led to huge delay in fixing
this bug.

kthread_stop() does put_task_struct(k) before it clears kthread_stop_info.k.
This means another kthread_create() can re-use this task_struct, but the
new kthread can still see kthread_should_stop() == T and exit even without
calling threadfn().

Reported-by: Robert Peterson <rpeterso@redhat.com>
Tested-by: Robert Peterson <rpeterso@redhat.com>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agogspca - ov534: Fix ov772x
Jean-Francois Moine [Wed, 19 Aug 2009 21:46:18 +0000 (17:46 -0400)]
gspca - ov534: Fix ov772x

The scan of the image packets of the sensor ov772x was broken when
the sensor ov965x was added.

[ Based on upstream c874f3aa, modified slightly for v2.6.30.5 ]

Signed-off-by: Jim Paris <jim@jtan.com>
Acked-by: Jean-Francois Moine <moinejf@free.fr>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agoxfs: fix spin_is_locked assert on uni-processor builds
Christoph Hellwig [Wed, 19 Aug 2009 18:43:02 +0000 (14:43 -0400)]
xfs: fix spin_is_locked assert on uni-processor builds

upstream commit a8914f3a6d72c97328597a556a99daaf5cc288ae

Without SMP or preemption spin_is_locked always returns false,
so we can't do an assert with it.  Instead use assert_spin_locked,
which does the right thing on all builds.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Eric Sandeen <sandeen@sandeen.net>
Reported-by: Johannes Engel <jcnengel@googlemail.com>
Tested-by: Johannes Engel <jcnengel@googlemail.com>
Signed-off-by: Felix Blyakher <felixb@sgi.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agoxfs: fix freeing of inodes not yet added to the inode cache
Christoph Hellwig [Wed, 19 Aug 2009 18:43:01 +0000 (14:43 -0400)]
xfs: fix freeing of inodes not yet added to the inode cache

upstream commit b36ec0428a06fcbdb67d61e9e664154e5dd9a8c7

When freeing an inode that lost race getting added to the inode cache we
must not call into ->destroy_inode, because that would delete the inode
that won the race from the inode cache radix tree.

This patch uses splits a new xfs_inode_free helper out of xfs_ireclaim
and uses that plus __destroy_inode to make sure we really only free
the memory allocted for the inode that lost the race, and not mess with
the inode cache state.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Eric Sandeen <sandeen@sandeen.net>
Reported-by: Alex Samad <alex@samad.com.au>
Reported-by: Andrew Randrianasulu <randrik@mail.ru>
Reported-by: Stephane <sharnois@max-t.com>
Reported-by: Tommy <tommy@news-service.com>
Reported-by: Miah Gregory <mace@darksilence.net>
Reported-by: Gabriel Barazer <gabriel@oxeva.fr>
Reported-by: Leandro Lucarella <llucax@gmail.com>
Reported-by: Daniel Burr <dburr@fami.com.au>
Reported-by: Nickolay <newmail@spaces.ru>
Reported-by: Michael Guntsche <mike@it-loops.com>
Reported-by: Dan Carley <dan.carley+linuxkern-bugs@gmail.com>
Reported-by: Michael Ole Olsen <gnu@gmx.net>
Reported-by: Michael Weissenbacher <mw@dermichi.com>
Reported-by: Martin Spott <Martin.Spott@mgras.net>
Reported-by: Christian Kujau <lists@nerdbynature.de>
Tested-by: Michael Guntsche <mike@it-loops.com>
Tested-by: Dan Carley <dan.carley+linuxkern-bugs@gmail.com>
Tested-by: Christian Kujau <lists@nerdbynature.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agovfs: add __destroy_inode
Christoph Hellwig [Wed, 19 Aug 2009 18:43:00 +0000 (14:43 -0400)]
vfs: add __destroy_inode

backport of upstream commit 2e00c97e2c1d2ffc9e26252ca26b237678b0b772

When we want to tear down an inode that lost the add to the cache race
in XFS we must not call into ->destroy_inode because that would delete
the inode that won the race from the inode cache radix tree.

This patch provides the __destroy_inode helper needed to fix this,
the actual fix will be in th next patch.  As XFS was the only reason
destroy_inode was exported we shift the export to the new __destroy_inode.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Eric Sandeen <sandeen@sandeen.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agovfs: fix inode_init_always calling convention
Christoph Hellwig [Wed, 19 Aug 2009 18:42:59 +0000 (14:42 -0400)]
vfs: fix inode_init_always calling convention

backport of upstream commit 54e346215e4fe2ca8c94c54e546cc61902060510

Currently inode_init_always calls into ->destroy_inode if the additional
initialization fails.  That's not only counter-intuitive because
inode_init_always did not allocate the inode structure, but in case of
XFS it's actively harmful as ->destroy_inode might delete the inode from
a radix-tree that has never been added.  This in turn might end up
deleting the inode for the same inum that has been instanciated by
another process and cause lots of cause subtile problems.

Also in the case of re-initializing a reclaimable inode in XFS it would
free an inode we still want to keep alive.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Eric Sandeen <sandeen@sandeen.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agoACPI processor: force throttling state when BIOS returns incorrect value
Frans Pop [Wed, 26 Aug 2009 21:29:29 +0000 (14:29 -0700)]
ACPI processor: force throttling state when BIOS returns incorrect value

commit 2a908002c7b1b666616103e9df2419b38d7c6f1f upstream.

If the BIOS reports an invalid throttling state (which seems to be
fairly common after system boot), a reset is done to state T0.
Because of a check in acpi_processor_get_throttling_ptc(), the reset
never actually gets executed, which results in the error reoccurring
on every access of for example /proc/acpi/processor/CPU0/throttling.

Add a 'force' option to acpi_processor_set_throttling() to ensure
the reset really takes effect.

Addresses http://bugzilla.kernel.org/show_bug.cgi?id=13389

This patch, together with the next one, fixes a regression introduced in
2.6.30, listed on the regression list. They have been available for 2.5
months now in bugzilla, but have not been picked up, despite various
reminders and without any reason given.

Google shows that numerous people are hitting this issue. The issue is in
itself relatively minor, but the bug in the code is clear.

The patches have been in all my kernels and today testing has shown that
throttling works correctly with the patches applied when the system
overheats (http://bugzilla.kernel.org/show_bug.cgi?id=13918#c14).

Signed-off-by: Frans Pop <elendil@planet.nl>
Acked-by: Zhang Rui <rui.zhang@intel.com>
Cc: Len Brown <lenb@kernel.org>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agoocfs2: Initialize the cluster we're writing to in a non-sparse extend
Sunil Mushran [Thu, 6 Aug 2009 23:12:58 +0000 (16:12 -0700)]
ocfs2: Initialize the cluster we're writing to in a non-sparse extend

commit e7432675f8ca868a4af365759a8d4c3779a3d922 upstream.

In a non-sparse extend, we correctly allocate (and zero) the clusters between
the old_i_size and pos, but we don't zero the portions of the cluster we're
writing to outside of pos<->len.

It handles clustersize > pagesize and blocksize < pagesize.

[Cleaned up by Joel Becker.]

Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com>
Signed-off-by: Joel Becker <joel.becker@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agox86, amd: Don't probe for extended APIC ID if APICs are disabled
Jeremy Fitzhardinge [Wed, 22 Jul 2009 16:59:35 +0000 (09:59 -0700)]
x86, amd: Don't probe for extended APIC ID if APICs are disabled

commit 2cb078603abb612e3bcd428fb8122c3d39e08832 upstream.

If we've logically disabled apics, don't probe the PCI space for the
AMD extended APIC ID.

[ Impact: prevent boot crash under Xen. ]

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Reported-by: Bastian Blank <bastian@waldi.eu.org>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Cc: Andreas Herrmann <andreas.herrmann3@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agoBug Fix arch/ia64/kernel/pci-dma.c: fix recursive dma_supported() call in iommu_dma_s...
Fenghua Yu [Tue, 11 Aug 2009 21:52:10 +0000 (14:52 -0700)]
Bug Fix arch/ia64/kernel/pci-dma.c: fix recursive dma_supported() call in iommu_dma_supported()

commit 51b89f7a6615eca184aa0b85db5781d931e9c8d1 upstream.

In commit 160c1d8e40866edfeae7d68816b7005d70acf391,
dma_ops->dma_supported = iommu_dma_supported;

This dma_ops->dma_supported is first called in platform_dma_init() during kernel
boot. Then dma_ops->dma_supported will be called recursively in
iommu_dma_supported.

Kernel can not boot because kernel can not get out of iommu_dma_supported until
it runs out of stack memory.

Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agodo_sigaltstack: avoid copying 'stack_t' as a structure to user space
Linus Torvalds [Sat, 1 Aug 2009 17:34:56 +0000 (10:34 -0700)]
do_sigaltstack: avoid copying 'stack_t' as a structure to user space

commit 0083fc2c50e6c5127c2802ad323adf8143ab7856 upstream.

Ulrich Drepper correctly points out that there is generally padding in
the structure on 64-bit hosts, and that copying the structure from
kernel to user space can leak information from the kernel stack in those
padding bytes.

Avoid the whole issue by just copying the three members one by one
instead, which also means that the function also can avoid the need for
a stack frame.  This also happens to match how we copy the new structure
from user space, so it all even makes sense.

[ The obvious solution of adding a memset() generates horrid code, gcc
  does really stupid things. ]

Reported-by: Ulrich Drepper <drepper@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agompt2sas: fix config request and diag reset deadlock
Kashyap, Desai [Fri, 14 Aug 2009 09:31:35 +0000 (15:01 +0530)]
mpt2sas: fix config request and diag reset deadlock

commit 388ce4beb7135722c584b0af18f215e3ec657adf upstream.

Moving the setting and clearing of the mutex's to
_config_request. There was a mutex deadlock when diag reset is called from
inside _config_request, so diag reset was moved to outside the mutexs.

Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Signed-off-by: Kashyap Desai <kashyap.desai@lsi.com>
Signed-off-by: Eric Moore <Eric.moore@lsi.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agoSCSI: mpt2sas: fix oops because drv data points to NULL on resume from hibernate
Kashyap, Desai [Fri, 7 Aug 2009 14:08:48 +0000 (19:38 +0530)]
SCSI: mpt2sas: fix oops because drv data points to NULL on resume from hibernate

commit fcfe6392d18283df3c561b5ef59c330d485ff8ca upstream.

Fix another ocurring when the system resumes.  This oops was due to driver
setting the pci drvdata to NULL on the prior hibernation.  Becuase it was
set to NULL, upon resmume we assume the pci drvdata is non-zero, and we oops.
To fix the ooops, we don't set pci drvdata to NULL at hibernation time.

Signed-off-by: Kashyap Desai <kashyap.desai@lsi.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agoSCSI: mpt2sas: fix crash due to Watchdog is active while OS in standby mode
Kashyap, Desai [Fri, 7 Aug 2009 14:07:59 +0000 (19:37 +0530)]
SCSI: mpt2sas: fix crash due to Watchdog is active while OS in standby mode

commit e4750c989f732555fca86dd73d488c79972362db upstream.

Fix oops ocurring at hibernation time.  This oops was due to the firmware fault
watchdog timer still running after we freed resources. To fix the issue we need
to terminate the watchdog timer at hibernation time.

Signed-off-by: Kashyap Desai <kashyap.desai@lsi.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agoSCSI: mpt2sas: fix infinite loop inside config request
Kashyap, Desai [Fri, 7 Aug 2009 14:07:14 +0000 (19:37 +0530)]
SCSI: mpt2sas: fix infinite loop inside config request

commit 6bd4e1e4d6023f4da069fd68729c502cc4e6dfd0 upstream.

This restriction is introduced just to avoid loop of
config_request. Retry must be limited so we have restricted
config request to maximum 2 times.

Signed-off-by: Kashyap Desai <kashyap.desai@lsi.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agoSCSI: mpt2sas: Excessive log info causes sas iounit page time out
Kashyap, Desai [Fri, 7 Aug 2009 14:06:43 +0000 (19:36 +0530)]
SCSI: mpt2sas: Excessive log info causes sas iounit page time out

commit be9e8cd75ce8d94ae4aab721fdcc337fa78a9090 upstream.

Inhibit 0x3117 loginfos - during cable pull, there are too many printks going
to the syslog, this is have impact on how fast the interrupt routine can handle
keeping up with command completions; this was the root cause to the config
pages timeouts.

Signed-off-by: Kashyap Desai <kashyap.desai@lsi.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agoSCSI: mpt2sas: Raid 10 Value is showing as Raid 1E in /va/log/messages
Kashyap, Desai [Fri, 7 Aug 2009 14:05:18 +0000 (19:35 +0530)]
SCSI: mpt2sas: Raid 10 Value is showing as Raid 1E in /va/log/messages

commit 62727a7ba43c0abf2673e3877079c136a9721792 upstream.

When a volume is activated, the driver will recieve a pair of ir config change
events to remove the foreign volume, then add the native.
In the process of the removal event, the hidden raid componet is removed from
the parent.When the disks is added back, the adding of the port fails becuase
there is no instance of the device in its parent.
To fix this issue, the driver needs to call mpt2sas_transport_update_links()
prior to calling _scsih_add_device. In addition, we added sanity checks on
volume add and removal to ignore events for foreign volumes.

Signed-off-by: Kashyap Desai <kashyap.desai@lsi.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agoSCSI: mpt2sas: Expander fix oops saying "Already part of another port"
Kashyap, Desai [Fri, 7 Aug 2009 14:04:26 +0000 (19:34 +0530)]
SCSI: mpt2sas: Expander fix oops saying "Already part of another port"

commit 20f5895d55d9281830bfb7819c5c5b70b05297a6 upstream.

Kernel panic is seen because driver did not tear down the port which should
be dnoe using mpt2sas_transport_port_remove(). without this fix When expander
is added back we would oops inside sas_port_add_phy.

Signed-off-by: Kashyap Desai <kashyap.desai@lsi.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agoSCSI: mpt2sas: Introduced check for enclosure_handle to avoid crash
Kashyap, Desai [Fri, 7 Aug 2009 14:03:17 +0000 (19:33 +0530)]
SCSI: mpt2sas: Introduced check for enclosure_handle to avoid crash

commit 15052c9e85bf0cdadcb69eb89623bf12bad8b4f8 upstream.

Kernel panic is seen because of enclosure_handle received from FW is zero.
Check is introduced before calling mpt2sas_config_get_enclosure_pg0.

Signed-off-by: Kashyap Desai <kashyap.desai@lsi.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agolibata: OCZ Vertex can't do HPA
Tejun Heo [Thu, 6 Aug 2009 16:59:15 +0000 (01:59 +0900)]
libata: OCZ Vertex can't do HPA

commit 7831387bda72af3059be48d39846d3eb6d8ce2f6 upstream.

OCZ Vertex SSD can't do HPA and not in a usual way.  It reports HPA,
allows unlocking but then fails all IOs which fall in the unlocked
area.  Quirk it so that HPA unlocking is not used for the device.

Reported by Daniel Perup in bnc#522414.

 https://bugzilla.novell.com/show_bug.cgi?id=522414

Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Daniel Perup <probe@spray.se>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agoiwlagn: do not send key clear commands when rfkill enabled
Reinette Chatre [Mon, 3 Aug 2009 19:10:16 +0000 (12:10 -0700)]
iwlagn: do not send key clear commands when rfkill enabled

commit 99f1b01562b7dcae75b043114f76163fbf84fcab upstream.

Do all key clearing except sending sommands to device when rfkill
enabled. When rfkill enabled the interface is brought down and will
be brought back up correctly after rfkill is enabled again.

Same change is not needed for iwl3945 as it ignores return code when
sending key clearing command to device.

This fixes http://bugzilla.kernel.org/show_bug.cgi?id=13742

Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
Tested-by: Frans Pop <elendil@planet.nl>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agoiwl3945: fix rfkill switch
Stanislaw Gruszka [Thu, 13 Aug 2009 12:29:08 +0000 (14:29 +0200)]
iwl3945: fix rfkill switch

(Not needed upstream, due to the major rewrite in 2.6.31)

Due to rfkill and iwlwifi mishmash of SW / HW killswitch representation,
we have race conditions which make unable turn wifi radio on, after enable
and disable again killswitch. I can observe this problem on my laptop
with iwl3945 device.

In rfkill core HW switch and SW switch are separate 'states'. Device can
be only in one of 3 states: RFKILL_STATE_SOFT_BLOCKED, RFKILL_STATE_UNBLOCKED,
RFKILL_STATE_HARD_BLOCKED. Whereas in iwlwifi driver we have separate bits
STATUS_RF_KILL_HW and STATUS_RF_KILL_SW for HW and SW switches - radio can be
turned on, only if both bits are cleared.

In this particular race conditions, radio can not be turned on if in driver
STATUS_RF_KILL_SW bit is set, and rfkill core is in state
RFKILL_STATE_HARD_BLOCKED, because rfkill core is unable to call
rfkill->toggle_radio(). This situation can be entered in case:

- killswitch is turned on
- rfkill core 'see' button change first and move to RFKILL_STATE_SOFT_BLOCKED
  also call ->toggle_radio() and STATE_RF_KILL_SW in driver is set
- iwl3945 get info about button from hardware to set STATUS_RF_KILL_HW bit and
  force rfkill to move to RFKILL_STATE_HARD_BLOCKED
- killsiwtch is turend off
- driver clear STATUS_RF_KILL_HW
- rfkill core is unable to clear STATUS_RF_KILL_SW in driver

Additionally call to rfkill_epo() when STATUS_RF_KILL_HW in driver is set
cause move to the same situation.

In 2.6.31 this problem is fixed due to _total_ rewrite of rfkill subsystem.
This is a quite small fix for 2.6.30.x in iwlwifi driver. We are changing
internal rfkill state to always have below relations true:

STATUS_RF_KILL_HW=1 STATUS_RF_KILL_SW=1 <-> RFKILL_STATUS_SOFT_BLOCKED
STATUS_RF_KILL_HW=0 STATUS_RF_KILL_SW=1 <-> RFKILL_STATUS_SOFT_BLOCKED
STATUS_RF_KILL_HW=1 STATUS_RF_KILL_SW=0 <-> RFKILL_STATUS_HARD_BLOCKED
STATUS_RF_KILL_HW=0 STATUS_RF_KILL_SW=0 <-> RFKILL_STATUS_UNBLOCKED

Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
Acked-by: Reinette Chatre <reinette.chatre@intel.com>
Acked-by: John W. Linville <linville@tuxdriver.com>
14 years agoKVM: Fix KVM_GET_MSR_INDEX_LIST
Jan Kiszka [Thu, 2 Jul 2009 19:45:47 +0000 (21:45 +0200)]
KVM: Fix KVM_GET_MSR_INDEX_LIST

commit e125e7b6944898831b56739a5448e705578bf7e2 upstream.

So far, KVM copied the emulated_msrs (only MSR_IA32_MISC_ENABLE) to a
wrong address in user space due to broken pointer arithmetic. This
caused subtle corruption up there (missing MSR_IA32_MISC_ENABLE had
probably no practical relevance). Moreover, the size check for the
user-provided kvm_msr_list forgot about emulated MSRs.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agoKVM: fix ack not being delivered when msi present
Michael S. Tsirkin [Tue, 1 Sep 2009 15:15:14 +0000 (12:15 -0300)]
KVM: fix ack not being delivered when msi present

(cherry picked from commit 5116d8f6b977970ebefc1932c0f313163a6ec91f)

kvm_notify_acked_irq does not check irq type, so that it sometimes
interprets msi vector as irq.  As a result, ack notifiers are not
called, which typially hangs the guest.  The fix is to track and
check irq type.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agoKVM: MMU: limit rmap chain length
Marcelo Tosatti [Tue, 1 Sep 2009 15:15:13 +0000 (12:15 -0300)]
KVM: MMU: limit rmap chain length

(cherry picked from commit 53a27b39ff4d2492f84b1fdc2f0047175f0b0b93)

Otherwise the host can spend too long traversing an rmap chain, which
happens under a spinlock.

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agoKVM: MMU: handle n_free_mmu_pages > n_alloc_mmu_pages in kvm_mmu_change_mmu_pages
Marcelo Tosatti [Tue, 1 Sep 2009 15:15:12 +0000 (12:15 -0300)]
KVM: MMU: handle n_free_mmu_pages > n_alloc_mmu_pages in kvm_mmu_change_mmu_pages

(cherry picked from commit 025dbbf36a7680bffe54d9dcbf0a8bc01a7cbd10)

kvm_mmu_change_mmu_pages mishandles the case where n_alloc_mmu_pages is
smaller then n_free_mmu_pages, by not checking if the result of
the subtraction is negative.

Its a valid condition which can happen if a large number of pages has
been recently freed.

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agoKVM: SVM: force new asid on vcpu migration
Marcelo Tosatti [Tue, 1 Sep 2009 15:15:11 +0000 (12:15 -0300)]
KVM: SVM: force new asid on vcpu migration

(cherry picked from commit 4b656b1202498184a0ecef86b3b89ff613b9c6ab)

If a migrated vcpu matches the asid_generation value of the target pcpu,
there will be no TLB flush via TLB_CONTROL_FLUSH_ALL_ASID.

The check for vcpu.cpu in pre_svm_run is meaningless since svm_vcpu_load
already updated it on schedule in.

Such vcpu will VMRUN with stale TLB entries.

Based on original patch from Joerg Roedel (http://patchwork.kernel.org/patch/10021/)

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Acked-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agoKVM: x86: verify MTRR/PAT validity
Marcelo Tosatti [Tue, 1 Sep 2009 15:15:10 +0000 (12:15 -0300)]
KVM: x86: verify MTRR/PAT validity

(cherry picked from commit d6289b9365c3f622a8cfe62c4fb054bb70b5061a)

Do not allow invalid memory types in MTRR/PAT (generating a #GP
otherwise).

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agoKVM: Fix cpuid feature misreporting
Avi Kivity [Mon, 3 Aug 2009 17:57:57 +0000 (14:57 -0300)]
KVM: Fix cpuid feature misreporting

(cherry picked from commit 8d753f369bd28fff1706ffe9fb9fea4fd88cf85b)

MTRR, PAT, MCE, and MCA are all supported (to some extent) but not reported.
Vista requires these features, so if userspace relies on kernel cpuid
reporting, it loses support for Vista.

Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agoKVM: Ignore reads to K7 EVNTSEL MSRs
Amit Shah [Mon, 3 Aug 2009 17:57:56 +0000 (14:57 -0300)]
KVM: Ignore reads to K7 EVNTSEL MSRs

(cherry picked from commit 9e6996240afcbe61682eab8eeaeb65c34333164d)

In commit 7fe29e0faacb650d31b9e9f538203a157bec821d we ignored the
reads to the P6 EVNTSEL MSRs. That fixed crashes on Intel machines.

Ignore the reads to K7 EVNTSEL MSRs as well to fix this on AMD
hosts.

This fixes Kaspersky antivirus crashing Windows guests on AMD hosts.

Signed-off-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agoKVM: x86: Ignore reads to EVNTSEL MSRs
Amit Shah [Mon, 3 Aug 2009 17:57:55 +0000 (14:57 -0300)]
KVM: x86: Ignore reads to EVNTSEL MSRs

(cherry picked from commit 7fe29e0faacb650d31b9e9f538203a157bec821d)

We ignore writes to the performance counters and performance event
selector registers already. Kaspersky antivirus reads the eventsel
MSR causing it to crash with the current behaviour.

Return 0 as data when the eventsel registers are read to stop the
crash.

Signed-off-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agoKVM: MMU: Use different shadows when EFER.NXE changes
Avi Kivity [Mon, 3 Aug 2009 17:57:54 +0000 (14:57 -0300)]
KVM: MMU: Use different shadows when EFER.NXE changes

(cherry picked from commit 9645bb56b31a1b70ab9e470387b5264cafc04aa9)

A pte that is shadowed when the guest EFER.NXE=1 is not valid when
EFER.NXE=0; if bit 63 is set, the pte should cause a fault, and since the
shadow EFER always has NX enabled, this won't happen.

Fix by using a different shadow page table for different EFER.NXE bits.  This
allows vcpus to run correctly with different values of EFER.NXE, and for
transitions on this bit to be handled correctly without requiring a full
flush.

Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agoKVM: Deal with interrupt shadow state for emulated instructions
Glauber Costa [Mon, 3 Aug 2009 17:57:53 +0000 (14:57 -0300)]
KVM: Deal with interrupt shadow state for emulated instructions

(cherry picked from commit 310b5d306c1aee7ebe32f702c0e33e7988d50646)

We currently unblock shadow interrupt state when we skip an instruction,
but failing to do so when we actually emulate one. This blocks interrupts
in key instruction blocks, in particular sti; hlt; sequences

If the instruction emulated is an sti, we have to block shadow interrupts.
The same goes for mov ss. pop ss also needs it, but we don't currently
emulate it.

Without this patch, I cannot boot gpxe option roms at vmx machines.
This is described at https://bugzilla.redhat.com/show_bug.cgi?id=494469

Signed-off-by: Glauber Costa <glommer@redhat.com>
CC: H. Peter Anvin <hpa@zytor.com>
CC: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agoKVM: Introduce {set/get}_interrupt_shadow()
Glauber Costa [Mon, 3 Aug 2009 17:57:52 +0000 (14:57 -0300)]
KVM: Introduce {set/get}_interrupt_shadow()

This patch introduces set/get_interrupt_shadow(), that does exactly
what the name suggests. It also replaces open code that explicitly does
it with the now existent functions. It differs slightly from upstream,
because upstream merged it after gleb's interrupt rework, that we don't
ship.

Just for reference, upstream changelog is
(2809f5d2c4cfad171167b131bb2a21ab65eba40f):

This patch replaces drop_interrupt_shadow with the more
general set_interrupt_shadow, that can either drop or raise
it, depending on its parameter.  It also adds ->get_interrupt_shadow()
for future use.

Signed-off-by: Glauber Costa <glommer@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
14 years agoKVM: MMU: do not free active mmu pages in free_mmu_pages()
Gleb Natapov [Mon, 3 Aug 2009 17:57:51 +0000 (14:57 -0300)]
KVM: MMU: do not free active mmu pages in free_mmu_pages()

(cherry picked from commit f00be0cae4e6ad0a8c7be381c6d9be3586800b3e)

free_mmu_pages() should only undo what alloc_mmu_pages() does.
Free mmu pages from the generic VM destruction function, kvm_destroy_vm().

Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>