Mathias Krause [Tue, 10 Feb 2015 00:14:07 +0000 (01:14 +0100)]
crypto: crc32c - add missing crypto module alias
The backport of commit 5d26a105b5a7 ("crypto: prefix module autoloading
with "crypto-"") lost the MODULE_ALIAS_CRYPTO() annotation of crc32c.c.
Add it to fix the reported filesystem related regressions.
Signed-off-by: Mathias Krause <minipli@googlemail.com> Reported-by: Philip Müller <philm@manjaro.org> Cc: Kees Cook <keescook@chromium.org> Cc: Rob McCathie <rob@manjaro.org> Cc: Luis Henriques <luis.henriques@canonical.com> Cc: Kamal Mostafa <kamal@canonical.com> Cc: Jiri Slaby <jslaby@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
CR4 isn't constant; at least the TSD and PCE bits can vary.
TBH, treating CR0 and CR3 as constant scares me a bit, too, but it looks
like it's correct.
This adds a branch and a read from cr4 to each vm entry. Because it is
extremely likely that consecutive entries into the same vcpu will have
the same host cr4 value, this fixes up the vmcs instead of restoring cr4
after the fact. A subsequent patch will add a kernel-wide cr4 shadow,
reducing the overhead in the common case to just two memory reads and a
branch.
Signed-off-by: Andy Lutomirski <luto@amacapital.net> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Cc: Petr Matousek <pmatouse@redhat.com> Cc: Gleb Natapov <gleb@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
[wangkai: Backport to 3.10: adjust context] Signed-off-by: Wang Kai <morgan.wang@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
On systems with invvpid instruction support (corresponding bit in
IA32_VMX_EPT_VPID_CAP MSR is set) guest invocation of invvpid
causes vm exit, which is currently not handled and results in
propagation of unknown exit to userspace.
Fix this by installing an invvpid vm exit handler.
This is CVE-2014-3646.
Cc: stable@vger.kernel.org Signed-off-by: Petr Matousek <pmatouse@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
[wangkai: Backport to 3.10: adjust context] Signed-off-by: Wang Kai <morgan.wang@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This results in a missing per cpu thread for the newly onlined cpu2 and
in a NULL pointer dereference on a consecutive offline of that cpu.
Proctect smpboot_register_percpu_thread() with get_online_cpus() to
prevent that.
[ tglx: Massaged changelog and removed the change in
smpboot_unregister_percpu_thread() because that's an
optimization and therefor not stable material. ]
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Cc: David Rientjes <rientjes@google.com> Link: http://lkml.kernel.org/r/1406777421-12830-1-git-send-email-laijs@cn.fujitsu.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When ak4114 work calls its callback and the callback invokes
ak4114_reinit(), it stalls due to flush_delayed_work(). For avoiding
this, control the reentrance by introducing a refcount. Also
flush_delayed_work() is replaced with cancel_delayed_work_sync().
The exactly same bug is present in ak4113.c and fixed as well.
Reported-by: Pavel Hofman <pavel.hofman@ivitera.com> Acked-by: Jaroslav Kysela <perex@perex.cz> Tested-by: Pavel Hofman <pavel.hofman@ivitera.com> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To quote from section 1.3.1 of the data sheet:
The SGTL5000 has an internal reset that is deasserted
8 SYS_MCLK cycles after all power rails have been brought
up. After this time, communication can start
...
1.0us represents 8 SYS_MCLK cycles at the minimum 8.0 MHz SYS_MCLK.
Signed-off-by: Eric Nelson <eric.nelson@boundarydevices.com> Reviewed-by: Fabio Estevam <fabio.estevam@freescale.com> Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
According to the I2S specification information as following:
- WS = 0, channel 1 (left)
- WS = 1, channel 2 (right)
So, the start event should be TF/RF falling edge.
Reported-by: Songjun Wu <songjun.wu@atmel.com> Signed-off-by: Bo Shen <voice.shen@atmel.com> Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Fixed commit added from64to32 under _#ifndef do_csum_ but used it
under _#ifndef csum_tcpudp_nofold_, breaking some builds (Fengguang's
robot reported TILEGX's). Move from64to32 under the latter.
Fixes: 150ae0e94634 ("lib/checksum.c: fix carry in csum_tcpudp_nofold") Reported-by: kbuild test robot <fengguang.wu@intel.com> Signed-off-by: Karl Beldan <karl.beldan@rivierawaves.com> Cc: Eric Dumazet <edumazet@google.com> Cc: David S. Miller <davem@davemloft.net> Signed-off-by: David S. Miller <davem@davemloft.net> Cc: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
O_DIRECT flags can be toggeled via fcntl(F_SETFL). But this value checked
twice inside ext4_file_write_iter() and __generic_file_write() which
result in BUG_ON inside ext4_direct_IO.
Reported-by: Sasha Levin <sasha.levin@oracle.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
[hujianyang: Backported to 3.10
- Move initialization of iocb->private to ext4_file_write() as we don't
have ext4_file_write_iter(), which is introduced by commit 9b884164.
- Adjust context to make 'overwrite' changes apply to ext4_file_dio_write()
as ext4_file_dio_write() is not move into ext4_file_write()] Signed-off-by: hujianyang <hujianyang@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Commit d7a49086f263164a (arm64: cpuinfo: print info for all CPUs)
attempted to clean up /proc/cpuinfo, but due to concerns regarding
further changes was reverted in commit 5e39977edf6500fd (Revert "arm64:
cpuinfo: print info for all CPUs").
There are two major issues with the arm64 /proc/cpuinfo format
currently:
* The "Features" line describes (only) the 64-bit hwcaps, which is
problematic for some 32-bit applications which attempt to parse it. As
the same names are used for analogous ISA features (e.g. aes) despite
these generally being architecturally unrelated, it is not possible to
simply append the 64-bit and 32-bit hwcaps in a manner that might not
be misleading to some applications.
Various potential solutions have appeared in vendor kernels. Typically
the format of the Features line varies depending on whether the task
is 32-bit.
* Information is only printed regarding a single CPU. This does not
match the ARM format, and does not provide sufficient information in
big.LITTLE systems where CPUs are heterogeneous. The CPU information
printed is queried from the current CPU's registers, which is racy
w.r.t. cross-cpu migration.
This patch attempts to solve these issues. The following changes are
made:
* When a task with a LINUX32 personality attempts to read /proc/cpuinfo,
the "Features" line contains the decoded 32-bit hwcaps, as with the
arm port. Otherwise, the decoded 64-bit hwcaps are shown. This aligns
with the behaviour of COMPAT_UTS_MACHINE and COMPAT_ELF_PLATFORM. In
the absense of compat support, the Features line is empty.
The set of hwcaps injected into a task's auxval are unaffected.
* Properties are printed per-cpu, as with the ARM port. The per-cpu
information is queried from pre-recorded cpu information (as used by
the sanity checks).
* As with the previous attempt at fixing up /proc/cpuinfo, the hardware
field is removed. The only users so far are 32-bit applications tied
to particular boards, so no portable applications should be affected,
and this should prevent future tying to particular boards.
The following differences remain:
* No model_name is printed, as this cannot be queried from the hardware
and cannot be provided in a stable fashion. Use of the CPU
{implementor,variant,part,revision} fields is sufficient to identify a
CPU and is portable across arm and arm64.
* The following system-wide properties are not provided, as they are not
possible to provide generally. Programs relying on these are already
tied to particular (32-bit only) boards:
- Hardware
- Revision
- Serial
No software has yet been identified for which these remaining
differences are problematic.
Cc: Greg Hackmann <ghackmann@google.com> Cc: Ian Campbell <ijc@hellion.org.uk> Cc: Serban Constantinescu <serban.constantinescu@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: cross-distro@lists.linaro.org Cc: linux-api@vger.kernel.org Cc: linux-arm-kernel@lists.infradead.org Cc: linux-kernel@vger.kernel.org Acked-by: Catalin Marinas <catalin.marinas@arm.com>
[Mark: backport to v3.10.x] Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Nilfs2 eventually hangs in a stress test with fsstress program. This
issue was caused by the following deadlock over I_SYNC flag between
nilfs_segctor_thread() and writeback_sb_inodes():
nilfs_segctor_thread()
nilfs_segctor_thread_construct()
nilfs_segctor_unlock()
nilfs_dispose_list()
iput()
iput_final()
evict()
inode_wait_for_writeback() * wait for I_SYNC flag
writeback_sb_inodes()
* set I_SYNC flag on inode->i_state
__writeback_single_inode()
do_writepages()
nilfs_writepages()
nilfs_construct_dsync_segment()
nilfs_segctor_sync()
* wait for completion of segment constructor
inode_sync_complete()
* clear I_SYNC flag after __writeback_single_inode() completed
writeback_sb_inodes() calls do_writepages() for dirty inodes after
setting I_SYNC flag on inode->i_state. do_writepages() in turn calls
nilfs_writepages(), which can run segment constructor and wait for its
completion. On the other hand, segment constructor calls iput(), which
can call evict() and wait for the I_SYNC flag on
inode_wait_for_writeback().
Since segment constructor doesn't know when I_SYNC will be set, it
cannot know whether iput() will block or not unless inode->i_nlink has a
non-zero count. We can prevent evict() from being called in iput() by
implementing sop->drop_inode(), but it's not preferable to leave inodes
with i_nlink == 0 for long periods because it even defers file
truncation and inode deallocation. So, this instead resolves the
deadlock by calling iput() asynchronously with a workqueue for inodes
with i_nlink == 0.
The carry from the 64->32bits folding was dropped, e.g with:
saddr=0xFFFFFFFF daddr=0xFF0000FF len=0xFFFF proto=0 sum=1,
csum_tcpudp_nofold returned 0 instead of 1.
Signed-off-by: Karl Beldan <karl.beldan@rivierawaves.com> Cc: Al Viro <viro@ZenIV.linux.org.uk> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Mike Frysinger <vapier@gentoo.org> Cc: netdev@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
walk_page_range() silently skips vma having VM_PFNMAP set, which leads
to undesirable behaviour at client end (who called walk_page_range).
Userspace applications get the wrong data, so the effect is like just
confusing users (if the applications just display the data) or sometimes
killing the processes (if the applications do something with
misunderstanding virtual addresses due to the wrong data.)
For example for pagemap_read, when no callbacks are called against
VM_PFNMAP vma, pagemap_read may prepare pagemap data for next virtual
address range at wrong index.
Eventually userspace may get wrong pagemap data for a task.
Corresponding to a VM_PFNMAP marked vma region, kernel may report
mappings from subsequent vma regions. User space in turn may account
more pages (than really are) to the task.
In my case I was using procmem, procrack (Android utility) which uses
pagemap interface to account RSS pages of a task. Due to this bug it
was giving a wrong picture for vmas (with VM_PFNMAP set).
As printk() invocation can cause e.g. a TLB miss, printk() cannot be
called before the exception handlers have been properly initialized.
This can happen e.g. when netconsole has been loaded as a kernel module
and the TLB table has been cleared when a CPU was offline.
Call cpu_report() in start_secondary() only after the exception handlers
have been initialized to fix this.
Without the patch the kernel will randomly either lockup or crash
after a CPU is onlined and the console driver is a module.
If the irq_chip does not define .irq_disable, any call to disable_irq
will defer disabling the IRQ until it fires while marked as disabled.
This assumes that the handler function checks for this condition, which
handle_percpu_irq does not. In this case, calling disable_irq leads to
an IRQ storm, if the interrupt fires while disabled.
This optimization is only useful when disabling the IRQ is slow, which
is not true for the MIPS CPU IRQ.
Disable this optimization by implementing .irq_disable and .irq_enable
Fix memory leak in the gpio sysfs interface due to failure to drop
reference to device returned by class_find_device when setting the
gpio-line polarity.
Fixes: 0769746183ca ("gpiolib: add support for changing value polarity in sysfs") Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Fix memory leak in the gpio sysfs interface due to failure to drop
reference to device returned by class_find_device when creating a link.
Fixes: a4177ee7f1a8 ("gpiolib: allow exported GPIO nodes to be named using sysfs links") Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This patch drops the arbitrary maximum I/O size limit in sbc_parse_cdb(),
which currently for fabric_max_sectors is hardcoded to 8192 (4 MB for 512
byte sector devices), and for hw_max_sectors is a backend driver dependent
value.
This limit is problematic because Linux initiators have only recently
started to honor block limits MAXIMUM TRANSFER LENGTH, and other non-Linux
based initiators (eg: MSFT Fibre Channel) can also generate I/Os larger
than 4 MB in size.
Currently when this happens, the following message will appear on the
target resulting in I/Os being returned with non recoverable status:
SCSI OP 28h with too big sectors 16384 exceeds fabric_max_sectors: 8192
Instead, drop both [fabric,hw]_max_sector checks in sbc_parse_cdb(),
and convert the existing hw_max_sectors into a purely informational
attribute used to represent the granuality that backend driver and/or
subsystem code is splitting I/Os upon.
Also, update FILEIO with an explicit FD_MAX_BYTES check in fd_execute_rw()
to deal with the one special iovec limitiation case.
v2 changes:
- Drop hw_max_sectors check in sbc_parse_cdb()
Reported-by: Lance Gropper <lance.gropper@qosserver.com> Reported-by: Stefan Priebe <s.priebe@profihost.ag> Cc: Christoph Hellwig <hch@lst.de> Cc: Martin K. Petersen <martin.petersen@oracle.com> Cc: Roland Dreier <roland@purestorage.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In situations such as bond failover, The new session establishment
implicitly invokes the termination of the old connection.
So, we don't want to wait for the old connection wait_conn to completely
terminate before we accept the new connection and post a login response.
The solution is to deffer the comp_wait completion and the conn_put to
a work so wait_conn will effectively be non-blocking (flush errors are
assumed to come very fast).
We allocate isert_release_wq with WQ_UNBOUND and WQ_UNBOUND_MAX_ACTIVE
to spread the concurrency of release works.
The np listener cm_id will also get ADDR_CHANGE event
upcall (in case it is bound to a specific IP). Handle
it correctly by creating a new cm_id and implicitly
destroy the old one.
Since this is the second event a listener np cm_id may
encounter, we move the np cm_id event handling to a
routine.
Take isert_conn pointer from cm_id->qp->qp_context. This
will allow us to know that the cm_id context is always
the network portal. This will make the cm_id event check
(connection or network portal) more reliable.
In order to avoid a NULL dereference in cma_id->qp->qp_context
we destroy the qp after we destroy the cm_id (and make the
dereference safe). session stablishment/teardown sequences
can happen in parallel, we should take into account that
connected_handler might race with connection teardown flow.
Also, protect isert_conn->conn_device->active_qps decrement
within the error patch during QP creation failure and the
normal teardown path in isert_connect_release().
Squashed:
iser-target: Decrement completion context active_qps in error flow
There is no point in accepting a new CM request only
when we are completely done with the last iscsi login.
Instead we accept immediately, this will also cause the
CM connection to reach connected state and the initiator
is allowed to send the first login. We mark that we got
the initial login and let iscsi layer pick it up when it
gets there.
This reduces the parallel login sequence by a factor of
more then 4 (and more for multi-login) and also prevents
the initiator (who does all logins in parallel) from
giving up on login timeout expiration.
In order to support multiple login requests sequence (CHAP)
we call isert_rx_login_req from isert_rx_completion insead
of letting isert_get_login_rx call it.
Squashed:
iser-target: Use kref_get_unless_zero in connected_handler
iser-target: Acquire conn_mutex when changing connection state
iser-target: Reject connect request in failure path
ISER_CONN_UP state is not sufficient to know if
we should wait for completion of flush errors and
disconnected_handler event.
Instead, split it to 2 states:
- ISER_CONN_UP: Got to CM connected phase, This state
indicates that we need to wait for a CM disconnect
event before going to teardown.
- ISER_CONN_FULL_FEATURE: Got to full feature phase
after we posted login response, This state indicates
that we posted recv buffers and we need to wait for
flush completions before going to teardown.
Also avoid deffering disconnected handler to a work,
and handle it within disconnected handler.
More work here is needed to handle DEVICE_REMOVAL event
correctly (cleanup all resources).
Squashed:
iser-target: Don't deffer disconnected handler to a work
Since commit 0fc4ea701fcf ("Target/iser: Don't put isert_conn inside
disconnected handler") we put the conn kref in isert_wait_conn, so we
need .wait_conn to be invoked also in the error path.
Introduce call to isert_conn_terminate (called under lock)
which transitions the connection state to TERMINATING and calls
rdma_disconnect. If the state is already teminating, just bail
out back (temination started).
Also, make sure to destroy the connection when getting a connect
error event if didn't get to connected (state UP). Same for the
handling of REJECTED and UNREACHABLE cma events.
Squashed:
iscsi-target: Add call to wait_conn in establishment error flow
While looking at hch's recent conversion to drop the MSG_*_TAG
definitions, I noticed a long standing bug in vhost-scsi where
the VIRTIO_SCSI_S_* attribute definitions where incorrectly
being passed directly into target_submit_cmd_map_sgls().
This patch adds the missing virtio-scsi to TCM/SAM task attribute
conversion.
Cc: Christoph Hellwig <hch@lst.de> Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
tcm_loop has the I_T nexus associated with the HBA. This causes
commands to become misdirected if the HBA has more than one
target portal group; any command is then being sent to the
first target portal group instead of the correct one.
The nexus needs to be associated with the target portal group
instead.
Signed-off-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This patch addresses a bug where individual vhost-scsi configfs endpoint
groups can be removed from below while active exports to QEMU userspace
still exist, resulting in an OOPs.
It adds a configfs_depend_item() in vhost_scsi_set_endpoint() to obtain
an explicit dependency on se_tpg->tpg_group in order to prevent individual
vhost-scsi WWPN endpoints from being released via normal configfs methods
while an QEMU ioctl reference still exists.
Also, add matching configfs_undepend_item() in vhost_scsi_clear_endpoint()
to release the dependency, once QEMU's reference to the individual group
at /sys/kernel/config/target/vhost/$WWPN/$TPGT is released.
(Fix up vhost_scsi_clear_endpoint() error path - DanC)
Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This patch adds a max_send_sge=2 minimum in isert_conn_setup_qp()
to ensure outgoing control PDU responses with tx_desc->num_sge=2
are able to function correctly.
This addresses a bug with RDMA hardware using dev_attr.max_sge=3,
that in the original code with the ConnectX-2 work-around would
result in isert_conn->max_sge=1 being negotiated.
Originally reported by Chris with ocrdma driver.
Reported-by: Chris Moore <Chris.Moore@emulex.com> Tested-by: Chris Moore <Chris.Moore@emulex.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
isert has an issue of trying to create a CQ with more CQEs than are
supported by the hardware, that currently results in failures during
isert_device creation during first session login.
This is the isert version of the patch that Minh Tran submitted for
iser, and is simple a workaround required to function with existing
ocrdma hardware.
Signed-off-by: Chris Moore <chris.moore@emulex.com> Reviewied-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
A worker_pool's forward progress is guaranteed by the fact that the
last idle worker assumes the manager role to create more workers and
summon the rescuers if creating workers doesn't succeed in timely
manner before proceeding to execute work items.
This manager role is implemented in manage_workers(), which indicates
whether the worker may proceed to work item execution with its return
value. This is necessary because multiple workers may contend for the
manager role, and, if there already is a manager, others should
proceed to work item execution.
Unfortunately, the function also indicates that the worker may proceed
to work item execution if need_to_create_worker() is false at the head
of the function. need_to_create_worker() tests the following
conditions.
pending work items && !nr_running && !nr_idle
The first and third conditions are protected by pool->lock and thus
won't change while holding pool->lock; however, nr_running can change
asynchronously as other workers block and resume and while it's likely
to be zero, as someone woke this worker up in the first place, some
other workers could have become runnable inbetween making it non-zero.
If this happens, manage_worker() could return false even with zero
nr_idle making the worker, the last idle one, proceed to execute work
items. If then all workers of the pool end up blocking on a resource
which can only be released by a work item which is pending on that
pool, the whole pool can deadlock as there's no one to create more
workers or summon the rescuers.
This patch fixes the problem by removing the early exit condition from
maybe_create_worker() and making manage_workers() return false iff
there's already another manager, which ensures that the last worker
doesn't start executing work items.
We can leave the early exit condition alone and just ignore the return
value but the only reason it was put there is because the
manage_workers() used to perform both creations and destructions of
workers and thus the function may be invoked while the pool is trying
to reduce the number of workers. Now that manage_workers() is called
only when more workers are needed, the only case this early exit
condition is triggered is rare race conditions rendering it pointless.
Tested with simulated workload and modified workqueue code which
trigger the pool deadlock reliably without this patch.
Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: Eric Sandeen <sandeen@sandeen.net> Link: http://lkml.kernel.org/g/54B019F4.8030009@sandeen.net Cc: Dave Chinner <david@fromorbit.com> Cc: Lai Jiangshan <laijs@cn.fujitsu.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Martin Kaiser [Fri, 30 Jan 2015 14:01:29 +0000 (15:01 +0100)]
gpio: squelch a compiler warning
drivers/gpio/gpiolib-of.c: In function 'of_gpiochip_find_and_xlate':
drivers/gpio/gpiolib-of.c:51:21: warning: assignment makes integer from
pointer without a cast [enabled by default]
gg_data->out_gpio = ERR_PTR(ret);
^
this was introduced in d1c3449160df60fac4abb56f0ba0a3784305e43e
the upstream kernel changed the type of out_gpio from int to struct gpio_desc *
as part of a larger refactoring that wasn't backported
Signed-off-by: Martin Kaiser <martin@kaiser.cx> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Pstore fs expects that backends provide a unique id which could avoid
pstore making entries as duplication or denominating entries the same
name. So I combine the timestamp, part and count into id.
Signed-off-by: Madper Xie <cxie@redhat.com> Cc: Seiji Aguchi <seiji.aguchi@hds.com> Cc: stable@vger.kernel.org Signed-off-by: Matt Fleming <matt.fleming@intel.com>
[hkp: Backported to 3.10: adjust context] Signed-off-by: Hu Keping <hukeping@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
For persistent RAM outside of main memory, the memory may have limitations
on supported accesses. For internal RAM on highbank platform exclusive
accesses are not supported and will hang the system. So atomic_cmpxchg
cannot be used. This commit uses spinlock protection for buffer size and
start updates on ioremapped regions instead.
Signed-off-by: Rob Herring <rob.herring@calxeda.com> Acked-by: Anton Vorontsov <anton@enomsg.org> Signed-off-by: Tony Luck <tony.luck@intel.com>
[hkp: Backported to 3.10: adjust context] Signed-off-by: HuKeping <hukeping@huawei.com>
ramoops_get_next_prz get the prz according the paramters. If it get a
uninitialized prz, access its members by following persistent_ram_old_size(prz)
will cause a NULL pointer crash.
Ex: if ftrace_size is 0, fprz will be NULL.
Fix it by return NULL in advance.
Signed-off-by: Liu ShuoX <shuox.liu@intel.com> Acked-by: Kees Cook <keescook@chromium.org> Signed-off-by: Tony Luck <tony.luck@intel.com> Cc: HuKeping <hukeping@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
*_read_cnt in ramoops_context need to be cleared during pstore ->open to
support mutli times getting the records. The patch added missed
ftrace_read_cnt clearing and removed duplicate clearing in ramoops_probe.
Signed-off-by: Liu ShuoX <shuox.liu@intel.com> Cc: "Zhang, Yanmin" <yanmin_zhang@linux.intel.com> Cc: Colin Cross <ccross@android.com> Cc: Kees Cook <keescook@chromium.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Tony Luck <tony.luck@intel.com> Cc: HuKeping <hukeping@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
pstore_erase is used to erase the record from the persistent store.
So if a driver has not defined pstore_erase callback return
-EPERM instead of unlinking a file as deleting the file without
erasing its record in persistent store will give a wrong impression
to customers.
For LPAE, we have the following means for encoding writable or dirty
ptes:
L_PTE_DIRTY L_PTE_RDONLY
!pte_dirty && !pte_write 0 1
!pte_dirty && pte_write 0 1
pte_dirty && !pte_write 1 1
pte_dirty && pte_write 1 0
So we can't distinguish between writeable clean ptes and read only
ptes. This can cause problems with ptes being incorrectly flagged as
read only when they are writeable but not dirty.
This patch renumbers L_PTE_RDONLY from AP[2] to a software bit #58,
and adds additional logic to set AP[2] whenever the pte is read only
or not dirty. That way we can distinguish between clean writeable ptes
and read only ptes.
HugeTLB pages will use this new logic automatically.
We need to add some logic to Transparent HugePages to ensure that they
correctly interpret the revised pgprot permissions (L_PTE_RDONLY has
moved and no longer matches PMD_SECT_AP2). In the process of revising
THP, the names of the PMD software bits have been prefixed with L_ to
make them easier to distinguish from their hardware bit counterparts.
Signed-off-by: Steve Capper <steve.capper@linaro.org> Reviewed-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
[hpy: Backported to 3.10
- adjust the context
- ignore change related to pmd, because 3.10 does not support HugePage ] Signed-off-by: Hou Pengyang <houpengyang@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Long descriptors on ARM are 64 bits, and some pte functions such as
pte_dirty return a bitwise-and of a flag with the pte value. If the
flag to be tested resides in the upper 32 bits of the pte, then we run
into the danger of the result being dropped if downcast.
For example:
gather_stats(page, md, pte_dirty(*pte), 1);
where pte_dirty(*pte) is downcast to an int.
This patch introduces a new macro pte_isset which performs the bitwise
and, then performs a double logical invert (where needed) to ensure
predictable downcasting. The logical inverse pte_isclear is also
introduced.
Equivalent pmd functions for Transparent HugePages have also been
added.
Signed-off-by: Steve Capper <steve.capper@linaro.org> Reviewed-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
[hpy: Backported to 3.10:
- adjust the context
- ignore change to pmd, because 3.10 does not support HugePage.] Signed-off-by: Hou Pengyang <houpengyang@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When setting up the CMA region, we must ensure that the old section
mappings are flushed from the TLB before replacing them with page
tables, otherwise we can suffer from mismatched aliases if the CPU
speculatively prefetches from these mappings at an inopportune time.
A mismatched alias can occur when the TLB contains a section mapping,
but a subsequent prefetch causes it to load a page table mapping,
resulting in the possibility of the TLB containing two matching
mappings for the same virtual address region.
Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> Cc: Hou Pengyang <houpengyang@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The definition of virt_addr_valid is that virt_addr_valid should
return true if and only if virt_to_page returns a valid pointer.
The current definition of virt_addr_valid only checks against the
virtual address range. There's no guarantee that just because a
virtual address falls bewteen PAGE_OFFSET and high_memory the
associated physical memory has a valid backing struct page. Follow
the example of other architectures and convert to pfn_valid to
verify that the virtual address is actually valid. The check for
an address between PAGE_OFFSET and high_memory is still necessary
as vmalloc/highmem addresses are not valid with virt_to_page.
Cc: Will Deacon <will.deacon@arm.com> Cc: Nicolas Pitre <nico@linaro.org> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Laura Abbott <lauraa@codeaurora.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> Cc: Hou Pengyang <houpengyang@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Jason Gunthorpe reports a build failure when ARM_PATCH_PHYS_VIRT is
not defined:
In file included from arch/arm/include/asm/page.h:163:0,
from include/linux/mm_types.h:16,
from include/linux/sched.h:24,
from arch/arm/kernel/asm-offsets.c:13:
arch/arm/include/asm/memory.h: In function '__virt_to_phys':
arch/arm/include/asm/memory.h:244:40: error: 'PHYS_OFFSET' undeclared (first use in this function)
arch/arm/include/asm/memory.h:244:40: note: each undeclared identifier is reported only once for each function it appears in
arch/arm/include/asm/memory.h: In function '__phys_to_virt':
arch/arm/include/asm/memory.h:249:13: error: 'PHYS_OFFSET' undeclared (first use in this function)
Fixes: ca5a45c06cd4 ("ARM: mm: use phys_addr_t appropriately in p2v and v2p conversions") Tested-By: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
[hpy: Backported to 3.10:
- adjust the context
- MPU is not supported by 3.10, so ignore fix to MPU compared with the original patch.] Signed-off-by: Hou Pengyang <houpengyang@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
For atomic_cmpxchg(), the type of 'oldval' need be 'int' to match the
type of "*ptr" (used by 'ldrex' instruction) and 'old' (used by 'teq'
instruction).
Reviewed-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Chen Gang <gang.chen@asianux.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> Cc: Hou Pengyang <houpengyang@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
atomic* value is signed value, and atomic* functions need also process
signed value (parameter value, and return value), so 32-bit arm need
use 'long long' instead of 'u64'.
After replacement, it will also fix a bug for atomic64_add_negative():
"u64 is never less than 0".
The modifications are:
in vim, use "1,% s/\<u64\>/long long/g" command.
remove '__aligned(8)' which is useless for 64-bit.
be sure of 80 column limitation after replacement.
Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Chen Gang <gang.chen@asianux.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> Cc: Hou Pengyang <houpengyang@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
For 2-level page tables, PTE_HWTABLE_PTRS describes the offset between
Linux PTEs and hardware PTEs. On LPAE, there is no distinction (since
we have 64-bit descriptors with plenty of space) so PTE_HWTABLE_PTRS
should be 0. Unfortunately, it is wrongly defined as PTRS_PER_PTE,
meaning that current pte table flushing is off by a page. Luckily,
all current LPAE implementations are SMP, so the hardware walker can
snoop L1.
On LPAE machines, PHYS_OFFSET evaluates to a phys_addr_t and this type is
inherited by the PHYS_PFN_OFFSET definition as well. Consequently, the kernel
build emits warnings of the form:
init/main.c: In function 'start_kernel':
init/main.c:588:7: warning: format '%lx' expects argument of type 'long unsigned int', but argument 2 has type 'phys_addr_t' [-Wformat]
This patch fixes this warning by pinning down the PFN type to unsigned long.
This patch fixes the alloc_init_pud() function to use phys_addr_t instead of
unsigned long when passing in the phys argument.
This is an extension to commit 97092e0c56830457af0639f6bd904537a150ea4a (ARM:
pgtable: use phys_addr_t for physical addresses), which applied similar changes
elsewhere in the ARM memory management code.
This patch applies to PAGE_MASK, PMD_MASK, and PGDIR_MASK, where forcing
unsigned long math truncates the mask at the 32-bits. This clearly does bad
things on PAE systems.
This patch fixes this problem by defining these masks as signed quantities.
We then rely on sign extension to do the right thing.
It appears that gcc may put some code in ".text.unlikely" or
".text.hot" sections. Right now those aren't accounted for in unwind
tables. Add them.
I found some docs about this at:
http://gcc.gnu.org/onlinedocs/gcc-4.6.2/gcc.pdf
Without this, if you have slub_debug turned on, you can get messages
that look like this:
unwind: Index not found 7f008c50
Signed-off-by: Doug Anderson <dianders@chromium.org> Acked-by: Mike Frysinger <vapier@gentoo.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
[wangkai: backport to 3.10
- adjust context
] Signed-off-by: Wang Kai <morgan.wang@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In Dual EMAC, the default VLANs are used to segregate Rx packets between
the ports, so adding the same default VLAN to the switch will affect the
normal packet transfers. So returning error on addition of dual EMAC
default VLANs.
Even if EMAC 0 default port VLAN is added to EMAC 1, it will lead to
break dual EMAC port separations.
Fixes: d9ba8f9e6298 (driver: net: ethernet: cpsw: dual emac interface implementation) Reported-by: Felipe Balbi <balbi@ti.com> Signed-off-by: Mugunthan V N <mugunthanvnm@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The regulator framework maintains a list of consumer regulators
for a regulator device and protects it from concurrent access using
the regulator device's mutex lock.
In the case of regulator_put() the consumer is removed and regulator
device's parameters are updated without holding the regulator device's
mutex. This would lead to a race condition between the regulator_put()
and any function which traverses the consumer list or modifies regulator
device's parameters.
Fix this race condition by holding the regulator device's mutex in case
of regulator_put.
Signed-off-by: Ashay Jaiswal <ashayj@codeaurora.org> Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Once the current message is finished, the driver notifies SPI core about
this by calling spi_finalize_current_message(). This function queues next
message to be transferred. If there are more messages in the queue, it is
possible that the driver is asked to transfer the next message at this
point.
When spi_finalize_current_message() returns the driver clears the
drv_data->cur_chip pointer to NULL. The problem is that if the driver
already started the next message clearing drv_data->cur_chip will cause
NULL pointer dereference which crashes the kernel like:
Fix this by clearing drv_data->cur_chip before we call spi_finalize_current_message().
Reported-by: Martin Oldfield <m@mjoldfield.com> Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com> Acked-by: Robert Jarzmik <robert.jarzmik@free.fr> Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Commit 9b1cc9f251 ("dm cache: share cache-metadata object across
inactive and active DM tables") mistakenly ignored the use of ERR_PTR
returns. Restore missing IS_ERR checks and ERR_PTR returns where
appropriate.
Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Joe Thornber <ejt@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
You can't modify the metadata in these modes. It's better to fail these
messages immediately than let the block-manager deny write locks on
metadata blocks. Otherwise these failed metadata changes will trigger
'needs_check' to get set in the metadata superblock -- requiring repair
using the thin_check utility.
Signed-off-by: Joe Thornber <ejt@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In case userspace attempts to obtain key information for or delete a
unicast key, this is currently erroneously rejected unless the driver
sets the WIPHY_FLAG_IBSS_RSN flag. Apparently enough drivers do so it
was never noticed.
Fix that, and while at it fix a potential memory leak: the error path
in the get_key() function was placed after allocating a message but
didn't free it - move it to a better place. Luckily admin permissions
are needed to call this operation.
We only support swap file calling nfs_direct_IO. However, application
might be able to get to nfs_direct_IO if it toggles O_DIRECT flag
during IO and it can deadlock because we grab inode->i_mutex in
nfs_file_direct_write(). So return 0 for such case. Then the generic
layer will fall back to buffer IO.
When the last subscriber to a "Through" port has been removed, the
subscribed destination ports might still be active, so it would be
wrong to send "all sounds off" and "reset controller" events to them.
The proper place for such a shutdown would be the closing of the actual
MIDI port (and close_substream() in rawmidi.c already can do this).
This also fixes a deadlock when dummy_unuse() tries to send events to
its own port that is already locked because it is being freed.
Reported-by: Peter Billam <peter@www.pjb.com.au> Signed-off-by: Clemens Ladisch <clemens@ladisch.de> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The commit 3b8a3c010969 ("powerpc/pseries: Fix endiannes issue in RTAS
call from xmon") was fixing an endianness issue in the call made from
xmon to RTAS.
However, as Michael Ellerman noticed, this fix was not complete, the
token value was not byte swapped. This lead to call an unexpected and
most of the time unexisting RTAS function, which is silently ignored by
RTAS.
This fix addresses this hole.
Reported-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
While being in an ERROR_WARNING state, and receiving further
bus error events with error counters still in the ERROR_WARNING
range of 97-127 inclusive, the state handling code erroneously
reverts back to ERROR_ACTIVE.
Per the CAN standard, only revert to ERROR_ACTIVE when the
error counters are less than 96.
Moreover, in certain Kvaser models, the BUS_ERROR flag is
always set along with undefined bits in the M16C status
register. Thus use bitwise operators instead of full equality
for checking that register against bus errors.
Signed-off-by: Ahmed S. Darwish <ahmed.darwish@valeo.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
On some x86 laptops, plugging a Kvaser device again after an
unplug makes the firmware always ignore the very first command.
For such a case, provide some room for retries instead of
completely exiting the driver init code.
Signed-off-by: Ahmed S. Darwish <ahmed.darwish@valeo.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Send expected argument to the URB completion hander: a CAN
netdevice instead of the network interface private context
`kvaser_usb_net_priv'.
This was discovered by having some garbage in the kernel
log in place of the netdevice names: can0 and can1.
Signed-off-by: Ahmed S. Darwish <ahmed.darwish@valeo.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Upon receiving a hardware event with the BUS_RESET flag set,
the driver kills all of its anchored URBs and resets all of
its transmit URB contexts.
Unfortunately it does so under the context of URB completion
handler `kvaser_usb_read_bulk_callback()', which is often
called in an atomic context.
While the device is flooded with many received error packets,
usb_kill_urb() typically sleeps/reschedules till the transfer
request of each killed URB in question completes, leading to
the sleep in atomic bug. [3]
In v2 submission of the original driver patch [1], it was
stated that the URBs kill and tx contexts reset was needed
since we don't receive any tx acknowledgments later and thus
such resources will be locked down forever. Fortunately this
is no longer needed since an earlier bugfix in this patch
series is now applied: all tx URB contexts are reset upon CAN
channel close. [2]
Moreover, a BUS_RESET is now treated _exactly_ like a BUS_OFF
event, which is the recommended handling method advised by
the device manufacturer.
Signed-off-by: Ahmed S. Darwish <ahmed.darwish@valeo.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
wm8960 codec can't support sample rate 11250, it must be 11025.
Signed-off-by: Zidan Wang <b50113@freescale.com> Acked-by: Charles Keepax <ckeepax@opensource.wolfsonmicro.com> Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The FIFO size is 40 accordingly to the specifications, but this means 0x40,
i.e. 64 bytes. This patch fixes the typo and enables FIFO size autodetection
for Intel MID devices.
Fixes: 7063c0d942a1 (spi/dw_spi: add DMA support) Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
It is critical that fetch_block() and handle_stripe_dirtying()
are consistent in their analysis of what needs to be loaded.
Otherwise raid5 can wait forever for a block that won't be loaded.
Currently when writing to a RAID5 that is resyncing, to a location
beyond the resync offset, handle_stripe_dirtying chooses a
reconstruct-write cycle, but fetch_block() assumes a
read-modify-write, and a lockup can happen.
So treat that case just like RAID6, just as we do in
handle_stripe_dirtying. RAID6 always does reconstruct-write.
This bug was introduced when the behaviour of handle_stripe_dirtying
was changed in 3.7, so the patch is suitable for any kernel since,
though it will need careful merging for some versions.
Cc: stable@vger.kernel.org (v3.7+) Fixes: a7854487cd7128a30a7f4f5259de9f67d5efb95f Reported-by: Henry Cai <henryplusplus@gmail.com> Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
reaim workfile.dbase test easily triggers warning in
ext4_da_update_reserve_space():
EXT4-fs warning (device ram0): ext4_da_update_reserve_space:365:
ino 12, allocated 1 with only 0 reserved metadata blocks (releasing 1
blocks with reserved 9 data blocks)
The problem is that (one of) tests creates file and then randomly writes
to it with O_SYNC. That results in writing back pages of the file in
random order so we create extents for written blocks say 0, 2, 4, 6, 8
- this last allocation also allocates new block for extents. Then we
writeout block 1 so we have extents 0-2, 4, 6, 8 and we release
indirect extent block because extents fit in the inode again. Then we
writeout block 10 and we need to allocate indirect extent block again
which triggers the warning because we don't have the reservation
anymore.
Fix the problem by giving back freed metadata blocks resulting from
extent merging into inode's reservation pool.
Signed-off-by: Jan Kara <jack@suse.cz> Cc: Josh Hunt <johunt@akamai.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
ext4 needs to convert allocated (metadata) blocks back into blocks
reserved for delayed allocation. Add functions into quota code for
supporting such operation.
Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: Josh Hunt <johunt@akamai.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Commit 5d26a105b5a7 ("crypto: prefix module autoloading with "crypto-"")
changed the automatic module loading when requesting crypto algorithms
to prefix all module requests with "crypto-". This requires all crypto
modules to have a crypto specific module alias even if their file name
would otherwise match the requested crypto algorithm.
Even though commit 5d26a105b5a7 added those aliases for a vast amount of
modules, it was missing a few. Add the required MODULE_ALIAS_CRYPTO
annotations to those files to make them get loaded automatically, again.
This fixes, e.g., requesting 'ecb(blowfish-generic)', which used to work
with kernels v3.18 and below.
Also change MODULE_ALIAS() lines to MODULE_ALIAS_CRYPTO(). The former
won't work for crypto modules any more.
This prefixes all crypto module loading with "crypto-" so we never run
the risk of exposing module auto-loading to userspace via a crypto API,
as demonstrated by Mathias Krause:
This commit broke on x86 PV because entries in the generic SWIOTLB are
indexed using (pseudo-)physical address not DMA address and these are
not the same in a x86 PV guest.
When a key is being garbage collected, it's key->user would get put before
the ->destroy() callback is called, where the key is removed from it's
respective tracking structures.
This leaves a key hanging in a semi-invalid state which leaves a window open
for a different task to try an access key->user. An example is
find_keyring_by_name() which would dereference key->user for a key that is
in the process of being garbage collected (where key->user was freed but
->destroy() wasn't called yet - so it's still present in the linked list).
This would cause either a panic, or corrupt memory.
Strictly speaking, this code was never correct. It should have set
read_exec_only and seg_not_present to 1 to indicate that it wanted
to find a free slot without putting anything there, or it should
have put something sensible in the TLS slot if it wanted to allocate
a TLS entry for real. The actual effect of this code was to
allocate a bogus segment that could be used to exploit espfix.
The set_thread_area hardening patches changed the behavior, causing
set_thread_area to return -EINVAL and crashing the game.
This changes set_thread_area to interpret this as a request to find
a free slot and to leave it empty, which isn't *quite* what the game
expects but should be close enough to keep it working. In
particular, using the code above to allocate two segments will
allocate the same segment both times.
According to FrostbittenKing on Github, this fixes The Witcher 2.
If this somehow still causes problems, we could instead allocate
a limit==0 32-bit data segment, but that seems rather ugly to me.
32-bit programs don't have an lm bit in their ABI, so they can't
reliably cause LDT_empty to return true without resorting to memset.
They shouldn't need to do this.
This should fix a longstanding, if minor, issue in all 64-bit kernels
as well as a potential regression in the TLS hardening code.
Many users see this message when booting without knowning that it is
of no importance and that TSC calibration may have succeeded by
another way.
As explained by Paul Bolle in
http://lkml.kernel.org/r/1348488259.1436.22.camel@x61.thuisdomein
"Fast TSC calibration failed" should not be considered as an error
since other calibration methods are being tried afterward. At most,
those send a warning if they fail (not an error). So let's change
the message from error to warning.
[ tglx: Make if pr_info. It's really not important at all ]
Fixes: c767a54ba065 x86/debug: Add KERN_<LEVEL> to bare printks, convert printks to pr_<level> Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com> Link: http://lkml.kernel.org/r/1418106470-6906-1-git-send-email-alexandre.f.demers@gmail.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
EXYNOS4_MCT_L_MASK is defined as 0xffffff00, so applying this bitmask
produces a number outside the range 0x00 to 0xff, which always results
in execution of the default switch statement.
Obviously this is wrong and git history shows that the bitmask inversion
was incorrectly set during a refactoring of the MCT code.
Fix this by putting the inversion at the correct position again.
Acked-by: Kukjin Kim <kgene.kim@samsung.com> Reported-by: GP Orcullo <kinsamanka@gmail.com> Reviewed-by: Doug Anderson <dianders@chromium.org> Signed-off-by: Tobias Jakobi <tjakobi@math.uni-bielefeld.de> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When changing flags in the CAN drivers ctrlmode the provided new content has to
be checked whether the bits are allowed to be changed. The bits that are to be
changed are given as a bitfield in cm->mask. Therefore checking against
cm->flags is wrong as the content can hold any kind of values.
The iproute2 tool sets the bits in cm->mask and cm->flags depending on the
detected command line options. To be robust against bogus user space
applications additionally sanitize the provided flags with the provided mask.
Cc: Wolfgang Grandegger <wg@grandegger.com> Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
On Armada XP, 375 and 38x the MBus window 13 has the remap capability,
like windows 0 to 7. However, the mvebu-mbus driver isn't currently
taking into account this special case, which means that when window 13
is actually used, the remap registers are left to 0, making the device
using this MBus window unavailable.
As a minimal fix for stable, don't use window 13. A full fix will
follow later.
Fixes: fddddb52a6c ("bus: introduce an Marvell EBU MBus driver") Reviewed-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Verify that the frequency value from userspace is valid and makes sense.
Unverified values can cause overflows later on.
Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@kernel.org> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
[jstultz: Fix up bug for negative values and drop redunent cap check] Signed-off-by: John Stultz <john.stultz@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
An unvalidated user input is multiplied by a constant, which can result in
an undefined behaviour for large values. While this is validated later,
we should avoid triggering undefined behaviour.
Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@kernel.org> Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
[jstultz: include trivial milisecond->microsecond correction noticed
by Andy] Signed-off-by: John Stultz <john.stultz@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
If a DM table is reloaded with an inactive table when the device is not
suspended (normal procedure for LVM2), then there will be two dm-bufio
objects that can diverge. This can lead to a situation where the
inactive table uses bufio to read metadata at the same time the active
table writes metadata -- resulting in the inactive table having stale
metadata buffers once it is promoted to the active table slot.
Fix this by using reference counting and a global list of cache metadata
objects to ensure there is only one metadata object per metadata device.
Signed-off-by: Joe Thornber <ejt@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Fixes a race condition in abort handling that was injected
when multiple interrupt support was added. When only a single
interrupt is present, the adapter guarantees it will send
responses for aborted commands prior to the response for the
abort command itself. With multiple interrupts, these responses
generally come back on different interrupts, so we need to
ensure the abort thread waits until the aborted command is
complete so we don't perform a double completion. This race
condition was being hit frequently in environments which
were triggering command timeouts, which was resulting in
a double completion causing a kernel oops.
Signed-off-by: Brian King <brking@linux.vnet.ibm.com> Reviewed-by: Wendy Xiong <wenxiong@linux.vnet.ibm.com> Tested-by: Wendy Xiong <wenxiong@linux.vnet.ibm.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
If CONFIG_DEBUG_MUTEXES is set, the mutex->owner field is only cleared
if the mutex debugging is enabled which introduces a race in our
mutex_is_locked_by() - i.e. we may inspect the old owner value before it
is acquired by the new task.
This is the root cause of this error:
# diff --git a/kernel/locking/mutex-debug.c b/kernel/locking/mutex-debug.c
# index 5cf6731..3ef3736 100644
# --- a/kernel/locking/mutex-debug.c
# +++ b/kernel/locking/mutex-debug.c
# @@ -80,13 +80,13 @@ void debug_mutex_unlock(struct mutex *lock)
# DEBUG_LOCKS_WARN_ON(lock->owner != current);
#
# DEBUG_LOCKS_WARN_ON(!lock->wait_list.prev && !lock->wait_list.next);
# - mutex_clear_owner(lock);
# }
#
# /*
# * __mutex_slowpath_needs_to_unlock() is explicitly 0 for debug
# * mutexes so that we can do it here after we've verified state.
# */
# + mutex_clear_owner(lock);
# atomic_set(&lock->count, 1);
# }
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=87955 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Compiling SH with gcc-4.8 fails due to the -m32 option not being
supported.
From http://buildd.debian-ports.org/status/fetch.php?pkg=linux&arch=sh4&ver=3.16.7-ckt4-1&stamp=1421425783
CC init/main.o
gcc-4.8: error: unrecognized command line option '-m32'
ld: cannot find init/.tmp_mc_main.o: No such file or directory
objcopy: 'init/.tmp_mx_main.o': No such file
rm: cannot remove 'init/.tmp_mx_main.o': No such file or directory
rm: cannot remove 'init/.tmp_mc_main.o': No such file or directory
It is possible for ata_sff_flush_pio_task() to set ap->hsm_task_state to
HSM_ST_IDLE in between the time __ata_sff_port_intr() checks for HSM_ST_IDLE
and before it calls ata_sff_hsm_move() causing ata_sff_hsm_move() to BUG().
This problem is hard to reproduce making this patch hard to verify, but this
fix will prevent the race.
I have not been able to reproduce the problem, but here is a crash dump from
a 2.6.32 kernel.
On examining the ata port's state, its hsm_task_state field has a value of HSM_ST_IDLE:
Normally, this should not be possible as ata_sff_hsm_move() was called from ata_sff_host_intr(),
which checks hsm_task_state and won't call ata_sff_hsm_move() if it has a HSM_ST_IDLE value.