git.itanic.dy.fi Git - linux-stable/log

Linux 3.12.4

drm/radeon/audio: correct ACR table

commit 3e71985f2439d8c4090dc2820e497e6f3d72dcff upstream.

The values were taken from the HDMI spec, but they assumed
exact x/1.001 clocks. Since we round the clocks, we also need
to calculate different N and CTS values.

Note that the N for 25.2/1.001 MHz at 44.1 kHz audio is out of
spec. Hopefully this mode is rarely used and/or HDMI sinks
tolerate overly large values of N.

bug:
https://bugs.freedesktop.org/show_bug.cgi?id=69675

Signed-off-by: Pierre Ossman <pierre@ossman.eu>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: Josh Boyer <jwboyer@fedoraproject.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

drm/radeon/audio: improve ACR calculation

commit a2098250fbda149cfad9e626afe80abe3b21e574 upstream.

In order to have any realistic chance of calculating proper
ACR values, we need to be able to calculate both N and CTS,
not just CTS. We still aim for the ideal N as specified in
the HDMI spec though.

bug:
https://bugs.freedesktop.org/show_bug.cgi?id=69675

Signed-off-by: Pierre Ossman <pierre@ossman.eu>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: Josh Boyer <jwboyer@fedoraproject.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

aio: clean up aio ring in the fail path

commit d1b9432712a25eeb06114fb4b587133525a47de5 upstream.

Clean up the aio ring file in the fail path of aio_setup_ring
and ioctx_alloc. And maybe it can fix the GPF issue reported by
Dave Jones:
https://lkml.org/lkml/2013/11/25/898

Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com>
Signed-off-by: Benjamin LaHaise <bcrl@kvack.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

aio: nullify aio->ring_pages after freeing it

commit ddb8c45ba15149ebd41d7586261c05f7ca37f9a1 upstream.

After freeing ring_pages we leave it as is causing a dangling pointer. This
has already caused an issue so to help catching any issues in the future
NULL it out.

Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: Benjamin LaHaise <bcrl@kvack.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

aio: prevent double free in ioctx_alloc

commit d558023207e008a4476a3b7bb8706b2a2bf5d84f upstream.

ioctx_alloc() calls aio_setup_ring() to allocate a ring. If aio_setup_ring()
fails to do so it would call aio_free_ring() before returning, but
ioctx_alloc() would call aio_free_ring() again causing a double free of
the ring.

This is easily reproducible from userspace.

Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: Benjamin LaHaise <bcrl@kvack.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

aio: checking for NULL instead of IS_ERR

commit 7f62656be8a8ef14c168db2d98021fb9c8cc1076 upstream.

alloc_anon_inode() returns an ERR_PTR(), it doesn't return NULL.

Fixes: 71ad7490c1f3 ('rework aio migrate pages to use aio fs')
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

rework aio migrate pages to use aio fs

commit 71ad7490c1f32bd7829df76360f9fa17829868f3 upstream.

Don't abuse anon_inodes.c to host private files needed by aio;
we can bloody well declare a mini-fs of our own instead of
patching up what anon_inodes can create for us.

Tested-by: Benjamin LaHaise <bcrl@kvack.org>
Acked-by: Benjamin LaHaise <bcrl@kvack.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

take anon inode allocation to libfs.c

commit 6987843ff7e836ea65b554905aec34d2fad05c94 upstream.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Cc: Benjamin LaHaise <bcrl@kvack.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

aio: Fix a trinity splat

commit e34ecee2ae791df674dfb466ce40692ca6218e43 upstream.

aio kiocb refcounting was broken - it was relying on keeping track of
the number of available ring buffer entries, which it needs to do
anyways; then at shutdown time it'd wait for completions to be delivered
until the # of available ring buffer entries equalled what it was
initialized to.

Problem with that is that the ring buffer is mapped writable into
userspace, so userspace could futz with the head and tail pointers to
cause the kernel to see extra completions, and cause free_ioctx() to
return while there were still outstanding kiocbs. Which would be bad.

Fix is just to directly refcount the kiocbs - which is more
straightforward, and with the new percpu refcounting code doesn't cost
us any cacheline bouncing which was the whole point of the original
scheme.

Also clean up ioctx_alloc()'s error path and fix a bug where it wasn't
subtracting from aio_nr if ioctx_add_table() failed.

Signed-off-by: Kent Overstreet <kmo@daterainc.com>
Cc: Benjamin LaHaise <bcrl@kvack.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ntp: Make periodic RTC update more reliable

commit a97ad0c4b447a132a322cedc3a5f7fa4cab4b304 upstream.

The current code requires that the scheduled update of the RTC happens
in the closest tick to the half of the second. This seems to be
difficult to achieve reliably. The scheduled work may be missing the
target time by a tick or two and be constantly rescheduled every second.

Relax the limit to 10 ticks. As a typical RTC drifts in the 11-minute
update interval by several milliseconds, this shouldn't affect the
overall accuracy of the RTC much.

Signed-off-by: Miroslav Lichvar <mlichvar@redhat.com>
Signed-off-by: John Stultz <john.stultz@linaro.org>
Cc: Josh Boyer <jwboyer@fedoraproject.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

elevator: acquire q->sysfs_lock in elevator_change()

commit 7c8a3679e3d8e9d92d58f282161760a0e247df97 upstream.

Add locking of q->sysfs_lock into elevator_change() (an exported function)
to ensure it is held to protect q->elevator from elevator_init(), even if
elevator_change() is called from non-sysfs paths.
sysfs path (elv_iosched_store) uses __elevator_change(), non-locking
version, as the lock is already taken by elv_iosched_store().

Signed-off-by: Tomoki Sekiyama <tomoki.sekiyama@hds.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Cc: Josh Boyer <jwboyer@fedoraproject.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

elevator: Fix a race in elevator switching and md device initialization

commit eb1c160b22655fd4ec44be732d6594fd1b1e44f4 upstream.

The soft lockup below happens at the boot time of the system using dm
multipath and the udev rules to switch scheduler.

[  356.127001] BUG: soft lockup - CPU#3 stuck for 22s! [sh:483]
[  356.127001] RIP: 0010:[<ffffffff81072a7d>]  [<ffffffff81072a7d>] lock_timer_base.isra.35+0x1d/0x50
...
[  356.127001] Call Trace:
[  356.127001]  [<ffffffff81073810>] try_to_del_timer_sync+0x20/0x70
[  356.127001]  [<ffffffff8118b08a>] ? kmem_cache_alloc_node_trace+0x20a/0x230
[  356.127001]  [<ffffffff810738b2>] del_timer_sync+0x52/0x60
[  356.127001]  [<ffffffff812ece22>] cfq_exit_queue+0x32/0xf0
[  356.127001]  [<ffffffff812c98df>] elevator_exit+0x2f/0x50
[  356.127001]  [<ffffffff812c9f21>] elevator_change+0xf1/0x1c0
[  356.127001]  [<ffffffff812caa50>] elv_iosched_store+0x20/0x50
[  356.127001]  [<ffffffff812d1d09>] queue_attr_store+0x59/0xb0
[  356.127001]  [<ffffffff812143f6>] sysfs_write_file+0xc6/0x140
[  356.127001]  [<ffffffff811a326d>] vfs_write+0xbd/0x1e0
[  356.127001]  [<ffffffff811a3ca9>] SyS_write+0x49/0xa0
[  356.127001]  [<ffffffff8164e899>] system_call_fastpath+0x16/0x1b

This is caused by a race between md device initialization by multipathd and
shell script to switch the scheduler using sysfs.

- multipathd:
   SyS_ioctl -> do_vfs_ioctl -> dm_ctl_ioctl -> ctl_ioctl -> table_load
   -> dm_setup_md_queue -> blk_init_allocated_queue -> elevator_init
    q->elevator = elevator_alloc(q, e); // not yet initialized

- sh -c 'echo deadline > /sys/$DEVPATH/queue/scheduler':
   elevator_switch (in the call trace above)
    struct elevator_queue *old = q->elevator;
    q->elevator = elevator_alloc(q, new_e);
    elevator_exit(old);                 // lockup! (*)

- multipathd: (cont.)
    err = e->ops.elevator_init_fn(q);   // init fails; q->elevator is modified

(*) When del_timer_sync() is called, lock_timer_base() will loop infinitely
while timer->base == NULL. In this case, as timer will never initialized,
it results in lockup.

This patch introduces acquisition of q->sysfs_lock around elevator_init()
into blk_init_allocated_queue(), to provide mutual exclusion between
initialization of the q->scheduler and switching of the scheduler.

This should fix this bugzilla:
https://bugzilla.redhat.com/show_bug.cgi?id=902012

Signed-off-by: Tomoki Sekiyama <tomoki.sekiyama@hds.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Cc: Josh Boyer <jwboyer@fedoraproject.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

rt2800: add support for radio chip RF3070

commit 3b9b74baa1af2952d719735b4a4a34706a593948 upstream.

Add support for new RF chip ID: 3070. It seems to be the same as 5370,
maybe vendor just put wrong value on the eeprom, but add this id anyway
since devices with it showed on the marked.

Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

iommu: Remove stack trace from broken irq remapping warning

commit 05104a4e8713b27291c7bb49c1e7e68b4e243571 upstream.

The warning for the irq remapping broken check in intel_irq_remapping.c is
pretty pointless.  We need the warning, but we know where its comming from, the
stack trace will always be the same, and it needlessly triggers things like
Abrt.  This changes the warning to just print a text warning about BIOS being
broken, without the stack trace, then sets the appropriate taint bit.  Since we
automatically disable irq remapping, theres no need to contiue making Abrt jump
at this problem

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
CC: Joerg Roedel <joro@8bytes.org>
CC: Bjorn Helgaas <bhelgaas@google.com>
CC: Andy Lutomirski <luto@amacapital.net>
CC: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
CC: Sebastian Andrzej Siewior <sebastian@breakpoint.cc>
Signed-off-by: Joerg Roedel <joro@8bytes.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

iommu/vt-d: Fixed interaction of VFIO_IOMMU_MAP_DMA with IOMMU address limits

commit f9423606ade08653dd8a43334f0a7fb45504c5cc upstream.

The BUG_ON in drivers/iommu/intel-iommu.c:785 can be triggered from userspace via
VFIO by calling the VFIO_IOMMU_MAP_DMA ioctl on a vfio device with any address
beyond the addressing capabilities of the IOMMU. The problem is that the ioctl code
calls iommu_iova_to_phys before it calls iommu_map. iommu_map handles the case that
it gets addresses beyond the addressing capabilities of its IOMMU.
intel_iommu_iova_to_phys does not.

This patch fixes iommu_iova_to_phys to return NULL for addresses beyond what the
IOMMU can handle. This in turn causes the ioctl call to fail in iommu_map and
(correctly) return EFAULT to the user with a helpful warning message in the kernel
log.

Signed-off-by: Julian Stecklina <jsteckli@os.inf.tu-dresden.de>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Joerg Roedel <joro@8bytes.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

HID: hid-elo: some systems cannot stomach work around

commit 403cfb53fb450d53751fdc7ee0cd6652419612cf upstream.

Some systems although they have firmware class 'M', which usually
needs a work around to not crash, must not be subjected to the
work around because the work around crashes them. They cannot be
told apart by their own device descriptor, but as they are part
of compound devices, can be identified by looking at their siblings.

Signed-off-by: Oliver Neukum <oneukum@suse.de>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

HID: lg: fix Report Descriptor for Logitech MOMO Force (Black)

commit 348cbaa800f8161168b20f85f72abb541c145132 upstream.

By default the Logitech MOMO Force (Black) presents a combined accel/brake
axis ('Y'). This patch modifies the HID descriptor to present seperate
accel/brake axes ('Y' and 'Z').

Signed-off-by: Simon Wood <simon@mungewell.org>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

video: kyro: fix incorrect sizes when copying to userspace

commit 2ab68ec927310dc488f3403bb48f9e4ad00a9491 upstream.

kyro would copy u32s and specify sizeof(unsigned long) as the size to copy.

This would copy more data than intended and cause memory corruption and might
leak kernel memory.

Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ti.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

usb: wusbcore: change WA_SEGS_MAX to a legal value

commit f74b75e7f920c700636cccca669c7d16d12e9202 upstream.

change WA_SEGS_MAX to a number that is legal according to the WUSB
spec.

Signed-off-by: Thomas Pugliese <thomas.pugliese@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

usb: musb: davinci: fix resources passed to MUSB driver for DM6467

commit ea78201e2e08f8a91e30100c4c4a908b5cf295fc upstream.

After commit 09fc7d22b024692b2fe8a943b246de1af307132b (usb: musb: fix incorrect
usage of  resource pointer), CPPI DMA driver on DaVinci DM6467 can't detect its
dedicated IRQ and so the MUSB IRQ  is erroneously used instead. This is because
only 2 resources are passed to the MUSB driver from the DaVinci glue layer,  so
fix  this by always copying 3 resources (it's  safe since a placeholder for the
3rd resource is always  there) and passing 'pdev->num_resources' instead of the
size of musb_resources[] to platform_device_add_resources().

Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: Felipe Balbi <balbi@ti.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

md/raid5: Use conf->device_lock protect changing of multi-thread resources.

commit 60aaf933854511630e16be4efe0f96485e132de4 upstream.
and commit 0c775d5208284700de423e6746259da54a42e1f5

When we change group_thread_cnt from sysfs entry, it can OOPS.

The kernel messages are:
[  135.299021] BUG: unable to handle kernel NULL pointer dereference at           (null)
[  135.299073] IP: [<ffffffff815188ab>] handle_active_stripes+0x32b/0x440
[  135.299107] PGD 0
[  135.299122] Oops: 0000 [#1] SMP
[  135.299144] Modules linked in: netconsole e1000e ptp pps_core
[  135.299188] CPU: 3 PID: 2225 Comm: md0_raid5 Not tainted 3.12.0+ #24
[  135.299214] Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./To be filled by O.E.M., BIOS 080015  11/09/2011
[  135.299255] task: ffff8800b9638f80 ti: ffff8800b77a4000 task.ti: ffff8800b77a4000
[  135.299283] RIP: 0010:[<ffffffff815188ab>]  [<ffffffff815188ab>] handle_active_stripes+0x32b/0x440
[  135.299323] RSP: 0018:ffff8800b77a5c48  EFLAGS: 00010002
[  135.299344] RAX: ffff880037bb5c70 RBX: 0000000000000000 RCX: 0000000000000008
[  135.299371] RDX: ffff880037bb5cb8 RSI: 0000000000000001 RDI: ffff880037bb5c00
[  135.299398] RBP: ffff8800b77a5d08 R08: 0000000000000001 R09: 0000000000000000
[  135.299425] R10: ffff8800b77a5c98 R11: 00000000ffffffff R12: ffff880037bb5c00
[  135.299452] R13: 0000000000000000 R14: 0000000000000000 R15: ffff880037bb5c70
[  135.299479] FS:  0000000000000000(0000) GS:ffff88013fd80000(0000) knlGS:0000000000000000
[  135.299510] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[  135.299532] CR2: 0000000000000000 CR3: 0000000001c0b000 CR4: 00000000000407e0
[  135.299559] Stack:
[  135.299570]  ffff8800b77a5c88 ffffffff8107383e ffff8800b77a5c88 ffff880037a64300
[  135.299611]  000000000000ec08 ffff880037bb5cb8 ffff8800b77a5c98 ffffffffffffffd8
[  135.299654]  000000000000ec08 ffff880037bb5c60 ffff8800b77a5c98 ffff8800b77a5c98
[  135.299696] Call Trace:
[  135.299711]  [<ffffffff8107383e>] ? __wake_up+0x4e/0x70
[  135.299733]  [<ffffffff81518f88>] raid5d+0x4c8/0x680
[  135.299756]  [<ffffffff817174ed>] ? schedule_timeout+0x15d/0x1f0
[  135.299781]  [<ffffffff81524c9f>] md_thread+0x11f/0x170
[  135.299804]  [<ffffffff81069cd0>] ? wake_up_bit+0x40/0x40
[  135.299826]  [<ffffffff81524b80>] ? md_rdev_init+0x110/0x110
[  135.299850]  [<ffffffff81069656>] kthread+0xc6/0xd0
[  135.299871]  [<ffffffff81069590>] ? kthread_freezable_should_stop+0x70/0x70
[  135.299899]  [<ffffffff81722ffc>] ret_from_fork+0x7c/0xb0
[  135.299923]  [<ffffffff81069590>] ? kthread_freezable_should_stop+0x70/0x70
[  135.299951] Code: ff ff ff 0f 84 d7 fe ff ff e9 5c fe ff ff 66 90 41 8b b4 24 d8 01 00 00 45 31 ed 85 f6 0f 8e 7b fd ff ff 49 8b 9c 24 d0 01 00 00 <48> 3b 1b 49 89 dd 0f 85 67 fd ff ff 48 8d 43 28 31 d2 eb 17 90
[  135.300005] RIP  [<ffffffff815188ab>] handle_active_stripes+0x32b/0x440
[  135.300005]  RSP <ffff8800b77a5c48>
[  135.300005] CR2: 0000000000000000
[  135.300005] ---[ end trace 504854e5bb7562ed ]---
[  135.300005] Kernel panic - not syncing: Fatal exception

This is because raid5d() can be running when the multi-thread
resources are changed via system. We see need to provide locking.

mddev->device_lock is suitable, but we cannot simple call
alloc_thread_groups under this lock as we cannot allocate memory
while holding a spinlock.
So change alloc_thread_groups() to allocate and return the data
structures, then raid5_store_group_thread_cnt() can take the lock
while updating the pointers to the data structures.

This fixes a bug introduced in 3.12 and so is suitable for the 3.12.x
stable series.

Fixes: b721420e8719131896b009b11edbbd27
Signed-off-by: Jianpeng Ma <majianpeng@gmail.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Reviewed-by: Shaohua Li <shli@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mm: numa: return the number of base pages altered by protection changes

commit 72403b4a0fbdf433c1fe0127e49864658f6f6468 upstream.

Commit 0255d4918480 ("mm: Account for a THP NUMA hinting update as one
PTE update") was added to account for the number of PTE updates when
marking pages prot_numa.  task_numa_work was using the old return value
to track how much address space had been updated.  Altering the return
value causes the scanner to do more work than it is configured or
documented to in a single unit of work.

This patch reverts that commit and accounts for the number of THP
updates separately in vmstat.  It is up to the administrator to
interpret the pair of values correctly.  This is a straight-forward
operation and likely to only be of interest when actively debugging NUMA
balancing problems.

The impact of this patch is that the NUMA PTE scanner will scan slower
when THP is enabled and workloads may converge slower as a result.  On
the flip size system CPU usage should be lower than recent tests
reported.  This is an illustrative example of a short single JVM specjbb
test

specjbb
                       3.12.0                3.12.0
                      vanilla      acctupdates
TPut 1      26143.00 (  0.00%)     25747.00 ( -1.51%)
TPut 7     185257.00 (  0.00%)    183202.00 ( -1.11%)
TPut 13    329760.00 (  0.00%)    346577.00 (  5.10%)
TPut 19    442502.00 (  0.00%)    460146.00 (  3.99%)
TPut 25    540634.00 (  0.00%)    549053.00 (  1.56%)
TPut 31    512098.00 (  0.00%)    519611.00 (  1.47%)
TPut 37    461276.00 (  0.00%)    474973.00 (  2.97%)
TPut 43    403089.00 (  0.00%)    414172.00 (  2.75%)

              3.12.0      3.12.0
             vanillaacctupdates
User         5169.64     5184.14
System        100.45       80.02
Elapsed       252.75      251.85

Performance is similar but note the reduction in system CPU time.  While
this showed a performance gain, it will not be universal but at least
it'll be behaving as documented.  The vmstats are obviously different but
here is an obvious interpretation of them from mmtests.

                                3.12.0      3.12.0
                               vanillaacctupdates
NUMA page range updates        1408326    11043064
NUMA huge PMD updates                0       21040
NUMA PTE updates               1408326      291624

"NUMA page range updates" == nr_pte_updates and is the value returned to
the NUMA pte scanner.  NUMA huge PMD updates were the number of THP
updates which in combination can be used to calculate how many ptes were
updated from userspace.

Signed-off-by: Mel Gorman <mgorman@suse.de>
Reported-by: Alex Thorlton <athorlton@sgi.com>
Reviewed-by: Rik van Riel <riel@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Mel Gorman <mgorman@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

xfs: add capability check to free eofblocks ioctl

commit 8c567a7fab6e086a0284eee2db82348521e7120c upstream.

Check for CAP_SYS_ADMIN since the caller can truncate preallocated
blocks from files they do not own nor have write access to. A more
fine grained access check was considered: require the caller to
specify their own uid/gid and to use inode_permission to check for
write, but this would not catch the case of an inode not reachable
via path traversal from the callers mount namespace.

Add check for read-only filesystem to free eofblocks ioctl.

Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Gao feng <gaofeng@cn.fujitsu.com>
Signed-off-by: Dwight Engen <dwight.engen@oracle.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
Cc: Kees Cook <keescook@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

xfrm: Fix null pointer dereference when decoding sessions

[ Upstream commit 84502b5ef9849a9694673b15c31bd3ac693010ae ]

On some codepaths the skb does not have a dst entry
when xfrm_decode_session() is called. So check for
a valid skb_dst() before dereferencing the device
interface index. We use 0 as the device index if
there is no valid skb_dst(), or at reverse decoding
we use skb_iif as device interface index.

Bug was introduced with git commit bafd4bd4dc
("xfrm: Decode sessions with output interface.").

Reported-by: Meelis Roos <mroos@linux.ee>
Tested-by: Meelis Roos <mroos@linux.ee>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

{pktgen, xfrm} Update IPv4 header total len and checksum after tranformation

[ Upstream commit 3868204d6b89ea373a273e760609cb08020beb1a ]

commit a553e4a6317b2cfc7659542c10fe43184ffe53da ("[PKTGEN]: IPSEC support")
tried to support IPsec ESP transport transformation for pktgen, but acctually
this doesn't work at all for two reasons(The orignal transformed packet has
bad IPv4 checksum value, as well as wrong auth value, reported by wireshark)

- After transpormation, IPv4 header total length needs update,
because encrypted payload's length is NOT same as that of plain text.

- After transformation, IPv4 checksum needs re-caculate because of payload
has been changed.

With this patch, armmed pktgen with below cofiguration, Wireshark is able to
decrypted ESP packet generated by pktgen without any IPv4 checksum error or
auth value error.

pgset "flag IPSEC"
pgset "flows 1"

Signed-off-by: Fan Du <fan.du@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ipv6: fix possible seqlock deadlock in ip6_finish_output2

[ Upstream commit 7f88c6b23afbd31545c676dea77ba9593a1a14bf ]

IPv6 stats are 64 bits and thus are protected with a seqlock. By not
disabling bottom-half we could deadlock here if we don't disable bh and
a softirq reentrantly updates the same mib.

Cc: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

inet: fix possible seqlock deadlocks

[ Upstream commit f1d8cba61c3c4b1eb88e507249c4cb8d635d9a76 ]

In commit c9e9042994d3 ("ipv4: fix possible seqlock deadlock") I left
another places where IP_INC_STATS_BH() were improperly used.

udp_sendmsg(), ping_v4_sendmsg() and tcp_v4_connect() are called from
process context, not from softirq context.

This was detected by lockdep seqlock support.

Reported-by: jongman heo <jongman.heo@samsung.com>
Fixes: 584bdf8cbdf6 ("[IPV4]: Fix "ipOutNoRoutes" counter error for TCP and UDP")
Fixes: c319b4d76b9e ("net: ipv4: add IPPROTO_ICMP socket kind")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

team: fix master carrier set when user linkup is enabled

[ Upstream commit f5e0d34382e18f396d7673a84df8e3342bea7eb6 ]

When user linkup is enabled and user sets linkup of individual port,
we need to recompute linkup (carrier) of master interface so the change
is reflected. Fix this by calling __team_carrier_check() which does the
needed work.

Please apply to all stable kernels as well. Thanks.

Reported-by: Jan Tluka <jtluka@redhat.com>
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

net: update consumers of MSG_MORE to recognize MSG_SENDPAGE_NOTLAST

[ Upstream commit d3f7d56a7a4671d395e8af87071068a195257bf6 ]

Commit 35f9c09fe (tcp: tcp_sendpages() should call tcp_push() once)
added an internal flag MSG_SENDPAGE_NOTLAST, similar to
MSG_MORE.

algif_hash, algif_skcipher, and udp used MSG_MORE from tcp_sendpages()
and need to see the new flag as identical to MSG_MORE.

This fixes sendfile() on AF_ALG.

v3: also fix udp

Cc: Tom Herbert <therbert@google.com>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: <stable@vger.kernel.org> # 3.4.x + 3.2.x
Reported-and-tested-by: Shawn Landden <shawnlandden@gmail.com>
Original-patch: Richard Weinberger <richard@nod.at>
Signed-off-by: Shawn Landden <shawn@churchofgit.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

net: smc91: fix crash regression on the versatile

[ Upstream commit a0c20fb02592d372e744d1d739cda3e1b3defaae ]

After commit e9e4ea74f06635f2ffc1dffe5ef40c854faa0a90
"net: smc91x: dont't use SMC_outw for fixing up halfword-aligned data"
The Versatile SMSC LAN91C111 is crashing like this:

------------[ cut here ]------------
kernel BUG at /home/linus/linux/drivers/net/ethernet/smsc/smc91x.c:599!
Internal error: Oops - BUG: 0 [#1] ARM
Modules linked in:
CPU: 0 PID: 43 Comm: udhcpc Not tainted 3.13.0-rc1+ #24
task: c6ccfaa0 ti: c6cd0000 task.ti: c6cd0000
PC is at smc_hardware_send_pkt+0x198/0x22c
LR is at smc_hardware_send_pkt+0x24/0x22c
pc : [<c01be324>]    lr : [<c01be1b0>]    psr: 20000013
sp : c6cd1d08  ip : 00000001  fp : 00000000
r10: c02adb08  r9 : 00000000  r8 : c6ced802
r7 : c786fba0  r6 : 00000146  r5 : c8800000  r4 : c78d6000
r3 : 0000000f  r2 : 00000146  r1 : 00000000  r0 : 00000031
Flags: nzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment user
Control: 0005317f  Table: 06cf4000  DAC: 00000015
Process udhcpc (pid: 43, stack limit = 0xc6cd01c0)
Stack: (0xc6cd1d08 to 0xc6cd2000)
1d00:                   00000010 c8800000 c78d6000 c786fba0 c78d6000 c01be868
1d20: c01be7a4 00004000 00000000 c786fba0 c6c12b80 c0208554 000004d0 c780fc60
1d40: 00000220 c01fb734 00000000 00000000 00000000 c6c9a440 c6c12b80 c78d6000
1d60: c786fba0 c6c9a440 00000000 c021d1d8 00000000 00000000 c6c12b80 c78d6000
1d80: c786fba0 00000001 c6c9a440 c02087f8 c6c9a4a0 00080008 00000000 00000000
1da0: c78d6000 c786fba0 c78d6000 00000138 00000000 00000000 00000000 00000000
1dc0: 00000000 c027ba74 00000138 00000138 00000001 00000010 c6cedc00 00000000
1de0: 00000008 c7404400 c6cd1eec c6cd1f14 c067a73c c065c0b8 00000000 c067a740
1e00: 01ffffff 002040d0 00000000 00000000 00000000 00000000 00000000 ffffffff
1e20: 43004400 00110022 c6cdef20 c027ae8c c6ccfaa0 be82d65c 00000014 be82d3cc
1e40: 00000000 00000000 00000000 c01f2870 00000000 00000000 00000000 c6cd1e88
1e60: c6ccfaa0 00000000 00000000 00000000 00000000 00000000 00000000 00000000
1e80: 00000000 00000000 00000031 c7802310 c7802300 00000138 c7404400 c0771da0
1ea0: 00000000 c6cd1eec c7800340 00000138 be82d65c 00000014 be82d3cc c6cd1f08
1ec0: 00000014 00000000 c7404400 c7404400 00000138 c01f4628 c78d6000 00000000
1ee0: 00000000 be82d3cc 00000138 c6cd1f08 00000014 c6cd1ee4 00000001 00000000
1f00: 00000000 00000000 00080011 00000002 06000000 ffffffff 0000ffff 00000002
1f20: 06000000 ffffffff 0000ffff c00928c8 c065c520 c6cd1f58 00000003 c009299c
1f40: 00000003 c065c520 c7404400 00000000 c7404400 c01f2218 c78106b0 c7441cb0
1f60: 00000000 00000006 c06799fc 00000000 00000000 00000006 00000000 c01f3ee0
1f80: 00000000 00000000 be82d678 be82d65c 00000014 00000001 00000122 c00139c8
1fa0: c6cd0000 c0013840 be82d65c 00000014 00000006 be82d3cc 00000138 00000000
1fc0: be82d65c 00000014 00000001 00000122 00000000 00000000 00018cb1 00000000
1fe0: 00003801 be82d3a8 0003a0c7 b6e9af08 60000010 00000006 00000000 00000000
[<c01be324>] (smc_hardware_send_pkt+0x198/0x22c) from [<c01be868>] (smc_hard_start_xmit+0xc4/0x1e8)
[<c01be868>] (smc_hard_start_xmit+0xc4/0x1e8) from [<c0208554>] (dev_hard_start_xmit+0x460/0x4cc)
[<c0208554>] (dev_hard_start_xmit+0x460/0x4cc) from [<c021d1d8>] (sch_direct_xmit+0x94/0x18c)
[<c021d1d8>] (sch_direct_xmit+0x94/0x18c) from [<c02087f8>] (dev_queue_xmit+0x238/0x42c)
[<c02087f8>] (dev_queue_xmit+0x238/0x42c) from [<c027ba74>] (packet_sendmsg+0xbe8/0xd28)
[<c027ba74>] (packet_sendmsg+0xbe8/0xd28) from [<c01f2870>] (sock_sendmsg+0x84/0xa8)
[<c01f2870>] (sock_sendmsg+0x84/0xa8) from [<c01f4628>] (SyS_sendto+0xb8/0xdc)
[<c01f4628>] (SyS_sendto+0xb8/0xdc) from [<c0013840>] (ret_fast_syscall+0x0/0x2c)
Code: e3130002 1a000001 e3130001 0affffcd (e7f001f2)
---[ end trace 81104fe70e8da7fe ]---
Kernel panic - not syncing: Fatal exception in interrupt

This is because the macro operations in smc91x.h defined
for Versatile are missing SMC_outsw() as used in this
commit.

The Versatile needs and uses the same accessors as the other
platforms in the first if(...) clause, just switch it to using
that and we have one problem less to worry about.

This includes a hunk of a patch from Will Deacon fixin
the other 32bit platforms as well: Innokom, Ramses, PXA,
PCM027.

Checkpatch complains about spacing, but I have opted to
follow the style of this .h-file.

Cc: Russell King <linux@arm.linux.org.uk>
Cc: Nicolas Pitre <nico@fluxnic.net>
Cc: Eric Miao <eric.y.miao@gmail.com>
Cc: Jonathan Cameron <jic23@cam.ac.uk>
Cc: stable@vger.kernel.org
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

net: 8139cp: fix a BUG_ON triggered by wrong bytes_compl

[ Upstream commit 7fe0ee099ad5e3dea88d4ee1b6f20246b1ca57c3 ]

Using iperf to send packets(GSO mode is on), a bug is triggered:

[  212.672781] kernel BUG at lib/dynamic_queue_limits.c:26!
[  212.673396] invalid opcode: 0000 [#1] SMP
[  212.673882] Modules linked in: 8139cp(O) nls_utf8 edd fuse loop dm_mod ipv6 i2c_piix4 8139too i2c_core intel_agp joydev pcspkr hid_generic intel_gtt floppy sr_mod mii button sg cdrom ext3 jbd mbcache usbhid hid uhci_hcd ehci_hcd usbcore sd_mod usb_common crc_t10dif crct10dif_common processor thermal_sys hwmon scsi_dh_emc scsi_dh_rdac scsi_dh_hp_sw scsi_dh ata_generic ata_piix libata scsi_mod [last unloaded: 8139cp]
[  212.676084] CPU: 0 PID: 4124 Comm: iperf Tainted: G           O 3.12.0-0.7-default+ #16
[  212.676084] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007
[  212.676084] task: ffff8800d83966c0 ti: ffff8800db4c8000 task.ti: ffff8800db4c8000
[  212.676084] RIP: 0010:[<ffffffff8122e23f>]  [<ffffffff8122e23f>] dql_completed+0x17f/0x190
[  212.676084] RSP: 0018:ffff880116e03e30  EFLAGS: 00010083
[  212.676084] RAX: 00000000000005ea RBX: 0000000000000f7c RCX: 0000000000000002
[  212.676084] RDX: ffff880111dd0dc0 RSI: 0000000000000bd4 RDI: ffff8800db6ffcc0
[  212.676084] RBP: ffff880116e03e48 R08: 0000000000000992 R09: 0000000000000000
[  212.676084] R10: ffffffff8181e400 R11: 0000000000000004 R12: 000000000000000f
[  212.676084] R13: ffff8800d94ec840 R14: ffff8800db440c80 R15: 000000000000000e
[  212.676084] FS:  00007f6685a3c700(0000) GS:ffff880116e00000(0000) knlGS:0000000000000000
[  212.676084] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  212.676084] CR2: 00007f6685ad6460 CR3: 00000000db714000 CR4: 00000000000006f0
[  212.676084] Stack:
[  212.676084]  ffff8800db6ffc00 000000000000000f ffff8800d94ec840 ffff880116e03eb8
[  212.676084]  ffffffffa041509f ffff880116e03e88 0000000f16e03e88 ffff8800d94ec000
[  212.676084]  00000bd400059858 000000050000000f ffffffff81094c36 ffff880116e03eb8
[  212.676084] Call Trace:
[  212.676084]  <IRQ>
[  212.676084]  [<ffffffffa041509f>] cp_interrupt+0x4ef/0x590 [8139cp]
[  212.676084]  [<ffffffff81094c36>] ? ktime_get+0x56/0xd0
[  212.676084]  [<ffffffff8108cf73>] handle_irq_event_percpu+0x53/0x170
[  212.676084]  [<ffffffff8108d0cc>] handle_irq_event+0x3c/0x60
[  212.676084]  [<ffffffff8108fdb5>] handle_fasteoi_irq+0x55/0xf0
[  212.676084]  [<ffffffff810045df>] handle_irq+0x1f/0x30
[  212.676084]  [<ffffffff81003c8b>] do_IRQ+0x5b/0xe0
[  212.676084]  [<ffffffff8142beaa>] common_interrupt+0x6a/0x6a
[  212.676084]  <EOI>
[  212.676084]  [<ffffffffa0416a21>] ? cp_start_xmit+0x621/0x97c [8139cp]
[  212.676084]  [<ffffffffa0416a09>] ? cp_start_xmit+0x609/0x97c [8139cp]
[  212.676084]  [<ffffffff81378ed9>] dev_hard_start_xmit+0x2c9/0x550
[  212.676084]  [<ffffffff813960a9>] sch_direct_xmit+0x179/0x1d0
[  212.676084]  [<ffffffff813793f3>] dev_queue_xmit+0x293/0x440
[  212.676084]  [<ffffffff813b0e46>] ip_finish_output+0x236/0x450
[  212.676084]  [<ffffffff810e59e7>] ? __alloc_pages_nodemask+0x187/0xb10
[  212.676084]  [<ffffffff813b10e8>] ip_output+0x88/0x90
[  212.676084]  [<ffffffff813afa64>] ip_local_out+0x24/0x30
[  212.676084]  [<ffffffff813aff0d>] ip_queue_xmit+0x14d/0x3e0
[  212.676084]  [<ffffffff813c6fd1>] tcp_transmit_skb+0x501/0x840
[  212.676084]  [<ffffffff813c8323>] tcp_write_xmit+0x1e3/0xb20
[  212.676084]  [<ffffffff81363237>] ? skb_page_frag_refill+0x87/0xd0
[  212.676084]  [<ffffffff813c8c8b>] tcp_push_one+0x2b/0x40
[  212.676084]  [<ffffffff813bb7e6>] tcp_sendmsg+0x926/0xc90
[  212.676084]  [<ffffffff813e1d21>] inet_sendmsg+0x61/0xc0
[  212.676084]  [<ffffffff8135e861>] sock_aio_write+0x101/0x120
[  212.676084]  [<ffffffff81107cf1>] ? vma_adjust+0x2e1/0x5d0
[  212.676084]  [<ffffffff812163e0>] ? timerqueue_add+0x60/0xb0
[  212.676084]  [<ffffffff81130b60>] do_sync_write+0x60/0x90
[  212.676084]  [<ffffffff81130d44>] ? rw_verify_area+0x54/0xf0
[  212.676084]  [<ffffffff81130f66>] vfs_write+0x186/0x190
[  212.676084]  [<ffffffff811317fd>] SyS_write+0x5d/0xa0
[  212.676084]  [<ffffffff814321e2>] system_call_fastpath+0x16/0x1b
[  212.676084] Code: ca 41 89 dc 41 29 cc 45 31 db 29 c2 41 89 c5 89 d0 45 29 c5 f7 d0 c1 e8 1f e9 43 ff ff ff 66 0f 1f 44 00 00 31 c0 e9 7b ff ff ff <0f> 0b eb fe 66 66 66 66 2e 0f 1f 84 00 00 00 00 00 c7 47 40 00
[  212.676084] RIP  [<ffffffff8122e23f>] dql_completed+0x17f/0x190
------------[ cut here ]------------

When a skb has frags, bytes_compl plus skb->len nr_frags times in cp_tx().
It's not the correct value(actually, it should plus skb->len once) and it
will trigger the BUG_ON(bytes_compl > num_queued - dql->num_completed).
So only increase bytes_compl when finish sending all frags. pkts_compl also
has a wrong value, fix it too.

It's introduced by commit 871f0d4c ("8139cp: enable bql").

Suggested-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

r8169: check ALDPS bit and disable it if enabled for the 8168g

[ Upstream commit 1bac1072425c86f1ac85bd5967910706677ef8b3 ]

Windows driver will enable ALDPS function, but linux driver and firmware
do not have any configuration related to ALDPS function for 8168g.
So restart system to linux and remove the NIC cable, LAN enter ALDPS,
then LAN RX will be disabled.

This issue can be easily reproduced on dual boot windows and linux
system with RTL_GIGA_MAC_VER_40 chip.

Realtek said, ALDPS function can be disabled by configuring to PHY,
switch to page 0x0A43, reg0x10 bit2=0.

Signed-off-by: David Chang <dchang@suse.com>
Acked-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

via-velocity: fix netif_receive_skb use in irq disabled section.

[ Upstream commit bc9627e7e918a85e906c1a3f6d01d9b8ef911a96 ]

2fdac010bdcf10a30711b6924612dfc40daf19b8 ("via-velocity.c: update napi
implementation") overlooked an irq disabling spinlock when the Rx part
of the NAPI poll handler was converted from netif_rx to netif_receive_skb.

NAPI Rx processing can be taken out of the locked section with a pair of
napi_{disable / enable} since it only races with the MTU change function.

An heavier rework of the NAPI locking would be able to perform NAPI Tx
before Rx where I simply removed one of velocity_tx_srv calls.

References: https://bugzilla.redhat.com/show_bug.cgi?id=1022733
Fixes: 2fdac010bdcf (via-velocity.c: update napi implementation)
Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
Tested-by: Alex A. Schmidt <aaschmidt1@gmail.com>
Cc: Jamie Heilman <jamie@audible.transient.net>
Cc: Michele Baldessari <michele@acksyn.org>
Cc: Julia Lawall <Julia.Lawall@lip6.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

xen-netback: include definition of csum_ipv6_magic

[ Upstream commit ae5e8127b712313ec1b99356019ce9226fea8b88 ]

We are now using csum_ipv6_magic, include the appropriate header.
Avoids the following error:

drivers/net/xen-netback/netback.c:1313:4: error: implicit declaration of function 'csum_ipv6_magic' [-Werror=implicit-function-declaration]
tcph->check = ~csum_ipv6_magic(&ipv6h->saddr,

Signed-off-by: Andy Whitcroft <apw@canonical.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

sch_tbf: handle too small burst

[ Upstream commit 4d0820cf6a55d72350cb2d24a4504f62fbde95d9 ]

If a too small burst is inadvertently set on TBF, we might trigger
a bug in tbf_segment(), as 'skb' instead of 'segs' was used in a
qdisc_reshape_fail() call.

tc qdisc add dev eth0 root handle 1: tbf latency 50ms burst 1KB rate
50mbit

Fix the bug, and add a warning, as such configuration is not
going to work anyway for non GSO packets.

(For some reason, one has to use a burst >= 1520 to get a working
configuration, even with old kernels. This is a probable iproute2/tc
bug)

Based on a report and initial patch from Yang Yingliang

Fixes: e43ac79a4bc6 ("sch_tbf: segment too big GSO packets")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Yang Yingliang <yangyingliang@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

gro: Clean up tcpX_gro_receive checksum verification

[ Upstream commit b8ee93ba80b5a0b6c3c06b65c34dd1276f16c047 ]

This patch simplifies the checksum verification in tcpX_gro_receive
by reusing the CHECKSUM_COMPLETE code for CHECKSUM_NONE. All it
does for CHECKSUM_NONE is compute the partial checksum and then
treat it as if it came from the hardware (CHECKSUM_COMPLETE).

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Cheers,
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

gro: Only verify TCP checksums for candidates

[ Upstream commit cc5c00bbb44c5d68b883aa5cb9d01514a2525d94 ]

In some cases we may receive IP packets that are longer than
their stated lengths. Such packets are never merged in GRO.
However, we may end up computing their checksums incorrectly
and end up allowing packets with a bogus checksum enter our
stack with the checksum status set as verified.

Since such packets are rare and not performance-critical, this
patch simply skips the checksum verification for them.

Reported-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Acked-by: Alexander Duyck <alexander.h.duyck@intel.com>
Thanks,
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

gso: handle new frag_list of frags GRO packets

[ Upstream commit 9d8506cc2d7ea1f911c72c100193a3677f6668c3 ]

Recently GRO started generating packets with frag_lists of frags.
This was not handled by GSO, thus leading to a crash.

Thankfully these packets are of a regular form and are easy to
handle. This patch handles them in two ways. For completely
non-linear frag_list entries, we simply continue to iterate over
the frag_list frags once we exhaust the normal frags. For frag_list
entries with linear parts, we call pskb_trim on the first part
of the frag_list skb, and then process the rest of the frags in
the usual way.

This patch also kills a chunk of dead frag_list code that has
obviously never ever been run since it ends up generating a bogus
GSO-segmented packet with a frag_list entry.

Future work is planned to split super big packets into TSO
ones.

Fixes: 8a29111c7ca6 ("net: gro: allow to build full sized skb")
Reported-by: Christoph Paasch <christoph.paasch@uclouvain.be>
Reported-by: Jerry Chu <hkchu@google.com>
Reported-by: Sander Eikelenboom <linux@eikelenboom.it>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Tested-by: Sander Eikelenboom <linux@eikelenboom.it>
Tested-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

af_packet: block BH in prb_shutdown_retire_blk_timer()

[ Upstream commit ec6f809ff6f19fafba3212f6aff0dda71dfac8e8 ]

Currently we're using plain spin_lock() in prb_shutdown_retire_blk_timer(),
however the timer might fire right in the middle and thus try to re-aquire
the same spinlock, leaving us in a endless loop.

To fix that, use the spin_lock_bh() to block it.

Fixes: f6fb8f100b80 ("af-packet: TPACKET_V3 flexible buffer implementation.")
CC: "David S. Miller" <davem@davemloft.net>
CC: Daniel Borkmann <dborkman@redhat.com>
CC: Willem de Bruijn <willemb@google.com>
CC: Phil Sutter <phil@nwl.cc>
CC: Eric Dumazet <edumazet@google.com>
Reported-by: Jan Stancek <jstancek@redhat.com>
Tested-by: Jan Stancek <jstancek@redhat.com>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Acked-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

packet: fix use after free race in send path when dev is released

[ Upstream commit e40526cb20b5ee53419452e1f03d97092f144418 ]

Salam reported a use after free bug in PF_PACKET that occurs when
we're sending out frames on a socket bound device and suddenly the
net device is being unregistered. It appears that commit 827d9780
introduced a possible race condition between {t,}packet_snd() and
packet_notifier(). In the case of a bound socket, packet_notifier()
can drop the last reference to the net_device and {t,}packet_snd()
might end up suddenly sending a packet over a freed net_device.

To avoid reverting 827d9780 and thus introducing a performance
regression compared to the current state of things, we decided to
hold a cached RCU protected pointer to the net device and maintain
it on write side via bind spin_lock protected register_prot_hook()
and __unregister_prot_hook() calls.

In {t,}packet_snd() path, we access this pointer under rcu_read_lock
through packet_cached_dev_get() that holds reference to the device
to prevent it from being freed through packet_notifier() while
we're in send path. This is okay to do as dev_put()/dev_hold() are
per-cpu counters, so this should not be a performance issue. Also,
the code simplifies a bit as we don't need need_rls_dev anymore.

Fixes: 827d978037d7 ("af-packet: Use existing netdev reference for bound sockets.")
Reported-by: Salam Noureddine <noureddine@aristanetworks.com>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: Salam Noureddine <noureddine@aristanetworks.com>
Cc: Ben Greear <greearb@candelatech.com>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

bridge: flush br's address entry in fdb when remove the bridge dev

[ Upstream commit f873042093c0b418d2351fe142222b625c740149 ]

When the following commands are executed:

brctl addbr br0
ifconfig br0 hw ether <addr>
rmmod bridge

The calltrace will occur:

[  563.312114] device eth1 left promiscuous mode
[  563.312188] br0: port 1(eth1) entered disabled state
[  563.468190] kmem_cache_destroy bridge_fdb_cache: Slab cache still has objects
[  563.468197] CPU: 6 PID: 6982 Comm: rmmod Tainted: G           O 3.12.0-0.7-default+ #9
[  563.468199] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007
[  563.468200]  0000000000000880 ffff88010f111e98 ffffffff814d1c92 ffff88010f111eb8
[  563.468204]  ffffffff81148efd ffff88010f111eb8 0000000000000000 ffff88010f111ec8
[  563.468206]  ffffffffa062a270 ffff88010f111ed8 ffffffffa063ac76 ffff88010f111f78
[  563.468209] Call Trace:
[  563.468218]  [<ffffffff814d1c92>] dump_stack+0x6a/0x78
[  563.468234]  [<ffffffff81148efd>] kmem_cache_destroy+0xfd/0x100
[  563.468242]  [<ffffffffa062a270>] br_fdb_fini+0x10/0x20 [bridge]
[  563.468247]  [<ffffffffa063ac76>] br_deinit+0x4e/0x50 [bridge]
[  563.468254]  [<ffffffff810c7dc9>] SyS_delete_module+0x199/0x2b0
[  563.468259]  [<ffffffff814e0922>] system_call_fastpath+0x16/0x1b
[  570.377958] Bridge firewalling registered

--------------------------- cut here -------------------------------

The reason is that when the bridge dev's address is changed, the
br_fdb_change_mac_address() will add new address in fdb, but when
the bridge was removed, the address entry in the fdb did not free,
the bridge_fdb_cache still has objects when destroy the cache, Fix
this by flushing the bridge address entry when removing the bridge.

v2: according to the Toshiaki Makita and Vlad's suggestion, I only
    delete the vlan0 entry, it still have a leak here if the vlan id
    is other number, so I need to call fdb_delete_by_port(br, NULL, 1)
    to flush all entries whose dst is NULL for the bridge.

Suggested-by: Toshiaki Makita <toshiaki.makita1@gmail.com>
Suggested-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: Ding Tianhong <dingtianhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

net: core: Always propagate flag changes to interfaces

[ Upstream commit d2615bf450694c1302d86b9cc8a8958edfe4c3a4 ]

The following commit:
    b6c40d68ff6498b7f63ddf97cf0aa818d748dee7
    net: only invoke dev->change_rx_flags when device is UP

tried to fix a problem with VLAN devices and promiscuouse flag setting.
The issue was that VLAN device was setting a flag on an interface that
was down, thus resulting in bad promiscuity count.
This commit blocked flag propagation to any device that is currently
down.

A later commit:
    deede2fabe24e00bd7e246eb81cd5767dc6fcfc7
    vlan: Don't propagate flag changes on down interfaces

fixed VLAN code to only propagate flags when the VLAN interface is up,
thus fixing the same issue as above, only localized to VLAN.

The problem we have now is that if we have create a complex stack
involving multiple software devices like bridges, bonds, and vlans,
then it is possible that the flags would not propagate properly to
the physical devices.  A simple examle of the scenario is the
following:

  eth0----> bond0 ----> bridge0 ---> vlan50

If bond0 or eth0 happen to be down at the time bond0 is added to
the bridge, then eth0 will never have promisc mode set which is
currently required for operation as part of the bridge.  As a
result, packets with vlan50 will be dropped by the interface.

The only 2 devices that implement the special flag handling are
VLAN and DSA and they both have required code to prevent incorrect
flag propagation.  As a result we can remove the generic solution
introduced in b6c40d68ff6498b7f63ddf97cf0aa818d748dee7 and leave
it to the individual devices to decide whether they will block
flag propagation or not.

Reported-by: Stefan Priebe <s.priebe@profihost.ag>
Suggested-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: Vlad Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ipv4: fix race in concurrent ip_route_input_slow()

[ Upstream commit dcdfdf56b4a6c9437fc37dbc9cee94a788f9b0c4 ]

CPUs can ask for local route via ip_route_input_noref() concurrently.
if nh_rth_input is not cached yet, CPUs will proceed to allocate
equivalent DSTs on 'lo' and then will try to cache them in nh_rth_input
via rt_cache_route()
Most of the time they succeed, but on occasion the following two lines:
orig = *p;
prev = cmpxchg(p, orig, rt);
in rt_cache_route() do race and one of the cpus fails to complete cmpxchg.
But ip_route_input_slow() doesn't check the return code of rt_cache_route(),
so dst is leaking. dst_destroy() is never called and 'lo' device
refcnt doesn't go to zero, which can be seen in the logs as:
unregister_netdevice: waiting for lo to become free. Usage count = 1
Adding mdelay() between above two lines makes it easily reproducible.
Fix it similar to nh_pcpu_rth_output case.

Fixes: d2d68ba9fe8b ("ipv4: Cache input routes in fib_info nexthops.")
Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

tcp: don't update snd_nxt, when a socket is switched from repair mode

[ Upstream commit dbde497966804e63a38fdedc1e3815e77097efc2 ]

snd_nxt must be updated synchronously with sk_send_head.  Otherwise
tp->packets_out may be updated incorrectly, what may bring a kernel panic.

Here is a kernel panic from my host.
[  103.043194] BUG: unable to handle kernel NULL pointer dereference at 0000000000000048
[  103.044025] IP: [<ffffffff815aaaaf>] tcp_rearm_rto+0xcf/0x150
...
[  146.301158] Call Trace:
[  146.301158]  [<ffffffff815ab7f0>] tcp_ack+0xcc0/0x12c0

Before this panic a tcp socket was restored. This socket had sent and
unsent data in the write queue. Sent data was restored in repair mode,
then the socket was switched from reapair mode and unsent data was
restored. After that the socket was switched back into repair mode.

In that moment we had a socket where write queue looks like this:
snd_una    snd_nxt   write_seq
   |_________|________|
             |
  sk_send_head

After a second switching from repair mode the state of socket was
changed:

snd_una          snd_nxt, write_seq
   |_________ ________|
             |
  sk_send_head

This state is inconsistent, because snd_nxt and sk_send_head are not
synchronized.

Bellow you can find a call trace, how packets_out can be incremented
twice for one skb, if snd_nxt and sk_send_head are not synchronized.
In this case packets_out will be always positive, even when
sk_write_queue is empty.

tcp_write_wakeup
skb = tcp_send_head(sk);
tcp_fragment
if (!before(tp->snd_nxt, TCP_SKB_CB(buff)->end_seq))
tcp_adjust_pcount(sk, skb, diff);
tcp_event_new_data_sent
tp->packets_out += tcp_skb_pcount(skb);

I think update of snd_nxt isn't required, when a socket is switched from
repair mode.  Because it's initialized in tcp_connect_init. Then when a
write queue is restored, snd_nxt is incremented in tcp_event_new_data_sent,
so it's always is in consistent state.

I have checked, that the bug is not reproduced with this patch and
all tests about restoring tcp connections work fine.

Cc: Pavel Emelyanov <xemul@parallels.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
Cc: James Morris <jmorris@namei.org>
Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
Cc: Patrick McHardy <kaber@trash.net>
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

atm: idt77252: fix dev refcnt leak

[ Upstream commit b5de4a22f157ca345cdb3575207bf46402414bc1 ]

init_card() calls dev_get_by_name() to get a network deceive. But it
doesn't decrease network device reference count after the device is
used.

Signed-off-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

xfrm: Release dst if this dst is improper for vti tunnel

[ Upstream commit 236c9f84868534c718b6889aa624de64763281f9 ]

After searching rt by the vti tunnel dst/src parameter,
if this rt has neither attached to any transformation
nor the transformation is not tunnel oriented, this rt
should be released back to ip layer.

otherwise causing dst memory leakage.

Signed-off-by: Fan Du <fan.du@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

netfilter: push reasm skb through instead of original frag skbs

[ Upstream commit 6aafeef03b9d9ecf255f3a80ed85ee070260e1ae ]

Pushing original fragments through causes several problems. For example
for matching, frags may not be matched correctly. Take following
example:

<example>
On HOSTA do:
ip6tables -I INPUT -p icmpv6 -j DROP
ip6tables -I INPUT -p icmpv6 -m icmp6 --icmpv6-type 128 -j ACCEPT

and on HOSTB you do:
ping6 HOSTA -s2000 (MTU is 1500)

Incoming echo requests will be filtered out on HOSTA. This issue does
not occur with smaller packets than MTU (where fragmentation does not happen)
</example>

As was discussed previously, the only correct solution seems to be to use
reassembled skb instead of separete frags. Doing this has positive side
effects in reducing sk_buff by one pointer (nfct_reasm) and also the reams
dances in ipvs and conntrack can be removed.

Future plan is to remove net/ipv6/netfilter/nf_conntrack_reasm.c
entirely and use code in net/ipv6/reassembly.c instead.

Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Acked-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Marcelo Ricardo Leitner <mleitner@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ip6_output: fragment outgoing reassembled skb properly

[ Upstream commit 9037c3579a277f3a23ba476664629fda8c35f7c4 ]

If reassembled packet would fit into outdev MTU, it is not fragmented
according the original frag size and it is send as single big packet.

The second case is if skb is gso. In that case fragmentation does not happen
according to the original frag size.

This patch fixes these.

Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ipv6: Fix inet6_init() cleanup order

Commit 6d0bfe22611602f36617bc7aa2ffa1bbb2f54c67
net: ipv6: Add IPv6 support to the ping socket

introduced a change in the cleanup logic of inet6_init and
has a bug in that ipv6_packet_cleanup() may not be called.
Fix the cleanup ordering.

CC: Hannes Frederic Sowa <hannes@stressinduktion.org>
CC: Lorenzo Colitti <lorenzo@google.com>
CC: Fabio Estevam <fabio.estevam@freescale.com>
Signed-off-by: Vlad Yasevich <vyasevich@gmail.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ipv6: fix leaking uninitialized port number of offender sockaddr

[ Upstream commit 1fa4c710b6fe7b0aac9907240291b6fe6aafc3b8 ]

Offenders don't have port numbers, so set it to 0.

Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

net: clamp ->msg_namelen instead of returning an error

[ Upstream commit db31c55a6fb245fdbb752a2ca4aefec89afabb06 ]

If kmsg->msg_namelen > sizeof(struct sockaddr_storage) then in the
original code that would lead to memory corruption in the kernel if you
had audit configured. If you didn't have audit configured it was
harmless.

There are some programs such as beta versions of Ruby which use too
large of a buffer and returning an error code breaks them. We should
clamp the ->msg_namelen value instead.

Fixes: 1661bf364ae9 ("net: heap overflow in __audit_sockaddr()")
Reported-by: Eric Wong <normalperson@yhbt.net>
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Tested-by: Eric Wong <normalperson@yhbt.net>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

inet: fix addr_len/msg->msg_namelen assignment in recv_error and rxpmtu functions

[ Upstream commit 85fbaa75037d0b6b786ff18658ddf0b4014ce2a4 ]

Commit bceaa90240b6019ed73b49965eac7d167610be69 ("inet: prevent leakage
of uninitialized memory to user in recv syscalls") conditionally updated
addr_len if the msg_name is written to. The recv_error and rxpmtu
functions relied on the recvmsg functions to set up addr_len before.

As this does not happen any more we have to pass addr_len to those
functions as well and set it to the size of the corresponding sockaddr
length.

This broke traceroute and such.

Fixes: bceaa90240b6 ("inet: prevent leakage of uninitialized memory to user in recv syscalls")
Reported-by: Brad Spengler <spender@grsecurity.net>
Reported-by: Tom Labanowski
Cc: mpb <mpb.mail@gmail.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

net: add BUG_ON if kernel advertises msg_namelen > sizeof(struct sockaddr_storage)

[ Upstream commit 68c6beb373955da0886d8f4f5995b3922ceda4be ]

In that case it is probable that kernel code overwrote part of the
stack. So we should bail out loudly here.

The BUG_ON may be removed in future if we are sure all protocols are
conformant.

Suggested-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

net: rework recvmsg handler msg_name and msg_namelen logic

[ Upstream commit f3d3342602f8bcbf37d7c46641cb9bca7618eb1c ]

This patch now always passes msg->msg_namelen as 0. recvmsg handlers must
set msg_namelen to the proper size <= sizeof(struct sockaddr_storage)
to return msg_name to the user.

This prevents numerous uninitialized memory leaks we had in the
recvmsg handlers and makes it harder for new code to accidentally leak
uninitialized memory.

Optimize for the case recvfrom is called with NULL as address. We don't
need to copy the address at all, so set it to NULL before invoking the
recvmsg handler. We can do so, because all the recvmsg handlers must
cope with the case a plain read() is called on them. read() also sets
msg_name to NULL.

Also document these changes in include/linux/net.h as suggested by David
Miller.

Changes since RFC:

Set msg->msg_name = NULL if user specified a NULL in msg_name but had a
non-null msg_namelen in verify_iovec/verify_compat_iovec. This doesn't
affect sendto as it would bail out earlier while trying to copy-in the
address. It also more naturally reflects the logic by the callers of
verify_iovec.

With this change in place I could remove "
if (!uaddr || msg_sys->msg_namelen == 0)
msg->msg_name = NULL
".

This change does not alter the user visible error logic as we ignore
msg_namelen as long as msg_name is NULL.

Also remove two unnecessary curly brackets in ___sys_recvmsg and change
comments to netdev style.

Cc: David Miller <davem@davemloft.net>
Suggested-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ping: prevent NULL pointer dereference on write to msg_name

[ Upstream commit cf970c002d270c36202bd5b9c2804d3097a52da0 ]

A plain read() on a socket does set msg->msg_name to NULL. So check for
NULL pointer first.

Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

inet: prevent leakage of uninitialized memory to user in recv syscalls

[ Upstream commit bceaa90240b6019ed73b49965eac7d167610be69 ]

Only update *addr_len when we actually fill in sockaddr, otherwise we
can return uninitialized memory from the stack to the caller in the
recvfrom, recvmmsg and recvmsg syscalls. Drop the the (addr_len == NULL)
checks because we only get called with a valid addr_len pointer either
from sock_common_recvmsg or inet_recvmsg.

If a blocking read waits on a socket which is concurrently shut down we
now return zero and set msg_msgnamelen to 0.

Reported-by: mpb <mpb.mail@gmail.com>
Suggested-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

pkt_sched: fq: fix pacing for small frames

[ Upstream commit f52ed89971adbe79b6438c459814034707b8ab91 ]

For performance reasons, sch_fq tried hard to not setup timers for every
sent packet, using a quantum based heuristic : A delay is setup only if
the flow exhausted its credit.

Problem is that application limited flows can refill their credit
for every queued packet, and they can evade pacing.

This problem can also be triggered when TCP flows use small MSS values,
as TSO auto sizing builds packets that are smaller than the default fq
quantum (3028 bytes)

This patch adds a 40 ms delay to guard flow credit refill.

Fixes: afe4fd062416 ("pkt_sched: fq: Fair Queue packet scheduler")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Maciej Żenczykowski <maze@google.com>
Cc: Willem de Bruijn <willemb@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

pkt_sched: fq: warn users using defrate

[ Upstream commit 65c5189a2b57b9aa1d89e4b79da39928257c9505 ]

Commit 7eec4174ff29 ("pkt_sched: fq: fix non TCP flows pacing")
obsoleted TCA_FQ_FLOW_DEFAULT_RATE without notice for the users.

Suggested by David Miller

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ipv4: fix possible seqlock deadlock

[ Upstream commit c9e9042994d37cbc1ee538c500e9da1bb9d1bcdf ]

ip4_datagram_connect() being called from process context,
it should use IP_INC_STATS() instead of IP_INC_STATS_BH()
otherwise we can deadlock on 32bit arches, or get corruptions of
SNMP counters.

Fixes: 584bdf8cbdf6 ("[IPV4]: Fix "ipOutNoRoutes" counter error for TCP and UDP")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Dave Jones <davej@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

connector: improved unaligned access error fix

[ Upstream commit 1ca1a4cf59ea343a1a70084fe7cc96f37f3cf5b1 ]

In af3e095a1fb4, Erik Jacobsen fixed one type of unaligned access
bug for ia64 by converting a 64-bit write to use put_unaligned().
Unfortunately, since gcc will convert a short memset() to a series
of appropriately-aligned stores, the problem is now visible again
on tilegx, where the memset that zeros out proc_event is converted
to three 64-bit stores, causing an unaligned access panic.

A better fix for the original problem is to ensure that proc_event
is aligned to 8 bytes here. We can do that relatively easily by
arranging to start the struct cn_msg aligned to 8 bytes and then
offset by 4 bytes. Doing so means that the immediately following
proc_event structure is then correctly aligned to 8 bytes.

The result is that the memset() stores are now aligned, and as an
added benefit, we can remove the put_unaligned() calls in the code.

Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

pkt_sched: fq: change classification of control packets

[ Upstream commit 2abc2f070eb30ac8421554a5c32229f8332c6206 ]

Initial sch_fq implementation copied code from pfifo_fast to classify
a packet as a high prio packet.

This clashes with setups using PRIO with say 7 bands, as one of the
band could be incorrectly (mis)classified by FQ.

Packets would be queued in the 'internal' queue, and no pacing ever
happen for this special queue.

Fixes: afe4fd062416 ("pkt_sched: fq: Fair Queue packet scheduler")
Signed-off-by: Maciej Żenczykowski <maze@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Stephen Hemminger <stephen@networkplumber.org>
Cc: Willem de Bruijn <willemb@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ip6tnl: fix use after free of fb_tnl_dev

[ Upstream commit 1e9f3d6f1c403dd2b6270f654b4747147aa2306f ]

Bug has been introduced by commit bb8140947a24 ("ip6tnl: allow to use rtnl ops
on fb tunnel").

When ip6_tunnel.ko is unloaded, FB device is delete by rtnl_link_unregister()
and then we try to use the pointer in ip6_tnl_destroy_tunnels().

Let's add an handler for dellink, which will never remove the FB tunnel. With
this patch it will no more be possible to remove it via 'ip link del ip6tnl0',
but it's safer.

The same fix was already proposed by Willem de Bruijn <willemb@google.com> for
sit interfaces.

CC: Willem de Bruijn <willemb@google.com>
Reported-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

isdnloop: use strlcpy() instead of strcpy()

[ Upstream commit f9a23c84486ed350cce7bb1b2828abd1f6658796 ]

These strings come from a copy_from_user() and there is no way to be
sure they are NUL terminated.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

sit: fix use after free of fb_tunnel_dev

[ Upstream commit 9434266f2c645d4fcf62a03a8e36ad8075e37943 ]

Bug: The fallback device is created in sit_init_net and assumed to be
freed in sit_exit_net. First, it is dereferenced in that function, in
sit_destroy_tunnels:

struct net *net = dev_net(sitn->fb_tunnel_dev);

Prior to this, rtnl_unlink_register has removed all devices that match
rtnl_link_ops == sit_link_ops.

Commit 205983c43700 added the line

+ sitn->fb_tunnel_dev->rtnl_link_ops = &sit_link_ops;

which cases the fallback device to match here and be freed before it
is last dereferenced.

Fix: This commit adds an explicit .delllink callback to sit_link_ops
that skips deallocation at rtnl_unlink_register for the fallback
device. This mechanism is comparable to the one in ip_tunnel.

It also modifies sit_destroy_tunnels and its only caller sit_exit_net
to avoid the offending dereference in the first place. That double
lookup is more complicated than required.

Test: The bug is only triggered when CONFIG_NET_NS is enabled. It
causes a GPF only when CONFIG_DEBUG_SLAB is enabled. Verified that
this bug exists at the mentioned commit, at davem-net HEAD and at
3.11.y HEAD. Verified that it went away after applying this patch.

Fixes: 205983c43700 ("sit: allow to use rtnl ops on fb tunnel")
Signed-off-by: Willem de Bruijn <willemb@google.com>
Acked-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

net-tcp: fix panic in tcp_fastopen_cache_set()

[ Upstream commit dccf76ca6b626c0c4a4e09bb221adee3270ab0ef ]

We had some reports of crashes using TCP fastopen, and Dave Jones
gave a nice stack trace pointing to the error.

Issue is that tcp_get_metrics() should not be called with a NULL dst

Fixes: 1fe4c481ba637 ("net-tcp: Fast Open client - cookie cache")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Dave Jones <davej@redhat.com>
Cc: Yuchung Cheng <ycheng@google.com>
Acked-by: Yuchung Cheng <ycheng@google.com>
Tested-by: Dave Jones <davej@fedoraproject.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

bonding: fix two race conditions in bond_store_updelay/downdelay

[ Upstream commit b869ccfab1e324507fa3596e3e1308444fb68227 ]

This patch fixes two race conditions between bond_store_updelay/downdelay
and bond_store_miimon which could lead to division by zero as miimon can
be set to 0 while either updelay/downdelay are being set and thus miss the
zero check in the beginning, the zero div happens because updelay/downdelay
are stored as new_value / bond->params.miimon. Use rtnl to synchronize with
miimon setting.

CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
CC: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com>
Acked-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

tcp: tsq: restore minimal amount of queueing

[ Upstream commit 98e09386c0ef4dfd48af7ba60ff908f0d525cdee ]

After commit c9eeec26e32e ("tcp: TSQ can use a dynamic limit"), several
users reported throughput regressions, notably on mvneta and wifi
adapters.

802.11 AMPDU requires a fair amount of queueing to be effective.

This patch partially reverts the change done in tcp_write_xmit()
so that the minimal amount is sysctl_tcp_limit_output_bytes.

It also remove the use of this sysctl while building skb stored
in write queue, as TSO autosizing does the right thing anyway.

Users with well behaving NICS and correct qdisc (like sch_fq),
can then lower the default sysctl_tcp_limit_output_bytes value from
128KB to 8KB.

This new usage of sysctl_tcp_limit_output_bytes permits each driver
authors to check how their driver performs when/if the value is set
to a minimum of 4KB.

Normally, line rate for a single TCP flow should be possible,
but some drivers rely on timers to perform TX completion and
too long TX completion delays prevent reaching full throughput.

Fixes: c9eeec26e32e ("tcp: TSQ can use a dynamic limit")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Sujith Manoharan <sujith@msujith.org>
Reported-by: Arnaud Ebalard <arno@natisbad.org>
Tested-by: Sujith Manoharan <sujith@msujith.org>
Cc: Felix Fietkau <nbd@openwrt.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

macvtap: limit head length of skb allocated

[ Upstream commit 16a3fa28630331e28208872fa5341ce210b901c7 ]

We currently use hdr_len as a hint of head length which is advertised by
guest. But when guest advertise a very big value, it can lead to an 64K+
allocating of kmalloc() which has a very high possibility of failure when host
memory is fragmented or under heavy stress. The huge hdr_len also reduce the
effect of zerocopy or even disable if a gso skb is linearized in guest.

To solves those issues, this patch introduces an upper limit (PAGE_SIZE) of the
head, which guarantees an order 0 allocation each time.

Cc: Stefan Hajnoczi <stefanha@redhat.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

tuntap: limit head length of skb allocated

[ Upstream commit 96f8d9ecf227638c89f98ccdcdd50b569891976c ]

We currently use hdr_len as a hint of head length which is advertised by
guest. But when guest advertise a very big value, it can lead to an 64K+
allocating of kmalloc() which has a very high possibility of failure when host
memory is fragmented or under heavy stress. The huge hdr_len also reduce the
effect of zerocopy or even disable if a gso skb is linearized in guest.

To solves those issues, this patch introduces an upper limit (PAGE_SIZE) of the
head, which guarantees an order 0 allocation each time.

Cc: Stefan Hajnoczi <stefanha@redhat.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

6lowpan: Uncompression of traffic class field was incorrect

[ Upstream commit 1188f05497e7bd2f2614b99c54adfbe7413d5749 ]

If priority/traffic class field in IPv6 header is set (seen when
using ssh), the uncompression sets the TC and Flow fields incorrectly.

Example:

This is IPv6 header of a sent packet. Note the priority/TC (=1) in
the first byte.

00000000: 61 00 00 00 00 2c 06 40 fe 80 00 00 00 00 00 00
00000010: 02 02 72 ff fe c6 42 10 fe 80 00 00 00 00 00 00
00000020: 02 1e ab ff fe 4c 52 57

This gets compressed like this in the sending side

00000000: 72 31 04 06 02 1e ab ff fe 4c 52 57 ec c2 00 16
00000010: aa 2d fe 92 86 4e be c6 ....

In the receiving end, the packet gets uncompressed to this
IPv6 header

00000000: 60 06 06 02 00 2a 1e 40 fe 80 00 00 00 00 00 00
00000010: 02 02 72 ff fe c6 42 10 fe 80 00 00 00 00 00 00
00000020: ab ff fe 4c 52 57 ec c2

First four bytes are set incorrectly and we have also lost
two bytes from destination address.

The fix is to switch the case values in switch statement
when checking the TC field.

Signed-off-by: Jukka Rissanen <jukka.rissanen@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

core/dev: do not ignore dmac in dev_forward_skb()

[ Upstream commit 81b9eab5ebbf0d5d54da4fc168cfb02c2adc76b8 ]

commit 06a23fe31ca3
("core/dev: set pkt_type after eth_type_trans() in dev_forward_skb()")
and refactoring 64261f230a91
("dev: move skb_scrub_packet() after eth_type_trans()")

are forcing pkt_type to be PACKET_HOST when skb traverses veth.

which means that ip forwarding will kick in inside netns
even if skb->eth->h_dest != dev->dev_addr

Fix order of eth_type_trans() and skb_scrub_packet() in dev_forward_skb()
and in ip_tunnel_rcv()

Fixes: 06a23fe31ca3 ("core/dev: set pkt_type after eth_type_trans() in dev_forward_skb()")
CC: Isaku Yamahata <yamahatanetdev@gmail.com>
CC: Maciej Zenczykowski <zenczykowski@gmail.com>
CC: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

usbnet: fix status interrupt urb handling

[ Upstream commit 52f48d0d9aaa621ffa5e08d79da99a3f8c93b848 ]

Since commit 7b0c5f21f348a66de495868b8df0284e8dfd6bbf
"sierra_net: keep status interrupt URB active", sierra_net triggers
status interrupt polling before the net_device is opened (in order to
properly receive the sync message response).

To be able to receive further interrupts, the interrupt urb needs to be
re-submitted, so this patch removes the bogus check for netif_running().

Signed-off-by: Felix Fietkau <nbd@openwrt.org>
Tested-by: Dan Williams <dcbw@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

bonding: don't permit to use ARP monitoring in 802.3ad mode

[ Upstream commit ec9f1d15db8185f63a2c3143dc1e90ba18541b08 ]

Currently the ARP monitoring is not supported with 802.3ad, and it's
prohibited to use it via the module params.

However we still can set it afterwards via sysfs, cause we only check for
*LB modes there.

To fix this - add a check for 802.3ad mode in bonding_store_arp_interval.

CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

random32: fix off-by-one in seeding requirement

[ Upstream commit 51c37a70aaa3f95773af560e6db3073520513912 ]

For properly initialising the Tausworthe generator [1], we have
a strict seeding requirement, that is, s1 > 1, s2 > 7, s3 > 15.

Commit 697f8d0348 ("random32: seeding improvement") introduced
a __seed() function that imposes boundary checks proposed by the
errata paper [2] to properly ensure above conditions.

However, we're off by one, as the function is implemented as:
"return (x < m) ? x + m : x;", and called with __seed(X, 1),
__seed(X, 7), __seed(X, 15). Thus, an unwanted seed of 1, 7, 15
would be possible, whereas the lower boundary should actually
be of at least 2, 8, 16, just as GSL does. Fix this, as otherwise
an initialization with an unwanted seed could have the effect
that Tausworthe's PRNG properties cannot not be ensured.

Note that this PRNG is *not* used for cryptography in the kernel.

[1] http://www.iro.umontreal.ca/~lecuyer/myftp/papers/tausme.ps
[2] http://www.iro.umontreal.ca/~lecuyer/myftp/papers/tausme2.ps

Joint work with Hannes Frederic Sowa.

Fixes: 697f8d0348a6 ("random32: seeding improvement")
Cc: Stephen Hemminger <stephen@networkplumber.org>
Cc: Florian Weimer <fweimer@redhat.com>
Cc: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ipv6: protect for_each_sk_fl_rcu in mem_check with rcu_read_lock_bh

[ Upstream commit f8c31c8f80dd882f7eb49276989a4078d33d67a7 ]

Fixes a suspicious rcu derference warning.

Cc: Florent Fourcot <florent.fourcot@enst-bretagne.fr>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ipv6: use rt6_get_dflt_router to get default router in rt6_route_rcv

[ Upstream commit f104a567e673f382b09542a8dc3500aa689957b4 ]

As the rfc 4191 said, the Router Preference and Lifetime values in a
::/0 Route Information Option should override the preference and lifetime
values in the Router Advertisement header. But when the kernel deals with
a ::/0 Route Information Option, the rt6_get_route_info() always return
NULL, that means that overriding will not happen, because those default
routers were added without flag RTF_ROUTEINFO in rt6_add_dflt_router().

In order to deal with that condition, we should call rt6_get_dflt_router
when the prefix length is 0.

Signed-off-by: Duan Jiong <duanj.fnst@cn.fujitsu.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

net: Fix "ip rule delete table 256"

[ Upstream commit 13eb2ab2d33c57ebddc57437a7d341995fc9138c ]

When trying to delete a table >= 256 using iproute2 the local table
will be deleted.
The table id is specified as a netlink attribute when it needs more then
8 bits and iproute2 then sets the table field to RT_TABLE_UNSPEC (0).
Preconditions to matching the table id in the rule delete code
doesn't seem to take the "table id in netlink attribute" into condition
so the frh_get_table helper function never gets to do its job when
matching against current rule.
Use the helper function twice instead of peaking at the table value directly.

Originally reported at: http://bugs.debian.org/724783

Reported-by: Nicolas HICHER <nhicher@avencall.com>
Signed-off-by: Andreas Henriksson <andreas@fatal.se>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

net/mlx4_en: Fixed crash when port type is changed

[ Upstream commit 1ec4864b10171b0691ee196d7006ae56d2c153f2 ]

timecounter_init() was was called only after first potential
timecounter_read().
Moved mlx4_en_init_timestamp() before mlx4_en_init_netdev()

Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

net: x86: bpf: don't forget to free sk_filter (v2)

[ Upstream commit 98bbc06aabac5a2dcc46580d20c59baf8ebe479f ]

sk_filter isn't freed if bpf_func is equal to sk_run_filter.

This memory leak was introduced by v3.12-rc3-224-gd45ed4a4
"net: fix unsafe set_memory_rw from softirq".

Before this patch sk_filter was freed in sk_filter_release_rcu,
now it should be freed in bpf_jit_free.

Here is output of kmemleak:
unreferenced object 0xffff8800b774eab0 (size 128):
  comm "systemd", pid 1, jiffies 4294669014 (age 124.062s)
  hex dump (first 32 bytes):
    00 00 00 00 0b 00 00 00 20 63 7f b7 00 88 ff ff  ........ c......
    60 d4 55 81 ff ff ff ff 30 d9 55 81 ff ff ff ff  `.U.....0.U.....
  backtrace:
    [<ffffffff816444be>] kmemleak_alloc+0x4e/0xb0
    [<ffffffff811845af>] __kmalloc+0xef/0x260
    [<ffffffff81534028>] sock_kmalloc+0x38/0x60
    [<ffffffff8155d4dd>] sk_attach_filter+0x5d/0x190
    [<ffffffff815378a1>] sock_setsockopt+0x991/0x9e0
    [<ffffffff81531bd6>] SyS_setsockopt+0xb6/0xd0
    [<ffffffff8165f3e9>] system_call_fastpath+0x16/0x1b
    [<ffffffffffffffff>] 0xffffffffffffffff

v2: add extra { } after else

Fixes: d45ed4a4e33a ("net: fix unsafe set_memory_rw from softirq")
Acked-by: Daniel Borkmann <dborkman@redhat.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Acked-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

bonding: RCUify bond_set_rx_mode()

[ Upstream commit b32418705107265dfca5edfe2b547643e53a732e ]

Currently we rely on rtnl locking in bond_set_rx_mode(), however it's not
always the case:

RTNL: assertion failed at drivers/net/bonding/bond_main.c (3391)
...
[<ffffffff81651ca5>] dump_stack+0x54/0x74
[<ffffffffa029e717>] bond_set_rx_mode+0xc7/0xd0 [bonding]
[<ffffffff81553af7>] __dev_set_rx_mode+0x57/0xa0
[<ffffffff81557ff8>] __dev_mc_add+0x58/0x70
[<ffffffff81558020>] dev_mc_add+0x10/0x20
[<ffffffff8161e26e>] igmp6_group_added+0x18e/0x1d0
[<ffffffff81186f76>] ? kmem_cache_alloc_trace+0x236/0x260
[<ffffffff8161f80f>] ipv6_dev_mc_inc+0x29f/0x320
[<ffffffff8161f9e7>] ipv6_sock_mc_join+0x157/0x260
...

Fix this by using RCU primitives.

Reported-by: Joe Lawrence <joe.lawrence@stratus.com>
Tested-by: Joe Lawrence <joe.lawrence@stratus.com>
CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ipv6: fix headroom calculation in udp6_ufo_fragment

[ Upstream commit 0e033e04c2678dbbe74a46b23fffb7bb918c288e ]

Commit 1e2bd517c108816220f262d7954b697af03b5f9c ("udp6: Fix udp
fragmentation for tunnel traffic.") changed the calculation if
there is enough space to include a fragment header in the skb from a
skb->mac_header dervived one to skb_headroom. Because we already peeled
off the skb to transport_header this is wrong. Change this back to check
if we have enough room before the mac_header.

This fixes a panic Saran Neti reported. He used the tbf scheduler which
skb_gso_segments the skb. The offsets get negative and we panic in memcpy
because the skb was erroneously not expanded at the head.

Reported-by: Saran Neti <Saran.Neti@telus.com>
Cc: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

net: mv643xx_eth: potential NULL dereference in probe()

upstream commit 6115c11fe1a5a636ac99fc823b00df4ff3c0674e

We assume that "mp->phy" can be NULL a couple lines before the
dereference.

Fixes: 1cce16d37d0f ('net: mv643xx_eth: Add missing phy_addr_set in DT mode')
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com>
Acked-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

net: mv643xx_eth: Add missing phy_addr_set in DT mode

Commit cc9d4598 'net: mv643xx_eth: use of_phy_connect if phy_node
present' made the call to phy_scan optional, if the DT has a link to
the phy node.

However phy_scan has the side effect of calling phy_addr_set, which
writes the phy MDIO address to the ethernet controller. If phy_addr_set
is not called, and the bootloader has not set the correct address then
the driver will fail to function.

Tested on Kirkwood.

Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Acked-by: Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com>
Tested-by: Arnaud Ebalard <arno@natisbad.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Linux 3.12.3

HID: apple: option to swap the 'Option' ("Alt") and 'Command' ("Flag") keys.

commit 43c831468b3d26dbe8f2e061ccaf1abaf9cc1b8b upstream.

Use case: people who use both Apple and PC keyboards regularly, and desire to
keep&use their PC muscle memory.

A particular use case: an Apple compact external keyboard connected to a PC
laptop. (This use case can't be covered well by X.org key remappings etc.)

Signed-off-by: Nanno Langstraat <langstr@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

HID: enable Mayflash USB Gamecube Adapter

commit e17f5d7667c5414b8f12a93ef14aae0824bd2beb upstream.

This is a patch that adds the new Mayflash Gamecube Controller to USB adapter
(ID 1a34:f705 ACRUX) to the ACRUX driver (drivers/hid/hid-axff.c) with full
force feedback support.

Signed-off-by: Tristan Rice <rice@outerearth.net>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

HID: add support for LEETGION Hellion Gaming Mouse

commit f1a4914bd04911fbeaee23445112adae8977144a upstream.

Added id, bindings and comments for Holtek USB ID 04d9:a072
LEETGION Hellion Gaming mouse to use the same corrections of the report
descriptor as Holtek 04d9:a067. As the mouse exceed HID_MAX_USAGES at the
same offsets in the reported descriptor.
Tested on the hardware.

Signed-off-by: Anders F. U. Kiær <ablacksheep@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

HID: roccat: add missing special driver declarations

commit e078809df5611600965f4d3420c3256260fc3e3d upstream.

Forgot two special driver declarations and sorted the list.

Signed-off-by: Stefan Achatz <erazor_de@users.sourceforge.net>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

HID: roccat: fix Coverity CID 141438

commit 7be63f20b00840a6f1c718dcee00855688d64acd upstream.

Add missing switch breaks.

Signed-off-by: Stefan Achatz <erazor_de@users.sourceforge.net>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

HID: roccat: add new device return value

commit 14fc4290df2fb94a28f39dab9ed32feaa5527bef upstream.

Ryos uses a new return value for critical errors, others have been
confirmed.

Signed-off-by: Stefan Achatz <erazor_de@users.sourceforge.net>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

X.509: Remove certificate date checks

commit 124df926090b32a998483f6e43ebeccdbe5b5302 upstream.

Remove the certificate date checks that are performed when a certificate is
parsed. There are two checks: a valid from and a valid to. The first check is
causing a lot of problems with system clocks that don't keep good time and the
second places an implicit expiry date upon the kernel when used for module
signing, so do we really need them?

Signed-off-by: David Howells <dhowells@redhat.com>
cc: David Woodhouse <dwmw2@infradead.org>
cc: Rusty Russell <rusty@rustcorp.com.au>
cc: Josh Boyer <jwboyer@redhat.com>
cc: Alexander Holler <holler@ahsoftware.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

media: s5h1420: Don't use dynamic static allocation

commit 9736a89dafe07359d9c86bf9c3b815a250b354bc upstream.

Dynamic static allocation is evil, as Kernel stack is too low, and
compilation complains about it on some archs:
drivers/media/dvb-frontends/s5h1420.c:851:1: warning: 's5h1420_tuner_i2c_tuner_xfer' uses dynamic stack allocation [enabled by default]
Instead, let's enforce a limit for the buffer.
In the specific case of this frontend, only ttpci uses it. The maximum
number of messages there is two, on I2C read operations. As the logic
can add an extra operation, change the size to 3.

Signed-off-by: Mauro Carvalho Chehab <m.chehab@samsung.com>
Reviewed-by: Hans Verkuil <hans.verkuil@cisco.com>
Signed-off-by: Mauro Carvalho Chehab <m.chehab@samsung.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

media: dvb-frontends: Don't use dynamic static allocation

commit 8393796dfa4cf5dffcceec464c7789bec3a2f471 upstream.

Dynamic static allocation is evil, as Kernel stack is too low, and
compilation complains about it on some archs:
drivers/media/dvb-frontends/bcm3510.c:230:1: warning: 'bcm3510_do_hab_cmd' uses dynamic stack allocation [enabled by default]
drivers/media/dvb-frontends/itd1000.c:69:1: warning: 'itd1000_write_regs.constprop.0' uses dynamic stack allocation [enabled by default]
drivers/media/dvb-frontends/mt312.c:126:1: warning: 'mt312_write' uses dynamic stack allocation [enabled by default]
drivers/media/dvb-frontends/nxt200x.c:111:1: warning: 'nxt200x_writebytes' uses dynamic stack allocation [enabled by default]
drivers/media/dvb-frontends/stb6100.c:216:1: warning: 'stb6100_write_reg_range.constprop.3' uses dynamic stack allocation [enabled by default]
drivers/media/dvb-frontends/stv6110.c:98:1: warning: 'stv6110_write_regs' uses dynamic stack allocation [enabled by default]
drivers/media/dvb-frontends/stv6110x.c:85:1: warning: 'stv6110x_write_regs' uses dynamic stack allocation [enabled by default]
drivers/media/dvb-frontends/tda18271c2dd.c:147:1: warning: 'WriteRegs' uses dynamic stack allocation [enabled by default]
drivers/media/dvb-frontends/zl10039.c:119:1: warning: 'zl10039_write' uses dynamic stack allocation [enabled by default]
Instead, let's enforce a limit for the buffer. Considering that I2C
transfers are generally limited, and that devices used on USB has a
max data length of 64 bytes for the control URBs.
So, it seem safe to use 64 bytes as the hard limit for all those devices.
On most cases, the limit is a way lower than that, but this limit
is small enough to not affect the Kernel stack, and it is a no brain
limit, as using smaller ones would require to either carefully each
driver or to take a look on each datasheet.

Signed-off-by: Mauro Carvalho Chehab <m.chehab@samsung.com>
Reviewed-by: Hans Verkuil <hans.verkuil@cisco.com>
Signed-off-by: Mauro Carvalho Chehab <m.chehab@samsung.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

media: dvb-frontends: Don't use dynamic static allocation

commit 37ebaf6891ee81687bb558e8375c0712d8264ed8 upstream.

Dynamic static allocation is evil, as Kernel stack is too low, and
compilation complains about it on some archs:
drivers/media/dvb-frontends/af9013.c:77:1: warning: 'af9013_wr_regs_i2c' uses dynamic stack allocation [enabled by default]
drivers/media/dvb-frontends/af9033.c:188:1: warning: 'af9033_wr_reg_val_tab' uses dynamic stack allocation [enabled by default]
drivers/media/dvb-frontends/af9033.c:68:1: warning: 'af9033_wr_regs' uses dynamic stack allocation [enabled by default]
drivers/media/dvb-frontends/bcm3510.c:230:1: warning: 'bcm3510_do_hab_cmd' uses dynamic stack allocation [enabled by default]
drivers/media/dvb-frontends/cxd2820r_core.c:84:1: warning: 'cxd2820r_rd_regs_i2c.isra.1' uses dynamic stack allocation [enabled by default]
drivers/media/dvb-frontends/rtl2830.c:56:1: warning: 'rtl2830_wr' uses dynamic stack allocation [enabled by default]
drivers/media/dvb-frontends/rtl2832.c:187:1: warning: 'rtl2832_wr' uses dynamic stack allocation [enabled by default]
drivers/media/dvb-frontends/tda10071.c:52:1: warning: 'tda10071_wr_regs' uses dynamic stack allocation [enabled by default]
drivers/media/dvb-frontends/tda10071.c:84:1: warning: 'tda10071_rd_regs' uses dynamic stack allocation [enabled by default]
Instead, let's enforce a limit for the buffer. Considering that I2C
transfers are generally limited, and that devices used on USB has a
max data length of 64 bytes for the control URBs.
So, it seem safe to use 64 bytes as the hard limit for all those devices.
On most cases, the limit is a way lower than that, but this limit
is small enough to not affect the Kernel stack, and it is a no brain
limit, as using smaller ones would require to either carefully each
driver or to take a look on each datasheet.

Signed-off-by: Mauro Carvalho Chehab <m.chehab@samsung.com>
Reviewed-by: Hans Verkuil <hans.verkuil@cisco.com>
Reviewed-by: Antti Palosaari <crope@iki.fi>
Signed-off-by: Mauro Carvalho Chehab <m.chehab@samsung.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

media: stb0899_drv: Don't use dynamic static allocation

commit ba4746423488aafa435739c32bfe0758f3dd5d77 upstream.

Dynamic static allocation is evil, as Kernel stack is too low, and
compilation complains about it on some archs:
drivers/media/dvb-frontends/stb0899_drv.c:540:1: warning: 'stb0899_write_regs' uses dynamic stack allocation [enabled by default]
Instead, let's enforce a limit for the buffer. Considering that I2C
transfers are generally limited, and that devices used on USB has a
max data length of 64 bytes for the control URBs.
So, it seem safe to use 64 bytes as the hard limit for all those devices.
On most cases, the limit is a way lower than that, but this limit
is small enough to not affect the Kernel stack, and it is a no brain
limit, as using smaller ones would require to either carefully each
driver or to take a look on each datasheet.

Signed-off-by: Mauro Carvalho Chehab <m.chehab@samsung.com>
Reviewed-by: Hans Verkuil <hans.verkuil@cisco.com>
Signed-off-by: Mauro Carvalho Chehab <m.chehab@samsung.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

media: stv0367: Don't use dynamic static allocation

commit 9aca4fb0571ce9cfef680ceb08d19dd008015307 upstream.

Dynamic static allocation is evil, as Kernel stack is too low, and
compilation complains about it on some archs:
drivers/media/dvb-frontends/stv0367.c:791:1: warning: 'stv0367_writeregs.constprop.4' uses dynamic stack allocation [enabled by default]
Instead, let's enforce a limit for the buffer. Considering that I2C
transfers are generally limited, and that devices used on USB has a
max data length of 64 bytes for the control URBs.
So, it seem safe to use 64 bytes as the hard limit for all those devices.
On most cases, the limit is a way lower than that, but this limit
is small enough to not affect the Kernel stack, and it is a no brain
limit, as using smaller ones would require to either carefully each
driver or to take a look on each datasheet.

Signed-off-by: Mauro Carvalho Chehab <m.chehab@samsung.com>
Reviewed-by: Hans Verkuil <hans.verkuil@cisco.com>
Signed-off-by: Mauro Carvalho Chehab <m.chehab@samsung.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

media: stv090x: Don't use dynamic static allocation

commit f7a35df15b1f7de7823946aebc9164854e66ea07 upstream.

Dynamic static allocation is evil, as Kernel stack is too low, and
compilation complains about it on some archs:
drivers/media/dvb-frontends/stv090x.c:750:1: warning: 'stv090x_write_regs.constprop.6' uses dynamic stack allocation [enabled by default]
Instead, let's enforce a limit for the buffer. Considering that I2C
transfers are generally limited, and that devices used on USB has a
max data length of 64 bytes for the control URBs.
So, it seem safe to use 64 bytes as the hard limit for all those devices.
On most cases, the limit is a way lower than that, but this limit
is small enough to not affect the Kernel stack, and it is a no brain
limit, as using smaller ones would require to either carefully each
driver or to take a look on each datasheet.

Signed-off-by: Mauro Carvalho Chehab <m.chehab@samsung.com>
Reviewed-by: Hans Verkuil <hans.verkuil@cisco.com>
Signed-off-by: Mauro Carvalho Chehab <m.chehab@samsung.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

media: tuners: Don't use dynamic static allocation

commit f1baab870f6e93b668af7b34d6f6ba49f1b0e982 upstream.

Dynamic static allocation is evil, as Kernel stack is too low, and
compilation complains about it on some archs:
drivers/media/tuners/e4000.c:50:1: warning: 'e4000_wr_regs' uses dynamic stack allocation [enabled by default]
drivers/media/tuners/e4000.c:83:1: warning: 'e4000_rd_regs' uses dynamic stack allocation [enabled by default]
drivers/media/tuners/fc2580.c:66:1: warning: 'fc2580_wr_regs.constprop.1' uses dynamic stack allocation [enabled by default]
drivers/media/tuners/fc2580.c:98:1: warning: 'fc2580_rd_regs.constprop.0' uses dynamic stack allocation [enabled by default]
drivers/media/tuners/tda18212.c:57:1: warning: 'tda18212_wr_regs' uses dynamic stack allocation [enabled by default]
drivers/media/tuners/tda18212.c:90:1: warning: 'tda18212_rd_regs.constprop.0' uses dynamic stack allocation [enabled by default]
drivers/media/tuners/tda18218.c:60:1: warning: 'tda18218_wr_regs' uses dynamic stack allocation [enabled by default]
drivers/media/tuners/tda18218.c:92:1: warning: 'tda18218_rd_regs.constprop.0' uses dynamic stack allocation [enabled by default]
Instead, let's enforce a limit for the buffer. Considering that I2C
transfers are generally limited, and that devices used on USB has a
max data length of 64 bytes for the control URBs.
So, it seem safe to use 64 bytes as the hard limit for all those devices.
On most cases, the limit is a way lower than that, but this limit
is small enough to not affect the Kernel stack, and it is a no brain
limit, as using smaller ones would require to either carefully each
driver or to take a look on each datasheet.

Signed-off-by: Mauro Carvalho Chehab <m.chehab@samsung.com>
Reviewed-by: Hans Verkuil <hans.verkuil@cisco.com>
Reviewed-by: Antti Palosaari <crope@iki.fi>
Signed-off-by: Mauro Carvalho Chehab <m.chehab@samsung.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

media: tuner-xc2028: Don't use dynamic static allocation

commit 56ac033725ec93a45170caf3979eb2b1211a59a8 upstream.

Dynamic static allocation is evil, as Kernel stack is too low, and
compilation complains about it on some archs:
drivers/media/tuners/tuner-xc2028.c:651:1: warning: 'load_firmware' uses dynamic stack allocation [enabled by default]
Instead, let's enforce a limit for the buffer.
In the specific case of this driver, the maximum limit is 80, used only
on tm6000 driver. This limit is due to the size of the USB control URBs.
Ok, it would be theoretically possible to use a bigger size on PCI
devices, but the firmware load time is already good enough. Anyway,
if some usage requires more, it is just a matter of also increasing
the buffer size at load_firmware().

Signed-off-by: Mauro Carvalho Chehab <m.chehab@samsung.com>
Reviewed-by: Hans Verkuil <hans.verkuil@cisco.com>
Signed-off-by: Mauro Carvalho Chehab <m.chehab@samsung.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>