summaryrefslogtreecommitdiffstats
path: root/arch/sparc (unfollow)
Commit message (Collapse)AuthorFilesLines
2014-06-16Linux 3.16-rc1v3.16-rc1Linus Torvalds1-2/+2
2014-06-15net: sctp: fix permissions for rto_alpha and rto_beta knobsDaniel Borkmann1-4/+28
Commit 3fd091e73b81 ("[SCTP]: Remove multiple levels of msecs to jiffies conversions.") has silently changed permissions for rto_alpha and rto_beta knobs from 0644 to 0444. The purpose of this was to discourage users from tweaking rto_alpha and rto_beta knobs in production environments since they are key to correctly compute rtt/srtt. RFC4960 under section 6.3.1. RTO Calculation says regarding rto_alpha and rto_beta under rule C3 and C4: [...] C3) When a new RTT measurement R' is made, set RTTVAR <- (1 - RTO.Beta) * RTTVAR + RTO.Beta * |SRTT - R'| and SRTT <- (1 - RTO.Alpha) * SRTT + RTO.Alpha * R' Note: The value of SRTT used in the update to RTTVAR is its value before updating SRTT itself using the second assignment. After the computation, update RTO <- SRTT + 4 * RTTVAR. C4) When data is in flight and when allowed by rule C5 below, a new RTT measurement MUST be made each round trip. Furthermore, new RTT measurements SHOULD be made no more than once per round trip for a given destination transport address. There are two reasons for this recommendation: First, it appears that measuring more frequently often does not in practice yield any significant benefit [ALLMAN99]; second, if measurements are made more often, then the values of RTO.Alpha and RTO.Beta in rule C3 above should be adjusted so that SRTT and RTTVAR still adjust to changes at roughly the same rate (in terms of how many round trips it takes them to reflect new values) as they would if making only one measurement per round-trip and using RTO.Alpha and RTO.Beta as given in rule C3. However, the exact nature of these adjustments remains a research issue. [...] While it is discouraged to adjust rto_alpha and rto_beta and not further specified how to adjust them, the RFC also doesn't explicitly forbid it, but rather gives a RECOMMENDED default value (rto_alpha=3, rto_beta=2). We have a couple of users relying on the old permissions before they got changed. That said, if someone really has the urge to adjust them, we could allow it with a warning in the log. Fixes: 3fd091e73b81 ("[SCTP]: Remove multiple levels of msecs to jiffies conversions.") Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Cc: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-15vxlan: Checksum fixesTom Herbert1-9/+2
Call skb_pop_rcv_encapsulation and postpull_rcsum for the Ethernet header to work properly with checksum complete. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-15net: add skb_pop_rcv_encapsulationTom Herbert1-0/+12
This function is used by UDP encapsulation protocols in RX when crossing encapsulation boundary. If ip_summed is set to CHECKSUM_UNNECESSARY and encapsulation is not set, change to CHECKSUM_NONE since the checksum has not been validated within the encapsulation. Clears csum_valid by the same rationale. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-15udp: call __skb_checksum_complete when doing full checksumTom Herbert1-1/+3
In __udp_lib_checksum_complete check if checksum is being done over all the data (len is equal to skb->len) and if it is call __skb_checksum_complete instead of __skb_checksum_complete_head. This allows checksum to be saved in checksum complete. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-15net: Fix save software checksum completeTom Herbert2-10/+29
Geert reported issues regarding checksum complete and UDP. The logic introduced in commit 7e3cead5172927732f51fde ("net: Save software checksum complete") is not correct. This patch: 1) Restores code in __skb_checksum_complete_header except for setting CHECKSUM_UNNECESSARY. This function may be calculating checksum on something less than skb->len. 2) Adds saving checksum to __skb_checksum_complete. The full packet checksum 0..skb->len is calculated without adding in pseudo header. This value is saved in skb->csum and then the pseudo header is added to that to derive the checksum for validation. 3) In both __skb_checksum_complete_header and __skb_checksum_complete, set skb->csum_valid to whether checksum of zero was computed. This allows skb_csum_unnecessary to return true without changing to CHECKSUM_UNNECESSARY which was done previously. 4) Copy new csum related bits in __copy_skb_header. Reported-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-15net: Fix GSO constants to match NETIF flagsTom Herbert3-5/+14
Joseph Gasparakis reported that VXLAN GSO offload stopped working with i40e device after recent UDP changes. The problem is that the SKB_GSO_* bits are out of sync with the corresponding NETIF flags. This patch fixes that. Also, we add BUILD_BUG_ONs in net_gso_ok for several GSO constants that were missing to avoid the problem in the future. Reported-by: Joseph Gasparakis <joseph.gasparakis@intel.com> Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-15fix __swap_writepage() compile failure on old gcc versionsAl Viro1-1/+1
Tetsuo Handa wrote: "Commit 62a8067a7f35 ("bio_vec-backed iov_iter") introduced an unnamed union inside a struct which gcc-4.4.7 cannot handle. Name the unnamed union as u in order to fix build failure" Let's do this instead: there is only one place in the entire tree that steps into this breakage. Anon structs and unions work in older gcc versions; as the matter of fact, we have those in the tree - see e.g. struct ieee80211_tx_info in include/net/mac80211.h What doesn't work is handling their initializers: struct { int a; union { int b; char c; }; } x[2] = {{.a = 1, .c = 'a'}, {.a = 0, .b = 1}}; is the obvious syntax for initializer, perfectly fine for C11 and handled correctly by gcc-4.7 or later. Earlier versions, though, break on it - declaration is fine and so's access to fields (i.e. x[0].c = 'a'; would produce the right code), but members of the anon structs and unions are not inserted into the right namespace. Tellingly, those older versions will not barf on struct {int a; struct {int a;};}; - looks like they just have it hacked up somewhere around the handling of . and -> instead of doing the right thing. The easiest way to deal with that crap is to turn initialization of those fields (in the only place where we have such initializer of iov_iter) into plain assignment. Reported-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Reported-by: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-14udp: ipv4: do not waste time in __udp4_lib_mcast_demux_lookupEric Dumazet1-0/+4
Its too easy to add thousand of UDP sockets on a particular bucket, and slow down an innocent multicast receiver. Early demux is supposed to be an optimization, we should avoid spending too much time in it. It is interesting to note __udp4_lib_demux_lookup() only tries to match first socket in the chain. 10 is the threshold we already have in __udp4_lib_lookup() to switch to secondary hash. Fixes: 421b3885bf6d5 ("udp: ipv4: Add udp early demux") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: David Held <drheld@google.com> Cc: Shawn Bohrer <sbohrer@rgmadvisors.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-14vxlan: use dev->needed_headroom instead of dev->hard_header_lenCong Wang1-4/+3
When we mirror packets from a vxlan tunnel to other device, the mirror device should see the same packets (that is, without outer header). Because vxlan tunnel sets dev->hard_header_len, tcf_mirred() resets mac header back to outer mac, the mirror device actually sees packets with outer headers Vxlan tunnel should set dev->needed_headroom instead of dev->hard_header_len, like what other ip tunnels do. This fixes the above problem. Cc: "David S. Miller" <davem@davemloft.net> Cc: stephen hemminger <stephen@networkplumber.org> Cc: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: Cong Wang <cwang@twopensource.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-13MAINTAINERS: update cxgb4 maintainerDimitris Michailidis1-1/+1
Hari's been doing the patch submissions for a while now and he'll be taking over as maintainer. Signed-off-by: Dimitris Michailidis <dm@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-13x86/vdso: Fix vdso_installAndy Lutomirski1-11/+11
"make vdso_install" installs unstripped versions of the vdso objects for the benefit of the debugger. This was broken by checkin: 6f121e548f83 x86, vdso: Reimplement vdso.so preparation in build-time C The filenames are different now, so update the Makefile to cope. This still installs the 64-bit vdso as vdso64.so. We believe this will be okay, as the only known user is a patched gdb which is known to use build-ids, but if it turns out to be a problem we may have to add a link. Inspired by a patch from Sam Ravnborg. Acked-by: Sam Ravnborg <sam@ravnborg.org> Reported-by: Josh Boyer <jwboyer@fedoraproject.org> Tested-by: Josh Boyer <jwboyer@fedoraproject.org> Signed-off-by: Andy Lutomirski <luto@amacapital.net> Link: http://lkml.kernel.org/r/b10299edd8ba98d17e07dafcd895b8ecf4d99eff.1402586707.git.luto@amacapital.net Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2014-06-13NVMe: Fix START_STOP_UNIT Scsi->NVMe translation.Dan McLeran1-7/+6
This patch contains several fixes for Scsi START_STOP_UNIT. The previous code did not account for signed vs. unsigned arithmetic which resulted in an invalid lowest power state caculation when the device only supports 1 power state. The code for Power Condition == 2 (Idle) was not following the spec. The spec calls for setting the device to specific power states, depending upon Power Condition Modifier, without accounting for the number of power states supported by the device. The code for Power Condition == 3 (Standby) was using a hard-coded '0' which is replaced with the macro POWER_STATE_0. Signed-off-by: Dan McLeran <daniel.mcleran@intel.com> Reviewed-by: Vishal Verma <vishal.l.verma@linux.intel.com> Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
2014-06-13btrfs: fix error handling in create_pending_snapshotEric Sandeen1-5/+7
fcebe456 cut and pasted some code to a later point in create_pending_snapshot(), but didn't switch to the appropriate error handling for this stage of the function. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Chris Mason <clm@fb.com>
2014-06-13btrfs: fix use of uninit "ret" in end_extent_writepage()Eric Sandeen1-1/+1
If this condition in end_extent_writepage() is false: if (tree->ops && tree->ops->writepage_end_io_hook) we will then test an uninitialized "ret" at: ret = ret < 0 ? ret : -EIO; The test for ret is for the case where ->writepage_end_io_hook failed, and we'd choose that ret as the error; but if there is no ->writepage_end_io_hook, nothing sets ret. Initializing ret to 0 should be sufficient; if writepage_end_io_hook wasn't set, (!uptodate) means non-zero err was passed in, so we choose -EIO in that case. Signed-of-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Chris Mason <clm@fb.com>
2014-06-13btrfs: free ulist in qgroup_shared_accounting() error pathEric Sandeen1-1/+3
If tmp = ulist_alloc(GFP_NOFS) fails, we return without freeing the previously allocated qgroups = ulist_alloc(GFP_NOFS) and cause a memory leak. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Chris Mason <clm@fb.com>
2014-06-13Btrfs: fix qgroups sanity test crash or hangFilipe Manana1-0/+2
Often when running the qgroups sanity test, a crash or a hang happened. This is because the extent buffer the test uses for the root node doesn't have an header level explicitly set, making it have a random level value. This is a problem when it's not zero for the btrfs_search_slot() calls the test ends up doing, resulting in crashes or hangs such as the following: [ 6454.127192] Btrfs loaded, debug=on, assert=on, integrity-checker=on (...) [ 6454.127760] BTRFS: selftest: Running qgroup tests [ 6454.127964] BTRFS: selftest: Running test_test_no_shared_qgroup [ 6454.127966] BTRFS: selftest: Qgroup basic add [ 6480.152005] BUG: soft lockup - CPU#0 stuck for 23s! [modprobe:5383] [ 6480.152005] Modules linked in: btrfs(+) xor raid6_pq binfmt_misc nfsd auth_rpcgss oid_registry nfs_acl nfs lockd fscache sunrpc i2c_piix4 i2c_core pcspkr evbug psmouse serio_raw e1000 [last unloaded: btrfs] [ 6480.152005] irq event stamp: 188448 [ 6480.152005] hardirqs last enabled at (188447): [<ffffffff8168ef5c>] restore_args+0x0/0x30 [ 6480.152005] hardirqs last disabled at (188448): [<ffffffff81698e6a>] apic_timer_interrupt+0x6a/0x80 [ 6480.152005] softirqs last enabled at (188446): [<ffffffff810516cf>] __do_softirq+0x1cf/0x450 [ 6480.152005] softirqs last disabled at (188441): [<ffffffff81051c25>] irq_exit+0xb5/0xc0 [ 6480.152005] CPU: 0 PID: 5383 Comm: modprobe Not tainted 3.15.0-rc8-fdm-btrfs-next-33+ #4 [ 6480.152005] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 [ 6480.152005] task: ffff8802146125a0 ti: ffff8800d0d00000 task.ti: ffff8800d0d00000 [ 6480.152005] RIP: 0010:[<ffffffff81349a63>] [<ffffffff81349a63>] __write_lock_failed+0x13/0x20 [ 6480.152005] RSP: 0018:ffff8800d0d038e8 EFLAGS: 00000287 [ 6480.152005] RAX: 0000000000000000 RBX: ffffffff8168ef5c RCX: 000005deb8525852 [ 6480.152005] RDX: 0000000000000000 RSI: 0000000000001d45 RDI: ffff8802105000b8 [ 6480.152005] RBP: ffff8800d0d038e8 R08: fffffe12710f63db R09: ffffffffa03196fb [ 6480.152005] R10: ffff8802146125a0 R11: ffff880214612e28 R12: ffff8800d0d03858 [ 6480.152005] R13: 0000000000000000 R14: ffff8800d0d00000 R15: ffff8802146125a0 [ 6480.152005] FS: 00007f14ff804700(0000) GS:ffff880215e00000(0000) knlGS:0000000000000000 [ 6480.152005] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [ 6480.152005] CR2: 00007fff4df0dac8 CR3: 00000000d1796000 CR4: 00000000000006f0 [ 6480.152005] Stack: [ 6480.152005] ffff8800d0d03908 ffffffff810ae967 0000000000000001 ffff8802105000b8 [ 6480.152005] ffff8800d0d03938 ffffffff8168e57e ffffffffa0319c16 0000000000000007 [ 6480.152005] ffff880210500000 ffff880210500100 ffff8800d0d039b8 ffffffffa0319c16 [ 6480.152005] Call Trace: [ 6480.152005] [<ffffffff810ae967>] do_raw_write_lock+0x47/0xa0 [ 6480.152005] [<ffffffff8168e57e>] _raw_write_lock+0x5e/0x80 [ 6480.152005] [<ffffffffa0319c16>] ? btrfs_tree_lock+0x116/0x270 [btrfs] [ 6480.152005] [<ffffffffa0319c16>] btrfs_tree_lock+0x116/0x270 [btrfs] [ 6480.152005] [<ffffffffa02b2acb>] btrfs_lock_root_node+0x3b/0x50 [btrfs] [ 6480.152005] [<ffffffffa02b81a6>] btrfs_search_slot+0x916/0xa20 [btrfs] [ 6480.152005] [<ffffffff811a727f>] ? create_object+0x23f/0x300 [ 6480.152005] [<ffffffffa02b9958>] btrfs_insert_empty_items+0x78/0xd0 [btrfs] [ 6480.152005] [<ffffffffa036041a>] insert_normal_tree_ref.constprop.4+0xa2/0x19a [btrfs] [ 6480.152005] [<ffffffffa03605c3>] test_no_shared_qgroup+0xb1/0x1ca [btrfs] [ 6480.152005] [<ffffffff8108cad6>] ? local_clock+0x16/0x30 [ 6480.152005] [<ffffffffa035ef8e>] btrfs_test_qgroups+0x1ae/0x1d7 [btrfs] [ 6480.152005] [<ffffffffa03a69d2>] ? ftrace_define_fields_btrfs_space_reservation+0xfd/0xfd [btrfs] [ 6480.152005] [<ffffffffa03a6a86>] init_btrfs_fs+0xb4/0x153 [btrfs] [ 6480.152005] [<ffffffff81000352>] do_one_initcall+0x102/0x150 [ 6480.152005] [<ffffffff8103d223>] ? set_memory_nx+0x43/0x50 [ 6480.152005] [<ffffffff81682668>] ? set_section_ro_nx+0x6d/0x74 [ 6480.152005] [<ffffffff810d91cc>] load_module+0x1cdc/0x2630 (...) Therefore initialize the extent buffer as an empty leaf (level 0). Issue easy to reproduce when btrfs is built as a module via: $ for ((i = 1; i <= 1000000; i++)); do rmmod btrfs; modprobe btrfs; done Signed-off-by: Filipe David Borba Manana <fdmanana@gmail.com> Signed-off-by: Chris Mason <clm@fb.com>
2014-06-13btrfs: prevent RCU warning when dereferencing radix tree slotSasha Levin1-1/+1
Mark the dereference as protected by lock. Not doing so triggers an RCU warning since the radix tree assumed that RCU is in use. Signed-off-by: Sasha Levin <sasha.levin@oracle.com> Signed-off-by: Chris Mason <clm@fb.com>
2014-06-13Btrfs: fix unfinished readahead thread for raid5/6 degraded mountingWang Shilong1-2/+7
Steps to reproduce: # mkfs.btrfs -f /dev/sd[b-f] -m raid5 -d raid5 # mkfs.ext4 /dev/sdc --->corrupt one of btrfs device # mount /dev/sdb /mnt -o degraded # btrfs scrub start -BRd /mnt This is because readahead would skip missing device, this is not true for RAID5/6, because REQ_GET_READ_MIRRORS return 1 for RAID5/6 block mapping. If expected data locates in missing device, readahead thread would not call __readahead_hook() which makes event @rc->elems=0 wait forever. Fix this problem by checking return value of btrfs_map_block(),we can only skip missing device safely if there are several mirrors. Signed-off-by: Wang Shilong <wangsl.fnst@cn.fujitsu.com> Signed-off-by: Chris Mason <clm@fb.com>
2014-06-13btrfs: new ioctl TREE_SEARCH_V2Gerhard Heift2-0/+51
This new ioctl call allows the user to supply a buffer of varying size in which a tree search can store its results. This is much more flexible if you want to receive items which are larger than the current fixed buffer of 3992 bytes or if you want to fetch more items at once. Items larger than this buffer are for example some of the type EXTENT_CSUM. Signed-off-by: Gerhard Heift <Gerhard@Heift.Name> Signed-off-by: Chris Mason <clm@fb.com> Acked-by: David Sterba <dsterba@suse.cz>
2014-06-13NVMe: Use Log Page constants in SCSI emulationMatthew Wilcox1-3/+2
The nvme-scsi file defined its own Log Page constant. Use the newly-defined one from the header file instead. Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
2014-06-13NVMe: Define Log Page constantsMatthew Wilcox1-0/+4
Taken from the 1.1a version of the spec Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
2014-06-13NVMe: Fix hot cpu notification dead lockKeith Busch2-11/+26
There is a potential dead lock if a cpu event occurs during nvme probe since it registered with hot cpu notification. This fixes the race by having the module register with notification outside of probe rather than have each device register. The actual work is done in a scheduled work queue instead of in the notifier since assigning IO queues has the potential to block if the driver creates additional queues. Signed-off-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
2014-06-13ALSA: hda/realtek - Add more entry for enable HP mute ledKailang Yang1-0/+14
More HP machine need mute led support. Signed-off-by: Kailang Yang <kailang@realtek.com> Cc: <stable@vger.kernel.org> Signed-off-by: Takashi Iwai <tiwai@suse.de>
2014-06-13ALSA: hda - Add quirk for external mic on Lifebook U904David Henningsson1-0/+9
According to the bug reporter (Данило Шеган), the external mic starts to work and has proper jack detection if only pin 0x19 is marked properly as an external headset mic. AlsaInfo at https://bugs.launchpad.net/ubuntu/+source/pulseaudio/+bug/1328587/+attachment/4128991/+files/AlsaInfo.txt Cc: stable@vger.kernel.org BugLink: https://bugs.launchpad.net/bugs/1328587 Signed-off-by: David Henningsson <david.henningsson@canonical.com> Signed-off-by: Takashi Iwai <tiwai@suse.de>
2014-06-13ALSA: hda - fix a fixup value for codec alc293 in the pin_quirk tableHui Wang1-1/+1
The fixup value for codec alc293 was set to ALC269_FIXUP_DELL1_MIC_NO_PRESENCE by a mistake, if we don't fix it, the Dock mic will be overwriten by the headset mic, this will make the Dock mic can't work. Cc: David Henningsson <david.henningsson@canonical.com> Signed-off-by: Hui Wang <hui.wang@canonical.com> Signed-off-by: Takashi Iwai <tiwai@suse.de>
2014-06-13x86/vdso: Hack to keep 64-bit Go programs workingAndy Lutomirski3-13/+60
The Go runtime has a buggy vDSO parser that currently segfaults. This writes an empty SHT_DYNSYM entry that causes Go's runtime to malfunction by thinking that the vDSO is empty rather than malfunctioning by running off the end and segfaulting. This affects x86-64 only as far as we know, so we do not need this for the i386 and x32 vdsos. Signed-off-by: Andy Lutomirski <luto@amacapital.net> Link: http://lkml.kernel.org/r/d10618176c4bd39b457a5e85c497295c90cab1bc.1402620737.git.luto@amacapital.net Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2014-06-13x86/vdso: Add PUT_LE to store little-endian valuesAndy Lutomirski1-3/+16
Add PUT_LE() by analogy with GET_LE() to write littleendian values in addition to reading them. Signed-off-by: Andy Lutomirski <luto@amacapital.net> Link: http://lkml.kernel.org/r/3d9b27e92745b27b6fda1b9a98f70dc9c1246c7a.1402620737.git.luto@amacapital.net Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2014-06-13x86/vdso/doc: Make vDSO examples more portableAndy Lutomirski3-41/+123
This adds a new vdso_test.c that's written entirely in C. It also makes all of the vDSO examples work on 32-bit x86. Cc: Stefani Seibold <stefani@seibold.net> Signed-off-by: Andy Lutomirski <luto@amacapital.net> Link: http://lkml.kernel.org/r/62b701fc44b79f118ac2b2d64d19965fc5c291fb.1402620737.git.luto@amacapital.net Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2014-06-13x86/vdso/doc: Rename vdso_test.c to vdso_standalone_test_x86.cAndy Lutomirski1-1/+1
This thing is hopelessly x86_64-specific: it's an example of how to access the vDSO without any runtime support at all. Signed-off-by: Andy Lutomirski <luto@amacapital.net> Link: http://lkml.kernel.org/r/3efc170e0e166e15f0150c9fdb37d52488b9c0a4.1402620737.git.luto@amacapital.net Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2014-06-13btrfs: tree_search, search_ioctl: direct copy to userspaceGerhard Heift1-15/+33
By copying each found item seperatly to userspace, we do not need extra buffer in the kernel. Signed-off-by: Gerhard Heift <Gerhard@Heift.Name> Signed-off-by: Chris Mason <clm@fb.com> Acked-by: David Sterba <dsterba@suse.cz>
2014-06-13btrfs: new function read_extent_buffer_to_userGerhard Heift2-0/+40
This new function reads the content of an extent directly to user memory. Signed-off-by: Gerhard Heift <Gerhard@Heift.Name> Signed-off-by: Chris Mason <clm@fb.com> Acked-by: David Sterba <dsterba@suse.cz>
2014-06-13btrfs: tree_search, copy_to_sk: return needed size on EOVERFLOWGerhard Heift1-9/+15
If an item in tree_search is too large to be stored in the given buffer, return the needed size (including the header). Signed-off-by: Gerhard Heift <Gerhard@Heift.Name> Signed-off-by: Chris Mason <clm@fb.com> Acked-by: David Sterba <dsterba@suse.cz>
2014-06-13btrfs: tree_search, copy_to_sk: return EOVERFLOW for too small bufferGerhard Heift1-2/+26
In copy_to_sk, if an item is too large for the given buffer, it now returns -EOVERFLOW instead of copying a search_header with len = 0. For backward compatibility for the first item it still copies such a header to the buffer, but not any other following items, which could have fitted. tree_search changes -EOVERFLOW back to 0 to behave similiar to the way it behaved before this patch. Signed-off-by: Gerhard Heift <Gerhard@Heift.Name> Signed-off-by: Chris Mason <clm@fb.com> Acked-by: David Sterba <dsterba@suse.cz>
2014-06-13btrfs: tree_search, search_ioctl: accept varying bufferGerhard Heift1-7/+11
rewrite search_ioctl to accept a buffer with varying size Signed-off-by: Gerhard Heift <Gerhard@Heift.Name> Signed-off-by: Chris Mason <clm@fb.com> Acked-by: David Sterba <dsterba@suse.cz>
2014-06-13btrfs: tree_search: eliminate redundant nr_items checkGerhard Heift1-5/+7
If the amount of items reached the given limit of nr_items, we can leave copy_to_sk without updating the key. Also by returning 1 we leave the loop in search_ioctl without rechecking if we reached the given limit. Signed-off-by: Gerhard Heift <Gerhard@Heift.Name> Signed-off-by: Chris Mason <clm@fb.com> Acked-by: David Sterba <dsterba@suse.cz>
2014-06-12ima: introduce ima_kernel_read()Dmitry Kasatkin1-1/+31
Commit 8aac62706 "move exit_task_namespaces() outside of exit_notify" introduced the kernel opps since the kernel v3.10, which happens when Apparmor and IMA-appraisal are enabled at the same time. ---------------------------------------------------------------------- [ 106.750167] BUG: unable to handle kernel NULL pointer dereference at 0000000000000018 [ 106.750221] IP: [<ffffffff811ec7da>] our_mnt+0x1a/0x30 [ 106.750241] PGD 0 [ 106.750254] Oops: 0000 [#1] SMP [ 106.750272] Modules linked in: cuse parport_pc ppdev bnep rfcomm bluetooth rpcsec_gss_krb5 nfsd auth_rpcgss nfs_acl nfs lockd sunrpc fscache dm_crypt intel_rapl x86_pkg_temp_thermal intel_powerclamp kvm_intel snd_hda_codec_hdmi kvm crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel aes_x86_64 glue_helper lrw gf128mul ablk_helper cryptd snd_hda_codec_realtek dcdbas snd_hda_intel snd_hda_codec snd_hwdep snd_pcm snd_page_alloc snd_seq_midi snd_seq_midi_event snd_rawmidi psmouse snd_seq microcode serio_raw snd_timer snd_seq_device snd soundcore video lpc_ich coretemp mac_hid lp parport mei_me mei nbd hid_generic e1000e usbhid ahci ptp hid libahci pps_core [ 106.750658] CPU: 6 PID: 1394 Comm: mysqld Not tainted 3.13.0-rc7-kds+ #15 [ 106.750673] Hardware name: Dell Inc. OptiPlex 9010/0M9KCM, BIOS A08 09/19/2012 [ 106.750689] task: ffff8800de804920 ti: ffff880400fca000 task.ti: ffff880400fca000 [ 106.750704] RIP: 0010:[<ffffffff811ec7da>] [<ffffffff811ec7da>] our_mnt+0x1a/0x30 [ 106.750725] RSP: 0018:ffff880400fcba60 EFLAGS: 00010286 [ 106.750738] RAX: 0000000000000000 RBX: 0000000000000100 RCX: ffff8800d51523e7 [ 106.750764] RDX: ffffffffffffffea RSI: ffff880400fcba34 RDI: ffff880402d20020 [ 106.750791] RBP: ffff880400fcbae0 R08: 0000000000000000 R09: 0000000000000001 [ 106.750817] R10: 0000000000000000 R11: 0000000000000001 R12: ffff8800d5152300 [ 106.750844] R13: ffff8803eb8df510 R14: ffff880400fcbb28 R15: ffff8800d51523e7 [ 106.750871] FS: 0000000000000000(0000) GS:ffff88040d200000(0000) knlGS:0000000000000000 [ 106.750910] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 106.750935] CR2: 0000000000000018 CR3: 0000000001c0e000 CR4: 00000000001407e0 [ 106.750962] Stack: [ 106.750981] ffffffff813434eb ffff880400fcbb20 ffff880400fcbb18 0000000000000000 [ 106.751037] ffff8800de804920 ffffffff8101b9b9 0001800000000000 0000000000000100 [ 106.751093] 0000010000000000 0000000000000002 000000000000000e ffff8803eb8df500 [ 106.751149] Call Trace: [ 106.751172] [<ffffffff813434eb>] ? aa_path_name+0x2ab/0x430 [ 106.751199] [<ffffffff8101b9b9>] ? sched_clock+0x9/0x10 [ 106.751225] [<ffffffff8134a68d>] aa_path_perm+0x7d/0x170 [ 106.751250] [<ffffffff8101b945>] ? native_sched_clock+0x15/0x80 [ 106.751276] [<ffffffff8134aa73>] aa_file_perm+0x33/0x40 [ 106.751301] [<ffffffff81348c5e>] common_file_perm+0x8e/0xb0 [ 106.751327] [<ffffffff81348d78>] apparmor_file_permission+0x18/0x20 [ 106.751355] [<ffffffff8130c853>] security_file_permission+0x23/0xa0 [ 106.751382] [<ffffffff811c77a2>] rw_verify_area+0x52/0xe0 [ 106.751407] [<ffffffff811c789d>] vfs_read+0x6d/0x170 [ 106.751432] [<ffffffff811cda31>] kernel_read+0x41/0x60 [ 106.751457] [<ffffffff8134fd45>] ima_calc_file_hash+0x225/0x280 [ 106.751483] [<ffffffff8134fb52>] ? ima_calc_file_hash+0x32/0x280 [ 106.751509] [<ffffffff8135022d>] ima_collect_measurement+0x9d/0x160 [ 106.751536] [<ffffffff810b552d>] ? trace_hardirqs_on+0xd/0x10 [ 106.751562] [<ffffffff8134f07c>] ? ima_file_free+0x6c/0xd0 [ 106.751587] [<ffffffff81352824>] ima_update_xattr+0x34/0x60 [ 106.751612] [<ffffffff8134f0d0>] ima_file_free+0xc0/0xd0 [ 106.751637] [<ffffffff811c9635>] __fput+0xd5/0x300 [ 106.751662] [<ffffffff811c98ae>] ____fput+0xe/0x10 [ 106.751687] [<ffffffff81086774>] task_work_run+0xc4/0xe0 [ 106.751712] [<ffffffff81066fad>] do_exit+0x2bd/0xa90 [ 106.751738] [<ffffffff8173c958>] ? retint_swapgs+0x13/0x1b [ 106.751763] [<ffffffff8106780c>] do_group_exit+0x4c/0xc0 [ 106.751788] [<ffffffff81067894>] SyS_exit_group+0x14/0x20 [ 106.751814] [<ffffffff8174522d>] system_call_fastpath+0x1a/0x1f [ 106.751839] Code: c3 0f 1f 44 00 00 55 48 89 e5 e8 22 fe ff ff 5d c3 0f 1f 44 00 00 55 65 48 8b 04 25 c0 c9 00 00 48 8b 80 28 06 00 00 48 89 e5 5d <48> 8b 40 18 48 39 87 c0 00 00 00 0f 94 c0 c3 0f 1f 80 00 00 00 [ 106.752185] RIP [<ffffffff811ec7da>] our_mnt+0x1a/0x30 [ 106.752214] RSP <ffff880400fcba60> [ 106.752236] CR2: 0000000000000018 [ 106.752258] ---[ end trace 3c520748b4732721 ]--- ---------------------------------------------------------------------- The reason for the oops is that IMA-appraisal uses "kernel_read()" when file is closed. kernel_read() honors LSM security hook which calls Apparmor handler, which uses current->nsproxy->mnt_ns. The 'guilty' commit changed the order of cleanup code so that nsproxy->mnt_ns was not already available for Apparmor. Discussion about the issue with Al Viro and Eric W. Biederman suggested that kernel_read() is too high-level for IMA. Another issue, except security checking, that was identified is mandatory locking. kernel_read honors it as well and it might prevent IMA from calculating necessary hash. It was suggested to use simplified version of the function without security and locking checks. This patch introduces special version ima_kernel_read(), which skips security and mandatory locking checking. It prevents the kernel oops to happen. Signed-off-by: Dmitry Kasatkin <d.kasatkin@samsung.com> Suggested-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com> Cc: <stable@vger.kernel.org>
2014-06-12evm: prohibit userspace writing 'security.evm' HMAC valueMimi Zohar1-2/+10
Calculating the 'security.evm' HMAC value requires access to the EVM encrypted key. Only the kernel should have access to it. This patch prevents userspace tools(eg. setfattr, cp --preserve=xattr) from setting/modifying the 'security.evm' HMAC value directly. Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com> Cc: <stable@vger.kernel.org>
2014-06-12ima: check inode integrity cache in violation checkDmitry Kasatkin1-2/+7
When IMA did not support ima-appraisal, existance of the S_IMA flag clearly indicated that the file was measured. With IMA appraisal S_IMA flag indicates that file was measured and/or appraised. Because of this, when measurement is not enabled by the policy, violations are still reported. To differentiate between measurement and appraisal policies this patch checks the inode integrity cache flags. The IMA_MEASURED flag indicates whether the file was actually measured, while the IMA_MEASURE flag indicates whether the file should be measured. Unfortunately, the IMA_MEASURED flag is reset to indicate the file needs to be re-measured. Thus, this patch checks the IMA_MEASURE flag. This patch limits the false positive violation reports, but does not fix it entirely. The IMA_MEASURE/IMA_MEASURED flags are indications that, at some point in time, the file opened for read was in policy, but might not be in policy now (eg. different uid). Other changes would be needed to further limit false positive violation reports. Changelog: - expanded patch description based on conversation with Roberto (Mimi) Signed-off-by: Dmitry Kasatkin <d.kasatkin@samsung.com> Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2014-06-12ima: prevent unnecessary policy checkingDmitry Kasatkin1-9/+4
ima_rdwr_violation_check is called for every file openning. The function checks the policy even when violation condition is not met. It causes unnecessary policy checking. This patch does policy checking only if violation condition is met. Changelog: - check writecount is greater than zero (Mimi) Signed-off-by: Dmitry Kasatkin <d.kasatkin@samsung.com> Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2014-06-12evm: provide option to protect additional SMACK xattrsDmitry Kasatkin2-0/+22
Newer versions of SMACK introduced following security xattrs: SMACK64EXEC, SMACK64TRANSMUTE and SMACK64MMAP. To protect these xattrs, this patch includes them in the HMAC calculation. However, for backwards compatibility with existing labeled filesystems, including these xattrs needs to be configurable. Changelog: - Add SMACK dependency on new option (Mimi) Signed-off-by: Dmitry Kasatkin <d.kasatkin@samsung.com> Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2014-06-12evm: replace HMAC version with attribute maskDmitry Kasatkin4-11/+33
Using HMAC version limits the posibility to arbitrarily add new attributes such as SMACK64EXEC to the hmac calculation. This patch replaces hmac version with attribute mask. Desired attributes can be enabled with configuration parameter. It allows to build kernels which works with previously labeled filesystems. Currently supported attribute is 'fsuuid' which is equivalent of the former version 2. Signed-off-by: Dmitry Kasatkin <d.kasatkin@samsung.com> Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2014-06-12ima: prevent new digsig xattr from being replacedMimi Zohar1-3/+7
Even though a new xattr will only be appraised on the next access, set the DIGSIG flag to prevent a signature from being replaced with a hash on file close. Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
2014-06-12target: Fix NULL pointer dereference for XCOPY in target_put_sess_cmdNicholas Bellinger1-0/+4
This patch fixes a NULL pointer dereference regression bug that was introduced with: commit 1e1110c43b1cda9fe77fc4a04835e460550e6b3c Author: Mikulas Patocka <mpatocka@redhat.com> Date: Sat May 17 06:49:22 2014 -0400 target: fix memory leak on XCOPY Now that target_put_sess_cmd() -> kref_put_spinlock_irqsave() is called with a valid se_cmd->cmd_kref, a NULL pointer dereference is triggered because the XCOPY passthrough commands don't have an associated se_session pointer. To address this bug, go ahead and checking for a NULL se_sess pointer within target_put_sess_cmd(), and call se_cmd->se_tfo->release_cmd() to release the XCOPY's xcopy_pt_cmd memory. Reported-by: Thomas Glanzmann <thomas@glanzmann.de> Cc: Thomas Glanzmann <thomas@glanzmann.de> Cc: Mikulas Patocka <mpatocka@redhat.com> Cc: stable@vger.kernel.org # 3.12+ Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2014-06-12rtnetlink: fix userspace API breakage for iproute2 < v3.9.0Michal Schmidt1-4/+18
When running RHEL6 userspace on a current upstream kernel, "ip link" fails to show VF information. The reason is a kernel<->userspace API change introduced by commit 88c5b5ce5cb57 ("rtnetlink: Call nlmsg_parse() with correct header length"), after which the kernel does not see iproute2's IFLA_EXT_MASK attribute in the netlink request. iproute2 adjusted for the API change in its commit 63338dca4513 ("libnetlink: Use ifinfomsg instead of rtgenmsg in rtnl_wilddump_req_filter"). The problem has been noticed before: http://marc.info/?l=linux-netdev&m=136692296022182&w=2 (Subject: Re: getting VF link info seems to be broken in 3.9-rc8) We can do better than tell those with old userspace to upgrade. We can recognize the old iproute2 in the kernel by checking the netlink message length. Even when including the IFLA_EXT_MASK attribute, its netlink message is shorter than struct ifinfomsg. With this patch "ip link" shows VF information in both old and new iproute2 versions. Signed-off-by: Michal Schmidt <mschmidt@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-12tcp: fixing TLP's FIN recoveryPer Hurtig1-3/+1
Fix to a problem observed when losing a FIN segment that does not contain data. In such situations, TLP is unable to recover from *any* tail loss and instead adds at least PTO ms to the retransmission process, i.e., RTO = RTO + PTO. Signed-off-by: Per Hurtig <per.hurtig@kau.se> Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Nandita Dukkipati <nanditad@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-12net: fec: Add software TSO supportNimrod Andy2-23/+238
Add software TSO support for FEC. This feature allows to improve outbound throughput performance. Tested on imx6dl sabresd board, running iperf tcp tests shows: - 16.2% improvement comparing with FEC SG patch - 82% improvement comparing with NO SG & TSO patch $ ethtool -K eth0 tso on $ iperf -c 10.192.242.167 -t 3 & [ 3] local 10.192.242.108 port 35388 connected with 10.192.242.167 port 5001 [ ID] Interval Transfer Bandwidth [ 3] 0.0- 3.0 sec 181 MBytes 506 Mbits/sec During the testing, CPU loading is 30%. Since imx6dl FEC Bandwidth is limited to SOC system bus bandwidth, the performance with SW TSO is a milestone. CC: Ezequiel Garcia <ezequiel.garcia@free-electrons.com> CC: Eric Dumazet <eric.dumazet@gmail.com> CC: David Laight <David.Laight@ACULAB.COM> CC: Li Frank <B20596@freescale.com> Signed-off-by: Fugang Duan <B38611@freescale.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-12net: fec: Add Scatter/gather supportNimrod Andy2-62/+178
Add Scatter/gather support for FEC. This feature allows to improve outbound throughput performance. Tested on imx6dl sabresd board: Running iperf tests shows a 55.4% improvement. $ ethtool -K eth0 sg off $ iperf -c 10.192.242.167 -t 3 & [ 3] local 10.192.242.108 port 52618 connected with 10.192.242.167 port 5001 [ ID] Interval Transfer Bandwidth [ 3] 0.0- 3.0 sec 99.5 MBytes 278 Mbits/sec $ ethtool -K eth0 sg on $ iperf -c 10.192.242.167 -t 3 & [ 3] local 10.192.242.108 port 52617 connected with 10.192.242.167 port 5001 [ ID] Interval Transfer Bandwidth [ 3] 0.0- 3.0 sec 154 MBytes 432 Mbits/sec CC: Li Frank <B20596@freescale.com> Signed-off-by: Fugang Duan <B38611@freescale.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-12net: fec: Increase buffer descriptor entry numberNimrod Andy2-16/+17
In order to support SG, software TSO, let's increase BD entry number. CC: Ezequiel Garcia <ezequiel.garcia@free-electrons.com> CC: Eric Dumazet <eric.dumazet@gmail.com> CC: David Laight <David.Laight@ACULAB.COM> Signed-off-by: Fugang Duan <B38611@freescale.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-12net: fec: Factorize feature settingNimrod Andy1-5/+3
In order to enhance the code readable, let's factorize the feature list. Signed-off-by: Fugang Duan <B38611@freescale.com> Signed-off-by: David S. Miller <davem@davemloft.net>