summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* bonding: don't use stale speed and duplex informationJay Vosburgh2016-02-161-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There is presently a race condition between the bonding periodic link monitor and the updating of a slave's speed and duplex. The former occurs on a periodic basis, and the latter in response to a driver's calling of netif_carrier_on. It is possible for the periodic monitor to run between the driver call of netif_carrier_on and the receipt of the NETDEV_CHANGE event that causes bonding to update the slave's speed and duplex. This manifests most notably as a report that a slave is up and "0 Mbps full duplex" after enslavement, but in principle could report an incorrect speed and duplex after any link up event if the device comes up with a different speed or duplex. This affects the 802.3ad aggregator selection, as the speed and duplex are selection criteria. This is fixed by updating the speed and duplex in the periodic monitor, prior to using that information. This was done historically in bonding, but the call to bond_update_speed_duplex was removed in commit 876254ae2758 ("bonding: don't call update_speed_duplex() under spinlocks"), as it might sleep under lock. Later, the locking was changed to only hold RTNL, and so after commit 876254ae2758 ("bonding: don't call update_speed_duplex() under spinlocks") this call is again safe. Tested-by: "Tantilov, Emil S" <emil.s.tantilov@intel.com> Cc: Veaceslav Falico <vfalico@gmail.com> Cc: dingtianhong <dingtianhong@huawei.com> Fixes: 876254ae2758 ("bonding: don't call update_speed_duplex() under spinlocks") Signed-off-by: Jay Vosburgh <jay.vosburgh@canonical.com> Acked-by: Ding Tianhong <dingtianhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: am79c961a: avoid %? in inline assemblyArnd Bergmann2016-02-161-32/+32
| | | | | | | | | | | | | | | | The am79c961a.c driver fails to build with clang because of an unusual inline assembly construct: drivers/net/ethernet/amd/am79c961a.c:53:7: error: invalid % escape in inline assembly string "str%?h %1, [%2] @ NET_RAP\n\t" The same change has been done a decade ago in arch/arm as of 6a39dd6222dd ("[ARM] 3759/2: Remove uses of %?"), but apparently some drivers were missed. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: smc91x: propagate irq return codeRobert Jarzmik2016-02-161-2/+2
| | | | | | | | | | | | | The smc91x driver doesn't honor the probe deferral mechanism when the interrupt source is not yet available, such as one provided by a gpio controller not probed. Fix this by propagating the platform_get_irq() error code as the probe return value. Signed-off-by: Robert Jarzmik <robert.jarzmik@free.fr> Cc: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge branch 'bcm7xxx-fixes'David S. Miller2016-02-161-39/+4
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Florian Fainelli says: ==================== Subject: [PATCH net v2 0/4] net: phy: bcm7xxx 40nm PHY fixes Here is a collection of fixes for the 40nm Ethernet PHY supported by the 7xxx PHY driver, please also queue these fixes for stable. Changes in v2: - dropped the cleanup patch, not appropriate - added another patch removing bogus wildcard entries ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| * net: phy: bcm7xxx: Remove wildcard entriesFlorian Fainelli2016-02-161-31/+0
| | | | | | | | | | | | | | | | | | | | | | Remove the two wildcard entries, they serve no purpose and will match way too many devices, some of them being covered by the driver in drivers/net/phy/broadcom.c. Remove the now unused bcm7xxx_dummy_config_init() function which would produce a warning. Fixes: b560a58c45c6 ("net: phy: add Broadcom BCM7xxx internal PHY driver") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net: phy: bcm7xxx: Fix bcm7xxx_config_init() checkFlorian Fainelli2016-02-161-4/+0
| | | | | | | | | | | | | | | | | | | | | | Since we were wrongly advertising gigabit features for these 10/100 only Ethernet PHYs, bcm7xxx_config_init() which is supposed to apply workaround would have not run since the check would be true, now that we have fixed the PHY features, remove that check since it has no reasoning to be there anymore. Fixes: e18556ee3bd83 ("net: phy: bcm7xxx: do not use PHY_BRCM_100MBPS_WAR") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net: phy: bcm7xxx: Fix 40nm EPHY featuresFlorian Fainelli2016-02-161-3/+3
| | | | | | | | | | | | | | | | | | | | The PHY entries for BCM7425/29/35 declare the 40nm Ethernet PHY as being 10/100/1000 capable, while this is just a 10/100 capable PHY device, fix that. Fixes: d068b02cfdfc2 ("net: phy: add BCM7425 and BCM7429 PHYs") Fixes: 9458ceab4917 ("net: phy: bcm7xxx: Add entry for BCM7435") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net: phy: bcm7xxx: Fix shadow mode 2 disablingFlorian Fainelli2016-02-161-1/+1
|/ | | | | | | | | | The clear and set masks in the call to phy_set_clr_bits() called from bcm7xxx_config_init() are inverted. We need to fix this by swapping the two arguments, that is, set 0 bits, but clear the shade mode 2 enable bit. Fixes: b560a58c45c66 ("net: phy: add Broadcom BCM7xxx internal PHY driver") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge branch 'ravb-fixes'David S. Miller2016-02-161-8/+8
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Sergei Shtylyov says: ==================== ravb: fix the fallout of R-Car gen3 gPTP support Here's a set of 2 patches against DaveM's 'net.git' repo fixing up the incomplete commit f5d7837f96e5 ("ravb: ptp: Add CONFIG mode support"). I'm proposing these as fixes but they can be merged as cleanups as well... [1/2] ravb: kill duplicate setting of CCC.CSEL [2/2] ravb: skip gPTP start/stop on R-Car gen3 ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| * ravb: skip gPTP start/stop on R-Car gen3Sergei Shtylyov2016-02-161-4/+8
| | | | | | | | | | | | | | | | | | | | When adding support for the R-Car gen3 gPTP active in configuration mode, some call sites of ravb_ptp_{init|stop}() were missed due to an oversight. Add checks for the R-Car gen2 SoCs around these... Fixes: f5d7837f96e5 ("ravb: ptp: Add CONFIG mode support") Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * ravb: kill duplicate setting of CCC.CSELSergei Shtylyov2016-02-161-4/+0
|/ | | | | | | | | | | When adding support for the R-Car gen3 gPTP active in configuration mode, the code setting the CCC.CSEL field was duplicated due to an oversight. For R-Car gen 2 it's just redundant and for R-Car gen3 the write at this time is probably ignored due to CCC.GAC bit being already set... Fixes: f5d7837f96e5 ("ravb: ptp: Add CONFIG mode support") Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nfDavid S. Miller2016-02-167-16/+91
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pablo Neira Ayuso says: ==================== Netfilter fixes for net The following patchset contain a rather large batch for your net that includes accumulated bugfixes, they are: 1) Run conntrack cleanup from workqueue process context to avoid hitting soft lockup via watchdog for large tables. This is required by the IPv6 masquerading extension. From Florian Westphal. 2) Use original skbuff from nfnetlink batch when calling netlink_ack() on error since this needs to access the skb->sk pointer. 3) Incremental fix on top of recent Sasha Levin's lock fix for conntrack resizing. 4) Fix several problems in nfnetlink batch message header sanitization and error handling, from Phil Turnbull. 5) Select NF_DUP_IPV6 based on CONFIG_IPV6, from Arnd Bergmann. 6) Fix wrong signess in return values on nf_tables counter expression, from Anton Protopopov. Due to the NetDev 1.1 organization burden, I had no chance to pass up this to you any sooner in this release cycle, sorry about that. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| * netfilter: nft_counter: fix erroneous return valuesAnton Protopopov2016-02-081-2/+2
| | | | | | | | | | | | | | | | The nft_counter_init() and nft_counter_clone() functions should return negative error value -ENOMEM instead of positive ENOMEM. Signed-off-by: Anton Protopopov <a.s.protopopov@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
| * netfilter: tee: select NF_DUP_IPV6 unconditionallyArnd Bergmann2016-02-082-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The NETFILTER_XT_TARGET_TEE option selects NF_DUP_IPV6 whenever IP6_NF_IPTABLES is enabled, and it ensures that it cannot be builtin itself if NF_CONNTRACK is a loadable module, as that is a dependency for NF_DUP_IPV6. However, NF_DUP_IPV6 can be enabled even if IP6_NF_IPTABLES is turned off, and it only really depends on IPV6. With the current check in tee_tg6, we call nf_dup_ipv6() whenever NF_DUP_IPV6 is enabled. This can however be a loadable module which is unreachable from a built-in xt_TEE: net/built-in.o: In function `tee_tg6': :(.text+0x67728): undefined reference to `nf_dup_ipv6' The bug was originally introduced in the split of the xt_TEE module into separate modules for ipv4 and ipv6, and two patches tried to fix it unsuccessfully afterwards. This is a revert of the the first incorrect attempt to fix it, going back to depending on IPV6 as the dependency, and we adapt the 'select' condition accordingly. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Fixes: bbde9fc1824a ("netfilter: factor out packet duplication for IPv4/IPv6") Fixes: 116984a316c3 ("netfilter: xt_TEE: use IS_ENABLED(CONFIG_NF_DUP_IPV6)") Fixes: 74ec4d55c4d2 ("netfilter: fix xt_TEE and xt_TPROXY dependencies") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
| * netfilter: nfnetlink: correctly validate length of batch messagesPhil Turnbull2016-02-081-4/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If nlh->nlmsg_len is zero then an infinite loop is triggered because 'skb_pull(skb, msglen);' pulls zero bytes. The calculation in nlmsg_len() underflows if 'nlh->nlmsg_len < NLMSG_HDRLEN' which bypasses the length validation and will later trigger an out-of-bound read. If the length validation does fail then the malformed batch message is copied back to userspace. However, we cannot do this because the nlh->nlmsg_len can be invalid. This leads to an out-of-bounds read in netlink_ack: [ 41.455421] ================================================================== [ 41.456431] BUG: KASAN: slab-out-of-bounds in memcpy+0x1d/0x40 at addr ffff880119e79340 [ 41.456431] Read of size 4294967280 by task a.out/987 [ 41.456431] ============================================================================= [ 41.456431] BUG kmalloc-512 (Not tainted): kasan: bad access detected [ 41.456431] ----------------------------------------------------------------------------- ... [ 41.456431] Bytes b4 ffff880119e79310: 00 00 00 00 d5 03 00 00 b0 fb fe ff 00 00 00 00 ................ [ 41.456431] Object ffff880119e79320: 20 00 00 00 10 00 05 00 00 00 00 00 00 00 00 00 ............... [ 41.456431] Object ffff880119e79330: 14 00 0a 00 01 03 fc 40 45 56 11 22 33 10 00 05 .......@EV."3... [ 41.456431] Object ffff880119e79340: f0 ff ff ff 88 99 aa bb 00 14 00 0a 00 06 fe fb ................ ^^ start of batch nlmsg with nlmsg_len=4294967280 ... [ 41.456431] Memory state around the buggy address: [ 41.456431] ffff880119e79400: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 41.456431] ffff880119e79480: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 41.456431] >ffff880119e79500: 00 00 00 00 fc fc fc fc fc fc fc fc fc fc fc fc [ 41.456431] ^ [ 41.456431] ffff880119e79580: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 41.456431] ffff880119e79600: fc fc fc fc fc fc fc fc fc fc fb fb fb fb fb fb [ 41.456431] ================================================================== Fix this with better validation of nlh->nlmsg_len and by setting NFNL_BATCH_FAILURE if any batch message fails length validation. CAP_NET_ADMIN is required to trigger the bugs. Fixes: 9ea2aa8b7dba ("netfilter: nfnetlink: validate nfnetlink header from batch") Signed-off-by: Phil Turnbull <phil.turnbull@oracle.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
| * netfilter: cttimeout: fix deadlock due to erroneous unlock/lock conversionFlorian Westphal2016-02-011-1/+1
| | | | | | | | | | | | | | | | | | The spin_unlock call should have been left as-is, revert. Fixes: b16c29191dc89bd ("netfilter: nf_conntrack: use safer way to lock all buckets") Reported-by: kernel test robot <fengguang.wu@intel.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
| * netfilter: nfnetlink: use original skbuff when acking batchesPablo Neira Ayuso2016-02-011-3/+3
| | | | | | | | | | | | | | | | | | | | | | Since bd678e09dc17 ("netfilter: nfnetlink: fix splat due to incorrect socket memory accounting in skbuff clones"), we don't manually attach the sk to the skbuff clone anymore, so we have to use the original skbuff from netlink_ack() which needs to access the sk pointer. Fixes: bd678e09dc17 ("netfilter: nfnetlink: fix splat due to incorrect socket memory accounting in skbuff clones") Reported-by: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
| * netfilter: conntrack: resched in nf_ct_iterate_cleanupFlorian Westphal2016-02-012-3/+76
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Ulrich reports soft lockup with following (shortened) callchain: NMI watchdog: BUG: soft lockup - CPU#1 stuck for 22s! __netif_receive_skb_core+0x6e4/0x774 process_backlog+0x94/0x160 net_rx_action+0x88/0x178 call_do_softirq+0x24/0x3c do_softirq+0x54/0x6c __local_bh_enable_ip+0x7c/0xbc nf_ct_iterate_cleanup+0x11c/0x22c [nf_conntrack] masq_inet_event+0x20/0x30 [nf_nat_masquerade_ipv6] atomic_notifier_call_chain+0x1c/0x2c ipv6_del_addr+0x1bc/0x220 [ipv6] Problem is that nf_ct_iterate_cleanup can run for a very long time since it can be interrupted by softirq processing. Moreover, atomic_notifier_call_chain runs with rcu readlock held. So lets call cond_resched() in nf_ct_iterate_cleanup and defer the call to a work queue for the atomic_notifier_call_chain case. We also need another cond_resched in get_next_corpse, since we have to deal with iter() always returning false, in that case get_next_corpse will walk entire conntrack table. Reported-by: Ulrich Weber <uw@ocedo.com> Tested-by: Ulrich Weber <uw@ocedo.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* | af_unix: Guard against other == sk in unix_dgram_sendmsgRainer Weikusat2016-02-161-1/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The unix_dgram_sendmsg routine use the following test if (unlikely(unix_peer(other) != sk && unix_recvq_full(other))) { to determine if sk and other are in an n:1 association (either established via connect or by using sendto to send messages to an unrelated socket identified by address). This isn't correct as the specified address could have been bound to the sending socket itself or because this socket could have been connected to itself by the time of the unix_peer_get but disconnected before the unix_state_lock(other). In both cases, the if-block would be entered despite other == sk which might either block the sender unintentionally or lead to trying to unlock the same spin lock twice for a non-blocking send. Add a other != sk check to guard against this. Fixes: 7d267278a9ec ("unix: avoid use-after-free in ep_remove_wait_queue") Reported-By: Philipp Hahn <pmhahn@pmhahn.de> Signed-off-by: Rainer Weikusat <rweikusat@mobileactivedefense.com> Tested-by: Philipp Hahn <pmhahn@pmhahn.de> Signed-off-by: David S. Miller <davem@davemloft.net>
* | af_unix: Don't set err in unix_stream_read_generic unless there was an errorRainer Weikusat2016-02-161-6/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The present unix_stream_read_generic contains various code sequences of the form err = -EDISASTER; if (<test>) goto out; This has the unfortunate side effect of possibly causing the error code to bleed through to the final out: return copied ? : err; and then to be wrongly returned if no data was copied because the caller didn't supply a data buffer, as demonstrated by the program available at http://pad.lv/1540731 Change it such that err is only set if an error condition was detected. Fixes: 3822b5c2fc62 ("af_unix: Revert 'lock_interruptible' in stream receive code") Reported-by: Joseph Salisbury <joseph.salisbury@canonical.com> Signed-off-by: Rainer Weikusat <rweikusat@mobileactivedefense.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | dscc4: Undefined signed int shiftMichael McConville2016-02-131-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | My analysis in the below mail applies, although the second part is unnecessary because i isn't used in arithmetic operations here: https://marc.info/?l=openbsd-tech&m=145377854103866&w=2 Thanks for your time. Signed-off-by: Michael McConville <mmcco@mykolab.com> Acked-by: Francois Romieu <romieu@fr.zoreil.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | net: dsa: mv88e6xxx: do not leave reserved VLANsVivien Didelot2016-02-131-15/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | BRIDGE_VLAN_FILTERING automatically adds a newly bridged port to the VLAN with the bridge's default_pvid. The mv88e6xxx driver currently reserves VLANs 4000+ for unbridged ports isolation. When a port joins a bridge, it leaves its reserved VLAN. When a port leaves a bridge, it joins again its reserved VLAN. But if the VLAN filtering is disabled, or if this hardware VLAN is already in use, the bridged port ends up with no default VLAN, and the communication with the CPU is thus broken. To fix this, make a port join its reserved VLAN once on setup, never leave it, and restore its PVID after another one was eventually used. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Tested-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
* | net: dsa: mv88e6xxx: fix software VLAN deletionVivien Didelot2016-02-131-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The current bridge code calls switchdev_port_obj_del on a VLAN port even if the corresponding switchdev_port_obj_add call returned -EOPNOTSUPP. If the DSA driver doesn't return -EOPNOTSUPP for a software port VLAN in its port_vlan_del function, the VLAN is not deleted. Unbridging the port also generates a stack trace for the same reason. This can be quickly tested on a VLAN filtering enabled system with: # brctl addbr br0 # brctl addif br0 lan0 # brctl addbr br1 # brctl addif br1 lan1 # brctl delif br1 lan1 Both bridges have a default default_pvid set to 1. lan0 uses the hardware VLAN 1 while lan1 falls back to the software VLAN 1. Unbridging lan1 does not delete its software VLAN, and thus generates the following stack trace: [ 2991.681705] device lan1 left promiscuous mode [ 2991.686237] br1: port 1(lan1) entered disabled state [ 2991.725094] ------------[ cut here ]------------ [ 2991.729761] WARNING: CPU: 0 PID: 869 at net/bridge/br_vlan.c:314 __vlan_group_free+0x4c/0x50() [ 2991.738437] Modules linked in: [ 2991.741546] CPU: 0 PID: 869 Comm: ip Not tainted 4.4.0 #16 [ 2991.747039] Hardware name: Freescale Vybrid VF5xx/VF6xx (Device Tree) [ 2991.753511] Backtrace: [ 2991.756008] [<80014450>] (dump_backtrace) from [<8001469c>] (show_stack+0x20/0x24) [ 2991.763604] r6:80512644 r5:00000009 r4:00000000 r3:00000000 [ 2991.769343] [<8001467c>] (show_stack) from [<80268e44>] (dump_stack+0x24/0x28) [ 2991.776618] [<80268e20>] (dump_stack) from [<80025568>] (warn_slowpath_common+0x98/0xc4) [ 2991.784750] [<800254d0>] (warn_slowpath_common) from [<80025650>] (warn_slowpath_null+0x2c/0x34) [ 2991.793557] r8:00000000 r7:9f786a8c r6:9f76c440 r5:9f786a00 r4:9f68ac00 [ 2991.800366] [<80025624>] (warn_slowpath_null) from [<80512644>] (__vlan_group_free+0x4c/0x50) [ 2991.808946] [<805125f8>] (__vlan_group_free) from [<80514488>] (nbp_vlan_flush+0x44/0x68) [ 2991.817147] r4:9f68ac00 r3:9ec70000 [ 2991.820772] [<80514444>] (nbp_vlan_flush) from [<80506f08>] (del_nbp+0xac/0x130) [ 2991.828201] r5:9f56f800 r4:9f786a00 [ 2991.831841] [<80506e5c>] (del_nbp) from [<8050774c>] (br_del_if+0x40/0xbc) [ 2991.838724] r7:80590f68 r6:00000000 r5:9ec71c38 r4:9f76c440 [ 2991.844475] [<8050770c>] (br_del_if) from [<80503dc0>] (br_del_slave+0x1c/0x20) [ 2991.851802] r5:9ec71c38 r4:9f56f800 [ 2991.855428] [<80503da4>] (br_del_slave) from [<80484a34>] (do_setlink+0x324/0x7b8) [ 2991.863043] [<80484710>] (do_setlink) from [<80485e90>] (rtnl_newlink+0x508/0x6f4) [ 2991.870616] r10:00000000 r9:9ec71ba8 r8:00000000 r7:00000000 r6:9f6b0400 r5:9f56f800 [ 2991.878548] r4:8076278c [ 2991.881110] [<80485988>] (rtnl_newlink) from [<80484048>] (rtnetlink_rcv_msg+0x18c/0x22c) [ 2991.889315] r10:9f7d4e40 r9:00000000 r8:00000000 r7:00000000 r6:9f7d4e40 r5:9f6b0400 [ 2991.897250] r4:00000000 [ 2991.899814] [<80483ebc>] (rtnetlink_rcv_msg) from [<80497c74>] (netlink_rcv_skb+0xb0/0xcc) [ 2991.908104] r8:00000000 r7:9f7d4e40 r6:9f7d4e40 r5:80483ebc r4:9f6b0400 [ 2991.914928] [<80497bc4>] (netlink_rcv_skb) from [<80483eb4>] (rtnetlink_rcv+0x34/0x3c) [ 2991.922874] r6:9f5ea000 r5:00000028 r4:9f7d4e40 r3:80483e80 [ 2991.928622] [<80483e80>] (rtnetlink_rcv) from [<80497604>] (netlink_unicast+0x180/0x200) [ 2991.936742] r4:9f4edc00 r3:80483e80 [ 2991.940362] [<80497484>] (netlink_unicast) from [<80497a88>] (netlink_sendmsg+0x33c/0x350) [ 2991.948648] r8:00000000 r7:00000028 r6:00000000 r5:9f5ea000 r4:9ec71f4c [ 2991.955481] [<8049774c>] (netlink_sendmsg) from [<80457ff0>] (sock_sendmsg+0x24/0x34) [ 2991.963342] r10:00000000 r9:9ec71e28 r8:00000000 r7:9f1e2140 r6:00000000 r5:00000000 [ 2991.971276] r4:9ec71f4c [ 2991.973849] [<80457fcc>] (sock_sendmsg) from [<80458af0>] (___sys_sendmsg+0x1fc/0x204) [ 2991.981809] [<804588f4>] (___sys_sendmsg) from [<804598d0>] (__sys_sendmsg+0x4c/0x7c) [ 2991.989640] r10:00000000 r9:9ec70000 r8:80010824 r7:00000128 r6:7ee946c4 r5:00000000 [ 2991.997572] r4:9f1e2140 [ 2992.000128] [<80459884>] (__sys_sendmsg) from [<80459918>] (SyS_sendmsg+0x18/0x1c) [ 2992.007725] r6:00000000 r5:7ee9c7b8 r4:7ee946e0 [ 2992.012430] [<80459900>] (SyS_sendmsg) from [<80010660>] (ret_fast_syscall+0x0/0x3c) [ 2992.020182] ---[ end trace 5d4bc29f4da04280 ]--- To fix this, return -EOPNOTSUPP in _mv88e6xxx_port_vlan_del instead of -ENOENT if the hardware VLAN doesn't exist or the port is not a member. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Tested-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
* | net: cavium: liquidio: fix check for in progress flagColin Ian King2016-02-131-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | smatch detected a suspicious looking bitop condition: drivers/net/ethernet/cavium/liquidio/lio_main.c:2529 handle_timestamp() warn: suspicious bitop condition (skb_shinfo(skb)->tx_flags | SKBTX_IN_PROGRESS is always non-zero, so the logic is definitely not correct. Use & to mask the correct bit. Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | hv_netvsc: Restore needed_headroom requestVitaly Kuznetsov2016-02-131-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit c0eb454034aa ("hv_netvsc: Don't ask for additional head room in the skb") got rid of needed_headroom setting for the driver. With the change I hit the following issue trying to use ptkgen module: [ 57.522021] kernel BUG at net/core/skbuff.c:1128! [ 57.522021] invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC ... [ 58.721068] Call Trace: [ 58.721068] [<ffffffffa0144e86>] netvsc_start_xmit+0x4c6/0x8e0 [hv_netvsc] ... [ 58.721068] [<ffffffffa02f87fc>] ? pktgen_finalize_skb+0x25c/0x2a0 [pktgen] [ 58.721068] [<ffffffff814f5760>] ? __netdev_alloc_skb+0xc0/0x100 [ 58.721068] [<ffffffffa02f9907>] pktgen_thread_worker+0x257/0x1920 [pktgen] Basically, we're calling skb_cow_head(skb, RNDIS_AND_PPI_SIZE) and crash on if (skb_shared(skb)) BUG(); We probably need to restore needed_headroom setting (but shrunk to RNDIS_AND_PPI_SIZE as we don't need more) to request the required headroom space. In theory, it should not give us performance penalty. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | Merge branch 'mvneta-fixes'David S. Miller2016-02-131-83/+101
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Gregory CLEMENT says: ==================== mvneta fixes for SMP Following this bug report: http://thread.gmane.org/gmane.linux.ports.arm.kernel/468173 and the suggestions from Russell King, I reviewed all the code involving multi-CPU. It ended with this series of patches which should improve the stability of the driver. During my test I found another bug which is fixed by new patch (the second one of this new version of the series) The two first patches fix real bugs, the others fix potential issues in the driver. Changelog: v1 -> v2 Fix spinlock comment. Pointed by David Miller v2 -> v3 - Fix typos and mistake in the comments. Pointed by Sergei Shtylyov - Add a new patch fixing the CPU choice in mvneta_percpu_elect - Use lock in last patch to prevent remaining race condition. Pointed by Jisheng ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| * | net: mvneta: Fix race condition during stoppingGregory CLEMENT2016-02-131-8/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When stopping the port, the CPU notifier are still there whereas the mvneta_stop_dev function calls mvneta_percpu_disable() on each CPUs. It was possible to have a new CPU coming at this point which could be racy. This patch adds a flag preventing executing the code notifier for a new CPU when the port is stopping. It also uses the spinlock introduces previously. To avoid the deadlock, the lock has been moved outside the mvneta_percpu_elect function. Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | net: mvneta: The mvneta_percpu_elect function should be atomicGregory CLEMENT2016-02-131-0/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Electing a CPU must be done in an atomic way: it should be done after or before the removal/insertion of a CPU and this function is not reentrant. During the loop of mvneta_percpu_elect we associates the queues to the CPUs, if there is a topology change during this loop, then the mapping between the CPUs and the queues could be wrong. During this loop the interrupt mask is also updating for each CPUs, It should not be changed in the same time by other part of the driver. This patch adds spinlock to create the needed critical sections. Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | net: mvneta: Modify the queue related fields from each cpuGregory CLEMENT2016-02-131-54/+46
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In the MVNETA_INTR_* registers, the queues related fields are per cpu, according to the datasheet (comment in [] are added by me): "In a multi-CPU system, bits of RX[or TX] queues for which the access by the reading[or writing] CPU is disabled are read as 0, and cannot be cleared[or written]." That means that each time we want to manipulate these bits we had to do it on each cpu and not only on the current cpu. Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | net: mvneta: Remove unused codeGregory CLEMENT2016-02-131-8/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | Since the commit 2dcf75e2793c ("net: mvneta: Associate RX queues with each CPU") all the percpu irq are used and disabled at initialization, so there is no point to disable them first. Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | net: mvneta: Use on_each_cpu when possibleGregory CLEMENT2016-02-131-11/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Instead of using a for_each_* loop in which we just call the smp_call_function_single macro, it is more simple to directly use the on_each_cpu macro. Moreover, this macro ensures that the calls will be done all at once. Suggested-by: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | net: mvneta: Fix the CPU choice in mvneta_percpu_electGregory CLEMENT2016-02-131-4/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When passing to the management of multiple RX queue, the mvneta_percpu_elect function was broken. The use of the modulo can lead to elect the wrong cpu. For example with rxq_def=2, if the CPU 2 goes offline and then online, we ended with the third RX queue activated in the same time on CPU 0 and CPU2, which lead to a kernel crash. With this fix, we don't try to get "the closer" CPU if the default CPU is gone, now we just use CPU 0 which always be there. Thanks to this, the code becomes more readable, easier to maintain and more predicable. Cc: stable@vger.kernel.org Fixes: 2dcf75e2793c ("net: mvneta: Associate RX queues with each CPU") Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | net: mvneta: Fix for_each_present_cpu usageGregory CLEMENT2016-02-131-5/+3
|/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch convert the for_each_present in on_each_cpu, instead of applying on the present cpus it will be applied only on the online cpus. This fix a bug reported on http://thread.gmane.org/gmane.linux.ports.arm.kernel/468173. Using the macro on_each_cpu (instead of a for_each_* loop) also ensures that all the calls will be done all at once. Fixes: f86428854480 ("net: mvneta: Statically assign queues to CPUs") Reported-by: Stefan Roese <stefan.roese@gmail.com> Suggested-by: Jisheng Zhang <jszhang@marvell.com> Suggested-by: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | vsock: Fix blocking ops call in prepare_to_waitLaura Abbott2016-02-131-13/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We receoved a bug report from someone using vmware: WARNING: CPU: 3 PID: 660 at kernel/sched/core.c:7389 __might_sleep+0x7d/0x90() do not call blocking ops when !TASK_RUNNING; state=1 set at [<ffffffff810fa68d>] prepare_to_wait+0x2d/0x90 Modules linked in: vmw_vsock_vmci_transport vsock snd_seq_midi snd_seq_midi_event snd_ens1371 iosf_mbi gameport snd_rawmidi snd_ac97_codec ac97_bus snd_seq coretemp snd_seq_device snd_pcm snd_timer snd soundcore ppdev crct10dif_pclmul crc32_pclmul ghash_clmulni_intel vmw_vmci vmw_balloon i2c_piix4 shpchp parport_pc parport acpi_cpufreq nfsd auth_rpcgss nfs_acl lockd grace sunrpc btrfs xor raid6_pq 8021q garp stp llc mrp crc32c_intel serio_raw mptspi vmwgfx drm_kms_helper ttm drm scsi_transport_spi mptscsih e1000 ata_generic mptbase pata_acpi CPU: 3 PID: 660 Comm: vmtoolsd Not tainted 4.2.0-0.rc1.git3.1.fc23.x86_64 #1 Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 05/20/2014 0000000000000000 0000000049e617f3 ffff88006ac37ac8 ffffffff818641f5 0000000000000000 ffff88006ac37b20 ffff88006ac37b08 ffffffff810ab446 ffff880068009f40 ffffffff81c63bc0 0000000000000061 0000000000000000 Call Trace: [<ffffffff818641f5>] dump_stack+0x4c/0x65 [<ffffffff810ab446>] warn_slowpath_common+0x86/0xc0 [<ffffffff810ab4d5>] warn_slowpath_fmt+0x55/0x70 [<ffffffff8112551d>] ? debug_lockdep_rcu_enabled+0x1d/0x20 [<ffffffff810fa68d>] ? prepare_to_wait+0x2d/0x90 [<ffffffff810fa68d>] ? prepare_to_wait+0x2d/0x90 [<ffffffff810da2bd>] __might_sleep+0x7d/0x90 [<ffffffff812163b3>] __might_fault+0x43/0xa0 [<ffffffff81430477>] copy_from_iter+0x87/0x2a0 [<ffffffffa039460a>] __qp_memcpy_to_queue+0x9a/0x1b0 [vmw_vmci] [<ffffffffa0394740>] ? qp_memcpy_to_queue+0x20/0x20 [vmw_vmci] [<ffffffffa0394757>] qp_memcpy_to_queue_iov+0x17/0x20 [vmw_vmci] [<ffffffffa0394d50>] qp_enqueue_locked+0xa0/0x140 [vmw_vmci] [<ffffffffa039593f>] vmci_qpair_enquev+0x4f/0xd0 [vmw_vmci] [<ffffffffa04847bb>] vmci_transport_stream_enqueue+0x1b/0x20 [vmw_vsock_vmci_transport] [<ffffffffa047ae05>] vsock_stream_sendmsg+0x2c5/0x320 [vsock] [<ffffffff810fabd0>] ? wake_atomic_t_function+0x70/0x70 [<ffffffff81702af8>] sock_sendmsg+0x38/0x50 [<ffffffff81702ff4>] SYSC_sendto+0x104/0x190 [<ffffffff8126e25a>] ? vfs_read+0x8a/0x140 [<ffffffff817042ee>] SyS_sendto+0xe/0x10 [<ffffffff8186d9ae>] entry_SYSCALL_64_fastpath+0x12/0x76 transport->stream_enqueue may call copy_to_user so it should not be called inside a prepare_to_wait. Narrow the scope of the prepare_to_wait to avoid the bad call. This also applies to vsock_stream_recvmsg as well. Reported-by: Vinson Lee <vlee@freedesktop.org> Tested-by: Vinson Lee <vlee@freedesktop.org> Signed-off-by: Laura Abbott <labbott@fedoraproject.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* | r8169:fix system hange problem.Chun-Hao Lin2016-02-131-7/+7
| | | | | | | | | | | | | | | | There are typos in setting RTL8168H hardware parameters. If system install another version driver that may cuase system hang. Signed-off-by: Chunhao Lin <hau@realtek.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | ipv4: fix memory leaks in ip_cmsg_send() callersEric Dumazet2016-02-134-3/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | Dmitry reported memory leaks of IP options allocated in ip_cmsg_send() when/if this function returns an error. Callers are responsible for the freeing. Many thanks to Dmitry for the report and diagnostic. Reported-by: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | net: mvpp2: Return correct error codesAmitoj Kaur Chawla2016-02-131-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The return value of kzalloc on failure of allocation of memory should be -ENOMEM and not -1. Found using Coccinelle. A simplified version of the semantic patch used is: //<smpl> @@ expression *e; position p,q; @@ e@q = kzalloc(...); if@p (e == NULL) { ... return - -1 + -ENOMEM ; } //</smpl> This function may also return -1 after calling mpp2_prs_tcam_port_map_get. So that the function consistently returns meaningful error values on failure, the -1 is changed to -EINVAL. Signed-off-by: Amitoj Kaur Chawla <amitoj1606@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | net: cavium: liquidio: Return correct error codeAmitoj Kaur Chawla2016-02-132-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The return value of vmalloc on failure of allocation of memory should be -ENOMEM and not -1. Found using Coccinelle. A simplified version of the semantic patch used is: //<smpl> @@ expression *e; identifier l1; position p,q; @@ e@q = vmalloc(...); if@p (e == NULL) { ... goto l1; } l1: ... return -1 + -ENOMEM ; //</smpl The single call site of the containing function checks whether the returned value is -1, so this check is changed as well. The single call site of this call site, however, only checks whether the value is not 0, so no further change was required. Signed-off-by: Amitoj Kaur Chawla <amitoj1606@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | bonding: Fix ARP monitor validationJay Vosburgh2016-02-131-11/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The current logic in bond_arp_rcv will accept an incoming ARP for validation if (a) the receiving slave is either "active" (which includes the currently active slave, or the current ARP slave) or, (b) there is a currently active slave, and it has received an ARP since it became active. For case (b), the receiving slave isn't the currently active slave, and is receiving the original broadcast ARP request, not an ARP reply from the target. This logic can fail if there is no currently active slave. In this situation, the ARP probe logic cycles through all slaves, assigning each in turn as the "current_arp_slave" for one arp_interval, then setting that one as "active," and sending an ARP probe from that slave. The current logic expects the ARP reply to arrive on the sending current_arp_slave, however, due to switch FDB updating delays, the reply may be directed to another slave. This can arise if the bonding slaves and switch are working, but the ARP target is not responding. When the ARP target recovers, a condition may result wherein the ARP target host replies faster than the switch can update its forwarding table, causing each ARP reply to be sent to the previous current_arp_slave. This will never pass the logic in bond_arp_rcv, as neither of the above conditions (a) or (b) are met. Some experimentation on a LAN shows ARP reply round trips in the 200 usec range, but my available switches never update their FDB in less than 4000 usec. This patch changes the logic in bond_arp_rcv to additionally accept an ARP reply for validation on any slave if there is a current ARP slave and it sent an ARP probe during the previous arp_interval. Fixes: aeea64ac717a ("bonding: don't trust arp requests unless active slave really works") Cc: Veaceslav Falico <vfalico@gmail.com> Cc: Andy Gospodarek <gospo@cumulusnetworks.com> Signed-off-by: Jay Vosburgh <jay.vosburgh@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | Merge tag 'gpio-v4.5-2' of ↵Linus Torvalds2016-02-112-5/+7
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio Pull GPIO fixes from Linus Walleij: - Probe errorpath fix for the Altera - irqchip ofnode pointer added to the DaVinci driver - controller instance number correction for DaVinci * tag 'gpio-v4.5-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio: gpio: davinci: Fix the number of controllers allocated gpio: davinci: Add the missing of-node pointer gpio: gpio-altera: Remove gpiochip on probe failure.
| * | gpio: davinci: Fix the number of controllers allocatedLokesh Vutla2016-02-101-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Driver only needs to allocate for [ngpio / 32] controllers, as each controller handles 32 gpios. But the current driver allocates for ngpio of which the extra allocated are unused. Fix it be registering only the required number of controllers. Signed-off-by: Lokesh Vutla <lokeshvutla@ti.com> Signed-off-by: Keerthy <j-keerthy@ti.com> Reviewed-by: Grygorii Strashko <grygorii.strashko@ti.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
| * | gpio: davinci: Add the missing of-node pointerKeerthy2016-02-101-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently the first parameter of irq_domain_add_legacy is NULL. irq_find_host function returns NULL when we do not populate the of_node and hence irq_of_parse_and_map call fails whenever we want to request a gpio irq. This fixes the request_irq failures for gpio interrupts. Signed-off-by: Keerthy <j-keerthy@ti.com> Reviewed-by: Grygorii Strashko <grygorii.strashko@ti.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
| * | gpio: gpio-altera: Remove gpiochip on probe failure.Phil Reid2016-01-271-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On failure to setup the irq altera_gpio_probe would return an error but not go to cleanup. This resulted in kernel fault "Unable to handle kernel paging request at virtual address xxxxxxxx" later on in of_gpiochip_find_and_xlate. Signed-off-by: Phil Reid <preid@electromag.com.au> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
* | | Merge tag 'platform-drivers-x86-v4.5-3' of ↵Linus Torvalds2016-02-112-3/+2
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.infradead.org/users/dvhart/linux-platform-drivers-x86 Pull x86 platform driver fixes from Darren Hart: "Just two small fixes for the 4.5-rc cycle: intel_scu_ipcutil: - underflow in scu_reg_access() intel-hid: - fix incorrect entries in intel_hid_keymap" * tag 'platform-drivers-x86-v4.5-3' of git://git.infradead.org/users/dvhart/linux-platform-drivers-x86: intel_scu_ipcutil: underflow in scu_reg_access() intel-hid: fix incorrect entries in intel_hid_keymap
| * | | intel_scu_ipcutil: underflow in scu_reg_access()Dan Carpenter2016-01-301-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | "count" is controlled by the user and it can be negative. Let's prevent that by making it unsigned. You have to have CAP_SYS_RAWIO to call this function so the bug is not as serious as it could be. Fixes: 5369c02d951a ('intel_scu_ipc: Utility driver for intel scu ipc') Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Cc: stable@vger.kernel.org Signed-off-by: Darren Hart <dvhart@linux.intel.com>
| * | | intel-hid: fix incorrect entries in intel_hid_keymapAlex Hung2016-01-301-2/+1
| |/ / | | | | | | | | | | | | | | | | | | | | | intel_hid_keymap contains a duplicate entry for KEY_HOME and an incorrect HID index for KEY_PAGEDOWN Reported-by: Pavel Bludov <pbludov@gmail.com> Signed-off-by: Alex Hung <alex.hung@canonical.com>
* | | Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds2016-02-1130-66/+191
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull networking fixes from David Miller: 1) Fix BPF handling of branch offset adjustmnets on backjumps, from Daniel Borkmann. 2) Make sure selinux knows about SOCK_DESTROY netlink messages, from Lorenzo Colitti. 3) Fix openvswitch tunnel mtu regression, from David Wragg. 4) Fix ICMP handling of TCP sockets in syn_recv state, from Eric Dumazet. 5) Fix SCTP user hmacid byte ordering bug, from Xin Long. 6) Fix recursive locking in ipv6 addrconf, from Subash Abhinov Kasiviswanathan. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: bpf: fix branch offset adjustment on backjumps after patching ctx expansion vxlan, gre, geneve: Set a large MTU on ovs-created tunnel devices geneve: Relax MTU constraints vxlan: Relax MTU constraints flow_dissector: Fix unaligned access in __skb_flow_dissector when used by eth_get_headlen of: of_mdio: Add marvell, 88e1145 to whitelist of PHY compatibilities. selinux: nlmsgtab: add SOCK_DESTROY to the netlink mapping tables sctp: translate network order to host order when users get a hmacid enic: increment devcmd2 result ring in case of timeout tg3: Fix for tg3 transmit queue 0 timed out when too many gso_segs net:Add sysctl_max_skb_frags tcp: do not drop syn_recv on all icmp reports ipv6: fix a lockdep splat unix: correctly track in-flight fds in sending process user_struct update be2net maintainers' email addresses dwc_eth_qos: Reset hardware before PHY start ipv6: addrconf: Fix recursive spin lock call
| * | | bpf: fix branch offset adjustment on backjumps after patching ctx expansionDaniel Borkmann2016-02-101-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When ctx access is used, the kernel often needs to expand/rewrite instructions, so after that patching, branch offsets have to be adjusted for both forward and backward jumps in the new eBPF program, but for backward jumps it fails to account the delta. Meaning, for example, if the expansion happens exactly on the insn that sits at the jump target, it doesn't fix up the back jump offset. Analysis on what the check in adjust_branches() is currently doing: /* adjust offset of jmps if necessary */ if (i < pos && i + insn->off + 1 > pos) insn->off += delta; else if (i > pos && i + insn->off + 1 < pos) insn->off -= delta; First condition (forward jumps): Before: After: insns[0] insns[0] insns[1] <--- i/insn insns[1] <--- i/insn insns[2] <--- pos insns[P] <--- pos insns[3] insns[P] `------| delta insns[4] <--- target_X insns[P] `-----| insns[5] insns[3] insns[4] <--- target_X insns[5] First case is if we cross pos-boundary and the jump instruction was before pos. This is handeled correctly. I.e. if i == pos, then this would mean our jump that we currently check was the patchlet itself that we just injected. Since such patchlets are self-contained and have no awareness of any insns before or after the patched one, the delta is correctly not adjusted. Also, for the second condition in case of i + insn->off + 1 == pos, means we jump to that newly patched instruction, so no offset adjustment are needed. That part is correct. Second condition (backward jumps): Before: After: insns[0] insns[0] insns[1] <--- target_X insns[1] <--- target_X insns[2] <--- pos <-- target_Y insns[P] <--- pos <-- target_Y insns[3] insns[P] `------| delta insns[4] <--- i/insn insns[P] `-----| insns[5] insns[3] insns[4] <--- i/insn insns[5] Second interesting case is where we cross pos-boundary and the jump instruction was after pos. Backward jump with i == pos would be impossible and pose a bug somewhere in the patchlet, so the first condition checking i > pos is okay only by itself. However, i + insn->off + 1 < pos does not always work as intended to trigger the adjustment. It works when jump targets would be far off where the delta wouldn't matter. But, for example, where the fixed insn->off before pointed to pos (target_Y), it now points to pos + delta, so that additional room needs to be taken into account for the check. This means that i) both tests here need to be adjusted into pos + delta, and ii) for the second condition, the test needs to be <= as pos itself can be a target in the backjump, too. Fixes: 9bac3d6d548e ("bpf: allow extended BPF programs access skb fields") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | Merge branch 'ovs-tunnel-mtu'David S. Miller2016-02-106-22/+87
| |\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | David Wragg says: ==================== Set a large MTU on ovs-created tunnel devices Prior to 4.3, openvswitch tunnel vports (vxlan, gre and geneve) could transmit vxlan packets of any size, constrained only by the ability to send out the resulting packets. 4.3 introduced netdevs corresponding to tunnel vports. These netdevs have an MTU, which limits the size of a packet that can be successfully encapsulated. The default MTU values are low (1500 or less), which is awkwardly small in the context of physical networks supporting jumbo frames, and leads to a conspicuous change in behaviour for userspace. This patch series sets the MTU on openvswitch-created netdevs to be the relevant maximum (i.e. the maximum IP packet size minus any relevant overhead), effectively restoring the behaviour prior to 4.3. Where relevant, the limits on MTU values that can be directly set on the netdevs are also relaxed. Changes in v2: * Extend to all openvswitch tunnel types, i.e. gre and geneve as well * Use IP_MAX_MTU Changes in v3: * Fix block comment style ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| | * | | vxlan, gre, geneve: Set a large MTU on ovs-created tunnel devicesDavid Wragg2016-02-106-10/+50
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Prior to 4.3, openvswitch tunnel vports (vxlan, gre and geneve) could transmit vxlan packets of any size, constrained only by the ability to send out the resulting packets. 4.3 introduced netdevs corresponding to tunnel vports. These netdevs have an MTU, which limits the size of a packet that can be successfully encapsulated. The default MTU values are low (1500 or less), which is awkwardly small in the context of physical networks supporting jumbo frames, and leads to a conspicuous change in behaviour for userspace. Instead, set the MTU on openvswitch-created netdevs to be the relevant maximum (i.e. the maximum IP packet size minus any relevant overhead), effectively restoring the behaviour prior to 4.3. Signed-off-by: David Wragg <david@weave.works> Signed-off-by: David S. Miller <davem@davemloft.net>