linux - linux

	Commit message (Collapse)	Author	Age	Files	Lines
*	l2tp: fix racy SOCK_ZAPPED flag check in l2tp_ip{,6}_bind()	Guillaume Nault	2016-11-20	2	-4/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Lock socket before checking the SOCK_ZAPPED flag in l2tp_ip6_bind(). Without lock, a concurrent call could modify the socket flags between the sock_flag(sk, SOCK_ZAPPED) test and the lock_sock() call. This way, a socket could be inserted twice in l2tp_ip6_bind_table. Releasing it would then leave a stale pointer there, generating use-after-free errors when walking through the list or modifying adjacent entries. BUG: KASAN: use-after-free in l2tp_ip6_close+0x22e/0x290 at addr ffff8800081b0ed8 Write of size 8 by task syz-executor/10987 CPU: 0 PID: 10987 Comm: syz-executor Not tainted 4.8.0+ #39 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.8.2-0-g33fbe13 by qemu-project.org 04/01/2014 ffff880031d97838 ffffffff829f835b ffff88001b5a1640 ffff8800081b0ec0 ffff8800081b15a0 ffff8800081b6d20 ffff880031d97860 ffffffff8174d3cc ffff880031d978f0 ffff8800081b0e80 ffff88001b5a1640 ffff880031d978e0 Call Trace: [<ffffffff829f835b>] dump_stack+0xb3/0x118 lib/dump_stack.c:15 [<ffffffff8174d3cc>] kasan_object_err+0x1c/0x70 mm/kasan/report.c:156 [< inline >] print_address_description mm/kasan/report.c:194 [<ffffffff8174d666>] kasan_report_error+0x1f6/0x4d0 mm/kasan/report.c:283 [< inline >] kasan_report mm/kasan/report.c:303 [<ffffffff8174db7e>] __asan_report_store8_noabort+0x3e/0x40 mm/kasan/report.c:329 [< inline >] __write_once_size ./include/linux/compiler.h:249 [< inline >] __hlist_del ./include/linux/list.h:622 [< inline >] hlist_del_init ./include/linux/list.h:637 [<ffffffff8579047e>] l2tp_ip6_close+0x22e/0x290 net/l2tp/l2tp_ip6.c:239 [<ffffffff850b2dfd>] inet_release+0xed/0x1c0 net/ipv4/af_inet.c:415 [<ffffffff851dc5a0>] inet6_release+0x50/0x70 net/ipv6/af_inet6.c:422 [<ffffffff84c4581d>] sock_release+0x8d/0x1d0 net/socket.c:570 [<ffffffff84c45976>] sock_close+0x16/0x20 net/socket.c:1017 [<ffffffff817a108c>] __fput+0x28c/0x780 fs/file_table.c:208 [<ffffffff817a1605>] ____fput+0x15/0x20 fs/file_table.c:244 [<ffffffff813774f9>] task_work_run+0xf9/0x170 [<ffffffff81324aae>] do_exit+0x85e/0x2a00 [<ffffffff81326dc8>] do_group_exit+0x108/0x330 [<ffffffff81348cf7>] get_signal+0x617/0x17a0 kernel/signal.c:2307 [<ffffffff811b49af>] do_signal+0x7f/0x18f0 [<ffffffff810039bf>] exit_to_usermode_loop+0xbf/0x150 arch/x86/entry/common.c:156 [< inline >] prepare_exit_to_usermode arch/x86/entry/common.c:190 [<ffffffff81006060>] syscall_return_slowpath+0x1a0/0x1e0 arch/x86/entry/common.c:259 [<ffffffff85e4d726>] entry_SYSCALL_64_fastpath+0xc4/0xc6 Object at ffff8800081b0ec0, in cache L2TP/IPv6 size: 1448 Allocated: PID = 10987 [ 1116.897025] [<ffffffff811ddcb6>] save_stack_trace+0x16/0x20 [ 1116.897025] [<ffffffff8174c736>] save_stack+0x46/0xd0 [ 1116.897025] [<ffffffff8174c9ad>] kasan_kmalloc+0xad/0xe0 [ 1116.897025] [<ffffffff8174cee2>] kasan_slab_alloc+0x12/0x20 [ 1116.897025] [< inline >] slab_post_alloc_hook mm/slab.h:417 [ 1116.897025] [< inline >] slab_alloc_node mm/slub.c:2708 [ 1116.897025] [< inline >] slab_alloc mm/slub.c:2716 [ 1116.897025] [<ffffffff817476a8>] kmem_cache_alloc+0xc8/0x2b0 mm/slub.c:2721 [ 1116.897025] [<ffffffff84c4f6a9>] sk_prot_alloc+0x69/0x2b0 net/core/sock.c:1326 [ 1116.897025] [<ffffffff84c58ac8>] sk_alloc+0x38/0xae0 net/core/sock.c:1388 [ 1116.897025] [<ffffffff851ddf67>] inet6_create+0x2d7/0x1000 net/ipv6/af_inet6.c:182 [ 1116.897025] [<ffffffff84c4af7b>] __sock_create+0x37b/0x640 net/socket.c:1153 [ 1116.897025] [< inline >] sock_create net/socket.c:1193 [ 1116.897025] [< inline >] SYSC_socket net/socket.c:1223 [ 1116.897025] [<ffffffff84c4b46f>] SyS_socket+0xef/0x1b0 net/socket.c:1203 [ 1116.897025] [<ffffffff85e4d685>] entry_SYSCALL_64_fastpath+0x23/0xc6 Freed: PID = 10987 [ 1116.897025] [<ffffffff811ddcb6>] save_stack_trace+0x16/0x20 [ 1116.897025] [<ffffffff8174c736>] save_stack+0x46/0xd0 [ 1116.897025] [<ffffffff8174cf61>] kasan_slab_free+0x71/0xb0 [ 1116.897025] [< inline >] slab_free_hook mm/slub.c:1352 [ 1116.897025] [< inline >] slab_free_freelist_hook mm/slub.c:1374 [ 1116.897025] [< inline >] slab_free mm/slub.c:2951 [ 1116.897025] [<ffffffff81748b28>] kmem_cache_free+0xc8/0x330 mm/slub.c:2973 [ 1116.897025] [< inline >] sk_prot_free net/core/sock.c:1369 [ 1116.897025] [<ffffffff84c541eb>] __sk_destruct+0x32b/0x4f0 net/core/sock.c:1444 [ 1116.897025] [<ffffffff84c5aca4>] sk_destruct+0x44/0x80 net/core/sock.c:1452 [ 1116.897025] [<ffffffff84c5ad33>] __sk_free+0x53/0x220 net/core/sock.c:1460 [ 1116.897025] [<ffffffff84c5af23>] sk_free+0x23/0x30 net/core/sock.c:1471 [ 1116.897025] [<ffffffff84c5cb6c>] sk_common_release+0x28c/0x3e0 ./include/net/sock.h:1589 [ 1116.897025] [<ffffffff8579044e>] l2tp_ip6_close+0x1fe/0x290 net/l2tp/l2tp_ip6.c:243 [ 1116.897025] [<ffffffff850b2dfd>] inet_release+0xed/0x1c0 net/ipv4/af_inet.c:415 [ 1116.897025] [<ffffffff851dc5a0>] inet6_release+0x50/0x70 net/ipv6/af_inet6.c:422 [ 1116.897025] [<ffffffff84c4581d>] sock_release+0x8d/0x1d0 net/socket.c:570 [ 1116.897025] [<ffffffff84c45976>] sock_close+0x16/0x20 net/socket.c:1017 [ 1116.897025] [<ffffffff817a108c>] __fput+0x28c/0x780 fs/file_table.c:208 [ 1116.897025] [<ffffffff817a1605>] ____fput+0x15/0x20 fs/file_table.c:244 [ 1116.897025] [<ffffffff813774f9>] task_work_run+0xf9/0x170 [ 1116.897025] [<ffffffff81324aae>] do_exit+0x85e/0x2a00 [ 1116.897025] [<ffffffff81326dc8>] do_group_exit+0x108/0x330 [ 1116.897025] [<ffffffff81348cf7>] get_signal+0x617/0x17a0 kernel/signal.c:2307 [ 1116.897025] [<ffffffff811b49af>] do_signal+0x7f/0x18f0 [ 1116.897025] [<ffffffff810039bf>] exit_to_usermode_loop+0xbf/0x150 arch/x86/entry/common.c:156 [ 1116.897025] [< inline >] prepare_exit_to_usermode arch/x86/entry/common.c:190 [ 1116.897025] [<ffffffff81006060>] syscall_return_slowpath+0x1a0/0x1e0 arch/x86/entry/common.c:259 [ 1116.897025] [<ffffffff85e4d726>] entry_SYSCALL_64_fastpath+0xc4/0xc6 Memory state around the buggy address: ffff8800081b0d80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ffff8800081b0e00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc >ffff8800081b0e80: fc fc fc fc fc fc fc fc fb fb fb fb fb fb fb fb ^ ffff8800081b0f00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ffff8800081b0f80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ================================================================== The same issue exists with l2tp_ip_bind() and l2tp_ip_bind_table. Fixes: c51ce49735c1 ("l2tp: fix oops in L2TP IP sockets for connect() AF_UNSPEC case") Reported-by: Baozeng Ding <sploving1@gmail.com> Reported-by: Andrey Konovalov <andreyknvl@google.com> Tested-by: Baozeng Ding <sploving1@gmail.com> Signed-off-by: Guillaume Nault <g.nault@alphalink.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
*	Merge tag 'batadv-net-for-davem-20161119' of git://git.open-mesh.org/linux-merge	David S. Miller	2016-11-19	2	-0/+2
\|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Simon Wunderlich says: ==================== Here are two batman-adv bugfix patches: - Revert a splat on disabling interface which created another problem, by Sven Eckelmann - Fix error handling when the primary interface disappears during a throughput meter test, by Sven Eckelmann ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	batman-adv: Detect missing primaryif during tp_send as error	Sven Eckelmann	2016-11-04	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The throughput meter detects different situations as problems for the current test. It stops the test after these and reports it to userspace. This also has to be done when the primary interface disappeared during the test. Fixes: 33a3bb4a3345 ("batman-adv: throughput meter implementation") Reported-by: Joe Perches <joe@perches.com> Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
\| *	batman-adv: Revert "fix splat on disabling an interface"	Sven Eckelmann	2016-11-04	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The commit 9799c50372b2 ("batman-adv: fix splat on disabling an interface") fixed a warning but at the same time broke the rtnl function add_slave for devices which were temporarily removed. batadv_softif_slave_add requires soft_iface of and hard_iface to be NULL before it is allowed to be enslaved. But this resetting of soft_iface to NULL in batadv_hardif_disable_interface was removed with the aforementioned commit. Reported-by: Julian Labus <julian@freifunk-rtk.de> Signed-off-by: Sven Eckelmann <sven@narfation.org> Acked-by: Linus Lüssing <linus.luessing@c0d3.blue> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
* \|	net: macb: add check for dma mapping error in start_xmit()	Alexey Khoroshilov	2016-11-19	1	-0/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	at91ether_start_xmit() does not check for dma mapping errors. Found by Linux Driver Verification project (linuxtesting.org). Signed-off-by: Alexey Khoroshilov <khoroshilov@ispras.ru> Signed-off-by: David S. Miller <davem@davemloft.net>
* \|	rtnetlink: fix FDB size computation	Sabrina Dubroca	2016-11-18	1	-1/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add missing NDA_VLAN attribute's size. Fixes: 1e53d5bb8878 ("net: Pass VLAN ID to rtnl_fdb_notify.") Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: David S. Miller <davem@davemloft.net>
* \|	netns: fix get_net_ns_by_fd(int pid) typo	Stefan Hajnoczi	2016-11-18	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The argument to get_net_ns_by_fd() is a /proc/$PID/ns/net file descriptor not a pid. Fix the typo. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Acked-by: Rami Rosen <roszenrami@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* \|	Merge tag 'mac80211-for-davem-2016-11-18' of ↵	David S. Miller	2016-11-18	7	-7/+100
\|\ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211 Johannes Berg says: ==================== A few more bugfixes: * limit # of scan results stored in memory - this is a long-standing bug Jouni and I only noticed while discussing other things in Santa Fe * revert AP_LINK_PS patch that was causing issues (Felix) * various A-MSDU/A-MPDU fixes for TXQ code (Felix) * interoperability workaround for peers with broken VHT capabilities (Filip Matusiak) * add bitrate definition for a VHT MCS that's supposed to be invalid but gets used by some hardware anyway (Thomas Pedersen) * beacon timer fix in hwsim (Benjamin Beichler) ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
\| * \|	cfg80211: limit scan results cache size	Johannes Berg	2016-11-18	2	-0/+70
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	It's possible to make scanning consume almost arbitrary amounts of memory, e.g. by sending beacon frames with random BSSIDs at high rates while somebody is scanning. Limit the number of BSS table entries we're willing to cache to 1000, limiting maximum memory usage to maybe 4-5MB, but lower in practice - that would be the case for having both full-sized beacon and probe response frames for each entry; this seems not possible in practice, so a limit of 1000 entries will likely be closer to 0.5 MB. Cc: stable@vger.kernel.org Signed-off-by: Johannes Berg <johannes.berg@intel.com>
\| * \|	mac80211_hwsim: fix beacon delta calculation	Benjamin Beichler	2016-11-15	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Due to the cast from uint32_t to int64_t, a wrong next beacon timing is calculated and effectively the beacon timer stops working. This is especially bad for 802.11s mesh networks, because discovery breaks without beacons. Signed-off-by: Benjamin Beichler <benjamin.beichler@uni-rostock.de> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
\| * \|	mac80211: fix A-MSDU aggregation with fast-xmit + txq	Felix Fietkau	2016-11-15	1	-1/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	A-MSDU aggregation alters the QoS header after a frame has been enqueued, so it needs to be ready before enqueue and not overwritten again afterwards Fixes: bb42f2d13ffc ("mac80211: Move reorder-sensitive TX handlers to after TXQ dequeue") Signed-off-by: Felix Fietkau <nbd@nbd.name> Acked-by: Toke Høiland-Jørgensen <toke@toke.dk> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
\| * \|	mac80211: remove bogus skb vif assignment	Felix Fietkau	2016-11-15	1	-3/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The call to ieee80211_txq_enqueue overwrites the vif pointer with the codel enqueue time, so setting it just before that call makes no sense. Signed-off-by: Felix Fietkau <nbd@nbd.name> Acked-by: Toke Høiland-Jørgensen <toke@toke.dk> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
\| * \|	mac80211: update A-MPDU flag on tx dequeue	Felix Fietkau	2016-11-15	1	-0/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The sequence number counter is used to derive the starting sequence number. Since that counter is updated on tx dequeue, the A-MPDU flag needs to be up to date at the tme of dequeue as well. This patch prevents sending more A-MPDU frames after the session has been terminated and also ensures that aggregation starts right after the session has been established Fixes: bb42f2d13ffc ("mac80211: Move reorder-sensitive TX handlers to after TXQ dequeue") Signed-off-by: Felix Fietkau <nbd@nbd.name> Acked-by: Toke Høiland-Jørgensen <toke@toke.dk> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
\| * \|	cfg80211: add bitrate for 20MHz MCS 9	Pedersen, Thomas	2016-11-15	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Some drivers (ath10k) report MCS 9 @ 20MHz, which technically isn't defined. To get more meaningful value than 0 out of this however, just extrapolate a bitrate from ratio of MCS 7 and 9 in channels where it is allowed. Signed-off-by: Thomas Pedersen <twp@qca.qualcomm.com> [add a comment about it in the code] Signed-off-by: Johannes Berg <johannes.berg@intel.com>
\| * \|	Revert "mac80211: allow using AP_LINK_PS with mac80211-generated TIM IE"	Felix Fietkau	2016-11-15	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This reverts commit c68df2e7be0c1238ea3c281fd744a204ef3b15a0. __sta_info_recalc_tim turns into a no-op if local->ops->set_tim is not set. This prevents the beacon TIM bit from being set for all drivers that do not implement this op (almost all of them), thus thoroughly essential AP mode powersave functionality. Cc: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Fixes: c68df2e7be0c ("mac80211: allow using AP_LINK_PS with mac80211-generated TIM IE") Signed-off-by: Felix Fietkau <nbd@nbd.name> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
\| * \|	mac80211: Ignore VHT IE from peer with wrong rx_mcs_map	Filip Matusiak	2016-11-15	1	-0/+16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is a workaround for VHT-enabled STAs which break the spec and have the VHT-MCS Rx map filled in with value 3 for all eight spacial streams, an example is AR9462 in AP mode. As per spec, in section 22.1.1 Introduction to the VHT PHY A VHT STA shall support at least single spactial stream VHT-MCSs 0 to 7 (transmit and receive) in all supported channel widths. Some devices in STA mode will get firmware assert when trying to associate, examples are QCA9377 & QCA6174. Packet example of broken VHT Cap IE of AR9462: Tag: VHT Capabilities (IEEE Std 802.11ac/D3.1) Tag Number: VHT Capabilities (IEEE Std 802.11ac/D3.1) (191) Tag length: 12 VHT Capabilities Info: 0x00000000 VHT Supported MCS Set Rx MCS Map: 0xffff .... .... .... ..11 = Rx 1 SS: Not Supported (0x0003) .... .... .... 11.. = Rx 2 SS: Not Supported (0x0003) .... .... ..11 .... = Rx 3 SS: Not Supported (0x0003) .... .... 11.. .... = Rx 4 SS: Not Supported (0x0003) .... ..11 .... .... = Rx 5 SS: Not Supported (0x0003) .... 11.. .... .... = Rx 6 SS: Not Supported (0x0003) ..11 .... .... .... = Rx 7 SS: Not Supported (0x0003) 11.. .... .... .... = Rx 8 SS: Not Supported (0x0003) ...0 0000 0000 0000 = Rx Highest Long GI Data Rate (in Mb/s, 0 = subfield not in use): 0x0000 Tx MCS Map: 0xffff ...0 0000 0000 0000 = Tx Highest Long GI Data Rate (in Mb/s, 0 = subfield not in use): 0x0000 Signed-off-by: Filip Matusiak <filip.matusiak@tieto.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
* \| \|	af_unix: conditionally use freezable blocking calls in read	WANG Cong	2016-11-18	1	-6/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Commit 2b15af6f95 ("af_unix: use freezable blocking calls in read") converts schedule_timeout() to its freezable version, it was probably correct at that time, but later, commit 2b514574f7e8 ("net: af_unix: implement splice for stream af_unix sockets") breaks the strong requirement for a freezable sleep, according to commit 0f9548ca1091: We shouldn't try_to_freeze if locks are held. Holding a lock can cause a deadlock if the lock is later acquired in the suspend or hibernate path (e.g. by dpm). Holding a lock can also cause a deadlock in the case of cgroup_freezer if a lock is held inside a frozen cgroup that is later acquired by a process outside that group. The pipe_lock is still held at that point. So use freezable version only for the recvmsg call path, avoid impact for Android. Fixes: 2b514574f7e8 ("net: af_unix: implement splice for stream af_unix sockets") Reported-by: Dmitry Vyukov <dvyukov@google.com> Cc: Tejun Heo <tj@kernel.org> Cc: Colin Cross <ccross@android.com> Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* \| \|	Merge branch 'cpsw-fixes'	David S. Miller	2016-11-18	1	-21/+74
\|\ \ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Johan Hovold says: ==================== net: cpsw: fix leaks and probe deferral This series fixes as number of leaks and issues in the cpsw probe-error and driver-unbind paths, some which specifically prevented deferred probing. v2 - Keep platform device runtime-resumed throughout probe instead of resuming in the probe error path as suggested by Grygorii (patch 1/7). - Runtime-resume platform device before registering any children in order to make sure it is synchronously suspended after deregistering children in the error path (patch 3/7). ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
\| * \| \|	net: ethernet: ti: cpsw: fix fixed-link phy probe deferral	Johan Hovold	2016-11-18	1	-5/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Make sure to propagate errors from of_phy_register_fixed_link() which can fail with -EPROBE_DEFER. Fixes: 1f71e8c96fc6 ("drivers: net: cpsw: Add support for fixed-link PHY") Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
\| * \| \|	net: ethernet: ti: cpsw: add missing sanity check	Johan Hovold	2016-11-18	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Make sure to check for allocation failures before dereferencing a NULL-pointer during probe. Fixes: 649a1688c960 ("net: ethernet: ti: cpsw: create common struct to hold shared driver data") Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
\| * \| \|	net: ethernet: ti: cpsw: fix secondary-emac probe error path	Johan Hovold	2016-11-18	1	-1/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Make sure to deregister the primary device in case the secondary emac fails to probe. kernel BUG at /home/johan/work/omicron/src/linux/net/core/dev.c:7743! ... [<c05b3dec>] (free_netdev) from [<c04fe6c0>] (cpsw_probe+0x9cc/0xe50) [<c04fe6c0>] (cpsw_probe) from [<c047b28c>] (platform_drv_probe+0x5c/0xc0) Fixes: d9ba8f9e6298 ("driver: net: ethernet: cpsw: dual emac interface implementation") Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
\| * \| \|	net: ethernet: ti: cpsw: fix of_node and phydev leaks	Johan Hovold	2016-11-18	1	-0/+35
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Make sure to drop references taken and deregister devices registered during probe on probe errors (including deferred probe) and driver unbind. Specifically, PHY of-node references were never released and fixed-link PHY devices were never deregistered. Fixes: 9e42f715264f ("drivers: net: cpsw: add phy-handle parsing") Fixes: 1f71e8c96fc6 ("drivers: net: cpsw: Add support for fixed-link PHY") Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
\| * \| \|	net: ethernet: ti: cpsw: fix deferred probe	Johan Hovold	2016-11-18	1	-17/+24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Make sure to deregister all child devices also on probe errors to avoid leaks and to fix probe deferral: cpsw 4a100000.ethernet: omap_device: omap_device_enable() called from invalid state 1 cpsw 4a100000.ethernet: use pm_runtime_put_sync_suspend() in driver? cpsw: probe of 4a100000.ethernet failed with error -22 Add generic helper to undo the effects of cpsw_probe_dt(), which will also be used in a follow-on patch to fix further leaks that have been introduced more recently. Note that the platform device is now runtime-resumed before registering any child devices in order to make sure that it is synchronously suspended after having deregistered the children in the error path. Fixes: 1fb19aa730e4 ("net: cpsw: Add parent<->child relation support between cpsw and mdio") Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
\| * \| \|	net: ethernet: ti: cpsw: fix mdio device reference leak	Johan Hovold	2016-11-18	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Make sure to drop the reference taken by of_find_device_by_node() when looking up an mdio device from a phy_id property during probe. Fixes: 549985ee9c72 ("cpsw: simplify the setup of the register pointers") Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
\| * \| \|	net: ethernet: ti: cpsw: fix bad register access in probe error path	Johan Hovold	2016-11-18	1	-4/+7
\|/ / / \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Make sure to keep the platform device runtime-resumed throughout probe to avoid accessing the CPSW registers in the error path (e.g. for deferred probe) with clocks disabled: Unhandled fault: external abort on non-linefetch (0x1008) at 0xd0872d08 ... [<c04fabcc>] (cpsw_ale_control_set) from [<c04fb8b4>] (cpsw_ale_destroy+0x2c/0x44) [<c04fb8b4>] (cpsw_ale_destroy) from [<c04fea58>] (cpsw_probe+0xbd0/0x10c4) [<c04fea58>] (cpsw_probe) from [<c047b2a0>] (platform_drv_probe+0x5c/0xc0) Fixes: df828598a755 ("netdev: driver: ethernet: Add TI CPSW driver") Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* \| \|	net: sky2: Fix shutdown crash	Jeremy Linton	2016-11-18	1	-0/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The sky2 frequently crashes during machine shutdown with: sky2_get_stats+0x60/0x3d8 [sky2] dev_get_stats+0x68/0xd8 rtnl_fill_stats+0x54/0x140 rtnl_fill_ifinfo+0x46c/0xc68 rtmsg_ifinfo_build_skb+0x7c/0xf0 rtmsg_ifinfo.part.22+0x3c/0x70 rtmsg_ifinfo+0x50/0x5c netdev_state_change+0x4c/0x58 linkwatch_do_dev+0x50/0x88 __linkwatch_run_queue+0x104/0x1a4 linkwatch_event+0x30/0x3c process_one_work+0x140/0x3e0 worker_thread+0x60/0x44c kthread+0xdc/0xf0 ret_from_fork+0x10/0x50 This is caused by the sky2 being called after it has been shutdown. A previous thread about this can be found here: https://lkml.org/lkml/2016/4/12/410 An alternative fix is to assure that IFF_UP gets cleared by calling dev_close() during shutdown. This is similar to what the bnx2/tg3/xgene and maybe others are doing to assure that the driver isn't being called following _shutdown(). Signed-off-by: Jeremy Linton <jeremy.linton@arm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* \| \|	net sched filters: pass netlink message flags in event notification	Roman Mashak	2016-11-17	1	-2/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Userland client should be able to read an event, and reflect it back to the kernel, therefore it needs to extract complete set of netlink flags. For example, this will allow "tc monitor" to distinguish Add and Replace operations. Signed-off-by: Roman Mashak <mrv@mojatatu.com> Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* \| \|	ip6_tunnel: disable caching when the traffic class is inherited	Paolo Abeni	2016-11-17	1	-2/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If an ip6 tunnel is configured to inherit the traffic class from the inner header, the dst_cache must be disabled or it will foul the policy routing. The issue is apprently there since at leat Linux-2.6.12-rc2. Reported-by: Liam McBirnie <liam.mcbirnie@boeing.com> Cc: Liam McBirnie <liam.mcbirnie@boeing.com> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* \| \|	Merge branch 'phy-dev-leaks'	David S. Miller	2016-11-17	2	-2/+6
\|\ \ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Johan Hovold says: ==================== net: phy: fix of_node and device leaks These patches fix a couple of of_node leaks in the fixed-link code and a device reference leak in a phy helper. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
\| * \| \|	net: phy: fixed_phy: fix of_node leak in fixed_phy_unregister	Johan Hovold	2016-11-17	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Make sure to drop the of_node reference taken in fixed_phy_register() when deregistering a PHY. Fixes: a75951217472 ("net: phy: extend fixed driver with fixed_phy_register()") Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
\| * \| \|	of_mdio: fix device reference leak in of_phy_find_device	Johan Hovold	2016-11-17	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Make sure to drop the reference taken by bus_find_device() before returning NULL from of_phy_find_device() when the found device is not a PHY. Fixes: 6ed742363b9c ("of: of_mdio: Ensure mdio device is a PHY") Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
\| * \| \|	of_mdio: fix node leak in of_phy_register_fixed_link error path	Johan Hovold	2016-11-17	1	-1/+4
\|/ / / \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Make sure to drop the of_node reference also on failure to parse the speed property in of_phy_register_fixed_link(). Fixes: 3be2a49e5c08 ("of: provide a binding for fixed link PHYs") Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* \| \|	net: check dead netns for peernet2id_alloc()	WANG Cong	2016-11-17	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Andrei reports we still allocate netns ID from idr after we destroy it in cleanup_net(). cleanup_net(): ... idr_destroy(&net->netns_ids); ... list_for_each_entry_reverse(ops, &pernet_list, list) ops_exit_list(ops, &net_exit_list); -> rollback_registered_many() -> rtmsg_ifinfo_build_skb() -> rtnl_fill_ifinfo() -> peernet2id_alloc() After that point we should not even access net->netns_ids, we should check the death of the current netns as early as we can in peernet2id_alloc(). For net-next we can consider to avoid sending rtmsg totally, it is a good optimization for netns teardown path. Fixes: 0c7aecd4bde4 ("netns: add rtnl cmd to add and get peer netns ids") Reported-by: Andrei Vagin <avagin@gmail.com> Cc: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Acked-by: Andrei Vagin <avagin@openvz.org> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* \| \|	bnxt: add a missing rcu synchronization	Eric Dumazet	2016-11-17	1	-0/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add a missing synchronize_net() call to avoid potential use after free, since we explicitly call napi_hash_del() to factorize the RCU grace period. Fixes: c0c050c58d84 ("bnxt_en: New Broadcom ethernet driver.") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Michael Chan <michael.chan@broadcom.com> Acked-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* \| \|	net: dsa: b53: Fix VLAN usage and how we treat CPU port	Florian Fainelli	2016-11-17	1	-12/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We currently have a fundamental problem in how we treat the CPU port and its VLAN membership. As soon as a second VLAN is configured to be untagged, the CPU automatically becomes untagged for that VLAN as well, and yet, we don't gracefully make sure that the CPU becomes tagged in the other VLANs it could be a member of. This results in only one VLAN being effectively usable from the CPU's perspective. Instead of having some pretty complex logic which tries to maintain the CPU port's default VLAN and its untagged properties, just do something very simple which consists in neither altering the CPU port's PVID settings, nor its untagged settings: - whenever a VLAN is added, the CPU is automatically a member of this VLAN group, as a tagged member - PVID settings for downstream ports do not alter the CPU port's PVID since it now is part of all VLANs in the system This means that a typical example where e.g: LAN ports are in VLAN1, and WAN port is in VLAN2, now require having two VLAN interfaces for the host to properly terminate and send traffic from/to. Fixes: Fixes: a2482d2ce349 ("net: dsa: b53: Plug in VLAN support") Reported-by: Hartmut Knaack <knaack.h@gmx.de> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* \| \|	net/phy/vitesse: Configure RGMII skew on VSC8601, if needed	Alex	2016-11-16	1	-1/+33
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	With RGMII, we need a 1.5 to 2ns skew between clock and data lines. The VSC8601 can handle this internally. While the VSC8601 can set more fine-grained delays, the standard skew settings work out of the box. The same heuristic is used to determine when this skew should be enabled as in vsc824x_config_init(). Tested on custom board with AM3352 SOC and VSC801 PHY. Signed-off-by: Alexandru Gagniuc <alex.g@adaptrum.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* \| \|	cxgb4: do not call napi_hash_del()	Eric Dumazet	2016-11-16	1	-1/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Calling napi_hash_del() before netif_napi_del() is dangerous if a synchronize_rcu() is not enforced before NAPI struct freeing. Lets leave this detail to core networking stack and feel more comfortable. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Hariprasad S <hariprasad@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* \| \|	be2net: do not call napi_hash_del()	Eric Dumazet	2016-11-16	1	-1/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Calling napi_hash_del() before netif_napi_del() is dangerous if a synchronize_rcu() is not enforced before NAPI struct freeing. Lets leave this detail to core networking stack and feel more comfortable. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Sathya Perla <sathya.perla@broadcom.com> Cc: Ajit Khaparde <ajit.khaparde@broadcom.com> Cc: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com> Cc: Somnath Kotur <somnath.kotur@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* \| \|	virtio-net: add a missing synchronize_net()	Eric Dumazet	2016-11-16	1	-0/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	It seems many drivers do not respect napi_hash_del() contract. When napi_hash_del() is used before netif_napi_del(), an RCU grace period is needed before freeing NAPI object. Fixes: 91815639d880 ("virtio-net: rx busy polling support") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Michael S. Tsirkin <mst@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* \| \|	Merge branch 'thunderx-fixes'	David S. Miller	2016-11-16	9	-234/+274
\|\ \ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Sunil Goutham says: ==================== net: thunderx: Miscellaneous fixes This patchset includes fixes for incorrect LMAC credits, unreliable driver statistics, memory leak upon interface down e.t.c Changes from v1: - As suggested replaced bit shifting with BIT() macro in the patch 'Fix configuration of L3/L4 length checking'. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
\| * \| \|	net: thunderx: Fix memory leak and other issues upon interface toggle	Sunil Goutham	2016-11-16	2	-7/+18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch fixes the following 1. When interface is being teardown and queues are being cleaned up, free pending SKBs that are in SQ which are either not transmitted or freed as NAPI is disabled by that time. 2. While interface initialization, delay CFG_DONE notification till the end to avoid corner cases where TXQs are enabled but CQ interrupts are not which results blocking transmission and kicking off watchdog. 3. Check for IFF_UP while re-enabling RBDR interrupts from tasklet. Signed-off-by: Sunil Goutham <sgoutham@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| * \| \|	net: thunderx: Fix VF driver's interface statistics	Sunil Goutham	2016-11-16	6	-196/+197
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch fixes multiple issues 1. Convert all driver statistics to percpu counters for accuracy. 2. To avoid multiple CQEs posted by a TSO packet appended to HW, TSO pkt's SQE has 'post_cqe' not set but a dummy SQE is added for getting HW transmit completion notification. This dummy SQE has 'dont_send' set and HW drops the pkt pointed to in this thus Tx drop counter increases. This patch fixes this by subtracting SW tx tso counter from HW Tx drop counter for actual packet drop counter. 3. Reset all individual queue's and VNIC HW stats when interface is going down. 4. Getrid off unnecessary counters in hot path. 5. Bringout all CQE error stats i.e both Rx and Tx. Signed-off-by: Sunil Goutham <sgoutham@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| * \| \|	net: thunderx: Fix configuration of L3/L4 length checking	Sunil Goutham	2016-11-16	1	-2/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch fixes enabling of HW verification of L3/L4 length and TCP/UDP checksum which is currently being cleared. Also fixed VLAN stripping config which is being cleared when multiqset is enabled. Signed-off-by: Sunil Goutham <sgoutham@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| * \| \|	net: thunderx: Program LMAC credits based on MTU	Sunil Goutham	2016-11-16	6	-30/+53
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Programming LMAC credits taking 9K frame size by default is incorrect as for an interface which is one of the many on the same BGX/QLM no of credits available will be less as Tx FIFO will be divided across all interfaces. So let's say a BGX with 40G interface and another BGX with multiple 10G, bandwidth of 10G interfaces will be effected when traffic is running on both 40G and 10G interfaces simultaneously. This patch fixes this issue by programming credits based on netdev's MTU. Also fixed configuring MTU to HW and added CQE counter for pkts which exceed this value. Signed-off-by: Sunil Goutham <sgoutham@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| * \| \|	net: thunderx: Introduce BGX_ID_MASK macro to extract bgx_id	Radha Mohan Chintakuntla	2016-11-16	2	-2/+4
\|/ / / \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch fixes the 'bgx_id' determination on 83xx where there are 4 BGX blocks instead of 2 on other platforms. Signed-off-by: Radha Mohan Chintakuntla <rchintakuntla@cavium.com> Signed-off-by: Sunil Goutham <sgoutham@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* \| \|	Merge branch 'fib-tables-fixes'	David S. Miller	2016-11-16	3	-6/+84
\|\ \ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Alexander Duyck says: ==================== ipv4: Fix memory leaks and reference issues in fib This series fixes one major issue and one minor issue in the fib tables. The major issue is that we had lost the functionality that was flushing the local table entries from main after we had unmerged the two tries. In order to regain the functionality I have performed a partial revert and then moved the functionality for flushing the external entries from main into fib_unmerge. The minor issue was a memory leak that could occur in the event that we weren't able to add an alias to the local trie resulting in the fib alias being leaked. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
\| * \| \|	ipv4: Fix memory leak in exception case for splitting tries	Alexander Duyck	2016-11-16	1	-1/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fix a small memory leak that can occur where we leak a fib_alias in the event of us not being able to insert it into the local table. Fixes: 0ddcf43d5d4a0 ("ipv4: FIB Local/MAIN table collapse") Reported-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| * \| \|	ipv4: Restore fib_trie_flush_external function and fix call ordering	Alexander Duyck	2016-11-16	3	-5/+81
\|/ / / \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The patch that removed the FIB offload infrastructure was a bit too aggressive and also removed code needed to clean up us splitting the table if additional rules were added. Specifically the function fib_trie_flush_external was called at the end of a new rule being added to flush the foreign trie entries from the main trie. I updated the code so that we only call fib_trie_flush_external on the main table so that we flush the entries for local from main. This way we don't call it for every rule change which is what was happening previously. Fixes: 347e3b28c1ba2 ("switchdev: remove FIB offload infrastructure") Reported-by: Eric Dumazet <edumazet@google.com> Cc: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* \| \|	bpf: fix range arithmetic for bpf map access	Josef Bacik	2016-11-16	2	-25/+50
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	I made some invalid assumptions with BPF_AND and BPF_MOD that could result in invalid accesses to bpf map entries. Fix this up by doing a few things 1) Kill BPF_MOD support. This doesn't actually get used by the compiler in real life and just adds extra complexity. 2) Fix the logic for BPF_AND, don't allow AND of negative numbers and set the minimum value to 0 for positive AND's. 3) Don't do operations on the ranges if they are set to the limits, as they are by definition undefined, and allowing arithmetic operations on those values could make them appear valid when they really aren't. This fixes the testcase provided by Jann as well as a few other theoretical problems. Reported-by: Jann Horn <jannh@google.com> Signed-off-by: Josef Bacik <jbacik@fb.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* \| \|	rtnetlink: fix rtnl message size computation for XDP	Sabrina Dubroca	2016-11-16	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	rtnl_xdp_size() only considers the size of the actual payload attribute, and misses the space taken by the attribute used for nesting (IFLA_XDP). Fixes: d1fdd9138682 ("rtnl: add option for setting link xdp prog") Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Reviewed-by: Brenden Blanco <bblanco@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net>