summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* net/mlx5e: Added BW check for DIM decision mechanismTal Gilboa2017-06-112-17/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | | DIM (Dynamically-tuned Interrupt Moderation) is a mechanism designed for changing the channel interrupt moderation values in order to reduce CPU overhead for all traffic types. Until now only interrupt and packet rate were sampled. We found a scenario on which we get a false indication since a change in DIM caused more aggregation and reduced packet rate while increasing BW. We now regard a change as succesfull iff: current_BW > (prev_BW + threshold) or current_BW ~= prev_BW and current_PR > (prev_PR + threshold) or current_BW ~= prev_BW and current_PR ~= prev_PR and current_IR < (prev_IR - threshold) Where BW = Bandwidth, PR = Packet rate and IR = Interrupt rate Improvements (ConnectX-4Lx 25GbE, single RX queue, LRO off) -------------------------------------------------- packet size | before[Mb/s] | after[Mb/s] | gain | 2B | 343.4 | 359.4 | 4.5% | 16B | 2739.7 | 2814.8 | 2.7% | 64B | 9739 | 10185.3 | 4.5% | Fixes: cb3c7fd4f839 ("net/mlx5e: Support adaptive RX coalescing") Signed-off-by: Tal Gilboa <talgi@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
* net/mlx5: Remove several module events out of ethtool statsHuy Nguyen2017-06-111-9/+2
| | | | | | | | | | | | | | | | | | | Remove the following module event counters out of ethtool stats. The reason for removing these event counters is that these events do not occur without techinician's intervention. module_pwr_budget_exd module_long_range module_no_eeprom module_enforce_part module_unknown_id module_unknown_status module_plug Fixes: bedb7c909c19 ("net/mlx5e: Add port module event counters to ethtool stats") Signed-off-by: Huy Nguyen <huyn@mellanox.com> Reviewed by: Gal Pressman <galp@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
* net/mlx5: Continue health polling until it is explicitly stoppedMohamad Haj Yahia2017-06-111-6/+5
| | | | | | | | | | The issue is that when we get an assert we will stop polling the health and thus we cant enter error state when we have a real health issue. Fixes: fd76ee4da55a ('net/mlx5_core: Fix internal error detection conditions') Signed-off-by: Mohamad Haj Yahia <mohamad@mellanox.com> Reviewed-by: Daniel Jurgens <danielj@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
* net/mlx5: Fix create vport flow table flowMohamad Haj Yahia2017-06-111-1/+1
| | | | | | | | | Send vport number to the create flow table inner method instead of ignoring the vport argument and sending always 0. Fixes: b3ba51498bdd ('net/mlx5: Refactor create flow table method to accept underlay QP') Signed-off-by: Mohamad Haj Yahia <mohamad@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
* Merge branch 'mvpp2-fixes'David S. Miller2017-06-111-41/+33
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Thomas Petazzoni says: ==================== net: mvpp2: driver fixes As requested, here is a series of patches containing only bug fixes for the mvpp2 driver. It is based on the latest "net" branch. Changes since v1: - Fixed a build breakage that occurred when only PATCH 1 was only, and not later patches in the series. Was reported by the kbuild report on the first submission. - Added Tested-by from Marc Zyngier on PATCH 2. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| * net: mvpp2: use {get, put}_cpu() instead of smp_processor_id()Thomas Petazzoni2017-06-111-8/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | smp_processor_id() should not be used in migration-enabled contexts. We originally thought it was OK in the specific situation of this driver, but it was wrong, and calling smp_processor_id() in a migration-enabled context prints a big fat warning when CONFIG_DEBUG_PREEMPT=y. Therefore, this commit replaces the smp_processor_id() in migration-enabled contexts by the appropriate get_cpu/put_cpu sections. Reported-by: Marc Zyngier <marc.zyngier@arm.com> Fixes: a786841df72e ("net: mvpp2: handle register mapping and access for PPv2.2") Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Tested-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net: mvpp2: remove mvpp2_bm_cookie_{build,pool_get}Thomas Petazzoni2017-06-111-33/+14
|/ | | | | | | | | | | | | | | | | | | This commit removes the useless remove mvpp2_bm_cookie_{build,pool_get} functions. All what mvpp2_bm_cookie_build() was doing is compute a 32-bit value by concatenating the pool number and the CPU number... only to get the pool number re-extracted by mvpp2_bm_cookie_pool_get() later on. Instead, just get the pool number directly from RX descriptor status, and pass it to mvpp2_pool_refill() and mvpp2_rx_refill(). This has the added benefit of dropping a smp_processor_id() call in a migration-enabled context, which is wrong, and is the original motivation for making this change. Fixes: 3f518509dedc9 ("ethernet: Add new driver for Marvell Armada 375 network unit") Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: tipc: Fix a sleep-in-atomic bug in tipc_msg_reverseJia-Ju Bai2017-06-111-1/+1
| | | | | | | | | | | | | | | | | | | | | The kernel may sleep under a rcu read lock in tipc_msg_reverse, and the function call path is: tipc_l2_rcv_msg (acquire the lock by rcu_read_lock) tipc_rcv tipc_sk_rcv tipc_msg_reverse pskb_expand_head(GFP_KERNEL) --> may sleep tipc_node_broadcast tipc_node_xmit_skb tipc_node_xmit tipc_sk_rcv tipc_msg_reverse pskb_expand_head(GFP_KERNEL) --> may sleep To fix it, "GFP_KERNEL" is replaced with "GFP_ATOMIC". Signed-off-by: Jia-Ju Bai <baijiaju1990@163.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: caif: Fix a sleep-in-atomic bug in cfpkt_create_pfxJia-Ju Bai2017-06-111-5/+1
| | | | | | | | | | | | | | | | | | | | | | | The kernel may sleep under a rcu read lock in cfpkt_create_pfx, and the function call path is: cfcnfg_linkup_rsp (acquire the lock by rcu_read_lock) cfctrl_linkdown_req cfpkt_create cfpkt_create_pfx alloc_skb(GFP_KERNEL) --> may sleep cfserl_receive (acquire the lock by rcu_read_lock) cfpkt_split cfpkt_create_pfx alloc_skb(GFP_KERNEL) --> may sleep There is "in_interrupt" in cfpkt_create_pfx to decide use "GFP_KERNEL" or "GFP_ATOMIC". In this situation, "GFP_KERNEL" is used because the function is called under a rcu read lock, instead in interrupt. To fix it, only "GFP_ATOMIC" is used in cfpkt_create_pfx. Signed-off-by: Jia-Ju Bai <baijiaju1990@163.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Revert "net: fec: Add a fec_enet_clear_ethtool_stats() stub for CONFIG_M5272"David S. Miller2017-06-101-4/+0
| | | | | | | | This reverts commit bf292f1b2c813f1d6ac49b04bd1a9863d8314266. It belongs in 'net-next' not 'net'. Signed-off-by: David S. Miller <davem@davemloft.net>
* sctp: disable BH in sctp_for_each_endpointXin Long2017-06-101-2/+2
| | | | | | | | | | | | | | | | | | | Now sctp holds read_lock when foreach sctp_ep_hashtable without disabling BH. If CPU schedules to another thread A at this moment, the thread A may be trying to hold the write_lock with disabling BH. As BH is disabled and CPU cannot schedule back to the thread holding the read_lock, while the thread A keeps waiting for the read_lock. A dead lock would be triggered by this. This patch is to fix this dead lock by calling read_lock_bh instead to disable BH when holding the read_lock in sctp_for_each_endpoint. Fixes: 626d16f50f39 ("sctp: export some apis or variables for sctp_diag and reuse some for proc") Reported-by: Xiumei Mu <xmu@redhat.com> Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: fec: Add a fec_enet_clear_ethtool_stats() stub for CONFIG_M5272Fabio Estevam2017-06-101-0/+4
| | | | | | | | | | | | | Commit 2b30842b23b9 ("net: fec: Clear and enable MIB counters on imx51") introduced fec_enet_clear_ethtool_stats(), but missed to add a stub for the CONFIG_M5272=y case, causing build failure for the m5272c3_defconfig. Add the missing empty stub to fix the build failure. Reported-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: Fabio Estevam <fabio.estevam@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* l2tp: cast l2tp traffic counter to unsignedDominik Heidler2017-06-101-6/+7
| | | | | | | | | | | | | | | This fixes a counter problem on 32bit systems: When the rx_bytes counter reached 2 GiB, it jumpd to (2^64 Bytes - 2GiB) Bytes. rtnl_link_stats64 has __u64 type and atomic_long_read returns atomic_long_t which is signed. Due to the conversation we get an incorrect value on 32bit systems if the MSB of the atomic_long_t value is set. CC: Tom Parkin <tparkin@katalix.com> Fixes: 7b7c0719cd7a ("l2tp: avoid deadlock in l2tp stats update") Signed-off-by: Dominik Heidler <dheidler@suse.de> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: aquantia: atlantic: remove declaration of hw_atl_utils_hw_set_powerPhilippe Reynes2017-06-101-3/+0
| | | | | | | | | | This function is not defined, so no need to declare it. As I don't have the hardware, I'd be very pleased if someone may test this patch. Signed-off-by: Philippe Reynes <tremyfr@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge branch 'bnx2x-Fix-malicious-VFs-indication'David S. Miller2017-06-103-5/+28
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Yuval Mintz says: ==================== bnx2x: Fix malicious VFs indication It was discovered that for a VF there's a simple [yet uncommon] scenario which would cause device firmware to declare that VF as malicious - Add a vlan interface on top of a VF and disable txvlan offloading for that VF [causing VF to transmit packets where vlan is on payload]. Patch #1 corrects driver transmission to prevent this issue. Patch #2 is a by-product correcting PF behavior once a VF is declared malicious. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| * bnx2x: Don't post statistics to malicious VFsMintz, Yuval2017-06-102-1/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | Once firmware indicates that a given VF is malicious and until that VF passes an FLR all bets are off - PF can't know anything is happening to the VF [since VF can't communicate anything to its PF]. But PF is currently still periodically asking device to collect statistics for the VF which might in turn fill logs by IOMMU blocking memory access done by the VF's PCI function [in the case VF has unmapped its buffers]. Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * bnx2x: Allow vfs to disable txvlan offloadMintz, Yuval2017-06-101-4/+15
|/ | | | | | | | | | | VF clients are configured as enforced, meaning firmware is validating the correctness of their ethertype/vid during transmission. Once txvlan is disabled, VF would start getting SKBs for transmission here vlan is on the payload - but it'll pass the packet's ethertype instead of the vid, leading to firmware declaring it as malicious. Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge tag 'linux-can-fixes-for-4.12-20170609' of ↵David S. Miller2017-06-097-8/+10
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can Marc Kleine-Budde says: ==================== pull-request: can 2017-06-09 this is a pull request of 6 patches for net/master. There's a patch by Stephane Grosjean that fixes an uninitialized symbol warning in the peak_canfd driver. A patch by Johan Hovold to fix the product-id endianness in an error message in the the peak_usb driver. A patch by Oliver Hartkopp to enable CAN FD for virtual CAN devices by default. Three patches by me, one makes the helper function can_change_state() robust to be called with cf == NULL. The next patch fixes a memory leak in the gs_usb driver. And the last one fixes a lockdep splat by properly initialize the per-net can_rcvlists_lock spin_lock. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| * can: enable CAN FD for virtual CAN devices by defaultOliver Hartkopp2017-06-092-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | CAN FD capable CAN interfaces can handle (classic) CAN 2.0 frames too. New users usually fail at their first attempt to explore CAN FD on virtual CAN interfaces due to the current CAN_MTU default. Set the MTU to CANFD_MTU by default to reduce this confusion. If someone *really* needs a 'classic CAN'-only device this can be set with the 'ip' tool with e.g. 'ip link set vcan0 mtu 16' as before. Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
| * can: af_can: namespace support: fix lockdep splat: properly initialize spin_lockMarc Kleine-Budde2017-06-091-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch uses spin_lock_init() instead of __SPIN_LOCK_UNLOCKED() to initialize the per namespace net->can.can_rcvlists_lock lock to fix this lockdep warning: | INFO: trying to register non-static key. | the code is fine but needs lockdep annotation. | turning off the locking correctness validator. | CPU: 0 PID: 186 Comm: candump Not tainted 4.12.0-rc3+ #47 | Hardware name: Marvell Kirkwood (Flattened Device Tree) | [<c0016644>] (unwind_backtrace) from [<c00139a8>] (show_stack+0x18/0x1c) | [<c00139a8>] (show_stack) from [<c0058c8c>] (register_lock_class+0x1e4/0x55c) | [<c0058c8c>] (register_lock_class) from [<c005bdfc>] (__lock_acquire+0x148/0x1990) | [<c005bdfc>] (__lock_acquire) from [<c005deec>] (lock_acquire+0x174/0x210) | [<c005deec>] (lock_acquire) from [<c04a6780>] (_raw_spin_lock+0x50/0x88) | [<c04a6780>] (_raw_spin_lock) from [<bf02116c>] (can_rx_register+0x94/0x15c [can]) | [<bf02116c>] (can_rx_register [can]) from [<bf02a868>] (raw_enable_filters+0x60/0xc0 [can_raw]) | [<bf02a868>] (raw_enable_filters [can_raw]) from [<bf02ac14>] (raw_enable_allfilters+0x2c/0xa0 [can_raw]) | [<bf02ac14>] (raw_enable_allfilters [can_raw]) from [<bf02ad38>] (raw_bind+0xb0/0x250 [can_raw]) | [<bf02ad38>] (raw_bind [can_raw]) from [<c03b5fb8>] (SyS_bind+0x70/0xac) | [<c03b5fb8>] (SyS_bind) from [<c000f8c0>] (ret_fast_syscall+0x0/0x1c) Cc: Mario Kicherer <dev@kicherer.org> Acked-by: Oliver Hartkopp <socketcan@hartkopp.net> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
| * can: gs_usb: fix memory leak in gs_cmd_reset()Marc Kleine-Budde2017-06-091-0/+2
| | | | | | | | | | | | | | | | | | This patch adds the missing kfree() in gs_cmd_reset() to free the memory that is not used anymore after usb_control_msg(). Cc: linux-stable <stable@vger.kernel.org> Cc: Maximilian Schneider <max@schneidersoft.net> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
| * can: peak_usb: fix product-id endianness in error messageJohan Hovold2017-06-091-3/+1
| | | | | | | | | | | | | | | | | | | | | | Make sure to use the USB device product-id stored in host-byte order in a probe error message. Also remove a redundant reassignment of the local usb_dev variable which had already been used to retrieve the product id. Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
| * can: peak_canfd: fix uninitialized symbol warningsStephane Grosjean2017-06-091-1/+1
| | | | | | | | | | | | | | | | | | | | This patch fixes two uninitialized symbol warnings in the new code adding support of the PEAK-System PCAN-PCI Express FD boards, in the socket-CAN network protocol family. Signed-off-by: Stephane Grosjean <s.grosjean@peak-system.com> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
| * can: dev: make can_change_state() robust to be called with cf == NULLMarc Kleine-Budde2017-06-091-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | In OOM situations where no skb can be allocated, can_change_state() may be called with cf == NULL. As this function updates the state and error statistics it's not an option to skip the call to can_change_state() in OOM situations. This patch makes can_change_state() robust, so that it can be called with cf == NULL. Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
* | mac80211: free netdev on dev_alloc_name() errorJohannes Berg2017-06-091-0/+1
| | | | | | | | | | | | | | | | | | | | | | The change to remove free_netdev() from ieee80211_if_free() erroneously didn't add the necessary free_netdev() for when ieee80211_if_free() is called directly in one place, rather than as the priv_destructor. Add the missing call. Fixes: cf124db566e6 ("net: Fix inconsistent teardown and release of private netdev state.") Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | net: rps: send out pending IPI's on CPU hotplugashwanth@codeaurora.org2017-06-091-9/+22
| | | | | | | | | | | | | | | | | | | | | | | | IPI's from the victim cpu are not handled in dev_cpu_callback. So these pending IPI's would be sent to the remote cpu only when NET_RX is scheduled on the victim cpu and since this trigger is unpredictable it would result in packet latencies on the remote cpu. This patch add support to send the pending ipi's of victim cpu. Signed-off-by: Ashwanth Goli <ashwanth@codeaurora.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* | stmmac: fix for hw timestamp of GMAC3 unitMario Molitor2017-06-092-9/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | 1.) Bugfix of function stmmac_get_tx_hwtstamp. Corrected the tx timestamp available check (same as 4.8 and older) Change printout from info syslevel to debug. 2.) Bugfix of function stmmac_get_rx_hwtstamp. Corrected the rx timestamp available check (same as 4.8 and older) Change printout from info syslevel to debug. Fixes: ba1ffd74df74 ("stmmac: fix PTP support for GMAC4") Signed-off-by: Mario Molitor <mario_molitor@web.de> Signed-off-by: David S. Miller <davem@davemloft.net>
* | stmmac: fix ptp header for GMAC3 hw timestampMario Molitor2017-06-092-4/+14
| | | | | | | | | | | | | | | | | | | | | | According the CYCLON V documention only the bit 16 of snaptypesel should set. (more information see Table 17-20 (cv_5v4.pdf) : Timestamp Snapshot Dependency on Register Bits) Fixes: d2042052a0aa ("stmmac: update the PTP header file") Signed-off-by: Mario Molitor <mario_molitor@web.de> Signed-off-by: David S. Miller <davem@davemloft.net>
* | Fix an intermittent pr_emerg warning about lo becoming free.Krister Johansen2017-06-091-0/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It looks like this: Message from syslogd@flamingo at Apr 26 00:45:00 ... kernel:unregister_netdevice: waiting for lo to become free. Usage count = 4 They seem to coincide with net namespace teardown. The message is emitted by netdev_wait_allrefs(). Forced a kdump in netdev_run_todo, but found that the refcount on the lo device was already 0 at the time we got to the panic. Used bcc to check the blocking in netdev_run_todo. The only places where we're off cpu there are in the rcu_barrier() and msleep() calls. That behavior is expected. The msleep time coincides with the amount of time we spend waiting for the refcount to reach zero; the rcu_barrier() wait times are not excessive. After looking through the list of callbacks that the netdevice notifiers invoke in this path, it appears that the dst_dev_event is the most interesting. The dst_ifdown path places a hold on the loopback_dev as part of releasing the dev associated with the original dst cache entry. Most of our notifier callbacks are straight-forward, but this one a) looks complex, and b) places a hold on the network interface in question. I constructed a new bcc script that watches various events in the liftime of a dst cache entry. Note that dst_ifdown will take a hold on the loopback device until the invalidated dst entry gets freed. [ __dst_free] on DST: ffff883ccabb7900 IF tap1008300eth0 invoked at 1282115677036183 __dst_free rcu_nocb_kthread kthread ret_from_fork Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | af_unix: Add sockaddr length checks before accessing sa_family in bind and ↵Mateusz Jurczyk2017-06-091-1/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | connect handlers Verify that the caller-provided sockaddr structure is large enough to contain the sa_family field, before accessing it in bind() and connect() handlers of the AF_UNIX socket. Since neither syscall enforces a minimum size of the corresponding memory region, very short sockaddrs (zero or one byte long) result in operating on uninitialized memory while referencing .sa_family. Signed-off-by: Mateusz Jurczyk <mjurczyk@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | net: phy: add missing SPEED_14000Joe Perches2017-06-091-0/+2
|/ | | | | | | Fixes: 0d7e2d2166f6 ("IB/ipoib: add get_link_ksettings in ethtool") Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: vrf: Make add_fib_rules per network namespace flagDavid Ahern2017-06-091-4/+32
| | | | | | | | | | | | | Commit 1aa6c4f6b8cd8 ("net: vrf: Add l3mdev rules on first device create") adds the l3mdev FIB rule the first time a VRF device is created. However, it only creates the rule once and only in the namespace the first device is created - which may not be init_net. Fix by using the net_generic capability to make the add_fib_rules flag per network namespace. Fixes: 1aa6c4f6b8cd8 ("net: vrf: Add l3mdev rules on first device create") Reported-by: Petr Machata <petrm@mellanox.com> Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* bpf, tests: fix endianness selectionDaniel Borkmann2017-06-081-11/+30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | I noticed that test_l4lb was failing in selftests: # ./test_progs test_pkt_access:PASS:ipv4 77 nsec test_pkt_access:PASS:ipv6 44 nsec test_xdp:PASS:ipv4 2933 nsec test_xdp:PASS:ipv6 1500 nsec test_l4lb:PASS:ipv4 377 nsec test_l4lb:PASS:ipv6 544 nsec test_l4lb:FAIL:stats 6297600000 200000 test_tcp_estats:PASS: 0 nsec Summary: 7 PASSED, 1 FAILED Tracking down the issue actually revealed that endianness selection in bpf_endian.h is broken when compiled with clang with bpf target. test_pkt_access.c, test_l4lb.c is compiled with __BYTE_ORDER as __BIG_ENDIAN, test_xdp.c as __LITTLE_ENDIAN! test_l4lb noticeably fails, because the test accounts bytes via bpf_ntohs(ip6h->payload_len) and bpf_ntohs(iph->tot_len), and compares them against a defined value and given a wrong endianness, the test outcome is different, of course. Turns out that there are actually two bugs: i) when we do __BYTE_ORDER comparison with __LITTLE_ENDIAN/__BIG_ENDIAN, then depending on the include order we see different outcomes. Reason is that __BYTE_ORDER is undefined due to missing endian.h include. Before we include the asm/byteorder.h (e.g. through linux/in.h), then __BYTE_ORDER equals __LITTLE_ENDIAN since both are undefined, after the include which correctly pulls in linux/byteorder/little_endian.h, __LITTLE_ENDIAN is defined, but given __BYTE_ORDER is still undefined, we match on __BYTE_ORDER equals to __BIG_ENDIAN since __BIG_ENDIAN is also undefined at that point, sigh. ii) But even that would be wrong, since when compiling the test cases with clang, one can select between bpfeb and bpfel targets for cross compilation. Hence, we can also not rely on what the system's endian.h provides, but we need to look at the compiler's defined endianness. The compiler defines __BYTE_ORDER__, and we can match __ORDER_LITTLE_ENDIAN__ and __ORDER_BIG_ENDIAN__, which also reflects targets bpf (native), bpfel, bpfeb correctly, thus really only rely on that. After patch: # ./test_progs test_pkt_access:PASS:ipv4 74 nsec test_pkt_access:PASS:ipv6 42 nsec test_xdp:PASS:ipv4 2340 nsec test_xdp:PASS:ipv6 1461 nsec test_l4lb:PASS:ipv4 400 nsec test_l4lb:PASS:ipv6 530 nsec test_tcp_estats:PASS: 0 nsec Summary: 7 PASSED, 0 FAILED Fixes: 43bcf707ccdc ("bpf: fix _htons occurences in test_progs") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* ethtool.h: remind to update 802.3ad when adding new speedsNicolas Dichtel2017-06-081-2/+4
| | | | | | | | | Each time a new speed is added, the bonding 802.3ad isn't updated. Add a comment to remind the developer to update this driver. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Acked-by: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: David S. Miller <davem@davemloft.net>
* bonding: fix 802.3ad support for 14G speedNicolas Dichtel2017-06-081-0/+9
| | | | | | | | | | This patch adds 14 Gbps enum definition, and fixes aggregated bandwidth calculation based on above slave links. Fixes: 0d7e2d2166f6 ("IB/ipoib: add get_link_ksettings in ethtool") Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Acked-by: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: David S. Miller <davem@davemloft.net>
* bonding: fix 802.3ad support for 5G and 50G speedsThibaut Collet2017-06-081-0/+18
| | | | | | | | | | | This patch adds [5|50] Gbps enum definition, and fixes aggregated bandwidth calculation based on above slave links. Fixes: c9a70d43461d ("net-next: ethtool: Added port speed macros.") Signed-off-by: Thibaut Collet <thibaut.collet@6wind.com> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Acked-by: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: David S. Miller <davem@davemloft.net>
* openvswitch: warn about missing first netlink attributeNicolas Dichtel2017-06-081-0/+1
| | | | | | | | | | | The first netlink attribute (value 0) must always be defined as none/unspec. Because we cannot change an existing UAPI, I add a comment to point the mistake and avoid to propagate it in a new ovs API in the future. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* ila_xlat: add missing hash secret initializationArnd Bergmann2017-06-081-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | While discussing the possible merits of clang warning about unused initialized functions, I found one function that was clearly meant to be called but never actually is. __ila_hash_secret_init() initializes the hash value for the ila locator, apparently this is intended to prevent hash collision attacks, but this ends up being a read-only zero constant since there is no caller. I could find no indication of why it was never called, the earliest patch submission for the module already was like this. If my interpretation is right, we certainly want to backport the patch to stable kernels as well. I considered adding it to the ila_xlat_init callback, but for best effect the random data is read as late as possible, just before it is first used. The underlying net_get_random_once() is already highly optimized to avoid overhead when called frequently. Fixes: 7f00feaf1076 ("ila: Add generic ILA translation facility") Cc: stable@vger.kernel.org Link: https://www.spinics.net/lists/kernel/msg2527243.html Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: Fix build regression in rtl8723bs staging driver.David S. Miller2017-06-082-3/+2
| | | | | | | | | drivers/staging/rtl8723bs/os_dep/ioctl_cfg80211.c: In function ‘rtw_cfg80211_add_monitor_if’: drivers/staging/rtl8723bs/os_dep/ioctl_cfg80211.c:2670:10: error: ‘struct net_device’ has no member named ‘destructor’ mon_ndev->destructor = rtw_ndev_destructor; ^ Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge branch 'netvsc-bug-fixes'David S. Miller2017-06-083-39/+50
|\ | | | | | | | | | | | | | | | | | | | | | | Stephen Hemminger says: ==================== netvsc: bug fixes These are bugfixes for netvsc driver in 4.12. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| * netvsc: move filter setting to rndis_devicestephen hemminger2017-06-083-34/+34
| | | | | | | | | | | | | | | | | | The work queue and handling of network filter parameters should be in rndis_device. This gets rid of warning from RCU checks, eliminates a race and cleans up code. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * netvsc: fix net poll modestephen hemminger2017-06-081-4/+15
| | | | | | | | | | | | | | | | | | | | | | The ndo_poll_controller function needs to schedule NAPI to pick up arriving packets and send completions. Otherwise no data will ever be received. For simple case of netconsole, it also will allow send completions to happen. Without this netpoll will eventually get stuck. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * netvsc: fix rcu dereference warning from ethtoolstephen hemminger2017-06-081-1/+1
|/ | | | | | | | | | The ethtool info command calls the netvsc get_sset_count with RTNL but not with RCU. Which causes warning: drivers/net/hyperv/netvsc_drv.c:1010 suspicious rcu_dereference_check() usage! Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: ipv6: Release route when device is unregisteringDavid Ahern2017-06-082-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Roopa reported attempts to delete a bond device that is referenced in a multipath route is hanging: $ ifdown bond2 # ifupdown2 command that deletes virtual devices unregister_netdevice: waiting for bond2 to become free. Usage count = 2 Steps to reproduce: echo 1 > /proc/sys/net/ipv6/conf/all/ignore_routes_with_linkdown ip link add dev bond12 type bond ip link add dev bond13 type bond ip addr add 2001:db8:2::0/64 dev bond12 ip addr add 2001:db8:3::0/64 dev bond13 ip route add 2001:db8:33::0/64 nexthop via 2001:db8:2::2 nexthop via 2001:db8:3::2 ip link del dev bond12 ip link del dev bond13 The root cause is the recent change to keep routes on a linkdown. Update the check to detect when the device is unregistering and release the route for that case. Fixes: a1a22c12060e4 ("net: ipv6: Keep nexthop of multipath route on admin down") Reported-by: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: David Ahern <dsahern@gmail.com> Acked-by: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: Zero ifla_vf_info in rtnl_fill_vfinfo()Mintz, Yuval2017-06-081-1/+2
| | | | | | | | | | | Some of the structure's fields are not initialized by the rtnetlink. If driver doesn't set those in ndo_get_vf_config(), they'd leak memory to user. Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> CC: Michal Schmidt <mschmidt@redhat.com> Reviewed-by: Greg Rose <gvrose8192@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* decnet: dn_rtmsg: Improve input length sanitization in dnrmg_receive_user_skbMateusz Jurczyk2017-06-081-1/+3
| | | | | | | | | | | | | | | | | Verify that the length of the socket buffer is sufficient to cover the nlmsghdr structure before accessing the nlh->nlmsg_len field for further input sanitization. If the client only supplies 1-3 bytes of data in sk_buff, then nlh->nlmsg_len remains partially uninitialized and contains leftover memory from the corresponding kernel allocation. Operating on such data may result in indeterminate evaluation of the nlmsg_len < sizeof(*nlh) expression. The bug was discovered by a runtime instrumentation designed to detect use of uninitialized memory in the kernel. The patch prevents this and other similar tools (e.g. KMSAN) from flagging this behavior in the future. Signed-off-by: Mateusz Jurczyk <mjurczyk@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Revert "decnet: dn_rtmsg: Improve input length sanitization in ↵David S. Miller2017-06-081-3/+1
| | | | | | | | | | | dnrmg_receive_user_skb" This reverts commit 85eac2ba35a2dbfbdd5767c7447a4af07444a5b4. There is an updated version of this fix which we should use instead. Signed-off-by: David S. Miller <davem@davemloft.net>
* net: emac: fix and unify emac_mdio functionsChristian Lamparter2017-06-081-23/+18
| | | | | | | | | | | | | | | | emac_mdio_read_link() was not copying the requested phy settings back into the emac driver's own phy api. This has caused a link speed mismatch issue for the AR8035 as the emac driver kept trying to connect with 10/100MBps on a 1GBit/s link. This patch also unifies shared code between emac_setup_aneg() and emac_mdio_setup_forced(). And furthermore it removes a chunk of emac_mdio_init_phy(), that was copying the same data into itself. Signed-off-by: Christian Lamparter <chunkeey@googlemail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: emac: fix reset timeout with AR8035 phyChristian Lamparter2017-06-081-4/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch fixes a problem where the AR8035 PHY can't be detected on an Cisco Meraki MR24, if the ethernet cable is not connected on boot. Russell Senior provided steps to reproduce the issue: |Disconnect ethernet cable, apply power, wait until device has booted, |plug in ethernet, check for interfaces, no eth0 is listed. | |This appears to be a problem during probing of the AR8035 Phy chip. |When ethernet has no link, the phy detection fails, and eth0 is not |created. Plugging ethernet later has no effect, because there is no |interface as far as the kernel is concerned. The relevant part of |the boot log looks like this: |this is the failing case: | |[ 0.876611] /plb/opb/emac-rgmii@ef601500: input 0 in RGMII mode |[ 0.882532] /plb/opb/ethernet@ef600c00: reset timeout |[ 0.888546] /plb/opb/ethernet@ef600c00: can't find PHY! |and the succeeding case: | |[ 0.876672] /plb/opb/emac-rgmii@ef601500: input 0 in RGMII mode |[ 0.883952] eth0: EMAC-0 /plb/opb/ethernet@ef600c00, MAC 00:01:.. |[ 0.890822] eth0: found Atheros 8035 Gigabit Ethernet PHY (0x01) Based on the comment and the commit message of commit 23fbb5a87c56 ("emac: Fix EMAC soft reset on 460EX/GT"). This is because the AR8035 PHY doesn't provide the TX Clock, if the ethernet cable is not attached. This causes the reset to timeout and the PHY detection code in emac_init_phy() is unable to detect the AR8035 PHY. As a result, the emac driver bails out early and the user left with no ethernet. In order to stay compatible with existing configurations, the driver tries the current reset approach at first. Only if the first attempt timed out, it does perform one more retry with the clock temporarily switched to the internal source for just the duration of the reset. LEDE-Bug: #687 <https://bugs.lede-project.org/index.php?do=details&task_id=687> Cc: Chris Blake <chrisrblake93@gmail.com> Reported-by: Russell Senior <russell@personaltelco.net> Fixes: 23fbb5a87c56e98 ("emac: Fix EMAC soft reset on 460EX/GT") Signed-off-by: Christian Lamparter <chunkeey@googlemail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
* decnet: dn_rtmsg: Improve input length sanitization in dnrmg_receive_user_skbMateusz Jurczyk2017-06-081-1/+3
| | | | | | | | | | | | | | | | | Verify that the length of the socket buffer is sufficient to cover the entire nlh->nlmsg_len field before accessing that field for further input sanitization. If the client only supplies 1-3 bytes of data in sk_buff, then nlh->nlmsg_len remains partially uninitialized and contains leftover memory from the corresponding kernel allocation. Operating on such data may result in indeterminate evaluation of the nlmsg_len < sizeof(*nlh) expression. The bug was discovered by a runtime instrumentation designed to detect use of uninitialized memory in the kernel. The patch prevents this and other similar tools (e.g. KMSAN) from flagging this behavior in the future. Signed-off-by: Mateusz Jurczyk <mjurczyk@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>