summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* Merge branch 'ionic-driver-updates'David S. Miller2019-10-024-63/+119
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Shannon Nelson says: ==================== ionic: driver updates These patches are a few updates to clean up some code issues and add an ethtool feature. v3: drop the Fixes tags as they really aren't fixing bugs simplify ionic_lif_quiesce() as no return is necessary v2: add cover letter edit a couple of patch descriptions for clarity and add Fixes: tags ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| * ionic: add lif_quiesce to wait for queue activity to stopShannon Nelson2019-10-021-0/+16
| | | | | | | | | | | | | | | | | | | | Even though we've already turned off the queue activity with the ionic_qcq_disable(), we need to wait for any device queues that are processing packets to drain down before we try to flush our packets and tear down the queues. Signed-off-by: Shannon Nelson <snelson@pensando.io> Signed-off-by: David S. Miller <davem@davemloft.net>
| * ionic: implement ethtool set-fecShannon Nelson2019-10-021-27/+67
| | | | | | | | | | | | | | | | Wire up the --set-fec and --show-fec features in the ethtool callbacks and pull the related code out of set_link_ksettings. Signed-off-by: Shannon Nelson <snelson@pensando.io> Signed-off-by: David S. Miller <davem@davemloft.net>
| * ionic: report users coalesce requestShannon Nelson2019-10-023-18/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | The user's request for an interrupt coalescing value gets translated into a hardware value to be used with the NIC, and was getting reported back based on the hw value, which, due to hw tic resolution, could be reported as a different number than what the user originally asked for. This code now tracks both the user request and what was put into the hardware so we can report back to the user what they requested. Signed-off-by: Shannon Nelson <snelson@pensando.io> Signed-off-by: David S. Miller <davem@davemloft.net>
| * ionic: use wait_on_bit_lock() rather than open codeShannon Nelson2019-10-023-13/+13
| | | | | | | | | | | | | | | | Replace the open-coded ionic_wait_for_bit() with the kernel's wait_on_bit_lock(). Signed-off-by: Shannon Nelson <snelson@pensando.io> Signed-off-by: David S. Miller <davem@davemloft.net>
| * ionic: simplify returns in devlink infoShannon Nelson2019-10-021-5/+4
|/ | | | | | | There is no need for a goto in this bit of code. Signed-off-by: Shannon Nelson <snelson@pensando.io> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge branch 'per-netns-notifier'David S. Miller2019-10-024-32/+165
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Jiri Pirko says: ==================== net: introduce per-netns netdevice notifiers and use them in mlxsw Some drivers, like mlxsw, are not interested in notifications coming in for netdevices from other network namespaces. So introduce per-netns notifiers and allow to reduce overhead by listening only for notifications from the same netns. This is also a preparation for upcoming patchset "devlink: allow devlink instances to change network namespace". This resolves deadlock during reload mlxsw into initial netns made possible by 328fbe747ad4 ("net: Close race between {un, }register_netdevice_notifier() and setup_net()/cleanup_net()"). ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| * mlxsw: spectrum: Use per-netns netdevice notifier registrationJiri Pirko2019-10-021-3/+6
| | | | | | | | | | | | | | | | | | | | The mlxsw_sp instance is not interested in events happening in other network namespaces. So use "_net" variants for netdevice notifier registration/unregistration and get only events which are happening in the net the instance is in. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net: introduce per-netns netdevice notifiersJiri Pirko2019-10-023-0/+93
| | | | | | | | | | | | | | | | | | | | | | Often the code for example in drivers is interested in getting notifier call only from certain network namespace. In addition to the existing global netdevice notifier chain introduce per-netns chains and allow users to register to that. Eventually this would eliminate unnecessary overhead in case there are many netdevices in many network namespaces. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net: push loops and nb calls into helper functionsJiri Pirko2019-10-023-29/+66
|/ | | | | | | | | | Push iterations over net namespaces and netdevices from register_netdevice_notifier() and unregister_netdevice_notifier() into helper functions. Along with that introduce continue_reverse macros to make the code a bit nicer allowing to get rid of "last" marks. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* r8152: Use guard clause and fix comment typosPrashant Malani2019-10-021-16/+16
| | | | | | | | | | | | | | Use a guard clause in tx_bottom() to reduce the indentation of the do-while loop. Also, fix a couple of spelling and grammatical mistakes in the r8152_csum_workaround() function comment. Change-Id: I460befde150ad92248fd85b0f189ec2df2ab8431 Signed-off-by: Prashant Malani <pmalani@chromium.org> Reviewed-by: Grant Grundler <grundler@chromium.org> Acked-by: Hayes Wang <hayeswang@realtek.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* vsock/virtio: add support for MSG_PEEKMatias Ezequiel Vara Larsen2019-10-021-3/+52
| | | | | | | | | | This patch adds support for MSG_PEEK. In such a case, packets are not removed from the rx_queue and credit updates are not sent. Signed-off-by: Matias Ezequiel Vara Larsen <matiasevara@gmail.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Tested-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* hso: fix NULL-deref on tty openJohan Hovold2019-10-021-4/+8
| | | | | | | | | | | | | | | | | | Fix NULL-pointer dereference on tty open due to a failure to handle a missing interrupt-in endpoint when probing modem ports: BUG: kernel NULL pointer dereference, address: 0000000000000006 ... RIP: 0010:tiocmget_submit_urb+0x1c/0xe0 [hso] ... Call Trace: hso_start_serial_device+0xdc/0x140 [hso] hso_serial_open+0x118/0x1b0 [hso] tty_open+0xf1/0x490 Fixes: 542f54823614 ("tty: Modem functions for the HSO driver") Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* dt-bindings: sh_eth convert bindings to json-schemaSimon Horman2019-10-023-70/+115
| | | | | | | | | | Convert Renesas Electronics SH EtherMAC bindings documentation to json-schema. Also name bindings documentation file according to the compat string being documented. Signed-off-by: Simon Horman <horms+renesas@verge.net.au> Reviewed-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: usb: ax88179_178a: allow optionally getting mac address from device treePeter Fink2019-10-021-4/+27
| | | | | | | | | | | Adopt and integrate the feature to pass the MAC address via device tree from asix_device.c (03fc5d4) also to other ax88179 based asix chips. E.g. the bootloader fills in local-mac-address and the driver will then pick up and use this MAC address. Signed-off-by: Peter Fink <pfink@christ-es.de> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
* ipv6: minor code reorg in inet6_fill_ifla6_attrs()Nicolas Dichtel2019-10-011-4/+3
| | | | | | | | Just put related code together to ease code reading: the memcpy() is related to the nla_reserve(). Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge branch 'netdev-altnames'David S. Miller2019-10-018-43/+348
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Jiri Pirko says: ==================== net: introduce alternative names for network interfaces In the past, there was repeatedly discussed the IFNAMSIZ (16) limit for netdevice name length. Now when we have PF and VF representors with port names like "pfXvfY", it became quite common to hit this limit: 0123456789012345 enp131s0f1npf0vf6 enp131s0f1npf0vf22 Udev cannot rename these interfaces out-of-the-box and user needs to create custom rules to handle them. Also, udev has multiple schemes of netdev names. From udev code: * Type of names: * b<number> - BCMA bus core number * c<bus_id> - bus id of a grouped CCW or CCW device, * with all leading zeros stripped [s390] * o<index>[n<phys_port_name>|d<dev_port>] * - on-board device index number * s<slot>[f<function>][n<phys_port_name>|d<dev_port>] * - hotplug slot index number * x<MAC> - MAC address * [P<domain>]p<bus>s<slot>[f<function>][n<phys_port_name>|d<dev_port>] * - PCI geographical location * [P<domain>]p<bus>s<slot>[f<function>][u<port>][..][c<config>][i<interface>] * - USB port number chain * v<slot> - VIO slot number (IBM PowerVM) * a<vendor><model>i<instance> - Platform bus ACPI instance id * i<addr>n<phys_port_name> - Netdevsim bus address and port name One device can be often renamed by multiple patterns at the same time (e.g. pci address/mac). This patchset introduces alternative names for network interfaces. Main goal is to: 1) Overcome the IFNAMSIZ limitation (altname limitation is 128 bytes) 2) Allow to have multiple names at the same time (multiple udev patterns) 3) Allow to use alternative names as handle for commands The patchset introduces two new commands to add/delete list of properties. Currently only alternative names are implemented but the ifrastructure could be easily extended later on. This is very similar to the list of vlan and tunnels being added/deleted to/from bridge ports. See following examples. $ ip link 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 2: dummy0: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000 link/ether ae:67:a9:67:46:86 brd ff:ff:ff:ff:ff:ff -> Add alternative names for dummy0: $ ip link prop add dummy0 altname someothername $ ip link prop add dummy0 altname someotherveryveryveryverylongname $ ip link 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 2: dummy0: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000 link/ether ae:67:a9:67:46:86 brd ff:ff:ff:ff:ff:ff altname someothername altname someotherveryveryveryverylongname $ ip link show someotherveryveryveryverylongname 2: dummy0: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000 link/ether ae:67:a9:67:46:86 brd ff:ff:ff:ff:ff:ff altname someothername altname someotherveryveryveryverylongname -> Add bridge brx, add it's alternative name and use alternative names to do enslavement. $ ip link add name brx type bridge $ ip link prop add brx altname mypersonalsuperspecialbridge $ ip link set someotherveryveryveryverylongname master mypersonalsuperspecialbridge $ ip link 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 2: dummy0: <BROADCAST,NOARP> mtu 1500 qdisc noop master brx state DOWN mode DEFAULT group default qlen 1000 link/ether ae:67:a9:67:46:86 brd ff:ff:ff:ff:ff:ff altname someothername altname someotherveryveryveryverylongname 3: brx: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000 link/ether ae:67:a9:67:46:86 brd ff:ff:ff:ff:ff:ff altname mypersonalsuperspecialbridge -> Add ipv4 address to the bridge using alternative name: $ ip addr add 192.168.0.1/24 dev mypersonalsuperspecialbridge $ ip addr show mypersonalsuperspecialbridge 3: brx: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000 link/ether ae:67:a9:67:46:86 brd ff:ff:ff:ff:ff:ff altname mypersonalsuperspecialbridge inet 192.168.0.1/24 scope global brx valid_lft forever preferred_lft forever -> Delete one of dummy0 alternative names: $ ip link prop del dummy0 altname someotherveryveryveryverylongname $ ip link 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 2: dummy0: <BROADCAST,NOARP> mtu 1500 qdisc noop master brx state DOWN mode DEFAULT group default qlen 1000 link/ether ae:67:a9:67:46:86 brd ff:ff:ff:ff:ff:ff altname someothername 3: brx: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000 link/ether ae:67:a9:67:46:86 brd ff:ff:ff:ff:ff:ff altname mypersonalsuperspecialbridge -> Add multiple alternative names at once $ ip link prop add dummy0 altname a altname b altname c altname d $ ip link 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 2: dummy0: <BROADCAST,NOARP> mtu 1500 qdisc noop master brx state DOWN mode DEFAULT group default qlen 1000 link/ether ae:67:a9:67:46:86 brd ff:ff:ff:ff:ff:ff altname someothername altname a altname b altname c altname d 3: brx: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000 link/ether ae:67:a9:67:46:86 brd ff:ff:ff:ff:ff:ff altname mypersonalsuperspecialbridge ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| * net: rtnetlink: add possibility to use alternative names as message handleJiri Pirko2019-10-011-11/+18
| | | | | | | | | | | | | | | | Extend the basic rtnetlink commands to use alternative interface names as a handle instead of ifindex and ifname. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net: rtnetlink: introduce helper to get net_device instance by ifnameJiri Pirko2019-10-011-20/+25
| | | | | | | | | | | | | | | | Introduce helper function rtnl_get_dev() that gets net_device structure instance pointer according to passed ifname or ifname attribute. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net: rtnetlink: unify the code in __rtnl_newlink get dev with the restJiri Pirko2019-10-011-6/+4
| | | | | | | | | | | | | | | | | | __rtnl_newlink() code flow is a bit different around tb[IFLA_IFNAME] processing comparing to the other places. Change that to be unified with the rest. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net: rtnetlink: put alternative names to getlink messageJiri Pirko2019-10-011-0/+53
| | | | | | | | | | | | | | | | Extend exiting getlink info message with list of properties. Now the only ones are alternative names. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net: rtnetlink: add linkprop commands to add and delete alternative ifnamesJiri Pirko2019-10-017-2/+177
| | | | | | | | | | | | | | | | | | Add two commands to add and delete list of link properties. Implement the first property type along - alternative ifnames. Each net device can have multiple alternative names. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net: introduce name_node struct to be used in hashlistJiri Pirko2019-10-012-20/+87
| | | | | | | | | | | | | | | | | | | | | | Introduce name_node structure to hold name of device and put it into hashlist instead of putting there struct net_device directly. Add a necessary infrastructure to manipulate the hashlist. This prepares the code to use the same hashlist for alternative names introduced later in this set. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net: procfs: use index hashlist instead of name hashlistJiri Pirko2019-10-011-2/+2
|/ | | | | | | | Name hashlist is going to be used for more than just dev->name, so use rather index hashlist for iteration over net_device instances. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* tcp: add ipv6_addr_v4mapped_loopback() helperEric Dumazet2019-10-012-4/+7
| | | | | | | | | | | tcp_twsk_unique() has a hard coded assumption about ipv4 loopback being 127/8 Lets instead use the standard ipv4_is_loopback() method, in a new ipv6_addr_v4mapped_loopback() helper. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: core: dev: replace state xoff flag comparison by netif_xmit_stopped methodJulio Faracco2019-10-011-1/+1
| | | | | | | | | | Function netif_schedule_queue() has a hardcoded comparison between queue state and any xoff flag. This comparison does the same thing as method netif_xmit_stopped(). In terms of code clarity, it is better. See other methods like: generic_xdp_tx() and dev_direct_xmit(). Signed-off-by: Julio Faracco <jcfaracco@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* r8152: Factor out OOB link list waitsPrashant Malani2019-10-011-52/+21
| | | | | | | | | | | | The same for-loop check for the LINK_LIST_READY bit of an OOB_CTRL register is used in several places. Factor these out into a single function to reduce the lines of code. Change-Id: I20e8f327045a72acc0a83e2d145ae2993ab62915 Signed-off-by: Prashant Malani <pmalani@chromium.org> Reviewed-by: Grant Grundler <grundler@chromium.org> Acked-by: Hayes Wang <hayeswang@realtek.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netLinus Torvalds2019-09-29169-1307/+1225
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull networking fixes from David Miller: 1) Sanity check URB networking device parameters to avoid divide by zero, from Oliver Neukum. 2) Disable global multicast filter in NCSI, otherwise LLDP and IPV6 don't work properly. Longer term this needs a better fix tho. From Vijay Khemka. 3) Small fixes to selftests (use ping when ping6 is not present, etc.) from David Ahern. 4) Bring back rt_uses_gateway member of struct rtable, it's semantics were not well understood and trying to remove it broke things. From David Ahern. 5) Move usbnet snaity checking, ignore endpoints with invalid wMaxPacketSize. From Bjørn Mork. 6) Missing Kconfig deps for sja1105 driver, from Mao Wenan. 7) Various small fixes to the mlx5 DR steering code, from Alaa Hleihel, Alex Vesker, and Yevgeny Kliteynik 8) Missing CAP_NET_RAW checks in various places, from Ori Nimron. 9) Fix crash when removing sch_cbs entry while offloading is enabled, from Vinicius Costa Gomes. 10) Signedness bug fixes, generally in looking at the result given by of_get_phy_mode() and friends. From Dan Crapenter. 11) Disable preemption around BPF_PROG_RUN() calls, from Eric Dumazet. 12) Don't create VRF ipv6 rules if ipv6 is disabled, from David Ahern. 13) Fix quantization code in tcp_bbr, from Kevin Yang. * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (127 commits) net: tap: clean up an indentation issue nfp: abm: fix memory leak in nfp_abm_u32_knode_replace tcp: better handle TCP_USER_TIMEOUT in SYN_SENT state sk_buff: drop all skb extensions on free and skb scrubbing tcp_bbr: fix quantization code to not raise cwnd if not probing bandwidth mlxsw: spectrum_flower: Fail in case user specifies multiple mirror actions Documentation: Clarify trap's description mlxsw: spectrum: Clear VLAN filters during port initialization net: ena: clean up indentation issue NFC: st95hf: clean up indentation issue net: phy: micrel: add Asym Pause workaround for KSZ9021 net: socionext: ave: Avoid using netdev_err() before calling register_netdev() ptp: correctly disable flags on old ioctls lib: dimlib: fix help text typos net: dsa: microchip: Always set regmap stride to 1 nfp: flower: fix memory leak in nfp_flower_spawn_vnic_reprs nfp: flower: prevent memory leak in nfp_flower_spawn_phy_reprs net/sched: Set default of CONFIG_NET_TC_SKB_EXT to N vrf: Do not attempt to create IPv6 mcast rule if IPv6 is disabled net: sched: sch_sfb: don't call qdisc_put() while holding tree lock ...
| * net: tap: clean up an indentation issueColin Ian King2019-09-271-1/+1
| | | | | | | | | | | | | | | | There is a statement that is indented too deeply, remove the extraneous tab. Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * nfp: abm: fix memory leak in nfp_abm_u32_knode_replaceNavid Emamdoost2019-09-271-4/+10
| | | | | | | | | | | | | | | | | | In nfp_abm_u32_knode_replace if the allocation for match fails it should go to the error handling instead of returning. Updated other gotos to have correct errno returned, too. Signed-off-by: Navid Emamdoost <navid.emamdoost@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * tcp: better handle TCP_USER_TIMEOUT in SYN_SENT stateEric Dumazet2019-09-271-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Yuchung Cheng and Marek Majkowski independently reported a weird behavior of TCP_USER_TIMEOUT option when used at connect() time. When the TCP_USER_TIMEOUT is reached, tcp_write_timeout() believes the flow should live, and the following condition in tcp_clamp_rto_to_user_timeout() programs one jiffie timers : remaining = icsk->icsk_user_timeout - elapsed; if (remaining <= 0) return 1; /* user timeout has passed; fire ASAP */ This silly situation ends when the max syn rtx count is reached. This patch makes sure we honor both TCP_SYNCNT and TCP_USER_TIMEOUT, avoiding these spurious SYN packets. Fixes: b701a99e431d ("tcp: Add tcp_clamp_rto_to_user_timeout() helper to improve accuracy") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Yuchung Cheng <ycheng@google.com> Reported-by: Marek Majkowski <marek@cloudflare.com> Cc: Jon Maxwell <jmaxwell37@gmail.com> Link: https://marc.info/?l=linux-netdev&m=156940118307949&w=2 Acked-by: Jon Maxwell <jmaxwell37@gmail.com> Tested-by: Marek Majkowski <marek@cloudflare.com> Signed-off-by: Marek Majkowski <marek@cloudflare.com> Acked-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * sk_buff: drop all skb extensions on free and skb scrubbingFlorian Westphal2019-09-273-3/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now that we have a 3rd extension, add a new helper that drops the extension space and use it when we need to scrub an sk_buff. At this time, scrubbing clears secpath and bridge netfilter data, but retains the tc skb extension, after this patch all three get cleared. NAPI reuse/free assumes we can only have a secpath attached to skb, but it seems better to clear all extensions there as well. v2: add unlikely hint (Eric Dumazet) Fixes: 95a7233c452a ("net: openvswitch: Set OvS recirc_id from tc chain index") Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
| * tcp_bbr: fix quantization code to not raise cwnd if not probing bandwidthKevin(Yudong) Yang2019-09-271-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There was a bug in the previous logic that attempted to ensure gain cycling gets inflight above BDP even for small BDPs. This code correctly raised and lowered target inflight values during the gain cycle. And this code correctly ensured that cwnd was raised when probing bandwidth. However, it did not correspondingly ensure that cwnd was *not* raised in this way when *not* probing for bandwidth. The result was that small-BDP flows that were always cwnd-bound could go for many cycles with a fixed cwnd, and not probe or yield bandwidth at all. This meant that multiple small-BDP flows could fail to converge in their bandwidth allocations. Fixes: 3c346b233c68 ("tcp_bbr: fix bw probing to raise in-flight data for very small BDPs") Signed-off-by: Kevin(Yudong) Yang <yyd@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Acked-by: Yuchung Cheng <ycheng@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Acked-by: Priyaranjan Jha <priyarjha@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * Merge branch 'mlxsw-Various-fixes'David S. Miller2019-09-274-8/+17
| |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Ido Schimmel says: ==================== mlxsw: Various fixes This patchset includes two small fixes for the mlxsw driver and one patch which clarifies recently introduced devlink-trap documentation. Patch #1 clears the port's VLAN filters during port initialization. This ensures that the drop reason reported to the user is consistent. The problem is explained in detail in the commit message. Patch #2 clarifies the description of one of the traps exposed via devlink-trap. Patch #3 from Danielle forbids the installation of a tc filter with multiple mirror actions since this is not supported by the device. The failure is communicated to the user via extack. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| | * mlxsw: spectrum_flower: Fail in case user specifies multiple mirror actionsDanielle Ratson2019-09-271-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The ASIC can only mirror a packet to one port, but when user is trying to set more than one mirror action, it doesn't fail. Add a check if more than one mirror action was specified per rule and if so, fail for not being supported. Fixes: d0d13c1858a11 ("mlxsw: spectrum_acl: Add support for mirror action") Signed-off-by: Danielle Ratson <danieller@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * Documentation: Clarify trap's descriptionIdo Schimmel2019-09-271-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Alex noted that the below description might not be obvious to all users. Clarify it by adding an example. Fixes: f3047ca01f12 ("Documentation: Add devlink-trap documentation") Reported-by: Alex Kushnarov <alexanderk@mellanox.com> Reviewed-by: Alex Kushnarov <alexanderk@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * mlxsw: spectrum: Clear VLAN filters during port initializationIdo Schimmel2019-09-272-7/+9
| |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When a port is created, its VLAN filters are not cleared by the firmware. This causes tagged packets to be later dropped by the ingress STP filters, which default to DISCARD state. The above did not matter much until commit b5ce611fd96e ("mlxsw: spectrum: Add devlink-trap support") where we exposed the drop reason to users. Without this patch, the drop reason users will see is not consistent. If a port is enslaved to a VLAN-aware bridge and a packet with an invalid VLAN tries to ingress the bridge, it will be dropped due to ingress STP filter. If the VLAN is later enabled and then disabled, the packet will be dropped by the ingress VLAN filter despite the above being a seemingly NOP operation. Fix this by clearing all the VLAN filters during port initialization. Adjust the test accordingly. Fixes: b5ce611fd96e ("mlxsw: spectrum: Add devlink-trap support") Reported-by: Alex Kushnarov <alexanderk@mellanox.com> Tested-by: Alex Kushnarov <alexanderk@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net: ena: clean up indentation issueColin Ian King2019-09-271-2/+2
| | | | | | | | | | | | | | There memset is indented incorrectly, remove the extraneous tabs. Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * NFC: st95hf: clean up indentation issueColin Ian King2019-09-271-1/+1
| | | | | | | | | | | | | | | | The return statement is indented incorrectly, add in a missing tab and remove an extraneous space after the return Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net: phy: micrel: add Asym Pause workaround for KSZ9021Hans Andersson2019-09-271-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The Micrel KSZ9031 PHY may fail to establish a link when the Asymmetric Pause capability is set. This issue is described in a Silicon Errata (DS80000691D or DS80000692D), which advises to always disable the capability. Micrel KSZ9021 has no errata, but has the same issue with Asymmetric Pause. This patch apply the same workaround as the one for KSZ9031. Fixes: 3aed3e2a143c ("net: phy: micrel: add Asym Pause workaround") Signed-off-by: Hans Andersson <hans.andersson@cellavision.se> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net: socionext: ave: Avoid using netdev_err() before calling register_netdev()Kunihiko Hayashi2019-09-271-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | Until calling register_netdev(), ndev->dev_name isn't specified, and netdev_err() displays "(unnamed net_device)". ave 65000000.ethernet (unnamed net_device) (uninitialized): invalid phy-mode setting ave: probe of 65000000.ethernet failed with error -22 This replaces netdev_err() with dev_err() before calling register_netdev(). Signed-off-by: Kunihiko Hayashi <hayashi.kunihiko@socionext.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * ptp: correctly disable flags on old ioctlsJacob Keller2019-09-272-2/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 415606588c61 ("PTP: introduce new versions of IOCTLs", 2019-09-13) introduced new versions of the PTP ioctls which actually validate that the flags are acceptable values. As part of this, it cleared the flags value using a bitwise and+negation, in an attempt to prevent the old ioctl from accidentally enabling new features. This is incorrect for a couple of reasons. First, it results in accidentally preventing previously working flags on the request ioctl. By clearing the "valid" flags, we now no longer allow setting the enable, rising edge, or falling edge flags. Second, if we add new additional flags in the future, they must not be set by the old ioctl. (Since the flag wasn't checked before, we could potentially break userspace programs which sent garbage flag data. The correct way to resolve this is to check for and clear all but the originally valid flags. Create defines indicating which flags are correctly checked and interpreted by the original ioctls. Use these to clear any bits which will not be correctly interpreted by the original ioctls. In the future, new flags must be added to the VALID_FLAGS macros, but *not* to the V1_VALID_FLAGS macros. In this way, new features may be exposed over the v2 ioctls, but without breaking previous userspace which happened to not clear the flags value properly. The old ioctl will continue to behave the same way, while the new ioctl gains the benefit of using the flags fields. Cc: Richard Cochran <richardcochran@gmail.com> Cc: Felipe Balbi <felipe.balbi@linux.intel.com> Cc: David S. Miller <davem@davemloft.net> Cc: Christopher Hall <christopher.s.hall@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Acked-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * lib: dimlib: fix help text typosRandy Dunlap2019-09-271-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | Fix help text typos for DIMLIB. Fixes: 4f75da3666c0 ("linux/dim: Move implementation to .c files") Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Uwe Kleine-König <uwe@kleine-koenig.org> Cc: Tal Gilboa <talgi@mellanox.com> Cc: Saeed Mahameed <saeedm@mellanox.com> Acked-by: Uwe Kleine-König <uwe@kleine-koenig.org> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net: dsa: microchip: Always set regmap stride to 1Marek Vasut2019-09-271-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The regmap stride is set to 1 for regmap describing 8bit registers already. However, for 16/32/64bit registers, the stride is 2/4/8 respectively. This is not correct, as the switch protocol supports unaligned register reads and writes and the KSZ87xx even uses such unaligned register accesses to read e.g. MIB counter. This patch fixes MIB counter access on KSZ87xx. Signed-off-by: Marek Vasut <marex@denx.de> Cc: Andrew Lunn <andrew@lunn.ch> Cc: David S. Miller <davem@davemloft.net> Cc: Florian Fainelli <f.fainelli@gmail.com> Cc: George McCollister <george.mccollister@gmail.com> Cc: Tristram Ha <Tristram.Ha@microchip.com> Cc: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Cc: Woojung Huh <woojung.huh@microchip.com> Fixes: 46558d601cb6 ("net: dsa: microchip: Initial SPI regmap support") Fixes: 255b59ad0db2 ("net: dsa: microchip: Factor out regmap config generation into common header") Reviewed-by: George McCollister <george.mccollister@gmail.com> Tested-by: George McCollister <george.mccollister@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nfDavid S. Miller2019-09-277-11/+51
| |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pablo Neira Ayuso says: ==================== Netfilter fixes for net The following patchset contains Netfilter fixes for net: 1) Add NFT_CHAIN_POLICY_UNSET to replace hardcoded -1 to specify that the chain policy is unset. The chain policy field is actually defined as an 8-bit unsigned integer. 2) Remove always true condition reported by smatch in chain policy check. 3) Fix element lookup on dynamic sets, from Florian Westphal. 4) Use __u8 in ebtables uapi header, from Masahiro Yamada. 5) Bogus EBUSY when removing flowtable after chain flush, from Laura Garcia Liebana. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| | * netfilter: nf_tables: bogus EBUSY when deleting flowtable after flushLaura Garcia Liebana2019-09-253-0/+39
| | | | | | | | | | | | | | | | | | | | | | | | | | | The deletion of a flowtable after a flush in the same transaction results in EBUSY. This patch adds an activation and deactivation of flowtables in order to update the _use_ counter. Signed-off-by: Laura Garcia Liebana <nevola@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
| | * netfilter: ebtables: use __u8 instead of uint8_t in uapi headerMasahiro Yamada2019-09-252-4/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When CONFIG_UAPI_HEADER_TEST=y, exported headers are compile-tested to make sure they can be included from user-space. Currently, linux/netfilter_bridge/ebtables.h is excluded from the test coverage. To make it join the compile-test, we need to fix the build errors attached below. For a case like this, we decided to use __u{8,16,32,64} variable types in this discussion: https://lkml.org/lkml/2019/6/5/18 Build log: CC usr/include/linux/netfilter_bridge/ebtables.h.s In file included from <command-line>:32:0: ./usr/include/linux/netfilter_bridge/ebtables.h:126:4: error: unknown type name ‘uint8_t’ uint8_t revision; ^~~~~~~ ./usr/include/linux/netfilter_bridge/ebtables.h:139:4: error: unknown type name ‘uint8_t’ uint8_t revision; ^~~~~~~ ./usr/include/linux/netfilter_bridge/ebtables.h:152:4: error: unknown type name ‘uint8_t’ uint8_t revision; ^~~~~~~ Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
| | * netfilter: nf_tables: allow lookups in dynamic setsFlorian Westphal2019-09-202-5/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This un-breaks lookups in sets that have the 'dynamic' flag set. Given this active example configuration: table filter { set set1 { type ipv4_addr size 64 flags dynamic,timeout timeout 1m } chain input { type filter hook input priority 0; policy accept; } } ... this works: nft add rule ip filter input add @set1 { ip saddr } -> whenever rule is triggered, the source ip address is inserted into the set (if it did not exist). This won't work: nft add rule ip filter input ip saddr @set1 counter Error: Could not process rule: Operation not supported In other words, we can add entries to the set, but then can't make matching decision based on that set. That is just wrong -- all set backends support lookups (else they would not be very useful). The failure comes from an explicit rejection in nft_lookup.c. Looking at the history, it seems like NFT_SET_EVAL used to mean 'set contains expressions' (aka. "is a meter"), for instance something like nft add rule ip filter input meter example { ip saddr limit rate 10/second } or nft add rule ip filter input meter example { ip saddr counter } The actual meaning of NFT_SET_EVAL however, is 'set can be updated from the packet path'. 'meters' and packet-path insertions into sets, such as 'add @set { ip saddr }' use exactly the same kernel code (nft_dynset.c) and thus require a set backend that provides the ->update() function. The only set that provides this also is the only one that has the NFT_SET_EVAL feature flag. Removing the wrong check makes the above example work. While at it, also fix the flag check during set instantiation to allow supported combinations only. Fixes: 8aeff920dcc9b3f ("netfilter: nf_tables: add stateful object reference to set elements") Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
| | * netfilter: nf_tables_offload: fix always true policy is unset checkPablo Neira Ayuso2019-09-201-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | New smatch warnings: net/netfilter/nf_tables_offload.c:316 nft_flow_offload_chain() warn: always true condition '(policy != -1) => (0-255 != (-1))' Reported-by: kbuild test robot <lkp@intel.com> Fixes: c9626a2cbdb2 ("netfilter: nf_tables: add hardware offload support") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
| | * netfilter: nf_tables: add NFT_CHAIN_POLICY_UNSET and use itPablo Neira Ayuso2019-09-202-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | Default policy is defined as a unsigned 8-bit field, do not use a negative value to leave it unset, use this new NFT_CHAIN_POLICY_UNSET instead. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>