diff options
author | Linus Torvalds <torvalds@linux-foundation.org> | 2021-07-01 00:51:09 +0200 |
---|---|---|
committer | Linus Torvalds <torvalds@linux-foundation.org> | 2021-07-01 00:51:09 +0200 |
commit | dbe69e43372212527abf48609aba7fc39a6daa27 (patch) | |
tree | 96cfafdf70f5325ceeac1054daf7deca339c9730 /Documentation/networking/devlink | |
parent | Merge tag 'sched-urgent-2021-06-30' of git://git.kernel.org/pub/scm/linux/ker... (diff) | |
parent | Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net (diff) | |
download | linux-dbe69e43372212527abf48609aba7fc39a6daa27.tar.xz linux-dbe69e43372212527abf48609aba7fc39a6daa27.zip |
Merge tag 'net-next-5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
Pull networking updates from Jakub Kicinski:
"Core:
- BPF:
- add syscall program type and libbpf support for generating
instructions and bindings for in-kernel BPF loaders (BPF loaders
for BPF), this is a stepping stone for signed BPF programs
- infrastructure to migrate TCP child sockets from one listener to
another in the same reuseport group/map to improve flexibility
of service hand-off/restart
- add broadcast support to XDP redirect
- allow bypass of the lockless qdisc to improving performance (for
pktgen: +23% with one thread, +44% with 2 threads)
- add a simpler version of "DO_ONCE()" which does not require jump
labels, intended for slow-path usage
- virtio/vsock: introduce SOCK_SEQPACKET support
- add getsocketopt to retrieve netns cookie
- ip: treat lowest address of a IPv4 subnet as ordinary unicast
address allowing reclaiming of precious IPv4 addresses
- ipv6: use prandom_u32() for ID generation
- ip: add support for more flexible field selection for hashing
across multi-path routes (w/ offload to mlxsw)
- icmp: add support for extended RFC 8335 PROBE (ping)
- seg6: add support for SRv6 End.DT46 behavior
- mptcp:
- DSS checksum support (RFC 8684) to detect middlebox meddling
- support Connection-time 'C' flag
- time stamping support
- sctp: packetization Layer Path MTU Discovery (RFC 8899)
- xfrm: speed up state addition with seq set
- WiFi:
- hidden AP discovery on 6 GHz and other HE 6 GHz improvements
- aggregation handling improvements for some drivers
- minstrel improvements for no-ack frames
- deferred rate control for TXQs to improve reaction times
- switch from round robin to virtual time-based airtime scheduler
- add trace points:
- tcp checksum errors
- openvswitch - action execution, upcalls
- socket errors via sk_error_report
Device APIs:
- devlink: add rate API for hierarchical control of max egress rate
of virtual devices (VFs, SFs etc.)
- don't require RCU read lock to be held around BPF hooks in NAPI
context
- page_pool: generic buffer recycling
New hardware/drivers:
- mobile:
- iosm: PCIe Driver for Intel M.2 Modem
- support for Qualcomm MSM8998 (ipa)
- WiFi: Qualcomm QCN9074 and WCN6855 PCI devices
- sparx5: Microchip SparX-5 family of Enterprise Ethernet switches
- Mellanox BlueField Gigabit Ethernet (control NIC of the DPU)
- NXP SJA1110 Automotive Ethernet 10-port switch
- Qualcomm QCA8327 switch support (qca8k)
- Mikrotik 10/25G NIC (atl1c)
Driver changes:
- ACPI support for some MDIO, MAC and PHY devices from Marvell and
NXP (our first foray into MAC/PHY description via ACPI)
- HW timestamping (PTP) support: bnxt_en, ice, sja1105, hns3, tja11xx
- Mellanox/Nvidia NIC (mlx5)
- NIC VF offload of L2 bridging
- support IRQ distribution to Sub-functions
- Marvell (prestera):
- add flower and match all
- devlink trap
- link aggregation
- Netronome (nfp): connection tracking offload
- Intel 1GE (igc): add AF_XDP support
- Marvell DPU (octeontx2): ingress ratelimit offload
- Google vNIC (gve): new ring/descriptor format support
- Qualcomm mobile (rmnet & ipa): inline checksum offload support
- MediaTek WiFi (mt76)
- mt7915 MSI support
- mt7915 Tx status reporting
- mt7915 thermal sensors support
- mt7921 decapsulation offload
- mt7921 enable runtime pm and deep sleep
- Realtek WiFi (rtw88)
- beacon filter support
- Tx antenna path diversity support
- firmware crash information via devcoredump
- Qualcomm WiFi (wcn36xx)
- Wake-on-WLAN support with magic packets and GTK rekeying
- Micrel PHY (ksz886x/ksz8081): add cable test support"
* tag 'net-next-5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (2168 commits)
tcp: change ICSK_CA_PRIV_SIZE definition
tcp_yeah: check struct yeah size at compile time
gve: DQO: Fix off by one in gve_rx_dqo()
stmmac: intel: set PCI_D3hot in suspend
stmmac: intel: Enable PHY WOL option in EHL
net: stmmac: option to enable PHY WOL with PMT enabled
net: say "local" instead of "static" addresses in ndo_dflt_fdb_{add,del}
net: use netdev_info in ndo_dflt_fdb_{add,del}
ptp: Set lookup cookie when creating a PTP PPS source.
net: sock: add trace for socket errors
net: sock: introduce sk_error_report
net: dsa: replay the local bridge FDB entries pointing to the bridge dev too
net: dsa: ensure during dsa_fdb_offload_notify that dev_hold and dev_put are on the same dev
net: dsa: include fdb entries pointing to bridge in the host fdb list
net: dsa: include bridge addresses which are local in the host fdb list
net: dsa: sync static FDB entries on foreign interfaces to hardware
net: dsa: install the host MDB and FDB entries in the master's RX filter
net: dsa: reference count the FDB addresses at the cross-chip notifier level
net: dsa: introduce a separate cross-chip notifier type for host FDBs
net: dsa: reference count the MDB entries at the cross-chip notifier level
...
Diffstat (limited to 'Documentation/networking/devlink')
-rw-r--r-- | Documentation/networking/devlink/devlink-port.rst | 35 | ||||
-rw-r--r-- | Documentation/networking/devlink/devlink-trap.rst | 1 | ||||
-rw-r--r-- | Documentation/networking/devlink/index.rst | 1 | ||||
-rw-r--r-- | Documentation/networking/devlink/netdevsim.rst | 26 | ||||
-rw-r--r-- | Documentation/networking/devlink/prestera.rst | 141 |
5 files changed, 204 insertions, 0 deletions
diff --git a/Documentation/networking/devlink/devlink-port.rst b/Documentation/networking/devlink/devlink-port.rst index ab790e7980b8..7627b1da01f2 100644 --- a/Documentation/networking/devlink/devlink-port.rst +++ b/Documentation/networking/devlink/devlink-port.rst @@ -164,6 +164,41 @@ device to instantiate the subfunction device on particular PCI function. A subfunction device is created on the :ref:`Documentation/driver-api/auxiliary_bus.rst <auxiliary_bus>`. At this point a matching subfunction driver binds to the subfunction's auxiliary device. +Rate object management +====================== + +Devlink provides API to manage tx rates of single devlink port or a group. +This is done through rate objects, which can be one of the two types: + +``leaf`` + Represents a single devlink port; created/destroyed by the driver. Since leaf + have 1to1 mapping to its devlink port, in user space it is referred as + ``pci/<bus_addr>/<port_index>``; + +``node`` + Represents a group of rate objects (leafs and/or nodes); created/deleted by + request from the userspace; initially empty (no rate objects added). In + userspace it is referred as ``pci/<bus_addr>/<node_name>``, where + ``node_name`` can be any identifier, except decimal number, to avoid + collisions with leafs. + +API allows to configure following rate object's parameters: + +``tx_share`` + Minimum TX rate value shared among all other rate objects, or rate objects + that parts of the parent group, if it is a part of the same group. + +``tx_max`` + Maximum TX rate value. + +``parent`` + Parent node name. Parent node rate limits are considered as additional limits + to all node children limits. ``tx_max`` is an upper limit for children. + ``tx_share`` is a total bandwidth distributed among children. + +Driver implementations are allowed to support both or either rate object types +and setting methods of their parameters. + Terms and Definitions ===================== diff --git a/Documentation/networking/devlink/devlink-trap.rst b/Documentation/networking/devlink/devlink-trap.rst index efa5f7f42c88..90d1381b88de 100644 --- a/Documentation/networking/devlink/devlink-trap.rst +++ b/Documentation/networking/devlink/devlink-trap.rst @@ -497,6 +497,7 @@ drivers: * Documentation/networking/devlink/netdevsim.rst * Documentation/networking/devlink/mlxsw.rst + * Documentation/networking/devlink/prestera.rst .. _Generic-Packet-Trap-Groups: diff --git a/Documentation/networking/devlink/index.rst b/Documentation/networking/devlink/index.rst index 8428a1220723..b3b9e0692088 100644 --- a/Documentation/networking/devlink/index.rst +++ b/Documentation/networking/devlink/index.rst @@ -46,3 +46,4 @@ parameters, info versions, and other features it supports. qed ti-cpsw-switch am65-nuss-cpsw-switch + prestera diff --git a/Documentation/networking/devlink/netdevsim.rst b/Documentation/networking/devlink/netdevsim.rst index 02c2d20dc673..8a292fb5aaea 100644 --- a/Documentation/networking/devlink/netdevsim.rst +++ b/Documentation/networking/devlink/netdevsim.rst @@ -57,6 +57,32 @@ entries, FIB rule entries and nexthops that the driver will allow. $ devlink resource set netdevsim/netdevsim0 path /nexthops size 16 $ devlink dev reload netdevsim/netdevsim0 +Rate objects +============ + +The ``netdevsim`` driver supports rate objects management, which includes: + +- registerging/unregistering leaf rate objects per VF devlink port; +- creation/deletion node rate objects; +- setting tx_share and tx_max rate values for any rate object type; +- setting parent node for any rate object type. + +Rate nodes and it's parameters are exposed in ``netdevsim`` debugfs in RO mode. +For example created rate node with name ``some_group``: + +.. code:: shell + + $ ls /sys/kernel/debug/netdevsim/netdevsim0/rate_groups/some_group + rate_parent tx_max tx_share + +Same parameters are exposed for leaf objects in corresponding ports directories. +For ex.: + +.. code:: shell + + $ ls /sys/kernel/debug/netdevsim/netdevsim0/ports/1 + dev ethtool rate_parent tx_max tx_share + Driver-specific Traps ===================== diff --git a/Documentation/networking/devlink/prestera.rst b/Documentation/networking/devlink/prestera.rst new file mode 100644 index 000000000000..49409d1d3081 --- /dev/null +++ b/Documentation/networking/devlink/prestera.rst @@ -0,0 +1,141 @@ +.. SPDX-License-Identifier: GPL-2.0 + +======================== +prestera devlink support +======================== + +This document describes the devlink features implemented by the ``prestera`` +device driver. + +Driver-specific Traps +===================== + +.. list-table:: List of Driver-specific Traps Registered by ``prestera`` + :widths: 5 5 90 + + * - Name + - Type + - Description +.. list-table:: List of Driver-specific Traps Registered by ``prestera`` + :widths: 5 5 90 + + * - Name + - Type + - Description + * - ``arp_bc`` + - ``trap`` + - Traps ARP broadcast packets (both requests/responses) + * - ``is_is`` + - ``trap`` + - Traps IS-IS packets + * - ``ospf`` + - ``trap`` + - Traps OSPF packets + * - ``ip_bc_mac`` + - ``trap`` + - Traps IPv4 packets with broadcast DA Mac address + * - ``stp`` + - ``trap`` + - Traps STP BPDU + * - ``lacp`` + - ``trap`` + - Traps LACP packets + * - ``lldp`` + - ``trap`` + - Traps LLDP packets + * - ``router_mc`` + - ``trap`` + - Traps multicast packets + * - ``vrrp`` + - ``trap`` + - Traps VRRP packets + * - ``dhcp`` + - ``trap`` + - Traps DHCP packets + * - ``mtu_error`` + - ``trap`` + - Traps (exception) packets that exceeded port's MTU + * - ``mac_to_me`` + - ``trap`` + - Traps packets with switch-port's DA Mac address + * - ``ttl_error`` + - ``trap`` + - Traps (exception) IPv4 packets whose TTL exceeded + * - ``ipv4_options`` + - ``trap`` + - Traps (exception) packets due to the malformed IPV4 header options + * - ``ip_default_route`` + - ``trap`` + - Traps packets that have no specific IP interface (IP to me) and no forwarding prefix + * - ``local_route`` + - ``trap`` + - Traps packets that have been send to one of switch IP interfaces addresses + * - ``ipv4_icmp_redirect`` + - ``trap`` + - Traps (exception) IPV4 ICMP redirect packets + * - ``arp_response`` + - ``trap`` + - Traps ARP replies packets that have switch-port's DA Mac address + * - ``acl_code_0`` + - ``trap`` + - Traps packets that have ACL priority set to 0 (tc pref 0) + * - ``acl_code_1`` + - ``trap`` + - Traps packets that have ACL priority set to 1 (tc pref 1) + * - ``acl_code_2`` + - ``trap`` + - Traps packets that have ACL priority set to 2 (tc pref 2) + * - ``acl_code_3`` + - ``trap`` + - Traps packets that have ACL priority set to 3 (tc pref 3) + * - ``acl_code_4`` + - ``trap`` + - Traps packets that have ACL priority set to 4 (tc pref 4) + * - ``acl_code_5`` + - ``trap`` + - Traps packets that have ACL priority set to 5 (tc pref 5) + * - ``acl_code_6`` + - ``trap`` + - Traps packets that have ACL priority set to 6 (tc pref 6) + * - ``acl_code_7`` + - ``trap`` + - Traps packets that have ACL priority set to 7 (tc pref 7) + * - ``ipv4_bgp`` + - ``trap`` + - Traps IPv4 BGP packets + * - ``ssh`` + - ``trap`` + - Traps SSH packets + * - ``telnet`` + - ``trap`` + - Traps Telnet packets + * - ``icmp`` + - ``trap`` + - Traps ICMP packets + * - ``rxdma_drop`` + - ``drop`` + - Drops packets (RxDMA) due to the lack of ingress buffers etc. + * - ``port_no_vlan`` + - ``drop`` + - Drops packets due to faulty-configured network or due to internal bug (config issue). + * - ``local_port`` + - ``drop`` + - Drops packets whose decision (FDB entry) is to bridge packet back to the incoming port/trunk. + * - ``invalid_sa`` + - ``drop`` + - Drops packets with multicast source MAC address. + * - ``illegal_ip_addr`` + - ``drop`` + - Drops packets with illegal SIP/DIP multicast/unicast addresses. + * - ``illegal_ipv4_hdr`` + - ``drop`` + - Drops packets with illegal IPV4 header. + * - ``ip_uc_dip_da_mismatch`` + - ``drop`` + - Drops packets with destination MAC being unicast, but destination IP address being multicast. + * - ``ip_sip_is_zero`` + - ``drop`` + - Drops packets with zero (0) IPV4 source address. + * - ``met_red`` + - ``drop`` + - Drops non-conforming packets (dropped by Ingress policer, metering drop), e.g. packet rate exceeded configured bandwith. |