linux - linux

	Commit message (Collapse)	Author	Age	Files	Lines
*	net: systemport: Enable all RX descriptors for SYSTEMPORT Lite	Florian Fainelli	2022-10-11	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \|	The original commit that added support for the SYSTEMPORT Lite variant halved the number of RX descriptors due to a confusion between the number of descriptors and the number of descriptor words. There are 512 descriptor words which means 256 descriptors total. Fixes: 44a4524c54af ("net: systemport: Add support for SYSTEMPORT Lite") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Link: https://lore.kernel.org/r/20221007034201.4126054-1-f.fainelli@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
*	net: prestera: span: do not unbind things things that were never bound	Maksym Glubokiy	2022-10-11	1	-1/+4
\| \| \| \| \| \| \|	Fixes: 13defa275eef ("net: marvell: prestera: Add matchall support") Signed-off-by: Maksym Glubokiy <maksym.glubokiy@plvision.eu> Link: https://lore.kernel.org/r/20221006190600.881740-1-maksym.glubokiy@plvision.eu Signed-off-by: Jakub Kicinski <kuba@kernel.org>
*	octeontx2-pf: mcs: fix possible memory leak in otx2_probe()	Yang Yingliang	2022-10-10	1	-1/+3
\| \| \| \| \| \| \| \| \|	In error path after calling cn10k_mcs_init(), cn10k_mcs_free() need be called to avoid memory leak. Fixes: c54ffc73601c ("octeontx2-pf: mcs: Introduce MACSEC hardware offloading") Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	octeontx2-af: cn10k: mcs: Fix error return code in mcs_register_interrupts()	Yang Yingliang	2022-10-09	1	-1/+3
\| \| \| \| \| \| \| \| \|	If alloc_mem() fails in mcs_register_interrupts(), it should return error code. Fixes: 6c635f78c474 ("octeontx2-af: cn10k: mcs: Handle MCS block interrupts") Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	octeontx2-pf: mcs: fix missing unlock in some error paths	Yang Yingliang	2022-10-09	1	-2/+3
\| \| \| \| \| \| \| \|	Add the missing unlock in some error paths. Fixes: c54ffc73601c ("octeontx2-pf: mcs: Introduce MACSEC hardware offloading") Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	macvlan: enforce a consistent minimal mtu	Eric Dumazet	2022-10-09	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	macvlan should enforce a minimal mtu of 68, even at link creation. This patch avoids the current behavior (which could lead to crashes in ipv6 stack if the link is brought up) $ ip link add macvlan1 link eno1 mtu 8 type macvlan # This should fail ! $ ip link sh dev macvlan1 5: macvlan1@eno1: <BROADCAST,MULTICAST> mtu 8 qdisc noop state DOWN mode DEFAULT group default qlen 1000 link/ether 02:47:6c:24:74:82 brd ff:ff:ff:ff:ff:ff $ ip link set macvlan1 mtu 67 Error: mtu less than device minimum. $ ip link set macvlan1 mtu 68 $ ip link set macvlan1 mtu 8 Error: mtu less than device minimum. Fixes: 91572088e3fd ("net: use core MTU range checking in core net infra") Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	net: ethernet: bgmac: Remove -Warray-bounds exception	Kees Cook	2022-10-07	1	-5/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	GCC-12 emits false positive -Warray-bounds warnings with CONFIG_UBSAN_SHIFT (-fsanitize=shift). This is fixed in GCC 13[1], and there is top-level Makefile logic to remove -Warray-bounds for known-bad GCC versions staring with commit f0be87c42cbd ("gcc-12: disable '-Warray-bounds' universally for now"). Remove the local work-around. [1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105679 Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: David S. Miller <davem@davemloft.net>
*	net: ethernet: mediatek: Remove -Warray-bounds exception	Kees Cook	2022-10-07	1	-5/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	GCC-12 emits false positive -Warray-bounds warnings with CONFIG_UBSAN_SHIFT (-fsanitize=shift). This is fixed in GCC 13[1], and there is top-level Makefile logic to remove -Warray-bounds for known-bad GCC versions staring with commit f0be87c42cbd ("gcc-12: disable '-Warray-bounds' universally for now"). Remove the local work-around. [1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105679 Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: David S. Miller <davem@davemloft.net>
*	prestera: matchall: do not rollback if rule exists	Serhiy Boiko	2022-10-07	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If you try to create a 'mirror' ACL rule on a port that already has a mirror rule, prestera_span_rule_add() will fail with EEXIST error. This forces rollback procedure which destroys existing mirror rule on hardware leaving it visible in linux. Add an explicit check for EEXIST to prevent the deletion of the existing rule but keep user seeing error message: $ tc filter add dev sw1p1 ... skip_sw action mirred egress mirror dev sw1p2 $ tc filter add dev sw1p1 ... skip_sw action mirred egress mirror dev sw1p3 RTNETLINK answers: File exists We have an error talking to the kernel Fixes: 13defa275eef ("net: marvell: prestera: Add matchall support") Signed-off-by: Serhiy Boiko <serhiy.boiko@plvision.eu> Signed-off-by: Maksym Glubokiy <maksym.glubokiy@plvision.eu> Signed-off-by: David S. Miller <davem@davemloft.net>
*	net: enetc: Remove duplicated include in enetc_qos.c	Yang Li	2022-10-07	1	-1/+0
\| \| \| \| \| \| \| \| \| \|	net/pkt_sched.h is included twice in enetc_qos.c, remove one of them. Link: https://bugzilla.openanolis.cn/show_bug.cgi?id=2334 Reported-by: Abaci Robot <abaci@linux.alibaba.com> Signed-off-by: Yang Li <yang.lee@linux.alibaba.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	octeontx2-pf: mcs: remove unneeded semicolon	Yang Li	2022-10-07	1	-1/+1
\| \| \| \| \| \| \| \| \|	Semicolon is not required after curly braces. Link: https://bugzilla.openanolis.cn/show_bug.cgi?id=2332 Reported-by: Abaci Robot <abaci@linux.alibaba.com> Signed-off-by: Yang Li <yang.lee@linux.alibaba.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	hv_netvsc: Fix race between VF offering and VF association message from host	Gaurav Kohli	2022-10-07	3	-1/+25
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	During vm boot, there might be possibility that vf registration call comes before the vf association from host to vm. And this might break netvsc vf path, To prevent the same block vf registration until vf bind message comes from host. Cc: stable@vger.kernel.org Fixes: 00d7ddba11436 ("hv_netvsc: pair VF based on serial number") Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: Gaurav Kohli <gauravkohli@linux.microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	net: ethernet: adi: adin1110: Add check in netdev_event	Alexandru Tachici	2022-10-06	1	-5/+8
\| \| \| \| \| \| \| \| \| \|	Check whether this driver actually is the intended recipient of upper change event. Fixes: bc93e19d088b ("net: ethernet: adi: Add ADIN1110 support") Signed-off-by: Alexandru Tachici <alexandru.tachici@analog.com> Link: https://lore.kernel.org/r/20221003111636.54973-1-alexandru.tachici@analog.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
*	net: pse-pd: PSE_REGULATOR should depend on REGULATOR	Geert Uytterhoeven	2022-10-06	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The Regulator based PSE controller driver relies on regulator support to be enabled. If regulator support is disabled, it will still compile fine, but won't operate correctly. Hence add a dependency on REGULATOR, to prevent asking the user about this driver when configuring a kernel without regulator support. Fixes: 66741b4e94ca7bb1 ("net: pse-pd: add regulator based PSE driver") Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: Oleksij Rempel <o.rempel@pengutronix.de> Link: https://lore.kernel.org/r/709caac8873ff2a8b72b92091429be7c1a939959.1664900558.git.geert+renesas@glider.be Signed-off-by: Jakub Kicinski <kuba@kernel.org>
*	Merge tag 'net-next-6.1' of ↵	Linus Torvalds	2022-10-04	1150	-23519/+80194
\|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next Pull networking updates from Jakub Kicinski: "Core: - Introduce and use a single page frag cache for allocating small skb heads, clawing back the 10-20% performance regression in UDP flood test from previous fixes. - Run packets which already went thru HW coalescing thru SW GRO. This significantly improves TCP segment coalescing and simplifies deployments as different workloads benefit from HW or SW GRO. - Shrink the size of the base zero-copy send structure. - Move TCP init under a new slow / sleepable version of DO_ONCE(). BPF: - Add BPF-specific, any-context-safe memory allocator. - Add helpers/kfuncs for PKCS#7 signature verification from BPF programs. - Define a new map type and related helpers for user space -> kernel communication over a ring buffer (BPF_MAP_TYPE_USER_RINGBUF). - Allow targeting BPF iterators to loop through resources of one task/thread. - Add ability to call selected destructive functions. Expose crash_kexec() to allow BPF to trigger a kernel dump. Use CAP_SYS_BOOT check on the loading process to judge permissions. - Enable BPF to collect custom hierarchical cgroup stats efficiently by integrating with the rstat framework. - Support struct arguments for trampoline based programs. Only structs with size <= 16B and x86 are supported. - Invoke cgroup/connect{4,6} programs for unprivileged ICMP ping sockets (instead of just TCP and UDP sockets). - Add a helper for accessing CLOCK_TAI for time sensitive network related programs. - Support accessing network tunnel metadata's flags. - Make TCP SYN ACK RTO tunable by BPF programs with TCP Fast Open. - Add support for writing to Netfilter's nf_conn:mark. Protocols: - WiFi: more Extremely High Throughput (EHT) and Multi-Link Operation (MLO) work (802.11be, WiFi 7). - vsock: improve support for SO_RCVLOWAT. - SMC: support SO_REUSEPORT. - Netlink: define and document how to use netlink in a "modern" way. Support reporting missing attributes via extended ACK. - IPSec: support collect metadata mode for xfrm interfaces. - TCPv6: send consistent autoflowlabel in SYN_RECV state and RST packets. - TCP: introduce optional per-netns connection hash table to allow better isolation between namespaces (opt-in, at the cost of memory and cache pressure). - MPTCP: support TCP_FASTOPEN_CONNECT. - Add NEXT-C-SID support in Segment Routing (SRv6) End behavior. - Adjust IP_UNICAST_IF sockopt behavior for connected UDP sockets. - Open vSwitch: - Allow specifying ifindex of new interfaces. - Allow conntrack and metering in non-initial user namespace. - TLS: support the Korean ARIA-GCM crypto algorithm. - Remove DECnet support. Driver API: - Allow selecting the conduit interface used by each port in DSA switches, at runtime. - Ethernet Power Sourcing Equipment and Power Device support. - Add tc-taprio support for queueMaxSDU parameter, i.e. setting per traffic class max frame size for time-based packet schedules. - Support PHY rate matching - adapting between differing host-side and link-side speeds. - Introduce QUSGMII PHY mode and 1000BASE-KX interface mode. - Validate OF (device tree) nodes for DSA shared ports; make phylink-related properties mandatory on DSA and CPU ports. Enforcing more uniformity should allow transitioning to phylink. - Require that flash component name used during update matches one of the components for which version is reported by info_get(). - Remove "weight" argument from driver-facing NAPI API as much as possible. It's one of those magic knobs which seemed like a good idea at the time but is too indirect to use in practice. - Support offload of TLS connections with 256 bit keys. New hardware / drivers: - Ethernet: - Microchip KSZ9896 6-port Gigabit Ethernet Switch - Renesas Ethernet AVB (EtherAVB-IF) Gen4 SoCs - Analog Devices ADIN1110 and ADIN2111 industrial single pair Ethernet (10BASE-T1L) MAC+PHY. - Rockchip RV1126 Gigabit Ethernet (a version of stmmac IP). - Ethernet SFPs / modules: - RollBall / Hilink / Turris 10G copper SFPs - HALNy GPON module - WiFi: - CYW43439 SDIO chipset (brcmfmac) - CYW89459 PCIe chipset (brcmfmac) - BCM4378 on Apple platforms (brcmfmac) Drivers: - CAN: - gs_usb: HW timestamp support - Ethernet PHYs: - lan8814: cable diagnostics - Ethernet NICs: - Intel (100G): - implement control of FCS/CRC stripping - port splitting via devlink - L2TPv3 filtering offload - nVidia/Mellanox: - tunnel offload for sub-functions - MACSec offload, w/ Extended packet number and replay window offload - significantly restructure, and optimize the AF_XDP support, align the behavior with other vendors - Huawei: - configuring DSCP map for traffic class selection - querying standard FEC statistics - querying SerDes lane number via ethtool - Marvell/Cavium: - egress priority flow control - MACSec offload - AMD/SolarFlare: - PTP over IPv6 and raw Ethernet - small / embedded: - ax88772: convert to phylink (to support SFP cages) - altera: tse: convert to phylink - ftgmac100: support fixed link - enetc: standard Ethtool counters - macb: ZynqMP SGMII dynamic configuration support - tsnep: support multi-queue and use page pool - lan743x: Rx IP & TCP checksum offload - igc: add xdp frags support to ndo_xdp_xmit - Ethernet high-speed switches: - Marvell (prestera): - support SPAN port features (traffic mirroring) - nexthop object offloading - Microchip (sparx5): - multicast forwarding offload - QoS queuing offload (tc-mqprio, tc-tbf, tc-ets) - Ethernet embedded switches: - Marvell (mv88e6xxx): - support RGMII cmode - NXP (felix): - standardized ethtool counters - Microchip (lan966x): - QoS queuing offload (tc-mqprio, tc-tbf, tc-cbs, tc-ets) - traffic policing and mirroring - link aggregation / bonding offload - QUSGMII PHY mode support - Qualcomm 802.11ax WiFi (ath11k): - cold boot calibration support on WCN6750 - support to connect to a non-transmit MBSSID AP profile - enable remain-on-channel support on WCN6750 - Wake-on-WLAN support for WCN6750 - support to provide transmit power from firmware via nl80211 - support to get power save duration for each client - spectral scan support for 160 MHz - MediaTek WiFi (mt76): - WiFi-to-Ethernet bridging offload for MT7986 chips - RealTek WiFi (rtw89): - P2P support" * tag 'net-next-6.1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1864 commits) eth: pse: add missing static inlines once: rename _SLOW to _SLEEPABLE net: pse-pd: add regulator based PSE driver dt-bindings: net: pse-dt: add bindings for regulator based PoDL PSE controller ethtool: add interface to interact with Ethernet Power Equipment net: mdiobus: search for PSE nodes by parsing PHY nodes. net: mdiobus: fwnode_mdiobus_register_phy() rework error handling net: add framework to support Ethernet PSE and PDs devices dt-bindings: net: phy: add PoDL PSE property net: marvell: prestera: Propagate nh state from hw to kernel net: marvell: prestera: Add neighbour cache accounting net: marvell: prestera: add stub handler neighbour events net: marvell: prestera: Add heplers to interact with fib_notifier_info net: marvell: prestera: Add length macros for prestera_ip_addr net: marvell: prestera: add delayed wq and flush wq on deinit net: marvell: prestera: Add strict cleanup of fib arbiter net: marvell: prestera: Add cleanup of allocated fib_nodes net: marvell: prestera: Add router nexthops ABI eth: octeon: fix build after netif_napi_add() changes net/mlx5: E-Switch, Return EBUSY if can't get mode lock ...
\| *	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net	Jakub Kicinski	2022-10-04	15	-16/+58
\| \|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Merge in the left-over fixes before the net-next pull-request. Conflicts: drivers/net/ethernet/mediatek/mtk_ppe.c ae3ed15da588 ("net: ethernet: mtk_eth_soc: fix state in __mtk_foe_entry_clear") 9d8cb4c096ab ("net: ethernet: mtk_eth_soc: add foe_entry_size to mtk_eth_soc") https://lore.kernel.org/all/6cb6893b-4921-a068-4c30-1109795110bb@tessares.net/ kernel/bpf/helpers.c 8addbfc7b308 ("bpf: Gate dynptr API behind CAP_BPF") 5679ff2f138f ("bpf: Move bpf_loop and bpf_for_each_map_elem under CAP_BPF") 8a67f2de9b1d ("bpf: expose bpf_strtol and bpf_strtoul to all program types") https://lore.kernel.org/all/20221003201957.13149-1-daniel@iogearbox.net/ Signed-off-by: Jakub Kicinski <kuba@kernel.org>
\| \| *	net: mvpp2: fix mvpp2 debugfs leak	Russell King (Oracle)	2022-10-04	3	-3/+21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When mvpp2 is unloaded, the driver specific debugfs directory is not removed, which technically leads to a memory leak. However, this directory is only created when the first device is probed, so the hardware is present. Removing the module is only something a developer would to when e.g. testing out changes, so the module would be reloaded. So this memory leak is minor. The original attempt in commit fe2c9c61f668 ("net: mvpp2: debugfs: fix memory leak when using debugfs_lookup()") that was labelled as a memory leak fix was not, it fixed a refcount leak, but in doing so created a problem when the module is reloaded - the directory already exists, but mvpp2_root is NULL, so we lose all debugfs entries. This fix has been reverted. This is the alternative fix, where we remove the offending directory whenever the driver is unloaded. Fixes: 21da57a23125 ("net: mvpp2: add a debugfs interface for the Header Parser") Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Marcin Wojtas <mw@semihalf.com> Link: https://lore.kernel.org/r/E1ofOAB-00CzkG-UO@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
\| \| *	r8152: Rate limit overflow messages	Andrew Gaul	2022-10-04	1	-1/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	My system shows almost 10 million of these messages over a 24-hour period which pollutes my logs. Signed-off-by: Andrew Gaul <gaul@google.com> Link: https://lore.kernel.org/r/20221002034128.2026653-1-gaul@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
\| \| *	net: wwan: iosm: Call mutex_init before locking it	Maxim Mikityanskiy	2022-10-03	1	-2/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	wwan_register_ops calls wwan_create_default_link, which ends up in the ipc_wwan_newlink callback that locks ipc_wwan->if_mutex. However, this mutex is not yet initialized by that point. Fix it by moving mutex_init above the wwan_register_ops call. This also makes the order of operations in ipc_wwan_init symmetric to ipc_wwan_deinit. Fixes: 83068395bbfc ("net: iosm: create default link via WWAN core") Signed-off-by: Maxim Mikityanskiy <maxtram95@gmail.com> Reviewed-by: M Chetan Kumar <m.chetan.kumar@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| \| *	eth: sp7021: fix use after free bug in spl2sw_nvmem_get_mac_address	Zheng Wang	2022-10-03	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This frees "mac" and tries to display its address as part of the error message on the next line. Swap the order. Fixes: fd3040b9394c ("net: ethernet: Add driver for Sunplus SP7021") Signed-off-by: Zheng Wang <zyytlz.wz@163.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| \| *	bnx2x: fix potential memory leak in bnx2x_tpa_stop()	Jianglei Nie	2022-10-03	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	bnx2x_tpa_stop() allocates a memory chunk from new_data with bnx2x_frag_alloc(). The new_data should be freed when gets some error. But when "pad + len > fp->rx_buf_size" is true, bnx2x_tpa_stop() returns without releasing the new_data, which will lead to a memory leak. We should free the new_data with bnx2x_frag_free() when "pad + len > fp->rx_buf_size" is true. Fixes: 07b0f00964def8af9321cfd6c4a7e84f6362f728 ("bnx2x: fix possible panic under memory stress") Signed-off-by: Jianglei Nie <niejianglei2021@163.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| \| *	eth: lan743x: reject extts for non-pci11x1x devices	Raju Lakkaraju	2022-10-03	1	-0/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Remove PTP_PF_EXTTS support for non-PCI11x1x devices since they do not support the PTP-IO Input event triggered timestamping mechanisms added Fixes: 60942c397af6 ("net: lan743x: Add support for PTP-IO Event Input External Timestamp (extts)") Signed-off-by: Raju Lakkaraju <Raju.Lakkaraju@microchip.com> Reviewed-by: Horatiu Vultur <horatiu.vultur@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| \| *	net: prestera: acl: Add check for kmemdup	Jiasheng Jiang	2022-10-03	3	-5/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	As the kemdup could return NULL, it should be better to check the return value and return error if fails. Moreover, the return value of prestera_acl_ruleset_keymask_set() should be checked by cascade. Fixes: 604ba230902d ("net: prestera: flower template support") Signed-off-by: Jiasheng Jiang <jiasheng@iscas.ac.cn> Reviewed-by: Taras Chornyi<tchornyi@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| \| *	net: sparx5: Fix return type of sparx5_port_xmit_impl	Nathan Huckleberry	2022-10-03	2	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The ndo_start_xmit field in net_device_ops is expected to be of type netdev_tx_t (ndo_start_xmit)(struct sk_buff skb, struct net_device *dev). The mismatched return type breaks forward edge kCFI since the underlying function definition does not match the function hook definition. The return type of sparx5_port_xmit_impl should be changed from int to netdev_tx_t. Reported-by: Dan Carpenter <error27@gmail.com> Link: https://github.com/ClangBuiltLinux/linux/issues/1703 Cc: llvm@lists.linux.dev Signed-off-by: Nathan Huckleberry <nhuck@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| \| *	net: ethernet: mtk_eth_soc: fix state in __mtk_foe_entry_clear	Daniel Golle	2022-10-01	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Setting ib1 state to MTK_FOE_STATE_UNBIND in __mtk_foe_entry_clear routine as done by commit 0e80707d94e4c8 ("net: ethernet: mtk_eth_soc: fix typo in __mtk_foe_entry_clear") breaks flow offloading, at least on older MTK_NETSYS_V1 SoCs, OpenWrt users have confirmed the bug on MT7622 and MT7621 systems. Felix Fietkau suggested to use MTK_FOE_STATE_INVALID instead which works well on both, MTK_NETSYS_V1 and MTK_NETSYS_V2. Tested on MT7622 (Linksys E8450) and MT7986 (BananaPi BPI-R3). Suggested-by: Felix Fietkau <nbd@nbd.name> Fixes: 0e80707d94e4c8 ("net: ethernet: mtk_eth_soc: fix typo in __mtk_foe_entry_clear") Fixes: 33fc42de33278b ("net: ethernet: mtk_eth_soc: support creating mac address based offload entries") Signed-off-by: Daniel Golle <daniel@makrotopia.org> Link: https://lore.kernel.org/r/YzY+1Yg0FBXcnrtc@makrotopia.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
\| \| *	eth: alx: take rtnl_lock on resume	Jakub Kicinski	2022-09-30	1	-0/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Zbynek reports that alx trips an rtnl assertion on resume: RTNL: assertion failed at net/core/dev.c (2891) RIP: 0010:netif_set_real_num_tx_queues+0x1ac/0x1c0 Call Trace: <TASK> __alx_open+0x230/0x570 [alx] alx_resume+0x54/0x80 [alx] ? pci_legacy_resume+0x80/0x80 dpm_run_callback+0x4a/0x150 device_resume+0x8b/0x190 async_resume+0x19/0x30 async_run_entry_fn+0x30/0x130 process_one_work+0x1e5/0x3b0 indeed the driver does not hold rtnl_lock during its internal close and re-open functions during suspend/resume. Note that this is not a huge bug as the driver implements its own locking, and does not implement changing the number of queues, but we need to silence the splat. Fixes: 4a5fe57e7751 ("alx: use fine-grained locking instead of RTNL") Reported-and-tested-by: Zbynek Michl <zbynek.michl@gmail.com> Reviewed-by: Niels Dossche <dossche.niels@gmail.com> Link: https://lore.kernel.org/r/20220928181236.1053043-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
\| * \|	net: pse-pd: add regulator based PSE driver	Oleksij Rempel	2022-10-04	3	-0/+160
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add generic, regulator based PSE driver to support simple Power Sourcing Equipment without automatic classification support. This driver was tested on 10Bast-T1L switch with regulator based PoDL PSE. Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
\| * \|	ethtool: add interface to interact with Ethernet Power Equipment	Oleksij Rempel	2022-10-04	1	-0/+58
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add interface to support Power Sourcing Equipment. At current step it provides generic way to address all variants of PSE devices as defined in IEEE 802.3-2018 but support only objects specified for IEEE 802.3-2018 104.4 PoDL Power Sourcing Equipment (PSE). Currently supported and mandatory objects are: IEEE 802.3-2018 30.15.1.1.3 aPoDLPSEPowerDetectionStatus IEEE 802.3-2018 30.15.1.1.2 aPoDLPSEAdminState IEEE 802.3-2018 30.15.1.2.1 acPoDLPSEAdminControl This is minimal interface needed to control PSE on each separate ethernet port but it provides not all mandatory objects specified in IEEE 802.3-2018. Since "PoDL PSE" and "PSE" have similar names, but some different values I decide to not merge them and keep separate naming schema. This should allow as to be as close to IEEE 802.3 spec as possible and avoid name conflicts in the future. This implementation is connected to PHYs instead of MACs because PSE auto classification can potentially interfere with PHY auto negotiation. So, may be some extra PHY related initialization will be needed. With WIP version of ethtools interaction with PSE capable link looks as following: $ ip l ... 5: t1l1@eth0: <BROADCAST,MULTICAST> .. ... $ ethtool --show-pse t1l1 PSE attributs for t1l1: PoDL PSE Admin State: disabled PoDL PSE Power Detection Status: disabled $ ethtool --set-pse t1l1 podl-pse-admin-control enable $ ethtool --show-pse t1l1 PSE attributs for t1l1: PoDL PSE Admin State: enabled PoDL PSE Power Detection Status: delivering power Signed-off-by: kernel test robot <lkp@intel.com> Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Reviewed-by: Bagas Sanjaya <bagasdotme@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
\| * \|	net: mdiobus: search for PSE nodes by parsing PHY nodes.	Oleksij Rempel	2022-10-04	2	-2/+37
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Some PHYs can be linked with PSE (Power Sourcing Equipment), so search for related nodes and attach it to the phydev. Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
\| * \|	net: mdiobus: fwnode_mdiobus_register_phy() rework error handling	Oleksij Rempel	2022-10-04	1	-9/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Rework error handling as preparation for PSE patch. This patch should make it easier to extend this function. Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
\| * \|	net: add framework to support Ethernet PSE and PDs devices	Oleksij Rempel	2022-10-04	5	-0/+274
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This framework was create with intention to provide support for Ethernet PSE (Power Sourcing Equipment) and PDs (Powered Device). At current step this patch implements generic PSE support for PoDL (Power over Data Lines 802.3bu) specification with reserving name space for PD devices as well. This framework can be extended to support 802.3af and 802.3at "Power via the Media Dependent Interface" (or PoE/Power over Ethernet) Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
\| * \|	net: marvell: prestera: Propagate nh state from hw to kernel	Yevhen Orlov	2022-10-04	2	-0/+114
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We poll nexthops in HW and call for each active nexthop appropriate neighbour. Also we provide implicity neighbour resolving. For example, user have added nexthop route: # ip route add 5.5.5.5 via 1.1.1.2 But neighbour 1.1.1.2 doesn't exist. In this case we will try to call neigh_event_send, even if there is no traffic. This is useful, when you have add route, which will be used after some time but with a lot of traffic (burst). So, we has prepared, offloaded route in advance. Co-developed-by: Taras Chornyi <tchornyi@marvell.com> Signed-off-by: Taras Chornyi <tchornyi@marvell.com> Co-developed-by: Oleksandr Mazur <oleksandr.mazur@plvision.eu> Signed-off-by: Oleksandr Mazur <oleksandr.mazur@plvision.eu> Signed-off-by: Yevhen Orlov <yevhen.orlov@plvision.eu> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
\| * \|	net: marvell: prestera: Add neighbour cache accounting	Yevhen Orlov	2022-10-04	2	-3/+797
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Move forward and use new PRESTERA_FIB_TYPE_UC_NH to provide basic nexthop routes support. Provide deinitialization sequence for all created router objects. Limitations: - Only "local" and "main" tables supported - Only generic interfaces supported for router (no bridges or vlans) Co-developed-by: Taras Chornyi <tchornyi@marvell.com> Signed-off-by: Taras Chornyi <tchornyi@marvell.com> Co-developed-by: Oleksandr Mazur <oleksandr.mazur@plvision.eu> Signed-off-by: Oleksandr Mazur <oleksandr.mazur@plvision.eu> Signed-off-by: Yevhen Orlov <yevhen.orlov@plvision.eu> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
\| * \|	net: marvell: prestera: add stub handler neighbour events	Yevhen Orlov	2022-10-04	2	-0/+60
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Actual handler will be added in next patches Co-developed-by: Taras Chornyi <tchornyi@marvell.com> Signed-off-by: Taras Chornyi <tchornyi@marvell.com> Co-developed-by: Oleksandr Mazur <oleksandr.mazur@plvision.eu> Signed-off-by: Oleksandr Mazur <oleksandr.mazur@plvision.eu> Signed-off-by: Yevhen Orlov <yevhen.orlov@plvision.eu> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
\| * \|	net: marvell: prestera: Add heplers to interact with fib_notifier_info	Yevhen Orlov	2022-10-04	1	-34/+65
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This will be used to implement nexthops related logic in next patches. Also try to keep ipv4/6 abstraction to be able to reuse helpers for ipv6 in the future. Co-developed-by: Taras Chornyi <tchornyi@marvell.com> Signed-off-by: Taras Chornyi <tchornyi@marvell.com> Co-developed-by: Oleksandr Mazur <oleksandr.mazur@plvision.eu> Signed-off-by: Oleksandr Mazur <oleksandr.mazur@plvision.eu> Signed-off-by: Yevhen Orlov <yevhen.orlov@plvision.eu> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
\| * \|	net: marvell: prestera: Add length macros for prestera_ip_addr	Yevhen Orlov	2022-10-04	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add macros to determine IP address length (internal driver types). This will be used in next patches for nexthops logic. Co-developed-by: Taras Chornyi <tchornyi@marvell.com> Signed-off-by: Taras Chornyi <tchornyi@marvell.com> Co-developed-by: Oleksandr Mazur <oleksandr.mazur@plvision.eu> Signed-off-by: Oleksandr Mazur <oleksandr.mazur@plvision.eu> Signed-off-by: Yevhen Orlov <yevhen.orlov@plvision.eu> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
\| * \|	net: marvell: prestera: add delayed wq and flush wq on deinit	Yevhen Orlov	2022-10-04	3	-0/+14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Flushing workqueues ensures, that no more pending works, related to just unregistered or deinitialized notifiers. After that we can free memory. Delayed wq will be used for neighbours in next patches. Co-developed-by: Taras Chornyi <tchornyi@marvell.com> Signed-off-by: Taras Chornyi <tchornyi@marvell.com> Co-developed-by: Oleksandr Mazur <oleksandr.mazur@plvision.eu> Signed-off-by: Oleksandr Mazur <oleksandr.mazur@plvision.eu> Signed-off-by: Yevhen Orlov <yevhen.orlov@plvision.eu> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
\| * \|	net: marvell: prestera: Add strict cleanup of fib arbiter	Yevhen Orlov	2022-10-04	1	-1/+41
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This will, ensure, that there is no more, preciously allocated fib_cache entries left after deinit. Will be used to free allocated resources of nexthop routes, that points to "not our" port (e.g. eth0). Signed-off-by: Yevhen Orlov <yevhen.orlov@plvision.eu> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
\| * \|	net: marvell: prestera: Add cleanup of allocated fib_nodes	Yevhen Orlov	2022-10-04	1	-0/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Do explicity cleanup on router_hw_fini, to ensure, that all allocated objects cleaned. This will be used in cases, when upper layer (cache) is not mapped to router_hw layer. Co-developed-by: Taras Chornyi <tchornyi@marvell.com> Signed-off-by: Taras Chornyi <tchornyi@marvell.com> Co-developed-by: Oleksandr Mazur <oleksandr.mazur@plvision.eu> Signed-off-by: Oleksandr Mazur <oleksandr.mazur@plvision.eu> Signed-off-by: Yevhen Orlov <yevhen.orlov@plvision.eu> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
\| * \|	net: marvell: prestera: Add router nexthops ABI	Yevhen Orlov	2022-10-04	6	-8/+582
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	- Add functions to allocate/delete/set nexthop group - NOTE: non-ECMP nexthop is nexthop group with allocated size = 1 - Add function to read state of HW nh (if packets going through it) Co-developed-by: Taras Chornyi <tchornyi@marvell.com> Signed-off-by: Taras Chornyi <tchornyi@marvell.com> Co-developed-by: Oleksandr Mazur <oleksandr.mazur@plvision.eu> Signed-off-by: Oleksandr Mazur <oleksandr.mazur@plvision.eu> Signed-off-by: Yevhen Orlov <yevhen.orlov@plvision.eu> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
\| * \|	eth: octeon: fix build after netif_napi_add() changes	Jakub Kicinski	2022-10-04	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Guenter reports I missed a netif_napi_add() call in one of the platform-specific drivers: drivers/net/ethernet/cavium/octeon/octeon_mgmt.c: In function 'octeon_mgmt_probe': drivers/net/ethernet/cavium/octeon/octeon_mgmt.c:1399:9: error: too many arguments to function 'netif_napi_add' 1399 \| netif_napi_add(netdev, &p->napi, octeon_mgmt_napi_poll, \| ^~~~~~~~~~~~~~ Reported-by: Guenter Roeck <linux@roeck-us.net> Fixes: b48b89f9c189 ("net: drop the weight argument from netif_napi_add") Link: https://lore.kernel.org/r/20221002175650.1491124-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
\| * \|	net/mlx5: E-Switch, Return EBUSY if can't get mode lock	Jianbo Liu	2022-10-04	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	It is to avoid tc retrying during device mode change. Signed-off-by: Jianbo Liu <jianbol@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
\| * \|	net/mlx5: E-switch, Don't update group if qos is not enabled	Chris Mi	2022-10-04	1	-1/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently, qos group will be updated and qos will be enabled when unregistering devlink port. Actually no need to update group if qos is not enabled. Add a check to prevent unnecessary enabling and disabling qos for every port. Signed-off-by: Chris Mi <cmi@nvidia.com> Reviewed-by: Dmytro Linkin <dlinkin@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
\| * \|	net/mlx5: E-Switch, Allow offloading fwd dest flow table with vport	Roi Dayan	2022-10-04	1	-7/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Before this commit a fwd dest flow table resulted in ignoring vport dests which is incorrect and is supported. With this commit the dests can be a mix of flow table and vport dests. There is still a limitation that there cannot be more than one flow table dest. Signed-off-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Maor Dickman <maord@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
\| * \|	net/mlx5: Set default grace period based on function type	Maher Sanalla	2022-10-04	1	-2/+16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently, driver sets the same grace period for fw fatal health reporter to any type of function. Since the lower level functions are more vulnerable to fw fatal errors as a result of parent function closure/reload, set a smaller grace period for the lower level functions, as follows: 1. For ECPF: 180 seconds. 2. For PF: 60 seconds. 3. For VF/SF: 30 seconds. Signed-off-by: Maher Sanalla <msanalla@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
\| * \|	net/mlx5: Start health poll at earlier stage of driver load	Moshe Shemesh	2022-10-04	2	-10/+18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Start health poll at earlier stage, so if fw fatal issue occurred before or during initialization commands such as init_hca or set_hca_cap the poll health can detect and indicate that the driver is already in error state. Signed-off-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
\| * \|	net/mlx5e: Expose rx_oversize_pkts_buffer counter	Gal Pressman	2022-10-04	3	-2/+26
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add the rx_oversize_pkts_buffer counter to ethtool statistics. This counter exposes the number of dropped received packets due to length which arrived to RQ and exceed software buffer size allocated by the device for incoming traffic. It might imply that the device MTU is larger than the software buffers size. Signed-off-by: Gal Pressman <gal@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
\| * \|	net/mlx5e: xsk: Optimize for unaligned mode with 3072-byte frames	Maxim Mikityanskiy	2022-10-04	4	-3/+61
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When XSK frame size is 3072 (or another power of two multiplied by 3), KLM mechanism for NIC virtual memory page mapping can be optimized by replacing it with KSM. Before this change, two KLM entries were needed to map an XSK frame that is not a power of two: one entry maps the UMEM memory up to the frame length, the other maps the rest of the stride to the garbage page. When the frame length divided by 3 is a power of two, it can be mapped using 3 KSM entries, and the fourth will map the rest of the stride to the garbage page. All 4 KSM entries are of the same size, which allows for a much faster lookup. Frame size 3072 is useful in certain use cases, because it allows packing 4 frames into 3 pages. Generally speaking, other frame sizes equal to PAGE_SIZE minus a power of two can be optimized in a similar way, but it will require many more KSMs per frame, which slows down UMRs a little bit, but more importantly may hit the limit for the maximum number of KSM entries. Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
\| * \|	net/mlx5e: xsk: Print a warning in slow configurations	Maxim Mikityanskiy	2022-10-04	1	-0/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	On striding RQ, when the XSK frame size doesn't match the MKey page size, KLM is used for memory mappings, which is a slower mechanism than MTT or KSM. It may happen in two cases: 1. Frame size is not a power of two (only possible in the unaligned mode of XSK). 2. Frame size is 2048 bytes, and the firmware doesn't support MKey pages smaller than 4096 bytes. Depending on the case, print a warning and recommend to disable striding RQ or upgrade the firmware. Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
\| * \|	net/mlx5e: xsk: Use KLM to protect frame overrun in unaligned mode	Maxim Mikityanskiy	2022-10-04	4	-10/+90
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	XSK RQs support striding RQ linear mode, but the stride size may be bigger than the XSK frame size, because: 1. The stride size must be a power of two. 2. The stride size must be equal to the UMR page size. Each XSK frame is treated as a separate page, because they aren't necessarily adjacent in physical memory, so the driver can't put more than one stride per page. 3. The minimal MTT page size is 4096 on older firmware. That means that if XSK frame size is 2048 or not a power of two, the strides may be bigger than XSK frames. Normally, it's not a problem if the hardware enforces the MTU. However, traffic between vports skips the hardware MTU check, and oversized packets may be received. If an oversized packet is bigger than the XSK frame but not bigger than the stride, it will cause overwriting of the adjacent UMEM region. If the packet takes more than one stride, they can be recycled for reuse, so it's not a problem when the XSK frame size matches the stride size. Work around the above issue by leveraging KLM to make a more fine-grained mapping. The beginning of each stride is mapped to the frame memory, and the padding up to the closest power of two is mapped to the overflow page that doesn't belong to UMEM. This way, application data corruption won't happen upon receiving packets bigger than MTU. Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>