summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* leds: trigger: netdev: uninitialized variable in netdev_trig_activate()Dan Carpenter2023-06-151-1/+1
| | | | | | | | | | | | | The qca8k_cled_hw_control_get() function which implements ->hw_control_get sets the appropriate bits but does not clear them. This leads to an uninitialized variable bug. Fix this by setting mode to zero at the start. Fixes: e0256648c831 ("net: dsa: qca8k: implement hw_control ops") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Acked-by: Lee Jones <lee@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* tools: ynl: work around stale system headersJakub Kicinski2023-06-153-3/+35
| | | | | | | | | | | | | | | | | | | | | | | The inability to include the uAPI headers directly in tools/ is one of the bigger annoyances of compiling user space code. Most projects trade the pain for smaller inconvenience of having to copy the headers under tools/include. In case of netlink headers I think that we can avoid both. Netlink family headers are simple and should be self-contained. We can try to twiddle the Makefile a little to force-include just the family header, and use system headers for the rest. This works fairly well. There are two warts - for some reason if we specify -include $path/family.h as a compilation flag, the #ifdef header guard does not seem to work. So we need to throw the guard in on the command line as well. Seems like GCC detects that the header is different and tries to include both. Second problem is that make wants hash sign to be escaped or not depending on the version. Sigh. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: tls: make the offload check helper take skb not socketJakub Kicinski2023-06-1512-26/+17
| | | | | | | | | | | | | | | | | | | | | | | All callers of tls_is_sk_tx_device_offloaded() currently do an equivalent of: if (skb->sk && tls_is_skb_tx_device_offloaded(skb->sk)) Have the helper accept skb and do the skb->sk check locally. Two drivers have local static inlines with similar wrappers already. While at it change the ifdef condition to TLS_DEVICE. Only TLS_DEVICE selects SOCK_VALIDATE_XMIT, so the two are equivalent. This makes removing the duplicated IS_ENABLED() check in funeth more obviously correct. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Acked-by: Maxim Mikityanskiy <maxtram95@gmail.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Acked-by: Tariq Toukan <tariqt@nvidia.com> Acked-by: Dimitris Michailidis <dmichail@fungible.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge branch 'macb-partial-store-and-forward'David S. Miller2023-06-153-0/+50
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pranavi Somisetty says: ==================== Add support for partial store and forward Add support for partial store and forward mode in Cadence MACB. Link for v1: https://lore.kernel.org/all/20221213121245.13981-1-pranavi.somisetty@amd.com/ Changes v2: 1. Removed all the changes related to validating FCS when Rx checksum offload is disabled. 2. Instead of using a platform dependent number (0xFFF) for the reset value of rx watermark, derive it from designcfg_debug2 register. 3. Added a check to see if partial s/f is supported, by reading the designcfg_debug6 register. 4. Added devicetree bindings for "rx-watermark" property. Link for v2: https://lore.kernel.org/all/20230511071214.18611-1-pranavi.somisetty@amd.com/ Changes v3: 1. Fixed DT schema error: "scalar properties shouldn't have array keywords" 2. Modified description of rx-watermark in to include units of the watermark value 3. Modified the DT property name corresponding to rx_watermark in pbuf_rxcutthru to "cdns,rx-watermark". 4. Followed reverse christmas tree pattern in declaring variables. 5. Return -EINVAL when an invalid watermark value is set. 6. Removed netdev_info when partial store and forward is not enabled. 7. Validating the rx-watermark value in probe itself and only write to the register in init. 8. Writing a reset value to the pbuf_cuthru register before disabing partial store and forward is redundant. So removing it. 9. Removed the platform caps flag. 10. Instead of reading rx-watermark from DT in macb_configure_caps, reading it in probe. 11. Changed Signed-Off-By and author names on the macb driver patch. Link for v3: https://lore.kernel.org/all/20230530095138.1302-1-pranavi.somisetty@amd.com/ Changes v4: 1. Modified description for "rx-watermark" property in the DT bindings. 2. Changed the width of the rx-watermark property to uint32. 3. Removed redundant code and unused variables. 4. When the rx-watermark value is invalid, instead of returning EINVAL, do not enable partial store and forward. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| * net: macb: Add support for partial store and forwardMaulik Jodhani2023-06-152-0/+39
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When the receive partial store and forward mode is activated, the receiver will only begin to forward the packet to the external AHB or AXI slave when enough packet data is stored in the packet buffer. The amount of packet data required to activate the forwarding process is programmable via watermark registers which are located at the same address as the partial store and forward enable bits. Adding support to read this rx-watermark value from device-tree, to program the watermark registers and enable partial store and forwarding. Signed-off-by: Maulik Jodhani <maulik.jodhani@xilinx.com> Signed-off-by: Pranavi Somisetty <pranavi.somisetty@amd.com> Reviewed-by: Claudiu Beznea <claudiu.beznea@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * dt-bindings: net: cdns,macb: Add rx-watermark propertyPranavi Somisetty2023-06-151-0/+11
|/ | | | | | | | | | watermark value is the minimum amount of packet data required to activate the forwarding process. The watermark implementation and maximum size is dependent on the device where Cadence MACB/GEM is used. Signed-off-by: Pranavi Somisetty <pranavi.somisetty@amd.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge branch 'netdev-tracking'David S. Miller2023-06-155-32/+62
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Jakub Kicinski says: ==================== net: create device lookup API with reference tracking We still see dev_hold() / dev_put() calls without reference tracker getting added in new code. dev_get_by_name() / dev_get_by_index() seem to be one of the sources of those. Provide appropriate helpers. Allocating the tracker can obviously be done with an additional call to netdev_tracker_alloc(), but a single API feels cleaner. v2: - fix a dev_put() in ethtool v1: https://lore.kernel.org/all/20230609183207.1466075-1-kuba@kernel.org/ ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| * netpoll: allocate netdev tracker right awayJakub Kicinski2023-06-151-3/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 5fa5ae605821 ("netpoll: add net device refcount tracker to struct netpoll") was part of one of the initial netdev tracker introduction patches. It added an explicit netdev_tracker_alloc() for netpoll, presumably because the flow of the function is somewhat odd. After most of the core networking stack was converted to use the tracking hold() variants, netpoll's call to old dev_hold() stands out a bit. np is allocated by the caller and ready to use, we can use netdev_hold() here, even tho np->ndev will only be set to ndev inside __netpoll_setup(). Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net: create device lookup API with reference trackingJakub Kicinski2023-06-154-29/+60
|/ | | | | | | | | | | | | | New users of dev_get_by_index() and dev_get_by_name() keep getting added and it would be nice to steer them towards the APIs with reference tracking. Add variants of those calls which allocate the reference tracker and use them in a couple of places. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Eric Dumazet <edumazet@google.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* rtnetlink: move validate_linkmsg out of do_setlinkXin Long2023-06-151-41/+42
| | | | | | | | | | | | | | | | | | | | This patch moves validate_linkmsg() out of do_setlink() to its callers and deletes the early validate_linkmsg() call in __rtnl_newlink(), so that it will not call validate_linkmsg() twice in either of the paths: - __rtnl_newlink() -> do_setlink() - __rtnl_newlink() -> rtnl_newlink_create() -> rtnl_create_link() Additionally, as validate_linkmsg() is now only called with a real dev, we can remove the NULL check for dev in validate_linkmsg(). Note that we moved validate_linkmsg() check to the places where it has not done any changes to the dev, as Jakub suggested. Signed-off-by: Xin Long <lucien.xin@gmail.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Link: https://lore.kernel.org/r/cf2ef061e08251faf9e8be25ff0d61150c030475.1686585334.git.lucien.xin@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* rtnetlink: extend RTEXT_FILTER_SKIP_STATS to IFLA_VF_INFOEdwin Peer2023-06-141-45/+51
| | | | | | | | | | | | | | | | | | | | | | | This filter already exists for excluding IPv6 SNMP stats. Extend its definition to also exclude IFLA_VF_INFO stats in RTM_GETLINK. This patch constitutes a partial fix for a netlink attribute nesting overflow bug in IFLA_VFINFO_LIST. By excluding the stats when the requester doesn't need them, the truncation of the VF list is avoided. While it was technically only the stats added in commit c5a9f6f0ab40 ("net/core: Add drop counters to VF statistics") breaking the camel's back, the appreciable size of the stats data should never have been included without due consideration for the maximum number of VFs supported by PCI. Fixes: 3b766cd83232 ("net/core: Add reading VF statistics through the PF netdevice") Fixes: c5a9f6f0ab40 ("net/core: Add drop counters to VF statistics") Signed-off-by: Edwin Peer <edwin.peer@broadcom.com> Cc: Edwin Peer <espeer@gmail.com> Signed-off-by: Gal Pressman <gal@nvidia.com> Link: https://lore.kernel.org/r/20230611105108.122586-1-gal@nvidia.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
* Merge branch 'mlxsw-preparations-for-out-of-order-operations-patches'Paolo Abeni2023-06-144-109/+211
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Petr Machata says: ==================== mlxsw: Preparations for out-of-order-operations patches The mlxsw driver currently makes the assumption that the user applies configuration in a bottom-up manner. Thus netdevices need to be added to the bridge before IP addresses are configured on that bridge or SVI added on top of it. Enslaving a netdevice to another netdevice that already has uppers is in fact forbidden by mlxsw for this reason. Despite this safety, it is rather easy to get into situations where the offloaded configuration is just plain wrong. As an example, take a front panel port, configure an IP address: it gets a RIF. Now enslave the port to a bridge, and the RIF is gone. Remove the port from the bridge again, but the RIF never comes back. There is a number of similar situations, where changing the configuration there and back utterly breaks the offload. Over the course of the following several patchsets, mlxsw code is going to be adjusted to diminish the space of wrongly offloaded configurations. Ideally the offload state will reflect the actual state, regardless of the sequence of operation used to construct that state. No functional changes are intended in this patchset yet. Rather the patches prepare the codebase for easier introduction of functional changes in later patchsets. - In patch #1, extract a helper to join a RIF of a given port, if there is one. In patch #2, use it in a newly-added helper to join a LAG interface. - In patches #3, #4 and #5, add helpers that abstract away the rif->dev access. This will make it simpler in the future to change the way the deduction is done. In patch #6, do this for deduction from nexthop group info to RIF. - In patch #7, add a helper to destroy a RIF. So far RIF was destroyed simply by kfree'ing it. - In patch #8, add a helper to check if any IP addresses are configured on a netdevice. This helper will be useful later. - In patch #9, add a helper to migrate a RIF. This will be a convenient place to put extensions later on. - Patch #10 move IPIP initialization up to make ipip_ops_arr available earlier. ==================== Link: https://lore.kernel.org/r/cover.1686581444.git.petrm@nvidia.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
| * mlxsw: spectrum_router: Move IPIP init upPetr Machata2023-06-141-7/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | mlxsw will need to keep track of certain devices that are not related to any of its front panel ports. This includes IPIP netdevices. To be able to query the list of supported IPIP types, router->ipip_ops_arr needs to be initialized. To that end, move the IPIP initialization up (and finalization correspondingly down). Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Amit Cohen <amcohen@nvidia.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
| * mlxsw: spectrum_router: Extract a helper for RIF migrationPetr Machata2023-06-141-6/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | RIF configuration contains a number of parameters that cannot be changed after the RIF is created. For the IPIP loopbacks, this is currently worked around by creating a new RIF with the desired configuration changes applied, and updating next hops to the new RIF, and then destroying the old RIF. This operation will be useful as a reusable atom, so extract a helper to that effect. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Amit Cohen <amcohen@nvidia.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
| * mlxsw: spectrum_router: Add a helper to check if netdev has addressesPetr Machata2023-06-141-13/+29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | This function will be useful later as the driver will need to retroactively create RIFs for new uppers with addresses. Add another helper that assumes RCU lock, and restructure the code to skip the IPv6 branch not through conditioning on the addr_list_empty variable, but by directly returning the result value. This makes the skip more obvious than it previously was. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Amit Cohen <amcohen@nvidia.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
| * mlxsw: spectrum_router: Extract a helper to free a RIFPetr Machata2023-06-141-4/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Right now freeing the object that mlxsw uses to keep track of a RIF is as simple as calling a kfree. But later on as CRIF abstraction is brought in, it will involve severing the link between CRIF and its RIF as well. Better to have the logic encapsulated in a helper. Since a helper is being introduced, make it a full-fledged destructor and have it validate that the objects tracked at the RIF have been released. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Amit Cohen <amcohen@nvidia.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
| * mlxsw: spectrum_router: Access nhgi->rif through a helperPetr Machata2023-06-141-4/+10
| | | | | | | | | | | | | | | | | | | | | | | | To abstract away deduction of RIF from the corresponding next hop group info (NHGI), mlxsw currently uses a macro. In its current form, that macro is impossible to extend to more general computation. Therefore introduce a helper, mlxsw_sp_nhgi_rif(), and use it throughout. This will make it possible to change the deduction path easily later on. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Amit Cohen <amcohen@nvidia.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
| * mlxsw: spectrum_router: Access nh->rif->dev through a helperPetr Machata2023-06-141-5/+18
| | | | | | | | | | | | | | | | | | | | | | In order to abstract away deduction of netdevice from the corresponding next hop, introduce a helper, mlxsw_sp_nexthop_dev(), and use it throughout. This will make it possible to change the deduction path easily later on. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Amit Cohen <amcohen@nvidia.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
| * mlxsw: spectrum_router: Access rif->dev from params in mlxsw_sp_rif_create()Petr Machata2023-06-141-4/+4
| | | | | | | | | | | | | | | | | | | | The previous patch added a helper to access a netdevice given a RIF. Using this helper in mlxsw_sp_rif_create() is unreasonable: the netdevice was given in RIF creation parameters. Just take it there. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Amit Cohen <amcohen@nvidia.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
| * mlxsw: spectrum_router: Access rif->dev through a helperPetr Machata2023-06-141-45/+64
| | | | | | | | | | | | | | | | | | | | In order to abstract away deduction of netdevice from the corresponding RIF, introduce a helper, mlxsw_sp_rif_dev(), and use it throughout. This will make it possible to change the deduction path easily later on. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Amit Cohen <amcohen@nvidia.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
| * mlxsw: spectrum_router: Add a helper specifically for joining a LAGPetr Machata2023-06-144-22/+45
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, joining a LAG very simply means that the LAG RIF should be joined by the subport representing untagged traffic. If the RIF does not exist, it does not have to be created: if the user wants there to be RIF for the LAG device, they are supposed to add an IP address, and they are supposed to do it after tha LAG becomes mlxsw upper. We can also assume that the LAG has no uppers, otherwise the enslavement is not allowed. In the future, these ordering dependencies should be removed. That means that joining LAG will be more complex operation, possibly involving a lazy RIF creation, and possibly joining / lazily creating RIFs for VLAN uppers of the LAG. It will be handy to have a dedicated function that handles all this. The new function mlxsw_sp_router_port_join_lag() is that. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Amit Cohen <amcohen@nvidia.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
| * mlxsw: spectrum_router: Extract a helper from mlxsw_sp_port_vlan_router_join()Petr Machata2023-06-141-9/+20
|/ | | | | | | | | | | | | Split out of mlxsw_sp_port_vlan_router_join() the part that checks for RIF and dispatches to __mlxsw_sp_port_vlan_router_join(), leaving it as wrapper that just manages the router lock. The new function, mlxsw_sp_port_vlan_router_join_existing(), will be useful as an atom in later patches. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Amit Cohen <amcohen@nvidia.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
* ethtool: ioctl: account for sopass diff in set_wolJustin Chen2023-06-141-1/+2
| | | | | | | | | | | | | | sopass won't be set if wolopt doesn't change. This means the following will fail to set the correct sopass. ethtool -s eth0 wol s sopass 11:22:33:44:55:66 ethtool -s eth0 wol s sopass 22:44:55:66:77:88 Make sure we call into the driver layer set_wol if sopass is different. Fixes: 55b24334c0f2 ("ethtool: ioctl: improve error checking for set_wol") Signed-off-by: Justin Chen <justin.chen@broadcom.com> Link: https://lore.kernel.org/r/1686605822-34544-1-git-send-email-justin.chen@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* mctp i2c: Switch back to use struct i2c_driver's .probe()Uwe Kleine-König2023-06-141-1/+1
| | | | | | | | | | | | After commit b8a1a4cd5a98 ("i2c: Provide a temporary .probe_new() call-back type"), all drivers being converted to .probe_new() and then commit 03c835f498b5 ("i2c: Switch .probe() to not take an id parameter") convert back to (the new) .probe() to be able to eventually drop .probe_new() from struct i2c_driver. Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Link: https://lore.kernel.org/r/20230612071641.836976-1-u.kleine-koenig@pengutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* Merge branch 'tools-ynl-gen-improvements-for-dpll'Jakub Kicinski2023-06-131-20/+37
|\ | | | | | | | | | | | | | | | | | | | | | | | | Jakub Kicinski says: ==================== tools: ynl-gen: improvements for DPLL A set of improvements needed to generate the kernel code for DPLL. ==================== Link: https://lore.kernel.org/r/20230612155920.1787579-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| * tools: ynl-gen: inherit policy in multi-attrJakub Kicinski2023-06-131-18/+24
| | | | | | | | | | | | | | | | | | | | Instead of reimplementing policies in MutliAttr for every underlying type forward the calls to the base type. This will be needed for DPLL which uses a multi-attr nest, and currently gets an invalid NLA_NEST policy generated. Reviewed-by: Jacob Keller <Jacob.e.keller@intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| * tools: ynl-gen: correct enum policiesJakub Kicinski2023-06-131-2/+13
|/ | | | | | | | Scalar range validation assumes enums start at 0. Teach it to properly calculate the value range. Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* amd-xgbe: extend 10Mbps support to MAC version 21HRaju Rangoju2023-06-131-6/+7
| | | | | | | | | | MAC version 21H supports the 10Mbps speed. So, extend support to platforms that support it. Acked-by: Shyam Sundar S K <Shyam-sundar.S-k@amd.com> Reviewed-by: Sridhar Samudrala <sridhar.samudrala@intel.com> Signed-off-by: Raju Rangoju <Raju.Rangoju@amd.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge branch 'octeontx2-updates'David S. Miller2023-06-139-20/+379
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Naveen Mamindlapalli says: ==================== RVU NIX AF driver updates This patch series includes a few enhancements and other updates to the RVU NIX AF driver. The first patch adds devlink option to configure NPC MCAM high priority zone entries reservation. This is useful when the requester needs more high priority entries than default reserved entries. The second patch adds support for RSS hash computation using L3 SRC or DST only, or L4 SRC or DST only. The third patch updates DWRR MTU configuration for CN10KB silicon. HW uses the DWRR MTU to compute DWRR weight. Patch 4 configures the LBK link in TL3_TL2 configuration only when switch mode is enabled. Patch 5 adds an option in the mailbox request to enable/disable DROP_RE bit which drops packets with L2 errors when set. Patch 6 updates SMQ flush mechanism to stop other child nodes from enqueuing any packets while SMQ flush is active. Otherwise SMQ flush may timeout. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| * octeontx2-af: Set XOFF on other child transmit schedulers during SMQ flushNaveen Mamindlapalli2023-06-132-2/+144
| | | | | | | | | | | | | | | | | | | | | | | | | | When multiple transmit scheduler queues feed a TL1 transmit link, the SMQ flush initiated on a low priority queue might get stuck when a high priority queue fully subscribes the transmit link. This inturn effects interface teardown. To avoid this, temporarily XOFF all TL1's other immediate child transmit scheduler queues and also clear any rate limit configuration on all the scheduler queues in SMQ(flush) hierarchy. Signed-off-by: Naveen Mamindlapalli <naveenm@marvell.com> Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * octeontx2-af: add option to toggle DROP_RE enable in rx cfgNithin Dabilpuram2023-06-132-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | Add option to toggle DROP_RE bit in rx cfg mbox. This helps in modifying the config runtime as opposed to setting available via nix_lf_alloc() mbox at NIX LF init time. Signed-off-by: Nithin Dabilpuram <ndabilpuram@marvell.com> Signed-off-by: Jerin Jacob Kollanukkaran <jerinj@marvell.com> Signed-off-by: Sunil Kovvuri Goutham <sgoutham@marvell.com> Signed-off-by: Naveen Mamindlapalli <naveenm@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * octeontx2-af: Enable LBK links only when switch mode is on.Subbaraya Sundeep2023-06-133-6/+25
| | | | | | | | | | | | | | | | | | | | Currently, all the TL3_TL2 nodes are being configured to enable switch LBK channel 63 in them. Instead enable them only when switch mode is enabled. Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com> Signed-off-by: Naveen Mamindlapalli <naveenm@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * octeontx2-af: cn10k: Set NIX DWRR MTU for CN10KB siliconSunil Goutham2023-06-138-12/+73
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The DWRR MTU config added for SDP and RPM/LBK links on CN10K silicon is further extended on CK10KB silicon variant and made it configurable. Now there are 4 DWRR MTU config to choose while setting transmit scheduler's RR_WEIGHT. Here we are reserving one config for each of RPM, SDP and LBK. NIXX_AF_DWRR_MTUX(0) ---> RPM NIXX_AF_DWRR_MTUX(1) ---> SDP NIXX_AF_DWRR_MTUX(2) ---> LBK PF/VF drivers can choose the DWRR_MTU to be used by setting SMQX_CFG[pkt_link_type] to one of above. TLx_SCHEDULE[RR_WEIGHT] is to be as configured 'quantum / 2^DWRR_MTUX[MTU]'. DWRR_MTU of each link is exposed to PF/VF drivers via mailbox for RR_WEIGHT calculation. Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: Geetha sowjanya <gakula@marvell.com> Signed-off-by: Naveen Mamindlapalli <naveenm@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * octeontx2-af: extend RSS supported offload typesKiran Kumar K2023-06-132-0/+63
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add support to select L3 SRC or DST only, L4 SRC or DST only for RSS calculation. AF consumer may have requirement as we can select only SRC or DST data for RSS calculation in L3, L4 layers. With this requirement there will be following combinations, IPV[4,6]_SRC_ONLY, IPV[4,6]_DST_ONLY, [TCP,UDP,SCTP]_SRC_ONLY, [TCP,UDP,SCTP]_DST_ONLY. So, instead of creating a bit for each combination, we are using upper 4 bits (31:28) in the flow_key_cfg to represent the SRC, DST selection. 31 => L3_SRC, 30 => L3_DST, 29 => L4_SRC, 28 => L4_DST. These won't be part of flow_cfg, so that we don't need to change the existing ABI. Signed-off-by: Kiran Kumar K <kirankumark@marvell.com> Signed-off-by: Geetha sowjanya <gakula@marvell.com> Signed-off-by: Naveen Mamindlapalli <naveenm@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * octeontx2-af: Add devlink option to adjust mcam high prio zone entriesNaveen Mamindlapalli2023-06-131-0/+68
|/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The NPC MCAM entries are currently divided into three priority zones in AF driver: high, mid, and low. The high priority zone and low priority zone take up 1/8th (each) of the available MCAM entries, and remaining going to the mid priority zone. The current allocation scheme may not meet certain requirements, such as when a requester needs more high priority zone entries than are reserved. This patch adds a devlink configurable option to increase the number of high priority zone entries that can be allocated by requester. The max number of entries that can be reserved for high priority usage is 100% of available MCAM entries. Usage: 1) Change high priority zone percentage to 75%: devlink -p dev param set pci/0002:01:00.0 name npc_mcam_high_zone_percent \ value 75 cmode runtime 2) Read high priority zone percentage: devlink -p dev param show pci/0002:01:00.0 name npc_mcam_high_zone_percent The devlink set configuration is only permitted when no MCAM entries are assigned, i.e., all MCAM entries are free, indicating that no PF/VF driver is loaded. So user must unload/unbind PF/VF driver/devices before modifying the high priority zone percentage. Signed-off-by: Naveen Mamindlapalli <naveenm@marvell.com> Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* selftests: forwarding: Fix layer 2 miss test syntaxIdo Schimmel2023-06-131-7/+7
| | | | | | | | | | | | | The test currently specifies "l2_miss" as "true" / "false", but the version that eventually landed in iproute2 uses "1" / "0" [1]. Align the test accordingly. [1] https://lore.kernel.org/netdev/20230607153550.3829340-1-idosch@nvidia.com/ Fixes: 8c33266ae26a ("selftests: forwarding: Add layer 2 miss test cases") Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge branch 'splice-net-some-miscellaneous-msg_splice_pages-changes'Jakub Kicinski2023-06-137-210/+77
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | David Howells says: ==================== splice, net: Some miscellaneous MSG_SPLICE_PAGES changes Now that the splice_to_socket() has been rewritten so that nothing now uses the ->sendpage() file op[1], some further changes can be made, so here are some miscellaneous changes that can now be done. (1) Remove the ->sendpage() file op. (2) Remove hash_sendpage*() from AF_ALG. (3) Make sunrpc send multiple pages in single sendmsg() call rather than calling sendpage() in TCP (or maybe TLS). (4) Make tcp_bpf_sendpage() a wrapper around tcp_bpf_sendmsg(). (5) Make AF_KCM use sendmsg() when calling down to TCP and then make it send entire fragment lists in single sendmsg calls. Link: https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git/commit/?id=fd5f4d7da29218485153fd8b4c08da7fc130c79f [1] ==================== Link: https://lore.kernel.org/r/20230609100221.2620633-1-dhowells@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| * kcm: Send multiple frags in one sendmsg()David Howells2023-06-132-77/+51
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Rewrite the AF_KCM transmission loop to send all the fragments in a single skb or frag_list-skb in one sendmsg() with MSG_SPLICE_PAGES set. The list of fragments in each skb is conveniently a bio_vec[] that can just be attached to a BVEC iter. Note: I'm working out the size of each fragment-skb by adding up bv_len for all the bio_vecs in skb->frags[] - but surely this information is recorded somewhere? For the skbs in head->frag_list, this is equal to skb->data_len, but not for the head. head->data_len includes all the tail frags too. Signed-off-by: David Howells <dhowells@redhat.com> cc: Tom Herbert <tom@herbertland.com> cc: Tom Herbert <tom@quantonium.net> cc: Jens Axboe <axboe@kernel.dk> cc: Matthew Wilcox <willy@infradead.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| * kcm: Use sendmsg(MSG_SPLICE_PAGES) rather then sendpageDavid Howells2023-06-131-5/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | When transmitting data, call down into the transport socket using sendmsg with MSG_SPLICE_PAGES to indicate that content should be spliced rather than using sendpage. Signed-off-by: David Howells <dhowells@redhat.com> cc: Tom Herbert <tom@herbertland.com> cc: Tom Herbert <tom@quantonium.net> cc: Jens Axboe <axboe@kernel.dk> cc: Matthew Wilcox <willy@infradead.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| * tcp_bpf: Make tcp_bpf_sendpage() go through tcp_bpf_sendmsg(MSG_SPLICE_PAGES)David Howells2023-06-131-40/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | Make tcp_bpf_sendpage() a wrapper around tcp_bpf_sendmsg(MSG_SPLICE_PAGES) rather than a loop calling tcp_sendpage(). sendpage() will be removed in the future. Signed-off-by: David Howells <dhowells@redhat.com> cc: John Fastabend <john.fastabend@gmail.com> cc: Jakub Sitnicki <jakub@cloudflare.com> cc: Jens Axboe <axboe@kernel.dk> cc: Matthew Wilcox <willy@infradead.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| * sunrpc: Use sendmsg(MSG_SPLICE_PAGES) rather then sendpageDavid Howells2023-06-132-32/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When transmitting data, call down into TCP using sendmsg with MSG_SPLICE_PAGES to indicate that content should be spliced rather than performing sendpage calls to transmit header, data pages and trailer. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Chuck Lever <chuck.lever@oracle.com> cc: Trond Myklebust <trond.myklebust@hammerspace.com> cc: Anna Schumaker <anna@kernel.org> cc: Jeff Layton <jlayton@kernel.org> cc: Jens Axboe <axboe@kernel.dk> cc: Matthew Wilcox <willy@infradead.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| * algif: Remove hash_sendpage*()David Howells2023-06-131-66/+0
| | | | | | | | | | | | | | | | | | | | | | | | Remove hash_sendpage*() as nothing should now call it since the rewrite of splice_to_socket()[1]. Signed-off-by: David Howells <dhowells@redhat.com> cc: Herbert Xu <herbert@gondor.apana.org.au> cc: Jens Axboe <axboe@kernel.dk> cc: Matthew Wilcox <willy@infradead.org> Link: https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git/commit/?id=2dc334f1a63a8839b88483a3e73c0f27c9c1791c [1] Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| * Remove file->f_op->sendpageDavid Howells2023-06-131-1/+0
|/ | | | | | | | | | Remove file->f_op->sendpage as splicing to a socket now calls sendmsg rather than sendpage. Signed-off-by: David Howells <dhowells@redhat.com> cc: Jens Axboe <axboe@kernel.dk> cc: Matthew Wilcox <willy@infradead.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* Merge branch 'net-flower-add-cfm-support'Jakub Kicinski2023-06-136-0/+369
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | Zahari Doychev says: ==================== net: flower: add cfm support The first patch adds cfm support to the flow dissector. The second adds the flower classifier support. The third adds a selftest for the flower cfm functionality. ==================== Link: https://lore.kernel.org/r/20230608105648.266575-1-zahari.doychev@linux.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| * selftests: net: add tc flower cfm testZahari Doychev2023-06-132-0/+207
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | New cfm flower test case is added to the net forwarding selfttests. Example output: # ./tc_flower_cfm.sh p1 p2 TEST: CFM opcode match test [ OK ] TEST: CFM level match test [ OK ] TEST: CFM opcode and level match test [ OK ] Signed-off-by: Zahari Doychev <zdoychev@maxlinear.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| * net: flower: add support for matching cfm fieldsZahari Doychev2023-06-132-0/+111
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Add support to the tc flower classifier to match based on fields in CFM information elements like level and opcode. tc filter add dev ens6 ingress protocol 802.1q \ flower vlan_id 698 vlan_ethtype 0x8902 cfm mdl 5 op 46 \ action drop Signed-off-by: Zahari Doychev <zdoychev@maxlinear.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| * net: flow_dissector: add support for cfm packetsZahari Doychev2023-06-132-0/+51
|/ | | | | | | | | | Add support for dissecting cfm packets. The cfm packet header fields maintenance domain level and opcode can be dissected. Signed-off-by: Zahari Doychev <zdoychev@maxlinear.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* net: mlxsw: i2c: Switch back to use struct i2c_driver's .probe()Uwe Kleine-König2023-06-121-1/+1
| | | | | | | | | | | | | After commit b8a1a4cd5a98 ("i2c: Provide a temporary .probe_new() call-back type"), all drivers being converted to .probe_new() and then commit 03c835f498b5 ("i2c: Switch .probe() to not take an id parameter") convert back to (the new) .probe() to be able to eventually drop .probe_new() from struct i2c_driver. Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Tested-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: phy: add driver for MediaTek SoC built-in GE PHYsDaniel Golle2023-06-125-1/+1140
| | | | | | | | | | | | | Some of MediaTek's Filogic SoCs come with built-in gigabit Ethernet PHYs which require calibration data from the SoC's efuse. Despite the similar design the driver doesn't share any code with the existing mediatek-ge.c. Add support for such PHYs by introducing a new driver with basic support for MediaTek SoCs MT7981 and MT7988 built-in 1GE PHYs. Signed-off-by: Daniel Golle <daniel@makrotopia.org> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge tag 'mlx5-updates-2023-06-09' of ↵David S. Miller2023-06-1220-177/+610
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux mlx5-updates-2023-06-09 1) Embedded CPU Virtual Functions 2) Lightweight local SFs Daniel Jurgens says: ==================== Embedded CPU Virtual Functions This series enables the creation of virtual functions on Bluefield (the embedded CPU platform). Embedded CPU virtual functions (EC VFs). EC VF creation, deletion and management interfaces are the same as those for virtual functions in a server with a Connect-X NIC. When using EC VFs on the ARM the creation of virtual functions on the host system is still supported. Host VFs eswitch vports occupy a range of 1..max_vfs, the EC VF vport range is max_vfs+1..max_ec_vfs. Every function (PF, ECPF, VF, EC VF, and subfunction) has a function ID associated with it. Prior to this series the function ID and the eswitch vport were the same. That is no longer the case, the EC VF function ID range is 1..max_ec_vfs. When querying or setting the capabilities of an EC VF function an new bit must be set in the query/set HCA cap structure. This is a high level overview of the changes made: - Allocate vports for EC VFs if they are enabled. - Create representors and devlink ports for the EC VF vports. - When querying/setting HCA caps by vport break the assumption that function ID is the same a vport number and adjust accordingly. - Create a new type of page, so that when SRIOV on the ARM is disabled, but remains enabled on the host, the driver can wait for the correct pages. - Update SRIOV code to support EC VF creation/deletion. =================== Lightweight local SFs: Last 3 patches form Shay Drory: SFs are heavy weight and by default they come with the full package of ConnectX features. Usually users want specialized SFs for one specific purpose and using devlink users will almost always override the set of advertises features of an SF and reload it. Shay Drory says: ================ In order to avoid the wasted time and resources on the reload, local SFs will probe without any auxiliary sub-device, so that the SFs can be configured prior to its full probe. The defaults of the enable_* devlink params of these SFs are set to false. Usage example: Create SF: $ devlink port add pci/0000:08:00.0 flavour pcisf pfnum 0 sfnum 11 $ devlink port function set pci/0000:08:00.0/32768 \ hw_addr 00:00:00:00:00:11 state active Enable ETH auxiliary device: $ devlink dev param set auxiliary/mlx5_core.sf.1 \ name enable_eth value true cmode driverinit Now, in order to fully probe the SF, use devlink reload: $ devlink dev reload auxiliary/mlx5_core.sf.1 At this point the user have SF devlink instance with auxiliary device for the Ethernet functionality only. ================