diff options
author | Linus Torvalds <torvalds@linux-foundation.org> | 2023-04-27 01:07:23 +0200 |
---|---|---|
committer | Linus Torvalds <torvalds@linux-foundation.org> | 2023-04-27 01:07:23 +0200 |
commit | 6e98b09da931a00bf4e0477d0fa52748bf28fcce (patch) | |
tree | 9c658ed95add5693f42f29f63df80a2ede3f6ec2 /drivers/net/ethernet | |
parent | Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi (diff) | |
parent | net: phy: hide the PHYLIB_LEDS knob (diff) | |
download | linux-6e98b09da931a00bf4e0477d0fa52748bf28fcce.tar.xz linux-6e98b09da931a00bf4e0477d0fa52748bf28fcce.zip |
Merge tag 'net-next-6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
Pull networking updates from Paolo Abeni:
"Core:
- Introduce a config option to tweak MAX_SKB_FRAGS. Increasing the
default value allows for better BIG TCP performances
- Reduce compound page head access for zero-copy data transfers
- RPS/RFS improvements, avoiding unneeded NET_RX_SOFTIRQ when
possible
- Threaded NAPI improvements, adding defer skb free support and
unneeded softirq avoidance
- Address dst_entry reference count scalability issues, via false
sharing avoidance and optimize refcount tracking
- Add lockless accesses annotation to sk_err[_soft]
- Optimize again the skb struct layout
- Extends the skb drop reasons to make it usable by multiple
subsystems
- Better const qualifier awareness for socket casts
BPF:
- Add skb and XDP typed dynptrs which allow BPF programs for more
ergonomic and less brittle iteration through data and
variable-sized accesses
- Add a new BPF netfilter program type and minimal support to hook
BPF programs to netfilter hooks such as prerouting or forward
- Add more precise memory usage reporting for all BPF map types
- Adds support for using {FOU,GUE} encap with an ipip device
operating in collect_md mode and add a set of BPF kfuncs for
controlling encap params
- Allow BPF programs to detect at load time whether a particular
kfunc exists or not, and also add support for this in light
skeleton
- Bigger batch of BPF verifier improvements to prepare for upcoming
BPF open-coded iterators allowing for less restrictive looping
capabilities
- Rework RCU enforcement in the verifier, add kptr_rcu and enforce
BPF programs to NULL-check before passing such pointers into kfunc
- Add support for kptrs in percpu hashmaps, percpu LRU hashmaps and
in local storage maps
- Enable RCU semantics for task BPF kptrs and allow referenced kptr
tasks to be stored in BPF maps
- Add support for refcounted local kptrs to the verifier for allowing
shared ownership, useful for adding a node to both the BPF list and
rbtree
- Add BPF verifier support for ST instructions in
convert_ctx_access() which will help new -mcpu=v4 clang flag to
start emitting them
- Add ARM32 USDT support to libbpf
- Improve bpftool's visual program dump which produces the control
flow graph in a DOT format by adding C source inline annotations
Protocols:
- IPv4: Allow adding to IPv4 address a 'protocol' tag. Such value
indicates the provenance of the IP address
- IPv6: optimize route lookup, dropping unneeded R/W lock acquisition
- Add the handshake upcall mechanism, allowing the user-space to
implement generic TLS handshake on kernel's behalf
- Bridge: support per-{Port, VLAN} neighbor suppression, increasing
resilience to nodes failures
- SCTP: add support for Fair Capacity and Weighted Fair Queueing
schedulers
- MPTCP: delay first subflow allocation up to its first usage. This
will allow for later better LSM interaction
- xfrm: Remove inner/outer modes from input/output path. These are
not needed anymore
- WiFi:
- reduced neighbor report (RNR) handling for AP mode
- HW timestamping support
- support for randomized auth/deauth TA for PASN privacy
- per-link debugfs for multi-link
- TC offload support for mac80211 drivers
- mac80211 mesh fast-xmit and fast-rx support
- enable Wi-Fi 7 (EHT) mesh support
Netfilter:
- Add nf_tables 'brouting' support, to force a packet to be routed
instead of being bridged
- Update bridge netfilter and ovs conntrack helpers to handle IPv6
Jumbo packets properly, i.e. fetch the packet length from
hop-by-hop extension header. This is needed for BIT TCP support
- The iptables 32bit compat interface isn't compiled in by default
anymore
- Move ip(6)tables builtin icmp matches to the udptcp one. This has
the advantage that icmp/icmpv6 match doesn't load the
iptables/ip6tables modules anymore when iptables-nft is used
- Extended netlink error report for netdevice in flowtables and
netdev/chains. Allow for incrementally add/delete devices to netdev
basechain. Allow to create netdev chain without device
Driver API:
- Remove redundant Device Control Error Reporting Enable, as PCI core
has already error reporting enabled at enumeration time
- Move Multicast DB netlink handlers to core, allowing devices other
then bridge to use them
- Allow the page_pool to directly recycle the pages from safely
localized NAPI
- Implement lockless TX queue stop/wake combo macros, allowing for
further code de-duplication and sanitization
- Add YNL support for user headers and struct attrs
- Add partial YNL specification for devlink
- Add partial YNL specification for ethtool
- Add tc-mqprio and tc-taprio support for preemptible traffic classes
- Add tx push buf len param to ethtool, specifies the maximum number
of bytes of a transmitted packet a driver can push directly to the
underlying device
- Add basic LED support for switch/phy
- Add NAPI documentation, stop relaying on external links
- Convert dsa_master_ioctl() to netdev notifier. This is a
preparatory work to make the hardware timestamping layer selectable
by user space
- Add transceiver support and improve the error messages for CAN-FD
controllers
New hardware / drivers:
- Ethernet:
- AMD/Pensando core device support
- MediaTek MT7981 SoC
- MediaTek MT7988 SoC
- Broadcom BCM53134 embedded switch
- Texas Instruments CPSW9G ethernet switch
- Qualcomm EMAC3 DWMAC ethernet
- StarFive JH7110 SoC
- NXP CBTX ethernet PHY
- WiFi:
- Apple M1 Pro/Max devices
- RealTek rtl8710bu/rtl8188gu
- RealTek rtl8822bs, rtl8822cs and rtl8821cs SDIO chipset
- Bluetooth:
- Realtek RTL8821CS, RTL8851B, RTL8852BS
- Mediatek MT7663, MT7922
- NXP w8997
- Actions Semi ATS2851
- QTI WCN6855
- Marvell 88W8997
- Can:
- STMicroelectronics bxcan stm32f429
Drivers:
- Ethernet NICs:
- Intel (1G, icg):
- add tracking and reporting of QBV config errors
- add support for configuring max SDU for each Tx queue
- Intel (100G, ice):
- refactor mailbox overflow detection to support Scalable IOV
- GNSS interface optimization
- Intel (i40e):
- support XDP multi-buffer
- nVidia/Mellanox:
- add the support for linux bridge multicast offload
- enable TC offload for egress and engress MACVLAN over bond
- add support for VxLAN GBP encap/decap flows offload
- extend packet offload to fully support libreswan
- support tunnel mode in mlx5 IPsec packet offload
- extend XDP multi-buffer support
- support MACsec VLAN offload
- add support for dynamic msix vectors allocation
- drop RX page_cache and fully use page_pool
- implement thermal zone to report NIC temperature
- Netronome/Corigine:
- add support for multi-zone conntrack offload
- Solarflare/Xilinx:
- support offloading TC VLAN push/pop actions to the MAE
- support TC decap rules
- support unicast PTP
- Other NICs:
- Broadcom (bnxt): enforce software based freq adjustments only on
shared PHC NIC
- RealTek (r8169): refactor to addess ASPM issues during NAPI poll
- Micrel (lan8841): add support for PTP_PF_PEROUT
- Cadence (macb): enable PTP unicast
- Engleder (tsnep): add XDP socket zero-copy support
- virtio-net: implement exact header length guest feature
- veth: add page_pool support for page recycling
- vxlan: add MDB data path support
- gve: add XDP support for GQI-QPL format
- geneve: accept every ethertype
- macvlan: allow some packets to bypass broadcast queue
- mana: add support for jumbo frame
- Ethernet high-speed switches:
- Microchip (sparx5): Add support for TC flower templates
- Ethernet embedded switches:
- Broadcom (b54):
- configure 6318 and 63268 RGMII ports
- Marvell (mv88e6xxx):
- faster C45 bus scan
- Microchip:
- lan966x:
- add support for IS1 VCAP
- better TX/RX from/to CPU performances
- ksz9477: add ETS Qdisc support
- ksz8: enhance static MAC table operations and error handling
- sama7g5: add PTP capability
- NXP (ocelot):
- add support for external ports
- add support for preemptible traffic classes
- Texas Instruments:
- add CPSWxG SGMII support for J7200 and J721E
- Intel WiFi (iwlwifi):
- preparation for Wi-Fi 7 EHT and multi-link support
- EHT (Wi-Fi 7) sniffer support
- hardware timestamping support for some devices/firwmares
- TX beacon protection on newer hardware
- Qualcomm 802.11ax WiFi (ath11k):
- MU-MIMO parameters support
- ack signal support for management packets
- RealTek WiFi (rtw88):
- SDIO bus support
- better support for some SDIO devices (e.g. MAC address from
efuse)
- RealTek WiFi (rtw89):
- HW scan support for 8852b
- better support for 6 GHz scanning
- support for various newer firmware APIs
- framework firmware backwards compatibility
- MediaTek WiFi (mt76):
- P2P support
- mesh A-MSDU support
- EHT (Wi-Fi 7) support
- coredump support"
* tag 'net-next-6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (2078 commits)
net: phy: hide the PHYLIB_LEDS knob
net: phy: marvell-88x2222: remove unnecessary (void*) conversions
tcp/udp: Fix memleaks of sk and zerocopy skbs with TX timestamp.
net: amd: Fix link leak when verifying config failed
net: phy: marvell: Fix inconsistent indenting in led_blink_set
lan966x: Don't use xdp_frame when action is XDP_TX
tsnep: Add XDP socket zero-copy TX support
tsnep: Add XDP socket zero-copy RX support
tsnep: Move skb receive action to separate function
tsnep: Add functions for queue enable/disable
tsnep: Rework TX/RX queue initialization
tsnep: Replace modulo operation with mask
net: phy: dp83867: Add led_brightness_set support
net: phy: Fix reading LED reg property
drivers: nfc: nfcsim: remove return value check of `dev_dir`
net: phy: dp83867: Remove unnecessary (void*) conversions
net: ethtool: coalesce: try to make user settings stick twice
net: mana: Check if netdev/napi_alloc_frag returns single page
net: mana: Rename mana_refill_rxoob and remove some empty lines
net: veth: add page_pool stats
...
Diffstat (limited to 'drivers/net/ethernet')
402 files changed, 21375 insertions, 13682 deletions
diff --git a/drivers/net/ethernet/8390/axnet_cs.c b/drivers/net/ethernet/8390/axnet_cs.c index 3aef959fc25b..78f985885547 100644 --- a/drivers/net/ethernet/8390/axnet_cs.c +++ b/drivers/net/ethernet/8390/axnet_cs.c @@ -650,7 +650,6 @@ static void block_input(struct net_device *dev, int count, { unsigned int nic_base = dev->base_addr; struct ei_device *ei_local = netdev_priv(dev); - int xfer_count = count; char *buf = skb->data; if ((netif_msg_rx_status(ei_local)) && (count != 4)) @@ -662,9 +661,7 @@ static void block_input(struct net_device *dev, int count, insw(nic_base + AXNET_DATAPORT,buf,count>>1); if (count & 0x01) { buf[count-1] = inb(nic_base + AXNET_DATAPORT); - xfer_count++; } - } /*====================================================================*/ diff --git a/drivers/net/ethernet/Kconfig b/drivers/net/ethernet/Kconfig index 1917da784191..5a274b99f299 100644 --- a/drivers/net/ethernet/Kconfig +++ b/drivers/net/ethernet/Kconfig @@ -84,7 +84,6 @@ source "drivers/net/ethernet/huawei/Kconfig" source "drivers/net/ethernet/i825xx/Kconfig" source "drivers/net/ethernet/ibm/Kconfig" source "drivers/net/ethernet/intel/Kconfig" -source "drivers/net/ethernet/wangxun/Kconfig" source "drivers/net/ethernet/xscale/Kconfig" config JME @@ -189,6 +188,7 @@ source "drivers/net/ethernet/toshiba/Kconfig" source "drivers/net/ethernet/tundra/Kconfig" source "drivers/net/ethernet/vertexcom/Kconfig" source "drivers/net/ethernet/via/Kconfig" +source "drivers/net/ethernet/wangxun/Kconfig" source "drivers/net/ethernet/wiznet/Kconfig" source "drivers/net/ethernet/xilinx/Kconfig" source "drivers/net/ethernet/xircom/Kconfig" diff --git a/drivers/net/ethernet/alteon/acenic.c b/drivers/net/ethernet/alteon/acenic.c index d7762da8b2c0..eafef84fe3be 100644 --- a/drivers/net/ethernet/alteon/acenic.c +++ b/drivers/net/ethernet/alteon/acenic.c @@ -2435,7 +2435,7 @@ restart: } else { dma_addr_t mapping; u32 vlan_tag = 0; - int i, len = 0; + int i; mapping = ace_map_tx_skb(ap, skb, NULL, idx); flagsize = (skb_headlen(skb) << 16); @@ -2454,7 +2454,6 @@ restart: const skb_frag_t *frag = &skb_shinfo(skb)->frags[i]; struct tx_ring_info *info; - len += skb_frag_size(frag); info = ap->skb->tx_skbuff + idx; desc = ap->tx_ring + idx; diff --git a/drivers/net/ethernet/amazon/ena/ena_eth_com.h b/drivers/net/ethernet/amazon/ena/ena_eth_com.h index 689313ee25a8..372b259279ec 100644 --- a/drivers/net/ethernet/amazon/ena/ena_eth_com.h +++ b/drivers/net/ethernet/amazon/ena/ena_eth_com.h @@ -10,6 +10,10 @@ /* head update threshold in units of (queue size / ENA_COMP_HEAD_THRESH) */ #define ENA_COMP_HEAD_THRESH 4 +/* we allow 2 DMA descriptors per LLQ entry */ +#define ENA_LLQ_ENTRY_DESC_CHUNK_SIZE (2 * sizeof(struct ena_eth_io_tx_desc)) +#define ENA_LLQ_HEADER (128UL - ENA_LLQ_ENTRY_DESC_CHUNK_SIZE) +#define ENA_LLQ_LARGE_HEADER (256UL - ENA_LLQ_ENTRY_DESC_CHUNK_SIZE) struct ena_com_tx_ctx { struct ena_com_tx_meta ena_meta; diff --git a/drivers/net/ethernet/amazon/ena/ena_ethtool.c b/drivers/net/ethernet/amazon/ena/ena_ethtool.c index 1d4f2f4d10f2..d671df4b76bc 100644 --- a/drivers/net/ethernet/amazon/ena/ena_ethtool.c +++ b/drivers/net/ethernet/amazon/ena/ena_ethtool.c @@ -476,6 +476,21 @@ static void ena_get_ringparam(struct net_device *netdev, ring->tx_max_pending = adapter->max_tx_ring_size; ring->rx_max_pending = adapter->max_rx_ring_size; + if (adapter->ena_dev->tx_mem_queue_type == ENA_ADMIN_PLACEMENT_POLICY_DEV) { + bool large_llq_supported = adapter->large_llq_header_supported; + + kernel_ring->tx_push = true; + kernel_ring->tx_push_buf_len = adapter->ena_dev->tx_max_header_size; + if (large_llq_supported) + kernel_ring->tx_push_buf_max_len = ENA_LLQ_LARGE_HEADER; + else + kernel_ring->tx_push_buf_max_len = ENA_LLQ_HEADER; + } else { + kernel_ring->tx_push = false; + kernel_ring->tx_push_buf_max_len = 0; + kernel_ring->tx_push_buf_len = 0; + } + ring->tx_pending = adapter->tx_ring[0].ring_size; ring->rx_pending = adapter->rx_ring[0].ring_size; } @@ -486,7 +501,8 @@ static int ena_set_ringparam(struct net_device *netdev, struct netlink_ext_ack *extack) { struct ena_adapter *adapter = netdev_priv(netdev); - u32 new_tx_size, new_rx_size; + u32 new_tx_size, new_rx_size, new_tx_push_buf_len; + bool changed = false; new_tx_size = ring->tx_pending < ENA_MIN_RING_SIZE ? ENA_MIN_RING_SIZE : ring->tx_pending; @@ -496,11 +512,51 @@ static int ena_set_ringparam(struct net_device *netdev, ENA_MIN_RING_SIZE : ring->rx_pending; new_rx_size = rounddown_pow_of_two(new_rx_size); - if (new_tx_size == adapter->requested_tx_ring_size && - new_rx_size == adapter->requested_rx_ring_size) + changed |= new_tx_size != adapter->requested_tx_ring_size || + new_rx_size != adapter->requested_rx_ring_size; + + /* This value is ignored if LLQ is not supported */ + new_tx_push_buf_len = adapter->ena_dev->tx_max_header_size; + + if ((adapter->ena_dev->tx_mem_queue_type == ENA_ADMIN_PLACEMENT_POLICY_DEV) != + kernel_ring->tx_push) { + NL_SET_ERR_MSG_MOD(extack, "Push mode state cannot be modified"); + return -EINVAL; + } + + /* Validate that the push buffer is supported on the underlying device */ + if (kernel_ring->tx_push_buf_len) { + enum ena_admin_placement_policy_type placement; + + new_tx_push_buf_len = kernel_ring->tx_push_buf_len; + + placement = adapter->ena_dev->tx_mem_queue_type; + if (placement == ENA_ADMIN_PLACEMENT_POLICY_HOST) + return -EOPNOTSUPP; + + if (new_tx_push_buf_len != ENA_LLQ_HEADER && + new_tx_push_buf_len != ENA_LLQ_LARGE_HEADER) { + bool large_llq_sup = adapter->large_llq_header_supported; + char large_llq_size_str[40]; + + snprintf(large_llq_size_str, 40, ", %lu", ENA_LLQ_LARGE_HEADER); + + NL_SET_ERR_MSG_FMT_MOD(extack, + "Supported tx push buff values: [%lu%s]", + ENA_LLQ_HEADER, + large_llq_sup ? large_llq_size_str : ""); + + return -EINVAL; + } + + changed |= new_tx_push_buf_len != adapter->ena_dev->tx_max_header_size; + } + + if (!changed) return 0; - return ena_update_queue_sizes(adapter, new_tx_size, new_rx_size); + return ena_update_queue_params(adapter, new_tx_size, new_rx_size, + new_tx_push_buf_len); } static u32 ena_flow_hash_to_flow_type(u16 hash_fields) @@ -909,6 +965,8 @@ static int ena_set_tunable(struct net_device *netdev, static const struct ethtool_ops ena_ethtool_ops = { .supported_coalesce_params = ETHTOOL_COALESCE_USECS | ETHTOOL_COALESCE_USE_ADAPTIVE_RX, + .supported_ring_params = ETHTOOL_RING_USE_TX_PUSH_BUF_LEN | + ETHTOOL_RING_USE_TX_PUSH, .get_link_ksettings = ena_get_link_ksettings, .get_drvinfo = ena_get_drvinfo, .get_msglevel = ena_get_msglevel, diff --git a/drivers/net/ethernet/amazon/ena/ena_netdev.c b/drivers/net/ethernet/amazon/ena/ena_netdev.c index cbfe7f977270..e6a6efaeb87c 100644 --- a/drivers/net/ethernet/amazon/ena/ena_netdev.c +++ b/drivers/net/ethernet/amazon/ena/ena_netdev.c @@ -1898,7 +1898,6 @@ static int ena_clean_xdp_irq(struct ena_ring *xdp_ring, u32 budget) { u32 total_done = 0; u16 next_to_clean; - u32 tx_bytes = 0; int tx_pkts = 0; u16 req_id; int rc; @@ -1936,7 +1935,6 @@ static int ena_clean_xdp_irq(struct ena_ring *xdp_ring, u32 budget) "tx_poll: q %d skb %p completed\n", xdp_ring->qid, xdpf); - tx_bytes += xdpf->len; tx_pkts++; total_done += tx_info->tx_descs; @@ -2809,11 +2807,13 @@ static int ena_close(struct net_device *netdev) return 0; } -int ena_update_queue_sizes(struct ena_adapter *adapter, - u32 new_tx_size, - u32 new_rx_size) +int ena_update_queue_params(struct ena_adapter *adapter, + u32 new_tx_size, + u32 new_rx_size, + u32 new_llq_header_len) { - bool dev_was_up; + bool dev_was_up, large_llq_changed = false; + int rc = 0; dev_was_up = test_bit(ENA_FLAG_DEV_UP, &adapter->flags); ena_close(adapter->netdev); @@ -2823,7 +2823,21 @@ int ena_update_queue_sizes(struct ena_adapter *adapter, 0, adapter->xdp_num_queues + adapter->num_io_queues); - return dev_was_up ? ena_up(adapter) : 0; + + large_llq_changed = adapter->ena_dev->tx_mem_queue_type == + ENA_ADMIN_PLACEMENT_POLICY_DEV; + large_llq_changed &= + new_llq_header_len != adapter->ena_dev->tx_max_header_size; + + /* a check that the configuration is valid is done by caller */ + if (large_llq_changed) { + adapter->large_llq_header_enabled = !adapter->large_llq_header_enabled; + + ena_destroy_device(adapter, false); + rc = ena_restore_device(adapter); + } + + return dev_was_up && !rc ? ena_up(adapter) : rc; } int ena_set_rx_copybreak(struct ena_adapter *adapter, u32 rx_copybreak) @@ -3364,6 +3378,98 @@ static const struct net_device_ops ena_netdev_ops = { .ndo_xdp_xmit = ena_xdp_xmit, }; +static void ena_calc_io_queue_size(struct ena_adapter *adapter, + struct ena_com_dev_get_features_ctx *get_feat_ctx) +{ + struct ena_admin_feature_llq_desc *llq = &get_feat_ctx->llq; + struct ena_com_dev *ena_dev = adapter->ena_dev; + u32 tx_queue_size = ENA_DEFAULT_RING_SIZE; + u32 rx_queue_size = ENA_DEFAULT_RING_SIZE; + u32 max_tx_queue_size; + u32 max_rx_queue_size; + + /* If this function is called after driver load, the ring sizes have already + * been configured. Take it into account when recalculating ring size. + */ + if (adapter->tx_ring->ring_size) + tx_queue_size = adapter->tx_ring->ring_size; + + if (adapter->rx_ring->ring_size) + rx_queue_size = adapter->rx_ring->ring_size; + + if (ena_dev->supported_features & BIT(ENA_ADMIN_MAX_QUEUES_EXT)) { + struct ena_admin_queue_ext_feature_fields *max_queue_ext = + &get_feat_ctx->max_queue_ext.max_queue_ext; + max_rx_queue_size = min_t(u32, max_queue_ext->max_rx_cq_depth, + max_queue_ext->max_rx_sq_depth); + max_tx_queue_size = max_queue_ext->max_tx_cq_depth; + + if (ena_dev->tx_mem_queue_type == ENA_ADMIN_PLACEMENT_POLICY_DEV) + max_tx_queue_size = min_t(u32, max_tx_queue_size, + llq->max_llq_depth); + else + max_tx_queue_size = min_t(u32, max_tx_queue_size, + max_queue_ext->max_tx_sq_depth); + + adapter->max_tx_sgl_size = min_t(u16, ENA_PKT_MAX_BUFS, + max_queue_ext->max_per_packet_tx_descs); + adapter->max_rx_sgl_size = min_t(u16, ENA_PKT_MAX_BUFS, + max_queue_ext->max_per_packet_rx_descs); + } else { + struct ena_admin_queue_feature_desc *max_queues = + &get_feat_ctx->max_queues; + max_rx_queue_size = min_t(u32, max_queues->max_cq_depth, + max_queues->max_sq_depth); + max_tx_queue_size = max_queues->max_cq_depth; + + if (ena_dev->tx_mem_queue_type == ENA_ADMIN_PLACEMENT_POLICY_DEV) + max_tx_queue_size = min_t(u32, max_tx_queue_size, + llq->max_llq_depth); + else + max_tx_queue_size = min_t(u32, max_tx_queue_size, + max_queues->max_sq_depth); + + adapter->max_tx_sgl_size = min_t(u16, ENA_PKT_MAX_BUFS, + max_queues->max_packet_tx_descs); + adapter->max_rx_sgl_size = min_t(u16, ENA_PKT_MAX_BUFS, + max_queues->max_packet_rx_descs); + } + + max_tx_queue_size = rounddown_pow_of_two(max_tx_queue_size); + max_rx_queue_size = rounddown_pow_of_two(max_rx_queue_size); + + /* When forcing large headers, we multiply the entry size by 2, and therefore divide + * the queue size by 2, leaving the amount of memory used by the queues unchanged. + */ + if (adapter->large_llq_header_enabled) { + if ((llq->entry_size_ctrl_supported & ENA_ADMIN_LIST_ENTRY_SIZE_256B) && + ena_dev->tx_mem_queue_type == ENA_ADMIN_PLACEMENT_POLICY_DEV) { + max_tx_queue_size /= 2; + dev_info(&adapter->pdev->dev, + "Forcing large headers and decreasing maximum TX queue size to %d\n", + max_tx_queue_size); + } else { + dev_err(&adapter->pdev->dev, + "Forcing large headers failed: LLQ is disabled or device does not support large headers\n"); + + adapter->large_llq_header_enabled = false; + } + } + + tx_queue_size = clamp_val(tx_queue_size, ENA_MIN_RING_SIZE, + max_tx_queue_size); + rx_queue_size = clamp_val(rx_queue_size, ENA_MIN_RING_SIZE, + max_rx_queue_size); + + tx_queue_size = rounddown_pow_of_two(tx_queue_size); + rx_queue_size = rounddown_pow_of_two(rx_queue_size); + + adapter->max_tx_ring_size = max_tx_queue_size; + adapter->max_rx_ring_size = max_rx_queue_size; + adapter->requested_tx_ring_size = tx_queue_size; + adapter->requested_rx_ring_size = rx_queue_size; +} + static int ena_device_validate_params(struct ena_adapter *adapter, struct ena_com_dev_get_features_ctx *get_feat_ctx) { @@ -3387,13 +3493,30 @@ static int ena_device_validate_params(struct ena_adapter *adapter, return 0; } -static void set_default_llq_configurations(struct ena_llq_configurations *llq_config) +static void set_default_llq_configurations(struct ena_adapter *adapter, + struct ena_llq_configurations *llq_config, + struct ena_admin_feature_llq_desc *llq) { + struct ena_com_dev *ena_dev = adapter->ena_dev; + llq_config->llq_header_location = ENA_ADMIN_INLINE_HEADER; llq_config->llq_stride_ctrl = ENA_ADMIN_MULTIPLE_DESCS_PER_ENTRY; llq_config->llq_num_decs_before_header = ENA_ADMIN_LLQ_NUM_DESCS_BEFORE_HEADER_2; - llq_config->llq_ring_entry_size = ENA_ADMIN_LIST_ENTRY_SIZE_128B; - llq_config->llq_ring_entry_size_value = 128; + + adapter->large_llq_header_supported = + !!(ena_dev->supported_features & BIT(ENA_ADMIN_LLQ)); + adapter->large_llq_header_supported &= + !!(llq->entry_size_ctrl_supported & + ENA_ADMIN_LIST_ENTRY_SIZE_256B); + + if ((llq->entry_size_ctrl_supported & ENA_ADMIN_LIST_ENTRY_SIZE_256B) && + adapter->large_llq_header_enabled) { + llq_config->llq_ring_entry_size = ENA_ADMIN_LIST_ENTRY_SIZE_256B; + llq_config->llq_ring_entry_size_value = 256; + } else { + llq_config->llq_ring_entry_size = ENA_ADMIN_LIST_ENTRY_SIZE_128B; + llq_config->llq_ring_entry_size_value = 128; + } } static int ena_set_queues_placement_policy(struct pci_dev *pdev, @@ -3412,6 +3535,13 @@ static int ena_set_queues_placement_policy(struct pci_dev *pdev, return 0; } + if (!ena_dev->mem_bar) { + netdev_err(ena_dev->net_device, + "LLQ is advertised as supported but device doesn't expose mem bar\n"); + ena_dev->tx_mem_queue_type = ENA_ADMIN_PLACEMENT_POLICY_HOST; + return 0; + } + rc = ena_com_config_dev_mode(ena_dev, llq, llq_default_configurations); if (unlikely(rc)) { dev_err(&pdev->dev, @@ -3427,15 +3557,8 @@ static int ena_map_llq_mem_bar(struct pci_dev *pdev, struct ena_com_dev *ena_dev { bool has_mem_bar = !!(bars & BIT(ENA_MEM_BAR)); - if (!has_mem_bar) { - if (ena_dev->tx_mem_queue_type == ENA_ADMIN_PLACEMENT_POLICY_DEV) { - dev_err(&pdev->dev, - "ENA device does not expose LLQ bar. Fallback to host mode policy.\n"); - ena_dev->tx_mem_queue_type = ENA_ADMIN_PLACEMENT_POLICY_HOST; - } - + if (!has_mem_bar) return 0; - } ena_dev->mem_bar = devm_ioremap_wc(&pdev->dev, pci_resource_start(pdev, ENA_MEM_BAR), @@ -3447,10 +3570,11 @@ static int ena_map_llq_mem_bar(struct pci_dev *pdev, struct ena_com_dev *ena_dev return 0; } -static int ena_device_init(struct ena_com_dev *ena_dev, struct pci_dev *pdev, +static int ena_device_init(struct ena_adapter *adapter, struct pci_dev *pdev, struct ena_com_dev_get_features_ctx *get_feat_ctx, bool *wd_state) { + struct ena_com_dev *ena_dev = adapter->ena_dev; struct ena_llq_configurations llq_config; struct device *dev = &pdev->dev; bool readless_supported; @@ -3535,7 +3659,7 @@ static int ena_device_init(struct ena_com_dev *ena_dev, struct pci_dev *pdev, *wd_state = !!(aenq_groups & BIT(ENA_ADMIN_KEEP_ALIVE)); - set_default_llq_configurations(&llq_config); + set_default_llq_configurations(adapter, &llq_config, &get_feat_ctx->llq); rc = ena_set_queues_placement_policy(pdev, ena_dev, &get_feat_ctx->llq, &llq_config); @@ -3544,6 +3668,8 @@ static int ena_device_init(struct ena_com_dev *ena_dev, struct pci_dev *pdev, goto err_admin_init; } + ena_calc_io_queue_size(adapter, get_feat_ctx); + return 0; err_admin_init: @@ -3638,17 +3764,25 @@ static int ena_restore_device(struct ena_adapter *adapter) struct ena_com_dev_get_features_ctx get_feat_ctx; struct ena_com_dev *ena_dev = adapter->ena_dev; struct pci_dev *pdev = adapter->pdev; + struct ena_ring *txr; + int rc, count, i; bool wd_state; - int rc; set_bit(ENA_FLAG_ONGOING_RESET, &adapter->flags); - rc = ena_device_init(ena_dev, adapter->pdev, &get_feat_ctx, &wd_state); + rc = ena_device_init(adapter, adapter->pdev, &get_feat_ctx, &wd_state); if (rc) { dev_err(&pdev->dev, "Can not initialize device\n"); goto err; } adapter->wd_state = wd_state; + count = adapter->xdp_num_queues + adapter->num_io_queues; + for (i = 0 ; i < count; i++) { + txr = &adapter->tx_ring[i]; + txr->tx_mem_queue_type = ena_dev->tx_mem_queue_type; + txr->tx_max_header_size = ena_dev->tx_max_header_size; + } + rc = ena_device_validate_params(adapter, &get_feat_ctx); if (rc) { dev_err(&pdev->dev, "Validation of device parameters failed\n"); @@ -4162,72 +4296,6 @@ static void ena_release_bars(struct ena_com_dev *ena_dev, struct pci_dev *pdev) pci_release_selected_regions(pdev, release_bars); } - -static void ena_calc_io_queue_size(struct ena_adapter *adapter, - struct ena_com_dev_get_features_ctx *get_feat_ctx) -{ - struct ena_admin_feature_llq_desc *llq = &get_feat_ctx->llq; - struct ena_com_dev *ena_dev = adapter->ena_dev; - u32 tx_queue_size = ENA_DEFAULT_RING_SIZE; - u32 rx_queue_size = ENA_DEFAULT_RING_SIZE; - u32 max_tx_queue_size; - u32 max_rx_queue_size; - - if (ena_dev->supported_features & BIT(ENA_ADMIN_MAX_QUEUES_EXT)) { - struct ena_admin_queue_ext_feature_fields *max_queue_ext = - &get_feat_ctx->max_queue_ext.max_queue_ext; - max_rx_queue_size = min_t(u32, max_queue_ext->max_rx_cq_depth, - max_queue_ext->max_rx_sq_depth); - max_tx_queue_size = max_queue_ext->max_tx_cq_depth; - - if (ena_dev->tx_mem_queue_type == ENA_ADMIN_PLACEMENT_POLICY_DEV) - max_tx_queue_size = min_t(u32, max_tx_queue_size, - llq->max_llq_depth); - else - max_tx_queue_size = min_t(u32, max_tx_queue_size, - max_queue_ext->max_tx_sq_depth); - - adapter->max_tx_sgl_size = min_t(u16, ENA_PKT_MAX_BUFS, - max_queue_ext->max_per_packet_tx_descs); - adapter->max_rx_sgl_size = min_t(u16, ENA_PKT_MAX_BUFS, - max_queue_ext->max_per_packet_rx_descs); - } else { - struct ena_admin_queue_feature_desc *max_queues = - &get_feat_ctx->max_queues; - max_rx_queue_size = min_t(u32, max_queues->max_cq_depth, - max_queues->max_sq_depth); - max_tx_queue_size = max_queues->max_cq_depth; - - if (ena_dev->tx_mem_queue_type == ENA_ADMIN_PLACEMENT_POLICY_DEV) - max_tx_queue_size = min_t(u32, max_tx_queue_size, - llq->max_llq_depth); - else - max_tx_queue_size = min_t(u32, max_tx_queue_size, - max_queues->max_sq_depth); - - adapter->max_tx_sgl_size = min_t(u16, ENA_PKT_MAX_BUFS, - max_queues->max_packet_tx_descs); - adapter->max_rx_sgl_size = min_t(u16, ENA_PKT_MAX_BUFS, - max_queues->max_packet_rx_descs); - } - - max_tx_queue_size = rounddown_pow_of_two(max_tx_queue_size); - max_rx_queue_size = rounddown_pow_of_two(max_rx_queue_size); - - tx_queue_size = clamp_val(tx_queue_size, ENA_MIN_RING_SIZE, - max_tx_queue_size); - rx_queue_size = clamp_val(rx_queue_size, ENA_MIN_RING_SIZE, - max_rx_queue_size); - - tx_queue_size = rounddown_pow_of_two(tx_queue_size); - rx_queue_size = rounddown_pow_of_two(rx_queue_size); - - adapter->max_tx_ring_size = max_tx_queue_size; - adapter->max_rx_ring_size = max_rx_queue_size; - adapter->requested_tx_ring_size = tx_queue_size; - adapter->requested_rx_ring_size = rx_queue_size; -} - /* ena_probe - Device Initialization Routine * @pdev: PCI device information struct * @ent: entry in ena_pci_tbl @@ -4310,7 +4378,13 @@ static int ena_probe(struct pci_dev *pdev, const struct pci_device_id *ent) pci_set_drvdata(pdev, adapter); - rc = ena_device_init(ena_dev, pdev, &get_feat_ctx, &wd_state); + rc = ena_map_llq_mem_bar(pdev, ena_dev, bars); + if (rc) { + dev_err(&pdev->dev, "ENA LLQ bar mapping failed\n"); + goto err_netdev_destroy; + } + + rc = ena_device_init(adapter, pdev, &get_feat_ctx, &wd_state); if (rc) { dev_err(&pdev->dev, "ENA device init failed\n"); if (rc == -ETIME) @@ -4318,12 +4392,6 @@ static int ena_probe(struct pci_dev *pdev, const struct pci_device_id *ent) goto err_netdev_destroy; } - rc = ena_map_llq_mem_bar(pdev, ena_dev, bars); - if (rc) { - dev_err(&pdev->dev, "ENA llq bar mapping failed\n"); - goto err_device_destroy; - } - /* Initial TX and RX interrupt delay. Assumes 1 usec granularity. * Updated during device initialization with the real granularity */ @@ -4331,7 +4399,6 @@ static int ena_probe(struct pci_dev *pdev, const struct pci_device_id *ent) ena_dev->intr_moder_rx_interval = ENA_INTR_INITIAL_RX_INTERVAL_USECS; ena_dev->intr_delay_resolution = ENA_DEFAULT_INTR_DELAY_RESOLUTION; max_num_io_queues = ena_calc_max_io_queue_num(pdev, ena_dev, &get_feat_ctx); - ena_calc_io_queue_size(adapter, &get_feat_ctx); if (unlikely(!max_num_io_queues)) { rc = -EFAULT; goto err_device_destroy; @@ -4364,6 +4431,7 @@ static int ena_probe(struct pci_dev *pdev, const struct pci_device_id *ent) "Failed to query interrupt moderation feature\n"); goto err_device_destroy; } + ena_init_io_rings(adapter, 0, adapter->xdp_num_queues + @@ -4488,6 +4556,7 @@ static void __ena_shutoff(struct pci_dev *pdev, bool shutdown) rtnl_lock(); /* lock released inside the below if-else block */ adapter->reset_reason = ENA_REGS_RESET_SHUTDOWN; ena_destroy_device(adapter, true); + if (shutdown) { netif_device_detach(netdev); dev_close(netdev); diff --git a/drivers/net/ethernet/amazon/ena/ena_netdev.h b/drivers/net/ethernet/amazon/ena/ena_netdev.h index 2cb141079474..5a0d4ee76172 100644 --- a/drivers/net/ethernet/amazon/ena/ena_netdev.h +++ b/drivers/net/ethernet/amazon/ena/ena_netdev.h @@ -334,6 +334,14 @@ struct ena_adapter { u32 msg_enable; + /* large_llq_header_enabled is used for two purposes: + * 1. Indicates that large LLQ has been requested. + * 2. Indicates whether large LLQ is set or not after device + * initialization / configuration. + */ + bool large_llq_header_enabled; + bool large_llq_header_supported; + u16 max_tx_sgl_size; u16 max_rx_sgl_size; @@ -388,9 +396,10 @@ void ena_dump_stats_to_buf(struct ena_adapter *adapter, u8 *buf); int ena_update_hw_stats(struct ena_adapter *adapter); -int ena_update_queue_sizes(struct ena_adapter *adapter, - u32 new_tx_size, - u32 new_rx_size); +int ena_update_queue_params(struct ena_adapter *adapter, + u32 new_tx_size, + u32 new_rx_size, + u32 new_llq_header_len); int ena_update_queue_count(struct ena_adapter *adapter, u32 new_channel_count); diff --git a/drivers/net/ethernet/amd/Kconfig b/drivers/net/ethernet/amd/Kconfig index ab42f75b9413..235fcacef5c5 100644 --- a/drivers/net/ethernet/amd/Kconfig +++ b/drivers/net/ethernet/amd/Kconfig @@ -186,4 +186,16 @@ config AMD_XGBE_HAVE_ECC bool default n +config PDS_CORE + tristate "AMD/Pensando Data Systems Core Device Support" + depends on 64BIT && PCI + help + This enables the support for the AMD/Pensando Core device family of + adapters. More specific information on this driver can be + found in + <file:Documentation/networking/device_drivers/ethernet/amd/pds_core.rst>. + + To compile this driver as a module, choose M here. The module + will be called pds_core. + endif # NET_VENDOR_AMD diff --git a/drivers/net/ethernet/amd/Makefile b/drivers/net/ethernet/amd/Makefile index 42742afe9115..2dcfb84731e1 100644 --- a/drivers/net/ethernet/amd/Makefile +++ b/drivers/net/ethernet/amd/Makefile @@ -17,3 +17,4 @@ obj-$(CONFIG_PCNET32) += pcnet32.o obj-$(CONFIG_SUN3LANCE) += sun3lance.o obj-$(CONFIG_SUNLANCE) += sunlance.o obj-$(CONFIG_AMD_XGBE) += xgbe/ +obj-$(CONFIG_PDS_CORE) += pds_core/ diff --git a/drivers/net/ethernet/amd/nmclan_cs.c b/drivers/net/ethernet/amd/nmclan_cs.c index 823a329a921f..0dd391c84c13 100644 --- a/drivers/net/ethernet/amd/nmclan_cs.c +++ b/drivers/net/ethernet/amd/nmclan_cs.c @@ -651,7 +651,7 @@ static int nmclan_config(struct pcmcia_device *link) } else { pr_notice("mace id not found: %x %x should be 0x40 0x?9\n", sig[0], sig[1]); - return -ENODEV; + goto failed; } } diff --git a/drivers/net/ethernet/amd/pds_core/Makefile b/drivers/net/ethernet/amd/pds_core/Makefile new file mode 100644 index 000000000000..0abc33ce826c --- /dev/null +++ b/drivers/net/ethernet/amd/pds_core/Makefile @@ -0,0 +1,14 @@ +# SPDX-License-Identifier: GPL-2.0 +# Copyright (c) 2023 Advanced Micro Devices, Inc. + +obj-$(CONFIG_PDS_CORE) := pds_core.o + +pds_core-y := main.o \ + devlink.o \ + auxbus.o \ + dev.o \ + adminq.o \ + core.o \ + fw.o + +pds_core-$(CONFIG_DEBUG_FS) += debugfs.o diff --git a/drivers/net/ethernet/amd/pds_core/adminq.c b/drivers/net/ethernet/amd/pds_core/adminq.c new file mode 100644 index 000000000000..045fe133f6ee --- /dev/null +++ b/drivers/net/ethernet/amd/pds_core/adminq.c @@ -0,0 +1,290 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright(c) 2023 Advanced Micro Devices, Inc */ + +#include <linux/dynamic_debug.h> + +#include "core.h" + +struct pdsc_wait_context { + struct pdsc_qcq *qcq; + struct completion wait_completion; +}; + +static int pdsc_process_notifyq(struct pdsc_qcq *qcq) +{ + union pds_core_notifyq_comp *comp; + struct pdsc *pdsc = qcq->pdsc; + struct pdsc_cq *cq = &qcq->cq; + struct pdsc_cq_info *cq_info; + int nq_work = 0; + u64 eid; + + cq_info = &cq->info[cq->tail_idx]; + comp = cq_info->comp; + eid = le64_to_cpu(comp->event.eid); + while (eid > pdsc->last_eid) { + u16 ecode = le16_to_cpu(comp->event.ecode); + + switch (ecode) { + case PDS_EVENT_LINK_CHANGE: + dev_info(pdsc->dev, "NotifyQ LINK_CHANGE ecode %d eid %lld\n", + ecode, eid); + pdsc_notify(PDS_EVENT_LINK_CHANGE, comp); + break; + + case PDS_EVENT_RESET: + dev_info(pdsc->dev, "NotifyQ RESET ecode %d eid %lld\n", + ecode, eid); + pdsc_notify(PDS_EVENT_RESET, comp); + break; + + case PDS_EVENT_XCVR: + dev_info(pdsc->dev, "NotifyQ XCVR ecode %d eid %lld\n", + ecode, eid); + break; + + default: + dev_info(pdsc->dev, "NotifyQ ecode %d eid %lld\n", + ecode, eid); + break; + } + + pdsc->last_eid = eid; + cq->tail_idx = (cq->tail_idx + 1) & (cq->num_descs - 1); + cq_info = &cq->info[cq->tail_idx]; + comp = cq_info->comp; + eid = le64_to_cpu(comp->event.eid); + + nq_work++; + } + + qcq->accum_work += nq_work; + + return nq_work; +} + +void pdsc_process_adminq(struct pdsc_qcq *qcq) +{ + union pds_core_adminq_comp *comp; + struct pdsc_queue *q = &qcq->q; + struct pdsc *pdsc = qcq->pdsc; + struct pdsc_cq *cq = &qcq->cq; + struct pdsc_q_info *q_info; + unsigned long irqflags; + int nq_work = 0; + int aq_work = 0; + int credits; + + /* Don't process AdminQ when shutting down */ + if (pdsc->state & BIT_ULL(PDSC_S_STOPPING_DRIVER)) { + dev_err(pdsc->dev, "%s: called while PDSC_S_STOPPING_DRIVER\n", + __func__); + return; + } + + /* Check for NotifyQ event */ + nq_work = pdsc_process_notifyq(&pdsc->notifyqcq); + + /* Check for empty queue, which can happen if the interrupt was + * for a NotifyQ event and there are no new AdminQ completions. + */ + if (q->tail_idx == q->head_idx) + goto credits; + + /* Find the first completion to clean, + * run the callback in the related q_info, + * and continue while we still match done color + */ + spin_lock_irqsave(&pdsc->adminq_lock, irqflags); + comp = cq->info[cq->tail_idx].comp; + while (pdsc_color_match(comp->color, cq->done_color)) { + q_info = &q->info[q->tail_idx]; + q->tail_idx = (q->tail_idx + 1) & (q->num_descs - 1); + + /* Copy out the completion data */ + memcpy(q_info->dest, comp, sizeof(*comp)); + + complete_all(&q_info->wc->wait_completion); + + if (cq->tail_idx == cq->num_descs - 1) + cq->done_color = !cq->done_color; + cq->tail_idx = (cq->tail_idx + 1) & (cq->num_descs - 1); + comp = cq->info[cq->tail_idx].comp; + + aq_work++; + } + spin_unlock_irqrestore(&pdsc->adminq_lock, irqflags); + + qcq->accum_work += aq_work; + +credits: + /* Return the interrupt credits, one for each completion */ + credits = nq_work + aq_work; + if (credits) + pds_core_intr_credits(&pdsc->intr_ctrl[qcq->intx], + credits, + PDS_CORE_INTR_CRED_REARM); +} + +void pdsc_work_thread(struct work_struct *work) +{ + struct pdsc_qcq *qcq = container_of(work, struct pdsc_qcq, work); + + pdsc_process_adminq(qcq); +} + +irqreturn_t pdsc_adminq_isr(int irq, void *data) +{ + struct pdsc_qcq *qcq = data; + struct pdsc *pdsc = qcq->pdsc; + + /* Don't process AdminQ when shutting down */ + if (pdsc->state & BIT_ULL(PDSC_S_STOPPING_DRIVER)) { + dev_err(pdsc->dev, "%s: called while PDSC_S_STOPPING_DRIVER\n", + __func__); + return IRQ_HANDLED; + } + + queue_work(pdsc->wq, &qcq->work); + pds_core_intr_mask(&pdsc->intr_ctrl[irq], PDS_CORE_INTR_MASK_CLEAR); + + return IRQ_HANDLED; +} + +static int __pdsc_adminq_post(struct pdsc *pdsc, + struct pdsc_qcq *qcq, + union pds_core_adminq_cmd *cmd, + union pds_core_adminq_comp *comp, + struct pdsc_wait_context *wc) +{ + struct pdsc_queue *q = &qcq->q; + struct pdsc_q_info *q_info; + unsigned long irqflags; + unsigned int avail; + int index; + int ret; + + spin_lock_irqsave(&pdsc->adminq_lock, irqflags); + + /* Check for space in the queue */ + avail = q->tail_idx; + if (q->head_idx >= avail) + avail += q->num_descs - q->head_idx - 1; + else + avail -= q->head_idx + 1; + if (!avail) { + ret = -ENOSPC; + goto err_out_unlock; + } + + /* Check that the FW is running */ + if (!pdsc_is_fw_running(pdsc)) { + u8 fw_status = ioread8(&pdsc->info_regs->fw_status); + + dev_info(pdsc->dev, "%s: post failed - fw not running %#02x:\n", + __func__, fw_status); + ret = -ENXIO; + + goto err_out_unlock; + } + + /* Post the request */ + index = q->head_idx; + q_info = &q->info[index]; + q_info->wc = wc; + q_info->dest = comp; + memcpy(q_info->desc, cmd, sizeof(*cmd)); + + dev_dbg(pdsc->dev, "head_idx %d tail_idx %d\n", + q->head_idx, q->tail_idx); + dev_dbg(pdsc->dev, "post admin queue command:\n"); + dynamic_hex_dump("cmd ", DUMP_PREFIX_OFFSET, 16, 1, + cmd, sizeof(*cmd), true); + + q->head_idx = (q->head_idx + 1) & (q->num_descs - 1); + + pds_core_dbell_ring(pdsc->kern_dbpage, + q->hw_type, q->dbval | q->head_idx); + ret = index; + +err_out_unlock: + spin_unlock_irqrestore(&pdsc->adminq_lock, irqflags); + return ret; +} + +int pdsc_adminq_post(struct pdsc *pdsc, + union pds_core_adminq_cmd *cmd, + union pds_core_adminq_comp *comp, + bool fast_poll) +{ + struct pdsc_wait_context wc = { + .wait_completion = + COMPLETION_INITIALIZER_ONSTACK(wc.wait_completion), + }; + unsigned long poll_interval = 1; + unsigned long poll_jiffies; + unsigned long time_limit; + unsigned long time_start; + unsigned long time_done; + unsigned long remaining; + int err = 0; + int index; + + wc.qcq = &pdsc->adminqcq; + index = __pdsc_adminq_post(pdsc, &pdsc->adminqcq, cmd, comp, &wc); + if (index < 0) { + err = index; + goto err_out; + } + + time_start = jiffies; + time_limit = time_start + HZ * pdsc->devcmd_timeout; + do { + /* Timeslice the actual wait to catch IO errors etc early */ + poll_jiffies = msecs_to_jiffies(poll_interval); + remaining = wait_for_completion_timeout(&wc.wait_completion, + poll_jiffies); + if (remaining) + break; + + if (!pdsc_is_fw_running(pdsc)) { + u8 fw_status = ioread8(&pdsc->info_regs->fw_status); + + dev_dbg(pdsc->dev, "%s: post wait failed - fw not running %#02x:\n", + __func__, fw_status); + err = -ENXIO; + break; + } + + /* When fast_poll is not requested, prevent aggressive polling + * on failures due to timeouts by doing exponential back off. + */ + if (!fast_poll && poll_interval < PDSC_ADMINQ_MAX_POLL_INTERVAL) + poll_interval <<= 1; + } while (time_before(jiffies, time_limit)); + time_done = jiffies; + dev_dbg(pdsc->dev, "%s: elapsed %d msecs\n", + __func__, jiffies_to_msecs(time_done - time_start)); + + /* Check the results */ + if (time_after_eq(time_done, time_limit)) + err = -ETIMEDOUT; + + dev_dbg(pdsc->dev, "read admin queue completion idx %d:\n", index); + dynamic_hex_dump("comp ", DUMP_PREFIX_OFFSET, 16, 1, + comp, sizeof(*comp), true); + + if (remaining && comp->status) + err = pdsc_err_to_errno(comp->status); + +err_out: + if (err) { + dev_dbg(pdsc->dev, "%s: opcode %d status %d err %pe\n", + __func__, cmd->opcode, comp->status, ERR_PTR(err)); + if (err == -ENXIO || err == -ETIMEDOUT) + queue_work(pdsc->wq, &pdsc->health_work); + } + + return err; +} +EXPORT_SYMBOL_GPL(pdsc_adminq_post); diff --git a/drivers/net/ethernet/amd/pds_core/auxbus.c b/drivers/net/ethernet/amd/pds_core/auxbus.c new file mode 100644 index 000000000000..561af8e5b3ea --- /dev/null +++ b/drivers/net/ethernet/amd/pds_core/auxbus.c @@ -0,0 +1,264 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright(c) 2023 Advanced Micro Devices, Inc */ + +#include <linux/pci.h> + +#include "core.h" +#include <linux/pds/pds_auxbus.h> + +/** + * pds_client_register - Link the client to the firmware + * @pf_pdev: ptr to the PF driver struct + * @devname: name that includes service into, e.g. pds_core.vDPA + * + * Return: 0 on success, or + * negative for error + */ +int pds_client_register(struct pci_dev *pf_pdev, char *devname) +{ + union pds_core_adminq_comp comp = {}; + union pds_core_adminq_cmd cmd = {}; + struct pdsc *pf; + int err; + u16 ci; + + pf = pci_get_drvdata(pf_pdev); + if (pf->state) + return -ENXIO; + + cmd.client_reg.opcode = PDS_AQ_CMD_CLIENT_REG; + strscpy(cmd.client_reg.devname, devname, + sizeof(cmd.client_reg.devname)); + + err = pdsc_adminq_post(pf, &cmd, &comp, false); + if (err) { + dev_info(pf->dev, "register dev_name %s with DSC failed, status %d: %pe\n", + devname, comp.status, ERR_PTR(err)); + return err; + } + + ci = le16_to_cpu(comp.client_reg.client_id); + if (!ci) { + dev_err(pf->dev, "%s: device returned null client_id\n", + __func__); + return -EIO; + } + + dev_dbg(pf->dev, "%s: device returned client_id %d for %s\n", + __func__, ci, devname); + + return ci; +} +EXPORT_SYMBOL_GPL(pds_client_register); + +/** + * pds_client_unregister - Unlink the client from the firmware + * @pf_pdev: ptr to the PF driver struct + * @client_id: id returned from pds_client_register() + * + * Return: 0 on success, or + * negative for error + */ +int pds_client_unregister(struct pci_dev *pf_pdev, u16 client_id) +{ + union pds_core_adminq_comp comp = {}; + union pds_core_adminq_cmd cmd = {}; + struct pdsc *pf; + int err; + + pf = pci_get_drvdata(pf_pdev); + if (pf->state) + return -ENXIO; + + cmd.client_unreg.opcode = PDS_AQ_CMD_CLIENT_UNREG; + cmd.client_unreg.client_id = cpu_to_le16(client_id); + + err = pdsc_adminq_post(pf, &cmd, &comp, false); + if (err) + dev_info(pf->dev, "unregister client_id %d failed, status %d: %pe\n", + client_id, comp.status, ERR_PTR(err)); + + return err; +} +EXPORT_SYMBOL_GPL(pds_client_unregister); + +/** + * pds_client_adminq_cmd - Process an adminq request for the client + * @padev: ptr to the client device + * @req: ptr to buffer with request + * @req_len: length of actual struct used for request + * @resp: ptr to buffer where answer is to be copied + * @flags: optional flags from pds_core_adminq_flags + * + * Return: 0 on success, or + * negative for error + * + * Client sends pointers to request and response buffers + * Core copies request data into pds_core_client_request_cmd + * Core sets other fields as needed + * Core posts to AdminQ + * Core copies completion data into response buffer + */ +int pds_client_adminq_cmd(struct pds_auxiliary_dev *padev, + union pds_core_adminq_cmd *req, + size_t req_len, + union pds_core_adminq_comp *resp, + u64 flags) +{ + union pds_core_adminq_cmd cmd = {}; + struct pci_dev *pf_pdev; + struct pdsc *pf; + size_t cp_len; + int err; + + pf_pdev = pci_physfn(padev->vf_pdev); + pf = pci_get_drvdata(pf_pdev); + + dev_dbg(pf->dev, "%s: %s opcode %d\n", + __func__, dev_name(&padev->aux_dev.dev), req->opcode); + + if (pf->state) + return -ENXIO; + + /* Wrap the client's request */ + cmd.client_request.opcode = PDS_AQ_CMD_CLIENT_CMD; + cmd.client_request.client_id = cpu_to_le16(padev->client_id); + cp_len = min_t(size_t, req_len, sizeof(cmd.client_request.client_cmd)); + memcpy(cmd.client_request.client_cmd, req, cp_len); + + err = pdsc_adminq_post(pf, &cmd, resp, + !!(flags & PDS_AQ_FLAG_FASTPOLL)); + if (err && err != -EAGAIN) + dev_info(pf->dev, "client admin cmd failed: %pe\n", + ERR_PTR(err)); + + return err; +} +EXPORT_SYMBOL_GPL(pds_client_adminq_cmd); + +static void pdsc_auxbus_dev_release(struct device *dev) +{ + struct pds_auxiliary_dev *padev = + container_of(dev, struct pds_auxiliary_dev, aux_dev.dev); + + kfree(padev); +} + +static struct pds_auxiliary_dev *pdsc_auxbus_dev_register(struct pdsc *cf, + struct pdsc *pf, + u16 client_id, + char *name) +{ + struct auxiliary_device *aux_dev; + struct pds_auxiliary_dev *padev; + int err; + + padev = kzalloc(sizeof(*padev), GFP_KERNEL); + if (!padev) + return ERR_PTR(-ENOMEM); + + padev->vf_pdev = cf->pdev; + padev->client_id = client_id; + + aux_dev = &padev->aux_dev; + aux_dev->name = name; + aux_dev->id = cf->uid; + aux_dev->dev.parent = cf->dev; + aux_dev->dev.release = pdsc_auxbus_dev_release; + + err = auxiliary_device_init(aux_dev); + if (err < 0) { + dev_warn(cf->dev, "auxiliary_device_init of %s failed: %pe\n", + name, ERR_PTR(err)); + goto err_out; + } + + err = auxiliary_device_add(aux_dev); + if (err) { + dev_warn(cf->dev, "auxiliary_device_add of %s failed: %pe\n", + name, ERR_PTR(err)); + goto err_out_uninit; + } + + return padev; + +err_out_uninit: + auxiliary_device_uninit(aux_dev); +err_out: + kfree(padev); + return ERR_PTR(err); +} + +int pdsc_auxbus_dev_del(struct pdsc *cf, struct pdsc *pf) +{ + struct pds_auxiliary_dev *padev; + int err = 0; + + mutex_lock(&pf->config_lock); + + padev = pf->vfs[cf->vf_id].padev; + if (padev) { + pds_client_unregister(pf->pdev, padev->client_id); + auxiliary_device_delete(&padev->aux_dev); + auxiliary_device_uninit(&padev->aux_dev); + padev->client_id = 0; + } + pf->vfs[cf->vf_id].padev = NULL; + + mutex_unlock(&pf->config_lock); + return err; +} + +int pdsc_auxbus_dev_add(struct pdsc *cf, struct pdsc *pf) +{ + struct pds_auxiliary_dev *padev; + enum pds_core_vif_types vt; + char devname[PDS_DEVNAME_LEN]; + u16 vt_support; + int client_id; + int err = 0; + + mutex_lock(&pf->config_lock); + + /* We only support vDPA so far, so it is the only one to + * be verified that it is available in the Core device and + * enabled in the devlink param. In the future this might + * become a loop for several VIF types. + */ + + /* Verify that the type is supported and enabled. It is not + * an error if there is no auxbus device support for this + * VF, it just means something else needs to happen with it. + */ + vt = PDS_DEV_TYPE_VDPA; + vt_support = !!le16_to_cpu(pf->dev_ident.vif_types[vt]); + if (!(vt_support && + pf->viftype_status[vt].supported && + pf->viftype_status[vt].enabled)) + goto out_unlock; + + /* Need to register with FW and get the client_id before + * creating the aux device so that the aux client can run + * adminq commands as part its probe + */ + snprintf(devname, sizeof(devname), "%s.%s.%d", + PDS_CORE_DRV_NAME, pf->viftype_status[vt].name, cf->uid); + client_id = pds_client_register(pf->pdev, devname); + if (client_id < 0) { + err = client_id; + goto out_unlock; + } + + padev = pdsc_auxbus_dev_register(cf, pf, client_id, + pf->viftype_status[vt].name); + if (IS_ERR(padev)) { + pds_client_unregister(pf->pdev, client_id); + err = PTR_ERR(padev); + goto out_unlock; + } + pf->vfs[cf->vf_id].padev = padev; + +out_unlock: + mutex_unlock(&pf->config_lock); + return err; +} diff --git a/drivers/net/ethernet/amd/pds_core/core.c b/drivers/net/ethernet/amd/pds_core/core.c new file mode 100644 index 000000000000..483a070d96fa --- /dev/null +++ b/drivers/net/ethernet/amd/pds_core/core.c @@ -0,0 +1,597 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright(c) 2023 Advanced Micro Devices, Inc */ + +#include <linux/pci.h> +#include <linux/vmalloc.h> + +#include "core.h" + +static BLOCKING_NOTIFIER_HEAD(pds_notify_chain); + +int pdsc_register_notify(struct notifier_block *nb) +{ + return blocking_notifier_chain_register(&pds_notify_chain, nb); +} +EXPORT_SYMBOL_GPL(pdsc_register_notify); + +void pdsc_unregister_notify(struct notifier_block *nb) +{ + blocking_notifier_chain_unregister(&pds_notify_chain, nb); +} +EXPORT_SYMBOL_GPL(pdsc_unregister_notify); + +void pdsc_notify(unsigned long event, void *data) +{ + blocking_notifier_call_chain(&pds_notify_chain, event, data); +} + +void pdsc_intr_free(struct pdsc *pdsc, int index) +{ + struct pdsc_intr_info *intr_info; + + if (index >= pdsc->nintrs || index < 0) { + WARN(true, "bad intr index %d\n", index); + return; + } + + intr_info = &pdsc->intr_info[index]; + if (!intr_info->vector) + return; + dev_dbg(pdsc->dev, "%s: idx %d vec %d name %s\n", + __func__, index, intr_info->vector, intr_info->name); + + pds_core_intr_mask(&pdsc->intr_ctrl[index], PDS_CORE_INTR_MASK_SET); + pds_core_intr_clean(&pdsc->intr_ctrl[index]); + + free_irq(intr_info->vector, intr_info->data); + + memset(intr_info, 0, sizeof(*intr_info)); +} + +int pdsc_intr_alloc(struct pdsc *pdsc, char *name, + irq_handler_t handler, void *data) +{ + struct pdsc_intr_info *intr_info; + unsigned int index; + int err; + + /* Find the first available interrupt */ + for (index = 0; index < pdsc->nintrs; index++) + if (!pdsc->intr_info[index].vector) + break; + if (index >= pdsc->nintrs) { + dev_warn(pdsc->dev, "%s: no intr, index=%d nintrs=%d\n", + __func__, index, pdsc->nintrs); + return -ENOSPC; + } + + pds_core_intr_clean_flags(&pdsc->intr_ctrl[index], + PDS_CORE_INTR_CRED_RESET_COALESCE); + + intr_info = &pdsc->intr_info[index]; + + intr_info->index = index; + intr_info->data = data; + strscpy(intr_info->name, name, sizeof(intr_info->name)); + + /* Get the OS vector number for the interrupt */ + err = pci_irq_vector(pdsc->pdev, index); + if (err < 0) { + dev_err(pdsc->dev, "failed to get intr vector index %d: %pe\n", + index, ERR_PTR(err)); + goto err_out_free_intr; + } + intr_info->vector = err; + + /* Init the device's intr mask */ + pds_core_intr_clean(&pdsc->intr_ctrl[index]); + pds_core_intr_mask_assert(&pdsc->intr_ctrl[index], 1); + pds_core_intr_mask(&pdsc->intr_ctrl[index], PDS_CORE_INTR_MASK_SET); + + /* Register the isr with a name */ + err = request_irq(intr_info->vector, handler, 0, intr_info->name, data); + if (err) { + dev_err(pdsc->dev, "failed to get intr irq vector %d: %pe\n", + intr_info->vector, ERR_PTR(err)); + goto err_out_free_intr; + } + + return index; + +err_out_free_intr: + pdsc_intr_free(pdsc, index); + return err; +} + +static void pdsc_qcq_intr_free(struct pdsc *pdsc, struct pdsc_qcq *qcq) +{ + if (!(qcq->flags & PDS_CORE_QCQ_F_INTR) || + qcq->intx == PDS_CORE_INTR_INDEX_NOT_ASSIGNED) + return; + + pdsc_intr_free(pdsc, qcq->intx); + qcq->intx = PDS_CORE_INTR_INDEX_NOT_ASSIGNED; +} + +static int pdsc_qcq_intr_alloc(struct pdsc *pdsc, struct pdsc_qcq *qcq) +{ + char name[PDSC_INTR_NAME_MAX_SZ]; + int index; + + if (!(qcq->flags & PDS_CORE_QCQ_F_INTR)) { + qcq->intx = PDS_CORE_INTR_INDEX_NOT_ASSIGNED; + return 0; + } + + snprintf(name, sizeof(name), "%s-%d-%s", + PDS_CORE_DRV_NAME, pdsc->pdev->bus->number, qcq->q.name); + index = pdsc_intr_alloc(pdsc, name, pdsc_adminq_isr, qcq); + if (index < 0) + return index; + qcq->intx = index; + + return 0; +} + +void pdsc_qcq_free(struct pdsc *pdsc, struct pdsc_qcq *qcq) +{ + struct device *dev = pdsc->dev; + + if (!(qcq && qcq->pdsc)) + return; + + pdsc_debugfs_del_qcq(qcq); + + pdsc_qcq_intr_free(pdsc, qcq); + + if (qcq->q_base) + dma_free_coherent(dev, qcq->q_size, + qcq->q_base, qcq->q_base_pa); + + if (qcq->cq_base) + dma_free_coherent(dev, qcq->cq_size, + qcq->cq_base, qcq->cq_base_pa); + + if (qcq->cq.info) + vfree(qcq->cq.info); + + if (qcq->q.info) + vfree(qcq->q.info); + + memset(qcq, 0, sizeof(*qcq)); +} + +static void pdsc_q_map(struct pdsc_queue *q, void *base, dma_addr_t base_pa) +{ + struct pdsc_q_info *cur; + unsigned int i; + + q->base = base; + q->base_pa = base_pa; + + for (i = 0, cur = q->info; i < q->num_descs; i++, cur++) + cur->desc = base + (i * q->desc_size); +} + +static void pdsc_cq_map(struct pdsc_cq *cq, void *base, dma_addr_t base_pa) +{ + struct pdsc_cq_info *cur; + unsigned int i; + + cq->base = base; + cq->base_pa = base_pa; + + for (i = 0, cur = cq->info; i < cq->num_descs; i++, cur++) + cur->comp = base + (i * cq->desc_size); +} + +int pdsc_qcq_alloc(struct pdsc *pdsc, unsigned int type, unsigned int index, + const char *name, unsigned int flags, unsigned int num_descs, + unsigned int desc_size, unsigned int cq_desc_size, + unsigned int pid, struct pdsc_qcq *qcq) +{ + struct device *dev = pdsc->dev; + void *q_base, *cq_base; + dma_addr_t cq_base_pa; + dma_addr_t q_base_pa; + int err; + + qcq->q.info = vzalloc(num_descs * sizeof(*qcq->q.info)); + if (!qcq->q.info) { + err = -ENOMEM; + goto err_out; + } + + qcq->pdsc = pdsc; + qcq->flags = flags; + INIT_WORK(&qcq->work, pdsc_work_thread); + + qcq->q.type = type; + qcq->q.index = index; + qcq->q.num_descs = num_descs; + qcq->q.desc_size = desc_size; + qcq->q.tail_idx = 0; + qcq->q.head_idx = 0; + qcq->q.pid = pid; + snprintf(qcq->q.name, sizeof(qcq->q.name), "%s%u", name, index); + + err = pdsc_qcq_intr_alloc(pdsc, qcq); + if (err) + goto err_out_free_q_info; + + qcq->cq.info = vzalloc(num_descs * sizeof(*qcq->cq.info)); + if (!qcq->cq.info) { + err = -ENOMEM; + goto err_out_free_irq; + } + + qcq->cq.bound_intr = &pdsc->intr_info[qcq->intx]; + qcq->cq.num_descs = num_descs; + qcq->cq.desc_size = cq_desc_size; + qcq->cq.tail_idx = 0; + qcq->cq.done_color = 1; + + if (flags & PDS_CORE_QCQ_F_NOTIFYQ) { + /* q & cq need to be contiguous in case of notifyq */ + qcq->q_size = PDS_PAGE_SIZE + + ALIGN(num_descs * desc_size, PDS_PAGE_SIZE) + + ALIGN(num_descs * cq_desc_size, PDS_PAGE_SIZE); + qcq->q_base = dma_alloc_coherent(dev, + qcq->q_size + qcq->cq_size, + &qcq->q_base_pa, + GFP_KERNEL); + if (!qcq->q_base) { + err = -ENOMEM; + goto err_out_free_cq_info; + } + q_base = PTR_ALIGN(qcq->q_base, PDS_PAGE_SIZE); + q_base_pa = ALIGN(qcq->q_base_pa, PDS_PAGE_SIZE); + pdsc_q_map(&qcq->q, q_base, q_base_pa); + + cq_base = PTR_ALIGN(q_base + + ALIGN(num_descs * desc_size, PDS_PAGE_SIZE), + PDS_PAGE_SIZE); + cq_base_pa = ALIGN(qcq->q_base_pa + + ALIGN(num_descs * desc_size, PDS_PAGE_SIZE), + PDS_PAGE_SIZE); + + } else { + /* q DMA descriptors */ + qcq->q_size = PDS_PAGE_SIZE + (num_descs * desc_size); + qcq->q_base = dma_alloc_coherent(dev, qcq->q_size, + &qcq->q_base_pa, + GFP_KERNEL); + if (!qcq->q_base) { + err = -ENOMEM; + goto err_out_free_cq_info; + } + q_base = PTR_ALIGN(qcq->q_base, PDS_PAGE_SIZE); + q_base_pa = ALIGN(qcq->q_base_pa, PDS_PAGE_SIZE); + pdsc_q_map(&qcq->q, q_base, q_base_pa); + + /* cq DMA descriptors */ + qcq->cq_size = PDS_PAGE_SIZE + (num_descs * cq_desc_size); + qcq->cq_base = dma_alloc_coherent(dev, qcq->cq_size, + &qcq->cq_base_pa, + GFP_KERNEL); + if (!qcq->cq_base) { + err = -ENOMEM; + goto err_out_free_q; + } + cq_base = PTR_ALIGN(qcq->cq_base, PDS_PAGE_SIZE); + cq_base_pa = ALIGN(qcq->cq_base_pa, PDS_PAGE_SIZE); + } + + pdsc_cq_map(&qcq->cq, cq_base, cq_base_pa); + qcq->cq.bound_q = &qcq->q; + + pdsc_debugfs_add_qcq(pdsc, qcq); + + return 0; + +err_out_free_q: + dma_free_coherent(dev, qcq->q_size, qcq->q_base, qcq->q_base_pa); +err_out_free_cq_info: + vfree(qcq->cq.info); +err_out_free_irq: + pdsc_qcq_intr_free(pdsc, qcq); +err_out_free_q_info: + vfree(qcq->q.info); + memset(qcq, 0, sizeof(*qcq)); +err_out: + dev_err(dev, "qcq alloc of %s%d failed %d\n", name, index, err); + return err; +} + +static int pdsc_core_init(struct pdsc *pdsc) +{ + union pds_core_dev_comp comp = {}; + union pds_core_dev_cmd cmd = { + .init.opcode = PDS_CORE_CMD_INIT, + }; + struct pds_core_dev_init_data_out cido; + struct pds_core_dev_init_data_in cidi; + u32 dbid_count; + u32 dbpage_num; + size_t sz; + int err; + + cidi.adminq_q_base = cpu_to_le64(pdsc->adminqcq.q_base_pa); + cidi.adminq_cq_base = cpu_to_le64(pdsc->adminqcq.cq_base_pa); + cidi.notifyq_cq_base = cpu_to_le64(pdsc->notifyqcq.cq.base_pa); + cidi.flags = cpu_to_le32(PDS_CORE_QINIT_F_IRQ | PDS_CORE_QINIT_F_ENA); + cidi.intr_index = cpu_to_le16(pdsc->adminqcq.intx); + cidi.adminq_ring_size = ilog2(pdsc->adminqcq.q.num_descs); + cidi.notifyq_ring_size = ilog2(pdsc->notifyqcq.q.num_descs); + + mutex_lock(&pdsc->devcmd_lock); + + sz = min_t(size_t, sizeof(cidi), sizeof(pdsc->cmd_regs->data)); + memcpy_toio(&pdsc->cmd_regs->data, &cidi, sz); + + err = pdsc_devcmd_locked(pdsc, &cmd, &comp, pdsc->devcmd_timeout); + if (!err) { + sz = min_t(size_t, sizeof(cido), sizeof(pdsc->cmd_regs->data)); + memcpy_fromio(&cido, &pdsc->cmd_regs->data, sz); + } + + mutex_unlock(&pdsc->devcmd_lock); + if (err) { + dev_err(pdsc->dev, "Device init command failed: %pe\n", + ERR_PTR(err)); + return err; + } + + pdsc->hw_index = le32_to_cpu(cido.core_hw_index); + + dbid_count = le32_to_cpu(pdsc->dev_ident.ndbpgs_per_lif); + dbpage_num = pdsc->hw_index * dbid_count; + pdsc->kern_dbpage = pdsc_map_dbpage(pdsc, dbpage_num); + if (!pdsc->kern_dbpage) { + dev_err(pdsc->dev, "Cannot map dbpage, aborting\n"); + return -ENOMEM; + } + + pdsc->adminqcq.q.hw_type = cido.adminq_hw_type; + pdsc->adminqcq.q.hw_index = le32_to_cpu(cido.adminq_hw_index); + pdsc->adminqcq.q.dbval = PDS_CORE_DBELL_QID(pdsc->adminqcq.q.hw_index); + + pdsc->notifyqcq.q.hw_type = cido.notifyq_hw_type; + pdsc->notifyqcq.q.hw_index = le32_to_cpu(cido.notifyq_hw_index); + pdsc->notifyqcq.q.dbval = PDS_CORE_DBELL_QID(pdsc->notifyqcq.q.hw_index); + + pdsc->last_eid = 0; + + return err; +} + +static struct pdsc_viftype pdsc_viftype_defaults[] = { + [PDS_DEV_TYPE_VDPA] = { .name = PDS_DEV_TYPE_VDPA_STR, + .vif_id = PDS_DEV_TYPE_VDPA, + .dl_id = DEVLINK_PARAM_GENERIC_ID_ENABLE_VNET }, + [PDS_DEV_TYPE_MAX] = {} +}; + +static int pdsc_viftypes_init(struct pdsc *pdsc) +{ + enum pds_core_vif_types vt; + + pdsc->viftype_status = kzalloc(sizeof(pdsc_viftype_defaults), + GFP_KERNEL); + if (!pdsc->viftype_status) + return -ENOMEM; + + for (vt = 0; vt < PDS_DEV_TYPE_MAX; vt++) { + bool vt_support; + + if (!pdsc_viftype_defaults[vt].name) + continue; + + /* Grab the defaults */ + pdsc->viftype_status[vt] = pdsc_viftype_defaults[vt]; + + /* See what the Core device has for support */ + vt_support = !!le16_to_cpu(pdsc->dev_ident.vif_types[vt]); + dev_dbg(pdsc->dev, "VIF %s is %ssupported\n", + pdsc->viftype_status[vt].name, + vt_support ? "" : "not "); + + pdsc->viftype_status[vt].supported = vt_support; + } + + return 0; +} + +int pdsc_setup(struct pdsc *pdsc, bool init) +{ + int numdescs; + int err; + + if (init) + err = pdsc_dev_init(pdsc); + else + err = pdsc_dev_reinit(pdsc); + if (err) + return err; + + /* Scale the descriptor ring length based on number of CPUs and VFs */ + numdescs = max_t(int, PDSC_ADMINQ_MIN_LENGTH, num_online_cpus()); + numdescs += 2 * pci_sriov_get_totalvfs(pdsc->pdev); + numdescs = roundup_pow_of_two(numdescs); + err = pdsc_qcq_alloc(pdsc, PDS_CORE_QTYPE_ADMINQ, 0, "adminq", + PDS_CORE_QCQ_F_CORE | PDS_CORE_QCQ_F_INTR, + numdescs, + sizeof(union pds_core_adminq_cmd), + sizeof(union pds_core_adminq_comp), + 0, &pdsc->adminqcq); + if (err) + goto err_out_teardown; + + err = pdsc_qcq_alloc(pdsc, PDS_CORE_QTYPE_NOTIFYQ, 0, "notifyq", + PDS_CORE_QCQ_F_NOTIFYQ, + PDSC_NOTIFYQ_LENGTH, + sizeof(struct pds_core_notifyq_cmd), + sizeof(union pds_core_notifyq_comp), + 0, &pdsc->notifyqcq); + if (err) + goto err_out_teardown; + + /* NotifyQ rides on the AdminQ interrupt */ + pdsc->notifyqcq.intx = pdsc->adminqcq.intx; + + /* Set up the Core with the AdminQ and NotifyQ info */ + err = pdsc_core_init(pdsc); + if (err) + goto err_out_teardown; + + /* Set up the VIFs */ + err = pdsc_viftypes_init(pdsc); + if (err) + goto err_out_teardown; + + if (init) + pdsc_debugfs_add_viftype(pdsc); + + clear_bit(PDSC_S_FW_DEAD, &pdsc->state); + return 0; + +err_out_teardown: + pdsc_teardown(pdsc, init); + return err; +} + +void pdsc_teardown(struct pdsc *pdsc, bool removing) +{ + int i; + + pdsc_devcmd_reset(pdsc); + pdsc_qcq_free(pdsc, &pdsc->notifyqcq); + pdsc_qcq_free(pdsc, &pdsc->adminqcq); + + kfree(pdsc->viftype_status); + pdsc->viftype_status = NULL; + + if (pdsc->intr_info) { + for (i = 0; i < pdsc->nintrs; i++) + pdsc_intr_free(pdsc, i); + + if (removing) { + kfree(pdsc->intr_info); + pdsc->intr_info = NULL; + } + } + + if (pdsc->kern_dbpage) { + iounmap(pdsc->kern_dbpage); + pdsc->kern_dbpage = NULL; + } + + set_bit(PDSC_S_FW_DEAD, &pdsc->state); +} + +int pdsc_start(struct pdsc *pdsc) +{ + pds_core_intr_mask(&pdsc->intr_ctrl[pdsc->adminqcq.intx], + PDS_CORE_INTR_MASK_CLEAR); + + return 0; +} + +void pdsc_stop(struct pdsc *pdsc) +{ + int i; + + if (!pdsc->intr_info) + return; + + /* Mask interrupts that are in use */ + for (i = 0; i < pdsc->nintrs; i++) + if (pdsc->intr_info[i].vector) + pds_core_intr_mask(&pdsc->intr_ctrl[i], + PDS_CORE_INTR_MASK_SET); +} + +static void pdsc_fw_down(struct pdsc *pdsc) +{ + union pds_core_notifyq_comp reset_event = { + .reset.ecode = cpu_to_le16(PDS_EVENT_RESET), + .reset.state = 0, + }; + + if (test_and_set_bit(PDSC_S_FW_DEAD, &pdsc->state)) { + dev_err(pdsc->dev, "%s: already happening\n", __func__); + return; + } + + /* Notify clients of fw_down */ + devlink_health_report(pdsc->fw_reporter, "FW down reported", pdsc); + pdsc_notify(PDS_EVENT_RESET, &reset_event); + + pdsc_stop(pdsc); + pdsc_teardown(pdsc, PDSC_TEARDOWN_RECOVERY); +} + +static void pdsc_fw_up(struct pdsc *pdsc) +{ + union pds_core_notifyq_comp reset_event = { + .reset.ecode = cpu_to_le16(PDS_EVENT_RESET), + .reset.state = 1, + }; + int err; + + if (!test_bit(PDSC_S_FW_DEAD, &pdsc->state)) { + dev_err(pdsc->dev, "%s: fw not dead\n", __func__); + return; + } + + err = pdsc_setup(pdsc, PDSC_SETUP_RECOVERY); + if (err) + goto err_out; + + err = pdsc_start(pdsc); + if (err) + goto err_out; + + /* Notify clients of fw_up */ + pdsc->fw_recoveries++; + devlink_health_reporter_state_update(pdsc->fw_reporter, + DEVLINK_HEALTH_REPORTER_STATE_HEALTHY); + pdsc_notify(PDS_EVENT_RESET, &reset_event); + + return; + +err_out: + pdsc_teardown(pdsc, PDSC_TEARDOWN_RECOVERY); +} + +void pdsc_health_thread(struct work_struct *work) +{ + struct pdsc *pdsc = container_of(work, struct pdsc, health_work); + unsigned long mask; + bool healthy; + + mutex_lock(&pdsc->config_lock); + + /* Don't do a check when in a transition state */ + mask = BIT_ULL(PDSC_S_INITING_DRIVER) | + BIT_ULL(PDSC_S_STOPPING_DRIVER); + if (pdsc->state & mask) + goto out_unlock; + + healthy = pdsc_is_fw_good(pdsc); + dev_dbg(pdsc->dev, "%s: health %d fw_status %#02x fw_heartbeat %d\n", + __func__, healthy, pdsc->fw_status, pdsc->last_hb); + + if (test_bit(PDSC_S_FW_DEAD, &pdsc->state)) { + if (healthy) + pdsc_fw_up(pdsc); + } else { + if (!healthy) + pdsc_fw_down(pdsc); + } + + pdsc->fw_generation = pdsc->fw_status & PDS_CORE_FW_STS_F_GENERATION; + +out_unlock: + mutex_unlock(&pdsc->config_lock); +} diff --git a/drivers/net/ethernet/amd/pds_core/core.h b/drivers/net/ethernet/amd/pds_core/core.h new file mode 100644 index 000000000000..e545fafc4819 --- /dev/null +++ b/drivers/net/ethernet/amd/pds_core/core.h @@ -0,0 +1,312 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* Copyright(c) 2023 Advanced Micro Devices, Inc */ + +#ifndef _PDSC_H_ +#define _PDSC_H_ + +#include <linux/debugfs.h> +#include <net/devlink.h> + +#include <linux/pds/pds_common.h> +#include <linux/pds/pds_core_if.h> +#include <linux/pds/pds_adminq.h> +#include <linux/pds/pds_intr.h> + +#define PDSC_DRV_DESCRIPTION "AMD/Pensando Core Driver" + +#define PDSC_WATCHDOG_SECS 5 +#define PDSC_QUEUE_NAME_MAX_SZ 32 +#define PDSC_ADMINQ_MIN_LENGTH 16 /* must be a power of two */ +#define PDSC_NOTIFYQ_LENGTH 64 /* must be a power of two */ +#define PDSC_TEARDOWN_RECOVERY false +#define PDSC_TEARDOWN_REMOVING true +#define PDSC_SETUP_RECOVERY false +#define PDSC_SETUP_INIT true + +struct pdsc_dev_bar { + void __iomem *vaddr; + phys_addr_t bus_addr; + unsigned long len; + int res_index; +}; + +struct pdsc; + +struct pdsc_vf { + struct pds_auxiliary_dev *padev; + struct pdsc *vf; + u16 index; + __le16 vif_types[PDS_DEV_TYPE_MAX]; +}; + +struct pdsc_devinfo { + u8 asic_type; + u8 asic_rev; + char fw_version[PDS_CORE_DEVINFO_FWVERS_BUFLEN + 1]; + char serial_num[PDS_CORE_DEVINFO_SERIAL_BUFLEN + 1]; +}; + +struct pdsc_queue { + struct pdsc_q_info *info; + u64 dbval; + u16 head_idx; + u16 tail_idx; + u8 hw_type; + unsigned int index; + unsigned int num_descs; + u64 dbell_count; + u64 features; + unsigned int type; + unsigned int hw_index; + union { + void *base; + struct pds_core_admin_cmd *adminq; + }; + dma_addr_t base_pa; /* must be page aligned */ + unsigned int desc_size; + unsigned int pid; + char name[PDSC_QUEUE_NAME_MAX_SZ]; +}; + +#define PDSC_INTR_NAME_MAX_SZ 32 + +struct pdsc_intr_info { + char name[PDSC_INTR_NAME_MAX_SZ]; + unsigned int index; + unsigned int vector; + void *data; +}; + +struct pdsc_cq_info { + void *comp; +}; + +struct pdsc_buf_info { + struct page *page; + dma_addr_t dma_addr; + u32 page_offset; + u32 len; +}; + +struct pdsc_q_info { + union { + void *desc; + struct pdsc_admin_cmd *adminq_desc; + }; + unsigned int bytes; + unsigned int nbufs; + struct pdsc_buf_info bufs[PDS_CORE_MAX_FRAGS]; + struct pdsc_wait_context *wc; + void *dest; +}; + +struct pdsc_cq { + struct pdsc_cq_info *info; + struct pdsc_queue *bound_q; + struct pdsc_intr_info *bound_intr; + u16 tail_idx; + bool done_color; + unsigned int num_descs; + unsigned int desc_size; + void *base; + dma_addr_t base_pa; /* must be page aligned */ +} ____cacheline_aligned_in_smp; + +struct pdsc_qcq { + struct pdsc *pdsc; + void *q_base; + dma_addr_t q_base_pa; /* might not be page aligned */ + void *cq_base; + dma_addr_t cq_base_pa; /* might not be page aligned */ + u32 q_size; + u32 cq_size; + bool armed; + unsigned int flags; + + struct work_struct work; + struct pdsc_queue q; + struct pdsc_cq cq; + int intx; + + u32 accum_work; + struct dentry *dentry; +}; + +struct pdsc_viftype { + char *name; + bool supported; + bool enabled; + int dl_id; + int vif_id; + struct pds_auxiliary_dev *padev; +}; + +/* No state flags set means we are in a steady running state */ +enum pdsc_state_flags { + PDSC_S_FW_DEAD, /* stopped, wait on startup or recovery */ + PDSC_S_INITING_DRIVER, /* initial startup from probe */ + PDSC_S_STOPPING_DRIVER, /* driver remove */ + + /* leave this as last */ + PDSC_S_STATE_SIZE +}; + +struct pdsc { + struct pci_dev *pdev; + struct dentry *dentry; + struct device *dev; + struct pdsc_dev_bar bars[PDS_CORE_BARS_MAX]; + struct pdsc_vf *vfs; + int num_vfs; + int vf_id; + int hw_index; + int uid; + + unsigned long state; + u8 fw_status; + u8 fw_generation; + unsigned long last_fw_time; + u32 last_hb; + struct timer_list wdtimer; + unsigned int wdtimer_period; + struct work_struct health_work; + struct devlink_health_reporter *fw_reporter; + u32 fw_recoveries; + + struct pdsc_devinfo dev_info; + struct pds_core_dev_identity dev_ident; + unsigned int nintrs; + struct pdsc_intr_info *intr_info; /* array of nintrs elements */ + + struct workqueue_struct *wq; + + unsigned int devcmd_timeout; + struct mutex devcmd_lock; /* lock for dev_cmd operations */ + struct mutex config_lock; /* lock for configuration operations */ + spinlock_t adminq_lock; /* lock for adminq operations */ + struct pds_core_dev_info_regs __iomem *info_regs; + struct pds_core_dev_cmd_regs __iomem *cmd_regs; + struct pds_core_intr __iomem *intr_ctrl; + u64 __iomem *intr_status; + u64 __iomem *db_pages; + dma_addr_t phy_db_pages; + u64 __iomem *kern_dbpage; + + struct pdsc_qcq adminqcq; + struct pdsc_qcq notifyqcq; + u64 last_eid; + struct pdsc_viftype *viftype_status; +}; + +/** enum pds_core_dbell_bits - bitwise composition of dbell values. + * + * @PDS_CORE_DBELL_QID_MASK: unshifted mask of valid queue id bits. + * @PDS_CORE_DBELL_QID_SHIFT: queue id shift amount in dbell value. + * @PDS_CORE_DBELL_QID: macro to build QID component of dbell value. + * + * @PDS_CORE_DBELL_RING_MASK: unshifted mask of valid ring bits. + * @PDS_CORE_DBELL_RING_SHIFT: ring shift amount in dbell value. + * @PDS_CORE_DBELL_RING: macro to build ring component of dbell value. + * + * @PDS_CORE_DBELL_RING_0: ring zero dbell component value. + * @PDS_CORE_DBELL_RING_1: ring one dbell component value. + * @PDS_CORE_DBELL_RING_2: ring two dbell component value. + * @PDS_CORE_DBELL_RING_3: ring three dbell component value. + * + * @PDS_CORE_DBELL_INDEX_MASK: bit mask of valid index bits, no shift needed. + */ +enum pds_core_dbell_bits { + PDS_CORE_DBELL_QID_MASK = 0xffffff, + PDS_CORE_DBELL_QID_SHIFT = 24, + +#define PDS_CORE_DBELL_QID(n) \ + (((u64)(n) & PDS_CORE_DBELL_QID_MASK) << PDS_CORE_DBELL_QID_SHIFT) + + PDS_CORE_DBELL_RING_MASK = 0x7, + PDS_CORE_DBELL_RING_SHIFT = 16, + +#define PDS_CORE_DBELL_RING(n) \ + (((u64)(n) & PDS_CORE_DBELL_RING_MASK) << PDS_CORE_DBELL_RING_SHIFT) + + PDS_CORE_DBELL_RING_0 = 0, + PDS_CORE_DBELL_RING_1 = PDS_CORE_DBELL_RING(1), + PDS_CORE_DBELL_RING_2 = PDS_CORE_DBELL_RING(2), + PDS_CORE_DBELL_RING_3 = PDS_CORE_DBELL_RING(3), + + PDS_CORE_DBELL_INDEX_MASK = 0xffff, +}; + +static inline void pds_core_dbell_ring(u64 __iomem *db_page, + enum pds_core_logical_qtype qtype, + u64 val) +{ + writeq(val, &db_page[qtype]); +} + +int pdsc_fw_reporter_diagnose(struct devlink_health_reporter *reporter, + struct devlink_fmsg *fmsg, + struct netlink_ext_ack *extack); +int pdsc_dl_info_get(struct devlink *dl, struct devlink_info_req *req, + struct netlink_ext_ack *extack); +int pdsc_dl_flash_update(struct devlink *dl, + struct devlink_flash_update_params *params, + struct netlink_ext_ack *extack); +int pdsc_dl_enable_get(struct devlink *dl, u32 id, + struct devlink_param_gset_ctx *ctx); +int pdsc_dl_enable_set(struct devlink *dl, u32 id, + struct devlink_param_gset_ctx *ctx); +int pdsc_dl_enable_validate(struct devlink *dl, u32 id, + union devlink_param_value val, + struct netlink_ext_ack *extack); + +void __iomem *pdsc_map_dbpage(struct pdsc *pdsc, int page_num); + +void pdsc_debugfs_create(void); +void pdsc_debugfs_destroy(void); +void pdsc_debugfs_add_dev(struct pdsc *pdsc); +void pdsc_debugfs_del_dev(struct pdsc *pdsc); +void pdsc_debugfs_add_ident(struct pdsc *pdsc); +void pdsc_debugfs_add_viftype(struct pdsc *pdsc); +void pdsc_debugfs_add_irqs(struct pdsc *pdsc); +void pdsc_debugfs_add_qcq(struct pdsc *pdsc, struct pdsc_qcq *qcq); +void pdsc_debugfs_del_qcq(struct pdsc_qcq *qcq); + +int pdsc_err_to_errno(enum pds_core_status_code code); +bool pdsc_is_fw_running(struct pdsc *pdsc); +bool pdsc_is_fw_good(struct pdsc *pdsc); +int pdsc_devcmd(struct pdsc *pdsc, union pds_core_dev_cmd *cmd, + union pds_core_dev_comp *comp, int max_seconds); +int pdsc_devcmd_locked(struct pdsc *pdsc, union pds_core_dev_cmd *cmd, + union pds_core_dev_comp *comp, int max_seconds); +int pdsc_devcmd_init(struct pdsc *pdsc); +int pdsc_devcmd_reset(struct pdsc *pdsc); +int pdsc_dev_reinit(struct pdsc *pdsc); +int pdsc_dev_init(struct pdsc *pdsc); + +int pdsc_intr_alloc(struct pdsc *pdsc, char *name, + irq_handler_t handler, void *data); +void pdsc_intr_free(struct pdsc *pdsc, int index); +void pdsc_qcq_free(struct pdsc *pdsc, struct pdsc_qcq *qcq); +int pdsc_qcq_alloc(struct pdsc *pdsc, unsigned int type, unsigned int index, + const char *name, unsigned int flags, unsigned int num_descs, + unsigned int desc_size, unsigned int cq_desc_size, + unsigned int pid, struct pdsc_qcq *qcq); +int pdsc_setup(struct pdsc *pdsc, bool init); +void pdsc_teardown(struct pdsc *pdsc, bool removing); +int pdsc_start(struct pdsc *pdsc); +void pdsc_stop(struct pdsc *pdsc); +void pdsc_health_thread(struct work_struct *work); + +int pdsc_register_notify(struct notifier_block *nb); +void pdsc_unregister_notify(struct notifier_block *nb); +void pdsc_notify(unsigned long event, void *data); +int pdsc_auxbus_dev_add(struct pdsc *cf, struct pdsc *pf); +int pdsc_auxbus_dev_del(struct pdsc *cf, struct pdsc *pf); + +void pdsc_process_adminq(struct pdsc_qcq *qcq); +void pdsc_work_thread(struct work_struct *work); +irqreturn_t pdsc_adminq_isr(int irq, void *data); + +int pdsc_firmware_update(struct pdsc *pdsc, const struct firmware *fw, + struct netlink_ext_ack *extack); +#endif /* _PDSC_H_ */ diff --git a/drivers/net/ethernet/amd/pds_core/debugfs.c b/drivers/net/ethernet/amd/pds_core/debugfs.c new file mode 100644 index 000000000000..8ec392299b7d --- /dev/null +++ b/drivers/net/ethernet/amd/pds_core/debugfs.c @@ -0,0 +1,170 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright(c) 2023 Advanced Micro Devices, Inc */ + +#include <linux/pci.h> + +#include "core.h" + +static struct dentry *pdsc_dir; + +void pdsc_debugfs_create(void) +{ + pdsc_dir = debugfs_create_dir(PDS_CORE_DRV_NAME, NULL); +} + +void pdsc_debugfs_destroy(void) +{ + debugfs_remove_recursive(pdsc_dir); +} + +void pdsc_debugfs_add_dev(struct pdsc *pdsc) +{ + pdsc->dentry = debugfs_create_dir(pci_name(pdsc->pdev), pdsc_dir); + + debugfs_create_ulong("state", 0400, pdsc->dentry, &pdsc->state); +} + +void pdsc_debugfs_del_dev(struct pdsc *pdsc) +{ + debugfs_remove_recursive(pdsc->dentry); + pdsc->dentry = NULL; +} + +static int identity_show(struct seq_file *seq, void *v) +{ + struct pdsc *pdsc = seq->private; + struct pds_core_dev_identity *ident; + int vt; + + ident = &pdsc->dev_ident; + + seq_printf(seq, "fw_heartbeat: 0x%x\n", + ioread32(&pdsc->info_regs->fw_heartbeat)); + + seq_printf(seq, "nlifs: %d\n", + le32_to_cpu(ident->nlifs)); + seq_printf(seq, "nintrs: %d\n", + le32_to_cpu(ident->nintrs)); + seq_printf(seq, "ndbpgs_per_lif: %d\n", + le32_to_cpu(ident->ndbpgs_per_lif)); + seq_printf(seq, "intr_coal_mult: %d\n", + le32_to_cpu(ident->intr_coal_mult)); + seq_printf(seq, "intr_coal_div: %d\n", + le32_to_cpu(ident->intr_coal_div)); + + seq_puts(seq, "vif_types: "); + for (vt = 0; vt < PDS_DEV_TYPE_MAX; vt++) + seq_printf(seq, "%d ", + le16_to_cpu(pdsc->dev_ident.vif_types[vt])); + seq_puts(seq, "\n"); + + return 0; +} +DEFINE_SHOW_ATTRIBUTE(identity); + +void pdsc_debugfs_add_ident(struct pdsc *pdsc) +{ + debugfs_create_file("identity", 0400, pdsc->dentry, + pdsc, &identity_fops); +} + +static int viftype_show(struct seq_file *seq, void *v) +{ + struct pdsc *pdsc = seq->private; + int vt; + + for (vt = 0; vt < PDS_DEV_TYPE_MAX; vt++) { + if (!pdsc->viftype_status[vt].name) + continue; + + seq_printf(seq, "%s\t%d supported %d enabled\n", + pdsc->viftype_status[vt].name, + pdsc->viftype_status[vt].supported, + pdsc->viftype_status[vt].enabled); + } + return 0; +} +DEFINE_SHOW_ATTRIBUTE(viftype); + +void pdsc_debugfs_add_viftype(struct pdsc *pdsc) +{ + debugfs_create_file("viftypes", 0400, pdsc->dentry, + pdsc, &viftype_fops); +} + +static const struct debugfs_reg32 intr_ctrl_regs[] = { + { .name = "coal_init", .offset = 0, }, + { .name = "mask", .offset = 4, }, + { .name = "credits", .offset = 8, }, + { .name = "mask_on_assert", .offset = 12, }, + { .name = "coal_timer", .offset = 16, }, +}; + +void pdsc_debugfs_add_qcq(struct pdsc *pdsc, struct pdsc_qcq *qcq) +{ + struct dentry *qcq_dentry, *q_dentry, *cq_dentry; + struct dentry *intr_dentry; + struct debugfs_regset32 *intr_ctrl_regset; + struct pdsc_intr_info *intr = &pdsc->intr_info[qcq->intx]; + struct pdsc_queue *q = &qcq->q; + struct pdsc_cq *cq = &qcq->cq; + + qcq_dentry = debugfs_create_dir(q->name, pdsc->dentry); + if (IS_ERR_OR_NULL(qcq_dentry)) + return; + qcq->dentry = qcq_dentry; + + debugfs_create_x64("q_base_pa", 0400, qcq_dentry, &qcq->q_base_pa); + debugfs_create_x32("q_size", 0400, qcq_dentry, &qcq->q_size); + debugfs_create_x64("cq_base_pa", 0400, qcq_dentry, &qcq->cq_base_pa); + debugfs_create_x32("cq_size", 0400, qcq_dentry, &qcq->cq_size); + debugfs_create_x32("accum_work", 0400, qcq_dentry, &qcq->accum_work); + + q_dentry = debugfs_create_dir("q", qcq->dentry); + if (IS_ERR_OR_NULL(q_dentry)) + return; + + debugfs_create_u32("index", 0400, q_dentry, &q->index); + debugfs_create_u32("num_descs", 0400, q_dentry, &q->num_descs); + debugfs_create_u32("desc_size", 0400, q_dentry, &q->desc_size); + debugfs_create_u32("pid", 0400, q_dentry, &q->pid); + + debugfs_create_u16("tail", 0400, q_dentry, &q->tail_idx); + debugfs_create_u16("head", 0400, q_dentry, &q->head_idx); + + cq_dentry = debugfs_create_dir("cq", qcq->dentry); + if (IS_ERR_OR_NULL(cq_dentry)) + return; + + debugfs_create_x64("base_pa", 0400, cq_dentry, &cq->base_pa); + debugfs_create_u32("num_descs", 0400, cq_dentry, &cq->num_descs); + debugfs_create_u32("desc_size", 0400, cq_dentry, &cq->desc_size); + debugfs_create_bool("done_color", 0400, cq_dentry, &cq->done_color); + debugfs_create_u16("tail", 0400, cq_dentry, &cq->tail_idx); + + if (qcq->flags & PDS_CORE_QCQ_F_INTR) { + intr_dentry = debugfs_create_dir("intr", qcq->dentry); + if (IS_ERR_OR_NULL(intr_dentry)) + return; + + debugfs_create_u32("index", 0400, intr_dentry, &intr->index); + debugfs_create_u32("vector", 0400, intr_dentry, &intr->vector); + + intr_ctrl_regset = kzalloc(sizeof(*intr_ctrl_regset), + GFP_KERNEL); + if (!intr_ctrl_regset) + return; + intr_ctrl_regset->regs = intr_ctrl_regs; + intr_ctrl_regset->nregs = ARRAY_SIZE(intr_ctrl_regs); + intr_ctrl_regset->base = &pdsc->intr_ctrl[intr->index]; + + debugfs_create_regset32("intr_ctrl", 0400, intr_dentry, + intr_ctrl_regset); + } +}; + +void pdsc_debugfs_del_qcq(struct pdsc_qcq *qcq) +{ + debugfs_remove_recursive(qcq->dentry); + qcq->dentry = NULL; +} diff --git a/drivers/net/ethernet/amd/pds_core/dev.c b/drivers/net/ethernet/amd/pds_core/dev.c new file mode 100644 index 000000000000..f7c597ea5daf --- /dev/null +++ b/drivers/net/ethernet/amd/pds_core/dev.c @@ -0,0 +1,351 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright(c) 2023 Advanced Micro Devices, Inc */ + +#include <linux/errno.h> +#include <linux/pci.h> +#include <linux/utsname.h> + +#include "core.h" + +int pdsc_err_to_errno(enum pds_core_status_code code) +{ + switch (code) { + case PDS_RC_SUCCESS: + return 0; + case PDS_RC_EVERSION: + case PDS_RC_EQTYPE: + case PDS_RC_EQID: + case PDS_RC_EINVAL: + case PDS_RC_ENOSUPP: + return -EINVAL; + case PDS_RC_EPERM: + return -EPERM; + case PDS_RC_ENOENT: + return -ENOENT; + case PDS_RC_EAGAIN: + return -EAGAIN; + case PDS_RC_ENOMEM: + return -ENOMEM; + case PDS_RC_EFAULT: + return -EFAULT; + case PDS_RC_EBUSY: + return -EBUSY; + case PDS_RC_EEXIST: + return -EEXIST; + case PDS_RC_EVFID: + return -ENODEV; + case PDS_RC_ECLIENT: + return -ECHILD; + case PDS_RC_ENOSPC: + return -ENOSPC; + case PDS_RC_ERANGE: + return -ERANGE; + case PDS_RC_BAD_ADDR: + return -EFAULT; + case PDS_RC_EOPCODE: + case PDS_RC_EINTR: + case PDS_RC_DEV_CMD: + case PDS_RC_ERROR: + case PDS_RC_ERDMA: + case PDS_RC_EIO: + default: + return -EIO; + } +} + +bool pdsc_is_fw_running(struct pdsc *pdsc) +{ + pdsc->fw_status = ioread8(&pdsc->info_regs->fw_status); + pdsc->last_fw_time = jiffies; + pdsc->last_hb = ioread32(&pdsc->info_regs->fw_heartbeat); + + /* Firmware is useful only if the running bit is set and + * fw_status != 0xff (bad PCI read) + */ + return (pdsc->fw_status != 0xff) && + (pdsc->fw_status & PDS_CORE_FW_STS_F_RUNNING); +} + +bool pdsc_is_fw_good(struct pdsc *pdsc) +{ + u8 gen = pdsc->fw_status & PDS_CORE_FW_STS_F_GENERATION; + + return pdsc_is_fw_running(pdsc) && gen == pdsc->fw_generation; +} + +static u8 pdsc_devcmd_status(struct pdsc *pdsc) +{ + return ioread8(&pdsc->cmd_regs->comp.status); +} + +static bool pdsc_devcmd_done(struct pdsc *pdsc) +{ + return ioread32(&pdsc->cmd_regs->done) & PDS_CORE_DEV_CMD_DONE; +} + +static void pdsc_devcmd_dbell(struct pdsc *pdsc) +{ + iowrite32(0, &pdsc->cmd_regs->done); + iowrite32(1, &pdsc->cmd_regs->doorbell); +} + +static void pdsc_devcmd_clean(struct pdsc *pdsc) +{ + iowrite32(0, &pdsc->cmd_regs->doorbell); + memset_io(&pdsc->cmd_regs->cmd, 0, sizeof(pdsc->cmd_regs->cmd)); +} + +static const char *pdsc_devcmd_str(int opcode) +{ + switch (opcode) { + case PDS_CORE_CMD_NOP: + return "PDS_CORE_CMD_NOP"; + case PDS_CORE_CMD_IDENTIFY: + return "PDS_CORE_CMD_IDENTIFY"; + case PDS_CORE_CMD_RESET: + return "PDS_CORE_CMD_RESET"; + case PDS_CORE_CMD_INIT: + return "PDS_CORE_CMD_INIT"; + case PDS_CORE_CMD_FW_DOWNLOAD: + return "PDS_CORE_CMD_FW_DOWNLOAD"; + case PDS_CORE_CMD_FW_CONTROL: + return "PDS_CORE_CMD_FW_CONTROL"; + default: + return "PDS_CORE_CMD_UNKNOWN"; + } +} + +static int pdsc_devcmd_wait(struct pdsc *pdsc, int max_seconds) +{ + struct device *dev = pdsc->dev; + unsigned long start_time; + unsigned long max_wait; + unsigned long duration; + int timeout = 0; + int done = 0; + int err = 0; + int status; + int opcode; + + opcode = ioread8(&pdsc->cmd_regs->cmd.opcode); + + start_time = jiffies; + max_wait = start_time + (max_seconds * HZ); + + while (!done && !timeout) { + done = pdsc_devcmd_done(pdsc); + if (done) + break; + + timeout = time_after(jiffies, max_wait); + if (timeout) + break; + + usleep_range(100, 200); + } + duration = jiffies - start_time; + + if (done && duration > HZ) + dev_dbg(dev, "DEVCMD %d %s after %ld secs\n", + opcode, pdsc_devcmd_str(opcode), duration / HZ); + + if (!done || timeout) { + dev_err(dev, "DEVCMD %d %s timeout, done %d timeout %d max_seconds=%d\n", + opcode, pdsc_devcmd_str(opcode), done, timeout, + max_seconds); + err = -ETIMEDOUT; + pdsc_devcmd_clean(pdsc); + } + + status = pdsc_devcmd_status(pdsc); + err = pdsc_err_to_errno(status); + if (err && err != -EAGAIN) + dev_err(dev, "DEVCMD %d %s failed, status=%d err %d %pe\n", + opcode, pdsc_devcmd_str(opcode), status, err, + ERR_PTR(err)); + + return err; +} + +int pdsc_devcmd_locked(struct pdsc *pdsc, union pds_core_dev_cmd *cmd, + union pds_core_dev_comp *comp, int max_seconds) +{ + int err; + + memcpy_toio(&pdsc->cmd_regs->cmd, cmd, sizeof(*cmd)); + pdsc_devcmd_dbell(pdsc); + err = pdsc_devcmd_wait(pdsc, max_seconds); + memcpy_fromio(comp, &pdsc->cmd_regs->comp, sizeof(*comp)); + + if (err == -ENXIO || err == -ETIMEDOUT) + queue_work(pdsc->wq, &pdsc->health_work); + + return err; +} + +int pdsc_devcmd(struct pdsc *pdsc, union pds_core_dev_cmd *cmd, + union pds_core_dev_comp *comp, int max_seconds) +{ + int err; + + mutex_lock(&pdsc->devcmd_lock); + err = pdsc_devcmd_locked(pdsc, cmd, comp, max_seconds); + mutex_unlock(&pdsc->devcmd_lock); + + return err; +} + +int pdsc_devcmd_init(struct pdsc *pdsc) +{ + union pds_core_dev_comp comp = {}; + union pds_core_dev_cmd cmd = { + .opcode = PDS_CORE_CMD_INIT, + }; + + return pdsc_devcmd(pdsc, &cmd, &comp, pdsc->devcmd_timeout); +} + +int pdsc_devcmd_reset(struct pdsc *pdsc) +{ + union pds_core_dev_comp comp = {}; + union pds_core_dev_cmd cmd = { + .reset.opcode = PDS_CORE_CMD_RESET, + }; + + return pdsc_devcmd(pdsc, &cmd, &comp, pdsc->devcmd_timeout); +} + +static int pdsc_devcmd_identify_locked(struct pdsc *pdsc) +{ + union pds_core_dev_comp comp = {}; + union pds_core_dev_cmd cmd = { + .identify.opcode = PDS_CORE_CMD_IDENTIFY, + .identify.ver = PDS_CORE_IDENTITY_VERSION_1, + }; + + return pdsc_devcmd_locked(pdsc, &cmd, &comp, pdsc->devcmd_timeout); +} + +static void pdsc_init_devinfo(struct pdsc *pdsc) +{ + pdsc->dev_info.asic_type = ioread8(&pdsc->info_regs->asic_type); + pdsc->dev_info.asic_rev = ioread8(&pdsc->info_regs->asic_rev); + pdsc->fw_generation = PDS_CORE_FW_STS_F_GENERATION & + ioread8(&pdsc->info_regs->fw_status); + + memcpy_fromio(pdsc->dev_info.fw_version, + pdsc->info_regs->fw_version, + PDS_CORE_DEVINFO_FWVERS_BUFLEN); + pdsc->dev_info.fw_version[PDS_CORE_DEVINFO_FWVERS_BUFLEN] = 0; + + memcpy_fromio(pdsc->dev_info.serial_num, + pdsc->info_regs->serial_num, + PDS_CORE_DEVINFO_SERIAL_BUFLEN); + pdsc->dev_info.serial_num[PDS_CORE_DEVINFO_SERIAL_BUFLEN] = 0; + + dev_dbg(pdsc->dev, "fw_version %s\n", pdsc->dev_info.fw_version); +} + +static int pdsc_identify(struct pdsc *pdsc) +{ + struct pds_core_drv_identity drv = {}; + size_t sz; + int err; + + drv.drv_type = cpu_to_le32(PDS_DRIVER_LINUX); + snprintf(drv.driver_ver_str, sizeof(drv.driver_ver_str), + "%s %s", PDS_CORE_DRV_NAME, utsname()->release); + + /* Next let's get some info about the device + * We use the devcmd_lock at this level in order to + * get safe access to the cmd_regs->data before anyone + * else can mess it up + */ + mutex_lock(&pdsc->devcmd_lock); + + sz = min_t(size_t, sizeof(drv), sizeof(pdsc->cmd_regs->data)); + memcpy_toio(&pdsc->cmd_regs->data, &drv, sz); + + err = pdsc_devcmd_identify_locked(pdsc); + if (!err) { + sz = min_t(size_t, sizeof(pdsc->dev_ident), + sizeof(pdsc->cmd_regs->data)); + memcpy_fromio(&pdsc->dev_ident, &pdsc->cmd_regs->data, sz); + } + mutex_unlock(&pdsc->devcmd_lock); + + if (err) { + dev_err(pdsc->dev, "Cannot identify device: %pe\n", + ERR_PTR(err)); + return err; + } + + if (isprint(pdsc->dev_info.fw_version[0]) && + isascii(pdsc->dev_info.fw_version[0])) + dev_info(pdsc->dev, "FW: %.*s\n", + (int)(sizeof(pdsc->dev_info.fw_version) - 1), + pdsc->dev_info.fw_version); + else + dev_info(pdsc->dev, "FW: (invalid string) 0x%02x 0x%02x 0x%02x 0x%02x ...\n", + (u8)pdsc->dev_info.fw_version[0], + (u8)pdsc->dev_info.fw_version[1], + (u8)pdsc->dev_info.fw_version[2], + (u8)pdsc->dev_info.fw_version[3]); + + return 0; +} + +int pdsc_dev_reinit(struct pdsc *pdsc) +{ + pdsc_init_devinfo(pdsc); + + return pdsc_identify(pdsc); +} + +int pdsc_dev_init(struct pdsc *pdsc) +{ + unsigned int nintrs; + int err; + + /* Initial init and reset of device */ + pdsc_init_devinfo(pdsc); + pdsc->devcmd_timeout = PDS_CORE_DEVCMD_TIMEOUT; + + err = pdsc_devcmd_reset(pdsc); + if (err) + return err; + + err = pdsc_identify(pdsc); + if (err) + return err; + + pdsc_debugfs_add_ident(pdsc); + + /* Now we can reserve interrupts */ + nintrs = le32_to_cpu(pdsc->dev_ident.nintrs); + nintrs = min_t(unsigned int, num_online_cpus(), nintrs); + + /* Get intr_info struct array for tracking */ + pdsc->intr_info = kcalloc(nintrs, sizeof(*pdsc->intr_info), GFP_KERNEL); + if (!pdsc->intr_info) { + err = -ENOMEM; + goto err_out; + } + + err = pci_alloc_irq_vectors(pdsc->pdev, nintrs, nintrs, PCI_IRQ_MSIX); + if (err != nintrs) { + dev_err(pdsc->dev, "Can't get %d intrs from OS: %pe\n", + nintrs, ERR_PTR(err)); + err = -ENOSPC; + goto err_out; + } + pdsc->nintrs = nintrs; + + return 0; + +err_out: + kfree(pdsc->intr_info); + pdsc->intr_info = NULL; + + return err; +} diff --git a/drivers/net/ethernet/amd/pds_core/devlink.c b/drivers/net/ethernet/amd/pds_core/devlink.c new file mode 100644 index 000000000000..9c6b3653c1c7 --- /dev/null +++ b/drivers/net/ethernet/amd/pds_core/devlink.c @@ -0,0 +1,183 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright(c) 2023 Advanced Micro Devices, Inc */ + +#include "core.h" +#include <linux/pds/pds_auxbus.h> + +static struct +pdsc_viftype *pdsc_dl_find_viftype_by_id(struct pdsc *pdsc, + enum devlink_param_type dl_id) +{ + int vt; + + for (vt = 0; vt < PDS_DEV_TYPE_MAX; vt++) { + if (pdsc->viftype_status[vt].dl_id == dl_id) + return &pdsc->viftype_status[vt]; + } + + return NULL; +} + +int pdsc_dl_enable_get(struct devlink *dl, u32 id, + struct devlink_param_gset_ctx *ctx) +{ + struct pdsc *pdsc = devlink_priv(dl); + struct pdsc_viftype *vt_entry; + + vt_entry = pdsc_dl_find_viftype_by_id(pdsc, id); + if (!vt_entry) + return -ENOENT; + + ctx->val.vbool = vt_entry->enabled; + + return 0; +} + +int pdsc_dl_enable_set(struct devlink *dl, u32 id, + struct devlink_param_gset_ctx *ctx) +{ + struct pdsc *pdsc = devlink_priv(dl); + struct pdsc_viftype *vt_entry; + int err = 0; + int vf_id; + + vt_entry = pdsc_dl_find_viftype_by_id(pdsc, id); + if (!vt_entry || !vt_entry->supported) + return -EOPNOTSUPP; + + if (vt_entry->enabled == ctx->val.vbool) + return 0; + + vt_entry->enabled = ctx->val.vbool; + for (vf_id = 0; vf_id < pdsc->num_vfs; vf_id++) { + struct pdsc *vf = pdsc->vfs[vf_id].vf; + + err = ctx->val.vbool ? pdsc_auxbus_dev_add(vf, pdsc) : + pdsc_auxbus_dev_del(vf, pdsc); + } + + return err; +} + +int pdsc_dl_enable_validate(struct devlink *dl, u32 id, + union devlink_param_value val, + struct netlink_ext_ack *extack) +{ + struct pdsc *pdsc = devlink_priv(dl); + struct pdsc_viftype *vt_entry; + + vt_entry = pdsc_dl_find_viftype_by_id(pdsc, id); + if (!vt_entry || !vt_entry->supported) + return -EOPNOTSUPP; + + if (!pdsc->viftype_status[vt_entry->vif_id].supported) + return -ENODEV; + + return 0; +} + +int pdsc_dl_flash_update(struct devlink *dl, + struct devlink_flash_update_params *params, + struct netlink_ext_ack *extack) +{ + struct pdsc *pdsc = devlink_priv(dl); + + return pdsc_firmware_update(pdsc, params->fw, extack); +} + +static char *fw_slotnames[] = { + "fw.goldfw", + "fw.mainfwa", + "fw.mainfwb", +}; + +int pdsc_dl_info_get(struct devlink *dl, struct devlink_info_req *req, + struct netlink_ext_ack *extack) +{ + union pds_core_dev_cmd cmd = { + .fw_control.opcode = PDS_CORE_CMD_FW_CONTROL, + .fw_control.oper = PDS_CORE_FW_GET_LIST, + }; + struct pds_core_fw_list_info fw_list; + struct pdsc *pdsc = devlink_priv(dl); + union pds_core_dev_comp comp; + char buf[16]; + int listlen; + int err; + int i; + + mutex_lock(&pdsc->devcmd_lock); + err = pdsc_devcmd_locked(pdsc, &cmd, &comp, pdsc->devcmd_timeout * 2); + memcpy_fromio(&fw_list, pdsc->cmd_regs->data, sizeof(fw_list)); + mutex_unlock(&pdsc->devcmd_lock); + if (err && err != -EIO) + return err; + + listlen = fw_list.num_fw_slots; + for (i = 0; i < listlen; i++) { + if (i < ARRAY_SIZE(fw_slotnames)) + strscpy(buf, fw_slotnames[i], sizeof(buf)); + else + snprintf(buf, sizeof(buf), "fw.slot_%d", i); + err = devlink_info_version_stored_put(req, buf, + fw_list.fw_names[i].fw_version); + } + + err = devlink_info_version_running_put(req, + DEVLINK_INFO_VERSION_GENERIC_FW, + pdsc->dev_info.fw_version); + if (err) + return err; + + snprintf(buf, sizeof(buf), "0x%x", pdsc->dev_info.asic_type); + err = devlink_info_version_fixed_put(req, + DEVLINK_INFO_VERSION_GENERIC_ASIC_ID, + buf); + if (err) + return err; + + snprintf(buf, sizeof(buf), "0x%x", pdsc->dev_info.asic_rev); + err = devlink_info_version_fixed_put(req, + DEVLINK_INFO_VERSION_GENERIC_ASIC_REV, + buf); + if (err) + return err; + + return devlink_info_serial_number_put(req, pdsc->dev_info.serial_num); +} + +int pdsc_fw_reporter_diagnose(struct devlink_health_reporter *reporter, + struct devlink_fmsg *fmsg, + struct netlink_ext_ack *extack) +{ + struct pdsc *pdsc = devlink_health_reporter_priv(reporter); + int err; + + mutex_lock(&pdsc->config_lock); + + if (test_bit(PDSC_S_FW_DEAD, &pdsc->state)) + err = devlink_fmsg_string_pair_put(fmsg, "Status", "dead"); + else if (!pdsc_is_fw_good(pdsc)) + err = devlink_fmsg_string_pair_put(fmsg, "Status", "unhealthy"); + else + err = devlink_fmsg_string_pair_put(fmsg, "Status", "healthy"); + + mutex_unlock(&pdsc->config_lock); + + if (err) + return err; + + err = devlink_fmsg_u32_pair_put(fmsg, "State", + pdsc->fw_status & + ~PDS_CORE_FW_STS_F_GENERATION); + if (err) + return err; + + err = devlink_fmsg_u32_pair_put(fmsg, "Generation", + pdsc->fw_generation >> 4); + if (err) + return err; + + return devlink_fmsg_u32_pair_put(fmsg, "Recoveries", + pdsc->fw_recoveries); +} diff --git a/drivers/net/ethernet/amd/pds_core/fw.c b/drivers/net/ethernet/amd/pds_core/fw.c new file mode 100644 index 000000000000..90a811f3878a --- /dev/null +++ b/drivers/net/ethernet/amd/pds_core/fw.c @@ -0,0 +1,194 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright(c) 2023 Advanced Micro Devices, Inc */ + +#include "core.h" + +/* The worst case wait for the install activity is about 25 minutes when + * installing a new CPLD, which is very seldom. Normal is about 30-35 + * seconds. Since the driver can't tell if a CPLD update will happen we + * set the timeout for the ugly case. + */ +#define PDSC_FW_INSTALL_TIMEOUT (25 * 60) +#define PDSC_FW_SELECT_TIMEOUT 30 + +/* Number of periodic log updates during fw file download */ +#define PDSC_FW_INTERVAL_FRACTION 32 + +static int pdsc_devcmd_fw_download_locked(struct pdsc *pdsc, u64 addr, + u32 offset, u32 length) +{ + union pds_core_dev_cmd cmd = { + .fw_download.opcode = PDS_CORE_CMD_FW_DOWNLOAD, + .fw_download.offset = cpu_to_le32(offset), + .fw_download.addr = cpu_to_le64(addr), + .fw_download.length = cpu_to_le32(length), + }; + union pds_core_dev_comp comp; + + return pdsc_devcmd_locked(pdsc, &cmd, &comp, pdsc->devcmd_timeout); +} + +static int pdsc_devcmd_fw_install(struct pdsc *pdsc) +{ + union pds_core_dev_cmd cmd = { + .fw_control.opcode = PDS_CORE_CMD_FW_CONTROL, + .fw_control.oper = PDS_CORE_FW_INSTALL_ASYNC + }; + union pds_core_dev_comp comp; + int err; + + err = pdsc_devcmd(pdsc, &cmd, &comp, pdsc->devcmd_timeout); + if (err < 0) + return err; + + return comp.fw_control.slot; +} + +static int pdsc_devcmd_fw_activate(struct pdsc *pdsc, + enum pds_core_fw_slot slot) +{ + union pds_core_dev_cmd cmd = { + .fw_control.opcode = PDS_CORE_CMD_FW_CONTROL, + .fw_control.oper = PDS_CORE_FW_ACTIVATE_ASYNC, + .fw_control.slot = slot + }; + union pds_core_dev_comp comp; + + return pdsc_devcmd(pdsc, &cmd, &comp, pdsc->devcmd_timeout); +} + +static int pdsc_fw_status_long_wait(struct pdsc *pdsc, + const char *label, + unsigned long timeout, + u8 fw_cmd, + struct netlink_ext_ack *extack) +{ + union pds_core_dev_cmd cmd = { + .fw_control.opcode = PDS_CORE_CMD_FW_CONTROL, + .fw_control.oper = fw_cmd, + }; + union pds_core_dev_comp comp; + unsigned long start_time; + unsigned long end_time; + int err; + + /* Ping on the status of the long running async install + * command. We get EAGAIN while the command is still + * running, else we get the final command status. + */ + start_time = jiffies; + end_time = start_time + (timeout * HZ); + do { + err = pdsc_devcmd(pdsc, &cmd, &comp, pdsc->devcmd_timeout); + msleep(20); + } while (time_before(jiffies, end_time) && + (err == -EAGAIN || err == -ETIMEDOUT)); + + if (err == -EAGAIN || err == -ETIMEDOUT) { + NL_SET_ERR_MSG_MOD(extack, "Firmware wait timed out"); + dev_err(pdsc->dev, "DEV_CMD firmware wait %s timed out\n", + label); + } else if (err) { + NL_SET_ERR_MSG_MOD(extack, "Firmware wait failed"); + } + + return err; +} + +int pdsc_firmware_update(struct pdsc *pdsc, const struct firmware *fw, + struct netlink_ext_ack *extack) +{ + u32 buf_sz, copy_sz, offset; + struct devlink *dl; + int next_interval; + u64 data_addr; + int err = 0; + int fw_slot; + + dev_info(pdsc->dev, "Installing firmware\n"); + + dl = priv_to_devlink(pdsc); + devlink_flash_update_status_notify(dl, "Preparing to flash", + NULL, 0, 0); + + buf_sz = sizeof(pdsc->cmd_regs->data); + + dev_dbg(pdsc->dev, + "downloading firmware - size %d part_sz %d nparts %lu\n", + (int)fw->size, buf_sz, DIV_ROUND_UP(fw->size, buf_sz)); + + offset = 0; + next_interval = 0; + data_addr = offsetof(struct pds_core_dev_cmd_regs, data); + while (offset < fw->size) { + if (offset >= next_interval) { + devlink_flash_update_status_notify(dl, "Downloading", + NULL, offset, + fw->size); + next_interval = offset + + (fw->size / PDSC_FW_INTERVAL_FRACTION); + } + + copy_sz = min_t(unsigned int, buf_sz, fw->size - offset); + mutex_lock(&pdsc->devcmd_lock); + memcpy_toio(&pdsc->cmd_regs->data, fw->data + offset, copy_sz); + err = pdsc_devcmd_fw_download_locked(pdsc, data_addr, + offset, copy_sz); + mutex_unlock(&pdsc->devcmd_lock); + if (err) { + dev_err(pdsc->dev, + "download failed offset 0x%x addr 0x%llx len 0x%x: %pe\n", + offset, data_addr, copy_sz, ERR_PTR(err)); + NL_SET_ERR_MSG_MOD(extack, "Segment download failed"); + goto err_out; + } + offset += copy_sz; + } + devlink_flash_update_status_notify(dl, "Downloading", NULL, + fw->size, fw->size); + + devlink_flash_update_timeout_notify(dl, "Installing", NULL, + PDSC_FW_INSTALL_TIMEOUT); + + fw_slot = pdsc_devcmd_fw_install(pdsc); + if (fw_slot < 0) { + err = fw_slot; + dev_err(pdsc->dev, "install failed: %pe\n", ERR_PTR(err)); + NL_SET_ERR_MSG_MOD(extack, "Failed to start firmware install"); + goto err_out; + } + + err = pdsc_fw_status_long_wait(pdsc, "Installing", + PDSC_FW_INSTALL_TIMEOUT, + PDS_CORE_FW_INSTALL_STATUS, + extack); + if (err) + goto err_out; + + devlink_flash_update_timeout_notify(dl, "Selecting", NULL, + PDSC_FW_SELECT_TIMEOUT); + + err = pdsc_devcmd_fw_activate(pdsc, fw_slot); + if (err) { + NL_SET_ERR_MSG_MOD(extack, "Failed to start firmware select"); + goto err_out; + } + + err = pdsc_fw_status_long_wait(pdsc, "Selecting", + PDSC_FW_SELECT_TIMEOUT, + PDS_CORE_FW_ACTIVATE_STATUS, + extack); + if (err) + goto err_out; + + dev_info(pdsc->dev, "Firmware update completed, slot %d\n", fw_slot); + +err_out: + if (err) + devlink_flash_update_status_notify(dl, "Flash failed", + NULL, 0, 0); + else + devlink_flash_update_status_notify(dl, "Flash done", + NULL, 0, 0); + return err; +} diff --git a/drivers/net/ethernet/amd/pds_core/main.c b/drivers/net/ethernet/amd/pds_core/main.c new file mode 100644 index 000000000000..e2d14b1ca471 --- /dev/null +++ b/drivers/net/ethernet/amd/pds_core/main.c @@ -0,0 +1,475 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright(c) 2023 Advanced Micro Devices, Inc */ + +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt + +#include <linux/pci.h> + +#include <linux/pds/pds_common.h> + +#include "core.h" + +MODULE_DESCRIPTION(PDSC_DRV_DESCRIPTION); +MODULE_AUTHOR("Advanced Micro Devices, Inc"); +MODULE_LICENSE("GPL"); + +/* Supported devices */ +static const struct pci_device_id pdsc_id_table[] = { + { PCI_VDEVICE(PENSANDO, PCI_DEVICE_ID_PENSANDO_CORE_PF) }, + { PCI_VDEVICE(PENSANDO, PCI_DEVICE_ID_PENSANDO_VDPA_VF) }, + { 0, } /* end of table */ +}; +MODULE_DEVICE_TABLE(pci, pdsc_id_table); + +static void pdsc_wdtimer_cb(struct timer_list *t) +{ + struct pdsc *pdsc = from_timer(pdsc, t, wdtimer); + + dev_dbg(pdsc->dev, "%s: jiffies %ld\n", __func__, jiffies); + mod_timer(&pdsc->wdtimer, + round_jiffies(jiffies + pdsc->wdtimer_period)); + + queue_work(pdsc->wq, &pdsc->health_work); +} + +static void pdsc_unmap_bars(struct pdsc *pdsc) +{ + struct pdsc_dev_bar *bars = pdsc->bars; + unsigned int i; + + for (i = 0; i < PDS_CORE_BARS_MAX; i++) { + if (bars[i].vaddr) + pci_iounmap(pdsc->pdev, bars[i].vaddr); + } +} + +static int pdsc_map_bars(struct pdsc *pdsc) +{ + struct pdsc_dev_bar *bar = pdsc->bars; + struct pci_dev *pdev = pdsc->pdev; + struct device *dev = pdsc->dev; + struct pdsc_dev_bar *bars; + unsigned int i, j; + int num_bars = 0; + int err; + u32 sig; + + bars = pdsc->bars; + + /* Since the PCI interface in the hardware is configurable, + * we need to poke into all the bars to find the set we're + * expecting. + */ + for (i = 0, j = 0; i < PDS_CORE_BARS_MAX; i++) { + if (!(pci_resource_flags(pdev, i) & IORESOURCE_MEM)) + continue; + + bars[j].len = pci_resource_len(pdev, i); + bars[j].bus_addr = pci_resource_start(pdev, i); + bars[j].res_index = i; + + /* only map the whole bar 0 */ + if (j > 0) { + bars[j].vaddr = NULL; + } else { + bars[j].vaddr = pci_iomap(pdev, i, bars[j].len); + if (!bars[j].vaddr) { + dev_err(dev, "Cannot map BAR %d, aborting\n", i); + return -ENODEV; + } + } + + j++; + } + num_bars = j; + + /* BAR0: dev_cmd and interrupts */ + if (num_bars < 1) { + dev_err(dev, "No bars found\n"); + err = -EFAULT; + goto err_out; + } + + if (bar->len < PDS_CORE_BAR0_SIZE) { + dev_err(dev, "Resource bar size %lu too small\n", bar->len); + err = -EFAULT; + goto err_out; + } + + pdsc->info_regs = bar->vaddr + PDS_CORE_BAR0_DEV_INFO_REGS_OFFSET; + pdsc->cmd_regs = bar->vaddr + PDS_CORE_BAR0_DEV_CMD_REGS_OFFSET; + pdsc->intr_status = bar->vaddr + PDS_CORE_BAR0_INTR_STATUS_OFFSET; + pdsc->intr_ctrl = bar->vaddr + PDS_CORE_BAR0_INTR_CTRL_OFFSET; + + sig = ioread32(&pdsc->info_regs->signature); + if (sig != PDS_CORE_DEV_INFO_SIGNATURE) { + dev_err(dev, "Incompatible firmware signature %x", sig); + err = -EFAULT; + goto err_out; + } + + /* BAR1: doorbells */ + bar++; + if (num_bars < 2) { + dev_err(dev, "Doorbell bar missing\n"); + err = -EFAULT; + goto err_out; + } + + pdsc->db_pages = bar->vaddr; + pdsc->phy_db_pages = bar->bus_addr; + + return 0; + +err_out: + pdsc_unmap_bars(pdsc); + return err; +} + +void __iomem *pdsc_map_dbpage(struct pdsc *pdsc, int page_num) +{ + return pci_iomap_range(pdsc->pdev, + pdsc->bars[PDS_CORE_PCI_BAR_DBELL].res_index, + (u64)page_num << PAGE_SHIFT, PAGE_SIZE); +} + +static int pdsc_sriov_configure(struct pci_dev *pdev, int num_vfs) +{ + struct pdsc *pdsc = pci_get_drvdata(pdev); + struct device *dev = pdsc->dev; + int ret = 0; + + if (num_vfs > 0) { + pdsc->vfs = kcalloc(num_vfs, sizeof(struct pdsc_vf), + GFP_KERNEL); + if (!pdsc->vfs) + return -ENOMEM; + pdsc->num_vfs = num_vfs; + + ret = pci_enable_sriov(pdev, num_vfs); + if (ret) { + dev_err(dev, "Cannot enable SRIOV: %pe\n", + ERR_PTR(ret)); + goto no_vfs; + } + + return num_vfs; + } + +no_vfs: + pci_disable_sriov(pdev); + + kfree(pdsc->vfs); + pdsc->vfs = NULL; + pdsc->num_vfs = 0; + + return ret; +} + +static int pdsc_init_vf(struct pdsc *vf) +{ + struct devlink *dl; + struct pdsc *pf; + int err; + + pf = pdsc_get_pf_struct(vf->pdev); + if (IS_ERR_OR_NULL(pf)) + return PTR_ERR(pf) ?: -1; + + vf->vf_id = pci_iov_vf_id(vf->pdev); + + dl = priv_to_devlink(vf); + devl_lock(dl); + devl_register(dl); + devl_unlock(dl); + + pf->vfs[vf->vf_id].vf = vf; + err = pdsc_auxbus_dev_add(vf, pf); + if (err) { + devl_lock(dl); + devl_unregister(dl); + devl_unlock(dl); + } + + return err; +} + +static const struct devlink_health_reporter_ops pdsc_fw_reporter_ops = { + .name = "fw", + .diagnose = pdsc_fw_reporter_diagnose, +}; + +static const struct devlink_param pdsc_dl_params[] = { + DEVLINK_PARAM_GENERIC(ENABLE_VNET, + BIT(DEVLINK_PARAM_CMODE_RUNTIME), + pdsc_dl_enable_get, + pdsc_dl_enable_set, + pdsc_dl_enable_validate), +}; + +#define PDSC_WQ_NAME_LEN 24 + +static int pdsc_init_pf(struct pdsc *pdsc) +{ + struct devlink_health_reporter *hr; + char wq_name[PDSC_WQ_NAME_LEN]; + struct devlink *dl; + int err; + + pcie_print_link_status(pdsc->pdev); + + err = pci_request_regions(pdsc->pdev, PDS_CORE_DRV_NAME); + if (err) { + dev_err(pdsc->dev, "Cannot request PCI regions: %pe\n", + ERR_PTR(err)); + return err; + } + + err = pdsc_map_bars(pdsc); + if (err) + goto err_out_release_regions; + + /* General workqueue and timer, but don't start timer yet */ + snprintf(wq_name, sizeof(wq_name), "%s.%d", PDS_CORE_DRV_NAME, pdsc->uid); + pdsc->wq = create_singlethread_workqueue(wq_name); + INIT_WORK(&pdsc->health_work, pdsc_health_thread); + timer_setup(&pdsc->wdtimer, pdsc_wdtimer_cb, 0); + pdsc->wdtimer_period = PDSC_WATCHDOG_SECS * HZ; + + mutex_init(&pdsc->devcmd_lock); + mutex_init(&pdsc->config_lock); + spin_lock_init(&pdsc->adminq_lock); + + mutex_lock(&pdsc->config_lock); + set_bit(PDSC_S_FW_DEAD, &pdsc->state); + + err = pdsc_setup(pdsc, PDSC_SETUP_INIT); + if (err) + goto err_out_unmap_bars; + err = pdsc_start(pdsc); + if (err) + goto err_out_teardown; + + mutex_unlock(&pdsc->config_lock); + + dl = priv_to_devlink(pdsc); + devl_lock(dl); + err = devl_params_register(dl, pdsc_dl_params, + ARRAY_SIZE(pdsc_dl_params)); + if (err) { + dev_warn(pdsc->dev, "Failed to register devlink params: %pe\n", + ERR_PTR(err)); + goto err_out_unlock_dl; + } + + hr = devl_health_reporter_create(dl, &pdsc_fw_reporter_ops, 0, pdsc); + if (IS_ERR(hr)) { + dev_warn(pdsc->dev, "Failed to create fw reporter: %pe\n", hr); + err = PTR_ERR(hr); + goto err_out_unreg_params; + } + pdsc->fw_reporter = hr; + + devl_register(dl); + devl_unlock(dl); + + /* Lastly, start the health check timer */ + mod_timer(&pdsc->wdtimer, round_jiffies(jiffies + pdsc->wdtimer_period)); + + return 0; + +err_out_unreg_params: + devl_params_unregister(dl, pdsc_dl_params, + ARRAY_SIZE(pdsc_dl_params)); +err_out_unlock_dl: + devl_unlock(dl); + pdsc_stop(pdsc); +err_out_teardown: + pdsc_teardown(pdsc, PDSC_TEARDOWN_REMOVING); +err_out_unmap_bars: + mutex_unlock(&pdsc->config_lock); + del_timer_sync(&pdsc->wdtimer); + if (pdsc->wq) + destroy_workqueue(pdsc->wq); + mutex_destroy(&pdsc->config_lock); + mutex_destroy(&pdsc->devcmd_lock); + pci_free_irq_vectors(pdsc->pdev); + pdsc_unmap_bars(pdsc); +err_out_release_regions: + pci_release_regions(pdsc->pdev); + + return err; +} + +static const struct devlink_ops pdsc_dl_ops = { + .info_get = pdsc_dl_info_get, + .flash_update = pdsc_dl_flash_update, +}; + +static const struct devlink_ops pdsc_dl_vf_ops = { +}; + +static DEFINE_IDA(pdsc_ida); + +static int pdsc_probe(struct pci_dev *pdev, const struct pci_device_id *ent) +{ + struct device *dev = &pdev->dev; + const struct devlink_ops *ops; + struct devlink *dl; + struct pdsc *pdsc; + bool is_pf; + int err; + + is_pf = !pdev->is_virtfn; + ops = is_pf ? &pdsc_dl_ops : &pdsc_dl_vf_ops; + dl = devlink_alloc(ops, sizeof(struct pdsc), dev); + if (!dl) + return -ENOMEM; + pdsc = devlink_priv(dl); + + pdsc->pdev = pdev; + pdsc->dev = &pdev->dev; + set_bit(PDSC_S_INITING_DRIVER, &pdsc->state); + pci_set_drvdata(pdev, pdsc); + pdsc_debugfs_add_dev(pdsc); + + err = ida_alloc(&pdsc_ida, GFP_KERNEL); + if (err < 0) { + dev_err(pdsc->dev, "%s: id alloc failed: %pe\n", + __func__, ERR_PTR(err)); + goto err_out_free_devlink; + } + pdsc->uid = err; + + /* Query system for DMA addressing limitation for the device. */ + err = dma_set_mask_and_coherent(dev, DMA_BIT_MASK(PDS_CORE_ADDR_LEN)); + if (err) { + dev_err(dev, "Unable to obtain 64-bit DMA for consistent allocations, aborting: %pe\n", + ERR_PTR(err)); + goto err_out_free_ida; + } + + err = pci_enable_device(pdev); + if (err) { + dev_err(dev, "Cannot enable PCI device: %pe\n", ERR_PTR(err)); + goto err_out_free_ida; + } + pci_set_master(pdev); + + if (is_pf) + err = pdsc_init_pf(pdsc); + else + err = pdsc_init_vf(pdsc); + if (err) { + dev_err(dev, "Cannot init device: %pe\n", ERR_PTR(err)); + goto err_out_clear_master; + } + + clear_bit(PDSC_S_INITING_DRIVER, &pdsc->state); + return 0; + +err_out_clear_master: + pci_clear_master(pdev); + pci_disable_device(pdev); +err_out_free_ida: + ida_free(&pdsc_ida, pdsc->uid); +err_out_free_devlink: + pdsc_debugfs_del_dev(pdsc); + devlink_free(dl); + + return err; +} + +static void pdsc_remove(struct pci_dev *pdev) +{ + struct pdsc *pdsc = pci_get_drvdata(pdev); + struct devlink *dl; + + /* Unhook the registrations first to be sure there + * are no requests while we're stopping. + */ + dl = priv_to_devlink(pdsc); + devl_lock(dl); + devl_unregister(dl); + if (!pdev->is_virtfn) { + if (pdsc->fw_reporter) { + devl_health_reporter_destroy(pdsc->fw_reporter); + pdsc->fw_reporter = NULL; + } + devl_params_unregister(dl, pdsc_dl_params, + ARRAY_SIZE(pdsc_dl_params)); + } + devl_unlock(dl); + + if (pdev->is_virtfn) { + struct pdsc *pf; + + pf = pdsc_get_pf_struct(pdsc->pdev); + if (!IS_ERR(pf)) { + pdsc_auxbus_dev_del(pdsc, pf); + pf->vfs[pdsc->vf_id].vf = NULL; + } + } else { + /* Remove the VFs and their aux_bus connections before other + * cleanup so that the clients can use the AdminQ to cleanly + * shut themselves down. + */ + pdsc_sriov_configure(pdev, 0); + + del_timer_sync(&pdsc->wdtimer); + if (pdsc->wq) + destroy_workqueue(pdsc->wq); + + mutex_lock(&pdsc->config_lock); + set_bit(PDSC_S_STOPPING_DRIVER, &pdsc->state); + + pdsc_stop(pdsc); + pdsc_teardown(pdsc, PDSC_TEARDOWN_REMOVING); + mutex_unlock(&pdsc->config_lock); + mutex_destroy(&pdsc->config_lock); + mutex_destroy(&pdsc->devcmd_lock); + + pci_free_irq_vectors(pdev); + pdsc_unmap_bars(pdsc); + pci_release_regions(pdev); + } + + pci_clear_master(pdev); + pci_disable_device(pdev); + + ida_free(&pdsc_ida, pdsc->uid); + pdsc_debugfs_del_dev(pdsc); + devlink_free(dl); +} + +static struct pci_driver pdsc_driver = { + .name = PDS_CORE_DRV_NAME, + .id_table = pdsc_id_table, + .probe = pdsc_probe, + .remove = pdsc_remove, + .sriov_configure = pdsc_sriov_configure, +}; + +void *pdsc_get_pf_struct(struct pci_dev *vf_pdev) +{ + return pci_iov_get_pf_drvdata(vf_pdev, &pdsc_driver); +} +EXPORT_SYMBOL_GPL(pdsc_get_pf_struct); + +static int __init pdsc_init_module(void) +{ + if (strcmp(KBUILD_MODNAME, PDS_CORE_DRV_NAME)) + return -EINVAL; + + pdsc_debugfs_create(); + return pci_register_driver(&pdsc_driver); +} + +static void __exit pdsc_cleanup_module(void) +{ + pci_unregister_driver(&pdsc_driver); + pdsc_debugfs_destroy(); +} + +module_init(pdsc_init_module); +module_exit(pdsc_cleanup_module); diff --git a/drivers/net/ethernet/aquantia/atlantic/aq_drvinfo.c b/drivers/net/ethernet/aquantia/atlantic/aq_drvinfo.c index d3526cd38f3d..414b2e448d59 100644 --- a/drivers/net/ethernet/aquantia/atlantic/aq_drvinfo.c +++ b/drivers/net/ethernet/aquantia/atlantic/aq_drvinfo.c @@ -124,7 +124,7 @@ static const struct hwmon_channel_info aq_hwmon_temp = { .config = aq_hwmon_temp_config, }; -static const struct hwmon_channel_info *aq_hwmon_info[] = { +static const struct hwmon_channel_info * const aq_hwmon_info[] = { &aq_hwmon_temp, NULL, }; diff --git a/drivers/net/ethernet/atheros/alx/main.c b/drivers/net/ethernet/atheros/alx/main.c index 306393f8eeca..49bb9a8f00e6 100644 --- a/drivers/net/ethernet/atheros/alx/main.c +++ b/drivers/net/ethernet/atheros/alx/main.c @@ -39,7 +39,6 @@ #include <linux/ipv6.h> #include <linux/if_vlan.h> #include <linux/mdio.h> -#include <linux/aer.h> #include <linux/bitops.h> #include <linux/netdevice.h> #include <linux/etherdevice.h> @@ -1745,7 +1744,6 @@ static int alx_probe(struct pci_dev *pdev, const struct pci_device_id *ent) goto out_pci_disable; } - pci_enable_pcie_error_reporting(pdev); pci_set_master(pdev); if (!pdev->pm_cap) { @@ -1879,7 +1877,6 @@ out_free_netdev: free_netdev(netdev); out_pci_release: pci_release_mem_regions(pdev); - pci_disable_pcie_error_reporting(pdev); out_pci_disable: pci_disable_device(pdev); return err; @@ -1897,7 +1894,6 @@ static void alx_remove(struct pci_dev *pdev) iounmap(hw->hw_addr); pci_release_mem_regions(pdev); - pci_disable_pcie_error_reporting(pdev); pci_disable_device(pdev); mutex_destroy(&alx->mtx); diff --git a/drivers/net/ethernet/atheros/atl1c/atl1c_main.c b/drivers/net/ethernet/atheros/atl1c/atl1c_main.c index 40c781695d58..4a288799633f 100644 --- a/drivers/net/ethernet/atheros/atl1c/atl1c_main.c +++ b/drivers/net/ethernet/atheros/atl1c/atl1c_main.c @@ -207,16 +207,6 @@ static inline void atl1c_irq_disable(struct atl1c_adapter *adapter) synchronize_irq(adapter->pdev->irq); } -/** - * atl1c_irq_reset - reset interrupt confiure on the NIC - * @adapter: board private structure - */ -static inline void atl1c_irq_reset(struct atl1c_adapter *adapter) -{ - atomic_set(&adapter->irq_sem, 1); - atl1c_irq_enable(adapter); -} - /* * atl1c_wait_until_idle - wait up to AT_HW_MAX_IDLE_DELAY reads * of the idle status register until the device is actually idle diff --git a/drivers/net/ethernet/broadcom/bnx2.c b/drivers/net/ethernet/broadcom/bnx2.c index 9f473854b0f4..466e1d62bcf6 100644 --- a/drivers/net/ethernet/broadcom/bnx2.c +++ b/drivers/net/ethernet/broadcom/bnx2.c @@ -48,7 +48,6 @@ #include <linux/cache.h> #include <linux/firmware.h> #include <linux/log2.h> -#include <linux/aer.h> #include <linux/crash_dump.h> #if IS_ENABLED(CONFIG_CNIC) @@ -3829,7 +3828,7 @@ load_rv2p_fw(struct bnx2 *bp, u32 rv2p_proc, return 0; } -static int +static void load_cpu_fw(struct bnx2 *bp, const struct cpu_reg *cpu_reg, const struct bnx2_mips_fw_file_entry *fw_entry) { @@ -3897,48 +3896,34 @@ load_cpu_fw(struct bnx2 *bp, const struct cpu_reg *cpu_reg, val &= ~cpu_reg->mode_value_halt; bnx2_reg_wr_ind(bp, cpu_reg->state, cpu_reg->state_value_clear); bnx2_reg_wr_ind(bp, cpu_reg->mode, val); - - return 0; } -static int +static void bnx2_init_cpus(struct bnx2 *bp) { const struct bnx2_mips_fw_file *mips_fw = (const struct bnx2_mips_fw_file *) bp->mips_firmware->data; const struct bnx2_rv2p_fw_file *rv2p_fw = (const struct bnx2_rv2p_fw_file *) bp->rv2p_firmware->data; - int rc; /* Initialize the RV2P processor. */ load_rv2p_fw(bp, RV2P_PROC1, &rv2p_fw->proc1); load_rv2p_fw(bp, RV2P_PROC2, &rv2p_fw->proc2); /* Initialize the RX Processor. */ - rc = load_cpu_fw(bp, &cpu_reg_rxp, &mips_fw->rxp); - if (rc) - goto init_cpu_err; + load_cpu_fw(bp, &cpu_reg_rxp, &mips_fw->rxp); /* Initialize the TX Processor. */ - rc = load_cpu_fw(bp, &cpu_reg_txp, &mips_fw->txp); - if (rc) - goto init_cpu_err; + load_cpu_fw(bp, &cpu_reg_txp, &mips_fw->txp); /* Initialize the TX Patch-up Processor. */ - rc = load_cpu_fw(bp, &cpu_reg_tpat, &mips_fw->tpat); - if (rc) - goto init_cpu_err; + load_cpu_fw(bp, &cpu_reg_tpat, &mips_fw->tpat); /* Initialize the Completion Processor. */ - rc = load_cpu_fw(bp, &cpu_reg_com, &mips_fw->com); - if (rc) - goto init_cpu_err; + load_cpu_fw(bp, &cpu_reg_com, &mips_fw->com); /* Initialize the Command Processor. */ - rc = load_cpu_fw(bp, &cpu_reg_cp, &mips_fw->cp); - -init_cpu_err: - return rc; + load_cpu_fw(bp, &cpu_reg_cp, &mips_fw->cp); } static void @@ -4951,8 +4936,7 @@ bnx2_init_chip(struct bnx2 *bp) } else bnx2_init_context(bp); - if ((rc = bnx2_init_cpus(bp)) != 0) - return rc; + bnx2_init_cpus(bp); bnx2_init_nvram(bp); @@ -8093,7 +8077,6 @@ bnx2_init_board(struct pci_dev *pdev, struct net_device *dev) int rc, i, j; u32 reg; u64 dma_mask, persist_dma_mask; - int err; SET_NETDEV_DEV(dev, &pdev->dev); bp = netdev_priv(dev); @@ -8176,12 +8159,6 @@ bnx2_init_board(struct pci_dev *pdev, struct net_device *dev) bp->flags |= BNX2_FLAG_PCIE; if (BNX2_CHIP_REV(bp) == BNX2_CHIP_REV_Ax) bp->flags |= BNX2_FLAG_JUMBO_BROKEN; - - /* AER (Advanced Error Reporting) hooks */ - err = pci_enable_pcie_error_reporting(pdev); - if (!err) - bp->flags |= BNX2_FLAG_AER_ENABLED; - } else { bp->pcix_cap = pci_find_capability(pdev, PCI_CAP_ID_PCIX); if (bp->pcix_cap == 0) { @@ -8460,11 +8437,6 @@ bnx2_init_board(struct pci_dev *pdev, struct net_device *dev) return 0; err_out_unmap: - if (bp->flags & BNX2_FLAG_AER_ENABLED) { - pci_disable_pcie_error_reporting(pdev); - bp->flags &= ~BNX2_FLAG_AER_ENABLED; - } - pci_iounmap(pdev, bp->regview); bp->regview = NULL; @@ -8638,11 +8610,6 @@ bnx2_remove_one(struct pci_dev *pdev) bnx2_free_stats_blk(dev); kfree(bp->temp_stats_blk); - if (bp->flags & BNX2_FLAG_AER_ENABLED) { - pci_disable_pcie_error_reporting(pdev); - bp->flags &= ~BNX2_FLAG_AER_ENABLED; - } - bnx2_release_firmware(bp); free_netdev(dev); @@ -8766,9 +8733,6 @@ static pci_ers_result_t bnx2_io_slot_reset(struct pci_dev *pdev) } rtnl_unlock(); - if (!(bp->flags & BNX2_FLAG_AER_ENABLED)) - return result; - return result; } diff --git a/drivers/net/ethernet/broadcom/bnx2.h b/drivers/net/ethernet/broadcom/bnx2.h index a09ec47461c9..315b08c64edd 100644 --- a/drivers/net/ethernet/broadcom/bnx2.h +++ b/drivers/net/ethernet/broadcom/bnx2.h @@ -6808,7 +6808,6 @@ struct bnx2 { #define BNX2_FLAG_JUMBO_BROKEN 0x00000800 #define BNX2_FLAG_CAN_KEEP_VLAN 0x00001000 #define BNX2_FLAG_BROKEN_STATS 0x00002000 -#define BNX2_FLAG_AER_ENABLED 0x00004000 struct bnx2_napi bnx2_napi[BNX2_MAX_MSIX_VEC]; diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x.h b/drivers/net/ethernet/broadcom/bnx2x/bnx2x.h index dd5945c4bfec..8bcde0a6e011 100644 --- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x.h +++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x.h @@ -1486,7 +1486,6 @@ struct bnx2x { #define IS_VF_FLAG (1 << 22) #define BC_SUPPORTS_RMMOD_CMD (1 << 23) #define HAS_PHYS_PORT_ID (1 << 24) -#define AER_ENABLED (1 << 25) #define PTP_SUPPORTED (1 << 26) #define TX_TIMESTAMPING_EN (1 << 27) diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c index 12083b9679b5..6ea5521074d3 100644 --- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c +++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c @@ -1935,8 +1935,7 @@ u16 bnx2x_select_queue(struct net_device *dev, struct sk_buff *skb, /* Skip VLAN tag if present */ if (ether_type == ETH_P_8021Q) { - struct vlan_ethhdr *vhdr = - (struct vlan_ethhdr *)skb->data; + struct vlan_ethhdr *vhdr = skb_vlan_eth_hdr(skb); ether_type = ntohs(vhdr->h_vlan_encapsulated_proto); } diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c index 5d1e4fe335aa..3bb5ea570c87 100644 --- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c +++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c @@ -29,7 +29,6 @@ #include <linux/slab.h> #include <linux/interrupt.h> #include <linux/pci.h> -#include <linux/aer.h> #include <linux/init.h> #include <linux/netdevice.h> #include <linux/etherdevice.h> @@ -13037,14 +13036,6 @@ static const struct net_device_ops bnx2x_netdev_ops = { .ndo_features_check = bnx2x_features_check, }; -static void bnx2x_disable_pcie_error_reporting(struct bnx2x *bp) -{ - if (bp->flags & AER_ENABLED) { - pci_disable_pcie_error_reporting(bp->pdev); - bp->flags &= ~AER_ENABLED; - } -} - static int bnx2x_init_dev(struct bnx2x *bp, struct pci_dev *pdev, struct net_device *dev, unsigned long board_type) { @@ -13157,13 +13148,6 @@ static int bnx2x_init_dev(struct bnx2x *bp, struct pci_dev *pdev, /* Set PCIe reset type to fundamental for EEH recovery */ pdev->needs_freset = 1; - /* AER (Advanced Error reporting) configuration */ - rc = pci_enable_pcie_error_reporting(pdev); - if (!rc) - bp->flags |= AER_ENABLED; - else - BNX2X_DEV_INFO("Failed To configure PCIe AER [%d]\n", rc); - /* * Clean the following indirect addresses for all functions since it * is not used by the driver. @@ -14020,8 +14004,6 @@ init_one_freemem: bnx2x_free_mem_bp(bp); init_one_exit: - bnx2x_disable_pcie_error_reporting(bp); - if (bp->regview) iounmap(bp->regview); @@ -14102,7 +14084,6 @@ static void __bnx2x_remove(struct pci_dev *pdev, pci_set_power_state(pdev, PCI_D3hot); } - bnx2x_disable_pcie_error_reporting(bp); if (remove_netdev) { if (bp->regview) iounmap(bp->regview); diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c index 651b79ce5d80..dcd9367f05af 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c @@ -48,7 +48,6 @@ #include <linux/prefetch.h> #include <linux/cache.h> #include <linux/log2.h> -#include <linux/aer.h> #include <linux/bitmap.h> #include <linux/cpu_rmap.h> #include <linux/cpumask.h> @@ -57,6 +56,7 @@ #include <linux/hwmon-sysfs.h> #include <net/page_pool.h> #include <linux/align.h> +#include <net/netdev_queues.h> #include "bnxt_hsi.h" #include "bnxt.h" @@ -332,26 +332,6 @@ static void bnxt_txr_db_kick(struct bnxt *bp, struct bnxt_tx_ring_info *txr, txr->kick_pending = 0; } -static bool bnxt_txr_netif_try_stop_queue(struct bnxt *bp, - struct bnxt_tx_ring_info *txr, - struct netdev_queue *txq) -{ - netif_tx_stop_queue(txq); - - /* netif_tx_stop_queue() must be done before checking - * tx index in bnxt_tx_avail() below, because in - * bnxt_tx_int(), we update tx index before checking for - * netif_tx_queue_stopped(). - */ - smp_mb(); - if (bnxt_tx_avail(bp, txr) >= bp->tx_wake_thresh) { - netif_tx_wake_queue(txq); - return false; - } - - return true; -} - static netdev_tx_t bnxt_start_xmit(struct sk_buff *skb, struct net_device *dev) { struct bnxt *bp = netdev_priv(dev); @@ -385,7 +365,8 @@ static netdev_tx_t bnxt_start_xmit(struct sk_buff *skb, struct net_device *dev) if (net_ratelimit() && txr->kick_pending) netif_warn(bp, tx_err, dev, "bnxt: ring busy w/ flush pending!\n"); - if (bnxt_txr_netif_try_stop_queue(bp, txr, txq)) + if (!netif_txq_try_stop(txq, bnxt_tx_avail(bp, txr), + bp->tx_wake_thresh)) return NETDEV_TX_BUSY; } @@ -491,7 +472,7 @@ static netdev_tx_t bnxt_start_xmit(struct sk_buff *skb, struct net_device *dev) prod = NEXT_TX(prod); tx_push->doorbell = cpu_to_le32(DB_KEY_TX_PUSH | DB_LONG_TX_PUSH | prod); - txr->tx_prod = prod; + WRITE_ONCE(txr->tx_prod, prod); tx_buf->is_push = 1; netdev_tx_sent_queue(txq, skb->len); @@ -602,7 +583,7 @@ normal_tx: wmb(); prod = NEXT_TX(prod); - txr->tx_prod = prod; + WRITE_ONCE(txr->tx_prod, prod); if (!netdev_xmit_more() || netif_xmit_stopped(txq)) bnxt_txr_db_kick(bp, txr, prod); @@ -615,7 +596,8 @@ tx_done: if (netdev_xmit_more() && !tx_buf->is_push) bnxt_txr_db_kick(bp, txr, prod); - bnxt_txr_netif_try_stop_queue(bp, txr, txq); + netif_txq_try_stop(txq, bnxt_tx_avail(bp, txr), + bp->tx_wake_thresh); } return NETDEV_TX_OK; @@ -706,20 +688,11 @@ next_tx_int: dev_kfree_skb_any(skb); } - netdev_tx_completed_queue(txq, nr_pkts, tx_bytes); - txr->tx_cons = cons; + WRITE_ONCE(txr->tx_cons, cons); - /* Need to make the tx_cons update visible to bnxt_start_xmit() - * before checking for netif_tx_queue_stopped(). Without the - * memory barrier, there is a small possibility that bnxt_start_xmit() - * will miss it and cause the queue to be stopped forever. - */ - smp_mb(); - - if (unlikely(netif_tx_queue_stopped(txq)) && - bnxt_tx_avail(bp, txr) >= bp->tx_wake_thresh && - READ_ONCE(txr->dev_state) != BNXT_DEV_STATE_CLOSING) - netif_tx_wake_queue(txq); + __netif_txq_completed_wake(txq, nr_pkts, tx_bytes, + bnxt_tx_avail(bp, txr), bp->tx_wake_thresh, + READ_ONCE(txr->dev_state) != BNXT_DEV_STATE_CLOSING); } static struct page *__bnxt_alloc_rx_page(struct bnxt *bp, dma_addr_t *mapping, @@ -3238,6 +3211,7 @@ static int bnxt_alloc_rx_page_pool(struct bnxt *bp, pp.pool_size = bp->rx_ring_size; pp.nid = dev_to_node(&bp->pdev->dev); + pp.napi = &rxr->bnapi->napi; pp.dev = &bp->pdev->dev; pp.dma_dir = DMA_BIDIRECTIONAL; @@ -7770,7 +7744,7 @@ static int __bnxt_hwrm_func_qcaps(struct bnxt *bp) if (flags & FUNC_QCAPS_RESP_FLAGS_WOL_MAGICPKT_SUPPORTED) bp->flags |= BNXT_FLAG_WOL_CAP; if (flags & FUNC_QCAPS_RESP_FLAGS_PTP_SUPPORTED) { - __bnxt_hwrm_ptp_qcfg(bp); + bp->fw_cap |= BNXT_FW_CAP_PTP; } else { bnxt_ptp_clear(bp); kfree(bp->ptp_cfg); @@ -12299,6 +12273,8 @@ static int bnxt_fw_init_one_p2(struct bnxt *bp) bnxt_hwrm_vnic_qcaps(bp); bnxt_hwrm_port_led_qcaps(bp); bnxt_ethtool_init(bp); + if (bp->fw_cap & BNXT_FW_CAP_PTP) + __bnxt_hwrm_ptp_qcfg(bp); bnxt_dcb_init(bp); return 0; } @@ -12705,8 +12681,6 @@ static int bnxt_init_board(struct pci_dev *pdev, struct net_device *dev) goto init_err_release; } - pci_enable_pcie_error_reporting(pdev); - INIT_WORK(&bp->sp_task, bnxt_sp_task); INIT_DELAYED_WORK(&bp->fw_reset_task, bnxt_fw_reset_task); @@ -13186,7 +13160,6 @@ static void bnxt_remove_one(struct pci_dev *pdev) bnxt_rdma_aux_device_uninit(bp); bnxt_ptp_clear(bp); - pci_disable_pcie_error_reporting(pdev); unregister_netdev(dev); clear_bit(BNXT_STATE_IN_FW_RESET, &bp->state); /* Flush any pending tasks */ diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.h b/drivers/net/ethernet/broadcom/bnxt/bnxt.h index 5928430f6f51..080e73496066 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.h +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.h @@ -1969,34 +1969,35 @@ struct bnxt { u32 msg_enable; - u32 fw_cap; - #define BNXT_FW_CAP_SHORT_CMD 0x00000001 - #define BNXT_FW_CAP_LLDP_AGENT 0x00000002 - #define BNXT_FW_CAP_DCBX_AGENT 0x00000004 - #define BNXT_FW_CAP_NEW_RM 0x00000008 - #define BNXT_FW_CAP_IF_CHANGE 0x00000010 - #define BNXT_FW_CAP_KONG_MB_CHNL 0x00000080 - #define BNXT_FW_CAP_OVS_64BIT_HANDLE 0x00000400 - #define BNXT_FW_CAP_TRUSTED_VF 0x00000800 - #define BNXT_FW_CAP_ERROR_RECOVERY 0x00002000 - #define BNXT_FW_CAP_PKG_VER 0x00004000 - #define BNXT_FW_CAP_CFA_ADV_FLOW 0x00008000 - #define BNXT_FW_CAP_CFA_RFS_RING_TBL_IDX_V2 0x00010000 - #define BNXT_FW_CAP_PCIE_STATS_SUPPORTED 0x00020000 - #define BNXT_FW_CAP_EXT_STATS_SUPPORTED 0x00040000 - #define BNXT_FW_CAP_RSS_HASH_TYPE_DELTA 0x00080000 - #define BNXT_FW_CAP_ERR_RECOVER_RELOAD 0x00100000 - #define BNXT_FW_CAP_HOT_RESET 0x00200000 - #define BNXT_FW_CAP_PTP_RTC 0x00400000 - #define BNXT_FW_CAP_RX_ALL_PKT_TS 0x00800000 - #define BNXT_FW_CAP_VLAN_RX_STRIP 0x01000000 - #define BNXT_FW_CAP_VLAN_TX_INSERT 0x02000000 - #define BNXT_FW_CAP_EXT_HW_STATS_SUPPORTED 0x04000000 - #define BNXT_FW_CAP_LIVEPATCH 0x08000000 - #define BNXT_FW_CAP_PTP_PPS 0x10000000 - #define BNXT_FW_CAP_HOT_RESET_IF 0x20000000 - #define BNXT_FW_CAP_RING_MONITOR 0x40000000 - #define BNXT_FW_CAP_DBG_QCAPS 0x80000000 + u64 fw_cap; + #define BNXT_FW_CAP_SHORT_CMD BIT_ULL(0) + #define BNXT_FW_CAP_LLDP_AGENT BIT_ULL(1) + #define BNXT_FW_CAP_DCBX_AGENT BIT_ULL(2) + #define BNXT_FW_CAP_NEW_RM BIT_ULL(3) + #define BNXT_FW_CAP_IF_CHANGE BIT_ULL(4) + #define BNXT_FW_CAP_KONG_MB_CHNL BIT_ULL(7) + #define BNXT_FW_CAP_OVS_64BIT_HANDLE BIT_ULL(10) + #define BNXT_FW_CAP_TRUSTED_VF BIT_ULL(11) + #define BNXT_FW_CAP_ERROR_RECOVERY BIT_ULL(13) + #define BNXT_FW_CAP_PKG_VER BIT_ULL(14) + #define BNXT_FW_CAP_CFA_ADV_FLOW BIT_ULL(15) + #define BNXT_FW_CAP_CFA_RFS_RING_TBL_IDX_V2 BIT_ULL(16) + #define BNXT_FW_CAP_PCIE_STATS_SUPPORTED BIT_ULL(17) + #define BNXT_FW_CAP_EXT_STATS_SUPPORTED BIT_ULL(18) + #define BNXT_FW_CAP_RSS_HASH_TYPE_DELTA BIT_ULL(19) + #define BNXT_FW_CAP_ERR_RECOVER_RELOAD BIT_ULL(20) + #define BNXT_FW_CAP_HOT_RESET BIT_ULL(21) + #define BNXT_FW_CAP_PTP_RTC BIT_ULL(22) + #define BNXT_FW_CAP_RX_ALL_PKT_TS BIT_ULL(23) + #define BNXT_FW_CAP_VLAN_RX_STRIP BIT_ULL(24) + #define BNXT_FW_CAP_VLAN_TX_INSERT BIT_ULL(25) + #define BNXT_FW_CAP_EXT_HW_STATS_SUPPORTED BIT_ULL(26) + #define BNXT_FW_CAP_LIVEPATCH BIT_ULL(27) + #define BNXT_FW_CAP_PTP_PPS BIT_ULL(28) + #define BNXT_FW_CAP_HOT_RESET_IF BIT_ULL(29) + #define BNXT_FW_CAP_RING_MONITOR BIT_ULL(30) + #define BNXT_FW_CAP_DBG_QCAPS BIT_ULL(31) + #define BNXT_FW_CAP_PTP BIT_ULL(32) u32 fw_dbg_cap; @@ -2230,13 +2231,12 @@ struct bnxt { #define SFF_MODULE_ID_QSFP28 0x11 #define BNXT_MAX_PHY_I2C_RESP_SIZE 64 -static inline u32 bnxt_tx_avail(struct bnxt *bp, struct bnxt_tx_ring_info *txr) +static inline u32 bnxt_tx_avail(struct bnxt *bp, + const struct bnxt_tx_ring_info *txr) { - /* Tell compiler to fetch tx indices from memory. */ - barrier(); + u32 used = READ_ONCE(txr->tx_prod) - READ_ONCE(txr->tx_cons); - return bp->tx_ring_size - - ((txr->tx_prod - txr->tx_cons) & bp->tx_ring_mask); + return bp->tx_ring_size - (used & bp->tx_ring_mask); } static inline void bnxt_writeq(struct bnxt *bp, u64 val, diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c index 6bd18eb5137f..2dd8ee4a6f75 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c @@ -2864,7 +2864,7 @@ static int bnxt_get_nvram_directory(struct net_device *dev, u32 len, u8 *data) if (rc) return rc; - buflen = dir_entries * entry_length; + buflen = mul_u32_u32(dir_entries, entry_length); buf = hwrm_req_dma_slice(bp, req, buflen, &dma_handle); if (!buf) { hwrm_req_drop(bp, req); diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_ptp.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_ptp.c index a3a3978a4d1c..e46689128e32 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt_ptp.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_ptp.c @@ -230,7 +230,7 @@ static int bnxt_ptp_adjfine(struct ptp_clock_info *ptp_info, long scaled_ppm) ptp_info); struct bnxt *bp = ptp->bp; - if (BNXT_PTP_USE_RTC(bp)) + if (!BNXT_MH(bp)) return bnxt_ptp_adjfine_rtc(bp, scaled_ppm); spin_lock_bh(&ptp->ptp_lock); @@ -861,9 +861,15 @@ static void bnxt_ptp_timecounter_init(struct bnxt *bp, bool init_tc) memset(&ptp->cc, 0, sizeof(ptp->cc)); ptp->cc.read = bnxt_cc_read; ptp->cc.mask = CYCLECOUNTER_MASK(48); - ptp->cc.shift = BNXT_CYCLES_SHIFT; - ptp->cc.mult = clocksource_khz2mult(BNXT_DEVCLK_FREQ, ptp->cc.shift); - ptp->cmult = ptp->cc.mult; + if (BNXT_MH(bp)) { + /* Use timecounter based non-real time mode */ + ptp->cc.shift = BNXT_CYCLES_SHIFT; + ptp->cc.mult = clocksource_khz2mult(BNXT_DEVCLK_FREQ, ptp->cc.shift); + ptp->cmult = ptp->cc.mult; + } else { + ptp->cc.shift = 0; + ptp->cc.mult = 1; + } ptp->next_overflow_check = jiffies + BNXT_PHC_OVERFLOW_PERIOD; } if (init_tc) diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c index 3ed3a2b3b3a9..dde327f2c57e 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c @@ -825,8 +825,24 @@ static int bnxt_sriov_enable(struct bnxt *bp, int *num_vfs) if (rc) goto err_out2; + if (bp->eswitch_mode != DEVLINK_ESWITCH_MODE_SWITCHDEV) + return 0; + + /* Create representors for VFs in switchdev mode */ + devl_lock(bp->dl); + rc = bnxt_vf_reps_create(bp); + devl_unlock(bp->dl); + if (rc) { + netdev_info(bp->dev, "Cannot enable VFS as representors cannot be created\n"); + goto err_out3; + } + return 0; +err_out3: + /* Disable SR-IOV */ + pci_disable_sriov(bp->pdev); + err_out2: /* Free the resources reserved for various VF's */ bnxt_hwrm_func_vf_resource_free(bp, *num_vfs); diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_vfr.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_vfr.c index fcc65890820a..2f1a1f2d2157 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt_vfr.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_vfr.c @@ -356,10 +356,15 @@ void bnxt_vf_reps_destroy(struct bnxt *bp) /* un-publish cfa_code_map so that RX path can't see it anymore */ kfree(bp->cfa_code_map); bp->cfa_code_map = NULL; - bp->eswitch_mode = DEVLINK_ESWITCH_MODE_LEGACY; - if (closed) + if (closed) { + /* Temporarily set legacy mode to avoid re-opening + * representors and restore switchdev mode after that. + */ + bp->eswitch_mode = DEVLINK_ESWITCH_MODE_LEGACY; bnxt_open_nic(bp, false, false); + bp->eswitch_mode = DEVLINK_ESWITCH_MODE_SWITCHDEV; + } rtnl_unlock(); /* Need to call vf_reps_destroy() outside of rntl_lock @@ -482,7 +487,7 @@ static void bnxt_vf_rep_netdev_init(struct bnxt *bp, struct bnxt_vf_rep *vf_rep, dev->min_mtu = ETH_ZLEN; } -static int bnxt_vf_reps_create(struct bnxt *bp) +int bnxt_vf_reps_create(struct bnxt *bp) { u16 *cfa_code_map = NULL, num_vfs = pci_num_vf(bp->pdev); struct bnxt_vf_rep *vf_rep; @@ -535,7 +540,6 @@ static int bnxt_vf_reps_create(struct bnxt *bp) /* publish cfa_code_map only after all VF-reps have been initialized */ bp->cfa_code_map = cfa_code_map; - bp->eswitch_mode = DEVLINK_ESWITCH_MODE_SWITCHDEV; netif_keep_dst(bp->dev); return 0; @@ -559,6 +563,7 @@ int bnxt_dl_eswitch_mode_set(struct devlink *devlink, u16 mode, struct netlink_ext_ack *extack) { struct bnxt *bp = bnxt_get_bp_from_dl(devlink); + int ret = 0; if (bp->eswitch_mode == mode) { netdev_info(bp->dev, "already in %s eswitch mode\n", @@ -570,7 +575,7 @@ int bnxt_dl_eswitch_mode_set(struct devlink *devlink, u16 mode, switch (mode) { case DEVLINK_ESWITCH_MODE_LEGACY: bnxt_vf_reps_destroy(bp); - return 0; + break; case DEVLINK_ESWITCH_MODE_SWITCHDEV: if (bp->hwrm_spec_code < 0x10803) { @@ -578,15 +583,19 @@ int bnxt_dl_eswitch_mode_set(struct devlink *devlink, u16 mode, return -ENOTSUPP; } - if (pci_num_vf(bp->pdev) == 0) { - netdev_info(bp->dev, "Enable VFs before setting switchdev mode\n"); - return -EPERM; - } - return bnxt_vf_reps_create(bp); + /* Create representors for existing VFs */ + if (pci_num_vf(bp->pdev) > 0) + ret = bnxt_vf_reps_create(bp); + break; default: return -EINVAL; } + + if (!ret) + bp->eswitch_mode = mode; + + return ret; } #endif diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_vfr.h b/drivers/net/ethernet/broadcom/bnxt/bnxt_vfr.h index 5637a84884d7..33a965631d0b 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt_vfr.h +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_vfr.h @@ -14,6 +14,7 @@ #define MAX_CFA_CODE 65536 +int bnxt_vf_reps_create(struct bnxt *bp); void bnxt_vf_reps_destroy(struct bnxt *bp); void bnxt_vf_reps_close(struct bnxt *bp); void bnxt_vf_reps_open(struct bnxt *bp); @@ -37,6 +38,11 @@ int bnxt_dl_eswitch_mode_set(struct devlink *devlink, u16 mode, #else +static inline int bnxt_vf_reps_create(struct bnxt *bp) +{ + return 0; +} + static inline void bnxt_vf_reps_close(struct bnxt *bp) { } diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.c index 5843c93b1711..4efa5fe6972b 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.c @@ -64,7 +64,7 @@ struct bnxt_sw_tx_bd *bnxt_xmit_bd(struct bnxt *bp, int frag_len; prod = NEXT_TX(prod); - txr->tx_prod = prod; + WRITE_ONCE(txr->tx_prod, prod); /* first fill up the first buffer */ frag_tx_buf = &txr->tx_buf_ring[prod]; @@ -94,7 +94,7 @@ struct bnxt_sw_tx_bd *bnxt_xmit_bd(struct bnxt *bp, /* Sync TX BD */ wmb(); prod = NEXT_TX(prod); - txr->tx_prod = prod; + WRITE_ONCE(txr->tx_prod, prod); return tx_buf; } @@ -161,7 +161,7 @@ void bnxt_tx_int_xdp(struct bnxt *bp, struct bnxt_napi *bnapi, int nr_pkts) } tx_cons = NEXT_TX(tx_cons); } - txr->tx_cons = tx_cons; + WRITE_ONCE(txr->tx_cons, tx_cons); if (rx_doorbell_needed) { tx_buf = &txr->tx_buf_ring[last_tx_cons]; bnxt_db_write(bp, &rxr->rx_db, tx_buf->rx_prod); diff --git a/drivers/net/ethernet/broadcom/sb1250-mac.c b/drivers/net/ethernet/broadcom/sb1250-mac.c index f02facb60fd1..3a6763c5e8b3 100644 --- a/drivers/net/ethernet/broadcom/sb1250-mac.c +++ b/drivers/net/ethernet/broadcom/sb1250-mac.c @@ -73,7 +73,7 @@ MODULE_PARM_DESC(int_timeout_rx, "RX timeout value"); #include <asm/sibyte/board.h> #include <asm/sibyte/sb1250.h> -#if defined(CONFIG_SIBYTE_BCM1x55) || defined(CONFIG_SIBYTE_BCM1x80) +#if defined(CONFIG_SIBYTE_BCM1x80) #include <asm/sibyte/bcm1480_regs.h> #include <asm/sibyte/bcm1480_int.h> #define R_MAC_DMA_OODPKTLOST_RX R_MAC_DMA_OODPKTLOST @@ -87,7 +87,7 @@ MODULE_PARM_DESC(int_timeout_rx, "RX timeout value"); #include <asm/sibyte/sb1250_mac.h> #include <asm/sibyte/sb1250_dma.h> -#if defined(CONFIG_SIBYTE_BCM1x55) || defined(CONFIG_SIBYTE_BCM1x80) +#if defined(CONFIG_SIBYTE_BCM1x80) #define UNIT_INT(n) (K_BCM1480_INT_MAC_0 + ((n) * 2)) #elif defined(CONFIG_SIBYTE_SB1250) || defined(CONFIG_SIBYTE_BCM112X) #define UNIT_INT(n) (K_INT_MAC_0 + (n)) @@ -1527,7 +1527,7 @@ static void sbmac_channel_start(struct sbmac_softc *s) * Turn on the rest of the bits in the enable register */ -#if defined(CONFIG_SIBYTE_BCM1x55) || defined(CONFIG_SIBYTE_BCM1x80) +#if defined(CONFIG_SIBYTE_BCM1x80) __raw_writeq(M_MAC_RXDMA_EN0 | M_MAC_TXDMA_EN0, s->sbm_macenable); #elif defined(CONFIG_SIBYTE_SB1250) || defined(CONFIG_SIBYTE_BCM112X) diff --git a/drivers/net/ethernet/cadence/macb.h b/drivers/net/ethernet/cadence/macb.h index 14dfec4db8f9..cfbdd0022764 100644 --- a/drivers/net/ethernet/cadence/macb.h +++ b/drivers/net/ethernet/cadence/macb.h @@ -95,6 +95,8 @@ #define GEM_SA4B 0x00A0 /* Specific4 Bottom */ #define GEM_SA4T 0x00A4 /* Specific4 Top */ #define GEM_WOL 0x00b8 /* Wake on LAN */ +#define GEM_RXPTPUNI 0x00D4 /* PTP RX Unicast address */ +#define GEM_TXPTPUNI 0x00D8 /* PTP TX Unicast address */ #define GEM_EFTSH 0x00e8 /* PTP Event Frame Transmitted Seconds Register 47:32 */ #define GEM_EFRSH 0x00ec /* PTP Event Frame Received Seconds Register 47:32 */ #define GEM_PEFTSH 0x00f0 /* PTP Peer Event Frame Transmitted Seconds Register 47:32 */ @@ -245,6 +247,8 @@ #define MACB_TZQ_OFFSET 12 /* Transmit zero quantum pause frame */ #define MACB_TZQ_SIZE 1 #define MACB_SRTSM_OFFSET 15 /* Store Receive Timestamp to Memory */ +#define MACB_PTPUNI_OFFSET 20 /* PTP Unicast packet enable */ +#define MACB_PTPUNI_SIZE 1 #define MACB_OSSMODE_OFFSET 24 /* Enable One Step Synchro Mode */ #define MACB_OSSMODE_SIZE 1 #define MACB_MIIONRGMII_OFFSET 28 /* MII Usage on RGMII Interface */ @@ -692,6 +696,8 @@ #define GEM_CLK_DIV48 3 #define GEM_CLK_DIV64 4 #define GEM_CLK_DIV96 5 +#define GEM_CLK_DIV128 6 +#define GEM_CLK_DIV224 7 /* Constants for MAN register */ #define MACB_MAN_C22_SOF 1 @@ -1361,7 +1367,7 @@ static inline bool macb_is_gem(struct macb *bp) static inline bool gem_has_ptp(struct macb *bp) { - return !!(bp->caps & MACB_CAPS_GEM_HAS_PTP); + return IS_ENABLED(CONFIG_MACB_USE_HWSTAMP) && (bp->caps & MACB_CAPS_GEM_HAS_PTP); } /** diff --git a/drivers/net/ethernet/cadence/macb_main.c b/drivers/net/ethernet/cadence/macb_main.c index e43d99ec50ba..29a1199dad14 100644 --- a/drivers/net/ethernet/cadence/macb_main.c +++ b/drivers/net/ethernet/cadence/macb_main.c @@ -94,8 +94,7 @@ struct sifive_fu540_macb_mgmt { /* Graceful stop timeouts in us. We should allow up to * 1 frame time (10 Mbits/s, full-duplex, ignoring collisions) */ -#define MACB_HALT_TIMEOUT 1230 - +#define MACB_HALT_TIMEOUT 14000 #define MACB_PM_TIMEOUT 100 /* ms */ #define MACB_MDIO_TIMEOUT 1000000 /* in usecs */ @@ -288,6 +287,11 @@ static void macb_set_hwaddr(struct macb *bp) top = cpu_to_le16(*((u16 *)(bp->dev->dev_addr + 4))); macb_or_gem_writel(bp, SA1T, top); + if (gem_has_ptp(bp)) { + gem_writel(bp, RXPTPUNI, bottom); + gem_writel(bp, TXPTPUNI, bottom); + } + /* Clear unused address register sets */ macb_or_gem_writel(bp, SA2B, 0); macb_or_gem_writel(bp, SA2T, 0); @@ -774,8 +778,12 @@ static void macb_mac_link_up(struct phylink_config *config, spin_unlock_irqrestore(&bp->lock, flags); - /* Enable Rx and Tx */ - macb_writel(bp, NCR, macb_readl(bp, NCR) | MACB_BIT(RE) | MACB_BIT(TE)); + /* Enable Rx and Tx; Enable PTP unicast */ + ctrl = macb_readl(bp, NCR); + if (gem_has_ptp(bp)) + ctrl |= MACB_BIT(PTPUNI); + + macb_writel(bp, NCR, ctrl | MACB_BIT(RE) | MACB_BIT(TE)); netif_tx_wake_all_queues(ndev); } @@ -1075,6 +1083,7 @@ static void macb_tx_error_task(struct work_struct *work) { struct macb_queue *queue = container_of(work, struct macb_queue, tx_error_task); + bool halt_timeout = false; struct macb *bp = queue->bp; struct macb_tx_skb *tx_skb; struct macb_dma_desc *desc; @@ -1102,9 +1111,11 @@ static void macb_tx_error_task(struct work_struct *work) * (in case we have just queued new packets) * macb/gem must be halted to write TBQP register */ - if (macb_halt_tx(bp)) - /* Just complain for now, reinitializing TX path can be good */ + if (macb_halt_tx(bp)) { netdev_err(bp->dev, "BUG: halt tx timed out\n"); + macb_writel(bp, NCR, macb_readl(bp, NCR) & (~MACB_BIT(TE))); + halt_timeout = true; + } /* Treat frames in TX queue including the ones that caused the error. * Free transmit buffers in upper layer. @@ -1175,6 +1186,9 @@ static void macb_tx_error_task(struct work_struct *work) macb_writel(bp, TSR, macb_readl(bp, TSR)); queue_writel(queue, IER, MACB_TX_INT_FLAGS); + if (halt_timeout) + macb_writel(bp, NCR, macb_readl(bp, NCR) | MACB_BIT(TE)); + /* Now we are ready to start transmission again */ netif_tx_start_all_queues(bp->dev); macb_writel(bp, NCR, macb_readl(bp, NCR) | MACB_BIT(TSTART)); @@ -2645,8 +2659,12 @@ static u32 gem_mdc_clk_div(struct macb *bp) config = GEM_BF(CLK, GEM_CLK_DIV48); else if (pclk_hz <= 160000000) config = GEM_BF(CLK, GEM_CLK_DIV64); - else + else if (pclk_hz <= 240000000) config = GEM_BF(CLK, GEM_CLK_DIV96); + else if (pclk_hz <= 320000000) + config = GEM_BF(CLK, GEM_CLK_DIV128); + else + config = GEM_BF(CLK, GEM_CLK_DIV224); return config; } @@ -3884,17 +3902,17 @@ static void macb_configure_caps(struct macb *bp, dcfg = gem_readl(bp, DCFG2); if ((dcfg & (GEM_BIT(RX_PKT_BUFF) | GEM_BIT(TX_PKT_BUFF))) == 0) bp->caps |= MACB_CAPS_FIFO_MODE; -#ifdef CONFIG_MACB_USE_HWSTAMP if (gem_has_ptp(bp)) { if (!GEM_BFEXT(TSU, gem_readl(bp, DCFG5))) dev_err(&bp->pdev->dev, "GEM doesn't support hardware ptp.\n"); else { +#ifdef CONFIG_MACB_USE_HWSTAMP bp->hw_dma_cap |= HW_DMA_CAP_PTP; bp->ptp_info = &gem_ptp_info; +#endif } } -#endif } dev_dbg(&bp->pdev->dev, "Cadence caps 0x%08x\n", bp->caps); @@ -4848,7 +4866,7 @@ static const struct macb_config mpfs_config = { static const struct macb_config sama7g5_gem_config = { .caps = MACB_CAPS_GIGABIT_MODE_AVAILABLE | MACB_CAPS_CLK_HW_CHG | - MACB_CAPS_MIIONRGMII, + MACB_CAPS_MIIONRGMII | MACB_CAPS_GEM_HAS_PTP, .dma_burst_length = 16, .clk_init = macb_clk_init, .init = macb_init, @@ -4857,7 +4875,8 @@ static const struct macb_config sama7g5_gem_config = { static const struct macb_config sama7g5_emac_config = { .caps = MACB_CAPS_USRIO_DEFAULT_IS_MII_GMII | - MACB_CAPS_USRIO_HAS_CLKEN | MACB_CAPS_MIIONRGMII, + MACB_CAPS_USRIO_HAS_CLKEN | MACB_CAPS_MIIONRGMII | + MACB_CAPS_GEM_HAS_PTP, .dma_burst_length = 16, .clk_init = macb_clk_init, .init = macb_init, diff --git a/drivers/net/ethernet/cadence/macb_ptp.c b/drivers/net/ethernet/cadence/macb_ptp.c index f962a95068a0..51d26fa190d7 100644 --- a/drivers/net/ethernet/cadence/macb_ptp.c +++ b/drivers/net/ethernet/cadence/macb_ptp.c @@ -258,6 +258,8 @@ static int gem_hw_timestamp(struct macb *bp, u32 dma_desc_ts_1, */ gem_tsu_get_time(&bp->ptp_clock_info, &tsu, NULL); + ts->tv_sec |= ((~GEM_DMA_SEC_MASK) & tsu.tv_sec); + /* If the top bit is set in the timestamp, * but not in 1588 timer, it has rolled over, * so subtract max size @@ -266,8 +268,6 @@ static int gem_hw_timestamp(struct macb *bp, u32 dma_desc_ts_1, !(tsu.tv_sec & (GEM_DMA_SEC_TOP >> 1))) ts->tv_sec -= GEM_DMA_SEC_TOP; - ts->tv_sec += ((~GEM_DMA_SEC_MASK) & tsu.tv_sec); - return 0; } diff --git a/drivers/net/ethernet/cavium/liquidio/lio_main.c b/drivers/net/ethernet/cavium/liquidio/lio_main.c index fd7c80edb6e8..9bd1d2d7027d 100644 --- a/drivers/net/ethernet/cavium/liquidio/lio_main.c +++ b/drivers/net/ethernet/cavium/liquidio/lio_main.c @@ -1129,7 +1129,6 @@ static void octeon_destroy_resources(struct octeon_device *oct) fallthrough; case OCT_DEV_PCI_ENABLE_DONE: - pci_clear_master(oct->pci_dev); /* Disable the device, releasing the PCI INT */ pci_disable_device(oct->pci_dev); diff --git a/drivers/net/ethernet/cavium/liquidio/lio_vf_main.c b/drivers/net/ethernet/cavium/liquidio/lio_vf_main.c index ac196883f07e..e2921aec3da0 100644 --- a/drivers/net/ethernet/cavium/liquidio/lio_vf_main.c +++ b/drivers/net/ethernet/cavium/liquidio/lio_vf_main.c @@ -577,7 +577,6 @@ static void octeon_destroy_resources(struct octeon_device *oct) fallthrough; case OCT_DEV_PCI_ENABLE_DONE: - pci_clear_master(oct->pci_dev); /* Disable the device, releasing the PCI INT */ pci_disable_device(oct->pci_dev); diff --git a/drivers/net/ethernet/cavium/liquidio/request_manager.c b/drivers/net/ethernet/cavium/liquidio/request_manager.c index 8e59c2825533..32f854c0cd79 100644 --- a/drivers/net/ethernet/cavium/liquidio/request_manager.c +++ b/drivers/net/ethernet/cavium/liquidio/request_manager.c @@ -40,15 +40,6 @@ static void __check_db_timeout(struct octeon_device *oct, u64 iq_no); static void (*reqtype_free_fn[MAX_OCTEON_DEVICES][REQTYPE_LAST + 1]) (void *); -static inline int IQ_INSTR_MODE_64B(struct octeon_device *oct, int iq_no) -{ - struct octeon_instr_queue *iq = - (struct octeon_instr_queue *)oct->instr_queue[iq_no]; - return iq->iqcmd_64B; -} - -#define IQ_INSTR_MODE_32B(oct, iq_no) (!IQ_INSTR_MODE_64B(oct, iq_no)) - /* Define this to return the request status comaptible to old code */ /*#define OCTEON_USE_OLD_REQ_STATUS*/ diff --git a/drivers/net/ethernet/chelsio/cxgb3/sge.c b/drivers/net/ethernet/chelsio/cxgb3/sge.c index 62dfbdd33365..efa7f401529e 100644 --- a/drivers/net/ethernet/chelsio/cxgb3/sge.c +++ b/drivers/net/ethernet/chelsio/cxgb3/sge.c @@ -166,11 +166,6 @@ static u8 flit_desc_map[] = { #endif }; -static inline struct sge_qset *fl_to_qset(const struct sge_fl *q, int qidx) -{ - return container_of(q, struct sge_qset, fl[qidx]); -} - static inline struct sge_qset *rspq_to_qset(const struct sge_rspq *q) { return container_of(q, struct sge_qset, rspq); diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c index 7db2403c4c9c..f0bc7396ce2b 100644 --- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c +++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c @@ -51,7 +51,6 @@ #include <linux/mutex.h> #include <linux/netdevice.h> #include <linux/pci.h> -#include <linux/aer.h> #include <linux/rtnetlink.h> #include <linux/sched.h> #include <linux/seq_file.h> @@ -6687,7 +6686,6 @@ static int init_one(struct pci_dev *pdev, const struct pci_device_id *ent) goto out_free_adapter; } - pci_enable_pcie_error_reporting(pdev); pci_set_master(pdev); pci_save_state(pdev); adap_idx++; @@ -7092,7 +7090,6 @@ fw_attach_fail: out_unmap_bar0: iounmap(regs); out_disable_device: - pci_disable_pcie_error_reporting(pdev); pci_disable_device(pdev); out_release_regions: pci_release_regions(pdev); @@ -7171,7 +7168,6 @@ static void remove_one(struct pci_dev *pdev) } #endif iounmap(adapter->regs); - pci_disable_pcie_error_reporting(pdev); if ((adapter->flags & CXGB4_DEV_ENABLED)) { pci_disable_device(pdev); adapter->flags &= ~CXGB4_DEV_ENABLED; diff --git a/drivers/net/ethernet/chelsio/cxgb4vf/cxgb4vf_main.c b/drivers/net/ethernet/chelsio/cxgb4vf/cxgb4vf_main.c index 63b2bd084130..9ba0864592e8 100644 --- a/drivers/net/ethernet/chelsio/cxgb4vf/cxgb4vf_main.c +++ b/drivers/net/ethernet/chelsio/cxgb4vf/cxgb4vf_main.c @@ -3258,7 +3258,6 @@ err_free_adapter: err_release_regions: pci_release_regions(pdev); - pci_clear_master(pdev); err_disable_device: pci_disable_device(pdev); @@ -3338,7 +3337,6 @@ static void cxgb4vf_pci_remove(struct pci_dev *pdev) * Disable the device and release its PCI resources. */ pci_disable_device(pdev); - pci_clear_master(pdev); pci_release_regions(pdev); } diff --git a/drivers/net/ethernet/ec_bhf.c b/drivers/net/ethernet/ec_bhf.c index 46e3a05e9582..c2c5c589a5e3 100644 --- a/drivers/net/ethernet/ec_bhf.c +++ b/drivers/net/ethernet/ec_bhf.c @@ -558,7 +558,6 @@ err_unmap: err_release_regions: pci_release_regions(dev); err_disable_dev: - pci_clear_master(dev); pci_disable_device(dev); return err; @@ -577,7 +576,6 @@ static void ec_bhf_remove(struct pci_dev *dev) free_netdev(net_dev); pci_release_regions(dev); - pci_clear_master(dev); pci_disable_device(dev); } diff --git a/drivers/net/ethernet/emulex/benet/be_cmds.c b/drivers/net/ethernet/emulex/benet/be_cmds.c index 08ec84cd21c0..61adcebeef01 100644 --- a/drivers/net/ethernet/emulex/benet/be_cmds.c +++ b/drivers/net/ethernet/emulex/benet/be_cmds.c @@ -135,7 +135,8 @@ static int be_mcc_notify(struct be_adapter *adapter) /* To check if valid bit is set, check the entire word as we don't know * the endianness of the data (old entry is host endian while a new entry is - * little endian) */ + * little endian) + */ static inline bool be_mcc_compl_is_new(struct be_mcc_compl *compl) { u32 flags; @@ -248,7 +249,8 @@ static int be_mcc_compl_process(struct be_adapter *adapter, u8 opcode = 0, subsystem = 0; /* Just swap the status to host endian; mcc tag is opaquely copied - * from mcc_wrb */ + * from mcc_wrb + */ be_dws_le_to_cpu(compl, 4); base_status = base_status(compl->status); @@ -657,8 +659,7 @@ static int be_mbox_db_ready_wait(struct be_adapter *adapter, void __iomem *db) return 0; } -/* - * Insert the mailbox address into the doorbell in two steps +/* Insert the mailbox address into the doorbell in two steps * Polls on the mbox doorbell till a command completion (or a timeout) occurs */ static int be_mbox_notify_wait(struct be_adapter *adapter) @@ -802,7 +803,7 @@ static void be_wrb_cmd_hdr_prepare(struct be_cmd_req_hdr *req_hdr, req_hdr->subsystem = subsystem; req_hdr->request_length = cpu_to_le32(cmd_len - sizeof(*req_hdr)); req_hdr->version = 0; - fill_wrb_tags(wrb, (ulong) req_hdr); + fill_wrb_tags(wrb, (ulong)req_hdr); wrb->payload_length = cmd_len; if (mem) { wrb->embedded |= (1 & MCC_WRB_SGE_CNT_MASK) << @@ -832,8 +833,8 @@ static void be_cmd_page_addrs_prepare(struct phys_addr *pages, u32 max_pages, static inline struct be_mcc_wrb *wrb_from_mbox(struct be_adapter *adapter) { struct be_dma_mem *mbox_mem = &adapter->mbox_mem; - struct be_mcc_wrb *wrb - = &((struct be_mcc_mailbox *)(mbox_mem->va))->wrb; + struct be_mcc_wrb *wrb = &((struct be_mcc_mailbox *)(mbox_mem->va))->wrb; + memset(wrb, 0, sizeof(*wrb)); return wrb; } @@ -896,7 +897,7 @@ static struct be_mcc_wrb *be_cmd_copy(struct be_adapter *adapter, memcpy(dest_wrb, wrb, sizeof(*wrb)); if (wrb->embedded & cpu_to_le32(MCC_WRB_EMBEDDED_MASK)) - fill_wrb_tags(dest_wrb, (ulong) embedded_payload(wrb)); + fill_wrb_tags(dest_wrb, (ulong)embedded_payload(wrb)); return dest_wrb; } @@ -1114,7 +1115,7 @@ int be_cmd_pmac_add(struct be_adapter *adapter, const u8 *mac_addr, err: mutex_unlock(&adapter->mcc_lock); - if (base_status(status) == MCC_STATUS_UNAUTHORIZED_REQUEST) + if (base_status(status) == MCC_STATUS_UNAUTHORIZED_REQUEST) status = -EPERM; return status; @@ -1803,7 +1804,7 @@ int be_cmd_get_fat_dump(struct be_adapter *adapter, u32 buf_len, void *buf) total_size = buf_len; - get_fat_cmd.size = sizeof(struct be_cmd_req_get_fat) + 60*1024; + get_fat_cmd.size = sizeof(struct be_cmd_req_get_fat) + 60 * 1024; get_fat_cmd.va = dma_alloc_coherent(&adapter->pdev->dev, get_fat_cmd.size, &get_fat_cmd.dma, GFP_ATOMIC); @@ -1813,7 +1814,7 @@ int be_cmd_get_fat_dump(struct be_adapter *adapter, u32 buf_len, void *buf) mutex_lock(&adapter->mcc_lock); while (total_size) { - buf_size = min(total_size, (u32)60*1024); + buf_size = min(total_size, (u32)60 * 1024); total_size -= buf_size; wrb = wrb_from_mccq(adapter); @@ -3362,7 +3363,7 @@ int be_cmd_ddr_dma_test(struct be_adapter *adapter, u64 pattern, req->pattern = cpu_to_le64(pattern); req->byte_count = cpu_to_le32(byte_cnt); for (i = 0; i < byte_cnt; i++) { - req->snd_buff[i] = (u8)(pattern >> (j*8)); + req->snd_buff[i] = (u8)(pattern >> (j * 8)); j++; if (j > 7) j = 0; @@ -3846,7 +3847,7 @@ int be_cmd_set_mac_list(struct be_adapter *adapter, u8 *mac_array, req->hdr.domain = domain; req->mac_count = mac_count; if (mac_count) - memcpy(req->mac, mac_array, ETH_ALEN*mac_count); + memcpy(req->mac, mac_array, ETH_ALEN * mac_count); status = be_mcc_notify_wait(adapter); diff --git a/drivers/net/ethernet/emulex/benet/be_main.c b/drivers/net/ethernet/emulex/benet/be_main.c index 46fe3d74e2e9..7e408bcc88de 100644 --- a/drivers/net/ethernet/emulex/benet/be_main.c +++ b/drivers/net/ethernet/emulex/benet/be_main.c @@ -16,7 +16,6 @@ #include "be.h" #include "be_cmds.h" #include <asm/div64.h> -#include <linux/aer.h> #include <linux/if_bridge.h> #include <net/busy_poll.h> #include <net/vxlan.h> @@ -1125,7 +1124,7 @@ static struct sk_buff *be_lancer_xmit_workarounds(struct be_adapter *adapter, struct be_wrb_params *wrb_params) { - struct vlan_ethhdr *veh = (struct vlan_ethhdr *)skb->data; + struct vlan_ethhdr *veh = skb_vlan_eth_hdr(skb); unsigned int eth_hdr_len; struct iphdr *ip; @@ -5726,8 +5725,6 @@ static void be_remove(struct pci_dev *pdev) be_unmap_pci_bars(adapter); be_drv_cleanup(adapter); - pci_disable_pcie_error_reporting(pdev); - pci_release_regions(pdev); pci_disable_device(pdev); @@ -5845,10 +5842,6 @@ static int be_probe(struct pci_dev *pdev, const struct pci_device_id *pdev_id) goto free_netdev; } - status = pci_enable_pcie_error_reporting(pdev); - if (!status) - dev_info(&pdev->dev, "PCIe error reporting enabled\n"); - status = be_map_pci_bars(adapter); if (status) goto free_netdev; @@ -5893,7 +5886,6 @@ drv_cleanup: unmap_bars: be_unmap_pci_bars(adapter); free_netdev: - pci_disable_pcie_error_reporting(pdev); free_netdev(netdev); rel_reg: pci_release_regions(pdev); diff --git a/drivers/net/ethernet/engleder/tsnep.h b/drivers/net/ethernet/engleder/tsnep.h index 058c2bcf31a7..11b29f56aaf9 100644 --- a/drivers/net/ethernet/engleder/tsnep.h +++ b/drivers/net/ethernet/engleder/tsnep.h @@ -18,6 +18,7 @@ #define TSNEP "tsnep" #define TSNEP_RING_SIZE 256 +#define TSNEP_RING_MASK (TSNEP_RING_SIZE - 1) #define TSNEP_RING_RX_REFILL 16 #define TSNEP_RING_RX_REUSE (TSNEP_RING_SIZE - TSNEP_RING_SIZE / 4) #define TSNEP_RING_ENTRIES_PER_PAGE (PAGE_SIZE / TSNEP_DESC_SIZE) @@ -69,6 +70,7 @@ struct tsnep_tx_entry { union { struct sk_buff *skb; struct xdp_frame *xdpf; + bool zc; }; size_t len; DEFINE_DMA_UNMAP_ADDR(dma); @@ -87,6 +89,7 @@ struct tsnep_tx { int read; u32 owner_counter; int increment_owner_counter; + struct xsk_buff_pool *xsk_pool; u32 packets; u32 bytes; @@ -100,7 +103,10 @@ struct tsnep_rx_entry { u32 properties; - struct page *page; + union { + struct page *page; + struct xdp_buff *xdp; + }; size_t len; dma_addr_t dma; }; @@ -120,6 +126,9 @@ struct tsnep_rx { u32 owner_counter; int increment_owner_counter; struct page_pool *page_pool; + struct page **page_buffer; + struct xsk_buff_pool *xsk_pool; + struct xdp_buff **xdp_batch; u32 packets; u32 bytes; @@ -128,6 +137,7 @@ struct tsnep_rx { u32 alloc_failed; struct xdp_rxq_info xdp_rxq; + struct xdp_rxq_info xdp_rxq_zc; }; struct tsnep_queue { @@ -213,6 +223,8 @@ int tsnep_rxnfc_del_rule(struct tsnep_adapter *adapter, int tsnep_xdp_setup_prog(struct tsnep_adapter *adapter, struct bpf_prog *prog, struct netlink_ext_ack *extack); +int tsnep_xdp_setup_pool(struct tsnep_adapter *adapter, + struct xsk_buff_pool *pool, u16 queue_id); #if IS_ENABLED(CONFIG_TSNEP_SELFTESTS) int tsnep_ethtool_get_test_count(void); @@ -241,5 +253,7 @@ static inline void tsnep_ethtool_self_test(struct net_device *dev, void tsnep_get_system_time(struct tsnep_adapter *adapter, u64 *time); int tsnep_set_irq_coalesce(struct tsnep_queue *queue, u32 usecs); u32 tsnep_get_irq_coalesce(struct tsnep_queue *queue); +int tsnep_enable_xsk(struct tsnep_queue *queue, struct xsk_buff_pool *pool); +void tsnep_disable_xsk(struct tsnep_queue *queue); #endif /* _TSNEP_H */ diff --git a/drivers/net/ethernet/engleder/tsnep_main.c b/drivers/net/ethernet/engleder/tsnep_main.c index 6982aaa928b5..84751bb303a6 100644 --- a/drivers/net/ethernet/engleder/tsnep_main.c +++ b/drivers/net/ethernet/engleder/tsnep_main.c @@ -28,11 +28,16 @@ #include <linux/iopoll.h> #include <linux/bpf.h> #include <linux/bpf_trace.h> +#include <net/xdp_sock_drv.h> #define TSNEP_RX_OFFSET (max(NET_SKB_PAD, XDP_PACKET_HEADROOM) + NET_IP_ALIGN) #define TSNEP_HEADROOM ALIGN(TSNEP_RX_OFFSET, 4) #define TSNEP_MAX_RX_BUF_SIZE (PAGE_SIZE - TSNEP_HEADROOM - \ SKB_DATA_ALIGN(sizeof(struct skb_shared_info))) +/* XSK buffer shall store at least Q-in-Q frame */ +#define TSNEP_XSK_RX_BUF_SIZE (ALIGN(TSNEP_RX_INLINE_METADATA_SIZE + \ + ETH_FRAME_LEN + ETH_FCS_LEN + \ + VLAN_HLEN * 2, 4)) #ifdef CONFIG_ARCH_DMA_ADDR_T_64BIT #define DMA_ADDR_HIGH(dma_addr) ((u32)(((dma_addr) >> 32) & 0xFFFFFFFF)) @@ -49,6 +54,8 @@ #define TSNEP_TX_TYPE_SKB_FRAG BIT(1) #define TSNEP_TX_TYPE_XDP_TX BIT(2) #define TSNEP_TX_TYPE_XDP_NDO BIT(3) +#define TSNEP_TX_TYPE_XDP (TSNEP_TX_TYPE_XDP_TX | TSNEP_TX_TYPE_XDP_NDO) +#define TSNEP_TX_TYPE_XSK BIT(4) #define TSNEP_XDP_TX BIT(0) #define TSNEP_XDP_REDIRECT BIT(1) @@ -246,7 +253,6 @@ static void tsnep_phy_close(struct tsnep_adapter *adapter) { phy_stop(adapter->netdev->phydev); phy_disconnect(adapter->netdev->phydev); - adapter->netdev->phydev = NULL; } static void tsnep_tx_ring_cleanup(struct tsnep_tx *tx) @@ -266,7 +272,7 @@ static void tsnep_tx_ring_cleanup(struct tsnep_tx *tx) } } -static int tsnep_tx_ring_init(struct tsnep_tx *tx) +static int tsnep_tx_ring_create(struct tsnep_tx *tx) { struct device *dmadev = tx->adapter->dmadev; struct tsnep_tx_entry *entry; @@ -289,11 +295,12 @@ static int tsnep_tx_ring_init(struct tsnep_tx *tx) entry->desc = (struct tsnep_tx_desc *) (((u8 *)entry->desc_wb) + TSNEP_DESC_OFFSET); entry->desc_dma = tx->page_dma[i] + TSNEP_DESC_SIZE * j; + entry->owner_user_flag = false; } } for (i = 0; i < TSNEP_RING_SIZE; i++) { entry = &tx->entry[i]; - next_entry = &tx->entry[(i + 1) % TSNEP_RING_SIZE]; + next_entry = &tx->entry[(i + 1) & TSNEP_RING_MASK]; entry->desc->next = __cpu_to_le64(next_entry->desc_dma); } @@ -304,13 +311,60 @@ alloc_failed: return retval; } +static void tsnep_tx_init(struct tsnep_tx *tx) +{ + dma_addr_t dma; + + dma = tx->entry[0].desc_dma | TSNEP_RESET_OWNER_COUNTER; + iowrite32(DMA_ADDR_LOW(dma), tx->addr + TSNEP_TX_DESC_ADDR_LOW); + iowrite32(DMA_ADDR_HIGH(dma), tx->addr + TSNEP_TX_DESC_ADDR_HIGH); + tx->write = 0; + tx->read = 0; + tx->owner_counter = 1; + tx->increment_owner_counter = TSNEP_RING_SIZE - 1; +} + +static void tsnep_tx_enable(struct tsnep_tx *tx) +{ + struct netdev_queue *nq; + + nq = netdev_get_tx_queue(tx->adapter->netdev, tx->queue_index); + + __netif_tx_lock_bh(nq); + netif_tx_wake_queue(nq); + __netif_tx_unlock_bh(nq); +} + +static void tsnep_tx_disable(struct tsnep_tx *tx, struct napi_struct *napi) +{ + struct netdev_queue *nq; + u32 val; + + nq = netdev_get_tx_queue(tx->adapter->netdev, tx->queue_index); + + __netif_tx_lock_bh(nq); + netif_tx_stop_queue(nq); + __netif_tx_unlock_bh(nq); + + /* wait until TX is done in hardware */ + readx_poll_timeout(ioread32, tx->addr + TSNEP_CONTROL, val, + ((val & TSNEP_CONTROL_TX_ENABLE) == 0), 10000, + 1000000); + + /* wait until TX is also done in software */ + while (READ_ONCE(tx->read) != tx->write) { + napi_schedule(napi); + napi_synchronize(napi); + } +} + static void tsnep_tx_activate(struct tsnep_tx *tx, int index, int length, bool last) { struct tsnep_tx_entry *entry = &tx->entry[index]; entry->properties = 0; - /* xdpf is union with skb */ + /* xdpf and zc are union with skb */ if (entry->skb) { entry->properties = length & TSNEP_DESC_LENGTH_MASK; entry->properties |= TSNEP_DESC_INTERRUPT_FLAG; @@ -382,7 +436,7 @@ static int tsnep_tx_map(struct sk_buff *skb, struct tsnep_tx *tx, int count) int i; for (i = 0; i < count; i++) { - entry = &tx->entry[(tx->write + i) % TSNEP_RING_SIZE]; + entry = &tx->entry[(tx->write + i) & TSNEP_RING_MASK]; if (!i) { len = skb_headlen(skb); @@ -420,7 +474,7 @@ static int tsnep_tx_unmap(struct tsnep_tx *tx, int index, int count) int i; for (i = 0; i < count; i++) { - entry = &tx->entry[(index + i) % TSNEP_RING_SIZE]; + entry = &tx->entry[(index + i) & TSNEP_RING_MASK]; if (entry->len) { if (entry->type & TSNEP_TX_TYPE_SKB) @@ -482,9 +536,9 @@ static netdev_tx_t tsnep_xmit_frame_ring(struct sk_buff *skb, skb_shinfo(skb)->tx_flags |= SKBTX_IN_PROGRESS; for (i = 0; i < count; i++) - tsnep_tx_activate(tx, (tx->write + i) % TSNEP_RING_SIZE, length, + tsnep_tx_activate(tx, (tx->write + i) & TSNEP_RING_MASK, length, i == count - 1); - tx->write = (tx->write + count) % TSNEP_RING_SIZE; + tx->write = (tx->write + count) & TSNEP_RING_MASK; skb_tx_timestamp(skb); @@ -517,7 +571,7 @@ static int tsnep_xdp_tx_map(struct xdp_frame *xdpf, struct tsnep_tx *tx, frag = NULL; len = xdpf->len; for (i = 0; i < count; i++) { - entry = &tx->entry[(tx->write + i) % TSNEP_RING_SIZE]; + entry = &tx->entry[(tx->write + i) & TSNEP_RING_MASK]; if (type & TSNEP_TX_TYPE_XDP_NDO) { data = unlikely(frag) ? skb_frag_address(frag) : xdpf->data; @@ -590,9 +644,9 @@ static bool tsnep_xdp_xmit_frame_ring(struct xdp_frame *xdpf, length = retval; for (i = 0; i < count; i++) - tsnep_tx_activate(tx, (tx->write + i) % TSNEP_RING_SIZE, length, + tsnep_tx_activate(tx, (tx->write + i) & TSNEP_RING_MASK, length, i == count - 1); - tx->write = (tx->write + count) % TSNEP_RING_SIZE; + tx->write = (tx->write + count) & TSNEP_RING_MASK; /* descriptor properties shall be valid before hardware is notified */ dma_wmb(); @@ -628,10 +682,69 @@ static bool tsnep_xdp_xmit_back(struct tsnep_adapter *adapter, return xmit; } +static int tsnep_xdp_tx_map_zc(struct xdp_desc *xdpd, struct tsnep_tx *tx) +{ + struct tsnep_tx_entry *entry; + dma_addr_t dma; + + entry = &tx->entry[tx->write]; + entry->zc = true; + + dma = xsk_buff_raw_get_dma(tx->xsk_pool, xdpd->addr); + xsk_buff_raw_dma_sync_for_device(tx->xsk_pool, dma, xdpd->len); + + entry->type = TSNEP_TX_TYPE_XSK; + entry->len = xdpd->len; + + entry->desc->tx = __cpu_to_le64(dma); + + return xdpd->len; +} + +static void tsnep_xdp_xmit_frame_ring_zc(struct xdp_desc *xdpd, + struct tsnep_tx *tx) +{ + int length; + + length = tsnep_xdp_tx_map_zc(xdpd, tx); + + tsnep_tx_activate(tx, tx->write, length, true); + tx->write = (tx->write + 1) & TSNEP_RING_MASK; +} + +static void tsnep_xdp_xmit_zc(struct tsnep_tx *tx) +{ + int desc_available = tsnep_tx_desc_available(tx); + struct xdp_desc *descs = tx->xsk_pool->tx_descs; + int batch, i; + + /* ensure that TX ring is not filled up by XDP, always MAX_SKB_FRAGS + * will be available for normal TX path and queue is stopped there if + * necessary + */ + if (desc_available <= (MAX_SKB_FRAGS + 1)) + return; + desc_available -= MAX_SKB_FRAGS + 1; + + batch = xsk_tx_peek_release_desc_batch(tx->xsk_pool, desc_available); + for (i = 0; i < batch; i++) + tsnep_xdp_xmit_frame_ring_zc(&descs[i], tx); + + if (batch) { + /* descriptor properties shall be valid before hardware is + * notified + */ + dma_wmb(); + + tsnep_xdp_xmit_flush(tx); + } +} + static bool tsnep_tx_poll(struct tsnep_tx *tx, int napi_budget) { struct tsnep_tx_entry *entry; struct netdev_queue *nq; + int xsk_frames = 0; int budget = 128; int length; int count; @@ -658,7 +771,7 @@ static bool tsnep_tx_poll(struct tsnep_tx *tx, int napi_budget) if ((entry->type & TSNEP_TX_TYPE_SKB) && skb_shinfo(entry->skb)->nr_frags > 0) count += skb_shinfo(entry->skb)->nr_frags; - else if (!(entry->type & TSNEP_TX_TYPE_SKB) && + else if ((entry->type & TSNEP_TX_TYPE_XDP) && xdp_frame_has_frags(entry->xdpf)) count += xdp_get_shared_info_from_frame(entry->xdpf)->nr_frags; @@ -687,12 +800,14 @@ static bool tsnep_tx_poll(struct tsnep_tx *tx, int napi_budget) if (entry->type & TSNEP_TX_TYPE_SKB) napi_consume_skb(entry->skb, napi_budget); - else + else if (entry->type & TSNEP_TX_TYPE_XDP) xdp_return_frame_rx_napi(entry->xdpf); - /* xdpf is union with skb */ + else + xsk_frames++; + /* xdpf and zc are union with skb */ entry->skb = NULL; - tx->read = (tx->read + count) % TSNEP_RING_SIZE; + tx->read = (tx->read + count) & TSNEP_RING_MASK; tx->packets++; tx->bytes += length + ETH_FCS_LEN; @@ -700,6 +815,14 @@ static bool tsnep_tx_poll(struct tsnep_tx *tx, int napi_budget) budget--; } while (likely(budget)); + if (tx->xsk_pool) { + if (xsk_frames) + xsk_tx_completed(tx->xsk_pool, xsk_frames); + if (xsk_uses_need_wakeup(tx->xsk_pool)) + xsk_set_tx_need_wakeup(tx->xsk_pool); + tsnep_xdp_xmit_zc(tx); + } + if ((tsnep_tx_desc_available(tx) >= ((MAX_SKB_FRAGS + 1) * 2)) && netif_tx_queue_stopped(nq)) { netif_tx_wake_queue(nq); @@ -732,38 +855,21 @@ static bool tsnep_tx_pending(struct tsnep_tx *tx) return pending; } -static int tsnep_tx_open(struct tsnep_adapter *adapter, void __iomem *addr, - int queue_index, struct tsnep_tx *tx) +static int tsnep_tx_open(struct tsnep_tx *tx) { - dma_addr_t dma; int retval; - memset(tx, 0, sizeof(*tx)); - tx->adapter = adapter; - tx->addr = addr; - tx->queue_index = queue_index; - - retval = tsnep_tx_ring_init(tx); + retval = tsnep_tx_ring_create(tx); if (retval) return retval; - dma = tx->entry[0].desc_dma | TSNEP_RESET_OWNER_COUNTER; - iowrite32(DMA_ADDR_LOW(dma), tx->addr + TSNEP_TX_DESC_ADDR_LOW); - iowrite32(DMA_ADDR_HIGH(dma), tx->addr + TSNEP_TX_DESC_ADDR_HIGH); - tx->owner_counter = 1; - tx->increment_owner_counter = TSNEP_RING_SIZE - 1; + tsnep_tx_init(tx); return 0; } static void tsnep_tx_close(struct tsnep_tx *tx) { - u32 val; - - readx_poll_timeout(ioread32, tx->addr + TSNEP_CONTROL, val, - ((val & TSNEP_CONTROL_TX_ENABLE) == 0), 10000, - 1000000); - tsnep_tx_ring_cleanup(tx); } @@ -775,9 +881,12 @@ static void tsnep_rx_ring_cleanup(struct tsnep_rx *rx) for (i = 0; i < TSNEP_RING_SIZE; i++) { entry = &rx->entry[i]; - if (entry->page) + if (!rx->xsk_pool && entry->page) page_pool_put_full_page(rx->page_pool, entry->page, false); + if (rx->xsk_pool && entry->xdp) + xsk_buff_free(entry->xdp); + /* xdp is union with page */ entry->page = NULL; } @@ -796,7 +905,7 @@ static void tsnep_rx_ring_cleanup(struct tsnep_rx *rx) } } -static int tsnep_rx_ring_init(struct tsnep_rx *rx) +static int tsnep_rx_ring_create(struct tsnep_rx *rx) { struct device *dmadev = rx->adapter->dmadev; struct tsnep_rx_entry *entry; @@ -840,7 +949,7 @@ static int tsnep_rx_ring_init(struct tsnep_rx *rx) for (i = 0; i < TSNEP_RING_SIZE; i++) { entry = &rx->entry[i]; - next_entry = &rx->entry[(i + 1) % TSNEP_RING_SIZE]; + next_entry = &rx->entry[(i + 1) & TSNEP_RING_MASK]; entry->desc->next = __cpu_to_le64(next_entry->desc_dma); } @@ -851,6 +960,37 @@ failed: return retval; } +static void tsnep_rx_init(struct tsnep_rx *rx) +{ + dma_addr_t dma; + + dma = rx->entry[0].desc_dma | TSNEP_RESET_OWNER_COUNTER; + iowrite32(DMA_ADDR_LOW(dma), rx->addr + TSNEP_RX_DESC_ADDR_LOW); + iowrite32(DMA_ADDR_HIGH(dma), rx->addr + TSNEP_RX_DESC_ADDR_HIGH); + rx->write = 0; + rx->read = 0; + rx->owner_counter = 1; + rx->increment_owner_counter = TSNEP_RING_SIZE - 1; +} + +static void tsnep_rx_enable(struct tsnep_rx *rx) +{ + /* descriptor properties shall be valid before hardware is notified */ + dma_wmb(); + + iowrite32(TSNEP_CONTROL_RX_ENABLE, rx->addr + TSNEP_CONTROL); +} + +static void tsnep_rx_disable(struct tsnep_rx *rx) +{ + u32 val; + + iowrite32(TSNEP_CONTROL_RX_DISABLE, rx->addr + TSNEP_CONTROL); + readx_poll_timeout(ioread32, rx->addr + TSNEP_CONTROL, val, + ((val & TSNEP_CONTROL_RX_ENABLE) == 0), 10000, + 1000000); +} + static int tsnep_rx_desc_available(struct tsnep_rx *rx) { if (rx->read <= rx->write) @@ -859,6 +999,40 @@ static int tsnep_rx_desc_available(struct tsnep_rx *rx) return rx->read - rx->write - 1; } +static void tsnep_rx_free_page_buffer(struct tsnep_rx *rx) +{ + struct page **page; + + /* last entry of page_buffer is always zero, because ring cannot be + * filled completely + */ + page = rx->page_buffer; + while (*page) { + page_pool_put_full_page(rx->page_pool, *page, false); + *page = NULL; + page++; + } +} + +static int tsnep_rx_alloc_page_buffer(struct tsnep_rx *rx) +{ + int i; + + /* alloc for all ring entries except the last one, because ring cannot + * be filled completely + */ + for (i = 0; i < TSNEP_RING_SIZE - 1; i++) { + rx->page_buffer[i] = page_pool_dev_alloc_pages(rx->page_pool); + if (!rx->page_buffer[i]) { + tsnep_rx_free_page_buffer(rx); + + return -ENOMEM; + } + } + + return 0; +} + static void tsnep_rx_set_page(struct tsnep_rx *rx, struct tsnep_rx_entry *entry, struct page *page) { @@ -894,7 +1068,7 @@ static void tsnep_rx_activate(struct tsnep_rx *rx, int index) { struct tsnep_rx_entry *entry = &rx->entry[index]; - /* TSNEP_MAX_RX_BUF_SIZE is a multiple of 4 */ + /* TSNEP_MAX_RX_BUF_SIZE and TSNEP_XSK_RX_BUF_SIZE are multiple of 4 */ entry->properties = entry->len & TSNEP_DESC_LENGTH_MASK; entry->properties |= TSNEP_DESC_INTERRUPT_FLAG; if (index == rx->increment_owner_counter) { @@ -917,19 +1091,15 @@ static void tsnep_rx_activate(struct tsnep_rx *rx, int index) entry->desc->properties = __cpu_to_le32(entry->properties); } -static int tsnep_rx_refill(struct tsnep_rx *rx, int count, bool reuse) +static int tsnep_rx_alloc(struct tsnep_rx *rx, int count, bool reuse) { - int index; bool alloc_failed = false; - bool enable = false; - int i; - int retval; + int i, index; for (i = 0; i < count && !alloc_failed; i++) { - index = (rx->write + i) % TSNEP_RING_SIZE; + index = (rx->write + i) & TSNEP_RING_MASK; - retval = tsnep_rx_alloc_buffer(rx, index); - if (unlikely(retval)) { + if (unlikely(tsnep_rx_alloc_buffer(rx, index))) { rx->alloc_failed++; alloc_failed = true; @@ -941,24 +1111,95 @@ static int tsnep_rx_refill(struct tsnep_rx *rx, int count, bool reuse) } tsnep_rx_activate(rx, index); - - enable = true; } - if (enable) { - rx->write = (rx->write + i) % TSNEP_RING_SIZE; + if (i) + rx->write = (rx->write + i) & TSNEP_RING_MASK; - /* descriptor properties shall be valid before hardware is - * notified - */ - dma_wmb(); + return i; +} + +static int tsnep_rx_refill(struct tsnep_rx *rx, int count, bool reuse) +{ + int desc_refilled; + + desc_refilled = tsnep_rx_alloc(rx, count, reuse); + if (desc_refilled) + tsnep_rx_enable(rx); + + return desc_refilled; +} + +static void tsnep_rx_set_xdp(struct tsnep_rx *rx, struct tsnep_rx_entry *entry, + struct xdp_buff *xdp) +{ + entry->xdp = xdp; + entry->len = TSNEP_XSK_RX_BUF_SIZE; + entry->dma = xsk_buff_xdp_get_dma(entry->xdp); + entry->desc->rx = __cpu_to_le64(entry->dma); +} - iowrite32(TSNEP_CONTROL_RX_ENABLE, rx->addr + TSNEP_CONTROL); +static void tsnep_rx_reuse_buffer_zc(struct tsnep_rx *rx, int index) +{ + struct tsnep_rx_entry *entry = &rx->entry[index]; + struct tsnep_rx_entry *read = &rx->entry[rx->read]; + + tsnep_rx_set_xdp(rx, entry, read->xdp); + read->xdp = NULL; +} + +static int tsnep_rx_alloc_zc(struct tsnep_rx *rx, int count, bool reuse) +{ + u32 allocated; + int i; + + allocated = xsk_buff_alloc_batch(rx->xsk_pool, rx->xdp_batch, count); + for (i = 0; i < allocated; i++) { + int index = (rx->write + i) & TSNEP_RING_MASK; + struct tsnep_rx_entry *entry = &rx->entry[index]; + + tsnep_rx_set_xdp(rx, entry, rx->xdp_batch[i]); + tsnep_rx_activate(rx, index); + } + if (i == 0) { + rx->alloc_failed++; + + if (reuse) { + tsnep_rx_reuse_buffer_zc(rx, rx->write); + tsnep_rx_activate(rx, rx->write); + } } + if (i) + rx->write = (rx->write + i) & TSNEP_RING_MASK; + return i; } +static void tsnep_rx_free_zc(struct tsnep_rx *rx) +{ + int i; + + for (i = 0; i < TSNEP_RING_SIZE; i++) { + struct tsnep_rx_entry *entry = &rx->entry[i]; + + if (entry->xdp) + xsk_buff_free(entry->xdp); + entry->xdp = NULL; + } +} + +static int tsnep_rx_refill_zc(struct tsnep_rx *rx, int count, bool reuse) +{ + int desc_refilled; + + desc_refilled = tsnep_rx_alloc_zc(rx, count, reuse); + if (desc_refilled) + tsnep_rx_enable(rx); + + return desc_refilled; +} + static bool tsnep_xdp_run_prog(struct tsnep_rx *rx, struct bpf_prog *prog, struct xdp_buff *xdp, int *status, struct netdev_queue *tx_nq, struct tsnep_tx *tx) @@ -970,11 +1211,6 @@ static bool tsnep_xdp_run_prog(struct tsnep_rx *rx, struct bpf_prog *prog, length = xdp->data_end - xdp->data_hard_start - XDP_PACKET_HEADROOM; act = bpf_prog_run_xdp(prog, xdp); - - /* Due xdp_adjust_tail: DMA sync for_device cover max len CPU touch */ - sync = xdp->data_end - xdp->data_hard_start - XDP_PACKET_HEADROOM; - sync = max(sync, length); - switch (act) { case XDP_PASS: return false; @@ -996,12 +1232,56 @@ out_failure: trace_xdp_exception(rx->adapter->netdev, prog, act); fallthrough; case XDP_DROP: + /* Due xdp_adjust_tail: DMA sync for_device cover max len CPU + * touch + */ + sync = xdp->data_end - xdp->data_hard_start - + XDP_PACKET_HEADROOM; + sync = max(sync, length); page_pool_put_page(rx->page_pool, virt_to_head_page(xdp->data), sync, true); return true; } } +static bool tsnep_xdp_run_prog_zc(struct tsnep_rx *rx, struct bpf_prog *prog, + struct xdp_buff *xdp, int *status, + struct netdev_queue *tx_nq, + struct tsnep_tx *tx) +{ + u32 act; + + act = bpf_prog_run_xdp(prog, xdp); + + /* XDP_REDIRECT is the main action for zero-copy */ + if (likely(act == XDP_REDIRECT)) { + if (xdp_do_redirect(rx->adapter->netdev, xdp, prog) < 0) + goto out_failure; + *status |= TSNEP_XDP_REDIRECT; + return true; + } + + switch (act) { + case XDP_PASS: + return false; + case XDP_TX: + if (!tsnep_xdp_xmit_back(rx->adapter, xdp, tx_nq, tx)) + goto out_failure; + *status |= TSNEP_XDP_TX; + return true; + default: + bpf_warn_invalid_xdp_action(rx->adapter->netdev, prog, act); + fallthrough; + case XDP_ABORTED: +out_failure: + trace_xdp_exception(rx->adapter->netdev, prog, act); + fallthrough; + case XDP_DROP: + xsk_buff_free(xdp); + return true; + } +} + static void tsnep_finalize_xdp(struct tsnep_adapter *adapter, int status, struct netdev_queue *tx_nq, struct tsnep_tx *tx) { @@ -1046,6 +1326,28 @@ static struct sk_buff *tsnep_build_skb(struct tsnep_rx *rx, struct page *page, return skb; } +static void tsnep_rx_page(struct tsnep_rx *rx, struct napi_struct *napi, + struct page *page, int length) +{ + struct sk_buff *skb; + + skb = tsnep_build_skb(rx, page, length); + if (skb) { + page_pool_release_page(rx->page_pool, page); + + rx->packets++; + rx->bytes += length; + if (skb->pkt_type == PACKET_MULTICAST) + rx->multicast++; + + napi_gro_receive(napi, skb); + } else { + page_pool_recycle_direct(rx->page_pool, page); + + rx->dropped++; + } +} + static int tsnep_rx_poll(struct tsnep_rx *rx, struct napi_struct *napi, int budget) { @@ -1055,7 +1357,6 @@ static int tsnep_rx_poll(struct tsnep_rx *rx, struct napi_struct *napi, struct netdev_queue *tx_nq; struct bpf_prog *prog; struct xdp_buff xdp; - struct sk_buff *skb; struct tsnep_tx *tx; int desc_available; int xdp_status = 0; @@ -1091,7 +1392,7 @@ static int tsnep_rx_poll(struct tsnep_rx *rx, struct napi_struct *napi, * empty RX ring, thus buffer cannot be used for * RX processing */ - rx->read = (rx->read + 1) % TSNEP_RING_SIZE; + rx->read = (rx->read + 1) & TSNEP_RING_MASK; desc_available++; rx->dropped++; @@ -1118,7 +1419,7 @@ static int tsnep_rx_poll(struct tsnep_rx *rx, struct napi_struct *napi, */ length -= TSNEP_RX_INLINE_METADATA_SIZE; - rx->read = (rx->read + 1) % TSNEP_RING_SIZE; + rx->read = (rx->read + 1) & TSNEP_RING_MASK; desc_available++; if (prog) { @@ -1140,31 +1441,135 @@ static int tsnep_rx_poll(struct tsnep_rx *rx, struct napi_struct *napi, } } - skb = tsnep_build_skb(rx, entry->page, length); - if (skb) { - page_pool_release_page(rx->page_pool, entry->page); + tsnep_rx_page(rx, napi, entry->page, length); + entry->page = NULL; + } + + if (xdp_status) + tsnep_finalize_xdp(rx->adapter, xdp_status, tx_nq, tx); + + if (desc_available) + tsnep_rx_refill(rx, desc_available, false); - rx->packets++; - rx->bytes += length; - if (skb->pkt_type == PACKET_MULTICAST) - rx->multicast++; + return done; +} - napi_gro_receive(napi, skb); - } else { - page_pool_recycle_direct(rx->page_pool, entry->page); +static int tsnep_rx_poll_zc(struct tsnep_rx *rx, struct napi_struct *napi, + int budget) +{ + struct tsnep_rx_entry *entry; + struct netdev_queue *tx_nq; + struct bpf_prog *prog; + struct tsnep_tx *tx; + int desc_available; + int xdp_status = 0; + struct page *page; + int done = 0; + int length; + + desc_available = tsnep_rx_desc_available(rx); + prog = READ_ONCE(rx->adapter->xdp_prog); + if (prog) { + tx_nq = netdev_get_tx_queue(rx->adapter->netdev, + rx->tx_queue_index); + tx = &rx->adapter->tx[rx->tx_queue_index]; + } + + while (likely(done < budget) && (rx->read != rx->write)) { + entry = &rx->entry[rx->read]; + if ((__le32_to_cpu(entry->desc_wb->properties) & + TSNEP_DESC_OWNER_COUNTER_MASK) != + (entry->properties & TSNEP_DESC_OWNER_COUNTER_MASK)) + break; + done++; + + if (desc_available >= TSNEP_RING_RX_REFILL) { + bool reuse = desc_available >= TSNEP_RING_RX_REUSE; + + desc_available -= tsnep_rx_refill_zc(rx, desc_available, + reuse); + if (!entry->xdp) { + /* buffer has been reused for refill to prevent + * empty RX ring, thus buffer cannot be used for + * RX processing + */ + rx->read = (rx->read + 1) & TSNEP_RING_MASK; + desc_available++; + + rx->dropped++; + + continue; + } + } + + /* descriptor properties shall be read first, because valid data + * is signaled there + */ + dma_rmb(); + + prefetch(entry->xdp->data); + length = __le32_to_cpu(entry->desc_wb->properties) & + TSNEP_DESC_LENGTH_MASK; + xsk_buff_set_size(entry->xdp, length); + xsk_buff_dma_sync_for_cpu(entry->xdp, rx->xsk_pool); + + /* RX metadata with timestamps is in front of actual data, + * subtract metadata size to get length of actual data and + * consider metadata size as offset of actual data during RX + * processing + */ + length -= TSNEP_RX_INLINE_METADATA_SIZE; + + rx->read = (rx->read + 1) & TSNEP_RING_MASK; + desc_available++; + + if (prog) { + bool consume; + + entry->xdp->data += TSNEP_RX_INLINE_METADATA_SIZE; + entry->xdp->data_meta += TSNEP_RX_INLINE_METADATA_SIZE; + + consume = tsnep_xdp_run_prog_zc(rx, prog, entry->xdp, + &xdp_status, tx_nq, tx); + if (consume) { + rx->packets++; + rx->bytes += length; + + entry->xdp = NULL; + continue; + } + } + + page = page_pool_dev_alloc_pages(rx->page_pool); + if (page) { + memcpy(page_address(page) + TSNEP_RX_OFFSET, + entry->xdp->data - TSNEP_RX_INLINE_METADATA_SIZE, + length + TSNEP_RX_INLINE_METADATA_SIZE); + tsnep_rx_page(rx, napi, page, length); + } else { rx->dropped++; } - entry->page = NULL; + xsk_buff_free(entry->xdp); + entry->xdp = NULL; } if (xdp_status) tsnep_finalize_xdp(rx->adapter, xdp_status, tx_nq, tx); if (desc_available) - tsnep_rx_refill(rx, desc_available, false); + desc_available -= tsnep_rx_refill_zc(rx, desc_available, false); - return done; + if (xsk_uses_need_wakeup(rx->xsk_pool)) { + if (desc_available) + xsk_set_rx_need_wakeup(rx->xsk_pool); + else + xsk_clear_rx_need_wakeup(rx->xsk_pool); + + return done; + } + + return desc_available ? budget : done; } static bool tsnep_rx_pending(struct tsnep_rx *rx) @@ -1182,44 +1587,125 @@ static bool tsnep_rx_pending(struct tsnep_rx *rx) return false; } -static int tsnep_rx_open(struct tsnep_adapter *adapter, void __iomem *addr, - int queue_index, struct tsnep_rx *rx) +static int tsnep_rx_open(struct tsnep_rx *rx) { - dma_addr_t dma; + int desc_available; int retval; - memset(rx, 0, sizeof(*rx)); - rx->adapter = adapter; - rx->addr = addr; - rx->queue_index = queue_index; - - retval = tsnep_rx_ring_init(rx); + retval = tsnep_rx_ring_create(rx); if (retval) return retval; - dma = rx->entry[0].desc_dma | TSNEP_RESET_OWNER_COUNTER; - iowrite32(DMA_ADDR_LOW(dma), rx->addr + TSNEP_RX_DESC_ADDR_LOW); - iowrite32(DMA_ADDR_HIGH(dma), rx->addr + TSNEP_RX_DESC_ADDR_HIGH); - rx->owner_counter = 1; - rx->increment_owner_counter = TSNEP_RING_SIZE - 1; + tsnep_rx_init(rx); - tsnep_rx_refill(rx, tsnep_rx_desc_available(rx), false); + desc_available = tsnep_rx_desc_available(rx); + if (rx->xsk_pool) + retval = tsnep_rx_alloc_zc(rx, desc_available, false); + else + retval = tsnep_rx_alloc(rx, desc_available, false); + if (retval != desc_available) { + retval = -ENOMEM; + + goto alloc_failed; + } + + /* prealloc pages to prevent allocation failures when XSK pool is + * disabled at runtime + */ + if (rx->xsk_pool) { + retval = tsnep_rx_alloc_page_buffer(rx); + if (retval) + goto alloc_failed; + } return 0; + +alloc_failed: + tsnep_rx_ring_cleanup(rx); + return retval; } static void tsnep_rx_close(struct tsnep_rx *rx) { - u32 val; - - iowrite32(TSNEP_CONTROL_RX_DISABLE, rx->addr + TSNEP_CONTROL); - readx_poll_timeout(ioread32, rx->addr + TSNEP_CONTROL, val, - ((val & TSNEP_CONTROL_RX_ENABLE) == 0), 10000, - 1000000); + if (rx->xsk_pool) + tsnep_rx_free_page_buffer(rx); tsnep_rx_ring_cleanup(rx); } +static void tsnep_rx_reopen(struct tsnep_rx *rx) +{ + struct page **page = rx->page_buffer; + int i; + + tsnep_rx_init(rx); + + for (i = 0; i < TSNEP_RING_SIZE; i++) { + struct tsnep_rx_entry *entry = &rx->entry[i]; + + /* defined initial values for properties are required for + * correct owner counter checking + */ + entry->desc->properties = 0; + entry->desc_wb->properties = 0; + + /* prevent allocation failures by reusing kept pages */ + if (*page) { + tsnep_rx_set_page(rx, entry, *page); + tsnep_rx_activate(rx, rx->write); + rx->write++; + + *page = NULL; + page++; + } + } +} + +static void tsnep_rx_reopen_xsk(struct tsnep_rx *rx) +{ + struct page **page = rx->page_buffer; + u32 allocated; + int i; + + tsnep_rx_init(rx); + + /* alloc all ring entries except the last one, because ring cannot be + * filled completely, as many buffers as possible is enough as wakeup is + * done if new buffers are available + */ + allocated = xsk_buff_alloc_batch(rx->xsk_pool, rx->xdp_batch, + TSNEP_RING_SIZE - 1); + + for (i = 0; i < TSNEP_RING_SIZE; i++) { + struct tsnep_rx_entry *entry = &rx->entry[i]; + + /* keep pages to prevent allocation failures when xsk is + * disabled + */ + if (entry->page) { + *page = entry->page; + entry->page = NULL; + + page++; + } + + /* defined initial values for properties are required for + * correct owner counter checking + */ + entry->desc->properties = 0; + entry->desc_wb->properties = 0; + + if (allocated) { + tsnep_rx_set_xdp(rx, entry, + rx->xdp_batch[allocated - 1]); + tsnep_rx_activate(rx, rx->write); + rx->write++; + + allocated--; + } + } +} + static bool tsnep_pending(struct tsnep_queue *queue) { if (queue->tx && tsnep_tx_pending(queue->tx)) @@ -1242,7 +1728,9 @@ static int tsnep_poll(struct napi_struct *napi, int budget) complete = tsnep_tx_poll(queue->tx, budget); if (queue->rx) { - done = tsnep_rx_poll(queue->rx, napi, budget); + done = queue->rx->xsk_pool ? + tsnep_rx_poll_zc(queue->rx, napi, budget) : + tsnep_rx_poll(queue->rx, napi, budget); if (done >= budget) complete = false; } @@ -1323,8 +1811,12 @@ static void tsnep_queue_close(struct tsnep_queue *queue, bool first) tsnep_free_irq(queue, first); - if (rx && xdp_rxq_info_is_reg(&rx->xdp_rxq)) - xdp_rxq_info_unreg(&rx->xdp_rxq); + if (rx) { + if (xdp_rxq_info_is_reg(&rx->xdp_rxq)) + xdp_rxq_info_unreg(&rx->xdp_rxq); + if (xdp_rxq_info_is_reg(&rx->xdp_rxq_zc)) + xdp_rxq_info_unreg(&rx->xdp_rxq_zc); + } netif_napi_del(&queue->napi); } @@ -1336,8 +1828,6 @@ static int tsnep_queue_open(struct tsnep_adapter *adapter, struct tsnep_tx *tx = queue->tx; int retval; - queue->adapter = adapter; - netif_napi_add(adapter->netdev, &queue->napi, tsnep_poll); if (rx) { @@ -1349,6 +1839,10 @@ static int tsnep_queue_open(struct tsnep_adapter *adapter, else rx->tx_queue_index = 0; + /* prepare both memory models to eliminate possible registration + * errors when memory model is switched between page pool and + * XSK pool during runtime + */ retval = xdp_rxq_info_reg(&rx->xdp_rxq, adapter->netdev, rx->queue_index, queue->napi.napi_id); if (retval) @@ -1358,6 +1852,17 @@ static int tsnep_queue_open(struct tsnep_adapter *adapter, rx->page_pool); if (retval) goto failed; + retval = xdp_rxq_info_reg(&rx->xdp_rxq_zc, adapter->netdev, + rx->queue_index, queue->napi.napi_id); + if (retval) + goto failed; + retval = xdp_rxq_info_reg_mem_model(&rx->xdp_rxq_zc, + MEM_TYPE_XSK_BUFF_POOL, + NULL); + if (retval) + goto failed; + if (rx->xsk_pool) + xsk_pool_set_rxq_info(rx->xsk_pool, &rx->xdp_rxq_zc); } retval = tsnep_request_irq(queue, first); @@ -1375,30 +1880,48 @@ failed: return retval; } +static void tsnep_queue_enable(struct tsnep_queue *queue) +{ + napi_enable(&queue->napi); + tsnep_enable_irq(queue->adapter, queue->irq_mask); + + if (queue->tx) + tsnep_tx_enable(queue->tx); + + if (queue->rx) + tsnep_rx_enable(queue->rx); +} + +static void tsnep_queue_disable(struct tsnep_queue *queue) +{ + if (queue->tx) + tsnep_tx_disable(queue->tx, &queue->napi); + + napi_disable(&queue->napi); + tsnep_disable_irq(queue->adapter, queue->irq_mask); + + /* disable RX after NAPI polling has been disabled, because RX can be + * enabled during NAPI polling + */ + if (queue->rx) + tsnep_rx_disable(queue->rx); +} + static int tsnep_netdev_open(struct net_device *netdev) { struct tsnep_adapter *adapter = netdev_priv(netdev); - int tx_queue_index = 0; - int rx_queue_index = 0; - void __iomem *addr; int i, retval; for (i = 0; i < adapter->num_queues; i++) { if (adapter->queue[i].tx) { - addr = adapter->addr + TSNEP_QUEUE(tx_queue_index); - retval = tsnep_tx_open(adapter, addr, tx_queue_index, - adapter->queue[i].tx); + retval = tsnep_tx_open(adapter->queue[i].tx); if (retval) goto failed; - tx_queue_index++; } if (adapter->queue[i].rx) { - addr = adapter->addr + TSNEP_QUEUE(rx_queue_index); - retval = tsnep_rx_open(adapter, addr, rx_queue_index, - adapter->queue[i].rx); + retval = tsnep_rx_open(adapter->queue[i].rx); if (retval) goto failed; - rx_queue_index++; } retval = tsnep_queue_open(adapter, &adapter->queue[i], i == 0); @@ -1420,11 +1943,8 @@ static int tsnep_netdev_open(struct net_device *netdev) if (retval) goto phy_failed; - for (i = 0; i < adapter->num_queues; i++) { - napi_enable(&adapter->queue[i].napi); - - tsnep_enable_irq(adapter, adapter->queue[i].irq_mask); - } + for (i = 0; i < adapter->num_queues; i++) + tsnep_queue_enable(&adapter->queue[i]); return 0; @@ -1451,9 +1971,7 @@ static int tsnep_netdev_close(struct net_device *netdev) tsnep_phy_close(adapter); for (i = 0; i < adapter->num_queues; i++) { - tsnep_disable_irq(adapter, adapter->queue[i].irq_mask); - - napi_disable(&adapter->queue[i].napi); + tsnep_queue_disable(&adapter->queue[i]); tsnep_queue_close(&adapter->queue[i], i == 0); @@ -1466,6 +1984,69 @@ static int tsnep_netdev_close(struct net_device *netdev) return 0; } +int tsnep_enable_xsk(struct tsnep_queue *queue, struct xsk_buff_pool *pool) +{ + bool running = netif_running(queue->adapter->netdev); + u32 frame_size; + + frame_size = xsk_pool_get_rx_frame_size(pool); + if (frame_size < TSNEP_XSK_RX_BUF_SIZE) + return -EOPNOTSUPP; + + queue->rx->page_buffer = kcalloc(TSNEP_RING_SIZE, + sizeof(*queue->rx->page_buffer), + GFP_KERNEL); + if (!queue->rx->page_buffer) + return -ENOMEM; + queue->rx->xdp_batch = kcalloc(TSNEP_RING_SIZE, + sizeof(*queue->rx->xdp_batch), + GFP_KERNEL); + if (!queue->rx->xdp_batch) { + kfree(queue->rx->page_buffer); + queue->rx->page_buffer = NULL; + + return -ENOMEM; + } + + xsk_pool_set_rxq_info(pool, &queue->rx->xdp_rxq_zc); + + if (running) + tsnep_queue_disable(queue); + + queue->tx->xsk_pool = pool; + queue->rx->xsk_pool = pool; + + if (running) { + tsnep_rx_reopen_xsk(queue->rx); + tsnep_queue_enable(queue); + } + + return 0; +} + +void tsnep_disable_xsk(struct tsnep_queue *queue) +{ + bool running = netif_running(queue->adapter->netdev); + + if (running) + tsnep_queue_disable(queue); + + tsnep_rx_free_zc(queue->rx); + + queue->rx->xsk_pool = NULL; + queue->tx->xsk_pool = NULL; + + if (running) { + tsnep_rx_reopen(queue->rx); + tsnep_queue_enable(queue); + } + + kfree(queue->rx->xdp_batch); + queue->rx->xdp_batch = NULL; + kfree(queue->rx->page_buffer); + queue->rx->page_buffer = NULL; +} + static netdev_tx_t tsnep_netdev_xmit_frame(struct sk_buff *skb, struct net_device *netdev) { @@ -1615,6 +2196,9 @@ static int tsnep_netdev_bpf(struct net_device *dev, struct netdev_bpf *bpf) switch (bpf->command) { case XDP_SETUP_PROG: return tsnep_xdp_setup_prog(adapter, bpf->prog, bpf->extack); + case XDP_SETUP_XSK_POOL: + return tsnep_xdp_setup_pool(adapter, bpf->xsk.pool, + bpf->xsk.queue_id); default: return -EOPNOTSUPP; } @@ -1669,6 +2253,24 @@ static int tsnep_netdev_xdp_xmit(struct net_device *dev, int n, return nxmit; } +static int tsnep_netdev_xsk_wakeup(struct net_device *dev, u32 queue_id, + u32 flags) +{ + struct tsnep_adapter *adapter = netdev_priv(dev); + struct tsnep_queue *queue; + + if (queue_id >= adapter->num_rx_queues || + queue_id >= adapter->num_tx_queues) + return -EINVAL; + + queue = &adapter->queue[queue_id]; + + if (!napi_if_scheduled_mark_missed(&queue->napi)) + napi_schedule(&queue->napi); + + return 0; +} + static const struct net_device_ops tsnep_netdev_ops = { .ndo_open = tsnep_netdev_open, .ndo_stop = tsnep_netdev_close, @@ -1682,6 +2284,7 @@ static const struct net_device_ops tsnep_netdev_ops = { .ndo_setup_tc = tsnep_tc_setup, .ndo_bpf = tsnep_netdev_bpf, .ndo_xdp_xmit = tsnep_netdev_xdp_xmit, + .ndo_xsk_wakeup = tsnep_netdev_xsk_wakeup, }; static int tsnep_mac_init(struct tsnep_adapter *adapter) @@ -1797,9 +2400,16 @@ static int tsnep_queue_init(struct tsnep_adapter *adapter, int queue_count) adapter->num_tx_queues = 1; adapter->num_rx_queues = 1; adapter->num_queues = 1; + adapter->queue[0].adapter = adapter; adapter->queue[0].irq = retval; adapter->queue[0].tx = &adapter->tx[0]; + adapter->queue[0].tx->adapter = adapter; + adapter->queue[0].tx->addr = adapter->addr + TSNEP_QUEUE(0); + adapter->queue[0].tx->queue_index = 0; adapter->queue[0].rx = &adapter->rx[0]; + adapter->queue[0].rx->adapter = adapter; + adapter->queue[0].rx->addr = adapter->addr + TSNEP_QUEUE(0); + adapter->queue[0].rx->queue_index = 0; adapter->queue[0].irq_mask = irq_mask; adapter->queue[0].irq_delay_addr = adapter->addr + ECM_INT_DELAY; retval = tsnep_set_irq_coalesce(&adapter->queue[0], @@ -1821,9 +2431,16 @@ static int tsnep_queue_init(struct tsnep_adapter *adapter, int queue_count) adapter->num_tx_queues++; adapter->num_rx_queues++; adapter->num_queues++; + adapter->queue[i].adapter = adapter; adapter->queue[i].irq = retval; adapter->queue[i].tx = &adapter->tx[i]; + adapter->queue[i].tx->adapter = adapter; + adapter->queue[i].tx->addr = adapter->addr + TSNEP_QUEUE(i); + adapter->queue[i].tx->queue_index = i; adapter->queue[i].rx = &adapter->rx[i]; + adapter->queue[i].rx->adapter = adapter; + adapter->queue[i].rx->addr = adapter->addr + TSNEP_QUEUE(i); + adapter->queue[i].rx->queue_index = i; adapter->queue[i].irq_mask = irq_mask << (ECM_INT_TXRX_SHIFT * i); adapter->queue[i].irq_delay_addr = @@ -1928,7 +2545,8 @@ static int tsnep_probe(struct platform_device *pdev) netdev->xdp_features = NETDEV_XDP_ACT_BASIC | NETDEV_XDP_ACT_REDIRECT | NETDEV_XDP_ACT_NDO_XMIT | - NETDEV_XDP_ACT_NDO_XMIT_SG; + NETDEV_XDP_ACT_NDO_XMIT_SG | + NETDEV_XDP_ACT_XSK_ZEROCOPY; /* carrier off reporting is important to ethtool even BEFORE open */ netif_carrier_off(netdev); diff --git a/drivers/net/ethernet/engleder/tsnep_xdp.c b/drivers/net/ethernet/engleder/tsnep_xdp.c index 4d14cb1fd772..c0513848c547 100644 --- a/drivers/net/ethernet/engleder/tsnep_xdp.c +++ b/drivers/net/ethernet/engleder/tsnep_xdp.c @@ -17,3 +17,69 @@ int tsnep_xdp_setup_prog(struct tsnep_adapter *adapter, struct bpf_prog *prog, return 0; } + +static int tsnep_xdp_enable_pool(struct tsnep_adapter *adapter, + struct xsk_buff_pool *pool, u16 queue_id) +{ + struct tsnep_queue *queue; + int retval; + + if (queue_id >= adapter->num_rx_queues || + queue_id >= adapter->num_tx_queues) + return -EINVAL; + + queue = &adapter->queue[queue_id]; + if (queue->rx->queue_index != queue_id || + queue->tx->queue_index != queue_id) { + netdev_err(adapter->netdev, + "XSK support only for TX/RX queue pairs\n"); + + return -EOPNOTSUPP; + } + + retval = xsk_pool_dma_map(pool, adapter->dmadev, + DMA_ATTR_SKIP_CPU_SYNC); + if (retval) { + netdev_err(adapter->netdev, "failed to map XSK pool\n"); + + return retval; + } + + retval = tsnep_enable_xsk(queue, pool); + if (retval) { + xsk_pool_dma_unmap(pool, DMA_ATTR_SKIP_CPU_SYNC); + + return retval; + } + + return 0; +} + +static int tsnep_xdp_disable_pool(struct tsnep_adapter *adapter, u16 queue_id) +{ + struct xsk_buff_pool *pool; + struct tsnep_queue *queue; + + if (queue_id >= adapter->num_rx_queues || + queue_id >= adapter->num_tx_queues) + return -EINVAL; + + pool = xsk_get_pool_from_qid(adapter->netdev, queue_id); + if (!pool) + return -EINVAL; + + queue = &adapter->queue[queue_id]; + + tsnep_disable_xsk(queue); + + xsk_pool_dma_unmap(pool, DMA_ATTR_SKIP_CPU_SYNC); + + return 0; +} + +int tsnep_xdp_setup_pool(struct tsnep_adapter *adapter, + struct xsk_buff_pool *pool, u16 queue_id) +{ + return pool ? tsnep_xdp_enable_pool(adapter, pool, queue_id) : + tsnep_xdp_disable_pool(adapter, queue_id); +} diff --git a/drivers/net/ethernet/freescale/Kconfig b/drivers/net/ethernet/freescale/Kconfig index f1e80d6996ef..1c78f66a89da 100644 --- a/drivers/net/ethernet/freescale/Kconfig +++ b/drivers/net/ethernet/freescale/Kconfig @@ -71,6 +71,7 @@ config FSL_XGMAC_MDIO tristate "Freescale XGMAC MDIO" select PHYLIB depends on OF + select MDIO_DEVRES select OF_MDIO help This driver supports the MDIO bus on the Fman 10G Ethernet MACs, and diff --git a/drivers/net/ethernet/freescale/dpaa/dpaa_eth.c b/drivers/net/ethernet/freescale/dpaa/dpaa_eth.c index 9318a2554056..431f8917dc39 100644 --- a/drivers/net/ethernet/freescale/dpaa/dpaa_eth.c +++ b/drivers/net/ethernet/freescale/dpaa/dpaa_eth.c @@ -299,7 +299,8 @@ static int dpaa_stop(struct net_device *net_dev) { struct mac_device *mac_dev; struct dpaa_priv *priv; - int i, err, error; + int i, error; + int err = 0; priv = netdev_priv(net_dev); mac_dev = priv->mac_dev; @@ -1482,13 +1483,8 @@ static int dpaa_enable_tx_csum(struct dpaa_priv *priv, parse_result = (struct fman_prs_result *)parse_results; /* If we're dealing with VLAN, get the real Ethernet type */ - if (ethertype == ETH_P_8021Q) { - /* We can't always assume the MAC header is set correctly - * by the stack, so reset to beginning of skb->data - */ - skb_reset_mac_header(skb); - ethertype = ntohs(vlan_eth_hdr(skb)->h_vlan_encapsulated_proto); - } + if (ethertype == ETH_P_8021Q) + ethertype = ntohs(skb_vlan_eth_hdr(skb)->h_vlan_encapsulated_proto); /* Fill in the relevant L3 parse result fields * and read the L4 protocol type diff --git a/drivers/net/ethernet/freescale/dpaa2/dpaa2-mac.c b/drivers/net/ethernet/freescale/dpaa2/dpaa2-mac.c index c886f33f8c6f..b1871e6c4006 100644 --- a/drivers/net/ethernet/freescale/dpaa2/dpaa2-mac.c +++ b/drivers/net/ethernet/freescale/dpaa2/dpaa2-mac.c @@ -159,7 +159,8 @@ static void dpaa2_mac_config(struct phylink_config *config, unsigned int mode, struct dpmac_link_state *dpmac_state = &mac->state; int err; - if (state->an_enabled) + if (linkmode_test_bit(ETHTOOL_LINK_MODE_Autoneg_BIT, + state->advertising)) dpmac_state->options |= DPMAC_LINK_OPT_AUTONEG; else dpmac_state->options &= ~DPMAC_LINK_OPT_AUTONEG; diff --git a/drivers/net/ethernet/freescale/enetc/Kconfig b/drivers/net/ethernet/freescale/enetc/Kconfig index 9bc099cf3cb1..4d75e6807e92 100644 --- a/drivers/net/ethernet/freescale/enetc/Kconfig +++ b/drivers/net/ethernet/freescale/enetc/Kconfig @@ -10,6 +10,7 @@ config FSL_ENETC_CORE config FSL_ENETC tristate "ENETC PF driver" depends on PCI_MSI + select MDIO_DEVRES select FSL_ENETC_CORE select FSL_ENETC_IERB select FSL_ENETC_MDIO diff --git a/drivers/net/ethernet/freescale/enetc/enetc.c b/drivers/net/ethernet/freescale/enetc/enetc.c index 2fc712b24d12..3c4fa26f0f9b 100644 --- a/drivers/net/ethernet/freescale/enetc/enetc.c +++ b/drivers/net/ethernet/freescale/enetc/enetc.c @@ -25,6 +25,13 @@ void enetc_port_mac_wr(struct enetc_si *si, u32 reg, u32 val) } EXPORT_SYMBOL_GPL(enetc_port_mac_wr); +static void enetc_change_preemptible_tcs(struct enetc_ndev_priv *priv, + u8 preemptible_tcs) +{ + priv->preemptible_tcs = preemptible_tcs; + enetc_mm_commit_preemptible_tcs(priv); +} + static int enetc_num_stack_tx_queues(struct enetc_ndev_priv *priv) { int num_tx_rings = priv->num_tx_rings; @@ -2640,16 +2647,19 @@ static void enetc_reset_tc_mqprio(struct net_device *ndev) } enetc_debug_tx_ring_prios(priv); + + enetc_change_preemptible_tcs(priv, 0); } int enetc_setup_tc_mqprio(struct net_device *ndev, void *type_data) { + struct tc_mqprio_qopt_offload *mqprio = type_data; struct enetc_ndev_priv *priv = netdev_priv(ndev); - struct tc_mqprio_qopt *mqprio = type_data; + struct tc_mqprio_qopt *qopt = &mqprio->qopt; struct enetc_hw *hw = &priv->si->hw; int num_stack_tx_queues = 0; - u8 num_tc = mqprio->num_tc; struct enetc_bdr *tx_ring; + u8 num_tc = qopt->num_tc; int offset, count; int err, tc, q; @@ -2663,8 +2673,8 @@ int enetc_setup_tc_mqprio(struct net_device *ndev, void *type_data) return err; for (tc = 0; tc < num_tc; tc++) { - offset = mqprio->offset[tc]; - count = mqprio->count[tc]; + offset = qopt->offset[tc]; + count = qopt->count[tc]; num_stack_tx_queues += count; err = netdev_set_tc_queue(ndev, tc, count, offset); @@ -2693,6 +2703,8 @@ int enetc_setup_tc_mqprio(struct net_device *ndev, void *type_data) enetc_debug_tx_ring_prios(priv); + enetc_change_preemptible_tcs(priv, mqprio->preemptible_tcs); + return 0; err_reset_tc: diff --git a/drivers/net/ethernet/freescale/enetc/enetc.h b/drivers/net/ethernet/freescale/enetc/enetc.h index 8010f31cd10d..c97a8e3d7a7f 100644 --- a/drivers/net/ethernet/freescale/enetc/enetc.h +++ b/drivers/net/ethernet/freescale/enetc/enetc.h @@ -355,6 +355,9 @@ struct enetc_ndev_priv { u16 rx_bd_count, tx_bd_count; u16 msg_enable; + + u8 preemptible_tcs; + enum enetc_active_offloads active_offloads; u32 speed; /* store speed for compare update pspeed */ @@ -433,6 +436,7 @@ int enetc_xdp_xmit(struct net_device *ndev, int num_frames, /* ethtool */ void enetc_set_ethtool_ops(struct net_device *ndev); void enetc_mm_link_state_update(struct enetc_ndev_priv *priv, bool link); +void enetc_mm_commit_preemptible_tcs(struct enetc_ndev_priv *priv); /* control buffer descriptor ring (CBDR) */ int enetc_setup_cbdr(struct device *dev, struct enetc_hw *hw, int bd_count, diff --git a/drivers/net/ethernet/freescale/enetc/enetc_ethtool.c b/drivers/net/ethernet/freescale/enetc/enetc_ethtool.c index 838750a03cf6..e993ed04ab57 100644 --- a/drivers/net/ethernet/freescale/enetc/enetc_ethtool.c +++ b/drivers/net/ethernet/freescale/enetc/enetc_ethtool.c @@ -32,6 +32,12 @@ static const u32 enetc_port_regs[] = { ENETC_PM0_CMD_CFG, ENETC_PM0_MAXFRM, ENETC_PM0_IF_MODE }; +static const u32 enetc_port_mm_regs[] = { + ENETC_MMCSR, ENETC_PFPMR, ENETC_PTCFPR(0), ENETC_PTCFPR(1), + ENETC_PTCFPR(2), ENETC_PTCFPR(3), ENETC_PTCFPR(4), ENETC_PTCFPR(5), + ENETC_PTCFPR(6), ENETC_PTCFPR(7), +}; + static int enetc_get_reglen(struct net_device *ndev) { struct enetc_ndev_priv *priv = netdev_priv(ndev); @@ -45,6 +51,9 @@ static int enetc_get_reglen(struct net_device *ndev) if (hw->port) len += ARRAY_SIZE(enetc_port_regs); + if (hw->port && !!(priv->si->hw_features & ENETC_SI_F_QBU)) + len += ARRAY_SIZE(enetc_port_mm_regs); + len *= sizeof(u32) * 2; /* store 2 entries per reg: addr and value */ return len; @@ -90,6 +99,14 @@ static void enetc_get_regs(struct net_device *ndev, struct ethtool_regs *regs, *buf++ = addr; *buf++ = enetc_rd(hw, addr); } + + if (priv->si->hw_features & ENETC_SI_F_QBU) { + for (i = 0; i < ARRAY_SIZE(enetc_port_mm_regs); i++) { + addr = ENETC_PORT_BASE + enetc_port_mm_regs[i]; + *buf++ = addr; + *buf++ = enetc_rd(hw, addr); + } + } } static const struct { @@ -976,7 +993,9 @@ static int enetc_get_mm(struct net_device *ndev, struct ethtool_mm_state *state) lafs = ENETC_MMCSR_GET_LAFS(val); state->rx_min_frag_size = ethtool_mm_frag_size_add_to_min(lafs); state->tx_enabled = !!(val & ENETC_MMCSR_LPE); /* mirror of MMCSR_ME */ - state->tx_active = !!(val & ENETC_MMCSR_LPA); + state->tx_active = state->tx_enabled && + (state->verify_status == ETHTOOL_MM_VERIFY_STATUS_SUCCEEDED || + state->verify_status == ETHTOOL_MM_VERIFY_STATUS_DISABLED); state->verify_enabled = !(val & ENETC_MMCSR_VDIS); state->verify_time = ENETC_MMCSR_GET_VT(val); /* A verifyTime of 128 ms would exceed the 7 bit width @@ -989,6 +1008,64 @@ static int enetc_get_mm(struct net_device *ndev, struct ethtool_mm_state *state) return 0; } +static int enetc_mm_wait_tx_active(struct enetc_hw *hw, int verify_time) +{ + int timeout = verify_time * USEC_PER_MSEC * ENETC_MM_VERIFY_RETRIES; + u32 val; + + /* This will time out after the standard value of 3 verification + * attempts. To not sleep forever, it relies on a non-zero verify_time, + * guarantee which is provided by the ethtool nlattr policy. + */ + return read_poll_timeout(enetc_port_rd, val, + ENETC_MMCSR_GET_VSTS(val) == 3, + ENETC_MM_VERIFY_SLEEP_US, timeout, + true, hw, ENETC_MMCSR); +} + +static void enetc_set_ptcfpr(struct enetc_hw *hw, u8 preemptible_tcs) +{ + u32 val; + int tc; + + for (tc = 0; tc < 8; tc++) { + val = enetc_port_rd(hw, ENETC_PTCFPR(tc)); + + if (preemptible_tcs & BIT(tc)) + val |= ENETC_PTCFPR_FPE; + else + val &= ~ENETC_PTCFPR_FPE; + + enetc_port_wr(hw, ENETC_PTCFPR(tc), val); + } +} + +/* ENETC does not have an IRQ to notify changes to the MAC Merge TX status + * (active/inactive), but the preemptible traffic classes should only be + * committed to hardware once TX is active. Resort to polling. + */ +void enetc_mm_commit_preemptible_tcs(struct enetc_ndev_priv *priv) +{ + struct enetc_hw *hw = &priv->si->hw; + u8 preemptible_tcs = 0; + u32 val; + int err; + + val = enetc_port_rd(hw, ENETC_MMCSR); + if (!(val & ENETC_MMCSR_ME)) + goto out; + + if (!(val & ENETC_MMCSR_VDIS)) { + err = enetc_mm_wait_tx_active(hw, ENETC_MMCSR_GET_VT(val)); + if (err) + goto out; + } + + preemptible_tcs = priv->preemptible_tcs; +out: + enetc_set_ptcfpr(hw, preemptible_tcs); +} + /* FIXME: Workaround for the link partner's verification failing if ENETC * priorly received too much express traffic. The documentation doesn't * suggest this is needed. @@ -1041,10 +1118,13 @@ static int enetc_set_mm(struct net_device *ndev, struct ethtool_mm_cfg *cfg, else priv->active_offloads &= ~ENETC_F_QBU; - /* If link is up, enable MAC Merge right away */ - if (!!(priv->active_offloads & ENETC_F_QBU) && - !(val & ENETC_MMCSR_LINK_FAIL)) - val |= ENETC_MMCSR_ME; + /* If link is up, enable/disable MAC Merge right away */ + if (!(val & ENETC_MMCSR_LINK_FAIL)) { + if (!!(priv->active_offloads & ENETC_F_QBU)) + val |= ENETC_MMCSR_ME; + else + val &= ~ENETC_MMCSR_ME; + } val &= ~ENETC_MMCSR_VT_MASK; val |= ENETC_MMCSR_VT(cfg->verify_time); @@ -1056,6 +1136,8 @@ static int enetc_set_mm(struct net_device *ndev, struct ethtool_mm_cfg *cfg, enetc_restart_emac_rx(priv->si); + enetc_mm_commit_preemptible_tcs(priv); + mutex_unlock(&priv->mm_lock); return 0; @@ -1089,6 +1171,8 @@ void enetc_mm_link_state_update(struct enetc_ndev_priv *priv, bool link) enetc_port_wr(hw, ENETC_MMCSR, val); + enetc_mm_commit_preemptible_tcs(priv); + mutex_unlock(&priv->mm_lock); } EXPORT_SYMBOL_GPL(enetc_mm_link_state_update); diff --git a/drivers/net/ethernet/freescale/enetc/enetc_hw.h b/drivers/net/ethernet/freescale/enetc/enetc_hw.h index de2e0ee8cdcb..1619943fb263 100644 --- a/drivers/net/ethernet/freescale/enetc/enetc_hw.h +++ b/drivers/net/ethernet/freescale/enetc/enetc_hw.h @@ -3,6 +3,9 @@ #include <linux/bitops.h> +#define ENETC_MM_VERIFY_SLEEP_US USEC_PER_MSEC +#define ENETC_MM_VERIFY_RETRIES 3 + /* ENETC device IDs */ #define ENETC_DEV_ID_PF 0xe100 #define ENETC_DEV_ID_VF 0xef00 @@ -965,6 +968,10 @@ static inline u32 enetc_usecs_to_cycles(u32 usecs) return (u32)div_u64(usecs * ENETC_CLK, 1000000ULL); } +/* Port traffic class frame preemption register */ +#define ENETC_PTCFPR(n) (0x1910 + (n) * 4) /* n = [0 ..7] */ +#define ENETC_PTCFPR_FPE BIT(31) + /* port time gating control register */ #define ENETC_PTGCR 0x11a00 #define ENETC_PTGCR_TGE BIT(31) diff --git a/drivers/net/ethernet/fungible/funcore/fun_dev.c b/drivers/net/ethernet/fungible/funcore/fun_dev.c index fb5120d90f26..a7fbd4cd560a 100644 --- a/drivers/net/ethernet/fungible/funcore/fun_dev.c +++ b/drivers/net/ethernet/fungible/funcore/fun_dev.c @@ -1,6 +1,5 @@ // SPDX-License-Identifier: (GPL-2.0-only OR BSD-3-Clause) -#include <linux/aer.h> #include <linux/bitmap.h> #include <linux/delay.h> #include <linux/interrupt.h> @@ -747,8 +746,6 @@ void fun_dev_disable(struct fun_dev *fdev) bitmap_free(fdev->irq_map); pci_free_irq_vectors(pdev); - pci_clear_master(pdev); - pci_disable_pcie_error_reporting(pdev); pci_disable_device(pdev); fun_unmap_bars(fdev); @@ -781,8 +778,6 @@ int fun_dev_enable(struct fun_dev *fdev, struct pci_dev *pdev, goto unmap; } - pci_enable_pcie_error_reporting(pdev); - rc = sanitize_dev(fdev); if (rc) goto disable_dev; @@ -825,12 +820,10 @@ int fun_dev_enable(struct fun_dev *fdev, struct pci_dev *pdev, disable_admin: fun_disable_admin_queue(fdev); free_irq_mgr: - pci_clear_master(pdev); bitmap_free(fdev->irq_map); free_irqs: pci_free_irq_vectors(pdev); disable_dev: - pci_disable_pcie_error_reporting(pdev); pci_disable_device(pdev); unmap: fun_unmap_bars(fdev); diff --git a/drivers/net/ethernet/google/gve/gve.h b/drivers/net/ethernet/google/gve/gve.h index 005cb9dfe078..98eb78d98e9f 100644 --- a/drivers/net/ethernet/google/gve/gve.h +++ b/drivers/net/ethernet/google/gve/gve.h @@ -47,6 +47,8 @@ #define GVE_RX_BUFFER_SIZE_DQO 2048 +#define GVE_XDP_ACTIONS 5 + #define GVE_GQ_TX_MIN_PKT_DESC_BYTES 182 /* Each slot in the desc ring has a 1:1 mapping to a slot in the data ring */ @@ -232,7 +234,10 @@ struct gve_rx_ring { u64 rx_frag_flip_cnt; /* free-running count of rx segments where page_flip was used */ u64 rx_frag_copy_cnt; /* free-running count of rx segments copied */ u64 rx_frag_alloc_cnt; /* free-running count of rx page allocations */ - + u64 xdp_tx_errors; + u64 xdp_redirect_errors; + u64 xdp_alloc_fails; + u64 xdp_actions[GVE_XDP_ACTIONS]; u32 q_num; /* queue index */ u32 ntfy_id; /* notification block index */ struct gve_queue_resources *q_resources; /* head and tail pointer idx */ @@ -240,6 +245,12 @@ struct gve_rx_ring { struct u64_stats_sync statss; /* sync stats for 32bit archs */ struct gve_rx_ctx ctx; /* Info for packet currently being processed in this ring. */ + + /* XDP stuff */ + struct xdp_rxq_info xdp_rxq; + struct xdp_rxq_info xsk_rxq; + struct xsk_buff_pool *xsk_pool; + struct page_frag_cache page_cache; /* Page cache to allocate XDP frames */ }; /* A TX desc ring entry */ @@ -260,7 +271,14 @@ struct gve_tx_iovec { * ring entry but only used for a pkt_desc not a seg_desc */ struct gve_tx_buffer_state { - struct sk_buff *skb; /* skb for this pkt */ + union { + struct sk_buff *skb; /* skb for this pkt */ + struct xdp_frame *xdp_frame; /* xdp_frame */ + }; + struct { + u16 size; /* size of xmitted xdp pkt */ + u8 is_xsk; /* xsk buff */ + } xdp; union { struct gve_tx_iovec iov[GVE_TX_MAX_IOVEC]; /* segments of this pkt */ struct { @@ -375,6 +393,8 @@ struct gve_tx_ring { struct { /* Spinlock for when cleanup in progress */ spinlock_t clean_lock; + /* Spinlock for XDP tx traffic */ + spinlock_t xdp_lock; }; /* DQO fields. */ @@ -452,6 +472,12 @@ struct gve_tx_ring { dma_addr_t q_resources_bus; /* dma address of the queue resources */ dma_addr_t complq_bus_dqo; /* dma address of the dqo.compl_ring */ struct u64_stats_sync statss; /* sync stats for 32bit archs */ + struct xsk_buff_pool *xsk_pool; + u32 xdp_xsk_wakeup; + u32 xdp_xsk_done; + u64 xdp_xsk_sent; + u64 xdp_xmit; + u64 xdp_xmit_errors; } ____cacheline_aligned; /* Wraps the info for one irq including the napi struct and the queues @@ -528,9 +554,11 @@ struct gve_priv { u16 rx_data_slot_cnt; /* rx buffer length */ u64 max_registered_pages; u64 num_registered_pages; /* num pages registered with NIC */ + struct bpf_prog *xdp_prog; /* XDP BPF program */ u32 rx_copybreak; /* copy packets smaller than this */ u16 default_num_queues; /* default num queues to set up */ + u16 num_xdp_queues; struct gve_queue_config tx_cfg; struct gve_queue_config rx_cfg; struct gve_qpl_config qpl_cfg; /* map used QPL ids */ @@ -787,7 +815,17 @@ static inline u32 gve_num_tx_qpls(struct gve_priv *priv) if (priv->queue_format != GVE_GQI_QPL_FORMAT) return 0; - return priv->tx_cfg.num_queues; + return priv->tx_cfg.num_queues + priv->num_xdp_queues; +} + +/* Returns the number of XDP tx queue page lists + */ +static inline u32 gve_num_xdp_qpls(struct gve_priv *priv) +{ + if (priv->queue_format != GVE_GQI_QPL_FORMAT) + return 0; + + return priv->num_xdp_queues; } /* Returns the number of rx queue page lists @@ -800,16 +838,35 @@ static inline u32 gve_num_rx_qpls(struct gve_priv *priv) return priv->rx_cfg.num_queues; } +static inline u32 gve_tx_qpl_id(struct gve_priv *priv, int tx_qid) +{ + return tx_qid; +} + +static inline u32 gve_rx_qpl_id(struct gve_priv *priv, int rx_qid) +{ + return priv->tx_cfg.max_queues + rx_qid; +} + +static inline u32 gve_tx_start_qpl_id(struct gve_priv *priv) +{ + return gve_tx_qpl_id(priv, 0); +} + +static inline u32 gve_rx_start_qpl_id(struct gve_priv *priv) +{ + return gve_rx_qpl_id(priv, 0); +} + /* Returns a pointer to the next available tx qpl in the list of qpls */ static inline -struct gve_queue_page_list *gve_assign_tx_qpl(struct gve_priv *priv) +struct gve_queue_page_list *gve_assign_tx_qpl(struct gve_priv *priv, int tx_qid) { - int id = find_first_zero_bit(priv->qpl_cfg.qpl_id_map, - priv->qpl_cfg.qpl_map_size); + int id = gve_tx_qpl_id(priv, tx_qid); - /* we are out of tx qpls */ - if (id >= gve_num_tx_qpls(priv)) + /* QPL already in use */ + if (test_bit(id, priv->qpl_cfg.qpl_id_map)) return NULL; set_bit(id, priv->qpl_cfg.qpl_id_map); @@ -819,14 +876,12 @@ struct gve_queue_page_list *gve_assign_tx_qpl(struct gve_priv *priv) /* Returns a pointer to the next available rx qpl in the list of qpls */ static inline -struct gve_queue_page_list *gve_assign_rx_qpl(struct gve_priv *priv) +struct gve_queue_page_list *gve_assign_rx_qpl(struct gve_priv *priv, int rx_qid) { - int id = find_next_zero_bit(priv->qpl_cfg.qpl_id_map, - priv->qpl_cfg.qpl_map_size, - gve_num_tx_qpls(priv)); + int id = gve_rx_qpl_id(priv, rx_qid); - /* we are out of rx qpls */ - if (id == gve_num_tx_qpls(priv) + gve_num_rx_qpls(priv)) + /* QPL already in use */ + if (test_bit(id, priv->qpl_cfg.qpl_id_map)) return NULL; set_bit(id, priv->qpl_cfg.qpl_id_map); @@ -845,7 +900,7 @@ static inline void gve_unassign_qpl(struct gve_priv *priv, int id) static inline enum dma_data_direction gve_qpl_dma_dir(struct gve_priv *priv, int id) { - if (id < gve_num_tx_qpls(priv)) + if (id < gve_rx_start_qpl_id(priv)) return DMA_TO_DEVICE; else return DMA_FROM_DEVICE; @@ -857,6 +912,21 @@ static inline bool gve_is_gqi(struct gve_priv *priv) priv->queue_format == GVE_GQI_QPL_FORMAT; } +static inline u32 gve_num_tx_queues(struct gve_priv *priv) +{ + return priv->tx_cfg.num_queues + priv->num_xdp_queues; +} + +static inline u32 gve_xdp_tx_queue_id(struct gve_priv *priv, u32 queue_id) +{ + return priv->tx_cfg.num_queues + queue_id; +} + +static inline u32 gve_xdp_tx_start_queue_id(struct gve_priv *priv) +{ + return gve_xdp_tx_queue_id(priv, 0); +} + /* buffers */ int gve_alloc_page(struct gve_priv *priv, struct device *dev, struct page **page, dma_addr_t *dma, @@ -865,9 +935,15 @@ void gve_free_page(struct device *dev, struct page *page, dma_addr_t dma, enum dma_data_direction); /* tx handling */ netdev_tx_t gve_tx(struct sk_buff *skb, struct net_device *dev); +int gve_xdp_xmit(struct net_device *dev, int n, struct xdp_frame **frames, + u32 flags); +int gve_xdp_xmit_one(struct gve_priv *priv, struct gve_tx_ring *tx, + void *data, int len, void *frame_p); +void gve_xdp_tx_flush(struct gve_priv *priv, u32 xdp_qid); bool gve_tx_poll(struct gve_notify_block *block, int budget); -int gve_tx_alloc_rings(struct gve_priv *priv); -void gve_tx_free_rings_gqi(struct gve_priv *priv); +bool gve_xdp_poll(struct gve_notify_block *block, int budget); +int gve_tx_alloc_rings(struct gve_priv *priv, int start_id, int num_rings); +void gve_tx_free_rings_gqi(struct gve_priv *priv, int start_id, int num_rings); u32 gve_tx_load_event_counter(struct gve_priv *priv, struct gve_tx_ring *tx); bool gve_tx_clean_pending(struct gve_priv *priv, struct gve_tx_ring *tx); diff --git a/drivers/net/ethernet/google/gve/gve_adminq.c b/drivers/net/ethernet/google/gve/gve_adminq.c index 60061288ad9d..252974202a3f 100644 --- a/drivers/net/ethernet/google/gve/gve_adminq.c +++ b/drivers/net/ethernet/google/gve/gve_adminq.c @@ -516,12 +516,12 @@ static int gve_adminq_create_tx_queue(struct gve_priv *priv, u32 queue_index) return gve_adminq_issue_cmd(priv, &cmd); } -int gve_adminq_create_tx_queues(struct gve_priv *priv, u32 num_queues) +int gve_adminq_create_tx_queues(struct gve_priv *priv, u32 start_id, u32 num_queues) { int err; int i; - for (i = 0; i < num_queues; i++) { + for (i = start_id; i < start_id + num_queues; i++) { err = gve_adminq_create_tx_queue(priv, i); if (err) return err; @@ -604,12 +604,12 @@ static int gve_adminq_destroy_tx_queue(struct gve_priv *priv, u32 queue_index) return 0; } -int gve_adminq_destroy_tx_queues(struct gve_priv *priv, u32 num_queues) +int gve_adminq_destroy_tx_queues(struct gve_priv *priv, u32 start_id, u32 num_queues) { int err; int i; - for (i = 0; i < num_queues; i++) { + for (i = start_id; i < start_id + num_queues; i++) { err = gve_adminq_destroy_tx_queue(priv, i); if (err) return err; diff --git a/drivers/net/ethernet/google/gve/gve_adminq.h b/drivers/net/ethernet/google/gve/gve_adminq.h index cf29662e6ad1..f894beb3deaf 100644 --- a/drivers/net/ethernet/google/gve/gve_adminq.h +++ b/drivers/net/ethernet/google/gve/gve_adminq.h @@ -410,8 +410,8 @@ int gve_adminq_configure_device_resources(struct gve_priv *priv, dma_addr_t db_array_bus_addr, u32 num_ntfy_blks); int gve_adminq_deconfigure_device_resources(struct gve_priv *priv); -int gve_adminq_create_tx_queues(struct gve_priv *priv, u32 num_queues); -int gve_adminq_destroy_tx_queues(struct gve_priv *priv, u32 queue_id); +int gve_adminq_create_tx_queues(struct gve_priv *priv, u32 start_id, u32 num_queues); +int gve_adminq_destroy_tx_queues(struct gve_priv *priv, u32 start_id, u32 num_queues); int gve_adminq_create_rx_queues(struct gve_priv *priv, u32 num_queues); int gve_adminq_destroy_rx_queues(struct gve_priv *priv, u32 queue_id); int gve_adminq_register_page_list(struct gve_priv *priv, diff --git a/drivers/net/ethernet/google/gve/gve_ethtool.c b/drivers/net/ethernet/google/gve/gve_ethtool.c index 5f81470843b4..cfd4b8d284d1 100644 --- a/drivers/net/ethernet/google/gve/gve_ethtool.c +++ b/drivers/net/ethernet/google/gve/gve_ethtool.c @@ -34,6 +34,11 @@ static u32 gve_get_msglevel(struct net_device *netdev) return priv->msg_enable; } +/* For the following stats column string names, make sure the order + * matches how it is filled in the code. For xdp_aborted, xdp_drop, + * xdp_pass, xdp_tx, xdp_redirect, make sure it also matches the order + * as declared in enum xdp_action inside file uapi/linux/bpf.h . + */ static const char gve_gstrings_main_stats[][ETH_GSTRING_LEN] = { "rx_packets", "tx_packets", "rx_bytes", "tx_bytes", "rx_dropped", "tx_dropped", "tx_timeouts", @@ -49,12 +54,16 @@ static const char gve_gstrings_rx_stats[][ETH_GSTRING_LEN] = { "rx_dropped_pkt[%u]", "rx_copybreak_pkt[%u]", "rx_copied_pkt[%u]", "rx_queue_drop_cnt[%u]", "rx_no_buffers_posted[%u]", "rx_drops_packet_over_mru[%u]", "rx_drops_invalid_checksum[%u]", + "rx_xdp_aborted[%u]", "rx_xdp_drop[%u]", "rx_xdp_pass[%u]", + "rx_xdp_tx[%u]", "rx_xdp_redirect[%u]", + "rx_xdp_tx_errors[%u]", "rx_xdp_redirect_errors[%u]", "rx_xdp_alloc_fails[%u]", }; static const char gve_gstrings_tx_stats[][ETH_GSTRING_LEN] = { "tx_posted_desc[%u]", "tx_completed_desc[%u]", "tx_consumed_desc[%u]", "tx_bytes[%u]", "tx_wake[%u]", "tx_stop[%u]", "tx_event_counter[%u]", - "tx_dma_mapping_error[%u]", + "tx_dma_mapping_error[%u]", "tx_xsk_wakeup[%u]", + "tx_xsk_done[%u]", "tx_xsk_sent[%u]", "tx_xdp_xmit[%u]", "tx_xdp_xmit_errors[%u]" }; static const char gve_gstrings_adminq_stats[][ETH_GSTRING_LEN] = { @@ -81,8 +90,10 @@ static void gve_get_strings(struct net_device *netdev, u32 stringset, u8 *data) { struct gve_priv *priv = netdev_priv(netdev); char *s = (char *)data; + int num_tx_queues; int i, j; + num_tx_queues = gve_num_tx_queues(priv); switch (stringset) { case ETH_SS_STATS: memcpy(s, *gve_gstrings_main_stats, @@ -97,7 +108,7 @@ static void gve_get_strings(struct net_device *netdev, u32 stringset, u8 *data) } } - for (i = 0; i < priv->tx_cfg.num_queues; i++) { + for (i = 0; i < num_tx_queues; i++) { for (j = 0; j < NUM_GVE_TX_CNTS; j++) { snprintf(s, ETH_GSTRING_LEN, gve_gstrings_tx_stats[j], i); @@ -124,12 +135,14 @@ static void gve_get_strings(struct net_device *netdev, u32 stringset, u8 *data) static int gve_get_sset_count(struct net_device *netdev, int sset) { struct gve_priv *priv = netdev_priv(netdev); + int num_tx_queues; + num_tx_queues = gve_num_tx_queues(priv); switch (sset) { case ETH_SS_STATS: return GVE_MAIN_STATS_LEN + GVE_ADMINQ_STATS_LEN + (priv->rx_cfg.num_queues * NUM_GVE_RX_CNTS) + - (priv->tx_cfg.num_queues * NUM_GVE_TX_CNTS); + (num_tx_queues * NUM_GVE_TX_CNTS); case ETH_SS_PRIV_FLAGS: return GVE_PRIV_FLAGS_STR_LEN; default: @@ -153,18 +166,20 @@ gve_get_ethtool_stats(struct net_device *netdev, struct gve_priv *priv; bool skip_nic_stats; unsigned int start; + int num_tx_queues; int ring; int i, j; ASSERT_RTNL(); priv = netdev_priv(netdev); + num_tx_queues = gve_num_tx_queues(priv); report_stats = priv->stats_report->stats; rx_qid_to_stats_idx = kmalloc_array(priv->rx_cfg.num_queues, sizeof(int), GFP_KERNEL); if (!rx_qid_to_stats_idx) return; - tx_qid_to_stats_idx = kmalloc_array(priv->tx_cfg.num_queues, + tx_qid_to_stats_idx = kmalloc_array(num_tx_queues, sizeof(int), GFP_KERNEL); if (!tx_qid_to_stats_idx) { kfree(rx_qid_to_stats_idx); @@ -195,7 +210,7 @@ gve_get_ethtool_stats(struct net_device *netdev, } } for (tx_pkts = 0, tx_bytes = 0, tx_dropped = 0, ring = 0; - ring < priv->tx_cfg.num_queues; ring++) { + ring < num_tx_queues; ring++) { if (priv->tx) { do { start = @@ -232,7 +247,7 @@ gve_get_ethtool_stats(struct net_device *netdev, i = GVE_MAIN_STATS_LEN; /* For rx cross-reporting stats, start from nic rx stats in report */ - base_stats_idx = GVE_TX_STATS_REPORT_NUM * priv->tx_cfg.num_queues + + base_stats_idx = GVE_TX_STATS_REPORT_NUM * num_tx_queues + GVE_RX_STATS_REPORT_NUM * priv->rx_cfg.num_queues; max_stats_idx = NIC_RX_STATS_REPORT_NUM * priv->rx_cfg.num_queues + base_stats_idx; @@ -283,14 +298,26 @@ gve_get_ethtool_stats(struct net_device *netdev, if (skip_nic_stats) { /* skip NIC rx stats */ i += NIC_RX_STATS_REPORT_NUM; - continue; - } - for (j = 0; j < NIC_RX_STATS_REPORT_NUM; j++) { - u64 value = - be64_to_cpu(report_stats[rx_qid_to_stats_idx[ring] + j].value); + } else { + stats_idx = rx_qid_to_stats_idx[ring]; + for (j = 0; j < NIC_RX_STATS_REPORT_NUM; j++) { + u64 value = + be64_to_cpu(report_stats[stats_idx + j].value); - data[i++] = value; + data[i++] = value; + } } + /* XDP rx counters */ + do { + start = u64_stats_fetch_begin(&priv->rx[ring].statss); + for (j = 0; j < GVE_XDP_ACTIONS; j++) + data[i + j] = rx->xdp_actions[j]; + data[i + j++] = rx->xdp_tx_errors; + data[i + j++] = rx->xdp_redirect_errors; + data[i + j++] = rx->xdp_alloc_fails; + } while (u64_stats_fetch_retry(&priv->rx[ring].statss, + start)); + i += GVE_XDP_ACTIONS + 3; /* XDP rx counters */ } } else { i += priv->rx_cfg.num_queues * NUM_GVE_RX_CNTS; @@ -298,7 +325,7 @@ gve_get_ethtool_stats(struct net_device *netdev, /* For tx cross-reporting stats, start from nic tx stats in report */ base_stats_idx = max_stats_idx; - max_stats_idx = NIC_TX_STATS_REPORT_NUM * priv->tx_cfg.num_queues + + max_stats_idx = NIC_TX_STATS_REPORT_NUM * num_tx_queues + max_stats_idx; /* Preprocess the stats report for tx, map queue id to start index */ skip_nic_stats = false; @@ -316,7 +343,7 @@ gve_get_ethtool_stats(struct net_device *netdev, } /* walk TX rings */ if (priv->tx) { - for (ring = 0; ring < priv->tx_cfg.num_queues; ring++) { + for (ring = 0; ring < num_tx_queues; ring++) { struct gve_tx_ring *tx = &priv->tx[ring]; if (gve_is_gqi(priv)) { @@ -346,16 +373,28 @@ gve_get_ethtool_stats(struct net_device *netdev, if (skip_nic_stats) { /* skip NIC tx stats */ i += NIC_TX_STATS_REPORT_NUM; - continue; - } - for (j = 0; j < NIC_TX_STATS_REPORT_NUM; j++) { - u64 value = - be64_to_cpu(report_stats[tx_qid_to_stats_idx[ring] + j].value); - data[i++] = value; + } else { + stats_idx = tx_qid_to_stats_idx[ring]; + for (j = 0; j < NIC_TX_STATS_REPORT_NUM; j++) { + u64 value = + be64_to_cpu(report_stats[stats_idx + j].value); + data[i++] = value; + } } + /* XDP xsk counters */ + data[i++] = tx->xdp_xsk_wakeup; + data[i++] = tx->xdp_xsk_done; + do { + start = u64_stats_fetch_begin(&priv->tx[ring].statss); + data[i] = tx->xdp_xsk_sent; + data[i + 1] = tx->xdp_xmit; + data[i + 2] = tx->xdp_xmit_errors; + } while (u64_stats_fetch_retry(&priv->tx[ring].statss, + start)); + i += 3; /* XDP tx counters */ } } else { - i += priv->tx_cfg.num_queues * NUM_GVE_TX_CNTS; + i += num_tx_queues * NUM_GVE_TX_CNTS; } kfree(rx_qid_to_stats_idx); @@ -412,6 +451,12 @@ static int gve_set_channels(struct net_device *netdev, if (!new_rx || !new_tx) return -EINVAL; + if (priv->num_xdp_queues && + (new_tx != new_rx || (2 * new_tx > priv->tx_cfg.max_queues))) { + dev_err(&priv->pdev->dev, "XDP load failed: The number of configured RX queues should be equal to the number of configured TX queues and the number of configured RX/TX queues should be less than or equal to half the maximum number of RX/TX queues"); + return -EINVAL; + } + if (!netif_carrier_ok(netdev)) { priv->tx_cfg.num_queues = new_tx; priv->rx_cfg.num_queues = new_rx; @@ -502,7 +547,9 @@ static int gve_set_priv_flags(struct net_device *netdev, u32 flags) { struct gve_priv *priv = netdev_priv(netdev); u64 ori_flags, new_flags; + int num_tx_queues; + num_tx_queues = gve_num_tx_queues(priv); ori_flags = READ_ONCE(priv->ethtool_flags); new_flags = ori_flags; @@ -522,7 +569,7 @@ static int gve_set_priv_flags(struct net_device *netdev, u32 flags) /* delete report stats timer. */ if (!(flags & BIT(0)) && (ori_flags & BIT(0))) { int tx_stats_num = GVE_TX_STATS_REPORT_NUM * - priv->tx_cfg.num_queues; + num_tx_queues; int rx_stats_num = GVE_RX_STATS_REPORT_NUM * priv->rx_cfg.num_queues; diff --git a/drivers/net/ethernet/google/gve/gve_main.c b/drivers/net/ethernet/google/gve/gve_main.c index 07111c241e0e..57ce74315eba 100644 --- a/drivers/net/ethernet/google/gve/gve_main.c +++ b/drivers/net/ethernet/google/gve/gve_main.c @@ -4,8 +4,10 @@ * Copyright (C) 2015-2021 Google, Inc. */ +#include <linux/bpf.h> #include <linux/cpumask.h> #include <linux/etherdevice.h> +#include <linux/filter.h> #include <linux/interrupt.h> #include <linux/module.h> #include <linux/pci.h> @@ -15,6 +17,7 @@ #include <linux/utsname.h> #include <linux/version.h> #include <net/sch_generic.h> +#include <net/xdp_sock_drv.h> #include "gve.h" #include "gve_dqo.h" #include "gve_adminq.h" @@ -90,8 +93,10 @@ static void gve_get_stats(struct net_device *dev, struct rtnl_link_stats64 *s) struct gve_priv *priv = netdev_priv(dev); unsigned int start; u64 packets, bytes; + int num_tx_queues; int ring; + num_tx_queues = gve_num_tx_queues(priv); if (priv->rx) { for (ring = 0; ring < priv->rx_cfg.num_queues; ring++) { do { @@ -106,7 +111,7 @@ static void gve_get_stats(struct net_device *dev, struct rtnl_link_stats64 *s) } } if (priv->tx) { - for (ring = 0; ring < priv->tx_cfg.num_queues; ring++) { + for (ring = 0; ring < num_tx_queues; ring++) { do { start = u64_stats_fetch_begin(&priv->tx[ring].statss); @@ -180,7 +185,7 @@ static int gve_alloc_stats_report(struct gve_priv *priv) int tx_stats_num, rx_stats_num; tx_stats_num = (GVE_TX_STATS_REPORT_NUM + NIC_TX_STATS_REPORT_NUM) * - priv->tx_cfg.num_queues; + gve_num_tx_queues(priv); rx_stats_num = (GVE_RX_STATS_REPORT_NUM + NIC_RX_STATS_REPORT_NUM) * priv->rx_cfg.num_queues; priv->stats_report_len = struct_size(priv->stats_report, stats, @@ -245,8 +250,13 @@ static int gve_napi_poll(struct napi_struct *napi, int budget) block = container_of(napi, struct gve_notify_block, napi); priv = block->priv; - if (block->tx) - reschedule |= gve_tx_poll(block, budget); + if (block->tx) { + if (block->tx->q_num < priv->tx_cfg.num_queues) + reschedule |= gve_tx_poll(block, budget); + else + reschedule |= gve_xdp_poll(block, budget); + } + if (block->rx) { work_done = gve_rx_poll(block, budget); reschedule |= work_done == budget; @@ -580,13 +590,36 @@ static void gve_remove_napi(struct gve_priv *priv, int ntfy_idx) netif_napi_del(&block->napi); } +static int gve_register_xdp_qpls(struct gve_priv *priv) +{ + int start_id; + int err; + int i; + + start_id = gve_tx_qpl_id(priv, gve_xdp_tx_start_queue_id(priv)); + for (i = start_id; i < start_id + gve_num_xdp_qpls(priv); i++) { + err = gve_adminq_register_page_list(priv, &priv->qpls[i]); + if (err) { + netif_err(priv, drv, priv->dev, + "failed to register queue page list %d\n", + priv->qpls[i].id); + /* This failure will trigger a reset - no need to clean + * up + */ + return err; + } + } + return 0; +} + static int gve_register_qpls(struct gve_priv *priv) { - int num_qpls = gve_num_tx_qpls(priv) + gve_num_rx_qpls(priv); + int start_id; int err; int i; - for (i = 0; i < num_qpls; i++) { + start_id = gve_tx_start_qpl_id(priv); + for (i = start_id; i < start_id + gve_num_tx_qpls(priv); i++) { err = gve_adminq_register_page_list(priv, &priv->qpls[i]); if (err) { netif_err(priv, drv, priv->dev, @@ -598,16 +631,63 @@ static int gve_register_qpls(struct gve_priv *priv) return err; } } + + start_id = gve_rx_start_qpl_id(priv); + for (i = start_id; i < start_id + gve_num_rx_qpls(priv); i++) { + err = gve_adminq_register_page_list(priv, &priv->qpls[i]); + if (err) { + netif_err(priv, drv, priv->dev, + "failed to register queue page list %d\n", + priv->qpls[i].id); + /* This failure will trigger a reset - no need to clean + * up + */ + return err; + } + } + return 0; +} + +static int gve_unregister_xdp_qpls(struct gve_priv *priv) +{ + int start_id; + int err; + int i; + + start_id = gve_tx_qpl_id(priv, gve_xdp_tx_start_queue_id(priv)); + for (i = start_id; i < start_id + gve_num_xdp_qpls(priv); i++) { + err = gve_adminq_unregister_page_list(priv, priv->qpls[i].id); + /* This failure will trigger a reset - no need to clean up */ + if (err) { + netif_err(priv, drv, priv->dev, + "Failed to unregister queue page list %d\n", + priv->qpls[i].id); + return err; + } + } return 0; } static int gve_unregister_qpls(struct gve_priv *priv) { - int num_qpls = gve_num_tx_qpls(priv) + gve_num_rx_qpls(priv); + int start_id; int err; int i; - for (i = 0; i < num_qpls; i++) { + start_id = gve_tx_start_qpl_id(priv); + for (i = start_id; i < start_id + gve_num_tx_qpls(priv); i++) { + err = gve_adminq_unregister_page_list(priv, priv->qpls[i].id); + /* This failure will trigger a reset - no need to clean up */ + if (err) { + netif_err(priv, drv, priv->dev, + "Failed to unregister queue page list %d\n", + priv->qpls[i].id); + return err; + } + } + + start_id = gve_rx_start_qpl_id(priv); + for (i = start_id; i < start_id + gve_num_rx_qpls(priv); i++) { err = gve_adminq_unregister_page_list(priv, priv->qpls[i].id); /* This failure will trigger a reset - no need to clean up */ if (err) { @@ -620,22 +700,44 @@ static int gve_unregister_qpls(struct gve_priv *priv) return 0; } +static int gve_create_xdp_rings(struct gve_priv *priv) +{ + int err; + + err = gve_adminq_create_tx_queues(priv, + gve_xdp_tx_start_queue_id(priv), + priv->num_xdp_queues); + if (err) { + netif_err(priv, drv, priv->dev, "failed to create %d XDP tx queues\n", + priv->num_xdp_queues); + /* This failure will trigger a reset - no need to clean + * up + */ + return err; + } + netif_dbg(priv, drv, priv->dev, "created %d XDP tx queues\n", + priv->num_xdp_queues); + + return 0; +} + static int gve_create_rings(struct gve_priv *priv) { + int num_tx_queues = gve_num_tx_queues(priv); int err; int i; - err = gve_adminq_create_tx_queues(priv, priv->tx_cfg.num_queues); + err = gve_adminq_create_tx_queues(priv, 0, num_tx_queues); if (err) { netif_err(priv, drv, priv->dev, "failed to create %d tx queues\n", - priv->tx_cfg.num_queues); + num_tx_queues); /* This failure will trigger a reset - no need to clean * up */ return err; } netif_dbg(priv, drv, priv->dev, "created %d tx queues\n", - priv->tx_cfg.num_queues); + num_tx_queues); err = gve_adminq_create_rx_queues(priv, priv->rx_cfg.num_queues); if (err) { @@ -668,6 +770,23 @@ static int gve_create_rings(struct gve_priv *priv) return 0; } +static void add_napi_init_xdp_sync_stats(struct gve_priv *priv, + int (*napi_poll)(struct napi_struct *napi, + int budget)) +{ + int start_id = gve_xdp_tx_start_queue_id(priv); + int i; + + /* Add xdp tx napi & init sync stats*/ + for (i = start_id; i < start_id + priv->num_xdp_queues; i++) { + int ntfy_idx = gve_tx_idx_to_ntfy(priv, i); + + u64_stats_init(&priv->tx[i].statss); + priv->tx[i].ntfy_id = ntfy_idx; + gve_add_napi(priv, ntfy_idx, napi_poll); + } +} + static void add_napi_init_sync_stats(struct gve_priv *priv, int (*napi_poll)(struct napi_struct *napi, int budget)) @@ -675,7 +794,7 @@ static void add_napi_init_sync_stats(struct gve_priv *priv, int i; /* Add tx napi & init sync stats*/ - for (i = 0; i < priv->tx_cfg.num_queues; i++) { + for (i = 0; i < gve_num_tx_queues(priv); i++) { int ntfy_idx = gve_tx_idx_to_ntfy(priv, i); u64_stats_init(&priv->tx[i].statss); @@ -692,34 +811,51 @@ static void add_napi_init_sync_stats(struct gve_priv *priv, } } -static void gve_tx_free_rings(struct gve_priv *priv) +static void gve_tx_free_rings(struct gve_priv *priv, int start_id, int num_rings) { if (gve_is_gqi(priv)) { - gve_tx_free_rings_gqi(priv); + gve_tx_free_rings_gqi(priv, start_id, num_rings); } else { gve_tx_free_rings_dqo(priv); } } +static int gve_alloc_xdp_rings(struct gve_priv *priv) +{ + int start_id; + int err = 0; + + if (!priv->num_xdp_queues) + return 0; + + start_id = gve_xdp_tx_start_queue_id(priv); + err = gve_tx_alloc_rings(priv, start_id, priv->num_xdp_queues); + if (err) + return err; + add_napi_init_xdp_sync_stats(priv, gve_napi_poll); + + return 0; +} + static int gve_alloc_rings(struct gve_priv *priv) { int err; /* Setup tx rings */ - priv->tx = kvcalloc(priv->tx_cfg.num_queues, sizeof(*priv->tx), + priv->tx = kvcalloc(priv->tx_cfg.max_queues, sizeof(*priv->tx), GFP_KERNEL); if (!priv->tx) return -ENOMEM; if (gve_is_gqi(priv)) - err = gve_tx_alloc_rings(priv); + err = gve_tx_alloc_rings(priv, 0, gve_num_tx_queues(priv)); else err = gve_tx_alloc_rings_dqo(priv); if (err) goto free_tx; /* Setup rx rings */ - priv->rx = kvcalloc(priv->rx_cfg.num_queues, sizeof(*priv->rx), + priv->rx = kvcalloc(priv->rx_cfg.max_queues, sizeof(*priv->rx), GFP_KERNEL); if (!priv->rx) { err = -ENOMEM; @@ -744,18 +880,39 @@ free_rx: kvfree(priv->rx); priv->rx = NULL; free_tx_queue: - gve_tx_free_rings(priv); + gve_tx_free_rings(priv, 0, gve_num_tx_queues(priv)); free_tx: kvfree(priv->tx); priv->tx = NULL; return err; } +static int gve_destroy_xdp_rings(struct gve_priv *priv) +{ + int start_id; + int err; + + start_id = gve_xdp_tx_start_queue_id(priv); + err = gve_adminq_destroy_tx_queues(priv, + start_id, + priv->num_xdp_queues); + if (err) { + netif_err(priv, drv, priv->dev, + "failed to destroy XDP queues\n"); + /* This failure will trigger a reset - no need to clean up */ + return err; + } + netif_dbg(priv, drv, priv->dev, "destroyed XDP queues\n"); + + return 0; +} + static int gve_destroy_rings(struct gve_priv *priv) { + int num_tx_queues = gve_num_tx_queues(priv); int err; - err = gve_adminq_destroy_tx_queues(priv, priv->tx_cfg.num_queues); + err = gve_adminq_destroy_tx_queues(priv, 0, num_tx_queues); if (err) { netif_err(priv, drv, priv->dev, "failed to destroy tx queues\n"); @@ -782,17 +939,33 @@ static void gve_rx_free_rings(struct gve_priv *priv) gve_rx_free_rings_dqo(priv); } +static void gve_free_xdp_rings(struct gve_priv *priv) +{ + int ntfy_idx, start_id; + int i; + + start_id = gve_xdp_tx_start_queue_id(priv); + if (priv->tx) { + for (i = start_id; i < start_id + priv->num_xdp_queues; i++) { + ntfy_idx = gve_tx_idx_to_ntfy(priv, i); + gve_remove_napi(priv, ntfy_idx); + } + gve_tx_free_rings(priv, start_id, priv->num_xdp_queues); + } +} + static void gve_free_rings(struct gve_priv *priv) { + int num_tx_queues = gve_num_tx_queues(priv); int ntfy_idx; int i; if (priv->tx) { - for (i = 0; i < priv->tx_cfg.num_queues; i++) { + for (i = 0; i < num_tx_queues; i++) { ntfy_idx = gve_tx_idx_to_ntfy(priv, i); gve_remove_napi(priv, ntfy_idx); } - gve_tx_free_rings(priv); + gve_tx_free_rings(priv, 0, num_tx_queues); kvfree(priv->tx); priv->tx = NULL; } @@ -889,40 +1062,68 @@ static void gve_free_queue_page_list(struct gve_priv *priv, u32 id) qpl->page_buses[i], gve_qpl_dma_dir(priv, id)); kvfree(qpl->page_buses); + qpl->page_buses = NULL; free_pages: kvfree(qpl->pages); + qpl->pages = NULL; priv->num_registered_pages -= qpl->num_entries; } +static int gve_alloc_xdp_qpls(struct gve_priv *priv) +{ + int start_id; + int i, j; + int err; + + start_id = gve_tx_qpl_id(priv, gve_xdp_tx_start_queue_id(priv)); + for (i = start_id; i < start_id + gve_num_xdp_qpls(priv); i++) { + err = gve_alloc_queue_page_list(priv, i, + priv->tx_pages_per_qpl); + if (err) + goto free_qpls; + } + + return 0; + +free_qpls: + for (j = start_id; j <= i; j++) + gve_free_queue_page_list(priv, j); + return err; +} + static int gve_alloc_qpls(struct gve_priv *priv) { - int num_qpls = gve_num_tx_qpls(priv) + gve_num_rx_qpls(priv); + int max_queues = priv->tx_cfg.max_queues + priv->rx_cfg.max_queues; + int start_id; int i, j; int err; - if (num_qpls == 0) + if (priv->queue_format != GVE_GQI_QPL_FORMAT) return 0; - priv->qpls = kvcalloc(num_qpls, sizeof(*priv->qpls), GFP_KERNEL); + priv->qpls = kvcalloc(max_queues, sizeof(*priv->qpls), GFP_KERNEL); if (!priv->qpls) return -ENOMEM; - for (i = 0; i < gve_num_tx_qpls(priv); i++) { + start_id = gve_tx_start_qpl_id(priv); + for (i = start_id; i < start_id + gve_num_tx_qpls(priv); i++) { err = gve_alloc_queue_page_list(priv, i, priv->tx_pages_per_qpl); if (err) goto free_qpls; } - for (; i < num_qpls; i++) { + + start_id = gve_rx_start_qpl_id(priv); + for (i = start_id; i < start_id + gve_num_rx_qpls(priv); i++) { err = gve_alloc_queue_page_list(priv, i, priv->rx_data_slot_cnt); if (err) goto free_qpls; } - priv->qpl_cfg.qpl_map_size = BITS_TO_LONGS(num_qpls) * + priv->qpl_cfg.qpl_map_size = BITS_TO_LONGS(max_queues) * sizeof(unsigned long) * BITS_PER_BYTE; - priv->qpl_cfg.qpl_id_map = kvcalloc(BITS_TO_LONGS(num_qpls), + priv->qpl_cfg.qpl_id_map = kvcalloc(BITS_TO_LONGS(max_queues), sizeof(unsigned long), GFP_KERNEL); if (!priv->qpl_cfg.qpl_id_map) { err = -ENOMEM; @@ -935,23 +1136,36 @@ free_qpls: for (j = 0; j <= i; j++) gve_free_queue_page_list(priv, j); kvfree(priv->qpls); + priv->qpls = NULL; return err; } +static void gve_free_xdp_qpls(struct gve_priv *priv) +{ + int start_id; + int i; + + start_id = gve_tx_qpl_id(priv, gve_xdp_tx_start_queue_id(priv)); + for (i = start_id; i < start_id + gve_num_xdp_qpls(priv); i++) + gve_free_queue_page_list(priv, i); +} + static void gve_free_qpls(struct gve_priv *priv) { - int num_qpls = gve_num_tx_qpls(priv) + gve_num_rx_qpls(priv); + int max_queues = priv->tx_cfg.max_queues + priv->rx_cfg.max_queues; int i; - if (num_qpls == 0) + if (!priv->qpls) return; kvfree(priv->qpl_cfg.qpl_id_map); + priv->qpl_cfg.qpl_id_map = NULL; - for (i = 0; i < num_qpls; i++) + for (i = 0; i < max_queues; i++) gve_free_queue_page_list(priv, i); kvfree(priv->qpls); + priv->qpls = NULL; } /* Use this to schedule a reset when the device is capable of continuing @@ -969,11 +1183,109 @@ static int gve_reset_recovery(struct gve_priv *priv, bool was_up); static void gve_turndown(struct gve_priv *priv); static void gve_turnup(struct gve_priv *priv); +static int gve_reg_xdp_info(struct gve_priv *priv, struct net_device *dev) +{ + struct napi_struct *napi; + struct gve_rx_ring *rx; + int err = 0; + int i, j; + u32 tx_qid; + + if (!priv->num_xdp_queues) + return 0; + + for (i = 0; i < priv->rx_cfg.num_queues; i++) { + rx = &priv->rx[i]; + napi = &priv->ntfy_blocks[rx->ntfy_id].napi; + + err = xdp_rxq_info_reg(&rx->xdp_rxq, dev, i, + napi->napi_id); + if (err) + goto err; + err = xdp_rxq_info_reg_mem_model(&rx->xdp_rxq, + MEM_TYPE_PAGE_SHARED, NULL); + if (err) + goto err; + rx->xsk_pool = xsk_get_pool_from_qid(dev, i); + if (rx->xsk_pool) { + err = xdp_rxq_info_reg(&rx->xsk_rxq, dev, i, + napi->napi_id); + if (err) + goto err; + err = xdp_rxq_info_reg_mem_model(&rx->xsk_rxq, + MEM_TYPE_XSK_BUFF_POOL, NULL); + if (err) + goto err; + xsk_pool_set_rxq_info(rx->xsk_pool, + &rx->xsk_rxq); + } + } + + for (i = 0; i < priv->num_xdp_queues; i++) { + tx_qid = gve_xdp_tx_queue_id(priv, i); + priv->tx[tx_qid].xsk_pool = xsk_get_pool_from_qid(dev, i); + } + return 0; + +err: + for (j = i; j >= 0; j--) { + rx = &priv->rx[j]; + if (xdp_rxq_info_is_reg(&rx->xdp_rxq)) + xdp_rxq_info_unreg(&rx->xdp_rxq); + if (xdp_rxq_info_is_reg(&rx->xsk_rxq)) + xdp_rxq_info_unreg(&rx->xsk_rxq); + } + return err; +} + +static void gve_unreg_xdp_info(struct gve_priv *priv) +{ + int i, tx_qid; + + if (!priv->num_xdp_queues) + return; + + for (i = 0; i < priv->rx_cfg.num_queues; i++) { + struct gve_rx_ring *rx = &priv->rx[i]; + + xdp_rxq_info_unreg(&rx->xdp_rxq); + if (rx->xsk_pool) { + xdp_rxq_info_unreg(&rx->xsk_rxq); + rx->xsk_pool = NULL; + } + } + + for (i = 0; i < priv->num_xdp_queues; i++) { + tx_qid = gve_xdp_tx_queue_id(priv, i); + priv->tx[tx_qid].xsk_pool = NULL; + } +} + +static void gve_drain_page_cache(struct gve_priv *priv) +{ + struct page_frag_cache *nc; + int i; + + for (i = 0; i < priv->rx_cfg.num_queues; i++) { + nc = &priv->rx[i].page_cache; + if (nc->va) { + __page_frag_cache_drain(virt_to_page(nc->va), + nc->pagecnt_bias); + nc->va = NULL; + } + } +} + static int gve_open(struct net_device *dev) { struct gve_priv *priv = netdev_priv(dev); int err; + if (priv->xdp_prog) + priv->num_xdp_queues = priv->rx_cfg.num_queues; + else + priv->num_xdp_queues = 0; + err = gve_alloc_qpls(priv); if (err) return err; @@ -989,6 +1301,10 @@ static int gve_open(struct net_device *dev) if (err) goto free_rings; + err = gve_reg_xdp_info(priv, dev); + if (err) + goto free_rings; + err = gve_register_qpls(priv); if (err) goto reset; @@ -1043,6 +1359,7 @@ static int gve_close(struct net_device *dev) netif_carrier_off(dev); if (gve_get_device_rings_ok(priv)) { gve_turndown(priv); + gve_drain_page_cache(priv); err = gve_destroy_rings(priv); if (err) goto err; @@ -1053,6 +1370,7 @@ static int gve_close(struct net_device *dev) } del_timer_sync(&priv->stats_report_timer); + gve_unreg_xdp_info(priv); gve_free_rings(priv); gve_free_qpls(priv); priv->interface_down_cnt++; @@ -1069,6 +1387,306 @@ err: return gve_reset_recovery(priv, false); } +static int gve_remove_xdp_queues(struct gve_priv *priv) +{ + int err; + + err = gve_destroy_xdp_rings(priv); + if (err) + return err; + + err = gve_unregister_xdp_qpls(priv); + if (err) + return err; + + gve_unreg_xdp_info(priv); + gve_free_xdp_rings(priv); + gve_free_xdp_qpls(priv); + priv->num_xdp_queues = 0; + return 0; +} + +static int gve_add_xdp_queues(struct gve_priv *priv) +{ + int err; + + priv->num_xdp_queues = priv->tx_cfg.num_queues; + + err = gve_alloc_xdp_qpls(priv); + if (err) + goto err; + + err = gve_alloc_xdp_rings(priv); + if (err) + goto free_xdp_qpls; + + err = gve_reg_xdp_info(priv, priv->dev); + if (err) + goto free_xdp_rings; + + err = gve_register_xdp_qpls(priv); + if (err) + goto free_xdp_rings; + + err = gve_create_xdp_rings(priv); + if (err) + goto free_xdp_rings; + + return 0; + +free_xdp_rings: + gve_free_xdp_rings(priv); +free_xdp_qpls: + gve_free_xdp_qpls(priv); +err: + priv->num_xdp_queues = 0; + return err; +} + +static void gve_handle_link_status(struct gve_priv *priv, bool link_status) +{ + if (!gve_get_napi_enabled(priv)) + return; + + if (link_status == netif_carrier_ok(priv->dev)) + return; + + if (link_status) { + netdev_info(priv->dev, "Device link is up.\n"); + netif_carrier_on(priv->dev); + } else { + netdev_info(priv->dev, "Device link is down.\n"); + netif_carrier_off(priv->dev); + } +} + +static int gve_set_xdp(struct gve_priv *priv, struct bpf_prog *prog, + struct netlink_ext_ack *extack) +{ + struct bpf_prog *old_prog; + int err = 0; + u32 status; + + old_prog = READ_ONCE(priv->xdp_prog); + if (!netif_carrier_ok(priv->dev)) { + WRITE_ONCE(priv->xdp_prog, prog); + if (old_prog) + bpf_prog_put(old_prog); + return 0; + } + + gve_turndown(priv); + if (!old_prog && prog) { + // Allocate XDP TX queues if an XDP program is + // being installed + err = gve_add_xdp_queues(priv); + if (err) + goto out; + } else if (old_prog && !prog) { + // Remove XDP TX queues if an XDP program is + // being uninstalled + err = gve_remove_xdp_queues(priv); + if (err) + goto out; + } + WRITE_ONCE(priv->xdp_prog, prog); + if (old_prog) + bpf_prog_put(old_prog); + +out: + gve_turnup(priv); + status = ioread32be(&priv->reg_bar0->device_status); + gve_handle_link_status(priv, GVE_DEVICE_STATUS_LINK_STATUS_MASK & status); + return err; +} + +static int gve_xsk_pool_enable(struct net_device *dev, + struct xsk_buff_pool *pool, + u16 qid) +{ + struct gve_priv *priv = netdev_priv(dev); + struct napi_struct *napi; + struct gve_rx_ring *rx; + int tx_qid; + int err; + + if (qid >= priv->rx_cfg.num_queues) { + dev_err(&priv->pdev->dev, "xsk pool invalid qid %d", qid); + return -EINVAL; + } + if (xsk_pool_get_rx_frame_size(pool) < + priv->dev->max_mtu + sizeof(struct ethhdr)) { + dev_err(&priv->pdev->dev, "xsk pool frame_len too small"); + return -EINVAL; + } + + err = xsk_pool_dma_map(pool, &priv->pdev->dev, + DMA_ATTR_SKIP_CPU_SYNC | DMA_ATTR_WEAK_ORDERING); + if (err) + return err; + + /* If XDP prog is not installed, return */ + if (!priv->xdp_prog) + return 0; + + rx = &priv->rx[qid]; + napi = &priv->ntfy_blocks[rx->ntfy_id].napi; + err = xdp_rxq_info_reg(&rx->xsk_rxq, dev, qid, napi->napi_id); + if (err) + goto err; + + err = xdp_rxq_info_reg_mem_model(&rx->xsk_rxq, + MEM_TYPE_XSK_BUFF_POOL, NULL); + if (err) + goto err; + + xsk_pool_set_rxq_info(pool, &rx->xsk_rxq); + rx->xsk_pool = pool; + + tx_qid = gve_xdp_tx_queue_id(priv, qid); + priv->tx[tx_qid].xsk_pool = pool; + + return 0; +err: + if (xdp_rxq_info_is_reg(&rx->xsk_rxq)) + xdp_rxq_info_unreg(&rx->xsk_rxq); + + xsk_pool_dma_unmap(pool, + DMA_ATTR_SKIP_CPU_SYNC | DMA_ATTR_WEAK_ORDERING); + return err; +} + +static int gve_xsk_pool_disable(struct net_device *dev, + u16 qid) +{ + struct gve_priv *priv = netdev_priv(dev); + struct napi_struct *napi_rx; + struct napi_struct *napi_tx; + struct xsk_buff_pool *pool; + int tx_qid; + + pool = xsk_get_pool_from_qid(dev, qid); + if (!pool) + return -EINVAL; + if (qid >= priv->rx_cfg.num_queues) + return -EINVAL; + + /* If XDP prog is not installed, unmap DMA and return */ + if (!priv->xdp_prog) + goto done; + + tx_qid = gve_xdp_tx_queue_id(priv, qid); + if (!netif_running(dev)) { + priv->rx[qid].xsk_pool = NULL; + xdp_rxq_info_unreg(&priv->rx[qid].xsk_rxq); + priv->tx[tx_qid].xsk_pool = NULL; + goto done; + } + + napi_rx = &priv->ntfy_blocks[priv->rx[qid].ntfy_id].napi; + napi_disable(napi_rx); /* make sure current rx poll is done */ + + napi_tx = &priv->ntfy_blocks[priv->tx[tx_qid].ntfy_id].napi; + napi_disable(napi_tx); /* make sure current tx poll is done */ + + priv->rx[qid].xsk_pool = NULL; + xdp_rxq_info_unreg(&priv->rx[qid].xsk_rxq); + priv->tx[tx_qid].xsk_pool = NULL; + smp_mb(); /* Make sure it is visible to the workers on datapath */ + + napi_enable(napi_rx); + if (gve_rx_work_pending(&priv->rx[qid])) + napi_schedule(napi_rx); + + napi_enable(napi_tx); + if (gve_tx_clean_pending(priv, &priv->tx[tx_qid])) + napi_schedule(napi_tx); + +done: + xsk_pool_dma_unmap(pool, + DMA_ATTR_SKIP_CPU_SYNC | DMA_ATTR_WEAK_ORDERING); + return 0; +} + +static int gve_xsk_wakeup(struct net_device *dev, u32 queue_id, u32 flags) +{ + struct gve_priv *priv = netdev_priv(dev); + int tx_queue_id = gve_xdp_tx_queue_id(priv, queue_id); + + if (queue_id >= priv->rx_cfg.num_queues || !priv->xdp_prog) + return -EINVAL; + + if (flags & XDP_WAKEUP_TX) { + struct gve_tx_ring *tx = &priv->tx[tx_queue_id]; + struct napi_struct *napi = + &priv->ntfy_blocks[tx->ntfy_id].napi; + + if (!napi_if_scheduled_mark_missed(napi)) { + /* Call local_bh_enable to trigger SoftIRQ processing */ + local_bh_disable(); + napi_schedule(napi); + local_bh_enable(); + } + + tx->xdp_xsk_wakeup++; + } + + return 0; +} + +static int verify_xdp_configuration(struct net_device *dev) +{ + struct gve_priv *priv = netdev_priv(dev); + + if (dev->features & NETIF_F_LRO) { + netdev_warn(dev, "XDP is not supported when LRO is on.\n"); + return -EOPNOTSUPP; + } + + if (priv->queue_format != GVE_GQI_QPL_FORMAT) { + netdev_warn(dev, "XDP is not supported in mode %d.\n", + priv->queue_format); + return -EOPNOTSUPP; + } + + if (dev->mtu > (PAGE_SIZE / 2) - sizeof(struct ethhdr) - GVE_RX_PAD) { + netdev_warn(dev, "XDP is not supported for mtu %d.\n", + dev->mtu); + return -EOPNOTSUPP; + } + + if (priv->rx_cfg.num_queues != priv->tx_cfg.num_queues || + (2 * priv->tx_cfg.num_queues > priv->tx_cfg.max_queues)) { + netdev_warn(dev, "XDP load failed: The number of configured RX queues %d should be equal to the number of configured TX queues %d and the number of configured RX/TX queues should be less than or equal to half the maximum number of RX/TX queues %d", + priv->rx_cfg.num_queues, + priv->tx_cfg.num_queues, + priv->tx_cfg.max_queues); + return -EINVAL; + } + return 0; +} + +static int gve_xdp(struct net_device *dev, struct netdev_bpf *xdp) +{ + struct gve_priv *priv = netdev_priv(dev); + int err; + + err = verify_xdp_configuration(dev); + if (err) + return err; + switch (xdp->command) { + case XDP_SETUP_PROG: + return gve_set_xdp(priv, xdp->prog, xdp->extack); + case XDP_SETUP_XSK_POOL: + if (xdp->xsk.pool) + return gve_xsk_pool_enable(dev, xdp->xsk.pool, xdp->xsk.queue_id); + else + return gve_xsk_pool_disable(dev, xdp->xsk.queue_id); + default: + return -EINVAL; + } +} + int gve_adjust_queues(struct gve_priv *priv, struct gve_queue_config new_rx_config, struct gve_queue_config new_tx_config) @@ -1118,7 +1736,7 @@ static void gve_turndown(struct gve_priv *priv) return; /* Disable napi to prevent more work from coming in */ - for (idx = 0; idx < priv->tx_cfg.num_queues; idx++) { + for (idx = 0; idx < gve_num_tx_queues(priv); idx++) { int ntfy_idx = gve_tx_idx_to_ntfy(priv, idx); struct gve_notify_block *block = &priv->ntfy_blocks[ntfy_idx]; @@ -1146,7 +1764,7 @@ static void gve_turnup(struct gve_priv *priv) netif_tx_start_all_queues(priv->dev); /* Enable napi and unmask interrupts for all queues */ - for (idx = 0; idx < priv->tx_cfg.num_queues; idx++) { + for (idx = 0; idx < gve_num_tx_queues(priv); idx++) { int ntfy_idx = gve_tx_idx_to_ntfy(priv, idx); struct gve_notify_block *block = &priv->ntfy_blocks[ntfy_idx]; @@ -1263,6 +1881,9 @@ static const struct net_device_ops gve_netdev_ops = { .ndo_get_stats64 = gve_get_stats, .ndo_tx_timeout = gve_tx_timeout, .ndo_set_features = gve_set_features, + .ndo_bpf = gve_xdp, + .ndo_xdp_xmit = gve_xdp_xmit, + .ndo_xsk_wakeup = gve_xsk_wakeup, }; static void gve_handle_status(struct gve_priv *priv, u32 status) @@ -1306,7 +1927,7 @@ void gve_handle_report_stats(struct gve_priv *priv) be64_add_cpu(&priv->stats_report->written_count, 1); /* tx stats */ if (priv->tx) { - for (idx = 0; idx < priv->tx_cfg.num_queues; idx++) { + for (idx = 0; idx < gve_num_tx_queues(priv); idx++) { u32 last_completion = 0; u32 tx_frames = 0; @@ -1369,23 +1990,6 @@ void gve_handle_report_stats(struct gve_priv *priv) } } -static void gve_handle_link_status(struct gve_priv *priv, bool link_status) -{ - if (!gve_get_napi_enabled(priv)) - return; - - if (link_status == netif_carrier_ok(priv->dev)) - return; - - if (link_status) { - netdev_info(priv->dev, "Device link is up.\n"); - netif_carrier_on(priv->dev); - } else { - netdev_info(priv->dev, "Device link is down.\n"); - netif_carrier_off(priv->dev); - } -} - /* Handle NIC status register changes, reset requests and report stats */ static void gve_service_task(struct work_struct *work) { @@ -1399,6 +2003,18 @@ static void gve_service_task(struct work_struct *work) gve_handle_link_status(priv, GVE_DEVICE_STATUS_LINK_STATUS_MASK & status); } +static void gve_set_netdev_xdp_features(struct gve_priv *priv) +{ + if (priv->queue_format == GVE_GQI_QPL_FORMAT) { + priv->dev->xdp_features = NETDEV_XDP_ACT_BASIC; + priv->dev->xdp_features |= NETDEV_XDP_ACT_REDIRECT; + priv->dev->xdp_features |= NETDEV_XDP_ACT_NDO_XMIT; + priv->dev->xdp_features |= NETDEV_XDP_ACT_XSK_ZEROCOPY; + } else { + priv->dev->xdp_features = 0; + } +} + static int gve_init_priv(struct gve_priv *priv, bool skip_describe_device) { int num_ntfy; @@ -1477,6 +2093,7 @@ static int gve_init_priv(struct gve_priv *priv, bool skip_describe_device) } setup_device: + gve_set_netdev_xdp_features(priv); err = gve_setup_device_resources(priv); if (!err) return 0; diff --git a/drivers/net/ethernet/google/gve/gve_rx.c b/drivers/net/ethernet/google/gve/gve_rx.c index 1f55137722b0..d1da7413dc4d 100644 --- a/drivers/net/ethernet/google/gve/gve_rx.c +++ b/drivers/net/ethernet/google/gve/gve_rx.c @@ -8,6 +8,9 @@ #include "gve_adminq.h" #include "gve_utils.h" #include <linux/etherdevice.h> +#include <linux/filter.h> +#include <net/xdp.h> +#include <net/xdp_sock_drv.h> static void gve_rx_free_buffer(struct device *dev, struct gve_rx_slot_page_info *page_info, @@ -124,7 +127,7 @@ static int gve_prefill_rx_pages(struct gve_rx_ring *rx) return -ENOMEM; if (!rx->data.raw_addressing) { - rx->data.qpl = gve_assign_rx_qpl(priv); + rx->data.qpl = gve_assign_rx_qpl(priv, rx->q_num); if (!rx->data.qpl) { kvfree(rx->data.page_info); rx->data.page_info = NULL; @@ -556,7 +559,7 @@ static struct sk_buff *gve_rx_skb(struct gve_priv *priv, struct gve_rx_ring *rx, if (len <= priv->rx_copybreak && is_only_frag) { /* Just copy small packets */ - skb = gve_rx_copy(netdev, napi, page_info, len, GVE_RX_PAD); + skb = gve_rx_copy(netdev, napi, page_info, len); if (skb) { u64_stats_update_begin(&rx->statss); rx->rx_copied_pkt++; @@ -591,6 +594,107 @@ static struct sk_buff *gve_rx_skb(struct gve_priv *priv, struct gve_rx_ring *rx, return skb; } +static int gve_xsk_pool_redirect(struct net_device *dev, + struct gve_rx_ring *rx, + void *data, int len, + struct bpf_prog *xdp_prog) +{ + struct xdp_buff *xdp; + int err; + + if (rx->xsk_pool->frame_len < len) + return -E2BIG; + xdp = xsk_buff_alloc(rx->xsk_pool); + if (!xdp) { + u64_stats_update_begin(&rx->statss); + rx->xdp_alloc_fails++; + u64_stats_update_end(&rx->statss); + return -ENOMEM; + } + xdp->data_end = xdp->data + len; + memcpy(xdp->data, data, len); + err = xdp_do_redirect(dev, xdp, xdp_prog); + if (err) + xsk_buff_free(xdp); + return err; +} + +static int gve_xdp_redirect(struct net_device *dev, struct gve_rx_ring *rx, + struct xdp_buff *orig, struct bpf_prog *xdp_prog) +{ + int total_len, len = orig->data_end - orig->data; + int headroom = XDP_PACKET_HEADROOM; + struct xdp_buff new; + void *frame; + int err; + + if (rx->xsk_pool) + return gve_xsk_pool_redirect(dev, rx, orig->data, + len, xdp_prog); + + total_len = headroom + SKB_DATA_ALIGN(len) + + SKB_DATA_ALIGN(sizeof(struct skb_shared_info)); + frame = page_frag_alloc(&rx->page_cache, total_len, GFP_ATOMIC); + if (!frame) { + u64_stats_update_begin(&rx->statss); + rx->xdp_alloc_fails++; + u64_stats_update_end(&rx->statss); + return -ENOMEM; + } + xdp_init_buff(&new, total_len, &rx->xdp_rxq); + xdp_prepare_buff(&new, frame, headroom, len, false); + memcpy(new.data, orig->data, len); + + err = xdp_do_redirect(dev, &new, xdp_prog); + if (err) + page_frag_free(frame); + + return err; +} + +static void gve_xdp_done(struct gve_priv *priv, struct gve_rx_ring *rx, + struct xdp_buff *xdp, struct bpf_prog *xprog, + int xdp_act) +{ + struct gve_tx_ring *tx; + int tx_qid; + int err; + + switch (xdp_act) { + case XDP_ABORTED: + case XDP_DROP: + default: + break; + case XDP_TX: + tx_qid = gve_xdp_tx_queue_id(priv, rx->q_num); + tx = &priv->tx[tx_qid]; + spin_lock(&tx->xdp_lock); + err = gve_xdp_xmit_one(priv, tx, xdp->data, + xdp->data_end - xdp->data, NULL); + spin_unlock(&tx->xdp_lock); + + if (unlikely(err)) { + u64_stats_update_begin(&rx->statss); + rx->xdp_tx_errors++; + u64_stats_update_end(&rx->statss); + } + break; + case XDP_REDIRECT: + err = gve_xdp_redirect(priv->dev, rx, xdp, xprog); + + if (unlikely(err)) { + u64_stats_update_begin(&rx->statss); + rx->xdp_redirect_errors++; + u64_stats_update_end(&rx->statss); + } + break; + } + u64_stats_update_begin(&rx->statss); + if ((u32)xdp_act < GVE_XDP_ACTIONS) + rx->xdp_actions[xdp_act]++; + u64_stats_update_end(&rx->statss); +} + #define GVE_PKTCONT_BIT_IS_SET(x) (GVE_RXF_PKT_CONT & (x)) static void gve_rx(struct gve_rx_ring *rx, netdev_features_t feat, struct gve_rx_desc *desc, u32 idx, @@ -603,9 +707,12 @@ static void gve_rx(struct gve_rx_ring *rx, netdev_features_t feat, union gve_rx_data_slot *data_slot; struct gve_priv *priv = rx->gve; struct sk_buff *skb = NULL; + struct bpf_prog *xprog; + struct xdp_buff xdp; dma_addr_t page_bus; void *va; + u16 len = frag_size; struct napi_struct *napi = &priv->ntfy_blocks[rx->ntfy_id].napi; bool is_first_frag = ctx->frag_cnt == 0; @@ -645,9 +752,35 @@ static void gve_rx(struct gve_rx_ring *rx, netdev_features_t feat, dma_sync_single_for_cpu(&priv->pdev->dev, page_bus, PAGE_SIZE, DMA_FROM_DEVICE); page_info->pad = is_first_frag ? GVE_RX_PAD : 0; + len -= page_info->pad; frag_size -= page_info->pad; - skb = gve_rx_skb(priv, rx, page_info, napi, frag_size, + xprog = READ_ONCE(priv->xdp_prog); + if (xprog && is_only_frag) { + void *old_data; + int xdp_act; + + xdp_init_buff(&xdp, rx->packet_buffer_size, &rx->xdp_rxq); + xdp_prepare_buff(&xdp, page_info->page_address + + page_info->page_offset, GVE_RX_PAD, + len, false); + old_data = xdp.data; + xdp_act = bpf_prog_run_xdp(xprog, &xdp); + if (xdp_act != XDP_PASS) { + gve_xdp_done(priv, rx, &xdp, xprog, xdp_act); + ctx->total_size += frag_size; + goto finish_ok_pkt; + } + + page_info->pad += xdp.data - old_data; + len = xdp.data_end - xdp.data; + + u64_stats_update_begin(&rx->statss); + rx->xdp_actions[XDP_PASS]++; + u64_stats_update_end(&rx->statss); + } + + skb = gve_rx_skb(priv, rx, page_info, napi, len, data_slot, is_only_frag); if (!skb) { u64_stats_update_begin(&rx->statss); @@ -773,6 +906,8 @@ static bool gve_rx_refill_buffers(struct gve_priv *priv, struct gve_rx_ring *rx) static int gve_clean_rx_done(struct gve_rx_ring *rx, int budget, netdev_features_t feat) { + u64 xdp_redirects = rx->xdp_actions[XDP_REDIRECT]; + u64 xdp_txs = rx->xdp_actions[XDP_TX]; struct gve_rx_ctx *ctx = &rx->ctx; struct gve_priv *priv = rx->gve; struct gve_rx_cnts cnts = {0}; @@ -820,6 +955,12 @@ static int gve_clean_rx_done(struct gve_rx_ring *rx, int budget, u64_stats_update_end(&rx->statss); } + if (xdp_txs != rx->xdp_actions[XDP_TX]) + gve_xdp_tx_flush(priv, rx->q_num); + + if (xdp_redirects != rx->xdp_actions[XDP_REDIRECT]) + xdp_do_flush(); + /* restock ring slots */ if (!rx->data.raw_addressing) { /* In QPL mode buffs are refilled as the desc are processed */ diff --git a/drivers/net/ethernet/google/gve/gve_rx_dqo.c b/drivers/net/ethernet/google/gve/gve_rx_dqo.c index 630f42a3037b..e57b73eb70f6 100644 --- a/drivers/net/ethernet/google/gve/gve_rx_dqo.c +++ b/drivers/net/ethernet/google/gve/gve_rx_dqo.c @@ -568,7 +568,7 @@ static int gve_rx_dqo(struct napi_struct *napi, struct gve_rx_ring *rx, if (eop && buf_len <= priv->rx_copybreak) { rx->ctx.skb_head = gve_rx_copy(priv->dev, napi, - &buf_state->page_info, buf_len, 0); + &buf_state->page_info, buf_len); if (unlikely(!rx->ctx.skb_head)) goto error; rx->ctx.skb_tail = rx->ctx.skb_head; diff --git a/drivers/net/ethernet/google/gve/gve_tx.c b/drivers/net/ethernet/google/gve/gve_tx.c index 5e11b8236754..813da572abca 100644 --- a/drivers/net/ethernet/google/gve/gve_tx.c +++ b/drivers/net/ethernet/google/gve/gve_tx.c @@ -11,6 +11,7 @@ #include <linux/tcp.h> #include <linux/vmalloc.h> #include <linux/skbuff.h> +#include <net/xdp_sock_drv.h> static inline void gve_tx_put_doorbell(struct gve_priv *priv, struct gve_queue_resources *q_resources, @@ -19,6 +20,14 @@ static inline void gve_tx_put_doorbell(struct gve_priv *priv, iowrite32be(val, &priv->db_bar2[be32_to_cpu(q_resources->db_index)]); } +void gve_xdp_tx_flush(struct gve_priv *priv, u32 xdp_qid) +{ + u32 tx_qid = gve_xdp_tx_queue_id(priv, xdp_qid); + struct gve_tx_ring *tx = &priv->tx[tx_qid]; + + gve_tx_put_doorbell(priv, tx->q_resources, tx->req); +} + /* gvnic can only transmit from a Registered Segment. * We copy skb payloads into the registered segment before writing Tx * descriptors and ringing the Tx doorbell. @@ -132,6 +141,58 @@ static void gve_tx_free_fifo(struct gve_tx_fifo *fifo, size_t bytes) atomic_add(bytes, &fifo->available); } +static size_t gve_tx_clear_buffer_state(struct gve_tx_buffer_state *info) +{ + size_t space_freed = 0; + int i; + + for (i = 0; i < ARRAY_SIZE(info->iov); i++) { + space_freed += info->iov[i].iov_len + info->iov[i].iov_padding; + info->iov[i].iov_len = 0; + info->iov[i].iov_padding = 0; + } + return space_freed; +} + +static int gve_clean_xdp_done(struct gve_priv *priv, struct gve_tx_ring *tx, + u32 to_do) +{ + struct gve_tx_buffer_state *info; + u32 clean_end = tx->done + to_do; + u64 pkts = 0, bytes = 0; + size_t space_freed = 0; + u32 xsk_complete = 0; + u32 idx; + + for (; tx->done < clean_end; tx->done++) { + idx = tx->done & tx->mask; + info = &tx->info[idx]; + + if (unlikely(!info->xdp.size)) + continue; + + bytes += info->xdp.size; + pkts++; + xsk_complete += info->xdp.is_xsk; + + info->xdp.size = 0; + if (info->xdp_frame) { + xdp_return_frame(info->xdp_frame); + info->xdp_frame = NULL; + } + space_freed += gve_tx_clear_buffer_state(info); + } + + gve_tx_free_fifo(&tx->tx_fifo, space_freed); + if (xsk_complete > 0 && tx->xsk_pool) + xsk_tx_completed(tx->xsk_pool, xsk_complete); + u64_stats_update_begin(&tx->statss); + tx->bytes_done += bytes; + tx->pkt_done += pkts; + u64_stats_update_end(&tx->statss); + return pkts; +} + static int gve_clean_tx_done(struct gve_priv *priv, struct gve_tx_ring *tx, u32 to_do, bool try_to_wake); @@ -144,8 +205,12 @@ static void gve_tx_free_ring(struct gve_priv *priv, int idx) gve_tx_remove_from_block(priv, idx); slots = tx->mask + 1; - gve_clean_tx_done(priv, tx, priv->tx_desc_cnt, false); - netdev_tx_reset_queue(tx->netdev_txq); + if (tx->q_num < priv->tx_cfg.num_queues) { + gve_clean_tx_done(priv, tx, priv->tx_desc_cnt, false); + netdev_tx_reset_queue(tx->netdev_txq); + } else { + gve_clean_xdp_done(priv, tx, priv->tx_desc_cnt); + } dma_free_coherent(hdev, sizeof(*tx->q_resources), tx->q_resources, tx->q_resources_bus); @@ -177,6 +242,7 @@ static int gve_tx_alloc_ring(struct gve_priv *priv, int idx) /* Make sure everything is zeroed to start */ memset(tx, 0, sizeof(*tx)); spin_lock_init(&tx->clean_lock); + spin_lock_init(&tx->xdp_lock); tx->q_num = idx; tx->mask = slots - 1; @@ -195,7 +261,7 @@ static int gve_tx_alloc_ring(struct gve_priv *priv, int idx) tx->raw_addressing = priv->queue_format == GVE_GQI_RDA_FORMAT; tx->dev = &priv->pdev->dev; if (!tx->raw_addressing) { - tx->tx_fifo.qpl = gve_assign_tx_qpl(priv); + tx->tx_fifo.qpl = gve_assign_tx_qpl(priv, idx); if (!tx->tx_fifo.qpl) goto abort_with_desc; /* map Tx FIFO */ @@ -213,7 +279,8 @@ static int gve_tx_alloc_ring(struct gve_priv *priv, int idx) netif_dbg(priv, drv, priv->dev, "tx[%d]->bus=%lx\n", idx, (unsigned long)tx->bus); - tx->netdev_txq = netdev_get_tx_queue(priv->dev, idx); + if (idx < priv->tx_cfg.num_queues) + tx->netdev_txq = netdev_get_tx_queue(priv->dev, idx); gve_tx_add_to_block(priv, idx); return 0; @@ -233,12 +300,12 @@ abort_with_info: return -ENOMEM; } -int gve_tx_alloc_rings(struct gve_priv *priv) +int gve_tx_alloc_rings(struct gve_priv *priv, int start_id, int num_rings) { int err = 0; int i; - for (i = 0; i < priv->tx_cfg.num_queues; i++) { + for (i = start_id; i < start_id + num_rings; i++) { err = gve_tx_alloc_ring(priv, i); if (err) { netif_err(priv, drv, priv->dev, @@ -251,17 +318,17 @@ int gve_tx_alloc_rings(struct gve_priv *priv) if (err) { int j; - for (j = 0; j < i; j++) + for (j = start_id; j < i; j++) gve_tx_free_ring(priv, j); } return err; } -void gve_tx_free_rings_gqi(struct gve_priv *priv) +void gve_tx_free_rings_gqi(struct gve_priv *priv, int start_id, int num_rings) { int i; - for (i = 0; i < priv->tx_cfg.num_queues; i++) + for (i = start_id; i < start_id + num_rings; i++) gve_tx_free_ring(priv, i); } @@ -374,18 +441,18 @@ static int gve_maybe_stop_tx(struct gve_priv *priv, struct gve_tx_ring *tx, } static void gve_tx_fill_pkt_desc(union gve_tx_desc *pkt_desc, - struct sk_buff *skb, bool is_gso, + u16 csum_offset, u8 ip_summed, bool is_gso, int l4_hdr_offset, u32 desc_cnt, - u16 hlen, u64 addr) + u16 hlen, u64 addr, u16 pkt_len) { /* l4_hdr_offset and csum_offset are in units of 16-bit words */ if (is_gso) { pkt_desc->pkt.type_flags = GVE_TXD_TSO | GVE_TXF_L4CSUM; - pkt_desc->pkt.l4_csum_offset = skb->csum_offset >> 1; + pkt_desc->pkt.l4_csum_offset = csum_offset >> 1; pkt_desc->pkt.l4_hdr_offset = l4_hdr_offset >> 1; - } else if (likely(skb->ip_summed == CHECKSUM_PARTIAL)) { + } else if (likely(ip_summed == CHECKSUM_PARTIAL)) { pkt_desc->pkt.type_flags = GVE_TXD_STD | GVE_TXF_L4CSUM; - pkt_desc->pkt.l4_csum_offset = skb->csum_offset >> 1; + pkt_desc->pkt.l4_csum_offset = csum_offset >> 1; pkt_desc->pkt.l4_hdr_offset = l4_hdr_offset >> 1; } else { pkt_desc->pkt.type_flags = GVE_TXD_STD; @@ -393,7 +460,7 @@ static void gve_tx_fill_pkt_desc(union gve_tx_desc *pkt_desc, pkt_desc->pkt.l4_hdr_offset = 0; } pkt_desc->pkt.desc_cnt = desc_cnt; - pkt_desc->pkt.len = cpu_to_be16(skb->len); + pkt_desc->pkt.len = cpu_to_be16(pkt_len); pkt_desc->pkt.seg_len = cpu_to_be16(hlen); pkt_desc->pkt.seg_addr = cpu_to_be64(addr); } @@ -412,15 +479,16 @@ static void gve_tx_fill_mtd_desc(union gve_tx_desc *mtd_desc, } static void gve_tx_fill_seg_desc(union gve_tx_desc *seg_desc, - struct sk_buff *skb, bool is_gso, + u16 l3_offset, u16 gso_size, + bool is_gso_v6, bool is_gso, u16 len, u64 addr) { seg_desc->seg.type_flags = GVE_TXD_SEG; if (is_gso) { - if (skb_is_gso_v6(skb)) + if (is_gso_v6) seg_desc->seg.type_flags |= GVE_TXSF_IPV6; - seg_desc->seg.l3_offset = skb_network_offset(skb) >> 1; - seg_desc->seg.mss = cpu_to_be16(skb_shinfo(skb)->gso_size); + seg_desc->seg.l3_offset = l3_offset >> 1; + seg_desc->seg.mss = cpu_to_be16(gso_size); } seg_desc->seg.seg_len = cpu_to_be16(len); seg_desc->seg.seg_addr = cpu_to_be64(addr); @@ -471,9 +539,10 @@ static int gve_tx_add_skb_copy(struct gve_priv *priv, struct gve_tx_ring *tx, st payload_nfrags = gve_tx_alloc_fifo(&tx->tx_fifo, skb->len - hlen, &info->iov[payload_iov]); - gve_tx_fill_pkt_desc(pkt_desc, skb, is_gso, l4_hdr_offset, + gve_tx_fill_pkt_desc(pkt_desc, skb->csum_offset, skb->ip_summed, + is_gso, l4_hdr_offset, 1 + mtd_desc_nr + payload_nfrags, hlen, - info->iov[hdr_nfrags - 1].iov_offset); + info->iov[hdr_nfrags - 1].iov_offset, skb->len); skb_copy_bits(skb, 0, tx->tx_fifo.base + info->iov[hdr_nfrags - 1].iov_offset, @@ -492,7 +561,9 @@ static int gve_tx_add_skb_copy(struct gve_priv *priv, struct gve_tx_ring *tx, st next_idx = (tx->req + 1 + mtd_desc_nr + i - payload_iov) & tx->mask; seg_desc = &tx->desc[next_idx]; - gve_tx_fill_seg_desc(seg_desc, skb, is_gso, + gve_tx_fill_seg_desc(seg_desc, skb_network_offset(skb), + skb_shinfo(skb)->gso_size, + skb_is_gso_v6(skb), is_gso, info->iov[i].iov_len, info->iov[i].iov_offset); @@ -550,8 +621,9 @@ static int gve_tx_add_skb_no_copy(struct gve_priv *priv, struct gve_tx_ring *tx, if (mtd_desc_nr) num_descriptors++; - gve_tx_fill_pkt_desc(pkt_desc, skb, is_gso, l4_hdr_offset, - num_descriptors, hlen, addr); + gve_tx_fill_pkt_desc(pkt_desc, skb->csum_offset, skb->ip_summed, + is_gso, l4_hdr_offset, + num_descriptors, hlen, addr, skb->len); if (mtd_desc_nr) { idx = (idx + 1) & tx->mask; @@ -567,7 +639,9 @@ static int gve_tx_add_skb_no_copy(struct gve_priv *priv, struct gve_tx_ring *tx, addr += hlen; idx = (idx + 1) & tx->mask; seg_desc = &tx->desc[idx]; - gve_tx_fill_seg_desc(seg_desc, skb, is_gso, len, addr); + gve_tx_fill_seg_desc(seg_desc, skb_network_offset(skb), + skb_shinfo(skb)->gso_size, + skb_is_gso_v6(skb), is_gso, len, addr); } for (i = 0; i < shinfo->nr_frags; i++) { @@ -585,7 +659,9 @@ static int gve_tx_add_skb_no_copy(struct gve_priv *priv, struct gve_tx_ring *tx, dma_unmap_len_set(&tx->info[idx], len, len); dma_unmap_addr_set(&tx->info[idx], dma, addr); - gve_tx_fill_seg_desc(seg_desc, skb, is_gso, len, addr); + gve_tx_fill_seg_desc(seg_desc, skb_network_offset(skb), + skb_shinfo(skb)->gso_size, + skb_is_gso_v6(skb), is_gso, len, addr); } return num_descriptors; @@ -646,6 +722,103 @@ netdev_tx_t gve_tx(struct sk_buff *skb, struct net_device *dev) return NETDEV_TX_OK; } +static int gve_tx_fill_xdp(struct gve_priv *priv, struct gve_tx_ring *tx, + void *data, int len, void *frame_p, bool is_xsk) +{ + int pad, nfrags, ndescs, iovi, offset; + struct gve_tx_buffer_state *info; + u32 reqi = tx->req; + + pad = gve_tx_fifo_pad_alloc_one_frag(&tx->tx_fifo, len); + if (pad >= GVE_GQ_TX_MIN_PKT_DESC_BYTES) + pad = 0; + info = &tx->info[reqi & tx->mask]; + info->xdp_frame = frame_p; + info->xdp.size = len; + info->xdp.is_xsk = is_xsk; + + nfrags = gve_tx_alloc_fifo(&tx->tx_fifo, pad + len, + &info->iov[0]); + iovi = pad > 0; + ndescs = nfrags - iovi; + offset = 0; + + while (iovi < nfrags) { + if (!offset) + gve_tx_fill_pkt_desc(&tx->desc[reqi & tx->mask], 0, + CHECKSUM_NONE, false, 0, ndescs, + info->iov[iovi].iov_len, + info->iov[iovi].iov_offset, len); + else + gve_tx_fill_seg_desc(&tx->desc[reqi & tx->mask], + 0, 0, false, false, + info->iov[iovi].iov_len, + info->iov[iovi].iov_offset); + + memcpy(tx->tx_fifo.base + info->iov[iovi].iov_offset, + data + offset, info->iov[iovi].iov_len); + gve_dma_sync_for_device(&priv->pdev->dev, + tx->tx_fifo.qpl->page_buses, + info->iov[iovi].iov_offset, + info->iov[iovi].iov_len); + offset += info->iov[iovi].iov_len; + iovi++; + reqi++; + } + + return ndescs; +} + +int gve_xdp_xmit(struct net_device *dev, int n, struct xdp_frame **frames, + u32 flags) +{ + struct gve_priv *priv = netdev_priv(dev); + struct gve_tx_ring *tx; + int i, err = 0, qid; + + if (unlikely(flags & ~XDP_XMIT_FLAGS_MASK)) + return -EINVAL; + + qid = gve_xdp_tx_queue_id(priv, + smp_processor_id() % priv->num_xdp_queues); + + tx = &priv->tx[qid]; + + spin_lock(&tx->xdp_lock); + for (i = 0; i < n; i++) { + err = gve_xdp_xmit_one(priv, tx, frames[i]->data, + frames[i]->len, frames[i]); + if (err) + break; + } + + if (flags & XDP_XMIT_FLUSH) + gve_tx_put_doorbell(priv, tx->q_resources, tx->req); + + spin_unlock(&tx->xdp_lock); + + u64_stats_update_begin(&tx->statss); + tx->xdp_xmit += n; + tx->xdp_xmit_errors += n - i; + u64_stats_update_end(&tx->statss); + + return i ? i : err; +} + +int gve_xdp_xmit_one(struct gve_priv *priv, struct gve_tx_ring *tx, + void *data, int len, void *frame_p) +{ + int nsegs; + + if (!gve_can_tx(tx, len + GVE_GQ_TX_MIN_PKT_DESC_BYTES - 1)) + return -EBUSY; + + nsegs = gve_tx_fill_xdp(priv, tx, data, len, frame_p, false); + tx->req += nsegs; + + return 0; +} + #define GVE_TX_START_THRESH PAGE_SIZE static int gve_clean_tx_done(struct gve_priv *priv, struct gve_tx_ring *tx, @@ -655,8 +828,8 @@ static int gve_clean_tx_done(struct gve_priv *priv, struct gve_tx_ring *tx, u64 pkts = 0, bytes = 0; size_t space_freed = 0; struct sk_buff *skb; - int i, j; u32 idx; + int j; for (j = 0; j < to_do; j++) { idx = tx->done & tx->mask; @@ -678,12 +851,7 @@ static int gve_clean_tx_done(struct gve_priv *priv, struct gve_tx_ring *tx, dev_consume_skb_any(skb); if (tx->raw_addressing) continue; - /* FIFO free */ - for (i = 0; i < ARRAY_SIZE(info->iov); i++) { - space_freed += info->iov[i].iov_len + info->iov[i].iov_padding; - info->iov[i].iov_len = 0; - info->iov[i].iov_padding = 0; - } + space_freed += gve_tx_clear_buffer_state(info); } } @@ -718,6 +886,70 @@ u32 gve_tx_load_event_counter(struct gve_priv *priv, return be32_to_cpu(counter); } +static int gve_xsk_tx(struct gve_priv *priv, struct gve_tx_ring *tx, + int budget) +{ + struct xdp_desc desc; + int sent = 0, nsegs; + void *data; + + spin_lock(&tx->xdp_lock); + while (sent < budget) { + if (!gve_can_tx(tx, GVE_TX_START_THRESH)) + goto out; + + if (!xsk_tx_peek_desc(tx->xsk_pool, &desc)) { + tx->xdp_xsk_done = tx->xdp_xsk_wakeup; + goto out; + } + + data = xsk_buff_raw_get_data(tx->xsk_pool, desc.addr); + nsegs = gve_tx_fill_xdp(priv, tx, data, desc.len, NULL, true); + tx->req += nsegs; + sent++; + } +out: + if (sent > 0) { + gve_tx_put_doorbell(priv, tx->q_resources, tx->req); + xsk_tx_release(tx->xsk_pool); + } + spin_unlock(&tx->xdp_lock); + return sent; +} + +bool gve_xdp_poll(struct gve_notify_block *block, int budget) +{ + struct gve_priv *priv = block->priv; + struct gve_tx_ring *tx = block->tx; + u32 nic_done; + bool repoll; + u32 to_do; + + /* If budget is 0, do all the work */ + if (budget == 0) + budget = INT_MAX; + + /* Find out how much work there is to be done */ + nic_done = gve_tx_load_event_counter(priv, tx); + to_do = min_t(u32, (nic_done - tx->done), budget); + gve_clean_xdp_done(priv, tx, to_do); + repoll = nic_done != tx->done; + + if (tx->xsk_pool) { + int sent = gve_xsk_tx(priv, tx, budget); + + u64_stats_update_begin(&tx->statss); + tx->xdp_xsk_sent += sent; + u64_stats_update_end(&tx->statss); + repoll |= (sent == budget); + if (xsk_uses_need_wakeup(tx->xsk_pool)) + xsk_set_tx_need_wakeup(tx->xsk_pool); + } + + /* If we still have work we want to repoll */ + return repoll; +} + bool gve_tx_poll(struct gve_notify_block *block, int budget) { struct gve_priv *priv = block->priv; diff --git a/drivers/net/ethernet/google/gve/gve_utils.c b/drivers/net/ethernet/google/gve/gve_utils.c index 6ba46adaaee3..26e08d753270 100644 --- a/drivers/net/ethernet/google/gve/gve_utils.c +++ b/drivers/net/ethernet/google/gve/gve_utils.c @@ -49,10 +49,10 @@ void gve_rx_add_to_block(struct gve_priv *priv, int queue_idx) } struct sk_buff *gve_rx_copy(struct net_device *dev, struct napi_struct *napi, - struct gve_rx_slot_page_info *page_info, u16 len, - u16 padding) + struct gve_rx_slot_page_info *page_info, u16 len) { - void *va = page_info->page_address + padding + page_info->page_offset; + void *va = page_info->page_address + page_info->page_offset + + page_info->pad; struct sk_buff *skb; skb = napi_alloc_skb(napi, len); diff --git a/drivers/net/ethernet/google/gve/gve_utils.h b/drivers/net/ethernet/google/gve/gve_utils.h index 79595940b351..324fd98a6112 100644 --- a/drivers/net/ethernet/google/gve/gve_utils.h +++ b/drivers/net/ethernet/google/gve/gve_utils.h @@ -18,8 +18,7 @@ void gve_rx_remove_from_block(struct gve_priv *priv, int queue_idx); void gve_rx_add_to_block(struct gve_priv *priv, int queue_idx); struct sk_buff *gve_rx_copy(struct net_device *dev, struct napi_struct *napi, - struct gve_rx_slot_page_info *page_info, u16 len, - u16 pad); + struct gve_rx_slot_page_info *page_info, u16 len); /* Decrement pagecnt_bias. Set it back to INT_MAX if it reached zero. */ void gve_dec_pagecnt_bias(struct gve_rx_slot_page_info *page_info); diff --git a/drivers/net/ethernet/hisilicon/hns3/hnae3.h b/drivers/net/ethernet/hisilicon/hns3/hnae3.h index 40f4306449eb..9c9c72dc57e0 100644 --- a/drivers/net/ethernet/hisilicon/hns3/hnae3.h +++ b/drivers/net/ethernet/hisilicon/hns3/hnae3.h @@ -100,6 +100,7 @@ enum HNAE3_DEV_CAP_BITS { HNAE3_DEV_SUPPORT_CQ_B, HNAE3_DEV_SUPPORT_FEC_STATS_B, HNAE3_DEV_SUPPORT_LANE_NUM_B, + HNAE3_DEV_SUPPORT_WOL_B, }; #define hnae3_ae_dev_fd_supported(ae_dev) \ @@ -168,6 +169,9 @@ enum HNAE3_DEV_CAP_BITS { #define hnae3_ae_dev_lane_num_supported(ae_dev) \ test_bit(HNAE3_DEV_SUPPORT_LANE_NUM_B, (ae_dev)->caps) +#define hnae3_ae_dev_wol_supported(ae_dev) \ + test_bit(HNAE3_DEV_SUPPORT_WOL_B, (ae_dev)->caps) + enum HNAE3_PF_CAP_BITS { HNAE3_PF_SUPPORT_VLAN_FLTR_MDF_B = 0, }; @@ -561,6 +565,10 @@ struct hnae3_ae_dev { * Get phc info * clean_vf_config * Clean residual vf info after disable sriov + * get_wol + * Get wake on lan info + * set_wol + * Config wake on lan */ struct hnae3_ae_ops { int (*init_ae_dev)(struct hnae3_ae_dev *ae_dev); @@ -760,6 +768,10 @@ struct hnae3_ae_ops { void (*clean_vf_config)(struct hnae3_ae_dev *ae_dev, int num_vfs); int (*get_dscp_prio)(struct hnae3_handle *handle, u8 dscp, u8 *tc_map_mode, u8 *priority); + void (*get_wol)(struct hnae3_handle *handle, + struct ethtool_wolinfo *wol); + int (*set_wol)(struct hnae3_handle *handle, + struct ethtool_wolinfo *wol); }; struct hnae3_dcb_ops { diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3_common/hclge_comm_cmd.c b/drivers/net/ethernet/hisilicon/hns3/hns3_common/hclge_comm_cmd.c index f671a63cecde..cbbab5b2b402 100644 --- a/drivers/net/ethernet/hisilicon/hns3/hns3_common/hclge_comm_cmd.c +++ b/drivers/net/ethernet/hisilicon/hns3/hns3_common/hclge_comm_cmd.c @@ -155,6 +155,7 @@ static const struct hclge_comm_caps_bit_map hclge_pf_cmd_caps[] = { {HCLGE_COMM_CAP_FD_B, HNAE3_DEV_SUPPORT_FD_B}, {HCLGE_COMM_CAP_FEC_STATS_B, HNAE3_DEV_SUPPORT_FEC_STATS_B}, {HCLGE_COMM_CAP_LANE_NUM_B, HNAE3_DEV_SUPPORT_LANE_NUM_B}, + {HCLGE_COMM_CAP_WOL_B, HNAE3_DEV_SUPPORT_WOL_B}, }; static const struct hclge_comm_caps_bit_map hclge_vf_cmd_caps[] = { diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3_common/hclge_comm_cmd.h b/drivers/net/ethernet/hisilicon/hns3/hns3_common/hclge_comm_cmd.h index b1f9383b418f..de72ecbfd5ad 100644 --- a/drivers/net/ethernet/hisilicon/hns3/hns3_common/hclge_comm_cmd.h +++ b/drivers/net/ethernet/hisilicon/hns3/hns3_common/hclge_comm_cmd.h @@ -294,6 +294,8 @@ enum hclge_opcode_type { HCLGE_PPP_CMD0_INT_CMD = 0x2100, HCLGE_PPP_CMD1_INT_CMD = 0x2101, HCLGE_MAC_ETHERTYPE_IDX_RD = 0x2105, + HCLGE_OPC_WOL_GET_SUPPORTED_MODE = 0x2201, + HCLGE_OPC_WOL_CFG = 0x2202, HCLGE_NCSI_INT_EN = 0x2401, /* ROH MAC commands */ @@ -345,6 +347,7 @@ enum HCLGE_COMM_CAP_BITS { HCLGE_COMM_CAP_FD_B = 21, HCLGE_COMM_CAP_FEC_STATS_B = 25, HCLGE_COMM_CAP_LANE_NUM_B = 27, + HCLGE_COMM_CAP_WOL_B = 28, }; enum HCLGE_COMM_API_CAP_BITS { diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3_debugfs.c b/drivers/net/ethernet/hisilicon/hns3/hns3_debugfs.c index 66feb23f7b7b..4c3e90a1c4d0 100644 --- a/drivers/net/ethernet/hisilicon/hns3/hns3_debugfs.c +++ b/drivers/net/ethernet/hisilicon/hns3/hns3_debugfs.c @@ -408,6 +408,9 @@ static struct hns3_dbg_cap_info hns3_dbg_cap[] = { }, { .name = "support lane num", .cap_bit = HNAE3_DEV_SUPPORT_LANE_NUM_B, + }, { + .name = "support wake on lan", + .cap_bit = HNAE3_DEV_SUPPORT_WOL_B, } }; diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c index 25be7f8ac7cd..7356ad965487 100644 --- a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c +++ b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c @@ -13,7 +13,6 @@ #include <linux/ipv6.h> #include <linux/module.h> #include <linux/pci.h> -#include <linux/aer.h> #include <linux/skbuff.h> #include <linux/sctp.h> #include <net/gre.h> @@ -1533,7 +1532,7 @@ static int hns3_handle_vtags(struct hns3_enet_ring *tx_ring, if (unlikely(rc < 0)) return rc; - vhdr = (struct vlan_ethhdr *)skb->data; + vhdr = skb_vlan_eth_hdr(skb); vhdr->h_vlan_TCI |= cpu_to_be16((skb->priority << VLAN_PRIO_SHIFT) & VLAN_PRIO_MASK); diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.h b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.h index 294a14b4fdef..88af34bbee34 100644 --- a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.h +++ b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.h @@ -695,6 +695,12 @@ static inline unsigned int hns3_page_order(struct hns3_enet_ring *ring) #define hns3_get_handle(ndev) \ (((struct hns3_nic_priv *)netdev_priv(ndev))->ae_handle) +#define hns3_get_ae_dev(handle) \ + (pci_get_drvdata((handle)->pdev)) + +#define hns3_get_ops(handle) \ + ((handle)->ae_algo->ops) + #define hns3_gl_usec_to_reg(int_gl) ((int_gl) >> 1) #define hns3_gl_round_down(int_gl) round_down(int_gl, 2) diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3_ethtool.c b/drivers/net/ethernet/hisilicon/hns3/hns3_ethtool.c index 55306fe8a540..51d1278b18f6 100644 --- a/drivers/net/ethernet/hisilicon/hns3/hns3_ethtool.c +++ b/drivers/net/ethernet/hisilicon/hns3/hns3_ethtool.c @@ -2063,6 +2063,31 @@ static int hns3_get_link_ext_state(struct net_device *netdev, return -ENODATA; } +static void hns3_get_wol(struct net_device *netdev, struct ethtool_wolinfo *wol) +{ + struct hnae3_handle *handle = hns3_get_handle(netdev); + const struct hnae3_ae_ops *ops = hns3_get_ops(handle); + struct hnae3_ae_dev *ae_dev = hns3_get_ae_dev(handle); + + if (!hnae3_ae_dev_wol_supported(ae_dev)) + return; + + ops->get_wol(handle, wol); +} + +static int hns3_set_wol(struct net_device *netdev, + struct ethtool_wolinfo *wol) +{ + struct hnae3_handle *handle = hns3_get_handle(netdev); + const struct hnae3_ae_ops *ops = hns3_get_ops(handle); + struct hnae3_ae_dev *ae_dev = hns3_get_ae_dev(handle); + + if (!hnae3_ae_dev_wol_supported(ae_dev)) + return -EOPNOTSUPP; + + return ops->set_wol(handle, wol); +} + static const struct ethtool_ops hns3vf_ethtool_ops = { .supported_coalesce_params = HNS3_ETHTOOL_COALESCE, .supported_ring_params = HNS3_ETHTOOL_RING, @@ -2139,6 +2164,8 @@ static const struct ethtool_ops hns3_ethtool_ops = { .set_tunable = hns3_set_tunable, .reset = hns3_set_reset, .get_link_ext_state = hns3_get_link_ext_state, + .get_wol = hns3_get_wol, + .set_wol = hns3_set_wol, }; void hns3_ethtool_set_ops(struct net_device *netdev) diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_cmd.h b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_cmd.h index 43cada51d8cb..91c173f40701 100644 --- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_cmd.h +++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_cmd.h @@ -872,6 +872,18 @@ struct hclge_phy_reg_cmd { u8 rsv1[18]; }; +struct hclge_wol_cfg_cmd { + __le32 wake_on_lan_mode; + u8 sopass[SOPASS_MAX]; + u8 sopass_size; + u8 rsv[13]; +}; + +struct hclge_query_wol_supported_cmd { + __le32 supported_wake_mode; + u8 rsv[20]; +}; + struct hclge_hw; int hclge_cmd_send(struct hclge_hw *hw, struct hclge_desc *desc, int num); enum hclge_comm_cmd_status hclge_cmd_mdio_write(struct hclge_hw *hw, diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c index 07ad5f35219e..4fb5406c1951 100644 --- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c +++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c @@ -11365,7 +11365,7 @@ static int hclge_pci_init(struct hclge_dev *hdev) if (!hw->hw.io_base) { dev_err(&pdev->dev, "Can't map configuration register space\n"); ret = -ENOMEM; - goto err_clr_master; + goto err_release_regions; } ret = hclge_dev_mem_map(hdev); @@ -11378,8 +11378,7 @@ static int hclge_pci_init(struct hclge_dev *hdev) err_unmap_io_base: pcim_iounmap(pdev, hdev->hw.hw.io_base); -err_clr_master: - pci_clear_master(pdev); +err_release_regions: pci_release_regions(pdev); err_disable_device: pci_disable_device(pdev); @@ -11396,7 +11395,6 @@ static void hclge_pci_uninit(struct hclge_dev *hdev) pcim_iounmap(pdev, hdev->hw.hw.io_base); pci_free_irq_vectors(pdev); - pci_clear_master(pdev); pci_release_mem_regions(pdev); pci_disable_device(pdev); } @@ -11524,6 +11522,124 @@ static void hclge_uninit_rxd_adv_layout(struct hclge_dev *hdev) hclge_write_dev(&hdev->hw, HCLGE_RXD_ADV_LAYOUT_EN_REG, 0); } +static struct hclge_wol_info *hclge_get_wol_info(struct hnae3_handle *handle) +{ + struct hclge_vport *vport = hclge_get_vport(handle); + + return &vport->back->hw.mac.wol; +} + +static int hclge_get_wol_supported_mode(struct hclge_dev *hdev, + u32 *wol_supported) +{ + struct hclge_query_wol_supported_cmd *wol_supported_cmd; + struct hclge_desc desc; + int ret; + + hclge_cmd_setup_basic_desc(&desc, HCLGE_OPC_WOL_GET_SUPPORTED_MODE, + true); + wol_supported_cmd = (struct hclge_query_wol_supported_cmd *)desc.data; + + ret = hclge_cmd_send(&hdev->hw, &desc, 1); + if (ret) { + dev_err(&hdev->pdev->dev, + "failed to query wol supported, ret = %d\n", ret); + return ret; + } + + *wol_supported = le32_to_cpu(wol_supported_cmd->supported_wake_mode); + + return 0; +} + +static int hclge_set_wol_cfg(struct hclge_dev *hdev, + struct hclge_wol_info *wol_info) +{ + struct hclge_wol_cfg_cmd *wol_cfg_cmd; + struct hclge_desc desc; + int ret; + + hclge_cmd_setup_basic_desc(&desc, HCLGE_OPC_WOL_CFG, false); + wol_cfg_cmd = (struct hclge_wol_cfg_cmd *)desc.data; + wol_cfg_cmd->wake_on_lan_mode = cpu_to_le32(wol_info->wol_current_mode); + wol_cfg_cmd->sopass_size = wol_info->wol_sopass_size; + memcpy(wol_cfg_cmd->sopass, wol_info->wol_sopass, SOPASS_MAX); + + ret = hclge_cmd_send(&hdev->hw, &desc, 1); + if (ret) + dev_err(&hdev->pdev->dev, + "failed to set wol config, ret = %d\n", ret); + + return ret; +} + +static int hclge_update_wol(struct hclge_dev *hdev) +{ + struct hclge_wol_info *wol_info = &hdev->hw.mac.wol; + + if (!hnae3_ae_dev_wol_supported(hdev->ae_dev)) + return 0; + + return hclge_set_wol_cfg(hdev, wol_info); +} + +static int hclge_init_wol(struct hclge_dev *hdev) +{ + struct hclge_wol_info *wol_info = &hdev->hw.mac.wol; + int ret; + + if (!hnae3_ae_dev_wol_supported(hdev->ae_dev)) + return 0; + + memset(wol_info, 0, sizeof(struct hclge_wol_info)); + ret = hclge_get_wol_supported_mode(hdev, + &wol_info->wol_support_mode); + if (ret) { + wol_info->wol_support_mode = 0; + return ret; + } + + return hclge_update_wol(hdev); +} + +static void hclge_get_wol(struct hnae3_handle *handle, + struct ethtool_wolinfo *wol) +{ + struct hclge_wol_info *wol_info = hclge_get_wol_info(handle); + + wol->supported = wol_info->wol_support_mode; + wol->wolopts = wol_info->wol_current_mode; + if (wol_info->wol_current_mode & WAKE_MAGICSECURE) + memcpy(wol->sopass, wol_info->wol_sopass, SOPASS_MAX); +} + +static int hclge_set_wol(struct hnae3_handle *handle, + struct ethtool_wolinfo *wol) +{ + struct hclge_wol_info *wol_info = hclge_get_wol_info(handle); + struct hclge_vport *vport = hclge_get_vport(handle); + u32 wol_mode; + int ret; + + wol_mode = wol->wolopts; + if (wol_mode & ~wol_info->wol_support_mode) + return -EINVAL; + + wol_info->wol_current_mode = wol_mode; + if (wol_mode & WAKE_MAGICSECURE) { + memcpy(wol_info->wol_sopass, wol->sopass, SOPASS_MAX); + wol_info->wol_sopass_size = SOPASS_MAX; + } else { + wol_info->wol_sopass_size = 0; + } + + ret = hclge_set_wol_cfg(vport->back, wol_info); + if (ret) + wol_info->wol_current_mode = 0; + + return ret; +} + static int hclge_init_ae_dev(struct hnae3_ae_dev *ae_dev) { struct pci_dev *pdev = ae_dev->pdev; @@ -11720,6 +11836,11 @@ static int hclge_init_ae_dev(struct hnae3_ae_dev *ae_dev) /* Enable MISC vector(vector0) */ hclge_enable_vector(&hdev->misc_vector, true); + ret = hclge_init_wol(hdev); + if (ret) + dev_warn(&pdev->dev, + "failed to wake on lan init, ret = %d\n", ret); + hclge_state_init(hdev); hdev->last_reset_time = jiffies; @@ -11743,7 +11864,6 @@ err_devlink_uninit: hclge_devlink_uninit(hdev); err_pci_uninit: pcim_iounmap(pdev, hdev->hw.hw.io_base); - pci_clear_master(pdev); pci_release_regions(pdev); pci_disable_device(pdev); out: @@ -12099,6 +12219,11 @@ static int hclge_reset_ae_dev(struct hnae3_ae_dev *ae_dev) hclge_init_rxd_adv_layout(hdev); + ret = hclge_update_wol(hdev); + if (ret) + dev_warn(&pdev->dev, + "failed to update wol config, ret = %d\n", ret); + dev_info(&pdev->dev, "Reset done, %s driver initialization finished.\n", HCLGE_DRIVER_NAME); @@ -13145,6 +13270,8 @@ static const struct hnae3_ae_ops hclge_ops = { .get_link_diagnosis_info = hclge_get_link_diagnosis_info, .clean_vf_config = hclge_clean_vport_config, .get_dscp_prio = hclge_get_dscp_prio, + .get_wol = hclge_get_wol, + .set_wol = hclge_set_wol, }; static struct hnae3_ae_algo ae_algo = { diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.h b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.h index 13f23d606e77..81aa6b0facf5 100644 --- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.h +++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.h @@ -249,6 +249,13 @@ enum HCLGE_MAC_DUPLEX { #define QUERY_SFP_SPEED 0 #define QUERY_ACTIVE_SPEED 1 +struct hclge_wol_info { + u32 wol_support_mode; /* store the wake on lan info */ + u32 wol_current_mode; + u8 wol_sopass[SOPASS_MAX]; + u8 wol_sopass_size; +}; + struct hclge_mac { u8 mac_id; u8 phy_addr; @@ -268,6 +275,7 @@ struct hclge_mac { u32 user_fec_mode; u32 fec_ability; int link; /* store the link status of mac & phy (if phy exists) */ + struct hclge_wol_info wol; struct phy_device *phydev; struct mii_bus *mdio_bus; phy_interface_t phy_if; diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c b/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c index e84e5be8e59e..f24046250341 100644 --- a/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c +++ b/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c @@ -2598,7 +2598,7 @@ static int hclgevf_pci_init(struct hclgevf_dev *hdev) if (!hw->hw.io_base) { dev_err(&pdev->dev, "can't map configuration register space\n"); ret = -ENOMEM; - goto err_clr_master; + goto err_release_regions; } ret = hclgevf_dev_mem_map(hdev); @@ -2609,8 +2609,7 @@ static int hclgevf_pci_init(struct hclgevf_dev *hdev) err_unmap_io_base: pci_iounmap(pdev, hdev->hw.hw.io_base); -err_clr_master: - pci_clear_master(pdev); +err_release_regions: pci_release_regions(pdev); err_disable_device: pci_disable_device(pdev); @@ -2626,7 +2625,6 @@ static void hclgevf_pci_uninit(struct hclgevf_dev *hdev) devm_iounmap(&pdev->dev, hdev->hw.hw.mem_base); pci_iounmap(pdev, hdev->hw.hw.io_base); - pci_clear_master(pdev); pci_release_regions(pdev); pci_disable_device(pdev); } diff --git a/drivers/net/ethernet/intel/Kconfig b/drivers/net/ethernet/intel/Kconfig index c18c3b373846..9bc0a9519899 100644 --- a/drivers/net/ethernet/intel/Kconfig +++ b/drivers/net/ethernet/intel/Kconfig @@ -139,23 +139,6 @@ config IGBVF To compile this driver as a module, choose M here. The module will be called igbvf. -config IXGB - tristate "Intel(R) PRO/10GbE support" - depends on PCI - help - This driver supports Intel(R) PRO/10GbE family of adapters for - PCI-X type cards. For PCI-E type cards, use the "ixgbe" driver - instead. For more information on how to identify your adapter, go - to the Adapter & Driver ID Guide that can be located at: - - <http://support.intel.com> - - More specific information on configuring the driver is in - <file:Documentation/networking/device_drivers/ethernet/intel/ixgb.rst>. - - To compile this driver as a module, choose M here. The module - will be called ixgb. - config IXGBE tristate "Intel(R) 10GbE PCI Express adapters support" depends on PCI diff --git a/drivers/net/ethernet/intel/Makefile b/drivers/net/ethernet/intel/Makefile index 3075290063f6..d80d04132073 100644 --- a/drivers/net/ethernet/intel/Makefile +++ b/drivers/net/ethernet/intel/Makefile @@ -12,7 +12,6 @@ obj-$(CONFIG_IGBVF) += igbvf/ obj-$(CONFIG_IXGBE) += ixgbe/ obj-$(CONFIG_IXGBEVF) += ixgbevf/ obj-$(CONFIG_I40E) += i40e/ -obj-$(CONFIG_IXGB) += ixgb/ obj-$(CONFIG_IAVF) += iavf/ obj-$(CONFIG_FM10K) += fm10k/ obj-$(CONFIG_ICE) += ice/ diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c b/drivers/net/ethernet/intel/e1000e/netdev.c index e14d1e45318f..bd7ef59b1f2e 100644 --- a/drivers/net/ethernet/intel/e1000e/netdev.c +++ b/drivers/net/ethernet/intel/e1000e/netdev.c @@ -23,7 +23,6 @@ #include <linux/smp.h> #include <linux/pm_qos.h> #include <linux/pm_runtime.h> -#include <linux/aer.h> #include <linux/prefetch.h> #include <linux/suspend.h> diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_pci.c b/drivers/net/ethernet/intel/fm10k/fm10k_pci.c index 027d721feb18..d748b98274e7 100644 --- a/drivers/net/ethernet/intel/fm10k/fm10k_pci.c +++ b/drivers/net/ethernet/intel/fm10k/fm10k_pci.c @@ -3,7 +3,6 @@ #include <linux/module.h> #include <linux/interrupt.h> -#include <linux/aer.h> #include "fm10k.h" diff --git a/drivers/net/ethernet/intel/i40e/i40e.h b/drivers/net/ethernet/intel/i40e/i40e.h index 60ce4d15d82a..6e310a539467 100644 --- a/drivers/net/ethernet/intel/i40e/i40e.h +++ b/drivers/net/ethernet/intel/i40e/i40e.h @@ -10,7 +10,6 @@ #include <linux/errno.h> #include <linux/module.h> #include <linux/pci.h> -#include <linux/aer.h> #include <linux/netdevice.h> #include <linux/ioport.h> #include <linux/iommu.h> diff --git a/drivers/net/ethernet/intel/i40e/i40e_ethtool.c b/drivers/net/ethernet/intel/i40e/i40e_ethtool.c index 4934ff58332c..afc4fa8c66af 100644 --- a/drivers/net/ethernet/intel/i40e/i40e_ethtool.c +++ b/drivers/net/ethernet/intel/i40e/i40e_ethtool.c @@ -5402,6 +5402,13 @@ flags_complete: return -EOPNOTSUPP; } + if ((changed_flags & I40E_FLAG_LEGACY_RX) && + I40E_2K_TOO_SMALL_WITH_PADDING) { + dev_warn(&pf->pdev->dev, + "2k Rx buffer is too small to fit standard MTU and skb_shared_info\n"); + return -EOPNOTSUPP; + } + if ((changed_flags & new_flags & I40E_FLAG_LINK_DOWN_ON_CLOSE_ENABLED) && (new_flags & I40E_FLAG_MFP_ENABLED)) diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c index 7c30abd0dfc2..b847bd105b16 100644 --- a/drivers/net/ethernet/intel/i40e/i40e_main.c +++ b/drivers/net/ethernet/intel/i40e/i40e_main.c @@ -2896,15 +2896,35 @@ static void i40e_sync_filters_subtask(struct i40e_pf *pf) } /** - * i40e_max_xdp_frame_size - returns the maximum allowed frame size for XDP + * i40e_calculate_vsi_rx_buf_len - Calculates buffer length + * + * @vsi: VSI to calculate rx_buf_len from + */ +static u16 i40e_calculate_vsi_rx_buf_len(struct i40e_vsi *vsi) +{ + if (!vsi->netdev || (vsi->back->flags & I40E_FLAG_LEGACY_RX)) + return SKB_WITH_OVERHEAD(I40E_RXBUFFER_2048); + + return PAGE_SIZE < 8192 ? I40E_RXBUFFER_3072 : I40E_RXBUFFER_2048; +} + +/** + * i40e_max_vsi_frame_size - returns the maximum allowed frame size for VSI * @vsi: the vsi + * @xdp_prog: XDP program **/ -static int i40e_max_xdp_frame_size(struct i40e_vsi *vsi) +static int i40e_max_vsi_frame_size(struct i40e_vsi *vsi, + struct bpf_prog *xdp_prog) { - if (PAGE_SIZE >= 8192 || (vsi->back->flags & I40E_FLAG_LEGACY_RX)) - return I40E_RXBUFFER_2048; + u16 rx_buf_len = i40e_calculate_vsi_rx_buf_len(vsi); + u16 chain_len; + + if (xdp_prog && !xdp_prog->aux->xdp_has_frags) + chain_len = 1; else - return I40E_RXBUFFER_3072; + chain_len = I40E_MAX_CHAINED_RX_BUFFERS; + + return min_t(u16, rx_buf_len * chain_len, I40E_MAX_RXBUFFER); } /** @@ -2919,12 +2939,13 @@ static int i40e_change_mtu(struct net_device *netdev, int new_mtu) struct i40e_netdev_priv *np = netdev_priv(netdev); struct i40e_vsi *vsi = np->vsi; struct i40e_pf *pf = vsi->back; + int frame_size; - if (i40e_enabled_xdp_vsi(vsi)) { - int frame_size = new_mtu + I40E_PACKET_HDR_PAD; - - if (frame_size > i40e_max_xdp_frame_size(vsi)) - return -EINVAL; + frame_size = i40e_max_vsi_frame_size(vsi, vsi->xdp_prog); + if (new_mtu > frame_size - I40E_PACKET_HDR_PAD) { + netdev_err(netdev, "Error changing mtu to %d, Max is %d\n", + new_mtu, frame_size - I40E_PACKET_HDR_PAD); + return -EINVAL; } netdev_dbg(netdev, "changing MTU from %d to %d\n", @@ -3595,6 +3616,8 @@ static int i40e_configure_rx_ring(struct i40e_ring *ring) } } + xdp_init_buff(&ring->xdp, i40e_rx_pg_size(ring) / 2, &ring->xdp_rxq); + rx_ctx.dbuff = DIV_ROUND_UP(ring->rx_buf_len, BIT_ULL(I40E_RXQ_CTX_DBUFF_SHIFT)); @@ -3640,10 +3663,16 @@ static int i40e_configure_rx_ring(struct i40e_ring *ring) } /* configure Rx buffer alignment */ - if (!vsi->netdev || (vsi->back->flags & I40E_FLAG_LEGACY_RX)) + if (!vsi->netdev || (vsi->back->flags & I40E_FLAG_LEGACY_RX)) { + if (I40E_2K_TOO_SMALL_WITH_PADDING) { + dev_info(&vsi->back->pdev->dev, + "2k Rx buffer is too small to fit standard MTU and skb_shared_info\n"); + return -EOPNOTSUPP; + } clear_ring_build_skb_enabled(ring); - else + } else { set_ring_build_skb_enabled(ring); + } ring->rx_offset = i40e_rx_offset(ring); @@ -3694,24 +3723,6 @@ static int i40e_vsi_configure_tx(struct i40e_vsi *vsi) } /** - * i40e_calculate_vsi_rx_buf_len - Calculates buffer length - * - * @vsi: VSI to calculate rx_buf_len from - */ -static u16 i40e_calculate_vsi_rx_buf_len(struct i40e_vsi *vsi) -{ - if (!vsi->netdev || (vsi->back->flags & I40E_FLAG_LEGACY_RX)) - return I40E_RXBUFFER_2048; - -#if (PAGE_SIZE < 8192) - if (!I40E_2K_TOO_SMALL_WITH_PADDING && vsi->netdev->mtu <= ETH_DATA_LEN) - return I40E_RXBUFFER_1536 - NET_IP_ALIGN; -#endif - - return PAGE_SIZE < 8192 ? I40E_RXBUFFER_3072 : I40E_RXBUFFER_2048; -} - -/** * i40e_vsi_configure_rx - Configure the VSI for Rx * @vsi: the VSI being configured * @@ -3722,13 +3733,15 @@ static int i40e_vsi_configure_rx(struct i40e_vsi *vsi) int err = 0; u16 i; - vsi->max_frame = I40E_MAX_RXBUFFER; + vsi->max_frame = i40e_max_vsi_frame_size(vsi, vsi->xdp_prog); vsi->rx_buf_len = i40e_calculate_vsi_rx_buf_len(vsi); #if (PAGE_SIZE < 8192) if (vsi->netdev && !I40E_2K_TOO_SMALL_WITH_PADDING && - vsi->netdev->mtu <= ETH_DATA_LEN) - vsi->max_frame = I40E_RXBUFFER_1536 - NET_IP_ALIGN; + vsi->netdev->mtu <= ETH_DATA_LEN) { + vsi->rx_buf_len = I40E_RXBUFFER_1536 - NET_IP_ALIGN; + vsi->max_frame = vsi->rx_buf_len; + } #endif /* set up individual rings */ @@ -13319,15 +13332,15 @@ out_err: static int i40e_xdp_setup(struct i40e_vsi *vsi, struct bpf_prog *prog, struct netlink_ext_ack *extack) { - int frame_size = vsi->netdev->mtu + ETH_HLEN + ETH_FCS_LEN + VLAN_HLEN; + int frame_size = i40e_max_vsi_frame_size(vsi, prog); struct i40e_pf *pf = vsi->back; struct bpf_prog *old_prog; bool need_reset; int i; /* Don't allow frames that span over multiple buffers */ - if (frame_size > i40e_calculate_vsi_rx_buf_len(vsi)) { - NL_SET_ERR_MSG_MOD(extack, "MTU too large to enable XDP"); + if (vsi->netdev->mtu > frame_size - I40E_PACKET_HDR_PAD) { + NL_SET_ERR_MSG_MOD(extack, "MTU too large for linear frames and XDP prog does not support frags"); return -EINVAL; } @@ -13813,7 +13826,8 @@ static int i40e_config_netdev(struct i40e_vsi *vsi) netdev->xdp_features = NETDEV_XDP_ACT_BASIC | NETDEV_XDP_ACT_REDIRECT | - NETDEV_XDP_ACT_XSK_ZEROCOPY; + NETDEV_XDP_ACT_XSK_ZEROCOPY | + NETDEV_XDP_ACT_RX_SG; } else { /* Relate the VSI_VMDQ name to the VSI_MAIN name. Note that we * are still limited by IFNAMSIZ, but we're adding 'v%d\0' to diff --git a/drivers/net/ethernet/intel/i40e/i40e_trace.h b/drivers/net/ethernet/intel/i40e/i40e_trace.h index 79d587ad5409..33b4e30f5e00 100644 --- a/drivers/net/ethernet/intel/i40e/i40e_trace.h +++ b/drivers/net/ethernet/intel/i40e/i40e_trace.h @@ -162,45 +162,45 @@ DECLARE_EVENT_CLASS( TP_PROTO(struct i40e_ring *ring, union i40e_16byte_rx_desc *desc, - struct sk_buff *skb), + struct xdp_buff *xdp), - TP_ARGS(ring, desc, skb), + TP_ARGS(ring, desc, xdp), TP_STRUCT__entry( __field(void*, ring) __field(void*, desc) - __field(void*, skb) + __field(void*, xdp) __string(devname, ring->netdev->name) ), TP_fast_assign( __entry->ring = ring; __entry->desc = desc; - __entry->skb = skb; + __entry->xdp = xdp; __assign_str(devname, ring->netdev->name); ), TP_printk( - "netdev: %s ring: %p desc: %p skb %p", + "netdev: %s ring: %p desc: %p xdp %p", __get_str(devname), __entry->ring, - __entry->desc, __entry->skb) + __entry->desc, __entry->xdp) ); DEFINE_EVENT( i40e_rx_template, i40e_clean_rx_irq, TP_PROTO(struct i40e_ring *ring, union i40e_16byte_rx_desc *desc, - struct sk_buff *skb), + struct xdp_buff *xdp), - TP_ARGS(ring, desc, skb)); + TP_ARGS(ring, desc, xdp)); DEFINE_EVENT( i40e_rx_template, i40e_clean_rx_irq_rx, TP_PROTO(struct i40e_ring *ring, union i40e_16byte_rx_desc *desc, - struct sk_buff *skb), + struct xdp_buff *xdp), - TP_ARGS(ring, desc, skb)); + TP_ARGS(ring, desc, xdp)); DECLARE_EVENT_CLASS( i40e_xmit_template, diff --git a/drivers/net/ethernet/intel/i40e/i40e_txrx.c b/drivers/net/ethernet/intel/i40e/i40e_txrx.c index 72b091f2509d..8b8bf4880faa 100644 --- a/drivers/net/ethernet/intel/i40e/i40e_txrx.c +++ b/drivers/net/ethernet/intel/i40e/i40e_txrx.c @@ -1477,9 +1477,6 @@ void i40e_clean_rx_ring(struct i40e_ring *rx_ring) if (!rx_ring->rx_bi) return; - dev_kfree_skb(rx_ring->skb); - rx_ring->skb = NULL; - if (rx_ring->xsk_pool) { i40e_xsk_clean_rx_ring(rx_ring); goto skip_free; @@ -1524,6 +1521,7 @@ skip_free: rx_ring->next_to_alloc = 0; rx_ring->next_to_clean = 0; + rx_ring->next_to_process = 0; rx_ring->next_to_use = 0; } @@ -1576,6 +1574,7 @@ int i40e_setup_rx_descriptors(struct i40e_ring *rx_ring) rx_ring->next_to_alloc = 0; rx_ring->next_to_clean = 0; + rx_ring->next_to_process = 0; rx_ring->next_to_use = 0; /* XDP RX-queue info only needed for RX rings exposed to XDP */ @@ -1617,21 +1616,19 @@ void i40e_release_rx_desc(struct i40e_ring *rx_ring, u32 val) writel(val, rx_ring->tail); } +#if (PAGE_SIZE >= 8192) static unsigned int i40e_rx_frame_truesize(struct i40e_ring *rx_ring, unsigned int size) { unsigned int truesize; -#if (PAGE_SIZE < 8192) - truesize = i40e_rx_pg_size(rx_ring) / 2; /* Must be power-of-2 */ -#else truesize = rx_ring->rx_offset ? SKB_DATA_ALIGN(size + rx_ring->rx_offset) + SKB_DATA_ALIGN(sizeof(struct skb_shared_info)) : SKB_DATA_ALIGN(size); -#endif return truesize; } +#endif /** * i40e_alloc_mapped_page - recycle or make a new page @@ -1970,7 +1967,6 @@ static bool i40e_cleanup_headers(struct i40e_ring *rx_ring, struct sk_buff *skb, * i40e_can_reuse_rx_page - Determine if page can be reused for another Rx * @rx_buffer: buffer containing the page * @rx_stats: rx stats structure for the rx ring - * @rx_buffer_pgcnt: buffer page refcount pre xdp_do_redirect() call * * If page is reusable, we have a green light for calling i40e_reuse_rx_page, * which will assign the current buffer to the buffer that next_to_alloc is @@ -1981,8 +1977,7 @@ static bool i40e_cleanup_headers(struct i40e_ring *rx_ring, struct sk_buff *skb, * or busy if it could not be reused. */ static bool i40e_can_reuse_rx_page(struct i40e_rx_buffer *rx_buffer, - struct i40e_rx_queue_stats *rx_stats, - int rx_buffer_pgcnt) + struct i40e_rx_queue_stats *rx_stats) { unsigned int pagecnt_bias = rx_buffer->pagecnt_bias; struct page *page = rx_buffer->page; @@ -1995,7 +1990,7 @@ static bool i40e_can_reuse_rx_page(struct i40e_rx_buffer *rx_buffer, #if (PAGE_SIZE < 8192) /* if we are only owner of page we can reuse it */ - if (unlikely((rx_buffer_pgcnt - pagecnt_bias) > 1)) { + if (unlikely((rx_buffer->page_count - pagecnt_bias) > 1)) { rx_stats->page_busy_count++; return false; } @@ -2021,33 +2016,14 @@ static bool i40e_can_reuse_rx_page(struct i40e_rx_buffer *rx_buffer, } /** - * i40e_add_rx_frag - Add contents of Rx buffer to sk_buff - * @rx_ring: rx descriptor ring to transact packets on - * @rx_buffer: buffer containing page to add - * @skb: sk_buff to place the data into - * @size: packet length from rx_desc - * - * This function will add the data contained in rx_buffer->page to the skb. - * It will just attach the page as a frag to the skb. - * - * The function will then update the page offset. + * i40e_rx_buffer_flip - adjusted rx_buffer to point to an unused region + * @rx_buffer: Rx buffer to adjust + * @truesize: Size of adjustment **/ -static void i40e_add_rx_frag(struct i40e_ring *rx_ring, - struct i40e_rx_buffer *rx_buffer, - struct sk_buff *skb, - unsigned int size) +static void i40e_rx_buffer_flip(struct i40e_rx_buffer *rx_buffer, + unsigned int truesize) { #if (PAGE_SIZE < 8192) - unsigned int truesize = i40e_rx_pg_size(rx_ring) / 2; -#else - unsigned int truesize = SKB_DATA_ALIGN(size + rx_ring->rx_offset); -#endif - - skb_add_rx_frag(skb, skb_shinfo(skb)->nr_frags, rx_buffer->page, - rx_buffer->page_offset, size, truesize); - - /* page is being used so we must update the page offset */ -#if (PAGE_SIZE < 8192) rx_buffer->page_offset ^= truesize; #else rx_buffer->page_offset += truesize; @@ -2058,19 +2034,17 @@ static void i40e_add_rx_frag(struct i40e_ring *rx_ring, * i40e_get_rx_buffer - Fetch Rx buffer and synchronize data for use * @rx_ring: rx descriptor ring to transact packets on * @size: size of buffer to add to skb - * @rx_buffer_pgcnt: buffer page refcount * * This function will pull an Rx buffer from the ring and synchronize it * for use by the CPU. */ static struct i40e_rx_buffer *i40e_get_rx_buffer(struct i40e_ring *rx_ring, - const unsigned int size, - int *rx_buffer_pgcnt) + const unsigned int size) { struct i40e_rx_buffer *rx_buffer; - rx_buffer = i40e_rx_bi(rx_ring, rx_ring->next_to_clean); - *rx_buffer_pgcnt = + rx_buffer = i40e_rx_bi(rx_ring, rx_ring->next_to_process); + rx_buffer->page_count = #if (PAGE_SIZE < 8192) page_count(rx_buffer->page); #else @@ -2092,25 +2066,82 @@ static struct i40e_rx_buffer *i40e_get_rx_buffer(struct i40e_ring *rx_ring, } /** - * i40e_construct_skb - Allocate skb and populate it + * i40e_put_rx_buffer - Clean up used buffer and either recycle or free * @rx_ring: rx descriptor ring to transact packets on * @rx_buffer: rx buffer to pull data from + * + * This function will clean up the contents of the rx_buffer. It will + * either recycle the buffer or unmap it and free the associated resources. + */ +static void i40e_put_rx_buffer(struct i40e_ring *rx_ring, + struct i40e_rx_buffer *rx_buffer) +{ + if (i40e_can_reuse_rx_page(rx_buffer, &rx_ring->rx_stats)) { + /* hand second half of page back to the ring */ + i40e_reuse_rx_page(rx_ring, rx_buffer); + } else { + /* we are not reusing the buffer so unmap it */ + dma_unmap_page_attrs(rx_ring->dev, rx_buffer->dma, + i40e_rx_pg_size(rx_ring), + DMA_FROM_DEVICE, I40E_RX_DMA_ATTR); + __page_frag_cache_drain(rx_buffer->page, + rx_buffer->pagecnt_bias); + /* clear contents of buffer_info */ + rx_buffer->page = NULL; + } +} + +/** + * i40e_process_rx_buffs- Processing of buffers post XDP prog or on error + * @rx_ring: Rx descriptor ring to transact packets on + * @xdp_res: Result of the XDP program + * @xdp: xdp_buff pointing to the data + **/ +static void i40e_process_rx_buffs(struct i40e_ring *rx_ring, int xdp_res, + struct xdp_buff *xdp) +{ + u32 next = rx_ring->next_to_clean; + struct i40e_rx_buffer *rx_buffer; + + xdp->flags = 0; + + while (1) { + rx_buffer = i40e_rx_bi(rx_ring, next); + if (++next == rx_ring->count) + next = 0; + + if (!rx_buffer->page) + continue; + + if (xdp_res == I40E_XDP_CONSUMED) + rx_buffer->pagecnt_bias++; + else + i40e_rx_buffer_flip(rx_buffer, xdp->frame_sz); + + /* EOP buffer will be put in i40e_clean_rx_irq() */ + if (next == rx_ring->next_to_process) + return; + + i40e_put_rx_buffer(rx_ring, rx_buffer); + } +} + +/** + * i40e_construct_skb - Allocate skb and populate it + * @rx_ring: rx descriptor ring to transact packets on * @xdp: xdp_buff pointing to the data + * @nr_frags: number of buffers for the packet * * This function allocates an skb. It then populates it with the page * data from the current receive descriptor, taking care to set up the * skb correctly. */ static struct sk_buff *i40e_construct_skb(struct i40e_ring *rx_ring, - struct i40e_rx_buffer *rx_buffer, - struct xdp_buff *xdp) + struct xdp_buff *xdp, + u32 nr_frags) { unsigned int size = xdp->data_end - xdp->data; -#if (PAGE_SIZE < 8192) - unsigned int truesize = i40e_rx_pg_size(rx_ring) / 2; -#else - unsigned int truesize = SKB_DATA_ALIGN(size); -#endif + struct i40e_rx_buffer *rx_buffer; unsigned int headlen; struct sk_buff *skb; @@ -2150,48 +2181,60 @@ static struct sk_buff *i40e_construct_skb(struct i40e_ring *rx_ring, memcpy(__skb_put(skb, headlen), xdp->data, ALIGN(headlen, sizeof(long))); + rx_buffer = i40e_rx_bi(rx_ring, rx_ring->next_to_clean); /* update all of the pointers */ size -= headlen; if (size) { + if (unlikely(nr_frags >= MAX_SKB_FRAGS)) { + dev_kfree_skb(skb); + return NULL; + } skb_add_rx_frag(skb, 0, rx_buffer->page, rx_buffer->page_offset + headlen, - size, truesize); - + size, xdp->frame_sz); /* buffer is used by skb, update page_offset */ -#if (PAGE_SIZE < 8192) - rx_buffer->page_offset ^= truesize; -#else - rx_buffer->page_offset += truesize; -#endif + i40e_rx_buffer_flip(rx_buffer, xdp->frame_sz); } else { /* buffer is unused, reset bias back to rx_buffer */ rx_buffer->pagecnt_bias++; } + if (unlikely(xdp_buff_has_frags(xdp))) { + struct skb_shared_info *sinfo, *skinfo = skb_shinfo(skb); + + sinfo = xdp_get_shared_info_from_buff(xdp); + memcpy(&skinfo->frags[skinfo->nr_frags], &sinfo->frags[0], + sizeof(skb_frag_t) * nr_frags); + + xdp_update_skb_shared_info(skb, skinfo->nr_frags + nr_frags, + sinfo->xdp_frags_size, + nr_frags * xdp->frame_sz, + xdp_buff_is_frag_pfmemalloc(xdp)); + + /* First buffer has already been processed, so bump ntc */ + if (++rx_ring->next_to_clean == rx_ring->count) + rx_ring->next_to_clean = 0; + + i40e_process_rx_buffs(rx_ring, I40E_XDP_PASS, xdp); + } + return skb; } /** * i40e_build_skb - Build skb around an existing buffer * @rx_ring: Rx descriptor ring to transact packets on - * @rx_buffer: Rx buffer to pull data from * @xdp: xdp_buff pointing to the data + * @nr_frags: number of buffers for the packet * * This function builds an skb around an existing Rx buffer, taking care * to set up the skb correctly and avoid any memcpy overhead. */ static struct sk_buff *i40e_build_skb(struct i40e_ring *rx_ring, - struct i40e_rx_buffer *rx_buffer, - struct xdp_buff *xdp) + struct xdp_buff *xdp, + u32 nr_frags) { unsigned int metasize = xdp->data - xdp->data_meta; -#if (PAGE_SIZE < 8192) - unsigned int truesize = i40e_rx_pg_size(rx_ring) / 2; -#else - unsigned int truesize = SKB_DATA_ALIGN(sizeof(struct skb_shared_info)) + - SKB_DATA_ALIGN(xdp->data_end - - xdp->data_hard_start); -#endif struct sk_buff *skb; /* Prefetch first cache line of first page. If xdp->data_meta @@ -2202,7 +2245,7 @@ static struct sk_buff *i40e_build_skb(struct i40e_ring *rx_ring, net_prefetch(xdp->data_meta); /* build an skb around the page buffer */ - skb = napi_build_skb(xdp->data_hard_start, truesize); + skb = napi_build_skb(xdp->data_hard_start, xdp->frame_sz); if (unlikely(!skb)) return NULL; @@ -2212,42 +2255,25 @@ static struct sk_buff *i40e_build_skb(struct i40e_ring *rx_ring, if (metasize) skb_metadata_set(skb, metasize); - /* buffer is used by skb, update page_offset */ -#if (PAGE_SIZE < 8192) - rx_buffer->page_offset ^= truesize; -#else - rx_buffer->page_offset += truesize; -#endif + if (unlikely(xdp_buff_has_frags(xdp))) { + struct skb_shared_info *sinfo; - return skb; -} + sinfo = xdp_get_shared_info_from_buff(xdp); + xdp_update_skb_shared_info(skb, nr_frags, + sinfo->xdp_frags_size, + nr_frags * xdp->frame_sz, + xdp_buff_is_frag_pfmemalloc(xdp)); -/** - * i40e_put_rx_buffer - Clean up used buffer and either recycle or free - * @rx_ring: rx descriptor ring to transact packets on - * @rx_buffer: rx buffer to pull data from - * @rx_buffer_pgcnt: rx buffer page refcount pre xdp_do_redirect() call - * - * This function will clean up the contents of the rx_buffer. It will - * either recycle the buffer or unmap it and free the associated resources. - */ -static void i40e_put_rx_buffer(struct i40e_ring *rx_ring, - struct i40e_rx_buffer *rx_buffer, - int rx_buffer_pgcnt) -{ - if (i40e_can_reuse_rx_page(rx_buffer, &rx_ring->rx_stats, rx_buffer_pgcnt)) { - /* hand second half of page back to the ring */ - i40e_reuse_rx_page(rx_ring, rx_buffer); + i40e_process_rx_buffs(rx_ring, I40E_XDP_PASS, xdp); } else { - /* we are not reusing the buffer so unmap it */ - dma_unmap_page_attrs(rx_ring->dev, rx_buffer->dma, - i40e_rx_pg_size(rx_ring), - DMA_FROM_DEVICE, I40E_RX_DMA_ATTR); - __page_frag_cache_drain(rx_buffer->page, - rx_buffer->pagecnt_bias); - /* clear contents of buffer_info */ - rx_buffer->page = NULL; + struct i40e_rx_buffer *rx_buffer; + + rx_buffer = i40e_rx_bi(rx_ring, rx_ring->next_to_clean); + /* buffer is used by skb, update page_offset */ + i40e_rx_buffer_flip(rx_buffer, xdp->frame_sz); } + + return skb; } /** @@ -2333,25 +2359,6 @@ xdp_out: } /** - * i40e_rx_buffer_flip - adjusted rx_buffer to point to an unused region - * @rx_ring: Rx ring - * @rx_buffer: Rx buffer to adjust - * @size: Size of adjustment - **/ -static void i40e_rx_buffer_flip(struct i40e_ring *rx_ring, - struct i40e_rx_buffer *rx_buffer, - unsigned int size) -{ - unsigned int truesize = i40e_rx_frame_truesize(rx_ring, size); - -#if (PAGE_SIZE < 8192) - rx_buffer->page_offset ^= truesize; -#else - rx_buffer->page_offset += truesize; -#endif -} - -/** * i40e_xdp_ring_update_tail - Updates the XDP Tx ring tail register * @xdp_ring: XDP Tx ring * @@ -2409,16 +2416,65 @@ void i40e_finalize_xdp_rx(struct i40e_ring *rx_ring, unsigned int xdp_res) } /** - * i40e_inc_ntc: Advance the next_to_clean index + * i40e_inc_ntp: Advance the next_to_process index * @rx_ring: Rx ring **/ -static void i40e_inc_ntc(struct i40e_ring *rx_ring) +static void i40e_inc_ntp(struct i40e_ring *rx_ring) +{ + u32 ntp = rx_ring->next_to_process + 1; + + ntp = (ntp < rx_ring->count) ? ntp : 0; + rx_ring->next_to_process = ntp; + prefetch(I40E_RX_DESC(rx_ring, ntp)); +} + +/** + * i40e_add_xdp_frag: Add a frag to xdp_buff + * @xdp: xdp_buff pointing to the data + * @nr_frags: return number of buffers for the packet + * @rx_buffer: rx_buffer holding data of the current frag + * @size: size of data of current frag + */ +static int i40e_add_xdp_frag(struct xdp_buff *xdp, u32 *nr_frags, + struct i40e_rx_buffer *rx_buffer, u32 size) { - u32 ntc = rx_ring->next_to_clean + 1; + struct skb_shared_info *sinfo = xdp_get_shared_info_from_buff(xdp); + + if (!xdp_buff_has_frags(xdp)) { + sinfo->nr_frags = 0; + sinfo->xdp_frags_size = 0; + xdp_buff_set_frags_flag(xdp); + } else if (unlikely(sinfo->nr_frags >= MAX_SKB_FRAGS)) { + /* Overflowing packet: All frags need to be dropped */ + return -ENOMEM; + } + + __skb_fill_page_desc_noacc(sinfo, sinfo->nr_frags++, rx_buffer->page, + rx_buffer->page_offset, size); + + sinfo->xdp_frags_size += size; - ntc = (ntc < rx_ring->count) ? ntc : 0; - rx_ring->next_to_clean = ntc; - prefetch(I40E_RX_DESC(rx_ring, ntc)); + if (page_is_pfmemalloc(rx_buffer->page)) + xdp_buff_set_frag_pfmemalloc(xdp); + *nr_frags = sinfo->nr_frags; + + return 0; +} + +/** + * i40e_consume_xdp_buff - Consume all the buffers of the packet and update ntc + * @rx_ring: rx descriptor ring to transact packets on + * @xdp: xdp_buff pointing to the data + * @rx_buffer: rx_buffer of eop desc + */ +static void i40e_consume_xdp_buff(struct i40e_ring *rx_ring, + struct xdp_buff *xdp, + struct i40e_rx_buffer *rx_buffer) +{ + i40e_process_rx_buffs(rx_ring, I40E_XDP_CONSUMED, xdp); + i40e_put_rx_buffer(rx_ring, rx_buffer); + rx_ring->next_to_clean = rx_ring->next_to_process; + xdp->data = NULL; } /** @@ -2437,38 +2493,36 @@ static void i40e_inc_ntc(struct i40e_ring *rx_ring) static int i40e_clean_rx_irq(struct i40e_ring *rx_ring, int budget, unsigned int *rx_cleaned) { - unsigned int total_rx_bytes = 0, total_rx_packets = 0, frame_sz = 0; + unsigned int total_rx_bytes = 0, total_rx_packets = 0; u16 cleaned_count = I40E_DESC_UNUSED(rx_ring); + u16 clean_threshold = rx_ring->count / 2; unsigned int offset = rx_ring->rx_offset; - struct sk_buff *skb = rx_ring->skb; + struct xdp_buff *xdp = &rx_ring->xdp; unsigned int xdp_xmit = 0; struct bpf_prog *xdp_prog; bool failure = false; - struct xdp_buff xdp; int xdp_res = 0; -#if (PAGE_SIZE < 8192) - frame_sz = i40e_rx_frame_truesize(rx_ring, 0); -#endif - xdp_init_buff(&xdp, frame_sz, &rx_ring->xdp_rxq); - xdp_prog = READ_ONCE(rx_ring->xdp_prog); while (likely(total_rx_packets < (unsigned int)budget)) { + u16 ntp = rx_ring->next_to_process; struct i40e_rx_buffer *rx_buffer; union i40e_rx_desc *rx_desc; - int rx_buffer_pgcnt; + struct sk_buff *skb; unsigned int size; + u32 nfrags = 0; + bool neop; u64 qword; /* return some buffers to hardware, one at a time is too slow */ - if (cleaned_count >= I40E_RX_BUFFER_WRITE) { + if (cleaned_count >= clean_threshold) { failure = failure || i40e_alloc_rx_buffers(rx_ring, cleaned_count); cleaned_count = 0; } - rx_desc = I40E_RX_DESC(rx_ring, rx_ring->next_to_clean); + rx_desc = I40E_RX_DESC(rx_ring, ntp); /* status_error_len will always be zero for unused descriptors * because it's cleared in cleanup, and overlaps with hdr_addr @@ -2487,8 +2541,8 @@ static int i40e_clean_rx_irq(struct i40e_ring *rx_ring, int budget, i40e_clean_programming_status(rx_ring, rx_desc->raw.qword[0], qword); - rx_buffer = i40e_rx_bi(rx_ring, rx_ring->next_to_clean); - i40e_inc_ntc(rx_ring); + rx_buffer = i40e_rx_bi(rx_ring, ntp); + i40e_inc_ntp(rx_ring); i40e_reuse_rx_page(rx_ring, rx_buffer); cleaned_count++; continue; @@ -2499,76 +2553,84 @@ static int i40e_clean_rx_irq(struct i40e_ring *rx_ring, int budget, if (!size) break; - i40e_trace(clean_rx_irq, rx_ring, rx_desc, skb); - rx_buffer = i40e_get_rx_buffer(rx_ring, size, &rx_buffer_pgcnt); - + i40e_trace(clean_rx_irq, rx_ring, rx_desc, xdp); /* retrieve a buffer from the ring */ - if (!skb) { + rx_buffer = i40e_get_rx_buffer(rx_ring, size); + + neop = i40e_is_non_eop(rx_ring, rx_desc); + i40e_inc_ntp(rx_ring); + + if (!xdp->data) { unsigned char *hard_start; hard_start = page_address(rx_buffer->page) + rx_buffer->page_offset - offset; - xdp_prepare_buff(&xdp, hard_start, offset, size, true); - xdp_buff_clear_frags_flag(&xdp); + xdp_prepare_buff(xdp, hard_start, offset, size, true); #if (PAGE_SIZE > 4096) /* At larger PAGE_SIZE, frame_sz depend on len size */ - xdp.frame_sz = i40e_rx_frame_truesize(rx_ring, size); + xdp->frame_sz = i40e_rx_frame_truesize(rx_ring, size); #endif - xdp_res = i40e_run_xdp(rx_ring, &xdp, xdp_prog); + } else if (i40e_add_xdp_frag(xdp, &nfrags, rx_buffer, size) && + !neop) { + /* Overflowing packet: Drop all frags on EOP */ + i40e_consume_xdp_buff(rx_ring, xdp, rx_buffer); + break; } + if (neop) + continue; + + xdp_res = i40e_run_xdp(rx_ring, xdp, xdp_prog); + if (xdp_res) { - if (xdp_res & (I40E_XDP_TX | I40E_XDP_REDIR)) { - xdp_xmit |= xdp_res; - i40e_rx_buffer_flip(rx_ring, rx_buffer, size); + xdp_xmit |= xdp_res & (I40E_XDP_TX | I40E_XDP_REDIR); + + if (unlikely(xdp_buff_has_frags(xdp))) { + i40e_process_rx_buffs(rx_ring, xdp_res, xdp); + size = xdp_get_buff_len(xdp); + } else if (xdp_res & (I40E_XDP_TX | I40E_XDP_REDIR)) { + i40e_rx_buffer_flip(rx_buffer, xdp->frame_sz); } else { rx_buffer->pagecnt_bias++; } total_rx_bytes += size; - total_rx_packets++; - } else if (skb) { - i40e_add_rx_frag(rx_ring, rx_buffer, skb, size); - } else if (ring_uses_build_skb(rx_ring)) { - skb = i40e_build_skb(rx_ring, rx_buffer, &xdp); } else { - skb = i40e_construct_skb(rx_ring, rx_buffer, &xdp); - } + if (ring_uses_build_skb(rx_ring)) + skb = i40e_build_skb(rx_ring, xdp, nfrags); + else + skb = i40e_construct_skb(rx_ring, xdp, nfrags); + + /* drop if we failed to retrieve a buffer */ + if (!skb) { + rx_ring->rx_stats.alloc_buff_failed++; + i40e_consume_xdp_buff(rx_ring, xdp, rx_buffer); + break; + } - /* exit if we failed to retrieve a buffer */ - if (!xdp_res && !skb) { - rx_ring->rx_stats.alloc_buff_failed++; - rx_buffer->pagecnt_bias++; - break; - } + if (i40e_cleanup_headers(rx_ring, skb, rx_desc)) + goto process_next; - i40e_put_rx_buffer(rx_ring, rx_buffer, rx_buffer_pgcnt); - cleaned_count++; + /* probably a little skewed due to removing CRC */ + total_rx_bytes += skb->len; - i40e_inc_ntc(rx_ring); - if (i40e_is_non_eop(rx_ring, rx_desc)) - continue; + /* populate checksum, VLAN, and protocol */ + i40e_process_skb_fields(rx_ring, rx_desc, skb); - if (xdp_res || i40e_cleanup_headers(rx_ring, skb, rx_desc)) { - skb = NULL; - continue; + i40e_trace(clean_rx_irq_rx, rx_ring, rx_desc, xdp); + napi_gro_receive(&rx_ring->q_vector->napi, skb); } - /* probably a little skewed due to removing CRC */ - total_rx_bytes += skb->len; - - /* populate checksum, VLAN, and protocol */ - i40e_process_skb_fields(rx_ring, rx_desc, skb); - - i40e_trace(clean_rx_irq_rx, rx_ring, rx_desc, skb); - napi_gro_receive(&rx_ring->q_vector->napi, skb); - skb = NULL; - /* update budget accounting */ total_rx_packets++; +process_next: + cleaned_count += nfrags + 1; + i40e_put_rx_buffer(rx_ring, rx_buffer); + rx_ring->next_to_clean = rx_ring->next_to_process; + + xdp->data = NULL; } i40e_finalize_xdp_rx(rx_ring, xdp_xmit); - rx_ring->skb = skb; i40e_update_rx_stats(rx_ring, total_rx_bytes, total_rx_packets); @@ -3001,7 +3063,7 @@ static inline int i40e_tx_prepare_vlan_flags(struct sk_buff *skb, rc = skb_cow_head(skb, 0); if (rc < 0) return rc; - vhdr = (struct vlan_ethhdr *)skb->data; + vhdr = skb_vlan_eth_hdr(skb); vhdr->h_vlan_TCI = htons(tx_flags >> I40E_TX_FLAGS_VLAN_SHIFT); } else { diff --git a/drivers/net/ethernet/intel/i40e/i40e_txrx.h b/drivers/net/ethernet/intel/i40e/i40e_txrx.h index 768290dc6f48..8c3d24012c54 100644 --- a/drivers/net/ethernet/intel/i40e/i40e_txrx.h +++ b/drivers/net/ethernet/intel/i40e/i40e_txrx.h @@ -277,6 +277,7 @@ struct i40e_rx_buffer { struct page *page; __u32 page_offset; __u16 pagecnt_bias; + __u32 page_count; }; struct i40e_queue_stats { @@ -336,6 +337,17 @@ struct i40e_ring { u8 dcb_tc; /* Traffic class of ring */ u8 __iomem *tail; + /* Storing xdp_buff on ring helps in saving the state of partially built + * packet when i40e_clean_rx_ring_irq() must return before it sees EOP + * and to resume packet building for this ring in the next call to + * i40e_clean_rx_ring_irq(). + */ + struct xdp_buff xdp; + + /* Next descriptor to be processed; next_to_clean is updated only on + * processing EOP descriptor + */ + u16 next_to_process; /* high bit set means dynamic, use accessor routines to read/write. * hardware only supports 2us resolution for the ITR registers. * these values always store the USER setting, and must be converted @@ -380,14 +392,6 @@ struct i40e_ring { struct rcu_head rcu; /* to avoid race on free */ u16 next_to_alloc; - struct sk_buff *skb; /* When i40e_clean_rx_ring_irq() must - * return before it sees the EOP for - * the current packet, we save that skb - * here and resume receiving this - * packet the next time - * i40e_clean_rx_ring_irq() is called - * for this ring. - */ struct i40e_channel *ch; u16 rx_offset; diff --git a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c index 8a4587585acd..be59ba3774e1 100644 --- a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c +++ b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c @@ -2915,6 +2915,72 @@ static inline int i40e_check_vf_permission(struct i40e_vf *vf, } /** + * i40e_vc_ether_addr_type - get type of virtchnl_ether_addr + * @vc_ether_addr: used to extract the type + **/ +static u8 +i40e_vc_ether_addr_type(struct virtchnl_ether_addr *vc_ether_addr) +{ + return vc_ether_addr->type & VIRTCHNL_ETHER_ADDR_TYPE_MASK; +} + +/** + * i40e_is_vc_addr_legacy + * @vc_ether_addr: VIRTCHNL structure that contains MAC and type + * + * check if the MAC address is from an older VF + **/ +static bool +i40e_is_vc_addr_legacy(struct virtchnl_ether_addr *vc_ether_addr) +{ + return i40e_vc_ether_addr_type(vc_ether_addr) == + VIRTCHNL_ETHER_ADDR_LEGACY; +} + +/** + * i40e_is_vc_addr_primary + * @vc_ether_addr: VIRTCHNL structure that contains MAC and type + * + * check if the MAC address is the VF's primary MAC + * This function should only be called when the MAC address in + * virtchnl_ether_addr is a valid unicast MAC + **/ +static bool +i40e_is_vc_addr_primary(struct virtchnl_ether_addr *vc_ether_addr) +{ + return i40e_vc_ether_addr_type(vc_ether_addr) == + VIRTCHNL_ETHER_ADDR_PRIMARY; +} + +/** + * i40e_update_vf_mac_addr + * @vf: VF to update + * @vc_ether_addr: structure from VIRTCHNL with MAC to add + * + * update the VF's cached hardware MAC if allowed + **/ +static void +i40e_update_vf_mac_addr(struct i40e_vf *vf, + struct virtchnl_ether_addr *vc_ether_addr) +{ + u8 *mac_addr = vc_ether_addr->addr; + + if (!is_valid_ether_addr(mac_addr)) + return; + + /* If request to add MAC filter is a primary request update its default + * MAC address with the requested one. If it is a legacy request then + * check if current default is empty if so update the default MAC + */ + if (i40e_is_vc_addr_primary(vc_ether_addr)) { + ether_addr_copy(vf->default_lan_addr.addr, mac_addr); + } else if (i40e_is_vc_addr_legacy(vc_ether_addr)) { + if (is_zero_ether_addr(vf->default_lan_addr.addr)) + ether_addr_copy(vf->default_lan_addr.addr, mac_addr); + } +} + +/** * i40e_vc_add_mac_addr_msg * @vf: pointer to the VF info * @msg: pointer to the msg buffer @@ -2965,11 +3031,8 @@ static int i40e_vc_add_mac_addr_msg(struct i40e_vf *vf, u8 *msg) spin_unlock_bh(&vsi->mac_filter_hash_lock); goto error_param; } - if (is_valid_ether_addr(al->list[i].addr) && - is_zero_ether_addr(vf->default_lan_addr.addr)) - ether_addr_copy(vf->default_lan_addr.addr, - al->list[i].addr); } + i40e_update_vf_mac_addr(vf, &al->list[i]); } spin_unlock_bh(&vsi->mac_filter_hash_lock); @@ -3032,6 +3095,9 @@ static int i40e_vc_del_mac_addr_msg(struct i40e_vf *vf, u8 *msg) spin_unlock_bh(&vsi->mac_filter_hash_lock); + if (was_unimac_deleted) + eth_zero_addr(vf->default_lan_addr.addr); + /* program the updated filter list */ ret = i40e_sync_vsi_filters(vsi); if (ret) diff --git a/drivers/net/ethernet/intel/iavf/iavf.h b/drivers/net/ethernet/intel/iavf/iavf.h index 746ff76f2fb1..9abaff1f2aff 100644 --- a/drivers/net/ethernet/intel/iavf/iavf.h +++ b/drivers/net/ethernet/intel/iavf/iavf.h @@ -6,7 +6,6 @@ #include <linux/module.h> #include <linux/pci.h> -#include <linux/aer.h> #include <linux/netdevice.h> #include <linux/vmalloc.h> #include <linux/interrupt.h> diff --git a/drivers/net/ethernet/intel/ice/ice.h b/drivers/net/ethernet/intel/ice/ice.h index e809249500e1..aa32111afd6e 100644 --- a/drivers/net/ethernet/intel/ice/ice.h +++ b/drivers/net/ethernet/intel/ice/ice.h @@ -20,7 +20,6 @@ #include <linux/pci.h> #include <linux/workqueue.h> #include <linux/wait.h> -#include <linux/aer.h> #include <linux/interrupt.h> #include <linux/ethtool.h> #include <linux/timer.h> diff --git a/drivers/net/ethernet/intel/ice/ice_common.c b/drivers/net/ethernet/intel/ice/ice_common.c index c2fda4fa4188..0157f6e98d3e 100644 --- a/drivers/net/ethernet/intel/ice/ice_common.c +++ b/drivers/net/ethernet/intel/ice/ice_common.c @@ -1619,7 +1619,6 @@ ice_sq_send_cmd_retry(struct ice_hw *hw, struct ice_ctl_q_info *cq, { struct ice_aq_desc desc_cpy; bool is_cmd_for_retry; - u8 *buf_cpy = NULL; u8 idx = 0; u16 opcode; int status; @@ -1629,11 +1628,8 @@ ice_sq_send_cmd_retry(struct ice_hw *hw, struct ice_ctl_q_info *cq, memset(&desc_cpy, 0, sizeof(desc_cpy)); if (is_cmd_for_retry) { - if (buf) { - buf_cpy = kzalloc(buf_size, GFP_KERNEL); - if (!buf_cpy) - return -ENOMEM; - } + /* All retryable cmds are direct, without buf. */ + WARN_ON(buf); memcpy(&desc_cpy, desc, sizeof(desc_cpy)); } @@ -1645,17 +1641,12 @@ ice_sq_send_cmd_retry(struct ice_hw *hw, struct ice_ctl_q_info *cq, hw->adminq.sq_last_status != ICE_AQ_RC_EBUSY) break; - if (buf_cpy) - memcpy(buf, buf_cpy, buf_size); - memcpy(desc, &desc_cpy, sizeof(desc_cpy)); - mdelay(ICE_SQ_SEND_DELAY_TIME_MS); + msleep(ICE_SQ_SEND_DELAY_TIME_MS); } while (++idx < ICE_SQ_SEND_MAX_EXECUTE); - kfree(buf_cpy); - return status; } @@ -1992,19 +1983,19 @@ ice_acquire_res_exit: */ void ice_release_res(struct ice_hw *hw, enum ice_aq_res_ids res) { - u32 total_delay = 0; + unsigned long timeout; int status; - status = ice_aq_release_res(hw, res, 0, NULL); - /* there are some rare cases when trying to release the resource * results in an admin queue timeout, so handle them correctly */ - while ((status == -EIO) && (total_delay < hw->adminq.sq_cmd_timeout)) { - mdelay(1); + timeout = jiffies + 10 * ICE_CTL_Q_SQ_CMD_TIMEOUT; + do { status = ice_aq_release_res(hw, res, 0, NULL); - total_delay++; - } + if (status != -EIO) + break; + usleep_range(1000, 2000); + } while (time_before(jiffies, timeout)); } /** diff --git a/drivers/net/ethernet/intel/ice/ice_controlq.c b/drivers/net/ethernet/intel/ice/ice_controlq.c index 6bcfee295991..d2faf1baad2f 100644 --- a/drivers/net/ethernet/intel/ice/ice_controlq.c +++ b/drivers/net/ethernet/intel/ice/ice_controlq.c @@ -637,9 +637,6 @@ static int ice_init_ctrlq(struct ice_hw *hw, enum ice_ctl_q q_type) return -EIO; } - /* setup SQ command write back timeout */ - cq->sq_cmd_timeout = ICE_CTL_Q_SQ_CMD_TIMEOUT; - /* allocate the ATQ */ ret_code = ice_init_sq(hw, cq); if (ret_code) @@ -967,7 +964,7 @@ ice_sq_send_cmd(struct ice_hw *hw, struct ice_ctl_q_info *cq, struct ice_aq_desc *desc_on_ring; bool cmd_completed = false; struct ice_sq_cd *details; - u32 total_delay = 0; + unsigned long timeout; int status = 0; u16 retval = 0; u32 val = 0; @@ -1060,13 +1057,14 @@ ice_sq_send_cmd(struct ice_hw *hw, struct ice_ctl_q_info *cq, cq->sq.next_to_use = 0; wr32(hw, cq->sq.tail, cq->sq.next_to_use); + timeout = jiffies + ICE_CTL_Q_SQ_CMD_TIMEOUT; do { if (ice_sq_done(hw, cq)) break; - udelay(ICE_CTL_Q_SQ_CMD_USEC); - total_delay++; - } while (total_delay < cq->sq_cmd_timeout); + usleep_range(ICE_CTL_Q_SQ_CMD_USEC, + ICE_CTL_Q_SQ_CMD_USEC * 3 / 2); + } while (time_before(jiffies, timeout)); /* if ready, copy the desc back to temp */ if (ice_sq_done(hw, cq)) { diff --git a/drivers/net/ethernet/intel/ice/ice_controlq.h b/drivers/net/ethernet/intel/ice/ice_controlq.h index c07e9cc9fc6e..950b7f4a7a05 100644 --- a/drivers/net/ethernet/intel/ice/ice_controlq.h +++ b/drivers/net/ethernet/intel/ice/ice_controlq.h @@ -34,7 +34,7 @@ enum ice_ctl_q { }; /* Control Queue timeout settings - max delay 1s */ -#define ICE_CTL_Q_SQ_CMD_TIMEOUT 10000 /* Count 10000 times */ +#define ICE_CTL_Q_SQ_CMD_TIMEOUT HZ /* Wait max 1s */ #define ICE_CTL_Q_SQ_CMD_USEC 100 /* Check every 100usec */ #define ICE_CTL_Q_ADMIN_INIT_TIMEOUT 10 /* Count 10 times */ #define ICE_CTL_Q_ADMIN_INIT_MSEC 100 /* Check every 100msec */ @@ -87,7 +87,6 @@ struct ice_ctl_q_info { enum ice_ctl_q qtype; struct ice_ctl_q_ring rq; /* receive queue */ struct ice_ctl_q_ring sq; /* send queue */ - u32 sq_cmd_timeout; /* send queue cmd write back timeout */ u16 num_rq_entries; /* receive queue depth */ u16 num_sq_entries; /* send queue depth */ u16 rq_buf_size; /* receive queue buffer size */ diff --git a/drivers/net/ethernet/intel/ice/ice_devlink.c b/drivers/net/ethernet/intel/ice/ice_devlink.c index 05f216af8c81..bc44cc220818 100644 --- a/drivers/net/ethernet/intel/ice/ice_devlink.c +++ b/drivers/net/ethernet/intel/ice/ice_devlink.c @@ -1254,7 +1254,6 @@ static const struct devlink_ops ice_devlink_ops = { .supported_flash_update_params = DEVLINK_SUPPORT_FLASH_UPDATE_OVERWRITE_MASK, .reload_actions = BIT(DEVLINK_RELOAD_ACTION_DRIVER_REINIT) | BIT(DEVLINK_RELOAD_ACTION_FW_ACTIVATE), - /* The ice driver currently does not support driver reinit */ .reload_down = ice_devlink_reload_down, .reload_up = ice_devlink_reload_up, .port_split = ice_devlink_port_split, diff --git a/drivers/net/ethernet/intel/ice/ice_gnss.c b/drivers/net/ethernet/intel/ice/ice_gnss.c index 8dec748bb53a..2ea8a2b11bcd 100644 --- a/drivers/net/ethernet/intel/ice/ice_gnss.c +++ b/drivers/net/ethernet/intel/ice/ice_gnss.c @@ -117,6 +117,7 @@ static void ice_gnss_read(struct kthread_work *work) { struct gnss_serial *gnss = container_of(work, struct gnss_serial, read_work.work); + unsigned long delay = ICE_GNSS_POLL_DATA_DELAY_TIME; unsigned int i, bytes_read, data_len, count; struct ice_aqc_link_topo_addr link_topo; struct ice_pf *pf; @@ -136,11 +137,6 @@ static void ice_gnss_read(struct kthread_work *work) return; hw = &pf->hw; - buf = (char *)get_zeroed_page(GFP_KERNEL); - if (!buf) { - err = -ENOMEM; - goto exit; - } memset(&link_topo, 0, sizeof(struct ice_aqc_link_topo_addr)); link_topo.topo_params.index = ICE_E810T_GNSS_I2C_BUS; @@ -151,25 +147,24 @@ static void ice_gnss_read(struct kthread_work *work) i2c_params = ICE_GNSS_UBX_DATA_LEN_WIDTH | ICE_AQC_I2C_USE_REPEATED_START; - /* Read data length in a loop, when it's not 0 the data is ready */ - for (i = 0; i < ICE_MAX_UBX_READ_TRIES; i++) { - err = ice_aq_read_i2c(hw, link_topo, ICE_GNSS_UBX_I2C_BUS_ADDR, - cpu_to_le16(ICE_GNSS_UBX_DATA_LEN_H), - i2c_params, (u8 *)&data_len_b, NULL); - if (err) - goto exit_buf; + err = ice_aq_read_i2c(hw, link_topo, ICE_GNSS_UBX_I2C_BUS_ADDR, + cpu_to_le16(ICE_GNSS_UBX_DATA_LEN_H), + i2c_params, (u8 *)&data_len_b, NULL); + if (err) + goto requeue; - data_len = be16_to_cpu(data_len_b); - if (data_len != 0 && data_len != U16_MAX) - break; + data_len = be16_to_cpu(data_len_b); + if (data_len == 0 || data_len == U16_MAX) + goto requeue; - mdelay(10); - } + /* The u-blox has data_len bytes for us to read */ data_len = min_t(typeof(data_len), data_len, PAGE_SIZE); - if (!data_len) { + + buf = (char *)get_zeroed_page(GFP_KERNEL); + if (!buf) { err = -ENOMEM; - goto exit_buf; + goto requeue; } /* Read received data */ @@ -183,7 +178,7 @@ static void ice_gnss_read(struct kthread_work *work) cpu_to_le16(ICE_GNSS_UBX_EMPTY_DATA), bytes_read, &buf[i], NULL); if (err) - goto exit_buf; + goto free_buf; } count = gnss_insert_raw(pf->gnss_dev, buf, i); @@ -191,10 +186,11 @@ static void ice_gnss_read(struct kthread_work *work) dev_warn(ice_pf_to_dev(pf), "gnss_insert_raw ret=%d size=%d\n", count, i); -exit_buf: + delay = ICE_GNSS_TIMER_DELAY_TIME; +free_buf: free_page((unsigned long)buf); - kthread_queue_delayed_work(gnss->kworker, &gnss->read_work, - ICE_GNSS_TIMER_DELAY_TIME); +requeue: + kthread_queue_delayed_work(gnss->kworker, &gnss->read_work, delay); exit: if (err) dev_dbg(ice_pf_to_dev(pf), "GNSS failed to read err=%d\n", err); diff --git a/drivers/net/ethernet/intel/ice/ice_gnss.h b/drivers/net/ethernet/intel/ice/ice_gnss.h index 4d49e5b0b4b8..b8bb8b63d081 100644 --- a/drivers/net/ethernet/intel/ice/ice_gnss.h +++ b/drivers/net/ethernet/intel/ice/ice_gnss.h @@ -5,6 +5,7 @@ #define _ICE_GNSS_H_ #define ICE_E810T_GNSS_I2C_BUS 0x2 +#define ICE_GNSS_POLL_DATA_DELAY_TIME (HZ / 50) /* poll every 20 ms */ #define ICE_GNSS_TIMER_DELAY_TIME (HZ / 10) /* 0.1 second per message */ #define ICE_GNSS_TTY_WRITE_BUF 250 #define ICE_MAX_I2C_DATA_SIZE FIELD_MAX(ICE_AQC_I2C_DATA_SIZE_M) @@ -20,8 +21,6 @@ * passed as I2C addr parameter. */ #define ICE_GNSS_UBX_WRITE_BYTES (ICE_MAX_I2C_WRITE_BYTES + 1) -#define ICE_MAX_UBX_READ_TRIES 255 -#define ICE_MAX_UBX_ACK_READ_TRIES 4095 struct gnss_write_buf { struct list_head queue; diff --git a/drivers/net/ethernet/intel/ice/ice_main.c b/drivers/net/ethernet/intel/ice/ice_main.c index 0d8b8c6f9bd3..a1f7c8edc22f 100644 --- a/drivers/net/ethernet/intel/ice/ice_main.c +++ b/drivers/net/ethernet/intel/ice/ice_main.c @@ -1393,6 +1393,8 @@ static void ice_aq_cancel_waiting_tasks(struct ice_pf *pf) wake_up(&pf->aq_wait_queue); } +#define ICE_MBX_OVERFLOW_WATERMARK 64 + /** * __ice_clean_ctrlq - helper function to clean controlq rings * @pf: ptr to struct ice_pf @@ -1483,6 +1485,7 @@ static int __ice_clean_ctrlq(struct ice_pf *pf, enum ice_ctl_q q_type) return 0; do { + struct ice_mbx_data data = {}; u16 opcode; int ret; @@ -1509,8 +1512,12 @@ static int __ice_clean_ctrlq(struct ice_pf *pf, enum ice_ctl_q q_type) ice_vf_lan_overflow_event(pf, &event); break; case ice_mbx_opc_send_msg_to_pf: - if (!ice_is_malicious_vf(pf, &event, i, pending)) - ice_vc_process_vf_msg(pf, &event); + data.num_msg_proc = i; + data.num_pending_arq = pending; + data.max_num_msgs_mbx = hw->mailboxq.num_rq_entries; + data.async_watermark_val = ICE_MBX_OVERFLOW_WATERMARK; + + ice_vc_process_vf_msg(pf, &event, &data); break; case ice_aqc_opc_fw_logging: ice_output_fw_log(hw, &event.desc, event.msg_buf); @@ -3888,6 +3895,7 @@ static int ice_init_pf(struct ice_pf *pf) mutex_init(&pf->vfs.table_lock); hash_init(pf->vfs.table); + ice_mbx_init_snapshot(&pf->hw); return 0; } diff --git a/drivers/net/ethernet/intel/ice/ice_sriov.c b/drivers/net/ethernet/intel/ice/ice_sriov.c index 0cc05e54a781..f1dca59bd844 100644 --- a/drivers/net/ethernet/intel/ice/ice_sriov.c +++ b/drivers/net/ethernet/intel/ice/ice_sriov.c @@ -204,10 +204,7 @@ void ice_free_vfs(struct ice_pf *pf) } /* clear malicious info since the VF is getting released */ - if (ice_mbx_clear_malvf(&hw->mbx_snapshot, pf->vfs.malvfs, - ICE_MAX_SRIOV_VFS, vf->vf_id)) - dev_dbg(dev, "failed to clear malicious VF state for VF %u\n", - vf->vf_id); + list_del(&vf->mbx_info.list_entry); mutex_unlock(&vf->cfg_lock); } @@ -1017,7 +1014,6 @@ int ice_sriov_configure(struct pci_dev *pdev, int num_vfs) if (!num_vfs) { if (!pci_vfs_assigned(pdev)) { ice_free_vfs(pf); - ice_mbx_deinit_snapshot(&pf->hw); if (pf->lag) ice_enable_lag(pf->lag); return 0; @@ -1027,15 +1023,9 @@ int ice_sriov_configure(struct pci_dev *pdev, int num_vfs) return -EBUSY; } - err = ice_mbx_init_snapshot(&pf->hw, num_vfs); - if (err) - return err; - err = ice_pci_sriov_ena(pf, num_vfs); - if (err) { - ice_mbx_deinit_snapshot(&pf->hw); + if (err) return err; - } if (pf->lag) ice_disable_lag(pf->lag); @@ -1787,66 +1777,3 @@ void ice_restore_all_vfs_msi_state(struct pci_dev *pdev) } } } - -/** - * ice_is_malicious_vf - helper function to detect a malicious VF - * @pf: ptr to struct ice_pf - * @event: pointer to the AQ event - * @num_msg_proc: the number of messages processed so far - * @num_msg_pending: the number of messages peinding in admin queue - */ -bool -ice_is_malicious_vf(struct ice_pf *pf, struct ice_rq_event_info *event, - u16 num_msg_proc, u16 num_msg_pending) -{ - s16 vf_id = le16_to_cpu(event->desc.retval); - struct device *dev = ice_pf_to_dev(pf); - struct ice_mbx_data mbxdata; - bool malvf = false; - struct ice_vf *vf; - int status; - - vf = ice_get_vf_by_id(pf, vf_id); - if (!vf) - return false; - - if (test_bit(ICE_VF_STATE_DIS, vf->vf_states)) - goto out_put_vf; - - mbxdata.num_msg_proc = num_msg_proc; - mbxdata.num_pending_arq = num_msg_pending; - mbxdata.max_num_msgs_mbx = pf->hw.mailboxq.num_rq_entries; -#define ICE_MBX_OVERFLOW_WATERMARK 64 - mbxdata.async_watermark_val = ICE_MBX_OVERFLOW_WATERMARK; - - /* check to see if we have a malicious VF */ - status = ice_mbx_vf_state_handler(&pf->hw, &mbxdata, vf_id, &malvf); - if (status) - goto out_put_vf; - - if (malvf) { - bool report_vf = false; - - /* if the VF is malicious and we haven't let the user - * know about it, then let them know now - */ - status = ice_mbx_report_malvf(&pf->hw, pf->vfs.malvfs, - ICE_MAX_SRIOV_VFS, vf_id, - &report_vf); - if (status) - dev_dbg(dev, "Error reporting malicious VF\n"); - - if (report_vf) { - struct ice_vsi *pf_vsi = ice_get_main_vsi(pf); - - if (pf_vsi) - dev_warn(dev, "VF MAC %pM on PF MAC %pM is generating asynchronous messages and may be overflowing the PF message queue. Please see the Adapter User Guide for more information\n", - &vf->dev_lan_addr[0], - pf_vsi->netdev->dev_addr); - } - } - -out_put_vf: - ice_put_vf(vf); - return malvf; -} diff --git a/drivers/net/ethernet/intel/ice/ice_sriov.h b/drivers/net/ethernet/intel/ice/ice_sriov.h index 955ab810a198..346cb2666f3a 100644 --- a/drivers/net/ethernet/intel/ice/ice_sriov.h +++ b/drivers/net/ethernet/intel/ice/ice_sriov.h @@ -33,11 +33,7 @@ int ice_get_vf_cfg(struct net_device *netdev, int vf_id, struct ifla_vf_info *ivi); void ice_free_vfs(struct ice_pf *pf); -void ice_vc_process_vf_msg(struct ice_pf *pf, struct ice_rq_event_info *event); void ice_restore_all_vfs_msi_state(struct pci_dev *pdev); -bool -ice_is_malicious_vf(struct ice_pf *pf, struct ice_rq_event_info *event, - u16 num_msg_proc, u16 num_msg_pending); int ice_set_vf_port_vlan(struct net_device *netdev, int vf_id, u16 vlan_id, u8 qos, @@ -68,22 +64,11 @@ ice_vc_validate_pattern(struct ice_vf *vf, struct virtchnl_proto_hdrs *proto); static inline void ice_process_vflr_event(struct ice_pf *pf) { } static inline void ice_free_vfs(struct ice_pf *pf) { } static inline -void ice_vc_process_vf_msg(struct ice_pf *pf, struct ice_rq_event_info *event) { } -static inline void ice_vf_lan_overflow_event(struct ice_pf *pf, struct ice_rq_event_info *event) { } static inline void ice_print_vfs_mdd_events(struct ice_pf *pf) { } static inline void ice_print_vf_rx_mdd_event(struct ice_vf *vf) { } static inline void ice_restore_all_vfs_msi_state(struct pci_dev *pdev) { } -static inline bool -ice_is_malicious_vf(struct ice_pf __always_unused *pf, - struct ice_rq_event_info __always_unused *event, - u16 __always_unused num_msg_proc, - u16 __always_unused num_msg_pending) -{ - return false; -} - static inline int ice_sriov_configure(struct pci_dev __always_unused *pdev, int __always_unused num_vfs) diff --git a/drivers/net/ethernet/intel/ice/ice_type.h b/drivers/net/ethernet/intel/ice/ice_type.h index e3f622cad425..a09556e57803 100644 --- a/drivers/net/ethernet/intel/ice/ice_type.h +++ b/drivers/net/ethernet/intel/ice/ice_type.h @@ -784,14 +784,15 @@ struct ice_mbx_snap_buffer_data { u16 max_num_msgs_mbx; }; -/* Structure to track messages sent by VFs on mailbox: - * 1. vf_cntr: a counter array of VFs to track the number of - * asynchronous messages sent by each VF - * 2. vfcntr_len: number of entries in VF counter array +/* Structure used to track a single VF's messages on the mailbox: + * 1. list_entry: linked list entry node + * 2. msg_count: the number of asynchronous messages sent by this VF + * 3. malicious: whether this VF has been detected as malicious before */ -struct ice_mbx_vf_counter { - u32 *vf_cntr; - u32 vfcntr_len; +struct ice_mbx_vf_info { + struct list_head list_entry; + u32 msg_count; + u8 malicious : 1; }; /* Structure to hold data relevant to the captured static snapshot @@ -799,7 +800,7 @@ struct ice_mbx_vf_counter { */ struct ice_mbx_snapshot { struct ice_mbx_snap_buffer_data mbx_buf; - struct ice_mbx_vf_counter mbx_vf; + struct list_head mbx_vf; }; /* Structure to hold data to be used for capturing or updating a diff --git a/drivers/net/ethernet/intel/ice/ice_vf_lib.c b/drivers/net/ethernet/intel/ice/ice_vf_lib.c index 0e57bd1b85fd..89fd6982df09 100644 --- a/drivers/net/ethernet/intel/ice/ice_vf_lib.c +++ b/drivers/net/ethernet/intel/ice/ice_vf_lib.c @@ -496,10 +496,7 @@ void ice_reset_all_vfs(struct ice_pf *pf) /* clear all malicious info if the VFs are getting reset */ ice_for_each_vf(pf, bkt, vf) - if (ice_mbx_clear_malvf(&hw->mbx_snapshot, pf->vfs.malvfs, - ICE_MAX_SRIOV_VFS, vf->vf_id)) - dev_dbg(dev, "failed to clear malicious VF state for VF %u\n", - vf->vf_id); + ice_mbx_clear_malvf(&vf->mbx_info); /* If VFs have been disabled, there is no need to reset */ if (test_and_set_bit(ICE_VF_DIS, pf->state)) { @@ -601,12 +598,10 @@ int ice_reset_vf(struct ice_vf *vf, u32 flags) struct ice_pf *pf = vf->pf; struct ice_vsi *vsi; struct device *dev; - struct ice_hw *hw; int err = 0; bool rsd; dev = ice_pf_to_dev(pf); - hw = &pf->hw; if (flags & ICE_VF_RESET_NOTIFY) ice_notify_vf_reset(vf); @@ -705,10 +700,7 @@ int ice_reset_vf(struct ice_vf *vf, u32 flags) ice_eswitch_replay_vf_mac_rule(vf); /* if the VF has been reset allow it to come up again */ - if (ice_mbx_clear_malvf(&hw->mbx_snapshot, pf->vfs.malvfs, - ICE_MAX_SRIOV_VFS, vf->vf_id)) - dev_dbg(dev, "failed to clear malicious VF state for VF %u\n", - vf->vf_id); + ice_mbx_clear_malvf(&vf->mbx_info); out_unlock: if (flags & ICE_VF_RESET_LOCK) @@ -764,6 +756,9 @@ void ice_initialize_vf_entry(struct ice_vf *vf) ice_vf_ctrl_invalidate_vsi(vf); ice_vf_fdir_init(vf); + /* Initialize mailbox info for this VF */ + ice_mbx_init_vf_info(&pf->hw, &vf->mbx_info); + mutex_init(&vf->cfg_lock); } diff --git a/drivers/net/ethernet/intel/ice/ice_vf_lib.h b/drivers/net/ethernet/intel/ice/ice_vf_lib.h index ef30f05b5d02..e3cda6fb71ab 100644 --- a/drivers/net/ethernet/intel/ice/ice_vf_lib.h +++ b/drivers/net/ethernet/intel/ice/ice_vf_lib.h @@ -74,7 +74,6 @@ struct ice_vfs { u16 num_qps_per; /* number of queue pairs per VF */ u16 num_msix_per; /* number of MSI-X vectors per VF */ unsigned long last_printed_mdd_jiffies; /* MDD message rate limit */ - DECLARE_BITMAP(malvfs, ICE_MAX_SRIOV_VFS); /* malicious VF indicator */ }; /* VF information structure */ @@ -105,6 +104,7 @@ struct ice_vf { DECLARE_BITMAP(rxq_ena, ICE_MAX_RSS_QS_PER_VF); struct ice_vlan port_vlan_info; /* Port VLAN ID, QoS, and TPID */ struct virtchnl_vlan_caps vlan_v2_caps; + struct ice_mbx_vf_info mbx_info; u8 pf_set_mac:1; /* VF MAC address set by VMM admin */ u8 trusted:1; u8 spoofchk:1; diff --git a/drivers/net/ethernet/intel/ice/ice_vf_mbx.c b/drivers/net/ethernet/intel/ice/ice_vf_mbx.c index f56fa94ff3d0..40cb4ba0789c 100644 --- a/drivers/net/ethernet/intel/ice/ice_vf_mbx.c +++ b/drivers/net/ethernet/intel/ice/ice_vf_mbx.c @@ -93,36 +93,31 @@ u32 ice_conv_link_speed_to_virtchnl(bool adv_link_support, u16 link_speed) * * 2. When the caller starts processing its mailbox queue in response to an * interrupt, the structure ice_mbx_snapshot is expected to be cleared before - * the algorithm can be run for the first time for that interrupt. This can be - * done via ice_mbx_reset_snapshot(). + * the algorithm can be run for the first time for that interrupt. This + * requires calling ice_mbx_reset_snapshot() as well as calling + * ice_mbx_reset_vf_info() for each VF tracking structure. * * 3. For every message read by the caller from the MBX Queue, the caller must * call the detection algorithm's entry function ice_mbx_vf_state_handler(). * Before every call to ice_mbx_vf_state_handler() the struct ice_mbx_data is * filled as it is required to be passed to the algorithm. * - * 4. Every time a message is read from the MBX queue, a VFId is received which - * is passed to the state handler. The boolean output is_malvf of the state - * handler ice_mbx_vf_state_handler() serves as an indicator to the caller - * whether this VF is malicious or not. + * 4. Every time a message is read from the MBX queue, a tracking structure + * for the VF must be passed to the state handler. The boolean output + * report_malvf from ice_mbx_vf_state_handler() serves as an indicator to the + * caller whether it must report this VF as malicious or not. * * 5. When a VF is identified to be malicious, the caller can send a message - * to the system administrator. The caller can invoke ice_mbx_report_malvf() - * to help determine if a malicious VF is to be reported or not. This function - * requires the caller to maintain a global bitmap to track all malicious VFs - * and pass that to ice_mbx_report_malvf() along with the VFID which was identified - * to be malicious by ice_mbx_vf_state_handler(). + * to the system administrator. * - * 6. The global bitmap maintained by PF can be cleared completely if PF is in - * reset or the bit corresponding to a VF can be cleared if that VF is in reset. - * When a VF is shut down and brought back up, we assume that the new VF - * brought up is not malicious and hence report it if found malicious. + * 6. The PF is responsible for maintaining the struct ice_mbx_vf_info + * structure for each VF. The PF should clear the VF tracking structure if the + * VF is reset. When a VF is shut down and brought back up, we will then + * assume that the new VF is not malicious and may report it again if we + * detect it again. * * 7. The function ice_mbx_reset_snapshot() is called to reset the information * in ice_mbx_snapshot for every new mailbox interrupt handled. - * - * 8. The memory allocated for variables in ice_mbx_snapshot is de-allocated - * when driver is unloaded. */ #define ICE_RQ_DATA_MASK(rq_data) ((rq_data) & PF_MBX_ARQH_ARQH_M) /* Using the highest value for an unsigned 16-bit value 0xFFFF to indicate that @@ -131,6 +126,25 @@ u32 ice_conv_link_speed_to_virtchnl(bool adv_link_support, u16 link_speed) #define ICE_IGNORE_MAX_MSG_CNT 0xFFFF /** + * ice_mbx_reset_snapshot - Reset mailbox snapshot structure + * @snap: pointer to the mailbox snapshot + */ +static void ice_mbx_reset_snapshot(struct ice_mbx_snapshot *snap) +{ + struct ice_mbx_vf_info *vf_info; + + /* Clear mbx_buf in the mailbox snaphot structure and setting the + * mailbox snapshot state to a new capture. + */ + memset(&snap->mbx_buf, 0, sizeof(snap->mbx_buf)); + snap->mbx_buf.state = ICE_MAL_VF_DETECT_STATE_NEW_SNAPSHOT; + + /* Reset message counts for all VFs to zero */ + list_for_each_entry(vf_info, &snap->mbx_vf, list_entry) + vf_info->msg_count = 0; +} + +/** * ice_mbx_traverse - Pass through mailbox snapshot * @hw: pointer to the HW struct * @new_state: new algorithm state @@ -171,7 +185,7 @@ ice_mbx_traverse(struct ice_hw *hw, /** * ice_mbx_detect_malvf - Detect malicious VF in snapshot * @hw: pointer to the HW struct - * @vf_id: relative virtual function ID + * @vf_info: mailbox tracking structure for a VF * @new_state: new algorithm state * @is_malvf: boolean output to indicate if VF is malicious * @@ -180,19 +194,14 @@ ice_mbx_traverse(struct ice_hw *hw, * the permissible number of messages to send. */ static int -ice_mbx_detect_malvf(struct ice_hw *hw, u16 vf_id, +ice_mbx_detect_malvf(struct ice_hw *hw, struct ice_mbx_vf_info *vf_info, enum ice_mbx_snapshot_state *new_state, bool *is_malvf) { - struct ice_mbx_snapshot *snap = &hw->mbx_snapshot; - - if (vf_id >= snap->mbx_vf.vfcntr_len) - return -EIO; - - /* increment the message count in the VF array */ - snap->mbx_vf.vf_cntr[vf_id]++; + /* increment the message count for this VF */ + vf_info->msg_count++; - if (snap->mbx_vf.vf_cntr[vf_id] >= ICE_ASYNC_VF_MSG_THRESHOLD) + if (vf_info->msg_count >= ICE_ASYNC_VF_MSG_THRESHOLD) *is_malvf = true; /* continue to iterate through the mailbox snapshot */ @@ -202,35 +211,11 @@ ice_mbx_detect_malvf(struct ice_hw *hw, u16 vf_id, } /** - * ice_mbx_reset_snapshot - Reset mailbox snapshot structure - * @snap: pointer to mailbox snapshot structure in the ice_hw struct - * - * Reset the mailbox snapshot structure and clear VF counter array. - */ -static void ice_mbx_reset_snapshot(struct ice_mbx_snapshot *snap) -{ - u32 vfcntr_len; - - if (!snap || !snap->mbx_vf.vf_cntr) - return; - - /* Clear VF counters. */ - vfcntr_len = snap->mbx_vf.vfcntr_len; - if (vfcntr_len) - memset(snap->mbx_vf.vf_cntr, 0, - (vfcntr_len * sizeof(*snap->mbx_vf.vf_cntr))); - - /* Reset mailbox snapshot for a new capture. */ - memset(&snap->mbx_buf, 0, sizeof(snap->mbx_buf)); - snap->mbx_buf.state = ICE_MAL_VF_DETECT_STATE_NEW_SNAPSHOT; -} - -/** * ice_mbx_vf_state_handler - Handle states of the overflow algorithm * @hw: pointer to the HW struct * @mbx_data: pointer to structure containing mailbox data - * @vf_id: relative virtual function (VF) ID - * @is_malvf: boolean output to indicate if VF is malicious + * @vf_info: mailbox tracking structure for the VF in question + * @report_malvf: boolean output to indicate whether VF should be reported * * The function serves as an entry point for the malicious VF * detection algorithm by handling the different states and state @@ -249,24 +234,24 @@ static void ice_mbx_reset_snapshot(struct ice_mbx_snapshot *snap) * the static snapshot and look for a malicious VF. */ int -ice_mbx_vf_state_handler(struct ice_hw *hw, - struct ice_mbx_data *mbx_data, u16 vf_id, - bool *is_malvf) +ice_mbx_vf_state_handler(struct ice_hw *hw, struct ice_mbx_data *mbx_data, + struct ice_mbx_vf_info *vf_info, bool *report_malvf) { struct ice_mbx_snapshot *snap = &hw->mbx_snapshot; struct ice_mbx_snap_buffer_data *snap_buf; struct ice_ctl_q_info *cq = &hw->mailboxq; enum ice_mbx_snapshot_state new_state; + bool is_malvf = false; int status = 0; - if (!is_malvf || !mbx_data) + if (!report_malvf || !mbx_data || !vf_info) return -EINVAL; + *report_malvf = false; + /* When entering the mailbox state machine assume that the VF * is not malicious until detected. */ - *is_malvf = false; - /* Checking if max messages allowed to be processed while servicing current * interrupt is not less than the defined AVF message threshold. */ @@ -315,7 +300,7 @@ ice_mbx_vf_state_handler(struct ice_hw *hw, if (snap_buf->num_pending_arq >= mbx_data->async_watermark_val) { new_state = ICE_MAL_VF_DETECT_STATE_DETECT; - status = ice_mbx_detect_malvf(hw, vf_id, &new_state, is_malvf); + status = ice_mbx_detect_malvf(hw, vf_info, &new_state, &is_malvf); } else { new_state = ICE_MAL_VF_DETECT_STATE_TRAVERSE; ice_mbx_traverse(hw, &new_state); @@ -329,7 +314,7 @@ ice_mbx_vf_state_handler(struct ice_hw *hw, case ICE_MAL_VF_DETECT_STATE_DETECT: new_state = ICE_MAL_VF_DETECT_STATE_DETECT; - status = ice_mbx_detect_malvf(hw, vf_id, &new_state, is_malvf); + status = ice_mbx_detect_malvf(hw, vf_info, &new_state, &is_malvf); break; default: @@ -339,145 +324,57 @@ ice_mbx_vf_state_handler(struct ice_hw *hw, snap_buf->state = new_state; - return status; -} - -/** - * ice_mbx_report_malvf - Track and note malicious VF - * @hw: pointer to the HW struct - * @all_malvfs: all malicious VFs tracked by PF - * @bitmap_len: length of bitmap in bits - * @vf_id: relative virtual function ID of the malicious VF - * @report_malvf: boolean to indicate if malicious VF must be reported - * - * This function will update a bitmap that keeps track of the malicious - * VFs attached to the PF. A malicious VF must be reported only once if - * discovered between VF resets or loading so the function checks - * the input vf_id against the bitmap to verify if the VF has been - * detected in any previous mailbox iterations. - */ -int -ice_mbx_report_malvf(struct ice_hw *hw, unsigned long *all_malvfs, - u16 bitmap_len, u16 vf_id, bool *report_malvf) -{ - if (!all_malvfs || !report_malvf) - return -EINVAL; - - *report_malvf = false; - - if (bitmap_len < hw->mbx_snapshot.mbx_vf.vfcntr_len) - return -EINVAL; - - if (vf_id >= bitmap_len) - return -EIO; - - /* If the vf_id is found in the bitmap set bit and boolean to true */ - if (!test_and_set_bit(vf_id, all_malvfs)) + /* Only report VFs as malicious the first time we detect it */ + if (is_malvf && !vf_info->malicious) { + vf_info->malicious = 1; *report_malvf = true; + } - return 0; + return status; } /** - * ice_mbx_clear_malvf - Clear VF bitmap and counter for VF ID - * @snap: pointer to the mailbox snapshot structure - * @all_malvfs: all malicious VFs tracked by PF - * @bitmap_len: length of bitmap in bits - * @vf_id: relative virtual function ID of the malicious VF + * ice_mbx_clear_malvf - Clear VF mailbox info + * @vf_info: the mailbox tracking structure for a VF * - * In case of a VF reset, this function can be called to clear - * the bit corresponding to the VF ID in the bitmap tracking all - * malicious VFs attached to the PF. The function also clears the - * VF counter array at the index of the VF ID. This is to ensure - * that the new VF loaded is not considered malicious before going - * through the overflow detection algorithm. + * In case of a VF reset, this function shall be called to clear the VF's + * current mailbox tracking state. */ -int -ice_mbx_clear_malvf(struct ice_mbx_snapshot *snap, unsigned long *all_malvfs, - u16 bitmap_len, u16 vf_id) +void ice_mbx_clear_malvf(struct ice_mbx_vf_info *vf_info) { - if (!snap || !all_malvfs) - return -EINVAL; - - if (bitmap_len < snap->mbx_vf.vfcntr_len) - return -EINVAL; - - /* Ensure VF ID value is not larger than bitmap or VF counter length */ - if (vf_id >= bitmap_len || vf_id >= snap->mbx_vf.vfcntr_len) - return -EIO; - - /* Clear VF ID bit in the bitmap tracking malicious VFs attached to PF */ - clear_bit(vf_id, all_malvfs); - - /* Clear the VF counter in the mailbox snapshot structure for that VF ID. - * This is to ensure that if a VF is unloaded and a new one brought back - * up with the same VF ID for a snapshot currently in traversal or detect - * state the counter for that VF ID does not increment on top of existing - * values in the mailbox overflow detection algorithm. - */ - snap->mbx_vf.vf_cntr[vf_id] = 0; - - return 0; + vf_info->malicious = 0; + vf_info->msg_count = 0; } /** - * ice_mbx_init_snapshot - Initialize mailbox snapshot structure + * ice_mbx_init_vf_info - Initialize a new VF mailbox tracking info * @hw: pointer to the hardware structure - * @vf_count: number of VFs allocated on a PF + * @vf_info: the mailbox tracking info structure for a VF * - * Clear the mailbox snapshot structure and allocate memory - * for the VF counter array based on the number of VFs allocated - * on that PF. + * Initialize a VF mailbox tracking info structure and insert it into the + * snapshot list. * - * Assumption: This function will assume ice_get_caps() has already been - * called to ensure that the vf_count can be compared against the number - * of VFs supported as defined in the functional capabilities of the device. + * If you remove the VF, you must also delete the associated VF info structure + * from the linked list. */ -int ice_mbx_init_snapshot(struct ice_hw *hw, u16 vf_count) +void ice_mbx_init_vf_info(struct ice_hw *hw, struct ice_mbx_vf_info *vf_info) { struct ice_mbx_snapshot *snap = &hw->mbx_snapshot; - /* Ensure that the number of VFs allocated is non-zero and - * is not greater than the number of supported VFs defined in - * the functional capabilities of the PF. - */ - if (!vf_count || vf_count > hw->func_caps.num_allocd_vfs) - return -EINVAL; - - snap->mbx_vf.vf_cntr = devm_kcalloc(ice_hw_to_dev(hw), vf_count, - sizeof(*snap->mbx_vf.vf_cntr), - GFP_KERNEL); - if (!snap->mbx_vf.vf_cntr) - return -ENOMEM; - - /* Setting the VF counter length to the number of allocated - * VFs for given PF's functional capabilities. - */ - snap->mbx_vf.vfcntr_len = vf_count; - - /* Clear mbx_buf in the mailbox snaphot structure and setting the - * mailbox snapshot state to a new capture. - */ - memset(&snap->mbx_buf, 0, sizeof(snap->mbx_buf)); - snap->mbx_buf.state = ICE_MAL_VF_DETECT_STATE_NEW_SNAPSHOT; - - return 0; + ice_mbx_clear_malvf(vf_info); + list_add(&vf_info->list_entry, &snap->mbx_vf); } /** - * ice_mbx_deinit_snapshot - Free mailbox snapshot structure + * ice_mbx_init_snapshot - Initialize mailbox snapshot data * @hw: pointer to the hardware structure * - * Clear the mailbox snapshot structure and free the VF counter array. + * Clear the mailbox snapshot structure and initialize the VF mailbox list. */ -void ice_mbx_deinit_snapshot(struct ice_hw *hw) +void ice_mbx_init_snapshot(struct ice_hw *hw) { struct ice_mbx_snapshot *snap = &hw->mbx_snapshot; - /* Free VF counter array and reset VF counter length */ - devm_kfree(ice_hw_to_dev(hw), snap->mbx_vf.vf_cntr); - snap->mbx_vf.vfcntr_len = 0; - - /* Clear mbx_buf in the mailbox snaphot structure */ - memset(&snap->mbx_buf, 0, sizeof(snap->mbx_buf)); + INIT_LIST_HEAD(&snap->mbx_vf); + ice_mbx_reset_snapshot(snap); } diff --git a/drivers/net/ethernet/intel/ice/ice_vf_mbx.h b/drivers/net/ethernet/intel/ice/ice_vf_mbx.h index 582716e6d5f9..44bc030d17e0 100644 --- a/drivers/net/ethernet/intel/ice/ice_vf_mbx.h +++ b/drivers/net/ethernet/intel/ice/ice_vf_mbx.h @@ -21,15 +21,10 @@ ice_aq_send_msg_to_vf(struct ice_hw *hw, u16 vfid, u32 v_opcode, u32 v_retval, u32 ice_conv_link_speed_to_virtchnl(bool adv_link_support, u16 link_speed); int ice_mbx_vf_state_handler(struct ice_hw *hw, struct ice_mbx_data *mbx_data, - u16 vf_id, bool *is_mal_vf); -int -ice_mbx_clear_malvf(struct ice_mbx_snapshot *snap, unsigned long *all_malvfs, - u16 bitmap_len, u16 vf_id); -int ice_mbx_init_snapshot(struct ice_hw *hw, u16 vf_count); -void ice_mbx_deinit_snapshot(struct ice_hw *hw); -int -ice_mbx_report_malvf(struct ice_hw *hw, unsigned long *all_malvfs, - u16 bitmap_len, u16 vf_id, bool *report_malvf); + struct ice_mbx_vf_info *vf_info, bool *report_malvf); +void ice_mbx_clear_malvf(struct ice_mbx_vf_info *vf_info); +void ice_mbx_init_vf_info(struct ice_hw *hw, struct ice_mbx_vf_info *vf_info); +void ice_mbx_init_snapshot(struct ice_hw *hw); #else /* CONFIG_PCI_IOV */ static inline int ice_aq_send_msg_to_vf(struct ice_hw __always_unused *hw, @@ -48,5 +43,9 @@ ice_conv_link_speed_to_virtchnl(bool __always_unused adv_link_support, return 0; } +static inline void ice_mbx_init_snapshot(struct ice_hw *hw) +{ +} + #endif /* CONFIG_PCI_IOV */ #endif /* _ICE_VF_MBX_H_ */ diff --git a/drivers/net/ethernet/intel/ice/ice_virtchnl.c b/drivers/net/ethernet/intel/ice/ice_virtchnl.c index e24e3f5017ca..97243c616d5d 100644 --- a/drivers/net/ethernet/intel/ice/ice_virtchnl.c +++ b/drivers/net/ethernet/intel/ice/ice_virtchnl.c @@ -3834,14 +3834,57 @@ void ice_virtchnl_set_repr_ops(struct ice_vf *vf) } /** + * ice_is_malicious_vf - check if this vf might be overflowing mailbox + * @vf: the VF to check + * @mbxdata: data about the state of the mailbox + * + * Detect if a given VF might be malicious and attempting to overflow the PF + * mailbox. If so, log a warning message and ignore this event. + */ +static bool +ice_is_malicious_vf(struct ice_vf *vf, struct ice_mbx_data *mbxdata) +{ + bool report_malvf = false; + struct device *dev; + struct ice_pf *pf; + int status; + + pf = vf->pf; + dev = ice_pf_to_dev(pf); + + if (test_bit(ICE_VF_STATE_DIS, vf->vf_states)) + return vf->mbx_info.malicious; + + /* check to see if we have a newly malicious VF */ + status = ice_mbx_vf_state_handler(&pf->hw, mbxdata, &vf->mbx_info, + &report_malvf); + if (status) + dev_warn_ratelimited(dev, "Unable to check status of mailbox overflow for VF %u MAC %pM, status %d\n", + vf->vf_id, vf->dev_lan_addr, status); + + if (report_malvf) { + struct ice_vsi *pf_vsi = ice_get_main_vsi(pf); + u8 zero_addr[ETH_ALEN] = {}; + + dev_warn(dev, "VF MAC %pM on PF MAC %pM is generating asynchronous messages and may be overflowing the PF message queue. Please see the Adapter User Guide for more information\n", + vf->dev_lan_addr, + pf_vsi ? pf_vsi->netdev->dev_addr : zero_addr); + } + + return vf->mbx_info.malicious; +} + +/** * ice_vc_process_vf_msg - Process request from VF * @pf: pointer to the PF structure * @event: pointer to the AQ event + * @mbxdata: information used to detect VF attempting mailbox overflow * * called from the common asq/arq handler to * process request from VF */ -void ice_vc_process_vf_msg(struct ice_pf *pf, struct ice_rq_event_info *event) +void ice_vc_process_vf_msg(struct ice_pf *pf, struct ice_rq_event_info *event, + struct ice_mbx_data *mbxdata) { u32 v_opcode = le32_to_cpu(event->desc.cookie_high); s16 vf_id = le16_to_cpu(event->desc.retval); @@ -3863,6 +3906,10 @@ void ice_vc_process_vf_msg(struct ice_pf *pf, struct ice_rq_event_info *event) mutex_lock(&vf->cfg_lock); + /* Check if the VF is trying to overflow the mailbox */ + if (ice_is_malicious_vf(vf, mbxdata)) + goto finish; + /* Check if VF is disabled. */ if (test_bit(ICE_VF_STATE_DIS, vf->vf_states)) { err = -EPERM; diff --git a/drivers/net/ethernet/intel/ice/ice_virtchnl.h b/drivers/net/ethernet/intel/ice/ice_virtchnl.h index b454654d7b0c..cd747718de73 100644 --- a/drivers/net/ethernet/intel/ice/ice_virtchnl.h +++ b/drivers/net/ethernet/intel/ice/ice_virtchnl.h @@ -63,6 +63,8 @@ int ice_vc_send_msg_to_vf(struct ice_vf *vf, u32 v_opcode, enum virtchnl_status_code v_retval, u8 *msg, u16 msglen); bool ice_vc_isvalid_vsi_id(struct ice_vf *vf, u16 vsi_id); +void ice_vc_process_vf_msg(struct ice_pf *pf, struct ice_rq_event_info *event, + struct ice_mbx_data *mbxdata); #else /* CONFIG_PCI_IOV */ static inline void ice_virtchnl_set_dflt_ops(struct ice_vf *vf) { } static inline void ice_virtchnl_set_repr_ops(struct ice_vf *vf) { } @@ -81,6 +83,12 @@ static inline bool ice_vc_isvalid_vsi_id(struct ice_vf *vf, u16 vsi_id) { return false; } + +static inline void +ice_vc_process_vf_msg(struct ice_pf *pf, struct ice_rq_event_info *event, + struct ice_mbx_data *mbxdata) +{ +} #endif /* !CONFIG_PCI_IOV */ #endif /* _ICE_VIRTCHNL_H_ */ diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c index 274c781b5547..58872a4c2540 100644 --- a/drivers/net/ethernet/intel/igb/igb_main.c +++ b/drivers/net/ethernet/intel/igb/igb_main.c @@ -28,7 +28,6 @@ #include <linux/tcp.h> #include <linux/sctp.h> #include <linux/if_ether.h> -#include <linux/aer.h> #include <linux/prefetch.h> #include <linux/bpf.h> #include <linux/bpf_trace.h> diff --git a/drivers/net/ethernet/intel/igb/igb_ptp.c b/drivers/net/ethernet/intel/igb/igb_ptp.c index 6f471b91f562..405886ee5261 100644 --- a/drivers/net/ethernet/intel/igb/igb_ptp.c +++ b/drivers/net/ethernet/intel/igb/igb_ptp.c @@ -67,6 +67,7 @@ #define INCVALUE_82576_MASK GENMASK(E1000_TIMINCA_16NS_SHIFT - 1, 0) #define INCVALUE_82576 (16u << IGB_82576_TSYNC_SHIFT) #define IGB_NBITS_82580 40 +#define IGB_82580_BASE_PERIOD 0x800000000 static void igb_ptp_tx_hwtstamp(struct igb_adapter *adapter); static void igb_ptp_sdp_init(struct igb_adapter *adapter); @@ -209,17 +210,11 @@ static int igb_ptp_adjfine_82580(struct ptp_clock_info *ptp, long scaled_ppm) struct igb_adapter *igb = container_of(ptp, struct igb_adapter, ptp_caps); struct e1000_hw *hw = &igb->hw; - int neg_adj = 0; + bool neg_adj; u64 rate; u32 inca; - if (scaled_ppm < 0) { - neg_adj = 1; - scaled_ppm = -scaled_ppm; - } - rate = scaled_ppm; - rate <<= 13; - rate = div_u64(rate, 15625); + neg_adj = diff_by_scaled_ppm(IGB_82580_BASE_PERIOD, scaled_ppm, &rate); inca = rate & INCVALUE_MASK; if (neg_adj) diff --git a/drivers/net/ethernet/intel/igbvf/netdev.c b/drivers/net/ethernet/intel/igbvf/netdev.c index 72cb1b56e9f2..7ff2752dd763 100644 --- a/drivers/net/ethernet/intel/igbvf/netdev.c +++ b/drivers/net/ethernet/intel/igbvf/netdev.c @@ -2593,6 +2593,33 @@ static void igbvf_io_resume(struct pci_dev *pdev) netif_device_attach(netdev); } +/** + * igbvf_io_prepare - prepare device driver for PCI reset + * @pdev: PCI device information struct + */ +static void igbvf_io_prepare(struct pci_dev *pdev) +{ + struct net_device *netdev = pci_get_drvdata(pdev); + struct igbvf_adapter *adapter = netdev_priv(netdev); + + while (test_and_set_bit(__IGBVF_RESETTING, &adapter->state)) + usleep_range(1000, 2000); + igbvf_down(adapter); +} + +/** + * igbvf_io_reset_done - PCI reset done, device driver reset can begin + * @pdev: PCI device information struct + */ +static void igbvf_io_reset_done(struct pci_dev *pdev) +{ + struct net_device *netdev = pci_get_drvdata(pdev); + struct igbvf_adapter *adapter = netdev_priv(netdev); + + igbvf_up(adapter); + clear_bit(__IGBVF_RESETTING, &adapter->state); +} + static void igbvf_print_device_info(struct igbvf_adapter *adapter) { struct e1000_hw *hw = &adapter->hw; @@ -2920,6 +2947,8 @@ static const struct pci_error_handlers igbvf_err_handler = { .error_detected = igbvf_io_error_detected, .slot_reset = igbvf_io_slot_reset, .resume = igbvf_io_resume, + .reset_prepare = igbvf_io_prepare, + .reset_done = igbvf_io_reset_done, }; static const struct pci_device_id igbvf_pci_tbl[] = { diff --git a/drivers/net/ethernet/intel/igc/igc.h b/drivers/net/ethernet/intel/igc/igc.h index df3e26c0cf01..34aebf00a512 100644 --- a/drivers/net/ethernet/intel/igc/igc.h +++ b/drivers/net/ethernet/intel/igc/igc.h @@ -99,6 +99,7 @@ struct igc_ring { u32 start_time; u32 end_time; + u32 max_sdu; /* CBS parameters */ bool cbs_enable; /* indicates if CBS is enabled */ @@ -185,6 +186,7 @@ struct igc_adapter { ktime_t base_time; ktime_t cycle_time; bool qbv_enable; + u32 qbv_config_change_errors; /* OS defined structs */ struct pci_dev *pdev; @@ -292,8 +294,6 @@ extern char igc_driver_name[]; #define IGC_FLAG_PTP BIT(8) #define IGC_FLAG_WOL_SUPPORTED BIT(8) #define IGC_FLAG_NEED_LINK_UPDATE BIT(9) -#define IGC_FLAG_MEDIA_RESET BIT(10) -#define IGC_FLAG_MAS_ENABLE BIT(12) #define IGC_FLAG_HAS_MSIX BIT(13) #define IGC_FLAG_EEE BIT(14) #define IGC_FLAG_VLAN_PROMISC BIT(15) diff --git a/drivers/net/ethernet/intel/igc/igc_defines.h b/drivers/net/ethernet/intel/igc/igc_defines.h index 9dec3563ce3a..44a507029946 100644 --- a/drivers/net/ethernet/intel/igc/igc_defines.h +++ b/drivers/net/ethernet/intel/igc/igc_defines.h @@ -662,9 +662,6 @@ */ #define IGC_TW_SYSTEM_100_MASK 0x0000FF00 #define IGC_TW_SYSTEM_100_SHIFT 8 -#define IGC_DMACR_DMAC_EN 0x80000000 /* Enable DMA Coalescing */ -#define IGC_DMACR_DMACTHR_MASK 0x00FF0000 -#define IGC_DMACR_DMACTHR_SHIFT 16 /* Reg val to set scale to 1024 nsec */ #define IGC_LTRMINV_SCALE_1024 2 /* Reg val to set scale to 32768 nsec */ diff --git a/drivers/net/ethernet/intel/igc/igc_ethtool.c b/drivers/net/ethernet/intel/igc/igc_ethtool.c index 5a26a7805ef8..0e2cb00622d1 100644 --- a/drivers/net/ethernet/intel/igc/igc_ethtool.c +++ b/drivers/net/ethernet/intel/igc/igc_ethtool.c @@ -67,6 +67,7 @@ static const struct igc_stats igc_gstrings_stats[] = { IGC_STAT("rx_hwtstamp_cleared", rx_hwtstamp_cleared), IGC_STAT("tx_lpi_counter", stats.tlpic), IGC_STAT("rx_lpi_counter", stats.rlpic), + IGC_STAT("qbv_config_change_errors", qbv_config_change_errors), }; #define IGC_NETDEV_STAT(_net_stat) { \ diff --git a/drivers/net/ethernet/intel/igc/igc_hw.h b/drivers/net/ethernet/intel/igc/igc_hw.h index 88680e3d613d..e1c572e0d4ef 100644 --- a/drivers/net/ethernet/intel/igc/igc_hw.h +++ b/drivers/net/ethernet/intel/igc/igc_hw.h @@ -273,6 +273,7 @@ struct igc_hw_stats { u64 o2bspc; u64 b2ospc; u64 b2ogprc; + u64 txdrop; }; struct net_device *igc_get_hw_dev(struct igc_hw *hw); diff --git a/drivers/net/ethernet/intel/igc/igc_i225.c b/drivers/net/ethernet/intel/igc/igc_i225.c index 59d5c467ea6e..17546a035ab1 100644 --- a/drivers/net/ethernet/intel/igc/igc_i225.c +++ b/drivers/net/ethernet/intel/igc/igc_i225.c @@ -593,20 +593,11 @@ s32 igc_set_ltr_i225(struct igc_hw *hw, bool link) size = rd32(IGC_RXPBS) & IGC_RXPBS_SIZE_I225_MASK; - /* Calculations vary based on DMAC settings. */ - if (rd32(IGC_DMACR) & IGC_DMACR_DMAC_EN) { - size -= (rd32(IGC_DMACR) & - IGC_DMACR_DMACTHR_MASK) >> - IGC_DMACR_DMACTHR_SHIFT; - /* Convert size to bits. */ - size *= 1024 * 8; - } else { - /* Convert size to bytes, subtract the MTU, and then - * convert the size to bits. - */ - size *= 1024; - size *= 8; - } + /* Convert size to bytes, subtract the MTU, and then + * convert the size to bits. + */ + size *= 1024; + size *= 8; if (size < 0) { hw_dbg("Invalid effective Rx buffer size %d\n", diff --git a/drivers/net/ethernet/intel/igc/igc_main.c b/drivers/net/ethernet/intel/igc/igc_main.c index 25fc6c65209b..ba49728be919 100644 --- a/drivers/net/ethernet/intel/igc/igc_main.c +++ b/drivers/net/ethernet/intel/igc/igc_main.c @@ -4,7 +4,6 @@ #include <linux/module.h> #include <linux/types.h> #include <linux/if_vlan.h> -#include <linux/aer.h> #include <linux/tcp.h> #include <linux/udp.h> #include <linux/ip.h> @@ -1501,6 +1500,7 @@ static int igc_tso(struct igc_ring *tx_ring, static netdev_tx_t igc_xmit_frame_ring(struct sk_buff *skb, struct igc_ring *tx_ring) { + struct igc_adapter *adapter = netdev_priv(tx_ring->netdev); bool first_flag = false, insert_empty = false; u16 count = TXD_USE_COUNT(skb_headlen(skb)); __be16 protocol = vlan_get_protocol(skb); @@ -1563,9 +1563,19 @@ done: first->bytecount = skb->len; first->gso_segs = 1; - if (unlikely(skb_shinfo(skb)->tx_flags & SKBTX_HW_TSTAMP)) { - struct igc_adapter *adapter = netdev_priv(tx_ring->netdev); + if (tx_ring->max_sdu > 0) { + u32 max_sdu = 0; + + max_sdu = tx_ring->max_sdu + + (skb_vlan_tagged(first->skb) ? VLAN_HLEN : 0); + + if (first->bytecount > max_sdu) { + adapter->stats.txdrop++; + goto out_drop; + } + } + if (unlikely(skb_shinfo(skb)->tx_flags & SKBTX_HW_TSTAMP)) { /* FIXME: add support for retrieving timestamps from * the other timer registers before skipping the * timestamping request. @@ -4920,7 +4930,8 @@ void igc_update_stats(struct igc_adapter *adapter) net_stats->tx_window_errors = adapter->stats.latecol; net_stats->tx_carrier_errors = adapter->stats.tncrs; - /* Tx Dropped needs to be maintained elsewhere */ + /* Tx Dropped */ + net_stats->tx_dropped = adapter->stats.txdrop; /* Management Stats */ adapter->stats.mgptc += rd32(IGC_MGTPTC); @@ -5566,25 +5577,8 @@ no_wait: mod_timer(&adapter->phy_info_timer, round_jiffies(jiffies + 2 * HZ)); - /* link is down, time to check for alternate media */ - if (adapter->flags & IGC_FLAG_MAS_ENABLE) { - if (adapter->flags & IGC_FLAG_MEDIA_RESET) { - schedule_work(&adapter->reset_task); - /* return immediately */ - return; - } - } pm_schedule_suspend(netdev->dev.parent, MSEC_PER_SEC * 5); - - /* also check for alternate media here */ - } else if (!netif_carrier_ok(netdev) && - (adapter->flags & IGC_FLAG_MAS_ENABLE)) { - if (adapter->flags & IGC_FLAG_MEDIA_RESET) { - schedule_work(&adapter->reset_task); - /* return immediately */ - return; - } } } @@ -6049,12 +6043,14 @@ static int igc_tsn_clear_schedule(struct igc_adapter *adapter) adapter->base_time = 0; adapter->cycle_time = NSEC_PER_SEC; + adapter->qbv_config_change_errors = 0; for (i = 0; i < adapter->num_tx_queues; i++) { struct igc_ring *ring = adapter->tx_ring[i]; ring->start_time = 0; ring->end_time = NSEC_PER_SEC; + ring->max_sdu = 0; } return 0; @@ -6138,6 +6134,16 @@ static int igc_save_qbv_schedule(struct igc_adapter *adapter, } } + for (i = 0; i < adapter->num_tx_queues; i++) { + struct igc_ring *ring = adapter->tx_ring[i]; + struct net_device *dev = adapter->netdev; + + if (qopt->max_sdu[i]) + ring->max_sdu = qopt->max_sdu[i] + dev->hard_header_len; + else + ring->max_sdu = 0; + } + return 0; } @@ -6236,8 +6242,10 @@ static int igc_tc_query_caps(struct igc_adapter *adapter, caps->broken_mqprio = true; - if (hw->mac.type == igc_i225) + if (hw->mac.type == igc_i225) { + caps->supports_queue_max_sdu = true; caps->gate_mask_per_txq = true; + } return 0; } diff --git a/drivers/net/ethernet/intel/igc/igc_regs.h b/drivers/net/ethernet/intel/igc/igc_regs.h index 01c86d36856d..dba5a5759b1c 100644 --- a/drivers/net/ethernet/intel/igc/igc_regs.h +++ b/drivers/net/ethernet/intel/igc/igc_regs.h @@ -292,7 +292,6 @@ /* LTR registers */ #define IGC_LTRC 0x01A0 /* Latency Tolerance Reporting Control */ -#define IGC_DMACR 0x02508 /* DMA Coalescing Control Register */ #define IGC_LTRMINV 0x5BB0 /* LTR Minimum Value */ #define IGC_LTRMAXV 0x5BB4 /* LTR Maximum Value */ diff --git a/drivers/net/ethernet/intel/igc/igc_tsn.c b/drivers/net/ethernet/intel/igc/igc_tsn.c index a386c8d61dbf..94a2b0dfb54d 100644 --- a/drivers/net/ethernet/intel/igc/igc_tsn.c +++ b/drivers/net/ethernet/intel/igc/igc_tsn.c @@ -114,6 +114,7 @@ static int igc_tsn_disable_offload(struct igc_adapter *adapter) static int igc_tsn_enable_offload(struct igc_adapter *adapter) { struct igc_hw *hw = &adapter->hw; + bool tsn_mode_reconfig = false; u32 tqavctrl, baset_l, baset_h; u32 sec, nsec, cycle; ktime_t base_time, systim; @@ -226,6 +227,10 @@ skip_cbs: } tqavctrl = rd32(IGC_TQAVCTRL) & ~IGC_TQAVCTRL_FUTSCDDIS; + + if (tqavctrl & IGC_TQAVCTRL_TRANSMIT_MODE_TSN) + tsn_mode_reconfig = true; + tqavctrl |= IGC_TQAVCTRL_TRANSMIT_MODE_TSN | IGC_TQAVCTRL_ENHANCED_QAV; cycle = adapter->cycle_time; @@ -239,6 +244,13 @@ skip_cbs: s64 n = div64_s64(ktime_sub_ns(systim, base_time), cycle); base_time = ktime_add_ns(base_time, (n + 1) * cycle); + + /* Increase the counter if scheduling into the past while + * Gate Control List (GCL) is running. + */ + if ((rd32(IGC_BASET_H) || rd32(IGC_BASET_L)) && + tsn_mode_reconfig) + adapter->qbv_config_change_errors++; } else { /* According to datasheet section 7.5.2.9.3.3, FutScdDis bit * has to be configured before the cycle time and base time. diff --git a/drivers/net/ethernet/intel/ixgb/Makefile b/drivers/net/ethernet/intel/ixgb/Makefile deleted file mode 100644 index 2433e9300a33..000000000000 --- a/drivers/net/ethernet/intel/ixgb/Makefile +++ /dev/null @@ -1,9 +0,0 @@ -# SPDX-License-Identifier: GPL-2.0 -# Copyright(c) 1999 - 2008 Intel Corporation. -# -# Makefile for the Intel(R) PRO/10GbE ethernet driver -# - -obj-$(CONFIG_IXGB) += ixgb.o - -ixgb-objs := ixgb_main.o ixgb_hw.o ixgb_ee.o ixgb_ethtool.o ixgb_param.o diff --git a/drivers/net/ethernet/intel/ixgb/ixgb.h b/drivers/net/ethernet/intel/ixgb/ixgb.h deleted file mode 100644 index 81ac39576803..000000000000 --- a/drivers/net/ethernet/intel/ixgb/ixgb.h +++ /dev/null @@ -1,179 +0,0 @@ -/* SPDX-License-Identifier: GPL-2.0 */ -/* Copyright(c) 1999 - 2008 Intel Corporation. */ - -#ifndef _IXGB_H_ -#define _IXGB_H_ - -#include <linux/stddef.h> -#include <linux/module.h> -#include <linux/types.h> -#include <asm/byteorder.h> -#include <linux/mm.h> -#include <linux/errno.h> -#include <linux/ioport.h> -#include <linux/pci.h> -#include <linux/kernel.h> -#include <linux/netdevice.h> -#include <linux/etherdevice.h> -#include <linux/skbuff.h> -#include <linux/delay.h> -#include <linux/timer.h> -#include <linux/slab.h> -#include <linux/vmalloc.h> -#include <linux/interrupt.h> -#include <linux/string.h> -#include <linux/pagemap.h> -#include <linux/dma-mapping.h> -#include <linux/bitops.h> -#include <asm/io.h> -#include <asm/irq.h> -#include <linux/capability.h> -#include <linux/in.h> -#include <linux/ip.h> -#include <linux/tcp.h> -#include <linux/udp.h> -#include <net/pkt_sched.h> -#include <linux/list.h> -#include <linux/reboot.h> -#include <net/checksum.h> - -#include <linux/ethtool.h> -#include <linux/if_vlan.h> - -#define BAR_0 0 -#define BAR_1 1 - -struct ixgb_adapter; -#include "ixgb_hw.h" -#include "ixgb_ee.h" -#include "ixgb_ids.h" - -/* TX/RX descriptor defines */ -#define DEFAULT_TXD 256 -#define MAX_TXD 4096 -#define MIN_TXD 64 - -/* hardware cannot reliably support more than 512 descriptors owned by - * hardware descriptor cache otherwise an unreliable ring under heavy - * receive load may result */ -#define DEFAULT_RXD 512 -#define MAX_RXD 512 -#define MIN_RXD 64 - -/* Supported Rx Buffer Sizes */ -#define IXGB_RXBUFFER_2048 2048 -#define IXGB_RXBUFFER_4096 4096 -#define IXGB_RXBUFFER_8192 8192 -#define IXGB_RXBUFFER_16384 16384 - -/* How many Rx Buffers do we bundle into one write to the hardware ? */ -#define IXGB_RX_BUFFER_WRITE 8 /* Must be power of 2 */ - -/* wrapper around a pointer to a socket buffer, - * so a DMA handle can be stored along with the buffer */ -struct ixgb_buffer { - struct sk_buff *skb; - dma_addr_t dma; - unsigned long time_stamp; - u16 length; - u16 next_to_watch; - u16 mapped_as_page; -}; - -struct ixgb_desc_ring { - /* pointer to the descriptor ring memory */ - void *desc; - /* physical address of the descriptor ring */ - dma_addr_t dma; - /* length of descriptor ring in bytes */ - unsigned int size; - /* number of descriptors in the ring */ - unsigned int count; - /* next descriptor to associate a buffer with */ - unsigned int next_to_use; - /* next descriptor to check for DD status bit */ - unsigned int next_to_clean; - /* array of buffer information structs */ - struct ixgb_buffer *buffer_info; -}; - -#define IXGB_DESC_UNUSED(R) \ - ((((R)->next_to_clean > (R)->next_to_use) ? 0 : (R)->count) + \ - (R)->next_to_clean - (R)->next_to_use - 1) - -#define IXGB_GET_DESC(R, i, type) (&(((struct type *)((R).desc))[i])) -#define IXGB_RX_DESC(R, i) IXGB_GET_DESC(R, i, ixgb_rx_desc) -#define IXGB_TX_DESC(R, i) IXGB_GET_DESC(R, i, ixgb_tx_desc) -#define IXGB_CONTEXT_DESC(R, i) IXGB_GET_DESC(R, i, ixgb_context_desc) - -/* board specific private data structure */ - -struct ixgb_adapter { - struct timer_list watchdog_timer; - unsigned long active_vlans[BITS_TO_LONGS(VLAN_N_VID)]; - u32 bd_number; - u32 rx_buffer_len; - u32 part_num; - u16 link_speed; - u16 link_duplex; - struct work_struct tx_timeout_task; - - /* TX */ - struct ixgb_desc_ring tx_ring ____cacheline_aligned_in_smp; - unsigned int restart_queue; - unsigned long timeo_start; - u32 tx_cmd_type; - u64 hw_csum_tx_good; - u64 hw_csum_tx_error; - u32 tx_int_delay; - u32 tx_timeout_count; - bool tx_int_delay_enable; - bool detect_tx_hung; - - /* RX */ - struct ixgb_desc_ring rx_ring; - u64 hw_csum_rx_error; - u64 hw_csum_rx_good; - u32 rx_int_delay; - bool rx_csum; - - /* OS defined structs */ - struct napi_struct napi; - struct net_device *netdev; - struct pci_dev *pdev; - - /* structs defined in ixgb_hw.h */ - struct ixgb_hw hw; - u16 msg_enable; - struct ixgb_hw_stats stats; - u32 alloc_rx_buff_failed; - bool have_msi; - unsigned long flags; -}; - -enum ixgb_state_t { - /* TBD - __IXGB_TESTING, - __IXGB_RESETTING, - */ - __IXGB_DOWN -}; - -/* Exported from other modules */ -void ixgb_check_options(struct ixgb_adapter *adapter); -void ixgb_set_ethtool_ops(struct net_device *netdev); -extern char ixgb_driver_name[]; - -void ixgb_set_speed_duplex(struct net_device *netdev); - -int ixgb_up(struct ixgb_adapter *adapter); -void ixgb_down(struct ixgb_adapter *adapter, bool kill_watchdog); -void ixgb_reset(struct ixgb_adapter *adapter); -int ixgb_setup_rx_resources(struct ixgb_adapter *adapter); -int ixgb_setup_tx_resources(struct ixgb_adapter *adapter); -void ixgb_free_rx_resources(struct ixgb_adapter *adapter); -void ixgb_free_tx_resources(struct ixgb_adapter *adapter); -void ixgb_update_stats(struct ixgb_adapter *adapter); - - -#endif /* _IXGB_H_ */ diff --git a/drivers/net/ethernet/intel/ixgb/ixgb_ee.c b/drivers/net/ethernet/intel/ixgb/ixgb_ee.c deleted file mode 100644 index 129286fc1634..000000000000 --- a/drivers/net/ethernet/intel/ixgb/ixgb_ee.c +++ /dev/null @@ -1,580 +0,0 @@ -// SPDX-License-Identifier: GPL-2.0 -/* Copyright(c) 1999 - 2008 Intel Corporation. */ - -#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt - -#include "ixgb_hw.h" -#include "ixgb_ee.h" -/* Local prototypes */ -static u16 ixgb_shift_in_bits(struct ixgb_hw *hw); - -static void ixgb_shift_out_bits(struct ixgb_hw *hw, - u16 data, - u16 count); -static void ixgb_standby_eeprom(struct ixgb_hw *hw); - -static bool ixgb_wait_eeprom_command(struct ixgb_hw *hw); - -static void ixgb_cleanup_eeprom(struct ixgb_hw *hw); - -/****************************************************************************** - * Raises the EEPROM's clock input. - * - * hw - Struct containing variables accessed by shared code - * eecd_reg - EECD's current value - *****************************************************************************/ -static void -ixgb_raise_clock(struct ixgb_hw *hw, - u32 *eecd_reg) -{ - /* Raise the clock input to the EEPROM (by setting the SK bit), and then - * wait 50 microseconds. - */ - *eecd_reg = *eecd_reg | IXGB_EECD_SK; - IXGB_WRITE_REG(hw, EECD, *eecd_reg); - IXGB_WRITE_FLUSH(hw); - udelay(50); -} - -/****************************************************************************** - * Lowers the EEPROM's clock input. - * - * hw - Struct containing variables accessed by shared code - * eecd_reg - EECD's current value - *****************************************************************************/ -static void -ixgb_lower_clock(struct ixgb_hw *hw, - u32 *eecd_reg) -{ - /* Lower the clock input to the EEPROM (by clearing the SK bit), and then - * wait 50 microseconds. - */ - *eecd_reg = *eecd_reg & ~IXGB_EECD_SK; - IXGB_WRITE_REG(hw, EECD, *eecd_reg); - IXGB_WRITE_FLUSH(hw); - udelay(50); -} - -/****************************************************************************** - * Shift data bits out to the EEPROM. - * - * hw - Struct containing variables accessed by shared code - * data - data to send to the EEPROM - * count - number of bits to shift out - *****************************************************************************/ -static void -ixgb_shift_out_bits(struct ixgb_hw *hw, - u16 data, - u16 count) -{ - u32 eecd_reg; - u32 mask; - - /* We need to shift "count" bits out to the EEPROM. So, value in the - * "data" parameter will be shifted out to the EEPROM one bit at a time. - * In order to do this, "data" must be broken down into bits. - */ - mask = 0x01 << (count - 1); - eecd_reg = IXGB_READ_REG(hw, EECD); - eecd_reg &= ~(IXGB_EECD_DO | IXGB_EECD_DI); - do { - /* A "1" is shifted out to the EEPROM by setting bit "DI" to a "1", - * and then raising and then lowering the clock (the SK bit controls - * the clock input to the EEPROM). A "0" is shifted out to the EEPROM - * by setting "DI" to "0" and then raising and then lowering the clock. - */ - eecd_reg &= ~IXGB_EECD_DI; - - if (data & mask) - eecd_reg |= IXGB_EECD_DI; - - IXGB_WRITE_REG(hw, EECD, eecd_reg); - IXGB_WRITE_FLUSH(hw); - - udelay(50); - - ixgb_raise_clock(hw, &eecd_reg); - ixgb_lower_clock(hw, &eecd_reg); - - mask = mask >> 1; - - } while (mask); - - /* We leave the "DI" bit set to "0" when we leave this routine. */ - eecd_reg &= ~IXGB_EECD_DI; - IXGB_WRITE_REG(hw, EECD, eecd_reg); -} - -/****************************************************************************** - * Shift data bits in from the EEPROM - * - * hw - Struct containing variables accessed by shared code - *****************************************************************************/ -static u16 -ixgb_shift_in_bits(struct ixgb_hw *hw) -{ - u32 eecd_reg; - u32 i; - u16 data; - - /* In order to read a register from the EEPROM, we need to shift 16 bits - * in from the EEPROM. Bits are "shifted in" by raising the clock input to - * the EEPROM (setting the SK bit), and then reading the value of the "DO" - * bit. During this "shifting in" process the "DI" bit should always be - * clear.. - */ - - eecd_reg = IXGB_READ_REG(hw, EECD); - - eecd_reg &= ~(IXGB_EECD_DO | IXGB_EECD_DI); - data = 0; - - for (i = 0; i < 16; i++) { - data = data << 1; - ixgb_raise_clock(hw, &eecd_reg); - - eecd_reg = IXGB_READ_REG(hw, EECD); - - eecd_reg &= ~(IXGB_EECD_DI); - if (eecd_reg & IXGB_EECD_DO) - data |= 1; - - ixgb_lower_clock(hw, &eecd_reg); - } - - return data; -} - -/****************************************************************************** - * Prepares EEPROM for access - * - * hw - Struct containing variables accessed by shared code - * - * Lowers EEPROM clock. Clears input pin. Sets the chip select pin. This - * function should be called before issuing a command to the EEPROM. - *****************************************************************************/ -static void -ixgb_setup_eeprom(struct ixgb_hw *hw) -{ - u32 eecd_reg; - - eecd_reg = IXGB_READ_REG(hw, EECD); - - /* Clear SK and DI */ - eecd_reg &= ~(IXGB_EECD_SK | IXGB_EECD_DI); - IXGB_WRITE_REG(hw, EECD, eecd_reg); - - /* Set CS */ - eecd_reg |= IXGB_EECD_CS; - IXGB_WRITE_REG(hw, EECD, eecd_reg); -} - -/****************************************************************************** - * Returns EEPROM to a "standby" state - * - * hw - Struct containing variables accessed by shared code - *****************************************************************************/ -static void -ixgb_standby_eeprom(struct ixgb_hw *hw) -{ - u32 eecd_reg; - - eecd_reg = IXGB_READ_REG(hw, EECD); - - /* Deselect EEPROM */ - eecd_reg &= ~(IXGB_EECD_CS | IXGB_EECD_SK); - IXGB_WRITE_REG(hw, EECD, eecd_reg); - IXGB_WRITE_FLUSH(hw); - udelay(50); - - /* Clock high */ - eecd_reg |= IXGB_EECD_SK; - IXGB_WRITE_REG(hw, EECD, eecd_reg); - IXGB_WRITE_FLUSH(hw); - udelay(50); - - /* Select EEPROM */ - eecd_reg |= IXGB_EECD_CS; - IXGB_WRITE_REG(hw, EECD, eecd_reg); - IXGB_WRITE_FLUSH(hw); - udelay(50); - - /* Clock low */ - eecd_reg &= ~IXGB_EECD_SK; - IXGB_WRITE_REG(hw, EECD, eecd_reg); - IXGB_WRITE_FLUSH(hw); - udelay(50); -} - -/****************************************************************************** - * Raises then lowers the EEPROM's clock pin - * - * hw - Struct containing variables accessed by shared code - *****************************************************************************/ -static void -ixgb_clock_eeprom(struct ixgb_hw *hw) -{ - u32 eecd_reg; - - eecd_reg = IXGB_READ_REG(hw, EECD); - - /* Rising edge of clock */ - eecd_reg |= IXGB_EECD_SK; - IXGB_WRITE_REG(hw, EECD, eecd_reg); - IXGB_WRITE_FLUSH(hw); - udelay(50); - - /* Falling edge of clock */ - eecd_reg &= ~IXGB_EECD_SK; - IXGB_WRITE_REG(hw, EECD, eecd_reg); - IXGB_WRITE_FLUSH(hw); - udelay(50); -} - -/****************************************************************************** - * Terminates a command by lowering the EEPROM's chip select pin - * - * hw - Struct containing variables accessed by shared code - *****************************************************************************/ -static void -ixgb_cleanup_eeprom(struct ixgb_hw *hw) -{ - u32 eecd_reg; - - eecd_reg = IXGB_READ_REG(hw, EECD); - - eecd_reg &= ~(IXGB_EECD_CS | IXGB_EECD_DI); - - IXGB_WRITE_REG(hw, EECD, eecd_reg); - - ixgb_clock_eeprom(hw); -} - -/****************************************************************************** - * Waits for the EEPROM to finish the current command. - * - * hw - Struct containing variables accessed by shared code - * - * The command is done when the EEPROM's data out pin goes high. - * - * Returns: - * true: EEPROM data pin is high before timeout. - * false: Time expired. - *****************************************************************************/ -static bool -ixgb_wait_eeprom_command(struct ixgb_hw *hw) -{ - u32 eecd_reg; - u32 i; - - /* Toggle the CS line. This in effect tells to EEPROM to actually execute - * the command in question. - */ - ixgb_standby_eeprom(hw); - - /* Now read DO repeatedly until is high (equal to '1'). The EEPROM will - * signal that the command has been completed by raising the DO signal. - * If DO does not go high in 10 milliseconds, then error out. - */ - for (i = 0; i < 200; i++) { - eecd_reg = IXGB_READ_REG(hw, EECD); - - if (eecd_reg & IXGB_EECD_DO) - return true; - - udelay(50); - } - ASSERT(0); - return false; -} - -/****************************************************************************** - * Verifies that the EEPROM has a valid checksum - * - * hw - Struct containing variables accessed by shared code - * - * Reads the first 64 16 bit words of the EEPROM and sums the values read. - * If the sum of the 64 16 bit words is 0xBABA, the EEPROM's checksum is - * valid. - * - * Returns: - * true: Checksum is valid - * false: Checksum is not valid. - *****************************************************************************/ -bool -ixgb_validate_eeprom_checksum(struct ixgb_hw *hw) -{ - u16 checksum = 0; - u16 i; - - for (i = 0; i < (EEPROM_CHECKSUM_REG + 1); i++) - checksum += ixgb_read_eeprom(hw, i); - - if (checksum == (u16) EEPROM_SUM) - return true; - else - return false; -} - -/****************************************************************************** - * Calculates the EEPROM checksum and writes it to the EEPROM - * - * hw - Struct containing variables accessed by shared code - * - * Sums the first 63 16 bit words of the EEPROM. Subtracts the sum from 0xBABA. - * Writes the difference to word offset 63 of the EEPROM. - *****************************************************************************/ -void -ixgb_update_eeprom_checksum(struct ixgb_hw *hw) -{ - u16 checksum = 0; - u16 i; - - for (i = 0; i < EEPROM_CHECKSUM_REG; i++) - checksum += ixgb_read_eeprom(hw, i); - - checksum = (u16) EEPROM_SUM - checksum; - - ixgb_write_eeprom(hw, EEPROM_CHECKSUM_REG, checksum); -} - -/****************************************************************************** - * Writes a 16 bit word to a given offset in the EEPROM. - * - * hw - Struct containing variables accessed by shared code - * reg - offset within the EEPROM to be written to - * data - 16 bit word to be written to the EEPROM - * - * If ixgb_update_eeprom_checksum is not called after this function, the - * EEPROM will most likely contain an invalid checksum. - * - *****************************************************************************/ -void -ixgb_write_eeprom(struct ixgb_hw *hw, u16 offset, u16 data) -{ - struct ixgb_ee_map_type *ee_map = (struct ixgb_ee_map_type *)hw->eeprom; - - /* Prepare the EEPROM for writing */ - ixgb_setup_eeprom(hw); - - /* Send the 9-bit EWEN (write enable) command to the EEPROM (5-bit opcode - * plus 4-bit dummy). This puts the EEPROM into write/erase mode. - */ - ixgb_shift_out_bits(hw, EEPROM_EWEN_OPCODE, 5); - ixgb_shift_out_bits(hw, 0, 4); - - /* Prepare the EEPROM */ - ixgb_standby_eeprom(hw); - - /* Send the Write command (3-bit opcode + 6-bit addr) */ - ixgb_shift_out_bits(hw, EEPROM_WRITE_OPCODE, 3); - ixgb_shift_out_bits(hw, offset, 6); - - /* Send the data */ - ixgb_shift_out_bits(hw, data, 16); - - ixgb_wait_eeprom_command(hw); - - /* Recover from write */ - ixgb_standby_eeprom(hw); - - /* Send the 9-bit EWDS (write disable) command to the EEPROM (5-bit - * opcode plus 4-bit dummy). This takes the EEPROM out of write/erase - * mode. - */ - ixgb_shift_out_bits(hw, EEPROM_EWDS_OPCODE, 5); - ixgb_shift_out_bits(hw, 0, 4); - - /* Done with writing */ - ixgb_cleanup_eeprom(hw); - - /* clear the init_ctrl_reg_1 to signify that the cache is invalidated */ - ee_map->init_ctrl_reg_1 = cpu_to_le16(EEPROM_ICW1_SIGNATURE_CLEAR); -} - -/****************************************************************************** - * Reads a 16 bit word from the EEPROM. - * - * hw - Struct containing variables accessed by shared code - * offset - offset of 16 bit word in the EEPROM to read - * - * Returns: - * The 16-bit value read from the eeprom - *****************************************************************************/ -u16 -ixgb_read_eeprom(struct ixgb_hw *hw, - u16 offset) -{ - u16 data; - - /* Prepare the EEPROM for reading */ - ixgb_setup_eeprom(hw); - - /* Send the READ command (opcode + addr) */ - ixgb_shift_out_bits(hw, EEPROM_READ_OPCODE, 3); - /* - * We have a 64 word EEPROM, there are 6 address bits - */ - ixgb_shift_out_bits(hw, offset, 6); - - /* Read the data */ - data = ixgb_shift_in_bits(hw); - - /* End this read operation */ - ixgb_standby_eeprom(hw); - - return data; -} - -/****************************************************************************** - * Reads eeprom and stores data in shared structure. - * Validates eeprom checksum and eeprom signature. - * - * hw - Struct containing variables accessed by shared code - * - * Returns: - * true: if eeprom read is successful - * false: otherwise. - *****************************************************************************/ -bool -ixgb_get_eeprom_data(struct ixgb_hw *hw) -{ - u16 i; - u16 checksum = 0; - struct ixgb_ee_map_type *ee_map; - - ENTER(); - - ee_map = (struct ixgb_ee_map_type *)hw->eeprom; - - pr_debug("Reading eeprom data\n"); - for (i = 0; i < IXGB_EEPROM_SIZE ; i++) { - u16 ee_data; - ee_data = ixgb_read_eeprom(hw, i); - checksum += ee_data; - hw->eeprom[i] = cpu_to_le16(ee_data); - } - - if (checksum != (u16) EEPROM_SUM) { - pr_debug("Checksum invalid\n"); - /* clear the init_ctrl_reg_1 to signify that the cache is - * invalidated */ - ee_map->init_ctrl_reg_1 = cpu_to_le16(EEPROM_ICW1_SIGNATURE_CLEAR); - return false; - } - - if ((ee_map->init_ctrl_reg_1 & cpu_to_le16(EEPROM_ICW1_SIGNATURE_MASK)) - != cpu_to_le16(EEPROM_ICW1_SIGNATURE_VALID)) { - pr_debug("Signature invalid\n"); - return false; - } - - return true; -} - -/****************************************************************************** - * Local function to check if the eeprom signature is good - * If the eeprom signature is good, calls ixgb)get_eeprom_data. - * - * hw - Struct containing variables accessed by shared code - * - * Returns: - * true: eeprom signature was good and the eeprom read was successful - * false: otherwise. - ******************************************************************************/ -static bool -ixgb_check_and_get_eeprom_data (struct ixgb_hw* hw) -{ - struct ixgb_ee_map_type *ee_map = (struct ixgb_ee_map_type *)hw->eeprom; - - if ((ee_map->init_ctrl_reg_1 & cpu_to_le16(EEPROM_ICW1_SIGNATURE_MASK)) - == cpu_to_le16(EEPROM_ICW1_SIGNATURE_VALID)) { - return true; - } else { - return ixgb_get_eeprom_data(hw); - } -} - -/****************************************************************************** - * return a word from the eeprom - * - * hw - Struct containing variables accessed by shared code - * index - Offset of eeprom word - * - * Returns: - * Word at indexed offset in eeprom, if valid, 0 otherwise. - ******************************************************************************/ -__le16 -ixgb_get_eeprom_word(struct ixgb_hw *hw, u16 index) -{ - - if (index < IXGB_EEPROM_SIZE && ixgb_check_and_get_eeprom_data(hw)) - return hw->eeprom[index]; - - return 0; -} - -/****************************************************************************** - * return the mac address from EEPROM - * - * hw - Struct containing variables accessed by shared code - * mac_addr - Ethernet Address if EEPROM contents are valid, 0 otherwise - * - * Returns: None. - ******************************************************************************/ -void -ixgb_get_ee_mac_addr(struct ixgb_hw *hw, - u8 *mac_addr) -{ - int i; - struct ixgb_ee_map_type *ee_map = (struct ixgb_ee_map_type *)hw->eeprom; - - ENTER(); - - if (ixgb_check_and_get_eeprom_data(hw)) { - for (i = 0; i < ETH_ALEN; i++) { - mac_addr[i] = ee_map->mac_addr[i]; - } - pr_debug("eeprom mac address = %pM\n", mac_addr); - } -} - - -/****************************************************************************** - * return the Printed Board Assembly number from EEPROM - * - * hw - Struct containing variables accessed by shared code - * - * Returns: - * PBA number if EEPROM contents are valid, 0 otherwise - ******************************************************************************/ -u32 -ixgb_get_ee_pba_number(struct ixgb_hw *hw) -{ - if (ixgb_check_and_get_eeprom_data(hw)) - return le16_to_cpu(hw->eeprom[EEPROM_PBA_1_2_REG]) - | (le16_to_cpu(hw->eeprom[EEPROM_PBA_3_4_REG])<<16); - - return 0; -} - - -/****************************************************************************** - * return the Device Id from EEPROM - * - * hw - Struct containing variables accessed by shared code - * - * Returns: - * Device Id if EEPROM contents are valid, 0 otherwise - ******************************************************************************/ -u16 -ixgb_get_ee_device_id(struct ixgb_hw *hw) -{ - struct ixgb_ee_map_type *ee_map = (struct ixgb_ee_map_type *)hw->eeprom; - - if (ixgb_check_and_get_eeprom_data(hw)) - return le16_to_cpu(ee_map->device_id); - - return 0; -} - diff --git a/drivers/net/ethernet/intel/ixgb/ixgb_ee.h b/drivers/net/ethernet/intel/ixgb/ixgb_ee.h deleted file mode 100644 index 3ee0a09e5d0a..000000000000 --- a/drivers/net/ethernet/intel/ixgb/ixgb_ee.h +++ /dev/null @@ -1,79 +0,0 @@ -/* SPDX-License-Identifier: GPL-2.0 */ -/* Copyright(c) 1999 - 2008 Intel Corporation. */ - -#ifndef _IXGB_EE_H_ -#define _IXGB_EE_H_ - -#define IXGB_EEPROM_SIZE 64 /* Size in words */ - -/* EEPROM Commands */ -#define EEPROM_READ_OPCODE 0x6 /* EEPROM read opcode */ -#define EEPROM_WRITE_OPCODE 0x5 /* EEPROM write opcode */ -#define EEPROM_ERASE_OPCODE 0x7 /* EEPROM erase opcode */ -#define EEPROM_EWEN_OPCODE 0x13 /* EEPROM erase/write enable */ -#define EEPROM_EWDS_OPCODE 0x10 /* EEPROM erase/write disable */ - -/* EEPROM MAP (Word Offsets) */ -#define EEPROM_IA_1_2_REG 0x0000 -#define EEPROM_IA_3_4_REG 0x0001 -#define EEPROM_IA_5_6_REG 0x0002 -#define EEPROM_COMPATIBILITY_REG 0x0003 -#define EEPROM_PBA_1_2_REG 0x0008 -#define EEPROM_PBA_3_4_REG 0x0009 -#define EEPROM_INIT_CONTROL1_REG 0x000A -#define EEPROM_SUBSYS_ID_REG 0x000B -#define EEPROM_SUBVEND_ID_REG 0x000C -#define EEPROM_DEVICE_ID_REG 0x000D -#define EEPROM_VENDOR_ID_REG 0x000E -#define EEPROM_INIT_CONTROL2_REG 0x000F -#define EEPROM_SWDPINS_REG 0x0020 -#define EEPROM_CIRCUIT_CTRL_REG 0x0021 -#define EEPROM_D0_D3_POWER_REG 0x0022 -#define EEPROM_FLASH_VERSION 0x0032 -#define EEPROM_CHECKSUM_REG 0x003F - -/* Mask bits for fields in Word 0x0a of the EEPROM */ - -#define EEPROM_ICW1_SIGNATURE_MASK 0xC000 -#define EEPROM_ICW1_SIGNATURE_VALID 0x4000 -#define EEPROM_ICW1_SIGNATURE_CLEAR 0x0000 - -/* For checksumming, the sum of all words in the EEPROM should equal 0xBABA. */ -#define EEPROM_SUM 0xBABA - -/* EEPROM Map Sizes (Byte Counts) */ -#define PBA_SIZE 4 - -/* EEPROM Map defines (WORD OFFSETS)*/ - -/* EEPROM structure */ -struct ixgb_ee_map_type { - u8 mac_addr[ETH_ALEN]; - __le16 compatibility; - __le16 reserved1[4]; - __le32 pba_number; - __le16 init_ctrl_reg_1; - __le16 subsystem_id; - __le16 subvendor_id; - __le16 device_id; - __le16 vendor_id; - __le16 init_ctrl_reg_2; - __le16 oem_reserved[16]; - __le16 swdpins_reg; - __le16 circuit_ctrl_reg; - u8 d3_power; - u8 d0_power; - __le16 reserved2[28]; - __le16 checksum; -}; - -/* EEPROM Functions */ -u16 ixgb_read_eeprom(struct ixgb_hw *hw, u16 reg); - -bool ixgb_validate_eeprom_checksum(struct ixgb_hw *hw); - -void ixgb_update_eeprom_checksum(struct ixgb_hw *hw); - -void ixgb_write_eeprom(struct ixgb_hw *hw, u16 reg, u16 data); - -#endif /* IXGB_EE_H */ diff --git a/drivers/net/ethernet/intel/ixgb/ixgb_ethtool.c b/drivers/net/ethernet/intel/ixgb/ixgb_ethtool.c deleted file mode 100644 index efa980514944..000000000000 --- a/drivers/net/ethernet/intel/ixgb/ixgb_ethtool.c +++ /dev/null @@ -1,642 +0,0 @@ -// SPDX-License-Identifier: GPL-2.0 -/* Copyright(c) 1999 - 2008 Intel Corporation. */ - -/* ethtool support for ixgb */ - -#include "ixgb.h" - -#include <linux/uaccess.h> - -#define IXGB_ALL_RAR_ENTRIES 16 - -enum {NETDEV_STATS, IXGB_STATS}; - -struct ixgb_stats { - char stat_string[ETH_GSTRING_LEN]; - int type; - int sizeof_stat; - int stat_offset; -}; - -#define IXGB_STAT(m) IXGB_STATS, \ - sizeof_field(struct ixgb_adapter, m), \ - offsetof(struct ixgb_adapter, m) -#define IXGB_NETDEV_STAT(m) NETDEV_STATS, \ - sizeof_field(struct net_device, m), \ - offsetof(struct net_device, m) - -static struct ixgb_stats ixgb_gstrings_stats[] = { - {"rx_packets", IXGB_NETDEV_STAT(stats.rx_packets)}, - {"tx_packets", IXGB_NETDEV_STAT(stats.tx_packets)}, - {"rx_bytes", IXGB_NETDEV_STAT(stats.rx_bytes)}, - {"tx_bytes", IXGB_NETDEV_STAT(stats.tx_bytes)}, - {"rx_errors", IXGB_NETDEV_STAT(stats.rx_errors)}, - {"tx_errors", IXGB_NETDEV_STAT(stats.tx_errors)}, - {"rx_dropped", IXGB_NETDEV_STAT(stats.rx_dropped)}, - {"tx_dropped", IXGB_NETDEV_STAT(stats.tx_dropped)}, - {"multicast", IXGB_NETDEV_STAT(stats.multicast)}, - {"collisions", IXGB_NETDEV_STAT(stats.collisions)}, - -/* { "rx_length_errors", IXGB_NETDEV_STAT(stats.rx_length_errors) }, */ - {"rx_over_errors", IXGB_NETDEV_STAT(stats.rx_over_errors)}, - {"rx_crc_errors", IXGB_NETDEV_STAT(stats.rx_crc_errors)}, - {"rx_frame_errors", IXGB_NETDEV_STAT(stats.rx_frame_errors)}, - {"rx_no_buffer_count", IXGB_STAT(stats.rnbc)}, - {"rx_fifo_errors", IXGB_NETDEV_STAT(stats.rx_fifo_errors)}, - {"rx_missed_errors", IXGB_NETDEV_STAT(stats.rx_missed_errors)}, - {"tx_aborted_errors", IXGB_NETDEV_STAT(stats.tx_aborted_errors)}, - {"tx_carrier_errors", IXGB_NETDEV_STAT(stats.tx_carrier_errors)}, - {"tx_fifo_errors", IXGB_NETDEV_STAT(stats.tx_fifo_errors)}, - {"tx_heartbeat_errors", IXGB_NETDEV_STAT(stats.tx_heartbeat_errors)}, - {"tx_window_errors", IXGB_NETDEV_STAT(stats.tx_window_errors)}, - {"tx_deferred_ok", IXGB_STAT(stats.dc)}, - {"tx_timeout_count", IXGB_STAT(tx_timeout_count) }, - {"tx_restart_queue", IXGB_STAT(restart_queue) }, - {"rx_long_length_errors", IXGB_STAT(stats.roc)}, - {"rx_short_length_errors", IXGB_STAT(stats.ruc)}, - {"tx_tcp_seg_good", IXGB_STAT(stats.tsctc)}, - {"tx_tcp_seg_failed", IXGB_STAT(stats.tsctfc)}, - {"rx_flow_control_xon", IXGB_STAT(stats.xonrxc)}, - {"rx_flow_control_xoff", IXGB_STAT(stats.xoffrxc)}, - {"tx_flow_control_xon", IXGB_STAT(stats.xontxc)}, - {"tx_flow_control_xoff", IXGB_STAT(stats.xofftxc)}, - {"rx_csum_offload_good", IXGB_STAT(hw_csum_rx_good)}, - {"rx_csum_offload_errors", IXGB_STAT(hw_csum_rx_error)}, - {"tx_csum_offload_good", IXGB_STAT(hw_csum_tx_good)}, - {"tx_csum_offload_errors", IXGB_STAT(hw_csum_tx_error)} -}; - -#define IXGB_STATS_LEN ARRAY_SIZE(ixgb_gstrings_stats) - -static int -ixgb_get_link_ksettings(struct net_device *netdev, - struct ethtool_link_ksettings *cmd) -{ - struct ixgb_adapter *adapter = netdev_priv(netdev); - - ethtool_link_ksettings_zero_link_mode(cmd, supported); - ethtool_link_ksettings_add_link_mode(cmd, supported, 10000baseT_Full); - ethtool_link_ksettings_add_link_mode(cmd, supported, FIBRE); - - ethtool_link_ksettings_zero_link_mode(cmd, advertising); - ethtool_link_ksettings_add_link_mode(cmd, advertising, 10000baseT_Full); - ethtool_link_ksettings_add_link_mode(cmd, advertising, FIBRE); - - cmd->base.port = PORT_FIBRE; - - if (netif_carrier_ok(adapter->netdev)) { - cmd->base.speed = SPEED_10000; - cmd->base.duplex = DUPLEX_FULL; - } else { - cmd->base.speed = SPEED_UNKNOWN; - cmd->base.duplex = DUPLEX_UNKNOWN; - } - - cmd->base.autoneg = AUTONEG_DISABLE; - return 0; -} - -void ixgb_set_speed_duplex(struct net_device *netdev) -{ - struct ixgb_adapter *adapter = netdev_priv(netdev); - /* be optimistic about our link, since we were up before */ - adapter->link_speed = 10000; - adapter->link_duplex = FULL_DUPLEX; - netif_carrier_on(netdev); - netif_wake_queue(netdev); -} - -static int -ixgb_set_link_ksettings(struct net_device *netdev, - const struct ethtool_link_ksettings *cmd) -{ - struct ixgb_adapter *adapter = netdev_priv(netdev); - u32 speed = cmd->base.speed; - - if (cmd->base.autoneg == AUTONEG_ENABLE || - (speed + cmd->base.duplex != SPEED_10000 + DUPLEX_FULL)) - return -EINVAL; - - if (netif_running(adapter->netdev)) { - ixgb_down(adapter, true); - ixgb_reset(adapter); - ixgb_up(adapter); - ixgb_set_speed_duplex(netdev); - } else - ixgb_reset(adapter); - - return 0; -} - -static void -ixgb_get_pauseparam(struct net_device *netdev, - struct ethtool_pauseparam *pause) -{ - struct ixgb_adapter *adapter = netdev_priv(netdev); - struct ixgb_hw *hw = &adapter->hw; - - pause->autoneg = AUTONEG_DISABLE; - - if (hw->fc.type == ixgb_fc_rx_pause) - pause->rx_pause = 1; - else if (hw->fc.type == ixgb_fc_tx_pause) - pause->tx_pause = 1; - else if (hw->fc.type == ixgb_fc_full) { - pause->rx_pause = 1; - pause->tx_pause = 1; - } -} - -static int -ixgb_set_pauseparam(struct net_device *netdev, - struct ethtool_pauseparam *pause) -{ - struct ixgb_adapter *adapter = netdev_priv(netdev); - struct ixgb_hw *hw = &adapter->hw; - - if (pause->autoneg == AUTONEG_ENABLE) - return -EINVAL; - - if (pause->rx_pause && pause->tx_pause) - hw->fc.type = ixgb_fc_full; - else if (pause->rx_pause && !pause->tx_pause) - hw->fc.type = ixgb_fc_rx_pause; - else if (!pause->rx_pause && pause->tx_pause) - hw->fc.type = ixgb_fc_tx_pause; - else if (!pause->rx_pause && !pause->tx_pause) - hw->fc.type = ixgb_fc_none; - - if (netif_running(adapter->netdev)) { - ixgb_down(adapter, true); - ixgb_up(adapter); - ixgb_set_speed_duplex(netdev); - } else - ixgb_reset(adapter); - - return 0; -} - -static u32 -ixgb_get_msglevel(struct net_device *netdev) -{ - struct ixgb_adapter *adapter = netdev_priv(netdev); - return adapter->msg_enable; -} - -static void -ixgb_set_msglevel(struct net_device *netdev, u32 data) -{ - struct ixgb_adapter *adapter = netdev_priv(netdev); - adapter->msg_enable = data; -} -#define IXGB_GET_STAT(_A_, _R_) _A_->stats._R_ - -static int -ixgb_get_regs_len(struct net_device *netdev) -{ -#define IXGB_REG_DUMP_LEN 136*sizeof(u32) - return IXGB_REG_DUMP_LEN; -} - -static void -ixgb_get_regs(struct net_device *netdev, - struct ethtool_regs *regs, void *p) -{ - struct ixgb_adapter *adapter = netdev_priv(netdev); - struct ixgb_hw *hw = &adapter->hw; - u32 *reg = p; - u32 *reg_start = reg; - u8 i; - - /* the 1 (one) below indicates an attempt at versioning, if the - * interface in ethtool or the driver changes, this 1 should be - * incremented */ - regs->version = (1<<24) | hw->revision_id << 16 | hw->device_id; - - /* General Registers */ - *reg++ = IXGB_READ_REG(hw, CTRL0); /* 0 */ - *reg++ = IXGB_READ_REG(hw, CTRL1); /* 1 */ - *reg++ = IXGB_READ_REG(hw, STATUS); /* 2 */ - *reg++ = IXGB_READ_REG(hw, EECD); /* 3 */ - *reg++ = IXGB_READ_REG(hw, MFS); /* 4 */ - - /* Interrupt */ - *reg++ = IXGB_READ_REG(hw, ICR); /* 5 */ - *reg++ = IXGB_READ_REG(hw, ICS); /* 6 */ - *reg++ = IXGB_READ_REG(hw, IMS); /* 7 */ - *reg++ = IXGB_READ_REG(hw, IMC); /* 8 */ - - /* Receive */ - *reg++ = IXGB_READ_REG(hw, RCTL); /* 9 */ - *reg++ = IXGB_READ_REG(hw, FCRTL); /* 10 */ - *reg++ = IXGB_READ_REG(hw, FCRTH); /* 11 */ - *reg++ = IXGB_READ_REG(hw, RDBAL); /* 12 */ - *reg++ = IXGB_READ_REG(hw, RDBAH); /* 13 */ - *reg++ = IXGB_READ_REG(hw, RDLEN); /* 14 */ - *reg++ = IXGB_READ_REG(hw, RDH); /* 15 */ - *reg++ = IXGB_READ_REG(hw, RDT); /* 16 */ - *reg++ = IXGB_READ_REG(hw, RDTR); /* 17 */ - *reg++ = IXGB_READ_REG(hw, RXDCTL); /* 18 */ - *reg++ = IXGB_READ_REG(hw, RAIDC); /* 19 */ - *reg++ = IXGB_READ_REG(hw, RXCSUM); /* 20 */ - - /* there are 16 RAR entries in hardware, we only use 3 */ - for (i = 0; i < IXGB_ALL_RAR_ENTRIES; i++) { - *reg++ = IXGB_READ_REG_ARRAY(hw, RAL, (i << 1)); /*21,...,51 */ - *reg++ = IXGB_READ_REG_ARRAY(hw, RAH, (i << 1)); /*22,...,52 */ - } - - /* Transmit */ - *reg++ = IXGB_READ_REG(hw, TCTL); /* 53 */ - *reg++ = IXGB_READ_REG(hw, TDBAL); /* 54 */ - *reg++ = IXGB_READ_REG(hw, TDBAH); /* 55 */ - *reg++ = IXGB_READ_REG(hw, TDLEN); /* 56 */ - *reg++ = IXGB_READ_REG(hw, TDH); /* 57 */ - *reg++ = IXGB_READ_REG(hw, TDT); /* 58 */ - *reg++ = IXGB_READ_REG(hw, TIDV); /* 59 */ - *reg++ = IXGB_READ_REG(hw, TXDCTL); /* 60 */ - *reg++ = IXGB_READ_REG(hw, TSPMT); /* 61 */ - *reg++ = IXGB_READ_REG(hw, PAP); /* 62 */ - - /* Physical */ - *reg++ = IXGB_READ_REG(hw, PCSC1); /* 63 */ - *reg++ = IXGB_READ_REG(hw, PCSC2); /* 64 */ - *reg++ = IXGB_READ_REG(hw, PCSS1); /* 65 */ - *reg++ = IXGB_READ_REG(hw, PCSS2); /* 66 */ - *reg++ = IXGB_READ_REG(hw, XPCSS); /* 67 */ - *reg++ = IXGB_READ_REG(hw, UCCR); /* 68 */ - *reg++ = IXGB_READ_REG(hw, XPCSTC); /* 69 */ - *reg++ = IXGB_READ_REG(hw, MACA); /* 70 */ - *reg++ = IXGB_READ_REG(hw, APAE); /* 71 */ - *reg++ = IXGB_READ_REG(hw, ARD); /* 72 */ - *reg++ = IXGB_READ_REG(hw, AIS); /* 73 */ - *reg++ = IXGB_READ_REG(hw, MSCA); /* 74 */ - *reg++ = IXGB_READ_REG(hw, MSRWD); /* 75 */ - - /* Statistics */ - *reg++ = IXGB_GET_STAT(adapter, tprl); /* 76 */ - *reg++ = IXGB_GET_STAT(adapter, tprh); /* 77 */ - *reg++ = IXGB_GET_STAT(adapter, gprcl); /* 78 */ - *reg++ = IXGB_GET_STAT(adapter, gprch); /* 79 */ - *reg++ = IXGB_GET_STAT(adapter, bprcl); /* 80 */ - *reg++ = IXGB_GET_STAT(adapter, bprch); /* 81 */ - *reg++ = IXGB_GET_STAT(adapter, mprcl); /* 82 */ - *reg++ = IXGB_GET_STAT(adapter, mprch); /* 83 */ - *reg++ = IXGB_GET_STAT(adapter, uprcl); /* 84 */ - *reg++ = IXGB_GET_STAT(adapter, uprch); /* 85 */ - *reg++ = IXGB_GET_STAT(adapter, vprcl); /* 86 */ - *reg++ = IXGB_GET_STAT(adapter, vprch); /* 87 */ - *reg++ = IXGB_GET_STAT(adapter, jprcl); /* 88 */ - *reg++ = IXGB_GET_STAT(adapter, jprch); /* 89 */ - *reg++ = IXGB_GET_STAT(adapter, gorcl); /* 90 */ - *reg++ = IXGB_GET_STAT(adapter, gorch); /* 91 */ - *reg++ = IXGB_GET_STAT(adapter, torl); /* 92 */ - *reg++ = IXGB_GET_STAT(adapter, torh); /* 93 */ - *reg++ = IXGB_GET_STAT(adapter, rnbc); /* 94 */ - *reg++ = IXGB_GET_STAT(adapter, ruc); /* 95 */ - *reg++ = IXGB_GET_STAT(adapter, roc); /* 96 */ - *reg++ = IXGB_GET_STAT(adapter, rlec); /* 97 */ - *reg++ = IXGB_GET_STAT(adapter, crcerrs); /* 98 */ - *reg++ = IXGB_GET_STAT(adapter, icbc); /* 99 */ - *reg++ = IXGB_GET_STAT(adapter, ecbc); /* 100 */ - *reg++ = IXGB_GET_STAT(adapter, mpc); /* 101 */ - *reg++ = IXGB_GET_STAT(adapter, tptl); /* 102 */ - *reg++ = IXGB_GET_STAT(adapter, tpth); /* 103 */ - *reg++ = IXGB_GET_STAT(adapter, gptcl); /* 104 */ - *reg++ = IXGB_GET_STAT(adapter, gptch); /* 105 */ - *reg++ = IXGB_GET_STAT(adapter, bptcl); /* 106 */ - *reg++ = IXGB_GET_STAT(adapter, bptch); /* 107 */ - *reg++ = IXGB_GET_STAT(adapter, mptcl); /* 108 */ - *reg++ = IXGB_GET_STAT(adapter, mptch); /* 109 */ - *reg++ = IXGB_GET_STAT(adapter, uptcl); /* 110 */ - *reg++ = IXGB_GET_STAT(adapter, uptch); /* 111 */ - *reg++ = IXGB_GET_STAT(adapter, vptcl); /* 112 */ - *reg++ = IXGB_GET_STAT(adapter, vptch); /* 113 */ - *reg++ = IXGB_GET_STAT(adapter, jptcl); /* 114 */ - *reg++ = IXGB_GET_STAT(adapter, jptch); /* 115 */ - *reg++ = IXGB_GET_STAT(adapter, gotcl); /* 116 */ - *reg++ = IXGB_GET_STAT(adapter, gotch); /* 117 */ - *reg++ = IXGB_GET_STAT(adapter, totl); /* 118 */ - *reg++ = IXGB_GET_STAT(adapter, toth); /* 119 */ - *reg++ = IXGB_GET_STAT(adapter, dc); /* 120 */ - *reg++ = IXGB_GET_STAT(adapter, plt64c); /* 121 */ - *reg++ = IXGB_GET_STAT(adapter, tsctc); /* 122 */ - *reg++ = IXGB_GET_STAT(adapter, tsctfc); /* 123 */ - *reg++ = IXGB_GET_STAT(adapter, ibic); /* 124 */ - *reg++ = IXGB_GET_STAT(adapter, rfc); /* 125 */ - *reg++ = IXGB_GET_STAT(adapter, lfc); /* 126 */ - *reg++ = IXGB_GET_STAT(adapter, pfrc); /* 127 */ - *reg++ = IXGB_GET_STAT(adapter, pftc); /* 128 */ - *reg++ = IXGB_GET_STAT(adapter, mcfrc); /* 129 */ - *reg++ = IXGB_GET_STAT(adapter, mcftc); /* 130 */ - *reg++ = IXGB_GET_STAT(adapter, xonrxc); /* 131 */ - *reg++ = IXGB_GET_STAT(adapter, xontxc); /* 132 */ - *reg++ = IXGB_GET_STAT(adapter, xoffrxc); /* 133 */ - *reg++ = IXGB_GET_STAT(adapter, xofftxc); /* 134 */ - *reg++ = IXGB_GET_STAT(adapter, rjc); /* 135 */ - - regs->len = (reg - reg_start) * sizeof(u32); -} - -static int -ixgb_get_eeprom_len(struct net_device *netdev) -{ - /* return size in bytes */ - return IXGB_EEPROM_SIZE << 1; -} - -static int -ixgb_get_eeprom(struct net_device *netdev, - struct ethtool_eeprom *eeprom, u8 *bytes) -{ - struct ixgb_adapter *adapter = netdev_priv(netdev); - struct ixgb_hw *hw = &adapter->hw; - __le16 *eeprom_buff; - int i, max_len, first_word, last_word; - int ret_val = 0; - - if (eeprom->len == 0) { - ret_val = -EINVAL; - goto geeprom_error; - } - - eeprom->magic = hw->vendor_id | (hw->device_id << 16); - - max_len = ixgb_get_eeprom_len(netdev); - - if (eeprom->offset > eeprom->offset + eeprom->len) { - ret_val = -EINVAL; - goto geeprom_error; - } - - if ((eeprom->offset + eeprom->len) > max_len) - eeprom->len = (max_len - eeprom->offset); - - first_word = eeprom->offset >> 1; - last_word = (eeprom->offset + eeprom->len - 1) >> 1; - - eeprom_buff = kmalloc_array(last_word - first_word + 1, - sizeof(__le16), - GFP_KERNEL); - if (!eeprom_buff) - return -ENOMEM; - - /* note the eeprom was good because the driver loaded */ - for (i = 0; i <= (last_word - first_word); i++) - eeprom_buff[i] = ixgb_get_eeprom_word(hw, (first_word + i)); - - memcpy(bytes, (u8 *)eeprom_buff + (eeprom->offset & 1), eeprom->len); - kfree(eeprom_buff); - -geeprom_error: - return ret_val; -} - -static int -ixgb_set_eeprom(struct net_device *netdev, - struct ethtool_eeprom *eeprom, u8 *bytes) -{ - struct ixgb_adapter *adapter = netdev_priv(netdev); - struct ixgb_hw *hw = &adapter->hw; - u16 *eeprom_buff; - void *ptr; - int max_len, first_word, last_word; - u16 i; - - if (eeprom->len == 0) - return -EINVAL; - - if (eeprom->magic != (hw->vendor_id | (hw->device_id << 16))) - return -EFAULT; - - max_len = ixgb_get_eeprom_len(netdev); - - if (eeprom->offset > eeprom->offset + eeprom->len) - return -EINVAL; - - if ((eeprom->offset + eeprom->len) > max_len) - eeprom->len = (max_len - eeprom->offset); - - first_word = eeprom->offset >> 1; - last_word = (eeprom->offset + eeprom->len - 1) >> 1; - eeprom_buff = kmalloc(max_len, GFP_KERNEL); - if (!eeprom_buff) - return -ENOMEM; - - ptr = (void *)eeprom_buff; - - if (eeprom->offset & 1) { - /* need read/modify/write of first changed EEPROM word */ - /* only the second byte of the word is being modified */ - eeprom_buff[0] = ixgb_read_eeprom(hw, first_word); - ptr++; - } - if ((eeprom->offset + eeprom->len) & 1) { - /* need read/modify/write of last changed EEPROM word */ - /* only the first byte of the word is being modified */ - eeprom_buff[last_word - first_word] - = ixgb_read_eeprom(hw, last_word); - } - - memcpy(ptr, bytes, eeprom->len); - for (i = 0; i <= (last_word - first_word); i++) - ixgb_write_eeprom(hw, first_word + i, eeprom_buff[i]); - - /* Update the checksum over the first part of the EEPROM if needed */ - if (first_word <= EEPROM_CHECKSUM_REG) - ixgb_update_eeprom_checksum(hw); - - kfree(eeprom_buff); - return 0; -} - -static void -ixgb_get_drvinfo(struct net_device *netdev, - struct ethtool_drvinfo *drvinfo) -{ - struct ixgb_adapter *adapter = netdev_priv(netdev); - - strscpy(drvinfo->driver, ixgb_driver_name, - sizeof(drvinfo->driver)); - strscpy(drvinfo->bus_info, pci_name(adapter->pdev), - sizeof(drvinfo->bus_info)); -} - -static void -ixgb_get_ringparam(struct net_device *netdev, - struct ethtool_ringparam *ring, - struct kernel_ethtool_ringparam *kernel_ring, - struct netlink_ext_ack *extack) -{ - struct ixgb_adapter *adapter = netdev_priv(netdev); - struct ixgb_desc_ring *txdr = &adapter->tx_ring; - struct ixgb_desc_ring *rxdr = &adapter->rx_ring; - - ring->rx_max_pending = MAX_RXD; - ring->tx_max_pending = MAX_TXD; - ring->rx_pending = rxdr->count; - ring->tx_pending = txdr->count; -} - -static int -ixgb_set_ringparam(struct net_device *netdev, - struct ethtool_ringparam *ring, - struct kernel_ethtool_ringparam *kernel_ring, - struct netlink_ext_ack *extack) -{ - struct ixgb_adapter *adapter = netdev_priv(netdev); - struct ixgb_desc_ring *txdr = &adapter->tx_ring; - struct ixgb_desc_ring *rxdr = &adapter->rx_ring; - struct ixgb_desc_ring tx_old, tx_new, rx_old, rx_new; - int err; - - tx_old = adapter->tx_ring; - rx_old = adapter->rx_ring; - - if ((ring->rx_mini_pending) || (ring->rx_jumbo_pending)) - return -EINVAL; - - if (netif_running(adapter->netdev)) - ixgb_down(adapter, true); - - rxdr->count = max(ring->rx_pending,(u32)MIN_RXD); - rxdr->count = min(rxdr->count,(u32)MAX_RXD); - rxdr->count = ALIGN(rxdr->count, IXGB_REQ_RX_DESCRIPTOR_MULTIPLE); - - txdr->count = max(ring->tx_pending,(u32)MIN_TXD); - txdr->count = min(txdr->count,(u32)MAX_TXD); - txdr->count = ALIGN(txdr->count, IXGB_REQ_TX_DESCRIPTOR_MULTIPLE); - - if (netif_running(adapter->netdev)) { - /* Try to get new resources before deleting old */ - if ((err = ixgb_setup_rx_resources(adapter))) - goto err_setup_rx; - if ((err = ixgb_setup_tx_resources(adapter))) - goto err_setup_tx; - - /* save the new, restore the old in order to free it, - * then restore the new back again */ - - rx_new = adapter->rx_ring; - tx_new = adapter->tx_ring; - adapter->rx_ring = rx_old; - adapter->tx_ring = tx_old; - ixgb_free_rx_resources(adapter); - ixgb_free_tx_resources(adapter); - adapter->rx_ring = rx_new; - adapter->tx_ring = tx_new; - if ((err = ixgb_up(adapter))) - return err; - ixgb_set_speed_duplex(netdev); - } - - return 0; -err_setup_tx: - ixgb_free_rx_resources(adapter); -err_setup_rx: - adapter->rx_ring = rx_old; - adapter->tx_ring = tx_old; - ixgb_up(adapter); - return err; -} - -static int -ixgb_set_phys_id(struct net_device *netdev, enum ethtool_phys_id_state state) -{ - struct ixgb_adapter *adapter = netdev_priv(netdev); - - switch (state) { - case ETHTOOL_ID_ACTIVE: - return 2; - - case ETHTOOL_ID_ON: - ixgb_led_on(&adapter->hw); - break; - - case ETHTOOL_ID_OFF: - case ETHTOOL_ID_INACTIVE: - ixgb_led_off(&adapter->hw); - } - - return 0; -} - -static int -ixgb_get_sset_count(struct net_device *netdev, int sset) -{ - switch (sset) { - case ETH_SS_STATS: - return IXGB_STATS_LEN; - default: - return -EOPNOTSUPP; - } -} - -static void -ixgb_get_ethtool_stats(struct net_device *netdev, - struct ethtool_stats *stats, u64 *data) -{ - struct ixgb_adapter *adapter = netdev_priv(netdev); - int i; - char *p = NULL; - - ixgb_update_stats(adapter); - for (i = 0; i < IXGB_STATS_LEN; i++) { - switch (ixgb_gstrings_stats[i].type) { - case NETDEV_STATS: - p = (char *) netdev + - ixgb_gstrings_stats[i].stat_offset; - break; - case IXGB_STATS: - p = (char *) adapter + - ixgb_gstrings_stats[i].stat_offset; - break; - } - - data[i] = (ixgb_gstrings_stats[i].sizeof_stat == - sizeof(u64)) ? *(u64 *)p : *(u32 *)p; - } -} - -static void -ixgb_get_strings(struct net_device *netdev, u32 stringset, u8 *data) -{ - int i; - - switch(stringset) { - case ETH_SS_STATS: - for (i = 0; i < IXGB_STATS_LEN; i++) { - memcpy(data + i * ETH_GSTRING_LEN, - ixgb_gstrings_stats[i].stat_string, - ETH_GSTRING_LEN); - } - break; - } -} - -static const struct ethtool_ops ixgb_ethtool_ops = { - .get_drvinfo = ixgb_get_drvinfo, - .get_regs_len = ixgb_get_regs_len, - .get_regs = ixgb_get_regs, - .get_link = ethtool_op_get_link, - .get_eeprom_len = ixgb_get_eeprom_len, - .get_eeprom = ixgb_get_eeprom, - .set_eeprom = ixgb_set_eeprom, - .get_ringparam = ixgb_get_ringparam, - .set_ringparam = ixgb_set_ringparam, - .get_pauseparam = ixgb_get_pauseparam, - .set_pauseparam = ixgb_set_pauseparam, - .get_msglevel = ixgb_get_msglevel, - .set_msglevel = ixgb_set_msglevel, - .get_strings = ixgb_get_strings, - .set_phys_id = ixgb_set_phys_id, - .get_sset_count = ixgb_get_sset_count, - .get_ethtool_stats = ixgb_get_ethtool_stats, - .get_link_ksettings = ixgb_get_link_ksettings, - .set_link_ksettings = ixgb_set_link_ksettings, -}; - -void ixgb_set_ethtool_ops(struct net_device *netdev) -{ - netdev->ethtool_ops = &ixgb_ethtool_ops; -} diff --git a/drivers/net/ethernet/intel/ixgb/ixgb_hw.c b/drivers/net/ethernet/intel/ixgb/ixgb_hw.c deleted file mode 100644 index 98bd3267b99b..000000000000 --- a/drivers/net/ethernet/intel/ixgb/ixgb_hw.c +++ /dev/null @@ -1,1229 +0,0 @@ -// SPDX-License-Identifier: GPL-2.0 -/* Copyright(c) 1999 - 2008 Intel Corporation. */ - -/* ixgb_hw.c - * Shared functions for accessing and configuring the adapter - */ - -#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt - -#include <linux/pci_ids.h> -#include "ixgb_hw.h" -#include "ixgb_ids.h" - -#include <linux/etherdevice.h> - -/* Local function prototypes */ - -static u32 ixgb_hash_mc_addr(struct ixgb_hw *hw, u8 * mc_addr); - -static void ixgb_mta_set(struct ixgb_hw *hw, u32 hash_value); - -static void ixgb_get_bus_info(struct ixgb_hw *hw); - -static bool ixgb_link_reset(struct ixgb_hw *hw); - -static void ixgb_optics_reset(struct ixgb_hw *hw); - -static void ixgb_optics_reset_bcm(struct ixgb_hw *hw); - -static ixgb_phy_type ixgb_identify_phy(struct ixgb_hw *hw); - -static void ixgb_clear_hw_cntrs(struct ixgb_hw *hw); - -static void ixgb_clear_vfta(struct ixgb_hw *hw); - -static void ixgb_init_rx_addrs(struct ixgb_hw *hw); - -static u16 ixgb_read_phy_reg(struct ixgb_hw *hw, - u32 reg_address, - u32 phy_address, - u32 device_type); - -static bool ixgb_setup_fc(struct ixgb_hw *hw); - -static bool mac_addr_valid(u8 *mac_addr); - -static u32 ixgb_mac_reset(struct ixgb_hw *hw) -{ - u32 ctrl_reg; - - ctrl_reg = IXGB_CTRL0_RST | - IXGB_CTRL0_SDP3_DIR | /* All pins are Output=1 */ - IXGB_CTRL0_SDP2_DIR | - IXGB_CTRL0_SDP1_DIR | - IXGB_CTRL0_SDP0_DIR | - IXGB_CTRL0_SDP3 | /* Initial value 1101 */ - IXGB_CTRL0_SDP2 | - IXGB_CTRL0_SDP0; - -#ifdef HP_ZX1 - /* Workaround for 82597EX reset errata */ - IXGB_WRITE_REG_IO(hw, CTRL0, ctrl_reg); -#else - IXGB_WRITE_REG(hw, CTRL0, ctrl_reg); -#endif - - /* Delay a few ms just to allow the reset to complete */ - msleep(IXGB_DELAY_AFTER_RESET); - ctrl_reg = IXGB_READ_REG(hw, CTRL0); -#ifdef DBG - /* Make sure the self-clearing global reset bit did self clear */ - ASSERT(!(ctrl_reg & IXGB_CTRL0_RST)); -#endif - - if (hw->subsystem_vendor_id == PCI_VENDOR_ID_SUN) { - ctrl_reg = /* Enable interrupt from XFP and SerDes */ - IXGB_CTRL1_GPI0_EN | - IXGB_CTRL1_SDP6_DIR | - IXGB_CTRL1_SDP7_DIR | - IXGB_CTRL1_SDP6 | - IXGB_CTRL1_SDP7; - IXGB_WRITE_REG(hw, CTRL1, ctrl_reg); - ixgb_optics_reset_bcm(hw); - } - - if (hw->phy_type == ixgb_phy_type_txn17401) - ixgb_optics_reset(hw); - - return ctrl_reg; -} - -/****************************************************************************** - * Reset the transmit and receive units; mask and clear all interrupts. - * - * hw - Struct containing variables accessed by shared code - *****************************************************************************/ -bool -ixgb_adapter_stop(struct ixgb_hw *hw) -{ - u32 ctrl_reg; - - ENTER(); - - /* If we are stopped or resetting exit gracefully and wait to be - * started again before accessing the hardware. - */ - if (hw->adapter_stopped) { - pr_debug("Exiting because the adapter is already stopped!!!\n"); - return false; - } - - /* Set the Adapter Stopped flag so other driver functions stop - * touching the Hardware. - */ - hw->adapter_stopped = true; - - /* Clear interrupt mask to stop board from generating interrupts */ - pr_debug("Masking off all interrupts\n"); - IXGB_WRITE_REG(hw, IMC, 0xFFFFFFFF); - - /* Disable the Transmit and Receive units. Then delay to allow - * any pending transactions to complete before we hit the MAC with - * the global reset. - */ - IXGB_WRITE_REG(hw, RCTL, IXGB_READ_REG(hw, RCTL) & ~IXGB_RCTL_RXEN); - IXGB_WRITE_REG(hw, TCTL, IXGB_READ_REG(hw, TCTL) & ~IXGB_TCTL_TXEN); - IXGB_WRITE_FLUSH(hw); - msleep(IXGB_DELAY_BEFORE_RESET); - - /* Issue a global reset to the MAC. This will reset the chip's - * transmit, receive, DMA, and link units. It will not effect - * the current PCI configuration. The global reset bit is self- - * clearing, and should clear within a microsecond. - */ - pr_debug("Issuing a global reset to MAC\n"); - - ctrl_reg = ixgb_mac_reset(hw); - - /* Clear interrupt mask to stop board from generating interrupts */ - pr_debug("Masking off all interrupts\n"); - IXGB_WRITE_REG(hw, IMC, 0xffffffff); - - /* Clear any pending interrupt events. */ - IXGB_READ_REG(hw, ICR); - - return ctrl_reg & IXGB_CTRL0_RST; -} - - -/****************************************************************************** - * Identifies the vendor of the optics module on the adapter. The SR adapters - * support two different types of XPAK optics, so it is necessary to determine - * which optics are present before applying any optics-specific workarounds. - * - * hw - Struct containing variables accessed by shared code. - * - * Returns: the vendor of the XPAK optics module. - *****************************************************************************/ -static ixgb_xpak_vendor -ixgb_identify_xpak_vendor(struct ixgb_hw *hw) -{ - u32 i; - u16 vendor_name[5]; - ixgb_xpak_vendor xpak_vendor; - - ENTER(); - - /* Read the first few bytes of the vendor string from the XPAK NVR - * registers. These are standard XENPAK/XPAK registers, so all XPAK - * devices should implement them. */ - for (i = 0; i < 5; i++) { - vendor_name[i] = ixgb_read_phy_reg(hw, - MDIO_PMA_PMD_XPAK_VENDOR_NAME - + i, IXGB_PHY_ADDRESS, - MDIO_MMD_PMAPMD); - } - - /* Determine the actual vendor */ - if (vendor_name[0] == 'I' && - vendor_name[1] == 'N' && - vendor_name[2] == 'T' && - vendor_name[3] == 'E' && vendor_name[4] == 'L') { - xpak_vendor = ixgb_xpak_vendor_intel; - } else { - xpak_vendor = ixgb_xpak_vendor_infineon; - } - - return xpak_vendor; -} - -/****************************************************************************** - * Determine the physical layer module on the adapter. - * - * hw - Struct containing variables accessed by shared code. The device_id - * field must be (correctly) populated before calling this routine. - * - * Returns: the phy type of the adapter. - *****************************************************************************/ -static ixgb_phy_type -ixgb_identify_phy(struct ixgb_hw *hw) -{ - ixgb_phy_type phy_type; - ixgb_xpak_vendor xpak_vendor; - - ENTER(); - - /* Infer the transceiver/phy type from the device id */ - switch (hw->device_id) { - case IXGB_DEVICE_ID_82597EX: - pr_debug("Identified TXN17401 optics\n"); - phy_type = ixgb_phy_type_txn17401; - break; - - case IXGB_DEVICE_ID_82597EX_SR: - /* The SR adapters carry two different types of XPAK optics - * modules; read the vendor identifier to determine the exact - * type of optics. */ - xpak_vendor = ixgb_identify_xpak_vendor(hw); - if (xpak_vendor == ixgb_xpak_vendor_intel) { - pr_debug("Identified TXN17201 optics\n"); - phy_type = ixgb_phy_type_txn17201; - } else { - pr_debug("Identified G6005 optics\n"); - phy_type = ixgb_phy_type_g6005; - } - break; - case IXGB_DEVICE_ID_82597EX_LR: - pr_debug("Identified G6104 optics\n"); - phy_type = ixgb_phy_type_g6104; - break; - case IXGB_DEVICE_ID_82597EX_CX4: - pr_debug("Identified CX4\n"); - xpak_vendor = ixgb_identify_xpak_vendor(hw); - if (xpak_vendor == ixgb_xpak_vendor_intel) { - pr_debug("Identified TXN17201 optics\n"); - phy_type = ixgb_phy_type_txn17201; - } else { - pr_debug("Identified G6005 optics\n"); - phy_type = ixgb_phy_type_g6005; - } - break; - default: - pr_debug("Unknown physical layer module\n"); - phy_type = ixgb_phy_type_unknown; - break; - } - - /* update phy type for sun specific board */ - if (hw->subsystem_vendor_id == PCI_VENDOR_ID_SUN) - phy_type = ixgb_phy_type_bcm; - - return phy_type; -} - -/****************************************************************************** - * Performs basic configuration of the adapter. - * - * hw - Struct containing variables accessed by shared code - * - * Resets the controller. - * Reads and validates the EEPROM. - * Initializes the receive address registers. - * Initializes the multicast table. - * Clears all on-chip counters. - * Calls routine to setup flow control settings. - * Leaves the transmit and receive units disabled and uninitialized. - * - * Returns: - * true if successful, - * false if unrecoverable problems were encountered. - *****************************************************************************/ -bool -ixgb_init_hw(struct ixgb_hw *hw) -{ - u32 i; - bool status; - - ENTER(); - - /* Issue a global reset to the MAC. This will reset the chip's - * transmit, receive, DMA, and link units. It will not effect - * the current PCI configuration. The global reset bit is self- - * clearing, and should clear within a microsecond. - */ - pr_debug("Issuing a global reset to MAC\n"); - - ixgb_mac_reset(hw); - - pr_debug("Issuing an EE reset to MAC\n"); -#ifdef HP_ZX1 - /* Workaround for 82597EX reset errata */ - IXGB_WRITE_REG_IO(hw, CTRL1, IXGB_CTRL1_EE_RST); -#else - IXGB_WRITE_REG(hw, CTRL1, IXGB_CTRL1_EE_RST); -#endif - - /* Delay a few ms just to allow the reset to complete */ - msleep(IXGB_DELAY_AFTER_EE_RESET); - - if (!ixgb_get_eeprom_data(hw)) - return false; - - /* Use the device id to determine the type of phy/transceiver. */ - hw->device_id = ixgb_get_ee_device_id(hw); - hw->phy_type = ixgb_identify_phy(hw); - - /* Setup the receive addresses. - * Receive Address Registers (RARs 0 - 15). - */ - ixgb_init_rx_addrs(hw); - - /* - * Check that a valid MAC address has been set. - * If it is not valid, we fail hardware init. - */ - if (!mac_addr_valid(hw->curr_mac_addr)) { - pr_debug("MAC address invalid after ixgb_init_rx_addrs\n"); - return(false); - } - - /* tell the routines in this file they can access hardware again */ - hw->adapter_stopped = false; - - /* Fill in the bus_info structure */ - ixgb_get_bus_info(hw); - - /* Zero out the Multicast HASH table */ - pr_debug("Zeroing the MTA\n"); - for (i = 0; i < IXGB_MC_TBL_SIZE; i++) - IXGB_WRITE_REG_ARRAY(hw, MTA, i, 0); - - /* Zero out the VLAN Filter Table Array */ - ixgb_clear_vfta(hw); - - /* Zero all of the hardware counters */ - ixgb_clear_hw_cntrs(hw); - - /* Call a subroutine to setup flow control. */ - status = ixgb_setup_fc(hw); - - /* 82597EX errata: Call check-for-link in case lane deskew is locked */ - ixgb_check_for_link(hw); - - return status; -} - -/****************************************************************************** - * Initializes receive address filters. - * - * hw - Struct containing variables accessed by shared code - * - * Places the MAC address in receive address register 0 and clears the rest - * of the receive address registers. Clears the multicast table. Assumes - * the receiver is in reset when the routine is called. - *****************************************************************************/ -static void -ixgb_init_rx_addrs(struct ixgb_hw *hw) -{ - u32 i; - - ENTER(); - - /* - * If the current mac address is valid, assume it is a software override - * to the permanent address. - * Otherwise, use the permanent address from the eeprom. - */ - if (!mac_addr_valid(hw->curr_mac_addr)) { - - /* Get the MAC address from the eeprom for later reference */ - ixgb_get_ee_mac_addr(hw, hw->curr_mac_addr); - - pr_debug("Keeping Permanent MAC Addr = %pM\n", - hw->curr_mac_addr); - } else { - - /* Setup the receive address. */ - pr_debug("Overriding MAC Address in RAR[0]\n"); - pr_debug("New MAC Addr = %pM\n", hw->curr_mac_addr); - - ixgb_rar_set(hw, hw->curr_mac_addr, 0); - } - - /* Zero out the other 15 receive addresses. */ - pr_debug("Clearing RAR[1-15]\n"); - for (i = 1; i < IXGB_RAR_ENTRIES; i++) { - /* Write high reg first to disable the AV bit first */ - IXGB_WRITE_REG_ARRAY(hw, RA, ((i << 1) + 1), 0); - IXGB_WRITE_REG_ARRAY(hw, RA, (i << 1), 0); - } -} - -/****************************************************************************** - * Updates the MAC's list of multicast addresses. - * - * hw - Struct containing variables accessed by shared code - * mc_addr_list - the list of new multicast addresses - * mc_addr_count - number of addresses - * pad - number of bytes between addresses in the list - * - * The given list replaces any existing list. Clears the last 15 receive - * address registers and the multicast table. Uses receive address registers - * for the first 15 multicast addresses, and hashes the rest into the - * multicast table. - *****************************************************************************/ -void -ixgb_mc_addr_list_update(struct ixgb_hw *hw, - u8 *mc_addr_list, - u32 mc_addr_count, - u32 pad) -{ - u32 hash_value; - u32 i; - u32 rar_used_count = 1; /* RAR[0] is used for our MAC address */ - u8 *mca; - - ENTER(); - - /* Set the new number of MC addresses that we are being requested to use. */ - hw->num_mc_addrs = mc_addr_count; - - /* Clear RAR[1-15] */ - pr_debug("Clearing RAR[1-15]\n"); - for (i = rar_used_count; i < IXGB_RAR_ENTRIES; i++) { - IXGB_WRITE_REG_ARRAY(hw, RA, (i << 1), 0); - IXGB_WRITE_REG_ARRAY(hw, RA, ((i << 1) + 1), 0); - } - - /* Clear the MTA */ - pr_debug("Clearing MTA\n"); - for (i = 0; i < IXGB_MC_TBL_SIZE; i++) - IXGB_WRITE_REG_ARRAY(hw, MTA, i, 0); - - /* Add the new addresses */ - mca = mc_addr_list; - for (i = 0; i < mc_addr_count; i++) { - pr_debug("Adding the multicast addresses:\n"); - pr_debug("MC Addr #%d = %pM\n", i, mca); - - /* Place this multicast address in the RAR if there is room, * - * else put it in the MTA - */ - if (rar_used_count < IXGB_RAR_ENTRIES) { - ixgb_rar_set(hw, mca, rar_used_count); - pr_debug("Added a multicast address to RAR[%d]\n", i); - rar_used_count++; - } else { - hash_value = ixgb_hash_mc_addr(hw, mca); - - pr_debug("Hash value = 0x%03X\n", hash_value); - - ixgb_mta_set(hw, hash_value); - } - - mca += ETH_ALEN + pad; - } - - pr_debug("MC Update Complete\n"); -} - -/****************************************************************************** - * Hashes an address to determine its location in the multicast table - * - * hw - Struct containing variables accessed by shared code - * mc_addr - the multicast address to hash - * - * Returns: - * The hash value - *****************************************************************************/ -static u32 -ixgb_hash_mc_addr(struct ixgb_hw *hw, - u8 *mc_addr) -{ - u32 hash_value = 0; - - ENTER(); - - /* The portion of the address that is used for the hash table is - * determined by the mc_filter_type setting. - */ - switch (hw->mc_filter_type) { - /* [0] [1] [2] [3] [4] [5] - * 01 AA 00 12 34 56 - * LSB MSB - According to H/W docs */ - case 0: - /* [47:36] i.e. 0x563 for above example address */ - hash_value = - ((mc_addr[4] >> 4) | (((u16) mc_addr[5]) << 4)); - break; - case 1: /* [46:35] i.e. 0xAC6 for above example address */ - hash_value = - ((mc_addr[4] >> 3) | (((u16) mc_addr[5]) << 5)); - break; - case 2: /* [45:34] i.e. 0x5D8 for above example address */ - hash_value = - ((mc_addr[4] >> 2) | (((u16) mc_addr[5]) << 6)); - break; - case 3: /* [43:32] i.e. 0x634 for above example address */ - hash_value = ((mc_addr[4]) | (((u16) mc_addr[5]) << 8)); - break; - default: - /* Invalid mc_filter_type, what should we do? */ - pr_debug("MC filter type param set incorrectly\n"); - ASSERT(0); - break; - } - - hash_value &= 0xFFF; - return hash_value; -} - -/****************************************************************************** - * Sets the bit in the multicast table corresponding to the hash value. - * - * hw - Struct containing variables accessed by shared code - * hash_value - Multicast address hash value - *****************************************************************************/ -static void -ixgb_mta_set(struct ixgb_hw *hw, - u32 hash_value) -{ - u32 hash_bit, hash_reg; - u32 mta_reg; - - /* The MTA is a register array of 128 32-bit registers. - * It is treated like an array of 4096 bits. We want to set - * bit BitArray[hash_value]. So we figure out what register - * the bit is in, read it, OR in the new bit, then write - * back the new value. The register is determined by the - * upper 7 bits of the hash value and the bit within that - * register are determined by the lower 5 bits of the value. - */ - hash_reg = (hash_value >> 5) & 0x7F; - hash_bit = hash_value & 0x1F; - - mta_reg = IXGB_READ_REG_ARRAY(hw, MTA, hash_reg); - - mta_reg |= (1 << hash_bit); - - IXGB_WRITE_REG_ARRAY(hw, MTA, hash_reg, mta_reg); -} - -/****************************************************************************** - * Puts an ethernet address into a receive address register. - * - * hw - Struct containing variables accessed by shared code - * addr - Address to put into receive address register - * index - Receive address register to write - *****************************************************************************/ -void -ixgb_rar_set(struct ixgb_hw *hw, - const u8 *addr, - u32 index) -{ - u32 rar_low, rar_high; - - ENTER(); - - /* HW expects these in little endian so we reverse the byte order - * from network order (big endian) to little endian - */ - rar_low = ((u32) addr[0] | - ((u32)addr[1] << 8) | - ((u32)addr[2] << 16) | - ((u32)addr[3] << 24)); - - rar_high = ((u32) addr[4] | - ((u32)addr[5] << 8) | - IXGB_RAH_AV); - - IXGB_WRITE_REG_ARRAY(hw, RA, (index << 1), rar_low); - IXGB_WRITE_REG_ARRAY(hw, RA, ((index << 1) + 1), rar_high); -} - -/****************************************************************************** - * Writes a value to the specified offset in the VLAN filter table. - * - * hw - Struct containing variables accessed by shared code - * offset - Offset in VLAN filter table to write - * value - Value to write into VLAN filter table - *****************************************************************************/ -void -ixgb_write_vfta(struct ixgb_hw *hw, - u32 offset, - u32 value) -{ - IXGB_WRITE_REG_ARRAY(hw, VFTA, offset, value); -} - -/****************************************************************************** - * Clears the VLAN filter table - * - * hw - Struct containing variables accessed by shared code - *****************************************************************************/ -static void -ixgb_clear_vfta(struct ixgb_hw *hw) -{ - u32 offset; - - for (offset = 0; offset < IXGB_VLAN_FILTER_TBL_SIZE; offset++) - IXGB_WRITE_REG_ARRAY(hw, VFTA, offset, 0); -} - -/****************************************************************************** - * Configures the flow control settings based on SW configuration. - * - * hw - Struct containing variables accessed by shared code - *****************************************************************************/ - -static bool -ixgb_setup_fc(struct ixgb_hw *hw) -{ - u32 ctrl_reg; - u32 pap_reg = 0; /* by default, assume no pause time */ - bool status = true; - - ENTER(); - - /* Get the current control reg 0 settings */ - ctrl_reg = IXGB_READ_REG(hw, CTRL0); - - /* Clear the Receive Pause Enable and Transmit Pause Enable bits */ - ctrl_reg &= ~(IXGB_CTRL0_RPE | IXGB_CTRL0_TPE); - - /* The possible values of the "flow_control" parameter are: - * 0: Flow control is completely disabled - * 1: Rx flow control is enabled (we can receive pause frames - * but not send pause frames). - * 2: Tx flow control is enabled (we can send pause frames - * but we do not support receiving pause frames). - * 3: Both Rx and TX flow control (symmetric) are enabled. - * other: Invalid. - */ - switch (hw->fc.type) { - case ixgb_fc_none: /* 0 */ - /* Set CMDC bit to disable Rx Flow control */ - ctrl_reg |= (IXGB_CTRL0_CMDC); - break; - case ixgb_fc_rx_pause: /* 1 */ - /* RX Flow control is enabled, and TX Flow control is - * disabled. - */ - ctrl_reg |= (IXGB_CTRL0_RPE); - break; - case ixgb_fc_tx_pause: /* 2 */ - /* TX Flow control is enabled, and RX Flow control is - * disabled, by a software over-ride. - */ - ctrl_reg |= (IXGB_CTRL0_TPE); - pap_reg = hw->fc.pause_time; - break; - case ixgb_fc_full: /* 3 */ - /* Flow control (both RX and TX) is enabled by a software - * over-ride. - */ - ctrl_reg |= (IXGB_CTRL0_RPE | IXGB_CTRL0_TPE); - pap_reg = hw->fc.pause_time; - break; - default: - /* We should never get here. The value should be 0-3. */ - pr_debug("Flow control param set incorrectly\n"); - ASSERT(0); - break; - } - - /* Write the new settings */ - IXGB_WRITE_REG(hw, CTRL0, ctrl_reg); - - if (pap_reg != 0) - IXGB_WRITE_REG(hw, PAP, pap_reg); - - /* Set the flow control receive threshold registers. Normally, - * these registers will be set to a default threshold that may be - * adjusted later by the driver's runtime code. However, if the - * ability to transmit pause frames in not enabled, then these - * registers will be set to 0. - */ - if (!(hw->fc.type & ixgb_fc_tx_pause)) { - IXGB_WRITE_REG(hw, FCRTL, 0); - IXGB_WRITE_REG(hw, FCRTH, 0); - } else { - /* We need to set up the Receive Threshold high and low water - * marks as well as (optionally) enabling the transmission of XON - * frames. */ - if (hw->fc.send_xon) { - IXGB_WRITE_REG(hw, FCRTL, - (hw->fc.low_water | IXGB_FCRTL_XONE)); - } else { - IXGB_WRITE_REG(hw, FCRTL, hw->fc.low_water); - } - IXGB_WRITE_REG(hw, FCRTH, hw->fc.high_water); - } - return status; -} - -/****************************************************************************** - * Reads a word from a device over the Management Data Interface (MDI) bus. - * This interface is used to manage Physical layer devices. - * - * hw - Struct containing variables accessed by hw code - * reg_address - Offset of device register being read. - * phy_address - Address of device on MDI. - * - * Returns: Data word (16 bits) from MDI device. - * - * The 82597EX has support for several MDI access methods. This routine - * uses the new protocol MDI Single Command and Address Operation. - * This requires that first an address cycle command is sent, followed by a - * read command. - *****************************************************************************/ -static u16 -ixgb_read_phy_reg(struct ixgb_hw *hw, - u32 reg_address, - u32 phy_address, - u32 device_type) -{ - u32 i; - u32 data; - u32 command = 0; - - ASSERT(reg_address <= IXGB_MAX_PHY_REG_ADDRESS); - ASSERT(phy_address <= IXGB_MAX_PHY_ADDRESS); - ASSERT(device_type <= IXGB_MAX_PHY_DEV_TYPE); - - /* Setup and write the address cycle command */ - command = ((reg_address << IXGB_MSCA_NP_ADDR_SHIFT) | - (device_type << IXGB_MSCA_DEV_TYPE_SHIFT) | - (phy_address << IXGB_MSCA_PHY_ADDR_SHIFT) | - (IXGB_MSCA_ADDR_CYCLE | IXGB_MSCA_MDI_COMMAND)); - - IXGB_WRITE_REG(hw, MSCA, command); - - /************************************************************** - ** Check every 10 usec to see if the address cycle completed - ** The COMMAND bit will clear when the operation is complete. - ** This may take as long as 64 usecs (we'll wait 100 usecs max) - ** from the CPU Write to the Ready bit assertion. - **************************************************************/ - - for (i = 0; i < 10; i++) - { - udelay(10); - - command = IXGB_READ_REG(hw, MSCA); - - if ((command & IXGB_MSCA_MDI_COMMAND) == 0) - break; - } - - ASSERT((command & IXGB_MSCA_MDI_COMMAND) == 0); - - /* Address cycle complete, setup and write the read command */ - command = ((reg_address << IXGB_MSCA_NP_ADDR_SHIFT) | - (device_type << IXGB_MSCA_DEV_TYPE_SHIFT) | - (phy_address << IXGB_MSCA_PHY_ADDR_SHIFT) | - (IXGB_MSCA_READ | IXGB_MSCA_MDI_COMMAND)); - - IXGB_WRITE_REG(hw, MSCA, command); - - /************************************************************** - ** Check every 10 usec to see if the read command completed - ** The COMMAND bit will clear when the operation is complete. - ** The read may take as long as 64 usecs (we'll wait 100 usecs max) - ** from the CPU Write to the Ready bit assertion. - **************************************************************/ - - for (i = 0; i < 10; i++) - { - udelay(10); - - command = IXGB_READ_REG(hw, MSCA); - - if ((command & IXGB_MSCA_MDI_COMMAND) == 0) - break; - } - - ASSERT((command & IXGB_MSCA_MDI_COMMAND) == 0); - - /* Operation is complete, get the data from the MDIO Read/Write Data - * register and return. - */ - data = IXGB_READ_REG(hw, MSRWD); - data >>= IXGB_MSRWD_READ_DATA_SHIFT; - return((u16) data); -} - -/****************************************************************************** - * Writes a word to a device over the Management Data Interface (MDI) bus. - * This interface is used to manage Physical layer devices. - * - * hw - Struct containing variables accessed by hw code - * reg_address - Offset of device register being read. - * phy_address - Address of device on MDI. - * device_type - Also known as the Device ID or DID. - * data - 16-bit value to be written - * - * Returns: void. - * - * The 82597EX has support for several MDI access methods. This routine - * uses the new protocol MDI Single Command and Address Operation. - * This requires that first an address cycle command is sent, followed by a - * write command. - *****************************************************************************/ -static void -ixgb_write_phy_reg(struct ixgb_hw *hw, - u32 reg_address, - u32 phy_address, - u32 device_type, - u16 data) -{ - u32 i; - u32 command = 0; - - ASSERT(reg_address <= IXGB_MAX_PHY_REG_ADDRESS); - ASSERT(phy_address <= IXGB_MAX_PHY_ADDRESS); - ASSERT(device_type <= IXGB_MAX_PHY_DEV_TYPE); - - /* Put the data in the MDIO Read/Write Data register */ - IXGB_WRITE_REG(hw, MSRWD, (u32)data); - - /* Setup and write the address cycle command */ - command = ((reg_address << IXGB_MSCA_NP_ADDR_SHIFT) | - (device_type << IXGB_MSCA_DEV_TYPE_SHIFT) | - (phy_address << IXGB_MSCA_PHY_ADDR_SHIFT) | - (IXGB_MSCA_ADDR_CYCLE | IXGB_MSCA_MDI_COMMAND)); - - IXGB_WRITE_REG(hw, MSCA, command); - - /************************************************************** - ** Check every 10 usec to see if the address cycle completed - ** The COMMAND bit will clear when the operation is complete. - ** This may take as long as 64 usecs (we'll wait 100 usecs max) - ** from the CPU Write to the Ready bit assertion. - **************************************************************/ - - for (i = 0; i < 10; i++) - { - udelay(10); - - command = IXGB_READ_REG(hw, MSCA); - - if ((command & IXGB_MSCA_MDI_COMMAND) == 0) - break; - } - - ASSERT((command & IXGB_MSCA_MDI_COMMAND) == 0); - - /* Address cycle complete, setup and write the write command */ - command = ((reg_address << IXGB_MSCA_NP_ADDR_SHIFT) | - (device_type << IXGB_MSCA_DEV_TYPE_SHIFT) | - (phy_address << IXGB_MSCA_PHY_ADDR_SHIFT) | - (IXGB_MSCA_WRITE | IXGB_MSCA_MDI_COMMAND)); - - IXGB_WRITE_REG(hw, MSCA, command); - - /************************************************************** - ** Check every 10 usec to see if the read command completed - ** The COMMAND bit will clear when the operation is complete. - ** The write may take as long as 64 usecs (we'll wait 100 usecs max) - ** from the CPU Write to the Ready bit assertion. - **************************************************************/ - - for (i = 0; i < 10; i++) - { - udelay(10); - - command = IXGB_READ_REG(hw, MSCA); - - if ((command & IXGB_MSCA_MDI_COMMAND) == 0) - break; - } - - ASSERT((command & IXGB_MSCA_MDI_COMMAND) == 0); - - /* Operation is complete, return. */ -} - -/****************************************************************************** - * Checks to see if the link status of the hardware has changed. - * - * hw - Struct containing variables accessed by hw code - * - * Called by any function that needs to check the link status of the adapter. - *****************************************************************************/ -void -ixgb_check_for_link(struct ixgb_hw *hw) -{ - u32 status_reg; - u32 xpcss_reg; - - ENTER(); - - xpcss_reg = IXGB_READ_REG(hw, XPCSS); - status_reg = IXGB_READ_REG(hw, STATUS); - - if ((xpcss_reg & IXGB_XPCSS_ALIGN_STATUS) && - (status_reg & IXGB_STATUS_LU)) { - hw->link_up = true; - } else if (!(xpcss_reg & IXGB_XPCSS_ALIGN_STATUS) && - (status_reg & IXGB_STATUS_LU)) { - pr_debug("XPCSS Not Aligned while Status:LU is set\n"); - hw->link_up = ixgb_link_reset(hw); - } else { - /* - * 82597EX errata. Since the lane deskew problem may prevent - * link, reset the link before reporting link down. - */ - hw->link_up = ixgb_link_reset(hw); - } - /* Anything else for 10 Gig?? */ -} - -/****************************************************************************** - * Check for a bad link condition that may have occurred. - * The indication is that the RFC / LFC registers may be incrementing - * continually. A full adapter reset is required to recover. - * - * hw - Struct containing variables accessed by hw code - * - * Called by any function that needs to check the link status of the adapter. - *****************************************************************************/ -bool ixgb_check_for_bad_link(struct ixgb_hw *hw) -{ - u32 newLFC, newRFC; - bool bad_link_returncode = false; - - if (hw->phy_type == ixgb_phy_type_txn17401) { - newLFC = IXGB_READ_REG(hw, LFC); - newRFC = IXGB_READ_REG(hw, RFC); - if ((hw->lastLFC + 250 < newLFC) - || (hw->lastRFC + 250 < newRFC)) { - pr_debug("BAD LINK! too many LFC/RFC since last check\n"); - bad_link_returncode = true; - } - hw->lastLFC = newLFC; - hw->lastRFC = newRFC; - } - - return bad_link_returncode; -} - -/****************************************************************************** - * Clears all hardware statistics counters. - * - * hw - Struct containing variables accessed by shared code - *****************************************************************************/ -static void -ixgb_clear_hw_cntrs(struct ixgb_hw *hw) -{ - ENTER(); - - /* if we are stopped or resetting exit gracefully */ - if (hw->adapter_stopped) { - pr_debug("Exiting because the adapter is stopped!!!\n"); - return; - } - - IXGB_READ_REG(hw, TPRL); - IXGB_READ_REG(hw, TPRH); - IXGB_READ_REG(hw, GPRCL); - IXGB_READ_REG(hw, GPRCH); - IXGB_READ_REG(hw, BPRCL); - IXGB_READ_REG(hw, BPRCH); - IXGB_READ_REG(hw, MPRCL); - IXGB_READ_REG(hw, MPRCH); - IXGB_READ_REG(hw, UPRCL); - IXGB_READ_REG(hw, UPRCH); - IXGB_READ_REG(hw, VPRCL); - IXGB_READ_REG(hw, VPRCH); - IXGB_READ_REG(hw, JPRCL); - IXGB_READ_REG(hw, JPRCH); - IXGB_READ_REG(hw, GORCL); - IXGB_READ_REG(hw, GORCH); - IXGB_READ_REG(hw, TORL); - IXGB_READ_REG(hw, TORH); - IXGB_READ_REG(hw, RNBC); - IXGB_READ_REG(hw, RUC); - IXGB_READ_REG(hw, ROC); - IXGB_READ_REG(hw, RLEC); - IXGB_READ_REG(hw, CRCERRS); - IXGB_READ_REG(hw, ICBC); - IXGB_READ_REG(hw, ECBC); - IXGB_READ_REG(hw, MPC); - IXGB_READ_REG(hw, TPTL); - IXGB_READ_REG(hw, TPTH); - IXGB_READ_REG(hw, GPTCL); - IXGB_READ_REG(hw, GPTCH); - IXGB_READ_REG(hw, BPTCL); - IXGB_READ_REG(hw, BPTCH); - IXGB_READ_REG(hw, MPTCL); - IXGB_READ_REG(hw, MPTCH); - IXGB_READ_REG(hw, UPTCL); - IXGB_READ_REG(hw, UPTCH); - IXGB_READ_REG(hw, VPTCL); - IXGB_READ_REG(hw, VPTCH); - IXGB_READ_REG(hw, JPTCL); - IXGB_READ_REG(hw, JPTCH); - IXGB_READ_REG(hw, GOTCL); - IXGB_READ_REG(hw, GOTCH); - IXGB_READ_REG(hw, TOTL); - IXGB_READ_REG(hw, TOTH); - IXGB_READ_REG(hw, DC); - IXGB_READ_REG(hw, PLT64C); - IXGB_READ_REG(hw, TSCTC); - IXGB_READ_REG(hw, TSCTFC); - IXGB_READ_REG(hw, IBIC); - IXGB_READ_REG(hw, RFC); - IXGB_READ_REG(hw, LFC); - IXGB_READ_REG(hw, PFRC); - IXGB_READ_REG(hw, PFTC); - IXGB_READ_REG(hw, MCFRC); - IXGB_READ_REG(hw, MCFTC); - IXGB_READ_REG(hw, XONRXC); - IXGB_READ_REG(hw, XONTXC); - IXGB_READ_REG(hw, XOFFRXC); - IXGB_READ_REG(hw, XOFFTXC); - IXGB_READ_REG(hw, RJC); -} - -/****************************************************************************** - * Turns on the software controllable LED - * - * hw - Struct containing variables accessed by shared code - *****************************************************************************/ -void -ixgb_led_on(struct ixgb_hw *hw) -{ - u32 ctrl0_reg = IXGB_READ_REG(hw, CTRL0); - - /* To turn on the LED, clear software-definable pin 0 (SDP0). */ - ctrl0_reg &= ~IXGB_CTRL0_SDP0; - IXGB_WRITE_REG(hw, CTRL0, ctrl0_reg); -} - -/****************************************************************************** - * Turns off the software controllable LED - * - * hw - Struct containing variables accessed by shared code - *****************************************************************************/ -void -ixgb_led_off(struct ixgb_hw *hw) -{ - u32 ctrl0_reg = IXGB_READ_REG(hw, CTRL0); - - /* To turn off the LED, set software-definable pin 0 (SDP0). */ - ctrl0_reg |= IXGB_CTRL0_SDP0; - IXGB_WRITE_REG(hw, CTRL0, ctrl0_reg); -} - -/****************************************************************************** - * Gets the current PCI bus type, speed, and width of the hardware - * - * hw - Struct containing variables accessed by shared code - *****************************************************************************/ -static void -ixgb_get_bus_info(struct ixgb_hw *hw) -{ - u32 status_reg; - - status_reg = IXGB_READ_REG(hw, STATUS); - - hw->bus.type = (status_reg & IXGB_STATUS_PCIX_MODE) ? - ixgb_bus_type_pcix : ixgb_bus_type_pci; - - if (hw->bus.type == ixgb_bus_type_pci) { - hw->bus.speed = (status_reg & IXGB_STATUS_PCI_SPD) ? - ixgb_bus_speed_66 : ixgb_bus_speed_33; - } else { - switch (status_reg & IXGB_STATUS_PCIX_SPD_MASK) { - case IXGB_STATUS_PCIX_SPD_66: - hw->bus.speed = ixgb_bus_speed_66; - break; - case IXGB_STATUS_PCIX_SPD_100: - hw->bus.speed = ixgb_bus_speed_100; - break; - case IXGB_STATUS_PCIX_SPD_133: - hw->bus.speed = ixgb_bus_speed_133; - break; - default: - hw->bus.speed = ixgb_bus_speed_reserved; - break; - } - } - - hw->bus.width = (status_reg & IXGB_STATUS_BUS64) ? - ixgb_bus_width_64 : ixgb_bus_width_32; -} - -/****************************************************************************** - * Tests a MAC address to ensure it is a valid Individual Address - * - * mac_addr - pointer to MAC address. - * - *****************************************************************************/ -static bool -mac_addr_valid(u8 *mac_addr) -{ - bool is_valid = true; - ENTER(); - - /* Make sure it is not a multicast address */ - if (is_multicast_ether_addr(mac_addr)) { - pr_debug("MAC address is multicast\n"); - is_valid = false; - } - /* Not a broadcast address */ - else if (is_broadcast_ether_addr(mac_addr)) { - pr_debug("MAC address is broadcast\n"); - is_valid = false; - } - /* Reject the zero address */ - else if (is_zero_ether_addr(mac_addr)) { - pr_debug("MAC address is all zeros\n"); - is_valid = false; - } - return is_valid; -} - -/****************************************************************************** - * Resets the 10GbE link. Waits the settle time and returns the state of - * the link. - * - * hw - Struct containing variables accessed by shared code - *****************************************************************************/ -static bool -ixgb_link_reset(struct ixgb_hw *hw) -{ - bool link_status = false; - u8 wait_retries = MAX_RESET_ITERATIONS; - u8 lrst_retries = MAX_RESET_ITERATIONS; - - do { - /* Reset the link */ - IXGB_WRITE_REG(hw, CTRL0, - IXGB_READ_REG(hw, CTRL0) | IXGB_CTRL0_LRST); - - /* Wait for link-up and lane re-alignment */ - do { - udelay(IXGB_DELAY_USECS_AFTER_LINK_RESET); - link_status = - ((IXGB_READ_REG(hw, STATUS) & IXGB_STATUS_LU) - && (IXGB_READ_REG(hw, XPCSS) & - IXGB_XPCSS_ALIGN_STATUS)) ? true : false; - } while (!link_status && --wait_retries); - - } while (!link_status && --lrst_retries); - - return link_status; -} - -/****************************************************************************** - * Resets the 10GbE optics module. - * - * hw - Struct containing variables accessed by shared code - *****************************************************************************/ -static void -ixgb_optics_reset(struct ixgb_hw *hw) -{ - if (hw->phy_type == ixgb_phy_type_txn17401) { - ixgb_write_phy_reg(hw, - MDIO_CTRL1, - IXGB_PHY_ADDRESS, - MDIO_MMD_PMAPMD, - MDIO_CTRL1_RESET); - - ixgb_read_phy_reg(hw, MDIO_CTRL1, IXGB_PHY_ADDRESS, MDIO_MMD_PMAPMD); - } -} - -/****************************************************************************** - * Resets the 10GbE optics module for Sun variant NIC. - * - * hw - Struct containing variables accessed by shared code - *****************************************************************************/ - -#define IXGB_BCM8704_USER_PMD_TX_CTRL_REG 0xC803 -#define IXGB_BCM8704_USER_PMD_TX_CTRL_REG_VAL 0x0164 -#define IXGB_BCM8704_USER_CTRL_REG 0xC800 -#define IXGB_BCM8704_USER_CTRL_REG_VAL 0x7FBF -#define IXGB_BCM8704_USER_DEV3_ADDR 0x0003 -#define IXGB_SUN_PHY_ADDRESS 0x0000 -#define IXGB_SUN_PHY_RESET_DELAY 305 - -static void -ixgb_optics_reset_bcm(struct ixgb_hw *hw) -{ - u32 ctrl = IXGB_READ_REG(hw, CTRL0); - ctrl &= ~IXGB_CTRL0_SDP2; - ctrl |= IXGB_CTRL0_SDP3; - IXGB_WRITE_REG(hw, CTRL0, ctrl); - IXGB_WRITE_FLUSH(hw); - - /* SerDes needs extra delay */ - msleep(IXGB_SUN_PHY_RESET_DELAY); - - /* Broadcom 7408L configuration */ - /* Reference clock config */ - ixgb_write_phy_reg(hw, - IXGB_BCM8704_USER_PMD_TX_CTRL_REG, - IXGB_SUN_PHY_ADDRESS, - IXGB_BCM8704_USER_DEV3_ADDR, - IXGB_BCM8704_USER_PMD_TX_CTRL_REG_VAL); - /* we must read the registers twice */ - ixgb_read_phy_reg(hw, - IXGB_BCM8704_USER_PMD_TX_CTRL_REG, - IXGB_SUN_PHY_ADDRESS, - IXGB_BCM8704_USER_DEV3_ADDR); - ixgb_read_phy_reg(hw, - IXGB_BCM8704_USER_PMD_TX_CTRL_REG, - IXGB_SUN_PHY_ADDRESS, - IXGB_BCM8704_USER_DEV3_ADDR); - - ixgb_write_phy_reg(hw, - IXGB_BCM8704_USER_CTRL_REG, - IXGB_SUN_PHY_ADDRESS, - IXGB_BCM8704_USER_DEV3_ADDR, - IXGB_BCM8704_USER_CTRL_REG_VAL); - ixgb_read_phy_reg(hw, - IXGB_BCM8704_USER_CTRL_REG, - IXGB_SUN_PHY_ADDRESS, - IXGB_BCM8704_USER_DEV3_ADDR); - ixgb_read_phy_reg(hw, - IXGB_BCM8704_USER_CTRL_REG, - IXGB_SUN_PHY_ADDRESS, - IXGB_BCM8704_USER_DEV3_ADDR); - - /* SerDes needs extra delay */ - msleep(IXGB_SUN_PHY_RESET_DELAY); -} diff --git a/drivers/net/ethernet/intel/ixgb/ixgb_hw.h b/drivers/net/ethernet/intel/ixgb/ixgb_hw.h deleted file mode 100644 index 70bcff5fb3db..000000000000 --- a/drivers/net/ethernet/intel/ixgb/ixgb_hw.h +++ /dev/null @@ -1,767 +0,0 @@ -/* SPDX-License-Identifier: GPL-2.0 */ -/* Copyright(c) 1999 - 2008 Intel Corporation. */ - -#ifndef _IXGB_HW_H_ -#define _IXGB_HW_H_ - -#include <linux/mdio.h> - -#include "ixgb_osdep.h" - -/* Enums */ -typedef enum { - ixgb_mac_unknown = 0, - ixgb_82597, - ixgb_num_macs -} ixgb_mac_type; - -/* Types of physical layer modules */ -typedef enum { - ixgb_phy_type_unknown = 0, - ixgb_phy_type_g6005, /* 850nm, MM fiber, XPAK transceiver */ - ixgb_phy_type_g6104, /* 1310nm, SM fiber, XPAK transceiver */ - ixgb_phy_type_txn17201, /* 850nm, MM fiber, XPAK transceiver */ - ixgb_phy_type_txn17401, /* 1310nm, SM fiber, XENPAK transceiver */ - ixgb_phy_type_bcm /* SUN specific board */ -} ixgb_phy_type; - -/* XPAK transceiver vendors, for the SR adapters */ -typedef enum { - ixgb_xpak_vendor_intel, - ixgb_xpak_vendor_infineon -} ixgb_xpak_vendor; - -/* Media Types */ -typedef enum { - ixgb_media_type_unknown = 0, - ixgb_media_type_fiber = 1, - ixgb_media_type_copper = 2, - ixgb_num_media_types -} ixgb_media_type; - -/* Flow Control Settings */ -typedef enum { - ixgb_fc_none = 0, - ixgb_fc_rx_pause = 1, - ixgb_fc_tx_pause = 2, - ixgb_fc_full = 3, - ixgb_fc_default = 0xFF -} ixgb_fc_type; - -/* PCI bus types */ -typedef enum { - ixgb_bus_type_unknown = 0, - ixgb_bus_type_pci, - ixgb_bus_type_pcix -} ixgb_bus_type; - -/* PCI bus speeds */ -typedef enum { - ixgb_bus_speed_unknown = 0, - ixgb_bus_speed_33, - ixgb_bus_speed_66, - ixgb_bus_speed_100, - ixgb_bus_speed_133, - ixgb_bus_speed_reserved -} ixgb_bus_speed; - -/* PCI bus widths */ -typedef enum { - ixgb_bus_width_unknown = 0, - ixgb_bus_width_32, - ixgb_bus_width_64 -} ixgb_bus_width; - -#define IXGB_EEPROM_SIZE 64 /* Size in words */ - -#define SPEED_10000 10000 -#define FULL_DUPLEX 2 - -#define MIN_NUMBER_OF_DESCRIPTORS 8 -#define MAX_NUMBER_OF_DESCRIPTORS 0xFFF8 /* 13 bits in RDLEN/TDLEN, 128B aligned */ - -#define IXGB_DELAY_BEFORE_RESET 10 /* allow 10ms after idling rx/tx units */ -#define IXGB_DELAY_AFTER_RESET 1 /* allow 1ms after the reset */ -#define IXGB_DELAY_AFTER_EE_RESET 10 /* allow 10ms after the EEPROM reset */ - -#define IXGB_DELAY_USECS_AFTER_LINK_RESET 13 /* allow 13 microseconds after the reset */ - /* NOTE: this is MICROSECONDS */ -#define MAX_RESET_ITERATIONS 8 /* number of iterations to get things right */ - -/* General Registers */ -#define IXGB_CTRL0 0x00000 /* Device Control Register 0 - RW */ -#define IXGB_CTRL1 0x00008 /* Device Control Register 1 - RW */ -#define IXGB_STATUS 0x00010 /* Device Status Register - RO */ -#define IXGB_EECD 0x00018 /* EEPROM/Flash Control/Data Register - RW */ -#define IXGB_MFS 0x00020 /* Maximum Frame Size - RW */ - -/* Interrupt */ -#define IXGB_ICR 0x00080 /* Interrupt Cause Read - R/clr */ -#define IXGB_ICS 0x00088 /* Interrupt Cause Set - RW */ -#define IXGB_IMS 0x00090 /* Interrupt Mask Set/Read - RW */ -#define IXGB_IMC 0x00098 /* Interrupt Mask Clear - WO */ - -/* Receive */ -#define IXGB_RCTL 0x00100 /* RX Control - RW */ -#define IXGB_FCRTL 0x00108 /* Flow Control Receive Threshold Low - RW */ -#define IXGB_FCRTH 0x00110 /* Flow Control Receive Threshold High - RW */ -#define IXGB_RDBAL 0x00118 /* RX Descriptor Base Low - RW */ -#define IXGB_RDBAH 0x0011C /* RX Descriptor Base High - RW */ -#define IXGB_RDLEN 0x00120 /* RX Descriptor Length - RW */ -#define IXGB_RDH 0x00128 /* RX Descriptor Head - RW */ -#define IXGB_RDT 0x00130 /* RX Descriptor Tail - RW */ -#define IXGB_RDTR 0x00138 /* RX Delay Timer Ring - RW */ -#define IXGB_RXDCTL 0x00140 /* Receive Descriptor Control - RW */ -#define IXGB_RAIDC 0x00148 /* Receive Adaptive Interrupt Delay Control - RW */ -#define IXGB_RXCSUM 0x00158 /* Receive Checksum Control - RW */ -#define IXGB_RA 0x00180 /* Receive Address Array Base - RW */ -#define IXGB_RAL 0x00180 /* Receive Address Low [0:15] - RW */ -#define IXGB_RAH 0x00184 /* Receive Address High [0:15] - RW */ -#define IXGB_MTA 0x00200 /* Multicast Table Array [0:127] - RW */ -#define IXGB_VFTA 0x00400 /* VLAN Filter Table Array [0:127] - RW */ -#define IXGB_REQ_RX_DESCRIPTOR_MULTIPLE 8 - -/* Transmit */ -#define IXGB_TCTL 0x00600 /* TX Control - RW */ -#define IXGB_TDBAL 0x00608 /* TX Descriptor Base Low - RW */ -#define IXGB_TDBAH 0x0060C /* TX Descriptor Base High - RW */ -#define IXGB_TDLEN 0x00610 /* TX Descriptor Length - RW */ -#define IXGB_TDH 0x00618 /* TX Descriptor Head - RW */ -#define IXGB_TDT 0x00620 /* TX Descriptor Tail - RW */ -#define IXGB_TIDV 0x00628 /* TX Interrupt Delay Value - RW */ -#define IXGB_TXDCTL 0x00630 /* Transmit Descriptor Control - RW */ -#define IXGB_TSPMT 0x00638 /* TCP Segmentation PAD & Min Threshold - RW */ -#define IXGB_PAP 0x00640 /* Pause and Pace - RW */ -#define IXGB_REQ_TX_DESCRIPTOR_MULTIPLE 8 - -/* Physical */ -#define IXGB_PCSC1 0x00700 /* PCS Control 1 - RW */ -#define IXGB_PCSC2 0x00708 /* PCS Control 2 - RW */ -#define IXGB_PCSS1 0x00710 /* PCS Status 1 - RO */ -#define IXGB_PCSS2 0x00718 /* PCS Status 2 - RO */ -#define IXGB_XPCSS 0x00720 /* 10GBASE-X PCS Status (or XGXS Lane Status) - RO */ -#define IXGB_UCCR 0x00728 /* Unilink Circuit Control Register */ -#define IXGB_XPCSTC 0x00730 /* 10GBASE-X PCS Test Control */ -#define IXGB_MACA 0x00738 /* MDI Autoscan Command and Address - RW */ -#define IXGB_APAE 0x00740 /* Autoscan PHY Address Enable - RW */ -#define IXGB_ARD 0x00748 /* Autoscan Read Data - RO */ -#define IXGB_AIS 0x00750 /* Autoscan Interrupt Status - RO */ -#define IXGB_MSCA 0x00758 /* MDI Single Command and Address - RW */ -#define IXGB_MSRWD 0x00760 /* MDI Single Read and Write Data - RW, RO */ - -/* Wake-up */ -#define IXGB_WUFC 0x00808 /* Wake Up Filter Control - RW */ -#define IXGB_WUS 0x00810 /* Wake Up Status - RO */ -#define IXGB_FFLT 0x01000 /* Flexible Filter Length Table - RW */ -#define IXGB_FFMT 0x01020 /* Flexible Filter Mask Table - RW */ -#define IXGB_FTVT 0x01420 /* Flexible Filter Value Table - RW */ - -/* Statistics */ -#define IXGB_TPRL 0x02000 /* Total Packets Received (Low) */ -#define IXGB_TPRH 0x02004 /* Total Packets Received (High) */ -#define IXGB_GPRCL 0x02008 /* Good Packets Received Count (Low) */ -#define IXGB_GPRCH 0x0200C /* Good Packets Received Count (High) */ -#define IXGB_BPRCL 0x02010 /* Broadcast Packets Received Count (Low) */ -#define IXGB_BPRCH 0x02014 /* Broadcast Packets Received Count (High) */ -#define IXGB_MPRCL 0x02018 /* Multicast Packets Received Count (Low) */ -#define IXGB_MPRCH 0x0201C /* Multicast Packets Received Count (High) */ -#define IXGB_UPRCL 0x02020 /* Unicast Packets Received Count (Low) */ -#define IXGB_UPRCH 0x02024 /* Unicast Packets Received Count (High) */ -#define IXGB_VPRCL 0x02028 /* VLAN Packets Received Count (Low) */ -#define IXGB_VPRCH 0x0202C /* VLAN Packets Received Count (High) */ -#define IXGB_JPRCL 0x02030 /* Jumbo Packets Received Count (Low) */ -#define IXGB_JPRCH 0x02034 /* Jumbo Packets Received Count (High) */ -#define IXGB_GORCL 0x02038 /* Good Octets Received Count (Low) */ -#define IXGB_GORCH 0x0203C /* Good Octets Received Count (High) */ -#define IXGB_TORL 0x02040 /* Total Octets Received (Low) */ -#define IXGB_TORH 0x02044 /* Total Octets Received (High) */ -#define IXGB_RNBC 0x02048 /* Receive No Buffers Count */ -#define IXGB_RUC 0x02050 /* Receive Undersize Count */ -#define IXGB_ROC 0x02058 /* Receive Oversize Count */ -#define IXGB_RLEC 0x02060 /* Receive Length Error Count */ -#define IXGB_CRCERRS 0x02068 /* CRC Error Count */ -#define IXGB_ICBC 0x02070 /* Illegal control byte in mid-packet Count */ -#define IXGB_ECBC 0x02078 /* Error Control byte in mid-packet Count */ -#define IXGB_MPC 0x02080 /* Missed Packets Count */ -#define IXGB_TPTL 0x02100 /* Total Packets Transmitted (Low) */ -#define IXGB_TPTH 0x02104 /* Total Packets Transmitted (High) */ -#define IXGB_GPTCL 0x02108 /* Good Packets Transmitted Count (Low) */ -#define IXGB_GPTCH 0x0210C /* Good Packets Transmitted Count (High) */ -#define IXGB_BPTCL 0x02110 /* Broadcast Packets Transmitted Count (Low) */ -#define IXGB_BPTCH 0x02114 /* Broadcast Packets Transmitted Count (High) */ -#define IXGB_MPTCL 0x02118 /* Multicast Packets Transmitted Count (Low) */ -#define IXGB_MPTCH 0x0211C /* Multicast Packets Transmitted Count (High) */ -#define IXGB_UPTCL 0x02120 /* Unicast Packets Transmitted Count (Low) */ -#define IXGB_UPTCH 0x02124 /* Unicast Packets Transmitted Count (High) */ -#define IXGB_VPTCL 0x02128 /* VLAN Packets Transmitted Count (Low) */ -#define IXGB_VPTCH 0x0212C /* VLAN Packets Transmitted Count (High) */ -#define IXGB_JPTCL 0x02130 /* Jumbo Packets Transmitted Count (Low) */ -#define IXGB_JPTCH 0x02134 /* Jumbo Packets Transmitted Count (High) */ -#define IXGB_GOTCL 0x02138 /* Good Octets Transmitted Count (Low) */ -#define IXGB_GOTCH 0x0213C /* Good Octets Transmitted Count (High) */ -#define IXGB_TOTL 0x02140 /* Total Octets Transmitted Count (Low) */ -#define IXGB_TOTH 0x02144 /* Total Octets Transmitted Count (High) */ -#define IXGB_DC 0x02148 /* Defer Count */ -#define IXGB_PLT64C 0x02150 /* Packet Transmitted was less than 64 bytes Count */ -#define IXGB_TSCTC 0x02170 /* TCP Segmentation Context Transmitted Count */ -#define IXGB_TSCTFC 0x02178 /* TCP Segmentation Context Tx Fail Count */ -#define IXGB_IBIC 0x02180 /* Illegal byte during Idle stream count */ -#define IXGB_RFC 0x02188 /* Remote Fault Count */ -#define IXGB_LFC 0x02190 /* Local Fault Count */ -#define IXGB_PFRC 0x02198 /* Pause Frame Receive Count */ -#define IXGB_PFTC 0x021A0 /* Pause Frame Transmit Count */ -#define IXGB_MCFRC 0x021A8 /* MAC Control Frames (non-Pause) Received Count */ -#define IXGB_MCFTC 0x021B0 /* MAC Control Frames (non-Pause) Transmitted Count */ -#define IXGB_XONRXC 0x021B8 /* XON Received Count */ -#define IXGB_XONTXC 0x021C0 /* XON Transmitted Count */ -#define IXGB_XOFFRXC 0x021C8 /* XOFF Received Count */ -#define IXGB_XOFFTXC 0x021D0 /* XOFF Transmitted Count */ -#define IXGB_RJC 0x021D8 /* Receive Jabber Count */ - -/* CTRL0 Bit Masks */ -#define IXGB_CTRL0_LRST 0x00000008 -#define IXGB_CTRL0_JFE 0x00000010 -#define IXGB_CTRL0_XLE 0x00000020 -#define IXGB_CTRL0_MDCS 0x00000040 -#define IXGB_CTRL0_CMDC 0x00000080 -#define IXGB_CTRL0_SDP0 0x00040000 -#define IXGB_CTRL0_SDP1 0x00080000 -#define IXGB_CTRL0_SDP2 0x00100000 -#define IXGB_CTRL0_SDP3 0x00200000 -#define IXGB_CTRL0_SDP0_DIR 0x00400000 -#define IXGB_CTRL0_SDP1_DIR 0x00800000 -#define IXGB_CTRL0_SDP2_DIR 0x01000000 -#define IXGB_CTRL0_SDP3_DIR 0x02000000 -#define IXGB_CTRL0_RST 0x04000000 -#define IXGB_CTRL0_RPE 0x08000000 -#define IXGB_CTRL0_TPE 0x10000000 -#define IXGB_CTRL0_VME 0x40000000 - -/* CTRL1 Bit Masks */ -#define IXGB_CTRL1_GPI0_EN 0x00000001 -#define IXGB_CTRL1_GPI1_EN 0x00000002 -#define IXGB_CTRL1_GPI2_EN 0x00000004 -#define IXGB_CTRL1_GPI3_EN 0x00000008 -#define IXGB_CTRL1_SDP4 0x00000010 -#define IXGB_CTRL1_SDP5 0x00000020 -#define IXGB_CTRL1_SDP6 0x00000040 -#define IXGB_CTRL1_SDP7 0x00000080 -#define IXGB_CTRL1_SDP4_DIR 0x00000100 -#define IXGB_CTRL1_SDP5_DIR 0x00000200 -#define IXGB_CTRL1_SDP6_DIR 0x00000400 -#define IXGB_CTRL1_SDP7_DIR 0x00000800 -#define IXGB_CTRL1_EE_RST 0x00002000 -#define IXGB_CTRL1_RO_DIS 0x00020000 -#define IXGB_CTRL1_PCIXHM_MASK 0x00C00000 -#define IXGB_CTRL1_PCIXHM_1_2 0x00000000 -#define IXGB_CTRL1_PCIXHM_5_8 0x00400000 -#define IXGB_CTRL1_PCIXHM_3_4 0x00800000 -#define IXGB_CTRL1_PCIXHM_7_8 0x00C00000 - -/* STATUS Bit Masks */ -#define IXGB_STATUS_LU 0x00000002 -#define IXGB_STATUS_AIP 0x00000004 -#define IXGB_STATUS_TXOFF 0x00000010 -#define IXGB_STATUS_XAUIME 0x00000020 -#define IXGB_STATUS_RES 0x00000040 -#define IXGB_STATUS_RIS 0x00000080 -#define IXGB_STATUS_RIE 0x00000100 -#define IXGB_STATUS_RLF 0x00000200 -#define IXGB_STATUS_RRF 0x00000400 -#define IXGB_STATUS_PCI_SPD 0x00000800 -#define IXGB_STATUS_BUS64 0x00001000 -#define IXGB_STATUS_PCIX_MODE 0x00002000 -#define IXGB_STATUS_PCIX_SPD_MASK 0x0000C000 -#define IXGB_STATUS_PCIX_SPD_66 0x00000000 -#define IXGB_STATUS_PCIX_SPD_100 0x00004000 -#define IXGB_STATUS_PCIX_SPD_133 0x00008000 -#define IXGB_STATUS_REV_ID_MASK 0x000F0000 -#define IXGB_STATUS_REV_ID_SHIFT 16 - -/* EECD Bit Masks */ -#define IXGB_EECD_SK 0x00000001 -#define IXGB_EECD_CS 0x00000002 -#define IXGB_EECD_DI 0x00000004 -#define IXGB_EECD_DO 0x00000008 -#define IXGB_EECD_FWE_MASK 0x00000030 -#define IXGB_EECD_FWE_DIS 0x00000010 -#define IXGB_EECD_FWE_EN 0x00000020 - -/* MFS */ -#define IXGB_MFS_SHIFT 16 - -/* Interrupt Register Bit Masks (used for ICR, ICS, IMS, and IMC) */ -#define IXGB_INT_TXDW 0x00000001 -#define IXGB_INT_TXQE 0x00000002 -#define IXGB_INT_LSC 0x00000004 -#define IXGB_INT_RXSEQ 0x00000008 -#define IXGB_INT_RXDMT0 0x00000010 -#define IXGB_INT_RXO 0x00000040 -#define IXGB_INT_RXT0 0x00000080 -#define IXGB_INT_AUTOSCAN 0x00000200 -#define IXGB_INT_GPI0 0x00000800 -#define IXGB_INT_GPI1 0x00001000 -#define IXGB_INT_GPI2 0x00002000 -#define IXGB_INT_GPI3 0x00004000 - -/* RCTL Bit Masks */ -#define IXGB_RCTL_RXEN 0x00000002 -#define IXGB_RCTL_SBP 0x00000004 -#define IXGB_RCTL_UPE 0x00000008 -#define IXGB_RCTL_MPE 0x00000010 -#define IXGB_RCTL_RDMTS_MASK 0x00000300 -#define IXGB_RCTL_RDMTS_1_2 0x00000000 -#define IXGB_RCTL_RDMTS_1_4 0x00000100 -#define IXGB_RCTL_RDMTS_1_8 0x00000200 -#define IXGB_RCTL_MO_MASK 0x00003000 -#define IXGB_RCTL_MO_47_36 0x00000000 -#define IXGB_RCTL_MO_46_35 0x00001000 -#define IXGB_RCTL_MO_45_34 0x00002000 -#define IXGB_RCTL_MO_43_32 0x00003000 -#define IXGB_RCTL_MO_SHIFT 12 -#define IXGB_RCTL_BAM 0x00008000 -#define IXGB_RCTL_BSIZE_MASK 0x00030000 -#define IXGB_RCTL_BSIZE_2048 0x00000000 -#define IXGB_RCTL_BSIZE_4096 0x00010000 -#define IXGB_RCTL_BSIZE_8192 0x00020000 -#define IXGB_RCTL_BSIZE_16384 0x00030000 -#define IXGB_RCTL_VFE 0x00040000 -#define IXGB_RCTL_CFIEN 0x00080000 -#define IXGB_RCTL_CFI 0x00100000 -#define IXGB_RCTL_RPDA_MASK 0x00600000 -#define IXGB_RCTL_RPDA_MC_MAC 0x00000000 -#define IXGB_RCTL_MC_ONLY 0x00400000 -#define IXGB_RCTL_CFF 0x00800000 -#define IXGB_RCTL_SECRC 0x04000000 -#define IXGB_RDT_FPDB 0x80000000 - -#define IXGB_RCTL_IDLE_RX_UNIT 0 - -/* FCRTL Bit Masks */ -#define IXGB_FCRTL_XONE 0x80000000 - -/* RXDCTL Bit Masks */ -#define IXGB_RXDCTL_PTHRESH_MASK 0x000001FF -#define IXGB_RXDCTL_PTHRESH_SHIFT 0 -#define IXGB_RXDCTL_HTHRESH_MASK 0x0003FE00 -#define IXGB_RXDCTL_HTHRESH_SHIFT 9 -#define IXGB_RXDCTL_WTHRESH_MASK 0x07FC0000 -#define IXGB_RXDCTL_WTHRESH_SHIFT 18 - -/* RAIDC Bit Masks */ -#define IXGB_RAIDC_HIGHTHRS_MASK 0x0000003F -#define IXGB_RAIDC_DELAY_MASK 0x000FF800 -#define IXGB_RAIDC_DELAY_SHIFT 11 -#define IXGB_RAIDC_POLL_MASK 0x1FF00000 -#define IXGB_RAIDC_POLL_SHIFT 20 -#define IXGB_RAIDC_RXT_GATE 0x40000000 -#define IXGB_RAIDC_EN 0x80000000 - -#define IXGB_RAIDC_POLL_1000_INTERRUPTS_PER_SECOND 1220 -#define IXGB_RAIDC_POLL_5000_INTERRUPTS_PER_SECOND 244 -#define IXGB_RAIDC_POLL_10000_INTERRUPTS_PER_SECOND 122 -#define IXGB_RAIDC_POLL_20000_INTERRUPTS_PER_SECOND 61 - -/* RXCSUM Bit Masks */ -#define IXGB_RXCSUM_IPOFL 0x00000100 -#define IXGB_RXCSUM_TUOFL 0x00000200 - -/* RAH Bit Masks */ -#define IXGB_RAH_ASEL_MASK 0x00030000 -#define IXGB_RAH_ASEL_DEST 0x00000000 -#define IXGB_RAH_ASEL_SRC 0x00010000 -#define IXGB_RAH_AV 0x80000000 - -/* TCTL Bit Masks */ -#define IXGB_TCTL_TCE 0x00000001 -#define IXGB_TCTL_TXEN 0x00000002 -#define IXGB_TCTL_TPDE 0x00000004 - -#define IXGB_TCTL_IDLE_TX_UNIT 0 - -/* TXDCTL Bit Masks */ -#define IXGB_TXDCTL_PTHRESH_MASK 0x0000007F -#define IXGB_TXDCTL_HTHRESH_MASK 0x00007F00 -#define IXGB_TXDCTL_HTHRESH_SHIFT 8 -#define IXGB_TXDCTL_WTHRESH_MASK 0x007F0000 -#define IXGB_TXDCTL_WTHRESH_SHIFT 16 - -/* TSPMT Bit Masks */ -#define IXGB_TSPMT_TSMT_MASK 0x0000FFFF -#define IXGB_TSPMT_TSPBP_MASK 0xFFFF0000 -#define IXGB_TSPMT_TSPBP_SHIFT 16 - -/* PAP Bit Masks */ -#define IXGB_PAP_TXPC_MASK 0x0000FFFF -#define IXGB_PAP_TXPV_MASK 0x000F0000 -#define IXGB_PAP_TXPV_10G 0x00000000 -#define IXGB_PAP_TXPV_1G 0x00010000 -#define IXGB_PAP_TXPV_2G 0x00020000 -#define IXGB_PAP_TXPV_3G 0x00030000 -#define IXGB_PAP_TXPV_4G 0x00040000 -#define IXGB_PAP_TXPV_5G 0x00050000 -#define IXGB_PAP_TXPV_6G 0x00060000 -#define IXGB_PAP_TXPV_7G 0x00070000 -#define IXGB_PAP_TXPV_8G 0x00080000 -#define IXGB_PAP_TXPV_9G 0x00090000 -#define IXGB_PAP_TXPV_WAN 0x000F0000 - -/* PCSC1 Bit Masks */ -#define IXGB_PCSC1_LOOPBACK 0x00004000 - -/* PCSC2 Bit Masks */ -#define IXGB_PCSC2_PCS_TYPE_MASK 0x00000003 -#define IXGB_PCSC2_PCS_TYPE_10GBX 0x00000001 - -/* PCSS1 Bit Masks */ -#define IXGB_PCSS1_LOCAL_FAULT 0x00000080 -#define IXGB_PCSS1_RX_LINK_STATUS 0x00000004 - -/* PCSS2 Bit Masks */ -#define IXGB_PCSS2_DEV_PRES_MASK 0x0000C000 -#define IXGB_PCSS2_DEV_PRES 0x00004000 -#define IXGB_PCSS2_TX_LF 0x00000800 -#define IXGB_PCSS2_RX_LF 0x00000400 -#define IXGB_PCSS2_10GBW 0x00000004 -#define IXGB_PCSS2_10GBX 0x00000002 -#define IXGB_PCSS2_10GBR 0x00000001 - -/* XPCSS Bit Masks */ -#define IXGB_XPCSS_ALIGN_STATUS 0x00001000 -#define IXGB_XPCSS_PATTERN_TEST 0x00000800 -#define IXGB_XPCSS_LANE_3_SYNC 0x00000008 -#define IXGB_XPCSS_LANE_2_SYNC 0x00000004 -#define IXGB_XPCSS_LANE_1_SYNC 0x00000002 -#define IXGB_XPCSS_LANE_0_SYNC 0x00000001 - -/* XPCSTC Bit Masks */ -#define IXGB_XPCSTC_BERT_TRIG 0x00200000 -#define IXGB_XPCSTC_BERT_SST 0x00100000 -#define IXGB_XPCSTC_BERT_PSZ_MASK 0x000C0000 -#define IXGB_XPCSTC_BERT_PSZ_SHIFT 17 -#define IXGB_XPCSTC_BERT_PSZ_INF 0x00000003 -#define IXGB_XPCSTC_BERT_PSZ_68 0x00000001 -#define IXGB_XPCSTC_BERT_PSZ_1028 0x00000000 - -/* MSCA bit Masks */ -/* New Protocol Address */ -#define IXGB_MSCA_NP_ADDR_MASK 0x0000FFFF -#define IXGB_MSCA_NP_ADDR_SHIFT 0 -/* Either Device Type or Register Address,depending on ST_CODE */ -#define IXGB_MSCA_DEV_TYPE_MASK 0x001F0000 -#define IXGB_MSCA_DEV_TYPE_SHIFT 16 -#define IXGB_MSCA_PHY_ADDR_MASK 0x03E00000 -#define IXGB_MSCA_PHY_ADDR_SHIFT 21 -#define IXGB_MSCA_OP_CODE_MASK 0x0C000000 -/* OP_CODE == 00, Address cycle, New Protocol */ -/* OP_CODE == 01, Write operation */ -/* OP_CODE == 10, Read operation */ -/* OP_CODE == 11, Read, auto increment, New Protocol */ -#define IXGB_MSCA_ADDR_CYCLE 0x00000000 -#define IXGB_MSCA_WRITE 0x04000000 -#define IXGB_MSCA_READ 0x08000000 -#define IXGB_MSCA_READ_AUTOINC 0x0C000000 -#define IXGB_MSCA_OP_CODE_SHIFT 26 -#define IXGB_MSCA_ST_CODE_MASK 0x30000000 -/* ST_CODE == 00, New Protocol */ -/* ST_CODE == 01, Old Protocol */ -#define IXGB_MSCA_NEW_PROTOCOL 0x00000000 -#define IXGB_MSCA_OLD_PROTOCOL 0x10000000 -#define IXGB_MSCA_ST_CODE_SHIFT 28 -/* Initiate command, self-clearing when command completes */ -#define IXGB_MSCA_MDI_COMMAND 0x40000000 -/*MDI In Progress Enable. */ -#define IXGB_MSCA_MDI_IN_PROG_EN 0x80000000 - -/* MSRWD bit masks */ -#define IXGB_MSRWD_WRITE_DATA_MASK 0x0000FFFF -#define IXGB_MSRWD_WRITE_DATA_SHIFT 0 -#define IXGB_MSRWD_READ_DATA_MASK 0xFFFF0000 -#define IXGB_MSRWD_READ_DATA_SHIFT 16 - -/* Definitions for the optics devices on the MDIO bus. */ -#define IXGB_PHY_ADDRESS 0x0 /* Single PHY, multiple "Devices" */ - -#define MDIO_PMA_PMD_XPAK_VENDOR_NAME 0x803A /* XPAK/XENPAK devices only */ - -/* Vendor-specific MDIO registers */ -#define G6XXX_PMA_PMD_VS1 0xC001 /* Vendor-specific register */ -#define G6XXX_XGXS_XAUI_VS2 0x18 /* Vendor-specific register */ - -#define G6XXX_PMA_PMD_VS1_PLL_RESET 0x80 -#define G6XXX_PMA_PMD_VS1_REMOVE_PLL_RESET 0x00 -#define G6XXX_XGXS_XAUI_VS2_INPUT_MASK 0x0F /* XAUI lanes synchronized */ - -/* Layout of a single receive descriptor. The controller assumes that this - * structure is packed into 16 bytes, which is a safe assumption with most - * compilers. However, some compilers may insert padding between the fields, - * in which case the structure must be packed in some compiler-specific - * manner. */ -struct ixgb_rx_desc { - __le64 buff_addr; - __le16 length; - __le16 reserved; - u8 status; - u8 errors; - __le16 special; -}; - -#define IXGB_RX_DESC_STATUS_DD 0x01 -#define IXGB_RX_DESC_STATUS_EOP 0x02 -#define IXGB_RX_DESC_STATUS_IXSM 0x04 -#define IXGB_RX_DESC_STATUS_VP 0x08 -#define IXGB_RX_DESC_STATUS_TCPCS 0x20 -#define IXGB_RX_DESC_STATUS_IPCS 0x40 -#define IXGB_RX_DESC_STATUS_PIF 0x80 - -#define IXGB_RX_DESC_ERRORS_CE 0x01 -#define IXGB_RX_DESC_ERRORS_SE 0x02 -#define IXGB_RX_DESC_ERRORS_P 0x08 -#define IXGB_RX_DESC_ERRORS_TCPE 0x20 -#define IXGB_RX_DESC_ERRORS_IPE 0x40 -#define IXGB_RX_DESC_ERRORS_RXE 0x80 - -#define IXGB_RX_DESC_SPECIAL_VLAN_MASK 0x0FFF /* VLAN ID is in lower 12 bits */ -#define IXGB_RX_DESC_SPECIAL_PRI_MASK 0xE000 /* Priority is in upper 3 bits */ -#define IXGB_RX_DESC_SPECIAL_PRI_SHIFT 0x000D /* Priority is in upper 3 of 16 */ - -/* Layout of a single transmit descriptor. The controller assumes that this - * structure is packed into 16 bytes, which is a safe assumption with most - * compilers. However, some compilers may insert padding between the fields, - * in which case the structure must be packed in some compiler-specific - * manner. */ -struct ixgb_tx_desc { - __le64 buff_addr; - __le32 cmd_type_len; - u8 status; - u8 popts; - __le16 vlan; -}; - -#define IXGB_TX_DESC_LENGTH_MASK 0x000FFFFF -#define IXGB_TX_DESC_TYPE_MASK 0x00F00000 -#define IXGB_TX_DESC_TYPE_SHIFT 20 -#define IXGB_TX_DESC_CMD_MASK 0xFF000000 -#define IXGB_TX_DESC_CMD_SHIFT 24 -#define IXGB_TX_DESC_CMD_EOP 0x01000000 -#define IXGB_TX_DESC_CMD_TSE 0x04000000 -#define IXGB_TX_DESC_CMD_RS 0x08000000 -#define IXGB_TX_DESC_CMD_VLE 0x40000000 -#define IXGB_TX_DESC_CMD_IDE 0x80000000 - -#define IXGB_TX_DESC_TYPE 0x00100000 - -#define IXGB_TX_DESC_STATUS_DD 0x01 - -#define IXGB_TX_DESC_POPTS_IXSM 0x01 -#define IXGB_TX_DESC_POPTS_TXSM 0x02 -#define IXGB_TX_DESC_SPECIAL_PRI_SHIFT IXGB_RX_DESC_SPECIAL_PRI_SHIFT /* Priority is in upper 3 of 16 */ - -struct ixgb_context_desc { - u8 ipcss; - u8 ipcso; - __le16 ipcse; - u8 tucss; - u8 tucso; - __le16 tucse; - __le32 cmd_type_len; - u8 status; - u8 hdr_len; - __le16 mss; -}; - -#define IXGB_CONTEXT_DESC_CMD_TCP 0x01000000 -#define IXGB_CONTEXT_DESC_CMD_IP 0x02000000 -#define IXGB_CONTEXT_DESC_CMD_TSE 0x04000000 -#define IXGB_CONTEXT_DESC_CMD_RS 0x08000000 -#define IXGB_CONTEXT_DESC_CMD_IDE 0x80000000 - -#define IXGB_CONTEXT_DESC_TYPE 0x00000000 - -#define IXGB_CONTEXT_DESC_STATUS_DD 0x01 - -/* Filters */ -#define IXGB_MC_TBL_SIZE 128 /* Multicast Filter Table (4096 bits) */ -#define IXGB_VLAN_FILTER_TBL_SIZE 128 /* VLAN Filter Table (4096 bits) */ -#define IXGB_RAR_ENTRIES 3 /* Number of entries in Rx Address array */ - -#define IXGB_MEMORY_REGISTER_BASE_ADDRESS 0 -#define ENET_HEADER_SIZE 14 -#define ENET_FCS_LENGTH 4 -#define IXGB_MAX_NUM_MULTICAST_ADDRESSES 128 -#define IXGB_MIN_ENET_FRAME_SIZE_WITHOUT_FCS 60 -#define IXGB_MAX_ENET_FRAME_SIZE_WITHOUT_FCS 1514 -#define IXGB_MAX_JUMBO_FRAME_SIZE 0x3F00 - -/* Phy Addresses */ -#define IXGB_OPTICAL_PHY_ADDR 0x0 /* Optical Module phy address */ -#define IXGB_XAUII_PHY_ADDR 0x1 /* Xauii transceiver phy address */ -#define IXGB_DIAG_PHY_ADDR 0x1F /* Diagnostic Device phy address */ - -/* This structure takes a 64k flash and maps it for identification commands */ -struct ixgb_flash_buffer { - u8 manufacturer_id; - u8 device_id; - u8 filler1[0x2AA8]; - u8 cmd2; - u8 filler2[0x2AAA]; - u8 cmd1; - u8 filler3[0xAAAA]; -}; - -/* Flow control parameters */ -struct ixgb_fc { - u32 high_water; /* Flow Control High-water */ - u32 low_water; /* Flow Control Low-water */ - u16 pause_time; /* Flow Control Pause timer */ - bool send_xon; /* Flow control send XON */ - ixgb_fc_type type; /* Type of flow control */ -}; - -/* The historical defaults for the flow control values are given below. */ -#define FC_DEFAULT_HI_THRESH (0x8000) /* 32KB */ -#define FC_DEFAULT_LO_THRESH (0x4000) /* 16KB */ -#define FC_DEFAULT_TX_TIMER (0x100) /* ~130 us */ - -/* Phy definitions */ -#define IXGB_MAX_PHY_REG_ADDRESS 0xFFFF -#define IXGB_MAX_PHY_ADDRESS 31 -#define IXGB_MAX_PHY_DEV_TYPE 31 - -/* Bus parameters */ -struct ixgb_bus { - ixgb_bus_speed speed; - ixgb_bus_width width; - ixgb_bus_type type; -}; - -struct ixgb_hw { - u8 __iomem *hw_addr;/* Base Address of the hardware */ - void *back; /* Pointer to OS-dependent struct */ - struct ixgb_fc fc; /* Flow control parameters */ - struct ixgb_bus bus; /* Bus parameters */ - u32 phy_id; /* Phy Identifier */ - u32 phy_addr; /* XGMII address of Phy */ - ixgb_mac_type mac_type; /* Identifier for MAC controller */ - ixgb_phy_type phy_type; /* Transceiver/phy identifier */ - u32 max_frame_size; /* Maximum frame size supported */ - u32 mc_filter_type; /* Multicast filter hash type */ - u32 num_mc_addrs; /* Number of current Multicast addrs */ - u8 curr_mac_addr[ETH_ALEN]; /* Individual address currently programmed in MAC */ - u32 num_tx_desc; /* Number of Transmit descriptors */ - u32 num_rx_desc; /* Number of Receive descriptors */ - u32 rx_buffer_size; /* Size of Receive buffer */ - bool link_up; /* true if link is valid */ - bool adapter_stopped; /* State of adapter */ - u16 device_id; /* device id from PCI configuration space */ - u16 vendor_id; /* vendor id from PCI configuration space */ - u8 revision_id; /* revision id from PCI configuration space */ - u16 subsystem_vendor_id; /* subsystem vendor id from PCI configuration space */ - u16 subsystem_id; /* subsystem id from PCI configuration space */ - u32 bar0; /* Base Address registers */ - u32 bar1; - u32 bar2; - u32 bar3; - u16 pci_cmd_word; /* PCI command register id from PCI configuration space */ - __le16 eeprom[IXGB_EEPROM_SIZE]; /* EEPROM contents read at init time */ - unsigned long io_base; /* Our I/O mapped location */ - u32 lastLFC; - u32 lastRFC; -}; - -/* Statistics reported by the hardware */ -struct ixgb_hw_stats { - u64 tprl; - u64 tprh; - u64 gprcl; - u64 gprch; - u64 bprcl; - u64 bprch; - u64 mprcl; - u64 mprch; - u64 uprcl; - u64 uprch; - u64 vprcl; - u64 vprch; - u64 jprcl; - u64 jprch; - u64 gorcl; - u64 gorch; - u64 torl; - u64 torh; - u64 rnbc; - u64 ruc; - u64 roc; - u64 rlec; - u64 crcerrs; - u64 icbc; - u64 ecbc; - u64 mpc; - u64 tptl; - u64 tpth; - u64 gptcl; - u64 gptch; - u64 bptcl; - u64 bptch; - u64 mptcl; - u64 mptch; - u64 uptcl; - u64 uptch; - u64 vptcl; - u64 vptch; - u64 jptcl; - u64 jptch; - u64 gotcl; - u64 gotch; - u64 totl; - u64 toth; - u64 dc; - u64 plt64c; - u64 tsctc; - u64 tsctfc; - u64 ibic; - u64 rfc; - u64 lfc; - u64 pfrc; - u64 pftc; - u64 mcfrc; - u64 mcftc; - u64 xonrxc; - u64 xontxc; - u64 xoffrxc; - u64 xofftxc; - u64 rjc; -}; - -/* Function Prototypes */ -bool ixgb_adapter_stop(struct ixgb_hw *hw); -bool ixgb_init_hw(struct ixgb_hw *hw); -bool ixgb_adapter_start(struct ixgb_hw *hw); -void ixgb_check_for_link(struct ixgb_hw *hw); -bool ixgb_check_for_bad_link(struct ixgb_hw *hw); - -void ixgb_rar_set(struct ixgb_hw *hw, const u8 *addr, u32 index); - -/* Filters (multicast, vlan, receive) */ -void ixgb_mc_addr_list_update(struct ixgb_hw *hw, u8 *mc_addr_list, - u32 mc_addr_count, u32 pad); - -/* Vfta functions */ -void ixgb_write_vfta(struct ixgb_hw *hw, u32 offset, u32 value); - -/* Access functions to eeprom data */ -void ixgb_get_ee_mac_addr(struct ixgb_hw *hw, u8 *mac_addr); -u32 ixgb_get_ee_pba_number(struct ixgb_hw *hw); -u16 ixgb_get_ee_device_id(struct ixgb_hw *hw); -bool ixgb_get_eeprom_data(struct ixgb_hw *hw); -__le16 ixgb_get_eeprom_word(struct ixgb_hw *hw, u16 index); - -/* Everything else */ -void ixgb_led_on(struct ixgb_hw *hw); -void ixgb_led_off(struct ixgb_hw *hw); -void ixgb_write_pci_cfg(struct ixgb_hw *hw, - u32 reg, - u16 * value); - - -#endif /* _IXGB_HW_H_ */ diff --git a/drivers/net/ethernet/intel/ixgb/ixgb_ids.h b/drivers/net/ethernet/intel/ixgb/ixgb_ids.h deleted file mode 100644 index 9695b8215f01..000000000000 --- a/drivers/net/ethernet/intel/ixgb/ixgb_ids.h +++ /dev/null @@ -1,23 +0,0 @@ -/* SPDX-License-Identifier: GPL-2.0 */ -/* Copyright(c) 1999 - 2008 Intel Corporation. */ - -#ifndef _IXGB_IDS_H_ -#define _IXGB_IDS_H_ - -/********************************************************************** -** The Device and Vendor IDs for 10 Gigabit MACs -**********************************************************************/ - -#define IXGB_DEVICE_ID_82597EX 0x1048 -#define IXGB_DEVICE_ID_82597EX_SR 0x1A48 -#define IXGB_DEVICE_ID_82597EX_LR 0x1B48 -#define IXGB_SUBDEVICE_ID_A11F 0xA11F -#define IXGB_SUBDEVICE_ID_A01F 0xA01F - -#define IXGB_DEVICE_ID_82597EX_CX4 0x109E -#define IXGB_SUBDEVICE_ID_A00C 0xA00C -#define IXGB_SUBDEVICE_ID_A01C 0xA01C -#define IXGB_SUBDEVICE_ID_7036 0x7036 - -#endif /* #ifndef _IXGB_IDS_H_ */ -/* End of File */ diff --git a/drivers/net/ethernet/intel/ixgb/ixgb_main.c b/drivers/net/ethernet/intel/ixgb/ixgb_main.c deleted file mode 100644 index b4d47e7a76c8..000000000000 --- a/drivers/net/ethernet/intel/ixgb/ixgb_main.c +++ /dev/null @@ -1,2285 +0,0 @@ -// SPDX-License-Identifier: GPL-2.0 -/* Copyright(c) 1999 - 2008 Intel Corporation. */ - -#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt - -#include <linux/prefetch.h> -#include "ixgb.h" - -char ixgb_driver_name[] = "ixgb"; -static char ixgb_driver_string[] = "Intel(R) PRO/10GbE Network Driver"; - -static const char ixgb_copyright[] = "Copyright (c) 1999-2008 Intel Corporation."; - -#define IXGB_CB_LENGTH 256 -static unsigned int copybreak __read_mostly = IXGB_CB_LENGTH; -module_param(copybreak, uint, 0644); -MODULE_PARM_DESC(copybreak, - "Maximum size of packet that is copied to a new buffer on receive"); - -/* ixgb_pci_tbl - PCI Device ID Table - * - * Wildcard entries (PCI_ANY_ID) should come last - * Last entry must be all 0s - * - * { Vendor ID, Device ID, SubVendor ID, SubDevice ID, - * Class, Class Mask, private data (not used) } - */ -static const struct pci_device_id ixgb_pci_tbl[] = { - {PCI_VENDOR_ID_INTEL, IXGB_DEVICE_ID_82597EX, - PCI_ANY_ID, PCI_ANY_ID, 0, 0, 0}, - {PCI_VENDOR_ID_INTEL, IXGB_DEVICE_ID_82597EX_CX4, - PCI_ANY_ID, PCI_ANY_ID, 0, 0, 0}, - {PCI_VENDOR_ID_INTEL, IXGB_DEVICE_ID_82597EX_SR, - PCI_ANY_ID, PCI_ANY_ID, 0, 0, 0}, - {PCI_VENDOR_ID_INTEL, IXGB_DEVICE_ID_82597EX_LR, - PCI_ANY_ID, PCI_ANY_ID, 0, 0, 0}, - - /* required last entry */ - {0,} -}; - -MODULE_DEVICE_TABLE(pci, ixgb_pci_tbl); - -/* Local Function Prototypes */ -static int ixgb_init_module(void); -static void ixgb_exit_module(void); -static int ixgb_probe(struct pci_dev *pdev, const struct pci_device_id *ent); -static void ixgb_remove(struct pci_dev *pdev); -static int ixgb_sw_init(struct ixgb_adapter *adapter); -static int ixgb_open(struct net_device *netdev); -static int ixgb_close(struct net_device *netdev); -static void ixgb_configure_tx(struct ixgb_adapter *adapter); -static void ixgb_configure_rx(struct ixgb_adapter *adapter); -static void ixgb_setup_rctl(struct ixgb_adapter *adapter); -static void ixgb_clean_tx_ring(struct ixgb_adapter *adapter); -static void ixgb_clean_rx_ring(struct ixgb_adapter *adapter); -static void ixgb_set_multi(struct net_device *netdev); -static void ixgb_watchdog(struct timer_list *t); -static netdev_tx_t ixgb_xmit_frame(struct sk_buff *skb, - struct net_device *netdev); -static int ixgb_change_mtu(struct net_device *netdev, int new_mtu); -static int ixgb_set_mac(struct net_device *netdev, void *p); -static irqreturn_t ixgb_intr(int irq, void *data); -static bool ixgb_clean_tx_irq(struct ixgb_adapter *adapter); - -static int ixgb_clean(struct napi_struct *, int); -static bool ixgb_clean_rx_irq(struct ixgb_adapter *, int *, int); -static void ixgb_alloc_rx_buffers(struct ixgb_adapter *, int); - -static void ixgb_tx_timeout(struct net_device *dev, unsigned int txqueue); -static void ixgb_tx_timeout_task(struct work_struct *work); - -static void ixgb_vlan_strip_enable(struct ixgb_adapter *adapter); -static void ixgb_vlan_strip_disable(struct ixgb_adapter *adapter); -static int ixgb_vlan_rx_add_vid(struct net_device *netdev, - __be16 proto, u16 vid); -static int ixgb_vlan_rx_kill_vid(struct net_device *netdev, - __be16 proto, u16 vid); -static void ixgb_restore_vlan(struct ixgb_adapter *adapter); - -static pci_ers_result_t ixgb_io_error_detected (struct pci_dev *pdev, - pci_channel_state_t state); -static pci_ers_result_t ixgb_io_slot_reset (struct pci_dev *pdev); -static void ixgb_io_resume (struct pci_dev *pdev); - -static const struct pci_error_handlers ixgb_err_handler = { - .error_detected = ixgb_io_error_detected, - .slot_reset = ixgb_io_slot_reset, - .resume = ixgb_io_resume, -}; - -static struct pci_driver ixgb_driver = { - .name = ixgb_driver_name, - .id_table = ixgb_pci_tbl, - .probe = ixgb_probe, - .remove = ixgb_remove, - .err_handler = &ixgb_err_handler -}; - -MODULE_AUTHOR("Intel Corporation, <linux.nics@intel.com>"); -MODULE_DESCRIPTION("Intel(R) PRO/10GbE Network Driver"); -MODULE_LICENSE("GPL v2"); - -#define DEFAULT_MSG_ENABLE (NETIF_MSG_DRV|NETIF_MSG_PROBE|NETIF_MSG_LINK) -static int debug = -1; -module_param(debug, int, 0); -MODULE_PARM_DESC(debug, "Debug level (0=none,...,16=all)"); - -/** - * ixgb_init_module - Driver Registration Routine - * - * ixgb_init_module is the first routine called when the driver is - * loaded. All it does is register with the PCI subsystem. - **/ - -static int __init -ixgb_init_module(void) -{ - pr_info("%s\n", ixgb_driver_string); - pr_info("%s\n", ixgb_copyright); - - return pci_register_driver(&ixgb_driver); -} - -module_init(ixgb_init_module); - -/** - * ixgb_exit_module - Driver Exit Cleanup Routine - * - * ixgb_exit_module is called just before the driver is removed - * from memory. - **/ - -static void __exit -ixgb_exit_module(void) -{ - pci_unregister_driver(&ixgb_driver); -} - -module_exit(ixgb_exit_module); - -/** - * ixgb_irq_disable - Mask off interrupt generation on the NIC - * @adapter: board private structure - **/ - -static void -ixgb_irq_disable(struct ixgb_adapter *adapter) -{ - IXGB_WRITE_REG(&adapter->hw, IMC, ~0); - IXGB_WRITE_FLUSH(&adapter->hw); - synchronize_irq(adapter->pdev->irq); -} - -/** - * ixgb_irq_enable - Enable default interrupt generation settings - * @adapter: board private structure - **/ - -static void -ixgb_irq_enable(struct ixgb_adapter *adapter) -{ - u32 val = IXGB_INT_RXT0 | IXGB_INT_RXDMT0 | - IXGB_INT_TXDW | IXGB_INT_LSC; - if (adapter->hw.subsystem_vendor_id == PCI_VENDOR_ID_SUN) - val |= IXGB_INT_GPI0; - IXGB_WRITE_REG(&adapter->hw, IMS, val); - IXGB_WRITE_FLUSH(&adapter->hw); -} - -int -ixgb_up(struct ixgb_adapter *adapter) -{ - struct net_device *netdev = adapter->netdev; - int err, irq_flags = IRQF_SHARED; - int max_frame = netdev->mtu + ENET_HEADER_SIZE + ENET_FCS_LENGTH; - struct ixgb_hw *hw = &adapter->hw; - - /* hardware has been reset, we need to reload some things */ - - ixgb_rar_set(hw, netdev->dev_addr, 0); - ixgb_set_multi(netdev); - - ixgb_restore_vlan(adapter); - - ixgb_configure_tx(adapter); - ixgb_setup_rctl(adapter); - ixgb_configure_rx(adapter); - ixgb_alloc_rx_buffers(adapter, IXGB_DESC_UNUSED(&adapter->rx_ring)); - - /* disable interrupts and get the hardware into a known state */ - IXGB_WRITE_REG(&adapter->hw, IMC, 0xffffffff); - - /* only enable MSI if bus is in PCI-X mode */ - if (IXGB_READ_REG(&adapter->hw, STATUS) & IXGB_STATUS_PCIX_MODE) { - err = pci_enable_msi(adapter->pdev); - if (!err) { - adapter->have_msi = true; - irq_flags = 0; - } - /* proceed to try to request regular interrupt */ - } - - err = request_irq(adapter->pdev->irq, ixgb_intr, irq_flags, - netdev->name, netdev); - if (err) { - if (adapter->have_msi) - pci_disable_msi(adapter->pdev); - netif_err(adapter, probe, adapter->netdev, - "Unable to allocate interrupt Error: %d\n", err); - return err; - } - - if ((hw->max_frame_size != max_frame) || - (hw->max_frame_size != - (IXGB_READ_REG(hw, MFS) >> IXGB_MFS_SHIFT))) { - - hw->max_frame_size = max_frame; - - IXGB_WRITE_REG(hw, MFS, hw->max_frame_size << IXGB_MFS_SHIFT); - - if (hw->max_frame_size > - IXGB_MAX_ENET_FRAME_SIZE_WITHOUT_FCS + ENET_FCS_LENGTH) { - u32 ctrl0 = IXGB_READ_REG(hw, CTRL0); - - if (!(ctrl0 & IXGB_CTRL0_JFE)) { - ctrl0 |= IXGB_CTRL0_JFE; - IXGB_WRITE_REG(hw, CTRL0, ctrl0); - } - } - } - - clear_bit(__IXGB_DOWN, &adapter->flags); - - napi_enable(&adapter->napi); - ixgb_irq_enable(adapter); - - netif_wake_queue(netdev); - - mod_timer(&adapter->watchdog_timer, jiffies); - - return 0; -} - -void -ixgb_down(struct ixgb_adapter *adapter, bool kill_watchdog) -{ - struct net_device *netdev = adapter->netdev; - - /* prevent the interrupt handler from restarting watchdog */ - set_bit(__IXGB_DOWN, &adapter->flags); - - netif_carrier_off(netdev); - - napi_disable(&adapter->napi); - /* waiting for NAPI to complete can re-enable interrupts */ - ixgb_irq_disable(adapter); - free_irq(adapter->pdev->irq, netdev); - - if (adapter->have_msi) - pci_disable_msi(adapter->pdev); - - if (kill_watchdog) - del_timer_sync(&adapter->watchdog_timer); - - adapter->link_speed = 0; - adapter->link_duplex = 0; - netif_stop_queue(netdev); - - ixgb_reset(adapter); - ixgb_clean_tx_ring(adapter); - ixgb_clean_rx_ring(adapter); -} - -void -ixgb_reset(struct ixgb_adapter *adapter) -{ - struct ixgb_hw *hw = &adapter->hw; - - ixgb_adapter_stop(hw); - if (!ixgb_init_hw(hw)) - netif_err(adapter, probe, adapter->netdev, "ixgb_init_hw failed\n"); - - /* restore frame size information */ - IXGB_WRITE_REG(hw, MFS, hw->max_frame_size << IXGB_MFS_SHIFT); - if (hw->max_frame_size > - IXGB_MAX_ENET_FRAME_SIZE_WITHOUT_FCS + ENET_FCS_LENGTH) { - u32 ctrl0 = IXGB_READ_REG(hw, CTRL0); - if (!(ctrl0 & IXGB_CTRL0_JFE)) { - ctrl0 |= IXGB_CTRL0_JFE; - IXGB_WRITE_REG(hw, CTRL0, ctrl0); - } - } -} - -static netdev_features_t -ixgb_fix_features(struct net_device *netdev, netdev_features_t features) -{ - /* - * Tx VLAN insertion does not work per HW design when Rx stripping is - * disabled. - */ - if (!(features & NETIF_F_HW_VLAN_CTAG_RX)) - features &= ~NETIF_F_HW_VLAN_CTAG_TX; - - return features; -} - -static int -ixgb_set_features(struct net_device *netdev, netdev_features_t features) -{ - struct ixgb_adapter *adapter = netdev_priv(netdev); - netdev_features_t changed = features ^ netdev->features; - - if (!(changed & (NETIF_F_RXCSUM|NETIF_F_HW_VLAN_CTAG_RX))) - return 0; - - adapter->rx_csum = !!(features & NETIF_F_RXCSUM); - - if (netif_running(netdev)) { - ixgb_down(adapter, true); - ixgb_up(adapter); - ixgb_set_speed_duplex(netdev); - } else - ixgb_reset(adapter); - - return 0; -} - - -static const struct net_device_ops ixgb_netdev_ops = { - .ndo_open = ixgb_open, - .ndo_stop = ixgb_close, - .ndo_start_xmit = ixgb_xmit_frame, - .ndo_set_rx_mode = ixgb_set_multi, - .ndo_validate_addr = eth_validate_addr, - .ndo_set_mac_address = ixgb_set_mac, - .ndo_change_mtu = ixgb_change_mtu, - .ndo_tx_timeout = ixgb_tx_timeout, - .ndo_vlan_rx_add_vid = ixgb_vlan_rx_add_vid, - .ndo_vlan_rx_kill_vid = ixgb_vlan_rx_kill_vid, - .ndo_fix_features = ixgb_fix_features, - .ndo_set_features = ixgb_set_features, -}; - -/** - * ixgb_probe - Device Initialization Routine - * @pdev: PCI device information struct - * @ent: entry in ixgb_pci_tbl - * - * Returns 0 on success, negative on failure - * - * ixgb_probe initializes an adapter identified by a pci_dev structure. - * The OS initialization, configuring of the adapter private structure, - * and a hardware reset occur. - **/ - -static int -ixgb_probe(struct pci_dev *pdev, const struct pci_device_id *ent) -{ - struct net_device *netdev = NULL; - struct ixgb_adapter *adapter; - static int cards_found = 0; - u8 addr[ETH_ALEN]; - int i; - int err; - - err = pci_enable_device(pdev); - if (err) - return err; - - err = dma_set_mask_and_coherent(&pdev->dev, DMA_BIT_MASK(64)); - if (err) { - pr_err("No usable DMA configuration, aborting\n"); - goto err_dma_mask; - } - - err = pci_request_regions(pdev, ixgb_driver_name); - if (err) - goto err_request_regions; - - pci_set_master(pdev); - - netdev = alloc_etherdev(sizeof(struct ixgb_adapter)); - if (!netdev) { - err = -ENOMEM; - goto err_alloc_etherdev; - } - - SET_NETDEV_DEV(netdev, &pdev->dev); - - pci_set_drvdata(pdev, netdev); - adapter = netdev_priv(netdev); - adapter->netdev = netdev; - adapter->pdev = pdev; - adapter->hw.back = adapter; - adapter->msg_enable = netif_msg_init(debug, DEFAULT_MSG_ENABLE); - - adapter->hw.hw_addr = pci_ioremap_bar(pdev, BAR_0); - if (!adapter->hw.hw_addr) { - err = -EIO; - goto err_ioremap; - } - - for (i = BAR_1; i < PCI_STD_NUM_BARS; i++) { - if (pci_resource_len(pdev, i) == 0) - continue; - if (pci_resource_flags(pdev, i) & IORESOURCE_IO) { - adapter->hw.io_base = pci_resource_start(pdev, i); - break; - } - } - - netdev->netdev_ops = &ixgb_netdev_ops; - ixgb_set_ethtool_ops(netdev); - netdev->watchdog_timeo = 5 * HZ; - netif_napi_add(netdev, &adapter->napi, ixgb_clean); - - strncpy(netdev->name, pci_name(pdev), sizeof(netdev->name) - 1); - - adapter->bd_number = cards_found; - adapter->link_speed = 0; - adapter->link_duplex = 0; - - /* setup the private structure */ - - err = ixgb_sw_init(adapter); - if (err) - goto err_sw_init; - - netdev->hw_features = NETIF_F_SG | - NETIF_F_TSO | - NETIF_F_HW_CSUM | - NETIF_F_HW_VLAN_CTAG_TX | - NETIF_F_HW_VLAN_CTAG_RX; - netdev->features = netdev->hw_features | - NETIF_F_HW_VLAN_CTAG_FILTER; - netdev->hw_features |= NETIF_F_RXCSUM; - - netdev->features |= NETIF_F_HIGHDMA; - netdev->vlan_features |= NETIF_F_HIGHDMA; - - /* MTU range: 68 - 16114 */ - netdev->min_mtu = ETH_MIN_MTU; - netdev->max_mtu = IXGB_MAX_JUMBO_FRAME_SIZE - ETH_HLEN; - - /* make sure the EEPROM is good */ - - if (!ixgb_validate_eeprom_checksum(&adapter->hw)) { - netif_err(adapter, probe, adapter->netdev, - "The EEPROM Checksum Is Not Valid\n"); - err = -EIO; - goto err_eeprom; - } - - ixgb_get_ee_mac_addr(&adapter->hw, addr); - eth_hw_addr_set(netdev, addr); - - if (!is_valid_ether_addr(netdev->dev_addr)) { - netif_err(adapter, probe, adapter->netdev, "Invalid MAC Address\n"); - err = -EIO; - goto err_eeprom; - } - - adapter->part_num = ixgb_get_ee_pba_number(&adapter->hw); - - timer_setup(&adapter->watchdog_timer, ixgb_watchdog, 0); - - INIT_WORK(&adapter->tx_timeout_task, ixgb_tx_timeout_task); - - strcpy(netdev->name, "eth%d"); - err = register_netdev(netdev); - if (err) - goto err_register; - - /* carrier off reporting is important to ethtool even BEFORE open */ - netif_carrier_off(netdev); - - netif_info(adapter, probe, adapter->netdev, - "Intel(R) PRO/10GbE Network Connection\n"); - ixgb_check_options(adapter); - /* reset the hardware with the new settings */ - - ixgb_reset(adapter); - - cards_found++; - return 0; - -err_register: -err_sw_init: -err_eeprom: - iounmap(adapter->hw.hw_addr); -err_ioremap: - free_netdev(netdev); -err_alloc_etherdev: - pci_release_regions(pdev); -err_request_regions: -err_dma_mask: - pci_disable_device(pdev); - return err; -} - -/** - * ixgb_remove - Device Removal Routine - * @pdev: PCI device information struct - * - * ixgb_remove is called by the PCI subsystem to alert the driver - * that it should release a PCI device. The could be caused by a - * Hot-Plug event, or because the driver is going to be removed from - * memory. - **/ - -static void -ixgb_remove(struct pci_dev *pdev) -{ - struct net_device *netdev = pci_get_drvdata(pdev); - struct ixgb_adapter *adapter = netdev_priv(netdev); - - cancel_work_sync(&adapter->tx_timeout_task); - - unregister_netdev(netdev); - - iounmap(adapter->hw.hw_addr); - pci_release_regions(pdev); - - free_netdev(netdev); - pci_disable_device(pdev); -} - -/** - * ixgb_sw_init - Initialize general software structures (struct ixgb_adapter) - * @adapter: board private structure to initialize - * - * ixgb_sw_init initializes the Adapter private data structure. - * Fields are initialized based on PCI device information and - * OS network device settings (MTU size). - **/ - -static int -ixgb_sw_init(struct ixgb_adapter *adapter) -{ - struct ixgb_hw *hw = &adapter->hw; - struct net_device *netdev = adapter->netdev; - struct pci_dev *pdev = adapter->pdev; - - /* PCI config space info */ - - hw->vendor_id = pdev->vendor; - hw->device_id = pdev->device; - hw->subsystem_vendor_id = pdev->subsystem_vendor; - hw->subsystem_id = pdev->subsystem_device; - - hw->max_frame_size = netdev->mtu + ENET_HEADER_SIZE + ENET_FCS_LENGTH; - adapter->rx_buffer_len = hw->max_frame_size + 8; /* + 8 for errata */ - - if ((hw->device_id == IXGB_DEVICE_ID_82597EX) || - (hw->device_id == IXGB_DEVICE_ID_82597EX_CX4) || - (hw->device_id == IXGB_DEVICE_ID_82597EX_LR) || - (hw->device_id == IXGB_DEVICE_ID_82597EX_SR)) - hw->mac_type = ixgb_82597; - else { - /* should never have loaded on this device */ - netif_err(adapter, probe, adapter->netdev, "unsupported device id\n"); - } - - /* enable flow control to be programmed */ - hw->fc.send_xon = 1; - - set_bit(__IXGB_DOWN, &adapter->flags); - return 0; -} - -/** - * ixgb_open - Called when a network interface is made active - * @netdev: network interface device structure - * - * Returns 0 on success, negative value on failure - * - * The open entry point is called when a network interface is made - * active by the system (IFF_UP). At this point all resources needed - * for transmit and receive operations are allocated, the interrupt - * handler is registered with the OS, the watchdog timer is started, - * and the stack is notified that the interface is ready. - **/ - -static int -ixgb_open(struct net_device *netdev) -{ - struct ixgb_adapter *adapter = netdev_priv(netdev); - int err; - - /* allocate transmit descriptors */ - err = ixgb_setup_tx_resources(adapter); - if (err) - goto err_setup_tx; - - netif_carrier_off(netdev); - - /* allocate receive descriptors */ - - err = ixgb_setup_rx_resources(adapter); - if (err) - goto err_setup_rx; - - err = ixgb_up(adapter); - if (err) - goto err_up; - - netif_start_queue(netdev); - - return 0; - -err_up: - ixgb_free_rx_resources(adapter); -err_setup_rx: - ixgb_free_tx_resources(adapter); -err_setup_tx: - ixgb_reset(adapter); - - return err; -} - -/** - * ixgb_close - Disables a network interface - * @netdev: network interface device structure - * - * Returns 0, this is not allowed to fail - * - * The close entry point is called when an interface is de-activated - * by the OS. The hardware is still under the drivers control, but - * needs to be disabled. A global MAC reset is issued to stop the - * hardware, and all transmit and receive resources are freed. - **/ - -static int -ixgb_close(struct net_device *netdev) -{ - struct ixgb_adapter *adapter = netdev_priv(netdev); - - ixgb_down(adapter, true); - - ixgb_free_tx_resources(adapter); - ixgb_free_rx_resources(adapter); - - return 0; -} - -/** - * ixgb_setup_tx_resources - allocate Tx resources (Descriptors) - * @adapter: board private structure - * - * Return 0 on success, negative on failure - **/ - -int -ixgb_setup_tx_resources(struct ixgb_adapter *adapter) -{ - struct ixgb_desc_ring *txdr = &adapter->tx_ring; - struct pci_dev *pdev = adapter->pdev; - int size; - - size = sizeof(struct ixgb_buffer) * txdr->count; - txdr->buffer_info = vzalloc(size); - if (!txdr->buffer_info) - return -ENOMEM; - - /* round up to nearest 4K */ - - txdr->size = txdr->count * sizeof(struct ixgb_tx_desc); - txdr->size = ALIGN(txdr->size, 4096); - - txdr->desc = dma_alloc_coherent(&pdev->dev, txdr->size, &txdr->dma, - GFP_KERNEL); - if (!txdr->desc) { - vfree(txdr->buffer_info); - return -ENOMEM; - } - - txdr->next_to_use = 0; - txdr->next_to_clean = 0; - - return 0; -} - -/** - * ixgb_configure_tx - Configure 82597 Transmit Unit after Reset. - * @adapter: board private structure - * - * Configure the Tx unit of the MAC after a reset. - **/ - -static void -ixgb_configure_tx(struct ixgb_adapter *adapter) -{ - u64 tdba = adapter->tx_ring.dma; - u32 tdlen = adapter->tx_ring.count * sizeof(struct ixgb_tx_desc); - u32 tctl; - struct ixgb_hw *hw = &adapter->hw; - - /* Setup the Base and Length of the Tx Descriptor Ring - * tx_ring.dma can be either a 32 or 64 bit value - */ - - IXGB_WRITE_REG(hw, TDBAL, (tdba & 0x00000000ffffffffULL)); - IXGB_WRITE_REG(hw, TDBAH, (tdba >> 32)); - - IXGB_WRITE_REG(hw, TDLEN, tdlen); - - /* Setup the HW Tx Head and Tail descriptor pointers */ - - IXGB_WRITE_REG(hw, TDH, 0); - IXGB_WRITE_REG(hw, TDT, 0); - - /* don't set up txdctl, it induces performance problems if configured - * incorrectly */ - /* Set the Tx Interrupt Delay register */ - - IXGB_WRITE_REG(hw, TIDV, adapter->tx_int_delay); - - /* Program the Transmit Control Register */ - - tctl = IXGB_TCTL_TCE | IXGB_TCTL_TXEN | IXGB_TCTL_TPDE; - IXGB_WRITE_REG(hw, TCTL, tctl); - - /* Setup Transmit Descriptor Settings for this adapter */ - adapter->tx_cmd_type = - IXGB_TX_DESC_TYPE | - (adapter->tx_int_delay_enable ? IXGB_TX_DESC_CMD_IDE : 0); -} - -/** - * ixgb_setup_rx_resources - allocate Rx resources (Descriptors) - * @adapter: board private structure - * - * Returns 0 on success, negative on failure - **/ - -int -ixgb_setup_rx_resources(struct ixgb_adapter *adapter) -{ - struct ixgb_desc_ring *rxdr = &adapter->rx_ring; - struct pci_dev *pdev = adapter->pdev; - int size; - - size = sizeof(struct ixgb_buffer) * rxdr->count; - rxdr->buffer_info = vzalloc(size); - if (!rxdr->buffer_info) - return -ENOMEM; - - /* Round up to nearest 4K */ - - rxdr->size = rxdr->count * sizeof(struct ixgb_rx_desc); - rxdr->size = ALIGN(rxdr->size, 4096); - - rxdr->desc = dma_alloc_coherent(&pdev->dev, rxdr->size, &rxdr->dma, - GFP_KERNEL); - - if (!rxdr->desc) { - vfree(rxdr->buffer_info); - return -ENOMEM; - } - - rxdr->next_to_clean = 0; - rxdr->next_to_use = 0; - - return 0; -} - -/** - * ixgb_setup_rctl - configure the receive control register - * @adapter: Board private structure - **/ - -static void -ixgb_setup_rctl(struct ixgb_adapter *adapter) -{ - u32 rctl; - - rctl = IXGB_READ_REG(&adapter->hw, RCTL); - - rctl &= ~(3 << IXGB_RCTL_MO_SHIFT); - - rctl |= - IXGB_RCTL_BAM | IXGB_RCTL_RDMTS_1_2 | - IXGB_RCTL_RXEN | IXGB_RCTL_CFF | - (adapter->hw.mc_filter_type << IXGB_RCTL_MO_SHIFT); - - rctl |= IXGB_RCTL_SECRC; - - if (adapter->rx_buffer_len <= IXGB_RXBUFFER_2048) - rctl |= IXGB_RCTL_BSIZE_2048; - else if (adapter->rx_buffer_len <= IXGB_RXBUFFER_4096) - rctl |= IXGB_RCTL_BSIZE_4096; - else if (adapter->rx_buffer_len <= IXGB_RXBUFFER_8192) - rctl |= IXGB_RCTL_BSIZE_8192; - else if (adapter->rx_buffer_len <= IXGB_RXBUFFER_16384) - rctl |= IXGB_RCTL_BSIZE_16384; - - IXGB_WRITE_REG(&adapter->hw, RCTL, rctl); -} - -/** - * ixgb_configure_rx - Configure 82597 Receive Unit after Reset. - * @adapter: board private structure - * - * Configure the Rx unit of the MAC after a reset. - **/ - -static void -ixgb_configure_rx(struct ixgb_adapter *adapter) -{ - u64 rdba = adapter->rx_ring.dma; - u32 rdlen = adapter->rx_ring.count * sizeof(struct ixgb_rx_desc); - struct ixgb_hw *hw = &adapter->hw; - u32 rctl; - u32 rxcsum; - - /* make sure receives are disabled while setting up the descriptors */ - - rctl = IXGB_READ_REG(hw, RCTL); - IXGB_WRITE_REG(hw, RCTL, rctl & ~IXGB_RCTL_RXEN); - - /* set the Receive Delay Timer Register */ - - IXGB_WRITE_REG(hw, RDTR, adapter->rx_int_delay); - - /* Setup the Base and Length of the Rx Descriptor Ring */ - - IXGB_WRITE_REG(hw, RDBAL, (rdba & 0x00000000ffffffffULL)); - IXGB_WRITE_REG(hw, RDBAH, (rdba >> 32)); - - IXGB_WRITE_REG(hw, RDLEN, rdlen); - - /* Setup the HW Rx Head and Tail Descriptor Pointers */ - IXGB_WRITE_REG(hw, RDH, 0); - IXGB_WRITE_REG(hw, RDT, 0); - - /* due to the hardware errata with RXDCTL, we are unable to use any of - * the performance enhancing features of it without causing other - * subtle bugs, some of the bugs could include receive length - * corruption at high data rates (WTHRESH > 0) and/or receive - * descriptor ring irregularites (particularly in hardware cache) */ - IXGB_WRITE_REG(hw, RXDCTL, 0); - - /* Enable Receive Checksum Offload for TCP and UDP */ - if (adapter->rx_csum) { - rxcsum = IXGB_READ_REG(hw, RXCSUM); - rxcsum |= IXGB_RXCSUM_TUOFL; - IXGB_WRITE_REG(hw, RXCSUM, rxcsum); - } - - /* Enable Receives */ - - IXGB_WRITE_REG(hw, RCTL, rctl); -} - -/** - * ixgb_free_tx_resources - Free Tx Resources - * @adapter: board private structure - * - * Free all transmit software resources - **/ - -void -ixgb_free_tx_resources(struct ixgb_adapter *adapter) -{ - struct pci_dev *pdev = adapter->pdev; - - ixgb_clean_tx_ring(adapter); - - vfree(adapter->tx_ring.buffer_info); - adapter->tx_ring.buffer_info = NULL; - - dma_free_coherent(&pdev->dev, adapter->tx_ring.size, - adapter->tx_ring.desc, adapter->tx_ring.dma); - - adapter->tx_ring.desc = NULL; -} - -static void -ixgb_unmap_and_free_tx_resource(struct ixgb_adapter *adapter, - struct ixgb_buffer *buffer_info) -{ - if (buffer_info->dma) { - if (buffer_info->mapped_as_page) - dma_unmap_page(&adapter->pdev->dev, buffer_info->dma, - buffer_info->length, DMA_TO_DEVICE); - else - dma_unmap_single(&adapter->pdev->dev, buffer_info->dma, - buffer_info->length, DMA_TO_DEVICE); - buffer_info->dma = 0; - } - - if (buffer_info->skb) { - dev_kfree_skb_any(buffer_info->skb); - buffer_info->skb = NULL; - } - buffer_info->time_stamp = 0; - /* these fields must always be initialized in tx - * buffer_info->length = 0; - * buffer_info->next_to_watch = 0; */ -} - -/** - * ixgb_clean_tx_ring - Free Tx Buffers - * @adapter: board private structure - **/ - -static void -ixgb_clean_tx_ring(struct ixgb_adapter *adapter) -{ - struct ixgb_desc_ring *tx_ring = &adapter->tx_ring; - struct ixgb_buffer *buffer_info; - unsigned long size; - unsigned int i; - - /* Free all the Tx ring sk_buffs */ - - for (i = 0; i < tx_ring->count; i++) { - buffer_info = &tx_ring->buffer_info[i]; - ixgb_unmap_and_free_tx_resource(adapter, buffer_info); - } - - size = sizeof(struct ixgb_buffer) * tx_ring->count; - memset(tx_ring->buffer_info, 0, size); - - /* Zero out the descriptor ring */ - - memset(tx_ring->desc, 0, tx_ring->size); - - tx_ring->next_to_use = 0; - tx_ring->next_to_clean = 0; - - IXGB_WRITE_REG(&adapter->hw, TDH, 0); - IXGB_WRITE_REG(&adapter->hw, TDT, 0); -} - -/** - * ixgb_free_rx_resources - Free Rx Resources - * @adapter: board private structure - * - * Free all receive software resources - **/ - -void -ixgb_free_rx_resources(struct ixgb_adapter *adapter) -{ - struct ixgb_desc_ring *rx_ring = &adapter->rx_ring; - struct pci_dev *pdev = adapter->pdev; - - ixgb_clean_rx_ring(adapter); - - vfree(rx_ring->buffer_info); - rx_ring->buffer_info = NULL; - - dma_free_coherent(&pdev->dev, rx_ring->size, rx_ring->desc, - rx_ring->dma); - - rx_ring->desc = NULL; -} - -/** - * ixgb_clean_rx_ring - Free Rx Buffers - * @adapter: board private structure - **/ - -static void -ixgb_clean_rx_ring(struct ixgb_adapter *adapter) -{ - struct ixgb_desc_ring *rx_ring = &adapter->rx_ring; - struct ixgb_buffer *buffer_info; - struct pci_dev *pdev = adapter->pdev; - unsigned long size; - unsigned int i; - - /* Free all the Rx ring sk_buffs */ - - for (i = 0; i < rx_ring->count; i++) { - buffer_info = &rx_ring->buffer_info[i]; - if (buffer_info->dma) { - dma_unmap_single(&pdev->dev, - buffer_info->dma, - buffer_info->length, - DMA_FROM_DEVICE); - buffer_info->dma = 0; - buffer_info->length = 0; - } - - if (buffer_info->skb) { - dev_kfree_skb(buffer_info->skb); - buffer_info->skb = NULL; - } - } - - size = sizeof(struct ixgb_buffer) * rx_ring->count; - memset(rx_ring->buffer_info, 0, size); - - /* Zero out the descriptor ring */ - - memset(rx_ring->desc, 0, rx_ring->size); - - rx_ring->next_to_clean = 0; - rx_ring->next_to_use = 0; - - IXGB_WRITE_REG(&adapter->hw, RDH, 0); - IXGB_WRITE_REG(&adapter->hw, RDT, 0); -} - -/** - * ixgb_set_mac - Change the Ethernet Address of the NIC - * @netdev: network interface device structure - * @p: pointer to an address structure - * - * Returns 0 on success, negative on failure - **/ - -static int -ixgb_set_mac(struct net_device *netdev, void *p) -{ - struct ixgb_adapter *adapter = netdev_priv(netdev); - struct sockaddr *addr = p; - - if (!is_valid_ether_addr(addr->sa_data)) - return -EADDRNOTAVAIL; - - eth_hw_addr_set(netdev, addr->sa_data); - - ixgb_rar_set(&adapter->hw, addr->sa_data, 0); - - return 0; -} - -/** - * ixgb_set_multi - Multicast and Promiscuous mode set - * @netdev: network interface device structure - * - * The set_multi entry point is called whenever the multicast address - * list or the network interface flags are updated. This routine is - * responsible for configuring the hardware for proper multicast, - * promiscuous mode, and all-multi behavior. - **/ - -static void -ixgb_set_multi(struct net_device *netdev) -{ - struct ixgb_adapter *adapter = netdev_priv(netdev); - struct ixgb_hw *hw = &adapter->hw; - struct netdev_hw_addr *ha; - u32 rctl; - - /* Check for Promiscuous and All Multicast modes */ - - rctl = IXGB_READ_REG(hw, RCTL); - - if (netdev->flags & IFF_PROMISC) { - rctl |= (IXGB_RCTL_UPE | IXGB_RCTL_MPE); - /* disable VLAN filtering */ - rctl &= ~IXGB_RCTL_CFIEN; - rctl &= ~IXGB_RCTL_VFE; - } else { - if (netdev->flags & IFF_ALLMULTI) { - rctl |= IXGB_RCTL_MPE; - rctl &= ~IXGB_RCTL_UPE; - } else { - rctl &= ~(IXGB_RCTL_UPE | IXGB_RCTL_MPE); - } - /* enable VLAN filtering */ - rctl |= IXGB_RCTL_VFE; - rctl &= ~IXGB_RCTL_CFIEN; - } - - if (netdev_mc_count(netdev) > IXGB_MAX_NUM_MULTICAST_ADDRESSES) { - rctl |= IXGB_RCTL_MPE; - IXGB_WRITE_REG(hw, RCTL, rctl); - } else { - u8 *mta = kmalloc_array(ETH_ALEN, - IXGB_MAX_NUM_MULTICAST_ADDRESSES, - GFP_ATOMIC); - u8 *addr; - if (!mta) - goto alloc_failed; - - IXGB_WRITE_REG(hw, RCTL, rctl); - - addr = mta; - netdev_for_each_mc_addr(ha, netdev) { - memcpy(addr, ha->addr, ETH_ALEN); - addr += ETH_ALEN; - } - - ixgb_mc_addr_list_update(hw, mta, netdev_mc_count(netdev), 0); - kfree(mta); - } - -alloc_failed: - if (netdev->features & NETIF_F_HW_VLAN_CTAG_RX) - ixgb_vlan_strip_enable(adapter); - else - ixgb_vlan_strip_disable(adapter); - -} - -/** - * ixgb_watchdog - Timer Call-back - * @t: pointer to timer_list containing our private info pointer - **/ - -static void -ixgb_watchdog(struct timer_list *t) -{ - struct ixgb_adapter *adapter = from_timer(adapter, t, watchdog_timer); - struct net_device *netdev = adapter->netdev; - struct ixgb_desc_ring *txdr = &adapter->tx_ring; - - ixgb_check_for_link(&adapter->hw); - - if (ixgb_check_for_bad_link(&adapter->hw)) { - /* force the reset path */ - netif_stop_queue(netdev); - } - - if (adapter->hw.link_up) { - if (!netif_carrier_ok(netdev)) { - netdev_info(netdev, - "NIC Link is Up 10 Gbps Full Duplex, Flow Control: %s\n", - (adapter->hw.fc.type == ixgb_fc_full) ? - "RX/TX" : - (adapter->hw.fc.type == ixgb_fc_rx_pause) ? - "RX" : - (adapter->hw.fc.type == ixgb_fc_tx_pause) ? - "TX" : "None"); - adapter->link_speed = 10000; - adapter->link_duplex = FULL_DUPLEX; - netif_carrier_on(netdev); - } - } else { - if (netif_carrier_ok(netdev)) { - adapter->link_speed = 0; - adapter->link_duplex = 0; - netdev_info(netdev, "NIC Link is Down\n"); - netif_carrier_off(netdev); - } - } - - ixgb_update_stats(adapter); - - if (!netif_carrier_ok(netdev)) { - if (IXGB_DESC_UNUSED(txdr) + 1 < txdr->count) { - /* We've lost link, so the controller stops DMA, - * but we've got queued Tx work that's never going - * to get done, so reset controller to flush Tx. - * (Do the reset outside of interrupt context). */ - schedule_work(&adapter->tx_timeout_task); - /* return immediately since reset is imminent */ - return; - } - } - - /* Force detection of hung controller every watchdog period */ - adapter->detect_tx_hung = true; - - /* generate an interrupt to force clean up of any stragglers */ - IXGB_WRITE_REG(&adapter->hw, ICS, IXGB_INT_TXDW); - - /* Reset the timer */ - mod_timer(&adapter->watchdog_timer, jiffies + 2 * HZ); -} - -#define IXGB_TX_FLAGS_CSUM 0x00000001 -#define IXGB_TX_FLAGS_VLAN 0x00000002 -#define IXGB_TX_FLAGS_TSO 0x00000004 - -static int -ixgb_tso(struct ixgb_adapter *adapter, struct sk_buff *skb) -{ - struct ixgb_context_desc *context_desc; - unsigned int i; - u8 ipcss, ipcso, tucss, tucso, hdr_len; - u16 ipcse, tucse, mss; - - if (likely(skb_is_gso(skb))) { - struct ixgb_buffer *buffer_info; - struct iphdr *iph; - int err; - - err = skb_cow_head(skb, 0); - if (err < 0) - return err; - - hdr_len = skb_tcp_all_headers(skb); - mss = skb_shinfo(skb)->gso_size; - iph = ip_hdr(skb); - iph->tot_len = 0; - iph->check = 0; - tcp_hdr(skb)->check = ~csum_tcpudp_magic(iph->saddr, - iph->daddr, 0, - IPPROTO_TCP, 0); - ipcss = skb_network_offset(skb); - ipcso = (void *)&(iph->check) - (void *)skb->data; - ipcse = skb_transport_offset(skb) - 1; - tucss = skb_transport_offset(skb); - tucso = (void *)&(tcp_hdr(skb)->check) - (void *)skb->data; - tucse = 0; - - i = adapter->tx_ring.next_to_use; - context_desc = IXGB_CONTEXT_DESC(adapter->tx_ring, i); - buffer_info = &adapter->tx_ring.buffer_info[i]; - WARN_ON(buffer_info->dma != 0); - - context_desc->ipcss = ipcss; - context_desc->ipcso = ipcso; - context_desc->ipcse = cpu_to_le16(ipcse); - context_desc->tucss = tucss; - context_desc->tucso = tucso; - context_desc->tucse = cpu_to_le16(tucse); - context_desc->mss = cpu_to_le16(mss); - context_desc->hdr_len = hdr_len; - context_desc->status = 0; - context_desc->cmd_type_len = cpu_to_le32( - IXGB_CONTEXT_DESC_TYPE - | IXGB_CONTEXT_DESC_CMD_TSE - | IXGB_CONTEXT_DESC_CMD_IP - | IXGB_CONTEXT_DESC_CMD_TCP - | IXGB_CONTEXT_DESC_CMD_IDE - | (skb->len - (hdr_len))); - - - if (++i == adapter->tx_ring.count) i = 0; - adapter->tx_ring.next_to_use = i; - - return 1; - } - - return 0; -} - -static bool -ixgb_tx_csum(struct ixgb_adapter *adapter, struct sk_buff *skb) -{ - struct ixgb_context_desc *context_desc; - unsigned int i; - u8 css, cso; - - if (likely(skb->ip_summed == CHECKSUM_PARTIAL)) { - struct ixgb_buffer *buffer_info; - css = skb_checksum_start_offset(skb); - cso = css + skb->csum_offset; - - i = adapter->tx_ring.next_to_use; - context_desc = IXGB_CONTEXT_DESC(adapter->tx_ring, i); - buffer_info = &adapter->tx_ring.buffer_info[i]; - WARN_ON(buffer_info->dma != 0); - - context_desc->tucss = css; - context_desc->tucso = cso; - context_desc->tucse = 0; - /* zero out any previously existing data in one instruction */ - *(u32 *)&(context_desc->ipcss) = 0; - context_desc->status = 0; - context_desc->hdr_len = 0; - context_desc->mss = 0; - context_desc->cmd_type_len = - cpu_to_le32(IXGB_CONTEXT_DESC_TYPE - | IXGB_TX_DESC_CMD_IDE); - - if (++i == adapter->tx_ring.count) i = 0; - adapter->tx_ring.next_to_use = i; - - return true; - } - - return false; -} - -#define IXGB_MAX_TXD_PWR 14 -#define IXGB_MAX_DATA_PER_TXD (1<<IXGB_MAX_TXD_PWR) - -static int -ixgb_tx_map(struct ixgb_adapter *adapter, struct sk_buff *skb, - unsigned int first) -{ - struct ixgb_desc_ring *tx_ring = &adapter->tx_ring; - struct pci_dev *pdev = adapter->pdev; - struct ixgb_buffer *buffer_info; - int len = skb_headlen(skb); - unsigned int offset = 0, size, count = 0, i; - unsigned int mss = skb_shinfo(skb)->gso_size; - unsigned int nr_frags = skb_shinfo(skb)->nr_frags; - unsigned int f; - - i = tx_ring->next_to_use; - - while (len) { - buffer_info = &tx_ring->buffer_info[i]; - size = min(len, IXGB_MAX_DATA_PER_TXD); - /* Workaround for premature desc write-backs - * in TSO mode. Append 4-byte sentinel desc */ - if (unlikely(mss && !nr_frags && size == len && size > 8)) - size -= 4; - - buffer_info->length = size; - WARN_ON(buffer_info->dma != 0); - buffer_info->time_stamp = jiffies; - buffer_info->mapped_as_page = false; - buffer_info->dma = dma_map_single(&pdev->dev, - skb->data + offset, - size, DMA_TO_DEVICE); - if (dma_mapping_error(&pdev->dev, buffer_info->dma)) - goto dma_error; - buffer_info->next_to_watch = 0; - - len -= size; - offset += size; - count++; - if (len) { - i++; - if (i == tx_ring->count) - i = 0; - } - } - - for (f = 0; f < nr_frags; f++) { - const skb_frag_t *frag = &skb_shinfo(skb)->frags[f]; - len = skb_frag_size(frag); - offset = 0; - - while (len) { - i++; - if (i == tx_ring->count) - i = 0; - - buffer_info = &tx_ring->buffer_info[i]; - size = min(len, IXGB_MAX_DATA_PER_TXD); - - /* Workaround for premature desc write-backs - * in TSO mode. Append 4-byte sentinel desc */ - if (unlikely(mss && (f == (nr_frags - 1)) - && size == len && size > 8)) - size -= 4; - - buffer_info->length = size; - buffer_info->time_stamp = jiffies; - buffer_info->mapped_as_page = true; - buffer_info->dma = - skb_frag_dma_map(&pdev->dev, frag, offset, size, - DMA_TO_DEVICE); - if (dma_mapping_error(&pdev->dev, buffer_info->dma)) - goto dma_error; - buffer_info->next_to_watch = 0; - - len -= size; - offset += size; - count++; - } - } - tx_ring->buffer_info[i].skb = skb; - tx_ring->buffer_info[first].next_to_watch = i; - - return count; - -dma_error: - dev_err(&pdev->dev, "TX DMA map failed\n"); - buffer_info->dma = 0; - if (count) - count--; - - while (count--) { - if (i==0) - i += tx_ring->count; - i--; - buffer_info = &tx_ring->buffer_info[i]; - ixgb_unmap_and_free_tx_resource(adapter, buffer_info); - } - - return 0; -} - -static void -ixgb_tx_queue(struct ixgb_adapter *adapter, int count, int vlan_id,int tx_flags) -{ - struct ixgb_desc_ring *tx_ring = &adapter->tx_ring; - struct ixgb_tx_desc *tx_desc = NULL; - struct ixgb_buffer *buffer_info; - u32 cmd_type_len = adapter->tx_cmd_type; - u8 status = 0; - u8 popts = 0; - unsigned int i; - - if (tx_flags & IXGB_TX_FLAGS_TSO) { - cmd_type_len |= IXGB_TX_DESC_CMD_TSE; - popts |= (IXGB_TX_DESC_POPTS_IXSM | IXGB_TX_DESC_POPTS_TXSM); - } - - if (tx_flags & IXGB_TX_FLAGS_CSUM) - popts |= IXGB_TX_DESC_POPTS_TXSM; - - if (tx_flags & IXGB_TX_FLAGS_VLAN) - cmd_type_len |= IXGB_TX_DESC_CMD_VLE; - - i = tx_ring->next_to_use; - - while (count--) { - buffer_info = &tx_ring->buffer_info[i]; - tx_desc = IXGB_TX_DESC(*tx_ring, i); - tx_desc->buff_addr = cpu_to_le64(buffer_info->dma); - tx_desc->cmd_type_len = - cpu_to_le32(cmd_type_len | buffer_info->length); - tx_desc->status = status; - tx_desc->popts = popts; - tx_desc->vlan = cpu_to_le16(vlan_id); - - if (++i == tx_ring->count) i = 0; - } - - tx_desc->cmd_type_len |= - cpu_to_le32(IXGB_TX_DESC_CMD_EOP | IXGB_TX_DESC_CMD_RS); - - /* Force memory writes to complete before letting h/w - * know there are new descriptors to fetch. (Only - * applicable for weak-ordered memory model archs, - * such as IA-64). */ - wmb(); - - tx_ring->next_to_use = i; - IXGB_WRITE_REG(&adapter->hw, TDT, i); -} - -static int __ixgb_maybe_stop_tx(struct net_device *netdev, int size) -{ - struct ixgb_adapter *adapter = netdev_priv(netdev); - struct ixgb_desc_ring *tx_ring = &adapter->tx_ring; - - netif_stop_queue(netdev); - /* Herbert's original patch had: - * smp_mb__after_netif_stop_queue(); - * but since that doesn't exist yet, just open code it. */ - smp_mb(); - - /* We need to check again in a case another CPU has just - * made room available. */ - if (likely(IXGB_DESC_UNUSED(tx_ring) < size)) - return -EBUSY; - - /* A reprieve! */ - netif_start_queue(netdev); - ++adapter->restart_queue; - return 0; -} - -static int ixgb_maybe_stop_tx(struct net_device *netdev, - struct ixgb_desc_ring *tx_ring, int size) -{ - if (likely(IXGB_DESC_UNUSED(tx_ring) >= size)) - return 0; - return __ixgb_maybe_stop_tx(netdev, size); -} - - -/* Tx Descriptors needed, worst case */ -#define TXD_USE_COUNT(S) (((S) >> IXGB_MAX_TXD_PWR) + \ - (((S) & (IXGB_MAX_DATA_PER_TXD - 1)) ? 1 : 0)) -#define DESC_NEEDED TXD_USE_COUNT(IXGB_MAX_DATA_PER_TXD) /* skb->date */ + \ - MAX_SKB_FRAGS * TXD_USE_COUNT(PAGE_SIZE) + 1 /* for context */ \ - + 1 /* one more needed for sentinel TSO workaround */ - -static netdev_tx_t -ixgb_xmit_frame(struct sk_buff *skb, struct net_device *netdev) -{ - struct ixgb_adapter *adapter = netdev_priv(netdev); - unsigned int first; - unsigned int tx_flags = 0; - int vlan_id = 0; - int count = 0; - int tso; - - if (test_bit(__IXGB_DOWN, &adapter->flags)) { - dev_kfree_skb_any(skb); - return NETDEV_TX_OK; - } - - if (skb->len <= 0) { - dev_kfree_skb_any(skb); - return NETDEV_TX_OK; - } - - if (unlikely(ixgb_maybe_stop_tx(netdev, &adapter->tx_ring, - DESC_NEEDED))) - return NETDEV_TX_BUSY; - - if (skb_vlan_tag_present(skb)) { - tx_flags |= IXGB_TX_FLAGS_VLAN; - vlan_id = skb_vlan_tag_get(skb); - } - - first = adapter->tx_ring.next_to_use; - - tso = ixgb_tso(adapter, skb); - if (tso < 0) { - dev_kfree_skb_any(skb); - return NETDEV_TX_OK; - } - - if (likely(tso)) - tx_flags |= IXGB_TX_FLAGS_TSO; - else if (ixgb_tx_csum(adapter, skb)) - tx_flags |= IXGB_TX_FLAGS_CSUM; - - count = ixgb_tx_map(adapter, skb, first); - - if (count) { - ixgb_tx_queue(adapter, count, vlan_id, tx_flags); - /* Make sure there is space in the ring for the next send. */ - ixgb_maybe_stop_tx(netdev, &adapter->tx_ring, DESC_NEEDED); - - } else { - dev_kfree_skb_any(skb); - adapter->tx_ring.buffer_info[first].time_stamp = 0; - adapter->tx_ring.next_to_use = first; - } - - return NETDEV_TX_OK; -} - -/** - * ixgb_tx_timeout - Respond to a Tx Hang - * @netdev: network interface device structure - * @txqueue: queue hanging (unused) - **/ - -static void -ixgb_tx_timeout(struct net_device *netdev, unsigned int __always_unused txqueue) -{ - struct ixgb_adapter *adapter = netdev_priv(netdev); - - /* Do the reset outside of interrupt context */ - schedule_work(&adapter->tx_timeout_task); -} - -static void -ixgb_tx_timeout_task(struct work_struct *work) -{ - struct ixgb_adapter *adapter = - container_of(work, struct ixgb_adapter, tx_timeout_task); - - adapter->tx_timeout_count++; - ixgb_down(adapter, true); - ixgb_up(adapter); -} - -/** - * ixgb_change_mtu - Change the Maximum Transfer Unit - * @netdev: network interface device structure - * @new_mtu: new value for maximum frame size - * - * Returns 0 on success, negative on failure - **/ - -static int -ixgb_change_mtu(struct net_device *netdev, int new_mtu) -{ - struct ixgb_adapter *adapter = netdev_priv(netdev); - int max_frame = new_mtu + ENET_HEADER_SIZE + ENET_FCS_LENGTH; - - if (netif_running(netdev)) - ixgb_down(adapter, true); - - adapter->rx_buffer_len = max_frame + 8; /* + 8 for errata */ - - netdev->mtu = new_mtu; - - if (netif_running(netdev)) - ixgb_up(adapter); - - return 0; -} - -/** - * ixgb_update_stats - Update the board statistics counters. - * @adapter: board private structure - **/ - -void -ixgb_update_stats(struct ixgb_adapter *adapter) -{ - struct net_device *netdev = adapter->netdev; - struct pci_dev *pdev = adapter->pdev; - - /* Prevent stats update while adapter is being reset */ - if (pci_channel_offline(pdev)) - return; - - if ((netdev->flags & IFF_PROMISC) || (netdev->flags & IFF_ALLMULTI) || - (netdev_mc_count(netdev) > IXGB_MAX_NUM_MULTICAST_ADDRESSES)) { - u64 multi = IXGB_READ_REG(&adapter->hw, MPRCL); - u32 bcast_l = IXGB_READ_REG(&adapter->hw, BPRCL); - u32 bcast_h = IXGB_READ_REG(&adapter->hw, BPRCH); - u64 bcast = ((u64)bcast_h << 32) | bcast_l; - - multi |= ((u64)IXGB_READ_REG(&adapter->hw, MPRCH) << 32); - /* fix up multicast stats by removing broadcasts */ - if (multi >= bcast) - multi -= bcast; - - adapter->stats.mprcl += (multi & 0xFFFFFFFF); - adapter->stats.mprch += (multi >> 32); - adapter->stats.bprcl += bcast_l; - adapter->stats.bprch += bcast_h; - } else { - adapter->stats.mprcl += IXGB_READ_REG(&adapter->hw, MPRCL); - adapter->stats.mprch += IXGB_READ_REG(&adapter->hw, MPRCH); - adapter->stats.bprcl += IXGB_READ_REG(&adapter->hw, BPRCL); - adapter->stats.bprch += IXGB_READ_REG(&adapter->hw, BPRCH); - } - adapter->stats.tprl += IXGB_READ_REG(&adapter->hw, TPRL); - adapter->stats.tprh += IXGB_READ_REG(&adapter->hw, TPRH); - adapter->stats.gprcl += IXGB_READ_REG(&adapter->hw, GPRCL); - adapter->stats.gprch += IXGB_READ_REG(&adapter->hw, GPRCH); - adapter->stats.uprcl += IXGB_READ_REG(&adapter->hw, UPRCL); - adapter->stats.uprch += IXGB_READ_REG(&adapter->hw, UPRCH); - adapter->stats.vprcl += IXGB_READ_REG(&adapter->hw, VPRCL); - adapter->stats.vprch += IXGB_READ_REG(&adapter->hw, VPRCH); - adapter->stats.jprcl += IXGB_READ_REG(&adapter->hw, JPRCL); - adapter->stats.jprch += IXGB_READ_REG(&adapter->hw, JPRCH); - adapter->stats.gorcl += IXGB_READ_REG(&adapter->hw, GORCL); - adapter->stats.gorch += IXGB_READ_REG(&adapter->hw, GORCH); - adapter->stats.torl += IXGB_READ_REG(&adapter->hw, TORL); - adapter->stats.torh += IXGB_READ_REG(&adapter->hw, TORH); - adapter->stats.rnbc += IXGB_READ_REG(&adapter->hw, RNBC); - adapter->stats.ruc += IXGB_READ_REG(&adapter->hw, RUC); - adapter->stats.roc += IXGB_READ_REG(&adapter->hw, ROC); - adapter->stats.rlec += IXGB_READ_REG(&adapter->hw, RLEC); - adapter->stats.crcerrs += IXGB_READ_REG(&adapter->hw, CRCERRS); - adapter->stats.icbc += IXGB_READ_REG(&adapter->hw, ICBC); - adapter->stats.ecbc += IXGB_READ_REG(&adapter->hw, ECBC); - adapter->stats.mpc += IXGB_READ_REG(&adapter->hw, MPC); - adapter->stats.tptl += IXGB_READ_REG(&adapter->hw, TPTL); - adapter->stats.tpth += IXGB_READ_REG(&adapter->hw, TPTH); - adapter->stats.gptcl += IXGB_READ_REG(&adapter->hw, GPTCL); - adapter->stats.gptch += IXGB_READ_REG(&adapter->hw, GPTCH); - adapter->stats.bptcl += IXGB_READ_REG(&adapter->hw, BPTCL); - adapter->stats.bptch += IXGB_READ_REG(&adapter->hw, BPTCH); - adapter->stats.mptcl += IXGB_READ_REG(&adapter->hw, MPTCL); - adapter->stats.mptch += IXGB_READ_REG(&adapter->hw, MPTCH); - adapter->stats.uptcl += IXGB_READ_REG(&adapter->hw, UPTCL); - adapter->stats.uptch += IXGB_READ_REG(&adapter->hw, UPTCH); - adapter->stats.vptcl += IXGB_READ_REG(&adapter->hw, VPTCL); - adapter->stats.vptch += IXGB_READ_REG(&adapter->hw, VPTCH); - adapter->stats.jptcl += IXGB_READ_REG(&adapter->hw, JPTCL); - adapter->stats.jptch += IXGB_READ_REG(&adapter->hw, JPTCH); - adapter->stats.gotcl += IXGB_READ_REG(&adapter->hw, GOTCL); - adapter->stats.gotch += IXGB_READ_REG(&adapter->hw, GOTCH); - adapter->stats.totl += IXGB_READ_REG(&adapter->hw, TOTL); - adapter->stats.toth += IXGB_READ_REG(&adapter->hw, TOTH); - adapter->stats.dc += IXGB_READ_REG(&adapter->hw, DC); - adapter->stats.plt64c += IXGB_READ_REG(&adapter->hw, PLT64C); - adapter->stats.tsctc += IXGB_READ_REG(&adapter->hw, TSCTC); - adapter->stats.tsctfc += IXGB_READ_REG(&adapter->hw, TSCTFC); - adapter->stats.ibic += IXGB_READ_REG(&adapter->hw, IBIC); - adapter->stats.rfc += IXGB_READ_REG(&adapter->hw, RFC); - adapter->stats.lfc += IXGB_READ_REG(&adapter->hw, LFC); - adapter->stats.pfrc += IXGB_READ_REG(&adapter->hw, PFRC); - adapter->stats.pftc += IXGB_READ_REG(&adapter->hw, PFTC); - adapter->stats.mcfrc += IXGB_READ_REG(&adapter->hw, MCFRC); - adapter->stats.mcftc += IXGB_READ_REG(&adapter->hw, MCFTC); - adapter->stats.xonrxc += IXGB_READ_REG(&adapter->hw, XONRXC); - adapter->stats.xontxc += IXGB_READ_REG(&adapter->hw, XONTXC); - adapter->stats.xoffrxc += IXGB_READ_REG(&adapter->hw, XOFFRXC); - adapter->stats.xofftxc += IXGB_READ_REG(&adapter->hw, XOFFTXC); - adapter->stats.rjc += IXGB_READ_REG(&adapter->hw, RJC); - - /* Fill out the OS statistics structure */ - - netdev->stats.rx_packets = adapter->stats.gprcl; - netdev->stats.tx_packets = adapter->stats.gptcl; - netdev->stats.rx_bytes = adapter->stats.gorcl; - netdev->stats.tx_bytes = adapter->stats.gotcl; - netdev->stats.multicast = adapter->stats.mprcl; - netdev->stats.collisions = 0; - - /* ignore RLEC as it reports errors for padded (<64bytes) frames - * with a length in the type/len field */ - netdev->stats.rx_errors = - /* adapter->stats.rnbc + */ adapter->stats.crcerrs + - adapter->stats.ruc + - adapter->stats.roc /*+ adapter->stats.rlec */ + - adapter->stats.icbc + - adapter->stats.ecbc + adapter->stats.mpc; - - /* see above - * netdev->stats.rx_length_errors = adapter->stats.rlec; - */ - - netdev->stats.rx_crc_errors = adapter->stats.crcerrs; - netdev->stats.rx_fifo_errors = adapter->stats.mpc; - netdev->stats.rx_missed_errors = adapter->stats.mpc; - netdev->stats.rx_over_errors = adapter->stats.mpc; - - netdev->stats.tx_errors = 0; - netdev->stats.rx_frame_errors = 0; - netdev->stats.tx_aborted_errors = 0; - netdev->stats.tx_carrier_errors = 0; - netdev->stats.tx_fifo_errors = 0; - netdev->stats.tx_heartbeat_errors = 0; - netdev->stats.tx_window_errors = 0; -} - -/** - * ixgb_intr - Interrupt Handler - * @irq: interrupt number - * @data: pointer to a network interface device structure - **/ - -static irqreturn_t -ixgb_intr(int irq, void *data) -{ - struct net_device *netdev = data; - struct ixgb_adapter *adapter = netdev_priv(netdev); - struct ixgb_hw *hw = &adapter->hw; - u32 icr = IXGB_READ_REG(hw, ICR); - - if (unlikely(!icr)) - return IRQ_NONE; /* Not our interrupt */ - - if (unlikely(icr & (IXGB_INT_RXSEQ | IXGB_INT_LSC))) - if (!test_bit(__IXGB_DOWN, &adapter->flags)) - mod_timer(&adapter->watchdog_timer, jiffies); - - if (napi_schedule_prep(&adapter->napi)) { - - /* Disable interrupts and register for poll. The flush - of the posted write is intentionally left out. - */ - - IXGB_WRITE_REG(&adapter->hw, IMC, ~0); - __napi_schedule(&adapter->napi); - } - return IRQ_HANDLED; -} - -/** - * ixgb_clean - NAPI Rx polling callback - * @napi: napi struct pointer - * @budget: max number of receives to clean - **/ - -static int -ixgb_clean(struct napi_struct *napi, int budget) -{ - struct ixgb_adapter *adapter = container_of(napi, struct ixgb_adapter, napi); - int work_done = 0; - - ixgb_clean_tx_irq(adapter); - ixgb_clean_rx_irq(adapter, &work_done, budget); - - /* If budget not fully consumed, exit the polling mode */ - if (work_done < budget) { - napi_complete_done(napi, work_done); - if (!test_bit(__IXGB_DOWN, &adapter->flags)) - ixgb_irq_enable(adapter); - } - - return work_done; -} - -/** - * ixgb_clean_tx_irq - Reclaim resources after transmit completes - * @adapter: board private structure - **/ - -static bool -ixgb_clean_tx_irq(struct ixgb_adapter *adapter) -{ - struct ixgb_desc_ring *tx_ring = &adapter->tx_ring; - struct net_device *netdev = adapter->netdev; - struct ixgb_tx_desc *tx_desc, *eop_desc; - struct ixgb_buffer *buffer_info; - unsigned int i, eop; - bool cleaned = false; - - i = tx_ring->next_to_clean; - eop = tx_ring->buffer_info[i].next_to_watch; - eop_desc = IXGB_TX_DESC(*tx_ring, eop); - - while (eop_desc->status & IXGB_TX_DESC_STATUS_DD) { - - rmb(); /* read buffer_info after eop_desc */ - for (cleaned = false; !cleaned; ) { - tx_desc = IXGB_TX_DESC(*tx_ring, i); - buffer_info = &tx_ring->buffer_info[i]; - - if (tx_desc->popts & - (IXGB_TX_DESC_POPTS_TXSM | - IXGB_TX_DESC_POPTS_IXSM)) - adapter->hw_csum_tx_good++; - - ixgb_unmap_and_free_tx_resource(adapter, buffer_info); - - *(u32 *)&(tx_desc->status) = 0; - - cleaned = (i == eop); - if (++i == tx_ring->count) i = 0; - } - - eop = tx_ring->buffer_info[i].next_to_watch; - eop_desc = IXGB_TX_DESC(*tx_ring, eop); - } - - tx_ring->next_to_clean = i; - - if (unlikely(cleaned && netif_carrier_ok(netdev) && - IXGB_DESC_UNUSED(tx_ring) >= DESC_NEEDED)) { - /* Make sure that anybody stopping the queue after this - * sees the new next_to_clean. */ - smp_mb(); - - if (netif_queue_stopped(netdev) && - !(test_bit(__IXGB_DOWN, &adapter->flags))) { - netif_wake_queue(netdev); - ++adapter->restart_queue; - } - } - - if (adapter->detect_tx_hung) { - /* detect a transmit hang in hardware, this serializes the - * check with the clearing of time_stamp and movement of i */ - adapter->detect_tx_hung = false; - if (tx_ring->buffer_info[eop].time_stamp && - time_after(jiffies, tx_ring->buffer_info[eop].time_stamp + HZ) - && !(IXGB_READ_REG(&adapter->hw, STATUS) & - IXGB_STATUS_TXOFF)) { - /* detected Tx unit hang */ - netif_err(adapter, drv, adapter->netdev, - "Detected Tx Unit Hang\n" - " TDH <%x>\n" - " TDT <%x>\n" - " next_to_use <%x>\n" - " next_to_clean <%x>\n" - "buffer_info[next_to_clean]\n" - " time_stamp <%lx>\n" - " next_to_watch <%x>\n" - " jiffies <%lx>\n" - " next_to_watch.status <%x>\n", - IXGB_READ_REG(&adapter->hw, TDH), - IXGB_READ_REG(&adapter->hw, TDT), - tx_ring->next_to_use, - tx_ring->next_to_clean, - tx_ring->buffer_info[eop].time_stamp, - eop, - jiffies, - eop_desc->status); - netif_stop_queue(netdev); - } - } - - return cleaned; -} - -/** - * ixgb_rx_checksum - Receive Checksum Offload for 82597. - * @adapter: board private structure - * @rx_desc: receive descriptor - * @skb: socket buffer with received data - **/ - -static void -ixgb_rx_checksum(struct ixgb_adapter *adapter, - struct ixgb_rx_desc *rx_desc, - struct sk_buff *skb) -{ - /* Ignore Checksum bit is set OR - * TCP Checksum has not been calculated - */ - if ((rx_desc->status & IXGB_RX_DESC_STATUS_IXSM) || - (!(rx_desc->status & IXGB_RX_DESC_STATUS_TCPCS))) { - skb_checksum_none_assert(skb); - return; - } - - /* At this point we know the hardware did the TCP checksum */ - /* now look at the TCP checksum error bit */ - if (rx_desc->errors & IXGB_RX_DESC_ERRORS_TCPE) { - /* let the stack verify checksum errors */ - skb_checksum_none_assert(skb); - adapter->hw_csum_rx_error++; - } else { - /* TCP checksum is good */ - skb->ip_summed = CHECKSUM_UNNECESSARY; - adapter->hw_csum_rx_good++; - } -} - -/* - * this should improve performance for small packets with large amounts - * of reassembly being done in the stack - */ -static void ixgb_check_copybreak(struct napi_struct *napi, - struct ixgb_buffer *buffer_info, - u32 length, struct sk_buff **skb) -{ - struct sk_buff *new_skb; - - if (length > copybreak) - return; - - new_skb = napi_alloc_skb(napi, length); - if (!new_skb) - return; - - skb_copy_to_linear_data_offset(new_skb, -NET_IP_ALIGN, - (*skb)->data - NET_IP_ALIGN, - length + NET_IP_ALIGN); - /* save the skb in buffer_info as good */ - buffer_info->skb = *skb; - *skb = new_skb; -} - -/** - * ixgb_clean_rx_irq - Send received data up the network stack, - * @adapter: board private structure - * @work_done: output pointer to amount of packets cleaned - * @work_to_do: how much work we can complete - **/ - -static bool -ixgb_clean_rx_irq(struct ixgb_adapter *adapter, int *work_done, int work_to_do) -{ - struct ixgb_desc_ring *rx_ring = &adapter->rx_ring; - struct net_device *netdev = adapter->netdev; - struct pci_dev *pdev = adapter->pdev; - struct ixgb_rx_desc *rx_desc, *next_rxd; - struct ixgb_buffer *buffer_info, *next_buffer, *next2_buffer; - u32 length; - unsigned int i, j; - int cleaned_count = 0; - bool cleaned = false; - - i = rx_ring->next_to_clean; - rx_desc = IXGB_RX_DESC(*rx_ring, i); - buffer_info = &rx_ring->buffer_info[i]; - - while (rx_desc->status & IXGB_RX_DESC_STATUS_DD) { - struct sk_buff *skb; - u8 status; - - if (*work_done >= work_to_do) - break; - - (*work_done)++; - rmb(); /* read descriptor and rx_buffer_info after status DD */ - status = rx_desc->status; - skb = buffer_info->skb; - buffer_info->skb = NULL; - - prefetch(skb->data - NET_IP_ALIGN); - - if (++i == rx_ring->count) - i = 0; - next_rxd = IXGB_RX_DESC(*rx_ring, i); - prefetch(next_rxd); - - j = i + 1; - if (j == rx_ring->count) - j = 0; - next2_buffer = &rx_ring->buffer_info[j]; - prefetch(next2_buffer); - - next_buffer = &rx_ring->buffer_info[i]; - - cleaned = true; - cleaned_count++; - - dma_unmap_single(&pdev->dev, - buffer_info->dma, - buffer_info->length, - DMA_FROM_DEVICE); - buffer_info->dma = 0; - - length = le16_to_cpu(rx_desc->length); - rx_desc->length = 0; - - if (unlikely(!(status & IXGB_RX_DESC_STATUS_EOP))) { - - /* All receives must fit into a single buffer */ - - pr_debug("Receive packet consumed multiple buffers length<%x>\n", - length); - - dev_kfree_skb_irq(skb); - goto rxdesc_done; - } - - if (unlikely(rx_desc->errors & - (IXGB_RX_DESC_ERRORS_CE | IXGB_RX_DESC_ERRORS_SE | - IXGB_RX_DESC_ERRORS_P | IXGB_RX_DESC_ERRORS_RXE))) { - dev_kfree_skb_irq(skb); - goto rxdesc_done; - } - - ixgb_check_copybreak(&adapter->napi, buffer_info, length, &skb); - - /* Good Receive */ - skb_put(skb, length); - - /* Receive Checksum Offload */ - ixgb_rx_checksum(adapter, rx_desc, skb); - - skb->protocol = eth_type_trans(skb, netdev); - if (status & IXGB_RX_DESC_STATUS_VP) - __vlan_hwaccel_put_tag(skb, htons(ETH_P_8021Q), - le16_to_cpu(rx_desc->special)); - - netif_receive_skb(skb); - -rxdesc_done: - /* clean up descriptor, might be written over by hw */ - rx_desc->status = 0; - - /* return some buffers to hardware, one at a time is too slow */ - if (unlikely(cleaned_count >= IXGB_RX_BUFFER_WRITE)) { - ixgb_alloc_rx_buffers(adapter, cleaned_count); - cleaned_count = 0; - } - - /* use prefetched values */ - rx_desc = next_rxd; - buffer_info = next_buffer; - } - - rx_ring->next_to_clean = i; - - cleaned_count = IXGB_DESC_UNUSED(rx_ring); - if (cleaned_count) - ixgb_alloc_rx_buffers(adapter, cleaned_count); - - return cleaned; -} - -/** - * ixgb_alloc_rx_buffers - Replace used receive buffers - * @adapter: address of board private structure - * @cleaned_count: how many buffers to allocate - **/ - -static void -ixgb_alloc_rx_buffers(struct ixgb_adapter *adapter, int cleaned_count) -{ - struct ixgb_desc_ring *rx_ring = &adapter->rx_ring; - struct net_device *netdev = adapter->netdev; - struct pci_dev *pdev = adapter->pdev; - struct ixgb_rx_desc *rx_desc; - struct ixgb_buffer *buffer_info; - struct sk_buff *skb; - unsigned int i; - long cleancount; - - i = rx_ring->next_to_use; - buffer_info = &rx_ring->buffer_info[i]; - cleancount = IXGB_DESC_UNUSED(rx_ring); - - - /* leave three descriptors unused */ - while (--cleancount > 2 && cleaned_count--) { - /* recycle! its good for you */ - skb = buffer_info->skb; - if (skb) { - skb_trim(skb, 0); - goto map_skb; - } - - skb = netdev_alloc_skb_ip_align(netdev, adapter->rx_buffer_len); - if (unlikely(!skb)) { - /* Better luck next round */ - adapter->alloc_rx_buff_failed++; - break; - } - - buffer_info->skb = skb; - buffer_info->length = adapter->rx_buffer_len; -map_skb: - buffer_info->dma = dma_map_single(&pdev->dev, - skb->data, - adapter->rx_buffer_len, - DMA_FROM_DEVICE); - if (dma_mapping_error(&pdev->dev, buffer_info->dma)) { - adapter->alloc_rx_buff_failed++; - break; - } - - rx_desc = IXGB_RX_DESC(*rx_ring, i); - rx_desc->buff_addr = cpu_to_le64(buffer_info->dma); - /* guarantee DD bit not set now before h/w gets descriptor - * this is the rest of the workaround for h/w double - * writeback. */ - rx_desc->status = 0; - - - if (++i == rx_ring->count) - i = 0; - buffer_info = &rx_ring->buffer_info[i]; - } - - if (likely(rx_ring->next_to_use != i)) { - rx_ring->next_to_use = i; - if (unlikely(i-- == 0)) - i = (rx_ring->count - 1); - - /* Force memory writes to complete before letting h/w - * know there are new descriptors to fetch. (Only - * applicable for weak-ordered memory model archs, such - * as IA-64). */ - wmb(); - IXGB_WRITE_REG(&adapter->hw, RDT, i); - } -} - -static void -ixgb_vlan_strip_enable(struct ixgb_adapter *adapter) -{ - u32 ctrl; - - /* enable VLAN tag insert/strip */ - ctrl = IXGB_READ_REG(&adapter->hw, CTRL0); - ctrl |= IXGB_CTRL0_VME; - IXGB_WRITE_REG(&adapter->hw, CTRL0, ctrl); -} - -static void -ixgb_vlan_strip_disable(struct ixgb_adapter *adapter) -{ - u32 ctrl; - - /* disable VLAN tag insert/strip */ - ctrl = IXGB_READ_REG(&adapter->hw, CTRL0); - ctrl &= ~IXGB_CTRL0_VME; - IXGB_WRITE_REG(&adapter->hw, CTRL0, ctrl); -} - -static int -ixgb_vlan_rx_add_vid(struct net_device *netdev, __be16 proto, u16 vid) -{ - struct ixgb_adapter *adapter = netdev_priv(netdev); - u32 vfta, index; - - /* add VID to filter table */ - - index = (vid >> 5) & 0x7F; - vfta = IXGB_READ_REG_ARRAY(&adapter->hw, VFTA, index); - vfta |= (1 << (vid & 0x1F)); - ixgb_write_vfta(&adapter->hw, index, vfta); - set_bit(vid, adapter->active_vlans); - - return 0; -} - -static int -ixgb_vlan_rx_kill_vid(struct net_device *netdev, __be16 proto, u16 vid) -{ - struct ixgb_adapter *adapter = netdev_priv(netdev); - u32 vfta, index; - - /* remove VID from filter table */ - - index = (vid >> 5) & 0x7F; - vfta = IXGB_READ_REG_ARRAY(&adapter->hw, VFTA, index); - vfta &= ~(1 << (vid & 0x1F)); - ixgb_write_vfta(&adapter->hw, index, vfta); - clear_bit(vid, adapter->active_vlans); - - return 0; -} - -static void -ixgb_restore_vlan(struct ixgb_adapter *adapter) -{ - u16 vid; - - for_each_set_bit(vid, adapter->active_vlans, VLAN_N_VID) - ixgb_vlan_rx_add_vid(adapter->netdev, htons(ETH_P_8021Q), vid); -} - -/** - * ixgb_io_error_detected - called when PCI error is detected - * @pdev: pointer to pci device with error - * @state: pci channel state after error - * - * This callback is called by the PCI subsystem whenever - * a PCI bus error is detected. - */ -static pci_ers_result_t ixgb_io_error_detected(struct pci_dev *pdev, - pci_channel_state_t state) -{ - struct net_device *netdev = pci_get_drvdata(pdev); - struct ixgb_adapter *adapter = netdev_priv(netdev); - - netif_device_detach(netdev); - - if (state == pci_channel_io_perm_failure) - return PCI_ERS_RESULT_DISCONNECT; - - if (netif_running(netdev)) - ixgb_down(adapter, true); - - pci_disable_device(pdev); - - /* Request a slot reset. */ - return PCI_ERS_RESULT_NEED_RESET; -} - -/** - * ixgb_io_slot_reset - called after the pci bus has been reset. - * @pdev: pointer to pci device with error - * - * This callback is called after the PCI bus has been reset. - * Basically, this tries to restart the card from scratch. - * This is a shortened version of the device probe/discovery code, - * it resembles the first-half of the ixgb_probe() routine. - */ -static pci_ers_result_t ixgb_io_slot_reset(struct pci_dev *pdev) -{ - struct net_device *netdev = pci_get_drvdata(pdev); - struct ixgb_adapter *adapter = netdev_priv(netdev); - u8 addr[ETH_ALEN]; - - if (pci_enable_device(pdev)) { - netif_err(adapter, probe, adapter->netdev, - "Cannot re-enable PCI device after reset\n"); - return PCI_ERS_RESULT_DISCONNECT; - } - - /* Perform card reset only on one instance of the card */ - if (0 != PCI_FUNC (pdev->devfn)) - return PCI_ERS_RESULT_RECOVERED; - - pci_set_master(pdev); - - netif_carrier_off(netdev); - netif_stop_queue(netdev); - ixgb_reset(adapter); - - /* Make sure the EEPROM is good */ - if (!ixgb_validate_eeprom_checksum(&adapter->hw)) { - netif_err(adapter, probe, adapter->netdev, - "After reset, the EEPROM checksum is not valid\n"); - return PCI_ERS_RESULT_DISCONNECT; - } - ixgb_get_ee_mac_addr(&adapter->hw, addr); - eth_hw_addr_set(netdev, addr); - memcpy(netdev->perm_addr, netdev->dev_addr, netdev->addr_len); - - if (!is_valid_ether_addr(netdev->perm_addr)) { - netif_err(adapter, probe, adapter->netdev, - "After reset, invalid MAC address\n"); - return PCI_ERS_RESULT_DISCONNECT; - } - - return PCI_ERS_RESULT_RECOVERED; -} - -/** - * ixgb_io_resume - called when its OK to resume normal operations - * @pdev: pointer to pci device with error - * - * The error recovery driver tells us that its OK to resume - * normal operation. Implementation resembles the second-half - * of the ixgb_probe() routine. - */ -static void ixgb_io_resume(struct pci_dev *pdev) -{ - struct net_device *netdev = pci_get_drvdata(pdev); - struct ixgb_adapter *adapter = netdev_priv(netdev); - - pci_set_master(pdev); - - if (netif_running(netdev)) { - if (ixgb_up(adapter)) { - pr_err("can't bring device back up after reset\n"); - return; - } - } - - netif_device_attach(netdev); - mod_timer(&adapter->watchdog_timer, jiffies); -} - -/* ixgb_main.c */ diff --git a/drivers/net/ethernet/intel/ixgb/ixgb_osdep.h b/drivers/net/ethernet/intel/ixgb/ixgb_osdep.h deleted file mode 100644 index 7bd54efa698d..000000000000 --- a/drivers/net/ethernet/intel/ixgb/ixgb_osdep.h +++ /dev/null @@ -1,39 +0,0 @@ -/* SPDX-License-Identifier: GPL-2.0 */ -/* Copyright(c) 1999 - 2008 Intel Corporation. */ - -/* glue for the OS independent part of ixgb - * includes register access macros - */ - -#ifndef _IXGB_OSDEP_H_ -#define _IXGB_OSDEP_H_ - -#include <linux/types.h> -#include <linux/delay.h> -#include <asm/io.h> -#include <linux/interrupt.h> -#include <linux/sched.h> -#include <linux/if_ether.h> - -#undef ASSERT -#define ASSERT(x) BUG_ON(!(x)) - -#define ENTER() pr_debug("%s\n", __func__); - -#define IXGB_WRITE_REG(a, reg, value) ( \ - writel((value), ((a)->hw_addr + IXGB_##reg))) - -#define IXGB_READ_REG(a, reg) ( \ - readl((a)->hw_addr + IXGB_##reg)) - -#define IXGB_WRITE_REG_ARRAY(a, reg, offset, value) ( \ - writel((value), ((a)->hw_addr + IXGB_##reg + ((offset) << 2)))) - -#define IXGB_READ_REG_ARRAY(a, reg, offset) ( \ - readl((a)->hw_addr + IXGB_##reg + ((offset) << 2))) - -#define IXGB_WRITE_FLUSH(a) IXGB_READ_REG(a, STATUS) - -#define IXGB_MEMCPY memcpy - -#endif /* _IXGB_OSDEP_H_ */ diff --git a/drivers/net/ethernet/intel/ixgb/ixgb_param.c b/drivers/net/ethernet/intel/ixgb/ixgb_param.c deleted file mode 100644 index d40f96250691..000000000000 --- a/drivers/net/ethernet/intel/ixgb/ixgb_param.c +++ /dev/null @@ -1,442 +0,0 @@ -// SPDX-License-Identifier: GPL-2.0 -/* Copyright(c) 1999 - 2008 Intel Corporation. */ - -#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt - -#include "ixgb.h" - -/* This is the only thing that needs to be changed to adjust the - * maximum number of ports that the driver can manage. - */ - -#define IXGB_MAX_NIC 8 - -#define OPTION_UNSET -1 -#define OPTION_DISABLED 0 -#define OPTION_ENABLED 1 - -/* All parameters are treated the same, as an integer array of values. - * This macro just reduces the need to repeat the same declaration code - * over and over (plus this helps to avoid typo bugs). - */ - -#define IXGB_PARAM_INIT { [0 ... IXGB_MAX_NIC] = OPTION_UNSET } -#define IXGB_PARAM(X, desc) \ - static int X[IXGB_MAX_NIC+1] \ - = IXGB_PARAM_INIT; \ - static unsigned int num_##X = 0; \ - module_param_array_named(X, X, int, &num_##X, 0); \ - MODULE_PARM_DESC(X, desc); - -/* Transmit Descriptor Count - * - * Valid Range: 64-4096 - * - * Default Value: 256 - */ - -IXGB_PARAM(TxDescriptors, "Number of transmit descriptors"); - -/* Receive Descriptor Count - * - * Valid Range: 64-4096 - * - * Default Value: 1024 - */ - -IXGB_PARAM(RxDescriptors, "Number of receive descriptors"); - -/* User Specified Flow Control Override - * - * Valid Range: 0-3 - * - 0 - No Flow Control - * - 1 - Rx only, respond to PAUSE frames but do not generate them - * - 2 - Tx only, generate PAUSE frames but ignore them on receive - * - 3 - Full Flow Control Support - * - * Default Value: 2 - Tx only (silicon bug avoidance) - */ - -IXGB_PARAM(FlowControl, "Flow Control setting"); - -/* XsumRX - Receive Checksum Offload Enable/Disable - * - * Valid Range: 0, 1 - * - 0 - disables all checksum offload - * - 1 - enables receive IP/TCP/UDP checksum offload - * on 82597 based NICs - * - * Default Value: 1 - */ - -IXGB_PARAM(XsumRX, "Disable or enable Receive Checksum offload"); - -/* Transmit Interrupt Delay in units of 0.8192 microseconds - * - * Valid Range: 0-65535 - * - * Default Value: 32 - */ - -IXGB_PARAM(TxIntDelay, "Transmit Interrupt Delay"); - -/* Receive Interrupt Delay in units of 0.8192 microseconds - * - * Valid Range: 0-65535 - * - * Default Value: 72 - */ - -IXGB_PARAM(RxIntDelay, "Receive Interrupt Delay"); - -/* Receive Flow control high threshold (when we send a pause frame) - * (FCRTH) - * - * Valid Range: 1,536 - 262,136 (0x600 - 0x3FFF8, 8 byte granularity) - * - * Default Value: 196,608 (0x30000) - */ - -IXGB_PARAM(RxFCHighThresh, "Receive Flow Control High Threshold"); - -/* Receive Flow control low threshold (when we send a resume frame) - * (FCRTL) - * - * Valid Range: 64 - 262,136 (0x40 - 0x3FFF8, 8 byte granularity) - * must be less than high threshold by at least 8 bytes - * - * Default Value: 163,840 (0x28000) - */ - -IXGB_PARAM(RxFCLowThresh, "Receive Flow Control Low Threshold"); - -/* Flow control request timeout (how long to pause the link partner's tx) - * (PAP 15:0) - * - * Valid Range: 1 - 65535 - * - * Default Value: 65535 (0xffff) (we'll send an xon if we recover) - */ - -IXGB_PARAM(FCReqTimeout, "Flow Control Request Timeout"); - -/* Interrupt Delay Enable - * - * Valid Range: 0, 1 - * - * - 0 - disables transmit interrupt delay - * - 1 - enables transmmit interrupt delay - * - * Default Value: 1 - */ - -IXGB_PARAM(IntDelayEnable, "Transmit Interrupt Delay Enable"); - - -#define DEFAULT_TIDV 32 -#define MAX_TIDV 0xFFFF -#define MIN_TIDV 0 - -#define DEFAULT_RDTR 72 -#define MAX_RDTR 0xFFFF -#define MIN_RDTR 0 - -#define DEFAULT_FCRTL 0x28000 -#define DEFAULT_FCRTH 0x30000 -#define MIN_FCRTL 0 -#define MAX_FCRTL 0x3FFE8 -#define MIN_FCRTH 8 -#define MAX_FCRTH 0x3FFF0 - -#define MIN_FCPAUSE 1 -#define MAX_FCPAUSE 0xffff -#define DEFAULT_FCPAUSE 0xFFFF /* this may be too long */ - -struct ixgb_option { - enum { enable_option, range_option, list_option } type; - const char *name; - const char *err; - int def; - union { - struct { /* range_option info */ - int min; - int max; - } r; - struct { /* list_option info */ - int nr; - const struct ixgb_opt_list { - int i; - const char *str; - } *p; - } l; - } arg; -}; - -static int -ixgb_validate_option(unsigned int *value, const struct ixgb_option *opt) -{ - if (*value == OPTION_UNSET) { - *value = opt->def; - return 0; - } - - switch (opt->type) { - case enable_option: - switch (*value) { - case OPTION_ENABLED: - pr_info("%s Enabled\n", opt->name); - return 0; - case OPTION_DISABLED: - pr_info("%s Disabled\n", opt->name); - return 0; - } - break; - case range_option: - if (*value >= opt->arg.r.min && *value <= opt->arg.r.max) { - pr_info("%s set to %i\n", opt->name, *value); - return 0; - } - break; - case list_option: { - int i; - const struct ixgb_opt_list *ent; - - for (i = 0; i < opt->arg.l.nr; i++) { - ent = &opt->arg.l.p[i]; - if (*value == ent->i) { - if (ent->str[0] != '\0') - pr_info("%s\n", ent->str); - return 0; - } - } - } - break; - default: - BUG(); - } - - pr_info("Invalid %s specified (%i) %s\n", opt->name, *value, opt->err); - *value = opt->def; - return -1; -} - -/** - * ixgb_check_options - Range Checking for Command Line Parameters - * @adapter: board private structure - * - * This routine checks all command line parameters for valid user - * input. If an invalid value is given, or if no user specified - * value exists, a default value is used. The final value is stored - * in a variable in the adapter structure. - **/ - -void -ixgb_check_options(struct ixgb_adapter *adapter) -{ - int bd = adapter->bd_number; - if (bd >= IXGB_MAX_NIC) { - pr_notice("Warning: no configuration for board #%i\n", bd); - pr_notice("Using defaults for all values\n"); - } - - { /* Transmit Descriptor Count */ - static const struct ixgb_option opt = { - .type = range_option, - .name = "Transmit Descriptors", - .err = "using default of " __MODULE_STRING(DEFAULT_TXD), - .def = DEFAULT_TXD, - .arg = { .r = { .min = MIN_TXD, - .max = MAX_TXD}} - }; - struct ixgb_desc_ring *tx_ring = &adapter->tx_ring; - - if (num_TxDescriptors > bd) { - tx_ring->count = TxDescriptors[bd]; - ixgb_validate_option(&tx_ring->count, &opt); - } else { - tx_ring->count = opt.def; - } - tx_ring->count = ALIGN(tx_ring->count, IXGB_REQ_TX_DESCRIPTOR_MULTIPLE); - } - { /* Receive Descriptor Count */ - static const struct ixgb_option opt = { - .type = range_option, - .name = "Receive Descriptors", - .err = "using default of " __MODULE_STRING(DEFAULT_RXD), - .def = DEFAULT_RXD, - .arg = { .r = { .min = MIN_RXD, - .max = MAX_RXD}} - }; - struct ixgb_desc_ring *rx_ring = &adapter->rx_ring; - - if (num_RxDescriptors > bd) { - rx_ring->count = RxDescriptors[bd]; - ixgb_validate_option(&rx_ring->count, &opt); - } else { - rx_ring->count = opt.def; - } - rx_ring->count = ALIGN(rx_ring->count, IXGB_REQ_RX_DESCRIPTOR_MULTIPLE); - } - { /* Receive Checksum Offload Enable */ - static const struct ixgb_option opt = { - .type = enable_option, - .name = "Receive Checksum Offload", - .err = "defaulting to Enabled", - .def = OPTION_ENABLED - }; - - if (num_XsumRX > bd) { - unsigned int rx_csum = XsumRX[bd]; - ixgb_validate_option(&rx_csum, &opt); - adapter->rx_csum = rx_csum; - } else { - adapter->rx_csum = opt.def; - } - } - { /* Flow Control */ - - static const struct ixgb_opt_list fc_list[] = { - { ixgb_fc_none, "Flow Control Disabled" }, - { ixgb_fc_rx_pause, "Flow Control Receive Only" }, - { ixgb_fc_tx_pause, "Flow Control Transmit Only" }, - { ixgb_fc_full, "Flow Control Enabled" }, - { ixgb_fc_default, "Flow Control Hardware Default" } - }; - - static const struct ixgb_option opt = { - .type = list_option, - .name = "Flow Control", - .err = "reading default settings from EEPROM", - .def = ixgb_fc_tx_pause, - .arg = { .l = { .nr = ARRAY_SIZE(fc_list), - .p = fc_list }} - }; - - if (num_FlowControl > bd) { - unsigned int fc = FlowControl[bd]; - ixgb_validate_option(&fc, &opt); - adapter->hw.fc.type = fc; - } else { - adapter->hw.fc.type = opt.def; - } - } - { /* Receive Flow Control High Threshold */ - static const struct ixgb_option opt = { - .type = range_option, - .name = "Rx Flow Control High Threshold", - .err = "using default of " __MODULE_STRING(DEFAULT_FCRTH), - .def = DEFAULT_FCRTH, - .arg = { .r = { .min = MIN_FCRTH, - .max = MAX_FCRTH}} - }; - - if (num_RxFCHighThresh > bd) { - adapter->hw.fc.high_water = RxFCHighThresh[bd]; - ixgb_validate_option(&adapter->hw.fc.high_water, &opt); - } else { - adapter->hw.fc.high_water = opt.def; - } - if (!(adapter->hw.fc.type & ixgb_fc_tx_pause) ) - pr_info("Ignoring RxFCHighThresh when no RxFC\n"); - } - { /* Receive Flow Control Low Threshold */ - static const struct ixgb_option opt = { - .type = range_option, - .name = "Rx Flow Control Low Threshold", - .err = "using default of " __MODULE_STRING(DEFAULT_FCRTL), - .def = DEFAULT_FCRTL, - .arg = { .r = { .min = MIN_FCRTL, - .max = MAX_FCRTL}} - }; - - if (num_RxFCLowThresh > bd) { - adapter->hw.fc.low_water = RxFCLowThresh[bd]; - ixgb_validate_option(&adapter->hw.fc.low_water, &opt); - } else { - adapter->hw.fc.low_water = opt.def; - } - if (!(adapter->hw.fc.type & ixgb_fc_tx_pause) ) - pr_info("Ignoring RxFCLowThresh when no RxFC\n"); - } - { /* Flow Control Pause Time Request*/ - static const struct ixgb_option opt = { - .type = range_option, - .name = "Flow Control Pause Time Request", - .err = "using default of "__MODULE_STRING(DEFAULT_FCPAUSE), - .def = DEFAULT_FCPAUSE, - .arg = { .r = { .min = MIN_FCPAUSE, - .max = MAX_FCPAUSE}} - }; - - if (num_FCReqTimeout > bd) { - unsigned int pause_time = FCReqTimeout[bd]; - ixgb_validate_option(&pause_time, &opt); - adapter->hw.fc.pause_time = pause_time; - } else { - adapter->hw.fc.pause_time = opt.def; - } - if (!(adapter->hw.fc.type & ixgb_fc_tx_pause) ) - pr_info("Ignoring FCReqTimeout when no RxFC\n"); - } - /* high low and spacing check for rx flow control thresholds */ - if (adapter->hw.fc.type & ixgb_fc_tx_pause) { - /* high must be greater than low */ - if (adapter->hw.fc.high_water < (adapter->hw.fc.low_water + 8)) { - /* set defaults */ - pr_info("RxFCHighThresh must be >= (RxFCLowThresh + 8), Using Defaults\n"); - adapter->hw.fc.high_water = DEFAULT_FCRTH; - adapter->hw.fc.low_water = DEFAULT_FCRTL; - } - } - { /* Receive Interrupt Delay */ - static const struct ixgb_option opt = { - .type = range_option, - .name = "Receive Interrupt Delay", - .err = "using default of " __MODULE_STRING(DEFAULT_RDTR), - .def = DEFAULT_RDTR, - .arg = { .r = { .min = MIN_RDTR, - .max = MAX_RDTR}} - }; - - if (num_RxIntDelay > bd) { - adapter->rx_int_delay = RxIntDelay[bd]; - ixgb_validate_option(&adapter->rx_int_delay, &opt); - } else { - adapter->rx_int_delay = opt.def; - } - } - { /* Transmit Interrupt Delay */ - static const struct ixgb_option opt = { - .type = range_option, - .name = "Transmit Interrupt Delay", - .err = "using default of " __MODULE_STRING(DEFAULT_TIDV), - .def = DEFAULT_TIDV, - .arg = { .r = { .min = MIN_TIDV, - .max = MAX_TIDV}} - }; - - if (num_TxIntDelay > bd) { - adapter->tx_int_delay = TxIntDelay[bd]; - ixgb_validate_option(&adapter->tx_int_delay, &opt); - } else { - adapter->tx_int_delay = opt.def; - } - } - - { /* Transmit Interrupt Delay Enable */ - static const struct ixgb_option opt = { - .type = enable_option, - .name = "Tx Interrupt Delay Enable", - .err = "defaulting to Enabled", - .def = OPTION_ENABLED - }; - - if (num_IntDelayEnable > bd) { - unsigned int ide = IntDelayEnable[bd]; - ixgb_validate_option(&ide, &opt); - adapter->tx_int_delay_enable = ide; - } else { - adapter->tx_int_delay_enable = opt.def; - } - } -} diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe.h b/drivers/net/ethernet/intel/ixgbe/ixgbe.h index 8736ca4b2628..63d4e32df029 100644 --- a/drivers/net/ethernet/intel/ixgbe/ixgbe.h +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe.h @@ -9,7 +9,6 @@ #include <linux/pci.h> #include <linux/netdevice.h> #include <linux/cpumask.h> -#include <linux/aer.h> #include <linux/if_vlan.h> #include <linux/jiffies.h> #include <linux/phy.h> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c index 6cfc9dc16537..0bbad4a5cc2f 100644 --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c @@ -2665,6 +2665,14 @@ static int ixgbe_get_rss_hash_opts(struct ixgbe_adapter *adapter, return 0; } +static int ixgbe_rss_indir_tbl_max(struct ixgbe_adapter *adapter) +{ + if (adapter->hw.mac.type < ixgbe_mac_X550) + return 16; + else + return 64; +} + static int ixgbe_get_rxnfc(struct net_device *dev, struct ethtool_rxnfc *cmd, u32 *rule_locs) { @@ -2673,7 +2681,8 @@ static int ixgbe_get_rxnfc(struct net_device *dev, struct ethtool_rxnfc *cmd, switch (cmd->cmd) { case ETHTOOL_GRXRINGS: - cmd->data = adapter->num_rx_queues; + cmd->data = min_t(int, adapter->num_rx_queues, + ixgbe_rss_indir_tbl_max(adapter)); ret = 0; break; case ETHTOOL_GRXCLSRLCNT: @@ -3075,14 +3084,6 @@ static int ixgbe_set_rxnfc(struct net_device *dev, struct ethtool_rxnfc *cmd) return ret; } -static int ixgbe_rss_indir_tbl_max(struct ixgbe_adapter *adapter) -{ - if (adapter->hw.mac.type < ixgbe_mac_X550) - return 16; - else - return 64; -} - static u32 ixgbe_get_rxfh_key_size(struct net_device *netdev) { return IXGBE_RSS_KEY_SIZE; @@ -3131,8 +3132,8 @@ static int ixgbe_set_rxfh(struct net_device *netdev, const u32 *indir, int i; u32 reta_entries = ixgbe_rss_indir_tbl_entries(adapter); - if (hfunc) - return -EINVAL; + if (hfunc != ETH_RSS_HASH_NO_CHANGE && hfunc != ETH_RSS_HASH_TOP) + return -EOPNOTSUPP; /* Fill out the redirection table */ if (indir) { diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c index 773c35fecace..e961ef4bbf4d 100644 --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c @@ -36,6 +36,7 @@ #include <net/tc_act/tc_mirred.h> #include <net/vxlan.h> #include <net/mpls.h> +#include <net/netdev_queues.h> #include <net/xdp_sock_drv.h> #include <net/xfrm.h> @@ -1119,6 +1120,7 @@ static bool ixgbe_clean_tx_irq(struct ixgbe_q_vector *q_vector, unsigned int total_bytes = 0, total_packets = 0, total_ipsec = 0; unsigned int budget = q_vector->tx.work_limit; unsigned int i = tx_ring->next_to_clean; + struct netdev_queue *txq; if (test_bit(__IXGBE_DOWN, &adapter->state)) return true; @@ -1249,24 +1251,14 @@ static bool ixgbe_clean_tx_irq(struct ixgbe_q_vector *q_vector, if (ring_is_xdp(tx_ring)) return !!budget; - netdev_tx_completed_queue(txring_txq(tx_ring), - total_packets, total_bytes); - #define TX_WAKE_THRESHOLD (DESC_NEEDED * 2) - if (unlikely(total_packets && netif_carrier_ok(tx_ring->netdev) && - (ixgbe_desc_unused(tx_ring) >= TX_WAKE_THRESHOLD))) { - /* Make sure that anybody stopping the queue after this - * sees the new next_to_clean. - */ - smp_mb(); - if (__netif_subqueue_stopped(tx_ring->netdev, - tx_ring->queue_index) - && !test_bit(__IXGBE_DOWN, &adapter->state)) { - netif_wake_subqueue(tx_ring->netdev, - tx_ring->queue_index); - ++tx_ring->tx_stats.restart_queue; - } - } + txq = netdev_get_tx_queue(tx_ring->netdev, tx_ring->queue_index); + if (!__netif_txq_completed_wake(txq, total_packets, total_bytes, + ixgbe_desc_unused(tx_ring), + TX_WAKE_THRESHOLD, + netif_carrier_ok(tx_ring->netdev) && + test_bit(__IXGBE_DOWN, &adapter->state))) + ++tx_ring->tx_stats.restart_queue; return !!budget; } @@ -8270,22 +8262,10 @@ static void ixgbe_tx_olinfo_status(union ixgbe_adv_tx_desc *tx_desc, static int __ixgbe_maybe_stop_tx(struct ixgbe_ring *tx_ring, u16 size) { - netif_stop_subqueue(tx_ring->netdev, tx_ring->queue_index); - - /* Herbert's original patch had: - * smp_mb__after_netif_stop_queue(); - * but since that doesn't exist yet, just open code it. - */ - smp_mb(); - - /* We need to check again in a case another CPU has just - * made room available. - */ - if (likely(ixgbe_desc_unused(tx_ring) < size)) + if (!netif_subqueue_try_stop(tx_ring->netdev, tx_ring->queue_index, + ixgbe_desc_unused(tx_ring), size)) return -EBUSY; - /* A reprieve! - use start_queue because it doesn't call schedule */ - netif_start_subqueue(tx_ring->netdev, tx_ring->queue_index); ++tx_ring->tx_stats.restart_queue; return 0; } @@ -8818,7 +8798,7 @@ netdev_tx_t ixgbe_xmit_frame_ring(struct sk_buff *skb, if (skb_cow_head(skb, 0)) goto out_drop; - vhdr = (struct vlan_ethhdr *)skb->data; + vhdr = skb_vlan_eth_hdr(skb); vhdr->h_vlan_TCI = htons(tx_flags >> IXGBE_TX_FLAGS_VLAN_SHIFT); } else { diff --git a/drivers/net/ethernet/marvell/Kconfig b/drivers/net/ethernet/marvell/Kconfig index f58a1c0144ba..884d64114bff 100644 --- a/drivers/net/ethernet/marvell/Kconfig +++ b/drivers/net/ethernet/marvell/Kconfig @@ -34,6 +34,7 @@ config MV643XX_ETH config MVMDIO tristate "Marvell MDIO interface support" depends on HAS_IOMEM + select MDIO_DEVRES select PHYLIB help This driver supports the MDIO interface found in the network diff --git a/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c b/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c index 3ea00bc9b91c..adc953611913 100644 --- a/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c +++ b/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c @@ -6089,18 +6089,19 @@ static bool mvpp2_port_has_irqs(struct mvpp2 *priv, return true; } -static void mvpp2_port_copy_mac_addr(struct net_device *dev, struct mvpp2 *priv, - struct fwnode_handle *fwnode, - char **mac_from) +static int mvpp2_port_copy_mac_addr(struct net_device *dev, struct mvpp2 *priv, + struct fwnode_handle *fwnode, + char **mac_from) { struct mvpp2_port *port = netdev_priv(dev); char hw_mac_addr[ETH_ALEN] = {0}; char fw_mac_addr[ETH_ALEN]; + int ret; if (!fwnode_get_mac_address(fwnode, fw_mac_addr)) { *mac_from = "firmware node"; eth_hw_addr_set(dev, fw_mac_addr); - return; + return 0; } if (priv->hw_version == MVPP21) { @@ -6108,19 +6109,24 @@ static void mvpp2_port_copy_mac_addr(struct net_device *dev, struct mvpp2 *priv, if (is_valid_ether_addr(hw_mac_addr)) { *mac_from = "hardware"; eth_hw_addr_set(dev, hw_mac_addr); - return; + return 0; } } /* Only valid on OF enabled platforms */ - if (!of_get_mac_address_nvmem(to_of_node(fwnode), fw_mac_addr)) { + ret = of_get_mac_address_nvmem(to_of_node(fwnode), fw_mac_addr); + if (ret == -EPROBE_DEFER) + return ret; + if (!ret) { *mac_from = "nvmem cell"; eth_hw_addr_set(dev, fw_mac_addr); - return; + return 0; } *mac_from = "random"; eth_hw_addr_random(dev); + + return 0; } static struct mvpp2_port *mvpp2_phylink_to_port(struct phylink_config *config) @@ -6823,7 +6829,9 @@ static int mvpp2_port_probe(struct platform_device *pdev, mutex_init(&port->gather_stats_lock); INIT_DELAYED_WORK(&port->stats_work, mvpp2_gather_hw_statistics); - mvpp2_port_copy_mac_addr(dev, priv, port_fwnode, &mac_from); + err = mvpp2_port_copy_mac_addr(dev, priv, port_fwnode, &mac_from); + if (err < 0) + goto err_free_stats; port->tx_ring_size = MVPP2_MAX_TXD_DFLT; port->rx_ring_size = MVPP2_MAX_RXD_DFLT; diff --git a/drivers/net/ethernet/marvell/octeon_ep/octep_cn9k_pf.c b/drivers/net/ethernet/marvell/octeon_ep/octep_cn9k_pf.c index 6ad88d0fe43f..90c3a419932d 100644 --- a/drivers/net/ethernet/marvell/octeon_ep/octep_cn9k_pf.c +++ b/drivers/net/ethernet/marvell/octeon_ep/octep_cn9k_pf.c @@ -13,6 +13,12 @@ #include "octep_main.h" #include "octep_regs_cn9k_pf.h" +#define CTRL_MBOX_MAX_PF 128 +#define CTRL_MBOX_SZ ((size_t)(0x400000 / CTRL_MBOX_MAX_PF)) + +#define FW_HB_INTERVAL_IN_SECS 1 +#define FW_HB_MISS_COUNT 10 + /* Names of Hardware non-queue generic interrupts */ static char *cn93_non_ioq_msix_names[] = { "epf_ire_rint", @@ -198,7 +204,9 @@ static void octep_init_config_cn93_pf(struct octep_device *oct) { struct octep_config *conf = oct->conf; struct pci_dev *pdev = oct->pdev; + u8 link = 0; u64 val; + int pos; /* Read ring configuration: * PF ring count, number of VFs and rings per VF supported @@ -234,7 +242,20 @@ static void octep_init_config_cn93_pf(struct octep_device *oct) conf->msix_cfg.ioq_msix = conf->pf_ring_cfg.active_io_rings; conf->msix_cfg.non_ioq_msix_names = cn93_non_ioq_msix_names; - conf->ctrl_mbox_cfg.barmem_addr = (void __iomem *)oct->mmio[2].hw_addr + (0x400000ull * 7); + pos = pci_find_ext_capability(oct->pdev, PCI_EXT_CAP_ID_SRIOV); + if (pos) { + pci_read_config_byte(oct->pdev, + pos + PCI_SRIOV_FUNC_LINK, + &link); + link = PCI_DEVFN(PCI_SLOT(oct->pdev->devfn), link); + } + conf->ctrl_mbox_cfg.barmem_addr = (void __iomem *)oct->mmio[2].hw_addr + + (0x400000ull * 7) + + (link * CTRL_MBOX_SZ); + + conf->hb_interval = FW_HB_INTERVAL_IN_SECS; + conf->max_hb_miss_cnt = FW_HB_MISS_COUNT; + } /* Setup registers for a hardware Tx Queue */ @@ -352,19 +373,30 @@ static void octep_setup_mbox_regs_cn93_pf(struct octep_device *oct, int q_no) mbox->mbox_read_reg = oct->mmio[0].hw_addr + CN93_SDP_R_MBOX_VF_PF_DATA(q_no); } -/* Mailbox Interrupt handler */ -static void cn93_handle_pf_mbox_intr(struct octep_device *oct) +/* Process non-ioq interrupts required to keep pf interface running. + * OEI_RINT is needed for control mailbox + */ +static bool octep_poll_non_ioq_interrupts_cn93_pf(struct octep_device *oct) { - u64 mbox_int_val = 0ULL, val = 0ULL, qno = 0ULL; + bool handled = false; + u64 reg0; - mbox_int_val = readq(oct->mbox[0]->mbox_int_reg); - for (qno = 0; qno < OCTEP_MAX_VF; qno++) { - val = readq(oct->mbox[qno]->mbox_read_reg); - dev_dbg(&oct->pdev->dev, - "PF MBOX READ: val:%llx from VF:%llx\n", val, qno); + /* Check for OEI INTR */ + reg0 = octep_read_csr64(oct, CN93_SDP_EPF_OEI_RINT); + if (reg0) { + dev_info(&oct->pdev->dev, + "Received OEI_RINT intr: 0x%llx\n", + reg0); + octep_write_csr64(oct, CN93_SDP_EPF_OEI_RINT, reg0); + if (reg0 & CN93_SDP_EPF_OEI_RINT_DATA_BIT_MBOX) + queue_work(octep_wq, &oct->ctrl_mbox_task); + else if (reg0 & CN93_SDP_EPF_OEI_RINT_DATA_BIT_HBEAT) + atomic_set(&oct->hb_miss_cnt, 0); + + handled = true; } - writeq(mbox_int_val, oct->mbox[0]->mbox_int_reg); + return handled; } /* Interrupts handler for all non-queue generic interrupts. */ @@ -434,24 +466,9 @@ static irqreturn_t octep_non_ioq_intr_handler_cn93_pf(void *dev) goto irq_handled; } - /* Check for MBOX INTR */ - reg_val = octep_read_csr64(oct, CN93_SDP_EPF_MBOX_RINT(0)); - if (reg_val) { - dev_info(&pdev->dev, - "Received MBOX_RINT intr: 0x%llx\n", reg_val); - cn93_handle_pf_mbox_intr(oct); - goto irq_handled; - } - - /* Check for OEI INTR */ - reg_val = octep_read_csr64(oct, CN93_SDP_EPF_OEI_RINT); - if (reg_val) { - dev_info(&pdev->dev, - "Received OEI_EINT intr: 0x%llx\n", reg_val); - octep_write_csr64(oct, CN93_SDP_EPF_OEI_RINT, reg_val); - queue_work(octep_wq, &oct->ctrl_mbox_task); + /* Check for MBOX INTR and OEI INTR */ + if (octep_poll_non_ioq_interrupts_cn93_pf(oct)) goto irq_handled; - } /* Check for DMA INTR */ reg_val = octep_read_csr64(oct, CN93_SDP_EPF_DMA_RINT); @@ -712,6 +729,7 @@ void octep_device_setup_cn93_pf(struct octep_device *oct) oct->hw_ops.enable_interrupts = octep_enable_interrupts_cn93_pf; oct->hw_ops.disable_interrupts = octep_disable_interrupts_cn93_pf; + oct->hw_ops.poll_non_ioq_interrupts = octep_poll_non_ioq_interrupts_cn93_pf; oct->hw_ops.update_iq_read_idx = octep_update_iq_read_index_cn93_pf; diff --git a/drivers/net/ethernet/marvell/octeon_ep/octep_config.h b/drivers/net/ethernet/marvell/octeon_ep/octep_config.h index f208f3f9a447..df7cd39d9fce 100644 --- a/drivers/net/ethernet/marvell/octeon_ep/octep_config.h +++ b/drivers/net/ethernet/marvell/octeon_ep/octep_config.h @@ -200,5 +200,11 @@ struct octep_config { /* ctrl mbox config */ struct octep_ctrl_mbox_config ctrl_mbox_cfg; + + /* Configured maximum heartbeat miss count */ + u32 max_hb_miss_cnt; + + /* Configured firmware heartbeat interval in secs */ + u32 hb_interval; }; #endif /* _OCTEP_CONFIG_H_ */ diff --git a/drivers/net/ethernet/marvell/octeon_ep/octep_ctrl_mbox.c b/drivers/net/ethernet/marvell/octeon_ep/octep_ctrl_mbox.c index 39322e4dd100..035ead7935c7 100644 --- a/drivers/net/ethernet/marvell/octeon_ep/octep_ctrl_mbox.c +++ b/drivers/net/ethernet/marvell/octeon_ep/octep_ctrl_mbox.c @@ -24,41 +24,49 @@ /* Time in msecs to wait for message response */ #define OCTEP_CTRL_MBOX_MSG_WAIT_MS 10 -#define OCTEP_CTRL_MBOX_INFO_MAGIC_NUM_OFFSET(m) (m) -#define OCTEP_CTRL_MBOX_INFO_BARMEM_SZ_OFFSET(m) ((m) + 8) -#define OCTEP_CTRL_MBOX_INFO_HOST_STATUS_OFFSET(m) ((m) + 24) -#define OCTEP_CTRL_MBOX_INFO_FW_STATUS_OFFSET(m) ((m) + 144) - -#define OCTEP_CTRL_MBOX_H2FQ_INFO_OFFSET(m) ((m) + OCTEP_CTRL_MBOX_INFO_SZ) -#define OCTEP_CTRL_MBOX_H2FQ_PROD_OFFSET(m) (OCTEP_CTRL_MBOX_H2FQ_INFO_OFFSET(m)) -#define OCTEP_CTRL_MBOX_H2FQ_CONS_OFFSET(m) ((OCTEP_CTRL_MBOX_H2FQ_INFO_OFFSET(m)) + 4) -#define OCTEP_CTRL_MBOX_H2FQ_ELEM_SZ_OFFSET(m) ((OCTEP_CTRL_MBOX_H2FQ_INFO_OFFSET(m)) + 8) -#define OCTEP_CTRL_MBOX_H2FQ_ELEM_CNT_OFFSET(m) ((OCTEP_CTRL_MBOX_H2FQ_INFO_OFFSET(m)) + 12) - -#define OCTEP_CTRL_MBOX_F2HQ_INFO_OFFSET(m) ((m) + \ - OCTEP_CTRL_MBOX_INFO_SZ + \ - OCTEP_CTRL_MBOX_H2FQ_INFO_SZ) -#define OCTEP_CTRL_MBOX_F2HQ_PROD_OFFSET(m) (OCTEP_CTRL_MBOX_F2HQ_INFO_OFFSET(m)) -#define OCTEP_CTRL_MBOX_F2HQ_CONS_OFFSET(m) ((OCTEP_CTRL_MBOX_F2HQ_INFO_OFFSET(m)) + 4) -#define OCTEP_CTRL_MBOX_F2HQ_ELEM_SZ_OFFSET(m) ((OCTEP_CTRL_MBOX_F2HQ_INFO_OFFSET(m)) + 8) -#define OCTEP_CTRL_MBOX_F2HQ_ELEM_CNT_OFFSET(m) ((OCTEP_CTRL_MBOX_F2HQ_INFO_OFFSET(m)) + 12) - -#define OCTEP_CTRL_MBOX_Q_OFFSET(m, i) ((m) + \ - (sizeof(struct octep_ctrl_mbox_msg) * (i))) - -static u32 octep_ctrl_mbox_circq_inc(u32 index, u32 mask) +/* Size of mbox info in bytes */ +#define OCTEP_CTRL_MBOX_INFO_SZ 256 +/* Size of mbox host to fw queue info in bytes */ +#define OCTEP_CTRL_MBOX_H2FQ_INFO_SZ 16 +/* Size of mbox fw to host queue info in bytes */ +#define OCTEP_CTRL_MBOX_F2HQ_INFO_SZ 16 + +#define OCTEP_CTRL_MBOX_TOTAL_INFO_SZ (OCTEP_CTRL_MBOX_INFO_SZ + \ + OCTEP_CTRL_MBOX_H2FQ_INFO_SZ + \ + OCTEP_CTRL_MBOX_F2HQ_INFO_SZ) + +#define OCTEP_CTRL_MBOX_INFO_MAGIC_NUM(m) (m) +#define OCTEP_CTRL_MBOX_INFO_BARMEM_SZ(m) ((m) + 8) +#define OCTEP_CTRL_MBOX_INFO_HOST_STATUS(m) ((m) + 24) +#define OCTEP_CTRL_MBOX_INFO_FW_STATUS(m) ((m) + 144) + +#define OCTEP_CTRL_MBOX_H2FQ_INFO(m) ((m) + OCTEP_CTRL_MBOX_INFO_SZ) +#define OCTEP_CTRL_MBOX_H2FQ_PROD(m) (OCTEP_CTRL_MBOX_H2FQ_INFO(m)) +#define OCTEP_CTRL_MBOX_H2FQ_CONS(m) ((OCTEP_CTRL_MBOX_H2FQ_INFO(m)) + 4) +#define OCTEP_CTRL_MBOX_H2FQ_SZ(m) ((OCTEP_CTRL_MBOX_H2FQ_INFO(m)) + 8) + +#define OCTEP_CTRL_MBOX_F2HQ_INFO(m) ((m) + \ + OCTEP_CTRL_MBOX_INFO_SZ + \ + OCTEP_CTRL_MBOX_H2FQ_INFO_SZ) +#define OCTEP_CTRL_MBOX_F2HQ_PROD(m) (OCTEP_CTRL_MBOX_F2HQ_INFO(m)) +#define OCTEP_CTRL_MBOX_F2HQ_CONS(m) ((OCTEP_CTRL_MBOX_F2HQ_INFO(m)) + 4) +#define OCTEP_CTRL_MBOX_F2HQ_SZ(m) ((OCTEP_CTRL_MBOX_F2HQ_INFO(m)) + 8) + +static const u32 mbox_hdr_sz = sizeof(union octep_ctrl_mbox_msg_hdr); + +static u32 octep_ctrl_mbox_circq_inc(u32 index, u32 inc, u32 sz) { - return (index + 1) & mask; + return (index + inc) % sz; } -static u32 octep_ctrl_mbox_circq_space(u32 pi, u32 ci, u32 mask) +static u32 octep_ctrl_mbox_circq_space(u32 pi, u32 ci, u32 sz) { - return mask - ((pi - ci) & mask); + return sz - (abs(pi - ci) % sz); } -static u32 octep_ctrl_mbox_circq_depth(u32 pi, u32 ci, u32 mask) +static u32 octep_ctrl_mbox_circq_depth(u32 pi, u32 ci, u32 sz) { - return ((pi - ci) & mask); + return (abs(pi - ci) % sz); } int octep_ctrl_mbox_init(struct octep_ctrl_mbox *mbox) @@ -73,153 +81,170 @@ int octep_ctrl_mbox_init(struct octep_ctrl_mbox *mbox) return -EINVAL; } - magic_num = readq(OCTEP_CTRL_MBOX_INFO_MAGIC_NUM_OFFSET(mbox->barmem)); + magic_num = readq(OCTEP_CTRL_MBOX_INFO_MAGIC_NUM(mbox->barmem)); if (magic_num != OCTEP_CTRL_MBOX_MAGIC_NUMBER) { pr_info("octep_ctrl_mbox : Invalid magic number %llx\n", magic_num); return -EINVAL; } - status = readq(OCTEP_CTRL_MBOX_INFO_FW_STATUS_OFFSET(mbox->barmem)); + status = readq(OCTEP_CTRL_MBOX_INFO_FW_STATUS(mbox->barmem)); if (status != OCTEP_CTRL_MBOX_STATUS_READY) { pr_info("octep_ctrl_mbox : Firmware is not ready.\n"); return -EINVAL; } - mbox->barmem_sz = readl(OCTEP_CTRL_MBOX_INFO_BARMEM_SZ_OFFSET(mbox->barmem)); + mbox->barmem_sz = readl(OCTEP_CTRL_MBOX_INFO_BARMEM_SZ(mbox->barmem)); - writeq(OCTEP_CTRL_MBOX_STATUS_INIT, OCTEP_CTRL_MBOX_INFO_HOST_STATUS_OFFSET(mbox->barmem)); + writeq(OCTEP_CTRL_MBOX_STATUS_INIT, + OCTEP_CTRL_MBOX_INFO_HOST_STATUS(mbox->barmem)); - mbox->h2fq.elem_cnt = readl(OCTEP_CTRL_MBOX_H2FQ_ELEM_CNT_OFFSET(mbox->barmem)); - mbox->h2fq.elem_sz = readl(OCTEP_CTRL_MBOX_H2FQ_ELEM_SZ_OFFSET(mbox->barmem)); - mbox->h2fq.mask = (mbox->h2fq.elem_cnt - 1); - mutex_init(&mbox->h2fq_lock); + mbox->h2fq.sz = readl(OCTEP_CTRL_MBOX_H2FQ_SZ(mbox->barmem)); + mbox->h2fq.hw_prod = OCTEP_CTRL_MBOX_H2FQ_PROD(mbox->barmem); + mbox->h2fq.hw_cons = OCTEP_CTRL_MBOX_H2FQ_CONS(mbox->barmem); + mbox->h2fq.hw_q = mbox->barmem + OCTEP_CTRL_MBOX_TOTAL_INFO_SZ; - mbox->f2hq.elem_cnt = readl(OCTEP_CTRL_MBOX_F2HQ_ELEM_CNT_OFFSET(mbox->barmem)); - mbox->f2hq.elem_sz = readl(OCTEP_CTRL_MBOX_F2HQ_ELEM_SZ_OFFSET(mbox->barmem)); - mbox->f2hq.mask = (mbox->f2hq.elem_cnt - 1); - mutex_init(&mbox->f2hq_lock); - - mbox->h2fq.hw_prod = OCTEP_CTRL_MBOX_H2FQ_PROD_OFFSET(mbox->barmem); - mbox->h2fq.hw_cons = OCTEP_CTRL_MBOX_H2FQ_CONS_OFFSET(mbox->barmem); - mbox->h2fq.hw_q = mbox->barmem + - OCTEP_CTRL_MBOX_INFO_SZ + - OCTEP_CTRL_MBOX_H2FQ_INFO_SZ + - OCTEP_CTRL_MBOX_F2HQ_INFO_SZ; - - mbox->f2hq.hw_prod = OCTEP_CTRL_MBOX_F2HQ_PROD_OFFSET(mbox->barmem); - mbox->f2hq.hw_cons = OCTEP_CTRL_MBOX_F2HQ_CONS_OFFSET(mbox->barmem); - mbox->f2hq.hw_q = mbox->h2fq.hw_q + - ((mbox->h2fq.elem_sz + sizeof(union octep_ctrl_mbox_msg_hdr)) * - mbox->h2fq.elem_cnt); + mbox->f2hq.sz = readl(OCTEP_CTRL_MBOX_F2HQ_SZ(mbox->barmem)); + mbox->f2hq.hw_prod = OCTEP_CTRL_MBOX_F2HQ_PROD(mbox->barmem); + mbox->f2hq.hw_cons = OCTEP_CTRL_MBOX_F2HQ_CONS(mbox->barmem); + mbox->f2hq.hw_q = mbox->barmem + + OCTEP_CTRL_MBOX_TOTAL_INFO_SZ + + mbox->h2fq.sz; /* ensure ready state is seen after everything is initialized */ wmb(); - writeq(OCTEP_CTRL_MBOX_STATUS_READY, OCTEP_CTRL_MBOX_INFO_HOST_STATUS_OFFSET(mbox->barmem)); + writeq(OCTEP_CTRL_MBOX_STATUS_READY, + OCTEP_CTRL_MBOX_INFO_HOST_STATUS(mbox->barmem)); pr_info("Octep ctrl mbox : Init successful.\n"); return 0; } +static void +octep_write_mbox_data(struct octep_ctrl_mbox_q *q, u32 *pi, u32 ci, void *buf, u32 w_sz) +{ + u8 __iomem *qbuf; + u32 cp_sz; + + /* Assumption: Caller has ensured enough write space */ + qbuf = (q->hw_q + *pi); + if (*pi < ci) { + /* copy entire w_sz */ + memcpy_toio(qbuf, buf, w_sz); + *pi = octep_ctrl_mbox_circq_inc(*pi, w_sz, q->sz); + } else { + /* copy up to end of queue */ + cp_sz = min((q->sz - *pi), w_sz); + memcpy_toio(qbuf, buf, cp_sz); + w_sz -= cp_sz; + *pi = octep_ctrl_mbox_circq_inc(*pi, cp_sz, q->sz); + if (w_sz) { + /* roll over and copy remaining w_sz */ + buf += cp_sz; + qbuf = (q->hw_q + *pi); + memcpy_toio(qbuf, buf, w_sz); + *pi = octep_ctrl_mbox_circq_inc(*pi, w_sz, q->sz); + } + } +} + int octep_ctrl_mbox_send(struct octep_ctrl_mbox *mbox, struct octep_ctrl_mbox_msg *msg) { - unsigned long timeout = msecs_to_jiffies(OCTEP_CTRL_MBOX_MSG_TIMEOUT_MS); - unsigned long period = msecs_to_jiffies(OCTEP_CTRL_MBOX_MSG_WAIT_MS); + struct octep_ctrl_mbox_msg_buf *sg; struct octep_ctrl_mbox_q *q; - unsigned long expire; - u64 *mbuf, *word0; - u8 __iomem *qidx; - u16 pi, ci; - int i; + u32 pi, ci, buf_sz, w_sz; + int s; if (!mbox || !msg) return -EINVAL; + if (readq(OCTEP_CTRL_MBOX_INFO_FW_STATUS(mbox->barmem)) != OCTEP_CTRL_MBOX_STATUS_READY) + return -EIO; + + mutex_lock(&mbox->h2fq_lock); q = &mbox->h2fq; pi = readl(q->hw_prod); ci = readl(q->hw_cons); - if (!octep_ctrl_mbox_circq_space(pi, ci, q->mask)) - return -ENOMEM; - - qidx = OCTEP_CTRL_MBOX_Q_OFFSET(q->hw_q, pi); - mbuf = (u64 *)msg->msg; - word0 = &msg->hdr.word0; - - mutex_lock(&mbox->h2fq_lock); - for (i = 1; i <= msg->hdr.sizew; i++) - writeq(*mbuf++, (qidx + (i * 8))); - - writeq(*word0, qidx); + if (octep_ctrl_mbox_circq_space(pi, ci, q->sz) < (msg->hdr.s.sz + mbox_hdr_sz)) { + mutex_unlock(&mbox->h2fq_lock); + return -EAGAIN; + } - pi = octep_ctrl_mbox_circq_inc(pi, q->mask); + octep_write_mbox_data(q, &pi, ci, (void *)&msg->hdr, mbox_hdr_sz); + buf_sz = msg->hdr.s.sz; + for (s = 0; ((s < msg->sg_num) && (buf_sz > 0)); s++) { + sg = &msg->sg_list[s]; + w_sz = (sg->sz <= buf_sz) ? sg->sz : buf_sz; + octep_write_mbox_data(q, &pi, ci, sg->msg, w_sz); + buf_sz -= w_sz; + } writel(pi, q->hw_prod); mutex_unlock(&mbox->h2fq_lock); - /* don't check for notification response */ - if (msg->hdr.flags & OCTEP_CTRL_MBOX_MSG_HDR_FLAG_NOTIFY) - return 0; - - expire = jiffies + timeout; - while (true) { - *word0 = readq(qidx); - if (msg->hdr.flags == OCTEP_CTRL_MBOX_MSG_HDR_FLAG_RESP) - break; - schedule_timeout_interruptible(period); - if (signal_pending(current) || time_after(jiffies, expire)) { - pr_info("octep_ctrl_mbox: Timed out\n"); - return -EBUSY; + return 0; +} + +static void +octep_read_mbox_data(struct octep_ctrl_mbox_q *q, u32 pi, u32 *ci, void *buf, u32 r_sz) +{ + u8 __iomem *qbuf; + u32 cp_sz; + + /* Assumption: Caller has ensured enough read space */ + qbuf = (q->hw_q + *ci); + if (*ci < pi) { + /* copy entire r_sz */ + memcpy_fromio(buf, qbuf, r_sz); + *ci = octep_ctrl_mbox_circq_inc(*ci, r_sz, q->sz); + } else { + /* copy up to end of queue */ + cp_sz = min((q->sz - *ci), r_sz); + memcpy_fromio(buf, qbuf, cp_sz); + r_sz -= cp_sz; + *ci = octep_ctrl_mbox_circq_inc(*ci, cp_sz, q->sz); + if (r_sz) { + /* roll over and copy remaining r_sz */ + buf += cp_sz; + qbuf = (q->hw_q + *ci); + memcpy_fromio(buf, qbuf, r_sz); + *ci = octep_ctrl_mbox_circq_inc(*ci, r_sz, q->sz); } } - mbuf = (u64 *)msg->msg; - for (i = 1; i <= msg->hdr.sizew; i++) - *mbuf++ = readq(qidx + (i * 8)); - - return 0; } int octep_ctrl_mbox_recv(struct octep_ctrl_mbox *mbox, struct octep_ctrl_mbox_msg *msg) { + struct octep_ctrl_mbox_msg_buf *sg; + u32 pi, ci, r_sz, buf_sz, q_depth; struct octep_ctrl_mbox_q *q; - u32 count, pi, ci; - u8 __iomem *qidx; - u64 *mbuf; - int i; + int s; - if (!mbox || !msg) - return -EINVAL; + if (readq(OCTEP_CTRL_MBOX_INFO_FW_STATUS(mbox->barmem)) != OCTEP_CTRL_MBOX_STATUS_READY) + return -EIO; + mutex_lock(&mbox->f2hq_lock); q = &mbox->f2hq; pi = readl(q->hw_prod); ci = readl(q->hw_cons); - count = octep_ctrl_mbox_circq_depth(pi, ci, q->mask); - if (!count) - return -EAGAIN; - - qidx = OCTEP_CTRL_MBOX_Q_OFFSET(q->hw_q, ci); - mbuf = (u64 *)msg->msg; - mutex_lock(&mbox->f2hq_lock); - - msg->hdr.word0 = readq(qidx); - for (i = 1; i <= msg->hdr.sizew; i++) - *mbuf++ = readq(qidx + (i * 8)); + q_depth = octep_ctrl_mbox_circq_depth(pi, ci, q->sz); + if (q_depth < mbox_hdr_sz) { + mutex_unlock(&mbox->f2hq_lock); + return -EAGAIN; + } - ci = octep_ctrl_mbox_circq_inc(ci, q->mask); + octep_read_mbox_data(q, pi, &ci, (void *)&msg->hdr, mbox_hdr_sz); + buf_sz = msg->hdr.s.sz; + for (s = 0; ((s < msg->sg_num) && (buf_sz > 0)); s++) { + sg = &msg->sg_list[s]; + r_sz = (sg->sz <= buf_sz) ? sg->sz : buf_sz; + octep_read_mbox_data(q, pi, &ci, sg->msg, r_sz); + buf_sz -= r_sz; + } writel(ci, q->hw_cons); - mutex_unlock(&mbox->f2hq_lock); - if (msg->hdr.flags != OCTEP_CTRL_MBOX_MSG_HDR_FLAG_REQ || !mbox->process_req) - return 0; - - mbox->process_req(mbox->user_ctx, msg); - mbuf = (u64 *)msg->msg; - for (i = 1; i <= msg->hdr.sizew; i++) - writeq(*mbuf++, (qidx + (i * 8))); - - writeq(msg->hdr.word0, qidx); - return 0; } @@ -227,18 +252,17 @@ int octep_ctrl_mbox_uninit(struct octep_ctrl_mbox *mbox) { if (!mbox) return -EINVAL; + if (!mbox->barmem) + return -EINVAL; - writeq(OCTEP_CTRL_MBOX_STATUS_UNINIT, - OCTEP_CTRL_MBOX_INFO_HOST_STATUS_OFFSET(mbox->barmem)); + writeq(OCTEP_CTRL_MBOX_STATUS_INVALID, + OCTEP_CTRL_MBOX_INFO_HOST_STATUS(mbox->barmem)); /* ensure uninit state is written before uninitialization */ wmb(); mutex_destroy(&mbox->h2fq_lock); mutex_destroy(&mbox->f2hq_lock); - writeq(OCTEP_CTRL_MBOX_STATUS_INVALID, - OCTEP_CTRL_MBOX_INFO_HOST_STATUS_OFFSET(mbox->barmem)); - pr_info("Octep ctrl mbox : Uninit successful.\n"); return 0; diff --git a/drivers/net/ethernet/marvell/octeon_ep/octep_ctrl_mbox.h b/drivers/net/ethernet/marvell/octeon_ep/octep_ctrl_mbox.h index 2dc5753cfec6..9c4ff0fba6a0 100644 --- a/drivers/net/ethernet/marvell/octeon_ep/octep_ctrl_mbox.h +++ b/drivers/net/ethernet/marvell/octeon_ep/octep_ctrl_mbox.h @@ -27,50 +27,39 @@ * |-------------------------------------------| * |producer index (4 bytes) | * |consumer index (4 bytes) | - * |element size (4 bytes) | - * |element count (4 bytes) | + * |max element size (4 bytes) | + * |reserved (4 bytes) | * |===========================================| * |Fw to Host Queue info (16 bytes) | * |-------------------------------------------| * |producer index (4 bytes) | * |consumer index (4 bytes) | - * |element size (4 bytes) | - * |element count (4 bytes) | + * |max element size (4 bytes) | + * |reserved (4 bytes) | * |===========================================| - * |Host to Fw Queue | + * |Host to Fw Queue ((total size-288/2) bytes)| * |-------------------------------------------| - * |((elem_sz + hdr(8 bytes)) * elem_cnt) bytes| + * | | * |===========================================| * |===========================================| - * |Fw to Host Queue | + * |Fw to Host Queue ((total size-288/2) bytes)| * |-------------------------------------------| - * |((elem_sz + hdr(8 bytes)) * elem_cnt) bytes| + * | | * |===========================================| */ #define OCTEP_CTRL_MBOX_MAGIC_NUMBER 0xdeaddeadbeefbeefull -/* Size of mbox info in bytes */ -#define OCTEP_CTRL_MBOX_INFO_SZ 256 -/* Size of mbox host to target queue info in bytes */ -#define OCTEP_CTRL_MBOX_H2FQ_INFO_SZ 16 -/* Size of mbox target to host queue info in bytes */ -#define OCTEP_CTRL_MBOX_F2HQ_INFO_SZ 16 -/* Size of mbox queue in bytes */ -#define OCTEP_CTRL_MBOX_Q_SZ(sz, cnt) (((sz) + 8) * (cnt)) -/* Size of mbox in bytes */ -#define OCTEP_CTRL_MBOX_SZ(hsz, hcnt, fsz, fcnt) (OCTEP_CTRL_MBOX_INFO_SZ + \ - OCTEP_CTRL_MBOX_H2FQ_INFO_SZ + \ - OCTEP_CTRL_MBOX_F2HQ_INFO_SZ + \ - OCTEP_CTRL_MBOX_Q_SZ(hsz, hcnt) + \ - OCTEP_CTRL_MBOX_Q_SZ(fsz, fcnt)) - /* Valid request message */ #define OCTEP_CTRL_MBOX_MSG_HDR_FLAG_REQ BIT(0) /* Valid response message */ #define OCTEP_CTRL_MBOX_MSG_HDR_FLAG_RESP BIT(1) /* Valid notification, no response required */ #define OCTEP_CTRL_MBOX_MSG_HDR_FLAG_NOTIFY BIT(2) +/* Valid custom message */ +#define OCTEP_CTRL_MBOX_MSG_HDR_FLAG_CUSTOM BIT(3) + +#define OCTEP_CTRL_MBOX_MSG_DESC_MAX 4 enum octep_ctrl_mbox_status { OCTEP_CTRL_MBOX_STATUS_INVALID = 0, @@ -81,31 +70,48 @@ enum octep_ctrl_mbox_status { /* mbox message */ union octep_ctrl_mbox_msg_hdr { - u64 word0; + u64 words[2]; struct { + /* must be 0 */ + u16 reserved1:15; + /* vf_idx is valid if 1 */ + u16 is_vf:1; + /* sender vf index 0-(n-1), 0 if (is_vf==0) */ + u16 vf_idx; + /* total size of message excluding header */ + u32 sz; /* OCTEP_CTRL_MBOX_MSG_HDR_FLAG_* */ u32 flags; - /* size of message in words excluding header */ - u32 sizew; - }; + /* identifier to match responses */ + u16 msg_id; + u16 reserved2; + } s; +}; + +/* mbox message buffer */ +struct octep_ctrl_mbox_msg_buf { + u32 reserved1; + u16 reserved2; + /* size of buffer */ + u16 sz; + /* pointer to message buffer */ + void *msg; }; /* mbox message */ struct octep_ctrl_mbox_msg { /* mbox transaction header */ union octep_ctrl_mbox_msg_hdr hdr; - /* pointer to message buffer */ - void *msg; + /* number of sg buffer's */ + int sg_num; + /* message buffer's */ + struct octep_ctrl_mbox_msg_buf sg_list[OCTEP_CTRL_MBOX_MSG_DESC_MAX]; }; /* Mbox queue */ struct octep_ctrl_mbox_q { - /* q element size, should be aligned to unsigned long */ - u16 elem_sz; - /* q element count, should be power of 2 */ - u16 elem_cnt; - /* q mask */ - u16 mask; + /* size of queue buffer */ + u32 sz; /* producer address in bar mem */ u8 __iomem *hw_prod; /* consumer address in bar mem */ @@ -115,16 +121,10 @@ struct octep_ctrl_mbox_q { }; struct octep_ctrl_mbox { - /* host driver version */ - u64 version; /* size of bar memory */ u32 barmem_sz; /* pointer to BAR memory */ u8 __iomem *barmem; - /* user context for callback, can be null */ - void *user_ctx; - /* callback handler for processing request, called from octep_ctrl_mbox_recv */ - int (*process_req)(void *user_ctx, struct octep_ctrl_mbox_msg *msg); /* host-to-fw queue */ struct octep_ctrl_mbox_q h2fq; /* fw-to-host queue */ @@ -146,6 +146,8 @@ int octep_ctrl_mbox_init(struct octep_ctrl_mbox *mbox); /* Send mbox message. * * @param mbox: non-null pointer to struct octep_ctrl_mbox. + * @param msg: non-null pointer to struct octep_ctrl_mbox_msg. + * Caller should fill msg.sz and msg.desc.sz for each message. * * return value: 0 on success, -errno on failure. */ @@ -154,6 +156,8 @@ int octep_ctrl_mbox_send(struct octep_ctrl_mbox *mbox, struct octep_ctrl_mbox_ms /* Retrieve mbox message. * * @param mbox: non-null pointer to struct octep_ctrl_mbox. + * @param msg: non-null pointer to struct octep_ctrl_mbox_msg. + * Caller should fill msg.sz and msg.desc.sz for each message. * * return value: 0 on success, -errno on failure. */ @@ -161,7 +165,7 @@ int octep_ctrl_mbox_recv(struct octep_ctrl_mbox *mbox, struct octep_ctrl_mbox_ms /* Uninitialize control mbox. * - * @param ep: non-null pointer to struct octep_ctrl_mbox. + * @param mbox: non-null pointer to struct octep_ctrl_mbox. * * return value: 0 on success, -errno on failure. */ diff --git a/drivers/net/ethernet/marvell/octeon_ep/octep_ctrl_net.c b/drivers/net/ethernet/marvell/octeon_ep/octep_ctrl_net.c index 7c00c896ab98..1cc6af2feb38 100644 --- a/drivers/net/ethernet/marvell/octeon_ep/octep_ctrl_net.c +++ b/drivers/net/ethernet/marvell/octeon_ep/octep_ctrl_net.c @@ -8,187 +8,328 @@ #include <linux/types.h> #include <linux/etherdevice.h> #include <linux/pci.h> +#include <linux/wait.h> #include "octep_config.h" #include "octep_main.h" #include "octep_ctrl_net.h" -int octep_get_link_status(struct octep_device *oct) +static const u32 req_hdr_sz = sizeof(union octep_ctrl_net_req_hdr); +static const u32 mtu_sz = sizeof(struct octep_ctrl_net_h2f_req_cmd_mtu); +static const u32 mac_sz = sizeof(struct octep_ctrl_net_h2f_req_cmd_mac); +static const u32 state_sz = sizeof(struct octep_ctrl_net_h2f_req_cmd_state); +static const u32 link_info_sz = sizeof(struct octep_ctrl_net_link_info); +static atomic_t ctrl_net_msg_id; + +static void init_send_req(struct octep_ctrl_mbox_msg *msg, void *buf, + u16 sz, int vfid) { - struct octep_ctrl_net_h2f_req req = {}; - struct octep_ctrl_net_h2f_resp *resp; - struct octep_ctrl_mbox_msg msg = {}; - int err; + msg->hdr.s.flags = OCTEP_CTRL_MBOX_MSG_HDR_FLAG_REQ; + msg->hdr.s.msg_id = atomic_inc_return(&ctrl_net_msg_id) & + GENMASK(sizeof(msg->hdr.s.msg_id) * BITS_PER_BYTE, 0); + msg->hdr.s.sz = req_hdr_sz + sz; + msg->sg_num = 1; + msg->sg_list[0].msg = buf; + msg->sg_list[0].sz = msg->hdr.s.sz; + if (vfid != OCTEP_CTRL_NET_INVALID_VFID) { + msg->hdr.s.is_vf = 1; + msg->hdr.s.vf_idx = vfid; + } +} + +static int octep_send_mbox_req(struct octep_device *oct, + struct octep_ctrl_net_wait_data *d, + bool wait_for_response) +{ + int err, ret; + + err = octep_ctrl_mbox_send(&oct->ctrl_mbox, &d->msg); + if (err < 0) + return err; + + if (!wait_for_response) + return 0; + + d->done = 0; + INIT_LIST_HEAD(&d->list); + list_add_tail(&d->list, &oct->ctrl_req_wait_list); + ret = wait_event_interruptible_timeout(oct->ctrl_req_wait_q, + (d->done != 0), + jiffies + msecs_to_jiffies(500)); + list_del(&d->list); + if (ret == 0 || ret == 1) + return -EAGAIN; + + /** + * (ret == 0) cond = false && timeout, return 0 + * (ret < 0) interrupted by signal, return 0 + * (ret == 1) cond = true && timeout, return 1 + * (ret >= 1) cond = true && !timeout, return 1 + */ + + if (d->data.resp.hdr.s.reply != OCTEP_CTRL_NET_REPLY_OK) + return -EAGAIN; + + return 0; +} - req.hdr.cmd = OCTEP_CTRL_NET_H2F_CMD_LINK_STATUS; - req.link.cmd = OCTEP_CTRL_NET_CMD_GET; +int octep_ctrl_net_init(struct octep_device *oct) +{ + struct octep_ctrl_mbox *ctrl_mbox; + struct pci_dev *pdev = oct->pdev; + int ret; + + init_waitqueue_head(&oct->ctrl_req_wait_q); + INIT_LIST_HEAD(&oct->ctrl_req_wait_list); + + /* Initialize control mbox */ + ctrl_mbox = &oct->ctrl_mbox; + ctrl_mbox->barmem = CFG_GET_CTRL_MBOX_MEM_ADDR(oct->conf); + ret = octep_ctrl_mbox_init(ctrl_mbox); + if (ret) { + dev_err(&pdev->dev, "Failed to initialize control mbox\n"); + return ret; + } + oct->ctrl_mbox_ifstats_offset = ctrl_mbox->barmem_sz; + + return 0; +} - msg.hdr.flags = OCTEP_CTRL_MBOX_MSG_HDR_FLAG_REQ; - msg.hdr.sizew = OCTEP_CTRL_NET_H2F_STATE_REQ_SZW; - msg.msg = &req; - err = octep_ctrl_mbox_send(&oct->ctrl_mbox, &msg); - if (err) +int octep_ctrl_net_get_link_status(struct octep_device *oct, int vfid) +{ + struct octep_ctrl_net_wait_data d = {0}; + struct octep_ctrl_net_h2f_req *req = &d.data.req; + int err; + + init_send_req(&d.msg, (void *)req, state_sz, vfid); + req->hdr.s.cmd = OCTEP_CTRL_NET_H2F_CMD_LINK_STATUS; + req->link.cmd = OCTEP_CTRL_NET_CMD_GET; + err = octep_send_mbox_req(oct, &d, true); + if (err < 0) return err; - resp = (struct octep_ctrl_net_h2f_resp *)&req; - return resp->link.state; + return d.data.resp.link.state; } -void octep_set_link_status(struct octep_device *oct, bool up) +int octep_ctrl_net_set_link_status(struct octep_device *oct, int vfid, bool up, + bool wait_for_response) { - struct octep_ctrl_net_h2f_req req = {}; - struct octep_ctrl_mbox_msg msg = {}; + struct octep_ctrl_net_wait_data d = {0}; + struct octep_ctrl_net_h2f_req *req = &d.data.req; - req.hdr.cmd = OCTEP_CTRL_NET_H2F_CMD_LINK_STATUS; - req.link.cmd = OCTEP_CTRL_NET_CMD_SET; - req.link.state = (up) ? OCTEP_CTRL_NET_STATE_UP : OCTEP_CTRL_NET_STATE_DOWN; + init_send_req(&d.msg, req, state_sz, vfid); + req->hdr.s.cmd = OCTEP_CTRL_NET_H2F_CMD_LINK_STATUS; + req->link.cmd = OCTEP_CTRL_NET_CMD_SET; + req->link.state = (up) ? OCTEP_CTRL_NET_STATE_UP : + OCTEP_CTRL_NET_STATE_DOWN; - msg.hdr.flags = OCTEP_CTRL_MBOX_MSG_HDR_FLAG_REQ; - msg.hdr.sizew = OCTEP_CTRL_NET_H2F_STATE_REQ_SZW; - msg.msg = &req; - octep_ctrl_mbox_send(&oct->ctrl_mbox, &msg); + return octep_send_mbox_req(oct, &d, wait_for_response); } -void octep_set_rx_state(struct octep_device *oct, bool up) +int octep_ctrl_net_set_rx_state(struct octep_device *oct, int vfid, bool up, + bool wait_for_response) { - struct octep_ctrl_net_h2f_req req = {}; - struct octep_ctrl_mbox_msg msg = {}; + struct octep_ctrl_net_wait_data d = {0}; + struct octep_ctrl_net_h2f_req *req = &d.data.req; - req.hdr.cmd = OCTEP_CTRL_NET_H2F_CMD_RX_STATE; - req.link.cmd = OCTEP_CTRL_NET_CMD_SET; - req.link.state = (up) ? OCTEP_CTRL_NET_STATE_UP : OCTEP_CTRL_NET_STATE_DOWN; + init_send_req(&d.msg, req, state_sz, vfid); + req->hdr.s.cmd = OCTEP_CTRL_NET_H2F_CMD_RX_STATE; + req->link.cmd = OCTEP_CTRL_NET_CMD_SET; + req->link.state = (up) ? OCTEP_CTRL_NET_STATE_UP : + OCTEP_CTRL_NET_STATE_DOWN; - msg.hdr.flags = OCTEP_CTRL_MBOX_MSG_HDR_FLAG_REQ; - msg.hdr.sizew = OCTEP_CTRL_NET_H2F_STATE_REQ_SZW; - msg.msg = &req; - octep_ctrl_mbox_send(&oct->ctrl_mbox, &msg); + return octep_send_mbox_req(oct, &d, wait_for_response); } -int octep_get_mac_addr(struct octep_device *oct, u8 *addr) +int octep_ctrl_net_get_mac_addr(struct octep_device *oct, int vfid, u8 *addr) { - struct octep_ctrl_net_h2f_req req = {}; - struct octep_ctrl_net_h2f_resp *resp; - struct octep_ctrl_mbox_msg msg = {}; + struct octep_ctrl_net_wait_data d = {0}; + struct octep_ctrl_net_h2f_req *req = &d.data.req; int err; - req.hdr.cmd = OCTEP_CTRL_NET_H2F_CMD_MAC; - req.link.cmd = OCTEP_CTRL_NET_CMD_GET; - - msg.hdr.flags = OCTEP_CTRL_MBOX_MSG_HDR_FLAG_REQ; - msg.hdr.sizew = OCTEP_CTRL_NET_H2F_MAC_REQ_SZW; - msg.msg = &req; - err = octep_ctrl_mbox_send(&oct->ctrl_mbox, &msg); - if (err) + init_send_req(&d.msg, req, mac_sz, vfid); + req->hdr.s.cmd = OCTEP_CTRL_NET_H2F_CMD_MAC; + req->link.cmd = OCTEP_CTRL_NET_CMD_GET; + err = octep_send_mbox_req(oct, &d, true); + if (err < 0) return err; - resp = (struct octep_ctrl_net_h2f_resp *)&req; - memcpy(addr, resp->mac.addr, ETH_ALEN); + memcpy(addr, d.data.resp.mac.addr, ETH_ALEN); - return err; + return 0; } -int octep_set_mac_addr(struct octep_device *oct, u8 *addr) +int octep_ctrl_net_set_mac_addr(struct octep_device *oct, int vfid, u8 *addr, + bool wait_for_response) { - struct octep_ctrl_net_h2f_req req = {}; - struct octep_ctrl_mbox_msg msg = {}; - - req.hdr.cmd = OCTEP_CTRL_NET_H2F_CMD_MAC; - req.mac.cmd = OCTEP_CTRL_NET_CMD_SET; - memcpy(&req.mac.addr, addr, ETH_ALEN); + struct octep_ctrl_net_wait_data d = {0}; + struct octep_ctrl_net_h2f_req *req = &d.data.req; - msg.hdr.flags = OCTEP_CTRL_MBOX_MSG_HDR_FLAG_REQ; - msg.hdr.sizew = OCTEP_CTRL_NET_H2F_MAC_REQ_SZW; - msg.msg = &req; + init_send_req(&d.msg, req, mac_sz, vfid); + req->hdr.s.cmd = OCTEP_CTRL_NET_H2F_CMD_MAC; + req->mac.cmd = OCTEP_CTRL_NET_CMD_SET; + memcpy(&req->mac.addr, addr, ETH_ALEN); - return octep_ctrl_mbox_send(&oct->ctrl_mbox, &msg); + return octep_send_mbox_req(oct, &d, wait_for_response); } -int octep_set_mtu(struct octep_device *oct, int mtu) +int octep_ctrl_net_set_mtu(struct octep_device *oct, int vfid, int mtu, + bool wait_for_response) { - struct octep_ctrl_net_h2f_req req = {}; - struct octep_ctrl_mbox_msg msg = {}; + struct octep_ctrl_net_wait_data d = {0}; + struct octep_ctrl_net_h2f_req *req = &d.data.req; - req.hdr.cmd = OCTEP_CTRL_NET_H2F_CMD_MTU; - req.mtu.cmd = OCTEP_CTRL_NET_CMD_SET; - req.mtu.val = mtu; + init_send_req(&d.msg, req, mtu_sz, vfid); + req->hdr.s.cmd = OCTEP_CTRL_NET_H2F_CMD_MTU; + req->mtu.cmd = OCTEP_CTRL_NET_CMD_SET; + req->mtu.val = mtu; - msg.hdr.flags = OCTEP_CTRL_MBOX_MSG_HDR_FLAG_REQ; - msg.hdr.sizew = OCTEP_CTRL_NET_H2F_MTU_REQ_SZW; - msg.msg = &req; - - return octep_ctrl_mbox_send(&oct->ctrl_mbox, &msg); + return octep_send_mbox_req(oct, &d, wait_for_response); } -int octep_get_if_stats(struct octep_device *oct) +int octep_ctrl_net_get_if_stats(struct octep_device *oct, int vfid, + struct octep_iface_rx_stats *rx_stats, + struct octep_iface_tx_stats *tx_stats) { - void __iomem *iface_rx_stats; - void __iomem *iface_tx_stats; - struct octep_ctrl_net_h2f_req req = {}; - struct octep_ctrl_mbox_msg msg = {}; + struct octep_ctrl_net_wait_data d = {0}; + struct octep_ctrl_net_h2f_req *req = &d.data.req; + struct octep_ctrl_net_h2f_resp *resp; int err; - req.hdr.cmd = OCTEP_CTRL_NET_H2F_CMD_GET_IF_STATS; - req.mac.cmd = OCTEP_CTRL_NET_CMD_GET; - req.get_stats.offset = oct->ctrl_mbox_ifstats_offset; - - msg.hdr.flags = OCTEP_CTRL_MBOX_MSG_HDR_FLAG_REQ; - msg.hdr.sizew = OCTEP_CTRL_NET_H2F_GET_STATS_REQ_SZW; - msg.msg = &req; - err = octep_ctrl_mbox_send(&oct->ctrl_mbox, &msg); - if (err) + init_send_req(&d.msg, req, 0, vfid); + req->hdr.s.cmd = OCTEP_CTRL_NET_H2F_CMD_GET_IF_STATS; + err = octep_send_mbox_req(oct, &d, true); + if (err < 0) return err; - iface_rx_stats = oct->ctrl_mbox.barmem + oct->ctrl_mbox_ifstats_offset; - iface_tx_stats = oct->ctrl_mbox.barmem + oct->ctrl_mbox_ifstats_offset + - sizeof(struct octep_iface_rx_stats); - memcpy_fromio(&oct->iface_rx_stats, iface_rx_stats, sizeof(struct octep_iface_rx_stats)); - memcpy_fromio(&oct->iface_tx_stats, iface_tx_stats, sizeof(struct octep_iface_tx_stats)); - - return err; + resp = &d.data.resp; + memcpy(rx_stats, &resp->if_stats.rx_stats, sizeof(struct octep_iface_rx_stats)); + memcpy(tx_stats, &resp->if_stats.tx_stats, sizeof(struct octep_iface_tx_stats)); + return 0; } -int octep_get_link_info(struct octep_device *oct) +int octep_ctrl_net_get_link_info(struct octep_device *oct, int vfid, + struct octep_iface_link_info *link_info) { - struct octep_ctrl_net_h2f_req req = {}; + struct octep_ctrl_net_wait_data d = {0}; + struct octep_ctrl_net_h2f_req *req = &d.data.req; struct octep_ctrl_net_h2f_resp *resp; - struct octep_ctrl_mbox_msg msg = {}; int err; - req.hdr.cmd = OCTEP_CTRL_NET_H2F_CMD_LINK_INFO; - req.mac.cmd = OCTEP_CTRL_NET_CMD_GET; - - msg.hdr.flags = OCTEP_CTRL_MBOX_MSG_HDR_FLAG_REQ; - msg.hdr.sizew = OCTEP_CTRL_NET_H2F_LINK_INFO_REQ_SZW; - msg.msg = &req; - err = octep_ctrl_mbox_send(&oct->ctrl_mbox, &msg); - if (err) + init_send_req(&d.msg, req, link_info_sz, vfid); + req->hdr.s.cmd = OCTEP_CTRL_NET_H2F_CMD_LINK_INFO; + req->link_info.cmd = OCTEP_CTRL_NET_CMD_GET; + err = octep_send_mbox_req(oct, &d, true); + if (err < 0) return err; - resp = (struct octep_ctrl_net_h2f_resp *)&req; - oct->link_info.supported_modes = resp->link_info.supported_modes; - oct->link_info.advertised_modes = resp->link_info.advertised_modes; - oct->link_info.autoneg = resp->link_info.autoneg; - oct->link_info.pause = resp->link_info.pause; - oct->link_info.speed = resp->link_info.speed; + resp = &d.data.resp; + link_info->supported_modes = resp->link_info.supported_modes; + link_info->advertised_modes = resp->link_info.advertised_modes; + link_info->autoneg = resp->link_info.autoneg; + link_info->pause = resp->link_info.pause; + link_info->speed = resp->link_info.speed; + + return 0; +} + +int octep_ctrl_net_set_link_info(struct octep_device *oct, int vfid, + struct octep_iface_link_info *link_info, + bool wait_for_response) +{ + struct octep_ctrl_net_wait_data d = {0}; + struct octep_ctrl_net_h2f_req *req = &d.data.req; + + init_send_req(&d.msg, req, link_info_sz, vfid); + req->hdr.s.cmd = OCTEP_CTRL_NET_H2F_CMD_LINK_INFO; + req->link_info.cmd = OCTEP_CTRL_NET_CMD_SET; + req->link_info.info.advertised_modes = link_info->advertised_modes; + req->link_info.info.autoneg = link_info->autoneg; + req->link_info.info.pause = link_info->pause; + req->link_info.info.speed = link_info->speed; + + return octep_send_mbox_req(oct, &d, wait_for_response); +} + +static void process_mbox_resp(struct octep_device *oct, + struct octep_ctrl_mbox_msg *msg) +{ + struct octep_ctrl_net_wait_data *pos, *n; + + list_for_each_entry_safe(pos, n, &oct->ctrl_req_wait_list, list) { + if (pos->msg.hdr.s.msg_id == msg->hdr.s.msg_id) { + memcpy(&pos->data.resp, + msg->sg_list[0].msg, + msg->hdr.s.sz); + pos->done = 1; + wake_up_interruptible_all(&oct->ctrl_req_wait_q); + break; + } + } +} - return err; +static int process_mbox_notify(struct octep_device *oct, + struct octep_ctrl_mbox_msg *msg) +{ + struct net_device *netdev = oct->netdev; + struct octep_ctrl_net_f2h_req *req; + + req = (struct octep_ctrl_net_f2h_req *)msg->sg_list[0].msg; + switch (req->hdr.s.cmd) { + case OCTEP_CTRL_NET_F2H_CMD_LINK_STATUS: + if (netif_running(netdev)) { + if (req->link.state) { + dev_info(&oct->pdev->dev, "netif_carrier_on\n"); + netif_carrier_on(netdev); + } else { + dev_info(&oct->pdev->dev, "netif_carrier_off\n"); + netif_carrier_off(netdev); + } + } + break; + default: + pr_info("Unknown mbox req : %u\n", req->hdr.s.cmd); + break; + } + + return 0; } -int octep_set_link_info(struct octep_device *oct, struct octep_iface_link_info *link_info) +void octep_ctrl_net_recv_fw_messages(struct octep_device *oct) { - struct octep_ctrl_net_h2f_req req = {}; - struct octep_ctrl_mbox_msg msg = {}; + static u16 msg_sz = sizeof(union octep_ctrl_net_max_data); + union octep_ctrl_net_max_data data = {0}; + struct octep_ctrl_mbox_msg msg = {0}; + int ret; + + msg.hdr.s.sz = msg_sz; + msg.sg_num = 1; + msg.sg_list[0].sz = msg_sz; + msg.sg_list[0].msg = &data; + while (true) { + /* mbox will overwrite msg.hdr.s.sz so initialize it */ + msg.hdr.s.sz = msg_sz; + ret = octep_ctrl_mbox_recv(&oct->ctrl_mbox, (struct octep_ctrl_mbox_msg *)&msg); + if (ret < 0) + break; + + if (msg.hdr.s.flags & OCTEP_CTRL_MBOX_MSG_HDR_FLAG_RESP) + process_mbox_resp(oct, &msg); + else if (msg.hdr.s.flags & OCTEP_CTRL_MBOX_MSG_HDR_FLAG_NOTIFY) + process_mbox_notify(oct, &msg); + } +} + +int octep_ctrl_net_uninit(struct octep_device *oct) +{ + struct octep_ctrl_net_wait_data *pos, *n; + + list_for_each_entry_safe(pos, n, &oct->ctrl_req_wait_list, list) + pos->done = 1; - req.hdr.cmd = OCTEP_CTRL_NET_H2F_CMD_LINK_INFO; - req.link_info.cmd = OCTEP_CTRL_NET_CMD_SET; - req.link_info.info.advertised_modes = link_info->advertised_modes; - req.link_info.info.autoneg = link_info->autoneg; - req.link_info.info.pause = link_info->pause; - req.link_info.info.speed = link_info->speed; + wake_up_interruptible_all(&oct->ctrl_req_wait_q); - msg.hdr.flags = OCTEP_CTRL_MBOX_MSG_HDR_FLAG_REQ; - msg.hdr.sizew = OCTEP_CTRL_NET_H2F_LINK_INFO_REQ_SZW; - msg.msg = &req; + octep_ctrl_mbox_uninit(&oct->ctrl_mbox); - return octep_ctrl_mbox_send(&oct->ctrl_mbox, &msg); + return 0; } diff --git a/drivers/net/ethernet/marvell/octeon_ep/octep_ctrl_net.h b/drivers/net/ethernet/marvell/octeon_ep/octep_ctrl_net.h index f23b58381322..37880dd79116 100644 --- a/drivers/net/ethernet/marvell/octeon_ep/octep_ctrl_net.h +++ b/drivers/net/ethernet/marvell/octeon_ep/octep_ctrl_net.h @@ -7,6 +7,8 @@ #ifndef __OCTEP_CTRL_NET_H__ #define __OCTEP_CTRL_NET_H__ +#define OCTEP_CTRL_NET_INVALID_VFID (-1) + /* Supported commands */ enum octep_ctrl_net_cmd { OCTEP_CTRL_NET_CMD_GET = 0, @@ -45,15 +47,18 @@ enum octep_ctrl_net_f2h_cmd { OCTEP_CTRL_NET_F2H_CMD_LINK_STATUS, }; -struct octep_ctrl_net_req_hdr { - /* sender id */ - u16 sender; - /* receiver id */ - u16 receiver; - /* octep_ctrl_net_h2t_cmd */ - u16 cmd; - /* reserved */ - u16 rsvd0; +union octep_ctrl_net_req_hdr { + u64 words[1]; + struct { + /* sender id */ + u16 sender; + /* receiver id */ + u16 receiver; + /* octep_ctrl_net_h2t_cmd */ + u16 cmd; + /* reserved */ + u16 rsvd0; + } s; }; /* get/set mtu request */ @@ -72,12 +77,6 @@ struct octep_ctrl_net_h2f_req_cmd_mac { u8 addr[ETH_ALEN]; }; -/* get if_stats, xstats, q_stats request */ -struct octep_ctrl_net_h2f_req_cmd_get_stats { - /* offset into barmem where fw should copy over stats */ - u32 offset; -}; - /* get/set link state, rx state */ struct octep_ctrl_net_h2f_req_cmd_state { /* enum octep_ctrl_net_cmd */ @@ -110,26 +109,28 @@ struct octep_ctrl_net_h2f_req_cmd_link_info { /* Host to fw request data */ struct octep_ctrl_net_h2f_req { - struct octep_ctrl_net_req_hdr hdr; + union octep_ctrl_net_req_hdr hdr; union { struct octep_ctrl_net_h2f_req_cmd_mtu mtu; struct octep_ctrl_net_h2f_req_cmd_mac mac; - struct octep_ctrl_net_h2f_req_cmd_get_stats get_stats; struct octep_ctrl_net_h2f_req_cmd_state link; struct octep_ctrl_net_h2f_req_cmd_state rx; struct octep_ctrl_net_h2f_req_cmd_link_info link_info; }; } __packed; -struct octep_ctrl_net_resp_hdr { - /* sender id */ - u16 sender; - /* receiver id */ - u16 receiver; - /* octep_ctrl_net_h2t_cmd */ - u16 cmd; - /* octep_ctrl_net_reply */ - u16 reply; +union octep_ctrl_net_resp_hdr { + u64 words[1]; + struct { + /* sender id */ + u16 sender; + /* receiver id */ + u16 receiver; + /* octep_ctrl_net_h2t_cmd */ + u16 cmd; + /* octep_ctrl_net_reply */ + u16 reply; + } s; }; /* get mtu response */ @@ -144,6 +145,12 @@ struct octep_ctrl_net_h2f_resp_cmd_mac { u8 addr[ETH_ALEN]; }; +/* get if_stats, xstats, q_stats request */ +struct octep_ctrl_net_h2f_resp_cmd_get_stats { + struct octep_iface_rx_stats rx_stats; + struct octep_iface_tx_stats tx_stats; +}; + /* get link state, rx state response */ struct octep_ctrl_net_h2f_resp_cmd_state { /* enum octep_ctrl_net_state */ @@ -152,10 +159,11 @@ struct octep_ctrl_net_h2f_resp_cmd_state { /* Host to fw response data */ struct octep_ctrl_net_h2f_resp { - struct octep_ctrl_net_resp_hdr hdr; + union octep_ctrl_net_resp_hdr hdr; union { struct octep_ctrl_net_h2f_resp_cmd_mtu mtu; struct octep_ctrl_net_h2f_resp_cmd_mac mac; + struct octep_ctrl_net_h2f_resp_cmd_get_stats if_stats; struct octep_ctrl_net_h2f_resp_cmd_state link; struct octep_ctrl_net_h2f_resp_cmd_state rx; struct octep_ctrl_net_link_info link_info; @@ -170,7 +178,7 @@ struct octep_ctrl_net_f2h_req_cmd_state { /* Fw to host request data */ struct octep_ctrl_net_f2h_req { - struct octep_ctrl_net_req_hdr hdr; + union octep_ctrl_net_req_hdr hdr; union { struct octep_ctrl_net_f2h_req_cmd_state link; }; @@ -178,122 +186,152 @@ struct octep_ctrl_net_f2h_req { /* Fw to host response data */ struct octep_ctrl_net_f2h_resp { - struct octep_ctrl_net_resp_hdr hdr; + union octep_ctrl_net_resp_hdr hdr; }; -/* Size of host to fw octep_ctrl_mbox queue element */ -union octep_ctrl_net_h2f_data_sz { +/* Max data size to be transferred over mbox */ +union octep_ctrl_net_max_data { struct octep_ctrl_net_h2f_req h2f_req; struct octep_ctrl_net_h2f_resp h2f_resp; -}; - -/* Size of fw to host octep_ctrl_mbox queue element */ -union octep_ctrl_net_f2h_data_sz { struct octep_ctrl_net_f2h_req f2h_req; struct octep_ctrl_net_f2h_resp f2h_resp; }; -/* size of host to fw data in words */ -#define OCTEP_CTRL_NET_H2F_DATA_SZW ((sizeof(union octep_ctrl_net_h2f_data_sz)) / \ - (sizeof(unsigned long))) - -/* size of fw to host data in words */ -#define OCTEP_CTRL_NET_F2H_DATA_SZW ((sizeof(union octep_ctrl_net_f2h_data_sz)) / \ - (sizeof(unsigned long))) - -/* size in words of get/set mtu request */ -#define OCTEP_CTRL_NET_H2F_MTU_REQ_SZW 2 -/* size in words of get/set mac request */ -#define OCTEP_CTRL_NET_H2F_MAC_REQ_SZW 2 -/* size in words of get stats request */ -#define OCTEP_CTRL_NET_H2F_GET_STATS_REQ_SZW 2 -/* size in words of get/set state request */ -#define OCTEP_CTRL_NET_H2F_STATE_REQ_SZW 2 -/* size in words of get/set link info request */ -#define OCTEP_CTRL_NET_H2F_LINK_INFO_REQ_SZW 4 - -/* size in words of get mtu response */ -#define OCTEP_CTRL_NET_H2F_GET_MTU_RESP_SZW 2 -/* size in words of set mtu response */ -#define OCTEP_CTRL_NET_H2F_SET_MTU_RESP_SZW 1 -/* size in words of get mac response */ -#define OCTEP_CTRL_NET_H2F_GET_MAC_RESP_SZW 2 -/* size in words of set mac response */ -#define OCTEP_CTRL_NET_H2F_SET_MAC_RESP_SZW 1 -/* size in words of get state request */ -#define OCTEP_CTRL_NET_H2F_GET_STATE_RESP_SZW 2 -/* size in words of set state request */ -#define OCTEP_CTRL_NET_H2F_SET_STATE_RESP_SZW 1 -/* size in words of get link info request */ -#define OCTEP_CTRL_NET_H2F_GET_LINK_INFO_RESP_SZW 4 -/* size in words of set link info request */ -#define OCTEP_CTRL_NET_H2F_SET_LINK_INFO_RESP_SZW 1 +struct octep_ctrl_net_wait_data { + struct list_head list; + int done; + struct octep_ctrl_mbox_msg msg; + union { + struct octep_ctrl_net_h2f_req req; + struct octep_ctrl_net_h2f_resp resp; + } data; +}; + +/** Initialize data for ctrl net. + * + * @param oct: non-null pointer to struct octep_device. + * + * return value: 0 on success, -errno on error. + */ +int octep_ctrl_net_init(struct octep_device *oct); /** Get link status from firmware. * * @param oct: non-null pointer to struct octep_device. + * @param vfid: Index of virtual function. * * return value: link status 0=down, 1=up. */ -int octep_get_link_status(struct octep_device *oct); +int octep_ctrl_net_get_link_status(struct octep_device *oct, int vfid); /** Set link status in firmware. * * @param oct: non-null pointer to struct octep_device. + * @param vfid: Index of virtual function. * @param up: boolean status. + * @param wait_for_response: poll for response. + * + * return value: 0 on success, -errno on failure */ -void octep_set_link_status(struct octep_device *oct, bool up); +int octep_ctrl_net_set_link_status(struct octep_device *oct, int vfid, bool up, + bool wait_for_response); /** Set rx state in firmware. * * @param oct: non-null pointer to struct octep_device. + * @param vfid: Index of virtual function. * @param up: boolean status. + * @param wait_for_response: poll for response. + * + * return value: 0 on success, -errno on failure. */ -void octep_set_rx_state(struct octep_device *oct, bool up); +int octep_ctrl_net_set_rx_state(struct octep_device *oct, int vfid, bool up, + bool wait_for_response); /** Get mac address from firmware. * * @param oct: non-null pointer to struct octep_device. + * @param vfid: Index of virtual function. * @param addr: non-null pointer to mac address. * * return value: 0 on success, -errno on failure. */ -int octep_get_mac_addr(struct octep_device *oct, u8 *addr); +int octep_ctrl_net_get_mac_addr(struct octep_device *oct, int vfid, u8 *addr); /** Set mac address in firmware. * * @param oct: non-null pointer to struct octep_device. + * @param vfid: Index of virtual function. * @param addr: non-null pointer to mac address. + * @param wait_for_response: poll for response. + * + * return value: 0 on success, -errno on failure. */ -int octep_set_mac_addr(struct octep_device *oct, u8 *addr); +int octep_ctrl_net_set_mac_addr(struct octep_device *oct, int vfid, u8 *addr, + bool wait_for_response); /** Set mtu in firmware. * * @param oct: non-null pointer to struct octep_device. + * @param vfid: Index of virtual function. * @param mtu: mtu. + * @param wait_for_response: poll for response. + * + * return value: 0 on success, -errno on failure. */ -int octep_set_mtu(struct octep_device *oct, int mtu); +int octep_ctrl_net_set_mtu(struct octep_device *oct, int vfid, int mtu, + bool wait_for_response); /** Get interface statistics from firmware. * * @param oct: non-null pointer to struct octep_device. + * @param vfid: Index of virtual function. + * @param rx_stats: non-null pointer struct octep_iface_rx_stats. + * @param tx_stats: non-null pointer struct octep_iface_tx_stats. * * return value: 0 on success, -errno on failure. */ -int octep_get_if_stats(struct octep_device *oct); +int octep_ctrl_net_get_if_stats(struct octep_device *oct, int vfid, + struct octep_iface_rx_stats *rx_stats, + struct octep_iface_tx_stats *tx_stats); /** Get link info from firmware. * * @param oct: non-null pointer to struct octep_device. + * @param vfid: Index of virtual function. + * @param link_info: non-null pointer to struct octep_iface_link_info. * * return value: 0 on success, -errno on failure. */ -int octep_get_link_info(struct octep_device *oct); +int octep_ctrl_net_get_link_info(struct octep_device *oct, int vfid, + struct octep_iface_link_info *link_info); /** Set link info in firmware. * * @param oct: non-null pointer to struct octep_device. + * @param vfid: Index of virtual function. + * @param link_info: non-null pointer to struct octep_iface_link_info. + * @param wait_for_response: poll for response. + * + * return value: 0 on success, -errno on failure. + */ +int octep_ctrl_net_set_link_info(struct octep_device *oct, + int vfid, + struct octep_iface_link_info *link_info, + bool wait_for_response); + +/** Poll for firmware messages and process them. + * + * @param oct: non-null pointer to struct octep_device. + */ +void octep_ctrl_net_recv_fw_messages(struct octep_device *oct); + +/** Uninitialize data for ctrl net. + * + * @param oct: non-null pointer to struct octep_device. + * + * return value: 0 on success, -errno on error. */ -int octep_set_link_info(struct octep_device *oct, struct octep_iface_link_info *link_info); +int octep_ctrl_net_uninit(struct octep_device *oct); #endif /* __OCTEP_CTRL_NET_H__ */ diff --git a/drivers/net/ethernet/marvell/octeon_ep/octep_ethtool.c b/drivers/net/ethernet/marvell/octeon_ep/octep_ethtool.c index 87ef129b269a..7d0124b283da 100644 --- a/drivers/net/ethernet/marvell/octeon_ep/octep_ethtool.c +++ b/drivers/net/ethernet/marvell/octeon_ep/octep_ethtool.c @@ -150,9 +150,12 @@ octep_get_ethtool_stats(struct net_device *netdev, rx_packets = 0; rx_bytes = 0; - octep_get_if_stats(oct); iface_tx_stats = &oct->iface_tx_stats; iface_rx_stats = &oct->iface_rx_stats; + octep_ctrl_net_get_if_stats(oct, + OCTEP_CTRL_NET_INVALID_VFID, + iface_rx_stats, + iface_tx_stats); for (q = 0; q < oct->num_oqs; q++) { struct octep_iq *iq = oct->iq[q]; @@ -283,11 +286,11 @@ static int octep_get_link_ksettings(struct net_device *netdev, ethtool_link_ksettings_zero_link_mode(cmd, supported); ethtool_link_ksettings_zero_link_mode(cmd, advertising); - octep_get_link_info(oct); + link_info = &oct->link_info; + octep_ctrl_net_get_link_info(oct, OCTEP_CTRL_NET_INVALID_VFID, link_info); advertised_modes = oct->link_info.advertised_modes; supported_modes = oct->link_info.supported_modes; - link_info = &oct->link_info; OCTEP_SET_ETHTOOL_LINK_MODES_BITMAP(supported_modes, cmd, supported); OCTEP_SET_ETHTOOL_LINK_MODES_BITMAP(advertised_modes, cmd, advertising); @@ -439,7 +442,8 @@ static int octep_set_link_ksettings(struct net_device *netdev, link_info_new.speed = cmd->base.speed; link_info_new.autoneg = autoneg; - err = octep_set_link_info(oct, &link_info_new); + err = octep_ctrl_net_set_link_info(oct, OCTEP_CTRL_NET_INVALID_VFID, + &link_info_new, true); if (err) return err; diff --git a/drivers/net/ethernet/marvell/octeon_ep/octep_main.c b/drivers/net/ethernet/marvell/octeon_ep/octep_main.c index 5a898fb88e37..e1853da280f9 100644 --- a/drivers/net/ethernet/marvell/octeon_ep/octep_main.c +++ b/drivers/net/ethernet/marvell/octeon_ep/octep_main.c @@ -8,7 +8,6 @@ #include <linux/types.h> #include <linux/module.h> #include <linux/pci.h> -#include <linux/aer.h> #include <linux/netdevice.h> #include <linux/etherdevice.h> #include <linux/rtnetlink.h> @@ -18,6 +17,7 @@ #include "octep_main.h" #include "octep_ctrl_net.h" +#define OCTEP_INTR_POLL_TIME_MSECS 100 struct workqueue_struct *octep_wq; /* Supported Devices */ @@ -507,11 +507,11 @@ static int octep_open(struct net_device *netdev) octep_napi_enable(oct); oct->link_info.admin_up = 1; - octep_set_rx_state(oct, true); - - ret = octep_get_link_status(oct); - if (!ret) - octep_set_link_status(oct, true); + octep_ctrl_net_set_rx_state(oct, OCTEP_CTRL_NET_INVALID_VFID, true, + false); + octep_ctrl_net_set_link_status(oct, OCTEP_CTRL_NET_INVALID_VFID, true, + false); + oct->poll_non_ioq_intr = false; /* Enable the input and output queues for this Octeon device */ oct->hw_ops.enable_io_queues(oct); @@ -521,7 +521,7 @@ static int octep_open(struct net_device *netdev) octep_oq_dbell_init(oct); - ret = octep_get_link_status(oct); + ret = octep_ctrl_net_get_link_status(oct, OCTEP_CTRL_NET_INVALID_VFID); if (ret > 0) octep_link_up(netdev); @@ -551,14 +551,16 @@ static int octep_stop(struct net_device *netdev) netdev_info(netdev, "Stopping the device ...\n"); + octep_ctrl_net_set_link_status(oct, OCTEP_CTRL_NET_INVALID_VFID, false, + false); + octep_ctrl_net_set_rx_state(oct, OCTEP_CTRL_NET_INVALID_VFID, false, + false); + /* Stop Tx from stack */ netif_tx_stop_all_queues(netdev); netif_carrier_off(netdev); netif_tx_disable(netdev); - octep_set_link_status(oct, false); - octep_set_rx_state(oct, false); - oct->link_info.admin_up = 0; oct->link_info.oper_up = 0; @@ -573,6 +575,11 @@ static int octep_stop(struct net_device *netdev) oct->hw_ops.reset_io_queues(oct); octep_free_oqs(oct); octep_free_iqs(oct); + + oct->poll_non_ioq_intr = true; + queue_delayed_work(octep_wq, &oct->intr_poll_task, + msecs_to_jiffies(OCTEP_INTR_POLL_TIME_MSECS)); + netdev_info(netdev, "Device stopped !!\n"); return 0; } @@ -755,7 +762,12 @@ static void octep_get_stats64(struct net_device *netdev, struct octep_device *oct = netdev_priv(netdev); int q; - octep_get_if_stats(oct); + if (netif_running(netdev)) + octep_ctrl_net_get_if_stats(oct, + OCTEP_CTRL_NET_INVALID_VFID, + &oct->iface_rx_stats, + &oct->iface_tx_stats); + tx_packets = 0; tx_bytes = 0; rx_packets = 0; @@ -826,7 +838,8 @@ static int octep_set_mac(struct net_device *netdev, void *p) if (!is_valid_ether_addr(addr->sa_data)) return -EADDRNOTAVAIL; - err = octep_set_mac_addr(oct, addr->sa_data); + err = octep_ctrl_net_set_mac_addr(oct, OCTEP_CTRL_NET_INVALID_VFID, + addr->sa_data, true); if (err) return err; @@ -846,7 +859,8 @@ static int octep_change_mtu(struct net_device *netdev, int new_mtu) if (link_info->mtu == new_mtu) return 0; - err = octep_set_mtu(oct, new_mtu); + err = octep_ctrl_net_set_mtu(oct, OCTEP_CTRL_NET_INVALID_VFID, new_mtu, + true); if (!err) { oct->link_info.mtu = new_mtu; netdev->mtu = new_mtu; @@ -866,6 +880,59 @@ static const struct net_device_ops octep_netdev_ops = { }; /** + * octep_intr_poll_task - work queue task to process non-ioq interrupts. + * + * @work: pointer to mbox work_struct + * + * Process non-ioq interrupts to handle control mailbox, pfvf mailbox. + **/ +static void octep_intr_poll_task(struct work_struct *work) +{ + struct octep_device *oct = container_of(work, struct octep_device, + intr_poll_task.work); + + if (!oct->poll_non_ioq_intr) { + dev_info(&oct->pdev->dev, "Interrupt poll task stopped.\n"); + return; + } + + oct->hw_ops.poll_non_ioq_interrupts(oct); + queue_delayed_work(octep_wq, &oct->intr_poll_task, + msecs_to_jiffies(OCTEP_INTR_POLL_TIME_MSECS)); +} + +/** + * octep_hb_timeout_task - work queue task to check firmware heartbeat. + * + * @work: pointer to hb work_struct + * + * Check for heartbeat miss count. Uninitialize oct device if miss count + * exceeds configured max heartbeat miss count. + * + **/ +static void octep_hb_timeout_task(struct work_struct *work) +{ + struct octep_device *oct = container_of(work, struct octep_device, + hb_task.work); + + int miss_cnt; + + miss_cnt = atomic_inc_return(&oct->hb_miss_cnt); + if (miss_cnt < oct->conf->max_hb_miss_cnt) { + queue_delayed_work(octep_wq, &oct->hb_task, + msecs_to_jiffies(oct->conf->hb_interval * 1000)); + return; + } + + dev_err(&oct->pdev->dev, "Missed %u heartbeats. Uninitializing\n", + miss_cnt); + rtnl_lock(); + if (netif_running(oct->netdev)) + octep_stop(oct->netdev); + rtnl_unlock(); +} + +/** * octep_ctrl_mbox_task - work queue task to handle ctrl mbox messages. * * @work: pointer to ctrl mbox work_struct @@ -876,34 +943,8 @@ static void octep_ctrl_mbox_task(struct work_struct *work) { struct octep_device *oct = container_of(work, struct octep_device, ctrl_mbox_task); - struct net_device *netdev = oct->netdev; - struct octep_ctrl_net_f2h_req req = {}; - struct octep_ctrl_mbox_msg msg; - int ret = 0; - - msg.msg = &req; - while (true) { - ret = octep_ctrl_mbox_recv(&oct->ctrl_mbox, &msg); - if (ret) - break; - - switch (req.hdr.cmd) { - case OCTEP_CTRL_NET_F2H_CMD_LINK_STATUS: - if (netif_running(netdev)) { - if (req.link.state) { - dev_info(&oct->pdev->dev, "netif_carrier_on\n"); - netif_carrier_on(netdev); - } else { - dev_info(&oct->pdev->dev, "netif_carrier_off\n"); - netif_carrier_off(netdev); - } - } - break; - default: - pr_info("Unknown mbox req : %u\n", req.hdr.cmd); - break; - } - } + + octep_ctrl_net_recv_fw_messages(oct); } static const char *octep_devid_to_str(struct octep_device *oct) @@ -927,7 +968,6 @@ static const char *octep_devid_to_str(struct octep_device *oct) */ int octep_device_setup(struct octep_device *oct) { - struct octep_ctrl_mbox *ctrl_mbox; struct pci_dev *pdev = oct->pdev; int i, ret; @@ -964,19 +1004,14 @@ int octep_device_setup(struct octep_device *oct) oct->pkind = CFG_GET_IQ_PKIND(oct->conf); - /* Initialize control mbox */ - ctrl_mbox = &oct->ctrl_mbox; - ctrl_mbox->barmem = CFG_GET_CTRL_MBOX_MEM_ADDR(oct->conf); - ret = octep_ctrl_mbox_init(ctrl_mbox); - if (ret) { - dev_err(&pdev->dev, "Failed to initialize control mbox\n"); - goto unsupported_dev; - } - oct->ctrl_mbox_ifstats_offset = OCTEP_CTRL_MBOX_SZ(ctrl_mbox->h2fq.elem_sz, - ctrl_mbox->h2fq.elem_cnt, - ctrl_mbox->f2hq.elem_sz, - ctrl_mbox->f2hq.elem_cnt); + ret = octep_ctrl_net_init(oct); + if (ret) + return ret; + atomic_set(&oct->hb_miss_cnt, 0); + INIT_DELAYED_WORK(&oct->hb_task, octep_hb_timeout_task); + queue_delayed_work(octep_wq, &oct->hb_task, + msecs_to_jiffies(oct->conf->hb_interval * 1000)); return 0; unsupported_dev: @@ -1005,7 +1040,8 @@ static void octep_device_cleanup(struct octep_device *oct) oct->mbox[i] = NULL; } - octep_ctrl_mbox_uninit(&oct->ctrl_mbox); + octep_ctrl_net_uninit(oct); + cancel_delayed_work_sync(&oct->hb_task); oct->hw_ops.soft_reset(oct); for (i = 0; i < OCTEP_MMIO_REGIONS; i++) { @@ -1017,6 +1053,26 @@ static void octep_device_cleanup(struct octep_device *oct) oct->conf = NULL; } +static bool get_fw_ready_status(struct pci_dev *pdev) +{ + u32 pos = 0; + u16 vsec_id; + u8 status; + + while ((pos = pci_find_next_ext_capability(pdev, pos, + PCI_EXT_CAP_ID_VNDR))) { + pci_read_config_word(pdev, pos + 4, &vsec_id); +#define FW_STATUS_VSEC_ID 0xA3 + if (vsec_id != FW_STATUS_VSEC_ID) + continue; + + pci_read_config_byte(pdev, (pos + 8), &status); + dev_info(&pdev->dev, "Firmware ready status = %u\n", status); + return status; + } + return false; +} + /** * octep_probe() - Octeon PCI device probe handler. * @@ -1050,9 +1106,14 @@ static int octep_probe(struct pci_dev *pdev, const struct pci_device_id *ent) goto err_pci_regions; } - pci_enable_pcie_error_reporting(pdev); pci_set_master(pdev); + if (!get_fw_ready_status(pdev)) { + dev_notice(&pdev->dev, "Firmware not ready; defer probe.\n"); + err = -EPROBE_DEFER; + goto err_alloc_netdev; + } + netdev = alloc_etherdev_mq(sizeof(struct octep_device), OCTEP_MAX_QUEUES); if (!netdev) { @@ -1075,6 +1136,10 @@ static int octep_probe(struct pci_dev *pdev, const struct pci_device_id *ent) } INIT_WORK(&octep_dev->tx_timeout_task, octep_tx_timeout_task); INIT_WORK(&octep_dev->ctrl_mbox_task, octep_ctrl_mbox_task); + INIT_DELAYED_WORK(&octep_dev->intr_poll_task, octep_intr_poll_task); + octep_dev->poll_non_ioq_intr = true; + queue_delayed_work(octep_wq, &octep_dev->intr_poll_task, + msecs_to_jiffies(OCTEP_INTR_POLL_TIME_MSECS)); netdev->netdev_ops = &octep_netdev_ops; octep_set_ethtool_ops(netdev); @@ -1086,7 +1151,8 @@ static int octep_probe(struct pci_dev *pdev, const struct pci_device_id *ent) netdev->max_mtu = OCTEP_MAX_MTU; netdev->mtu = OCTEP_DEFAULT_MTU; - err = octep_get_mac_addr(octep_dev, octep_dev->mac_addr); + err = octep_ctrl_net_get_mac_addr(octep_dev, OCTEP_CTRL_NET_INVALID_VFID, + octep_dev->mac_addr); if (err) { dev_err(&pdev->dev, "Failed to get mac address\n"); goto register_dev_err; @@ -1106,7 +1172,6 @@ register_dev_err: err_octep_config: free_netdev(netdev); err_alloc_netdev: - pci_disable_pcie_error_reporting(pdev); pci_release_mem_regions(pdev); err_pci_regions: err_dma_mask: @@ -1136,10 +1201,11 @@ static void octep_remove(struct pci_dev *pdev) if (netdev->reg_state == NETREG_REGISTERED) unregister_netdev(netdev); + oct->poll_non_ioq_intr = false; + cancel_delayed_work_sync(&oct->intr_poll_task); octep_device_cleanup(oct); pci_release_mem_regions(pdev); free_netdev(netdev); - pci_disable_pcie_error_reporting(pdev); pci_disable_device(pdev); } diff --git a/drivers/net/ethernet/marvell/octeon_ep/octep_main.h b/drivers/net/ethernet/marvell/octeon_ep/octep_main.h index 123ffc13754d..e0907a719133 100644 --- a/drivers/net/ethernet/marvell/octeon_ep/octep_main.h +++ b/drivers/net/ethernet/marvell/octeon_ep/octep_main.h @@ -73,6 +73,7 @@ struct octep_hw_ops { void (*enable_interrupts)(struct octep_device *oct); void (*disable_interrupts)(struct octep_device *oct); + bool (*poll_non_ioq_interrupts)(struct octep_device *oct); void (*enable_io_queues)(struct octep_device *oct); void (*disable_io_queues)(struct octep_device *oct); @@ -270,7 +271,22 @@ struct octep_device { /* Work entry to handle ctrl mbox interrupt */ struct work_struct ctrl_mbox_task; - + /* Wait queue for host to firmware requests */ + wait_queue_head_t ctrl_req_wait_q; + /* List of objects waiting for h2f response */ + struct list_head ctrl_req_wait_list; + + /* Enable non-ioq interrupt polling */ + bool poll_non_ioq_intr; + /* Work entry to poll non-ioq interrupts */ + struct delayed_work intr_poll_task; + + /* Firmware heartbeat timer */ + struct timer_list hb_timer; + /* Firmware heartbeat miss count tracked by timer */ + atomic_t hb_miss_cnt; + /* Task to reset device on heartbeat miss */ + struct delayed_work hb_task; }; static inline u16 OCTEP_MAJOR_REV(struct octep_device *oct) diff --git a/drivers/net/ethernet/marvell/octeon_ep/octep_regs_cn9k_pf.h b/drivers/net/ethernet/marvell/octeon_ep/octep_regs_cn9k_pf.h index 3d5d39a52fe6..b25c3093dc7b 100644 --- a/drivers/net/ethernet/marvell/octeon_ep/octep_regs_cn9k_pf.h +++ b/drivers/net/ethernet/marvell/octeon_ep/octep_regs_cn9k_pf.h @@ -364,4 +364,10 @@ /* Number of non-queue interrupts in CN93xx */ #define CN93_NUM_NON_IOQ_INTR 16 + +/* bit 0 for control mbox interrupt */ +#define CN93_SDP_EPF_OEI_RINT_DATA_BIT_MBOX BIT_ULL(0) +/* bit 1 for firmware heartbeat interrupt */ +#define CN93_SDP_EPF_OEI_RINT_DATA_BIT_HBEAT BIT_ULL(1) + #endif /* _OCTEP_REGS_CN9K_PF_H_ */ diff --git a/drivers/net/ethernet/marvell/octeontx2/af/mbox.h b/drivers/net/ethernet/marvell/octeontx2/af/mbox.h index 5727d67e0259..8fb5cae7285b 100644 --- a/drivers/net/ethernet/marvell/octeontx2/af/mbox.h +++ b/drivers/net/ethernet/marvell/octeontx2/af/mbox.h @@ -936,7 +936,7 @@ struct nix_aq_enq_req { struct nix_cq_ctx_s cq; struct nix_rsse_s rss; struct nix_rx_mce_s mce; - u64 prof; + struct nix_bandprof_s prof; }; union { struct nix_rq_ctx_s rq_mask; @@ -944,7 +944,7 @@ struct nix_aq_enq_req { struct nix_cq_ctx_s cq_mask; struct nix_rsse_s rss_mask; struct nix_rx_mce_s mce_mask; - u64 prof_mask; + struct nix_bandprof_s prof_mask; }; }; diff --git a/drivers/net/ethernet/marvell/pxa168_eth.c b/drivers/net/ethernet/marvell/pxa168_eth.c index 87fff539d39d..d5691b6a2bc5 100644 --- a/drivers/net/ethernet/marvell/pxa168_eth.c +++ b/drivers/net/ethernet/marvell/pxa168_eth.c @@ -1586,7 +1586,7 @@ static struct platform_driver pxa168_eth_driver = { .suspend = pxa168_eth_suspend, .driver = { .name = DRIVER_NAME, - .of_match_table = of_match_ptr(pxa168_eth_of_match), + .of_match_table = pxa168_eth_of_match, }, }; diff --git a/drivers/net/ethernet/mediatek/Kconfig b/drivers/net/ethernet/mediatek/Kconfig index 97374fb3ee79..da0db417ab69 100644 --- a/drivers/net/ethernet/mediatek/Kconfig +++ b/drivers/net/ethernet/mediatek/Kconfig @@ -19,6 +19,8 @@ config NET_MEDIATEK_SOC select DIMLIB select PAGE_POOL select PAGE_POOL_STATS + select PCS_MTK_LYNXI + select REGMAP_MMIO help This driver supports the gigabit ethernet MACs in the MediaTek SoC family. diff --git a/drivers/net/ethernet/mediatek/Makefile b/drivers/net/ethernet/mediatek/Makefile index 8e0c61c33ff8..03e008fbc859 100644 --- a/drivers/net/ethernet/mediatek/Makefile +++ b/drivers/net/ethernet/mediatek/Makefile @@ -4,7 +4,7 @@ # obj-$(CONFIG_NET_MEDIATEK_SOC) += mtk_eth.o -mtk_eth-y := mtk_eth_soc.o mtk_sgmii.o mtk_eth_path.o mtk_ppe.o mtk_ppe_debugfs.o mtk_ppe_offload.o +mtk_eth-y := mtk_eth_soc.o mtk_eth_path.o mtk_ppe.o mtk_ppe_debugfs.o mtk_ppe_offload.o mtk_eth-$(CONFIG_NET_MEDIATEK_SOC_WED) += mtk_wed.o mtk_wed_mcu.o mtk_wed_wo.o ifdef CONFIG_DEBUG_FS mtk_eth-$(CONFIG_NET_MEDIATEK_SOC_WED) += mtk_wed_debugfs.o diff --git a/drivers/net/ethernet/mediatek/mtk_eth_path.c b/drivers/net/ethernet/mediatek/mtk_eth_path.c index 72648535a14d..317e447f4991 100644 --- a/drivers/net/ethernet/mediatek/mtk_eth_path.c +++ b/drivers/net/ethernet/mediatek/mtk_eth_path.c @@ -96,12 +96,20 @@ static int set_mux_gmac2_gmac0_to_gephy(struct mtk_eth *eth, int path) static int set_mux_u3_gmac2_to_qphy(struct mtk_eth *eth, int path) { - unsigned int val = 0; + unsigned int val = 0, mask = 0, reg = 0; bool updated = true; switch (path) { case MTK_ETH_PATH_GMAC2_SGMII: - val = CO_QPHY_SEL; + if (MTK_HAS_CAPS(eth->soc->caps, MTK_U3_COPHY_V2)) { + reg = USB_PHY_SWITCH_REG; + val = SGMII_QPHY_SEL; + mask = QPHY_SEL_MASK; + } else { + reg = INFRA_MISC2; + val = CO_QPHY_SEL; + mask = val; + } break; default: updated = false; @@ -109,7 +117,7 @@ static int set_mux_u3_gmac2_to_qphy(struct mtk_eth *eth, int path) } if (updated) - regmap_update_bits(eth->infra, INFRA_MISC2, CO_QPHY_SEL, val); + regmap_update_bits(eth->infra, reg, mask, val); dev_dbg(eth->dev, "path %s in %s updated = %d\n", mtk_eth_path_name(path), __func__, updated); diff --git a/drivers/net/ethernet/mediatek/mtk_eth_soc.c b/drivers/net/ethernet/mediatek/mtk_eth_soc.c index e14050e17862..9e948d091a69 100644 --- a/drivers/net/ethernet/mediatek/mtk_eth_soc.c +++ b/drivers/net/ethernet/mediatek/mtk_eth_soc.c @@ -20,6 +20,7 @@ #include <linux/interrupt.h> #include <linux/pinctrl/devinfo.h> #include <linux/phylink.h> +#include <linux/pcs/pcs-mtk-lynxi.h> #include <linux/jhash.h> #include <linux/bitfield.h> #include <net/dsa.h> @@ -374,17 +375,6 @@ static int mt7621_gmac0_rgmii_adjust(struct mtk_eth *eth, { u32 val; - /* Check DDR memory type. - * Currently TRGMII mode with DDR2 memory is not supported. - */ - regmap_read(eth->ethsys, ETHSYS_SYSCFG, &val); - if (interface == PHY_INTERFACE_MODE_TRGMII && - val & SYSCFG_DRAM_TYPE_DDR2) { - dev_err(eth->dev, - "TRGMII mode with DDR2 memory is not supported!\n"); - return -EOPNOTSUPP; - } - val = (interface == PHY_INTERFACE_MODE_TRGMII) ? ETHSYS_TRGMII_MT7621_DDR_PLL : 0; @@ -397,38 +387,42 @@ static int mt7621_gmac0_rgmii_adjust(struct mtk_eth *eth, static void mtk_gmac0_rgmii_adjust(struct mtk_eth *eth, phy_interface_t interface, int speed) { - u32 val; + unsigned long rate; + u32 tck, rck, intf; int ret; if (interface == PHY_INTERFACE_MODE_TRGMII) { mtk_w32(eth, TRGMII_MODE, INTF_MODE); - val = 500000000; - ret = clk_set_rate(eth->clks[MTK_CLK_TRGPLL], val); + ret = clk_set_rate(eth->clks[MTK_CLK_TRGPLL], 500000000); if (ret) dev_err(eth->dev, "Failed to set trgmii pll: %d\n", ret); return; } - val = (speed == SPEED_1000) ? - INTF_MODE_RGMII_1000 : INTF_MODE_RGMII_10_100; - mtk_w32(eth, val, INTF_MODE); + if (speed == SPEED_1000) { + intf = INTF_MODE_RGMII_1000; + rate = 250000000; + rck = RCK_CTRL_RGMII_1000; + tck = TCK_CTRL_RGMII_1000; + } else { + intf = INTF_MODE_RGMII_10_100; + rate = 500000000; + rck = RCK_CTRL_RGMII_10_100; + tck = TCK_CTRL_RGMII_10_100; + } + + mtk_w32(eth, intf, INTF_MODE); regmap_update_bits(eth->ethsys, ETHSYS_CLKCFG0, ETHSYS_TRGMII_CLK_SEL362_5, ETHSYS_TRGMII_CLK_SEL362_5); - val = (speed == SPEED_1000) ? 250000000 : 500000000; - ret = clk_set_rate(eth->clks[MTK_CLK_TRGPLL], val); + ret = clk_set_rate(eth->clks[MTK_CLK_TRGPLL], rate); if (ret) dev_err(eth->dev, "Failed to set trgmii pll: %d\n", ret); - val = (speed == SPEED_1000) ? - RCK_CTRL_RGMII_1000 : RCK_CTRL_RGMII_10_100; - mtk_w32(eth, val, TRGMII_RCK_CTRL); - - val = (speed == SPEED_1000) ? - TCK_CTRL_RGMII_1000 : TCK_CTRL_RGMII_10_100; - mtk_w32(eth, val, TRGMII_TCK_CTRL); + mtk_w32(eth, rck, TRGMII_RCK_CTRL); + mtk_w32(eth, tck, TRGMII_TCK_CTRL); } static struct phylink_pcs *mtk_mac_select_pcs(struct phylink_config *config, @@ -444,7 +438,7 @@ static struct phylink_pcs *mtk_mac_select_pcs(struct phylink_config *config, sid = (MTK_HAS_CAPS(eth->soc->caps, MTK_SHARED_SGMII)) ? 0 : mac->id; - return mtk_sgmii_select_pcs(eth->sgmii, sid); + return eth->sgmii_pcs[sid]; } return NULL; @@ -465,19 +459,11 @@ static void mtk_mac_config(struct phylink_config *config, unsigned int mode, /* Setup soc pin functions */ switch (state->interface) { case PHY_INTERFACE_MODE_TRGMII: - if (mac->id) - goto err_phy; - if (!MTK_HAS_CAPS(mac->hw->soc->caps, - MTK_GMAC1_TRGMII)) - goto err_phy; - fallthrough; case PHY_INTERFACE_MODE_RGMII_TXID: case PHY_INTERFACE_MODE_RGMII_RXID: case PHY_INTERFACE_MODE_RGMII_ID: case PHY_INTERFACE_MODE_RGMII: case PHY_INTERFACE_MODE_MII: - case PHY_INTERFACE_MODE_REVMII: - case PHY_INTERFACE_MODE_RMII: if (MTK_HAS_CAPS(eth->soc->caps, MTK_RGMII)) { err = mtk_gmac_rgmii_path_setup(eth, mac->id); if (err) @@ -487,11 +473,9 @@ static void mtk_mac_config(struct phylink_config *config, unsigned int mode, case PHY_INTERFACE_MODE_1000BASEX: case PHY_INTERFACE_MODE_2500BASEX: case PHY_INTERFACE_MODE_SGMII: - if (MTK_HAS_CAPS(eth->soc->caps, MTK_SGMII)) { - err = mtk_gmac_sgmii_path_setup(eth, mac->id); - if (err) - goto init_err; - } + err = mtk_gmac_sgmii_path_setup(eth, mac->id); + if (err) + goto init_err; break; case PHY_INTERFACE_MODE_GMII: if (MTK_HAS_CAPS(eth->soc->caps, MTK_GEPHY)) { @@ -539,21 +523,13 @@ static void mtk_mac_config(struct phylink_config *config, unsigned int mode, } } - ge_mode = 0; switch (state->interface) { case PHY_INTERFACE_MODE_MII: case PHY_INTERFACE_MODE_GMII: ge_mode = 1; break; - case PHY_INTERFACE_MODE_REVMII: - ge_mode = 2; - break; - case PHY_INTERFACE_MODE_RMII: - if (mac->id) - goto err_phy; - ge_mode = 3; - break; default: + ge_mode = 0; break; } @@ -789,8 +765,10 @@ static const struct phylink_mac_ops mtk_phylink_ops = { static int mtk_mdio_init(struct mtk_eth *eth) { + unsigned int max_clk = 2500000, divider; struct device_node *mii_np; int ret; + u32 val; mii_np = of_get_child_by_name(eth->dev->of_node, "mdio-bus"); if (!mii_np) { @@ -818,6 +796,25 @@ static int mtk_mdio_init(struct mtk_eth *eth) eth->mii_bus->parent = eth->dev; snprintf(eth->mii_bus->id, MII_BUS_ID_SIZE, "%pOFn", mii_np); + + if (!of_property_read_u32(mii_np, "clock-frequency", &val)) { + if (val > MDC_MAX_FREQ || val < MDC_MAX_FREQ / MDC_MAX_DIVIDER) { + dev_err(eth->dev, "MDIO clock frequency out of range"); + ret = -EINVAL; + goto err_put_node; + } + max_clk = val; + } + divider = min_t(unsigned int, DIV_ROUND_UP(MDC_MAX_FREQ, max_clk), 63); + + /* Configure MDC Divider */ + val = mtk_r32(eth, MTK_PPSC); + val &= ~PPSC_MDC_CFG; + val |= FIELD_PREP(PPSC_MDC_CFG, divider) | PPSC_MDC_TURBO; + mtk_w32(eth, val, MTK_PPSC); + + dev_dbg(eth->dev, "MDC is running on %d Hz\n", MDC_MAX_FREQ / divider); + ret = of_mdiobus_register(eth->mii_bus, mii_np); err_put_node: @@ -4059,8 +4056,17 @@ static int mtk_unreg_dev(struct mtk_eth *eth) return 0; } +static void mtk_sgmii_destroy(struct mtk_eth *eth) +{ + int i; + + for (i = 0; i < MTK_MAX_DEVS; i++) + mtk_pcs_lynxi_destroy(eth->sgmii_pcs[i]); +} + static int mtk_cleanup(struct mtk_eth *eth) { + mtk_sgmii_destroy(eth); mtk_unreg_dev(eth); mtk_free_dev(eth); cancel_work_sync(ð->pending_work); @@ -4332,6 +4338,7 @@ static int mtk_add_mac(struct mtk_eth *eth, struct device_node *np) struct mtk_mac *mac; int id, err; int txqs = 1; + u32 val; if (!_id) { dev_err(eth->dev, "missing mac id\n"); @@ -4408,6 +4415,15 @@ static int mtk_add_mac(struct mtk_eth *eth, struct device_node *np) __set_bit(PHY_INTERFACE_MODE_TRGMII, mac->phylink_config.supported_interfaces); + /* TRGMII is not permitted on MT7621 if using DDR2 */ + if (MTK_HAS_CAPS(mac->hw->soc->caps, MTK_GMAC1_TRGMII) && + MTK_HAS_CAPS(mac->hw->soc->caps, MTK_TRGMII_MT7621_CLK)) { + regmap_read(eth->ethsys, ETHSYS_SYSCFG, &val); + if (val & SYSCFG_DRAM_TYPE_DDR2) + __clear_bit(PHY_INTERFACE_MODE_TRGMII, + mac->phylink_config.supported_interfaces); + } + if (MTK_HAS_CAPS(mac->hw->soc->caps, MTK_SGMII)) { __set_bit(PHY_INTERFACE_MODE_SGMII, mac->phylink_config.supported_interfaces); @@ -4496,6 +4512,36 @@ void mtk_eth_set_dma_device(struct mtk_eth *eth, struct device *dma_dev) rtnl_unlock(); } +static int mtk_sgmii_init(struct mtk_eth *eth) +{ + struct device_node *np; + struct regmap *regmap; + u32 flags; + int i; + + for (i = 0; i < MTK_MAX_DEVS; i++) { + np = of_parse_phandle(eth->dev->of_node, "mediatek,sgmiisys", i); + if (!np) + break; + + regmap = syscon_node_to_regmap(np); + flags = 0; + if (of_property_read_bool(np, "mediatek,pnswap")) + flags |= MTK_SGMII_FLAG_PN_SWAP; + + of_node_put(np); + + if (IS_ERR(regmap)) + return PTR_ERR(regmap); + + eth->sgmii_pcs[i] = mtk_pcs_lynxi_create(eth->dev, regmap, + eth->soc->ana_rgc3, + flags); + } + + return 0; +} + static int mtk_probe(struct platform_device *pdev) { struct resource *res = NULL; @@ -4559,13 +4605,7 @@ static int mtk_probe(struct platform_device *pdev) } if (MTK_HAS_CAPS(eth->soc->caps, MTK_SGMII)) { - eth->sgmii = devm_kzalloc(eth->dev, sizeof(*eth->sgmii), - GFP_KERNEL); - if (!eth->sgmii) - return -ENOMEM; - - err = mtk_sgmii_init(eth->sgmii, pdev->dev.of_node, - eth->soc->ana_rgc3); + err = mtk_sgmii_init(eth); if (err) return err; @@ -4576,14 +4616,17 @@ static int mtk_probe(struct platform_device *pdev) "mediatek,pctl"); if (IS_ERR(eth->pctl)) { dev_err(&pdev->dev, "no pctl regmap found\n"); - return PTR_ERR(eth->pctl); + err = PTR_ERR(eth->pctl); + goto err_destroy_sgmii; } } if (MTK_HAS_CAPS(eth->soc->caps, MTK_NETSYS_V2)) { res = platform_get_resource(pdev, IORESOURCE_MEM, 0); - if (!res) - return -EINVAL; + if (!res) { + err = -EINVAL; + goto err_destroy_sgmii; + } } if (eth->soc->offload_version) { @@ -4693,8 +4736,8 @@ static int mtk_probe(struct platform_device *pdev) for (i = 0; i < num_ppe; i++) { u32 ppe_addr = eth->soc->reg_map->ppe_base + i * 0x400; - eth->ppe[i] = mtk_ppe_init(eth, eth->base + ppe_addr, - eth->soc->offload_version, i); + eth->ppe[i] = mtk_ppe_init(eth, eth->base + ppe_addr, i); + if (!eth->ppe[i]) { err = -ENOMEM; goto err_deinit_ppe; @@ -4742,6 +4785,8 @@ err_deinit_hw: mtk_hw_deinit(eth); err_wed_exit: mtk_wed_exit(); +err_destroy_sgmii: + mtk_sgmii_destroy(eth); return err; } @@ -4816,6 +4861,7 @@ static const struct mtk_soc_data mt7622_data = { .required_pctl = false, .offload_version = 2, .hash_offset = 2, + .has_accounting = true, .foe_entry_size = sizeof(struct mtk_foe_entry) - 16, .txrx = { .txd_size = sizeof(struct mtk_tx_dma), @@ -4853,6 +4899,7 @@ static const struct mtk_soc_data mt7629_data = { .hw_features = MTK_HW_FEATURES, .required_clks = MT7629_CLKS_BITMAP, .required_pctl = false, + .has_accounting = true, .txrx = { .txd_size = sizeof(struct mtk_tx_dma), .rxd_size = sizeof(struct mtk_rx_dma), @@ -4863,6 +4910,27 @@ static const struct mtk_soc_data mt7629_data = { }, }; +static const struct mtk_soc_data mt7981_data = { + .reg_map = &mt7986_reg_map, + .ana_rgc3 = 0x128, + .caps = MT7981_CAPS, + .hw_features = MTK_HW_FEATURES, + .required_clks = MT7981_CLKS_BITMAP, + .required_pctl = false, + .offload_version = 2, + .hash_offset = 4, + .foe_entry_size = sizeof(struct mtk_foe_entry), + .has_accounting = true, + .txrx = { + .txd_size = sizeof(struct mtk_tx_dma_v2), + .rxd_size = sizeof(struct mtk_rx_dma_v2), + .rx_irq_done_mask = MTK_RX_DONE_INT_V2, + .rx_dma_l4_valid = RX_DMA_L4_VALID_V2, + .dma_max_len = MTK_TX_DMA_BUF_LEN_V2, + .dma_len_offset = 8, + }, +}; + static const struct mtk_soc_data mt7986_data = { .reg_map = &mt7986_reg_map, .ana_rgc3 = 0x128, @@ -4873,6 +4941,7 @@ static const struct mtk_soc_data mt7986_data = { .offload_version = 2, .hash_offset = 4, .foe_entry_size = sizeof(struct mtk_foe_entry), + .has_accounting = true, .txrx = { .txd_size = sizeof(struct mtk_tx_dma_v2), .rxd_size = sizeof(struct mtk_rx_dma_v2), @@ -4905,6 +4974,7 @@ const struct of_device_id of_mtk_match[] = { { .compatible = "mediatek,mt7622-eth", .data = &mt7622_data}, { .compatible = "mediatek,mt7623-eth", .data = &mt7623_data}, { .compatible = "mediatek,mt7629-eth", .data = &mt7629_data}, + { .compatible = "mediatek,mt7981-eth", .data = &mt7981_data}, { .compatible = "mediatek,mt7986-eth", .data = &mt7986_data}, { .compatible = "ralink,rt5350-eth", .data = &rt5350_data}, {}, diff --git a/drivers/net/ethernet/mediatek/mtk_eth_soc.h b/drivers/net/ethernet/mediatek/mtk_eth_soc.h index 084a6badef6d..cdcf8534283e 100644 --- a/drivers/net/ethernet/mediatek/mtk_eth_soc.h +++ b/drivers/net/ethernet/mediatek/mtk_eth_soc.h @@ -363,6 +363,13 @@ #define RX_DMA_VTAG_V2 BIT(0) #define RX_DMA_L4_VALID_V2 BIT(2) +/* PHY Polling and SMI Master Control registers */ +#define MTK_PPSC 0x10000 +#define PPSC_MDC_CFG GENMASK(29, 24) +#define PPSC_MDC_TURBO BIT(20) +#define MDC_MAX_FREQ 25000000 +#define MDC_MAX_DIVIDER 63 + /* PHY Indirect Access Control registers */ #define MTK_PHY_IAC 0x10004 #define PHY_IAC_ACCESS BIT(31) @@ -503,64 +510,16 @@ #define ETHSYS_DMA_AG_MAP_QDMA BIT(1) #define ETHSYS_DMA_AG_MAP_PPE BIT(2) -/* SGMII subsystem config registers */ -/* BMCR (low 16) BMSR (high 16) */ -#define SGMSYS_PCS_CONTROL_1 0x0 -#define SGMII_BMCR GENMASK(15, 0) -#define SGMII_BMSR GENMASK(31, 16) -#define SGMII_AN_RESTART BIT(9) -#define SGMII_ISOLATE BIT(10) -#define SGMII_AN_ENABLE BIT(12) -#define SGMII_LINK_STATYS BIT(18) -#define SGMII_AN_ABILITY BIT(19) -#define SGMII_AN_COMPLETE BIT(21) -#define SGMII_PCS_FAULT BIT(23) -#define SGMII_AN_EXPANSION_CLR BIT(30) - -#define SGMSYS_PCS_ADVERTISE 0x8 -#define SGMII_ADVERTISE GENMASK(15, 0) -#define SGMII_LPA GENMASK(31, 16) - -/* Register to programmable link timer, the unit in 2 * 8ns */ -#define SGMSYS_PCS_LINK_TIMER 0x18 -#define SGMII_LINK_TIMER_MASK GENMASK(19, 0) -#define SGMII_LINK_TIMER_DEFAULT (0x186a0 & SGMII_LINK_TIMER_MASK) - -/* Register to control remote fault */ -#define SGMSYS_SGMII_MODE 0x20 -#define SGMII_IF_MODE_SGMII BIT(0) -#define SGMII_SPEED_DUPLEX_AN BIT(1) -#define SGMII_SPEED_MASK GENMASK(3, 2) -#define SGMII_SPEED_10 FIELD_PREP(SGMII_SPEED_MASK, 0) -#define SGMII_SPEED_100 FIELD_PREP(SGMII_SPEED_MASK, 1) -#define SGMII_SPEED_1000 FIELD_PREP(SGMII_SPEED_MASK, 2) -#define SGMII_DUPLEX_HALF BIT(4) -#define SGMII_IF_MODE_BIT5 BIT(5) -#define SGMII_REMOTE_FAULT_DIS BIT(8) -#define SGMII_CODE_SYNC_SET_VAL BIT(9) -#define SGMII_CODE_SYNC_SET_EN BIT(10) -#define SGMII_SEND_AN_ERROR_EN BIT(11) -#define SGMII_IF_MODE_MASK GENMASK(5, 1) - -/* Register to reset SGMII design */ -#define SGMII_RESERVED_0 0x34 -#define SGMII_SW_RESET BIT(0) - -/* Register to set SGMII speed, ANA RG_ Control Signals III*/ -#define SGMSYS_ANA_RG_CS3 0x2028 -#define RG_PHY_SPEED_MASK (BIT(2) | BIT(3)) -#define RG_PHY_SPEED_1_25G 0x0 -#define RG_PHY_SPEED_3_125G BIT(2) - -/* Register to power up QPHY */ -#define SGMSYS_QPHY_PWR_STATE_CTRL 0xe8 -#define SGMII_PHYA_PWD BIT(4) - /* Infrasys subsystem config registers */ #define INFRA_MISC2 0x70c #define CO_QPHY_SEL BIT(0) #define GEPHY_MAC_SEL BIT(1) +/* Top misc registers */ +#define USB_PHY_SWITCH_REG 0x218 +#define QPHY_SEL_MASK GENMASK(1, 0) +#define SGMII_QPHY_SEL 0x2 + /* MT7628/88 specific stuff */ #define MT7628_PDMA_OFFSET 0x0800 #define MT7628_SDM_OFFSET 0x0c00 @@ -741,6 +700,17 @@ enum mtk_clks_map { BIT(MTK_CLK_SGMII2_CDR_FB) | \ BIT(MTK_CLK_SGMII_CK) | \ BIT(MTK_CLK_ETH2PLL) | BIT(MTK_CLK_SGMIITOP)) +#define MT7981_CLKS_BITMAP (BIT(MTK_CLK_FE) | BIT(MTK_CLK_GP2) | BIT(MTK_CLK_GP1) | \ + BIT(MTK_CLK_WOCPU0) | \ + BIT(MTK_CLK_SGMII_TX_250M) | \ + BIT(MTK_CLK_SGMII_RX_250M) | \ + BIT(MTK_CLK_SGMII_CDR_REF) | \ + BIT(MTK_CLK_SGMII_CDR_FB) | \ + BIT(MTK_CLK_SGMII2_TX_250M) | \ + BIT(MTK_CLK_SGMII2_RX_250M) | \ + BIT(MTK_CLK_SGMII2_CDR_REF) | \ + BIT(MTK_CLK_SGMII2_CDR_FB) | \ + BIT(MTK_CLK_SGMII_CK)) #define MT7986_CLKS_BITMAP (BIT(MTK_CLK_FE) | BIT(MTK_CLK_GP2) | BIT(MTK_CLK_GP1) | \ BIT(MTK_CLK_WOCPU1) | BIT(MTK_CLK_WOCPU0) | \ BIT(MTK_CLK_SGMII_TX_250M) | \ @@ -854,6 +824,7 @@ enum mkt_eth_capabilities { MTK_NETSYS_V2_BIT, MTK_SOC_MT7628_BIT, MTK_RSTCTRL_PPE1_BIT, + MTK_U3_COPHY_V2_BIT, /* MUX BITS*/ MTK_ETH_MUX_GDM1_TO_GMAC1_ESW_BIT, @@ -888,6 +859,7 @@ enum mkt_eth_capabilities { #define MTK_NETSYS_V2 BIT(MTK_NETSYS_V2_BIT) #define MTK_SOC_MT7628 BIT(MTK_SOC_MT7628_BIT) #define MTK_RSTCTRL_PPE1 BIT(MTK_RSTCTRL_PPE1_BIT) +#define MTK_U3_COPHY_V2 BIT(MTK_U3_COPHY_V2_BIT) #define MTK_ETH_MUX_GDM1_TO_GMAC1_ESW \ BIT(MTK_ETH_MUX_GDM1_TO_GMAC1_ESW_BIT) @@ -960,6 +932,11 @@ enum mkt_eth_capabilities { MTK_MUX_U3_GMAC2_TO_QPHY | \ MTK_MUX_GMAC12_TO_GEPHY_SGMII | MTK_QDMA) +#define MT7981_CAPS (MTK_GMAC1_SGMII | MTK_GMAC2_SGMII | MTK_GMAC2_GEPHY | \ + MTK_MUX_GMAC12_TO_GEPHY_SGMII | MTK_QDMA | \ + MTK_MUX_U3_GMAC2_TO_QPHY | MTK_U3_COPHY_V2 | \ + MTK_NETSYS_V2 | MTK_RSTCTRL_PPE1) + #define MT7986_CAPS (MTK_GMAC1_SGMII | MTK_GMAC2_SGMII | \ MTK_MUX_GMAC12_TO_GEPHY_SGMII | MTK_QDMA | \ MTK_NETSYS_V2 | MTK_RSTCTRL_PPE1) @@ -1034,6 +1011,8 @@ struct mtk_reg_map { * the extra setup for those pins used by GMAC. * @hash_offset Flow table hash offset. * @foe_entry_size Foe table entry size. + * @has_accounting Bool indicating support for accounting of + * offloaded flows. * @txd_size Tx DMA descriptor size. * @rxd_size Rx DMA descriptor size. * @rx_irq_done_mask Rx irq done register mask. @@ -1051,6 +1030,7 @@ struct mtk_soc_data { u8 hash_offset; u16 foe_entry_size; netdev_features_t hw_features; + bool has_accounting; struct { u32 txd_size; u32 rxd_size; @@ -1066,29 +1046,6 @@ struct mtk_soc_data { /* currently no SoC has more than 2 macs */ #define MTK_MAX_DEVS 2 -/* struct mtk_pcs - This structure holds each sgmii regmap and associated - * data - * @regmap: The register map pointing at the range used to setup - * SGMII modes - * @ana_rgc3: The offset refers to register ANA_RGC3 related to regmap - * @interface: Currently configured interface mode - * @pcs: Phylink PCS structure - */ -struct mtk_pcs { - struct regmap *regmap; - u32 ana_rgc3; - phy_interface_t interface; - struct phylink_pcs pcs; -}; - -/* struct mtk_sgmii - This is the structure holding sgmii regmap and its - * characteristics - * @pcs Array of individual PCS structures - */ -struct mtk_sgmii { - struct mtk_pcs pcs[MTK_MAX_DEVS]; -}; - /* struct mtk_eth - This is the main datasructure for holding the state * of the driver * @dev: The device pointer @@ -1108,6 +1065,7 @@ struct mtk_sgmii { * MII modes * @infra: The register map pointing at the range used to setup * SGMII and GePHY path + * @sgmii_pcs: Pointers to mtk-pcs-lynxi phylink_pcs instances * @pctl: The register map pointing at the range used to setup * GMAC port drive/slew values * @dma_refcnt: track how many netdevs are using the DMA engine @@ -1148,8 +1106,8 @@ struct mtk_eth { u32 msg_enable; unsigned long sysclk; struct regmap *ethsys; - struct regmap *infra; - struct mtk_sgmii *sgmii; + struct regmap *infra; + struct phylink_pcs *sgmii_pcs[MTK_MAX_DEVS]; struct regmap *pctl; bool hwlro; refcount_t dma_refcnt; @@ -1311,10 +1269,6 @@ void mtk_stats_update_mac(struct mtk_mac *mac); void mtk_w32(struct mtk_eth *eth, u32 val, unsigned reg); u32 mtk_r32(struct mtk_eth *eth, unsigned reg); -struct phylink_pcs *mtk_sgmii_select_pcs(struct mtk_sgmii *ss, int id); -int mtk_sgmii_init(struct mtk_sgmii *ss, struct device_node *np, - u32 ana_rgc3); - int mtk_gmac_sgmii_path_setup(struct mtk_eth *eth, int mac_id); int mtk_gmac_gephy_path_setup(struct mtk_eth *eth, int mac_id); int mtk_gmac_rgmii_path_setup(struct mtk_eth *eth, int mac_id); @@ -1322,6 +1276,9 @@ int mtk_gmac_rgmii_path_setup(struct mtk_eth *eth, int mac_id); int mtk_eth_offload_init(struct mtk_eth *eth); int mtk_eth_setup_tc(struct net_device *dev, enum tc_setup_type type, void *type_data); +int mtk_flow_offload_cmd(struct mtk_eth *eth, struct flow_cls_offload *cls, + int ppe_index); +void mtk_flow_offload_cleanup(struct mtk_eth *eth, struct list_head *list); void mtk_eth_set_dma_device(struct mtk_eth *eth, struct device *dma_dev); diff --git a/drivers/net/ethernet/mediatek/mtk_ppe.c b/drivers/net/ethernet/mediatek/mtk_ppe.c index fd07d6e14273..9129821f3ab8 100644 --- a/drivers/net/ethernet/mediatek/mtk_ppe.c +++ b/drivers/net/ethernet/mediatek/mtk_ppe.c @@ -75,6 +75,48 @@ static int mtk_ppe_wait_busy(struct mtk_ppe *ppe) return ret; } +static int mtk_ppe_mib_wait_busy(struct mtk_ppe *ppe) +{ + int ret; + u32 val; + + ret = readl_poll_timeout(ppe->base + MTK_PPE_MIB_SER_CR, val, + !(val & MTK_PPE_MIB_SER_CR_ST), + 20, MTK_PPE_WAIT_TIMEOUT_US); + + if (ret) + dev_err(ppe->dev, "MIB table busy"); + + return ret; +} + +static int mtk_mib_entry_read(struct mtk_ppe *ppe, u16 index, u64 *bytes, u64 *packets) +{ + u32 byte_cnt_low, byte_cnt_high, pkt_cnt_low, pkt_cnt_high; + u32 val, cnt_r0, cnt_r1, cnt_r2; + int ret; + + val = FIELD_PREP(MTK_PPE_MIB_SER_CR_ADDR, index) | MTK_PPE_MIB_SER_CR_ST; + ppe_w32(ppe, MTK_PPE_MIB_SER_CR, val); + + ret = mtk_ppe_mib_wait_busy(ppe); + if (ret) + return ret; + + cnt_r0 = readl(ppe->base + MTK_PPE_MIB_SER_R0); + cnt_r1 = readl(ppe->base + MTK_PPE_MIB_SER_R1); + cnt_r2 = readl(ppe->base + MTK_PPE_MIB_SER_R2); + + byte_cnt_low = FIELD_GET(MTK_PPE_MIB_SER_R0_BYTE_CNT_LOW, cnt_r0); + byte_cnt_high = FIELD_GET(MTK_PPE_MIB_SER_R1_BYTE_CNT_HIGH, cnt_r1); + pkt_cnt_low = FIELD_GET(MTK_PPE_MIB_SER_R1_PKT_CNT_LOW, cnt_r1); + pkt_cnt_high = FIELD_GET(MTK_PPE_MIB_SER_R2_PKT_CNT_HIGH, cnt_r2); + *bytes = ((u64)byte_cnt_high << 32) | byte_cnt_low; + *packets = (pkt_cnt_high << 16) | pkt_cnt_low; + + return 0; +} + static void mtk_ppe_cache_clear(struct mtk_ppe *ppe) { ppe_set(ppe, MTK_PPE_CACHE_CTL, MTK_PPE_CACHE_CTL_CLEAR); @@ -460,6 +502,14 @@ __mtk_foe_entry_clear(struct mtk_ppe *ppe, struct mtk_flow_entry *entry) hwe->ib1 |= FIELD_PREP(MTK_FOE_IB1_STATE, MTK_FOE_STATE_INVALID); dma_wmb(); mtk_ppe_cache_clear(ppe); + + if (ppe->accounting) { + struct mtk_foe_accounting *acct; + + acct = ppe->acct_table + entry->hash * sizeof(*acct); + acct->packets = 0; + acct->bytes = 0; + } } entry->hash = 0xffff; @@ -551,6 +601,7 @@ __mtk_foe_entry_commit(struct mtk_ppe *ppe, struct mtk_foe_entry *entry, struct mtk_eth *eth = ppe->eth; u16 timestamp = mtk_eth_timestamp(eth); struct mtk_foe_entry *hwe; + u32 val; if (MTK_HAS_CAPS(eth->soc->caps, MTK_NETSYS_V2)) { entry->ib1 &= ~MTK_FOE_IB1_BIND_TIMESTAMP_V2; @@ -567,6 +618,14 @@ __mtk_foe_entry_commit(struct mtk_ppe *ppe, struct mtk_foe_entry *entry, wmb(); hwe->ib1 = entry->ib1; + if (ppe->accounting) { + if (MTK_HAS_CAPS(eth->soc->caps, MTK_NETSYS_V2)) + val = MTK_FOE_IB2_MIB_CNT_V2; + else + val = MTK_FOE_IB2_MIB_CNT; + *mtk_foe_entry_ib2(eth, hwe) |= val; + } + dma_wmb(); mtk_ppe_cache_clear(ppe); @@ -582,10 +641,20 @@ void mtk_foe_entry_clear(struct mtk_ppe *ppe, struct mtk_flow_entry *entry) static int mtk_foe_entry_commit_l2(struct mtk_ppe *ppe, struct mtk_flow_entry *entry) { + struct mtk_flow_entry *prev; + entry->type = MTK_FLOW_TYPE_L2; - return rhashtable_insert_fast(&ppe->l2_flows, &entry->l2_node, - mtk_flow_l2_ht_params); + prev = rhashtable_lookup_get_insert_fast(&ppe->l2_flows, &entry->l2_node, + mtk_flow_l2_ht_params); + if (likely(!prev)) + return 0; + + if (IS_ERR(prev)) + return PTR_ERR(prev); + + return rhashtable_replace_fast(&ppe->l2_flows, &prev->l2_node, + &entry->l2_node, mtk_flow_l2_ht_params); } int mtk_foe_entry_commit(struct mtk_ppe *ppe, struct mtk_flow_entry *entry) @@ -760,11 +829,39 @@ int mtk_ppe_prepare_reset(struct mtk_ppe *ppe) return mtk_ppe_wait_busy(ppe); } -struct mtk_ppe *mtk_ppe_init(struct mtk_eth *eth, void __iomem *base, - int version, int index) +struct mtk_foe_accounting *mtk_foe_entry_get_mib(struct mtk_ppe *ppe, u32 index, + struct mtk_foe_accounting *diff) +{ + struct mtk_foe_accounting *acct; + int size = sizeof(struct mtk_foe_accounting); + u64 bytes, packets; + + if (!ppe->accounting) + return NULL; + + if (mtk_mib_entry_read(ppe, index, &bytes, &packets)) + return NULL; + + acct = ppe->acct_table + index * size; + + acct->bytes += bytes; + acct->packets += packets; + + if (diff) { + diff->bytes = bytes; + diff->packets = packets; + } + + return acct; +} + +struct mtk_ppe *mtk_ppe_init(struct mtk_eth *eth, void __iomem *base, int index) { + bool accounting = eth->soc->has_accounting; const struct mtk_soc_data *soc = eth->soc; + struct mtk_foe_accounting *acct; struct device *dev = eth->dev; + struct mtk_mib_entry *mib; struct mtk_ppe *ppe; u32 foe_flow_size; void *foe; @@ -781,7 +878,8 @@ struct mtk_ppe *mtk_ppe_init(struct mtk_eth *eth, void __iomem *base, ppe->base = base; ppe->eth = eth; ppe->dev = dev; - ppe->version = version; + ppe->version = eth->soc->offload_version; + ppe->accounting = accounting; foe = dmam_alloc_coherent(ppe->dev, MTK_PPE_ENTRIES * soc->foe_entry_size, @@ -797,6 +895,23 @@ struct mtk_ppe *mtk_ppe_init(struct mtk_eth *eth, void __iomem *base, if (!ppe->foe_flow) goto err_free_l2_flows; + if (accounting) { + mib = dmam_alloc_coherent(ppe->dev, MTK_PPE_ENTRIES * sizeof(*mib), + &ppe->mib_phys, GFP_KERNEL); + if (!mib) + return NULL; + + ppe->mib_table = mib; + + acct = devm_kzalloc(dev, MTK_PPE_ENTRIES * sizeof(*acct), + GFP_KERNEL); + + if (!acct) + return NULL; + + ppe->acct_table = acct; + } + mtk_ppe_debugfs_init(ppe, index); return ppe; @@ -926,6 +1041,16 @@ void mtk_ppe_start(struct mtk_ppe *ppe) ppe_w32(ppe, MTK_PPE_DEFAULT_CPU_PORT1, 0xcb777); ppe_w32(ppe, MTK_PPE_SBW_CTRL, 0x7f); } + + if (ppe->accounting && ppe->mib_phys) { + ppe_w32(ppe, MTK_PPE_MIB_TB_BASE, ppe->mib_phys); + ppe_m32(ppe, MTK_PPE_MIB_CFG, MTK_PPE_MIB_CFG_EN, + MTK_PPE_MIB_CFG_EN); + ppe_m32(ppe, MTK_PPE_MIB_CFG, MTK_PPE_MIB_CFG_RD_CLR, + MTK_PPE_MIB_CFG_RD_CLR); + ppe_m32(ppe, MTK_PPE_MIB_CACHE_CTL, MTK_PPE_MIB_CACHE_CTL_EN, + MTK_PPE_MIB_CFG_RD_CLR); + } } int mtk_ppe_stop(struct mtk_ppe *ppe) diff --git a/drivers/net/ethernet/mediatek/mtk_ppe.h b/drivers/net/ethernet/mediatek/mtk_ppe.h index 5e8bc48252b1..e51de31a52ec 100644 --- a/drivers/net/ethernet/mediatek/mtk_ppe.h +++ b/drivers/net/ethernet/mediatek/mtk_ppe.h @@ -55,8 +55,10 @@ enum { #define MTK_FOE_IB2_PSE_QOS BIT(4) #define MTK_FOE_IB2_DEST_PORT GENMASK(7, 5) #define MTK_FOE_IB2_MULTICAST BIT(8) +#define MTK_FOE_IB2_MIB_CNT BIT(10) #define MTK_FOE_IB2_WDMA_QID2 GENMASK(13, 12) +#define MTK_FOE_IB2_MIB_CNT_V2 BIT(15) #define MTK_FOE_IB2_WDMA_DEVIDX BIT(16) #define MTK_FOE_IB2_WDMA_WINFO BIT(17) @@ -285,16 +287,34 @@ struct mtk_flow_entry { unsigned long cookie; }; +struct mtk_mib_entry { + u32 byt_cnt_l; + u16 byt_cnt_h; + u32 pkt_cnt_l; + u8 pkt_cnt_h; + u8 _rsv0; + u32 _rsv1; +} __packed; + +struct mtk_foe_accounting { + u64 bytes; + u64 packets; +}; + struct mtk_ppe { struct mtk_eth *eth; struct device *dev; void __iomem *base; int version; char dirname[5]; + bool accounting; void *foe_table; dma_addr_t foe_phys; + struct mtk_mib_entry *mib_table; + dma_addr_t mib_phys; + u16 foe_check_time[MTK_PPE_ENTRIES]; struct hlist_head *foe_flow; @@ -303,8 +323,8 @@ struct mtk_ppe { void *acct_table; }; -struct mtk_ppe *mtk_ppe_init(struct mtk_eth *eth, void __iomem *base, - int version, int index); +struct mtk_ppe *mtk_ppe_init(struct mtk_eth *eth, void __iomem *base, int index); + void mtk_ppe_deinit(struct mtk_eth *eth); void mtk_ppe_start(struct mtk_ppe *ppe); int mtk_ppe_stop(struct mtk_ppe *ppe); @@ -359,5 +379,7 @@ int mtk_foe_entry_commit(struct mtk_ppe *ppe, struct mtk_flow_entry *entry); void mtk_foe_entry_clear(struct mtk_ppe *ppe, struct mtk_flow_entry *entry); int mtk_foe_entry_idle_time(struct mtk_ppe *ppe, struct mtk_flow_entry *entry); int mtk_ppe_debugfs_init(struct mtk_ppe *ppe, int index); +struct mtk_foe_accounting *mtk_foe_entry_get_mib(struct mtk_ppe *ppe, u32 index, + struct mtk_foe_accounting *diff); #endif diff --git a/drivers/net/ethernet/mediatek/mtk_ppe_debugfs.c b/drivers/net/ethernet/mediatek/mtk_ppe_debugfs.c index 391b071bcff3..316fe2e70fea 100644 --- a/drivers/net/ethernet/mediatek/mtk_ppe_debugfs.c +++ b/drivers/net/ethernet/mediatek/mtk_ppe_debugfs.c @@ -47,7 +47,7 @@ static const char *mtk_foe_pkt_type_str(int type) static void mtk_print_addr(struct seq_file *m, u32 *addr, bool ipv6) { - u32 n_addr[4]; + __be32 n_addr[4]; int i; if (!ipv6) { @@ -82,6 +82,7 @@ mtk_ppe_debugfs_foe_show(struct seq_file *m, void *private, bool bind) struct mtk_foe_entry *entry = mtk_foe_get_entry(ppe, i); struct mtk_foe_mac_info *l2; struct mtk_flow_addr_info ai = {}; + struct mtk_foe_accounting *acct; unsigned char h_source[ETH_ALEN]; unsigned char h_dest[ETH_ALEN]; int type, state; @@ -95,6 +96,8 @@ mtk_ppe_debugfs_foe_show(struct seq_file *m, void *private, bool bind) if (bind && state != MTK_FOE_STATE_BIND) continue; + acct = mtk_foe_entry_get_mib(ppe, i, NULL); + type = FIELD_GET(MTK_FOE_IB1_PACKET_TYPE, entry->ib1); seq_printf(m, "%05x %s %7s", i, mtk_foe_entry_state_str(state), @@ -153,9 +156,11 @@ mtk_ppe_debugfs_foe_show(struct seq_file *m, void *private, bool bind) *((__be16 *)&h_dest[4]) = htons(l2->dest_mac_lo); seq_printf(m, " eth=%pM->%pM etype=%04x" - " vlan=%d,%d ib1=%08x ib2=%08x\n", + " vlan=%d,%d ib1=%08x ib2=%08x" + " packets=%llu bytes=%llu\n", h_source, h_dest, ntohs(l2->etype), - l2->vlan1, l2->vlan2, entry->ib1, ib2); + l2->vlan1, l2->vlan2, entry->ib1, ib2, + acct ? acct->packets : 0, acct ? acct->bytes : 0); } return 0; diff --git a/drivers/net/ethernet/mediatek/mtk_ppe_offload.c b/drivers/net/ethernet/mediatek/mtk_ppe_offload.c index 161751bb36c9..02eebff02d45 100644 --- a/drivers/net/ethernet/mediatek/mtk_ppe_offload.c +++ b/drivers/net/ethernet/mediatek/mtk_ppe_offload.c @@ -235,7 +235,8 @@ out: } static int -mtk_flow_offload_replace(struct mtk_eth *eth, struct flow_cls_offload *f) +mtk_flow_offload_replace(struct mtk_eth *eth, struct flow_cls_offload *f, + int ppe_index) { struct flow_rule *rule = flow_cls_offload_flow_rule(f); struct flow_action_entry *act; @@ -452,6 +453,7 @@ mtk_flow_offload_replace(struct mtk_eth *eth, struct flow_cls_offload *f) entry->cookie = f->cookie; memcpy(&entry->data, &foe, sizeof(entry->data)); entry->wed_index = wed_index; + entry->ppe_index = ppe_index; err = mtk_foe_entry_commit(eth->ppe[entry->ppe_index], entry); if (err < 0) @@ -497,6 +499,7 @@ static int mtk_flow_offload_stats(struct mtk_eth *eth, struct flow_cls_offload *f) { struct mtk_flow_entry *entry; + struct mtk_foe_accounting diff; u32 idle; entry = rhashtable_lookup(ð->flow_table, &f->cookie, @@ -507,30 +510,27 @@ mtk_flow_offload_stats(struct mtk_eth *eth, struct flow_cls_offload *f) idle = mtk_foe_entry_idle_time(eth->ppe[entry->ppe_index], entry); f->stats.lastused = jiffies - idle * HZ; + if (entry->hash != 0xFFFF && + mtk_foe_entry_get_mib(eth->ppe[entry->ppe_index], entry->hash, + &diff)) { + f->stats.pkts += diff.packets; + f->stats.bytes += diff.bytes; + } + return 0; } static DEFINE_MUTEX(mtk_flow_offload_mutex); -static int -mtk_eth_setup_tc_block_cb(enum tc_setup_type type, void *type_data, void *cb_priv) +int mtk_flow_offload_cmd(struct mtk_eth *eth, struct flow_cls_offload *cls, + int ppe_index) { - struct flow_cls_offload *cls = type_data; - struct net_device *dev = cb_priv; - struct mtk_mac *mac = netdev_priv(dev); - struct mtk_eth *eth = mac->hw; int err; - if (!tc_can_offload(dev)) - return -EOPNOTSUPP; - - if (type != TC_SETUP_CLSFLOWER) - return -EOPNOTSUPP; - mutex_lock(&mtk_flow_offload_mutex); switch (cls->command) { case FLOW_CLS_REPLACE: - err = mtk_flow_offload_replace(eth, cls); + err = mtk_flow_offload_replace(eth, cls, ppe_index); break; case FLOW_CLS_DESTROY: err = mtk_flow_offload_destroy(eth, cls); @@ -548,6 +548,26 @@ mtk_eth_setup_tc_block_cb(enum tc_setup_type type, void *type_data, void *cb_pri } static int +mtk_eth_setup_tc_block_cb(enum tc_setup_type type, void *type_data, void *cb_priv) +{ + struct flow_cls_offload *cls = type_data; + struct net_device *dev = cb_priv; + struct mtk_mac *mac; + struct mtk_eth *eth; + + mac = netdev_priv(dev); + eth = mac->hw; + + if (!tc_can_offload(dev)) + return -EOPNOTSUPP; + + if (type != TC_SETUP_CLSFLOWER) + return -EOPNOTSUPP; + + return mtk_flow_offload_cmd(eth, cls, 0); +} + +static int mtk_eth_setup_tc_block(struct net_device *dev, struct flow_block_offload *f) { struct mtk_mac *mac = netdev_priv(dev); diff --git a/drivers/net/ethernet/mediatek/mtk_ppe_regs.h b/drivers/net/ethernet/mediatek/mtk_ppe_regs.h index 0fdb983b0a88..a2e61b3eb006 100644 --- a/drivers/net/ethernet/mediatek/mtk_ppe_regs.h +++ b/drivers/net/ethernet/mediatek/mtk_ppe_regs.h @@ -149,6 +149,20 @@ enum { #define MTK_PPE_MIB_TB_BASE 0x338 +#define MTK_PPE_MIB_SER_CR 0x33C +#define MTK_PPE_MIB_SER_CR_ST BIT(16) +#define MTK_PPE_MIB_SER_CR_ADDR GENMASK(13, 0) + +#define MTK_PPE_MIB_SER_R0 0x340 +#define MTK_PPE_MIB_SER_R0_BYTE_CNT_LOW GENMASK(31, 0) + +#define MTK_PPE_MIB_SER_R1 0x344 +#define MTK_PPE_MIB_SER_R1_PKT_CNT_LOW GENMASK(31, 16) +#define MTK_PPE_MIB_SER_R1_BYTE_CNT_HIGH GENMASK(15, 0) + +#define MTK_PPE_MIB_SER_R2 0x348 +#define MTK_PPE_MIB_SER_R2_PKT_CNT_HIGH GENMASK(23, 0) + #define MTK_PPE_MIB_CACHE_CTL 0x350 #define MTK_PPE_MIB_CACHE_CTL_EN BIT(0) #define MTK_PPE_MIB_CACHE_CTL_FLUSH BIT(2) diff --git a/drivers/net/ethernet/mediatek/mtk_sgmii.c b/drivers/net/ethernet/mediatek/mtk_sgmii.c deleted file mode 100644 index 83976dc86887..000000000000 --- a/drivers/net/ethernet/mediatek/mtk_sgmii.c +++ /dev/null @@ -1,207 +0,0 @@ -// SPDX-License-Identifier: GPL-2.0 -// Copyright (c) 2018-2019 MediaTek Inc. - -/* A library for MediaTek SGMII circuit - * - * Author: Sean Wang <sean.wang@mediatek.com> - * - */ - -#include <linux/mfd/syscon.h> -#include <linux/of.h> -#include <linux/phylink.h> -#include <linux/regmap.h> - -#include "mtk_eth_soc.h" - -static struct mtk_pcs *pcs_to_mtk_pcs(struct phylink_pcs *pcs) -{ - return container_of(pcs, struct mtk_pcs, pcs); -} - -static void mtk_pcs_get_state(struct phylink_pcs *pcs, - struct phylink_link_state *state) -{ - struct mtk_pcs *mpcs = pcs_to_mtk_pcs(pcs); - unsigned int bm, adv; - - /* Read the BMSR and LPA */ - regmap_read(mpcs->regmap, SGMSYS_PCS_CONTROL_1, &bm); - regmap_read(mpcs->regmap, SGMSYS_PCS_ADVERTISE, &adv); - - phylink_mii_c22_pcs_decode_state(state, FIELD_GET(SGMII_BMSR, bm), - FIELD_GET(SGMII_LPA, adv)); -} - -static int mtk_pcs_config(struct phylink_pcs *pcs, unsigned int mode, - phy_interface_t interface, - const unsigned long *advertising, - bool permit_pause_to_mac) -{ - bool mode_changed = false, changed, use_an; - struct mtk_pcs *mpcs = pcs_to_mtk_pcs(pcs); - unsigned int rgc3, sgm_mode, bmcr; - int advertise, link_timer; - - advertise = phylink_mii_c22_pcs_encode_advertisement(interface, - advertising); - if (advertise < 0) - return advertise; - - /* Clearing IF_MODE_BIT0 switches the PCS to BASE-X mode, and - * we assume that fixes it's speed at bitrate = line rate (in - * other words, 1000Mbps or 2500Mbps). - */ - if (interface == PHY_INTERFACE_MODE_SGMII) { - sgm_mode = SGMII_IF_MODE_SGMII; - if (phylink_autoneg_inband(mode)) { - sgm_mode |= SGMII_REMOTE_FAULT_DIS | - SGMII_SPEED_DUPLEX_AN; - use_an = true; - } else { - use_an = false; - } - } else if (phylink_autoneg_inband(mode)) { - /* 1000base-X or 2500base-X autoneg */ - sgm_mode = SGMII_REMOTE_FAULT_DIS; - use_an = linkmode_test_bit(ETHTOOL_LINK_MODE_Autoneg_BIT, - advertising); - } else { - /* 1000base-X or 2500base-X without autoneg */ - sgm_mode = 0; - use_an = false; - } - - if (use_an) { - bmcr = SGMII_AN_ENABLE; - } else { - bmcr = 0; - } - - if (mpcs->interface != interface) { - link_timer = phylink_get_link_timer_ns(interface); - if (link_timer < 0) - return link_timer; - - /* PHYA power down */ - regmap_update_bits(mpcs->regmap, SGMSYS_QPHY_PWR_STATE_CTRL, - SGMII_PHYA_PWD, SGMII_PHYA_PWD); - - /* Reset SGMII PCS state */ - regmap_update_bits(mpcs->regmap, SGMII_RESERVED_0, - SGMII_SW_RESET, SGMII_SW_RESET); - - if (interface == PHY_INTERFACE_MODE_2500BASEX) - rgc3 = RG_PHY_SPEED_3_125G; - else - rgc3 = 0; - - /* Configure the underlying interface speed */ - regmap_update_bits(mpcs->regmap, mpcs->ana_rgc3, - RG_PHY_SPEED_3_125G, rgc3); - - /* Setup the link timer */ - regmap_write(mpcs->regmap, SGMSYS_PCS_LINK_TIMER, link_timer / 2 / 8); - - mpcs->interface = interface; - mode_changed = true; - } - - /* Update the advertisement, noting whether it has changed */ - regmap_update_bits_check(mpcs->regmap, SGMSYS_PCS_ADVERTISE, - SGMII_ADVERTISE, advertise, &changed); - - /* Update the sgmsys mode register */ - regmap_update_bits(mpcs->regmap, SGMSYS_SGMII_MODE, - SGMII_REMOTE_FAULT_DIS | SGMII_SPEED_DUPLEX_AN | - SGMII_IF_MODE_SGMII, sgm_mode); - - /* Update the BMCR */ - regmap_update_bits(mpcs->regmap, SGMSYS_PCS_CONTROL_1, - SGMII_AN_ENABLE, bmcr); - - /* Release PHYA power down state - * Only removing bit SGMII_PHYA_PWD isn't enough. - * There are cases when the SGMII_PHYA_PWD register contains 0x9 which - * prevents SGMII from working. The SGMII still shows link but no traffic - * can flow. Writing 0x0 to the PHYA_PWD register fix the issue. 0x0 was - * taken from a good working state of the SGMII interface. - * Unknown how much the QPHY needs but it is racy without a sleep. - * Tested on mt7622 & mt7986. - */ - usleep_range(50, 100); - regmap_write(mpcs->regmap, SGMSYS_QPHY_PWR_STATE_CTRL, 0); - - return changed || mode_changed; -} - -static void mtk_pcs_restart_an(struct phylink_pcs *pcs) -{ - struct mtk_pcs *mpcs = pcs_to_mtk_pcs(pcs); - - regmap_update_bits(mpcs->regmap, SGMSYS_PCS_CONTROL_1, - SGMII_AN_RESTART, SGMII_AN_RESTART); -} - -static void mtk_pcs_link_up(struct phylink_pcs *pcs, unsigned int mode, - phy_interface_t interface, int speed, int duplex) -{ - struct mtk_pcs *mpcs = pcs_to_mtk_pcs(pcs); - unsigned int sgm_mode; - - if (!phylink_autoneg_inband(mode)) { - /* Force the speed and duplex setting */ - if (speed == SPEED_10) - sgm_mode = SGMII_SPEED_10; - else if (speed == SPEED_100) - sgm_mode = SGMII_SPEED_100; - else - sgm_mode = SGMII_SPEED_1000; - - if (duplex != DUPLEX_FULL) - sgm_mode |= SGMII_DUPLEX_HALF; - - regmap_update_bits(mpcs->regmap, SGMSYS_SGMII_MODE, - SGMII_DUPLEX_HALF | SGMII_SPEED_MASK, - sgm_mode); - } -} - -static const struct phylink_pcs_ops mtk_pcs_ops = { - .pcs_get_state = mtk_pcs_get_state, - .pcs_config = mtk_pcs_config, - .pcs_an_restart = mtk_pcs_restart_an, - .pcs_link_up = mtk_pcs_link_up, -}; - -int mtk_sgmii_init(struct mtk_sgmii *ss, struct device_node *r, u32 ana_rgc3) -{ - struct device_node *np; - int i; - - for (i = 0; i < MTK_MAX_DEVS; i++) { - np = of_parse_phandle(r, "mediatek,sgmiisys", i); - if (!np) - break; - - ss->pcs[i].ana_rgc3 = ana_rgc3; - ss->pcs[i].regmap = syscon_node_to_regmap(np); - of_node_put(np); - if (IS_ERR(ss->pcs[i].regmap)) - return PTR_ERR(ss->pcs[i].regmap); - - ss->pcs[i].pcs.ops = &mtk_pcs_ops; - ss->pcs[i].pcs.poll = true; - ss->pcs[i].interface = PHY_INTERFACE_MODE_NA; - } - - return 0; -} - -struct phylink_pcs *mtk_sgmii_select_pcs(struct mtk_sgmii *ss, int id) -{ - if (!ss->pcs[id].regmap) - return NULL; - - return &ss->pcs[id].pcs; -} diff --git a/drivers/net/ethernet/mediatek/mtk_wed.c b/drivers/net/ethernet/mediatek/mtk_wed.c index 95d890870984..4c205afbd230 100644 --- a/drivers/net/ethernet/mediatek/mtk_wed.c +++ b/drivers/net/ethernet/mediatek/mtk_wed.c @@ -13,6 +13,8 @@ #include <linux/mfd/syscon.h> #include <linux/debugfs.h> #include <linux/soc/mediatek/mtk_wed.h> +#include <net/flow_offload.h> +#include <net/pkt_cls.h> #include "mtk_eth_soc.h" #include "mtk_wed_regs.h" #include "mtk_wed.h" @@ -41,6 +43,11 @@ static struct mtk_wed_hw *hw_list[2]; static DEFINE_MUTEX(hw_lock); +struct mtk_wed_flow_block_priv { + struct mtk_wed_hw *hw; + struct net_device *dev; +}; + static void wed_m32(struct mtk_wed_device *dev, u32 reg, u32 mask, u32 val) { @@ -1745,6 +1752,99 @@ out: mutex_unlock(&hw_lock); } +static int +mtk_wed_setup_tc_block_cb(enum tc_setup_type type, void *type_data, void *cb_priv) +{ + struct mtk_wed_flow_block_priv *priv = cb_priv; + struct flow_cls_offload *cls = type_data; + struct mtk_wed_hw *hw = priv->hw; + + if (!tc_can_offload(priv->dev)) + return -EOPNOTSUPP; + + if (type != TC_SETUP_CLSFLOWER) + return -EOPNOTSUPP; + + return mtk_flow_offload_cmd(hw->eth, cls, hw->index); +} + +static int +mtk_wed_setup_tc_block(struct mtk_wed_hw *hw, struct net_device *dev, + struct flow_block_offload *f) +{ + struct mtk_wed_flow_block_priv *priv; + static LIST_HEAD(block_cb_list); + struct flow_block_cb *block_cb; + struct mtk_eth *eth = hw->eth; + flow_setup_cb_t *cb; + + if (!eth->soc->offload_version) + return -EOPNOTSUPP; + + if (f->binder_type != FLOW_BLOCK_BINDER_TYPE_CLSACT_INGRESS) + return -EOPNOTSUPP; + + cb = mtk_wed_setup_tc_block_cb; + f->driver_block_list = &block_cb_list; + + switch (f->command) { + case FLOW_BLOCK_BIND: + block_cb = flow_block_cb_lookup(f->block, cb, dev); + if (block_cb) { + flow_block_cb_incref(block_cb); + return 0; + } + + priv = kzalloc(sizeof(*priv), GFP_KERNEL); + if (!priv) + return -ENOMEM; + + priv->hw = hw; + priv->dev = dev; + block_cb = flow_block_cb_alloc(cb, dev, priv, NULL); + if (IS_ERR(block_cb)) { + kfree(priv); + return PTR_ERR(block_cb); + } + + flow_block_cb_incref(block_cb); + flow_block_cb_add(block_cb, f); + list_add_tail(&block_cb->driver_list, &block_cb_list); + return 0; + case FLOW_BLOCK_UNBIND: + block_cb = flow_block_cb_lookup(f->block, cb, dev); + if (!block_cb) + return -ENOENT; + + if (!flow_block_cb_decref(block_cb)) { + flow_block_cb_remove(block_cb, f); + list_del(&block_cb->driver_list); + kfree(block_cb->cb_priv); + } + return 0; + default: + return -EOPNOTSUPP; + } +} + +static int +mtk_wed_setup_tc(struct mtk_wed_device *wed, struct net_device *dev, + enum tc_setup_type type, void *type_data) +{ + struct mtk_wed_hw *hw = wed->hw; + + if (hw->version < 2) + return -EOPNOTSUPP; + + switch (type) { + case TC_SETUP_BLOCK: + case TC_SETUP_FT: + return mtk_wed_setup_tc_block(hw, dev, type_data); + default: + return -EOPNOTSUPP; + } +} + void mtk_wed_add_hw(struct device_node *np, struct mtk_eth *eth, void __iomem *wdma, phys_addr_t wdma_phy, int index) @@ -1764,6 +1864,7 @@ void mtk_wed_add_hw(struct device_node *np, struct mtk_eth *eth, .irq_set_mask = mtk_wed_irq_set_mask, .detach = mtk_wed_detach, .ppe_check = mtk_wed_ppe_check, + .setup_tc = mtk_wed_setup_tc, }; struct device_node *eth_np = eth->dev->of_node; struct platform_device *pdev; diff --git a/drivers/net/ethernet/mediatek/mtk_wed_debugfs.c b/drivers/net/ethernet/mediatek/mtk_wed_debugfs.c index 56f663439721..b244c02c5b51 100644 --- a/drivers/net/ethernet/mediatek/mtk_wed_debugfs.c +++ b/drivers/net/ethernet/mediatek/mtk_wed_debugfs.c @@ -252,8 +252,6 @@ void mtk_wed_hw_add_debugfs(struct mtk_wed_hw *hw) snprintf(hw->dirname, sizeof(hw->dirname), "wed%d", hw->index); dir = debugfs_create_dir(hw->dirname, NULL); - if (!dir) - return; hw->debugfs_dir = dir; debugfs_create_u32("regidx", 0600, dir, &hw->debugfs_reg); diff --git a/drivers/net/ethernet/mediatek/mtk_wed_mcu.c b/drivers/net/ethernet/mediatek/mtk_wed_mcu.c index 6bad0d262f28..071ed3dea860 100644 --- a/drivers/net/ethernet/mediatek/mtk_wed_mcu.c +++ b/drivers/net/ethernet/mediatek/mtk_wed_mcu.c @@ -326,7 +326,11 @@ mtk_wed_mcu_load_firmware(struct mtk_wed_wo *wo) wo->hw->index + 1); /* load firmware */ - fw_name = wo->hw->index ? MT7986_FIRMWARE_WO1 : MT7986_FIRMWARE_WO0; + if (of_device_is_compatible(wo->hw->node, "mediatek,mt7981-wed")) + fw_name = MT7981_FIRMWARE_WO; + else + fw_name = wo->hw->index ? MT7986_FIRMWARE_WO1 : MT7986_FIRMWARE_WO0; + ret = request_firmware(&fw, fw_name, wo->hw->dev); if (ret) return ret; @@ -386,5 +390,6 @@ int mtk_wed_mcu_init(struct mtk_wed_wo *wo) 100, MTK_FW_DL_TIMEOUT); } +MODULE_FIRMWARE(MT7981_FIRMWARE_WO); MODULE_FIRMWARE(MT7986_FIRMWARE_WO0); MODULE_FIRMWARE(MT7986_FIRMWARE_WO1); diff --git a/drivers/net/ethernet/mediatek/mtk_wed_wo.h b/drivers/net/ethernet/mediatek/mtk_wed_wo.h index dbcf42ce9173..7a1a2a28f1ac 100644 --- a/drivers/net/ethernet/mediatek/mtk_wed_wo.h +++ b/drivers/net/ethernet/mediatek/mtk_wed_wo.h @@ -88,6 +88,7 @@ enum mtk_wed_dummy_cr_idx { MTK_WED_DUMMY_CR_WO_STATUS, }; +#define MT7981_FIRMWARE_WO "mediatek/mt7981_wo.bin" #define MT7986_FIRMWARE_WO0 "mediatek/mt7986_wo_0.bin" #define MT7986_FIRMWARE_WO1 "mediatek/mt7986_wo_1.bin" diff --git a/drivers/net/ethernet/mellanox/mlx4/en_tx.c b/drivers/net/ethernet/mellanox/mlx4/en_tx.c index 2f79378fbf6e..65cb63f6c465 100644 --- a/drivers/net/ethernet/mellanox/mlx4/en_tx.c +++ b/drivers/net/ethernet/mellanox/mlx4/en_tx.c @@ -228,7 +228,9 @@ void mlx4_en_deactivate_tx_ring(struct mlx4_en_priv *priv, static inline bool mlx4_en_is_tx_ring_full(struct mlx4_en_tx_ring *ring) { - return ring->prod - ring->cons > ring->full_size; + u32 used = READ_ONCE(ring->prod) - READ_ONCE(ring->cons); + + return used > ring->full_size; } static void mlx4_en_stamp_wqe(struct mlx4_en_priv *priv, @@ -1083,7 +1085,7 @@ netdev_tx_t mlx4_en_xmit(struct sk_buff *skb, struct net_device *dev) op_own |= cpu_to_be32(MLX4_WQE_CTRL_IIP); } - ring->prod += nr_txbb; + WRITE_ONCE(ring->prod, ring->prod + nr_txbb); /* If we used a bounce buffer then copy descriptor back into place */ if (unlikely(bounce)) @@ -1214,7 +1216,7 @@ netdev_tx_t mlx4_en_xmit_frame(struct mlx4_en_rx_ring *rx_ring, rx_ring->xdp_tx++; - ring->prod += MLX4_EN_XDP_TX_NRTXBB; + WRITE_ONCE(ring->prod, ring->prod + MLX4_EN_XDP_TX_NRTXBB); /* Ensure new descriptor hits memory * before setting ownership of this descriptor to HW diff --git a/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h b/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h index 4ac4d883047b..321f801c1d7c 100644 --- a/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h +++ b/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h @@ -323,7 +323,7 @@ struct mlx4_en_tx_ring { struct mlx4_en_rx_desc { /* actual number of entries depends on rx ring stride */ - struct mlx4_wqe_data_seg data[0]; + DECLARE_FLEX_ARRAY(struct mlx4_wqe_data_seg, data); }; struct mlx4_en_rx_ring { diff --git a/drivers/net/ethernet/mellanox/mlx5/core/Makefile b/drivers/net/ethernet/mellanox/mlx5/core/Makefile index 8d4e25cc54ea..ddf1e352f51d 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/Makefile +++ b/drivers/net/ethernet/mellanox/mlx5/core/Makefile @@ -16,7 +16,7 @@ mlx5_core-y := main.o cmd.o debugfs.o fw.o eq.o uar.o pagealloc.o \ transobj.o vport.o sriov.o fs_cmd.o fs_core.o pci_irq.o \ fs_counters.o fs_ft_pool.o rl.o lag/debugfs.o lag/lag.o dev.o events.o wq.o lib/gid.o \ lib/devcom.o lib/pci_vsc.o lib/dm.o lib/fs_ttc.o diag/fs_tracepoint.o \ - diag/fw_tracer.o diag/crdump.o devlink.o diag/rsc_dump.o \ + diag/fw_tracer.o diag/crdump.o devlink.o diag/rsc_dump.o diag/reporter_vnic.o \ fw_reset.o qos.o lib/tout.o lib/aso.o # @@ -69,14 +69,15 @@ mlx5_core-$(CONFIG_MLX5_TC_SAMPLE) += en/tc/sample.o # mlx5_core-$(CONFIG_MLX5_ESWITCH) += eswitch.o eswitch_offloads.o eswitch_offloads_termtbl.o \ ecpf.o rdma.o esw/legacy.o \ - esw/debugfs.o esw/devlink_port.o esw/vporttbl.o esw/qos.o + esw/devlink_port.o esw/vporttbl.o esw/qos.o mlx5_core-$(CONFIG_MLX5_ESWITCH) += esw/acl/helper.o \ esw/acl/egress_lgcy.o esw/acl/egress_ofld.o \ esw/acl/ingress_lgcy.o esw/acl/ingress_ofld.o -mlx5_core-$(CONFIG_MLX5_BRIDGE) += esw/bridge.o en/rep/bridge.o +mlx5_core-$(CONFIG_MLX5_BRIDGE) += esw/bridge.o esw/bridge_mcast.o en/rep/bridge.o +mlx5_core-$(CONFIG_THERMAL) += thermal.o mlx5_core-$(CONFIG_MLX5_MPFS) += lib/mpfs.o mlx5_core-$(CONFIG_VXLAN) += lib/vxlan.o mlx5_core-$(CONFIG_PTP_1588_CLOCK) += lib/clock.o @@ -111,8 +112,8 @@ mlx5_core-$(CONFIG_MLX5_SW_STEERING) += steering/dr_domain.o steering/dr_table.o steering/dr_ste_v2.o \ steering/dr_cmd.o steering/dr_fw.o \ steering/dr_action.o steering/fs_dr.o \ - steering/dr_definer.o \ - steering/dr_dbg.o lib/smfs.o + steering/dr_definer.o steering/dr_ptrn.o \ + steering/dr_arg.o steering/dr_dbg.o lib/smfs.o # # SF device # diff --git a/drivers/net/ethernet/mellanox/mlx5/core/cmd.c b/drivers/net/ethernet/mellanox/mlx5/core/cmd.c index b00e33ed05e9..d53de39539a8 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/cmd.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/cmd.c @@ -1802,7 +1802,7 @@ static struct mlx5_cmd_msg *alloc_msg(struct mlx5_core_dev *dev, int in_size, if (in_size <= 16) goto cache_miss; - for (i = 0; i < MLX5_NUM_COMMAND_CACHES; i++) { + for (i = 0; i < dev->profile.num_cmd_caches; i++) { ch = &cmd->cache[i]; if (in_size > ch->max_inbox_size) continue; @@ -2097,7 +2097,7 @@ static void destroy_msg_cache(struct mlx5_core_dev *dev) struct mlx5_cmd_msg *n; int i; - for (i = 0; i < MLX5_NUM_COMMAND_CACHES; i++) { + for (i = 0; i < dev->profile.num_cmd_caches; i++) { ch = &dev->cmd.cache[i]; list_for_each_entry_safe(msg, n, &ch->head, list) { list_del(&msg->list); @@ -2127,7 +2127,7 @@ static void create_msg_cache(struct mlx5_core_dev *dev) int k; /* Initialize and fill the caches with initial entries */ - for (k = 0; k < MLX5_NUM_COMMAND_CACHES; k++) { + for (k = 0; k < dev->profile.num_cmd_caches; k++) { ch = &cmd->cache[k]; spin_lock_init(&ch->lock); INIT_LIST_HEAD(&ch->head); diff --git a/drivers/net/ethernet/mellanox/mlx5/core/dev.c b/drivers/net/ethernet/mellanox/mlx5/core/dev.c index 2e7806001fdc..1b33533b15de 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/dev.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/dev.c @@ -35,6 +35,7 @@ #include <linux/mlx5/mlx5_ifc_vdpa.h> #include <linux/mlx5/vport.h> #include "mlx5_core.h" +#include "devlink.h" /* intf dev list mutex */ static DEFINE_MUTEX(mlx5_intf_mutex); @@ -106,17 +107,6 @@ bool mlx5_eth_supported(struct mlx5_core_dev *dev) return true; } -static bool is_eth_enabled(struct mlx5_core_dev *dev) -{ - union devlink_param_value val; - int err; - - err = devl_param_driverinit_value_get(priv_to_devlink(dev), - DEVLINK_PARAM_GENERIC_ID_ENABLE_ETH, - &val); - return err ? false : val.vbool; -} - bool mlx5_vnet_supported(struct mlx5_core_dev *dev) { if (!IS_ENABLED(CONFIG_MLX5_VDPA_NET)) @@ -245,7 +235,7 @@ static const struct mlx5_adev_device { .is_enabled = &is_ib_enabled }, [MLX5_INTERFACE_PROTOCOL_ETH] = { .suffix = "eth", .is_supported = &mlx5_eth_supported, - .is_enabled = &is_eth_enabled }, + .is_enabled = &mlx5_core_is_eth_enabled }, [MLX5_INTERFACE_PROTOCOL_ETH_REP] = { .suffix = "eth-rep", .is_supported = &is_eth_rep_supported }, [MLX5_INTERFACE_PROTOCOL_IB_REP] = { .suffix = "rdma-rep", diff --git a/drivers/net/ethernet/mellanox/mlx5/core/devlink.c b/drivers/net/ethernet/mellanox/mlx5/core/devlink.c index c5d2fdcabd56..4b607785d694 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/devlink.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/devlink.c @@ -202,7 +202,7 @@ static int mlx5_devlink_reload_up(struct devlink *devlink, enum devlink_reload_a break; /* On fw_activate action, also driver is reloaded and reinit performed */ *actions_performed |= BIT(DEVLINK_RELOAD_ACTION_DRIVER_REINIT); - ret = mlx5_load_one_devl_locked(dev, false); + ret = mlx5_load_one_devl_locked(dev, true); break; default: /* Unsupported action should not get to this function */ @@ -494,6 +494,61 @@ static int mlx5_devlink_eq_depth_validate(struct devlink *devlink, u32 id, return (val.vu32 >= 64 && val.vu32 <= 4096) ? 0 : -EINVAL; } +static int +mlx5_devlink_hairpin_num_queues_validate(struct devlink *devlink, u32 id, + union devlink_param_value val, + struct netlink_ext_ack *extack) +{ + return val.vu32 ? 0 : -EINVAL; +} + +static int +mlx5_devlink_hairpin_queue_size_validate(struct devlink *devlink, u32 id, + union devlink_param_value val, + struct netlink_ext_ack *extack) +{ + struct mlx5_core_dev *dev = devlink_priv(devlink); + u32 val32 = val.vu32; + + if (!is_power_of_2(val32)) { + NL_SET_ERR_MSG_MOD(extack, "Value is not power of two"); + return -EINVAL; + } + + if (val32 > BIT(MLX5_CAP_GEN(dev, log_max_hairpin_num_packets))) { + NL_SET_ERR_MSG_FMT_MOD( + extack, "Maximum hairpin queue size is %lu", + BIT(MLX5_CAP_GEN(dev, log_max_hairpin_num_packets))); + return -EINVAL; + } + + return 0; +} + +static void mlx5_devlink_hairpin_params_init_values(struct devlink *devlink) +{ + struct mlx5_core_dev *dev = devlink_priv(devlink); + union devlink_param_value value; + u32 link_speed = 0; + u64 link_speed64; + + /* set hairpin pair per each 50Gbs share of the link */ + mlx5_port_max_linkspeed(dev, &link_speed); + link_speed = max_t(u32, link_speed, 50000); + link_speed64 = link_speed; + do_div(link_speed64, 50000); + + value.vu32 = link_speed64; + devl_param_driverinit_value_set( + devlink, MLX5_DEVLINK_PARAM_ID_HAIRPIN_NUM_QUEUES, value); + + value.vu32 = + BIT(min_t(u32, 16 - MLX5_MPWRQ_MIN_LOG_STRIDE_SZ(dev), + MLX5_CAP_GEN(dev, log_max_hairpin_num_packets))); + devl_param_driverinit_value_set( + devlink, MLX5_DEVLINK_PARAM_ID_HAIRPIN_QUEUE_SIZE, value); +} + static const struct devlink_param mlx5_devlink_params[] = { DEVLINK_PARAM_GENERIC(ENABLE_ROCE, BIT(DEVLINK_PARAM_CMODE_DRIVERINIT), NULL, NULL, mlx5_devlink_enable_roce_validate), @@ -547,6 +602,14 @@ static void mlx5_devlink_set_params_init_values(struct devlink *devlink) static const struct devlink_param mlx5_devlink_eth_params[] = { DEVLINK_PARAM_GENERIC(ENABLE_ETH, BIT(DEVLINK_PARAM_CMODE_DRIVERINIT), NULL, NULL, NULL), + DEVLINK_PARAM_DRIVER(MLX5_DEVLINK_PARAM_ID_HAIRPIN_NUM_QUEUES, + "hairpin_num_queues", DEVLINK_PARAM_TYPE_U32, + BIT(DEVLINK_PARAM_CMODE_DRIVERINIT), NULL, NULL, + mlx5_devlink_hairpin_num_queues_validate), + DEVLINK_PARAM_DRIVER(MLX5_DEVLINK_PARAM_ID_HAIRPIN_QUEUE_SIZE, + "hairpin_queue_size", DEVLINK_PARAM_TYPE_U32, + BIT(DEVLINK_PARAM_CMODE_DRIVERINIT), NULL, NULL, + mlx5_devlink_hairpin_queue_size_validate), }; static int mlx5_devlink_eth_params_register(struct devlink *devlink) @@ -567,6 +630,9 @@ static int mlx5_devlink_eth_params_register(struct devlink *devlink) devl_param_driverinit_value_set(devlink, DEVLINK_PARAM_GENERIC_ID_ENABLE_ETH, value); + + mlx5_devlink_hairpin_params_init_values(devlink); + return 0; } @@ -805,6 +871,11 @@ int mlx5_devlink_params_register(struct devlink *devlink) { int err; + /* Here only the driver init params should be registered. + * Runtime params should be registered by the code which + * behaviour they configure. + */ + err = devl_params_register(devlink, mlx5_devlink_params, ARRAY_SIZE(mlx5_devlink_params)); if (err) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/devlink.h b/drivers/net/ethernet/mellanox/mlx5/core/devlink.h index 212b12424146..defba5bd91d9 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/devlink.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/devlink.h @@ -12,6 +12,8 @@ enum mlx5_devlink_param_id { MLX5_DEVLINK_PARAM_ID_ESW_LARGE_GROUP_NUM, MLX5_DEVLINK_PARAM_ID_ESW_PORT_METADATA, MLX5_DEVLINK_PARAM_ID_ESW_MULTIPORT, + MLX5_DEVLINK_PARAM_ID_HAIRPIN_NUM_QUEUES, + MLX5_DEVLINK_PARAM_ID_HAIRPIN_QUEUE_SIZE, }; struct mlx5_trap_ctx { @@ -44,4 +46,15 @@ void mlx5_devlink_free(struct devlink *devlink); int mlx5_devlink_params_register(struct devlink *devlink); void mlx5_devlink_params_unregister(struct devlink *devlink); +static inline bool mlx5_core_is_eth_enabled(struct mlx5_core_dev *dev) +{ + union devlink_param_value val; + int err; + + err = devl_param_driverinit_value_get(priv_to_devlink(dev), + DEVLINK_PARAM_GENERIC_ID_ENABLE_ETH, + &val); + return err ? false : val.vbool; +} + #endif /* __MLX5_DEVLINK_H__ */ diff --git a/drivers/net/ethernet/mellanox/mlx5/core/diag/reporter_vnic.c b/drivers/net/ethernet/mellanox/mlx5/core/diag/reporter_vnic.c new file mode 100644 index 000000000000..9114661cd967 --- /dev/null +++ b/drivers/net/ethernet/mellanox/mlx5/core/diag/reporter_vnic.c @@ -0,0 +1,125 @@ +// SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB +/* Copyright (c) 2023, NVIDIA CORPORATION & AFFILIATES. */ + +#include "reporter_vnic.h" +#include "devlink.h" + +#define VNIC_ENV_GET64(vnic_env_stats, c) \ + MLX5_GET64(query_vnic_env_out, (vnic_env_stats)->query_vnic_env_out, \ + vport_env.c) + +struct mlx5_vnic_diag_stats { + __be64 query_vnic_env_out[MLX5_ST_SZ_QW(query_vnic_env_out)]; +}; + +int mlx5_reporter_vnic_diagnose_counters(struct mlx5_core_dev *dev, + struct devlink_fmsg *fmsg, + u16 vport_num, bool other_vport) +{ + u32 in[MLX5_ST_SZ_DW(query_vnic_env_in)] = {}; + struct mlx5_vnic_diag_stats vnic; + int err; + + MLX5_SET(query_vnic_env_in, in, opcode, MLX5_CMD_OP_QUERY_VNIC_ENV); + MLX5_SET(query_vnic_env_in, in, vport_number, vport_num); + MLX5_SET(query_vnic_env_in, in, other_vport, !!other_vport); + + err = mlx5_cmd_exec_inout(dev, query_vnic_env, in, &vnic.query_vnic_env_out); + if (err) + return err; + + err = devlink_fmsg_pair_nest_start(fmsg, "vNIC env counters"); + if (err) + return err; + + err = devlink_fmsg_obj_nest_start(fmsg); + if (err) + return err; + + err = devlink_fmsg_u64_pair_put(fmsg, "total_error_queues", + VNIC_ENV_GET64(&vnic, total_error_queues)); + if (err) + return err; + + err = devlink_fmsg_u64_pair_put(fmsg, "send_queue_priority_update_flow", + VNIC_ENV_GET64(&vnic, send_queue_priority_update_flow)); + if (err) + return err; + + err = devlink_fmsg_u64_pair_put(fmsg, "comp_eq_overrun", + VNIC_ENV_GET64(&vnic, comp_eq_overrun)); + if (err) + return err; + + err = devlink_fmsg_u64_pair_put(fmsg, "async_eq_overrun", + VNIC_ENV_GET64(&vnic, async_eq_overrun)); + if (err) + return err; + + err = devlink_fmsg_u64_pair_put(fmsg, "cq_overrun", + VNIC_ENV_GET64(&vnic, cq_overrun)); + if (err) + return err; + + err = devlink_fmsg_u64_pair_put(fmsg, "invalid_command", + VNIC_ENV_GET64(&vnic, invalid_command)); + if (err) + return err; + + err = devlink_fmsg_u64_pair_put(fmsg, "quota_exceeded_command", + VNIC_ENV_GET64(&vnic, quota_exceeded_command)); + if (err) + return err; + + err = devlink_fmsg_u64_pair_put(fmsg, "nic_receive_steering_discard", + VNIC_ENV_GET64(&vnic, nic_receive_steering_discard)); + if (err) + return err; + + err = devlink_fmsg_obj_nest_end(fmsg); + if (err) + return err; + + err = devlink_fmsg_pair_nest_end(fmsg); + if (err) + return err; + + return 0; +} + +static int mlx5_reporter_vnic_diagnose(struct devlink_health_reporter *reporter, + struct devlink_fmsg *fmsg, + struct netlink_ext_ack *extack) +{ + struct mlx5_core_dev *dev = devlink_health_reporter_priv(reporter); + + return mlx5_reporter_vnic_diagnose_counters(dev, fmsg, 0, false); +} + +static const struct devlink_health_reporter_ops mlx5_reporter_vnic_ops = { + .name = "vnic", + .diagnose = mlx5_reporter_vnic_diagnose, +}; + +void mlx5_reporter_vnic_create(struct mlx5_core_dev *dev) +{ + struct mlx5_core_health *health = &dev->priv.health; + struct devlink *devlink = priv_to_devlink(dev); + + health->vnic_reporter = + devlink_health_reporter_create(devlink, + &mlx5_reporter_vnic_ops, + 0, dev); + if (IS_ERR(health->vnic_reporter)) + mlx5_core_warn(dev, + "Failed to create vnic reporter, err = %ld\n", + PTR_ERR(health->vnic_reporter)); +} + +void mlx5_reporter_vnic_destroy(struct mlx5_core_dev *dev) +{ + struct mlx5_core_health *health = &dev->priv.health; + + if (!IS_ERR_OR_NULL(health->vnic_reporter)) + devlink_health_reporter_destroy(health->vnic_reporter); +} diff --git a/drivers/net/ethernet/mellanox/mlx5/core/diag/reporter_vnic.h b/drivers/net/ethernet/mellanox/mlx5/core/diag/reporter_vnic.h new file mode 100644 index 000000000000..eba87a39e9b1 --- /dev/null +++ b/drivers/net/ethernet/mellanox/mlx5/core/diag/reporter_vnic.h @@ -0,0 +1,16 @@ +/* SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB + * Copyright (c) 2023, NVIDIA CORPORATION & AFFILIATES. + */ +#ifndef __MLX5_REPORTER_VNIC_H +#define __MLX5_REPORTER_VNIC_H + +#include "mlx5_core.h" + +void mlx5_reporter_vnic_create(struct mlx5_core_dev *dev); +void mlx5_reporter_vnic_destroy(struct mlx5_core_dev *dev); + +int mlx5_reporter_vnic_diagnose_counters(struct mlx5_core_dev *dev, + struct devlink_fmsg *fmsg, + u16 vport_num, bool other_vport); + +#endif /* __MLX5_REPORTER_VNIC_H */ diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h b/drivers/net/ethernet/mellanox/mlx5/core/en.h index 4a19ef4a9811..b8987a404d75 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h @@ -335,15 +335,20 @@ static inline u8 mlx5e_get_dcb_num_tc(struct mlx5e_params *params) params->mqprio.num_tc : 1; } +/* Keep this enum consistent with the corresponding strings array + * declared in en/reporter_rx.c + */ enum { - MLX5E_RQ_STATE_ENABLED, + MLX5E_RQ_STATE_ENABLED = 0, MLX5E_RQ_STATE_RECOVERING, - MLX5E_RQ_STATE_AM, + MLX5E_RQ_STATE_DIM, MLX5E_RQ_STATE_NO_CSUM_COMPLETE, MLX5E_RQ_STATE_CSUM_FULL, /* cqe_csum_full hw bit is set */ MLX5E_RQ_STATE_MINI_CQE_HW_STRIDX, /* set when mini_cqe_resp_stride_index cap is used */ MLX5E_RQ_STATE_SHAMPO, /* set when SHAMPO cap is used */ MLX5E_RQ_STATE_MINI_CQE_ENHANCED, /* set when enhanced mini_cqe_cap is used */ + MLX5E_RQ_STATE_XSK, /* set to indicate an xsk rq */ + MLX5E_NUM_RQ_STATES, /* Must be kept last */ }; struct mlx5e_cq { @@ -384,16 +389,20 @@ struct mlx5e_sq_dma { enum mlx5e_dma_map_type type; }; +/* Keep this enum consistent with with the corresponding strings array + * declared in en/reporter_tx.c + */ enum { - MLX5E_SQ_STATE_ENABLED, + MLX5E_SQ_STATE_ENABLED = 0, MLX5E_SQ_STATE_MPWQE, MLX5E_SQ_STATE_RECOVERING, MLX5E_SQ_STATE_IPSEC, - MLX5E_SQ_STATE_AM, + MLX5E_SQ_STATE_DIM, MLX5E_SQ_STATE_VLAN_NEED_L2_INLINE, MLX5E_SQ_STATE_PENDING_XSK_TX, MLX5E_SQ_STATE_PENDING_TLS_RX_RESYNC, MLX5E_SQ_STATE_XDP_MULTIBUF, + MLX5E_NUM_SQ_STATES, /* Must be kept last */ }; struct mlx5e_tx_mpwqe { @@ -466,64 +475,18 @@ struct mlx5e_txqsq { cqe_ts_to_ns ptp_cyc2time; } ____cacheline_aligned_in_smp; -union mlx5e_alloc_unit { - struct page *page; - struct xdp_buff *xsk; -}; - -/* XDP packets can be transmitted in different ways. On completion, we need to - * distinguish between them to clean up things in a proper way. - */ -enum mlx5e_xdp_xmit_mode { - /* An xdp_frame was transmitted due to either XDP_REDIRECT from another - * device or XDP_TX from an XSK RQ. The frame has to be unmapped and - * returned. - */ - MLX5E_XDP_XMIT_MODE_FRAME, - - /* The xdp_frame was created in place as a result of XDP_TX from a - * regular RQ. No DMA remapping happened, and the page belongs to us. - */ - MLX5E_XDP_XMIT_MODE_PAGE, - - /* No xdp_frame was created at all, the transmit happened from a UMEM - * page. The UMEM Completion Ring producer pointer has to be increased. - */ - MLX5E_XDP_XMIT_MODE_XSK, -}; - -struct mlx5e_xdp_info { - enum mlx5e_xdp_xmit_mode mode; - union { - struct { - struct xdp_frame *xdpf; - dma_addr_t dma_addr; - } frame; - struct { - struct mlx5e_rq *rq; - struct page *page; - } page; - }; -}; - -struct mlx5e_xmit_data { - dma_addr_t dma_addr; - void *data; - u32 len; -}; - struct mlx5e_xdp_info_fifo { - struct mlx5e_xdp_info *xi; + union mlx5e_xdp_info *xi; u32 *cc; u32 *pc; u32 mask; }; struct mlx5e_xdpsq; +struct mlx5e_xmit_data; typedef int (*mlx5e_fp_xmit_xdp_frame_check)(struct mlx5e_xdpsq *); typedef bool (*mlx5e_fp_xmit_xdp_frame)(struct mlx5e_xdpsq *, struct mlx5e_xmit_data *, - struct skb_shared_info *, int); struct mlx5e_xdpsq { @@ -596,16 +559,36 @@ struct mlx5e_icosq { struct work_struct recover_work; } ____cacheline_aligned_in_smp; +struct mlx5e_frag_page { + struct page *page; + u16 frags; +}; + +enum mlx5e_wqe_frag_flag { + MLX5E_WQE_FRAG_LAST_IN_PAGE, + MLX5E_WQE_FRAG_SKIP_RELEASE, +}; + struct mlx5e_wqe_frag_info { - union mlx5e_alloc_unit *au; + union { + struct mlx5e_frag_page *frag_page; + struct xdp_buff **xskp; + }; u32 offset; - bool last_in_page; + u8 flags; +}; + +union mlx5e_alloc_units { + DECLARE_FLEX_ARRAY(struct mlx5e_frag_page, frag_pages); + DECLARE_FLEX_ARRAY(struct page *, pages); + DECLARE_FLEX_ARRAY(struct xdp_buff *, xsk_buffs); }; struct mlx5e_mpw_info { u16 consumed_strides; - DECLARE_BITMAP(xdp_xmit_bitmap, MLX5_MPWRQ_MAX_PAGES_PER_WQE); - union mlx5e_alloc_unit alloc_units[]; + DECLARE_BITMAP(skip_release_bitmap, MLX5_MPWRQ_MAX_PAGES_PER_WQE); + struct mlx5e_frag_page linear_page; + union mlx5e_alloc_units alloc_units; }; #define MLX5E_MAX_RX_FRAGS 4 @@ -616,11 +599,6 @@ struct mlx5e_mpw_info { #define MLX5E_CACHE_UNIT (MLX5_MPWRQ_MAX_PAGES_PER_WQE > NAPI_POLL_WEIGHT ? \ MLX5_MPWRQ_MAX_PAGES_PER_WQE : NAPI_POLL_WEIGHT) #define MLX5E_CACHE_SIZE (4 * roundup_pow_of_two(MLX5E_CACHE_UNIT)) -struct mlx5e_page_cache { - u32 head; - u32 tail; - struct page *page_cache[MLX5E_CACHE_SIZE]; -}; struct mlx5e_rq; typedef void (*mlx5e_fp_handle_rx_cqe)(struct mlx5e_rq*, struct mlx5_cqe64*); @@ -652,19 +630,24 @@ struct mlx5e_rq_frags_info { struct mlx5e_rq_frag_info arr[MLX5E_MAX_RX_FRAGS]; u8 num_frags; u8 log_num_frags; - u8 wqe_bulk; + u16 wqe_bulk; + u16 refill_unit; u8 wqe_index_mask; }; struct mlx5e_dma_info { dma_addr_t addr; - struct page *page; + union { + struct mlx5e_frag_page *frag_page; + struct page *page; + }; }; struct mlx5e_shampo_hd { u32 mkey; struct mlx5e_dma_info *info; - struct page *last_page; + struct mlx5e_frag_page *pages; + u16 curr_page_index; u16 hd_per_wq; u16 hd_per_wqe; unsigned long *bitmap; @@ -693,7 +676,7 @@ struct mlx5e_rq { struct { struct mlx5_wq_cyc wq; struct mlx5e_wqe_frag_info *frags; - union mlx5e_alloc_unit *alloc_units; + union mlx5e_alloc_units *alloc_units; struct mlx5e_rq_frags_info info; mlx5e_fp_skb_from_cqe skb_from_cqe; } wqe; @@ -729,7 +712,6 @@ struct mlx5e_rq { struct mlx5e_rq_stats *stats; struct mlx5e_cq cq; struct mlx5e_cq_decomp cqd; - struct mlx5e_page_cache page_cache; struct hwtstamp_config *tstamp; struct mlx5_clock *clock; struct mlx5e_icosq *icosq; diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/params.c b/drivers/net/ethernet/mellanox/mlx5/core/en/params.c index a21bd1179477..ef546ed8b4d9 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en/params.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/params.c @@ -253,17 +253,20 @@ static u32 mlx5e_rx_get_linear_stride_sz(struct mlx5_core_dev *mdev, struct mlx5e_xsk_param *xsk, bool mpwqe) { + u32 sz; + /* XSK frames are mapped as individual pages, because frames may come in * an arbitrary order from random locations in the UMEM. */ if (xsk) return mpwqe ? 1 << mlx5e_mpwrq_page_shift(mdev, xsk) : PAGE_SIZE; - /* XDP in mlx5e doesn't support multiple packets per page. */ - if (params->xdp_prog) - return PAGE_SIZE; + sz = roundup_pow_of_two(mlx5e_rx_get_linear_sz_skb(params, false)); - return roundup_pow_of_two(mlx5e_rx_get_linear_sz_skb(params, false)); + /* XDP in mlx5e doesn't support multiple packets per page. + * Do not assume sz <= PAGE_SIZE if params->xdp_prog is set. + */ + return params->xdp_prog && sz < PAGE_SIZE ? PAGE_SIZE : sz; } static u8 mlx5e_mpwqe_log_pkts_per_wqe(struct mlx5_core_dev *mdev, @@ -320,6 +323,20 @@ static bool mlx5e_verify_rx_mpwqe_strides(struct mlx5_core_dev *mdev, return log_num_strides >= MLX5_MPWQE_LOG_NUM_STRIDES_BASE; } +bool mlx5e_verify_params_rx_mpwqe_strides(struct mlx5_core_dev *mdev, + struct mlx5e_params *params, + struct mlx5e_xsk_param *xsk) +{ + u8 log_wqe_num_of_strides = mlx5e_mpwqe_get_log_num_strides(mdev, params, xsk); + u8 log_wqe_stride_size = mlx5e_mpwqe_get_log_stride_size(mdev, params, xsk); + enum mlx5e_mpwrq_umr_mode umr_mode = mlx5e_mpwrq_umr_mode(mdev, xsk); + u8 page_shift = mlx5e_mpwrq_page_shift(mdev, xsk); + + return mlx5e_verify_rx_mpwqe_strides(mdev, log_wqe_stride_size, + log_wqe_num_of_strides, + page_shift, umr_mode); +} + bool mlx5e_rx_mpwqe_is_linear_skb(struct mlx5_core_dev *mdev, struct mlx5e_params *params, struct mlx5e_xsk_param *xsk) @@ -402,6 +419,10 @@ u8 mlx5e_mpwqe_get_log_stride_size(struct mlx5_core_dev *mdev, if (mlx5e_rx_mpwqe_is_linear_skb(mdev, params, xsk)) return order_base_2(mlx5e_rx_get_linear_stride_sz(mdev, params, xsk, true)); + /* XDP in mlx5e doesn't support multiple packets per page. */ + if (params->xdp_prog) + return PAGE_SHIFT; + return MLX5_MPWRQ_DEF_LOG_STRIDE_SZ(mdev); } @@ -553,7 +574,7 @@ bool slow_pci_heuristic(struct mlx5_core_dev *mdev) u32 link_speed = 0; u32 pci_bw = 0; - mlx5e_port_max_linkspeed(mdev, &link_speed); + mlx5_port_max_linkspeed(mdev, &link_speed); pci_bw = pcie_bandwidth_available(mdev->pdev, NULL, NULL, NULL); mlx5_core_dbg_once(mdev, "Max link speed = %d, PCI BW = %d\n", link_speed, pci_bw); @@ -572,9 +593,6 @@ int mlx5e_mpwrq_validate_regular(struct mlx5_core_dev *mdev, struct mlx5e_params if (!mlx5e_check_fragmented_striding_rq_cap(mdev, page_shift, umr_mode)) return -EOPNOTSUPP; - if (params->xdp_prog && !mlx5e_rx_mpwqe_is_linear_skb(mdev, params, NULL)) - return -EINVAL; - return 0; } @@ -667,6 +685,48 @@ static int mlx5e_max_nonlinear_mtu(int first_frag_size, int frag_size, bool xdp) return first_frag_size + (MLX5E_MAX_RX_FRAGS - 2) * frag_size + PAGE_SIZE; } +static void mlx5e_rx_compute_wqe_bulk_params(struct mlx5e_params *params, + struct mlx5e_rq_frags_info *info) +{ + u16 bulk_bound_rq_size = (1 << params->log_rq_mtu_frames) / 4; + u32 bulk_bound_rq_size_in_bytes; + u32 sum_frag_strides = 0; + u32 wqe_bulk_in_bytes; + u16 split_factor; + u32 wqe_bulk; + int i; + + for (i = 0; i < info->num_frags; i++) + sum_frag_strides += info->arr[i].frag_stride; + + /* For MTUs larger than PAGE_SIZE, align to PAGE_SIZE to reflect + * amount of consumed pages per wqe in bytes. + */ + if (sum_frag_strides > PAGE_SIZE) + sum_frag_strides = ALIGN(sum_frag_strides, PAGE_SIZE); + + bulk_bound_rq_size_in_bytes = bulk_bound_rq_size * sum_frag_strides; + +#define MAX_WQE_BULK_BYTES(xdp) ((xdp ? 256 : 512) * 1024) + + /* A WQE bulk should not exceed min(512KB, 1/4 of rq size). For XDP + * keep bulk size smaller to avoid filling the page_pool cache on + * every bulk refill. + */ + wqe_bulk_in_bytes = min_t(u32, MAX_WQE_BULK_BYTES(params->xdp_prog), + bulk_bound_rq_size_in_bytes); + wqe_bulk = DIV_ROUND_UP(wqe_bulk_in_bytes, sum_frag_strides); + + /* Make sure that allocations don't start when the page is still used + * by older WQEs. + */ + info->wqe_bulk = max_t(u16, info->wqe_index_mask + 1, wqe_bulk); + + split_factor = DIV_ROUND_UP(MAX_WQE_BULK_BYTES(params->xdp_prog), + PP_ALLOC_CACHE_REFILL * PAGE_SIZE); + info->refill_unit = DIV_ROUND_UP(info->wqe_bulk, split_factor); +} + #define DEFAULT_FRAG_SIZE (2048) static int mlx5e_build_rq_frags_info(struct mlx5_core_dev *mdev, @@ -774,11 +834,14 @@ static int mlx5e_build_rq_frags_info(struct mlx5_core_dev *mdev, } out: - /* Bulking optimization to skip allocation until at least 8 WQEs can be - * allocated in a row. At the same time, never start allocation when - * the page is still used by older WQEs. + /* Bulking optimization to skip allocation until a large enough number + * of WQEs can be allocated in a row. Bulking also influences how well + * deferred page release works. */ - info->wqe_bulk = max_t(u8, info->wqe_index_mask + 1, 8); + mlx5e_rx_compute_wqe_bulk_params(params, info); + + mlx5_core_dbg(mdev, "%s: wqe_bulk = %u, wqe_bulk_refill_unit = %u\n", + __func__, info->wqe_bulk, info->refill_unit); info->log_num_frags = order_base_2(info->num_frags); diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/params.h b/drivers/net/ethernet/mellanox/mlx5/core/en/params.h index c9be6eb88012..a5d20f6d6d9c 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en/params.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/params.h @@ -153,6 +153,9 @@ int mlx5e_build_channel_param(struct mlx5_core_dev *mdev, u16 mlx5e_calc_sq_stop_room(struct mlx5_core_dev *mdev, struct mlx5e_params *params); int mlx5e_validate_params(struct mlx5_core_dev *mdev, struct mlx5e_params *params); +bool mlx5e_verify_params_rx_mpwqe_strides(struct mlx5_core_dev *mdev, + struct mlx5e_params *params, + struct mlx5e_xsk_param *xsk); static inline void mlx5e_params_print_info(struct mlx5_core_dev *mdev, struct mlx5e_params *params) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/port.c b/drivers/net/ethernet/mellanox/mlx5/core/en/port.c index 505ba41195b9..dbe2b19a9570 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en/port.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/port.c @@ -32,101 +32,6 @@ #include "port.h" -/* speed in units of 1Mb */ -static const u32 mlx5e_link_speed[MLX5E_LINK_MODES_NUMBER] = { - [MLX5E_1000BASE_CX_SGMII] = 1000, - [MLX5E_1000BASE_KX] = 1000, - [MLX5E_10GBASE_CX4] = 10000, - [MLX5E_10GBASE_KX4] = 10000, - [MLX5E_10GBASE_KR] = 10000, - [MLX5E_20GBASE_KR2] = 20000, - [MLX5E_40GBASE_CR4] = 40000, - [MLX5E_40GBASE_KR4] = 40000, - [MLX5E_56GBASE_R4] = 56000, - [MLX5E_10GBASE_CR] = 10000, - [MLX5E_10GBASE_SR] = 10000, - [MLX5E_10GBASE_ER] = 10000, - [MLX5E_40GBASE_SR4] = 40000, - [MLX5E_40GBASE_LR4] = 40000, - [MLX5E_50GBASE_SR2] = 50000, - [MLX5E_100GBASE_CR4] = 100000, - [MLX5E_100GBASE_SR4] = 100000, - [MLX5E_100GBASE_KR4] = 100000, - [MLX5E_100GBASE_LR4] = 100000, - [MLX5E_100BASE_TX] = 100, - [MLX5E_1000BASE_T] = 1000, - [MLX5E_10GBASE_T] = 10000, - [MLX5E_25GBASE_CR] = 25000, - [MLX5E_25GBASE_KR] = 25000, - [MLX5E_25GBASE_SR] = 25000, - [MLX5E_50GBASE_CR2] = 50000, - [MLX5E_50GBASE_KR2] = 50000, -}; - -static const u32 mlx5e_ext_link_speed[MLX5E_EXT_LINK_MODES_NUMBER] = { - [MLX5E_SGMII_100M] = 100, - [MLX5E_1000BASE_X_SGMII] = 1000, - [MLX5E_5GBASE_R] = 5000, - [MLX5E_10GBASE_XFI_XAUI_1] = 10000, - [MLX5E_40GBASE_XLAUI_4_XLPPI_4] = 40000, - [MLX5E_25GAUI_1_25GBASE_CR_KR] = 25000, - [MLX5E_50GAUI_2_LAUI_2_50GBASE_CR2_KR2] = 50000, - [MLX5E_50GAUI_1_LAUI_1_50GBASE_CR_KR] = 50000, - [MLX5E_CAUI_4_100GBASE_CR4_KR4] = 100000, - [MLX5E_100GAUI_2_100GBASE_CR2_KR2] = 100000, - [MLX5E_200GAUI_4_200GBASE_CR4_KR4] = 200000, - [MLX5E_400GAUI_8] = 400000, - [MLX5E_100GAUI_1_100GBASE_CR_KR] = 100000, - [MLX5E_200GAUI_2_200GBASE_CR2_KR2] = 200000, - [MLX5E_400GAUI_4_400GBASE_CR4_KR4] = 400000, -}; - -bool mlx5e_ptys_ext_supported(struct mlx5_core_dev *mdev) -{ - struct mlx5e_port_eth_proto eproto; - int err; - - if (MLX5_CAP_PCAM_FEATURE(mdev, ptys_extended_ethernet)) - return true; - - err = mlx5_port_query_eth_proto(mdev, 1, true, &eproto); - if (err) - return false; - - return !!eproto.cap; -} - -static void mlx5e_port_get_speed_arr(struct mlx5_core_dev *mdev, - const u32 **arr, u32 *size, - bool force_legacy) -{ - bool ext = force_legacy ? false : mlx5e_ptys_ext_supported(mdev); - - *size = ext ? ARRAY_SIZE(mlx5e_ext_link_speed) : - ARRAY_SIZE(mlx5e_link_speed); - *arr = ext ? mlx5e_ext_link_speed : mlx5e_link_speed; -} - -int mlx5_port_query_eth_proto(struct mlx5_core_dev *dev, u8 port, bool ext, - struct mlx5e_port_eth_proto *eproto) -{ - u32 out[MLX5_ST_SZ_DW(ptys_reg)]; - int err; - - if (!eproto) - return -EINVAL; - - err = mlx5_query_port_ptys(dev, out, sizeof(out), MLX5_PTYS_EN, port); - if (err) - return err; - - eproto->cap = MLX5_GET_ETH_PROTO(ptys_reg, out, ext, - eth_proto_capability); - eproto->admin = MLX5_GET_ETH_PROTO(ptys_reg, out, ext, eth_proto_admin); - eproto->oper = MLX5_GET_ETH_PROTO(ptys_reg, out, ext, eth_proto_oper); - return 0; -} - void mlx5_port_query_eth_autoneg(struct mlx5_core_dev *dev, u8 *an_status, u8 *an_disable_cap, u8 *an_disable_admin) { @@ -172,30 +77,14 @@ int mlx5_port_set_eth_ptys(struct mlx5_core_dev *dev, bool an_disable, sizeof(out), MLX5_REG_PTYS, 0, 1); } -u32 mlx5e_port_ptys2speed(struct mlx5_core_dev *mdev, u32 eth_proto_oper, - bool force_legacy) -{ - unsigned long temp = eth_proto_oper; - const u32 *table; - u32 speed = 0; - u32 max_size; - int i; - - mlx5e_port_get_speed_arr(mdev, &table, &max_size, force_legacy); - i = find_first_bit(&temp, max_size); - if (i < max_size) - speed = table[i]; - return speed; -} - int mlx5e_port_linkspeed(struct mlx5_core_dev *mdev, u32 *speed) { - struct mlx5e_port_eth_proto eproto; + struct mlx5_port_eth_proto eproto; bool force_legacy = false; bool ext; int err; - ext = mlx5e_ptys_ext_supported(mdev); + ext = mlx5_ptys_ext_supported(mdev); err = mlx5_port_query_eth_proto(mdev, 1, ext, &eproto); if (err) goto out; @@ -205,7 +94,7 @@ int mlx5e_port_linkspeed(struct mlx5_core_dev *mdev, u32 *speed) if (err) goto out; } - *speed = mlx5e_port_ptys2speed(mdev, eproto.oper, force_legacy); + *speed = mlx5_port_ptys2speed(mdev, eproto.oper, force_legacy); if (!(*speed)) err = -EINVAL; @@ -213,46 +102,6 @@ out: return err; } -int mlx5e_port_max_linkspeed(struct mlx5_core_dev *mdev, u32 *speed) -{ - struct mlx5e_port_eth_proto eproto; - u32 max_speed = 0; - const u32 *table; - u32 max_size; - bool ext; - int err; - int i; - - ext = mlx5e_ptys_ext_supported(mdev); - err = mlx5_port_query_eth_proto(mdev, 1, ext, &eproto); - if (err) - return err; - - mlx5e_port_get_speed_arr(mdev, &table, &max_size, false); - for (i = 0; i < max_size; ++i) - if (eproto.cap & MLX5E_PROT_MASK(i)) - max_speed = max(max_speed, table[i]); - - *speed = max_speed; - return 0; -} - -u32 mlx5e_port_speed2linkmodes(struct mlx5_core_dev *mdev, u32 speed, - bool force_legacy) -{ - u32 link_modes = 0; - const u32 *table; - u32 max_size; - int i; - - mlx5e_port_get_speed_arr(mdev, &table, &max_size, force_legacy); - for (i = 0; i < max_size; ++i) { - if (table[i] == speed) - link_modes |= MLX5E_PROT_MASK(i); - } - return link_modes; -} - int mlx5e_port_query_pbmc(struct mlx5_core_dev *mdev, void *out) { int sz = MLX5_ST_SZ_BYTES(pbmc_reg); diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/port.h b/drivers/net/ethernet/mellanox/mlx5/core/en/port.h index 3f474e370828..d1da225f35da 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en/port.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/port.h @@ -36,25 +36,11 @@ #include <linux/mlx5/driver.h> #include "en.h" -struct mlx5e_port_eth_proto { - u32 cap; - u32 admin; - u32 oper; -}; - -int mlx5_port_query_eth_proto(struct mlx5_core_dev *dev, u8 port, bool ext, - struct mlx5e_port_eth_proto *eproto); void mlx5_port_query_eth_autoneg(struct mlx5_core_dev *dev, u8 *an_status, u8 *an_disable_cap, u8 *an_disable_admin); int mlx5_port_set_eth_ptys(struct mlx5_core_dev *dev, bool an_disable, u32 proto_admin, bool ext); -u32 mlx5e_port_ptys2speed(struct mlx5_core_dev *mdev, u32 eth_proto_oper, - bool force_legacy); int mlx5e_port_linkspeed(struct mlx5_core_dev *mdev, u32 *speed); -int mlx5e_port_max_linkspeed(struct mlx5_core_dev *mdev, u32 *speed); -u32 mlx5e_port_speed2linkmodes(struct mlx5_core_dev *mdev, u32 speed, - bool force_legacy); -bool mlx5e_ptys_ext_supported(struct mlx5_core_dev *mdev); int mlx5e_port_query_pbmc(struct mlx5_core_dev *mdev, void *out); int mlx5e_port_set_pbmc(struct mlx5_core_dev *mdev, void *in); int mlx5e_port_query_sbpr(struct mlx5_core_dev *mdev, u32 desc, u8 dir, diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/ptp.c b/drivers/net/ethernet/mellanox/mlx5/core/en/ptp.c index eb5aeba3addf..eb5abd0e55d9 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en/ptp.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/ptp.c @@ -81,23 +81,23 @@ void mlx5e_skb_cb_hwtstamp_handler(struct sk_buff *skb, int hwtstamp_type, #define PTP_WQE_CTR2IDX(val) ((val) & ptpsq->ts_cqe_ctr_mask) -static bool mlx5e_ptp_ts_cqe_drop(struct mlx5e_ptpsq *ptpsq, u16 skb_cc, u16 skb_id) +static bool mlx5e_ptp_ts_cqe_drop(struct mlx5e_ptpsq *ptpsq, u16 skb_ci, u16 skb_id) { - return (ptpsq->ts_cqe_ctr_mask && (skb_cc != skb_id)); + return (ptpsq->ts_cqe_ctr_mask && (skb_ci != skb_id)); } static bool mlx5e_ptp_ts_cqe_ooo(struct mlx5e_ptpsq *ptpsq, u16 skb_id) { - u16 skb_cc = PTP_WQE_CTR2IDX(ptpsq->skb_fifo_cc); - u16 skb_pc = PTP_WQE_CTR2IDX(ptpsq->skb_fifo_pc); + u16 skb_ci = PTP_WQE_CTR2IDX(ptpsq->skb_fifo_cc); + u16 skb_pi = PTP_WQE_CTR2IDX(ptpsq->skb_fifo_pc); - if (PTP_WQE_CTR2IDX(skb_id - skb_cc) >= PTP_WQE_CTR2IDX(skb_pc - skb_cc)) + if (PTP_WQE_CTR2IDX(skb_id - skb_ci) >= PTP_WQE_CTR2IDX(skb_pi - skb_ci)) return true; return false; } -static void mlx5e_ptp_skb_fifo_ts_cqe_resync(struct mlx5e_ptpsq *ptpsq, u16 skb_cc, +static void mlx5e_ptp_skb_fifo_ts_cqe_resync(struct mlx5e_ptpsq *ptpsq, u16 skb_ci, u16 skb_id, int budget) { struct skb_shared_hwtstamps hwts = {}; @@ -105,13 +105,13 @@ static void mlx5e_ptp_skb_fifo_ts_cqe_resync(struct mlx5e_ptpsq *ptpsq, u16 skb_ ptpsq->cq_stats->resync_event++; - while (skb_cc != skb_id) { + while (skb_ci != skb_id) { skb = mlx5e_skb_fifo_pop(&ptpsq->skb_fifo); hwts.hwtstamp = mlx5e_skb_cb_get_hwts(skb)->cqe_hwtstamp; skb_tstamp_tx(skb, &hwts); ptpsq->cq_stats->resync_cqe++; napi_consume_skb(skb, budget); - skb_cc = PTP_WQE_CTR2IDX(ptpsq->skb_fifo_cc); + skb_ci = PTP_WQE_CTR2IDX(ptpsq->skb_fifo_cc); } } @@ -120,7 +120,7 @@ static void mlx5e_ptp_handle_ts_cqe(struct mlx5e_ptpsq *ptpsq, int budget) { u16 skb_id = PTP_WQE_CTR2IDX(be16_to_cpu(cqe->wqe_counter)); - u16 skb_cc = PTP_WQE_CTR2IDX(ptpsq->skb_fifo_cc); + u16 skb_ci = PTP_WQE_CTR2IDX(ptpsq->skb_fifo_cc); struct mlx5e_txqsq *sq = &ptpsq->txqsq; struct sk_buff *skb; ktime_t hwtstamp; @@ -131,13 +131,13 @@ static void mlx5e_ptp_handle_ts_cqe(struct mlx5e_ptpsq *ptpsq, goto out; } - if (mlx5e_ptp_ts_cqe_drop(ptpsq, skb_cc, skb_id)) { + if (mlx5e_ptp_ts_cqe_drop(ptpsq, skb_ci, skb_id)) { if (mlx5e_ptp_ts_cqe_ooo(ptpsq, skb_id)) { /* already handled by a previous resync */ ptpsq->cq_stats->ooo_cqe_drop++; return; } - mlx5e_ptp_skb_fifo_ts_cqe_resync(ptpsq, skb_cc, skb_id, budget); + mlx5e_ptp_skb_fifo_ts_cqe_resync(ptpsq, skb_ci, skb_id, budget); } skb = mlx5e_skb_fifo_pop(&ptpsq->skb_fifo); diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/rep/bridge.c b/drivers/net/ethernet/mellanox/mlx5/core/en/rep/bridge.c index ce85b48d327d..fd191925ab4b 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en/rep/bridge.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/rep/bridge.c @@ -220,6 +220,7 @@ mlx5_esw_bridge_port_obj_add(struct net_device *dev, struct netlink_ext_ack *extack = switchdev_notifier_info_to_extack(&port_obj_info->info); const struct switchdev_obj *obj = port_obj_info->obj; const struct switchdev_obj_port_vlan *vlan; + const struct switchdev_obj_port_mdb *mdb; u16 vport_num, esw_owner_vhca_id; int err; @@ -235,6 +236,11 @@ mlx5_esw_bridge_port_obj_add(struct net_device *dev, err = mlx5_esw_bridge_port_vlan_add(vport_num, esw_owner_vhca_id, vlan->vid, vlan->flags, br_offloads, extack); break; + case SWITCHDEV_OBJ_ID_PORT_MDB: + mdb = SWITCHDEV_OBJ_PORT_MDB(obj); + err = mlx5_esw_bridge_port_mdb_add(dev, vport_num, esw_owner_vhca_id, mdb->addr, + mdb->vid, br_offloads, extack); + break; default: return -EOPNOTSUPP; } @@ -248,6 +254,7 @@ mlx5_esw_bridge_port_obj_del(struct net_device *dev, { const struct switchdev_obj *obj = port_obj_info->obj; const struct switchdev_obj_port_vlan *vlan; + const struct switchdev_obj_port_mdb *mdb; u16 vport_num, esw_owner_vhca_id; if (!mlx5_esw_bridge_rep_vport_num_vhca_id_get(dev, br_offloads->esw, &vport_num, @@ -261,6 +268,11 @@ mlx5_esw_bridge_port_obj_del(struct net_device *dev, vlan = SWITCHDEV_OBJ_PORT_VLAN(obj); mlx5_esw_bridge_port_vlan_del(vport_num, esw_owner_vhca_id, vlan->vid, br_offloads); break; + case SWITCHDEV_OBJ_ID_PORT_MDB: + mdb = SWITCHDEV_OBJ_PORT_MDB(obj); + mlx5_esw_bridge_port_mdb_del(dev, vport_num, esw_owner_vhca_id, mdb->addr, mdb->vid, + br_offloads); + break; default: return -EOPNOTSUPP; } @@ -306,6 +318,10 @@ mlx5_esw_bridge_port_obj_attr_set(struct net_device *dev, attr->u.vlan_protocol, br_offloads); break; + case SWITCHDEV_ATTR_ID_BRIDGE_MC_DISABLED: + err = mlx5_esw_bridge_mcast_set(vport_num, esw_owner_vhca_id, + !attr->u.mc_disabled, br_offloads); + break; default: err = -EOPNOTSUPP; } diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/rep/tc.c b/drivers/net/ethernet/mellanox/mlx5/core/en/rep/tc.c index 8f7452dc00ee..b5c773ffc763 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en/rep/tc.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/rep/tc.c @@ -426,39 +426,58 @@ static bool mlx5e_rep_macvlan_mode_supported(const struct net_device *dev) return macvlan->mode == MACVLAN_MODE_PASSTHRU; } -static int -mlx5e_rep_indr_setup_block(struct net_device *netdev, struct Qdisc *sch, - struct mlx5e_rep_priv *rpriv, - struct flow_block_offload *f, - flow_setup_cb_t *setup_cb, - void *data, - void (*cleanup)(struct flow_block_cb *block_cb)) +static bool +mlx5e_rep_check_indr_block_supported(struct mlx5e_rep_priv *rpriv, + struct net_device *netdev, + struct flow_block_offload *f) { struct mlx5e_priv *priv = netdev_priv(rpriv->netdev); struct mlx5_eswitch *esw = priv->mdev->priv.eswitch; - bool is_ovs_int_port = netif_is_ovs_master(netdev); - struct mlx5e_rep_indr_block_priv *indr_priv; - struct flow_block_cb *block_cb; + struct net_device *macvlan_real_dev; - if (!mlx5e_tc_tun_device_to_offload(priv, netdev) && - !(is_vlan_dev(netdev) && vlan_dev_real_dev(netdev) == rpriv->netdev) && - !is_ovs_int_port) { - if (!(netif_is_macvlan(netdev) && macvlan_dev_real_dev(netdev) == rpriv->netdev)) - return -EOPNOTSUPP; + if (f->binder_type != FLOW_BLOCK_BINDER_TYPE_CLSACT_INGRESS && + f->binder_type != FLOW_BLOCK_BINDER_TYPE_CLSACT_EGRESS) + return false; + + if (mlx5e_tc_tun_device_to_offload(priv, netdev)) + return true; + + if (is_vlan_dev(netdev) && vlan_dev_real_dev(netdev) == rpriv->netdev) + return true; + + if (netif_is_macvlan(netdev)) { if (!mlx5e_rep_macvlan_mode_supported(netdev)) { netdev_warn(netdev, "Offloading ingress filter is supported only with macvlan passthru mode"); - return -EOPNOTSUPP; + return false; } + + macvlan_real_dev = macvlan_dev_real_dev(netdev); + + if (macvlan_real_dev == rpriv->netdev) + return true; + if (netif_is_bond_master(macvlan_real_dev)) + return true; } - if (f->binder_type != FLOW_BLOCK_BINDER_TYPE_CLSACT_INGRESS && - f->binder_type != FLOW_BLOCK_BINDER_TYPE_CLSACT_EGRESS) - return -EOPNOTSUPP; + if (netif_is_ovs_master(netdev) && f->binder_type == FLOW_BLOCK_BINDER_TYPE_CLSACT_EGRESS && + mlx5e_tc_int_port_supported(esw)) + return true; - if (f->binder_type == FLOW_BLOCK_BINDER_TYPE_CLSACT_EGRESS && !is_ovs_int_port) - return -EOPNOTSUPP; + return false; +} - if (is_ovs_int_port && !mlx5e_tc_int_port_supported(esw)) +static int +mlx5e_rep_indr_setup_block(struct net_device *netdev, struct Qdisc *sch, + struct mlx5e_rep_priv *rpriv, + struct flow_block_offload *f, + flow_setup_cb_t *setup_cb, + void *data, + void (*cleanup)(struct flow_block_cb *block_cb)) +{ + struct mlx5e_rep_indr_block_priv *indr_priv; + struct flow_block_cb *block_cb; + + if (!mlx5e_rep_check_indr_block_supported(rpriv, netdev, f)) return -EOPNOTSUPP; f->unlocked_driver_cb = true; @@ -715,5 +734,6 @@ forward: return; free_skb: + dev_put(tc_priv.fwd_dev); dev_kfree_skb_any(skb); } diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/reporter_rx.c b/drivers/net/ethernet/mellanox/mlx5/core/en/reporter_rx.c index c462fe76495b..e8eea9ffd5eb 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en/reporter_rx.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/reporter_rx.c @@ -8,6 +8,19 @@ #include "ptp.h" #include "lib/tout.h" +/* Keep this string array consistent with the MLX5E_RQ_STATE_* enums in en.h */ +static const char * const rq_sw_state_type_name[] = { + [MLX5E_RQ_STATE_ENABLED] = "enabled", + [MLX5E_RQ_STATE_RECOVERING] = "recovering", + [MLX5E_RQ_STATE_DIM] = "dim", + [MLX5E_RQ_STATE_NO_CSUM_COMPLETE] = "no_csum_complete", + [MLX5E_RQ_STATE_CSUM_FULL] = "csum_full", + [MLX5E_RQ_STATE_MINI_CQE_HW_STRIDX] = "mini_cqe_hw_stridx", + [MLX5E_RQ_STATE_SHAMPO] = "shampo", + [MLX5E_RQ_STATE_MINI_CQE_ENHANCED] = "mini_cqe_enhanced", + [MLX5E_RQ_STATE_XSK] = "xsk", +}; + static int mlx5e_query_rq_state(struct mlx5_core_dev *dev, u32 rqn, u8 *state) { int outlen = MLX5_ST_SZ_BYTES(query_rq_out); @@ -108,9 +121,9 @@ static int mlx5e_rx_reporter_err_icosq_cqe_recover(void *ctx) mlx5e_reset_icosq_cc_pc(icosq); - mlx5e_free_rx_in_progress_descs(rq); + mlx5e_free_rx_missing_descs(rq); if (xskrq) - mlx5e_free_rx_in_progress_descs(xskrq); + mlx5e_free_rx_missing_descs(xskrq); clear_bit(MLX5E_SQ_STATE_RECOVERING, &icosq->state); mlx5e_activate_icosq(icosq); @@ -239,6 +252,27 @@ static int mlx5e_reporter_icosq_diagnose(struct mlx5e_icosq *icosq, u8 hw_state, return mlx5e_health_fmsg_named_obj_nest_end(fmsg); } +static int mlx5e_health_rq_put_sw_state(struct devlink_fmsg *fmsg, struct mlx5e_rq *rq) +{ + int err; + int i; + + BUILD_BUG_ON_MSG(ARRAY_SIZE(rq_sw_state_type_name) != MLX5E_NUM_RQ_STATES, + "rq_sw_state_type_name string array must be consistent with MLX5E_RQ_STATE_* enum in en.h"); + err = mlx5e_health_fmsg_named_obj_nest_start(fmsg, "SW State"); + if (err) + return err; + + for (i = 0; i < ARRAY_SIZE(rq_sw_state_type_name); ++i) { + err = devlink_fmsg_u32_pair_put(fmsg, rq_sw_state_type_name[i], + test_bit(i, &rq->state)); + if (err) + return err; + } + + return mlx5e_health_fmsg_named_obj_nest_end(fmsg); +} + static int mlx5e_rx_reporter_build_diagnose_output_rq_common(struct mlx5e_rq *rq, struct devlink_fmsg *fmsg) @@ -265,10 +299,6 @@ mlx5e_rx_reporter_build_diagnose_output_rq_common(struct mlx5e_rq *rq, if (err) return err; - err = devlink_fmsg_u8_pair_put(fmsg, "SW state", rq->state); - if (err) - return err; - err = devlink_fmsg_u32_pair_put(fmsg, "WQE counter", wqe_counter); if (err) return err; @@ -281,6 +311,10 @@ mlx5e_rx_reporter_build_diagnose_output_rq_common(struct mlx5e_rq *rq, if (err) return err; + err = mlx5e_health_rq_put_sw_state(fmsg, rq); + if (err) + return err; + err = mlx5e_health_cq_diag_fmsg(&rq->cq, fmsg); if (err) return err; diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/reporter_tx.c b/drivers/net/ethernet/mellanox/mlx5/core/en/reporter_tx.c index 34666e2b3871..b35ff289af49 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en/reporter_tx.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/reporter_tx.c @@ -6,6 +6,19 @@ #include "en/devlink.h" #include "lib/tout.h" +/* Keep this string array consistent with the MLX5E_SQ_STATE_* enums in en.h */ +static const char * const sq_sw_state_type_name[] = { + [MLX5E_SQ_STATE_ENABLED] = "enabled", + [MLX5E_SQ_STATE_MPWQE] = "mpwqe", + [MLX5E_SQ_STATE_RECOVERING] = "recovering", + [MLX5E_SQ_STATE_IPSEC] = "ipsec", + [MLX5E_SQ_STATE_DIM] = "dim", + [MLX5E_SQ_STATE_VLAN_NEED_L2_INLINE] = "vlan_need_l2_inline", + [MLX5E_SQ_STATE_PENDING_XSK_TX] = "pending_xsk_tx", + [MLX5E_SQ_STATE_PENDING_TLS_RX_RESYNC] = "pending_tls_rx_resync", + [MLX5E_SQ_STATE_XDP_MULTIBUF] = "xdp_multibuf", +}; + static int mlx5e_wait_for_sq_flush(struct mlx5e_txqsq *sq) { struct mlx5_core_dev *dev = sq->mdev; @@ -37,6 +50,27 @@ static void mlx5e_reset_txqsq_cc_pc(struct mlx5e_txqsq *sq) sq->pc = 0; } +static int mlx5e_health_sq_put_sw_state(struct devlink_fmsg *fmsg, struct mlx5e_txqsq *sq) +{ + int err; + int i; + + BUILD_BUG_ON_MSG(ARRAY_SIZE(sq_sw_state_type_name) != MLX5E_NUM_SQ_STATES, + "sq_sw_state_type_name string array must be consistent with MLX5E_SQ_STATE_* enum in en.h"); + err = mlx5e_health_fmsg_named_obj_nest_start(fmsg, "SW State"); + if (err) + return err; + + for (i = 0; i < ARRAY_SIZE(sq_sw_state_type_name); ++i) { + err = devlink_fmsg_u32_pair_put(fmsg, sq_sw_state_type_name[i], + test_bit(i, &sq->state)); + if (err) + return err; + } + + return mlx5e_health_fmsg_named_obj_nest_end(fmsg); +} + static int mlx5e_tx_reporter_err_cqe_recover(void *ctx) { struct mlx5_core_dev *mdev; @@ -190,6 +224,10 @@ mlx5e_tx_reporter_build_diagnose_output_sq_common(struct devlink_fmsg *fmsg, if (err) return err; + err = mlx5e_health_sq_put_sw_state(fmsg, sq); + if (err) + return err; + err = mlx5e_health_cq_diag_fmsg(&sq->cq, fmsg); if (err) return err; diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/tc/act/accept.c b/drivers/net/ethernet/mellanox/mlx5/core/en/tc/act/accept.c index a278f52d52b0..9db1b5307a8d 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en/tc/act/accept.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/tc/act/accept.c @@ -4,15 +4,6 @@ #include "act.h" #include "en/tc_priv.h" -static bool -tc_act_can_offload_accept(struct mlx5e_tc_act_parse_state *parse_state, - const struct flow_action_entry *act, - int act_index, - struct mlx5_flow_attr *attr) -{ - return true; -} - static int tc_act_parse_accept(struct mlx5e_tc_act_parse_state *parse_state, const struct flow_action_entry *act, @@ -26,7 +17,6 @@ tc_act_parse_accept(struct mlx5e_tc_act_parse_state *parse_state, } struct mlx5e_tc_act mlx5e_tc_act_accept = { - .can_offload = tc_act_can_offload_accept, .parse_action = tc_act_parse_accept, .is_terminating_action = true, }; diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/tc/act/act.c b/drivers/net/ethernet/mellanox/mlx5/core/en/tc/act/act.c index eba0c8698926..fc923a99b6a4 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en/tc/act/act.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/tc/act/act.c @@ -82,26 +82,6 @@ mlx5e_tc_act_init_parse_state(struct mlx5e_tc_act_parse_state *parse_state, parse_state->flow_action = flow_action; } -void -mlx5e_tc_act_reorder_flow_actions(struct flow_action *flow_action, - struct mlx5e_tc_flow_action *flow_action_reorder) -{ - struct flow_action_entry *act; - int i, j = 0; - - flow_action_for_each(i, act, flow_action) { - /* Add CT action to be first. */ - if (act->id == FLOW_ACTION_CT) - flow_action_reorder->entries[j++] = act; - } - - flow_action_for_each(i, act, flow_action) { - if (act->id == FLOW_ACTION_CT) - continue; - flow_action_reorder->entries[j++] = act; - } -} - int mlx5e_tc_act_post_parse(struct mlx5e_tc_act_parse_state *parse_state, struct flow_action *flow_action, diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/tc/act/act.h b/drivers/net/ethernet/mellanox/mlx5/core/en/tc/act/act.h index 8346557eeaf6..0e6e1872ac62 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en/tc/act/act.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/tc/act/act.h @@ -17,8 +17,6 @@ struct mlx5e_tc_act_parse_state { struct mlx5e_tc_flow *flow; struct netlink_ext_ack *extack; u32 actions; - bool ct; - bool ct_clear; bool encap; bool decap; bool mpls_push; @@ -56,6 +54,8 @@ struct mlx5e_tc_act { const struct flow_action_entry *act, struct mlx5_flow_attr *attr); + bool (*is_missable)(const struct flow_action_entry *act); + int (*offload_action)(struct mlx5e_priv *priv, struct flow_offload_action *fl_act, struct flow_action_entry *act); @@ -110,10 +110,6 @@ mlx5e_tc_act_init_parse_state(struct mlx5e_tc_act_parse_state *parse_state, struct flow_action *flow_action, struct netlink_ext_ack *extack); -void -mlx5e_tc_act_reorder_flow_actions(struct flow_action *flow_action, - struct mlx5e_tc_flow_action *flow_action_reorder); - int mlx5e_tc_act_post_parse(struct mlx5e_tc_act_parse_state *parse_state, struct flow_action *flow_action, diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/tc/act/ct.c b/drivers/net/ethernet/mellanox/mlx5/core/en/tc/act/ct.c index a829c94289c1..92d3952dfa8b 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en/tc/act/ct.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/tc/act/ct.c @@ -5,53 +5,22 @@ #include "en/tc_priv.h" #include "en/tc_ct.h" -static bool -tc_act_can_offload_ct(struct mlx5e_tc_act_parse_state *parse_state, - const struct flow_action_entry *act, - int act_index, - struct mlx5_flow_attr *attr) -{ - bool clear_action = act->ct.action & TCA_CT_ACT_CLEAR; - struct netlink_ext_ack *extack = parse_state->extack; - - if (parse_state->ct && !clear_action) { - NL_SET_ERR_MSG_MOD(extack, "Multiple CT actions are not supported"); - return false; - } - - return true; -} - static int tc_act_parse_ct(struct mlx5e_tc_act_parse_state *parse_state, const struct flow_action_entry *act, struct mlx5e_priv *priv, struct mlx5_flow_attr *attr) { - bool clear_action = act->ct.action & TCA_CT_ACT_CLEAR; int err; - /* It's redundant to do ct clear more than once. */ - if (clear_action && parse_state->ct_clear) - return 0; - - err = mlx5_tc_ct_parse_action(parse_state->ct_priv, attr, - &attr->parse_attr->mod_hdr_acts, - act, parse_state->extack); + err = mlx5_tc_ct_parse_action(parse_state->ct_priv, attr, act, parse_state->extack); if (err) return err; - if (mlx5e_is_eswitch_flow(parse_state->flow)) attr->esw_attr->split_count = attr->esw_attr->out_count; - if (clear_action) { - parse_state->ct_clear = true; - } else { - attr->flags |= MLX5_ATTR_FLAG_CT; - flow_flag_set(parse_state->flow, CT); - parse_state->ct = true; - } + attr->flags |= MLX5_ATTR_FLAG_CT; return 0; } @@ -61,27 +30,10 @@ tc_act_post_parse_ct(struct mlx5e_tc_act_parse_state *parse_state, struct mlx5e_priv *priv, struct mlx5_flow_attr *attr) { - struct mlx5e_tc_mod_hdr_acts *mod_acts = &attr->parse_attr->mod_hdr_acts; - int err; - - /* If ct action exist, we can ignore previous ct_clear actions */ - if (parse_state->ct) + if (!(attr->flags & MLX5_ATTR_FLAG_CT)) return 0; - if (parse_state->ct_clear) { - err = mlx5_tc_ct_set_ct_clear_regs(parse_state->ct_priv, mod_acts); - if (err) { - NL_SET_ERR_MSG_MOD(parse_state->extack, - "Failed to set registers for ct clear"); - return err; - } - attr->action |= MLX5_FLOW_CONTEXT_ACTION_MOD_HDR; - - /* Prevent handling of additional, redundant clear actions */ - parse_state->ct_clear = false; - } - - return 0; + return mlx5_tc_ct_flow_offload(parse_state->ct_priv, attr); } static bool @@ -95,10 +47,16 @@ tc_act_is_multi_table_act_ct(struct mlx5e_priv *priv, return true; } +static bool +tc_act_is_missable_ct(const struct flow_action_entry *act) +{ + return !(act->ct.action & TCA_CT_ACT_CLEAR); +} + struct mlx5e_tc_act mlx5e_tc_act_ct = { - .can_offload = tc_act_can_offload_ct, .parse_action = tc_act_parse_ct, - .is_multi_table_act = tc_act_is_multi_table_act_ct, .post_parse = tc_act_post_parse_ct, + .is_multi_table_act = tc_act_is_multi_table_act_ct, + .is_missable = tc_act_is_missable_ct, }; diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/tc/act/drop.c b/drivers/net/ethernet/mellanox/mlx5/core/en/tc/act/drop.c index 7d16aeabb119..5dc81715d625 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en/tc/act/drop.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/tc/act/drop.c @@ -4,15 +4,6 @@ #include "act.h" #include "en/tc_priv.h" -static bool -tc_act_can_offload_drop(struct mlx5e_tc_act_parse_state *parse_state, - const struct flow_action_entry *act, - int act_index, - struct mlx5_flow_attr *attr) -{ - return true; -} - static int tc_act_parse_drop(struct mlx5e_tc_act_parse_state *parse_state, const struct flow_action_entry *act, @@ -25,7 +16,6 @@ tc_act_parse_drop(struct mlx5e_tc_act_parse_state *parse_state, } struct mlx5e_tc_act mlx5e_tc_act_drop = { - .can_offload = tc_act_can_offload_drop, .parse_action = tc_act_parse_drop, .is_terminating_action = true, }; diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/tc/act/mirred.c b/drivers/net/ethernet/mellanox/mlx5/core/en/tc/act/mirred.c index 07cc65596f89..291193f7120d 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en/tc/act/mirred.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/tc/act/mirred.c @@ -234,6 +234,9 @@ parse_mirred(struct mlx5e_tc_act_parse_state *parse_state, if (mlx5_lag_mpesw_do_mirred(priv->mdev, out_dev, extack)) return -EOPNOTSUPP; + if (netif_is_macvlan(out_dev)) + out_dev = macvlan_dev_real_dev(out_dev); + out_dev = get_fdb_out_dev(uplink_dev, out_dev); if (!out_dev) return -ENODEV; @@ -250,9 +253,6 @@ parse_mirred(struct mlx5e_tc_act_parse_state *parse_state, return err; } - if (netif_is_macvlan(out_dev)) - out_dev = macvlan_dev_real_dev(out_dev); - err = verify_uplink_forwarding(priv, attr, out_dev, extack); if (err) return err; diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/tc/act/pedit.c b/drivers/net/ethernet/mellanox/mlx5/core/en/tc/act/pedit.c index 47597c524e59..3b272bbf4c53 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en/tc/act/pedit.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/tc/act/pedit.c @@ -78,15 +78,6 @@ out_err: return err; } -static bool -tc_act_can_offload_pedit(struct mlx5e_tc_act_parse_state *parse_state, - const struct flow_action_entry *act, - int act_index, - struct mlx5_flow_attr *attr) -{ - return true; -} - static int tc_act_parse_pedit(struct mlx5e_tc_act_parse_state *parse_state, const struct flow_action_entry *act, @@ -114,6 +105,5 @@ tc_act_parse_pedit(struct mlx5e_tc_act_parse_state *parse_state, } struct mlx5e_tc_act mlx5e_tc_act_pedit = { - .can_offload = tc_act_can_offload_pedit, .parse_action = tc_act_parse_pedit, }; diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/tc/act/ptype.c b/drivers/net/ethernet/mellanox/mlx5/core/en/tc/act/ptype.c index 6454b031ff7a..80b4bc64380a 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en/tc/act/ptype.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/tc/act/ptype.c @@ -4,15 +4,6 @@ #include "act.h" #include "en/tc_priv.h" -static bool -tc_act_can_offload_ptype(struct mlx5e_tc_act_parse_state *parse_state, - const struct flow_action_entry *act, - int act_index, - struct mlx5_flow_attr *attr) -{ - return true; -} - static int tc_act_parse_ptype(struct mlx5e_tc_act_parse_state *parse_state, const struct flow_action_entry *act, @@ -31,6 +22,5 @@ tc_act_parse_ptype(struct mlx5e_tc_act_parse_state *parse_state, } struct mlx5e_tc_act mlx5e_tc_act_ptype = { - .can_offload = tc_act_can_offload_ptype, .parse_action = tc_act_parse_ptype, }; diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/tc/act/sample.c b/drivers/net/ethernet/mellanox/mlx5/core/en/tc/act/sample.c index 2c0196431302..2df02f99cecf 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en/tc/act/sample.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/tc/act/sample.c @@ -6,25 +6,6 @@ #include "en/tc_priv.h" #include "en/tc/act/sample.h" -static bool -tc_act_can_offload_sample(struct mlx5e_tc_act_parse_state *parse_state, - const struct flow_action_entry *act, - int act_index, - struct mlx5_flow_attr *attr) -{ - struct netlink_ext_ack *extack = parse_state->extack; - bool ct_nat; - - ct_nat = attr->ct_attr.ct_action & TCA_CT_ACT_NAT; - - if (flow_flag_test(parse_state->flow, CT) && ct_nat) { - NL_SET_ERR_MSG_MOD(extack, "Sample action with CT NAT is not supported"); - return false; - } - - return true; -} - static int tc_act_parse_sample(struct mlx5e_tc_act_parse_state *parse_state, const struct flow_action_entry *act, @@ -65,7 +46,6 @@ tc_act_is_multi_table_act_sample(struct mlx5e_priv *priv, } struct mlx5e_tc_act mlx5e_tc_act_sample = { - .can_offload = tc_act_can_offload_sample, .parse_action = tc_act_parse_sample, .is_multi_table_act = tc_act_is_multi_table_act_sample, }; diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/tc/act/trap.c b/drivers/net/ethernet/mellanox/mlx5/core/en/tc/act/trap.c index 915ce201aeb2..1b78bd9c106a 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en/tc/act/trap.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/tc/act/trap.c @@ -5,15 +5,6 @@ #include "en/tc_priv.h" #include "eswitch.h" -static bool -tc_act_can_offload_trap(struct mlx5e_tc_act_parse_state *parse_state, - const struct flow_action_entry *act, - int act_index, - struct mlx5_flow_attr *attr) -{ - return true; -} - static int tc_act_parse_trap(struct mlx5e_tc_act_parse_state *parse_state, const struct flow_action_entry *act, @@ -27,6 +18,5 @@ tc_act_parse_trap(struct mlx5e_tc_act_parse_state *parse_state, } struct mlx5e_tc_act mlx5e_tc_act_trap = { - .can_offload = tc_act_can_offload_trap, .parse_action = tc_act_parse_trap, }; diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/tc/act/tun.c b/drivers/net/ethernet/mellanox/mlx5/core/en/tc/act/tun.c index b4fa2de9711d..f1cae21c2c37 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en/tc/act/tun.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/tc/act/tun.c @@ -32,15 +32,6 @@ tc_act_parse_tun_encap(struct mlx5e_tc_act_parse_state *parse_state, return 0; } -static bool -tc_act_can_offload_tun_decap(struct mlx5e_tc_act_parse_state *parse_state, - const struct flow_action_entry *act, - int act_index, - struct mlx5_flow_attr *attr) -{ - return true; -} - static int tc_act_parse_tun_decap(struct mlx5e_tc_act_parse_state *parse_state, const struct flow_action_entry *act, @@ -58,6 +49,5 @@ struct mlx5e_tc_act mlx5e_tc_act_tun_encap = { }; struct mlx5e_tc_act mlx5e_tc_act_tun_decap = { - .can_offload = tc_act_can_offload_tun_decap, .parse_action = tc_act_parse_tun_decap, }; diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/tc/act/vlan.c b/drivers/net/ethernet/mellanox/mlx5/core/en/tc/act/vlan.c index 2e0d88b513aa..c8a3eaf189f6 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en/tc/act/vlan.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/tc/act/vlan.c @@ -141,15 +141,6 @@ mlx5e_tc_act_vlan_add_pop_action(struct mlx5e_priv *priv, return err; } -static bool -tc_act_can_offload_vlan(struct mlx5e_tc_act_parse_state *parse_state, - const struct flow_action_entry *act, - int act_index, - struct mlx5_flow_attr *attr) -{ - return true; -} - static int tc_act_parse_vlan(struct mlx5e_tc_act_parse_state *parse_state, const struct flow_action_entry *act, @@ -205,7 +196,6 @@ tc_act_post_parse_vlan(struct mlx5e_tc_act_parse_state *parse_state, } struct mlx5e_tc_act mlx5e_tc_act_vlan = { - .can_offload = tc_act_can_offload_vlan, .parse_action = tc_act_parse_vlan, .post_parse = tc_act_post_parse_vlan, }; diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/tc/act/vlan_mangle.c b/drivers/net/ethernet/mellanox/mlx5/core/en/tc/act/vlan_mangle.c index 9a8a1a6bd99e..310b99230760 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en/tc/act/vlan_mangle.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/tc/act/vlan_mangle.c @@ -50,15 +50,6 @@ mlx5e_tc_act_vlan_add_rewrite_action(struct mlx5e_priv *priv, int namespace, return err; } -static bool -tc_act_can_offload_vlan_mangle(struct mlx5e_tc_act_parse_state *parse_state, - const struct flow_action_entry *act, - int act_index, - struct mlx5_flow_attr *attr) -{ - return true; -} - static int tc_act_parse_vlan_mangle(struct mlx5e_tc_act_parse_state *parse_state, const struct flow_action_entry *act, @@ -81,6 +72,5 @@ tc_act_parse_vlan_mangle(struct mlx5e_tc_act_parse_state *parse_state, } struct mlx5e_tc_act mlx5e_tc_act_vlan_mangle = { - .can_offload = tc_act_can_offload_vlan_mangle, .parse_action = tc_act_parse_vlan_mangle, }; diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/tc/post_act.c b/drivers/net/ethernet/mellanox/mlx5/core/en/tc/post_act.c index 4e48946c4c2a..0290e0dea539 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en/tc/post_act.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/tc/post_act.c @@ -106,22 +106,17 @@ err_rule: } struct mlx5e_post_act_handle * -mlx5e_tc_post_act_add(struct mlx5e_post_act *post_act, struct mlx5_flow_attr *attr) +mlx5e_tc_post_act_add(struct mlx5e_post_act *post_act, struct mlx5_flow_attr *post_attr) { - u32 attr_sz = ns_to_attr_sz(post_act->ns_type); struct mlx5e_post_act_handle *handle; - struct mlx5_flow_attr *post_attr; int err; handle = kzalloc(sizeof(*handle), GFP_KERNEL); - post_attr = mlx5_alloc_flow_attr(post_act->ns_type); - if (!handle || !post_attr) { - kfree(post_attr); + if (!handle) { kfree(handle); return ERR_PTR(-ENOMEM); } - memcpy(post_attr, attr, attr_sz); post_attr->chain = 0; post_attr->prio = 0; post_attr->ft = post_act->ft; @@ -145,7 +140,6 @@ mlx5e_tc_post_act_add(struct mlx5e_post_act *post_act, struct mlx5_flow_attr *at return handle; err_xarray: - kfree(post_attr); kfree(handle); return ERR_PTR(err); } @@ -164,7 +158,6 @@ mlx5e_tc_post_act_del(struct mlx5e_post_act *post_act, struct mlx5e_post_act_han if (!IS_ERR_OR_NULL(handle->rule)) mlx5e_tc_post_act_unoffload(post_act, handle); xa_erase(&post_act->ids, handle->id); - kfree(handle->attr); kfree(handle); } diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/tc/post_act.h b/drivers/net/ethernet/mellanox/mlx5/core/en/tc/post_act.h index f476774c0b75..40b8df184af5 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en/tc/post_act.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/tc/post_act.h @@ -19,7 +19,7 @@ void mlx5e_tc_post_act_destroy(struct mlx5e_post_act *post_act); struct mlx5e_post_act_handle * -mlx5e_tc_post_act_add(struct mlx5e_post_act *post_act, struct mlx5_flow_attr *attr); +mlx5e_tc_post_act_add(struct mlx5e_post_act *post_act, struct mlx5_flow_attr *post_attr); void mlx5e_tc_post_act_del(struct mlx5e_post_act *post_act, struct mlx5e_post_act_handle *handle); diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/tc/sample.c b/drivers/net/ethernet/mellanox/mlx5/core/en/tc/sample.c index 558a776359af..5db239cae814 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en/tc/sample.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/tc/sample.c @@ -14,10 +14,10 @@ #define MLX5_ESW_VPORT_TBL_SIZE_SAMPLE (64 * 1024) -static const struct esw_vport_tbl_namespace mlx5_esw_vport_tbl_sample_ns = { +static struct esw_vport_tbl_namespace mlx5_esw_vport_tbl_sample_ns = { .max_fte = MLX5_ESW_VPORT_TBL_SIZE_SAMPLE, .max_num_groups = 0, /* default num of groups */ - .flags = MLX5_FLOW_TABLE_TUNNEL_EN_REFORMAT | MLX5_FLOW_TABLE_TUNNEL_EN_DECAP, + .flags = 0, }; struct mlx5e_tc_psample { diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/tc_ct.c b/drivers/net/ethernet/mellanox/mlx5/core/en/tc_ct.c index 314983bc6f08..ead38ef69483 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en/tc_ct.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/tc_ct.c @@ -83,12 +83,6 @@ struct mlx5_tc_ct_priv { struct mlx5_tc_ct_debugfs debugfs; }; -struct mlx5_ct_flow { - struct mlx5_flow_attr *pre_ct_attr; - struct mlx5_flow_handle *pre_ct_rule; - struct mlx5_ct_ft *ft; -}; - struct mlx5_ct_zone_rule { struct mlx5_ct_fs_rule *rule; struct mlx5e_mod_hdr_handle *mh; @@ -598,12 +592,6 @@ mlx5_tc_ct_entry_set_registers(struct mlx5_tc_ct_priv *ct_priv, return 0; } -int mlx5_tc_ct_set_ct_clear_regs(struct mlx5_tc_ct_priv *priv, - struct mlx5e_tc_mod_hdr_acts *mod_acts) -{ - return mlx5_tc_ct_entry_set_registers(priv, mod_acts, 0, 0, 0, 0); -} - static int mlx5_tc_ct_parse_mangle_to_mod_act(struct flow_action_entry *act, char *modact) @@ -920,6 +908,7 @@ mlx5_tc_ct_entry_replace_rule(struct mlx5_tc_ct_priv *ct_priv, zone_rule->rule = rule; mlx5_tc_ct_entry_destroy_mod_hdr(ct_priv, old_attr, zone_rule->mh); zone_rule->mh = mh; + mlx5_put_label_mapping(ct_priv, old_attr->ct_attr.ct_labels_id); kfree(old_attr); kvfree(spec); @@ -1545,7 +1534,6 @@ mlx5_tc_ct_match_add(struct mlx5_tc_ct_priv *priv, int mlx5_tc_ct_parse_action(struct mlx5_tc_ct_priv *priv, struct mlx5_flow_attr *attr, - struct mlx5e_tc_mod_hdr_acts *mod_acts, const struct flow_action_entry *act, struct netlink_ext_ack *extack) { @@ -1555,8 +1543,8 @@ mlx5_tc_ct_parse_action(struct mlx5_tc_ct_priv *priv, return -EOPNOTSUPP; } + attr->ct_attr.ct_action |= act->ct.action; /* So we can have clear + ct */ attr->ct_attr.zone = act->ct.zone; - attr->ct_attr.ct_action = act->ct.action; attr->ct_attr.nf_ft = act->ct.flow_table; attr->ct_attr.act_miss_cookie = act->miss_cookie; @@ -1892,14 +1880,14 @@ mlx5_tc_ct_del_ft_cb(struct mlx5_tc_ct_priv *ct_priv, struct mlx5_ct_ft *ft) /* We translate the tc filter with CT action to the following HW model: * - * +---------------------+ - * + ft prio (tc chain) + - * + original match + - * +---------------------+ + * +-----------------------+ + * + rule (either original + + * + or post_act rule) + + * +-----------------------+ * | set act_miss_cookie mapping * | set fte_id * | set tunnel_id - * | do decap + * | rest of actions before the CT action (for this orig/post_act rule) * | * +-------------+ * | Chain 0 | @@ -1924,32 +1912,21 @@ mlx5_tc_ct_del_ft_cb(struct mlx5_tc_ct_priv *ct_priv, struct mlx5_ct_ft *ft) * | do nat (if needed) * v * +--------------+ - * + post_act + original filter actions + * + post_act + rest of parsed filter's actions * + fte_id match +------------------------> * +--------------+ * */ -static struct mlx5_flow_handle * +static int __mlx5_tc_ct_flow_offload(struct mlx5_tc_ct_priv *ct_priv, - struct mlx5_flow_spec *orig_spec, struct mlx5_flow_attr *attr) { bool nat = attr->ct_attr.ct_action & TCA_CT_ACT_NAT; struct mlx5e_priv *priv = netdev_priv(ct_priv->netdev); - struct mlx5e_tc_mod_hdr_acts *pre_mod_acts; - u32 attr_sz = ns_to_attr_sz(ct_priv->ns_type); - struct mlx5_flow_attr *pre_ct_attr; - struct mlx5_modify_hdr *mod_hdr; - struct mlx5_ct_flow *ct_flow; int act_miss_mapping = 0, err; struct mlx5_ct_ft *ft; u16 zone; - ct_flow = kzalloc(sizeof(*ct_flow), GFP_KERNEL); - if (!ct_flow) { - return ERR_PTR(-ENOMEM); - } - /* Register for CT established events */ ft = mlx5_tc_ct_add_ft_cb(ct_priv, attr->ct_attr.zone, attr->ct_attr.nf_ft); @@ -1958,23 +1935,7 @@ __mlx5_tc_ct_flow_offload(struct mlx5_tc_ct_priv *ct_priv, ct_dbg("Failed to register to ft callback"); goto err_ft; } - ct_flow->ft = ft; - - /* Base flow attributes of both rules on original rule attribute */ - ct_flow->pre_ct_attr = mlx5_alloc_flow_attr(ct_priv->ns_type); - if (!ct_flow->pre_ct_attr) { - err = -ENOMEM; - goto err_alloc_pre; - } - - pre_ct_attr = ct_flow->pre_ct_attr; - memcpy(pre_ct_attr, attr, attr_sz); - pre_mod_acts = &pre_ct_attr->parse_attr->mod_hdr_acts; - - /* Modify the original rule's action to fwd and modify, leave decap */ - pre_ct_attr->action = attr->action & MLX5_FLOW_CONTEXT_ACTION_DECAP; - pre_ct_attr->action |= MLX5_FLOW_CONTEXT_ACTION_FWD_DEST | - MLX5_FLOW_CONTEXT_ACTION_MOD_HDR; + attr->ct_attr.ft = ft; err = mlx5e_tc_action_miss_mapping_get(ct_priv->priv, attr, attr->ct_attr.act_miss_cookie, &act_miss_mapping); @@ -1982,136 +1943,89 @@ __mlx5_tc_ct_flow_offload(struct mlx5_tc_ct_priv *ct_priv, ct_dbg("Failed to get register mapping for act miss"); goto err_get_act_miss; } - attr->ct_attr.act_miss_mapping = act_miss_mapping; - err = mlx5e_tc_match_to_reg_set(priv->mdev, pre_mod_acts, ct_priv->ns_type, - MAPPED_OBJ_TO_REG, act_miss_mapping); + err = mlx5e_tc_match_to_reg_set(priv->mdev, &attr->parse_attr->mod_hdr_acts, + ct_priv->ns_type, MAPPED_OBJ_TO_REG, act_miss_mapping); if (err) { ct_dbg("Failed to set act miss register mapping"); goto err_mapping; } - /* If original flow is decap, we do it before going into ct table - * so add a rewrite for the tunnel match_id. - */ - if ((pre_ct_attr->action & MLX5_FLOW_CONTEXT_ACTION_DECAP) && - attr->chain == 0) { - err = mlx5e_tc_match_to_reg_set(priv->mdev, pre_mod_acts, - ct_priv->ns_type, - TUNNEL_TO_REG, - attr->tunnel_id); - if (err) { - ct_dbg("Failed to set tunnel register mapping"); - goto err_mapping; - } - } - - /* Change original rule point to ct table - * Chain 0 sets the zone and jumps to ct table + /* Chain 0 sets the zone and jumps to ct table * Other chains jump to pre_ct table to align with act_ct cached logic */ - pre_ct_attr->dest_chain = 0; if (!attr->chain) { zone = ft->zone & MLX5_CT_ZONE_MASK; - err = mlx5e_tc_match_to_reg_set(priv->mdev, pre_mod_acts, ct_priv->ns_type, - ZONE_TO_REG, zone); + err = mlx5e_tc_match_to_reg_set(priv->mdev, &attr->parse_attr->mod_hdr_acts, + ct_priv->ns_type, ZONE_TO_REG, zone); if (err) { ct_dbg("Failed to set zone register mapping"); goto err_mapping; } - pre_ct_attr->dest_ft = nat ? ct_priv->ct_nat : ct_priv->ct; + attr->dest_ft = nat ? ct_priv->ct_nat : ct_priv->ct; } else { - pre_ct_attr->dest_ft = nat ? ft->pre_ct_nat.ft : ft->pre_ct.ft; - } - - mod_hdr = mlx5_modify_header_alloc(priv->mdev, ct_priv->ns_type, - pre_mod_acts->num_actions, - pre_mod_acts->actions); - if (IS_ERR(mod_hdr)) { - err = PTR_ERR(mod_hdr); - ct_dbg("Failed to create pre ct mod hdr"); - goto err_mapping; - } - pre_ct_attr->modify_hdr = mod_hdr; - ct_flow->pre_ct_rule = mlx5_tc_rule_insert(priv, orig_spec, - pre_ct_attr); - if (IS_ERR(ct_flow->pre_ct_rule)) { - err = PTR_ERR(ct_flow->pre_ct_rule); - ct_dbg("Failed to add pre ct rule"); - goto err_insert_orig; + attr->dest_ft = nat ? ft->pre_ct_nat.ft : ft->pre_ct.ft; } - attr->ct_attr.ct_flow = ct_flow; - mlx5e_mod_hdr_dealloc(pre_mod_acts); + attr->action |= MLX5_FLOW_CONTEXT_ACTION_FWD_DEST | MLX5_FLOW_CONTEXT_ACTION_MOD_HDR; + attr->ct_attr.act_miss_mapping = act_miss_mapping; - return ct_flow->pre_ct_rule; + return 0; -err_insert_orig: - mlx5_modify_header_dealloc(priv->mdev, pre_ct_attr->modify_hdr); err_mapping: - mlx5e_mod_hdr_dealloc(pre_mod_acts); mlx5e_tc_action_miss_mapping_put(ct_priv->priv, attr, act_miss_mapping); err_get_act_miss: - kfree(ct_flow->pre_ct_attr); -err_alloc_pre: mlx5_tc_ct_del_ft_cb(ct_priv, ft); err_ft: - kfree(ct_flow); netdev_warn(priv->netdev, "Failed to offload ct flow, err %d\n", err); - return ERR_PTR(err); + return err; } -struct mlx5_flow_handle * -mlx5_tc_ct_flow_offload(struct mlx5_tc_ct_priv *priv, - struct mlx5_flow_spec *spec, - struct mlx5_flow_attr *attr, - struct mlx5e_tc_mod_hdr_acts *mod_hdr_acts) +int +mlx5_tc_ct_flow_offload(struct mlx5_tc_ct_priv *priv, struct mlx5_flow_attr *attr) { - struct mlx5_flow_handle *rule; + int err; if (!priv) - return ERR_PTR(-EOPNOTSUPP); + return -EOPNOTSUPP; + + if (attr->ct_attr.ct_action & TCA_CT_ACT_CLEAR) { + err = mlx5_tc_ct_entry_set_registers(priv, &attr->parse_attr->mod_hdr_acts, + 0, 0, 0, 0); + if (err) + return err; + + attr->action |= MLX5_FLOW_CONTEXT_ACTION_MOD_HDR; + } + + if (!attr->ct_attr.nf_ft) /* means only ct clear action, and not ct_clear,ct() */ + return 0; mutex_lock(&priv->control_lock); - rule = __mlx5_tc_ct_flow_offload(priv, spec, attr); + err = __mlx5_tc_ct_flow_offload(priv, attr); mutex_unlock(&priv->control_lock); - return rule; + return err; } static void __mlx5_tc_ct_delete_flow(struct mlx5_tc_ct_priv *ct_priv, - struct mlx5_ct_flow *ct_flow, struct mlx5_flow_attr *attr) { - struct mlx5_flow_attr *pre_ct_attr = ct_flow->pre_ct_attr; - struct mlx5e_priv *priv = netdev_priv(ct_priv->netdev); - - mlx5_tc_rule_delete(priv, ct_flow->pre_ct_rule, pre_ct_attr); - mlx5_modify_header_dealloc(priv->mdev, pre_ct_attr->modify_hdr); - mlx5e_tc_action_miss_mapping_put(ct_priv->priv, attr, attr->ct_attr.act_miss_mapping); - mlx5_tc_ct_del_ft_cb(ct_priv, ct_flow->ft); - - kfree(ct_flow->pre_ct_attr); - kfree(ct_flow); + mlx5_tc_ct_del_ft_cb(ct_priv, attr->ct_attr.ft); } void mlx5_tc_ct_delete_flow(struct mlx5_tc_ct_priv *priv, struct mlx5_flow_attr *attr) { - struct mlx5_ct_flow *ct_flow = attr->ct_attr.ct_flow; - - /* We are called on error to clean up stuff from parsing - * but we don't have anything for now - */ - if (!ct_flow) + if (!attr->ct_attr.nf_ft) /* means only ct clear action, and not ct_clear,ct() */ return; mutex_lock(&priv->control_lock); - __mlx5_tc_ct_delete_flow(priv, ct_flow, attr); + __mlx5_tc_ct_delete_flow(priv, attr); mutex_unlock(&priv->control_lock); } diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/tc_ct.h b/drivers/net/ethernet/mellanox/mlx5/core/en/tc_ct.h index 5c5ddaa83055..8e9316fa46d4 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en/tc_ct.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/tc_ct.h @@ -25,11 +25,11 @@ struct nf_flowtable; struct mlx5_ct_attr { u16 zone; u16 ct_action; - struct mlx5_ct_flow *ct_flow; struct nf_flowtable *nf_ft; u32 ct_labels_id; u32 act_miss_mapping; u64 act_miss_cookie; + struct mlx5_ct_ft *ft; }; #define zone_to_reg_ct {\ @@ -113,15 +113,12 @@ int mlx5_tc_ct_add_no_trk_match(struct mlx5_flow_spec *spec); int mlx5_tc_ct_parse_action(struct mlx5_tc_ct_priv *priv, struct mlx5_flow_attr *attr, - struct mlx5e_tc_mod_hdr_acts *mod_acts, const struct flow_action_entry *act, struct netlink_ext_ack *extack); -struct mlx5_flow_handle * -mlx5_tc_ct_flow_offload(struct mlx5_tc_ct_priv *priv, - struct mlx5_flow_spec *spec, - struct mlx5_flow_attr *attr, - struct mlx5e_tc_mod_hdr_acts *mod_hdr_acts); +int +mlx5_tc_ct_flow_offload(struct mlx5_tc_ct_priv *priv, struct mlx5_flow_attr *attr); + void mlx5_tc_ct_delete_flow(struct mlx5_tc_ct_priv *priv, struct mlx5_flow_attr *attr); @@ -130,10 +127,6 @@ bool mlx5e_tc_ct_restore_flow(struct mlx5_tc_ct_priv *ct_priv, struct sk_buff *skb, u8 zone_restore_id); -int -mlx5_tc_ct_set_ct_clear_regs(struct mlx5_tc_ct_priv *priv, - struct mlx5e_tc_mod_hdr_acts *mod_acts); - #else /* CONFIG_MLX5_TC_CT */ static inline struct mlx5_tc_ct_priv * @@ -176,16 +169,8 @@ mlx5_tc_ct_add_no_trk_match(struct mlx5_flow_spec *spec) } static inline int -mlx5_tc_ct_set_ct_clear_regs(struct mlx5_tc_ct_priv *priv, - struct mlx5e_tc_mod_hdr_acts *mod_acts) -{ - return -EOPNOTSUPP; -} - -static inline int mlx5_tc_ct_parse_action(struct mlx5_tc_ct_priv *priv, struct mlx5_flow_attr *attr, - struct mlx5e_tc_mod_hdr_acts *mod_acts, const struct flow_action_entry *act, struct netlink_ext_ack *extack) { @@ -193,13 +178,11 @@ mlx5_tc_ct_parse_action(struct mlx5_tc_ct_priv *priv, return -EOPNOTSUPP; } -static inline struct mlx5_flow_handle * +static inline int mlx5_tc_ct_flow_offload(struct mlx5_tc_ct_priv *priv, - struct mlx5_flow_spec *spec, - struct mlx5_flow_attr *attr, - struct mlx5e_tc_mod_hdr_acts *mod_hdr_acts) + struct mlx5_flow_attr *attr) { - return ERR_PTR(-EOPNOTSUPP); + return -EOPNOTSUPP; } static inline void diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/tc_priv.h b/drivers/net/ethernet/mellanox/mlx5/core/en/tc_priv.h index 451fd4342a5a..ba2b1f24ff14 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en/tc_priv.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/tc_priv.h @@ -25,12 +25,11 @@ enum { MLX5E_TC_FLOW_FLAG_DUP = MLX5E_TC_FLOW_BASE + 4, MLX5E_TC_FLOW_FLAG_NOT_READY = MLX5E_TC_FLOW_BASE + 5, MLX5E_TC_FLOW_FLAG_DELETED = MLX5E_TC_FLOW_BASE + 6, - MLX5E_TC_FLOW_FLAG_CT = MLX5E_TC_FLOW_BASE + 7, - MLX5E_TC_FLOW_FLAG_L3_TO_L2_DECAP = MLX5E_TC_FLOW_BASE + 8, - MLX5E_TC_FLOW_FLAG_TUN_RX = MLX5E_TC_FLOW_BASE + 9, - MLX5E_TC_FLOW_FLAG_FAILED = MLX5E_TC_FLOW_BASE + 10, - MLX5E_TC_FLOW_FLAG_SAMPLE = MLX5E_TC_FLOW_BASE + 11, - MLX5E_TC_FLOW_FLAG_USE_ACT_STATS = MLX5E_TC_FLOW_BASE + 12, + MLX5E_TC_FLOW_FLAG_L3_TO_L2_DECAP = MLX5E_TC_FLOW_BASE + 7, + MLX5E_TC_FLOW_FLAG_TUN_RX = MLX5E_TC_FLOW_BASE + 8, + MLX5E_TC_FLOW_FLAG_FAILED = MLX5E_TC_FLOW_BASE + 9, + MLX5E_TC_FLOW_FLAG_SAMPLE = MLX5E_TC_FLOW_BASE + 10, + MLX5E_TC_FLOW_FLAG_USE_ACT_STATS = MLX5E_TC_FLOW_BASE + 11, }; struct mlx5e_tc_flow_parse_attr { diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun.h b/drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun.h index b38f693bbb52..92065568bb19 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun.h @@ -115,6 +115,9 @@ int mlx5e_tc_tun_parse_udp_ports(struct mlx5e_priv *priv, bool mlx5e_tc_tun_encap_info_equal_generic(struct mlx5e_encap_key *a, struct mlx5e_encap_key *b); +bool mlx5e_tc_tun_encap_info_equal_options(struct mlx5e_encap_key *a, + struct mlx5e_encap_key *b, + __be16 tun_flags); #endif /* CONFIG_MLX5_ESWITCH */ #endif //__MLX5_EN_TC_TUNNEL_H__ diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun_encap.c b/drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun_encap.c index 780224fd67a1..20c2d2ecaf93 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun_encap.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun_encap.c @@ -3,6 +3,7 @@ #include <net/fib_notifier.h> #include <net/nexthop.h> +#include <net/ip_tunnels.h> #include "tc_tun_encap.h" #include "en_tc.h" #include "tc_tun.h" @@ -97,7 +98,6 @@ int mlx5e_tc_set_attr_rx_tun(struct mlx5e_tc_flow *flow, #if IS_ENABLED(CONFIG_INET) && IS_ENABLED(CONFIG_IPV6) else if (ip_version == 6) { int ipv6_size = MLX5_FLD_SZ_BYTES(ipv6_layout, ipv6); - struct in6_addr zerov6 = {}; daddr = MLX5_ADDR_OF(fte_match_param, spec->match_value, outer_headers.dst_ipv4_dst_ipv6.ipv6_layout.ipv6); @@ -105,8 +105,8 @@ int mlx5e_tc_set_attr_rx_tun(struct mlx5e_tc_flow *flow, outer_headers.src_ipv4_src_ipv6.ipv6_layout.ipv6); memcpy(&tun_attr->dst_ip.v6, daddr, ipv6_size); memcpy(&tun_attr->src_ip.v6, saddr, ipv6_size); - if (!memcmp(&tun_attr->dst_ip.v6, &zerov6, sizeof(zerov6)) || - !memcmp(&tun_attr->src_ip.v6, &zerov6, sizeof(zerov6))) + if (ipv6_addr_any(&tun_attr->dst_ip.v6) || + ipv6_addr_any(&tun_attr->src_ip.v6)) return 0; } #endif @@ -571,6 +571,37 @@ bool mlx5e_tc_tun_encap_info_equal_generic(struct mlx5e_encap_key *a, a->tc_tunnel->tunnel_type == b->tc_tunnel->tunnel_type; } +bool mlx5e_tc_tun_encap_info_equal_options(struct mlx5e_encap_key *a, + struct mlx5e_encap_key *b, + __be16 tun_flags) +{ + struct ip_tunnel_info *a_info; + struct ip_tunnel_info *b_info; + bool a_has_opts, b_has_opts; + + if (!mlx5e_tc_tun_encap_info_equal_generic(a, b)) + return false; + + a_has_opts = !!(a->ip_tun_key->tun_flags & tun_flags); + b_has_opts = !!(b->ip_tun_key->tun_flags & tun_flags); + + /* keys are equal when both don't have any options attached */ + if (!a_has_opts && !b_has_opts) + return true; + + if (a_has_opts != b_has_opts) + return false; + + /* options stored in memory next to ip_tunnel_info struct */ + a_info = container_of(a->ip_tun_key, struct ip_tunnel_info, key); + b_info = container_of(b->ip_tun_key, struct ip_tunnel_info, key); + + return a_info->options_len == b_info->options_len && + !memcmp(ip_tunnel_info_opts(a_info), + ip_tunnel_info_opts(b_info), + a_info->options_len); +} + static int cmp_decap_info(struct mlx5e_decap_key *a, struct mlx5e_decap_key *b) { diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun_geneve.c b/drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun_geneve.c index 054d80c4e65c..2bcd10b6d653 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun_geneve.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun_geneve.c @@ -337,29 +337,7 @@ static int mlx5e_tc_tun_parse_geneve(struct mlx5e_priv *priv, static bool mlx5e_tc_tun_encap_info_equal_geneve(struct mlx5e_encap_key *a, struct mlx5e_encap_key *b) { - struct ip_tunnel_info *a_info; - struct ip_tunnel_info *b_info; - bool a_has_opts, b_has_opts; - - if (!mlx5e_tc_tun_encap_info_equal_generic(a, b)) - return false; - - a_has_opts = !!(a->ip_tun_key->tun_flags & TUNNEL_GENEVE_OPT); - b_has_opts = !!(b->ip_tun_key->tun_flags & TUNNEL_GENEVE_OPT); - - /* keys are equal when both don't have any options attached */ - if (!a_has_opts && !b_has_opts) - return true; - - if (a_has_opts != b_has_opts) - return false; - - /* geneve options stored in memory next to ip_tunnel_info struct */ - a_info = container_of(a->ip_tun_key, struct ip_tunnel_info, key); - b_info = container_of(b->ip_tun_key, struct ip_tunnel_info, key); - - return a_info->options_len == b_info->options_len && - memcmp(a_info + 1, b_info + 1, a_info->options_len) == 0; + return mlx5e_tc_tun_encap_info_equal_options(a, b, TUNNEL_GENEVE_OPT); } struct mlx5e_tc_tunnel geneve_tunnel = { diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun_vxlan.c b/drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun_vxlan.c index 1f62c702b625..a184d739d5f8 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun_vxlan.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun_vxlan.c @@ -1,6 +1,7 @@ // SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB /* Copyright (c) 2018 Mellanox Technologies. */ +#include <net/ip_tunnels.h> #include <net/vxlan.h> #include "lib/vxlan.h" #include "en/tc_tun.h" @@ -86,9 +87,11 @@ static int mlx5e_gen_ip_tunnel_header_vxlan(char buf[], const struct ip_tunnel_key *tun_key = &e->tun_info->key; __be32 tun_id = tunnel_id_to_key32(tun_key->tun_id); struct udphdr *udp = (struct udphdr *)(buf); + const struct vxlan_metadata *md; struct vxlanhdr *vxh; - if (tun_key->tun_flags & TUNNEL_VXLAN_OPT) + if ((tun_key->tun_flags & TUNNEL_VXLAN_OPT) && + e->tun_info->options_len != sizeof(*md)) return -EOPNOTSUPP; vxh = (struct vxlanhdr *)((char *)udp + sizeof(struct udphdr)); *ip_proto = IPPROTO_UDP; @@ -96,6 +99,57 @@ static int mlx5e_gen_ip_tunnel_header_vxlan(char buf[], udp->dest = tun_key->tp_dst; vxh->vx_flags = VXLAN_HF_VNI; vxh->vx_vni = vxlan_vni_field(tun_id); + if (tun_key->tun_flags & TUNNEL_VXLAN_OPT) { + md = ip_tunnel_info_opts(e->tun_info); + vxlan_build_gbp_hdr(vxh, md); + } + + return 0; +} + +static int mlx5e_tc_tun_parse_vxlan_gbp_option(struct mlx5e_priv *priv, + struct mlx5_flow_spec *spec, + struct flow_cls_offload *f) +{ + struct flow_rule *rule = flow_cls_offload_flow_rule(f); + struct netlink_ext_ack *extack = f->common.extack; + struct flow_match_enc_opts enc_opts; + void *misc5_c, *misc5_v; + u32 *gbp, *gbp_mask; + + flow_rule_match_enc_opts(rule, &enc_opts); + + if (memchr_inv(&enc_opts.mask->data, 0, sizeof(enc_opts.mask->data)) && + !MLX5_CAP_ESW_FT_FIELD_SUPPORT_2(priv->mdev, tunnel_header_0_1)) { + NL_SET_ERR_MSG_MOD(extack, "Matching on VxLAN GBP is not supported"); + return -EOPNOTSUPP; + } + + if (enc_opts.key->dst_opt_type != TUNNEL_VXLAN_OPT) { + NL_SET_ERR_MSG_MOD(extack, "Wrong VxLAN option type: not GBP"); + return -EOPNOTSUPP; + } + + if (enc_opts.key->len != sizeof(*gbp) || + enc_opts.mask->len != sizeof(*gbp_mask)) { + NL_SET_ERR_MSG_MOD(extack, "VxLAN GBP option/mask len is not 32 bits"); + return -EINVAL; + } + + gbp = (u32 *)&enc_opts.key->data[0]; + gbp_mask = (u32 *)&enc_opts.mask->data[0]; + + if (*gbp_mask & ~VXLAN_GBP_MASK) { + NL_SET_ERR_MSG_FMT_MOD(extack, "Wrong VxLAN GBP mask(0x%08X)\n", *gbp_mask); + return -EINVAL; + } + + misc5_c = MLX5_ADDR_OF(fte_match_param, spec->match_criteria, misc_parameters_5); + misc5_v = MLX5_ADDR_OF(fte_match_param, spec->match_value, misc_parameters_5); + MLX5_SET(fte_match_set_misc5, misc5_c, tunnel_header_0, *gbp_mask); + MLX5_SET(fte_match_set_misc5, misc5_v, tunnel_header_0, *gbp); + + spec->match_criteria_enable |= MLX5_MATCH_MISC_PARAMETERS_5; return 0; } @@ -122,6 +176,14 @@ static int mlx5e_tc_tun_parse_vxlan(struct mlx5e_priv *priv, if (!enc_keyid.mask->keyid) return 0; + if (flow_rule_match_key(rule, FLOW_DISSECTOR_KEY_ENC_OPTS)) { + int err; + + err = mlx5e_tc_tun_parse_vxlan_gbp_option(priv, spec, f); + if (err) + return err; + } + /* match on VNI is required */ if (!MLX5_CAP_ESW_FLOWTABLE_FDB(priv->mdev, @@ -143,6 +205,12 @@ static int mlx5e_tc_tun_parse_vxlan(struct mlx5e_priv *priv, return 0; } +static bool mlx5e_tc_tun_encap_info_equal_vxlan(struct mlx5e_encap_key *a, + struct mlx5e_encap_key *b) +{ + return mlx5e_tc_tun_encap_info_equal_options(a, b, TUNNEL_VXLAN_OPT); +} + static int mlx5e_tc_tun_get_remote_ifindex(struct net_device *mirred_dev) { const struct vxlan_dev *vxlan = netdev_priv(mirred_dev); @@ -160,6 +228,6 @@ struct mlx5e_tc_tunnel vxlan_tunnel = { .generate_ip_tun_hdr = mlx5e_gen_ip_tunnel_header_vxlan, .parse_udp_ports = mlx5e_tc_tun_parse_udp_ports_vxlan, .parse_tunnel = mlx5e_tc_tun_parse_vxlan, - .encap_info_equal = mlx5e_tc_tun_encap_info_equal_generic, + .encap_info_equal = mlx5e_tc_tun_encap_info_equal_vxlan, .get_remote_ifindex = mlx5e_tc_tun_get_remote_ifindex, }; diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/txrx.h b/drivers/net/ethernet/mellanox/mlx5/core/en/txrx.h index b9c2f67d3794..47381e949f1f 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en/txrx.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/txrx.h @@ -65,13 +65,11 @@ int mlx5e_napi_poll(struct napi_struct *napi, int budget); int mlx5e_poll_ico_cq(struct mlx5e_cq *cq); /* RX */ -void mlx5e_page_dma_unmap(struct mlx5e_rq *rq, struct page *page); -void mlx5e_page_release_dynamic(struct mlx5e_rq *rq, struct page *page, bool recycle); INDIRECT_CALLABLE_DECLARE(bool mlx5e_post_rx_wqes(struct mlx5e_rq *rq)); INDIRECT_CALLABLE_DECLARE(bool mlx5e_post_rx_mpwqes(struct mlx5e_rq *rq)); int mlx5e_poll_rx_cq(struct mlx5e_cq *cq, int budget); void mlx5e_free_rx_descs(struct mlx5e_rq *rq); -void mlx5e_free_rx_in_progress_descs(struct mlx5e_rq *rq); +void mlx5e_free_rx_missing_descs(struct mlx5e_rq *rq); static inline bool mlx5e_rx_hw_stamp(struct hwtstamp_config *config) { @@ -79,6 +77,19 @@ static inline bool mlx5e_rx_hw_stamp(struct hwtstamp_config *config) } /* TX */ +struct mlx5e_xmit_data { + dma_addr_t dma_addr; + void *data; + u32 len : 31; + u32 has_frags : 1; +}; + +struct mlx5e_xmit_data_frags { + struct mlx5e_xmit_data xd; + struct skb_shared_info *sinfo; + dma_addr_t *dma_arr; +}; + netdev_tx_t mlx5e_xmit(struct sk_buff *skb, struct net_device *dev); bool mlx5e_poll_tx_cq(struct mlx5e_cq *cq, int napi_budget); void mlx5e_free_txqsq_descs(struct mlx5e_txqsq *sq); @@ -86,7 +97,7 @@ void mlx5e_free_txqsq_descs(struct mlx5e_txqsq *sq); static inline bool mlx5e_skb_fifo_has_room(struct mlx5e_skb_fifo *fifo) { - return (u16)(*fifo->pc - *fifo->cc) < fifo->mask; + return (u16)(*fifo->pc - *fifo->cc) <= fifo->mask; } static inline bool @@ -489,7 +500,7 @@ static inline bool mlx5e_icosq_can_post_wqe(struct mlx5e_icosq *sq, u16 wqe_size static inline struct mlx5e_mpw_info *mlx5e_get_mpw_info(struct mlx5e_rq *rq, int i) { - size_t isz = struct_size(rq->mpwqe.info, alloc_units, rq->mpwqe.pages_per_wqe); + size_t isz = struct_size(rq->mpwqe.info, alloc_units.frag_pages, rq->mpwqe.pages_per_wqe); return (struct mlx5e_mpw_info *)((char *)rq->mpwqe.info + array_size(i, isz)); } diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c b/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c index d9d3b9e1f15a..f0e6095809fa 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c @@ -61,9 +61,8 @@ mlx5e_xmit_xdp_buff(struct mlx5e_xdpsq *sq, struct mlx5e_rq *rq, struct xdp_buff *xdp) { struct page *page = virt_to_page(xdp->data); - struct skb_shared_info *sinfo = NULL; - struct mlx5e_xmit_data xdptxd; - struct mlx5e_xdp_info xdpi; + struct mlx5e_xmit_data_frags xdptxdf = {}; + struct mlx5e_xmit_data *xdptxd; struct xdp_frame *xdpf; dma_addr_t dma_addr; int i; @@ -72,8 +71,10 @@ mlx5e_xmit_xdp_buff(struct mlx5e_xdpsq *sq, struct mlx5e_rq *rq, if (unlikely(!xdpf)) return false; - xdptxd.data = xdpf->data; - xdptxd.len = xdpf->len; + xdptxd = &xdptxdf.xd; + xdptxd->data = xdpf->data; + xdptxd->len = xdpf->len; + xdptxd->has_frags = xdp_frame_has_frags(xdpf); if (xdp->rxq->mem.type == MEM_TYPE_XSK_BUFF_POOL) { /* The xdp_buff was in the UMEM and was copied into a newly @@ -88,24 +89,29 @@ mlx5e_xmit_xdp_buff(struct mlx5e_xdpsq *sq, struct mlx5e_rq *rq, */ __set_bit(MLX5E_RQ_FLAG_XDP_XMIT, rq->flags); /* non-atomic */ - xdpi.mode = MLX5E_XDP_XMIT_MODE_FRAME; + if (unlikely(xdptxd->has_frags)) + return false; - dma_addr = dma_map_single(sq->pdev, xdptxd.data, xdptxd.len, + dma_addr = dma_map_single(sq->pdev, xdptxd->data, xdptxd->len, DMA_TO_DEVICE); if (dma_mapping_error(sq->pdev, dma_addr)) { xdp_return_frame(xdpf); return false; } - xdptxd.dma_addr = dma_addr; - xdpi.frame.xdpf = xdpf; - xdpi.frame.dma_addr = dma_addr; + xdptxd->dma_addr = dma_addr; if (unlikely(!INDIRECT_CALL_2(sq->xmit_xdp_frame, mlx5e_xmit_xdp_frame_mpwqe, - mlx5e_xmit_xdp_frame, sq, &xdptxd, NULL, 0))) + mlx5e_xmit_xdp_frame, sq, xdptxd, 0))) return false; - mlx5e_xdpi_fifo_push(&sq->db.xdpi_fifo, &xdpi); + /* xmit_mode == MLX5E_XDP_XMIT_MODE_FRAME */ + mlx5e_xdpi_fifo_push(&sq->db.xdpi_fifo, + (union mlx5e_xdp_info) { .mode = MLX5E_XDP_XMIT_MODE_FRAME }); + mlx5e_xdpi_fifo_push(&sq->db.xdpi_fifo, + (union mlx5e_xdp_info) { .frame.xdpf = xdpf }); + mlx5e_xdpi_fifo_push(&sq->db.xdpi_fifo, + (union mlx5e_xdp_info) { .frame.dma_addr = dma_addr }); return true; } @@ -115,17 +121,15 @@ mlx5e_xmit_xdp_buff(struct mlx5e_xdpsq *sq, struct mlx5e_rq *rq, * mode. */ - xdpi.mode = MLX5E_XDP_XMIT_MODE_PAGE; - xdpi.page.rq = rq; - dma_addr = page_pool_get_dma_addr(page) + (xdpf->data - (void *)xdpf); - dma_sync_single_for_device(sq->pdev, dma_addr, xdptxd.len, DMA_BIDIRECTIONAL); + dma_sync_single_for_device(sq->pdev, dma_addr, xdptxd->len, DMA_BIDIRECTIONAL); - if (unlikely(xdp_frame_has_frags(xdpf))) { - sinfo = xdp_get_shared_info_from_frame(xdpf); + if (xdptxd->has_frags) { + xdptxdf.sinfo = xdp_get_shared_info_from_frame(xdpf); + xdptxdf.dma_arr = NULL; - for (i = 0; i < sinfo->nr_frags; i++) { - skb_frag_t *frag = &sinfo->frags[i]; + for (i = 0; i < xdptxdf.sinfo->nr_frags; i++) { + skb_frag_t *frag = &xdptxdf.sinfo->frags[i]; dma_addr_t addr; u32 len; @@ -137,22 +141,34 @@ mlx5e_xmit_xdp_buff(struct mlx5e_xdpsq *sq, struct mlx5e_rq *rq, } } - xdptxd.dma_addr = dma_addr; + xdptxd->dma_addr = dma_addr; if (unlikely(!INDIRECT_CALL_2(sq->xmit_xdp_frame, mlx5e_xmit_xdp_frame_mpwqe, - mlx5e_xmit_xdp_frame, sq, &xdptxd, sinfo, 0))) + mlx5e_xmit_xdp_frame, sq, xdptxd, 0))) return false; - xdpi.page.page = page; - mlx5e_xdpi_fifo_push(&sq->db.xdpi_fifo, &xdpi); - - if (unlikely(xdp_frame_has_frags(xdpf))) { - for (i = 0; i < sinfo->nr_frags; i++) { - skb_frag_t *frag = &sinfo->frags[i]; - - xdpi.page.page = skb_frag_page(frag); - mlx5e_xdpi_fifo_push(&sq->db.xdpi_fifo, &xdpi); + /* xmit_mode == MLX5E_XDP_XMIT_MODE_PAGE */ + mlx5e_xdpi_fifo_push(&sq->db.xdpi_fifo, + (union mlx5e_xdp_info) { .mode = MLX5E_XDP_XMIT_MODE_PAGE }); + + if (xdptxd->has_frags) { + mlx5e_xdpi_fifo_push(&sq->db.xdpi_fifo, + (union mlx5e_xdp_info) + { .page.num = 1 + xdptxdf.sinfo->nr_frags }); + mlx5e_xdpi_fifo_push(&sq->db.xdpi_fifo, + (union mlx5e_xdp_info) { .page.page = page }); + for (i = 0; i < xdptxdf.sinfo->nr_frags; i++) { + skb_frag_t *frag = &xdptxdf.sinfo->frags[i]; + + mlx5e_xdpi_fifo_push(&sq->db.xdpi_fifo, + (union mlx5e_xdp_info) + { .page.page = skb_frag_page(frag) }); } + } else { + mlx5e_xdpi_fifo_push(&sq->db.xdpi_fifo, + (union mlx5e_xdp_info) { .page.num = 1 }); + mlx5e_xdpi_fifo_push(&sq->db.xdpi_fifo, + (union mlx5e_xdp_info) { .page.page = page }); } return true; @@ -268,8 +284,6 @@ bool mlx5e_xdp_handle(struct mlx5e_rq *rq, goto xdp_abort; __set_bit(MLX5E_RQ_FLAG_XDP_XMIT, rq->flags); __set_bit(MLX5E_RQ_FLAG_XDP_REDIRECT, rq->flags); - if (xdp->rxq->mem.type != MEM_TYPE_XSK_BUFF_POOL) - mlx5e_page_dma_unmap(rq, virt_to_page(xdp->data)); rq->stats->xdp_redirect++; return true; default: @@ -383,26 +397,43 @@ INDIRECT_CALLABLE_SCOPE int mlx5e_xmit_xdp_frame_check_mpwqe(struct mlx5e_xdpsq INDIRECT_CALLABLE_SCOPE bool mlx5e_xmit_xdp_frame(struct mlx5e_xdpsq *sq, struct mlx5e_xmit_data *xdptxd, - struct skb_shared_info *sinfo, int check_result); + int check_result); INDIRECT_CALLABLE_SCOPE bool mlx5e_xmit_xdp_frame_mpwqe(struct mlx5e_xdpsq *sq, struct mlx5e_xmit_data *xdptxd, - struct skb_shared_info *sinfo, int check_result) + int check_result) { struct mlx5e_tx_mpwqe *session = &sq->mpwqe; struct mlx5e_xdpsq_stats *stats = sq->stats; + struct mlx5e_xmit_data *p = xdptxd; + struct mlx5e_xmit_data tmp; - if (unlikely(sinfo)) { - /* MPWQE is enabled, but a multi-buffer packet is queued for - * transmission. MPWQE can't send fragmented packets, so close - * the current session and fall back to a regular WQE. - */ - if (unlikely(sq->mpwqe.wqe)) - mlx5e_xdp_mpwqe_complete(sq); - return mlx5e_xmit_xdp_frame(sq, xdptxd, sinfo, 0); + if (xdptxd->has_frags) { + struct mlx5e_xmit_data_frags *xdptxdf = + container_of(xdptxd, struct mlx5e_xmit_data_frags, xd); + + if (!!xdptxd->len + xdptxdf->sinfo->nr_frags > 1) { + /* MPWQE is enabled, but a multi-buffer packet is queued for + * transmission. MPWQE can't send fragmented packets, so close + * the current session and fall back to a regular WQE. + */ + if (unlikely(sq->mpwqe.wqe)) + mlx5e_xdp_mpwqe_complete(sq); + return mlx5e_xmit_xdp_frame(sq, xdptxd, 0); + } + if (!xdptxd->len) { + skb_frag_t *frag = &xdptxdf->sinfo->frags[0]; + + tmp.data = skb_frag_address(frag); + tmp.len = skb_frag_size(frag); + tmp.dma_addr = xdptxdf->dma_arr ? xdptxdf->dma_arr[0] : + page_pool_get_dma_addr(skb_frag_page(frag)) + + skb_frag_off(frag); + p = &tmp; + } } - if (unlikely(xdptxd->len > sq->hw_mtu)) { + if (unlikely(p->len > sq->hw_mtu)) { stats->err++; return false; } @@ -420,7 +451,7 @@ mlx5e_xmit_xdp_frame_mpwqe(struct mlx5e_xdpsq *sq, struct mlx5e_xmit_data *xdptx mlx5e_xdp_mpwqe_session_start(sq); } - mlx5e_xdp_mpwqe_add_dseg(sq, xdptxd, stats); + mlx5e_xdp_mpwqe_add_dseg(sq, p, stats); if (unlikely(mlx5e_xdp_mpwqe_is_full(session, sq->max_sq_mpw_wqebbs))) mlx5e_xdp_mpwqe_complete(sq); @@ -448,8 +479,10 @@ INDIRECT_CALLABLE_SCOPE int mlx5e_xmit_xdp_frame_check(struct mlx5e_xdpsq *sq) INDIRECT_CALLABLE_SCOPE bool mlx5e_xmit_xdp_frame(struct mlx5e_xdpsq *sq, struct mlx5e_xmit_data *xdptxd, - struct skb_shared_info *sinfo, int check_result) + int check_result) { + struct mlx5e_xmit_data_frags *xdptxdf = + container_of(xdptxd, struct mlx5e_xmit_data_frags, xd); struct mlx5_wq_cyc *wq = &sq->wq; struct mlx5_wqe_ctrl_seg *cseg; struct mlx5_wqe_data_seg *dseg; @@ -461,26 +494,34 @@ mlx5e_xmit_xdp_frame(struct mlx5e_xdpsq *sq, struct mlx5e_xmit_data *xdptxd, u16 ds_cnt, inline_hdr_sz; u8 num_wqebbs = 1; int num_frags = 0; + bool inline_ok; + bool linear; u16 pi; struct mlx5e_xdpsq_stats *stats = sq->stats; - if (unlikely(dma_len < MLX5E_XDP_MIN_INLINE || sq->hw_mtu < dma_len)) { + inline_ok = sq->min_inline_mode == MLX5_INLINE_MODE_NONE || + dma_len >= MLX5E_XDP_MIN_INLINE; + + if (unlikely(!inline_ok || sq->hw_mtu < dma_len)) { stats->err++; return false; } - ds_cnt = MLX5E_TX_WQE_EMPTY_DS_COUNT + 1; + inline_hdr_sz = 0; if (sq->min_inline_mode != MLX5_INLINE_MODE_NONE) - ds_cnt++; + inline_hdr_sz = MLX5E_XDP_MIN_INLINE; + + linear = !!(dma_len - inline_hdr_sz); + ds_cnt = MLX5E_TX_WQE_EMPTY_DS_COUNT + linear + !!inline_hdr_sz; /* check_result must be 0 if sinfo is passed. */ if (!check_result) { int stop_room = 1; - if (unlikely(sinfo)) { - ds_cnt += sinfo->nr_frags; - num_frags = sinfo->nr_frags; + if (xdptxd->has_frags) { + ds_cnt += xdptxdf->sinfo->nr_frags; + num_frags = xdptxdf->sinfo->nr_frags; num_wqebbs = DIV_ROUND_UP(ds_cnt, MLX5_SEND_WQEBB_NUM_DS); /* Assuming MLX5_CAP_GEN(mdev, max_wqe_sz_sq) is big * enough to hold all fragments. @@ -501,53 +542,53 @@ mlx5e_xmit_xdp_frame(struct mlx5e_xdpsq *sq, struct mlx5e_xmit_data *xdptxd, eseg = &wqe->eth; dseg = wqe->data; - inline_hdr_sz = 0; - /* copy the inline part if required */ - if (sq->min_inline_mode != MLX5_INLINE_MODE_NONE) { + if (inline_hdr_sz) { memcpy(eseg->inline_hdr.start, xdptxd->data, sizeof(eseg->inline_hdr.start)); memcpy(dseg, xdptxd->data + sizeof(eseg->inline_hdr.start), - MLX5E_XDP_MIN_INLINE - sizeof(eseg->inline_hdr.start)); - dma_len -= MLX5E_XDP_MIN_INLINE; - dma_addr += MLX5E_XDP_MIN_INLINE; - inline_hdr_sz = MLX5E_XDP_MIN_INLINE; + inline_hdr_sz - sizeof(eseg->inline_hdr.start)); + dma_len -= inline_hdr_sz; + dma_addr += inline_hdr_sz; dseg++; } /* write the dma part */ - dseg->addr = cpu_to_be64(dma_addr); - dseg->byte_count = cpu_to_be32(dma_len); + if (linear) { + dseg->addr = cpu_to_be64(dma_addr); + dseg->byte_count = cpu_to_be32(dma_len); + dseg->lkey = sq->mkey_be; + dseg++; + } cseg->opmod_idx_opcode = cpu_to_be32((sq->pc << 8) | MLX5_OPCODE_SEND); - if (unlikely(test_bit(MLX5E_SQ_STATE_XDP_MULTIBUF, &sq->state))) { - u8 num_pkts = 1 + num_frags; + if (test_bit(MLX5E_SQ_STATE_XDP_MULTIBUF, &sq->state)) { int i; memset(&cseg->trailer, 0, sizeof(cseg->trailer)); memset(eseg, 0, sizeof(*eseg) - sizeof(eseg->trailer)); eseg->inline_hdr.sz = cpu_to_be16(inline_hdr_sz); - dseg->lkey = sq->mkey_be; for (i = 0; i < num_frags; i++) { - skb_frag_t *frag = &sinfo->frags[i]; + skb_frag_t *frag = &xdptxdf->sinfo->frags[i]; dma_addr_t addr; - addr = page_pool_get_dma_addr(skb_frag_page(frag)) + + addr = xdptxdf->dma_arr ? xdptxdf->dma_arr[i] : + page_pool_get_dma_addr(skb_frag_page(frag)) + skb_frag_off(frag); - dseg++; dseg->addr = cpu_to_be64(addr); dseg->byte_count = cpu_to_be32(skb_frag_size(frag)); dseg->lkey = sq->mkey_be; + dseg++; } cseg->qpn_ds = cpu_to_be32((sq->sqn << 8) | ds_cnt); sq->db.wqe_info[pi] = (struct mlx5e_xdp_wqe_info) { .num_wqebbs = num_wqebbs, - .num_pkts = num_pkts, + .num_pkts = 1, }; sq->pc += num_wqebbs; @@ -566,26 +607,67 @@ mlx5e_xmit_xdp_frame(struct mlx5e_xdpsq *sq, struct mlx5e_xmit_data *xdptxd, static void mlx5e_free_xdpsq_desc(struct mlx5e_xdpsq *sq, struct mlx5e_xdp_wqe_info *wi, u32 *xsk_frames, - bool recycle, struct xdp_frame_bulk *bq) { struct mlx5e_xdp_info_fifo *xdpi_fifo = &sq->db.xdpi_fifo; u16 i; for (i = 0; i < wi->num_pkts; i++) { - struct mlx5e_xdp_info xdpi = mlx5e_xdpi_fifo_pop(xdpi_fifo); + union mlx5e_xdp_info xdpi = mlx5e_xdpi_fifo_pop(xdpi_fifo); switch (xdpi.mode) { - case MLX5E_XDP_XMIT_MODE_FRAME: + case MLX5E_XDP_XMIT_MODE_FRAME: { /* XDP_TX from the XSK RQ and XDP_REDIRECT */ - dma_unmap_single(sq->pdev, xdpi.frame.dma_addr, - xdpi.frame.xdpf->len, DMA_TO_DEVICE); - xdp_return_frame_bulk(xdpi.frame.xdpf, bq); + struct xdp_frame *xdpf; + dma_addr_t dma_addr; + + xdpi = mlx5e_xdpi_fifo_pop(xdpi_fifo); + xdpf = xdpi.frame.xdpf; + xdpi = mlx5e_xdpi_fifo_pop(xdpi_fifo); + dma_addr = xdpi.frame.dma_addr; + + dma_unmap_single(sq->pdev, dma_addr, + xdpf->len, DMA_TO_DEVICE); + if (xdp_frame_has_frags(xdpf)) { + struct skb_shared_info *sinfo; + int j; + + sinfo = xdp_get_shared_info_from_frame(xdpf); + for (j = 0; j < sinfo->nr_frags; j++) { + skb_frag_t *frag = &sinfo->frags[j]; + + xdpi = mlx5e_xdpi_fifo_pop(xdpi_fifo); + dma_addr = xdpi.frame.dma_addr; + + dma_unmap_single(sq->pdev, dma_addr, + skb_frag_size(frag), DMA_TO_DEVICE); + } + } + xdp_return_frame_bulk(xdpf, bq); break; - case MLX5E_XDP_XMIT_MODE_PAGE: + } + case MLX5E_XDP_XMIT_MODE_PAGE: { /* XDP_TX from the regular RQ */ - mlx5e_page_release_dynamic(xdpi.page.rq, xdpi.page.page, recycle); + u8 num, n = 0; + + xdpi = mlx5e_xdpi_fifo_pop(xdpi_fifo); + num = xdpi.page.num; + + do { + struct page *page; + + xdpi = mlx5e_xdpi_fifo_pop(xdpi_fifo); + page = xdpi.page.page; + + /* No need to check ((page->pp_magic & ~0x3UL) == PP_SIGNATURE) + * as we know this is a page_pool page. + */ + page_pool_put_defragged_page(page->pp, + page, -1, true); + } while (++n < num); + break; + } case MLX5E_XDP_XMIT_MODE_XSK: /* AF_XDP send */ (*xsk_frames)++; @@ -638,7 +720,7 @@ bool mlx5e_poll_xdpsq_cq(struct mlx5e_cq *cq) sqcc += wi->num_wqebbs; - mlx5e_free_xdpsq_desc(sq, wi, &xsk_frames, true, &bq); + mlx5e_free_xdpsq_desc(sq, wi, &xsk_frames, &bq); } while (!last_wqe); if (unlikely(get_cqe_opcode(cqe) != MLX5_CQE_REQ)) { @@ -685,7 +767,7 @@ void mlx5e_free_xdpsq_descs(struct mlx5e_xdpsq *sq) sq->cc += wi->num_wqebbs; - mlx5e_free_xdpsq_desc(sq, wi, &xsk_frames, false, &bq); + mlx5e_free_xdpsq_desc(sq, wi, &xsk_frames, &bq); } xdp_flush_frame_bulk(&bq); @@ -719,34 +801,79 @@ int mlx5e_xdp_xmit(struct net_device *dev, int n, struct xdp_frame **frames, sq = &priv->channels.c[sq_num]->xdpsq; for (i = 0; i < n; i++) { + struct mlx5e_xmit_data_frags xdptxdf = {}; struct xdp_frame *xdpf = frames[i]; - struct mlx5e_xmit_data xdptxd; - struct mlx5e_xdp_info xdpi; + dma_addr_t dma_arr[MAX_SKB_FRAGS]; + struct mlx5e_xmit_data *xdptxd; bool ret; - xdptxd.data = xdpf->data; - xdptxd.len = xdpf->len; - xdptxd.dma_addr = dma_map_single(sq->pdev, xdptxd.data, - xdptxd.len, DMA_TO_DEVICE); + xdptxd = &xdptxdf.xd; + xdptxd->data = xdpf->data; + xdptxd->len = xdpf->len; + xdptxd->has_frags = xdp_frame_has_frags(xdpf); + xdptxd->dma_addr = dma_map_single(sq->pdev, xdptxd->data, + xdptxd->len, DMA_TO_DEVICE); - if (unlikely(dma_mapping_error(sq->pdev, xdptxd.dma_addr))) + if (unlikely(dma_mapping_error(sq->pdev, xdptxd->dma_addr))) break; - xdpi.mode = MLX5E_XDP_XMIT_MODE_FRAME; - xdpi.frame.xdpf = xdpf; - xdpi.frame.dma_addr = xdptxd.dma_addr; + if (xdptxd->has_frags) { + int j; + + xdptxdf.sinfo = xdp_get_shared_info_from_frame(xdpf); + xdptxdf.dma_arr = dma_arr; + for (j = 0; j < xdptxdf.sinfo->nr_frags; j++) { + skb_frag_t *frag = &xdptxdf.sinfo->frags[j]; + + dma_arr[j] = dma_map_single(sq->pdev, skb_frag_address(frag), + skb_frag_size(frag), DMA_TO_DEVICE); + + if (!dma_mapping_error(sq->pdev, dma_arr[j])) + continue; + /* mapping error */ + while (--j >= 0) + dma_unmap_single(sq->pdev, dma_arr[j], + skb_frag_size(&xdptxdf.sinfo->frags[j]), + DMA_TO_DEVICE); + goto out; + } + } ret = INDIRECT_CALL_2(sq->xmit_xdp_frame, mlx5e_xmit_xdp_frame_mpwqe, - mlx5e_xmit_xdp_frame, sq, &xdptxd, NULL, 0); + mlx5e_xmit_xdp_frame, sq, xdptxd, 0); if (unlikely(!ret)) { - dma_unmap_single(sq->pdev, xdptxd.dma_addr, - xdptxd.len, DMA_TO_DEVICE); + int j; + + dma_unmap_single(sq->pdev, xdptxd->dma_addr, + xdptxd->len, DMA_TO_DEVICE); + if (!xdptxd->has_frags) + break; + for (j = 0; j < xdptxdf.sinfo->nr_frags; j++) + dma_unmap_single(sq->pdev, dma_arr[j], + skb_frag_size(&xdptxdf.sinfo->frags[j]), + DMA_TO_DEVICE); break; } - mlx5e_xdpi_fifo_push(&sq->db.xdpi_fifo, &xdpi); + + /* xmit_mode == MLX5E_XDP_XMIT_MODE_FRAME */ + mlx5e_xdpi_fifo_push(&sq->db.xdpi_fifo, + (union mlx5e_xdp_info) { .mode = MLX5E_XDP_XMIT_MODE_FRAME }); + mlx5e_xdpi_fifo_push(&sq->db.xdpi_fifo, + (union mlx5e_xdp_info) { .frame.xdpf = xdpf }); + mlx5e_xdpi_fifo_push(&sq->db.xdpi_fifo, + (union mlx5e_xdp_info) { .frame.dma_addr = xdptxd->dma_addr }); + if (xdptxd->has_frags) { + int j; + + for (j = 0; j < xdptxdf.sinfo->nr_frags; j++) + mlx5e_xdpi_fifo_push(&sq->db.xdpi_fifo, + (union mlx5e_xdp_info) + { .frame.dma_addr = dma_arr[j] }); + } nxmit++; } +out: if (flags & XDP_XMIT_FLUSH) { if (sq->mpwqe.wqe) mlx5e_xdp_mpwqe_complete(sq); diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.h b/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.h index 10bcfa6f88c1..9e8e6184f9e4 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.h @@ -50,6 +50,53 @@ struct mlx5e_xdp_buff { struct mlx5e_rq *rq; }; +/* XDP packets can be transmitted in different ways. On completion, we need to + * distinguish between them to clean up things in a proper way. + */ +enum mlx5e_xdp_xmit_mode { + /* An xdp_frame was transmitted due to either XDP_REDIRECT from another + * device or XDP_TX from an XSK RQ. The frame has to be unmapped and + * returned. + */ + MLX5E_XDP_XMIT_MODE_FRAME, + + /* The xdp_frame was created in place as a result of XDP_TX from a + * regular RQ. No DMA remapping happened, and the page belongs to us. + */ + MLX5E_XDP_XMIT_MODE_PAGE, + + /* No xdp_frame was created at all, the transmit happened from a UMEM + * page. The UMEM Completion Ring producer pointer has to be increased. + */ + MLX5E_XDP_XMIT_MODE_XSK, +}; + +/* xmit_mode entry is pushed to the fifo per packet, followed by multiple + * entries, as follows: + * + * MLX5E_XDP_XMIT_MODE_FRAME: + * xdpf, dma_addr_1, dma_addr_2, ... , dma_addr_num. + * 'num' is derived from xdpf. + * + * MLX5E_XDP_XMIT_MODE_PAGE: + * num, page_1, page_2, ... , page_num. + * + * MLX5E_XDP_XMIT_MODE_XSK: + * none. + */ +union mlx5e_xdp_info { + enum mlx5e_xdp_xmit_mode mode; + union { + struct xdp_frame *xdpf; + dma_addr_t dma_addr; + } frame; + union { + struct mlx5e_rq *rq; + u8 num; + struct page *page; + } page; +}; + struct mlx5e_xsk_param; int mlx5e_xdp_max_mtu(struct mlx5e_params *params, struct mlx5e_xsk_param *xsk); bool mlx5e_xdp_handle(struct mlx5e_rq *rq, @@ -66,11 +113,9 @@ extern const struct xdp_metadata_ops mlx5e_xdp_metadata_ops; INDIRECT_CALLABLE_DECLARE(bool mlx5e_xmit_xdp_frame_mpwqe(struct mlx5e_xdpsq *sq, struct mlx5e_xmit_data *xdptxd, - struct skb_shared_info *sinfo, int check_result)); INDIRECT_CALLABLE_DECLARE(bool mlx5e_xmit_xdp_frame(struct mlx5e_xdpsq *sq, struct mlx5e_xmit_data *xdptxd, - struct skb_shared_info *sinfo, int check_result)); INDIRECT_CALLABLE_DECLARE(int mlx5e_xmit_xdp_frame_check_mpwqe(struct mlx5e_xdpsq *sq)); INDIRECT_CALLABLE_DECLARE(int mlx5e_xmit_xdp_frame_check(struct mlx5e_xdpsq *sq)); @@ -179,14 +224,14 @@ mlx5e_xdp_mpwqe_add_dseg(struct mlx5e_xdpsq *sq, static inline void mlx5e_xdpi_fifo_push(struct mlx5e_xdp_info_fifo *fifo, - struct mlx5e_xdp_info *xi) + union mlx5e_xdp_info xi) { u32 i = (*fifo->pc)++ & fifo->mask; - fifo->xi[i] = *xi; + fifo->xi[i] = xi; } -static inline struct mlx5e_xdp_info +static inline union mlx5e_xdp_info mlx5e_xdpi_fifo_pop(struct mlx5e_xdp_info_fifo *fifo) { return fifo->xi[(*fifo->cc)++ & fifo->mask]; diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.c b/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.c index fab787600459..d97e6df66f45 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.c @@ -22,6 +22,7 @@ int mlx5e_xsk_alloc_rx_mpwqe(struct mlx5e_rq *rq, u16 ix) struct mlx5e_icosq *icosq = rq->icosq; struct mlx5_wq_cyc *wq = &icosq->wq; struct mlx5e_umr_wqe *umr_wqe; + struct xdp_buff **xsk_buffs; int batch, i; u32 offset; /* 17-bit value with MTT. */ u16 pi; @@ -29,9 +30,9 @@ int mlx5e_xsk_alloc_rx_mpwqe(struct mlx5e_rq *rq, u16 ix) if (unlikely(!xsk_buff_can_alloc(rq->xsk_pool, rq->mpwqe.pages_per_wqe))) goto err; - BUILD_BUG_ON(sizeof(wi->alloc_units[0]) != sizeof(wi->alloc_units[0].xsk)); XSK_CHECK_PRIV_TYPE(struct mlx5e_xdp_buff); - batch = xsk_buff_alloc_batch(rq->xsk_pool, (struct xdp_buff **)wi->alloc_units, + xsk_buffs = (struct xdp_buff **)wi->alloc_units.xsk_buffs; + batch = xsk_buff_alloc_batch(rq->xsk_pool, xsk_buffs, rq->mpwqe.pages_per_wqe); /* If batch < pages_per_wqe, either: @@ -41,8 +42,8 @@ int mlx5e_xsk_alloc_rx_mpwqe(struct mlx5e_rq *rq, u16 ix) * the first error, which will mean there are no more valid descriptors. */ for (; batch < rq->mpwqe.pages_per_wqe; batch++) { - wi->alloc_units[batch].xsk = xsk_buff_alloc(rq->xsk_pool); - if (unlikely(!wi->alloc_units[batch].xsk)) + xsk_buffs[batch] = xsk_buff_alloc(rq->xsk_pool); + if (unlikely(!xsk_buffs[batch])) goto err_reuse_batch; } @@ -52,8 +53,8 @@ int mlx5e_xsk_alloc_rx_mpwqe(struct mlx5e_rq *rq, u16 ix) if (likely(rq->mpwqe.umr_mode == MLX5E_MPWRQ_UMR_MODE_ALIGNED)) { for (i = 0; i < batch; i++) { - struct mlx5e_xdp_buff *mxbuf = xsk_buff_to_mxbuf(wi->alloc_units[i].xsk); - dma_addr_t addr = xsk_buff_xdp_get_frame_dma(wi->alloc_units[i].xsk); + struct mlx5e_xdp_buff *mxbuf = xsk_buff_to_mxbuf(xsk_buffs[i]); + dma_addr_t addr = xsk_buff_xdp_get_frame_dma(xsk_buffs[i]); umr_wqe->inline_mtts[i] = (struct mlx5_mtt) { .ptag = cpu_to_be64(addr | MLX5_EN_WR), @@ -62,8 +63,8 @@ int mlx5e_xsk_alloc_rx_mpwqe(struct mlx5e_rq *rq, u16 ix) } } else if (unlikely(rq->mpwqe.umr_mode == MLX5E_MPWRQ_UMR_MODE_UNALIGNED)) { for (i = 0; i < batch; i++) { - struct mlx5e_xdp_buff *mxbuf = xsk_buff_to_mxbuf(wi->alloc_units[i].xsk); - dma_addr_t addr = xsk_buff_xdp_get_frame_dma(wi->alloc_units[i].xsk); + struct mlx5e_xdp_buff *mxbuf = xsk_buff_to_mxbuf(xsk_buffs[i]); + dma_addr_t addr = xsk_buff_xdp_get_frame_dma(xsk_buffs[i]); umr_wqe->inline_ksms[i] = (struct mlx5_ksm) { .key = rq->mkey_be, @@ -75,8 +76,8 @@ int mlx5e_xsk_alloc_rx_mpwqe(struct mlx5e_rq *rq, u16 ix) u32 mapping_size = 1 << (rq->mpwqe.page_shift - 2); for (i = 0; i < batch; i++) { - struct mlx5e_xdp_buff *mxbuf = xsk_buff_to_mxbuf(wi->alloc_units[i].xsk); - dma_addr_t addr = xsk_buff_xdp_get_frame_dma(wi->alloc_units[i].xsk); + struct mlx5e_xdp_buff *mxbuf = xsk_buff_to_mxbuf(xsk_buffs[i]); + dma_addr_t addr = xsk_buff_xdp_get_frame_dma(xsk_buffs[i]); umr_wqe->inline_ksms[i << 2] = (struct mlx5_ksm) { .key = rq->mkey_be, @@ -102,8 +103,8 @@ int mlx5e_xsk_alloc_rx_mpwqe(struct mlx5e_rq *rq, u16 ix) __be32 frame_size = cpu_to_be32(rq->xsk_pool->chunk_size); for (i = 0; i < batch; i++) { - struct mlx5e_xdp_buff *mxbuf = xsk_buff_to_mxbuf(wi->alloc_units[i].xsk); - dma_addr_t addr = xsk_buff_xdp_get_frame_dma(wi->alloc_units[i].xsk); + struct mlx5e_xdp_buff *mxbuf = xsk_buff_to_mxbuf(xsk_buffs[i]); + dma_addr_t addr = xsk_buff_xdp_get_frame_dma(xsk_buffs[i]); umr_wqe->inline_klms[i << 1] = (struct mlx5_klm) { .key = rq->mkey_be, @@ -119,7 +120,7 @@ int mlx5e_xsk_alloc_rx_mpwqe(struct mlx5e_rq *rq, u16 ix) } } - bitmap_zero(wi->xdp_xmit_bitmap, rq->mpwqe.pages_per_wqe); + bitmap_zero(wi->skip_release_bitmap, rq->mpwqe.pages_per_wqe); wi->consumed_strides = 0; umr_wqe->ctrl.opmod_idx_opcode = @@ -149,7 +150,7 @@ int mlx5e_xsk_alloc_rx_mpwqe(struct mlx5e_rq *rq, u16 ix) err_reuse_batch: while (--batch >= 0) - xsk_buff_free(wi->alloc_units[batch].xsk); + xsk_buff_free(xsk_buffs[batch]); err: rq->stats->buff_alloc_err++; @@ -163,13 +164,10 @@ int mlx5e_xsk_alloc_rx_wqes_batched(struct mlx5e_rq *rq, u16 ix, int wqe_bulk) u32 contig, alloc; int i; - /* mlx5e_init_frags_partition creates a 1:1 mapping between - * rq->wqe.frags and rq->wqe.alloc_units, which allows us to - * allocate XDP buffers straight into alloc_units. + /* Each rq->wqe.frags->xskp is 1:1 mapped to an element inside the + * rq->wqe.alloc_units->xsk_buffs array allocated here. */ - BUILD_BUG_ON(sizeof(rq->wqe.alloc_units[0]) != - sizeof(rq->wqe.alloc_units[0].xsk)); - buffs = (struct xdp_buff **)rq->wqe.alloc_units; + buffs = rq->wqe.alloc_units->xsk_buffs; contig = mlx5_wq_cyc_get_size(wq) - ix; if (wqe_bulk <= contig) { alloc = xsk_buff_alloc_batch(rq->xsk_pool, buffs + ix, wqe_bulk); @@ -189,8 +187,9 @@ int mlx5e_xsk_alloc_rx_wqes_batched(struct mlx5e_rq *rq, u16 ix, int wqe_bulk) /* Assumes log_num_frags == 0. */ frag = &rq->wqe.frags[j]; - addr = xsk_buff_xdp_get_frame_dma(frag->au->xsk); + addr = xsk_buff_xdp_get_frame_dma(*frag->xskp); wqe->data[0].addr = cpu_to_be64(addr + rq->buff.headroom); + frag->flags &= ~BIT(MLX5E_WQE_FRAG_SKIP_RELEASE); } return alloc; @@ -211,12 +210,13 @@ int mlx5e_xsk_alloc_rx_wqes(struct mlx5e_rq *rq, u16 ix, int wqe_bulk) /* Assumes log_num_frags == 0. */ frag = &rq->wqe.frags[j]; - frag->au->xsk = xsk_buff_alloc(rq->xsk_pool); - if (unlikely(!frag->au->xsk)) + *frag->xskp = xsk_buff_alloc(rq->xsk_pool); + if (unlikely(!*frag->xskp)) return i; - addr = xsk_buff_xdp_get_frame_dma(frag->au->xsk); + addr = xsk_buff_xdp_get_frame_dma(*frag->xskp); wqe->data[0].addr = cpu_to_be64(addr + rq->buff.headroom); + frag->flags &= ~BIT(MLX5E_WQE_FRAG_SKIP_RELEASE); } return wqe_bulk; @@ -251,7 +251,7 @@ struct sk_buff *mlx5e_xsk_skb_from_cqe_mpwrq_linear(struct mlx5e_rq *rq, u32 head_offset, u32 page_idx) { - struct mlx5e_xdp_buff *mxbuf = xsk_buff_to_mxbuf(wi->alloc_units[page_idx].xsk); + struct mlx5e_xdp_buff *mxbuf = xsk_buff_to_mxbuf(wi->alloc_units.xsk_buffs[page_idx]); struct bpf_prog *prog; /* Check packet size. Note LRO doesn't use linear SKB */ @@ -291,7 +291,7 @@ struct sk_buff *mlx5e_xsk_skb_from_cqe_mpwrq_linear(struct mlx5e_rq *rq, prog = rcu_dereference(rq->xdp_prog); if (likely(prog && mlx5e_xdp_handle(rq, prog, mxbuf))) { if (likely(__test_and_clear_bit(MLX5E_RQ_FLAG_XDP_XMIT, rq->flags))) - __set_bit(page_idx, wi->xdp_xmit_bitmap); /* non-atomic */ + __set_bit(page_idx, wi->skip_release_bitmap); /* non-atomic */ return NULL; /* page/packet was consumed by XDP */ } @@ -306,7 +306,7 @@ struct sk_buff *mlx5e_xsk_skb_from_cqe_linear(struct mlx5e_rq *rq, struct mlx5_cqe64 *cqe, u32 cqe_bcnt) { - struct mlx5e_xdp_buff *mxbuf = xsk_buff_to_mxbuf(wi->au->xsk); + struct mlx5e_xdp_buff *mxbuf = xsk_buff_to_mxbuf(*wi->xskp); struct bpf_prog *prog; /* wi->offset is not used in this function, because xdp->data and the diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/setup.c b/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/setup.c index 81a567e17264..ed279f450976 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/setup.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/setup.c @@ -93,13 +93,19 @@ static int mlx5e_open_xsk_rq(struct mlx5e_channel *c, struct mlx5e_params *param struct mlx5e_rq_param *rq_params, struct xsk_buff_pool *pool, struct mlx5e_xsk_param *xsk) { + struct mlx5e_rq *xskrq = &c->xskrq; int err; - err = mlx5e_init_xsk_rq(c, params, pool, xsk, &c->xskrq); + err = mlx5e_init_xsk_rq(c, params, pool, xsk, xskrq); if (err) return err; - return mlx5e_open_rq(params, rq_params, xsk, cpu_to_node(c->cpu), &c->xskrq); + err = mlx5e_open_rq(params, rq_params, xsk, cpu_to_node(c->cpu), xskrq); + if (err) + return err; + + __set_bit(MLX5E_RQ_STATE_XSK, &xskrq->state); + return 0; } int mlx5e_open_xsk(struct mlx5e_priv *priv, struct mlx5e_params *params, diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/tx.c b/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/tx.c index 367a9505ca4f..597f319d4770 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/tx.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/tx.c @@ -44,7 +44,7 @@ int mlx5e_xsk_wakeup(struct net_device *dev, u32 qid, u32 flags) * same. */ static void mlx5e_xsk_tx_post_err(struct mlx5e_xdpsq *sq, - struct mlx5e_xdp_info *xdpi) + union mlx5e_xdp_info *xdpi) { u16 pi = mlx5_wq_cyc_ctr2ix(&sq->wq, sq->pc); struct mlx5e_xdp_wqe_info *wi = &sq->db.wqe_info[pi]; @@ -54,15 +54,14 @@ static void mlx5e_xsk_tx_post_err(struct mlx5e_xdpsq *sq, wi->num_pkts = 1; nopwqe = mlx5e_post_nop(&sq->wq, sq->sqn, &sq->pc); - mlx5e_xdpi_fifo_push(&sq->db.xdpi_fifo, xdpi); + mlx5e_xdpi_fifo_push(&sq->db.xdpi_fifo, *xdpi); sq->doorbell_cseg = &nopwqe->ctrl; } bool mlx5e_xsk_tx(struct mlx5e_xdpsq *sq, unsigned int budget) { struct xsk_buff_pool *pool = sq->xsk_pool; - struct mlx5e_xmit_data xdptxd; - struct mlx5e_xdp_info xdpi; + union mlx5e_xdp_info xdpi; bool work_done = true; bool flush = false; @@ -73,6 +72,7 @@ bool mlx5e_xsk_tx(struct mlx5e_xdpsq *sq, unsigned int budget) mlx5e_xmit_xdp_frame_check_mpwqe, mlx5e_xmit_xdp_frame_check, sq); + struct mlx5e_xmit_data xdptxd = {}; struct xdp_desc desc; bool ret; @@ -97,7 +97,7 @@ bool mlx5e_xsk_tx(struct mlx5e_xdpsq *sq, unsigned int budget) xsk_buff_raw_dma_sync_for_device(pool, xdptxd.dma_addr, xdptxd.len); ret = INDIRECT_CALL_2(sq->xmit_xdp_frame, mlx5e_xmit_xdp_frame_mpwqe, - mlx5e_xmit_xdp_frame, sq, &xdptxd, NULL, + mlx5e_xmit_xdp_frame, sq, &xdptxd, check_result); if (unlikely(!ret)) { if (sq->mpwqe.wqe) @@ -105,7 +105,7 @@ bool mlx5e_xsk_tx(struct mlx5e_xdpsq *sq, unsigned int budget) mlx5e_xsk_tx_post_err(sq, &xdpi); } else { - mlx5e_xdpi_fifo_push(&sq->db.xdpi_fifo, &xdpi); + mlx5e_xdpi_fifo_push(&sq->db.xdpi_fifo, xdpi); } flush = true; diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec.c b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec.c index 7b0d3de0ec6c..55b38544422f 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec.c @@ -35,11 +35,15 @@ #include <crypto/aead.h> #include <linux/inetdevice.h> #include <linux/netdevice.h> +#include <net/netevent.h> #include "en.h" #include "ipsec.h" #include "ipsec_rxtx.h" +#define MLX5_IPSEC_RESCHED msecs_to_jiffies(1000) +#define MLX5E_IPSEC_TUNNEL_SA XA_MARK_1 + static struct mlx5e_ipsec_sa_entry *to_ipsec_sa_entry(struct xfrm_state *x) { return (struct mlx5e_ipsec_sa_entry *)x->xso.offload_handle; @@ -50,27 +54,71 @@ static struct mlx5e_ipsec_pol_entry *to_ipsec_pol_entry(struct xfrm_policy *x) return (struct mlx5e_ipsec_pol_entry *)x->xdo.offload_handle; } +static void mlx5e_ipsec_handle_tx_limit(struct work_struct *_work) +{ + struct mlx5e_ipsec_dwork *dwork = + container_of(_work, struct mlx5e_ipsec_dwork, dwork.work); + struct mlx5e_ipsec_sa_entry *sa_entry = dwork->sa_entry; + struct xfrm_state *x = sa_entry->x; + + spin_lock(&x->lock); + xfrm_state_check_expire(x); + if (x->km.state == XFRM_STATE_EXPIRED) { + sa_entry->attrs.drop = true; + mlx5e_accel_ipsec_fs_modify(sa_entry); + } + spin_unlock(&x->lock); + + if (sa_entry->attrs.drop) + return; + + queue_delayed_work(sa_entry->ipsec->wq, &dwork->dwork, + MLX5_IPSEC_RESCHED); +} + static bool mlx5e_ipsec_update_esn_state(struct mlx5e_ipsec_sa_entry *sa_entry) { - struct xfrm_replay_state_esn *replay_esn; + struct xfrm_state *x = sa_entry->x; u32 seq_bottom = 0; + u32 esn, esn_msb; u8 overlap; - if (!(sa_entry->x->props.flags & XFRM_STATE_ESN)) { - sa_entry->esn_state.trigger = 0; + switch (x->xso.type) { + case XFRM_DEV_OFFLOAD_PACKET: + switch (x->xso.dir) { + case XFRM_DEV_OFFLOAD_IN: + esn = x->replay_esn->seq; + esn_msb = x->replay_esn->seq_hi; + break; + case XFRM_DEV_OFFLOAD_OUT: + esn = x->replay_esn->oseq; + esn_msb = x->replay_esn->oseq_hi; + break; + default: + WARN_ON(true); + return false; + } + break; + case XFRM_DEV_OFFLOAD_CRYPTO: + /* Already parsed by XFRM core */ + esn = x->replay_esn->seq; + break; + default: + WARN_ON(true); return false; } - replay_esn = sa_entry->x->replay_esn; - if (replay_esn->seq >= replay_esn->replay_window) - seq_bottom = replay_esn->seq - replay_esn->replay_window + 1; - overlap = sa_entry->esn_state.overlap; - sa_entry->esn_state.esn = xfrm_replay_seqhi(sa_entry->x, - htonl(seq_bottom)); + if (esn >= x->replay_esn->replay_window) + seq_bottom = esn - x->replay_esn->replay_window + 1; + + if (x->xso.type == XFRM_DEV_OFFLOAD_CRYPTO) + esn_msb = xfrm_replay_seqhi(x, htonl(seq_bottom)); + + sa_entry->esn_state.esn = esn; + sa_entry->esn_state.esn_msb = esn_msb; - sa_entry->esn_state.trigger = 1; if (unlikely(overlap && seq_bottom < MLX5E_IPSEC_ESN_SCOPE_MID)) { sa_entry->esn_state.overlap = 0; return true; @@ -87,25 +135,161 @@ static void mlx5e_ipsec_init_limits(struct mlx5e_ipsec_sa_entry *sa_entry, struct mlx5_accel_esp_xfrm_attrs *attrs) { struct xfrm_state *x = sa_entry->x; + s64 start_value, n; - attrs->hard_packet_limit = x->lft.hard_packet_limit; + attrs->lft.hard_packet_limit = x->lft.hard_packet_limit; + attrs->lft.soft_packet_limit = x->lft.soft_packet_limit; if (x->lft.soft_packet_limit == XFRM_INF) return; - /* Hardware decrements hard_packet_limit counter through - * the operation. While fires an event when soft_packet_limit - * is reached. It emans that we need substitute the numbers - * in order to properly count soft limit. + /* Compute hard limit initial value and number of rounds. + * + * The counting pattern of hardware counter goes: + * value -> 2^31-1 + * 2^31 | (2^31-1) -> 2^31-1 + * 2^31 | (2^31-1) -> 2^31-1 + * [..] + * 2^31 | (2^31-1) -> 0 * - * As an example: - * XFRM user sets soft limit is 2 and hard limit is 9 and - * expects to see soft event after 2 packets and hard event - * after 9 packets. In our case, the hard limit will be set - * to 9 and soft limit is comparator to 7 so user gets the - * soft event after 2 packeta + * The pattern is created by using an ASO operation to atomically set + * bit 31 after the down counter clears bit 31. This is effectively an + * atomic addition of 2**31 to the counter. + * + * We wish to configure the counter, within the above pattern, so that + * when it reaches 0, it has hit the hard limit. This is defined by this + * system of equations: + * + * hard_limit == start_value + n * 2^31 + * n >= 0 + * start_value < 2^32, start_value >= 0 + * + * These equations are not single-solution, there are often two choices: + * hard_limit == start_value + n * 2^31 + * hard_limit == (start_value+2^31) + (n-1) * 2^31 + * + * The algorithm selects the solution that keeps the counter value + * above 2^31 until the final iteration. + */ + + /* Start by estimating n and compute start_value */ + n = attrs->lft.hard_packet_limit / BIT_ULL(31); + start_value = attrs->lft.hard_packet_limit - n * BIT_ULL(31); + + /* Choose the best of the two solutions: */ + if (n >= 1) + n -= 1; + + /* Computed values solve the system of equations: */ + start_value = attrs->lft.hard_packet_limit - n * BIT_ULL(31); + + /* The best solution means: when there are multiple iterations we must + * start above 2^31 and count down to 2**31 to get the interrupt. + */ + attrs->lft.hard_packet_limit = lower_32_bits(start_value); + attrs->lft.numb_rounds_hard = (u64)n; + + /* Compute soft limit initial value and number of rounds. + * + * The soft_limit is achieved by adjusting the counter's + * interrupt_value. This is embedded in the counting pattern created by + * hard packet calculations above. + * + * We wish to compute the interrupt_value for the soft_limit. This is + * defined by this system of equations: + * + * soft_limit == start_value - soft_value + n * 2^31 + * n >= 0 + * soft_value < 2^32, soft_value >= 0 + * for n == 0 start_value > soft_value + * + * As with compute_hard_n_value() the equations are not single-solution. + * The algorithm selects the solution that has: + * 2^30 <= soft_limit < 2^31 + 2^30 + * for the interior iterations, which guarantees a large guard band + * around the counter hard limit and next interrupt. + */ + + /* Start by estimating n and compute soft_value */ + n = (x->lft.soft_packet_limit - attrs->lft.hard_packet_limit) / BIT_ULL(31); + start_value = attrs->lft.hard_packet_limit + n * BIT_ULL(31) - + x->lft.soft_packet_limit; + + /* Compare against constraints and adjust n */ + if (n < 0) + n = 0; + else if (start_value >= BIT_ULL(32)) + n -= 1; + else if (start_value < 0) + n += 1; + + /* Choose the best of the two solutions: */ + start_value = attrs->lft.hard_packet_limit + n * BIT_ULL(31) - start_value; + if (n != attrs->lft.numb_rounds_hard && start_value < BIT_ULL(30)) + n += 1; + + /* Note that the upper limit of soft_value happens naturally because we + * always select the lowest soft_value. */ - attrs->soft_packet_limit = - x->lft.hard_packet_limit - x->lft.soft_packet_limit; + + /* Computed values solve the system of equations: */ + start_value = attrs->lft.hard_packet_limit + n * BIT_ULL(31) - start_value; + + /* The best solution means: when there are multiple iterations we must + * not fall below 2^30 as that would get too close to the false + * hard_limit and when we reach an interior iteration for soft_limit it + * has to be far away from 2**32-1 which is the counter reset point + * after the +2^31 to accommodate latency. + */ + attrs->lft.soft_packet_limit = lower_32_bits(start_value); + attrs->lft.numb_rounds_soft = (u64)n; +} + +static void mlx5e_ipsec_init_macs(struct mlx5e_ipsec_sa_entry *sa_entry, + struct mlx5_accel_esp_xfrm_attrs *attrs) +{ + struct mlx5_core_dev *mdev = mlx5e_ipsec_sa2dev(sa_entry); + struct xfrm_state *x = sa_entry->x; + struct net_device *netdev; + struct neighbour *n; + u8 addr[ETH_ALEN]; + const void *pkey; + u8 *dst, *src; + + if (attrs->mode != XFRM_MODE_TUNNEL || + attrs->type != XFRM_DEV_OFFLOAD_PACKET) + return; + + netdev = x->xso.real_dev; + + mlx5_query_mac_address(mdev, addr); + switch (attrs->dir) { + case XFRM_DEV_OFFLOAD_IN: + src = attrs->dmac; + dst = attrs->smac; + pkey = &attrs->saddr.a4; + break; + case XFRM_DEV_OFFLOAD_OUT: + src = attrs->smac; + dst = attrs->dmac; + pkey = &attrs->daddr.a4; + break; + default: + return; + } + + ether_addr_copy(src, addr); + n = neigh_lookup(&arp_tbl, pkey, netdev); + if (!n) { + n = neigh_create(&arp_tbl, pkey, netdev); + if (IS_ERR(n)) + return; + neigh_event_send(n, NULL); + attrs->drop = true; + } else { + neigh_ha_snapshot(addr, n, netdev); + ether_addr_copy(dst, addr); + } + neigh_release(n); } void mlx5e_ipsec_build_accel_xfrm_attrs(struct mlx5e_ipsec_sa_entry *sa_entry, @@ -141,11 +325,11 @@ void mlx5e_ipsec_build_accel_xfrm_attrs(struct mlx5e_ipsec_sa_entry *sa_entry, aes_gcm->icv_len = x->aead->alg_icv_len; /* esn */ - if (sa_entry->esn_state.trigger) { - attrs->esn_trigger = true; - attrs->esn = sa_entry->esn_state.esn; - attrs->esn_overlap = sa_entry->esn_state.overlap; - attrs->replay_window = x->replay_esn->replay_window; + if (x->props.flags & XFRM_STATE_ESN) { + attrs->replay_esn.trigger = true; + attrs->replay_esn.esn = sa_entry->esn_state.esn; + attrs->replay_esn.esn_msb = sa_entry->esn_state.esn_msb; + attrs->replay_esn.overlap = sa_entry->esn_state.overlap; } attrs->dir = x->xso.dir; @@ -163,8 +347,10 @@ void mlx5e_ipsec_build_accel_xfrm_attrs(struct mlx5e_ipsec_sa_entry *sa_entry, attrs->upspec.sport = ntohs(x->sel.sport); attrs->upspec.sport_mask = ntohs(x->sel.sport_mask); attrs->upspec.proto = x->sel.proto; + attrs->mode = x->props.mode; mlx5e_ipsec_init_limits(sa_entry, attrs); + mlx5e_ipsec_init_macs(sa_entry, attrs); } static int mlx5e_xfrm_validate_state(struct mlx5_core_dev *mdev, @@ -233,6 +419,11 @@ static int mlx5e_xfrm_validate_state(struct mlx5_core_dev *mdev, return -EINVAL; } + if (x->props.mode != XFRM_MODE_TRANSPORT && x->props.mode != XFRM_MODE_TUNNEL) { + NL_SET_ERR_MSG_MOD(extack, "Only transport and tunnel xfrm states may be offloaded"); + return -EINVAL; + } + switch (x->xso.type) { case XFRM_DEV_OFFLOAD_CRYPTO: if (!(mlx5_ipsec_device_caps(mdev) & MLX5_IPSEC_CAP_CRYPTO)) { @@ -240,11 +431,6 @@ static int mlx5e_xfrm_validate_state(struct mlx5_core_dev *mdev, return -EINVAL; } - if (x->props.mode != XFRM_MODE_TRANSPORT && - x->props.mode != XFRM_MODE_TUNNEL) { - NL_SET_ERR_MSG_MOD(extack, "Only transport and tunnel xfrm states may be offloaded"); - return -EINVAL; - } break; case XFRM_DEV_OFFLOAD_PACKET: if (!(mlx5_ipsec_device_caps(mdev) & @@ -253,8 +439,9 @@ static int mlx5e_xfrm_validate_state(struct mlx5_core_dev *mdev, return -EINVAL; } - if (x->props.mode != XFRM_MODE_TRANSPORT) { - NL_SET_ERR_MSG_MOD(extack, "Only transport xfrm states may be offloaded in packet mode"); + if (x->props.mode == XFRM_MODE_TUNNEL && + !(mlx5_ipsec_device_caps(mdev) & MLX5_IPSEC_CAP_TUNNEL)) { + NL_SET_ERR_MSG_MOD(extack, "Packet offload is not supported for tunnel mode"); return -EINVAL; } @@ -283,6 +470,11 @@ static int mlx5e_xfrm_validate_state(struct mlx5_core_dev *mdev, NL_SET_ERR_MSG_MOD(extack, "Hard packet limit must be greater than soft one"); return -EINVAL; } + + if (!x->lft.soft_packet_limit || !x->lft.hard_packet_limit) { + NL_SET_ERR_MSG_MOD(extack, "Soft/hard packet limits can't be 0"); + return -EINVAL; + } break; default: NL_SET_ERR_MSG_MOD(extack, "Unsupported xfrm offload type"); @@ -291,14 +483,134 @@ static int mlx5e_xfrm_validate_state(struct mlx5_core_dev *mdev, return 0; } -static void _update_xfrm_state(struct work_struct *work) +static void mlx5e_ipsec_modify_state(struct work_struct *_work) +{ + struct mlx5e_ipsec_work *work = + container_of(_work, struct mlx5e_ipsec_work, work); + struct mlx5e_ipsec_sa_entry *sa_entry = work->sa_entry; + struct mlx5_accel_esp_xfrm_attrs *attrs; + + attrs = &((struct mlx5e_ipsec_sa_entry *)work->data)->attrs; + + mlx5_accel_esp_modify_xfrm(sa_entry, attrs); +} + +static void mlx5e_ipsec_set_esn_ops(struct mlx5e_ipsec_sa_entry *sa_entry) +{ + struct xfrm_state *x = sa_entry->x; + + if (x->xso.type != XFRM_DEV_OFFLOAD_CRYPTO || + x->xso.dir != XFRM_DEV_OFFLOAD_OUT) + return; + + if (x->props.flags & XFRM_STATE_ESN) { + sa_entry->set_iv_op = mlx5e_ipsec_set_iv_esn; + return; + } + + sa_entry->set_iv_op = mlx5e_ipsec_set_iv; +} + +static void mlx5e_ipsec_handle_netdev_event(struct work_struct *_work) +{ + struct mlx5e_ipsec_work *work = + container_of(_work, struct mlx5e_ipsec_work, work); + struct mlx5e_ipsec_sa_entry *sa_entry = work->sa_entry; + struct mlx5e_ipsec_netevent_data *data = work->data; + struct mlx5_accel_esp_xfrm_attrs *attrs; + + attrs = &sa_entry->attrs; + + switch (attrs->dir) { + case XFRM_DEV_OFFLOAD_IN: + ether_addr_copy(attrs->smac, data->addr); + break; + case XFRM_DEV_OFFLOAD_OUT: + ether_addr_copy(attrs->dmac, data->addr); + break; + default: + WARN_ON_ONCE(true); + } + attrs->drop = false; + mlx5e_accel_ipsec_fs_modify(sa_entry); +} + +static int mlx5_ipsec_create_work(struct mlx5e_ipsec_sa_entry *sa_entry) +{ + struct xfrm_state *x = sa_entry->x; + struct mlx5e_ipsec_work *work; + void *data = NULL; + + switch (x->xso.type) { + case XFRM_DEV_OFFLOAD_CRYPTO: + if (!(x->props.flags & XFRM_STATE_ESN)) + return 0; + break; + case XFRM_DEV_OFFLOAD_PACKET: + if (x->props.mode != XFRM_MODE_TUNNEL) + return 0; + break; + default: + break; + } + + work = kzalloc(sizeof(*work), GFP_KERNEL); + if (!work) + return -ENOMEM; + + switch (x->xso.type) { + case XFRM_DEV_OFFLOAD_CRYPTO: + data = kzalloc(sizeof(*sa_entry), GFP_KERNEL); + if (!data) + goto free_work; + + INIT_WORK(&work->work, mlx5e_ipsec_modify_state); + break; + case XFRM_DEV_OFFLOAD_PACKET: + data = kzalloc(sizeof(struct mlx5e_ipsec_netevent_data), + GFP_KERNEL); + if (!data) + goto free_work; + + INIT_WORK(&work->work, mlx5e_ipsec_handle_netdev_event); + break; + default: + break; + } + + work->data = data; + work->sa_entry = sa_entry; + sa_entry->work = work; + return 0; + +free_work: + kfree(work); + return -ENOMEM; +} + +static int mlx5e_ipsec_create_dwork(struct mlx5e_ipsec_sa_entry *sa_entry) { - struct mlx5e_ipsec_modify_state_work *modify_work = - container_of(work, struct mlx5e_ipsec_modify_state_work, work); - struct mlx5e_ipsec_sa_entry *sa_entry = container_of( - modify_work, struct mlx5e_ipsec_sa_entry, modify_work); + struct xfrm_state *x = sa_entry->x; + struct mlx5e_ipsec_dwork *dwork; + + if (x->xso.type != XFRM_DEV_OFFLOAD_PACKET) + return 0; + + if (x->xso.dir != XFRM_DEV_OFFLOAD_OUT) + return 0; + + if (x->lft.soft_packet_limit == XFRM_INF && + x->lft.hard_packet_limit == XFRM_INF) + return 0; - mlx5_accel_esp_modify_xfrm(sa_entry, &modify_work->attrs); + dwork = kzalloc(sizeof(*dwork), GFP_KERNEL); + if (!dwork) + return -ENOMEM; + + dwork->sa_entry = sa_entry; + INIT_DELAYED_WORK(&dwork->dwork, mlx5e_ipsec_handle_tx_limit); + sa_entry->dwork = dwork; + return 0; } static int mlx5e_xfrm_add_state(struct xfrm_state *x, @@ -308,6 +620,7 @@ static int mlx5e_xfrm_add_state(struct xfrm_state *x, struct net_device *netdev = x->xso.real_dev; struct mlx5e_ipsec *ipsec; struct mlx5e_priv *priv; + gfp_t gfp; int err; priv = netdev_priv(netdev); @@ -315,30 +628,52 @@ static int mlx5e_xfrm_add_state(struct xfrm_state *x, return -EOPNOTSUPP; ipsec = priv->ipsec; - err = mlx5e_xfrm_validate_state(priv->mdev, x, extack); - if (err) - return err; - - sa_entry = kzalloc(sizeof(*sa_entry), GFP_KERNEL); + gfp = (x->xso.flags & XFRM_DEV_OFFLOAD_FLAG_ACQ) ? GFP_ATOMIC : GFP_KERNEL; + sa_entry = kzalloc(sizeof(*sa_entry), gfp); if (!sa_entry) return -ENOMEM; sa_entry->x = x; sa_entry->ipsec = ipsec; + /* Check if this SA is originated from acquire flow temporary SA */ + if (x->xso.flags & XFRM_DEV_OFFLOAD_FLAG_ACQ) + goto out; + + err = mlx5e_xfrm_validate_state(priv->mdev, x, extack); + if (err) + goto err_xfrm; /* check esn */ - mlx5e_ipsec_update_esn_state(sa_entry); + if (x->props.flags & XFRM_STATE_ESN) + mlx5e_ipsec_update_esn_state(sa_entry); mlx5e_ipsec_build_accel_xfrm_attrs(sa_entry, &sa_entry->attrs); + + err = mlx5_ipsec_create_work(sa_entry); + if (err) + goto err_xfrm; + + err = mlx5e_ipsec_create_dwork(sa_entry); + if (err) + goto release_work; + /* create hw context */ err = mlx5_ipsec_create_sa_ctx(sa_entry); if (err) - goto err_xfrm; + goto release_dwork; err = mlx5e_accel_ipsec_fs_add_rule(sa_entry); if (err) goto err_hw_ctx; + if (x->props.mode == XFRM_MODE_TUNNEL && + x->xso.type == XFRM_DEV_OFFLOAD_PACKET && + !mlx5e_ipsec_fs_tunnel_enabled(sa_entry)) { + NL_SET_ERR_MSG_MOD(extack, "Packet offload tunnel mode is disabled due to encap settings"); + err = -EINVAL; + goto err_add_rule; + } + /* We use *_bh() variant because xfrm_timer_handler(), which runs * in softirq context, can reach our state delete logic and we need * xa_erase_bh() there. @@ -348,11 +683,18 @@ static int mlx5e_xfrm_add_state(struct xfrm_state *x, if (err) goto err_add_rule; - if (x->xso.dir == XFRM_DEV_OFFLOAD_OUT) - sa_entry->set_iv_op = (x->props.flags & XFRM_STATE_ESN) ? - mlx5e_ipsec_set_iv_esn : mlx5e_ipsec_set_iv; + mlx5e_ipsec_set_esn_ops(sa_entry); + + if (sa_entry->dwork) + queue_delayed_work(ipsec->wq, &sa_entry->dwork->dwork, + MLX5_IPSEC_RESCHED); - INIT_WORK(&sa_entry->modify_work.work, _update_xfrm_state); + if (x->xso.type == XFRM_DEV_OFFLOAD_PACKET && + x->props.mode == XFRM_MODE_TUNNEL) + xa_set_mark(&ipsec->sadb, sa_entry->ipsec_obj_id, + MLX5E_IPSEC_TUNNEL_SA); + +out: x->xso.offload_handle = (unsigned long)sa_entry; return 0; @@ -360,32 +702,101 @@ err_add_rule: mlx5e_accel_ipsec_fs_del_rule(sa_entry); err_hw_ctx: mlx5_ipsec_free_sa_ctx(sa_entry); +release_dwork: + kfree(sa_entry->dwork); +release_work: + if (sa_entry->work) + kfree(sa_entry->work->data); + kfree(sa_entry->work); err_xfrm: kfree(sa_entry); - NL_SET_ERR_MSG_MOD(extack, "Device failed to offload this policy"); + NL_SET_ERR_MSG_WEAK_MOD(extack, "Device failed to offload this state"); return err; } static void mlx5e_xfrm_del_state(struct xfrm_state *x) { struct mlx5e_ipsec_sa_entry *sa_entry = to_ipsec_sa_entry(x); + struct mlx5_accel_esp_xfrm_attrs *attrs = &sa_entry->attrs; struct mlx5e_ipsec *ipsec = sa_entry->ipsec; struct mlx5e_ipsec_sa_entry *old; + if (x->xso.flags & XFRM_DEV_OFFLOAD_FLAG_ACQ) + return; + old = xa_erase_bh(&ipsec->sadb, sa_entry->ipsec_obj_id); WARN_ON(old != sa_entry); + + if (attrs->mode == XFRM_MODE_TUNNEL && + attrs->type == XFRM_DEV_OFFLOAD_PACKET) + /* Make sure that no ARP requests are running in parallel */ + flush_workqueue(ipsec->wq); + } static void mlx5e_xfrm_free_state(struct xfrm_state *x) { struct mlx5e_ipsec_sa_entry *sa_entry = to_ipsec_sa_entry(x); - cancel_work_sync(&sa_entry->modify_work.work); + if (x->xso.flags & XFRM_DEV_OFFLOAD_FLAG_ACQ) + goto sa_entry_free; + + if (sa_entry->work) + cancel_work_sync(&sa_entry->work->work); + + if (sa_entry->dwork) + cancel_delayed_work_sync(&sa_entry->dwork->dwork); + mlx5e_accel_ipsec_fs_del_rule(sa_entry); mlx5_ipsec_free_sa_ctx(sa_entry); + kfree(sa_entry->dwork); + if (sa_entry->work) + kfree(sa_entry->work->data); + kfree(sa_entry->work); +sa_entry_free: kfree(sa_entry); } +static int mlx5e_ipsec_netevent_event(struct notifier_block *nb, + unsigned long event, void *ptr) +{ + struct mlx5_accel_esp_xfrm_attrs *attrs; + struct mlx5e_ipsec_netevent_data *data; + struct mlx5e_ipsec_sa_entry *sa_entry; + struct mlx5e_ipsec *ipsec; + struct neighbour *n = ptr; + struct net_device *netdev; + struct xfrm_state *x; + unsigned long idx; + + if (event != NETEVENT_NEIGH_UPDATE || !(n->nud_state & NUD_VALID)) + return NOTIFY_DONE; + + ipsec = container_of(nb, struct mlx5e_ipsec, netevent_nb); + xa_for_each_marked(&ipsec->sadb, idx, sa_entry, MLX5E_IPSEC_TUNNEL_SA) { + attrs = &sa_entry->attrs; + + if (attrs->family == AF_INET) { + if (!neigh_key_eq32(n, &attrs->saddr.a4) && + !neigh_key_eq32(n, &attrs->daddr.a4)) + continue; + } else { + if (!neigh_key_eq128(n, &attrs->saddr.a4) && + !neigh_key_eq128(n, &attrs->daddr.a4)) + continue; + } + + x = sa_entry->x; + netdev = x->xso.real_dev; + data = sa_entry->work->data; + + neigh_ha_snapshot(data->addr, n, netdev); + queue_work(ipsec->wq, &sa_entry->work->work); + } + + return NOTIFY_DONE; +} + void mlx5e_ipsec_init(struct mlx5e_priv *priv) { struct mlx5e_ipsec *ipsec; @@ -402,8 +813,8 @@ void mlx5e_ipsec_init(struct mlx5e_priv *priv) xa_init_flags(&ipsec->sadb, XA_FLAGS_ALLOC); ipsec->mdev = priv->mdev; - ipsec->wq = alloc_ordered_workqueue("mlx5e_ipsec: %s", 0, - priv->netdev->name); + ipsec->wq = alloc_workqueue("mlx5e_ipsec: %s", WQ_UNBOUND, 0, + priv->netdev->name); if (!ipsec->wq) goto err_wq; @@ -414,6 +825,13 @@ void mlx5e_ipsec_init(struct mlx5e_priv *priv) goto err_aso; } + if (mlx5_ipsec_device_caps(priv->mdev) & MLX5_IPSEC_CAP_TUNNEL) { + ipsec->netevent_nb.notifier_call = mlx5e_ipsec_netevent_event; + ret = register_netevent_notifier(&ipsec->netevent_nb); + if (ret) + goto clear_aso; + } + ret = mlx5e_accel_ipsec_fs_init(ipsec); if (ret) goto err_fs_init; @@ -424,6 +842,9 @@ void mlx5e_ipsec_init(struct mlx5e_priv *priv) return; err_fs_init: + if (mlx5_ipsec_device_caps(priv->mdev) & MLX5_IPSEC_CAP_TUNNEL) + unregister_netevent_notifier(&ipsec->netevent_nb); +clear_aso: if (mlx5_ipsec_device_caps(priv->mdev) & MLX5_IPSEC_CAP_PACKET_OFFLOAD) mlx5e_ipsec_aso_cleanup(ipsec); err_aso: @@ -442,6 +863,8 @@ void mlx5e_ipsec_cleanup(struct mlx5e_priv *priv) return; mlx5e_accel_ipsec_fs_cleanup(ipsec); + if (mlx5_ipsec_device_caps(priv->mdev) & MLX5_IPSEC_CAP_TUNNEL) + unregister_netevent_notifier(&ipsec->netevent_nb); if (mlx5_ipsec_device_caps(priv->mdev) & MLX5_IPSEC_CAP_PACKET_OFFLOAD) mlx5e_ipsec_aso_cleanup(ipsec); destroy_workqueue(ipsec->wq); @@ -467,41 +890,43 @@ static bool mlx5e_ipsec_offload_ok(struct sk_buff *skb, struct xfrm_state *x) static void mlx5e_xfrm_advance_esn_state(struct xfrm_state *x) { struct mlx5e_ipsec_sa_entry *sa_entry = to_ipsec_sa_entry(x); - struct mlx5e_ipsec_modify_state_work *modify_work = - &sa_entry->modify_work; + struct mlx5e_ipsec_work *work = sa_entry->work; + struct mlx5e_ipsec_sa_entry *sa_entry_shadow; bool need_update; need_update = mlx5e_ipsec_update_esn_state(sa_entry); if (!need_update) return; - mlx5e_ipsec_build_accel_xfrm_attrs(sa_entry, &modify_work->attrs); - queue_work(sa_entry->ipsec->wq, &modify_work->work); + sa_entry_shadow = work->data; + memset(sa_entry_shadow, 0x00, sizeof(*sa_entry_shadow)); + mlx5e_ipsec_build_accel_xfrm_attrs(sa_entry, &sa_entry_shadow->attrs); + queue_work(sa_entry->ipsec->wq, &work->work); } static void mlx5e_xfrm_update_curlft(struct xfrm_state *x) { struct mlx5e_ipsec_sa_entry *sa_entry = to_ipsec_sa_entry(x); - int err; + struct mlx5e_ipsec_rule *ipsec_rule = &sa_entry->ipsec_rule; + u64 packets, bytes, lastuse; - lockdep_assert_held(&x->lock); + lockdep_assert(lockdep_is_held(&x->lock) || + lockdep_is_held(&dev_net(x->xso.real_dev)->xfrm.xfrm_cfg_mutex)); - if (sa_entry->attrs.soft_packet_limit == XFRM_INF) - /* Limits are not configured, as soft limit - * must be lowever than hard limit. - */ + if (x->xso.flags & XFRM_DEV_OFFLOAD_FLAG_ACQ) return; - err = mlx5e_ipsec_aso_query(sa_entry, NULL); - if (err) - return; - - mlx5e_ipsec_aso_update_curlft(sa_entry, &x->curlft.packets); + mlx5_fc_query_cached(ipsec_rule->fc, &bytes, &packets, &lastuse); + x->curlft.packets += packets; + x->curlft.bytes += bytes; } -static int mlx5e_xfrm_validate_policy(struct xfrm_policy *x, +static int mlx5e_xfrm_validate_policy(struct mlx5_core_dev *mdev, + struct xfrm_policy *x, struct netlink_ext_ack *extack) { + struct xfrm_selector *sel = &x->selector; + if (x->type != XFRM_POLICY_TYPE_MAIN) { NL_SET_ERR_MSG_MOD(extack, "Cannot offload non-main policy types"); return -EINVAL; @@ -519,8 +944,9 @@ static int mlx5e_xfrm_validate_policy(struct xfrm_policy *x, return -EINVAL; } - if (!x->xfrm_vec[0].reqid) { - NL_SET_ERR_MSG_MOD(extack, "Cannot offload policy without reqid"); + if (!x->xfrm_vec[0].reqid && sel->proto == IPPROTO_IP && + addr6_all_zero(sel->saddr.a6) && addr6_all_zero(sel->daddr.a6)) { + NL_SET_ERR_MSG_MOD(extack, "Unsupported policy with reqid 0 without at least one of upper protocol or ip addr(s) different than 0"); return -EINVAL; } @@ -529,12 +955,24 @@ static int mlx5e_xfrm_validate_policy(struct xfrm_policy *x, return -EINVAL; } - if (x->selector.proto != IPPROTO_IP && - (x->selector.proto != IPPROTO_UDP || x->xdo.dir != XFRM_DEV_OFFLOAD_OUT)) { + if (sel->proto != IPPROTO_IP && + (sel->proto != IPPROTO_UDP || x->xdo.dir != XFRM_DEV_OFFLOAD_OUT)) { NL_SET_ERR_MSG_MOD(extack, "Device does not support upper protocol other than UDP, and only Tx direction"); return -EINVAL; } + if (x->priority) { + if (!(mlx5_ipsec_device_caps(mdev) & MLX5_IPSEC_CAP_PRIO)) { + NL_SET_ERR_MSG_MOD(extack, "Device does not support policy priority"); + return -EINVAL; + } + + if (x->priority == U32_MAX) { + NL_SET_ERR_MSG_MOD(extack, "Device does not support requested policy priority"); + return -EINVAL; + } + } + return 0; } @@ -560,6 +998,7 @@ mlx5e_ipsec_build_accel_pol_attrs(struct mlx5e_ipsec_pol_entry *pol_entry, attrs->upspec.sport = ntohs(sel->sport); attrs->upspec.sport_mask = ntohs(sel->sport_mask); attrs->upspec.proto = sel->proto; + attrs->prio = x->priority; } static int mlx5e_xfrm_add_policy(struct xfrm_policy *x, @@ -576,7 +1015,7 @@ static int mlx5e_xfrm_add_policy(struct xfrm_policy *x, return -EOPNOTSUPP; } - err = mlx5e_xfrm_validate_policy(x, extack); + err = mlx5e_xfrm_validate_policy(priv->mdev, x, extack); if (err) return err; diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec.h b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec.h index 12f044330639..4e9887171508 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec.h @@ -60,10 +60,24 @@ struct upspec { u8 proto; }; +struct mlx5_ipsec_lft { + u64 hard_packet_limit; + u64 soft_packet_limit; + u64 numb_rounds_hard; + u64 numb_rounds_soft; +}; + +struct mlx5_replay_esn { + u32 replay_window; + u32 esn; + u32 esn_msb; + u8 overlap : 1; + u8 trigger : 1; +}; + struct mlx5_accel_esp_xfrm_attrs { - u32 esn; u32 spi; - u32 flags; + u32 mode; struct aes_gcm_keymat aes_gcm; union { @@ -78,15 +92,15 @@ struct mlx5_accel_esp_xfrm_attrs { struct upspec upspec; u8 dir : 2; - u8 esn_overlap : 1; - u8 esn_trigger : 1; u8 type : 2; + u8 drop : 1; u8 family; - u32 replay_window; + struct mlx5_replay_esn replay_esn; u32 authsize; u32 reqid; - u64 hard_packet_limit; - u64 soft_packet_limit; + struct mlx5_ipsec_lft lft; + u8 smac[ETH_ALEN]; + u8 dmac[ETH_ALEN]; }; enum mlx5_ipsec_cap { @@ -94,6 +108,8 @@ enum mlx5_ipsec_cap { MLX5_IPSEC_CAP_ESN = 1 << 1, MLX5_IPSEC_CAP_PACKET_OFFLOAD = 1 << 2, MLX5_IPSEC_CAP_ROCE = 1 << 3, + MLX5_IPSEC_CAP_PRIO = 1 << 4, + MLX5_IPSEC_CAP_TUNNEL = 1 << 5, }; struct mlx5e_priv; @@ -124,8 +140,17 @@ struct mlx5e_ipsec_tx; struct mlx5e_ipsec_work { struct work_struct work; - struct mlx5e_ipsec *ipsec; - u32 id; + struct mlx5e_ipsec_sa_entry *sa_entry; + void *data; +}; + +struct mlx5e_ipsec_netevent_data { + u8 addr[ETH_ALEN]; +}; + +struct mlx5e_ipsec_dwork { + struct delayed_work dwork; + struct mlx5e_ipsec_sa_entry *sa_entry; }; struct mlx5e_ipsec_aso { @@ -148,12 +173,13 @@ struct mlx5e_ipsec { struct mlx5e_ipsec_tx *tx; struct mlx5e_ipsec_aso *aso; struct notifier_block nb; + struct notifier_block netevent_nb; struct mlx5_ipsec_fs *roce; }; struct mlx5e_ipsec_esn_state { u32 esn; - u8 trigger: 1; + u32 esn_msb; u8 overlap: 1; }; @@ -161,11 +187,13 @@ struct mlx5e_ipsec_rule { struct mlx5_flow_handle *rule; struct mlx5_modify_hdr *modify_hdr; struct mlx5_pkt_reformat *pkt_reformat; + struct mlx5_fc *fc; }; -struct mlx5e_ipsec_modify_state_work { - struct work_struct work; - struct mlx5_accel_esp_xfrm_attrs attrs; +struct mlx5e_ipsec_limits { + u64 round; + u8 soft_limit_hit : 1; + u8 fix_limit : 1; }; struct mlx5e_ipsec_sa_entry { @@ -178,7 +206,9 @@ struct mlx5e_ipsec_sa_entry { u32 ipsec_obj_id; u32 enc_key_id; struct mlx5e_ipsec_rule ipsec_rule; - struct mlx5e_ipsec_modify_state_work modify_work; + struct mlx5e_ipsec_work *work; + struct mlx5e_ipsec_dwork *dwork; + struct mlx5e_ipsec_limits limits; }; struct mlx5_accel_pol_xfrm_attrs { @@ -198,6 +228,7 @@ struct mlx5_accel_pol_xfrm_attrs { u8 type : 2; u8 dir : 2; u32 reqid; + u32 prio; }; struct mlx5e_ipsec_pol_entry { @@ -219,6 +250,8 @@ int mlx5e_accel_ipsec_fs_add_rule(struct mlx5e_ipsec_sa_entry *sa_entry); void mlx5e_accel_ipsec_fs_del_rule(struct mlx5e_ipsec_sa_entry *sa_entry); int mlx5e_accel_ipsec_fs_add_pol(struct mlx5e_ipsec_pol_entry *pol_entry); void mlx5e_accel_ipsec_fs_del_pol(struct mlx5e_ipsec_pol_entry *pol_entry); +void mlx5e_accel_ipsec_fs_modify(struct mlx5e_ipsec_sa_entry *sa_entry); +bool mlx5e_ipsec_fs_tunnel_enabled(struct mlx5e_ipsec_sa_entry *sa_entry); int mlx5_ipsec_create_sa_ctx(struct mlx5e_ipsec_sa_entry *sa_entry); void mlx5_ipsec_free_sa_ctx(struct mlx5e_ipsec_sa_entry *sa_entry); @@ -233,9 +266,6 @@ void mlx5e_ipsec_aso_cleanup(struct mlx5e_ipsec *ipsec); int mlx5e_ipsec_aso_query(struct mlx5e_ipsec_sa_entry *sa_entry, struct mlx5_wqe_aso_ctrl_seg *data); -void mlx5e_ipsec_aso_update_curlft(struct mlx5e_ipsec_sa_entry *sa_entry, - u64 *packets); - void mlx5e_accel_ipsec_fs_read_stats(struct mlx5e_priv *priv, void *ipsec_stats); @@ -252,6 +282,13 @@ mlx5e_ipsec_pol2dev(struct mlx5e_ipsec_pol_entry *pol_entry) { return pol_entry->ipsec->mdev; } + +static inline bool addr6_all_zero(__be32 *addr6) +{ + static const __be32 zaddr6[4] = {}; + + return !memcmp(addr6, zaddr6, sizeof(zaddr6)); +} #else static inline void mlx5e_ipsec_init(struct mlx5e_priv *priv) { diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec_fs.c b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec_fs.c index 9871ba1b25ff..dbe87bf89c0d 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec_fs.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec_fs.c @@ -4,11 +4,15 @@ #include <linux/netdevice.h> #include "en.h" #include "en/fs.h" +#include "eswitch.h" #include "ipsec.h" #include "fs_core.h" #include "lib/ipsec_fs_roce.h" +#include "lib/fs_chains.h" #define NUM_IPSEC_FTE BIT(15) +#define MLX5_REFORMAT_TYPE_ADD_ESP_TRANSPORT_SIZE 16 +#define IPSEC_TUNNEL_DEFAULT_TTL 0x40 struct mlx5e_ipsec_fc { struct mlx5_fc *cnt; @@ -34,13 +38,18 @@ struct mlx5e_ipsec_rx { struct mlx5e_ipsec_miss sa; struct mlx5e_ipsec_rule status; struct mlx5e_ipsec_fc *fc; + struct mlx5_fs_chains *chains; + u8 allow_tunnel_mode : 1; }; struct mlx5e_ipsec_tx { struct mlx5e_ipsec_ft ft; struct mlx5e_ipsec_miss pol; + struct mlx5e_ipsec_rule status; struct mlx5_flow_namespace *ns; struct mlx5e_ipsec_fc *fc; + struct mlx5_fs_chains *chains; + u8 allow_tunnel_mode : 1; }; /* IPsec RX flow steering */ @@ -51,9 +60,70 @@ static enum mlx5_traffic_types family2tt(u32 family) return MLX5_TT_IPV6_IPSEC_ESP; } +static struct mlx5e_ipsec_rx *ipsec_rx(struct mlx5e_ipsec *ipsec, u32 family) +{ + if (family == AF_INET) + return ipsec->rx_ipv4; + + return ipsec->rx_ipv6; +} + +static struct mlx5_fs_chains * +ipsec_chains_create(struct mlx5_core_dev *mdev, struct mlx5_flow_table *miss_ft, + enum mlx5_flow_namespace_type ns, int base_prio, + int base_level, struct mlx5_flow_table **root_ft) +{ + struct mlx5_chains_attr attr = {}; + struct mlx5_fs_chains *chains; + struct mlx5_flow_table *ft; + int err; + + attr.flags = MLX5_CHAINS_AND_PRIOS_SUPPORTED | + MLX5_CHAINS_IGNORE_FLOW_LEVEL_SUPPORTED; + attr.max_grp_num = 2; + attr.default_ft = miss_ft; + attr.ns = ns; + attr.fs_base_prio = base_prio; + attr.fs_base_level = base_level; + chains = mlx5_chains_create(mdev, &attr); + if (IS_ERR(chains)) + return chains; + + /* Create chain 0, prio 1, level 0 to connect chains to prev in fs_core */ + ft = mlx5_chains_get_table(chains, 0, 1, 0); + if (IS_ERR(ft)) { + err = PTR_ERR(ft); + goto err_chains_get; + } + + *root_ft = ft; + return chains; + +err_chains_get: + mlx5_chains_destroy(chains); + return ERR_PTR(err); +} + +static void ipsec_chains_destroy(struct mlx5_fs_chains *chains) +{ + mlx5_chains_put_table(chains, 0, 1, 0); + mlx5_chains_destroy(chains); +} + +static struct mlx5_flow_table * +ipsec_chains_get_table(struct mlx5_fs_chains *chains, u32 prio) +{ + return mlx5_chains_get_table(chains, 0, prio + 1, 0); +} + +static void ipsec_chains_put_table(struct mlx5_fs_chains *chains, u32 prio) +{ + mlx5_chains_put_table(chains, 0, prio + 1, 0); +} + static struct mlx5_flow_table *ipsec_ft_create(struct mlx5_flow_namespace *ns, int level, int prio, - int max_num_groups) + int max_num_groups, u32 flags) { struct mlx5_flow_table_attr ft_attr = {}; @@ -62,6 +132,7 @@ static struct mlx5_flow_table *ipsec_ft_create(struct mlx5_flow_namespace *ns, ft_attr.max_fte = NUM_IPSEC_FTE; ft_attr.level = level; ft_attr.prio = prio; + ft_attr.flags = flags; return mlx5_create_auto_grouped_flow_table(ns, &ft_attr); } @@ -170,14 +241,24 @@ out: static void rx_destroy(struct mlx5_core_dev *mdev, struct mlx5e_ipsec *ipsec, struct mlx5e_ipsec_rx *rx, u32 family) { - mlx5_del_flow_rules(rx->pol.rule); - mlx5_destroy_flow_group(rx->pol.group); - mlx5_destroy_flow_table(rx->ft.pol); + struct mlx5_ttc_table *ttc = mlx5e_fs_get_ttc(ipsec->fs, false); + + /* disconnect */ + mlx5_ttc_fwd_default_dest(ttc, family2tt(family)); + + if (rx->chains) { + ipsec_chains_destroy(rx->chains); + } else { + mlx5_del_flow_rules(rx->pol.rule); + mlx5_destroy_flow_group(rx->pol.group); + mlx5_destroy_flow_table(rx->ft.pol); + } mlx5_del_flow_rules(rx->sa.rule); mlx5_destroy_flow_group(rx->sa.group); mlx5_destroy_flow_table(rx->ft.sa); - + if (rx->allow_tunnel_mode) + mlx5_eswitch_unblock_encap(mdev); mlx5_del_flow_rules(rx->status.rule); mlx5_modify_header_dealloc(mdev, rx->status.modify_hdr); mlx5_destroy_flow_table(rx->ft.status); @@ -193,6 +274,7 @@ static int rx_create(struct mlx5_core_dev *mdev, struct mlx5e_ipsec *ipsec, struct mlx5_flow_destination default_dest; struct mlx5_flow_destination dest[2]; struct mlx5_flow_table *ft; + u32 flags = 0; int err; default_dest = mlx5_ttc_get_default_dest(ttc, family2tt(family)); @@ -203,7 +285,7 @@ static int rx_create(struct mlx5_core_dev *mdev, struct mlx5e_ipsec *ipsec, return err; ft = ipsec_ft_create(ns, MLX5E_ACCEL_FS_ESP_FT_ERR_LEVEL, - MLX5E_NIC_PRIO, 1); + MLX5E_NIC_PRIO, 1, 0); if (IS_ERR(ft)) { err = PTR_ERR(ft); goto err_fs_ft_status; @@ -226,8 +308,12 @@ static int rx_create(struct mlx5_core_dev *mdev, struct mlx5e_ipsec *ipsec, goto err_add; /* Create FT */ - ft = ipsec_ft_create(ns, MLX5E_ACCEL_FS_ESP_FT_LEVEL, MLX5E_NIC_PRIO, - 2); + if (mlx5_ipsec_device_caps(mdev) & MLX5_IPSEC_CAP_TUNNEL) + rx->allow_tunnel_mode = mlx5_eswitch_block_encap(mdev); + if (rx->allow_tunnel_mode) + flags = MLX5_FLOW_TABLE_TUNNEL_EN_REFORMAT; + ft = ipsec_ft_create(ns, MLX5E_ACCEL_FS_ESP_FT_LEVEL, MLX5E_NIC_PRIO, 2, + flags); if (IS_ERR(ft)) { err = PTR_ERR(ft); goto err_fs_ft; @@ -238,8 +324,22 @@ static int rx_create(struct mlx5_core_dev *mdev, struct mlx5e_ipsec *ipsec, if (err) goto err_fs; + if (mlx5_ipsec_device_caps(mdev) & MLX5_IPSEC_CAP_PRIO) { + rx->chains = ipsec_chains_create(mdev, rx->ft.sa, + MLX5_FLOW_NAMESPACE_KERNEL, + MLX5E_NIC_PRIO, + MLX5E_ACCEL_FS_POL_FT_LEVEL, + &rx->ft.pol); + if (IS_ERR(rx->chains)) { + err = PTR_ERR(rx->chains); + goto err_pol_ft; + } + + goto connect; + } + ft = ipsec_ft_create(ns, MLX5E_ACCEL_FS_POL_FT_LEVEL, MLX5E_NIC_PRIO, - 2); + 2, 0); if (IS_ERR(ft)) { err = PTR_ERR(ft); goto err_pol_ft; @@ -252,6 +352,12 @@ static int rx_create(struct mlx5_core_dev *mdev, struct mlx5e_ipsec *ipsec, if (err) goto err_pol_miss; +connect: + /* connect */ + memset(dest, 0x00, sizeof(*dest)); + dest[0].type = MLX5_FLOW_DESTINATION_TYPE_FLOW_TABLE; + dest[0].ft = rx->ft.pol; + mlx5_ttc_fwd_dest(ttc, family2tt(family), &dest[0]); return 0; err_pol_miss: @@ -262,6 +368,8 @@ err_pol_ft: err_fs: mlx5_destroy_flow_table(rx->ft.sa); err_fs_ft: + if (rx->allow_tunnel_mode) + mlx5_eswitch_unblock_encap(mdev); mlx5_del_flow_rules(rx->status.rule); mlx5_modify_header_dealloc(mdev, rx->status.modify_hdr); err_add: @@ -271,83 +379,191 @@ err_fs_ft_status: return err; } -static struct mlx5e_ipsec_rx *rx_ft_get(struct mlx5_core_dev *mdev, - struct mlx5e_ipsec *ipsec, u32 family) +static int rx_get(struct mlx5_core_dev *mdev, struct mlx5e_ipsec *ipsec, + struct mlx5e_ipsec_rx *rx, u32 family) { - struct mlx5_ttc_table *ttc = mlx5e_fs_get_ttc(ipsec->fs, false); - struct mlx5_flow_destination dest = {}; - struct mlx5e_ipsec_rx *rx; - int err = 0; - - if (family == AF_INET) - rx = ipsec->rx_ipv4; - else - rx = ipsec->rx_ipv6; + int err; - mutex_lock(&rx->ft.mutex); if (rx->ft.refcnt) goto skip; - /* create FT */ err = rx_create(mdev, ipsec, rx, family); if (err) - goto out; - - /* connect */ - dest.type = MLX5_FLOW_DESTINATION_TYPE_FLOW_TABLE; - dest.ft = rx->ft.pol; - mlx5_ttc_fwd_dest(ttc, family2tt(family), &dest); + return err; skip: rx->ft.refcnt++; -out: + return 0; +} + +static void rx_put(struct mlx5e_ipsec *ipsec, struct mlx5e_ipsec_rx *rx, + u32 family) +{ + if (--rx->ft.refcnt) + return; + + rx_destroy(ipsec->mdev, ipsec, rx, family); +} + +static struct mlx5e_ipsec_rx *rx_ft_get(struct mlx5_core_dev *mdev, + struct mlx5e_ipsec *ipsec, u32 family) +{ + struct mlx5e_ipsec_rx *rx = ipsec_rx(ipsec, family); + int err; + + mutex_lock(&rx->ft.mutex); + err = rx_get(mdev, ipsec, rx, family); mutex_unlock(&rx->ft.mutex); if (err) return ERR_PTR(err); + return rx; } -static void rx_ft_put(struct mlx5_core_dev *mdev, struct mlx5e_ipsec *ipsec, - u32 family) +static struct mlx5_flow_table *rx_ft_get_policy(struct mlx5_core_dev *mdev, + struct mlx5e_ipsec *ipsec, + u32 family, u32 prio) { - struct mlx5_ttc_table *ttc = mlx5e_fs_get_ttc(ipsec->fs, false); - struct mlx5e_ipsec_rx *rx; + struct mlx5e_ipsec_rx *rx = ipsec_rx(ipsec, family); + struct mlx5_flow_table *ft; + int err; - if (family == AF_INET) - rx = ipsec->rx_ipv4; - else - rx = ipsec->rx_ipv6; + mutex_lock(&rx->ft.mutex); + err = rx_get(mdev, ipsec, rx, family); + if (err) + goto err_get; + + ft = rx->chains ? ipsec_chains_get_table(rx->chains, prio) : rx->ft.pol; + if (IS_ERR(ft)) { + err = PTR_ERR(ft); + goto err_get_ft; + } + + mutex_unlock(&rx->ft.mutex); + return ft; + +err_get_ft: + rx_put(ipsec, rx, family); +err_get: + mutex_unlock(&rx->ft.mutex); + return ERR_PTR(err); +} + +static void rx_ft_put(struct mlx5e_ipsec *ipsec, u32 family) +{ + struct mlx5e_ipsec_rx *rx = ipsec_rx(ipsec, family); mutex_lock(&rx->ft.mutex); - rx->ft.refcnt--; - if (rx->ft.refcnt) - goto out; + rx_put(ipsec, rx, family); + mutex_unlock(&rx->ft.mutex); +} - /* disconnect */ - mlx5_ttc_fwd_default_dest(ttc, family2tt(family)); +static void rx_ft_put_policy(struct mlx5e_ipsec *ipsec, u32 family, u32 prio) +{ + struct mlx5e_ipsec_rx *rx = ipsec_rx(ipsec, family); - /* remove FT */ - rx_destroy(mdev, ipsec, rx, family); + mutex_lock(&rx->ft.mutex); + if (rx->chains) + ipsec_chains_put_table(rx->chains, prio); -out: + rx_put(ipsec, rx, family); mutex_unlock(&rx->ft.mutex); } +static int ipsec_counter_rule_tx(struct mlx5_core_dev *mdev, struct mlx5e_ipsec_tx *tx) +{ + struct mlx5_flow_destination dest = {}; + struct mlx5_flow_act flow_act = {}; + struct mlx5_flow_handle *fte; + struct mlx5_flow_spec *spec; + int err; + + spec = kvzalloc(sizeof(*spec), GFP_KERNEL); + if (!spec) + return -ENOMEM; + + /* create fte */ + flow_act.action = MLX5_FLOW_CONTEXT_ACTION_ALLOW | + MLX5_FLOW_CONTEXT_ACTION_COUNT; + dest.type = MLX5_FLOW_DESTINATION_TYPE_COUNTER; + dest.counter_id = mlx5_fc_id(tx->fc->cnt); + fte = mlx5_add_flow_rules(tx->ft.status, spec, &flow_act, &dest, 1); + if (IS_ERR(fte)) { + err = PTR_ERR(fte); + mlx5_core_err(mdev, "Fail to add ipsec tx counter rule err=%d\n", err); + goto err_rule; + } + + kvfree(spec); + tx->status.rule = fte; + return 0; + +err_rule: + kvfree(spec); + return err; +} + /* IPsec TX flow steering */ +static void tx_destroy(struct mlx5_core_dev *mdev, struct mlx5e_ipsec_tx *tx, + struct mlx5_ipsec_fs *roce) +{ + mlx5_ipsec_fs_roce_tx_destroy(roce); + if (tx->chains) { + ipsec_chains_destroy(tx->chains); + } else { + mlx5_del_flow_rules(tx->pol.rule); + mlx5_destroy_flow_group(tx->pol.group); + mlx5_destroy_flow_table(tx->ft.pol); + } + + mlx5_destroy_flow_table(tx->ft.sa); + if (tx->allow_tunnel_mode) + mlx5_eswitch_unblock_encap(mdev); + mlx5_del_flow_rules(tx->status.rule); + mlx5_destroy_flow_table(tx->ft.status); +} + static int tx_create(struct mlx5_core_dev *mdev, struct mlx5e_ipsec_tx *tx, struct mlx5_ipsec_fs *roce) { struct mlx5_flow_destination dest = {}; struct mlx5_flow_table *ft; + u32 flags = 0; int err; - ft = ipsec_ft_create(tx->ns, 1, 0, 4); + ft = ipsec_ft_create(tx->ns, 2, 0, 1, 0); if (IS_ERR(ft)) return PTR_ERR(ft); + tx->ft.status = ft; + + err = ipsec_counter_rule_tx(mdev, tx); + if (err) + goto err_status_rule; + if (mlx5_ipsec_device_caps(mdev) & MLX5_IPSEC_CAP_TUNNEL) + tx->allow_tunnel_mode = mlx5_eswitch_block_encap(mdev); + if (tx->allow_tunnel_mode) + flags = MLX5_FLOW_TABLE_TUNNEL_EN_REFORMAT; + ft = ipsec_ft_create(tx->ns, 1, 0, 4, flags); + if (IS_ERR(ft)) { + err = PTR_ERR(ft); + goto err_sa_ft; + } tx->ft.sa = ft; - ft = ipsec_ft_create(tx->ns, 0, 0, 2); + if (mlx5_ipsec_device_caps(mdev) & MLX5_IPSEC_CAP_PRIO) { + tx->chains = ipsec_chains_create( + mdev, tx->ft.sa, MLX5_FLOW_NAMESPACE_EGRESS_IPSEC, 0, 0, + &tx->ft.pol); + if (IS_ERR(tx->chains)) { + err = PTR_ERR(tx->chains); + goto err_pol_ft; + } + + goto connect_roce; + } + + ft = ipsec_ft_create(tx->ns, 0, 0, 2, 0); if (IS_ERR(ft)) { err = PTR_ERR(ft); goto err_pol_ft; @@ -356,44 +572,102 @@ static int tx_create(struct mlx5_core_dev *mdev, struct mlx5e_ipsec_tx *tx, dest.type = MLX5_FLOW_DESTINATION_TYPE_FLOW_TABLE; dest.ft = tx->ft.sa; err = ipsec_miss_create(mdev, tx->ft.pol, &tx->pol, &dest); - if (err) - goto err_pol_miss; + if (err) { + mlx5_destroy_flow_table(tx->ft.pol); + goto err_pol_ft; + } +connect_roce: err = mlx5_ipsec_fs_roce_tx_create(mdev, roce, tx->ft.pol); if (err) goto err_roce; return 0; err_roce: - mlx5_del_flow_rules(tx->pol.rule); - mlx5_destroy_flow_group(tx->pol.group); -err_pol_miss: - mlx5_destroy_flow_table(tx->ft.pol); + if (tx->chains) { + ipsec_chains_destroy(tx->chains); + } else { + mlx5_del_flow_rules(tx->pol.rule); + mlx5_destroy_flow_group(tx->pol.group); + mlx5_destroy_flow_table(tx->ft.pol); + } err_pol_ft: mlx5_destroy_flow_table(tx->ft.sa); +err_sa_ft: + if (tx->allow_tunnel_mode) + mlx5_eswitch_unblock_encap(mdev); + mlx5_del_flow_rules(tx->status.rule); +err_status_rule: + mlx5_destroy_flow_table(tx->ft.status); return err; } -static struct mlx5e_ipsec_tx *tx_ft_get(struct mlx5_core_dev *mdev, - struct mlx5e_ipsec *ipsec) +static int tx_get(struct mlx5_core_dev *mdev, struct mlx5e_ipsec *ipsec, + struct mlx5e_ipsec_tx *tx) { - struct mlx5e_ipsec_tx *tx = ipsec->tx; - int err = 0; + int err; - mutex_lock(&tx->ft.mutex); if (tx->ft.refcnt) goto skip; err = tx_create(mdev, tx, ipsec->roce); if (err) - goto out; + return err; skip: tx->ft.refcnt++; -out: + return 0; +} + +static void tx_put(struct mlx5e_ipsec *ipsec, struct mlx5e_ipsec_tx *tx) +{ + if (--tx->ft.refcnt) + return; + + tx_destroy(ipsec->mdev, tx, ipsec->roce); +} + +static struct mlx5_flow_table *tx_ft_get_policy(struct mlx5_core_dev *mdev, + struct mlx5e_ipsec *ipsec, + u32 prio) +{ + struct mlx5e_ipsec_tx *tx = ipsec->tx; + struct mlx5_flow_table *ft; + int err; + + mutex_lock(&tx->ft.mutex); + err = tx_get(mdev, ipsec, tx); + if (err) + goto err_get; + + ft = tx->chains ? ipsec_chains_get_table(tx->chains, prio) : tx->ft.pol; + if (IS_ERR(ft)) { + err = PTR_ERR(ft); + goto err_get_ft; + } + + mutex_unlock(&tx->ft.mutex); + return ft; + +err_get_ft: + tx_put(ipsec, tx); +err_get: + mutex_unlock(&tx->ft.mutex); + return ERR_PTR(err); +} + +static struct mlx5e_ipsec_tx *tx_ft_get(struct mlx5_core_dev *mdev, + struct mlx5e_ipsec *ipsec) +{ + struct mlx5e_ipsec_tx *tx = ipsec->tx; + int err; + + mutex_lock(&tx->ft.mutex); + err = tx_get(mdev, ipsec, tx); mutex_unlock(&tx->ft.mutex); if (err) return ERR_PTR(err); + return tx; } @@ -402,53 +676,72 @@ static void tx_ft_put(struct mlx5e_ipsec *ipsec) struct mlx5e_ipsec_tx *tx = ipsec->tx; mutex_lock(&tx->ft.mutex); - tx->ft.refcnt--; - if (tx->ft.refcnt) - goto out; + tx_put(ipsec, tx); + mutex_unlock(&tx->ft.mutex); +} - mlx5_ipsec_fs_roce_tx_destroy(ipsec->roce); - mlx5_del_flow_rules(tx->pol.rule); - mlx5_destroy_flow_group(tx->pol.group); - mlx5_destroy_flow_table(tx->ft.pol); - mlx5_destroy_flow_table(tx->ft.sa); -out: +static void tx_ft_put_policy(struct mlx5e_ipsec *ipsec, u32 prio) +{ + struct mlx5e_ipsec_tx *tx = ipsec->tx; + + mutex_lock(&tx->ft.mutex); + if (tx->chains) + ipsec_chains_put_table(tx->chains, prio); + + tx_put(ipsec, tx); mutex_unlock(&tx->ft.mutex); } static void setup_fte_addr4(struct mlx5_flow_spec *spec, __be32 *saddr, __be32 *daddr) { + if (!*saddr && !*daddr) + return; + spec->match_criteria_enable |= MLX5_MATCH_OUTER_HEADERS; MLX5_SET_TO_ONES(fte_match_param, spec->match_criteria, outer_headers.ip_version); MLX5_SET(fte_match_param, spec->match_value, outer_headers.ip_version, 4); - memcpy(MLX5_ADDR_OF(fte_match_param, spec->match_value, - outer_headers.src_ipv4_src_ipv6.ipv4_layout.ipv4), saddr, 4); - memcpy(MLX5_ADDR_OF(fte_match_param, spec->match_value, - outer_headers.dst_ipv4_dst_ipv6.ipv4_layout.ipv4), daddr, 4); - MLX5_SET_TO_ONES(fte_match_param, spec->match_criteria, - outer_headers.src_ipv4_src_ipv6.ipv4_layout.ipv4); - MLX5_SET_TO_ONES(fte_match_param, spec->match_criteria, - outer_headers.dst_ipv4_dst_ipv6.ipv4_layout.ipv4); + if (*saddr) { + memcpy(MLX5_ADDR_OF(fte_match_param, spec->match_value, + outer_headers.src_ipv4_src_ipv6.ipv4_layout.ipv4), saddr, 4); + MLX5_SET_TO_ONES(fte_match_param, spec->match_criteria, + outer_headers.src_ipv4_src_ipv6.ipv4_layout.ipv4); + } + + if (*daddr) { + memcpy(MLX5_ADDR_OF(fte_match_param, spec->match_value, + outer_headers.dst_ipv4_dst_ipv6.ipv4_layout.ipv4), daddr, 4); + MLX5_SET_TO_ONES(fte_match_param, spec->match_criteria, + outer_headers.dst_ipv4_dst_ipv6.ipv4_layout.ipv4); + } } static void setup_fte_addr6(struct mlx5_flow_spec *spec, __be32 *saddr, __be32 *daddr) { + if (addr6_all_zero(saddr) && addr6_all_zero(daddr)) + return; + spec->match_criteria_enable |= MLX5_MATCH_OUTER_HEADERS; MLX5_SET_TO_ONES(fte_match_param, spec->match_criteria, outer_headers.ip_version); MLX5_SET(fte_match_param, spec->match_value, outer_headers.ip_version, 6); - memcpy(MLX5_ADDR_OF(fte_match_param, spec->match_value, - outer_headers.src_ipv4_src_ipv6.ipv6_layout.ipv6), saddr, 16); - memcpy(MLX5_ADDR_OF(fte_match_param, spec->match_value, - outer_headers.dst_ipv4_dst_ipv6.ipv6_layout.ipv6), daddr, 16); - memset(MLX5_ADDR_OF(fte_match_param, spec->match_criteria, - outer_headers.src_ipv4_src_ipv6.ipv6_layout.ipv6), 0xff, 16); - memset(MLX5_ADDR_OF(fte_match_param, spec->match_criteria, - outer_headers.dst_ipv4_dst_ipv6.ipv6_layout.ipv6), 0xff, 16); + if (!addr6_all_zero(saddr)) { + memcpy(MLX5_ADDR_OF(fte_match_param, spec->match_value, + outer_headers.src_ipv4_src_ipv6.ipv6_layout.ipv6), saddr, 16); + memset(MLX5_ADDR_OF(fte_match_param, spec->match_criteria, + outer_headers.src_ipv4_src_ipv6.ipv6_layout.ipv6), 0xff, 16); + } + + if (!addr6_all_zero(daddr)) { + memcpy(MLX5_ADDR_OF(fte_match_param, spec->match_value, + outer_headers.dst_ipv4_dst_ipv6.ipv6_layout.ipv6), daddr, 16); + memset(MLX5_ADDR_OF(fte_match_param, spec->match_criteria, + outer_headers.dst_ipv4_dst_ipv6.ipv6_layout.ipv6), 0xff, 16); + } } static void setup_fte_esp(struct mlx5_flow_spec *spec) @@ -560,40 +853,181 @@ static int setup_modify_header(struct mlx5_core_dev *mdev, u32 val, u8 dir, return 0; } +static int +setup_pkt_tunnel_reformat(struct mlx5_core_dev *mdev, + struct mlx5_accel_esp_xfrm_attrs *attrs, + struct mlx5_pkt_reformat_params *reformat_params) +{ + struct ip_esp_hdr *esp_hdr; + struct ipv6hdr *ipv6hdr; + struct ethhdr *eth_hdr; + struct iphdr *iphdr; + char *reformatbf; + size_t bfflen; + void *hdr; + + bfflen = sizeof(*eth_hdr); + + if (attrs->dir == XFRM_DEV_OFFLOAD_OUT) { + bfflen += sizeof(*esp_hdr) + 8; + + switch (attrs->family) { + case AF_INET: + bfflen += sizeof(*iphdr); + break; + case AF_INET6: + bfflen += sizeof(*ipv6hdr); + break; + default: + return -EINVAL; + } + } + + reformatbf = kzalloc(bfflen, GFP_KERNEL); + if (!reformatbf) + return -ENOMEM; + + eth_hdr = (struct ethhdr *)reformatbf; + switch (attrs->family) { + case AF_INET: + eth_hdr->h_proto = htons(ETH_P_IP); + break; + case AF_INET6: + eth_hdr->h_proto = htons(ETH_P_IPV6); + break; + default: + goto free_reformatbf; + } + + ether_addr_copy(eth_hdr->h_dest, attrs->dmac); + ether_addr_copy(eth_hdr->h_source, attrs->smac); + + switch (attrs->dir) { + case XFRM_DEV_OFFLOAD_IN: + reformat_params->type = MLX5_REFORMAT_TYPE_L3_ESP_TUNNEL_TO_L2; + break; + case XFRM_DEV_OFFLOAD_OUT: + reformat_params->type = MLX5_REFORMAT_TYPE_L2_TO_L3_ESP_TUNNEL; + reformat_params->param_0 = attrs->authsize; + + hdr = reformatbf + sizeof(*eth_hdr); + switch (attrs->family) { + case AF_INET: + iphdr = (struct iphdr *)hdr; + memcpy(&iphdr->saddr, &attrs->saddr.a4, 4); + memcpy(&iphdr->daddr, &attrs->daddr.a4, 4); + iphdr->version = 4; + iphdr->ihl = 5; + iphdr->ttl = IPSEC_TUNNEL_DEFAULT_TTL; + iphdr->protocol = IPPROTO_ESP; + hdr += sizeof(*iphdr); + break; + case AF_INET6: + ipv6hdr = (struct ipv6hdr *)hdr; + memcpy(&ipv6hdr->saddr, &attrs->saddr.a6, 16); + memcpy(&ipv6hdr->daddr, &attrs->daddr.a6, 16); + ipv6hdr->nexthdr = IPPROTO_ESP; + ipv6hdr->version = 6; + ipv6hdr->hop_limit = IPSEC_TUNNEL_DEFAULT_TTL; + hdr += sizeof(*ipv6hdr); + break; + default: + goto free_reformatbf; + } + + esp_hdr = (struct ip_esp_hdr *)hdr; + esp_hdr->spi = htonl(attrs->spi); + break; + default: + goto free_reformatbf; + } + + reformat_params->size = bfflen; + reformat_params->data = reformatbf; + return 0; + +free_reformatbf: + kfree(reformatbf); + return -EINVAL; +} + +static int +setup_pkt_transport_reformat(struct mlx5_accel_esp_xfrm_attrs *attrs, + struct mlx5_pkt_reformat_params *reformat_params) +{ + u8 *reformatbf; + __be32 spi; + + switch (attrs->dir) { + case XFRM_DEV_OFFLOAD_IN: + reformat_params->type = MLX5_REFORMAT_TYPE_DEL_ESP_TRANSPORT; + break; + case XFRM_DEV_OFFLOAD_OUT: + if (attrs->family == AF_INET) + reformat_params->type = + MLX5_REFORMAT_TYPE_ADD_ESP_TRANSPORT_OVER_IPV4; + else + reformat_params->type = + MLX5_REFORMAT_TYPE_ADD_ESP_TRANSPORT_OVER_IPV6; + + reformatbf = kzalloc(MLX5_REFORMAT_TYPE_ADD_ESP_TRANSPORT_SIZE, + GFP_KERNEL); + if (!reformatbf) + return -ENOMEM; + + /* convert to network format */ + spi = htonl(attrs->spi); + memcpy(reformatbf, &spi, sizeof(spi)); + + reformat_params->param_0 = attrs->authsize; + reformat_params->size = + MLX5_REFORMAT_TYPE_ADD_ESP_TRANSPORT_SIZE; + reformat_params->data = reformatbf; + break; + default: + return -EINVAL; + } + + return 0; +} + static int setup_pkt_reformat(struct mlx5_core_dev *mdev, struct mlx5_accel_esp_xfrm_attrs *attrs, struct mlx5_flow_act *flow_act) { - enum mlx5_flow_namespace_type ns_type = MLX5_FLOW_NAMESPACE_EGRESS; struct mlx5_pkt_reformat_params reformat_params = {}; struct mlx5_pkt_reformat *pkt_reformat; - u8 reformatbf[16] = {}; - __be32 spi; + enum mlx5_flow_namespace_type ns_type; + int ret; - if (attrs->dir == XFRM_DEV_OFFLOAD_IN) { - reformat_params.type = MLX5_REFORMAT_TYPE_DEL_ESP_TRANSPORT; + switch (attrs->dir) { + case XFRM_DEV_OFFLOAD_IN: ns_type = MLX5_FLOW_NAMESPACE_KERNEL; - goto cmd; + break; + case XFRM_DEV_OFFLOAD_OUT: + ns_type = MLX5_FLOW_NAMESPACE_EGRESS; + break; + default: + return -EINVAL; } - if (attrs->family == AF_INET) - reformat_params.type = - MLX5_REFORMAT_TYPE_ADD_ESP_TRANSPORT_OVER_IPV4; - else - reformat_params.type = - MLX5_REFORMAT_TYPE_ADD_ESP_TRANSPORT_OVER_IPV6; - - /* convert to network format */ - spi = htonl(attrs->spi); - memcpy(reformatbf, &spi, 4); + switch (attrs->mode) { + case XFRM_MODE_TRANSPORT: + ret = setup_pkt_transport_reformat(attrs, &reformat_params); + break; + case XFRM_MODE_TUNNEL: + ret = setup_pkt_tunnel_reformat(mdev, attrs, &reformat_params); + break; + default: + ret = -EINVAL; + } - reformat_params.param_0 = attrs->authsize; - reformat_params.size = sizeof(reformatbf); - reformat_params.data = &reformatbf; + if (ret) + return ret; -cmd: pkt_reformat = mlx5_packet_reformat_alloc(mdev, &reformat_params, ns_type); + kfree(reformat_params.data); if (IS_ERR(pkt_reformat)) return PTR_ERR(pkt_reformat); @@ -607,11 +1041,12 @@ static int rx_add_rule(struct mlx5e_ipsec_sa_entry *sa_entry) struct mlx5_accel_esp_xfrm_attrs *attrs = &sa_entry->attrs; struct mlx5_core_dev *mdev = mlx5e_ipsec_sa2dev(sa_entry); struct mlx5e_ipsec *ipsec = sa_entry->ipsec; - struct mlx5_flow_destination dest = {}; + struct mlx5_flow_destination dest[2]; struct mlx5_flow_act flow_act = {}; struct mlx5_flow_handle *rule; struct mlx5_flow_spec *spec; struct mlx5e_ipsec_rx *rx; + struct mlx5_fc *counter; int err; rx = rx_ft_get(mdev, ipsec, attrs->family); @@ -648,14 +1083,25 @@ static int rx_add_rule(struct mlx5e_ipsec_sa_entry *sa_entry) break; } + counter = mlx5_fc_create(mdev, true); + if (IS_ERR(counter)) { + err = PTR_ERR(counter); + goto err_add_cnt; + } flow_act.crypto.type = MLX5_FLOW_CONTEXT_ENCRYPT_DECRYPT_TYPE_IPSEC; flow_act.crypto.obj_id = sa_entry->ipsec_obj_id; flow_act.flags |= FLOW_ACT_NO_APPEND; - flow_act.action |= MLX5_FLOW_CONTEXT_ACTION_FWD_DEST | - MLX5_FLOW_CONTEXT_ACTION_CRYPTO_DECRYPT; - dest.type = MLX5_FLOW_DESTINATION_TYPE_FLOW_TABLE; - dest.ft = rx->ft.status; - rule = mlx5_add_flow_rules(rx->ft.sa, spec, &flow_act, &dest, 1); + flow_act.action |= MLX5_FLOW_CONTEXT_ACTION_CRYPTO_DECRYPT | + MLX5_FLOW_CONTEXT_ACTION_COUNT; + if (attrs->drop) + flow_act.action |= MLX5_FLOW_CONTEXT_ACTION_DROP; + else + flow_act.action |= MLX5_FLOW_CONTEXT_ACTION_FWD_DEST; + dest[0].type = MLX5_FLOW_DESTINATION_TYPE_FLOW_TABLE; + dest[0].ft = rx->ft.status; + dest[1].type = MLX5_FLOW_DESTINATION_TYPE_COUNTER; + dest[1].counter_id = mlx5_fc_id(counter); + rule = mlx5_add_flow_rules(rx->ft.sa, spec, &flow_act, dest, 2); if (IS_ERR(rule)) { err = PTR_ERR(rule); mlx5_core_err(mdev, "fail to add RX ipsec rule err=%d\n", err); @@ -665,10 +1111,13 @@ static int rx_add_rule(struct mlx5e_ipsec_sa_entry *sa_entry) sa_entry->ipsec_rule.rule = rule; sa_entry->ipsec_rule.modify_hdr = flow_act.modify_hdr; + sa_entry->ipsec_rule.fc = counter; sa_entry->ipsec_rule.pkt_reformat = flow_act.pkt_reformat; return 0; err_add_flow: + mlx5_fc_destroy(mdev, counter); +err_add_cnt: if (flow_act.pkt_reformat) mlx5_packet_reformat_dealloc(mdev, flow_act.pkt_reformat); err_pkt_reformat: @@ -676,7 +1125,7 @@ err_pkt_reformat: err_mod_header: kvfree(spec); err_alloc: - rx_ft_put(mdev, ipsec, attrs->family); + rx_ft_put(ipsec, attrs->family); return err; } @@ -685,12 +1134,13 @@ static int tx_add_rule(struct mlx5e_ipsec_sa_entry *sa_entry) struct mlx5_accel_esp_xfrm_attrs *attrs = &sa_entry->attrs; struct mlx5_core_dev *mdev = mlx5e_ipsec_sa2dev(sa_entry); struct mlx5e_ipsec *ipsec = sa_entry->ipsec; - struct mlx5_flow_destination dest = {}; + struct mlx5_flow_destination dest[2]; struct mlx5_flow_act flow_act = {}; struct mlx5_flow_handle *rule; struct mlx5_flow_spec *spec; struct mlx5e_ipsec_tx *tx; - int err = 0; + struct mlx5_fc *counter; + int err; tx = tx_ft_get(mdev, ipsec); if (IS_ERR(tx)) @@ -717,7 +1167,8 @@ static int tx_add_rule(struct mlx5e_ipsec_sa_entry *sa_entry) setup_fte_reg_a(spec); break; case XFRM_DEV_OFFLOAD_PACKET: - setup_fte_reg_c0(spec, attrs->reqid); + if (attrs->reqid) + setup_fte_reg_c0(spec, attrs->reqid); err = setup_pkt_reformat(mdev, attrs, &flow_act); if (err) goto err_pkt_reformat; @@ -726,15 +1177,27 @@ static int tx_add_rule(struct mlx5e_ipsec_sa_entry *sa_entry) break; } + counter = mlx5_fc_create(mdev, true); + if (IS_ERR(counter)) { + err = PTR_ERR(counter); + goto err_add_cnt; + } + flow_act.crypto.type = MLX5_FLOW_CONTEXT_ENCRYPT_DECRYPT_TYPE_IPSEC; flow_act.crypto.obj_id = sa_entry->ipsec_obj_id; flow_act.flags |= FLOW_ACT_NO_APPEND; - flow_act.action |= MLX5_FLOW_CONTEXT_ACTION_ALLOW | - MLX5_FLOW_CONTEXT_ACTION_CRYPTO_ENCRYPT | + flow_act.action |= MLX5_FLOW_CONTEXT_ACTION_CRYPTO_ENCRYPT | MLX5_FLOW_CONTEXT_ACTION_COUNT; - dest.type = MLX5_FLOW_DESTINATION_TYPE_COUNTER; - dest.counter_id = mlx5_fc_id(tx->fc->cnt); - rule = mlx5_add_flow_rules(tx->ft.sa, spec, &flow_act, &dest, 1); + if (attrs->drop) + flow_act.action |= MLX5_FLOW_CONTEXT_ACTION_DROP; + else + flow_act.action |= MLX5_FLOW_CONTEXT_ACTION_FWD_DEST; + + dest[0].ft = tx->ft.status; + dest[0].type = MLX5_FLOW_DESTINATION_TYPE_FLOW_TABLE; + dest[1].type = MLX5_FLOW_DESTINATION_TYPE_COUNTER; + dest[1].counter_id = mlx5_fc_id(counter); + rule = mlx5_add_flow_rules(tx->ft.sa, spec, &flow_act, dest, 2); if (IS_ERR(rule)) { err = PTR_ERR(rule); mlx5_core_err(mdev, "fail to add TX ipsec rule err=%d\n", err); @@ -743,10 +1206,13 @@ static int tx_add_rule(struct mlx5e_ipsec_sa_entry *sa_entry) kvfree(spec); sa_entry->ipsec_rule.rule = rule; + sa_entry->ipsec_rule.fc = counter; sa_entry->ipsec_rule.pkt_reformat = flow_act.pkt_reformat; return 0; err_add_flow: + mlx5_fc_destroy(mdev, counter); +err_add_cnt: if (flow_act.pkt_reformat) mlx5_packet_reformat_dealloc(mdev, flow_act.pkt_reformat); err_pkt_reformat: @@ -760,16 +1226,17 @@ static int tx_add_policy(struct mlx5e_ipsec_pol_entry *pol_entry) { struct mlx5_accel_pol_xfrm_attrs *attrs = &pol_entry->attrs; struct mlx5_core_dev *mdev = mlx5e_ipsec_pol2dev(pol_entry); + struct mlx5e_ipsec_tx *tx = pol_entry->ipsec->tx; struct mlx5_flow_destination dest[2] = {}; struct mlx5_flow_act flow_act = {}; struct mlx5_flow_handle *rule; struct mlx5_flow_spec *spec; - struct mlx5e_ipsec_tx *tx; + struct mlx5_flow_table *ft; int err, dstn = 0; - tx = tx_ft_get(mdev, pol_entry->ipsec); - if (IS_ERR(tx)) - return PTR_ERR(tx); + ft = tx_ft_get_policy(mdev, pol_entry->ipsec, attrs->prio); + if (IS_ERR(ft)) + return PTR_ERR(ft); spec = kvzalloc(sizeof(*spec), GFP_KERNEL); if (!spec) { @@ -785,14 +1252,16 @@ static int tx_add_policy(struct mlx5e_ipsec_pol_entry *pol_entry) setup_fte_no_frags(spec); setup_fte_upper_proto_match(spec, &attrs->upspec); - err = setup_modify_header(mdev, attrs->reqid, XFRM_DEV_OFFLOAD_OUT, - &flow_act); - if (err) - goto err_mod_header; - switch (attrs->action) { case XFRM_POLICY_ALLOW: flow_act.action |= MLX5_FLOW_CONTEXT_ACTION_FWD_DEST; + if (!attrs->reqid) + break; + + err = setup_modify_header(mdev, attrs->reqid, + XFRM_DEV_OFFLOAD_OUT, &flow_act); + if (err) + goto err_mod_header; break; case XFRM_POLICY_BLOCK: flow_act.action |= MLX5_FLOW_CONTEXT_ACTION_DROP | @@ -804,14 +1273,14 @@ static int tx_add_policy(struct mlx5e_ipsec_pol_entry *pol_entry) default: WARN_ON(true); err = -EINVAL; - goto err_action; + goto err_mod_header; } flow_act.flags |= FLOW_ACT_NO_APPEND; dest[dstn].ft = tx->ft.sa; dest[dstn].type = MLX5_FLOW_DESTINATION_TYPE_FLOW_TABLE; dstn++; - rule = mlx5_add_flow_rules(tx->ft.pol, spec, &flow_act, dest, dstn); + rule = mlx5_add_flow_rules(ft, spec, &flow_act, dest, dstn); if (IS_ERR(rule)) { err = PTR_ERR(rule); mlx5_core_err(mdev, "fail to add TX ipsec rule err=%d\n", err); @@ -824,11 +1293,12 @@ static int tx_add_policy(struct mlx5e_ipsec_pol_entry *pol_entry) return 0; err_action: - mlx5_modify_header_dealloc(mdev, flow_act.modify_hdr); + if (flow_act.modify_hdr) + mlx5_modify_header_dealloc(mdev, flow_act.modify_hdr); err_mod_header: kvfree(spec); err_alloc: - tx_ft_put(pol_entry->ipsec); + tx_ft_put_policy(pol_entry->ipsec, attrs->prio); return err; } @@ -840,12 +1310,15 @@ static int rx_add_policy(struct mlx5e_ipsec_pol_entry *pol_entry) struct mlx5_flow_act flow_act = {}; struct mlx5_flow_handle *rule; struct mlx5_flow_spec *spec; + struct mlx5_flow_table *ft; struct mlx5e_ipsec_rx *rx; int err, dstn = 0; - rx = rx_ft_get(mdev, pol_entry->ipsec, attrs->family); - if (IS_ERR(rx)) - return PTR_ERR(rx); + ft = rx_ft_get_policy(mdev, pol_entry->ipsec, attrs->family, attrs->prio); + if (IS_ERR(ft)) + return PTR_ERR(ft); + + rx = ipsec_rx(pol_entry->ipsec, attrs->family); spec = kvzalloc(sizeof(*spec), GFP_KERNEL); if (!spec) { @@ -880,7 +1353,7 @@ static int rx_add_policy(struct mlx5e_ipsec_pol_entry *pol_entry) dest[dstn].type = MLX5_FLOW_DESTINATION_TYPE_FLOW_TABLE; dest[dstn].ft = rx->ft.sa; dstn++; - rule = mlx5_add_flow_rules(rx->ft.pol, spec, &flow_act, dest, dstn); + rule = mlx5_add_flow_rules(ft, spec, &flow_act, dest, dstn); if (IS_ERR(rule)) { err = PTR_ERR(rule); mlx5_core_err(mdev, "Fail to add RX IPsec policy rule err=%d\n", err); @@ -894,7 +1367,7 @@ static int rx_add_policy(struct mlx5e_ipsec_pol_entry *pol_entry) err_action: kvfree(spec); err_alloc: - rx_ft_put(mdev, pol_entry->ipsec, attrs->family); + rx_ft_put_policy(pol_entry->ipsec, attrs->family, attrs->prio); return err; } @@ -1022,7 +1495,7 @@ void mlx5e_accel_ipsec_fs_del_rule(struct mlx5e_ipsec_sa_entry *sa_entry) struct mlx5_core_dev *mdev = mlx5e_ipsec_sa2dev(sa_entry); mlx5_del_flow_rules(ipsec_rule->rule); - + mlx5_fc_destroy(mdev, ipsec_rule->fc); if (ipsec_rule->pkt_reformat) mlx5_packet_reformat_dealloc(mdev, ipsec_rule->pkt_reformat); @@ -1032,7 +1505,7 @@ void mlx5e_accel_ipsec_fs_del_rule(struct mlx5e_ipsec_sa_entry *sa_entry) } mlx5_modify_header_dealloc(mdev, ipsec_rule->modify_hdr); - rx_ft_put(mdev, sa_entry->ipsec, sa_entry->attrs.family); + rx_ft_put(sa_entry->ipsec, sa_entry->attrs.family); } int mlx5e_accel_ipsec_fs_add_pol(struct mlx5e_ipsec_pol_entry *pol_entry) @@ -1051,12 +1524,15 @@ void mlx5e_accel_ipsec_fs_del_pol(struct mlx5e_ipsec_pol_entry *pol_entry) mlx5_del_flow_rules(ipsec_rule->rule); if (pol_entry->attrs.dir == XFRM_DEV_OFFLOAD_IN) { - rx_ft_put(mdev, pol_entry->ipsec, pol_entry->attrs.family); + rx_ft_put_policy(pol_entry->ipsec, pol_entry->attrs.family, + pol_entry->attrs.prio); return; } - mlx5_modify_header_dealloc(mdev, ipsec_rule->modify_hdr); - tx_ft_put(pol_entry->ipsec); + if (ipsec_rule->modify_hdr) + mlx5_modify_header_dealloc(mdev, ipsec_rule->modify_hdr); + + tx_ft_put_policy(pol_entry->ipsec, pol_entry->attrs.prio); } void mlx5e_accel_ipsec_fs_cleanup(struct mlx5e_ipsec *ipsec) @@ -1126,3 +1602,31 @@ err_rx_ipv4: kfree(ipsec->tx); return err; } + +void mlx5e_accel_ipsec_fs_modify(struct mlx5e_ipsec_sa_entry *sa_entry) +{ + struct mlx5e_ipsec_sa_entry sa_entry_shadow = {}; + int err; + + memcpy(&sa_entry_shadow, sa_entry, sizeof(*sa_entry)); + memset(&sa_entry_shadow.ipsec_rule, 0x00, sizeof(sa_entry->ipsec_rule)); + + err = mlx5e_accel_ipsec_fs_add_rule(&sa_entry_shadow); + if (err) + return; + + mlx5e_accel_ipsec_fs_del_rule(sa_entry); + memcpy(sa_entry, &sa_entry_shadow, sizeof(*sa_entry)); +} + +bool mlx5e_ipsec_fs_tunnel_enabled(struct mlx5e_ipsec_sa_entry *sa_entry) +{ + struct mlx5e_ipsec_rx *rx = + ipsec_rx(sa_entry->ipsec, sa_entry->attrs.family); + struct mlx5e_ipsec_tx *tx = sa_entry->ipsec->tx; + + if (sa_entry->attrs.dir == XFRM_DEV_OFFLOAD_OUT) + return tx->allow_tunnel_mode; + + return rx->allow_tunnel_mode; +} diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec_offload.c b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec_offload.c index 5fa7a4c40429..df90e19066bc 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec_offload.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec_offload.c @@ -8,6 +8,7 @@ enum { MLX5_IPSEC_ASO_REMOVE_FLOW_PKT_CNT_OFFSET, + MLX5_IPSEC_ASO_REMOVE_FLOW_SOFT_LFT_OFFSET, }; u32 mlx5_ipsec_device_caps(struct mlx5_core_dev *mdev) @@ -36,11 +37,24 @@ u32 mlx5_ipsec_device_caps(struct mlx5_core_dev *mdev) MLX5_CAP_ETH(mdev, insert_trailer) && MLX5_CAP_ETH(mdev, swp)) caps |= MLX5_IPSEC_CAP_CRYPTO; - if (MLX5_CAP_IPSEC(mdev, ipsec_full_offload) && - MLX5_CAP_FLOWTABLE_NIC_TX(mdev, reformat_add_esp_trasport) && - MLX5_CAP_FLOWTABLE_NIC_RX(mdev, reformat_del_esp_trasport) && - MLX5_CAP_FLOWTABLE_NIC_RX(mdev, decap)) - caps |= MLX5_IPSEC_CAP_PACKET_OFFLOAD; + if (MLX5_CAP_IPSEC(mdev, ipsec_full_offload)) { + if (MLX5_CAP_FLOWTABLE_NIC_TX(mdev, + reformat_add_esp_trasport) && + MLX5_CAP_FLOWTABLE_NIC_RX(mdev, + reformat_del_esp_trasport) && + MLX5_CAP_FLOWTABLE_NIC_RX(mdev, decap)) + caps |= MLX5_IPSEC_CAP_PACKET_OFFLOAD; + + if (MLX5_CAP_FLOWTABLE_NIC_TX(mdev, ignore_flow_level) && + MLX5_CAP_FLOWTABLE_NIC_RX(mdev, ignore_flow_level)) + caps |= MLX5_IPSEC_CAP_PRIO; + + if (MLX5_CAP_FLOWTABLE_NIC_TX(mdev, + reformat_l2_to_l3_esp_tunnel) && + MLX5_CAP_FLOWTABLE_NIC_RX(mdev, + reformat_l3_esp_tunnel_to_l2)) + caps |= MLX5_IPSEC_CAP_TUNNEL; + } if (mlx5_get_roce_state(mdev) && MLX5_CAP_GEN_2(mdev, flow_table_type_2_type) & MLX5_FT_NIC_RX_2_NIC_RX_RDMA && @@ -68,15 +82,17 @@ static void mlx5e_ipsec_packet_setup(void *obj, u32 pdn, void *aso_ctx; aso_ctx = MLX5_ADDR_OF(ipsec_obj, obj, ipsec_aso); - if (attrs->esn_trigger) { + if (attrs->replay_esn.trigger) { MLX5_SET(ipsec_aso, aso_ctx, esn_event_arm, 1); if (attrs->dir == XFRM_DEV_OFFLOAD_IN) { MLX5_SET(ipsec_aso, aso_ctx, window_sz, - attrs->replay_window / 64); + attrs->replay_esn.replay_window / 64); MLX5_SET(ipsec_aso, aso_ctx, mode, MLX5_IPSEC_ASO_REPLAY_PROTECTION); - } + } + MLX5_SET(ipsec_aso, aso_ctx, mode_parameter, + attrs->replay_esn.esn); } /* ASO context */ @@ -93,15 +109,15 @@ static void mlx5e_ipsec_packet_setup(void *obj, u32 pdn, if (attrs->dir == XFRM_DEV_OFFLOAD_OUT) MLX5_SET(ipsec_aso, aso_ctx, mode, MLX5_IPSEC_ASO_INC_SN); - if (attrs->hard_packet_limit != XFRM_INF) { + if (attrs->lft.hard_packet_limit != XFRM_INF) { MLX5_SET(ipsec_aso, aso_ctx, remove_flow_pkt_cnt, - lower_32_bits(attrs->hard_packet_limit)); + attrs->lft.hard_packet_limit); MLX5_SET(ipsec_aso, aso_ctx, hard_lft_arm, 1); } - if (attrs->soft_packet_limit != XFRM_INF) { + if (attrs->lft.soft_packet_limit != XFRM_INF) { MLX5_SET(ipsec_aso, aso_ctx, remove_flow_soft_lft, - lower_32_bits(attrs->soft_packet_limit)); + attrs->lft.soft_packet_limit); MLX5_SET(ipsec_aso, aso_ctx, soft_lft_arm, 1); } @@ -128,10 +144,10 @@ static int mlx5_create_ipsec_obj(struct mlx5e_ipsec_sa_entry *sa_entry) salt_iv_p = MLX5_ADDR_OF(ipsec_obj, obj, implicit_iv); memcpy(salt_iv_p, &aes_gcm->seq_iv, sizeof(aes_gcm->seq_iv)); /* esn */ - if (attrs->esn_trigger) { + if (attrs->replay_esn.trigger) { MLX5_SET(ipsec_obj, obj, esn_en, 1); - MLX5_SET(ipsec_obj, obj, esn_msb, attrs->esn); - MLX5_SET(ipsec_obj, obj, esn_overlap, attrs->esn_overlap); + MLX5_SET(ipsec_obj, obj, esn_msb, attrs->replay_esn.esn_msb); + MLX5_SET(ipsec_obj, obj, esn_overlap, attrs->replay_esn.overlap); } MLX5_SET(ipsec_obj, obj, dekn, sa_entry->enc_key_id); @@ -217,9 +233,6 @@ static int mlx5_modify_ipsec_obj(struct mlx5e_ipsec_sa_entry *sa_entry, void *obj; int err; - if (!attrs->esn_trigger) - return 0; - general_obj_types = MLX5_CAP_GEN_64(mdev, general_obj_types); if (!(general_obj_types & MLX5_HCA_CAP_GENERAL_OBJECT_TYPES_IPSEC)) return -EINVAL; @@ -247,8 +260,8 @@ static int mlx5_modify_ipsec_obj(struct mlx5e_ipsec_sa_entry *sa_entry, MLX5_SET64(ipsec_obj, obj, modify_field_select, MLX5_MODIFY_IPSEC_BITMASK_ESN_OVERLAP | MLX5_MODIFY_IPSEC_BITMASK_ESN_MSB); - MLX5_SET(ipsec_obj, obj, esn_msb, attrs->esn); - MLX5_SET(ipsec_obj, obj, esn_overlap, attrs->esn_overlap); + MLX5_SET(ipsec_obj, obj, esn_msb, attrs->replay_esn.esn_msb); + MLX5_SET(ipsec_obj, obj, esn_overlap, attrs->replay_esn.overlap); /* general object fields set */ MLX5_SET(general_obj_in_cmd_hdr, in, opcode, MLX5_CMD_OP_MODIFY_GENERAL_OBJECT); @@ -268,29 +281,24 @@ void mlx5_accel_esp_modify_xfrm(struct mlx5e_ipsec_sa_entry *sa_entry, memcpy(&sa_entry->attrs, attrs, sizeof(sa_entry->attrs)); } -static void -mlx5e_ipsec_aso_update_esn(struct mlx5e_ipsec_sa_entry *sa_entry, - const struct mlx5_accel_esp_xfrm_attrs *attrs) +static void mlx5e_ipsec_aso_update(struct mlx5e_ipsec_sa_entry *sa_entry, + struct mlx5_wqe_aso_ctrl_seg *data) { - struct mlx5_wqe_aso_ctrl_seg data = {}; + data->data_mask_mode = MLX5_ASO_DATA_MASK_MODE_BITWISE_64BIT << 6; + data->condition_1_0_operand = MLX5_ASO_ALWAYS_TRUE | + MLX5_ASO_ALWAYS_TRUE << 4; - data.data_mask_mode = MLX5_ASO_DATA_MASK_MODE_BITWISE_64BIT << 6; - data.condition_1_0_operand = MLX5_ASO_ALWAYS_TRUE | MLX5_ASO_ALWAYS_TRUE - << 4; - data.data_offset_condition_operand = MLX5_IPSEC_ASO_REMOVE_FLOW_PKT_CNT_OFFSET; - data.bitwise_data = cpu_to_be64(BIT_ULL(54)); - data.data_mask = data.bitwise_data; - - mlx5e_ipsec_aso_query(sa_entry, &data); + mlx5e_ipsec_aso_query(sa_entry, data); } static void mlx5e_ipsec_update_esn_state(struct mlx5e_ipsec_sa_entry *sa_entry, u32 mode_param) { struct mlx5_accel_esp_xfrm_attrs attrs = {}; + struct mlx5_wqe_aso_ctrl_seg data = {}; if (mode_param < MLX5E_IPSEC_ESN_SCOPE_MID) { - sa_entry->esn_state.esn++; + sa_entry->esn_state.esn_msb++; sa_entry->esn_state.overlap = 0; } else { sa_entry->esn_state.overlap = 1; @@ -298,25 +306,129 @@ static void mlx5e_ipsec_update_esn_state(struct mlx5e_ipsec_sa_entry *sa_entry, mlx5e_ipsec_build_accel_xfrm_attrs(sa_entry, &attrs); mlx5_accel_esp_modify_xfrm(sa_entry, &attrs); - mlx5e_ipsec_aso_update_esn(sa_entry, &attrs); + + data.data_offset_condition_operand = + MLX5_IPSEC_ASO_REMOVE_FLOW_PKT_CNT_OFFSET; + data.bitwise_data = cpu_to_be64(BIT_ULL(54)); + data.data_mask = data.bitwise_data; + + mlx5e_ipsec_aso_update(sa_entry, &data); +} + +static void mlx5e_ipsec_aso_update_hard(struct mlx5e_ipsec_sa_entry *sa_entry) +{ + struct mlx5_wqe_aso_ctrl_seg data = {}; + + data.data_offset_condition_operand = + MLX5_IPSEC_ASO_REMOVE_FLOW_PKT_CNT_OFFSET; + data.bitwise_data = cpu_to_be64(BIT_ULL(57) + BIT_ULL(31)); + data.data_mask = data.bitwise_data; + mlx5e_ipsec_aso_update(sa_entry, &data); +} + +static void mlx5e_ipsec_aso_update_soft(struct mlx5e_ipsec_sa_entry *sa_entry, + u32 val) +{ + struct mlx5_wqe_aso_ctrl_seg data = {}; + + data.data_offset_condition_operand = + MLX5_IPSEC_ASO_REMOVE_FLOW_SOFT_LFT_OFFSET; + data.bitwise_data = cpu_to_be64(val); + data.data_mask = cpu_to_be64(U32_MAX); + mlx5e_ipsec_aso_update(sa_entry, &data); +} + +static void mlx5e_ipsec_handle_limits(struct mlx5e_ipsec_sa_entry *sa_entry) +{ + struct mlx5_accel_esp_xfrm_attrs *attrs = &sa_entry->attrs; + struct mlx5e_ipsec *ipsec = sa_entry->ipsec; + struct mlx5e_ipsec_aso *aso = ipsec->aso; + bool soft_arm, hard_arm; + u64 hard_cnt; + + lockdep_assert_held(&sa_entry->x->lock); + + soft_arm = !MLX5_GET(ipsec_aso, aso->ctx, soft_lft_arm); + hard_arm = !MLX5_GET(ipsec_aso, aso->ctx, hard_lft_arm); + if (!soft_arm && !hard_arm) + /* It is not lifetime event */ + return; + + hard_cnt = MLX5_GET(ipsec_aso, aso->ctx, remove_flow_pkt_cnt); + if (!hard_cnt || hard_arm) { + /* It is possible to see packet counter equal to zero without + * hard limit event armed. Such situation can be if packet + * decreased, while we handled soft limit event. + * + * However it will be HW/FW bug if hard limit event is raised + * and packet counter is not zero. + */ + WARN_ON_ONCE(hard_arm && hard_cnt); + + /* Notify about hard limit */ + xfrm_state_check_expire(sa_entry->x); + return; + } + + /* We are in soft limit event. */ + if (!sa_entry->limits.soft_limit_hit && + sa_entry->limits.round == attrs->lft.numb_rounds_soft) { + sa_entry->limits.soft_limit_hit = true; + /* Notify about soft limit */ + xfrm_state_check_expire(sa_entry->x); + + if (sa_entry->limits.round == attrs->lft.numb_rounds_hard) + goto hard; + + if (attrs->lft.soft_packet_limit > BIT_ULL(31)) { + /* We cannot avoid a soft_value that might have the high + * bit set. For instance soft_value=2^31+1 cannot be + * adjusted to the low bit clear version of soft_value=1 + * because it is too close to 0. + * + * Thus we have this corner case where we can hit the + * soft_limit with the high bit set, but cannot adjust + * the counter. Thus we set a temporary interrupt_value + * at least 2^30 away from here and do the adjustment + * then. + */ + mlx5e_ipsec_aso_update_soft(sa_entry, + BIT_ULL(31) - BIT_ULL(30)); + sa_entry->limits.fix_limit = true; + return; + } + + sa_entry->limits.fix_limit = true; + } + +hard: + if (sa_entry->limits.round == attrs->lft.numb_rounds_hard) { + mlx5e_ipsec_aso_update_soft(sa_entry, 0); + attrs->lft.soft_packet_limit = XFRM_INF; + return; + } + + mlx5e_ipsec_aso_update_hard(sa_entry); + sa_entry->limits.round++; + if (sa_entry->limits.round == attrs->lft.numb_rounds_soft) + mlx5e_ipsec_aso_update_soft(sa_entry, + attrs->lft.soft_packet_limit); + if (sa_entry->limits.fix_limit) { + sa_entry->limits.fix_limit = false; + mlx5e_ipsec_aso_update_soft(sa_entry, BIT_ULL(31) - 1); + } } static void mlx5e_ipsec_handle_event(struct work_struct *_work) { struct mlx5e_ipsec_work *work = container_of(_work, struct mlx5e_ipsec_work, work); + struct mlx5e_ipsec_sa_entry *sa_entry = work->data; struct mlx5_accel_esp_xfrm_attrs *attrs; - struct mlx5e_ipsec_sa_entry *sa_entry; struct mlx5e_ipsec_aso *aso; - struct mlx5e_ipsec *ipsec; int ret; - sa_entry = xa_load(&work->ipsec->sadb, work->id); - if (!sa_entry) - goto out; - - ipsec = sa_entry->ipsec; - aso = ipsec->aso; + aso = sa_entry->ipsec->aso; attrs = &sa_entry->attrs; spin_lock(&sa_entry->x->lock); @@ -324,21 +436,18 @@ static void mlx5e_ipsec_handle_event(struct work_struct *_work) if (ret) goto unlock; - if (attrs->esn_trigger && + if (attrs->replay_esn.trigger && !MLX5_GET(ipsec_aso, aso->ctx, esn_event_arm)) { u32 mode_param = MLX5_GET(ipsec_aso, aso->ctx, mode_parameter); mlx5e_ipsec_update_esn_state(sa_entry, mode_param); } - if (attrs->soft_packet_limit != XFRM_INF) - if (!MLX5_GET(ipsec_aso, aso->ctx, soft_lft_arm) || - !MLX5_GET(ipsec_aso, aso->ctx, hard_lft_arm)) - xfrm_state_check_expire(sa_entry->x); + if (attrs->lft.soft_packet_limit != XFRM_INF) + mlx5e_ipsec_handle_limits(sa_entry); unlock: spin_unlock(&sa_entry->x->lock); -out: kfree(work); } @@ -346,6 +455,7 @@ static int mlx5e_ipsec_event(struct notifier_block *nb, unsigned long event, void *data) { struct mlx5e_ipsec *ipsec = container_of(nb, struct mlx5e_ipsec, nb); + struct mlx5e_ipsec_sa_entry *sa_entry; struct mlx5_eqe_obj_change *object; struct mlx5e_ipsec_work *work; struct mlx5_eqe *eqe = data; @@ -360,13 +470,16 @@ static int mlx5e_ipsec_event(struct notifier_block *nb, unsigned long event, if (type != MLX5_GENERAL_OBJECT_TYPES_IPSEC) return NOTIFY_DONE; + sa_entry = xa_load(&ipsec->sadb, be32_to_cpu(object->obj_id)); + if (!sa_entry) + return NOTIFY_DONE; + work = kmalloc(sizeof(*work), GFP_ATOMIC); if (!work) return NOTIFY_DONE; INIT_WORK(&work->work, mlx5e_ipsec_handle_event); - work->ipsec = ipsec; - work->id = be32_to_cpu(object->obj_id); + work->data = sa_entry; queue_work(ipsec->wq, &work->work); return NOTIFY_OK; @@ -457,6 +570,7 @@ int mlx5e_ipsec_aso_query(struct mlx5e_ipsec_sa_entry *sa_entry, struct mlx5_wqe_aso_ctrl_seg *ctrl; struct mlx5e_hw_objs *res; struct mlx5_aso_wqe *wqe; + unsigned long expires; u8 ds_cnt; int ret; @@ -478,22 +592,12 @@ int mlx5e_ipsec_aso_query(struct mlx5e_ipsec_sa_entry *sa_entry, mlx5e_ipsec_aso_copy(ctrl, data); mlx5_aso_post_wqe(aso->aso, false, &wqe->ctrl); - ret = mlx5_aso_poll_cq(aso->aso, false); + expires = jiffies + msecs_to_jiffies(10); + do { + ret = mlx5_aso_poll_cq(aso->aso, false); + if (ret) + usleep_range(2, 10); + } while (ret && time_is_after_jiffies(expires)); spin_unlock_bh(&aso->lock); return ret; } - -void mlx5e_ipsec_aso_update_curlft(struct mlx5e_ipsec_sa_entry *sa_entry, - u64 *packets) -{ - struct mlx5e_ipsec *ipsec = sa_entry->ipsec; - struct mlx5e_ipsec_aso *aso = ipsec->aso; - u64 hard_cnt; - - hard_cnt = MLX5_GET(ipsec_aso, aso->ctx, remove_flow_pkt_cnt); - /* HW decresases the limit till it reaches zero to fire an avent. - * We need to fix the calculations, so the returned count is a total - * number of passed packets and not how much left. - */ - *packets = sa_entry->attrs.hard_packet_limit - hard_cnt; -} diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/macsec.c b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/macsec.c index 51f1cd8364c2..6b7b563f844a 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/macsec.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/macsec.c @@ -4,6 +4,7 @@ #include <linux/mlx5/device.h> #include <linux/mlx5/mlx5_ifc.h> #include <linux/xarray.h> +#include <linux/if_vlan.h> #include "en.h" #include "lib/aso.h" @@ -348,12 +349,21 @@ static void mlx5e_macsec_cleanup_sa(struct mlx5e_macsec *macsec, sa->macsec_rule = NULL; } +static struct mlx5e_priv *macsec_netdev_priv(const struct net_device *dev) +{ +#if IS_ENABLED(CONFIG_VLAN_8021Q) + if (is_vlan_dev(dev)) + return netdev_priv(vlan_dev_priv(dev)->real_dev); +#endif + return netdev_priv(dev); +} + static int mlx5e_macsec_init_sa(struct macsec_context *ctx, struct mlx5e_macsec_sa *sa, bool encrypt, bool is_tx) { - struct mlx5e_priv *priv = netdev_priv(ctx->netdev); + struct mlx5e_priv *priv = macsec_netdev_priv(ctx->netdev); struct mlx5e_macsec *macsec = priv->macsec; struct mlx5_macsec_rule_attrs rule_attrs; struct mlx5_core_dev *mdev = priv->mdev; @@ -427,7 +437,7 @@ static int macsec_rx_sa_active_update(struct macsec_context *ctx, struct mlx5e_macsec_sa *rx_sa, bool active) { - struct mlx5e_priv *priv = netdev_priv(ctx->netdev); + struct mlx5e_priv *priv = macsec_netdev_priv(ctx->netdev); struct mlx5e_macsec *macsec = priv->macsec; int err = 0; @@ -508,9 +518,9 @@ static void update_macsec_epn(struct mlx5e_macsec_sa *sa, const struct macsec_ke static int mlx5e_macsec_add_txsa(struct macsec_context *ctx) { + struct mlx5e_priv *priv = macsec_netdev_priv(ctx->netdev); const struct macsec_tx_sc *tx_sc = &ctx->secy->tx_sc; const struct macsec_tx_sa *ctx_tx_sa = ctx->sa.tx_sa; - struct mlx5e_priv *priv = netdev_priv(ctx->netdev); const struct macsec_secy *secy = ctx->secy; struct mlx5e_macsec_device *macsec_device; struct mlx5_core_dev *mdev = priv->mdev; @@ -583,9 +593,9 @@ out: static int mlx5e_macsec_upd_txsa(struct macsec_context *ctx) { + struct mlx5e_priv *priv = macsec_netdev_priv(ctx->netdev); const struct macsec_tx_sc *tx_sc = &ctx->secy->tx_sc; const struct macsec_tx_sa *ctx_tx_sa = ctx->sa.tx_sa; - struct mlx5e_priv *priv = netdev_priv(ctx->netdev); struct mlx5e_macsec_device *macsec_device; u8 assoc_num = ctx->sa.assoc_num; struct mlx5e_macsec_sa *tx_sa; @@ -645,7 +655,7 @@ out: static int mlx5e_macsec_del_txsa(struct macsec_context *ctx) { - struct mlx5e_priv *priv = netdev_priv(ctx->netdev); + struct mlx5e_priv *priv = macsec_netdev_priv(ctx->netdev); struct mlx5e_macsec_device *macsec_device; u8 assoc_num = ctx->sa.assoc_num; struct mlx5e_macsec_sa *tx_sa; @@ -696,7 +706,7 @@ static u32 mlx5e_macsec_get_sa_from_hashtable(struct rhashtable *sci_hash, sci_t static int mlx5e_macsec_add_rxsc(struct macsec_context *ctx) { struct mlx5e_macsec_rx_sc_xarray_element *sc_xarray_element; - struct mlx5e_priv *priv = netdev_priv(ctx->netdev); + struct mlx5e_priv *priv = macsec_netdev_priv(ctx->netdev); const struct macsec_rx_sc *ctx_rx_sc = ctx->rx_sc; struct mlx5e_macsec_device *macsec_device; struct mlx5e_macsec_rx_sc *rx_sc; @@ -776,7 +786,7 @@ out: static int mlx5e_macsec_upd_rxsc(struct macsec_context *ctx) { - struct mlx5e_priv *priv = netdev_priv(ctx->netdev); + struct mlx5e_priv *priv = macsec_netdev_priv(ctx->netdev); const struct macsec_rx_sc *ctx_rx_sc = ctx->rx_sc; struct mlx5e_macsec_device *macsec_device; struct mlx5e_macsec_rx_sc *rx_sc; @@ -854,7 +864,7 @@ static void macsec_del_rxsc_ctx(struct mlx5e_macsec *macsec, struct mlx5e_macsec static int mlx5e_macsec_del_rxsc(struct macsec_context *ctx) { - struct mlx5e_priv *priv = netdev_priv(ctx->netdev); + struct mlx5e_priv *priv = macsec_netdev_priv(ctx->netdev); struct mlx5e_macsec_device *macsec_device; struct mlx5e_macsec_rx_sc *rx_sc; struct mlx5e_macsec *macsec; @@ -890,8 +900,8 @@ out: static int mlx5e_macsec_add_rxsa(struct macsec_context *ctx) { + struct mlx5e_priv *priv = macsec_netdev_priv(ctx->netdev); const struct macsec_rx_sa *ctx_rx_sa = ctx->sa.rx_sa; - struct mlx5e_priv *priv = netdev_priv(ctx->netdev); struct mlx5e_macsec_device *macsec_device; struct mlx5_core_dev *mdev = priv->mdev; u8 assoc_num = ctx->sa.assoc_num; @@ -976,8 +986,8 @@ out: static int mlx5e_macsec_upd_rxsa(struct macsec_context *ctx) { + struct mlx5e_priv *priv = macsec_netdev_priv(ctx->netdev); const struct macsec_rx_sa *ctx_rx_sa = ctx->sa.rx_sa; - struct mlx5e_priv *priv = netdev_priv(ctx->netdev); struct mlx5e_macsec_device *macsec_device; u8 assoc_num = ctx->sa.assoc_num; struct mlx5e_macsec_rx_sc *rx_sc; @@ -1033,7 +1043,7 @@ out: static int mlx5e_macsec_del_rxsa(struct macsec_context *ctx) { - struct mlx5e_priv *priv = netdev_priv(ctx->netdev); + struct mlx5e_priv *priv = macsec_netdev_priv(ctx->netdev); struct mlx5e_macsec_device *macsec_device; sci_t sci = ctx->sa.rx_sa->sc->sci; struct mlx5e_macsec_rx_sc *rx_sc; @@ -1085,7 +1095,7 @@ out: static int mlx5e_macsec_add_secy(struct macsec_context *ctx) { - struct mlx5e_priv *priv = netdev_priv(ctx->netdev); + struct mlx5e_priv *priv = macsec_netdev_priv(ctx->netdev); const struct net_device *dev = ctx->secy->netdev; const struct net_device *netdev = ctx->netdev; struct mlx5e_macsec_device *macsec_device; @@ -1137,7 +1147,7 @@ out: static int macsec_upd_secy_hw_address(struct macsec_context *ctx, struct mlx5e_macsec_device *macsec_device) { - struct mlx5e_priv *priv = netdev_priv(ctx->netdev); + struct mlx5e_priv *priv = macsec_netdev_priv(ctx->netdev); const struct net_device *dev = ctx->secy->netdev; struct mlx5e_macsec *macsec = priv->macsec; struct mlx5e_macsec_rx_sc *rx_sc, *tmp; @@ -1184,8 +1194,8 @@ out: */ static int mlx5e_macsec_upd_secy(struct macsec_context *ctx) { + struct mlx5e_priv *priv = macsec_netdev_priv(ctx->netdev); const struct macsec_tx_sc *tx_sc = &ctx->secy->tx_sc; - struct mlx5e_priv *priv = netdev_priv(ctx->netdev); const struct net_device *dev = ctx->secy->netdev; struct mlx5e_macsec_device *macsec_device; struct mlx5e_macsec_sa *tx_sa; @@ -1240,7 +1250,7 @@ out: static int mlx5e_macsec_del_secy(struct macsec_context *ctx) { - struct mlx5e_priv *priv = netdev_priv(ctx->netdev); + struct mlx5e_priv *priv = macsec_netdev_priv(ctx->netdev); struct mlx5e_macsec_device *macsec_device; struct mlx5e_macsec_rx_sc *rx_sc, *tmp; struct mlx5e_macsec_sa *tx_sa; @@ -1741,7 +1751,7 @@ void mlx5e_macsec_offload_handle_rx_skb(struct net_device *netdev, { struct mlx5e_macsec_rx_sc_xarray_element *sc_xarray_element; u32 macsec_meta_data = be32_to_cpu(cqe->ft_metadata); - struct mlx5e_priv *priv = netdev_priv(netdev); + struct mlx5e_priv *priv = macsec_netdev_priv(netdev); struct mlx5e_macsec_rx_sc *rx_sc; struct mlx5e_macsec *macsec; u32 fs_id; diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/macsec_fs.c b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/macsec_fs.c index 5b658a5588c6..7fc901a6ec5f 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/macsec_fs.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/macsec_fs.c @@ -4,6 +4,7 @@ #include <net/macsec.h> #include <linux/netdevice.h> #include <linux/mlx5/qp.h> +#include <linux/if_vlan.h> #include "fs_core.h" #include "en/fs.h" #include "en_accel/macsec_fs.h" @@ -292,8 +293,6 @@ static int macsec_fs_tx_create(struct mlx5e_macsec_fs *macsec_fs) } /* Tx crypto table MKE rule - MKE packets shouldn't be offloaded */ - memset(&flow_act, 0, sizeof(flow_act)); - memset(spec, 0, sizeof(*spec)); spec->match_criteria_enable = MLX5_MATCH_OUTER_HEADERS; MLX5_SET_TO_ONES(fte_match_param, spec->match_criteria, outer_headers.ethertype); @@ -510,6 +509,8 @@ static void macsec_fs_tx_del_rule(struct mlx5e_macsec_fs *macsec_fs, macsec_fs_tx_ft_put(macsec_fs); } +#define MLX5_REFORMAT_PARAM_ADD_MACSEC_OFFSET_4_BYTES 1 + static union mlx5e_macsec_rule * macsec_fs_tx_add_rule(struct mlx5e_macsec_fs *macsec_fs, const struct macsec_context *macsec_ctx, @@ -555,6 +556,10 @@ macsec_fs_tx_add_rule(struct mlx5e_macsec_fs *macsec_fs, reformat_params.type = MLX5_REFORMAT_TYPE_ADD_MACSEC; reformat_params.size = reformat_size; reformat_params.data = reformatbf; + + if (is_vlan_dev(macsec_ctx->netdev)) + reformat_params.param_0 = MLX5_REFORMAT_PARAM_ADD_MACSEC_OFFSET_4_BYTES; + flow_act.pkt_reformat = mlx5_packet_reformat_alloc(macsec_fs->mdev, &reformat_params, MLX5_FLOW_NAMESPACE_EGRESS_MACSEC); @@ -1109,7 +1114,6 @@ static void macsec_fs_rx_setup_fte(struct mlx5_flow_spec *spec, static union mlx5e_macsec_rule * macsec_fs_rx_add_rule(struct mlx5e_macsec_fs *macsec_fs, - const struct macsec_context *macsec_ctx, struct mlx5_macsec_rule_attrs *attrs, u32 fs_id) { @@ -1334,7 +1338,7 @@ mlx5e_macsec_fs_add_rule(struct mlx5e_macsec_fs *macsec_fs, { return (attrs->action == MLX5_ACCEL_MACSEC_ACTION_ENCRYPT) ? macsec_fs_tx_add_rule(macsec_fs, macsec_ctx, attrs, sa_fs_id) : - macsec_fs_rx_add_rule(macsec_fs, macsec_ctx, attrs, *sa_fs_id); + macsec_fs_rx_add_rule(macsec_fs, attrs, *sa_fs_id); } void mlx5e_macsec_fs_del_rule(struct mlx5e_macsec_fs *macsec_fs, diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c b/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c index 79fd21ecb9cb..1f5a2110d31f 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c @@ -220,7 +220,7 @@ static void mlx5e_ethtool_get_speed_arr(struct mlx5_core_dev *mdev, struct ptys2ethtool_config **arr, u32 *size) { - bool ext = mlx5e_ptys_ext_supported(mdev); + bool ext = mlx5_ptys_ext_supported(mdev); *arr = ext ? ptys2ext_ethtool_table : ptys2legacy_ethtool_table; *size = ext ? ARRAY_SIZE(ptys2ext_ethtool_table) : @@ -895,7 +895,7 @@ static void get_speed_duplex(struct net_device *netdev, if (!netif_carrier_ok(netdev)) goto out; - speed = mlx5e_port_ptys2speed(priv->mdev, eth_proto_oper, force_legacy); + speed = mlx5_port_ptys2speed(priv->mdev, eth_proto_oper, force_legacy); if (!speed) { if (data_rate_oper) speed = 100 * data_rate_oper; @@ -980,7 +980,7 @@ static void get_lp_advertising(struct mlx5_core_dev *mdev, u32 eth_proto_lp, struct ethtool_link_ksettings *link_ksettings) { unsigned long *lp_advertising = link_ksettings->link_modes.lp_advertising; - bool ext = mlx5e_ptys_ext_supported(mdev); + bool ext = mlx5_ptys_ext_supported(mdev); ptys2ethtool_adver_link(lp_advertising, eth_proto_lp, ext); } @@ -1160,7 +1160,7 @@ int mlx5e_ethtool_set_link_ksettings(struct mlx5e_priv *priv, const struct ethtool_link_ksettings *link_ksettings) { struct mlx5_core_dev *mdev = priv->mdev; - struct mlx5e_port_eth_proto eproto; + struct mlx5_port_eth_proto eproto; const unsigned long *adver; bool an_changes = false; u8 an_disable_admin; @@ -1180,7 +1180,7 @@ int mlx5e_ethtool_set_link_ksettings(struct mlx5e_priv *priv, autoneg = link_ksettings->base.autoneg; speed = link_ksettings->base.speed; - ext_supported = mlx5e_ptys_ext_supported(mdev); + ext_supported = mlx5_ptys_ext_supported(mdev); ext = ext_requested(autoneg, adver, ext_supported); if (!ext_supported && ext) return -EOPNOTSUPP; @@ -1194,7 +1194,7 @@ int mlx5e_ethtool_set_link_ksettings(struct mlx5e_priv *priv, goto out; } link_modes = autoneg == AUTONEG_ENABLE ? ethtool2ptys_adver_func(adver) : - mlx5e_port_speed2linkmodes(mdev, speed, !ext); + mlx5_port_speed2linkmodes(mdev, speed, !ext); err = mlx5e_speed_validate(priv->netdev, ext, link_modes, autoneg); if (err) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_fs.c b/drivers/net/ethernet/mellanox/mlx5/core/en_fs.c index 05796f8b1d7c..33bfe4d7338b 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_fs.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_fs.c @@ -783,6 +783,7 @@ static int mlx5e_create_promisc_table(struct mlx5e_flow_steering *fs) ft->t = mlx5_create_auto_grouped_flow_table(fs->ns, &ft_attr); if (IS_ERR(ft->t)) { err = PTR_ERR(ft->t); + ft->t = NULL; fs_err(fs, "fail to create promisc table err=%d\n", err); return err; } @@ -810,7 +811,7 @@ static void mlx5e_del_promisc_rule(struct mlx5e_flow_steering *fs) static void mlx5e_destroy_promisc_table(struct mlx5e_flow_steering *fs) { - if (WARN(!fs->promisc.ft.t, "Trying to remove non-existing promiscuous table")) + if (!fs->promisc.ft.t) return; mlx5e_del_promisc_rule(fs); mlx5_destroy_flow_table(fs->promisc.ft.t); @@ -1490,6 +1491,8 @@ err: void mlx5e_fs_cleanup(struct mlx5e_flow_steering *fs) { + if (!fs) + return; debugfs_remove_recursive(fs->dfs_root); mlx5e_fs_ethtool_free(fs); mlx5e_fs_tc_free(fs); diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c index 7ca7e9b57607..2944691f06ad 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c @@ -262,23 +262,30 @@ static int mlx5e_rq_shampo_hd_info_alloc(struct mlx5e_rq *rq, int node) shampo->bitmap = bitmap_zalloc_node(shampo->hd_per_wq, GFP_KERNEL, node); - if (!shampo->bitmap) - return -ENOMEM; - shampo->info = kvzalloc_node(array_size(shampo->hd_per_wq, sizeof(*shampo->info)), GFP_KERNEL, node); - if (!shampo->info) { - kvfree(shampo->bitmap); - return -ENOMEM; - } + shampo->pages = kvzalloc_node(array_size(shampo->hd_per_wq, + sizeof(*shampo->pages)), + GFP_KERNEL, node); + if (!shampo->bitmap || !shampo->info || !shampo->pages) + goto err_nomem; + return 0; + +err_nomem: + kvfree(shampo->info); + kvfree(shampo->bitmap); + kvfree(shampo->pages); + + return -ENOMEM; } static void mlx5e_rq_shampo_hd_info_free(struct mlx5e_rq *rq) { kvfree(rq->mpwqe.shampo->bitmap); kvfree(rq->mpwqe.shampo->info); + kvfree(rq->mpwqe.shampo->pages); } static int mlx5e_rq_alloc_mpwqe_info(struct mlx5e_rq *rq, int node) @@ -286,13 +293,23 @@ static int mlx5e_rq_alloc_mpwqe_info(struct mlx5e_rq *rq, int node) int wq_sz = mlx5_wq_ll_get_size(&rq->mpwqe.wq); size_t alloc_size; - alloc_size = array_size(wq_sz, struct_size(rq->mpwqe.info, alloc_units, + alloc_size = array_size(wq_sz, struct_size(rq->mpwqe.info, + alloc_units.frag_pages, rq->mpwqe.pages_per_wqe)); rq->mpwqe.info = kvzalloc_node(alloc_size, GFP_KERNEL, node); if (!rq->mpwqe.info) return -ENOMEM; + /* For deferred page release (release right before alloc), make sure + * that on first round release is not called. + */ + for (int i = 0; i < wq_sz; i++) { + struct mlx5e_mpw_info *wi = mlx5e_get_mpw_info(rq, i); + + bitmap_fill(wi->skip_release_bitmap, rq->mpwqe.pages_per_wqe); + } + mlx5e_build_umr_wqe(rq, rq->icosq, &rq->mpwqe.umr_wqe); return 0; @@ -499,14 +516,12 @@ static void mlx5e_init_frags_partition(struct mlx5e_rq *rq) struct mlx5e_wqe_frag_info *prev = NULL; int i; - if (rq->xsk_pool) { - /* Assumptions used by XSK batched allocator. */ - WARN_ON(rq->wqe.info.num_frags != 1); - WARN_ON(rq->wqe.info.log_num_frags != 0); - WARN_ON(rq->wqe.info.arr[0].frag_stride != PAGE_SIZE); - } + WARN_ON(rq->xsk_pool); + + next_frag.frag_page = &rq->wqe.alloc_units->frag_pages[0]; - next_frag.au = &rq->wqe.alloc_units[0]; + /* Skip first release due to deferred release. */ + next_frag.flags = BIT(MLX5E_WQE_FRAG_SKIP_RELEASE); for (i = 0; i < mlx5_wq_cyc_get_size(&rq->wqe.wq); i++) { struct mlx5e_rq_frag_info *frag_info = &rq->wqe.info.arr[0]; @@ -516,10 +531,11 @@ static void mlx5e_init_frags_partition(struct mlx5e_rq *rq) for (f = 0; f < rq->wqe.info.num_frags; f++, frag++) { if (next_frag.offset + frag_info[f].frag_stride > PAGE_SIZE) { - next_frag.au++; + /* Pages are assigned at runtime. */ + next_frag.frag_page++; next_frag.offset = 0; if (prev) - prev->last_in_page = true; + prev->flags |= BIT(MLX5E_WQE_FRAG_LAST_IN_PAGE); } *frag = next_frag; @@ -530,25 +546,68 @@ static void mlx5e_init_frags_partition(struct mlx5e_rq *rq) } if (prev) - prev->last_in_page = true; + prev->flags |= BIT(MLX5E_WQE_FRAG_LAST_IN_PAGE); +} + +static void mlx5e_init_xsk_buffs(struct mlx5e_rq *rq) +{ + int i; + + /* Assumptions used by XSK batched allocator. */ + WARN_ON(rq->wqe.info.num_frags != 1); + WARN_ON(rq->wqe.info.log_num_frags != 0); + WARN_ON(rq->wqe.info.arr[0].frag_stride != PAGE_SIZE); + + /* Considering the above assumptions a fragment maps to a single + * xsk_buff. + */ + for (i = 0; i < mlx5_wq_cyc_get_size(&rq->wqe.wq); i++) { + rq->wqe.frags[i].xskp = &rq->wqe.alloc_units->xsk_buffs[i]; + + /* Skip first release due to deferred release as WQES are + * not allocated yet. + */ + rq->wqe.frags[i].flags |= BIT(MLX5E_WQE_FRAG_SKIP_RELEASE); + } } -static int mlx5e_init_au_list(struct mlx5e_rq *rq, int wq_sz, int node) +static int mlx5e_init_wqe_alloc_info(struct mlx5e_rq *rq, int node) { + int wq_sz = mlx5_wq_cyc_get_size(&rq->wqe.wq); int len = wq_sz << rq->wqe.info.log_num_frags; + struct mlx5e_wqe_frag_info *frags; + union mlx5e_alloc_units *aus; + int aus_sz; - rq->wqe.alloc_units = kvzalloc_node(array_size(len, sizeof(*rq->wqe.alloc_units)), - GFP_KERNEL, node); - if (!rq->wqe.alloc_units) + if (rq->xsk_pool) + aus_sz = sizeof(*aus->xsk_buffs); + else + aus_sz = sizeof(*aus->frag_pages); + + aus = kvzalloc_node(array_size(len, aus_sz), GFP_KERNEL, node); + if (!aus) return -ENOMEM; - mlx5e_init_frags_partition(rq); + frags = kvzalloc_node(array_size(len, sizeof(*frags)), GFP_KERNEL, node); + if (!frags) { + kvfree(aus); + return -ENOMEM; + } + + rq->wqe.alloc_units = aus; + rq->wqe.frags = frags; + + if (rq->xsk_pool) + mlx5e_init_xsk_buffs(rq); + else + mlx5e_init_frags_partition(rq); return 0; } -static void mlx5e_free_au_list(struct mlx5e_rq *rq) +static void mlx5e_free_wqe_alloc_info(struct mlx5e_rq *rq) { + kvfree(rq->wqe.frags); kvfree(rq->wqe.alloc_units); } @@ -693,7 +752,6 @@ static int mlx5e_alloc_rq(struct mlx5e_params *params, struct mlx5e_rq_param *rqp, int node, struct mlx5e_rq *rq) { - struct page_pool_params pp_params = { 0 }; struct mlx5_core_dev *mdev = rq->mdev; void *rqc = rqp->rqc; void *rqc_wq = MLX5_ADDR_OF(rqc, rqc, wq); @@ -745,6 +803,9 @@ static int mlx5e_alloc_rq(struct mlx5e_params *params, pool_size = rq->mpwqe.pages_per_wqe << mlx5e_mpwqe_get_log_rq_size(mdev, params, xsk); + if (!mlx5e_rx_mpwqe_is_linear_skb(mdev, params, xsk) && params->xdp_prog) + pool_size *= 2; /* additional page per packet for the linear part */ + rq->mpwqe.log_stride_sz = mlx5e_mpwqe_get_log_stride_size(mdev, params, xsk); rq->mpwqe.num_strides = BIT(mlx5e_mpwqe_get_log_num_strides(mdev, params, xsk)); @@ -778,18 +839,9 @@ static int mlx5e_alloc_rq(struct mlx5e_params *params, rq->wqe.info = rqp->frags_info; rq->buff.frame0_sz = rq->wqe.info.arr[0].frag_stride; - rq->wqe.frags = - kvzalloc_node(array_size(sizeof(*rq->wqe.frags), - (wq_sz << rq->wqe.info.log_num_frags)), - GFP_KERNEL, node); - if (!rq->wqe.frags) { - err = -ENOMEM; - goto err_rq_wq_destroy; - } - - err = mlx5e_init_au_list(rq, wq_sz, node); + err = mlx5e_init_wqe_alloc_info(rq, node); if (err) - goto err_rq_frags; + goto err_rq_wq_destroy; } if (xsk) { @@ -798,12 +850,16 @@ static int mlx5e_alloc_rq(struct mlx5e_params *params, xsk_pool_set_rxq_info(rq->xsk_pool, &rq->xdp_rxq); } else { /* Create a page_pool and register it with rxq */ + struct page_pool_params pp_params = { 0 }; + pp_params.order = 0; - pp_params.flags = 0; /* No-internal DMA mapping in page_pool */ + pp_params.flags = PP_FLAG_DMA_MAP | PP_FLAG_DMA_SYNC_DEV | PP_FLAG_PAGE_FRAG; pp_params.pool_size = pool_size; pp_params.nid = node; pp_params.dev = rq->pdev; + pp_params.napi = rq->cq.napi; pp_params.dma_dir = rq->buff.map_dir; + pp_params.max_len = PAGE_SIZE; /* page_pool can be used even when there is no rq->xdp_prog, * given page_pool does not handle DMA mapping there is no @@ -869,9 +925,6 @@ static int mlx5e_alloc_rq(struct mlx5e_params *params, rq->dim.mode = DIM_CQ_PERIOD_MODE_START_FROM_EQE; } - rq->page_cache.head = 0; - rq->page_cache.tail = 0; - return 0; err_destroy_page_pool: @@ -888,9 +941,7 @@ err_rq_drop_page: mlx5e_free_mpwqe_rq_drop_page(rq); break; default: /* MLX5_WQ_TYPE_CYCLIC */ - mlx5e_free_au_list(rq); -err_rq_frags: - kvfree(rq->wqe.frags); + mlx5e_free_wqe_alloc_info(rq); } err_rq_wq_destroy: mlx5_wq_destroy(&rq->wq_ctrl); @@ -904,7 +955,6 @@ err_rq_xdp_prog: static void mlx5e_free_rq(struct mlx5e_rq *rq) { struct bpf_prog *old_prog; - int i; if (xdp_rxq_info_is_reg(&rq->xdp_rxq)) { old_prog = rcu_dereference_protected(rq->xdp_prog, @@ -921,17 +971,7 @@ static void mlx5e_free_rq(struct mlx5e_rq *rq) mlx5e_rq_free_shampo(rq); break; default: /* MLX5_WQ_TYPE_CYCLIC */ - kvfree(rq->wqe.frags); - mlx5e_free_au_list(rq); - } - - for (i = rq->page_cache.head; i != rq->page_cache.tail; - i = (i + 1) & (MLX5E_CACHE_SIZE - 1)) { - /* With AF_XDP, page_cache is not used, so this loop is not - * entered, and it's safe to call mlx5e_page_release_dynamic - * directly. - */ - mlx5e_page_release_dynamic(rq, rq->page_cache.page_cache[i], false); + mlx5e_free_wqe_alloc_info(rq); } xdp_rxq_info_unreg(&rq->xdp_rxq); @@ -1094,7 +1134,7 @@ int mlx5e_wait_for_min_rx_wqes(struct mlx5e_rq *rq, int wait_time) return -ETIMEDOUT; } -void mlx5e_free_rx_in_progress_descs(struct mlx5e_rq *rq) +void mlx5e_free_rx_missing_descs(struct mlx5e_rq *rq) { struct mlx5_wq_ll *wq; u16 head; @@ -1106,8 +1146,12 @@ void mlx5e_free_rx_in_progress_descs(struct mlx5e_rq *rq) wq = &rq->mpwqe.wq; head = wq->head; - /* Outstanding UMR WQEs (in progress) start at wq->head */ - for (i = 0; i < rq->mpwqe.umr_in_progress; i++) { + /* Release WQEs that are in missing state: they have been + * popped from the list after completion but were not freed + * due to deferred release. + * Also free the linked-list reserved entry, hence the "+ 1". + */ + for (i = 0; i < mlx5_wq_ll_missing(wq) + 1; i++) { rq->dealloc_wqe(rq, head); head = mlx5_wq_ll_get_wqe_next_ix(wq, head); } @@ -1134,7 +1178,7 @@ void mlx5e_free_rx_descs(struct mlx5e_rq *rq) if (rq->wq_type == MLX5_WQ_TYPE_LINKED_LIST_STRIDING_RQ) { struct mlx5_wq_ll *wq = &rq->mpwqe.wq; - mlx5e_free_rx_in_progress_descs(rq); + mlx5e_free_rx_missing_descs(rq); while (!mlx5_wq_ll_is_empty(wq)) { struct mlx5e_rx_wqe_ll *wqe; @@ -1152,12 +1196,21 @@ void mlx5e_free_rx_descs(struct mlx5e_rq *rq) 0, true); } else { struct mlx5_wq_cyc *wq = &rq->wqe.wq; + u16 missing = mlx5_wq_cyc_missing(wq); + u16 head = mlx5_wq_cyc_get_head(wq); while (!mlx5_wq_cyc_is_empty(wq)) { wqe_ix = mlx5_wq_cyc_get_tail(wq); rq->dealloc_wqe(rq, wqe_ix); mlx5_wq_cyc_pop(wq); } + /* Missing slots might also contain unreleased pages due to + * deferred release. + */ + while (missing--) { + wqe_ix = mlx5_wq_cyc_ctr2ix(wq, head++); + rq->dealloc_wqe(rq, wqe_ix); + } } } @@ -1188,7 +1241,7 @@ int mlx5e_open_rq(struct mlx5e_params *params, struct mlx5e_rq_param *param, __set_bit(MLX5E_RQ_STATE_CSUM_FULL, &rq->state); if (params->rx_dim_enabled) - __set_bit(MLX5E_RQ_STATE_AM, &rq->state); + __set_bit(MLX5E_RQ_STATE_DIM, &rq->state); /* We disable csum_complete when XDP is enabled since * XDP programs might manipulate packets which will render @@ -1251,17 +1304,19 @@ static int mlx5e_alloc_xdpsq_fifo(struct mlx5e_xdpsq *sq, int numa) { struct mlx5e_xdp_info_fifo *xdpi_fifo = &sq->db.xdpi_fifo; int wq_sz = mlx5_wq_cyc_get_size(&sq->wq); - int dsegs_per_wq = wq_sz * MLX5_SEND_WQEBB_NUM_DS; + int entries = wq_sz * MLX5_SEND_WQEBB_NUM_DS * 2; /* upper bound for maximum num of + * entries of all xmit_modes. + */ size_t size; - size = array_size(sizeof(*xdpi_fifo->xi), dsegs_per_wq); + size = array_size(sizeof(*xdpi_fifo->xi), entries); xdpi_fifo->xi = kvzalloc_node(size, GFP_KERNEL, numa); if (!xdpi_fifo->xi) return -ENOMEM; xdpi_fifo->pc = &sq->xdpi_fifo_pc; xdpi_fifo->cc = &sq->xdpi_fifo_cc; - xdpi_fifo->mask = dsegs_per_wq - 1; + xdpi_fifo->mask = entries - 1; return 0; } @@ -1664,7 +1719,7 @@ int mlx5e_open_txqsq(struct mlx5e_channel *c, u32 tisn, int txq_ix, mlx5e_set_sq_maxrate(c->netdev, sq, tx_rate); if (params->tx_dim_enabled) - sq->state |= BIT(MLX5E_SQ_STATE_AM); + sq->state |= BIT(MLX5E_SQ_STATE_DIM); return 0; @@ -1811,11 +1866,7 @@ int mlx5e_open_xdpsq(struct mlx5e_channel *c, struct mlx5e_params *params, csp.min_inline_mode = sq->min_inline_mode; set_bit(MLX5E_SQ_STATE_ENABLED, &sq->state); - /* Don't enable multi buffer on XDP_REDIRECT SQ, as it's not yet - * supported by upstream, and there is no defined trigger to allow - * transmitting redirected multi-buffer frames. - */ - if (param->is_xdp_mb && !is_redirect) + if (param->is_xdp_mb) set_bit(MLX5E_SQ_STATE_XDP_MULTIBUF, &sq->state); err = mlx5e_create_sq_rdy(c->mdev, param, &csp, 0, &sq->sqn); @@ -1839,7 +1890,6 @@ int mlx5e_open_xdpsq(struct mlx5e_channel *c, struct mlx5e_params *params, struct mlx5e_tx_wqe *wqe = mlx5_wq_cyc_get_wqe(&sq->wq, i); struct mlx5_wqe_ctrl_seg *cseg = &wqe->ctrl; struct mlx5_wqe_eth_seg *eseg = &wqe->eth; - struct mlx5_wqe_data_seg *dseg; sq->db.wqe_info[i] = (struct mlx5e_xdp_wqe_info) { .num_wqebbs = 1, @@ -1848,9 +1898,6 @@ int mlx5e_open_xdpsq(struct mlx5e_channel *c, struct mlx5e_params *params, cseg->qpn_ds = cpu_to_be32((sq->sqn << 8) | ds_cnt); eseg->inline_hdr.sz = cpu_to_be16(inline_hdr_sz); - - dseg = (struct mlx5_wqe_data_seg *)cseg + (ds_cnt - 1); - dseg->lkey = sq->mkey_be; } } @@ -4017,9 +4064,9 @@ void mlx5e_set_xdp_feature(struct net_device *netdev) val = NETDEV_XDP_ACT_BASIC | NETDEV_XDP_ACT_REDIRECT | NETDEV_XDP_ACT_XSK_ZEROCOPY | - NETDEV_XDP_ACT_NDO_XMIT; - if (params->rq_wq_type == MLX5_WQ_TYPE_CYCLIC) - val |= NETDEV_XDP_ACT_RX_SG; + NETDEV_XDP_ACT_RX_SG | + NETDEV_XDP_ACT_NDO_XMIT | + NETDEV_XDP_ACT_NDO_XMIT_SG; xdp_set_features_flag(netdev, val); } @@ -4213,19 +4260,24 @@ static bool mlx5e_params_validate_xdp(struct net_device *netdev, /* No XSK params: AF_XDP can't be enabled yet at the point of setting * the XDP program. */ - is_linear = mlx5e_rx_is_linear_skb(mdev, params, NULL); - - if (!is_linear && params->rq_wq_type != MLX5_WQ_TYPE_CYCLIC) { - netdev_warn(netdev, "XDP is not allowed with striding RQ and MTU(%d) > %d\n", - params->sw_mtu, - mlx5e_xdp_max_mtu(params, NULL)); - return false; - } - if (!is_linear && !params->xdp_prog->aux->xdp_has_frags) { - netdev_warn(netdev, "MTU(%d) > %d, too big for an XDP program not aware of multi buffer\n", - params->sw_mtu, - mlx5e_xdp_max_mtu(params, NULL)); - return false; + is_linear = params->rq_wq_type == MLX5_WQ_TYPE_CYCLIC ? + mlx5e_rx_is_linear_skb(mdev, params, NULL) : + mlx5e_rx_mpwqe_is_linear_skb(mdev, params, NULL); + + if (!is_linear) { + if (!params->xdp_prog->aux->xdp_has_frags) { + netdev_warn(netdev, "MTU(%d) > %d, too big for an XDP program not aware of multi buffer\n", + params->sw_mtu, + mlx5e_xdp_max_mtu(params, NULL)); + return false; + } + if (params->rq_wq_type == MLX5_WQ_TYPE_LINKED_LIST_STRIDING_RQ && + !mlx5e_verify_params_rx_mpwqe_strides(mdev, params, NULL)) { + netdev_warn(netdev, "XDP is not allowed with striding RQ and MTU(%d) > %d\n", + params->sw_mtu, + mlx5e_xdp_max_mtu(params, NULL)); + return false; + } } return true; @@ -4717,20 +4769,15 @@ static void mlx5e_tx_timeout(struct net_device *dev, unsigned int txqueue) queue_work(priv->wq, &priv->tx_timeout_work); } -static int mlx5e_xdp_allowed(struct mlx5e_priv *priv, struct bpf_prog *prog) +static int mlx5e_xdp_allowed(struct net_device *netdev, struct mlx5_core_dev *mdev, + struct mlx5e_params *params) { - struct net_device *netdev = priv->netdev; - struct mlx5e_params new_params; - - if (priv->channels.params.packet_merge.type != MLX5E_PACKET_MERGE_NONE) { + if (params->packet_merge.type != MLX5E_PACKET_MERGE_NONE) { netdev_warn(netdev, "can't set XDP while HW-GRO/LRO is on, disable them first\n"); return -EINVAL; } - new_params = priv->channels.params; - new_params.xdp_prog = prog; - - if (!mlx5e_params_validate_xdp(netdev, priv->mdev, &new_params)) + if (!mlx5e_params_validate_xdp(netdev, mdev, params)) return -EINVAL; return 0; @@ -4757,8 +4804,11 @@ static int mlx5e_xdp_set(struct net_device *netdev, struct bpf_prog *prog) mutex_lock(&priv->state_lock); + new_params = priv->channels.params; + new_params.xdp_prog = prog; + if (prog) { - err = mlx5e_xdp_allowed(priv, prog); + err = mlx5e_xdp_allowed(netdev, priv->mdev, &new_params); if (err) goto unlock; } @@ -4766,22 +4816,6 @@ static int mlx5e_xdp_set(struct net_device *netdev, struct bpf_prog *prog) /* no need for full reset when exchanging programs */ reset = (!priv->channels.params.xdp_prog || !prog); - new_params = priv->channels.params; - new_params.xdp_prog = prog; - - /* XDP affects striding RQ parameters. Block XDP if striding RQ won't be - * supported with the new parameters: if PAGE_SIZE is bigger than - * MLX5_MPWQE_LOG_STRIDE_SZ_MAX, striding RQ can't be used, even though - * the MTU is small enough for the linear mode, because XDP uses strides - * of PAGE_SIZE on regular RQs. - */ - if (reset && MLX5E_GET_PFLAG(&new_params, MLX5E_PFLAG_RX_STRIDING_RQ)) { - /* Checking for regular RQs here; XSK RQs were checked on XSK bind. */ - err = mlx5e_mpwrq_validate_regular(priv->mdev, &new_params); - if (err) - goto unlock; - } - old_prog = priv->channels.params.xdp_prog; err = mlx5e_safe_switch_params(priv, &new_params, NULL, NULL, reset); @@ -5076,6 +5110,7 @@ static void mlx5e_build_nic_netdev(struct net_device *netdev) netdev->vlan_features |= NETIF_F_SG; netdev->vlan_features |= NETIF_F_HW_CSUM; + netdev->vlan_features |= NETIF_F_HW_MACSEC; netdev->vlan_features |= NETIF_F_GRO; netdev->vlan_features |= NETIF_F_TSO; netdev->vlan_features |= NETIF_F_TSO6; @@ -5270,6 +5305,7 @@ static void mlx5e_nic_cleanup(struct mlx5e_priv *priv) mlx5e_health_destroy_reporters(priv); mlx5e_ktls_cleanup(priv); mlx5e_fs_cleanup(priv->fs); + priv->fs = NULL; } static int mlx5e_init_nic_rx(struct mlx5e_priv *priv) @@ -5725,8 +5761,8 @@ int mlx5e_attach_netdev(struct mlx5e_priv *priv) /* Validate the max_wqe_size_sq capability. */ if (WARN_ON_ONCE(mlx5e_get_max_sq_wqebbs(priv->mdev) < MLX5E_MAX_TX_WQEBBS)) { - mlx5_core_warn(priv->mdev, "MLX5E: Max SQ WQEBBs firmware capability: %u, needed %lu\n", - mlx5e_get_max_sq_wqebbs(priv->mdev), MLX5E_MAX_TX_WQEBBS); + mlx5_core_warn(priv->mdev, "MLX5E: Max SQ WQEBBs firmware capability: %u, needed %u\n", + mlx5e_get_max_sq_wqebbs(priv->mdev), (unsigned int)MLX5E_MAX_TX_WQEBBS); return -EIO; } diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c index 8ff654b4e9e1..1fc386eccaf8 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c @@ -53,6 +53,7 @@ #include "lib/vxlan.h" #define CREATE_TRACE_POINTS #include "diag/en_rep_tracepoint.h" +#include "diag/reporter_vnic.h" #include "en_accel/ipsec.h" #include "en/tc/int_port.h" #include "en/ptp.h" @@ -828,6 +829,7 @@ static int mlx5e_init_ul_rep(struct mlx5_core_dev *mdev, static void mlx5e_cleanup_rep(struct mlx5e_priv *priv) { mlx5e_fs_cleanup(priv->fs); + priv->fs = NULL; } static int mlx5e_create_rep_ttc_table(struct mlx5e_priv *priv) @@ -994,6 +996,7 @@ err_close_drop_rq: priv->rx_res = NULL; err_free_fs: mlx5e_fs_cleanup(priv->fs); + priv->fs = NULL; return err; } @@ -1294,6 +1297,50 @@ static unsigned int mlx5e_ul_rep_stats_grps_num(struct mlx5e_priv *priv) return ARRAY_SIZE(mlx5e_ul_rep_stats_grps); } +static int +mlx5e_rep_vnic_reporter_diagnose(struct devlink_health_reporter *reporter, + struct devlink_fmsg *fmsg, + struct netlink_ext_ack *extack) +{ + struct mlx5e_rep_priv *rpriv = devlink_health_reporter_priv(reporter); + struct mlx5_eswitch_rep *rep = rpriv->rep; + + return mlx5_reporter_vnic_diagnose_counters(rep->esw->dev, fmsg, + rep->vport, true); +} + +static const struct devlink_health_reporter_ops mlx5_rep_vnic_reporter_ops = { + .name = "vnic", + .diagnose = mlx5e_rep_vnic_reporter_diagnose, +}; + +static void mlx5e_rep_vnic_reporter_create(struct mlx5e_priv *priv, + struct devlink_port *dl_port) +{ + struct mlx5e_rep_priv *rpriv = priv->ppriv; + struct devlink_health_reporter *reporter; + + reporter = devl_port_health_reporter_create(dl_port, + &mlx5_rep_vnic_reporter_ops, + 0, rpriv); + if (IS_ERR(reporter)) { + mlx5_core_err(priv->mdev, + "Failed to create representor vnic reporter, err = %ld\n", + PTR_ERR(reporter)); + return; + } + + rpriv->rep_vnic_reporter = reporter; +} + +static void mlx5e_rep_vnic_reporter_destroy(struct mlx5e_priv *priv) +{ + struct mlx5e_rep_priv *rpriv = priv->ppriv; + + if (!IS_ERR_OR_NULL(rpriv->rep_vnic_reporter)) + devl_health_reporter_destroy(rpriv->rep_vnic_reporter); +} + static const struct mlx5e_profile mlx5e_rep_profile = { .init = mlx5e_init_rep, .cleanup = mlx5e_cleanup_rep, @@ -1394,8 +1441,10 @@ mlx5e_vport_vf_rep_load(struct mlx5_core_dev *dev, struct mlx5_eswitch_rep *rep) dl_port = mlx5_esw_offloads_devlink_port(dev->priv.eswitch, rpriv->rep->vport); - if (dl_port) + if (dl_port) { SET_NETDEV_DEVLINK_PORT(netdev, dl_port); + mlx5e_rep_vnic_reporter_create(priv, dl_port); + } err = register_netdev(netdev); if (err) { @@ -1408,8 +1457,8 @@ mlx5e_vport_vf_rep_load(struct mlx5_core_dev *dev, struct mlx5_eswitch_rep *rep) return 0; err_detach_netdev: + mlx5e_rep_vnic_reporter_destroy(priv); mlx5e_detach_netdev(netdev_priv(netdev)); - err_cleanup_profile: priv->profile->cleanup(priv); @@ -1458,6 +1507,7 @@ mlx5e_vport_rep_unload(struct mlx5_eswitch_rep *rep) } unregister_netdev(netdev); + mlx5e_rep_vnic_reporter_destroy(priv); mlx5e_detach_netdev(priv); priv->profile->cleanup(priv); mlx5e_destroy_netdev(priv); diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.h b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.h index dcfad0bf0f45..80b7f5079a5a 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.h @@ -118,6 +118,7 @@ struct mlx5e_rep_priv { struct rtnl_link_stats64 prev_vf_vport_stats; struct mlx5_flow_handle *send_to_vport_meta_rule; struct rhashtable tc_ht; + struct devlink_health_reporter *rep_vnic_reporter; }; static inline diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c index 3f7b63d6616b..69634829558e 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c @@ -271,98 +271,35 @@ static inline u32 mlx5e_decompress_cqes_start(struct mlx5e_rq *rq, return mlx5e_decompress_cqes_cont(rq, wq, 1, budget_rem); } -static inline bool mlx5e_rx_cache_put(struct mlx5e_rq *rq, struct page *page) -{ - struct mlx5e_page_cache *cache = &rq->page_cache; - u32 tail_next = (cache->tail + 1) & (MLX5E_CACHE_SIZE - 1); - struct mlx5e_rq_stats *stats = rq->stats; - - if (tail_next == cache->head) { - stats->cache_full++; - return false; - } - - if (!dev_page_is_reusable(page)) { - stats->cache_waive++; - return false; - } - - cache->page_cache[cache->tail] = page; - cache->tail = tail_next; - return true; -} - -static inline bool mlx5e_rx_cache_get(struct mlx5e_rq *rq, union mlx5e_alloc_unit *au) -{ - struct mlx5e_page_cache *cache = &rq->page_cache; - struct mlx5e_rq_stats *stats = rq->stats; - dma_addr_t addr; - - if (unlikely(cache->head == cache->tail)) { - stats->cache_empty++; - return false; - } - - if (page_ref_count(cache->page_cache[cache->head]) != 1) { - stats->cache_busy++; - return false; - } - - au->page = cache->page_cache[cache->head]; - cache->head = (cache->head + 1) & (MLX5E_CACHE_SIZE - 1); - stats->cache_reuse++; +#define MLX5E_PAGECNT_BIAS_MAX (PAGE_SIZE / 64) - addr = page_pool_get_dma_addr(au->page); - /* Non-XSK always uses PAGE_SIZE. */ - dma_sync_single_for_device(rq->pdev, addr, PAGE_SIZE, rq->buff.map_dir); - return true; -} - -static inline int mlx5e_page_alloc_pool(struct mlx5e_rq *rq, union mlx5e_alloc_unit *au) +static int mlx5e_page_alloc_fragmented(struct mlx5e_rq *rq, + struct mlx5e_frag_page *frag_page) { - dma_addr_t addr; + struct page *page; - if (mlx5e_rx_cache_get(rq, au)) - return 0; - - au->page = page_pool_dev_alloc_pages(rq->page_pool); - if (unlikely(!au->page)) + page = page_pool_dev_alloc_pages(rq->page_pool); + if (unlikely(!page)) return -ENOMEM; - /* Non-XSK always uses PAGE_SIZE. */ - addr = dma_map_page(rq->pdev, au->page, 0, PAGE_SIZE, rq->buff.map_dir); - if (unlikely(dma_mapping_error(rq->pdev, addr))) { - page_pool_recycle_direct(rq->page_pool, au->page); - au->page = NULL; - return -ENOMEM; - } - page_pool_set_dma_addr(au->page, addr); + page_pool_fragment_page(page, MLX5E_PAGECNT_BIAS_MAX); + + *frag_page = (struct mlx5e_frag_page) { + .page = page, + .frags = 0, + }; return 0; } -void mlx5e_page_dma_unmap(struct mlx5e_rq *rq, struct page *page) +static void mlx5e_page_release_fragmented(struct mlx5e_rq *rq, + struct mlx5e_frag_page *frag_page) { - dma_addr_t dma_addr = page_pool_get_dma_addr(page); + u16 drain_count = MLX5E_PAGECNT_BIAS_MAX - frag_page->frags; + struct page *page = frag_page->page; - dma_unmap_page_attrs(rq->pdev, dma_addr, PAGE_SIZE, rq->buff.map_dir, - DMA_ATTR_SKIP_CPU_SYNC); - page_pool_set_dma_addr(page, 0); -} - -void mlx5e_page_release_dynamic(struct mlx5e_rq *rq, struct page *page, bool recycle) -{ - if (likely(recycle)) { - if (mlx5e_rx_cache_put(rq, page)) - return; - - mlx5e_page_dma_unmap(rq, page); - page_pool_recycle_direct(rq->page_pool, page); - } else { - mlx5e_page_dma_unmap(rq, page); - page_pool_release_page(rq->page_pool, page); - put_page(page); - } + if (page_pool_defrag_page(page, drain_count) == 0) + page_pool_put_defragged_page(rq->page_pool, page, -1, true); } static inline int mlx5e_get_rx_frag(struct mlx5e_rq *rq, @@ -371,22 +308,31 @@ static inline int mlx5e_get_rx_frag(struct mlx5e_rq *rq, int err = 0; if (!frag->offset) - /* On first frag (offset == 0), replenish page (alloc_unit actually). - * Other frags that point to the same alloc_unit (with a different + /* On first frag (offset == 0), replenish page. + * Other frags that point to the same page (with a different * offset) should just use the new one without replenishing again * by themselves. */ - err = mlx5e_page_alloc_pool(rq, frag->au); + err = mlx5e_page_alloc_fragmented(rq, frag->frag_page); return err; } +static bool mlx5e_frag_can_release(struct mlx5e_wqe_frag_info *frag) +{ +#define CAN_RELEASE_MASK \ + (BIT(MLX5E_WQE_FRAG_LAST_IN_PAGE) | BIT(MLX5E_WQE_FRAG_SKIP_RELEASE)) + +#define CAN_RELEASE_VALUE BIT(MLX5E_WQE_FRAG_LAST_IN_PAGE) + + return (frag->flags & CAN_RELEASE_MASK) == CAN_RELEASE_VALUE; +} + static inline void mlx5e_put_rx_frag(struct mlx5e_rq *rq, - struct mlx5e_wqe_frag_info *frag, - bool recycle) + struct mlx5e_wqe_frag_info *frag) { - if (frag->last_in_page) - mlx5e_page_release_dynamic(rq, frag->au->page, recycle); + if (mlx5e_frag_can_release(frag)) + mlx5e_page_release_fragmented(rq, frag->frag_page); } static inline struct mlx5e_wqe_frag_info *get_frag(struct mlx5e_rq *rq, u16 ix) @@ -409,8 +355,10 @@ static int mlx5e_alloc_rx_wqe(struct mlx5e_rq *rq, struct mlx5e_rx_wqe_cyc *wqe, if (unlikely(err)) goto free_frags; + frag->flags &= ~BIT(MLX5E_WQE_FRAG_SKIP_RELEASE); + headroom = i == 0 ? rq->buff.headroom : 0; - addr = page_pool_get_dma_addr(frag->au->page); + addr = page_pool_get_dma_addr(frag->frag_page->page); wqe->data[i].addr = cpu_to_be64(addr + frag->offset + headroom); } @@ -418,35 +366,66 @@ static int mlx5e_alloc_rx_wqe(struct mlx5e_rq *rq, struct mlx5e_rx_wqe_cyc *wqe, free_frags: while (--i >= 0) - mlx5e_put_rx_frag(rq, --frag, true); + mlx5e_put_rx_frag(rq, --frag); return err; } static inline void mlx5e_free_rx_wqe(struct mlx5e_rq *rq, - struct mlx5e_wqe_frag_info *wi, - bool recycle) + struct mlx5e_wqe_frag_info *wi) { int i; - if (rq->xsk_pool) { - /* The `recycle` parameter is ignored, and the page is always - * put into the Reuse Ring, because there is no way to return - * the page to the userspace when the interface goes down. - */ - xsk_buff_free(wi->au->xsk); - return; - } - for (i = 0; i < rq->wqe.info.num_frags; i++, wi++) - mlx5e_put_rx_frag(rq, wi, recycle); + mlx5e_put_rx_frag(rq, wi); +} + +static void mlx5e_xsk_free_rx_wqe(struct mlx5e_wqe_frag_info *wi) +{ + if (!(wi->flags & BIT(MLX5E_WQE_FRAG_SKIP_RELEASE))) + xsk_buff_free(*wi->xskp); } static void mlx5e_dealloc_rx_wqe(struct mlx5e_rq *rq, u16 ix) { struct mlx5e_wqe_frag_info *wi = get_frag(rq, ix); - mlx5e_free_rx_wqe(rq, wi, false); + if (rq->xsk_pool) + mlx5e_xsk_free_rx_wqe(wi); + else + mlx5e_free_rx_wqe(rq, wi); +} + +static void mlx5e_xsk_free_rx_wqes(struct mlx5e_rq *rq, u16 ix, int wqe_bulk) +{ + struct mlx5_wq_cyc *wq = &rq->wqe.wq; + int i; + + for (i = 0; i < wqe_bulk; i++) { + int j = mlx5_wq_cyc_ctr2ix(wq, ix + i); + struct mlx5e_wqe_frag_info *wi; + + wi = get_frag(rq, j); + /* The page is always put into the Reuse Ring, because there + * is no way to return the page to the userspace when the + * interface goes down. + */ + mlx5e_xsk_free_rx_wqe(wi); + } +} + +static void mlx5e_free_rx_wqes(struct mlx5e_rq *rq, u16 ix, int wqe_bulk) +{ + struct mlx5_wq_cyc *wq = &rq->wqe.wq; + int i; + + for (i = 0; i < wqe_bulk; i++) { + int j = mlx5_wq_cyc_ctr2ix(wq, ix + i); + struct mlx5e_wqe_frag_info *wi; + + wi = get_frag(rq, j); + mlx5e_free_rx_wqe(rq, wi); + } } static int mlx5e_alloc_rx_wqes(struct mlx5e_rq *rq, u16 ix, int wqe_bulk) @@ -467,18 +446,71 @@ static int mlx5e_alloc_rx_wqes(struct mlx5e_rq *rq, u16 ix, int wqe_bulk) return i; } +static int mlx5e_refill_rx_wqes(struct mlx5e_rq *rq, u16 ix, int wqe_bulk) +{ + int remaining = wqe_bulk; + int i = 0; + + /* The WQE bulk is split into smaller bulks that are sized + * according to the page pool cache refill size to avoid overflowing + * the page pool cache due to too many page releases at once. + */ + do { + int refill = min_t(u16, rq->wqe.info.refill_unit, remaining); + int alloc_count; + + mlx5e_free_rx_wqes(rq, ix + i, refill); + alloc_count = mlx5e_alloc_rx_wqes(rq, ix + i, refill); + i += alloc_count; + if (unlikely(alloc_count != refill)) + break; + + remaining -= refill; + } while (remaining); + + return i; +} + +static void +mlx5e_add_skb_shared_info_frag(struct mlx5e_rq *rq, struct skb_shared_info *sinfo, + struct xdp_buff *xdp, struct mlx5e_frag_page *frag_page, + u32 frag_offset, u32 len) +{ + skb_frag_t *frag; + + dma_addr_t addr = page_pool_get_dma_addr(frag_page->page); + + dma_sync_single_for_cpu(rq->pdev, addr + frag_offset, len, rq->buff.map_dir); + if (!xdp_buff_has_frags(xdp)) { + /* Init on the first fragment to avoid cold cache access + * when possible. + */ + sinfo->nr_frags = 0; + sinfo->xdp_frags_size = 0; + xdp_buff_set_frags_flag(xdp); + } + + frag = &sinfo->frags[sinfo->nr_frags++]; + __skb_frag_set_page(frag, frag_page->page); + skb_frag_off_set(frag, frag_offset); + skb_frag_size_set(frag, len); + + if (page_is_pfmemalloc(frag_page->page)) + xdp_buff_set_frag_pfmemalloc(xdp); + sinfo->xdp_frags_size += len; +} + static inline void mlx5e_add_skb_frag(struct mlx5e_rq *rq, struct sk_buff *skb, - union mlx5e_alloc_unit *au, u32 frag_offset, u32 len, + struct page *page, u32 frag_offset, u32 len, unsigned int truesize) { - dma_addr_t addr = page_pool_get_dma_addr(au->page); + dma_addr_t addr = page_pool_get_dma_addr(page); dma_sync_single_for_cpu(rq->pdev, addr + frag_offset, len, rq->buff.map_dir); - page_ref_inc(au->page); skb_add_rx_frag(skb, skb_shinfo(skb)->nr_frags, - au->page, frag_offset, len, truesize); + page, frag_offset, len, truesize); } static inline void @@ -496,30 +528,36 @@ mlx5e_copy_skb_header(struct mlx5e_rq *rq, struct sk_buff *skb, } static void -mlx5e_free_rx_mpwqe(struct mlx5e_rq *rq, struct mlx5e_mpw_info *wi, bool recycle) +mlx5e_free_rx_mpwqe(struct mlx5e_rq *rq, struct mlx5e_mpw_info *wi) { - union mlx5e_alloc_unit *alloc_units = wi->alloc_units; bool no_xdp_xmit; int i; /* A common case for AF_XDP. */ - if (bitmap_full(wi->xdp_xmit_bitmap, rq->mpwqe.pages_per_wqe)) + if (bitmap_full(wi->skip_release_bitmap, rq->mpwqe.pages_per_wqe)) return; - no_xdp_xmit = bitmap_empty(wi->xdp_xmit_bitmap, rq->mpwqe.pages_per_wqe); + no_xdp_xmit = bitmap_empty(wi->skip_release_bitmap, rq->mpwqe.pages_per_wqe); if (rq->xsk_pool) { - /* The `recycle` parameter is ignored, and the page is always - * put into the Reuse Ring, because there is no way to return - * the page to the userspace when the interface goes down. + struct xdp_buff **xsk_buffs = wi->alloc_units.xsk_buffs; + + /* The page is always put into the Reuse Ring, because there + * is no way to return the page to userspace when the interface + * goes down. */ for (i = 0; i < rq->mpwqe.pages_per_wqe; i++) - if (no_xdp_xmit || !test_bit(i, wi->xdp_xmit_bitmap)) - xsk_buff_free(alloc_units[i].xsk); + if (no_xdp_xmit || !test_bit(i, wi->skip_release_bitmap)) + xsk_buff_free(xsk_buffs[i]); } else { - for (i = 0; i < rq->mpwqe.pages_per_wqe; i++) - if (no_xdp_xmit || !test_bit(i, wi->xdp_xmit_bitmap)) - mlx5e_page_release_dynamic(rq, alloc_units[i].page, recycle); + for (i = 0; i < rq->mpwqe.pages_per_wqe; i++) { + if (no_xdp_xmit || !test_bit(i, wi->skip_release_bitmap)) { + struct mlx5e_frag_page *frag_page; + + frag_page = &wi->alloc_units.frag_pages[i]; + mlx5e_page_release_fragmented(rq, frag_page); + } + } } } @@ -583,7 +621,8 @@ static int mlx5e_build_shampo_hd_umr(struct mlx5e_rq *rq, struct mlx5e_shampo_hd *shampo = rq->mpwqe.shampo; u16 entries, pi, header_offset, err, wqe_bbs, new_entries; u32 lkey = rq->mdev->mlx5e_res.hw_objs.mkey; - struct page *page = shampo->last_page; + u16 page_index = shampo->curr_page_index; + struct mlx5e_frag_page *frag_page; u64 addr = shampo->last_addr; struct mlx5e_dma_info *dma_info; struct mlx5e_umr_wqe *umr_wqe; @@ -597,6 +636,8 @@ static int mlx5e_build_shampo_hd_umr(struct mlx5e_rq *rq, umr_wqe = mlx5_wq_cyc_get_wqe(&sq->wq, pi); build_klm_umr(sq, umr_wqe, shampo->key, index, entries, wqe_bbs); + frag_page = &shampo->pages[page_index]; + for (i = 0; i < entries; i++, index++) { dma_info = &shampo->info[index]; if (i >= klm_entries || (index < shampo->pi && shampo->pi - index < @@ -605,16 +646,20 @@ static int mlx5e_build_shampo_hd_umr(struct mlx5e_rq *rq, header_offset = (index & (MLX5E_SHAMPO_WQ_HEADER_PER_PAGE - 1)) << MLX5E_SHAMPO_LOG_MAX_HEADER_ENTRY_SIZE; if (!(header_offset & (PAGE_SIZE - 1))) { - union mlx5e_alloc_unit au; + page_index = (page_index + 1) & (shampo->hd_per_wq - 1); + frag_page = &shampo->pages[page_index]; - err = mlx5e_page_alloc_pool(rq, &au); + err = mlx5e_page_alloc_fragmented(rq, frag_page); if (unlikely(err)) goto err_unmap; - page = dma_info->page = au.page; - addr = dma_info->addr = page_pool_get_dma_addr(au.page); + + addr = page_pool_get_dma_addr(frag_page->page); + + dma_info->addr = addr; + dma_info->frag_page = frag_page; } else { dma_info->addr = addr + header_offset; - dma_info->page = page; + dma_info->frag_page = frag_page; } update_klm: @@ -632,7 +677,7 @@ update_klm: }; shampo->pi = (shampo->pi + new_entries) & (shampo->hd_per_wq - 1); - shampo->last_page = page; + shampo->curr_page_index = page_index; shampo->last_addr = addr; sq->pc += wqe_bbs; sq->doorbell_cseg = &umr_wqe->ctrl; @@ -644,7 +689,7 @@ err_unmap: dma_info = &shampo->info[--index]; if (!(i & (MLX5E_SHAMPO_WQ_HEADER_PER_PAGE - 1))) { dma_info->addr = ALIGN_DOWN(dma_info->addr, PAGE_SIZE); - mlx5e_page_release_dynamic(rq, dma_info->page, true); + mlx5e_page_release_fragmented(rq, dma_info->frag_page); } } rq->stats->buff_alloc_err++; @@ -693,8 +738,8 @@ static int mlx5e_alloc_rx_hd_mpwqe(struct mlx5e_rq *rq) static int mlx5e_alloc_rx_mpwqe(struct mlx5e_rq *rq, u16 ix) { struct mlx5e_mpw_info *wi = mlx5e_get_mpw_info(rq, ix); - union mlx5e_alloc_unit *au = &wi->alloc_units[0]; struct mlx5e_icosq *sq = rq->icosq; + struct mlx5e_frag_page *frag_page; struct mlx5_wq_cyc *wq = &sq->wq; struct mlx5e_umr_wqe *umr_wqe; u32 offset; /* 17-bit value with MTT. */ @@ -712,13 +757,15 @@ static int mlx5e_alloc_rx_mpwqe(struct mlx5e_rq *rq, u16 ix) umr_wqe = mlx5_wq_cyc_get_wqe(wq, pi); memcpy(umr_wqe, &rq->mpwqe.umr_wqe, sizeof(struct mlx5e_umr_wqe)); - for (i = 0; i < rq->mpwqe.pages_per_wqe; i++, au++) { + frag_page = &wi->alloc_units.frag_pages[0]; + + for (i = 0; i < rq->mpwqe.pages_per_wqe; i++, frag_page++) { dma_addr_t addr; - err = mlx5e_page_alloc_pool(rq, au); + err = mlx5e_page_alloc_fragmented(rq, frag_page); if (unlikely(err)) goto err_unmap; - addr = page_pool_get_dma_addr(au->page); + addr = page_pool_get_dma_addr(frag_page->page); umr_wqe->inline_mtts[i] = (struct mlx5_mtt) { .ptag = cpu_to_be64(addr | MLX5_EN_WR), }; @@ -735,7 +782,7 @@ static int mlx5e_alloc_rx_mpwqe(struct mlx5e_rq *rq, u16 ix) sizeof(*umr_wqe->inline_mtts) * pad); } - bitmap_zero(wi->xdp_xmit_bitmap, rq->mpwqe.pages_per_wqe); + bitmap_zero(wi->skip_release_bitmap, rq->mpwqe.pages_per_wqe); wi->consumed_strides = 0; umr_wqe->ctrl.opmod_idx_opcode = @@ -759,8 +806,8 @@ static int mlx5e_alloc_rx_mpwqe(struct mlx5e_rq *rq, u16 ix) err_unmap: while (--i >= 0) { - au--; - mlx5e_page_release_dynamic(rq, au->page, true); + frag_page--; + mlx5e_page_release_fragmented(rq, frag_page); } err: @@ -778,8 +825,8 @@ err: void mlx5e_shampo_dealloc_hd(struct mlx5e_rq *rq, u16 len, u16 start, bool close) { struct mlx5e_shampo_hd *shampo = rq->mpwqe.shampo; + struct mlx5e_frag_page *deleted_page = NULL; int hd_per_wq = shampo->hd_per_wq; - struct page *deleted_page = NULL; struct mlx5e_dma_info *hd_info; int i, index = start; @@ -792,10 +839,12 @@ void mlx5e_shampo_dealloc_hd(struct mlx5e_rq *rq, u16 len, u16 start, bool close hd_info = &shampo->info[index]; hd_info->addr = ALIGN_DOWN(hd_info->addr, PAGE_SIZE); - if (hd_info->page != deleted_page) { - deleted_page = hd_info->page; - mlx5e_page_release_dynamic(rq, hd_info->page, false); + if (hd_info->frag_page && hd_info->frag_page != deleted_page) { + deleted_page = hd_info->frag_page; + mlx5e_page_release_fragmented(rq, hd_info->frag_page); } + + hd_info->frag_page = NULL; } if (start + len > hd_per_wq) { @@ -810,8 +859,13 @@ void mlx5e_shampo_dealloc_hd(struct mlx5e_rq *rq, u16 len, u16 start, bool close static void mlx5e_dealloc_rx_mpwqe(struct mlx5e_rq *rq, u16 ix) { struct mlx5e_mpw_info *wi = mlx5e_get_mpw_info(rq, ix); - /* Don't recycle, this function is called on rq/netdev close */ - mlx5e_free_rx_mpwqe(rq, wi, false); + /* This function is called on rq/netdev close. */ + mlx5e_free_rx_mpwqe(rq, wi); + + /* Avoid a second release of the wqe pages: dealloc is called also + * for missing wqes on an already flushed RQ. + */ + bitmap_fill(wi->skip_release_bitmap, rq->mpwqe.pages_per_wqe); } INDIRECT_CALLABLE_SCOPE bool mlx5e_post_rx_wqes(struct mlx5e_rq *rq) @@ -838,17 +892,20 @@ INDIRECT_CALLABLE_SCOPE bool mlx5e_post_rx_wqes(struct mlx5e_rq *rq) */ wqe_bulk -= (head + wqe_bulk) & rq->wqe.info.wqe_index_mask; - if (!rq->xsk_pool) - count = mlx5e_alloc_rx_wqes(rq, head, wqe_bulk); - else if (likely(!rq->xsk_pool->dma_need_sync)) + if (!rq->xsk_pool) { + count = mlx5e_refill_rx_wqes(rq, head, wqe_bulk); + } else if (likely(!rq->xsk_pool->dma_need_sync)) { + mlx5e_xsk_free_rx_wqes(rq, head, wqe_bulk); count = mlx5e_xsk_alloc_rx_wqes_batched(rq, head, wqe_bulk); - else + } else { + mlx5e_xsk_free_rx_wqes(rq, head, wqe_bulk); /* If dma_need_sync is true, it's more efficient to call * xsk_buff_alloc in a loop, rather than xsk_buff_alloc_batch, * because the latter does the same check and returns only one * frame. */ count = mlx5e_xsk_alloc_rx_wqes(rq, head, wqe_bulk); + } mlx5_wq_cyc_push_n(wq, count); if (unlikely(count != wqe_bulk)) { @@ -1029,6 +1086,11 @@ INDIRECT_CALLABLE_SCOPE bool mlx5e_post_rx_mpwqes(struct mlx5e_rq *rq) head = rq->mpwqe.actual_wq_head; i = missing; do { + struct mlx5e_mpw_info *wi = mlx5e_get_mpw_info(rq, head); + + /* Deferred free for better page pool cache usage. */ + mlx5e_free_rx_mpwqe(rq, wi); + alloc_err = rq->xsk_pool ? mlx5e_xsk_alloc_rx_mpwqe(rq, head) : mlx5e_alloc_rx_mpwqe(rq, head); @@ -1133,7 +1195,7 @@ static void *mlx5e_shampo_get_packet_hd(struct mlx5e_rq *rq, u16 header_index) struct mlx5e_dma_info *last_head = &rq->mpwqe.shampo->info[header_index]; u16 head_offset = (last_head->addr & (PAGE_SIZE - 1)) + rq->buff.headroom; - return page_address(last_head->page) + head_offset; + return page_address(last_head->frag_page->page) + head_offset; } static void mlx5e_shampo_update_ipv4_udp_hdr(struct mlx5e_rq *rq, struct iphdr *ipv4) @@ -1573,10 +1635,10 @@ struct sk_buff *mlx5e_build_linear_skb(struct mlx5e_rq *rq, void *va, } static void mlx5e_fill_mxbuf(struct mlx5e_rq *rq, struct mlx5_cqe64 *cqe, - void *va, u16 headroom, u32 len, + void *va, u16 headroom, u32 frame_sz, u32 len, struct mlx5e_xdp_buff *mxbuf) { - xdp_init_buff(&mxbuf->xdp, rq->buff.frame0_sz, &rq->xdp_rxq); + xdp_init_buff(&mxbuf->xdp, frame_sz, &rq->xdp_rxq); xdp_prepare_buff(&mxbuf->xdp, va, headroom, len, true); mxbuf->cqe = cqe; mxbuf->rq = rq; @@ -1586,7 +1648,7 @@ static struct sk_buff * mlx5e_skb_from_cqe_linear(struct mlx5e_rq *rq, struct mlx5e_wqe_frag_info *wi, struct mlx5_cqe64 *cqe, u32 cqe_bcnt) { - union mlx5e_alloc_unit *au = wi->au; + struct mlx5e_frag_page *frag_page = wi->frag_page; u16 rx_headroom = rq->buff.headroom; struct bpf_prog *prog; struct sk_buff *skb; @@ -1595,11 +1657,11 @@ mlx5e_skb_from_cqe_linear(struct mlx5e_rq *rq, struct mlx5e_wqe_frag_info *wi, dma_addr_t addr; u32 frag_size; - va = page_address(au->page) + wi->offset; + va = page_address(frag_page->page) + wi->offset; data = va + rx_headroom; frag_size = MLX5_SKB_FRAG_SZ(rx_headroom + cqe_bcnt); - addr = page_pool_get_dma_addr(au->page); + addr = page_pool_get_dma_addr(frag_page->page); dma_sync_single_range_for_cpu(rq->pdev, addr, wi->offset, frag_size, rq->buff.map_dir); net_prefetch(data); @@ -1609,7 +1671,8 @@ mlx5e_skb_from_cqe_linear(struct mlx5e_rq *rq, struct mlx5e_wqe_frag_info *wi, struct mlx5e_xdp_buff mxbuf; net_prefetchw(va); /* xdp_frame data area */ - mlx5e_fill_mxbuf(rq, cqe, va, rx_headroom, cqe_bcnt, &mxbuf); + mlx5e_fill_mxbuf(rq, cqe, va, rx_headroom, rq->buff.frame0_sz, + cqe_bcnt, &mxbuf); if (mlx5e_xdp_handle(rq, prog, &mxbuf)) return NULL; /* page/packet was consumed by XDP */ @@ -1623,7 +1686,8 @@ mlx5e_skb_from_cqe_linear(struct mlx5e_rq *rq, struct mlx5e_wqe_frag_info *wi, return NULL; /* queue up for recycling/reuse */ - page_ref_inc(au->page); + skb_mark_for_recycle(skb); + frag_page->frags++; return skb; } @@ -1634,8 +1698,8 @@ mlx5e_skb_from_cqe_nonlinear(struct mlx5e_rq *rq, struct mlx5e_wqe_frag_info *wi { struct mlx5e_rq_frag_info *frag_info = &rq->wqe.info.arr[0]; struct mlx5e_wqe_frag_info *head_wi = wi; - union mlx5e_alloc_unit *au = wi->au; u16 rx_headroom = rq->buff.headroom; + struct mlx5e_frag_page *frag_page; struct skb_shared_info *sinfo; struct mlx5e_xdp_buff mxbuf; u32 frag_consumed_bytes; @@ -1645,16 +1709,19 @@ mlx5e_skb_from_cqe_nonlinear(struct mlx5e_rq *rq, struct mlx5e_wqe_frag_info *wi u32 truesize; void *va; - va = page_address(au->page) + wi->offset; + frag_page = wi->frag_page; + + va = page_address(frag_page->page) + wi->offset; frag_consumed_bytes = min_t(u32, frag_info->frag_size, cqe_bcnt); - addr = page_pool_get_dma_addr(au->page); + addr = page_pool_get_dma_addr(frag_page->page); dma_sync_single_range_for_cpu(rq->pdev, addr, wi->offset, rq->buff.frame0_sz, rq->buff.map_dir); net_prefetchw(va); /* xdp_frame data area */ net_prefetch(va + rx_headroom); - mlx5e_fill_mxbuf(rq, cqe, va, rx_headroom, frag_consumed_bytes, &mxbuf); + mlx5e_fill_mxbuf(rq, cqe, va, rx_headroom, rq->buff.frame0_sz, + frag_consumed_bytes, &mxbuf); sinfo = xdp_get_shared_info_from_buff(&mxbuf.xdp); truesize = 0; @@ -1663,34 +1730,12 @@ mlx5e_skb_from_cqe_nonlinear(struct mlx5e_rq *rq, struct mlx5e_wqe_frag_info *wi wi++; while (cqe_bcnt) { - skb_frag_t *frag; - - au = wi->au; + frag_page = wi->frag_page; frag_consumed_bytes = min_t(u32, frag_info->frag_size, cqe_bcnt); - addr = page_pool_get_dma_addr(au->page); - dma_sync_single_for_cpu(rq->pdev, addr + wi->offset, - frag_consumed_bytes, rq->buff.map_dir); - - if (!xdp_buff_has_frags(&mxbuf.xdp)) { - /* Init on the first fragment to avoid cold cache access - * when possible. - */ - sinfo->nr_frags = 0; - sinfo->xdp_frags_size = 0; - xdp_buff_set_frags_flag(&mxbuf.xdp); - } - - frag = &sinfo->frags[sinfo->nr_frags++]; - __skb_frag_set_page(frag, au->page); - skb_frag_off_set(frag, wi->offset); - skb_frag_size_set(frag, frag_consumed_bytes); - - if (page_is_pfmemalloc(au->page)) - xdp_buff_set_frag_pfmemalloc(&mxbuf.xdp); - - sinfo->xdp_frags_size += frag_consumed_bytes; + mlx5e_add_skb_shared_info_frag(rq, sinfo, &mxbuf.xdp, frag_page, + wi->offset, frag_consumed_bytes); truesize += frag_info->frag_stride; cqe_bcnt -= frag_consumed_bytes; @@ -1701,10 +1746,10 @@ mlx5e_skb_from_cqe_nonlinear(struct mlx5e_rq *rq, struct mlx5e_wqe_frag_info *wi prog = rcu_dereference(rq->xdp_prog); if (prog && mlx5e_xdp_handle(rq, prog, &mxbuf)) { if (test_bit(MLX5E_RQ_FLAG_XDP_XMIT, rq->flags)) { - int i; + struct mlx5e_wqe_frag_info *pwi; - for (i = wi - head_wi; i < rq->wqe.info.num_frags; i++) - mlx5e_put_rx_frag(rq, &head_wi[i], true); + for (pwi = head_wi; pwi < wi; pwi++) + pwi->flags |= BIT(MLX5E_WQE_FRAG_SKIP_RELEASE); } return NULL; /* page/packet was consumed by XDP */ } @@ -1716,21 +1761,17 @@ mlx5e_skb_from_cqe_nonlinear(struct mlx5e_rq *rq, struct mlx5e_wqe_frag_info *wi if (unlikely(!skb)) return NULL; - page_ref_inc(head_wi->au->page); + skb_mark_for_recycle(skb); + head_wi->frag_page->frags++; if (xdp_buff_has_frags(&mxbuf.xdp)) { - int i; - /* sinfo->nr_frags is reset by build_skb, calculate again. */ xdp_update_skb_shared_info(skb, wi - head_wi - 1, sinfo->xdp_frags_size, truesize, xdp_buff_is_frag_pfmemalloc(&mxbuf.xdp)); - for (i = 0; i < sinfo->nr_frags; i++) { - skb_frag_t *frag = &sinfo->frags[i]; - - page_ref_inc(skb_frag_page(frag)); - } + for (struct mlx5e_wqe_frag_info *pwi = head_wi + 1; pwi < wi; pwi++) + pwi->frag_page->frags++; } return skb; @@ -1768,7 +1809,7 @@ static void mlx5e_handle_rx_cqe(struct mlx5e_rq *rq, struct mlx5_cqe64 *cqe) if (unlikely(MLX5E_RX_ERR_CQE(cqe))) { mlx5e_handle_rx_err_cqe(rq, cqe); - goto free_wqe; + goto wq_cyc_pop; } skb = INDIRECT_CALL_3(rq->wqe.skb_from_cqe, @@ -1782,9 +1823,9 @@ static void mlx5e_handle_rx_cqe(struct mlx5e_rq *rq, struct mlx5_cqe64 *cqe) /* do not return page to cache, * it will be returned on XDP_TX completion. */ - goto wq_cyc_pop; + wi->flags |= BIT(MLX5E_WQE_FRAG_SKIP_RELEASE); } - goto free_wqe; + goto wq_cyc_pop; } mlx5e_complete_rx_cqe(rq, cqe, cqe_bcnt, skb); @@ -1792,13 +1833,11 @@ static void mlx5e_handle_rx_cqe(struct mlx5e_rq *rq, struct mlx5_cqe64 *cqe) if (mlx5e_cqe_regb_chain(cqe)) if (!mlx5e_tc_update_skb_nic(cqe, skb)) { dev_kfree_skb_any(skb); - goto free_wqe; + goto wq_cyc_pop; } napi_gro_receive(rq->cq.napi, skb); -free_wqe: - mlx5e_free_rx_wqe(rq, wi, true); wq_cyc_pop: mlx5_wq_cyc_pop(wq); } @@ -1822,7 +1861,7 @@ static void mlx5e_handle_rx_cqe_rep(struct mlx5e_rq *rq, struct mlx5_cqe64 *cqe) if (unlikely(MLX5E_RX_ERR_CQE(cqe))) { mlx5e_handle_rx_err_cqe(rq, cqe); - goto free_wqe; + goto wq_cyc_pop; } skb = INDIRECT_CALL_2(rq->wqe.skb_from_cqe, @@ -1835,9 +1874,9 @@ static void mlx5e_handle_rx_cqe_rep(struct mlx5e_rq *rq, struct mlx5_cqe64 *cqe) /* do not return page to cache, * it will be returned on XDP_TX completion. */ - goto wq_cyc_pop; + wi->flags |= BIT(MLX5E_WQE_FRAG_SKIP_RELEASE); } - goto free_wqe; + goto wq_cyc_pop; } mlx5e_complete_rx_cqe(rq, cqe, cqe_bcnt, skb); @@ -1847,8 +1886,6 @@ static void mlx5e_handle_rx_cqe_rep(struct mlx5e_rq *rq, struct mlx5_cqe64 *cqe) mlx5e_rep_tc_receive(cqe, rq, skb); -free_wqe: - mlx5e_free_rx_wqe(rq, wi, true); wq_cyc_pop: mlx5_wq_cyc_pop(wq); } @@ -1901,7 +1938,6 @@ mpwrq_cqe_out: wq = &rq->mpwqe.wq; wqe = mlx5_wq_ll_get_wqe(wq, wqe_id); - mlx5e_free_rx_mpwqe(rq, wi, true); mlx5_wq_ll_pop(wq, cqe->wqe_id, &wqe->next.next_wqe_index); } @@ -1913,7 +1949,8 @@ const struct mlx5e_rx_handlers mlx5e_rx_handlers_rep = { static void mlx5e_fill_skb_data(struct sk_buff *skb, struct mlx5e_rq *rq, - union mlx5e_alloc_unit *au, u32 data_bcnt, u32 data_offset) + struct mlx5e_frag_page *frag_page, + u32 data_bcnt, u32 data_offset) { net_prefetchw(skb->data); @@ -1927,12 +1964,13 @@ mlx5e_fill_skb_data(struct sk_buff *skb, struct mlx5e_rq *rq, else truesize = ALIGN(pg_consumed_bytes, BIT(rq->mpwqe.log_stride_sz)); - mlx5e_add_skb_frag(rq, skb, au, data_offset, + frag_page->frags++; + mlx5e_add_skb_frag(rq, skb, frag_page->page, data_offset, pg_consumed_bytes, truesize); data_bcnt -= pg_consumed_bytes; data_offset = 0; - au++; + frag_page++; } } @@ -1941,37 +1979,142 @@ mlx5e_skb_from_cqe_mpwrq_nonlinear(struct mlx5e_rq *rq, struct mlx5e_mpw_info *w struct mlx5_cqe64 *cqe, u16 cqe_bcnt, u32 head_offset, u32 page_idx) { - union mlx5e_alloc_unit *au = &wi->alloc_units[page_idx]; + struct mlx5e_frag_page *frag_page = &wi->alloc_units.frag_pages[page_idx]; u16 headlen = min_t(u16, MLX5E_RX_MAX_HEAD, cqe_bcnt); - u32 frag_offset = head_offset + headlen; - u32 byte_cnt = cqe_bcnt - headlen; - union mlx5e_alloc_unit *head_au = au; + struct mlx5e_frag_page *head_page = frag_page; + u32 frag_offset = head_offset; + u32 byte_cnt = cqe_bcnt; + struct skb_shared_info *sinfo; + struct mlx5e_xdp_buff mxbuf; + unsigned int truesize = 0; + struct bpf_prog *prog; struct sk_buff *skb; - dma_addr_t addr; + u32 linear_frame_sz; + u16 linear_data_len; + u16 linear_hr; + void *va; - skb = napi_alloc_skb(rq->cq.napi, - ALIGN(MLX5E_RX_MAX_HEAD, sizeof(long))); - if (unlikely(!skb)) { - rq->stats->buff_alloc_err++; - return NULL; + prog = rcu_dereference(rq->xdp_prog); + + if (prog) { + /* area for bpf_xdp_[store|load]_bytes */ + net_prefetchw(page_address(frag_page->page) + frag_offset); + if (unlikely(mlx5e_page_alloc_fragmented(rq, &wi->linear_page))) { + rq->stats->buff_alloc_err++; + return NULL; + } + va = page_address(wi->linear_page.page); + net_prefetchw(va); /* xdp_frame data area */ + linear_hr = XDP_PACKET_HEADROOM; + linear_data_len = 0; + linear_frame_sz = MLX5_SKB_FRAG_SZ(linear_hr + MLX5E_RX_MAX_HEAD); + } else { + skb = napi_alloc_skb(rq->cq.napi, + ALIGN(MLX5E_RX_MAX_HEAD, sizeof(long))); + if (unlikely(!skb)) { + rq->stats->buff_alloc_err++; + return NULL; + } + skb_mark_for_recycle(skb); + va = skb->head; + net_prefetchw(va); /* xdp_frame data area */ + net_prefetchw(skb->data); + + frag_offset += headlen; + byte_cnt -= headlen; + linear_hr = skb_headroom(skb); + linear_data_len = headlen; + linear_frame_sz = MLX5_SKB_FRAG_SZ(skb_end_offset(skb)); + if (unlikely(frag_offset >= PAGE_SIZE)) { + frag_page++; + frag_offset -= PAGE_SIZE; + } } - net_prefetchw(skb->data); + mlx5e_fill_mxbuf(rq, cqe, va, linear_hr, linear_frame_sz, linear_data_len, &mxbuf); + + sinfo = xdp_get_shared_info_from_buff(&mxbuf.xdp); + + while (byte_cnt) { + /* Non-linear mode, hence non-XSK, which always uses PAGE_SIZE. */ + u32 pg_consumed_bytes = min_t(u32, PAGE_SIZE - frag_offset, byte_cnt); - /* Non-linear mode, hence non-XSK, which always uses PAGE_SIZE. */ - if (unlikely(frag_offset >= PAGE_SIZE)) { - au++; - frag_offset -= PAGE_SIZE; + if (test_bit(MLX5E_RQ_STATE_SHAMPO, &rq->state)) + truesize += pg_consumed_bytes; + else + truesize += ALIGN(pg_consumed_bytes, BIT(rq->mpwqe.log_stride_sz)); + + mlx5e_add_skb_shared_info_frag(rq, sinfo, &mxbuf.xdp, frag_page, frag_offset, + pg_consumed_bytes); + byte_cnt -= pg_consumed_bytes; + frag_offset = 0; + frag_page++; } - mlx5e_fill_skb_data(skb, rq, au, byte_cnt, frag_offset); - /* copy header */ - addr = page_pool_get_dma_addr(head_au->page); - mlx5e_copy_skb_header(rq, skb, head_au->page, addr, - head_offset, head_offset, headlen); - /* skb linear part was allocated with headlen and aligned to long */ - skb->tail += headlen; - skb->len += headlen; + if (prog) { + if (mlx5e_xdp_handle(rq, prog, &mxbuf)) { + if (__test_and_clear_bit(MLX5E_RQ_FLAG_XDP_XMIT, rq->flags)) { + int i; + + for (i = 0; i < sinfo->nr_frags; i++) + /* non-atomic */ + __set_bit(page_idx + i, wi->skip_release_bitmap); + return NULL; + } + mlx5e_page_release_fragmented(rq, &wi->linear_page); + return NULL; /* page/packet was consumed by XDP */ + } + + skb = mlx5e_build_linear_skb(rq, mxbuf.xdp.data_hard_start, + linear_frame_sz, + mxbuf.xdp.data - mxbuf.xdp.data_hard_start, 0, + mxbuf.xdp.data - mxbuf.xdp.data_meta); + if (unlikely(!skb)) { + mlx5e_page_release_fragmented(rq, &wi->linear_page); + return NULL; + } + + skb_mark_for_recycle(skb); + wi->linear_page.frags++; + mlx5e_page_release_fragmented(rq, &wi->linear_page); + + if (xdp_buff_has_frags(&mxbuf.xdp)) { + struct mlx5e_frag_page *pagep; + + /* sinfo->nr_frags is reset by build_skb, calculate again. */ + xdp_update_skb_shared_info(skb, frag_page - head_page, + sinfo->xdp_frags_size, truesize, + xdp_buff_is_frag_pfmemalloc(&mxbuf.xdp)); + + pagep = head_page; + do + pagep->frags++; + while (++pagep < frag_page); + } + __pskb_pull_tail(skb, headlen); + } else { + dma_addr_t addr; + + if (xdp_buff_has_frags(&mxbuf.xdp)) { + struct mlx5e_frag_page *pagep; + + xdp_update_skb_shared_info(skb, sinfo->nr_frags, + sinfo->xdp_frags_size, truesize, + xdp_buff_is_frag_pfmemalloc(&mxbuf.xdp)); + + pagep = frag_page - sinfo->nr_frags; + do + pagep->frags++; + while (++pagep < frag_page); + } + /* copy header */ + addr = page_pool_get_dma_addr(head_page->page); + mlx5e_copy_skb_header(rq, skb, head_page->page, addr, + head_offset, head_offset, headlen); + /* skb linear part was allocated with headlen and aligned to long */ + skb->tail += headlen; + skb->len += headlen; + } return skb; } @@ -1981,7 +2124,7 @@ mlx5e_skb_from_cqe_mpwrq_linear(struct mlx5e_rq *rq, struct mlx5e_mpw_info *wi, struct mlx5_cqe64 *cqe, u16 cqe_bcnt, u32 head_offset, u32 page_idx) { - union mlx5e_alloc_unit *au = &wi->alloc_units[page_idx]; + struct mlx5e_frag_page *frag_page = &wi->alloc_units.frag_pages[page_idx]; u16 rx_headroom = rq->buff.headroom; struct bpf_prog *prog; struct sk_buff *skb; @@ -1996,11 +2139,11 @@ mlx5e_skb_from_cqe_mpwrq_linear(struct mlx5e_rq *rq, struct mlx5e_mpw_info *wi, return NULL; } - va = page_address(au->page) + head_offset; + va = page_address(frag_page->page) + head_offset; data = va + rx_headroom; frag_size = MLX5_SKB_FRAG_SZ(rx_headroom + cqe_bcnt); - addr = page_pool_get_dma_addr(au->page); + addr = page_pool_get_dma_addr(frag_page->page); dma_sync_single_range_for_cpu(rq->pdev, addr, head_offset, frag_size, rq->buff.map_dir); net_prefetch(data); @@ -2010,10 +2153,11 @@ mlx5e_skb_from_cqe_mpwrq_linear(struct mlx5e_rq *rq, struct mlx5e_mpw_info *wi, struct mlx5e_xdp_buff mxbuf; net_prefetchw(va); /* xdp_frame data area */ - mlx5e_fill_mxbuf(rq, cqe, va, rx_headroom, cqe_bcnt, &mxbuf); + mlx5e_fill_mxbuf(rq, cqe, va, rx_headroom, rq->buff.frame0_sz, + cqe_bcnt, &mxbuf); if (mlx5e_xdp_handle(rq, prog, &mxbuf)) { if (__test_and_clear_bit(MLX5E_RQ_FLAG_XDP_XMIT, rq->flags)) - __set_bit(page_idx, wi->xdp_xmit_bitmap); /* non-atomic */ + __set_bit(page_idx, wi->skip_release_bitmap); /* non-atomic */ return NULL; /* page/packet was consumed by XDP */ } @@ -2027,7 +2171,8 @@ mlx5e_skb_from_cqe_mpwrq_linear(struct mlx5e_rq *rq, struct mlx5e_mpw_info *wi, return NULL; /* queue up for recycling/reuse */ - page_ref_inc(au->page); + skb_mark_for_recycle(skb); + frag_page->frags++; return skb; } @@ -2044,7 +2189,7 @@ mlx5e_skb_from_cqe_shampo(struct mlx5e_rq *rq, struct mlx5e_mpw_info *wi, void *hdr, *data; u32 frag_size; - hdr = page_address(head->page) + head_offset; + hdr = page_address(head->frag_page->page) + head_offset; data = hdr + rx_headroom; frag_size = MLX5_SKB_FRAG_SZ(rx_headroom + head_size); @@ -2058,9 +2203,7 @@ mlx5e_skb_from_cqe_shampo(struct mlx5e_rq *rq, struct mlx5e_mpw_info *wi, if (unlikely(!skb)) return NULL; - /* queue up for recycling/reuse */ - page_ref_inc(head->page); - + head->frag_page->frags++; } else { /* allocate SKB and copy header for large header */ rq->stats->gro_large_hds++; @@ -2072,13 +2215,17 @@ mlx5e_skb_from_cqe_shampo(struct mlx5e_rq *rq, struct mlx5e_mpw_info *wi, } prefetchw(skb->data); - mlx5e_copy_skb_header(rq, skb, head->page, head->addr, + mlx5e_copy_skb_header(rq, skb, head->frag_page->page, head->addr, head_offset + rx_headroom, rx_headroom, head_size); /* skb linear part was allocated with headlen and aligned to long */ skb->tail += head_size; skb->len += head_size; } + + /* queue up for recycling/reuse */ + skb_mark_for_recycle(skb); + return skb; } @@ -2123,8 +2270,10 @@ mlx5e_free_rx_shampo_hd_entry(struct mlx5e_rq *rq, u16 header_index) u64 addr = shampo->info[header_index].addr; if (((header_index + 1) & (MLX5E_SHAMPO_WQ_HEADER_PER_PAGE - 1)) == 0) { - shampo->info[header_index].addr = ALIGN_DOWN(addr, PAGE_SIZE); - mlx5e_page_release_dynamic(rq, shampo->info[header_index].page, true); + struct mlx5e_dma_info *dma_info = &shampo->info[header_index]; + + dma_info->addr = ALIGN_DOWN(addr, PAGE_SIZE); + mlx5e_page_release_fragmented(rq, dma_info->frag_page); } bitmap_clear(shampo->bitmap, header_index, 1); } @@ -2145,7 +2294,6 @@ static void mlx5e_handle_rx_cqe_mpwrq_shampo(struct mlx5e_rq *rq, struct mlx5_cq bool match = cqe->shampo.match; struct mlx5e_rq_stats *stats = rq->stats; struct mlx5e_rx_wqe_ll *wqe; - union mlx5e_alloc_unit *au; struct mlx5e_mpw_info *wi; struct mlx5_wq_ll *wq; @@ -2195,8 +2343,10 @@ static void mlx5e_handle_rx_cqe_mpwrq_shampo(struct mlx5e_rq *rq, struct mlx5_cq } if (likely(head_size)) { - au = &wi->alloc_units[page_idx]; - mlx5e_fill_skb_data(*skb, rq, au, data_bcnt, data_offset); + struct mlx5e_frag_page *frag_page; + + frag_page = &wi->alloc_units.frag_pages[page_idx]; + mlx5e_fill_skb_data(*skb, rq, frag_page, data_bcnt, data_offset); } mlx5e_shampo_complete_rx_cqe(rq, cqe, cqe_bcnt, *skb); @@ -2210,7 +2360,6 @@ mpwrq_cqe_out: wq = &rq->mpwqe.wq; wqe = mlx5_wq_ll_get_wqe(wq, wqe_id); - mlx5e_free_rx_mpwqe(rq, wi, true); mlx5_wq_ll_pop(wq, cqe->wqe_id, &wqe->next.next_wqe_index); } @@ -2270,7 +2419,6 @@ mpwrq_cqe_out: wq = &rq->mpwqe.wq; wqe = mlx5_wq_ll_get_wqe(wq, wqe_id); - mlx5e_free_rx_mpwqe(rq, wi, true); mlx5_wq_ll_pop(wq, cqe->wqe_id, &wqe->next.next_wqe_index); } @@ -2489,7 +2637,7 @@ static void mlx5i_handle_rx_cqe(struct mlx5e_rq *rq, struct mlx5_cqe64 *cqe) if (unlikely(MLX5E_RX_ERR_CQE(cqe))) { rq->stats->wqe_err++; - goto wq_free_wqe; + goto wq_cyc_pop; } skb = INDIRECT_CALL_2(rq->wqe.skb_from_cqe, @@ -2497,17 +2645,16 @@ static void mlx5i_handle_rx_cqe(struct mlx5e_rq *rq, struct mlx5_cqe64 *cqe) mlx5e_skb_from_cqe_nonlinear, rq, wi, cqe, cqe_bcnt); if (!skb) - goto wq_free_wqe; + goto wq_cyc_pop; mlx5i_complete_rx_cqe(rq, cqe, cqe_bcnt, skb); if (unlikely(!skb->dev)) { dev_kfree_skb_any(skb); - goto wq_free_wqe; + goto wq_cyc_pop; } napi_gro_receive(rq->cq.napi, skb); -wq_free_wqe: - mlx5e_free_rx_wqe(rq, wi, true); +wq_cyc_pop: mlx5_wq_cyc_pop(wq); } @@ -2582,12 +2729,12 @@ static void mlx5e_trap_handle_rx_cqe(struct mlx5e_rq *rq, struct mlx5_cqe64 *cqe if (unlikely(MLX5E_RX_ERR_CQE(cqe))) { rq->stats->wqe_err++; - goto free_wqe; + goto wq_cyc_pop; } skb = mlx5e_skb_from_cqe_nonlinear(rq, wi, cqe, cqe_bcnt); if (!skb) - goto free_wqe; + goto wq_cyc_pop; mlx5e_complete_rx_cqe(rq, cqe, cqe_bcnt, skb); skb_push(skb, ETH_HLEN); @@ -2596,8 +2743,7 @@ static void mlx5e_trap_handle_rx_cqe(struct mlx5e_rq *rq, struct mlx5_cqe64 *cqe rq->netdev->devlink_port); dev_kfree_skb_any(skb); -free_wqe: - mlx5e_free_rx_wqe(rq, wi, false); +wq_cyc_pop: mlx5_wq_cyc_pop(wq); } diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c b/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c index 4478223c1720..f1d9596905c6 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c @@ -179,11 +179,6 @@ static const struct counter_desc sw_stats_desc[] = { { MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_buff_alloc_err) }, { MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_cqe_compress_blks) }, { MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_cqe_compress_pkts) }, - { MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_cache_reuse) }, - { MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_cache_full) }, - { MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_cache_empty) }, - { MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_cache_busy) }, - { MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_cache_waive) }, { MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_congst_umr) }, { MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_arfs_err) }, { MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_recover) }, @@ -358,11 +353,6 @@ static void mlx5e_stats_grp_sw_update_stats_rq_stats(struct mlx5e_sw_stats *s, s->rx_buff_alloc_err += rq_stats->buff_alloc_err; s->rx_cqe_compress_blks += rq_stats->cqe_compress_blks; s->rx_cqe_compress_pkts += rq_stats->cqe_compress_pkts; - s->rx_cache_reuse += rq_stats->cache_reuse; - s->rx_cache_full += rq_stats->cache_full; - s->rx_cache_empty += rq_stats->cache_empty; - s->rx_cache_busy += rq_stats->cache_busy; - s->rx_cache_waive += rq_stats->cache_waive; s->rx_congst_umr += rq_stats->congst_umr; s->rx_arfs_err += rq_stats->arfs_err; s->rx_recover += rq_stats->recover; @@ -1978,11 +1968,6 @@ static const struct counter_desc rq_stats_desc[] = { { MLX5E_DECLARE_RX_STAT(struct mlx5e_rq_stats, buff_alloc_err) }, { MLX5E_DECLARE_RX_STAT(struct mlx5e_rq_stats, cqe_compress_blks) }, { MLX5E_DECLARE_RX_STAT(struct mlx5e_rq_stats, cqe_compress_pkts) }, - { MLX5E_DECLARE_RX_STAT(struct mlx5e_rq_stats, cache_reuse) }, - { MLX5E_DECLARE_RX_STAT(struct mlx5e_rq_stats, cache_full) }, - { MLX5E_DECLARE_RX_STAT(struct mlx5e_rq_stats, cache_empty) }, - { MLX5E_DECLARE_RX_STAT(struct mlx5e_rq_stats, cache_busy) }, - { MLX5E_DECLARE_RX_STAT(struct mlx5e_rq_stats, cache_waive) }, { MLX5E_DECLARE_RX_STAT(struct mlx5e_rq_stats, congst_umr) }, { MLX5E_DECLARE_RX_STAT(struct mlx5e_rq_stats, arfs_err) }, { MLX5E_DECLARE_RX_STAT(struct mlx5e_rq_stats, recover) }, @@ -2163,11 +2148,6 @@ static const struct counter_desc ptp_rq_stats_desc[] = { { MLX5E_DECLARE_PTP_RQ_STAT(struct mlx5e_rq_stats, buff_alloc_err) }, { MLX5E_DECLARE_PTP_RQ_STAT(struct mlx5e_rq_stats, cqe_compress_blks) }, { MLX5E_DECLARE_PTP_RQ_STAT(struct mlx5e_rq_stats, cqe_compress_pkts) }, - { MLX5E_DECLARE_PTP_RQ_STAT(struct mlx5e_rq_stats, cache_reuse) }, - { MLX5E_DECLARE_PTP_RQ_STAT(struct mlx5e_rq_stats, cache_full) }, - { MLX5E_DECLARE_PTP_RQ_STAT(struct mlx5e_rq_stats, cache_empty) }, - { MLX5E_DECLARE_PTP_RQ_STAT(struct mlx5e_rq_stats, cache_busy) }, - { MLX5E_DECLARE_PTP_RQ_STAT(struct mlx5e_rq_stats, cache_waive) }, { MLX5E_DECLARE_PTP_RQ_STAT(struct mlx5e_rq_stats, congst_umr) }, { MLX5E_DECLARE_PTP_RQ_STAT(struct mlx5e_rq_stats, arfs_err) }, { MLX5E_DECLARE_PTP_RQ_STAT(struct mlx5e_rq_stats, recover) }, diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_stats.h b/drivers/net/ethernet/mellanox/mlx5/core/en_stats.h index b77100b60b50..1ff8a06027dc 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_stats.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_stats.h @@ -193,11 +193,6 @@ struct mlx5e_sw_stats { u64 rx_buff_alloc_err; u64 rx_cqe_compress_blks; u64 rx_cqe_compress_pkts; - u64 rx_cache_reuse; - u64 rx_cache_full; - u64 rx_cache_empty; - u64 rx_cache_busy; - u64 rx_cache_waive; u64 rx_congst_umr; u64 rx_arfs_err; u64 rx_recover; @@ -362,11 +357,6 @@ struct mlx5e_rq_stats { u64 buff_alloc_err; u64 cqe_compress_blks; u64 cqe_compress_pkts; - u64 cache_reuse; - u64 cache_full; - u64 cache_empty; - u64 cache_busy; - u64 cache_waive; u64 congst_umr; u64 arfs_err; u64 recover; diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c index 87a2850b32d0..728b82ce4031 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c @@ -44,6 +44,7 @@ #include <net/bareudp.h> #include <net/bonding.h> #include <net/dst_metadata.h> +#include "devlink.h" #include "en.h" #include "en/tc/post_act.h" #include "en/tc/act_stats.h" @@ -73,12 +74,6 @@ #define MLX5E_TC_TABLE_NUM_GROUPS 4 #define MLX5E_TC_TABLE_MAX_GROUP_SIZE BIT(18) -struct mlx5e_hairpin_params { - struct mlx5_core_dev *mdev; - u32 num_queues; - u32 queue_size; -}; - struct mlx5e_tc_table { /* Protects the dynamic assignment of the t parameter * which is the nic tc root table. @@ -101,7 +96,6 @@ struct mlx5e_tc_table { struct mlx5_tc_ct_priv *ct; struct mapping_ctx *mapping; - struct mlx5e_hairpin_params hairpin_params; struct dentry *dfs_root; /* tc action stats */ @@ -183,7 +177,8 @@ static struct lock_class_key tc_ht_wq_key; static void mlx5e_put_flow_tunnel_id(struct mlx5e_tc_flow *flow); static void free_flow_post_acts(struct mlx5e_tc_flow *flow); -static void mlx5_free_flow_attr(struct mlx5e_tc_flow *flow, struct mlx5_flow_attr *attr); +static void mlx5_free_flow_attr_actions(struct mlx5e_tc_flow *flow, + struct mlx5_flow_attr *attr); void mlx5e_tc_match_to_reg_match(struct mlx5_flow_spec *spec, @@ -493,15 +488,6 @@ mlx5e_tc_rule_offload(struct mlx5e_priv *priv, struct mlx5_eswitch *esw = priv->mdev->priv.eswitch; int err; - if (attr->flags & MLX5_ATTR_FLAG_CT) { - struct mlx5e_tc_mod_hdr_acts *mod_hdr_acts = - &attr->parse_attr->mod_hdr_acts; - - return mlx5_tc_ct_flow_offload(get_ct_priv(priv), - spec, attr, - mod_hdr_acts); - } - if (!is_mdev_switchdev_mode(priv->mdev)) return mlx5e_add_offloaded_nic_rule(priv, spec, attr); @@ -524,11 +510,6 @@ mlx5e_tc_rule_unoffload(struct mlx5e_priv *priv, { struct mlx5_eswitch *esw = priv->mdev->priv.eswitch; - if (attr->flags & MLX5_ATTR_FLAG_CT) { - mlx5_tc_ct_delete_flow(get_ct_priv(priv), attr); - return; - } - if (!is_mdev_switchdev_mode(priv->mdev)) { mlx5e_del_offloaded_nic_rule(priv, rule, attr); return; @@ -589,6 +570,7 @@ struct mlx5e_hairpin { struct mlx5e_tir direct_tir; int num_channels; + u8 log_num_packets; struct mlx5e_rqt indir_rqt; struct mlx5e_tir indir_tir[MLX5E_NUM_INDIR_TIRS]; struct mlx5_ttc_table *ttc; @@ -935,6 +917,7 @@ mlx5e_hairpin_create(struct mlx5e_priv *priv, struct mlx5_hairpin_params *params hp->func_mdev = func_mdev; hp->func_priv = priv; hp->num_channels = params->num_channels; + hp->log_num_packets = params->log_num_packets; err = mlx5e_hairpin_create_transport(hp); if (err) @@ -1076,9 +1059,11 @@ static int debugfs_hairpin_table_dump_show(struct seq_file *file, void *priv) mutex_lock(&tc->hairpin_tbl_lock); hash_for_each(tc->hairpin_tbl, bkt, hpe, hairpin_hlist) - seq_printf(file, "Hairpin peer_vhca_id %u prio %u refcnt %u\n", + seq_printf(file, + "Hairpin peer_vhca_id %u prio %u refcnt %u num_channels %u num_packets %lu\n", hpe->peer_vhca_id, hpe->prio, - refcount_read(&hpe->refcnt)); + refcount_read(&hpe->refcnt), hpe->hp->num_channels, + BIT(hpe->hp->log_num_packets)); mutex_unlock(&tc->hairpin_tbl_lock); return 0; @@ -1099,33 +1084,15 @@ static void mlx5e_tc_debugfs_init(struct mlx5e_tc_table *tc, &debugfs_hairpin_table_dump_fops); } -static void -mlx5e_hairpin_params_init(struct mlx5e_hairpin_params *hairpin_params, - struct mlx5_core_dev *mdev) -{ - u32 link_speed = 0; - u64 link_speed64; - - hairpin_params->mdev = mdev; - /* set hairpin pair per each 50Gbs share of the link */ - mlx5e_port_max_linkspeed(mdev, &link_speed); - link_speed = max_t(u32, link_speed, 50000); - link_speed64 = link_speed; - do_div(link_speed64, 50000); - hairpin_params->num_queues = link_speed64; - - hairpin_params->queue_size = - BIT(min_t(u32, 16 - MLX5_MPWRQ_MIN_LOG_STRIDE_SZ(mdev), - MLX5_CAP_GEN(mdev, log_max_hairpin_num_packets))); -} - static int mlx5e_hairpin_flow_add(struct mlx5e_priv *priv, struct mlx5e_tc_flow *flow, struct mlx5e_tc_flow_parse_attr *parse_attr, struct netlink_ext_ack *extack) { struct mlx5e_tc_table *tc = mlx5e_fs_get_tc(priv->fs); + struct devlink *devlink = priv_to_devlink(priv->mdev); int peer_ifindex = parse_attr->mirred_ifindex[0]; + union devlink_param_value val = {}; struct mlx5_hairpin_params params; struct mlx5_core_dev *peer_mdev; struct mlx5e_hairpin_entry *hpe; @@ -1182,7 +1149,14 @@ static int mlx5e_hairpin_flow_add(struct mlx5e_priv *priv, hash_hairpin_info(peer_id, match_prio)); mutex_unlock(&tc->hairpin_tbl_lock); - params.log_num_packets = ilog2(tc->hairpin_params.queue_size); + err = devl_param_driverinit_value_get( + devlink, MLX5_DEVLINK_PARAM_ID_HAIRPIN_QUEUE_SIZE, &val); + if (err) { + err = -ENOMEM; + goto out_err; + } + + params.log_num_packets = ilog2(val.vu32); params.log_data_size = clamp_t(u32, params.log_num_packets + @@ -1191,7 +1165,14 @@ static int mlx5e_hairpin_flow_add(struct mlx5e_priv *priv, MLX5_CAP_GEN(priv->mdev, log_max_hairpin_wq_data_sz)); params.q_counter = priv->q_counter; - params.num_channels = tc->hairpin_params.num_queues; + err = devl_param_driverinit_value_get( + devlink, MLX5_DEVLINK_PARAM_ID_HAIRPIN_NUM_QUEUES, &val); + if (err) { + err = -ENOMEM; + goto out_err; + } + + params.num_channels = val.vu32; hp = mlx5e_hairpin_create(priv, ¶ms, peer_ifindex); hpe->hp = hp; @@ -1401,13 +1382,7 @@ mlx5e_tc_add_nic_flow(struct mlx5e_priv *priv, return err; } - if (attr->flags & MLX5_ATTR_FLAG_CT) - flow->rule[0] = mlx5_tc_ct_flow_offload(get_ct_priv(priv), &parse_attr->spec, - attr, &parse_attr->mod_hdr_acts); - else - flow->rule[0] = mlx5e_add_offloaded_nic_rule(priv, &parse_attr->spec, - attr); - + flow->rule[0] = mlx5e_add_offloaded_nic_rule(priv, &parse_attr->spec, attr); return PTR_ERR_OR_ZERO(flow->rule[0]); } @@ -1438,9 +1413,7 @@ static void mlx5e_tc_del_nic_flow(struct mlx5e_priv *priv, flow_flag_clear(flow, OFFLOADED); - if (attr->flags & MLX5_ATTR_FLAG_CT) - mlx5_tc_ct_delete_flow(get_ct_priv(flow->priv), attr); - else if (!IS_ERR_OR_NULL(flow->rule[0])) + if (!IS_ERR_OR_NULL(flow->rule[0])) mlx5e_del_offloaded_nic_rule(priv, flow->rule[0], attr); /* Remove root table if no rules are left to avoid @@ -1791,8 +1764,7 @@ out: static void clean_encap_dests(struct mlx5e_priv *priv, struct mlx5e_tc_flow *flow, - struct mlx5_flow_attr *attr, - bool *vf_tun) + struct mlx5_flow_attr *attr) { struct mlx5_esw_flow_attr *esw_attr; int out_index; @@ -1801,17 +1773,11 @@ clean_encap_dests(struct mlx5e_priv *priv, return; esw_attr = attr->esw_attr; - *vf_tun = false; for (out_index = 0; out_index < MLX5_MAX_FLOW_FWD_VPORTS; out_index++) { if (!(esw_attr->dests[out_index].flags & MLX5_ESW_DEST_ENCAP)) continue; - if (esw_attr->dests[out_index].flags & - MLX5_ESW_DEST_CHAIN_WITH_SRC_PORT_CHANGE && - !esw_attr->dest_int_port) - *vf_tun = true; - mlx5e_detach_encap(priv, flow, attr, out_index); kfree(attr->parse_attr->tun_info[out_index]); } @@ -2034,7 +2000,7 @@ static void free_branch_attr(struct mlx5e_tc_flow *flow, struct mlx5_flow_attr * if (!attr) return; - mlx5_free_flow_attr(flow, attr); + mlx5_free_flow_attr_actions(flow, attr); kvfree(attr->parse_attr); kfree(attr); } @@ -2045,7 +2011,6 @@ static void mlx5e_tc_del_fdb_flow(struct mlx5e_priv *priv, struct mlx5_eswitch *esw = priv->mdev->priv.eswitch; struct mlx5_flow_attr *attr = flow->attr; struct mlx5_esw_flow_attr *esw_attr; - bool vf_tun; esw_attr = attr->esw_attr; mlx5e_put_flow_tunnel_id(flow); @@ -2067,18 +2032,8 @@ static void mlx5e_tc_del_fdb_flow(struct mlx5e_priv *priv, if (flow->decap_route) mlx5e_detach_decap_route(priv, flow); - clean_encap_dests(priv, flow, attr, &vf_tun); - mlx5_tc_ct_match_del(get_ct_priv(priv), &flow->attr->ct_attr); - if (attr->action & MLX5_FLOW_CONTEXT_ACTION_MOD_HDR) { - mlx5e_mod_hdr_dealloc(&attr->parse_attr->mod_hdr_acts); - mlx5e_tc_detach_mod_hdr(priv, flow, attr); - } - - if (attr->action & MLX5_FLOW_CONTEXT_ACTION_COUNT) - mlx5_fc_destroy(esw_attr->counter_dev, attr->counter); - if (esw_attr->int_port) mlx5e_tc_int_port_put(mlx5e_get_int_port_priv(priv), esw_attr->int_port); @@ -2091,8 +2046,7 @@ static void mlx5e_tc_del_fdb_flow(struct mlx5e_priv *priv, mlx5e_tc_act_stats_del_flow(get_act_stats_handle(priv), flow); free_flow_post_acts(flow); - free_branch_attr(flow, attr->branch_true); - free_branch_attr(flow, attr->branch_false); + mlx5_free_flow_attr_actions(flow, attr); kvfree(attr->esw_attr->rx_tun_attr); kvfree(attr->parse_attr); @@ -3469,114 +3423,59 @@ struct ipv6_hoplimit_word { }; static bool -is_action_keys_supported(const struct flow_action_entry *act, bool ct_flow, - bool *modify_ip_header, bool *modify_tuple, - struct netlink_ext_ack *extack) +is_flow_action_modify_ip_header(struct flow_action *flow_action) { + const struct flow_action_entry *act; u32 mask, offset; u8 htype; + int i; - htype = act->mangle.htype; - offset = act->mangle.offset; - mask = ~act->mangle.mask; /* For IPv4 & IPv6 header check 4 byte word, * to determine that modified fields * are NOT ttl & hop_limit only. */ - if (htype == FLOW_ACT_MANGLE_HDR_TYPE_IP4) { - struct ip_ttl_word *ttl_word = - (struct ip_ttl_word *)&mask; - - if (offset != offsetof(struct iphdr, ttl) || - ttl_word->protocol || - ttl_word->check) { - *modify_ip_header = true; - } - - if (offset >= offsetof(struct iphdr, saddr)) - *modify_tuple = true; - - if (ct_flow && *modify_tuple) { - NL_SET_ERR_MSG_MOD(extack, - "can't offload re-write of ipv4 address with action ct"); - return false; - } - } else if (htype == FLOW_ACT_MANGLE_HDR_TYPE_IP6) { - struct ipv6_hoplimit_word *hoplimit_word = - (struct ipv6_hoplimit_word *)&mask; - - if (offset != offsetof(struct ipv6hdr, payload_len) || - hoplimit_word->payload_len || - hoplimit_word->nexthdr) { - *modify_ip_header = true; - } - - if (ct_flow && offset >= offsetof(struct ipv6hdr, saddr)) - *modify_tuple = true; + flow_action_for_each(i, act, flow_action) { + if (act->id != FLOW_ACTION_MANGLE && + act->id != FLOW_ACTION_ADD) + continue; - if (ct_flow && *modify_tuple) { - NL_SET_ERR_MSG_MOD(extack, - "can't offload re-write of ipv6 address with action ct"); - return false; - } - } else if (htype == FLOW_ACT_MANGLE_HDR_TYPE_TCP || - htype == FLOW_ACT_MANGLE_HDR_TYPE_UDP) { - *modify_tuple = true; - if (ct_flow) { - NL_SET_ERR_MSG_MOD(extack, - "can't offload re-write of transport header ports with action ct"); - return false; + htype = act->mangle.htype; + offset = act->mangle.offset; + mask = ~act->mangle.mask; + + if (htype == FLOW_ACT_MANGLE_HDR_TYPE_IP4) { + struct ip_ttl_word *ttl_word = + (struct ip_ttl_word *)&mask; + + if (offset != offsetof(struct iphdr, ttl) || + ttl_word->protocol || + ttl_word->check) + return true; + } else if (htype == FLOW_ACT_MANGLE_HDR_TYPE_IP6) { + struct ipv6_hoplimit_word *hoplimit_word = + (struct ipv6_hoplimit_word *)&mask; + + if (offset != offsetof(struct ipv6hdr, payload_len) || + hoplimit_word->payload_len || + hoplimit_word->nexthdr) + return true; } } - return true; -} - -static bool modify_tuple_supported(bool modify_tuple, bool ct_clear, - bool ct_flow, struct netlink_ext_ack *extack, - struct mlx5e_priv *priv, - struct mlx5_flow_spec *spec) -{ - if (!modify_tuple || ct_clear) - return true; - - if (ct_flow) { - NL_SET_ERR_MSG_MOD(extack, - "can't offload tuple modification with non-clear ct()"); - netdev_info(priv->netdev, - "can't offload tuple modification with non-clear ct()"); - return false; - } - - /* Add ct_state=-trk match so it will be offloaded for non ct flows - * (or after clear action), as otherwise, since the tuple is changed, - * we can't restore ct state - */ - if (mlx5_tc_ct_add_no_trk_match(spec)) { - NL_SET_ERR_MSG_MOD(extack, - "can't offload tuple modification with ct matches and no ct(clear) action"); - netdev_info(priv->netdev, - "can't offload tuple modification with ct matches and no ct(clear) action"); - return false; - } - - return true; + return false; } static bool modify_header_match_supported(struct mlx5e_priv *priv, struct mlx5_flow_spec *spec, struct flow_action *flow_action, - u32 actions, bool ct_flow, - bool ct_clear, + u32 actions, struct netlink_ext_ack *extack) { - const struct flow_action_entry *act; - bool modify_ip_header, modify_tuple; + bool modify_ip_header; void *headers_c; void *headers_v; u16 ethertype; u8 ip_proto; - int i; headers_c = mlx5e_get_match_headers_criteria(actions, spec); headers_v = mlx5e_get_match_headers_value(actions, spec); @@ -3587,23 +3486,7 @@ static bool modify_header_match_supported(struct mlx5e_priv *priv, ethertype != ETH_P_IP && ethertype != ETH_P_IPV6) goto out_ok; - modify_ip_header = false; - modify_tuple = false; - flow_action_for_each(i, act, flow_action) { - if (act->id != FLOW_ACTION_MANGLE && - act->id != FLOW_ACTION_ADD) - continue; - - if (!is_action_keys_supported(act, ct_flow, - &modify_ip_header, - &modify_tuple, extack)) - return false; - } - - if (!modify_tuple_supported(modify_tuple, ct_clear, ct_flow, extack, - priv, spec)) - return false; - + modify_ip_header = is_flow_action_modify_ip_header(flow_action); ip_proto = MLX5_GET(fte_match_set_lyr_2_4, headers_v, ip_protocol); if (modify_ip_header && ip_proto != IPPROTO_TCP && ip_proto != IPPROTO_UDP && ip_proto != IPPROTO_ICMP) { @@ -3624,19 +3507,6 @@ actions_match_supported_fdb(struct mlx5e_priv *priv, struct netlink_ext_ack *extack) { struct mlx5_esw_flow_attr *esw_attr = flow->attr->esw_attr; - bool ct_flow, ct_clear; - - ct_clear = flow->attr->ct_attr.ct_action & TCA_CT_ACT_CLEAR; - ct_flow = flow_flag_test(flow, CT) && !ct_clear; - - if (esw_attr->split_count && ct_flow && - !MLX5_CAP_GEN(esw_attr->in_mdev, reg_c_preserve)) { - /* All registers used by ct are cleared when using - * split rules. - */ - NL_SET_ERR_MSG_MOD(extack, "Can't offload mirroring with action ct"); - return false; - } if (esw_attr->split_count > 0 && !mlx5_esw_has_fwd_fdb(priv->mdev)) { NL_SET_ERR_MSG_MOD(extack, @@ -3657,14 +3527,9 @@ actions_match_supported(struct mlx5e_priv *priv, struct mlx5e_tc_flow *flow, struct netlink_ext_ack *extack) { - bool ct_flow, ct_clear; - - ct_clear = flow->attr->ct_attr.ct_action & TCA_CT_ACT_CLEAR; - ct_flow = flow_flag_test(flow, CT) && !ct_clear; - if (actions & MLX5_FLOW_CONTEXT_ACTION_MOD_HDR && - !modify_header_match_supported(priv, &parse_attr->spec, flow_action, - actions, ct_flow, ct_clear, extack)) + !modify_header_match_supported(priv, &parse_attr->spec, flow_action, actions, + extack)) return false; if (mlx5e_is_eswitch_flow(flow) && @@ -3758,6 +3623,7 @@ mlx5e_clone_flow_attr_for_post_act(struct mlx5_flow_attr *attr, attr2->dest_chain = 0; attr2->dest_ft = NULL; attr2->act_id_restore_rule = NULL; + memset(&attr2->ct_attr, 0, sizeof(attr2->ct_attr)); if (ns_type == MLX5_FLOW_NAMESPACE_FDB) { attr2->esw_attr->out_count = 0; @@ -3811,9 +3677,7 @@ free_flow_post_acts(struct mlx5e_tc_flow *flow) if (list_is_last(&attr->list, &flow->attrs)) break; - mlx5_free_flow_attr(flow, attr); - free_branch_attr(flow, attr->branch_true); - free_branch_attr(flow, attr->branch_false); + mlx5_free_flow_attr_actions(flow, attr); list_del(&attr->list); kvfree(attr->parse_attr); @@ -4071,76 +3935,79 @@ parse_tc_actions(struct mlx5e_tc_act_parse_state *parse_state, struct flow_action *flow_action) { struct netlink_ext_ack *extack = parse_state->extack; - struct mlx5e_tc_flow_action flow_action_reorder; struct mlx5e_tc_flow *flow = parse_state->flow; struct mlx5e_tc_jump_state jump_state = {}; struct mlx5_flow_attr *attr = flow->attr; enum mlx5_flow_namespace_type ns_type; struct mlx5e_priv *priv = flow->priv; - struct flow_action_entry *act, **_act; + struct mlx5_flow_attr *prev_attr; + struct flow_action_entry *act; struct mlx5e_tc_act *tc_act; + bool is_missable; int err, i; - flow_action_reorder.num_entries = flow_action->num_entries; - flow_action_reorder.entries = kcalloc(flow_action->num_entries, - sizeof(flow_action), GFP_KERNEL); - if (!flow_action_reorder.entries) - return -ENOMEM; - - mlx5e_tc_act_reorder_flow_actions(flow_action, &flow_action_reorder); - ns_type = mlx5e_get_flow_namespace(flow); list_add(&attr->list, &flow->attrs); - flow_action_for_each(i, _act, &flow_action_reorder) { + flow_action_for_each(i, act, flow_action) { jump_state.jump_target = false; - act = *_act; + is_missable = false; + prev_attr = attr; + tc_act = mlx5e_tc_act_get(act->id, ns_type); if (!tc_act) { NL_SET_ERR_MSG_MOD(extack, "Not implemented offload action"); err = -EOPNOTSUPP; - goto out_free; + goto out_free_post_acts; } - if (!tc_act->can_offload(parse_state, act, i, attr)) { + if (tc_act->can_offload && !tc_act->can_offload(parse_state, act, i, attr)) { err = -EOPNOTSUPP; - goto out_free; + goto out_free_post_acts; } err = tc_act->parse_action(parse_state, act, priv, attr); if (err) - goto out_free; + goto out_free_post_acts; dec_jump_count(act, tc_act, attr, priv, &jump_state); err = parse_branch_ctrl(act, tc_act, flow, attr, &jump_state, extack); if (err) - goto out_free; + goto out_free_post_acts; parse_state->actions |= attr->action; - if (!tc_act->stats_action) - attr->tc_act_cookies[attr->tc_act_cookies_count++] = act->cookie; /* Split attr for multi table act if not the last act. */ if (jump_state.jump_target || (tc_act->is_multi_table_act && tc_act->is_multi_table_act(priv, act, attr) && - i < flow_action_reorder.num_entries - 1)) { + i < flow_action->num_entries - 1)) { + is_missable = tc_act->is_missable ? tc_act->is_missable(act) : false; + err = mlx5e_tc_act_post_parse(parse_state, flow_action, attr, ns_type); if (err) - goto out_free; + goto out_free_post_acts; attr = mlx5e_clone_flow_attr_for_post_act(flow->attr, ns_type); if (!attr) { err = -ENOMEM; - goto out_free; + goto out_free_post_acts; } list_add(&attr->list, &flow->attrs); } - } - kfree(flow_action_reorder.entries); + if (is_missable) { + /* Add counter to prev, and assign act to new (next) attr */ + prev_attr->action |= MLX5_FLOW_CONTEXT_ACTION_COUNT; + flow_flag_set(flow, USE_ACT_STATS); + + attr->tc_act_cookies[attr->tc_act_cookies_count++] = act->cookie; + } else if (!tc_act->stats_action) { + prev_attr->tc_act_cookies[prev_attr->tc_act_cookies_count++] = act->cookie; + } + } err = mlx5e_tc_act_post_parse(parse_state, flow_action, attr, ns_type); if (err) @@ -4152,8 +4019,6 @@ parse_tc_actions(struct mlx5e_tc_act_parse_state *parse_state, return 0; -out_free: - kfree(flow_action_reorder.entries); out_free_post_acts: free_flow_post_acts(flow); @@ -4448,10 +4313,9 @@ mlx5_alloc_flow_attr(enum mlx5_flow_namespace_type type) } static void -mlx5_free_flow_attr(struct mlx5e_tc_flow *flow, struct mlx5_flow_attr *attr) +mlx5_free_flow_attr_actions(struct mlx5e_tc_flow *flow, struct mlx5_flow_attr *attr) { struct mlx5_core_dev *counter_dev = get_flow_counter_dev(flow); - bool vf_tun; if (!attr) return; @@ -4459,7 +4323,7 @@ mlx5_free_flow_attr(struct mlx5e_tc_flow *flow, struct mlx5_flow_attr *attr) if (attr->post_act_handle) mlx5e_tc_post_act_del(get_post_action(flow->priv), attr->post_act_handle); - clean_encap_dests(flow->priv, flow, attr, &vf_tun); + clean_encap_dests(flow->priv, flow, attr); if (attr->action & MLX5_FLOW_CONTEXT_ACTION_COUNT) mlx5_fc_destroy(counter_dev, attr->counter); @@ -4468,6 +4332,11 @@ mlx5_free_flow_attr(struct mlx5e_tc_flow *flow, struct mlx5_flow_attr *attr) mlx5e_mod_hdr_dealloc(&attr->parse_attr->mod_hdr_acts); mlx5e_tc_detach_mod_hdr(flow->priv, flow, attr); } + + mlx5_tc_ct_delete_flow(get_ct_priv(flow->priv), attr); + + free_branch_attr(flow, attr->branch_true); + free_branch_attr(flow, attr->branch_false); } static int @@ -4929,7 +4798,7 @@ int mlx5e_stats_flower(struct net_device *dev, struct mlx5e_priv *priv, goto errout; } - if (mlx5e_is_offloaded_flow(flow) || flow_flag_test(flow, CT)) { + if (mlx5e_is_offloaded_flow(flow)) { if (flow_flag_test(flow, USE_ACT_STATS)) { f->use_act_stats = true; } else { @@ -5187,22 +5056,6 @@ static int mlx5e_tc_netdev_event(struct notifier_block *this, return NOTIFY_DONE; } -static int mlx5e_tc_nic_get_ft_size(struct mlx5_core_dev *dev) -{ - int tc_grp_size, tc_tbl_size; - u32 max_flow_counter; - - max_flow_counter = (MLX5_CAP_GEN(dev, max_flow_counter_31_16) << 16) | - MLX5_CAP_GEN(dev, max_flow_counter_15_0); - - tc_grp_size = min_t(int, max_flow_counter, MLX5E_TC_TABLE_MAX_GROUP_SIZE); - - tc_tbl_size = min_t(int, tc_grp_size * MLX5E_TC_TABLE_NUM_GROUPS, - BIT(MLX5_CAP_FLOWTABLE_NIC_RX(dev, log_max_ft_size))); - - return tc_tbl_size; -} - static int mlx5e_tc_nic_create_miss_table(struct mlx5e_priv *priv) { struct mlx5e_tc_table *tc = mlx5e_fs_get_tc(priv->fs); @@ -5275,10 +5128,10 @@ int mlx5e_tc_nic_init(struct mlx5e_priv *priv) attr.flags = MLX5_CHAINS_AND_PRIOS_SUPPORTED | MLX5_CHAINS_IGNORE_FLOW_LEVEL_SUPPORTED; attr.ns = MLX5_FLOW_NAMESPACE_KERNEL; - attr.max_ft_sz = mlx5e_tc_nic_get_ft_size(dev); attr.max_grp_num = MLX5E_TC_TABLE_NUM_GROUPS; attr.default_ft = tc->miss_t; attr.mapping = chains_mapping; + attr.fs_base_prio = MLX5E_TC_PRIO; tc->chains = mlx5_chains_create(dev, &attr); if (IS_ERR(tc->chains)) { @@ -5286,12 +5139,12 @@ int mlx5e_tc_nic_init(struct mlx5e_priv *priv) goto err_miss; } + mlx5_chains_print_info(tc->chains); + tc->post_act = mlx5e_tc_post_act_init(priv, tc->chains, MLX5_FLOW_NAMESPACE_KERNEL); tc->ct = mlx5_tc_ct_init(priv, tc->chains, &tc->mod_hdr, MLX5_FLOW_NAMESPACE_KERNEL, tc->post_act); - mlx5e_hairpin_params_init(&tc->hairpin_params, dev); - tc->netdevice_nb.notifier_call = mlx5e_tc_netdev_event; err = register_netdevice_notifier_dev_net(priv->netdev, &tc->netdevice_nb, diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_txrx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_txrx.c index 9a458a5d9853..a50bfda18e96 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_txrx.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_txrx.c @@ -51,7 +51,7 @@ static void mlx5e_handle_tx_dim(struct mlx5e_txqsq *sq) struct mlx5e_sq_stats *stats = sq->stats; struct dim_sample dim_sample = {}; - if (unlikely(!test_bit(MLX5E_SQ_STATE_AM, &sq->state))) + if (unlikely(!test_bit(MLX5E_SQ_STATE_DIM, &sq->state))) return; dim_update_sample(sq->cq.event_ctr, stats->packets, stats->bytes, &dim_sample); @@ -63,7 +63,7 @@ static void mlx5e_handle_rx_dim(struct mlx5e_rq *rq) struct mlx5e_rq_stats *stats = rq->stats; struct dim_sample dim_sample = {}; - if (unlikely(!test_bit(MLX5E_RQ_STATE_AM, &rq->state))) + if (unlikely(!test_bit(MLX5E_RQ_STATE_DIM, &rq->state))) return; dim_update_sample(rq->cq.event_ctr, stats->packets, stats->bytes, &dim_sample); diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eq.c b/drivers/net/ethernet/mellanox/mlx5/core/eq.c index 38b32e98f3bd..1c35d721a31d 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/eq.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/eq.c @@ -18,6 +18,7 @@ #include "lib/clock.h" #include "diag/fw_tracer.h" #include "mlx5_irq.h" +#include "pci_irq.h" #include "devlink.h" #include "en_accel/ipsec.h" @@ -61,9 +62,7 @@ struct mlx5_eq_table { struct mlx5_irq_table *irq_table; struct mlx5_irq **comp_irqs; struct mlx5_irq *ctrl_irq; -#ifdef CONFIG_RFS_ACCEL struct cpu_rmap *rmap; -#endif }; #define MLX5_ASYNC_EVENT_MASK ((1ull << MLX5_EVENT_TYPE_PATH_MIG) | \ @@ -637,6 +636,7 @@ static u16 async_eq_depth_devlink_param_get(struct mlx5_core_dev *dev) mlx5_core_dbg(dev, "Failed to get param. using default. err = %d\n", err); return MLX5_NUM_ASYNC_EQE; } + static int create_async_eqs(struct mlx5_core_dev *dev) { struct mlx5_eq_table *table = dev->priv.eq_table; @@ -803,44 +803,28 @@ void mlx5_eq_update_ci(struct mlx5_eq *eq, u32 cc, bool arm) } EXPORT_SYMBOL(mlx5_eq_update_ci); -static void comp_irqs_release(struct mlx5_core_dev *dev) +static void comp_irqs_release_pci(struct mlx5_core_dev *dev) { struct mlx5_eq_table *table = dev->priv.eq_table; - if (mlx5_core_is_sf(dev)) - mlx5_irq_affinity_irqs_release(dev, table->comp_irqs, table->num_comp_eqs); - else - mlx5_irqs_release_vectors(table->comp_irqs, table->num_comp_eqs); - kfree(table->comp_irqs); + mlx5_irqs_release_vectors(table->comp_irqs, table->num_comp_eqs); } -static int comp_irqs_request(struct mlx5_core_dev *dev) +static int comp_irqs_request_pci(struct mlx5_core_dev *dev) { struct mlx5_eq_table *table = dev->priv.eq_table; const struct cpumask *prev = cpu_none_mask; const struct cpumask *mask; - int ncomp_eqs = table->num_comp_eqs; + int ncomp_eqs; u16 *cpus; int ret; int cpu; int i; ncomp_eqs = table->num_comp_eqs; - table->comp_irqs = kcalloc(ncomp_eqs, sizeof(*table->comp_irqs), GFP_KERNEL); - if (!table->comp_irqs) - return -ENOMEM; - if (mlx5_core_is_sf(dev)) { - ret = mlx5_irq_affinity_irqs_request_auto(dev, ncomp_eqs, table->comp_irqs); - if (ret < 0) - goto free_irqs; - return ret; - } - cpus = kcalloc(ncomp_eqs, sizeof(*cpus), GFP_KERNEL); - if (!cpus) { + if (!cpus) ret = -ENOMEM; - goto free_irqs; - } i = 0; rcu_read_lock(); @@ -854,17 +838,89 @@ static int comp_irqs_request(struct mlx5_core_dev *dev) } spread_done: rcu_read_unlock(); - ret = mlx5_irqs_request_vectors(dev, cpus, ncomp_eqs, table->comp_irqs); + ret = mlx5_irqs_request_vectors(dev, cpus, ncomp_eqs, table->comp_irqs, &table->rmap); kfree(cpus); - if (ret < 0) - goto free_irqs; return ret; +} + +static void comp_irqs_release_sf(struct mlx5_core_dev *dev) +{ + struct mlx5_eq_table *table = dev->priv.eq_table; + + mlx5_irq_affinity_irqs_release(dev, table->comp_irqs, table->num_comp_eqs); +} + +static int comp_irqs_request_sf(struct mlx5_core_dev *dev) +{ + struct mlx5_eq_table *table = dev->priv.eq_table; + int ncomp_eqs = table->num_comp_eqs; + + return mlx5_irq_affinity_irqs_request_auto(dev, ncomp_eqs, table->comp_irqs); +} + +static void comp_irqs_release(struct mlx5_core_dev *dev) +{ + struct mlx5_eq_table *table = dev->priv.eq_table; + + mlx5_core_is_sf(dev) ? comp_irqs_release_sf(dev) : + comp_irqs_release_pci(dev); -free_irqs: kfree(table->comp_irqs); +} + +static int comp_irqs_request(struct mlx5_core_dev *dev) +{ + struct mlx5_eq_table *table = dev->priv.eq_table; + int ncomp_eqs; + int ret; + + ncomp_eqs = table->num_comp_eqs; + table->comp_irqs = kcalloc(ncomp_eqs, sizeof(*table->comp_irqs), GFP_KERNEL); + if (!table->comp_irqs) + return -ENOMEM; + + ret = mlx5_core_is_sf(dev) ? comp_irqs_request_sf(dev) : + comp_irqs_request_pci(dev); + if (ret < 0) + kfree(table->comp_irqs); + return ret; } +#ifdef CONFIG_RFS_ACCEL +static int alloc_rmap(struct mlx5_core_dev *mdev) +{ + struct mlx5_eq_table *eq_table = mdev->priv.eq_table; + + /* rmap is a mapping between irq number and queue number. + * Each irq can be assigned only to a single rmap. + * Since SFs share IRQs, rmap mapping cannot function correctly + * for irqs that are shared between different core/netdev RX rings. + * Hence we don't allow netdev rmap for SFs. + */ + if (mlx5_core_is_sf(mdev)) + return 0; + + eq_table->rmap = alloc_irq_cpu_rmap(eq_table->num_comp_eqs); + if (!eq_table->rmap) + return -ENOMEM; + return 0; +} + +static void free_rmap(struct mlx5_core_dev *mdev) +{ + struct mlx5_eq_table *eq_table = mdev->priv.eq_table; + + if (eq_table->rmap) { + free_irq_cpu_rmap(eq_table->rmap); + eq_table->rmap = NULL; + } +} +#else +static int alloc_rmap(struct mlx5_core_dev *mdev) { return 0; } +static void free_rmap(struct mlx5_core_dev *mdev) {} +#endif + static void destroy_comp_eqs(struct mlx5_core_dev *dev) { struct mlx5_eq_table *table = dev->priv.eq_table; @@ -880,6 +936,7 @@ static void destroy_comp_eqs(struct mlx5_core_dev *dev) kfree(eq); } comp_irqs_release(dev); + free_rmap(dev); } static u16 comp_eq_depth_devlink_param_get(struct mlx5_core_dev *dev) @@ -906,9 +963,16 @@ static int create_comp_eqs(struct mlx5_core_dev *dev) int err; int i; + err = alloc_rmap(dev); + if (err) + return err; + ncomp_eqs = comp_irqs_request(dev); - if (ncomp_eqs < 0) - return ncomp_eqs; + if (ncomp_eqs < 0) { + err = ncomp_eqs; + goto err_irqs_req; + } + INIT_LIST_HEAD(&table->comp_eqs_list); nent = comp_eq_depth_devlink_param_get(dev); @@ -953,6 +1017,8 @@ clean_eq: kfree(eq); clean: destroy_comp_eqs(dev); +err_irqs_req: + free_rmap(dev); return err; } @@ -1004,10 +1070,11 @@ mlx5_comp_irq_get_affinity_mask(struct mlx5_core_dev *dev, int vector) list_for_each_entry(eq, &table->comp_eqs_list, list) { if (i++ == vector) - break; + return mlx5_irq_get_affinity_mask(eq->core.irq); } - return mlx5_irq_get_affinity_mask(eq->core.irq); + WARN_ON_ONCE(1); + return NULL; } EXPORT_SYMBOL(mlx5_comp_irq_get_affinity_mask); @@ -1031,55 +1098,12 @@ struct mlx5_eq_comp *mlx5_eqn2comp_eq(struct mlx5_core_dev *dev, int eqn) return ERR_PTR(-ENOENT); } -static void clear_rmap(struct mlx5_core_dev *dev) -{ -#ifdef CONFIG_RFS_ACCEL - struct mlx5_eq_table *eq_table = dev->priv.eq_table; - - free_irq_cpu_rmap(eq_table->rmap); -#endif -} - -static int set_rmap(struct mlx5_core_dev *mdev) -{ - int err = 0; -#ifdef CONFIG_RFS_ACCEL - struct mlx5_eq_table *eq_table = mdev->priv.eq_table; - int vecidx; - - eq_table->rmap = alloc_irq_cpu_rmap(eq_table->num_comp_eqs); - if (!eq_table->rmap) { - err = -ENOMEM; - mlx5_core_err(mdev, "Failed to allocate cpu_rmap. err %d", err); - goto err_out; - } - - for (vecidx = 0; vecidx < eq_table->num_comp_eqs; vecidx++) { - err = irq_cpu_rmap_add(eq_table->rmap, - pci_irq_vector(mdev->pdev, vecidx)); - if (err) { - mlx5_core_err(mdev, "irq_cpu_rmap_add failed. err %d", - err); - goto err_irq_cpu_rmap_add; - } - } - return 0; - -err_irq_cpu_rmap_add: - clear_rmap(mdev); -err_out: -#endif - return err; -} - /* This function should only be called after mlx5_cmd_force_teardown_hca */ void mlx5_core_eq_free_irqs(struct mlx5_core_dev *dev) { struct mlx5_eq_table *table = dev->priv.eq_table; mutex_lock(&table->lock); /* sync with create/destroy_async_eq */ - if (!mlx5_core_is_sf(dev)) - clear_rmap(dev); mlx5_irq_table_destroy(dev); mutex_unlock(&table->lock); } @@ -1090,44 +1114,47 @@ void mlx5_core_eq_free_irqs(struct mlx5_core_dev *dev) #define MLX5_MAX_ASYNC_EQS 3 #endif -int mlx5_eq_table_create(struct mlx5_core_dev *dev) +static int get_num_eqs(struct mlx5_core_dev *dev) { struct mlx5_eq_table *eq_table = dev->priv.eq_table; - int num_eqs = MLX5_CAP_GEN(dev, max_num_eqs) ? + int max_dev_eqs; + int max_eqs_sf; + int num_eqs; + + /* If ethernet is disabled we use just a single completion vector to + * have the other vectors available for other drivers using mlx5_core. For + * example, mlx5_vdpa + */ + if (!mlx5_core_is_eth_enabled(dev) && mlx5_eth_supported(dev)) + return 1; + + max_dev_eqs = MLX5_CAP_GEN(dev, max_num_eqs) ? MLX5_CAP_GEN(dev, max_num_eqs) : 1 << MLX5_CAP_GEN(dev, log_max_eq); - int max_eqs_sf; - int err; - eq_table->num_comp_eqs = - min_t(int, - mlx5_irq_table_get_num_comp(eq_table->irq_table), - num_eqs - MLX5_MAX_ASYNC_EQS); + num_eqs = min_t(int, mlx5_irq_table_get_num_comp(eq_table->irq_table), + max_dev_eqs - MLX5_MAX_ASYNC_EQS); if (mlx5_core_is_sf(dev)) { max_eqs_sf = min_t(int, MLX5_COMP_EQS_PER_SF, mlx5_irq_table_get_sfs_vec(eq_table->irq_table)); - eq_table->num_comp_eqs = min_t(int, eq_table->num_comp_eqs, - max_eqs_sf); + num_eqs = min_t(int, num_eqs, max_eqs_sf); } + return num_eqs; +} + +int mlx5_eq_table_create(struct mlx5_core_dev *dev) +{ + struct mlx5_eq_table *eq_table = dev->priv.eq_table; + int err; + + eq_table->num_comp_eqs = get_num_eqs(dev); err = create_async_eqs(dev); if (err) { mlx5_core_err(dev, "Failed to create async EQs\n"); goto err_async_eqs; } - if (!mlx5_core_is_sf(dev)) { - /* rmap is a mapping between irq number and queue number. - * each irq can be assign only to a single rmap. - * since SFs share IRQs, rmap mapping cannot function correctly - * for irqs that are shared for different core/netdev RX rings. - * Hence we don't allow netdev rmap for SFs - */ - err = set_rmap(dev); - if (err) - goto err_rmap; - } - err = create_comp_eqs(dev); if (err) { mlx5_core_err(dev, "Failed to create completion EQs\n"); @@ -1135,10 +1162,8 @@ int mlx5_eq_table_create(struct mlx5_core_dev *dev) } return 0; + err_comp_eqs: - if (!mlx5_core_is_sf(dev)) - clear_rmap(dev); -err_rmap: destroy_async_eqs(dev); err_async_eqs: return err; @@ -1146,8 +1171,6 @@ err_async_eqs: void mlx5_eq_table_destroy(struct mlx5_core_dev *dev) { - if (!mlx5_core_is_sf(dev)) - clear_rmap(dev); destroy_comp_eqs(dev); destroy_async_eqs(dev); } diff --git a/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge.c b/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge.c index 3cdcb0e0b20f..1ba03e219111 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge.c @@ -13,66 +13,6 @@ #define CREATE_TRACE_POINTS #include "diag/bridge_tracepoint.h" -#define MLX5_ESW_BRIDGE_INGRESS_TABLE_VLAN_GRP_SIZE 12000 -#define MLX5_ESW_BRIDGE_INGRESS_TABLE_UNTAGGED_GRP_SIZE 16000 -#define MLX5_ESW_BRIDGE_INGRESS_TABLE_VLAN_GRP_IDX_FROM 0 -#define MLX5_ESW_BRIDGE_INGRESS_TABLE_VLAN_GRP_IDX_TO \ - (MLX5_ESW_BRIDGE_INGRESS_TABLE_VLAN_GRP_SIZE - 1) -#define MLX5_ESW_BRIDGE_INGRESS_TABLE_VLAN_FILTER_GRP_IDX_FROM \ - (MLX5_ESW_BRIDGE_INGRESS_TABLE_VLAN_GRP_IDX_TO + 1) -#define MLX5_ESW_BRIDGE_INGRESS_TABLE_VLAN_FILTER_GRP_IDX_TO \ - (MLX5_ESW_BRIDGE_INGRESS_TABLE_VLAN_FILTER_GRP_IDX_FROM + \ - MLX5_ESW_BRIDGE_INGRESS_TABLE_VLAN_GRP_SIZE - 1) -#define MLX5_ESW_BRIDGE_INGRESS_TABLE_QINQ_GRP_IDX_FROM \ - (MLX5_ESW_BRIDGE_INGRESS_TABLE_VLAN_FILTER_GRP_IDX_TO + 1) -#define MLX5_ESW_BRIDGE_INGRESS_TABLE_QINQ_GRP_IDX_TO \ - (MLX5_ESW_BRIDGE_INGRESS_TABLE_QINQ_GRP_IDX_FROM + \ - MLX5_ESW_BRIDGE_INGRESS_TABLE_VLAN_GRP_SIZE - 1) -#define MLX5_ESW_BRIDGE_INGRESS_TABLE_QINQ_FILTER_GRP_IDX_FROM \ - (MLX5_ESW_BRIDGE_INGRESS_TABLE_QINQ_GRP_IDX_TO + 1) -#define MLX5_ESW_BRIDGE_INGRESS_TABLE_QINQ_FILTER_GRP_IDX_TO \ - (MLX5_ESW_BRIDGE_INGRESS_TABLE_QINQ_FILTER_GRP_IDX_FROM + \ - MLX5_ESW_BRIDGE_INGRESS_TABLE_VLAN_GRP_SIZE - 1) -#define MLX5_ESW_BRIDGE_INGRESS_TABLE_MAC_GRP_IDX_FROM \ - (MLX5_ESW_BRIDGE_INGRESS_TABLE_QINQ_FILTER_GRP_IDX_TO + 1) -#define MLX5_ESW_BRIDGE_INGRESS_TABLE_MAC_GRP_IDX_TO \ - (MLX5_ESW_BRIDGE_INGRESS_TABLE_MAC_GRP_IDX_FROM + \ - MLX5_ESW_BRIDGE_INGRESS_TABLE_UNTAGGED_GRP_SIZE - 1) -#define MLX5_ESW_BRIDGE_INGRESS_TABLE_SIZE \ - (MLX5_ESW_BRIDGE_INGRESS_TABLE_MAC_GRP_IDX_TO + 1) -static_assert(MLX5_ESW_BRIDGE_INGRESS_TABLE_SIZE == 64000); - -#define MLX5_ESW_BRIDGE_EGRESS_TABLE_VLAN_GRP_SIZE 16000 -#define MLX5_ESW_BRIDGE_EGRESS_TABLE_MAC_GRP_SIZE (32000 - 1) -#define MLX5_ESW_BRIDGE_EGRESS_TABLE_VLAN_GRP_IDX_FROM 0 -#define MLX5_ESW_BRIDGE_EGRESS_TABLE_VLAN_GRP_IDX_TO \ - (MLX5_ESW_BRIDGE_EGRESS_TABLE_VLAN_GRP_SIZE - 1) -#define MLX5_ESW_BRIDGE_EGRESS_TABLE_QINQ_GRP_IDX_FROM \ - (MLX5_ESW_BRIDGE_EGRESS_TABLE_VLAN_GRP_IDX_TO + 1) -#define MLX5_ESW_BRIDGE_EGRESS_TABLE_QINQ_GRP_IDX_TO \ - (MLX5_ESW_BRIDGE_EGRESS_TABLE_QINQ_GRP_IDX_FROM + \ - MLX5_ESW_BRIDGE_EGRESS_TABLE_VLAN_GRP_SIZE - 1) -#define MLX5_ESW_BRIDGE_EGRESS_TABLE_MAC_GRP_IDX_FROM \ - (MLX5_ESW_BRIDGE_EGRESS_TABLE_QINQ_GRP_IDX_TO + 1) -#define MLX5_ESW_BRIDGE_EGRESS_TABLE_MAC_GRP_IDX_TO \ - (MLX5_ESW_BRIDGE_EGRESS_TABLE_MAC_GRP_IDX_FROM + \ - MLX5_ESW_BRIDGE_EGRESS_TABLE_MAC_GRP_SIZE - 1) -#define MLX5_ESW_BRIDGE_EGRESS_TABLE_MISS_GRP_IDX_FROM \ - (MLX5_ESW_BRIDGE_EGRESS_TABLE_MAC_GRP_IDX_TO + 1) -#define MLX5_ESW_BRIDGE_EGRESS_TABLE_MISS_GRP_IDX_TO \ - MLX5_ESW_BRIDGE_EGRESS_TABLE_MISS_GRP_IDX_FROM -#define MLX5_ESW_BRIDGE_EGRESS_TABLE_SIZE \ - (MLX5_ESW_BRIDGE_EGRESS_TABLE_MISS_GRP_IDX_TO + 1) -static_assert(MLX5_ESW_BRIDGE_EGRESS_TABLE_SIZE == 64000); - -#define MLX5_ESW_BRIDGE_SKIP_TABLE_SIZE 0 - -enum { - MLX5_ESW_BRIDGE_LEVEL_INGRESS_TABLE, - MLX5_ESW_BRIDGE_LEVEL_EGRESS_TABLE, - MLX5_ESW_BRIDGE_LEVEL_SKIP_TABLE, -}; - static const struct rhashtable_params fdb_ht_params = { .key_offset = offsetof(struct mlx5_esw_bridge_fdb_entry, key), .key_len = sizeof(struct mlx5_esw_bridge_fdb_key), @@ -80,31 +20,6 @@ static const struct rhashtable_params fdb_ht_params = { .automatic_shrinking = true, }; -enum { - MLX5_ESW_BRIDGE_VLAN_FILTERING_FLAG = BIT(0), -}; - -struct mlx5_esw_bridge { - int ifindex; - int refcnt; - struct list_head list; - struct mlx5_esw_bridge_offloads *br_offloads; - - struct list_head fdb_list; - struct rhashtable fdb_ht; - - struct mlx5_flow_table *egress_ft; - struct mlx5_flow_group *egress_vlan_fg; - struct mlx5_flow_group *egress_qinq_fg; - struct mlx5_flow_group *egress_mac_fg; - struct mlx5_flow_group *egress_miss_fg; - struct mlx5_pkt_reformat *egress_miss_pkt_reformat; - struct mlx5_flow_handle *egress_miss_handle; - unsigned long ageing_time; - u32 flags; - u16 vlan_proto; -}; - static void mlx5_esw_bridge_fdb_offload_notify(struct net_device *dev, const unsigned char *addr, u16 vid, unsigned long val) @@ -146,7 +61,7 @@ mlx5_esw_bridge_pkt_reformat_vlan_pop_create(struct mlx5_eswitch *esw) return mlx5_packet_reformat_alloc(esw->dev, &reformat_params, MLX5_FLOW_NAMESPACE_FDB); } -static struct mlx5_flow_table * +struct mlx5_flow_table * mlx5_esw_bridge_table_create(int max_fte, u32 level, struct mlx5_eswitch *esw) { struct mlx5_flow_table_attr ft_attr = {}; @@ -925,6 +840,10 @@ static struct mlx5_esw_bridge *mlx5_esw_bridge_create(int ifindex, if (err) goto err_fdb_ht; + err = mlx5_esw_bridge_mdb_init(bridge); + if (err) + goto err_mdb_ht; + INIT_LIST_HEAD(&bridge->fdb_list); bridge->ifindex = ifindex; bridge->refcnt = 1; @@ -934,6 +853,8 @@ static struct mlx5_esw_bridge *mlx5_esw_bridge_create(int ifindex, return bridge; +err_mdb_ht: + rhashtable_destroy(&bridge->fdb_ht); err_fdb_ht: mlx5_esw_bridge_egress_table_cleanup(bridge); err_egress_tbl: @@ -953,7 +874,9 @@ static void mlx5_esw_bridge_put(struct mlx5_esw_bridge_offloads *br_offloads, return; mlx5_esw_bridge_egress_table_cleanup(bridge); + mlx5_esw_bridge_mcast_disable(bridge); list_del(&bridge->list); + mlx5_esw_bridge_mdb_cleanup(bridge); rhashtable_destroy(&bridge->fdb_ht); kvfree(bridge); @@ -993,7 +916,7 @@ static unsigned long mlx5_esw_bridge_port_key_from_data(u16 vport_num, u16 esw_o return vport_num | (unsigned long)esw_owner_vhca_id << sizeof(vport_num) * BITS_PER_BYTE; } -static unsigned long mlx5_esw_bridge_port_key(struct mlx5_esw_bridge_port *port) +unsigned long mlx5_esw_bridge_port_key(struct mlx5_esw_bridge_port *port) { return mlx5_esw_bridge_port_key_from_data(port->vport_num, port->esw_owner_vhca_id); } @@ -1018,6 +941,19 @@ static void mlx5_esw_bridge_port_erase(struct mlx5_esw_bridge_port *port, xa_erase(&br_offloads->ports, mlx5_esw_bridge_port_key(port)); } +static struct mlx5_esw_bridge * +mlx5_esw_bridge_from_port_lookup(u16 vport_num, u16 esw_owner_vhca_id, + struct mlx5_esw_bridge_offloads *br_offloads) +{ + struct mlx5_esw_bridge_port *port; + + port = mlx5_esw_bridge_port_lookup(vport_num, esw_owner_vhca_id, br_offloads); + if (!port) + return NULL; + + return port->bridge; +} + static void mlx5_esw_bridge_fdb_entry_refresh(struct mlx5_esw_bridge_fdb_entry *entry) { trace_mlx5_esw_bridge_fdb_entry_refresh(entry); @@ -1166,8 +1102,21 @@ mlx5_esw_bridge_vlan_push_mark_cleanup(struct mlx5_esw_bridge_vlan *vlan, struct } static int -mlx5_esw_bridge_vlan_push_pop_create(u16 vlan_proto, u16 flags, struct mlx5_esw_bridge_vlan *vlan, - struct mlx5_eswitch *esw) +mlx5_esw_bridge_vlan_push_pop_fhs_create(u16 vlan_proto, struct mlx5_esw_bridge_port *port, + struct mlx5_esw_bridge_vlan *vlan) +{ + return mlx5_esw_bridge_vlan_mcast_init(vlan_proto, port, vlan); +} + +static void +mlx5_esw_bridge_vlan_push_pop_fhs_cleanup(struct mlx5_esw_bridge_vlan *vlan) +{ + mlx5_esw_bridge_vlan_mcast_cleanup(vlan); +} + +static int +mlx5_esw_bridge_vlan_push_pop_create(u16 vlan_proto, u16 flags, struct mlx5_esw_bridge_port *port, + struct mlx5_esw_bridge_vlan *vlan, struct mlx5_eswitch *esw) { int err; @@ -1185,10 +1134,16 @@ mlx5_esw_bridge_vlan_push_pop_create(u16 vlan_proto, u16 flags, struct mlx5_esw_ err = mlx5_esw_bridge_vlan_pop_create(vlan, esw); if (err) goto err_vlan_pop; + + err = mlx5_esw_bridge_vlan_push_pop_fhs_create(vlan_proto, port, vlan); + if (err) + goto err_vlan_pop_fhs; } return 0; +err_vlan_pop_fhs: + mlx5_esw_bridge_vlan_pop_cleanup(vlan, esw); err_vlan_pop: if (vlan->pkt_mod_hdr_push_mark) mlx5_esw_bridge_vlan_push_mark_cleanup(vlan, esw); @@ -1213,7 +1168,7 @@ mlx5_esw_bridge_vlan_create(u16 vlan_proto, u16 vid, u16 flags, struct mlx5_esw_ vlan->flags = flags; INIT_LIST_HEAD(&vlan->fdb_list); - err = mlx5_esw_bridge_vlan_push_pop_create(vlan_proto, flags, vlan, esw); + err = mlx5_esw_bridge_vlan_push_pop_create(vlan_proto, flags, port, vlan, esw); if (err) goto err_vlan_push_pop; @@ -1225,6 +1180,8 @@ mlx5_esw_bridge_vlan_create(u16 vlan_proto, u16 vid, u16 flags, struct mlx5_esw_ return vlan; err_xa_insert: + if (vlan->mcast_handle) + mlx5_esw_bridge_vlan_push_pop_fhs_cleanup(vlan); if (vlan->pkt_reformat_pop) mlx5_esw_bridge_vlan_pop_cleanup(vlan, esw); if (vlan->pkt_mod_hdr_push_mark) @@ -1242,7 +1199,8 @@ static void mlx5_esw_bridge_vlan_erase(struct mlx5_esw_bridge_port *port, xa_erase(&port->vlans, vlan->vid); } -static void mlx5_esw_bridge_vlan_flush(struct mlx5_esw_bridge_vlan *vlan, +static void mlx5_esw_bridge_vlan_flush(struct mlx5_esw_bridge_port *port, + struct mlx5_esw_bridge_vlan *vlan, struct mlx5_esw_bridge *bridge) { struct mlx5_eswitch *esw = bridge->br_offloads->esw; @@ -1250,7 +1208,10 @@ static void mlx5_esw_bridge_vlan_flush(struct mlx5_esw_bridge_vlan *vlan, list_for_each_entry_safe(entry, tmp, &vlan->fdb_list, vlan_list) mlx5_esw_bridge_fdb_entry_notify_and_cleanup(entry, bridge); + mlx5_esw_bridge_port_mdb_vlan_flush(port, vlan); + if (vlan->mcast_handle) + mlx5_esw_bridge_vlan_push_pop_fhs_cleanup(vlan); if (vlan->pkt_reformat_pop) mlx5_esw_bridge_vlan_pop_cleanup(vlan, esw); if (vlan->pkt_mod_hdr_push_mark) @@ -1264,7 +1225,7 @@ static void mlx5_esw_bridge_vlan_cleanup(struct mlx5_esw_bridge_port *port, struct mlx5_esw_bridge *bridge) { trace_mlx5_esw_bridge_vlan_cleanup(vlan); - mlx5_esw_bridge_vlan_flush(vlan, bridge); + mlx5_esw_bridge_vlan_flush(port, vlan, bridge); mlx5_esw_bridge_vlan_erase(port, vlan); kvfree(vlan); } @@ -1288,9 +1249,9 @@ static int mlx5_esw_bridge_port_vlans_recreate(struct mlx5_esw_bridge_port *port int err; xa_for_each(&port->vlans, i, vlan) { - mlx5_esw_bridge_vlan_flush(vlan, bridge); - err = mlx5_esw_bridge_vlan_push_pop_create(bridge->vlan_proto, vlan->flags, vlan, - br_offloads->esw); + mlx5_esw_bridge_vlan_flush(port, vlan, bridge); + err = mlx5_esw_bridge_vlan_push_pop_create(bridge->vlan_proto, vlan->flags, port, + vlan, br_offloads->esw); if (err) { esw_warn(br_offloads->esw->dev, "Failed to create VLAN=%u(proto=%x) push/pop actions (vport=%u,err=%d)\n", @@ -1473,33 +1434,32 @@ err_ingress_fc_create: int mlx5_esw_bridge_ageing_time_set(u16 vport_num, u16 esw_owner_vhca_id, unsigned long ageing_time, struct mlx5_esw_bridge_offloads *br_offloads) { - struct mlx5_esw_bridge_port *port; + struct mlx5_esw_bridge *bridge; - port = mlx5_esw_bridge_port_lookup(vport_num, esw_owner_vhca_id, br_offloads); - if (!port) + bridge = mlx5_esw_bridge_from_port_lookup(vport_num, esw_owner_vhca_id, br_offloads); + if (!bridge) return -EINVAL; - port->bridge->ageing_time = clock_t_to_jiffies(ageing_time); + bridge->ageing_time = clock_t_to_jiffies(ageing_time); return 0; } int mlx5_esw_bridge_vlan_filtering_set(u16 vport_num, u16 esw_owner_vhca_id, bool enable, struct mlx5_esw_bridge_offloads *br_offloads) { - struct mlx5_esw_bridge_port *port; struct mlx5_esw_bridge *bridge; bool filtering; - port = mlx5_esw_bridge_port_lookup(vport_num, esw_owner_vhca_id, br_offloads); - if (!port) + bridge = mlx5_esw_bridge_from_port_lookup(vport_num, esw_owner_vhca_id, br_offloads); + if (!bridge) return -EINVAL; - bridge = port->bridge; filtering = bridge->flags & MLX5_ESW_BRIDGE_VLAN_FILTERING_FLAG; if (filtering == enable) return 0; mlx5_esw_bridge_fdb_flush(bridge); + mlx5_esw_bridge_mdb_flush(bridge); if (enable) bridge->flags |= MLX5_ESW_BRIDGE_VLAN_FILTERING_FLAG; else @@ -1511,15 +1471,13 @@ int mlx5_esw_bridge_vlan_filtering_set(u16 vport_num, u16 esw_owner_vhca_id, boo int mlx5_esw_bridge_vlan_proto_set(u16 vport_num, u16 esw_owner_vhca_id, u16 proto, struct mlx5_esw_bridge_offloads *br_offloads) { - struct mlx5_esw_bridge_port *port; struct mlx5_esw_bridge *bridge; - port = mlx5_esw_bridge_port_lookup(vport_num, esw_owner_vhca_id, - br_offloads); - if (!port) + bridge = mlx5_esw_bridge_from_port_lookup(vport_num, esw_owner_vhca_id, + br_offloads); + if (!bridge) return -EINVAL; - bridge = port->bridge; if (bridge->vlan_proto == proto) return 0; if (proto != ETH_P_8021Q && proto != ETH_P_8021AD) { @@ -1528,12 +1486,43 @@ int mlx5_esw_bridge_vlan_proto_set(u16 vport_num, u16 esw_owner_vhca_id, u16 pro } mlx5_esw_bridge_fdb_flush(bridge); + mlx5_esw_bridge_mdb_flush(bridge); bridge->vlan_proto = proto; mlx5_esw_bridge_vlans_recreate(bridge); return 0; } +int mlx5_esw_bridge_mcast_set(u16 vport_num, u16 esw_owner_vhca_id, bool enable, + struct mlx5_esw_bridge_offloads *br_offloads) +{ + struct mlx5_eswitch *esw = br_offloads->esw; + struct mlx5_esw_bridge *bridge; + int err = 0; + bool mcast; + + if (!(MLX5_CAP_ESW_FLOWTABLE((esw)->dev, fdb_multi_path_any_table) || + MLX5_CAP_ESW_FLOWTABLE((esw)->dev, fdb_multi_path_any_table_limit_regc)) || + !MLX5_CAP_ESW_FLOWTABLE((esw)->dev, fdb_uplink_hairpin) || + !MLX5_CAP_ESW_FLOWTABLE_FDB((esw)->dev, ignore_flow_level)) + return -EOPNOTSUPP; + + bridge = mlx5_esw_bridge_from_port_lookup(vport_num, esw_owner_vhca_id, br_offloads); + if (!bridge) + return -EINVAL; + + mcast = bridge->flags & MLX5_ESW_BRIDGE_MCAST_FLAG; + if (mcast == enable) + return 0; + + if (enable) + err = mlx5_esw_bridge_mcast_enable(bridge); + else + mlx5_esw_bridge_mcast_disable(bridge); + + return err; +} + static int mlx5_esw_bridge_vport_init(u16 vport_num, u16 esw_owner_vhca_id, u16 flags, struct mlx5_esw_bridge_offloads *br_offloads, struct mlx5_esw_bridge *bridge) @@ -1551,6 +1540,15 @@ static int mlx5_esw_bridge_vport_init(u16 vport_num, u16 esw_owner_vhca_id, u16 port->bridge = bridge; port->flags |= flags; xa_init(&port->vlans); + + err = mlx5_esw_bridge_port_mcast_init(port); + if (err) { + esw_warn(esw->dev, + "Failed to initialize port multicast (vport=%u,esw_owner_vhca_id=%u,err=%d)\n", + port->vport_num, port->esw_owner_vhca_id, err); + goto err_port_mcast; + } + err = mlx5_esw_bridge_port_insert(port, br_offloads); if (err) { esw_warn(esw->dev, @@ -1563,6 +1561,8 @@ static int mlx5_esw_bridge_vport_init(u16 vport_num, u16 esw_owner_vhca_id, u16 return 0; err_port_insert: + mlx5_esw_bridge_port_mcast_cleanup(port); +err_port_mcast: kvfree(port); return err; } @@ -1580,6 +1580,7 @@ static int mlx5_esw_bridge_vport_cleanup(struct mlx5_esw_bridge_offloads *br_off trace_mlx5_esw_bridge_vport_cleanup(port); mlx5_esw_bridge_port_vlans_flush(port, bridge); + mlx5_esw_bridge_port_mcast_cleanup(port); mlx5_esw_bridge_port_erase(port, br_offloads); kvfree(port); mlx5_esw_bridge_put(br_offloads, bridge); @@ -1711,14 +1712,12 @@ void mlx5_esw_bridge_fdb_update_used(struct net_device *dev, u16 vport_num, u16 struct switchdev_notifier_fdb_info *fdb_info) { struct mlx5_esw_bridge_fdb_entry *entry; - struct mlx5_esw_bridge_port *port; struct mlx5_esw_bridge *bridge; - port = mlx5_esw_bridge_port_lookup(vport_num, esw_owner_vhca_id, br_offloads); - if (!port) + bridge = mlx5_esw_bridge_from_port_lookup(vport_num, esw_owner_vhca_id, br_offloads); + if (!bridge) return; - bridge = port->bridge; entry = mlx5_esw_bridge_fdb_lookup(bridge, fdb_info->addr, fdb_info->vid); if (!entry) { esw_debug(br_offloads->esw->dev, @@ -1765,14 +1764,12 @@ void mlx5_esw_bridge_fdb_remove(struct net_device *dev, u16 vport_num, u16 esw_o { struct mlx5_eswitch *esw = br_offloads->esw; struct mlx5_esw_bridge_fdb_entry *entry; - struct mlx5_esw_bridge_port *port; struct mlx5_esw_bridge *bridge; - port = mlx5_esw_bridge_port_lookup(vport_num, esw_owner_vhca_id, br_offloads); - if (!port) + bridge = mlx5_esw_bridge_from_port_lookup(vport_num, esw_owner_vhca_id, br_offloads); + if (!bridge) return; - bridge = port->bridge; entry = mlx5_esw_bridge_fdb_lookup(bridge, fdb_info->addr, fdb_info->vid); if (!entry) { esw_debug(esw->dev, @@ -1806,6 +1803,64 @@ void mlx5_esw_bridge_update(struct mlx5_esw_bridge_offloads *br_offloads) } } +int mlx5_esw_bridge_port_mdb_add(struct net_device *dev, u16 vport_num, u16 esw_owner_vhca_id, + const unsigned char *addr, u16 vid, + struct mlx5_esw_bridge_offloads *br_offloads, + struct netlink_ext_ack *extack) +{ + struct mlx5_esw_bridge_vlan *vlan; + struct mlx5_esw_bridge_port *port; + struct mlx5_esw_bridge *bridge; + int err; + + port = mlx5_esw_bridge_port_lookup(vport_num, esw_owner_vhca_id, br_offloads); + if (!port) { + esw_warn(br_offloads->esw->dev, + "Failed to lookup bridge port to add MDB (MAC=%pM,vport=%u)\n", + addr, vport_num); + NL_SET_ERR_MSG_FMT_MOD(extack, + "Failed to lookup bridge port to add MDB (MAC=%pM,vport=%u)\n", + addr, vport_num); + return -EINVAL; + } + + bridge = port->bridge; + if (bridge->flags & MLX5_ESW_BRIDGE_VLAN_FILTERING_FLAG && vid) { + vlan = mlx5_esw_bridge_vlan_lookup(vid, port); + if (!vlan) { + esw_warn(br_offloads->esw->dev, + "Failed to lookup bridge port vlan metadata to create MDB (MAC=%pM,vid=%u,vport=%u)\n", + addr, vid, vport_num); + NL_SET_ERR_MSG_FMT_MOD(extack, + "Failed to lookup bridge port vlan metadata to create MDB (MAC=%pM,vid=%u,vport=%u)\n", + addr, vid, vport_num); + return -EINVAL; + } + } + + err = mlx5_esw_bridge_port_mdb_attach(dev, port, addr, vid); + if (err) { + NL_SET_ERR_MSG_FMT_MOD(extack, "Failed to add MDB (MAC=%pM,vid=%u,vport=%u)\n", + addr, vid, vport_num); + return err; + } + + return 0; +} + +void mlx5_esw_bridge_port_mdb_del(struct net_device *dev, u16 vport_num, u16 esw_owner_vhca_id, + const unsigned char *addr, u16 vid, + struct mlx5_esw_bridge_offloads *br_offloads) +{ + struct mlx5_esw_bridge_port *port; + + port = mlx5_esw_bridge_port_lookup(vport_num, esw_owner_vhca_id, br_offloads); + if (!port) + return; + + mlx5_esw_bridge_port_mdb_detach(dev, port, addr, vid); +} + static void mlx5_esw_bridge_flush(struct mlx5_esw_bridge_offloads *br_offloads) { struct mlx5_esw_bridge_port *port; diff --git a/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge.h b/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge.h index 10851a515bca..a9dd18c73d6a 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge.h @@ -25,12 +25,19 @@ struct mlx5_esw_bridge_offloads { struct delayed_work update_work; struct mlx5_flow_table *ingress_ft; + struct mlx5_flow_group *ingress_igmp_fg; + struct mlx5_flow_group *ingress_mld_fg; struct mlx5_flow_group *ingress_vlan_fg; struct mlx5_flow_group *ingress_vlan_filter_fg; struct mlx5_flow_group *ingress_qinq_fg; struct mlx5_flow_group *ingress_qinq_filter_fg; struct mlx5_flow_group *ingress_mac_fg; + struct mlx5_flow_handle *igmp_handle; + struct mlx5_flow_handle *mld_query_handle; + struct mlx5_flow_handle *mld_report_handle; + struct mlx5_flow_handle *mld_done_handle; + struct mlx5_flow_table *skip_ft; }; @@ -64,10 +71,20 @@ int mlx5_esw_bridge_vlan_filtering_set(u16 vport_num, u16 esw_owner_vhca_id, boo struct mlx5_esw_bridge_offloads *br_offloads); int mlx5_esw_bridge_vlan_proto_set(u16 vport_num, u16 esw_owner_vhca_id, u16 proto, struct mlx5_esw_bridge_offloads *br_offloads); +int mlx5_esw_bridge_mcast_set(u16 vport_num, u16 esw_owner_vhca_id, bool enable, + struct mlx5_esw_bridge_offloads *br_offloads); int mlx5_esw_bridge_port_vlan_add(u16 vport_num, u16 esw_owner_vhca_id, u16 vid, u16 flags, struct mlx5_esw_bridge_offloads *br_offloads, struct netlink_ext_ack *extack); void mlx5_esw_bridge_port_vlan_del(u16 vport_num, u16 esw_owner_vhca_id, u16 vid, struct mlx5_esw_bridge_offloads *br_offloads); +int mlx5_esw_bridge_port_mdb_add(struct net_device *dev, u16 vport_num, u16 esw_owner_vhca_id, + const unsigned char *addr, u16 vid, + struct mlx5_esw_bridge_offloads *br_offloads, + struct netlink_ext_ack *extack); +void mlx5_esw_bridge_port_mdb_del(struct net_device *dev, u16 vport_num, u16 esw_owner_vhca_id, + const unsigned char *addr, u16 vid, + struct mlx5_esw_bridge_offloads *br_offloads); + #endif /* __MLX5_ESW_BRIDGE_H__ */ diff --git a/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge_mcast.c b/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge_mcast.c new file mode 100644 index 000000000000..2eae594a5e80 --- /dev/null +++ b/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge_mcast.c @@ -0,0 +1,1126 @@ +// SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB +/* Copyright (c) 2022, NVIDIA CORPORATION & AFFILIATES. All rights reserved. */ + +#include "lib/devcom.h" +#include "bridge.h" +#include "eswitch.h" +#include "bridge_priv.h" +#include "diag/bridge_tracepoint.h" + +static const struct rhashtable_params mdb_ht_params = { + .key_offset = offsetof(struct mlx5_esw_bridge_mdb_entry, key), + .key_len = sizeof(struct mlx5_esw_bridge_mdb_key), + .head_offset = offsetof(struct mlx5_esw_bridge_mdb_entry, ht_node), + .automatic_shrinking = true, +}; + +int mlx5_esw_bridge_mdb_init(struct mlx5_esw_bridge *bridge) +{ + INIT_LIST_HEAD(&bridge->mdb_list); + return rhashtable_init(&bridge->mdb_ht, &mdb_ht_params); +} + +void mlx5_esw_bridge_mdb_cleanup(struct mlx5_esw_bridge *bridge) +{ + rhashtable_destroy(&bridge->mdb_ht); +} + +static struct mlx5_esw_bridge_port * +mlx5_esw_bridge_mdb_port_lookup(struct mlx5_esw_bridge_port *port, + struct mlx5_esw_bridge_mdb_entry *entry) +{ + return xa_load(&entry->ports, mlx5_esw_bridge_port_key(port)); +} + +static int mlx5_esw_bridge_mdb_port_insert(struct mlx5_esw_bridge_port *port, + struct mlx5_esw_bridge_mdb_entry *entry) +{ + int err = xa_insert(&entry->ports, mlx5_esw_bridge_port_key(port), port, GFP_KERNEL); + + if (!err) + entry->num_ports++; + return err; +} + +static void mlx5_esw_bridge_mdb_port_remove(struct mlx5_esw_bridge_port *port, + struct mlx5_esw_bridge_mdb_entry *entry) +{ + xa_erase(&entry->ports, mlx5_esw_bridge_port_key(port)); + entry->num_ports--; +} + +static struct mlx5_flow_handle * +mlx5_esw_bridge_mdb_flow_create(u16 esw_owner_vhca_id, struct mlx5_esw_bridge_mdb_entry *entry, + struct mlx5_esw_bridge *bridge) +{ + struct mlx5_flow_act flow_act = { + .action = MLX5_FLOW_CONTEXT_ACTION_FWD_DEST, + .flags = FLOW_ACT_NO_APPEND | FLOW_ACT_IGNORE_FLOW_LEVEL, + }; + int num_dests = entry->num_ports, i = 0; + struct mlx5_flow_destination *dests; + struct mlx5_esw_bridge_port *port; + struct mlx5_flow_spec *rule_spec; + struct mlx5_flow_handle *handle; + u8 *dmac_v, *dmac_c; + unsigned long idx; + + rule_spec = kvzalloc(sizeof(*rule_spec), GFP_KERNEL); + if (!rule_spec) + return ERR_PTR(-ENOMEM); + + dests = kvcalloc(num_dests, sizeof(*dests), GFP_KERNEL); + if (!dests) { + kvfree(rule_spec); + return ERR_PTR(-ENOMEM); + } + + xa_for_each(&entry->ports, idx, port) { + dests[i].type = MLX5_FLOW_DESTINATION_TYPE_FLOW_TABLE; + dests[i].ft = port->mcast.ft; + i++; + } + + rule_spec->match_criteria_enable = MLX5_MATCH_OUTER_HEADERS; + dmac_v = MLX5_ADDR_OF(fte_match_param, rule_spec->match_value, outer_headers.dmac_47_16); + ether_addr_copy(dmac_v, entry->key.addr); + dmac_c = MLX5_ADDR_OF(fte_match_param, rule_spec->match_criteria, outer_headers.dmac_47_16); + eth_broadcast_addr(dmac_c); + + if (entry->key.vid) { + if (bridge->vlan_proto == ETH_P_8021Q) { + MLX5_SET_TO_ONES(fte_match_param, rule_spec->match_criteria, + outer_headers.cvlan_tag); + MLX5_SET_TO_ONES(fte_match_param, rule_spec->match_value, + outer_headers.cvlan_tag); + } else if (bridge->vlan_proto == ETH_P_8021AD) { + MLX5_SET_TO_ONES(fte_match_param, rule_spec->match_criteria, + outer_headers.svlan_tag); + MLX5_SET_TO_ONES(fte_match_param, rule_spec->match_value, + outer_headers.svlan_tag); + } + MLX5_SET_TO_ONES(fte_match_param, rule_spec->match_criteria, + outer_headers.first_vid); + MLX5_SET(fte_match_param, rule_spec->match_value, outer_headers.first_vid, + entry->key.vid); + } + + handle = mlx5_add_flow_rules(bridge->egress_ft, rule_spec, &flow_act, dests, num_dests); + + kvfree(dests); + kvfree(rule_spec); + return handle; +} + +static int +mlx5_esw_bridge_port_mdb_offload(struct mlx5_esw_bridge_port *port, + struct mlx5_esw_bridge_mdb_entry *entry) +{ + struct mlx5_flow_handle *handle; + + handle = mlx5_esw_bridge_mdb_flow_create(port->esw_owner_vhca_id, entry, port->bridge); + if (entry->egress_handle) { + mlx5_del_flow_rules(entry->egress_handle); + entry->egress_handle = NULL; + } + if (IS_ERR(handle)) + return PTR_ERR(handle); + + entry->egress_handle = handle; + return 0; +} + +static struct mlx5_esw_bridge_mdb_entry * +mlx5_esw_bridge_mdb_lookup(struct mlx5_esw_bridge *bridge, + const unsigned char *addr, u16 vid) +{ + struct mlx5_esw_bridge_mdb_key key = {}; + + ether_addr_copy(key.addr, addr); + key.vid = vid; + return rhashtable_lookup_fast(&bridge->mdb_ht, &key, mdb_ht_params); +} + +static struct mlx5_esw_bridge_mdb_entry * +mlx5_esw_bridge_port_mdb_entry_init(struct mlx5_esw_bridge_port *port, + const unsigned char *addr, u16 vid) +{ + struct mlx5_esw_bridge *bridge = port->bridge; + struct mlx5_esw_bridge_mdb_entry *entry; + int err; + + entry = kvzalloc(sizeof(*entry), GFP_KERNEL); + if (!entry) + return ERR_PTR(-ENOMEM); + + ether_addr_copy(entry->key.addr, addr); + entry->key.vid = vid; + xa_init(&entry->ports); + err = rhashtable_insert_fast(&bridge->mdb_ht, &entry->ht_node, mdb_ht_params); + if (err) + goto err_ht_insert; + + list_add(&entry->list, &bridge->mdb_list); + + return entry; + +err_ht_insert: + xa_destroy(&entry->ports); + kvfree(entry); + return ERR_PTR(err); +} + +static void mlx5_esw_bridge_port_mdb_entry_cleanup(struct mlx5_esw_bridge *bridge, + struct mlx5_esw_bridge_mdb_entry *entry) +{ + if (entry->egress_handle) + mlx5_del_flow_rules(entry->egress_handle); + list_del(&entry->list); + rhashtable_remove_fast(&bridge->mdb_ht, &entry->ht_node, mdb_ht_params); + xa_destroy(&entry->ports); + kvfree(entry); +} + +int mlx5_esw_bridge_port_mdb_attach(struct net_device *dev, struct mlx5_esw_bridge_port *port, + const unsigned char *addr, u16 vid) +{ + struct mlx5_esw_bridge *bridge = port->bridge; + struct mlx5_esw_bridge_mdb_entry *entry; + int err; + + if (!(bridge->flags & MLX5_ESW_BRIDGE_MCAST_FLAG)) + return -EOPNOTSUPP; + + entry = mlx5_esw_bridge_mdb_lookup(bridge, addr, vid); + if (entry) { + if (mlx5_esw_bridge_mdb_port_lookup(port, entry)) { + esw_warn(bridge->br_offloads->esw->dev, "MDB attach entry is already attached to port (MAC=%pM,vid=%u,vport=%u)\n", + addr, vid, port->vport_num); + return 0; + } + } else { + entry = mlx5_esw_bridge_port_mdb_entry_init(port, addr, vid); + if (IS_ERR(entry)) { + err = PTR_ERR(entry); + esw_warn(bridge->br_offloads->esw->dev, "MDB attach failed to init entry (MAC=%pM,vid=%u,vport=%u,err=%d)\n", + addr, vid, port->vport_num, err); + return err; + } + } + + err = mlx5_esw_bridge_mdb_port_insert(port, entry); + if (err) { + if (!entry->num_ports) + mlx5_esw_bridge_port_mdb_entry_cleanup(bridge, entry); /* new mdb entry */ + esw_warn(bridge->br_offloads->esw->dev, + "MDB attach failed to insert port (MAC=%pM,vid=%u,vport=%u,err=%d)\n", + addr, vid, port->vport_num, err); + return err; + } + + err = mlx5_esw_bridge_port_mdb_offload(port, entry); + if (err) + /* Single mdb can be used by multiple ports, so just log the + * error and continue. + */ + esw_warn(bridge->br_offloads->esw->dev, "MDB attach failed to offload (MAC=%pM,vid=%u,vport=%u,err=%d)\n", + addr, vid, port->vport_num, err); + + trace_mlx5_esw_bridge_port_mdb_attach(dev, entry); + return 0; +} + +static void mlx5_esw_bridge_port_mdb_entry_detach(struct mlx5_esw_bridge_port *port, + struct mlx5_esw_bridge_mdb_entry *entry) +{ + struct mlx5_esw_bridge *bridge = port->bridge; + int err; + + mlx5_esw_bridge_mdb_port_remove(port, entry); + if (!entry->num_ports) { + mlx5_esw_bridge_port_mdb_entry_cleanup(bridge, entry); + return; + } + + err = mlx5_esw_bridge_port_mdb_offload(port, entry); + if (err) + /* Single mdb can be used by multiple ports, so just log the + * error and continue. + */ + esw_warn(bridge->br_offloads->esw->dev, "MDB detach failed to offload (MAC=%pM,vid=%u,vport=%u)\n", + entry->key.addr, entry->key.vid, port->vport_num); +} + +void mlx5_esw_bridge_port_mdb_detach(struct net_device *dev, struct mlx5_esw_bridge_port *port, + const unsigned char *addr, u16 vid) +{ + struct mlx5_esw_bridge *bridge = port->bridge; + struct mlx5_esw_bridge_mdb_entry *entry; + + entry = mlx5_esw_bridge_mdb_lookup(bridge, addr, vid); + if (!entry) { + esw_debug(bridge->br_offloads->esw->dev, + "MDB detach entry not found (MAC=%pM,vid=%u,vport=%u)\n", + addr, vid, port->vport_num); + return; + } + + if (!mlx5_esw_bridge_mdb_port_lookup(port, entry)) { + esw_debug(bridge->br_offloads->esw->dev, + "MDB detach entry not attached to the port (MAC=%pM,vid=%u,vport=%u)\n", + addr, vid, port->vport_num); + return; + } + + trace_mlx5_esw_bridge_port_mdb_detach(dev, entry); + mlx5_esw_bridge_port_mdb_entry_detach(port, entry); +} + +void mlx5_esw_bridge_port_mdb_vlan_flush(struct mlx5_esw_bridge_port *port, + struct mlx5_esw_bridge_vlan *vlan) +{ + struct mlx5_esw_bridge *bridge = port->bridge; + struct mlx5_esw_bridge_mdb_entry *entry, *tmp; + + list_for_each_entry_safe(entry, tmp, &bridge->mdb_list, list) + if (entry->key.vid == vlan->vid && mlx5_esw_bridge_mdb_port_lookup(port, entry)) + mlx5_esw_bridge_port_mdb_entry_detach(port, entry); +} + +static void mlx5_esw_bridge_port_mdb_flush(struct mlx5_esw_bridge_port *port) +{ + struct mlx5_esw_bridge *bridge = port->bridge; + struct mlx5_esw_bridge_mdb_entry *entry, *tmp; + + list_for_each_entry_safe(entry, tmp, &bridge->mdb_list, list) + if (mlx5_esw_bridge_mdb_port_lookup(port, entry)) + mlx5_esw_bridge_port_mdb_entry_detach(port, entry); +} + +void mlx5_esw_bridge_mdb_flush(struct mlx5_esw_bridge *bridge) +{ + struct mlx5_esw_bridge_mdb_entry *entry, *tmp; + + list_for_each_entry_safe(entry, tmp, &bridge->mdb_list, list) + mlx5_esw_bridge_port_mdb_entry_cleanup(bridge, entry); +} +static int mlx5_esw_bridge_port_mcast_fts_init(struct mlx5_esw_bridge_port *port, + struct mlx5_esw_bridge *bridge) +{ + struct mlx5_eswitch *esw = bridge->br_offloads->esw; + struct mlx5_flow_table *mcast_ft; + + mcast_ft = mlx5_esw_bridge_table_create(MLX5_ESW_BRIDGE_MCAST_TABLE_SIZE, + MLX5_ESW_BRIDGE_LEVEL_MCAST_TABLE, + esw); + if (IS_ERR(mcast_ft)) + return PTR_ERR(mcast_ft); + + port->mcast.ft = mcast_ft; + return 0; +} + +static void mlx5_esw_bridge_port_mcast_fts_cleanup(struct mlx5_esw_bridge_port *port) +{ + if (port->mcast.ft) + mlx5_destroy_flow_table(port->mcast.ft); + port->mcast.ft = NULL; +} + +static struct mlx5_flow_group * +mlx5_esw_bridge_mcast_filter_fg_create(struct mlx5_eswitch *esw, + struct mlx5_flow_table *mcast_ft) +{ + int inlen = MLX5_ST_SZ_BYTES(create_flow_group_in); + struct mlx5_flow_group *fg; + u32 *in, *match; + + in = kvzalloc(inlen, GFP_KERNEL); + if (!in) + return ERR_PTR(-ENOMEM); + + MLX5_SET(create_flow_group_in, in, match_criteria_enable, MLX5_MATCH_MISC_PARAMETERS_2); + match = MLX5_ADDR_OF(create_flow_group_in, in, match_criteria); + + MLX5_SET(fte_match_param, match, misc_parameters_2.metadata_reg_c_0, + mlx5_eswitch_get_vport_metadata_mask()); + + MLX5_SET(create_flow_group_in, in, start_flow_index, + MLX5_ESW_BRIDGE_MCAST_TABLE_FILTER_GRP_IDX_FROM); + MLX5_SET(create_flow_group_in, in, end_flow_index, + MLX5_ESW_BRIDGE_MCAST_TABLE_FILTER_GRP_IDX_TO); + + fg = mlx5_create_flow_group(mcast_ft, in); + kvfree(in); + if (IS_ERR(fg)) + esw_warn(esw->dev, + "Failed to create filter flow group for bridge mcast table (err=%pe)\n", + fg); + + return fg; +} + +static struct mlx5_flow_group * +mlx5_esw_bridge_mcast_vlan_proto_fg_create(unsigned int from, unsigned int to, u16 vlan_proto, + struct mlx5_eswitch *esw, + struct mlx5_flow_table *mcast_ft) +{ + int inlen = MLX5_ST_SZ_BYTES(create_flow_group_in); + struct mlx5_flow_group *fg; + u32 *in, *match; + + in = kvzalloc(inlen, GFP_KERNEL); + if (!in) + return ERR_PTR(-ENOMEM); + + MLX5_SET(create_flow_group_in, in, match_criteria_enable, MLX5_MATCH_OUTER_HEADERS); + match = MLX5_ADDR_OF(create_flow_group_in, in, match_criteria); + + if (vlan_proto == ETH_P_8021Q) + MLX5_SET_TO_ONES(fte_match_param, match, outer_headers.cvlan_tag); + else if (vlan_proto == ETH_P_8021AD) + MLX5_SET_TO_ONES(fte_match_param, match, outer_headers.svlan_tag); + MLX5_SET_TO_ONES(fte_match_param, match, outer_headers.first_vid); + + MLX5_SET(create_flow_group_in, in, start_flow_index, from); + MLX5_SET(create_flow_group_in, in, end_flow_index, to); + + fg = mlx5_create_flow_group(mcast_ft, in); + kvfree(in); + if (IS_ERR(fg)) + esw_warn(esw->dev, + "Failed to create VLAN(proto=%x) flow group for bridge mcast table (err=%pe)\n", + vlan_proto, fg); + + return fg; +} + +static struct mlx5_flow_group * +mlx5_esw_bridge_mcast_vlan_fg_create(struct mlx5_eswitch *esw, struct mlx5_flow_table *mcast_ft) +{ + unsigned int from = MLX5_ESW_BRIDGE_MCAST_TABLE_VLAN_GRP_IDX_FROM; + unsigned int to = MLX5_ESW_BRIDGE_MCAST_TABLE_VLAN_GRP_IDX_TO; + + return mlx5_esw_bridge_mcast_vlan_proto_fg_create(from, to, ETH_P_8021Q, esw, mcast_ft); +} + +static struct mlx5_flow_group * +mlx5_esw_bridge_mcast_qinq_fg_create(struct mlx5_eswitch *esw, + struct mlx5_flow_table *mcast_ft) +{ + unsigned int from = MLX5_ESW_BRIDGE_MCAST_TABLE_QINQ_GRP_IDX_FROM; + unsigned int to = MLX5_ESW_BRIDGE_MCAST_TABLE_QINQ_GRP_IDX_TO; + + return mlx5_esw_bridge_mcast_vlan_proto_fg_create(from, to, ETH_P_8021AD, esw, mcast_ft); +} + +static struct mlx5_flow_group * +mlx5_esw_bridge_mcast_fwd_fg_create(struct mlx5_eswitch *esw, + struct mlx5_flow_table *mcast_ft) +{ + int inlen = MLX5_ST_SZ_BYTES(create_flow_group_in); + struct mlx5_flow_group *fg; + u32 *in; + + in = kvzalloc(inlen, GFP_KERNEL); + if (!in) + return ERR_PTR(-ENOMEM); + + MLX5_SET(create_flow_group_in, in, start_flow_index, + MLX5_ESW_BRIDGE_MCAST_TABLE_FWD_GRP_IDX_FROM); + MLX5_SET(create_flow_group_in, in, end_flow_index, + MLX5_ESW_BRIDGE_MCAST_TABLE_FWD_GRP_IDX_TO); + + fg = mlx5_create_flow_group(mcast_ft, in); + kvfree(in); + if (IS_ERR(fg)) + esw_warn(esw->dev, + "Failed to create forward flow group for bridge mcast table (err=%pe)\n", + fg); + + return fg; +} + +static int mlx5_esw_bridge_port_mcast_fgs_init(struct mlx5_esw_bridge_port *port) +{ + struct mlx5_flow_group *fwd_fg, *qinq_fg, *vlan_fg, *filter_fg; + struct mlx5_eswitch *esw = port->bridge->br_offloads->esw; + struct mlx5_flow_table *mcast_ft = port->mcast.ft; + int err; + + filter_fg = mlx5_esw_bridge_mcast_filter_fg_create(esw, mcast_ft); + if (IS_ERR(filter_fg)) + return PTR_ERR(filter_fg); + + vlan_fg = mlx5_esw_bridge_mcast_vlan_fg_create(esw, mcast_ft); + if (IS_ERR(vlan_fg)) { + err = PTR_ERR(vlan_fg); + goto err_vlan_fg; + } + + qinq_fg = mlx5_esw_bridge_mcast_qinq_fg_create(esw, mcast_ft); + if (IS_ERR(qinq_fg)) { + err = PTR_ERR(qinq_fg); + goto err_qinq_fg; + } + + fwd_fg = mlx5_esw_bridge_mcast_fwd_fg_create(esw, mcast_ft); + if (IS_ERR(fwd_fg)) { + err = PTR_ERR(fwd_fg); + goto err_fwd_fg; + } + + port->mcast.filter_fg = filter_fg; + port->mcast.vlan_fg = vlan_fg; + port->mcast.qinq_fg = qinq_fg; + port->mcast.fwd_fg = fwd_fg; + + return 0; + +err_fwd_fg: + mlx5_destroy_flow_group(qinq_fg); +err_qinq_fg: + mlx5_destroy_flow_group(vlan_fg); +err_vlan_fg: + mlx5_destroy_flow_group(filter_fg); + return err; +} + +static void mlx5_esw_bridge_port_mcast_fgs_cleanup(struct mlx5_esw_bridge_port *port) +{ + if (port->mcast.fwd_fg) + mlx5_destroy_flow_group(port->mcast.fwd_fg); + port->mcast.fwd_fg = NULL; + if (port->mcast.qinq_fg) + mlx5_destroy_flow_group(port->mcast.qinq_fg); + port->mcast.qinq_fg = NULL; + if (port->mcast.vlan_fg) + mlx5_destroy_flow_group(port->mcast.vlan_fg); + port->mcast.vlan_fg = NULL; + if (port->mcast.filter_fg) + mlx5_destroy_flow_group(port->mcast.filter_fg); + port->mcast.filter_fg = NULL; +} + +static struct mlx5_flow_handle * +mlx5_esw_bridge_mcast_flow_with_esw_create(struct mlx5_esw_bridge_port *port, + struct mlx5_eswitch *esw) +{ + struct mlx5_flow_act flow_act = { + .action = MLX5_FLOW_CONTEXT_ACTION_DROP, + .flags = FLOW_ACT_NO_APPEND, + }; + struct mlx5_flow_spec *rule_spec; + struct mlx5_flow_handle *handle; + + rule_spec = kvzalloc(sizeof(*rule_spec), GFP_KERNEL); + if (!rule_spec) + return ERR_PTR(-ENOMEM); + + rule_spec->match_criteria_enable = MLX5_MATCH_MISC_PARAMETERS_2; + + MLX5_SET(fte_match_param, rule_spec->match_criteria, + misc_parameters_2.metadata_reg_c_0, mlx5_eswitch_get_vport_metadata_mask()); + MLX5_SET(fte_match_param, rule_spec->match_value, misc_parameters_2.metadata_reg_c_0, + mlx5_eswitch_get_vport_metadata_for_match(esw, port->vport_num)); + + handle = mlx5_add_flow_rules(port->mcast.ft, rule_spec, &flow_act, NULL, 0); + + kvfree(rule_spec); + return handle; +} + +static struct mlx5_flow_handle * +mlx5_esw_bridge_mcast_filter_flow_create(struct mlx5_esw_bridge_port *port) +{ + return mlx5_esw_bridge_mcast_flow_with_esw_create(port, port->bridge->br_offloads->esw); +} + +static struct mlx5_flow_handle * +mlx5_esw_bridge_mcast_filter_flow_peer_create(struct mlx5_esw_bridge_port *port) +{ + struct mlx5_devcom *devcom = port->bridge->br_offloads->esw->dev->priv.devcom; + static struct mlx5_flow_handle *handle; + struct mlx5_eswitch *peer_esw; + + peer_esw = mlx5_devcom_get_peer_data(devcom, MLX5_DEVCOM_ESW_OFFLOADS); + if (!peer_esw) + return ERR_PTR(-ENODEV); + + handle = mlx5_esw_bridge_mcast_flow_with_esw_create(port, peer_esw); + + mlx5_devcom_release_peer_data(devcom, MLX5_DEVCOM_ESW_OFFLOADS); + return handle; +} + +static struct mlx5_flow_handle * +mlx5_esw_bridge_mcast_vlan_flow_create(u16 vlan_proto, struct mlx5_esw_bridge_port *port, + struct mlx5_esw_bridge_vlan *vlan) +{ + struct mlx5_flow_act flow_act = { + .action = MLX5_FLOW_CONTEXT_ACTION_FWD_DEST, + .flags = FLOW_ACT_NO_APPEND, + }; + struct mlx5_flow_destination dest = { + .type = MLX5_FLOW_DESTINATION_TYPE_VPORT, + .vport.num = port->vport_num, + }; + struct mlx5_esw_bridge *bridge = port->bridge; + struct mlx5_flow_spec *rule_spec; + struct mlx5_flow_handle *handle; + + rule_spec = kvzalloc(sizeof(*rule_spec), GFP_KERNEL); + if (!rule_spec) + return ERR_PTR(-ENOMEM); + + if (MLX5_CAP_ESW_FLOWTABLE(bridge->br_offloads->esw->dev, flow_source) && + port->vport_num == MLX5_VPORT_UPLINK) + rule_spec->flow_context.flow_source = + MLX5_FLOW_CONTEXT_FLOW_SOURCE_LOCAL_VPORT; + rule_spec->match_criteria_enable = MLX5_MATCH_OUTER_HEADERS; + + flow_act.action |= MLX5_FLOW_CONTEXT_ACTION_PACKET_REFORMAT; + flow_act.pkt_reformat = vlan->pkt_reformat_pop; + + if (vlan_proto == ETH_P_8021Q) { + MLX5_SET_TO_ONES(fte_match_param, rule_spec->match_criteria, + outer_headers.cvlan_tag); + MLX5_SET_TO_ONES(fte_match_param, rule_spec->match_value, + outer_headers.cvlan_tag); + } else if (vlan_proto == ETH_P_8021AD) { + MLX5_SET_TO_ONES(fte_match_param, rule_spec->match_criteria, + outer_headers.svlan_tag); + MLX5_SET_TO_ONES(fte_match_param, rule_spec->match_value, + outer_headers.svlan_tag); + } + MLX5_SET_TO_ONES(fte_match_param, rule_spec->match_criteria, outer_headers.first_vid); + MLX5_SET(fte_match_param, rule_spec->match_value, outer_headers.first_vid, vlan->vid); + + if (MLX5_CAP_ESW(bridge->br_offloads->esw->dev, merged_eswitch)) { + dest.vport.flags = MLX5_FLOW_DEST_VPORT_VHCA_ID; + dest.vport.vhca_id = port->esw_owner_vhca_id; + } + handle = mlx5_add_flow_rules(port->mcast.ft, rule_spec, &flow_act, &dest, 1); + + kvfree(rule_spec); + return handle; +} + +int mlx5_esw_bridge_vlan_mcast_init(u16 vlan_proto, struct mlx5_esw_bridge_port *port, + struct mlx5_esw_bridge_vlan *vlan) +{ + struct mlx5_flow_handle *handle; + + if (!(port->bridge->flags & MLX5_ESW_BRIDGE_MCAST_FLAG)) + return 0; + + handle = mlx5_esw_bridge_mcast_vlan_flow_create(vlan_proto, port, vlan); + if (IS_ERR(handle)) + return PTR_ERR(handle); + + vlan->mcast_handle = handle; + return 0; +} + +void mlx5_esw_bridge_vlan_mcast_cleanup(struct mlx5_esw_bridge_vlan *vlan) +{ + if (vlan->mcast_handle) + mlx5_del_flow_rules(vlan->mcast_handle); + vlan->mcast_handle = NULL; +} + +static struct mlx5_flow_handle * +mlx5_esw_bridge_mcast_fwd_flow_create(struct mlx5_esw_bridge_port *port) +{ + struct mlx5_flow_act flow_act = { + .action = MLX5_FLOW_CONTEXT_ACTION_FWD_DEST, + .flags = FLOW_ACT_NO_APPEND, + }; + struct mlx5_flow_destination dest = { + .type = MLX5_FLOW_DESTINATION_TYPE_VPORT, + .vport.num = port->vport_num, + }; + struct mlx5_esw_bridge *bridge = port->bridge; + struct mlx5_flow_spec *rule_spec; + struct mlx5_flow_handle *handle; + + rule_spec = kvzalloc(sizeof(*rule_spec), GFP_KERNEL); + if (!rule_spec) + return ERR_PTR(-ENOMEM); + + if (MLX5_CAP_ESW_FLOWTABLE(bridge->br_offloads->esw->dev, flow_source) && + port->vport_num == MLX5_VPORT_UPLINK) + rule_spec->flow_context.flow_source = + MLX5_FLOW_CONTEXT_FLOW_SOURCE_LOCAL_VPORT; + + if (MLX5_CAP_ESW(bridge->br_offloads->esw->dev, merged_eswitch)) { + dest.vport.flags = MLX5_FLOW_DEST_VPORT_VHCA_ID; + dest.vport.vhca_id = port->esw_owner_vhca_id; + } + handle = mlx5_add_flow_rules(port->mcast.ft, rule_spec, &flow_act, &dest, 1); + + kvfree(rule_spec); + return handle; +} + +static int mlx5_esw_bridge_port_mcast_fhs_init(struct mlx5_esw_bridge_port *port) +{ + struct mlx5_flow_handle *filter_handle, *fwd_handle; + struct mlx5_esw_bridge_vlan *vlan, *failed; + unsigned long index; + int err; + + + filter_handle = (port->flags & MLX5_ESW_BRIDGE_PORT_FLAG_PEER) ? + mlx5_esw_bridge_mcast_filter_flow_peer_create(port) : + mlx5_esw_bridge_mcast_filter_flow_create(port); + if (IS_ERR(filter_handle)) + return PTR_ERR(filter_handle); + + fwd_handle = mlx5_esw_bridge_mcast_fwd_flow_create(port); + if (IS_ERR(fwd_handle)) { + err = PTR_ERR(fwd_handle); + goto err_fwd; + } + + xa_for_each(&port->vlans, index, vlan) { + err = mlx5_esw_bridge_vlan_mcast_init(port->bridge->vlan_proto, port, vlan); + if (err) { + failed = vlan; + goto err_vlan; + } + } + + port->mcast.filter_handle = filter_handle; + port->mcast.fwd_handle = fwd_handle; + + return 0; + +err_vlan: + xa_for_each(&port->vlans, index, vlan) { + if (vlan == failed) + break; + + mlx5_esw_bridge_vlan_mcast_cleanup(vlan); + } + mlx5_del_flow_rules(fwd_handle); +err_fwd: + mlx5_del_flow_rules(filter_handle); + return err; +} + +static void mlx5_esw_bridge_port_mcast_fhs_cleanup(struct mlx5_esw_bridge_port *port) +{ + struct mlx5_esw_bridge_vlan *vlan; + unsigned long index; + + xa_for_each(&port->vlans, index, vlan) + mlx5_esw_bridge_vlan_mcast_cleanup(vlan); + + if (port->mcast.fwd_handle) + mlx5_del_flow_rules(port->mcast.fwd_handle); + port->mcast.fwd_handle = NULL; + if (port->mcast.filter_handle) + mlx5_del_flow_rules(port->mcast.filter_handle); + port->mcast.filter_handle = NULL; +} + +int mlx5_esw_bridge_port_mcast_init(struct mlx5_esw_bridge_port *port) +{ + struct mlx5_esw_bridge *bridge = port->bridge; + int err; + + if (!(bridge->flags & MLX5_ESW_BRIDGE_MCAST_FLAG)) + return 0; + + err = mlx5_esw_bridge_port_mcast_fts_init(port, bridge); + if (err) + return err; + + err = mlx5_esw_bridge_port_mcast_fgs_init(port); + if (err) + goto err_fgs; + + err = mlx5_esw_bridge_port_mcast_fhs_init(port); + if (err) + goto err_fhs; + return err; + +err_fhs: + mlx5_esw_bridge_port_mcast_fgs_cleanup(port); +err_fgs: + mlx5_esw_bridge_port_mcast_fts_cleanup(port); + return err; +} + +void mlx5_esw_bridge_port_mcast_cleanup(struct mlx5_esw_bridge_port *port) +{ + mlx5_esw_bridge_port_mdb_flush(port); + mlx5_esw_bridge_port_mcast_fhs_cleanup(port); + mlx5_esw_bridge_port_mcast_fgs_cleanup(port); + mlx5_esw_bridge_port_mcast_fts_cleanup(port); +} + +static struct mlx5_flow_group * +mlx5_esw_bridge_ingress_igmp_fg_create(struct mlx5_eswitch *esw, + struct mlx5_flow_table *ingress_ft) +{ + int inlen = MLX5_ST_SZ_BYTES(create_flow_group_in); + struct mlx5_flow_group *fg; + u32 *in, *match; + + in = kvzalloc(inlen, GFP_KERNEL); + if (!in) + return ERR_PTR(-ENOMEM); + + MLX5_SET(create_flow_group_in, in, match_criteria_enable, MLX5_MATCH_OUTER_HEADERS); + match = MLX5_ADDR_OF(create_flow_group_in, in, match_criteria); + + MLX5_SET_TO_ONES(fte_match_param, match, outer_headers.ip_version); + MLX5_SET_TO_ONES(fte_match_param, match, outer_headers.ip_protocol); + + MLX5_SET(create_flow_group_in, in, start_flow_index, + MLX5_ESW_BRIDGE_INGRESS_TABLE_IGMP_GRP_IDX_FROM); + MLX5_SET(create_flow_group_in, in, end_flow_index, + MLX5_ESW_BRIDGE_INGRESS_TABLE_IGMP_GRP_IDX_TO); + + fg = mlx5_create_flow_group(ingress_ft, in); + kvfree(in); + if (IS_ERR(fg)) + esw_warn(esw->dev, + "Failed to create IGMP flow group for bridge ingress table (err=%pe)\n", + fg); + + return fg; +} + +static struct mlx5_flow_group * +mlx5_esw_bridge_ingress_mld_fg_create(struct mlx5_eswitch *esw, + struct mlx5_flow_table *ingress_ft) +{ + int inlen = MLX5_ST_SZ_BYTES(create_flow_group_in); + struct mlx5_flow_group *fg; + u32 *in, *match; + + if (!(MLX5_CAP_GEN(esw->dev, flex_parser_protocols) & MLX5_FLEX_PROTO_ICMPV6)) { + esw_warn(esw->dev, + "Can't create MLD flow group due to missing hardware ICMPv6 parsing support\n"); + return NULL; + } + + in = kvzalloc(inlen, GFP_KERNEL); + if (!in) + return ERR_PTR(-ENOMEM); + + MLX5_SET(create_flow_group_in, in, match_criteria_enable, + MLX5_MATCH_OUTER_HEADERS | MLX5_MATCH_MISC_PARAMETERS_3); + match = MLX5_ADDR_OF(create_flow_group_in, in, match_criteria); + + MLX5_SET_TO_ONES(fte_match_param, match, outer_headers.ip_version); + MLX5_SET_TO_ONES(fte_match_param, match, misc_parameters_3.icmpv6_type); + + MLX5_SET(create_flow_group_in, in, start_flow_index, + MLX5_ESW_BRIDGE_INGRESS_TABLE_MLD_GRP_IDX_FROM); + MLX5_SET(create_flow_group_in, in, end_flow_index, + MLX5_ESW_BRIDGE_INGRESS_TABLE_MLD_GRP_IDX_TO); + + fg = mlx5_create_flow_group(ingress_ft, in); + kvfree(in); + if (IS_ERR(fg)) + esw_warn(esw->dev, + "Failed to create MLD flow group for bridge ingress table (err=%pe)\n", + fg); + + return fg; +} + +static int +mlx5_esw_bridge_ingress_mcast_fgs_init(struct mlx5_esw_bridge_offloads *br_offloads) +{ + struct mlx5_flow_table *ingress_ft = br_offloads->ingress_ft; + struct mlx5_eswitch *esw = br_offloads->esw; + struct mlx5_flow_group *igmp_fg, *mld_fg; + + igmp_fg = mlx5_esw_bridge_ingress_igmp_fg_create(esw, ingress_ft); + if (IS_ERR(igmp_fg)) + return PTR_ERR(igmp_fg); + + mld_fg = mlx5_esw_bridge_ingress_mld_fg_create(esw, ingress_ft); + if (IS_ERR(mld_fg)) { + mlx5_destroy_flow_group(igmp_fg); + return PTR_ERR(mld_fg); + } + + br_offloads->ingress_igmp_fg = igmp_fg; + br_offloads->ingress_mld_fg = mld_fg; + return 0; +} + +static void +mlx5_esw_bridge_ingress_mcast_fgs_cleanup(struct mlx5_esw_bridge_offloads *br_offloads) +{ + if (br_offloads->ingress_mld_fg) + mlx5_destroy_flow_group(br_offloads->ingress_mld_fg); + br_offloads->ingress_mld_fg = NULL; + if (br_offloads->ingress_igmp_fg) + mlx5_destroy_flow_group(br_offloads->ingress_igmp_fg); + br_offloads->ingress_igmp_fg = NULL; +} + +static struct mlx5_flow_handle * +mlx5_esw_bridge_ingress_igmp_fh_create(struct mlx5_flow_table *ingress_ft, + struct mlx5_flow_table *skip_ft) +{ + struct mlx5_flow_destination dest = { + .type = MLX5_FLOW_DESTINATION_TYPE_FLOW_TABLE, + .ft = skip_ft, + }; + struct mlx5_flow_act flow_act = { + .action = MLX5_FLOW_CONTEXT_ACTION_FWD_DEST, + .flags = FLOW_ACT_NO_APPEND, + }; + struct mlx5_flow_spec *rule_spec; + struct mlx5_flow_handle *handle; + + rule_spec = kvzalloc(sizeof(*rule_spec), GFP_KERNEL); + if (!rule_spec) + return ERR_PTR(-ENOMEM); + + rule_spec->match_criteria_enable = MLX5_MATCH_OUTER_HEADERS; + + MLX5_SET_TO_ONES(fte_match_param, rule_spec->match_criteria, outer_headers.ip_version); + MLX5_SET(fte_match_param, rule_spec->match_value, outer_headers.ip_version, 4); + MLX5_SET_TO_ONES(fte_match_param, rule_spec->match_criteria, outer_headers.ip_protocol); + MLX5_SET(fte_match_param, rule_spec->match_value, outer_headers.ip_protocol, IPPROTO_IGMP); + + handle = mlx5_add_flow_rules(ingress_ft, rule_spec, &flow_act, &dest, 1); + + kvfree(rule_spec); + return handle; +} + +static struct mlx5_flow_handle * +mlx5_esw_bridge_ingress_mld_fh_create(u8 type, struct mlx5_flow_table *ingress_ft, + struct mlx5_flow_table *skip_ft) +{ + struct mlx5_flow_destination dest = { + .type = MLX5_FLOW_DESTINATION_TYPE_FLOW_TABLE, + .ft = skip_ft, + }; + struct mlx5_flow_act flow_act = { + .action = MLX5_FLOW_CONTEXT_ACTION_FWD_DEST, + .flags = FLOW_ACT_NO_APPEND, + }; + struct mlx5_flow_spec *rule_spec; + struct mlx5_flow_handle *handle; + + rule_spec = kvzalloc(sizeof(*rule_spec), GFP_KERNEL); + if (!rule_spec) + return ERR_PTR(-ENOMEM); + + rule_spec->match_criteria_enable = MLX5_MATCH_OUTER_HEADERS | MLX5_MATCH_MISC_PARAMETERS_3; + + MLX5_SET_TO_ONES(fte_match_param, rule_spec->match_criteria, outer_headers.ip_version); + MLX5_SET(fte_match_param, rule_spec->match_value, outer_headers.ip_version, 6); + MLX5_SET_TO_ONES(fte_match_param, rule_spec->match_criteria, misc_parameters_3.icmpv6_type); + MLX5_SET(fte_match_param, rule_spec->match_value, misc_parameters_3.icmpv6_type, type); + + handle = mlx5_add_flow_rules(ingress_ft, rule_spec, &flow_act, &dest, 1); + + kvfree(rule_spec); + return handle; +} + +static int +mlx5_esw_bridge_ingress_mcast_fhs_create(struct mlx5_esw_bridge_offloads *br_offloads) +{ + struct mlx5_flow_handle *igmp_handle, *mld_query_handle, *mld_report_handle, + *mld_done_handle; + struct mlx5_flow_table *ingress_ft = br_offloads->ingress_ft, + *skip_ft = br_offloads->skip_ft; + int err; + + igmp_handle = mlx5_esw_bridge_ingress_igmp_fh_create(ingress_ft, skip_ft); + if (IS_ERR(igmp_handle)) + return PTR_ERR(igmp_handle); + + if (br_offloads->ingress_mld_fg) { + mld_query_handle = mlx5_esw_bridge_ingress_mld_fh_create(ICMPV6_MGM_QUERY, + ingress_ft, + skip_ft); + if (IS_ERR(mld_query_handle)) { + err = PTR_ERR(mld_query_handle); + goto err_mld_query; + } + + mld_report_handle = mlx5_esw_bridge_ingress_mld_fh_create(ICMPV6_MGM_REPORT, + ingress_ft, + skip_ft); + if (IS_ERR(mld_report_handle)) { + err = PTR_ERR(mld_report_handle); + goto err_mld_report; + } + + mld_done_handle = mlx5_esw_bridge_ingress_mld_fh_create(ICMPV6_MGM_REDUCTION, + ingress_ft, + skip_ft); + if (IS_ERR(mld_done_handle)) { + err = PTR_ERR(mld_done_handle); + goto err_mld_done; + } + } else { + mld_query_handle = NULL; + mld_report_handle = NULL; + mld_done_handle = NULL; + } + + br_offloads->igmp_handle = igmp_handle; + br_offloads->mld_query_handle = mld_query_handle; + br_offloads->mld_report_handle = mld_report_handle; + br_offloads->mld_done_handle = mld_done_handle; + + return 0; + +err_mld_done: + mlx5_del_flow_rules(mld_report_handle); +err_mld_report: + mlx5_del_flow_rules(mld_query_handle); +err_mld_query: + mlx5_del_flow_rules(igmp_handle); + return err; +} + +static void +mlx5_esw_bridge_ingress_mcast_fhs_cleanup(struct mlx5_esw_bridge_offloads *br_offloads) +{ + if (br_offloads->mld_done_handle) + mlx5_del_flow_rules(br_offloads->mld_done_handle); + br_offloads->mld_done_handle = NULL; + if (br_offloads->mld_report_handle) + mlx5_del_flow_rules(br_offloads->mld_report_handle); + br_offloads->mld_report_handle = NULL; + if (br_offloads->mld_query_handle) + mlx5_del_flow_rules(br_offloads->mld_query_handle); + br_offloads->mld_query_handle = NULL; + if (br_offloads->igmp_handle) + mlx5_del_flow_rules(br_offloads->igmp_handle); + br_offloads->igmp_handle = NULL; +} + +static int mlx5_esw_brige_mcast_init(struct mlx5_esw_bridge *bridge) +{ + struct mlx5_esw_bridge_offloads *br_offloads = bridge->br_offloads; + struct mlx5_esw_bridge_port *port, *failed; + unsigned long i; + int err; + + xa_for_each(&br_offloads->ports, i, port) { + if (port->bridge != bridge) + continue; + + err = mlx5_esw_bridge_port_mcast_init(port); + if (err) { + failed = port; + goto err_port; + } + } + return 0; + +err_port: + xa_for_each(&br_offloads->ports, i, port) { + if (port == failed) + break; + if (port->bridge != bridge) + continue; + + mlx5_esw_bridge_port_mcast_cleanup(port); + } + return err; +} + +static void mlx5_esw_brige_mcast_cleanup(struct mlx5_esw_bridge *bridge) +{ + struct mlx5_esw_bridge_offloads *br_offloads = bridge->br_offloads; + struct mlx5_esw_bridge_port *port; + unsigned long i; + + xa_for_each(&br_offloads->ports, i, port) { + if (port->bridge != bridge) + continue; + + mlx5_esw_bridge_port_mcast_cleanup(port); + } +} + +static int mlx5_esw_brige_mcast_global_enable(struct mlx5_esw_bridge_offloads *br_offloads) +{ + int err; + + if (br_offloads->ingress_igmp_fg) + return 0; /* already enabled by another bridge */ + + err = mlx5_esw_bridge_ingress_mcast_fgs_init(br_offloads); + if (err) { + esw_warn(br_offloads->esw->dev, + "Failed to create global multicast flow groups (err=%d)\n", + err); + return err; + } + + err = mlx5_esw_bridge_ingress_mcast_fhs_create(br_offloads); + if (err) { + esw_warn(br_offloads->esw->dev, + "Failed to create global multicast flows (err=%d)\n", + err); + goto err_fhs; + } + + return 0; + +err_fhs: + mlx5_esw_bridge_ingress_mcast_fgs_cleanup(br_offloads); + return err; +} + +static void mlx5_esw_brige_mcast_global_disable(struct mlx5_esw_bridge_offloads *br_offloads) +{ + struct mlx5_esw_bridge *br; + + list_for_each_entry(br, &br_offloads->bridges, list) { + /* Ingress table is global, so only disable snooping when all + * bridges on esw have multicast disabled. + */ + if (br->flags & MLX5_ESW_BRIDGE_MCAST_FLAG) + return; + } + + mlx5_esw_bridge_ingress_mcast_fhs_cleanup(br_offloads); + mlx5_esw_bridge_ingress_mcast_fgs_cleanup(br_offloads); +} + +int mlx5_esw_bridge_mcast_enable(struct mlx5_esw_bridge *bridge) +{ + int err; + + err = mlx5_esw_brige_mcast_global_enable(bridge->br_offloads); + if (err) + return err; + + bridge->flags |= MLX5_ESW_BRIDGE_MCAST_FLAG; + + err = mlx5_esw_brige_mcast_init(bridge); + if (err) { + esw_warn(bridge->br_offloads->esw->dev, "Failed to enable multicast (err=%d)\n", + err); + bridge->flags &= ~MLX5_ESW_BRIDGE_MCAST_FLAG; + mlx5_esw_brige_mcast_global_disable(bridge->br_offloads); + } + return err; +} + +void mlx5_esw_bridge_mcast_disable(struct mlx5_esw_bridge *bridge) +{ + mlx5_esw_brige_mcast_cleanup(bridge); + bridge->flags &= ~MLX5_ESW_BRIDGE_MCAST_FLAG; + mlx5_esw_brige_mcast_global_disable(bridge->br_offloads); +} diff --git a/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge_priv.h b/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge_priv.h index 878311fe950a..c9595801bdb4 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge_priv.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/esw/bridge_priv.h @@ -12,11 +12,124 @@ #include <linux/xarray.h> #include "fs_core.h" +#define MLX5_ESW_BRIDGE_INGRESS_TABLE_IGMP_GRP_SIZE 1 +#define MLX5_ESW_BRIDGE_INGRESS_TABLE_MLD_GRP_SIZE 3 +#define MLX5_ESW_BRIDGE_INGRESS_TABLE_VLAN_GRP_SIZE 131072 +#define MLX5_ESW_BRIDGE_INGRESS_TABLE_UNTAGGED_GRP_SIZE \ + (524288 - MLX5_ESW_BRIDGE_INGRESS_TABLE_IGMP_GRP_SIZE - \ + MLX5_ESW_BRIDGE_INGRESS_TABLE_MLD_GRP_SIZE) + +#define MLX5_ESW_BRIDGE_INGRESS_TABLE_IGMP_GRP_IDX_FROM 0 +#define MLX5_ESW_BRIDGE_INGRESS_TABLE_IGMP_GRP_IDX_TO \ + (MLX5_ESW_BRIDGE_INGRESS_TABLE_IGMP_GRP_SIZE - 1) +#define MLX5_ESW_BRIDGE_INGRESS_TABLE_MLD_GRP_IDX_FROM \ + (MLX5_ESW_BRIDGE_INGRESS_TABLE_IGMP_GRP_IDX_TO + 1) +#define MLX5_ESW_BRIDGE_INGRESS_TABLE_MLD_GRP_IDX_TO \ + (MLX5_ESW_BRIDGE_INGRESS_TABLE_MLD_GRP_IDX_FROM + \ + MLX5_ESW_BRIDGE_INGRESS_TABLE_MLD_GRP_SIZE - 1) +#define MLX5_ESW_BRIDGE_INGRESS_TABLE_VLAN_GRP_IDX_FROM \ + (MLX5_ESW_BRIDGE_INGRESS_TABLE_MLD_GRP_IDX_TO + 1) +#define MLX5_ESW_BRIDGE_INGRESS_TABLE_VLAN_GRP_IDX_TO \ + (MLX5_ESW_BRIDGE_INGRESS_TABLE_VLAN_GRP_IDX_FROM + \ + MLX5_ESW_BRIDGE_INGRESS_TABLE_VLAN_GRP_SIZE - 1) +#define MLX5_ESW_BRIDGE_INGRESS_TABLE_VLAN_FILTER_GRP_IDX_FROM \ + (MLX5_ESW_BRIDGE_INGRESS_TABLE_VLAN_GRP_IDX_TO + 1) +#define MLX5_ESW_BRIDGE_INGRESS_TABLE_VLAN_FILTER_GRP_IDX_TO \ + (MLX5_ESW_BRIDGE_INGRESS_TABLE_VLAN_FILTER_GRP_IDX_FROM + \ + MLX5_ESW_BRIDGE_INGRESS_TABLE_VLAN_GRP_SIZE - 1) +#define MLX5_ESW_BRIDGE_INGRESS_TABLE_QINQ_GRP_IDX_FROM \ + (MLX5_ESW_BRIDGE_INGRESS_TABLE_VLAN_FILTER_GRP_IDX_TO + 1) +#define MLX5_ESW_BRIDGE_INGRESS_TABLE_QINQ_GRP_IDX_TO \ + (MLX5_ESW_BRIDGE_INGRESS_TABLE_QINQ_GRP_IDX_FROM + \ + MLX5_ESW_BRIDGE_INGRESS_TABLE_VLAN_GRP_SIZE - 1) +#define MLX5_ESW_BRIDGE_INGRESS_TABLE_QINQ_FILTER_GRP_IDX_FROM \ + (MLX5_ESW_BRIDGE_INGRESS_TABLE_QINQ_GRP_IDX_TO + 1) +#define MLX5_ESW_BRIDGE_INGRESS_TABLE_QINQ_FILTER_GRP_IDX_TO \ + (MLX5_ESW_BRIDGE_INGRESS_TABLE_QINQ_FILTER_GRP_IDX_FROM + \ + MLX5_ESW_BRIDGE_INGRESS_TABLE_VLAN_GRP_SIZE - 1) +#define MLX5_ESW_BRIDGE_INGRESS_TABLE_MAC_GRP_IDX_FROM \ + (MLX5_ESW_BRIDGE_INGRESS_TABLE_QINQ_FILTER_GRP_IDX_TO + 1) +#define MLX5_ESW_BRIDGE_INGRESS_TABLE_MAC_GRP_IDX_TO \ + (MLX5_ESW_BRIDGE_INGRESS_TABLE_MAC_GRP_IDX_FROM + \ + MLX5_ESW_BRIDGE_INGRESS_TABLE_UNTAGGED_GRP_SIZE - 1) +#define MLX5_ESW_BRIDGE_INGRESS_TABLE_SIZE \ + (MLX5_ESW_BRIDGE_INGRESS_TABLE_MAC_GRP_IDX_TO + 1) +static_assert(MLX5_ESW_BRIDGE_INGRESS_TABLE_SIZE == 1048576); + +#define MLX5_ESW_BRIDGE_EGRESS_TABLE_VLAN_GRP_SIZE 131072 +#define MLX5_ESW_BRIDGE_EGRESS_TABLE_MAC_GRP_SIZE (262144 - 1) +#define MLX5_ESW_BRIDGE_EGRESS_TABLE_VLAN_GRP_IDX_FROM 0 +#define MLX5_ESW_BRIDGE_EGRESS_TABLE_VLAN_GRP_IDX_TO \ + (MLX5_ESW_BRIDGE_EGRESS_TABLE_VLAN_GRP_SIZE - 1) +#define MLX5_ESW_BRIDGE_EGRESS_TABLE_QINQ_GRP_IDX_FROM \ + (MLX5_ESW_BRIDGE_EGRESS_TABLE_VLAN_GRP_IDX_TO + 1) +#define MLX5_ESW_BRIDGE_EGRESS_TABLE_QINQ_GRP_IDX_TO \ + (MLX5_ESW_BRIDGE_EGRESS_TABLE_QINQ_GRP_IDX_FROM + \ + MLX5_ESW_BRIDGE_EGRESS_TABLE_VLAN_GRP_SIZE - 1) +#define MLX5_ESW_BRIDGE_EGRESS_TABLE_MAC_GRP_IDX_FROM \ + (MLX5_ESW_BRIDGE_EGRESS_TABLE_QINQ_GRP_IDX_TO + 1) +#define MLX5_ESW_BRIDGE_EGRESS_TABLE_MAC_GRP_IDX_TO \ + (MLX5_ESW_BRIDGE_EGRESS_TABLE_MAC_GRP_IDX_FROM + \ + MLX5_ESW_BRIDGE_EGRESS_TABLE_MAC_GRP_SIZE - 1) +#define MLX5_ESW_BRIDGE_EGRESS_TABLE_MISS_GRP_IDX_FROM \ + (MLX5_ESW_BRIDGE_EGRESS_TABLE_MAC_GRP_IDX_TO + 1) +#define MLX5_ESW_BRIDGE_EGRESS_TABLE_MISS_GRP_IDX_TO \ + MLX5_ESW_BRIDGE_EGRESS_TABLE_MISS_GRP_IDX_FROM +#define MLX5_ESW_BRIDGE_EGRESS_TABLE_SIZE \ + (MLX5_ESW_BRIDGE_EGRESS_TABLE_MISS_GRP_IDX_TO + 1) +static_assert(MLX5_ESW_BRIDGE_EGRESS_TABLE_SIZE == 524288); + +#define MLX5_ESW_BRIDGE_SKIP_TABLE_SIZE 0 + +#define MLX5_ESW_BRIDGE_MCAST_TABLE_FILTER_GRP_SIZE 1 +#define MLX5_ESW_BRIDGE_MCAST_TABLE_FWD_GRP_SIZE 1 +#define MLX5_ESW_BRIDGE_MCAST_TABLE_VLAN_GRP_SIZE 4095 +#define MLX5_ESW_BRIDGE_MCAST_TABLE_QINQ_GRP_SIZE MLX5_ESW_BRIDGE_MCAST_TABLE_VLAN_GRP_SIZE +#define MLX5_ESW_BRIDGE_MCAST_TABLE_FILTER_GRP_IDX_FROM 0 +#define MLX5_ESW_BRIDGE_MCAST_TABLE_FILTER_GRP_IDX_TO \ + (MLX5_ESW_BRIDGE_MCAST_TABLE_FILTER_GRP_SIZE - 1) +#define MLX5_ESW_BRIDGE_MCAST_TABLE_VLAN_GRP_IDX_FROM \ + (MLX5_ESW_BRIDGE_MCAST_TABLE_FILTER_GRP_IDX_TO + 1) +#define MLX5_ESW_BRIDGE_MCAST_TABLE_VLAN_GRP_IDX_TO \ + (MLX5_ESW_BRIDGE_MCAST_TABLE_VLAN_GRP_IDX_FROM + \ + MLX5_ESW_BRIDGE_MCAST_TABLE_VLAN_GRP_SIZE - 1) +#define MLX5_ESW_BRIDGE_MCAST_TABLE_QINQ_GRP_IDX_FROM \ + (MLX5_ESW_BRIDGE_MCAST_TABLE_VLAN_GRP_IDX_TO + 1) +#define MLX5_ESW_BRIDGE_MCAST_TABLE_QINQ_GRP_IDX_TO \ + (MLX5_ESW_BRIDGE_MCAST_TABLE_QINQ_GRP_IDX_FROM + \ + MLX5_ESW_BRIDGE_MCAST_TABLE_QINQ_GRP_SIZE - 1) +#define MLX5_ESW_BRIDGE_MCAST_TABLE_FWD_GRP_IDX_FROM \ + (MLX5_ESW_BRIDGE_MCAST_TABLE_QINQ_GRP_IDX_TO + 1) +#define MLX5_ESW_BRIDGE_MCAST_TABLE_FWD_GRP_IDX_TO \ + (MLX5_ESW_BRIDGE_MCAST_TABLE_FWD_GRP_IDX_FROM + \ + MLX5_ESW_BRIDGE_MCAST_TABLE_FWD_GRP_SIZE - 1) + +#define MLX5_ESW_BRIDGE_MCAST_TABLE_SIZE \ + (MLX5_ESW_BRIDGE_MCAST_TABLE_FWD_GRP_IDX_TO + 1) +static_assert(MLX5_ESW_BRIDGE_MCAST_TABLE_SIZE == 8192); + +enum { + MLX5_ESW_BRIDGE_LEVEL_INGRESS_TABLE, + MLX5_ESW_BRIDGE_LEVEL_EGRESS_TABLE, + MLX5_ESW_BRIDGE_LEVEL_MCAST_TABLE, + MLX5_ESW_BRIDGE_LEVEL_SKIP_TABLE, +}; + +enum { + MLX5_ESW_BRIDGE_VLAN_FILTERING_FLAG = BIT(0), + MLX5_ESW_BRIDGE_MCAST_FLAG = BIT(1), +}; + struct mlx5_esw_bridge_fdb_key { unsigned char addr[ETH_ALEN]; u16 vid; }; +struct mlx5_esw_bridge_mdb_key { + unsigned char addr[ETH_ALEN]; + u16 vid; +}; + enum { MLX5_ESW_BRIDGE_FLAG_ADDED_BY_USER = BIT(0), MLX5_ESW_BRIDGE_FLAG_PEER = BIT(1), @@ -43,6 +156,16 @@ struct mlx5_esw_bridge_fdb_entry { struct mlx5_flow_handle *filter_handle; }; +struct mlx5_esw_bridge_mdb_entry { + struct mlx5_esw_bridge_mdb_key key; + struct rhash_head ht_node; + struct list_head list; + struct xarray ports; + int num_ports; + + struct mlx5_flow_handle *egress_handle; +}; + struct mlx5_esw_bridge_vlan { u16 vid; u16 flags; @@ -50,6 +173,7 @@ struct mlx5_esw_bridge_vlan { struct mlx5_pkt_reformat *pkt_reformat_push; struct mlx5_pkt_reformat *pkt_reformat_pop; struct mlx5_modify_hdr *pkt_mod_hdr_push_mark; + struct mlx5_flow_handle *mcast_handle; }; struct mlx5_esw_bridge_port { @@ -58,6 +182,63 @@ struct mlx5_esw_bridge_port { u16 flags; struct mlx5_esw_bridge *bridge; struct xarray vlans; + struct { + struct mlx5_flow_table *ft; + struct mlx5_flow_group *filter_fg; + struct mlx5_flow_group *vlan_fg; + struct mlx5_flow_group *qinq_fg; + struct mlx5_flow_group *fwd_fg; + + struct mlx5_flow_handle *filter_handle; + struct mlx5_flow_handle *fwd_handle; + } mcast; }; +struct mlx5_esw_bridge { + int ifindex; + int refcnt; + struct list_head list; + struct mlx5_esw_bridge_offloads *br_offloads; + + struct list_head fdb_list; + struct rhashtable fdb_ht; + + struct list_head mdb_list; + struct rhashtable mdb_ht; + + struct mlx5_flow_table *egress_ft; + struct mlx5_flow_group *egress_vlan_fg; + struct mlx5_flow_group *egress_qinq_fg; + struct mlx5_flow_group *egress_mac_fg; + struct mlx5_flow_group *egress_miss_fg; + struct mlx5_pkt_reformat *egress_miss_pkt_reformat; + struct mlx5_flow_handle *egress_miss_handle; + unsigned long ageing_time; + u32 flags; + u16 vlan_proto; +}; + +struct mlx5_flow_table *mlx5_esw_bridge_table_create(int max_fte, u32 level, + struct mlx5_eswitch *esw); +unsigned long mlx5_esw_bridge_port_key(struct mlx5_esw_bridge_port *port); + +int mlx5_esw_bridge_port_mcast_init(struct mlx5_esw_bridge_port *port); +void mlx5_esw_bridge_port_mcast_cleanup(struct mlx5_esw_bridge_port *port); +int mlx5_esw_bridge_vlan_mcast_init(u16 vlan_proto, struct mlx5_esw_bridge_port *port, + struct mlx5_esw_bridge_vlan *vlan); +void mlx5_esw_bridge_vlan_mcast_cleanup(struct mlx5_esw_bridge_vlan *vlan); + +int mlx5_esw_bridge_mcast_enable(struct mlx5_esw_bridge *bridge); +void mlx5_esw_bridge_mcast_disable(struct mlx5_esw_bridge *bridge); + +int mlx5_esw_bridge_mdb_init(struct mlx5_esw_bridge *bridge); +void mlx5_esw_bridge_mdb_cleanup(struct mlx5_esw_bridge *bridge); +int mlx5_esw_bridge_port_mdb_attach(struct net_device *dev, struct mlx5_esw_bridge_port *port, + const unsigned char *addr, u16 vid); +void mlx5_esw_bridge_port_mdb_detach(struct net_device *dev, struct mlx5_esw_bridge_port *port, + const unsigned char *addr, u16 vid); +void mlx5_esw_bridge_port_mdb_vlan_flush(struct mlx5_esw_bridge_port *port, + struct mlx5_esw_bridge_vlan *vlan); +void mlx5_esw_bridge_mdb_flush(struct mlx5_esw_bridge *bridge); + #endif /* _MLX5_ESW_BRIDGE_PRIVATE_ */ diff --git a/drivers/net/ethernet/mellanox/mlx5/core/esw/debugfs.c b/drivers/net/ethernet/mellanox/mlx5/core/esw/debugfs.c deleted file mode 100644 index 3d0bbcca1cb9..000000000000 --- a/drivers/net/ethernet/mellanox/mlx5/core/esw/debugfs.c +++ /dev/null @@ -1,198 +0,0 @@ -// SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB -/* Copyright (c) 2022, NVIDIA CORPORATION & AFFILIATES. All rights reserved. */ - -#include <linux/debugfs.h> -#include "eswitch.h" - -enum vnic_diag_counter { - MLX5_VNIC_DIAG_TOTAL_Q_UNDER_PROCESSOR_HANDLE, - MLX5_VNIC_DIAG_SEND_QUEUE_PRIORITY_UPDATE_FLOW, - MLX5_VNIC_DIAG_COMP_EQ_OVERRUN, - MLX5_VNIC_DIAG_ASYNC_EQ_OVERRUN, - MLX5_VNIC_DIAG_CQ_OVERRUN, - MLX5_VNIC_DIAG_INVALID_COMMAND, - MLX5_VNIC_DIAG_QOUTA_EXCEEDED_COMMAND, - MLX5_VNIC_DIAG_RX_STEERING_DISCARD, -}; - -static int mlx5_esw_query_vnic_diag(struct mlx5_vport *vport, enum vnic_diag_counter counter, - u64 *val) -{ - u32 out[MLX5_ST_SZ_DW(query_vnic_env_out)] = {}; - u32 in[MLX5_ST_SZ_DW(query_vnic_env_in)] = {}; - struct mlx5_core_dev *dev = vport->dev; - u16 vport_num = vport->vport; - void *vnic_diag_out; - int err; - - MLX5_SET(query_vnic_env_in, in, opcode, MLX5_CMD_OP_QUERY_VNIC_ENV); - MLX5_SET(query_vnic_env_in, in, vport_number, vport_num); - if (!mlx5_esw_is_manager_vport(dev->priv.eswitch, vport_num)) - MLX5_SET(query_vnic_env_in, in, other_vport, 1); - - err = mlx5_cmd_exec(dev, in, sizeof(in), out, sizeof(out)); - if (err) - return err; - - vnic_diag_out = MLX5_ADDR_OF(query_vnic_env_out, out, vport_env); - switch (counter) { - case MLX5_VNIC_DIAG_TOTAL_Q_UNDER_PROCESSOR_HANDLE: - *val = MLX5_GET(vnic_diagnostic_statistics, vnic_diag_out, total_error_queues); - break; - case MLX5_VNIC_DIAG_SEND_QUEUE_PRIORITY_UPDATE_FLOW: - *val = MLX5_GET(vnic_diagnostic_statistics, vnic_diag_out, - send_queue_priority_update_flow); - break; - case MLX5_VNIC_DIAG_COMP_EQ_OVERRUN: - *val = MLX5_GET(vnic_diagnostic_statistics, vnic_diag_out, comp_eq_overrun); - break; - case MLX5_VNIC_DIAG_ASYNC_EQ_OVERRUN: - *val = MLX5_GET(vnic_diagnostic_statistics, vnic_diag_out, async_eq_overrun); - break; - case MLX5_VNIC_DIAG_CQ_OVERRUN: - *val = MLX5_GET(vnic_diagnostic_statistics, vnic_diag_out, cq_overrun); - break; - case MLX5_VNIC_DIAG_INVALID_COMMAND: - *val = MLX5_GET(vnic_diagnostic_statistics, vnic_diag_out, invalid_command); - break; - case MLX5_VNIC_DIAG_QOUTA_EXCEEDED_COMMAND: - *val = MLX5_GET(vnic_diagnostic_statistics, vnic_diag_out, quota_exceeded_command); - break; - case MLX5_VNIC_DIAG_RX_STEERING_DISCARD: - *val = MLX5_GET64(vnic_diagnostic_statistics, vnic_diag_out, - nic_receive_steering_discard); - break; - } - - return 0; -} - -static int __show_vnic_diag(struct seq_file *file, struct mlx5_vport *vport, - enum vnic_diag_counter type) -{ - u64 val = 0; - int ret; - - ret = mlx5_esw_query_vnic_diag(vport, type, &val); - if (ret) - return ret; - - seq_printf(file, "%llu\n", val); - return 0; -} - -static int total_q_under_processor_handle_show(struct seq_file *file, void *priv) -{ - return __show_vnic_diag(file, file->private, MLX5_VNIC_DIAG_TOTAL_Q_UNDER_PROCESSOR_HANDLE); -} - -static int send_queue_priority_update_flow_show(struct seq_file *file, void *priv) -{ - return __show_vnic_diag(file, file->private, - MLX5_VNIC_DIAG_SEND_QUEUE_PRIORITY_UPDATE_FLOW); -} - -static int comp_eq_overrun_show(struct seq_file *file, void *priv) -{ - return __show_vnic_diag(file, file->private, MLX5_VNIC_DIAG_COMP_EQ_OVERRUN); -} - -static int async_eq_overrun_show(struct seq_file *file, void *priv) -{ - return __show_vnic_diag(file, file->private, MLX5_VNIC_DIAG_ASYNC_EQ_OVERRUN); -} - -static int cq_overrun_show(struct seq_file *file, void *priv) -{ - return __show_vnic_diag(file, file->private, MLX5_VNIC_DIAG_CQ_OVERRUN); -} - -static int invalid_command_show(struct seq_file *file, void *priv) -{ - return __show_vnic_diag(file, file->private, MLX5_VNIC_DIAG_INVALID_COMMAND); -} - -static int quota_exceeded_command_show(struct seq_file *file, void *priv) -{ - return __show_vnic_diag(file, file->private, MLX5_VNIC_DIAG_QOUTA_EXCEEDED_COMMAND); -} - -static int rx_steering_discard_show(struct seq_file *file, void *priv) -{ - return __show_vnic_diag(file, file->private, MLX5_VNIC_DIAG_RX_STEERING_DISCARD); -} - -DEFINE_SHOW_ATTRIBUTE(total_q_under_processor_handle); -DEFINE_SHOW_ATTRIBUTE(send_queue_priority_update_flow); -DEFINE_SHOW_ATTRIBUTE(comp_eq_overrun); -DEFINE_SHOW_ATTRIBUTE(async_eq_overrun); -DEFINE_SHOW_ATTRIBUTE(cq_overrun); -DEFINE_SHOW_ATTRIBUTE(invalid_command); -DEFINE_SHOW_ATTRIBUTE(quota_exceeded_command); -DEFINE_SHOW_ATTRIBUTE(rx_steering_discard); - -void mlx5_esw_vport_debugfs_destroy(struct mlx5_eswitch *esw, u16 vport_num) -{ - struct mlx5_vport *vport = mlx5_eswitch_get_vport(esw, vport_num); - - debugfs_remove_recursive(vport->dbgfs); - vport->dbgfs = NULL; -} - -/* vnic diag dir name is "pf", "ecpf" or "{vf/sf}_xxxx" */ -#define VNIC_DIAG_DIR_NAME_MAX_LEN 8 - -void mlx5_esw_vport_debugfs_create(struct mlx5_eswitch *esw, u16 vport_num, bool is_sf, u16 sf_num) -{ - struct mlx5_vport *vport = mlx5_eswitch_get_vport(esw, vport_num); - struct dentry *vnic_diag; - char dir_name[VNIC_DIAG_DIR_NAME_MAX_LEN]; - int err; - - if (!MLX5_CAP_GEN(esw->dev, vport_group_manager)) - return; - - if (vport_num == MLX5_VPORT_PF) { - strcpy(dir_name, "pf"); - } else if (vport_num == MLX5_VPORT_ECPF) { - strcpy(dir_name, "ecpf"); - } else { - err = snprintf(dir_name, VNIC_DIAG_DIR_NAME_MAX_LEN, "%s_%d", is_sf ? "sf" : "vf", - is_sf ? sf_num : vport_num - MLX5_VPORT_FIRST_VF); - if (WARN_ON(err < 0)) - return; - } - - vport->dbgfs = debugfs_create_dir(dir_name, esw->dbgfs); - vnic_diag = debugfs_create_dir("vnic_diag", vport->dbgfs); - - if (MLX5_CAP_GEN(esw->dev, vnic_env_queue_counters)) { - debugfs_create_file("total_q_under_processor_handle", 0444, vnic_diag, vport, - &total_q_under_processor_handle_fops); - debugfs_create_file("send_queue_priority_update_flow", 0444, vnic_diag, vport, - &send_queue_priority_update_flow_fops); - } - - if (MLX5_CAP_GEN(esw->dev, eq_overrun_count)) { - debugfs_create_file("comp_eq_overrun", 0444, vnic_diag, vport, - &comp_eq_overrun_fops); - debugfs_create_file("async_eq_overrun", 0444, vnic_diag, vport, - &async_eq_overrun_fops); - } - - if (MLX5_CAP_GEN(esw->dev, vnic_env_cq_overrun)) - debugfs_create_file("cq_overrun", 0444, vnic_diag, vport, &cq_overrun_fops); - - if (MLX5_CAP_GEN(esw->dev, invalid_command_count)) - debugfs_create_file("invalid_command", 0444, vnic_diag, vport, - &invalid_command_fops); - - if (MLX5_CAP_GEN(esw->dev, quota_exceeded_count)) - debugfs_create_file("quota_exceeded_command", 0444, vnic_diag, vport, - "a_exceeded_command_fops); - - if (MLX5_CAP_GEN(esw->dev, nic_receive_steering_discard)) - debugfs_create_file("rx_steering_discard", 0444, vnic_diag, vport, - &rx_steering_discard_fops); - -} diff --git a/drivers/net/ethernet/mellanox/mlx5/core/esw/diag/bridge_tracepoint.h b/drivers/net/ethernet/mellanox/mlx5/core/esw/diag/bridge_tracepoint.h index 51ac24e6ec3c..1808da214094 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/esw/diag/bridge_tracepoint.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/esw/diag/bridge_tracepoint.h @@ -110,6 +110,41 @@ DEFINE_EVENT(mlx5_esw_bridge_port_template, TP_ARGS(port) ); +DECLARE_EVENT_CLASS(mlx5_esw_bridge_mdb_port_change_template, + TP_PROTO(const struct net_device *dev, + const struct mlx5_esw_bridge_mdb_entry *mdb), + TP_ARGS(dev, mdb), + TP_STRUCT__entry( + __array(char, dev_name, IFNAMSIZ) + __array(unsigned char, addr, ETH_ALEN) + __field(u16, vid) + __field(int, num_ports) + __field(bool, offloaded)), + TP_fast_assign( + strscpy(__entry->dev_name, netdev_name(dev), IFNAMSIZ); + memcpy(__entry->addr, mdb->key.addr, ETH_ALEN); + __entry->vid = mdb->key.vid; + __entry->num_ports = mdb->num_ports; + __entry->offloaded = mdb->egress_handle;), + TP_printk("net_device=%s addr=%pM vid=%u num_ports=%d offloaded=%d", + __entry->dev_name, + __entry->addr, + __entry->vid, + __entry->num_ports, + __entry->offloaded)); + +DEFINE_EVENT(mlx5_esw_bridge_mdb_port_change_template, + mlx5_esw_bridge_port_mdb_attach, + TP_PROTO(const struct net_device *dev, + const struct mlx5_esw_bridge_mdb_entry *mdb), + TP_ARGS(dev, mdb)); + +DEFINE_EVENT(mlx5_esw_bridge_mdb_port_change_template, + mlx5_esw_bridge_port_mdb_detach, + TP_PROTO(const struct net_device *dev, + const struct mlx5_esw_bridge_mdb_entry *mdb), + TP_ARGS(dev, mdb)); + #endif /* This part must be outside protection */ diff --git a/drivers/net/ethernet/mellanox/mlx5/core/esw/qos.c b/drivers/net/ethernet/mellanox/mlx5/core/esw/qos.c index 75015d370922..7c79476cc5f9 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/esw/qos.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/esw/qos.c @@ -744,7 +744,7 @@ static int esw_qos_devlink_rate_to_mbps(struct mlx5_core_dev *mdev, const char * u64 value; int err; - err = mlx5e_port_max_linkspeed(mdev, &link_speed_max); + err = mlx5_port_max_linkspeed(mdev, &link_speed_max); if (err) { NL_SET_ERR_MSG_MOD(extack, "Failed to get link maximum speed"); return err; diff --git a/drivers/net/ethernet/mellanox/mlx5/core/esw/vporttbl.c b/drivers/net/ethernet/mellanox/mlx5/core/esw/vporttbl.c index 9e72118f2e4c..749c3957a128 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/esw/vporttbl.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/esw/vporttbl.c @@ -11,7 +11,7 @@ struct mlx5_vport_key { u16 prio; u16 vport; u16 vhca_id; - const struct esw_vport_tbl_namespace *vport_ns; + struct esw_vport_tbl_namespace *vport_ns; } __packed; struct mlx5_vport_table { @@ -21,6 +21,14 @@ struct mlx5_vport_table { struct mlx5_vport_key key; }; +static void +esw_vport_tbl_init(struct mlx5_eswitch *esw, struct esw_vport_tbl_namespace *ns) +{ + if (esw->offloads.encap != DEVLINK_ESWITCH_ENCAP_MODE_NONE) + ns->flags |= (MLX5_FLOW_TABLE_TUNNEL_EN_REFORMAT | + MLX5_FLOW_TABLE_TUNNEL_EN_DECAP); +} + static struct mlx5_flow_table * esw_vport_tbl_create(struct mlx5_eswitch *esw, struct mlx5_flow_namespace *ns, const struct esw_vport_tbl_namespace *vport_ns) @@ -80,6 +88,7 @@ mlx5_esw_vporttbl_get(struct mlx5_eswitch *esw, struct mlx5_vport_tbl_attr *attr u32 hkey; mutex_lock(&esw->fdb_table.offloads.vports.lock); + esw_vport_tbl_init(esw, attr->vport_ns); hkey = flow_attr_to_vport_key(esw, attr, &skey); e = esw_vport_tbl_lookup(esw, &skey, hkey); if (e) { @@ -127,6 +136,7 @@ mlx5_esw_vporttbl_put(struct mlx5_eswitch *esw, struct mlx5_vport_tbl_attr *attr u32 hkey; mutex_lock(&esw->fdb_table.offloads.vports.lock); + esw_vport_tbl_init(esw, attr->vport_ns); hkey = flow_attr_to_vport_key(esw, attr, &key); e = esw_vport_tbl_lookup(esw, &key, hkey); if (!e || --e->num_rules) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c index 19fed514fc17..901c53751b0a 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c @@ -36,7 +36,6 @@ #include <linux/mlx5/vport.h> #include <linux/mlx5/fs.h> #include <linux/mlx5/mpfs.h> -#include <linux/debugfs.h> #include "esw/acl/lgcy.h" #include "esw/legacy.h" #include "esw/qos.h" @@ -1056,7 +1055,6 @@ int mlx5_eswitch_load_vport(struct mlx5_eswitch *esw, u16 vport_num, if (err) return err; - mlx5_esw_vport_debugfs_create(esw, vport_num, false, 0); err = esw_offloads_load_rep(esw, vport_num); if (err) goto err_rep; @@ -1064,7 +1062,6 @@ int mlx5_eswitch_load_vport(struct mlx5_eswitch *esw, u16 vport_num, return err; err_rep: - mlx5_esw_vport_debugfs_destroy(esw, vport_num); mlx5_esw_vport_disable(esw, vport_num); return err; } @@ -1072,7 +1069,6 @@ err_rep: void mlx5_eswitch_unload_vport(struct mlx5_eswitch *esw, u16 vport_num) { esw_offloads_unload_rep(esw, vport_num); - mlx5_esw_vport_debugfs_destroy(esw, vport_num); mlx5_esw_vport_disable(esw, vport_num); } @@ -1510,7 +1506,7 @@ out_free: return err; } -static int mlx5_esw_vport_alloc(struct mlx5_eswitch *esw, struct mlx5_core_dev *dev, +static int mlx5_esw_vport_alloc(struct mlx5_eswitch *esw, int index, u16 vport_num) { struct mlx5_vport *vport; @@ -1564,7 +1560,7 @@ static int mlx5_esw_vports_init(struct mlx5_eswitch *esw) xa_init(&esw->vports); - err = mlx5_esw_vport_alloc(esw, dev, idx, MLX5_VPORT_PF); + err = mlx5_esw_vport_alloc(esw, idx, MLX5_VPORT_PF); if (err) goto err; if (esw->first_host_vport == MLX5_VPORT_PF) @@ -1572,7 +1568,7 @@ static int mlx5_esw_vports_init(struct mlx5_eswitch *esw) idx++; for (i = 0; i < mlx5_core_max_vfs(dev); i++) { - err = mlx5_esw_vport_alloc(esw, dev, idx, idx); + err = mlx5_esw_vport_alloc(esw, idx, idx); if (err) goto err; xa_set_mark(&esw->vports, idx, MLX5_ESW_VPT_VF); @@ -1581,7 +1577,7 @@ static int mlx5_esw_vports_init(struct mlx5_eswitch *esw) } base_sf_num = mlx5_sf_start_function_id(dev); for (i = 0; i < mlx5_sf_max_functions(dev); i++) { - err = mlx5_esw_vport_alloc(esw, dev, idx, base_sf_num + i); + err = mlx5_esw_vport_alloc(esw, idx, base_sf_num + i); if (err) goto err; xa_set_mark(&esw->vports, base_sf_num + i, MLX5_ESW_VPT_SF); @@ -1592,7 +1588,7 @@ static int mlx5_esw_vports_init(struct mlx5_eswitch *esw) if (err) goto err; for (i = 0; i < max_host_pf_sfs; i++) { - err = mlx5_esw_vport_alloc(esw, dev, idx, base_sf_num + i); + err = mlx5_esw_vport_alloc(esw, idx, base_sf_num + i); if (err) goto err; xa_set_mark(&esw->vports, base_sf_num + i, MLX5_ESW_VPT_SF); @@ -1600,12 +1596,12 @@ static int mlx5_esw_vports_init(struct mlx5_eswitch *esw) } if (mlx5_ecpf_vport_exists(dev)) { - err = mlx5_esw_vport_alloc(esw, dev, idx, MLX5_VPORT_ECPF); + err = mlx5_esw_vport_alloc(esw, idx, MLX5_VPORT_ECPF); if (err) goto err; idx++; } - err = mlx5_esw_vport_alloc(esw, dev, idx, MLX5_VPORT_UPLINK); + err = mlx5_esw_vport_alloc(esw, idx, MLX5_VPORT_UPLINK); if (err) goto err; return 0; @@ -1672,7 +1668,6 @@ int mlx5_eswitch_init(struct mlx5_core_dev *dev) dev->priv.eswitch = esw; BLOCKING_INIT_NOTIFIER_HEAD(&esw->n_head); - esw->dbgfs = debugfs_create_dir("esw", mlx5_debugfs_get_dev_root(esw->dev)); esw_info(dev, "Total vports %d, per vport: max uc(%d) max mc(%d)\n", esw->total_vports, @@ -1696,7 +1691,6 @@ void mlx5_eswitch_cleanup(struct mlx5_eswitch *esw) esw_info(esw->dev, "cleanup\n"); - debugfs_remove_recursive(esw->dbgfs); esw->dev->priv.eswitch = NULL; destroy_workqueue(esw->work_queue); WARN_ON(refcount_read(&esw->qos.refcnt)); diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h index 19e9a77c4633..1a042c981713 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h @@ -195,7 +195,6 @@ struct mlx5_vport { enum mlx5_eswitch_vport_event enabled_events; int index; struct devlink_port *dl_port; - struct dentry *dbgfs; }; struct mlx5_esw_indir_table; @@ -263,6 +262,7 @@ struct mlx5_esw_offload { const struct mlx5_eswitch_rep_ops *rep_ops[NUM_REP_TYPES]; u8 inline_mode; atomic64_t num_flows; + u64 num_block_encap; enum devlink_eswitch_encap_mode encap; struct ida vport_metadata_ida; unsigned int host_number; /* ECPF supports one external host */ @@ -342,7 +342,6 @@ struct mlx5_eswitch { u32 large_group_num; } params; struct blocking_notifier_head n_head; - struct dentry *dbgfs; }; void esw_offloads_disable(struct mlx5_eswitch *esw); @@ -355,7 +354,6 @@ mlx5_eswitch_add_send_to_vport_meta_rule(struct mlx5_eswitch *esw, u16 vport_num void mlx5_eswitch_del_send_to_vport_meta_rule(struct mlx5_flow_handle *rule); bool mlx5_esw_vport_match_metadata_supported(const struct mlx5_eswitch *esw); -int mlx5_esw_offloads_vport_metadata_set(struct mlx5_eswitch *esw, bool enable); u32 mlx5_esw_match_metadata_alloc(struct mlx5_eswitch *esw); void mlx5_esw_match_metadata_free(struct mlx5_eswitch *esw, u32 metadata); @@ -674,7 +672,7 @@ struct mlx5_vport_tbl_attr { u32 chain; u16 prio; u16 vport; - const struct esw_vport_tbl_namespace *vport_ns; + struct esw_vport_tbl_namespace *vport_ns; }; struct mlx5_flow_table * @@ -703,9 +701,6 @@ int mlx5_esw_offloads_devlink_port_register(struct mlx5_eswitch *esw, u16 vport_ void mlx5_esw_offloads_devlink_port_unregister(struct mlx5_eswitch *esw, u16 vport_num); struct devlink_port *mlx5_esw_offloads_devlink_port(struct mlx5_eswitch *esw, u16 vport_num); -void mlx5_esw_vport_debugfs_create(struct mlx5_eswitch *esw, u16 vport_num, bool is_sf, u16 sf_num); -void mlx5_esw_vport_debugfs_destroy(struct mlx5_eswitch *esw, u16 vport_num); - int mlx5_esw_devlink_sf_port_register(struct mlx5_eswitch *esw, struct devlink_port *dl_port, u16 vport_num, u32 controller, u32 sfnum); void mlx5_esw_devlink_sf_port_unregister(struct mlx5_eswitch *esw, u16 vport_num); @@ -748,6 +743,9 @@ void mlx5_eswitch_offloads_destroy_single_fdb(struct mlx5_eswitch *master_esw, struct mlx5_eswitch *slave_esw); int mlx5_eswitch_reload_reps(struct mlx5_eswitch *esw); +bool mlx5_eswitch_block_encap(struct mlx5_core_dev *dev); +void mlx5_eswitch_unblock_encap(struct mlx5_core_dev *dev); + static inline int mlx5_eswitch_num_vfs(struct mlx5_eswitch *esw) { if (mlx5_esw_allowed(esw)) @@ -761,6 +759,7 @@ mlx5_eswitch_get_slow_fdb(struct mlx5_eswitch *esw) { return esw->fdb_table.offloads.slow_fdb; } + #else /* CONFIG_MLX5_ESWITCH */ /* eswitch API stubs */ static inline int mlx5_eswitch_init(struct mlx5_core_dev *dev) { return 0; } @@ -805,6 +804,15 @@ mlx5_eswitch_reload_reps(struct mlx5_eswitch *esw) { return 0; } + +static inline bool mlx5_eswitch_block_encap(struct mlx5_core_dev *dev) +{ + return true; +} + +static inline void mlx5_eswitch_unblock_encap(struct mlx5_core_dev *dev) +{ +} #endif /* CONFIG_MLX5_ESWITCH */ #endif /* __MLX5_ESWITCH_H__ */ diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c index 25a8076a77bf..69215ffb9999 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c @@ -73,7 +73,7 @@ #define MLX5_ESW_FT_OFFLOADS_DROP_RULE (1) -static const struct esw_vport_tbl_namespace mlx5_esw_vport_tbl_mirror_ns = { +static struct esw_vport_tbl_namespace mlx5_esw_vport_tbl_mirror_ns = { .max_fte = MLX5_ESW_VPORT_TBL_SIZE, .max_num_groups = MLX5_ESW_VPORT_TBL_NUM_GROUPS, .flags = 0, @@ -760,7 +760,6 @@ mlx5_eswitch_add_fwd_rule(struct mlx5_eswitch *esw, kfree(dest); return rule; err_chain_src_rewrite: - esw_put_dest_tables_loop(esw, attr, 0, i); mlx5_esw_vporttbl_put(esw, &fwd_attr); err_get_fwd: mlx5_chains_put_table(chains, attr->chain, attr->prio, 0); @@ -803,7 +802,6 @@ __mlx5_eswitch_del_rule(struct mlx5_eswitch *esw, if (fwd_rule) { mlx5_esw_vporttbl_put(esw, &fwd_attr); mlx5_chains_put_table(chains, attr->chain, attr->prio, 0); - esw_put_dest_tables_loop(esw, attr, 0, esw_attr->split_count); } else { if (split) mlx5_esw_vporttbl_put(esw, &fwd_attr); @@ -1374,14 +1372,11 @@ esw_chains_create(struct mlx5_eswitch *esw, struct mlx5_flow_table *miss_fdb) struct mlx5_flow_table *nf_ft, *ft; struct mlx5_chains_attr attr = {}; struct mlx5_fs_chains *chains; - u32 fdb_max; int err; - fdb_max = 1 << MLX5_CAP_ESW_FLOWTABLE_FDB(dev, log_max_ft_size); - esw_init_chains_offload_flags(esw, &attr.flags); attr.ns = MLX5_FLOW_NAMESPACE_FDB; - attr.max_ft_sz = fdb_max; + attr.fs_base_prio = FDB_TC_OFFLOAD; attr.max_grp_num = esw->params.large_group_num; attr.default_ft = miss_fdb; attr.mapping = esw->offloads.reg_c0_obj_pool; @@ -1392,6 +1387,7 @@ esw_chains_create(struct mlx5_eswitch *esw, struct mlx5_flow_table *miss_fdb) esw_warn(dev, "Failed to create fdb chains err(%d)\n", err); return err; } + mlx5_chains_print_info(chains); esw->fdb_table.offloads.esw_chains_priv = chains; @@ -2941,28 +2937,6 @@ metadata_err: return err; } -int mlx5_esw_offloads_vport_metadata_set(struct mlx5_eswitch *esw, bool enable) -{ - int err = 0; - - down_write(&esw->mode_lock); - if (mlx5_esw_is_fdb_created(esw)) { - err = -EBUSY; - goto done; - } - if (!mlx5_esw_vport_match_metadata_supported(esw)) { - err = -EOPNOTSUPP; - goto done; - } - if (enable) - esw->flags |= MLX5_ESWITCH_VPORT_MATCH_METADATA; - else - esw->flags &= ~MLX5_ESWITCH_VPORT_MATCH_METADATA; -done: - up_write(&esw->mode_lock); - return err; -} - int esw_vport_create_offloads_acl_tables(struct mlx5_eswitch *esw, struct mlx5_vport *vport) @@ -3588,6 +3562,47 @@ int mlx5_devlink_eswitch_inline_mode_get(struct devlink *devlink, u8 *mode) return err; } +bool mlx5_eswitch_block_encap(struct mlx5_core_dev *dev) +{ + struct devlink *devlink = priv_to_devlink(dev); + struct mlx5_eswitch *esw; + + devl_lock(devlink); + esw = mlx5_devlink_eswitch_get(devlink); + if (IS_ERR(esw)) { + devl_unlock(devlink); + /* Failure means no eswitch => not possible to change encap */ + return true; + } + + down_write(&esw->mode_lock); + if (esw->mode != MLX5_ESWITCH_LEGACY && + esw->offloads.encap != DEVLINK_ESWITCH_ENCAP_MODE_NONE) { + up_write(&esw->mode_lock); + devl_unlock(devlink); + return false; + } + + esw->offloads.num_block_encap++; + up_write(&esw->mode_lock); + devl_unlock(devlink); + return true; +} + +void mlx5_eswitch_unblock_encap(struct mlx5_core_dev *dev) +{ + struct devlink *devlink = priv_to_devlink(dev); + struct mlx5_eswitch *esw; + + esw = mlx5_devlink_eswitch_get(devlink); + if (IS_ERR(esw)) + return; + + down_write(&esw->mode_lock); + esw->offloads.num_block_encap--; + up_write(&esw->mode_lock); +} + int mlx5_devlink_eswitch_encap_mode_set(struct devlink *devlink, enum devlink_eswitch_encap_mode encap, struct netlink_ext_ack *extack) @@ -3629,6 +3644,13 @@ int mlx5_devlink_eswitch_encap_mode_set(struct devlink *devlink, goto unlock; } + if (esw->offloads.num_block_encap) { + NL_SET_ERR_MSG_MOD(extack, + "Can't set encapsulation when IPsec SA and/or policies are configured"); + err = -EOPNOTSUPP; + goto unlock; + } + esw_destroy_offloads_fdb_tables(esw); esw->offloads.encap = encap; @@ -3782,14 +3804,12 @@ int mlx5_esw_offloads_sf_vport_enable(struct mlx5_eswitch *esw, struct devlink_p if (err) goto devlink_err; - mlx5_esw_vport_debugfs_create(esw, vport_num, true, sfnum); err = mlx5_esw_offloads_rep_load(esw, vport_num); if (err) goto rep_err; return 0; rep_err: - mlx5_esw_vport_debugfs_destroy(esw, vport_num); mlx5_esw_devlink_sf_port_unregister(esw, vport_num); devlink_err: mlx5_esw_vport_disable(esw, vport_num); @@ -3799,7 +3819,6 @@ devlink_err: void mlx5_esw_offloads_sf_vport_disable(struct mlx5_eswitch *esw, u16 vport_num) { mlx5_esw_offloads_rep_unload(esw, vport_num); - mlx5_esw_vport_debugfs_destroy(esw, vport_num); mlx5_esw_devlink_sf_port_unregister(esw, vport_num); mlx5_esw_vport_disable(esw, vport_num); } diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads_termtbl.c b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads_termtbl.c index 3a9a6bb9158d..edd910258314 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads_termtbl.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads_termtbl.c @@ -210,18 +210,6 @@ static bool mlx5_eswitch_offload_is_uplink_port(const struct mlx5_eswitch *esw, return (port_mask & port_value) == MLX5_VPORT_UPLINK; } -static bool -mlx5_eswitch_is_push_vlan_no_cap(struct mlx5_eswitch *esw, - struct mlx5_flow_act *flow_act) -{ - if (flow_act->action & MLX5_FLOW_CONTEXT_ACTION_VLAN_PUSH && - !(mlx5_fs_get_capabilities(esw->dev, MLX5_FLOW_NAMESPACE_FDB) & - MLX5_FLOW_STEERING_CAP_VLAN_PUSH_ON_RX)) - return true; - - return false; -} - bool mlx5_eswitch_termtbl_required(struct mlx5_eswitch *esw, struct mlx5_flow_attr *attr, @@ -237,7 +225,10 @@ mlx5_eswitch_termtbl_required(struct mlx5_eswitch *esw, (!mlx5_eswitch_offload_is_uplink_port(esw, spec) && !esw_attr->int_port)) return false; - if (mlx5_eswitch_is_push_vlan_no_cap(esw, flow_act)) + /* push vlan on RX */ + if (flow_act->action & MLX5_FLOW_CONTEXT_ACTION_VLAN_PUSH && + !(mlx5_fs_get_capabilities(esw->dev, MLX5_FLOW_NAMESPACE_FDB) & + MLX5_FLOW_STEERING_CAP_VLAN_PUSH_ON_RX)) return true; /* hairpin */ @@ -261,31 +252,19 @@ mlx5_eswitch_add_termtbl_rule(struct mlx5_eswitch *esw, struct mlx5_flow_act term_tbl_act = {}; struct mlx5_flow_handle *rule = NULL; bool term_table_created = false; - bool is_push_vlan_on_rx; int num_vport_dests = 0; int i, curr_dest; - is_push_vlan_on_rx = mlx5_eswitch_is_push_vlan_no_cap(esw, flow_act); mlx5_eswitch_termtbl_actions_move(flow_act, &term_tbl_act); term_tbl_act.action |= MLX5_FLOW_CONTEXT_ACTION_FWD_DEST; for (i = 0; i < num_dest; i++) { struct mlx5_termtbl_handle *tt; - bool hairpin = false; /* only vport destinations can be terminated */ if (dest[i].type != MLX5_FLOW_DESTINATION_TYPE_VPORT) continue; - if (attr->dests[num_vport_dests].rep && - attr->dests[num_vport_dests].rep->vport == MLX5_VPORT_UPLINK) - hairpin = true; - - if (!is_push_vlan_on_rx && !hairpin) { - num_vport_dests++; - continue; - } - if (attr->dests[num_vport_dests].flags & MLX5_ESW_DEST_ENCAP) { term_tbl_act.action |= MLX5_FLOW_CONTEXT_ACTION_PACKET_REFORMAT; term_tbl_act.pkt_reformat = attr->dests[num_vport_dests].pkt_reformat; @@ -333,9 +312,6 @@ revert_changes: for (curr_dest = 0; curr_dest < num_vport_dests; curr_dest++) { struct mlx5_termtbl_handle *tt = attr->dests[curr_dest].termtbl; - if (!tt) - continue; - attr->dests[curr_dest].termtbl = NULL; /* search for the destination associated with the diff --git a/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c b/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c index 731acbe22dc7..19da02c41616 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c @@ -137,7 +137,7 @@ #define LAG_MIN_LEVEL (OFFLOADS_MIN_LEVEL + KERNEL_RX_MACSEC_MIN_LEVEL + 1) #define KERNEL_TX_IPSEC_NUM_PRIOS 1 -#define KERNEL_TX_IPSEC_NUM_LEVELS 2 +#define KERNEL_TX_IPSEC_NUM_LEVELS 3 #define KERNEL_TX_IPSEC_MIN_LEVEL (KERNEL_TX_IPSEC_NUM_LEVELS) #define KERNEL_TX_MACSEC_NUM_PRIOS 1 @@ -1762,7 +1762,8 @@ static bool dest_is_valid(struct mlx5_flow_destination *dest, if (ignore_level) { if (ft->type != FS_FT_FDB && - ft->type != FS_FT_NIC_RX) + ft->type != FS_FT_NIC_RX && + ft->type != FS_FT_NIC_TX) return false; if (dest->type == MLX5_FLOW_DESTINATION_TYPE_FLOW_TABLE && @@ -2996,7 +2997,7 @@ static int init_fdb_root_ns(struct mlx5_flow_steering *steering) goto out_err; } - maj_prio = fs_create_prio(&steering->fdb_root_ns->ns, FDB_BR_OFFLOAD, 3); + maj_prio = fs_create_prio(&steering->fdb_root_ns->ns, FDB_BR_OFFLOAD, 4); if (IS_ERR(maj_prio)) { err = PTR_ERR(maj_prio); goto out_err; diff --git a/drivers/net/ethernet/mellanox/mlx5/core/fw_reset.c b/drivers/net/ethernet/mellanox/mlx5/core/fw_reset.c index 4c2dad9d7cfb..50022e7565f1 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/fw_reset.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/fw_reset.c @@ -167,7 +167,7 @@ static void mlx5_fw_reset_complete_reload(struct mlx5_core_dev *dev) if (mlx5_health_wait_pci_up(dev)) mlx5_core_err(dev, "reset reload flow aborted, PCI reads still not working\n"); else - mlx5_load_one(dev); + mlx5_load_one(dev, true); devlink_remote_reload_actions_performed(priv_to_devlink(dev), 0, BIT(DEVLINK_RELOAD_ACTION_DRIVER_REINIT) | BIT(DEVLINK_RELOAD_ACTION_FW_ACTIVATE)); @@ -499,7 +499,7 @@ int mlx5_fw_reset_wait_reset_done(struct mlx5_core_dev *dev) err = fw_reset->ret; if (test_and_clear_bit(MLX5_FW_RESET_FLAGS_RELOAD_REQUIRED, &fw_reset->reset_flags)) { mlx5_unload_one_devl_locked(dev, false); - mlx5_load_one_devl_locked(dev, false); + mlx5_load_one_devl_locked(dev, true); } out: clear_bit(MLX5_FW_RESET_FLAGS_PENDING_COMP, &fw_reset->reset_flags); diff --git a/drivers/net/ethernet/mellanox/mlx5/core/health.c b/drivers/net/ethernet/mellanox/mlx5/core/health.c index f9438d4e43ca..871c32dda66e 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/health.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/health.c @@ -42,6 +42,7 @@ #include "lib/pci_vsc.h" #include "lib/tout.h" #include "diag/fw_tracer.h" +#include "diag/reporter_vnic.h" enum { MAX_MISSES = 3, @@ -325,6 +326,10 @@ int mlx5_health_wait_pci_up(struct mlx5_core_dev *dev) while (sensor_pci_not_working(dev)) { if (time_after(jiffies, end)) return -ETIMEDOUT; + if (test_bit(MLX5_BREAK_FW_WAIT, &dev->intf_state)) { + mlx5_core_warn(dev, "device is being removed, stop waiting for PCI\n"); + return -ENODEV; + } msleep(100); } return 0; @@ -894,6 +899,7 @@ void mlx5_health_cleanup(struct mlx5_core_dev *dev) cancel_delayed_work_sync(&health->update_fw_log_ts_work); destroy_workqueue(health->wq); + mlx5_reporter_vnic_destroy(dev); mlx5_fw_reporters_destroy(dev); } @@ -903,6 +909,7 @@ int mlx5_health_init(struct mlx5_core_dev *dev) char *name; mlx5_fw_reporters_create(dev); + mlx5_reporter_vnic_create(dev); health = &dev->priv.health; name = kmalloc(64, GFP_KERNEL); @@ -922,6 +929,7 @@ int mlx5_health_init(struct mlx5_core_dev *dev) return 0; out_err: + mlx5_reporter_vnic_destroy(dev); mlx5_fw_reporters_destroy(dev); return -ENOMEM; } diff --git a/drivers/net/ethernet/mellanox/mlx5/core/irq_affinity.c b/drivers/net/ethernet/mellanox/mlx5/core/irq_affinity.c index 380a208ab137..fa467335526e 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/irq_affinity.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/irq_affinity.c @@ -45,30 +45,28 @@ static int cpu_get_least_loaded(struct mlx5_irq_pool *pool, /* Creating an IRQ from irq_pool */ static struct mlx5_irq * -irq_pool_request_irq(struct mlx5_irq_pool *pool, const struct cpumask *req_mask) +irq_pool_request_irq(struct mlx5_irq_pool *pool, struct irq_affinity_desc *af_desc) { - cpumask_var_t auto_mask; - struct mlx5_irq *irq; + struct irq_affinity_desc auto_desc = {}; u32 irq_index; int err; - if (!zalloc_cpumask_var(&auto_mask, GFP_KERNEL)) - return ERR_PTR(-ENOMEM); err = xa_alloc(&pool->irqs, &irq_index, NULL, pool->xa_num_irqs, GFP_KERNEL); if (err) return ERR_PTR(err); if (pool->irqs_per_cpu) { - if (cpumask_weight(req_mask) > 1) + if (cpumask_weight(&af_desc->mask) > 1) /* if req_mask contain more then one CPU, set the least loadad CPU * of req_mask */ - cpumask_set_cpu(cpu_get_least_loaded(pool, req_mask), auto_mask); + cpumask_set_cpu(cpu_get_least_loaded(pool, &af_desc->mask), + &auto_desc.mask); else - cpu_get(pool, cpumask_first(req_mask)); + cpu_get(pool, cpumask_first(&af_desc->mask)); } - irq = mlx5_irq_alloc(pool, irq_index, cpumask_empty(auto_mask) ? req_mask : auto_mask); - free_cpumask_var(auto_mask); - return irq; + return mlx5_irq_alloc(pool, irq_index, + cpumask_empty(&auto_desc.mask) ? af_desc : &auto_desc, + NULL); } /* Looking for the IRQ with the smallest refcount that fits req_mask. @@ -115,22 +113,22 @@ irq_pool_find_least_loaded(struct mlx5_irq_pool *pool, const struct cpumask *req /** * mlx5_irq_affinity_request - request an IRQ according to the given mask. * @pool: IRQ pool to request from. - * @req_mask: cpumask requested for this IRQ. + * @af_desc: affinity descriptor for this IRQ. * * This function returns a pointer to IRQ, or ERR_PTR in case of error. */ struct mlx5_irq * -mlx5_irq_affinity_request(struct mlx5_irq_pool *pool, const struct cpumask *req_mask) +mlx5_irq_affinity_request(struct mlx5_irq_pool *pool, struct irq_affinity_desc *af_desc) { struct mlx5_irq *least_loaded_irq, *new_irq; mutex_lock(&pool->lock); - least_loaded_irq = irq_pool_find_least_loaded(pool, req_mask); + least_loaded_irq = irq_pool_find_least_loaded(pool, &af_desc->mask); if (least_loaded_irq && mlx5_irq_read_locked(least_loaded_irq) < pool->min_threshold) goto out; /* We didn't find an IRQ with less than min_thres, try to allocate a new IRQ */ - new_irq = irq_pool_request_irq(pool, req_mask); + new_irq = irq_pool_request_irq(pool, af_desc); if (IS_ERR(new_irq)) { if (!least_loaded_irq) { /* We failed to create an IRQ and we didn't find an IRQ */ @@ -194,32 +192,30 @@ int mlx5_irq_affinity_irqs_request_auto(struct mlx5_core_dev *dev, int nirqs, struct mlx5_irq **irqs) { struct mlx5_irq_pool *pool = mlx5_irq_pool_get(dev); - cpumask_var_t req_mask; + struct irq_affinity_desc af_desc = {}; struct mlx5_irq *irq; int i = 0; - if (!zalloc_cpumask_var(&req_mask, GFP_KERNEL)) - return -ENOMEM; - cpumask_copy(req_mask, cpu_online_mask); + af_desc.is_managed = 1; + cpumask_copy(&af_desc.mask, cpu_online_mask); for (i = 0; i < nirqs; i++) { if (mlx5_irq_pool_is_sf_pool(pool)) - irq = mlx5_irq_affinity_request(pool, req_mask); + irq = mlx5_irq_affinity_request(pool, &af_desc); else /* In case SF pool doesn't exists, fallback to the PF IRQs. * The PF IRQs are already allocated and binded to CPU * at this point. Hence, only an index is needed. */ - irq = mlx5_irq_request(dev, i, NULL); + irq = mlx5_irq_request(dev, i, NULL, NULL); if (IS_ERR(irq)) break; irqs[i] = irq; - cpumask_clear_cpu(cpumask_first(mlx5_irq_get_affinity_mask(irq)), req_mask); + cpumask_clear_cpu(cpumask_first(mlx5_irq_get_affinity_mask(irq)), &af_desc.mask); mlx5_core_dbg(pool->dev, "IRQ %u mapped to cpu %*pbl, %u EQs on this irq\n", pci_irq_vector(dev->pdev, mlx5_irq_get_index(irq)), cpumask_pr_args(mlx5_irq_get_affinity_mask(irq)), mlx5_irq_read_locked(irq) / MLX5_EQ_REFS_PER_IRQ); } - free_cpumask_var(req_mask); if (!i) return PTR_ERR(irq); return i; diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lib/clock.c b/drivers/net/ethernet/mellanox/mlx5/core/lib/clock.c index 4c9a40211059..932fbc843c69 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/lib/clock.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/lib/clock.c @@ -39,7 +39,7 @@ #include "clock.h" enum { - MLX5_CYCLES_SHIFT = 23 + MLX5_CYCLES_SHIFT = 31 }; enum { diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lib/fs_chains.c b/drivers/net/ethernet/mellanox/mlx5/core/lib/fs_chains.c index 81ed91fee59b..db9df9798ffa 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/lib/fs_chains.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/lib/fs_chains.c @@ -14,10 +14,8 @@ #define chains_lock(chains) ((chains)->lock) #define chains_ht(chains) ((chains)->chains_ht) #define prios_ht(chains) ((chains)->prios_ht) -#define tc_default_ft(chains) ((chains)->tc_default_ft) -#define tc_end_ft(chains) ((chains)->tc_end_ft) -#define ns_to_chains_fs_prio(ns) ((ns) == MLX5_FLOW_NAMESPACE_FDB ? \ - FDB_TC_OFFLOAD : MLX5E_TC_PRIO) +#define chains_default_ft(chains) ((chains)->chains_default_ft) +#define chains_end_ft(chains) ((chains)->chains_end_ft) #define FT_TBL_SZ (64 * 1024) struct mlx5_fs_chains { @@ -28,13 +26,15 @@ struct mlx5_fs_chains { /* Protects above chains_ht and prios_ht */ struct mutex lock; - struct mlx5_flow_table *tc_default_ft; - struct mlx5_flow_table *tc_end_ft; + struct mlx5_flow_table *chains_default_ft; + struct mlx5_flow_table *chains_end_ft; struct mapping_ctx *chains_mapping; enum mlx5_flow_namespace_type ns; u32 group_num; u32 flags; + int fs_base_prio; + int fs_base_level; }; struct fs_chain { @@ -145,7 +145,7 @@ void mlx5_chains_set_end_ft(struct mlx5_fs_chains *chains, struct mlx5_flow_table *ft) { - tc_end_ft(chains) = ft; + chains_end_ft(chains) = ft; } static struct mlx5_flow_table * @@ -164,11 +164,11 @@ mlx5_chains_create_table(struct mlx5_fs_chains *chains, sz = (chain == mlx5_chains_get_nf_ft_chain(chains)) ? FT_TBL_SZ : POOL_NEXT_SIZE; ft_attr.max_fte = sz; - /* We use tc_default_ft(chains) as the table's next_ft till + /* We use chains_default_ft(chains) as the table's next_ft till * ignore_flow_level is allowed on FT creation and not just for FTEs. * Instead caller should add an explicit miss rule if needed. */ - ft_attr.next_ft = tc_default_ft(chains); + ft_attr.next_ft = chains_default_ft(chains); /* The root table(chain 0, prio 1, level 0) is required to be * connected to the previous fs_core managed prio. @@ -177,22 +177,22 @@ mlx5_chains_create_table(struct mlx5_fs_chains *chains, */ if (!mlx5_chains_ignore_flow_level_supported(chains) || (chain == 0 && prio == 1 && level == 0)) { - ft_attr.level = level; - ft_attr.prio = prio - 1; + ft_attr.level = chains->fs_base_level; + ft_attr.prio = chains->fs_base_prio; ns = (chains->ns == MLX5_FLOW_NAMESPACE_FDB) ? mlx5_get_fdb_sub_ns(chains->dev, chain) : mlx5_get_flow_namespace(chains->dev, chains->ns); } else { ft_attr.flags |= MLX5_FLOW_TABLE_UNMANAGED; - ft_attr.prio = ns_to_chains_fs_prio(chains->ns); + ft_attr.prio = chains->fs_base_prio; /* Firmware doesn't allow us to create another level 0 table, - * so we create all unmanaged tables as level 1. + * so we create all unmanaged tables as level 1 (base + 1). * * To connect them, we use explicit miss rules with * ignore_flow_level. Caller is responsible to create * these rules (if needed). */ - ft_attr.level = 1; + ft_attr.level = chains->fs_base_level + 1; ns = mlx5_get_flow_namespace(chains->dev, chains->ns); } @@ -220,7 +220,8 @@ create_chain_restore(struct fs_chain *chain) int err; if (chain->chain == mlx5_chains_get_nf_ft_chain(chains) || - !mlx5_chains_prios_supported(chains)) + !mlx5_chains_prios_supported(chains) || + !chains->chains_mapping) return 0; err = mlx5_chains_get_chain_mapping(chains, chain->chain, &index); @@ -380,7 +381,7 @@ mlx5_chains_add_miss_rule(struct fs_chain *chain, dest.type = MLX5_FLOW_DESTINATION_TYPE_FLOW_TABLE; dest.ft = next_ft; - if (next_ft == tc_end_ft(chains) && + if (chains->chains_mapping && next_ft == chains_end_ft(chains) && chain->chain != mlx5_chains_get_nf_ft_chain(chains) && mlx5_chains_prios_supported(chains)) { act.modify_hdr = chain->miss_modify_hdr; @@ -494,8 +495,8 @@ mlx5_chains_create_prio(struct mlx5_fs_chains *chains, /* Default miss for each chain: */ next_ft = (chain == mlx5_chains_get_nf_ft_chain(chains)) ? - tc_default_ft(chains) : - tc_end_ft(chains); + chains_default_ft(chains) : + chains_end_ft(chains); list_for_each(pos, &chain_s->prios_list) { struct prio *p = list_entry(pos, struct prio, list); @@ -681,7 +682,7 @@ err_get_prio: struct mlx5_flow_table * mlx5_chains_get_tc_end_ft(struct mlx5_fs_chains *chains) { - return tc_end_ft(chains); + return chains_end_ft(chains); } struct mlx5_flow_table * @@ -718,48 +719,38 @@ mlx5_chains_destroy_global_table(struct mlx5_fs_chains *chains, static struct mlx5_fs_chains * mlx5_chains_init(struct mlx5_core_dev *dev, struct mlx5_chains_attr *attr) { - struct mlx5_fs_chains *chains_priv; - u32 max_flow_counter; + struct mlx5_fs_chains *chains; int err; - chains_priv = kzalloc(sizeof(*chains_priv), GFP_KERNEL); - if (!chains_priv) + chains = kzalloc(sizeof(*chains), GFP_KERNEL); + if (!chains) return ERR_PTR(-ENOMEM); - max_flow_counter = (MLX5_CAP_GEN(dev, max_flow_counter_31_16) << 16) | - MLX5_CAP_GEN(dev, max_flow_counter_15_0); - - mlx5_core_dbg(dev, - "Init flow table chains, max counters(%d), groups(%d), max flow table size(%d)\n", - max_flow_counter, attr->max_grp_num, attr->max_ft_sz); - - chains_priv->dev = dev; - chains_priv->flags = attr->flags; - chains_priv->ns = attr->ns; - chains_priv->group_num = attr->max_grp_num; - chains_priv->chains_mapping = attr->mapping; - tc_default_ft(chains_priv) = tc_end_ft(chains_priv) = attr->default_ft; + chains->dev = dev; + chains->flags = attr->flags; + chains->ns = attr->ns; + chains->group_num = attr->max_grp_num; + chains->chains_mapping = attr->mapping; + chains->fs_base_prio = attr->fs_base_prio; + chains->fs_base_level = attr->fs_base_level; + chains_default_ft(chains) = chains_end_ft(chains) = attr->default_ft; - mlx5_core_info(dev, "Supported tc offload range - chains: %u, prios: %u\n", - mlx5_chains_get_chain_range(chains_priv), - mlx5_chains_get_prio_range(chains_priv)); - - err = rhashtable_init(&chains_ht(chains_priv), &chain_params); + err = rhashtable_init(&chains_ht(chains), &chain_params); if (err) goto init_chains_ht_err; - err = rhashtable_init(&prios_ht(chains_priv), &prio_params); + err = rhashtable_init(&prios_ht(chains), &prio_params); if (err) goto init_prios_ht_err; - mutex_init(&chains_lock(chains_priv)); + mutex_init(&chains_lock(chains)); - return chains_priv; + return chains; init_prios_ht_err: - rhashtable_destroy(&chains_ht(chains_priv)); + rhashtable_destroy(&chains_ht(chains)); init_chains_ht_err: - kfree(chains_priv); + kfree(chains); return ERR_PTR(err); } @@ -808,3 +799,9 @@ mlx5_chains_put_chain_mapping(struct mlx5_fs_chains *chains, u32 chain_mapping) return mapping_remove(ctx, chain_mapping); } + +void +mlx5_chains_print_info(struct mlx5_fs_chains *chains) +{ + mlx5_core_dbg(chains->dev, "Flow table chains groups(%d)\n", chains->group_num); +} diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lib/fs_chains.h b/drivers/net/ethernet/mellanox/mlx5/core/lib/fs_chains.h index d50bdb226cef..8972fe05723a 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/lib/fs_chains.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/lib/fs_chains.h @@ -17,8 +17,9 @@ enum mlx5_chains_flags { struct mlx5_chains_attr { enum mlx5_flow_namespace_type ns; + int fs_base_prio; + int fs_base_level; u32 flags; - u32 max_ft_sz; u32 max_grp_num; struct mlx5_flow_table *default_ft; struct mapping_ctx *mapping; @@ -68,6 +69,8 @@ void mlx5_chains_destroy(struct mlx5_fs_chains *chains); void mlx5_chains_set_end_ft(struct mlx5_fs_chains *chains, struct mlx5_flow_table *ft); +void +mlx5_chains_print_info(struct mlx5_fs_chains *chains); #else /* CONFIG_MLX5_CLS_ACT */ @@ -89,7 +92,9 @@ static inline struct mlx5_fs_chains * mlx5_chains_create(struct mlx5_core_dev *dev, struct mlx5_chains_attr *attr) { return NULL; } static inline void -mlx5_chains_destroy(struct mlx5_fs_chains *chains) {}; +mlx5_chains_destroy(struct mlx5_fs_chains *chains) {} +static inline void +mlx5_chains_print_info(struct mlx5_fs_chains *chains) {} #endif /* CONFIG_MLX5_CLS_ACT */ diff --git a/drivers/net/ethernet/mellanox/mlx5/core/main.c b/drivers/net/ethernet/mellanox/mlx5/core/main.c index f1de152a6113..edc738e86cac 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/main.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/main.c @@ -46,12 +46,10 @@ #include <linux/kmod.h> #include <linux/mlx5/mlx5_ifc.h> #include <linux/mlx5/vport.h> -#ifdef CONFIG_RFS_ACCEL -#include <linux/cpu_rmap.h> -#endif #include <linux/version.h> #include <net/devlink.h> #include "mlx5_core.h" +#include "thermal.h" #include "lib/eq.h" #include "fs_core.h" #include "lib/mpfs.h" @@ -102,15 +100,19 @@ enum { static struct mlx5_profile profile[] = { [0] = { .mask = 0, + .num_cmd_caches = MLX5_NUM_COMMAND_CACHES, }, [1] = { .mask = MLX5_PROF_MASK_QP_SIZE, .log_max_qp = 12, + .num_cmd_caches = MLX5_NUM_COMMAND_CACHES, + }, [2] = { .mask = MLX5_PROF_MASK_QP_SIZE | MLX5_PROF_MASK_MR_CACHE, .log_max_qp = LOG_MAX_SUPPORTED_QPS, + .num_cmd_caches = MLX5_NUM_COMMAND_CACHES, .mr_cache[0] = { .size = 500, .limit = 250 @@ -176,6 +178,11 @@ static struct mlx5_profile profile[] = { .limit = 4 }, }, + [3] = { + .mask = MLX5_PROF_MASK_QP_SIZE, + .log_max_qp = LOG_MAX_SUPPORTED_QPS, + .num_cmd_caches = 0, + }, }; static int wait_fw_init(struct mlx5_core_dev *dev, u32 max_wait_mili, @@ -191,7 +198,7 @@ static int wait_fw_init(struct mlx5_core_dev *dev, u32 max_wait_mili, if (!(fw_initializing >> 31)) break; if (time_after(jiffies, end) || - test_and_clear_bit(MLX5_BREAK_FW_WAIT, &dev->intf_state)) { + test_bit(MLX5_BREAK_FW_WAIT, &dev->intf_state)) { err = -EBUSY; break; } @@ -710,7 +717,7 @@ static int handle_hca_cap_port_selection(struct mlx5_core_dev *dev, MLX5_ST_SZ_BYTES(port_selection_cap)); MLX5_SET(port_selection_cap, set_hca_cap, port_select_flow_table_bypass, 1); - err = set_caps(dev, set_ctx, MLX5_SET_HCA_CAP_OP_MODE_PORT_SELECTION); + err = set_caps(dev, set_ctx, MLX5_SET_HCA_CAP_OP_MOD_PORT_SELECTION); return err; } @@ -917,7 +924,6 @@ static int mlx5_pci_init(struct mlx5_core_dev *dev, struct pci_dev *pdev, return 0; err_clr_master: - pci_clear_master(dev->pdev); release_bar(dev->pdev); err_disable: mlx5_pci_disable_device(dev); @@ -932,7 +938,6 @@ static void mlx5_pci_close(struct mlx5_core_dev *dev) */ mlx5_drain_health_wq(dev); iounmap(dev->iseg); - pci_clear_master(dev->pdev); release_bar(dev->pdev); mlx5_pci_disable_device(dev); } @@ -1402,16 +1407,16 @@ int mlx5_init_one(struct mlx5_core_dev *dev) goto function_teardown; } + err = mlx5_devlink_params_register(priv_to_devlink(dev)); + if (err) + goto err_devlink_params_reg; + err = mlx5_load(dev); if (err) goto err_load; set_bit(MLX5_INTERFACE_STATE_UP, &dev->intf_state); - err = mlx5_devlink_params_register(priv_to_devlink(dev)); - if (err) - goto err_devlink_params_reg; - err = mlx5_register_device(dev); if (err) goto err_register; @@ -1421,11 +1426,11 @@ int mlx5_init_one(struct mlx5_core_dev *dev) return 0; err_register: - mlx5_devlink_params_unregister(priv_to_devlink(dev)); -err_devlink_params_reg: clear_bit(MLX5_INTERFACE_STATE_UP, &dev->intf_state); mlx5_unload(dev); err_load: + mlx5_devlink_params_unregister(priv_to_devlink(dev)); +err_devlink_params_reg: mlx5_cleanup_once(dev); function_teardown: mlx5_function_teardown(dev, true); @@ -1444,7 +1449,6 @@ void mlx5_uninit_one(struct mlx5_core_dev *dev) mutex_lock(&dev->intf_state_mutex); mlx5_unregister_device(dev); - mlx5_devlink_params_unregister(priv_to_devlink(dev)); if (!test_bit(MLX5_INTERFACE_STATE_UP, &dev->intf_state)) { mlx5_core_warn(dev, "%s: interface is down, NOP\n", @@ -1455,6 +1459,7 @@ void mlx5_uninit_one(struct mlx5_core_dev *dev) clear_bit(MLX5_INTERFACE_STATE_UP, &dev->intf_state); mlx5_unload(dev); + mlx5_devlink_params_unregister(priv_to_devlink(dev)); mlx5_cleanup_once(dev); mlx5_function_teardown(dev, true); out: @@ -1509,13 +1514,13 @@ out: return err; } -int mlx5_load_one(struct mlx5_core_dev *dev) +int mlx5_load_one(struct mlx5_core_dev *dev, bool recovery) { struct devlink *devlink = priv_to_devlink(dev); int ret; devl_lock(devlink); - ret = mlx5_load_one_devl_locked(dev, false); + ret = mlx5_load_one_devl_locked(dev, recovery); devl_unlock(devlink); return ret; } @@ -1768,6 +1773,10 @@ static int probe_one(struct pci_dev *pdev, const struct pci_device_id *id) if (err) dev_err(&pdev->dev, "mlx5_crdump_enable failed with error code %d\n", err); + err = mlx5_thermal_init(dev); + if (err) + dev_err(&pdev->dev, "mlx5_thermal_init failed with error code %d\n", err); + pci_save_state(pdev); devlink_register(devlink); return 0; @@ -1796,6 +1805,7 @@ static void remove_one(struct pci_dev *pdev) mlx5_drain_fw_reset(dev); devlink_unregister(devlink); mlx5_sriov_disable(pdev); + mlx5_thermal_uninit(dev); mlx5_crdump_disable(dev); mlx5_drain_health_wq(dev); mlx5_uninit_one(dev); @@ -1912,7 +1922,8 @@ static void mlx5_pci_resume(struct pci_dev *pdev) mlx5_pci_trace(dev, "Enter, loading driver..\n"); - err = mlx5_load_one(dev); + err = mlx5_load_one(dev, false); + if (!err) devlink_health_reporter_state_update(dev->priv.health.fw_fatal_reporter, DEVLINK_HEALTH_REPORTER_STATE_HEALTHY); @@ -2003,7 +2014,7 @@ static int mlx5_resume(struct pci_dev *pdev) { struct mlx5_core_dev *dev = pci_get_drvdata(pdev); - return mlx5_load_one(dev); + return mlx5_load_one(dev, false); } static const struct pci_device_id mlx5_core_pci_table[] = { diff --git a/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h b/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h index be0785f83083..1d879374acaa 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h @@ -142,6 +142,7 @@ enum mlx5_semaphore_space_address { }; #define MLX5_DEFAULT_PROF 2 +#define MLX5_SF_PROF 3 static inline int mlx5_flexible_inlen(struct mlx5_core_dev *dev, size_t fixed, size_t item_size, size_t num_items, @@ -321,7 +322,7 @@ int mlx5_init_one(struct mlx5_core_dev *dev); void mlx5_uninit_one(struct mlx5_core_dev *dev); void mlx5_unload_one(struct mlx5_core_dev *dev, bool suspend); void mlx5_unload_one_devl_locked(struct mlx5_core_dev *dev, bool suspend); -int mlx5_load_one(struct mlx5_core_dev *dev); +int mlx5_load_one(struct mlx5_core_dev *dev, bool recovery); int mlx5_load_one_devl_locked(struct mlx5_core_dev *dev, bool recovery); int mlx5_vport_set_other_func_cap(struct mlx5_core_dev *dev, const void *hca_cap, u16 function_id, diff --git a/drivers/net/ethernet/mellanox/mlx5/core/mlx5_irq.h b/drivers/net/ethernet/mellanox/mlx5/core/mlx5_irq.h index 23cb63fa4588..efd0c299c5c7 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/mlx5_irq.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/mlx5_irq.h @@ -9,6 +9,7 @@ #define MLX5_COMP_EQS_PER_SF 8 struct mlx5_irq; +struct cpu_rmap; int mlx5_irq_table_init(struct mlx5_core_dev *dev); void mlx5_irq_table_cleanup(struct mlx5_core_dev *dev); @@ -25,9 +26,10 @@ int mlx5_get_default_msix_vec_count(struct mlx5_core_dev *dev, int num_vfs); struct mlx5_irq *mlx5_ctrl_irq_request(struct mlx5_core_dev *dev); void mlx5_ctrl_irq_release(struct mlx5_irq *ctrl_irq); struct mlx5_irq *mlx5_irq_request(struct mlx5_core_dev *dev, u16 vecidx, - struct cpumask *affinity); + struct irq_affinity_desc *af_desc, + struct cpu_rmap **rmap); int mlx5_irqs_request_vectors(struct mlx5_core_dev *dev, u16 *cpus, int nirqs, - struct mlx5_irq **irqs); + struct mlx5_irq **irqs, struct cpu_rmap **rmap); void mlx5_irqs_release_vectors(struct mlx5_irq **irqs, int nirqs); int mlx5_irq_attach_nb(struct mlx5_irq *irq, struct notifier_block *nb); int mlx5_irq_detach_nb(struct mlx5_irq *irq, struct notifier_block *nb); @@ -39,7 +41,7 @@ struct mlx5_irq_pool; int mlx5_irq_affinity_irqs_request_auto(struct mlx5_core_dev *dev, int nirqs, struct mlx5_irq **irqs); struct mlx5_irq *mlx5_irq_affinity_request(struct mlx5_irq_pool *pool, - const struct cpumask *req_mask); + struct irq_affinity_desc *af_desc); void mlx5_irq_affinity_irqs_release(struct mlx5_core_dev *dev, struct mlx5_irq **irqs, int num_irqs); #else @@ -50,7 +52,7 @@ static inline int mlx5_irq_affinity_irqs_request_auto(struct mlx5_core_dev *dev, } static inline struct mlx5_irq * -mlx5_irq_affinity_request(struct mlx5_irq_pool *pool, const struct cpumask *req_mask) +mlx5_irq_affinity_request(struct mlx5_irq_pool *pool, struct irq_affinity_desc *af_desc) { return ERR_PTR(-EOPNOTSUPP); } diff --git a/drivers/net/ethernet/mellanox/mlx5/core/pci_irq.c b/drivers/net/ethernet/mellanox/mlx5/core/pci_irq.c index 6bde18bcd42f..2245d3b2f393 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/pci_irq.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/pci_irq.c @@ -1,6 +1,7 @@ // SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB /* Copyright (c) 2019 Mellanox Technologies. */ +#include <linux/pci.h> #include <linux/interrupt.h> #include <linux/notifier.h> #include <linux/mlx5/driver.h> @@ -9,6 +10,7 @@ #include "mlx5_irq.h" #include "pci_irq.h" #include "lib/sf.h" +#include "lib/eq.h" #ifdef CONFIG_RFS_ACCEL #include <linux/cpu_rmap.h> #endif @@ -29,12 +31,11 @@ struct mlx5_irq { char name[MLX5_MAX_IRQ_NAME]; struct mlx5_irq_pool *pool; int refcount; - u32 index; - int irqn; + struct msi_map map; }; struct mlx5_irq_table { - struct mlx5_irq_pool *pf_pool; + struct mlx5_irq_pool *pcif_pool; struct mlx5_irq_pool *sf_ctrl_pool; struct mlx5_irq_pool *sf_comp_pool; }; @@ -127,15 +128,26 @@ out: static void irq_release(struct mlx5_irq *irq) { struct mlx5_irq_pool *pool = irq->pool; +#ifdef CONFIG_RFS_ACCEL + struct cpu_rmap *rmap; +#endif - xa_erase(&pool->irqs, irq->index); - /* free_irq requires that affinity_hint and rmap will be cleared - * before calling it. This is why there is asymmetry with set_rmap - * which should be called after alloc_irq but before request_irq. + xa_erase(&pool->irqs, irq->map.index); + /* free_irq requires that affinity_hint and rmap will be cleared before + * calling it. To satisfy this requirement, we call + * irq_cpu_rmap_remove() to remove the notifier */ - irq_update_affinity_hint(irq->irqn, NULL); + irq_update_affinity_hint(irq->map.virq, NULL); +#ifdef CONFIG_RFS_ACCEL + rmap = mlx5_eq_table_get_rmap(pool->dev); + if (rmap && irq->map.index) + irq_cpu_rmap_remove(rmap, irq->map.virq); +#endif + free_cpumask_var(irq->mask); - free_irq(irq->irqn, &irq->nh); + free_irq(irq->map.virq, &irq->nh); + if (irq->map.index && pci_msix_can_alloc_dyn(pool->dev->pdev)) + pci_msix_free_irq(pool->dev->pdev, irq->map); kfree(irq); } @@ -198,7 +210,7 @@ static void irq_set_name(struct mlx5_irq_pool *pool, char *name, int vecidx) return; } - if (vecidx == pool->xa_num_irqs.max) { + if (!vecidx) { snprintf(name, MLX5_MAX_IRQ_NAME, "mlx5_async%d", vecidx); return; } @@ -207,7 +219,8 @@ static void irq_set_name(struct mlx5_irq_pool *pool, char *name, int vecidx) } struct mlx5_irq *mlx5_irq_alloc(struct mlx5_irq_pool *pool, int i, - const struct cpumask *affinity) + struct irq_affinity_desc *af_desc, + struct cpu_rmap **rmap) { struct mlx5_core_dev *dev = pool->dev; char name[MLX5_MAX_IRQ_NAME]; @@ -217,7 +230,28 @@ struct mlx5_irq *mlx5_irq_alloc(struct mlx5_irq_pool *pool, int i, irq = kzalloc(sizeof(*irq), GFP_KERNEL); if (!irq) return ERR_PTR(-ENOMEM); - irq->irqn = pci_irq_vector(dev->pdev, i); + if (!i || !pci_msix_can_alloc_dyn(dev->pdev)) { + /* The vector at index 0 was already allocated. + * Just get the irq number. If dynamic irq is not supported + * vectors have also been allocated. + */ + irq->map.virq = pci_irq_vector(dev->pdev, i); + irq->map.index = 0; + } else { + irq->map = pci_msix_alloc_irq_at(dev->pdev, MSI_ANY_INDEX, af_desc); + if (!irq->map.virq) { + err = irq->map.index; + goto err_alloc_irq; + } + } + + if (i && rmap && *rmap) { +#ifdef CONFIG_RFS_ACCEL + err = irq_cpu_rmap_add(*rmap, irq->map.virq); + if (err) + goto err_irq_rmap; +#endif + } if (!mlx5_irq_pool_is_sf_pool(pool)) irq_set_name(pool, name, i); else @@ -225,7 +259,7 @@ struct mlx5_irq *mlx5_irq_alloc(struct mlx5_irq_pool *pool, int i, ATOMIC_INIT_NOTIFIER_HEAD(&irq->nh); snprintf(irq->name, MLX5_MAX_IRQ_NAME, "%s@pci:%s", name, pci_name(dev->pdev)); - err = request_irq(irq->irqn, irq_int_handler, 0, irq->name, + err = request_irq(irq->map.virq, irq_int_handler, 0, irq->name, &irq->nh); if (err) { mlx5_core_err(dev, "Failed to request irq. err = %d\n", err); @@ -236,26 +270,37 @@ struct mlx5_irq *mlx5_irq_alloc(struct mlx5_irq_pool *pool, int i, err = -ENOMEM; goto err_cpumask; } - if (affinity) { - cpumask_copy(irq->mask, affinity); - irq_set_affinity_and_hint(irq->irqn, irq->mask); + if (af_desc) { + cpumask_copy(irq->mask, &af_desc->mask); + irq_set_affinity_and_hint(irq->map.virq, irq->mask); } irq->pool = pool; irq->refcount = 1; - irq->index = i; - err = xa_err(xa_store(&pool->irqs, irq->index, irq, GFP_KERNEL)); + irq->map.index = i; + err = xa_err(xa_store(&pool->irqs, irq->map.index, irq, GFP_KERNEL)); if (err) { mlx5_core_err(dev, "Failed to alloc xa entry for irq(%u). err = %d\n", - irq->index, err); + irq->map.index, err); goto err_xa; } return irq; err_xa: - irq_update_affinity_hint(irq->irqn, NULL); + if (af_desc) + irq_update_affinity_hint(irq->map.virq, NULL); free_cpumask_var(irq->mask); err_cpumask: - free_irq(irq->irqn, &irq->nh); + free_irq(irq->map.virq, &irq->nh); err_req_irq: +#ifdef CONFIG_RFS_ACCEL + if (i && rmap && *rmap) { + free_irq_cpu_rmap(*rmap); + *rmap = NULL; + } +err_irq_rmap: +#endif + if (i && pci_msix_can_alloc_dyn(dev->pdev)) + pci_msix_free_irq(dev->pdev, irq->map); +err_alloc_irq: kfree(irq); return ERR_PTR(err); } @@ -292,7 +337,7 @@ struct cpumask *mlx5_irq_get_affinity_mask(struct mlx5_irq *irq) int mlx5_irq_get_index(struct mlx5_irq *irq) { - return irq->index; + return irq->map.index; } /* irq_pool API */ @@ -300,7 +345,8 @@ int mlx5_irq_get_index(struct mlx5_irq *irq) /* requesting an irq from a given pool according to given index */ static struct mlx5_irq * irq_pool_request_vector(struct mlx5_irq_pool *pool, int vecidx, - struct cpumask *affinity) + struct irq_affinity_desc *af_desc, + struct cpu_rmap **rmap) { struct mlx5_irq *irq; @@ -310,7 +356,7 @@ irq_pool_request_vector(struct mlx5_irq_pool *pool, int vecidx, mlx5_irq_get_locked(irq); goto unlock; } - irq = mlx5_irq_alloc(pool, vecidx, affinity); + irq = mlx5_irq_alloc(pool, vecidx, af_desc, rmap); unlock: mutex_unlock(&pool->lock); return irq; @@ -337,7 +383,7 @@ struct mlx5_irq_pool *mlx5_irq_pool_get(struct mlx5_core_dev *dev) /* In some configs, there won't be a pool of SFs IRQs. Hence, returning * the PF IRQs pool in case the SF pool doesn't exist. */ - return pool ? pool : irq_table->pf_pool; + return pool ? pool : irq_table->pcif_pool; } static struct mlx5_irq_pool *ctrl_irq_pool_get(struct mlx5_core_dev *dev) @@ -351,7 +397,7 @@ static struct mlx5_irq_pool *ctrl_irq_pool_get(struct mlx5_core_dev *dev) /* In some configs, there won't be a pool of SFs IRQs. Hence, returning * the PF IRQs pool in case the SF pool doesn't exist. */ - return pool ? pool : irq_table->pf_pool; + return pool ? pool : irq_table->pcif_pool; } /** @@ -364,7 +410,7 @@ static void mlx5_irqs_release(struct mlx5_irq **irqs, int nirqs) int i; for (i = 0; i < nirqs; i++) { - synchronize_irq(irqs[i]->irqn); + synchronize_irq(irqs[i]->map.virq); mlx5_irq_put(irqs[i]); } } @@ -387,26 +433,26 @@ void mlx5_ctrl_irq_release(struct mlx5_irq *ctrl_irq) struct mlx5_irq *mlx5_ctrl_irq_request(struct mlx5_core_dev *dev) { struct mlx5_irq_pool *pool = ctrl_irq_pool_get(dev); - cpumask_var_t req_mask; + struct irq_affinity_desc af_desc; struct mlx5_irq *irq; - if (!zalloc_cpumask_var(&req_mask, GFP_KERNEL)) - return ERR_PTR(-ENOMEM); - cpumask_copy(req_mask, cpu_online_mask); + cpumask_copy(&af_desc.mask, cpu_online_mask); + af_desc.is_managed = false; if (!mlx5_irq_pool_is_sf_pool(pool)) { - /* In case we are allocating a control IRQ for PF/VF */ + /* In case we are allocating a control IRQ from a pci device's pool. + * This can happen also for a SF if the SFs pool is empty. + */ if (!pool->xa_num_irqs.max) { - cpumask_clear(req_mask); + cpumask_clear(&af_desc.mask); /* In case we only have a single IRQ for PF/VF */ - cpumask_set_cpu(cpumask_first(cpu_online_mask), req_mask); + cpumask_set_cpu(cpumask_first(cpu_online_mask), &af_desc.mask); } - /* Allocate the IRQ in the last index of the pool */ - irq = irq_pool_request_vector(pool, pool->xa_num_irqs.max, req_mask); + /* Allocate the IRQ in index 0. The vector was already allocated */ + irq = irq_pool_request_vector(pool, 0, &af_desc, NULL); } else { - irq = mlx5_irq_affinity_request(pool, req_mask); + irq = mlx5_irq_affinity_request(pool, &af_desc); } - free_cpumask_var(req_mask); return irq; } @@ -415,28 +461,82 @@ struct mlx5_irq *mlx5_ctrl_irq_request(struct mlx5_core_dev *dev) * @dev: mlx5 device that requesting the IRQ. * @vecidx: vector index of the IRQ. This argument is ignore if affinity is * provided. - * @affinity: cpumask requested for this IRQ. + * @af_desc: affinity descriptor for this IRQ. + * @rmap: pointer to reverse map pointer for completion interrupts * * This function returns a pointer to IRQ, or ERR_PTR in case of error. */ struct mlx5_irq *mlx5_irq_request(struct mlx5_core_dev *dev, u16 vecidx, - struct cpumask *affinity) + struct irq_affinity_desc *af_desc, + struct cpu_rmap **rmap) { struct mlx5_irq_table *irq_table = mlx5_irq_table_get(dev); struct mlx5_irq_pool *pool; struct mlx5_irq *irq; - pool = irq_table->pf_pool; - irq = irq_pool_request_vector(pool, vecidx, affinity); + pool = irq_table->pcif_pool; + irq = irq_pool_request_vector(pool, vecidx, af_desc, rmap); if (IS_ERR(irq)) return irq; mlx5_core_dbg(dev, "irq %u mapped to cpu %*pbl, %u EQs on this irq\n", - irq->irqn, cpumask_pr_args(affinity), + irq->map.virq, cpumask_pr_args(&af_desc->mask), irq->refcount / MLX5_EQ_REFS_PER_IRQ); return irq; } /** + * mlx5_msix_alloc - allocate msix interrupt + * @dev: mlx5 device from which to request + * @handler: interrupt handler + * @affdesc: affinity descriptor + * @name: interrupt name + * + * Returns: struct msi_map with result encoded. + * Note: the caller must make sure to release the irq by calling + * mlx5_msix_free() if shutdown was initiated. + */ +struct msi_map mlx5_msix_alloc(struct mlx5_core_dev *dev, + irqreturn_t (*handler)(int, void *), + const struct irq_affinity_desc *affdesc, + const char *name) +{ + struct msi_map map; + int err; + + if (!dev->pdev) { + map.virq = 0; + map.index = -EINVAL; + return map; + } + + map = pci_msix_alloc_irq_at(dev->pdev, MSI_ANY_INDEX, affdesc); + if (!map.virq) + return map; + + err = request_irq(map.virq, handler, 0, name, NULL); + if (err) { + mlx5_core_warn(dev, "err %d\n", err); + pci_msix_free_irq(dev->pdev, map); + map.virq = 0; + map.index = -ENOMEM; + } + return map; +} +EXPORT_SYMBOL(mlx5_msix_alloc); + +/** + * mlx5_msix_free - free a previously allocated msix interrupt + * @dev: mlx5 device associated with interrupt + * @map: map previously returned by mlx5_msix_alloc() + */ +void mlx5_msix_free(struct mlx5_core_dev *dev, struct msi_map map) +{ + free_irq(map.virq, NULL); + pci_msix_free_irq(dev->pdev, map); +} +EXPORT_SYMBOL(mlx5_msix_free); + +/** * mlx5_irqs_release_vectors - release one or more IRQs back to the system. * @irqs: IRQs to be released. * @nirqs: number of IRQs to be released. @@ -452,6 +552,7 @@ void mlx5_irqs_release_vectors(struct mlx5_irq **irqs, int nirqs) * @cpus: CPUs array for binding the IRQs * @nirqs: number of IRQs to request. * @irqs: an output array of IRQs pointers. + * @rmap: pointer to reverse map pointer for completion interrupts * * Each IRQ is bound to at most 1 CPU. * This function is requests nirqs IRQs, starting from @vecidx. @@ -460,24 +561,22 @@ void mlx5_irqs_release_vectors(struct mlx5_irq **irqs, int nirqs) * @nirqs), if successful, or a negative error code in case of an error. */ int mlx5_irqs_request_vectors(struct mlx5_core_dev *dev, u16 *cpus, int nirqs, - struct mlx5_irq **irqs) + struct mlx5_irq **irqs, struct cpu_rmap **rmap) { - cpumask_var_t req_mask; + struct irq_affinity_desc af_desc; struct mlx5_irq *irq; int i; - if (!zalloc_cpumask_var(&req_mask, GFP_KERNEL)) - return -ENOMEM; + af_desc.is_managed = 1; for (i = 0; i < nirqs; i++) { - cpumask_set_cpu(cpus[i], req_mask); - irq = mlx5_irq_request(dev, i, req_mask); + cpumask_set_cpu(cpus[i], &af_desc.mask); + irq = mlx5_irq_request(dev, i + 1, &af_desc, rmap); if (IS_ERR(irq)) break; - cpumask_clear(req_mask); + cpumask_clear(&af_desc.mask); irqs[i] = irq; } - free_cpumask_var(req_mask); return i ? i : PTR_ERR(irq); } @@ -521,7 +620,7 @@ static void irq_pool_free(struct mlx5_irq_pool *pool) kvfree(pool); } -static int irq_pools_init(struct mlx5_core_dev *dev, int sf_vec, int pf_vec) +static int irq_pools_init(struct mlx5_core_dev *dev, int sf_vec, int pcif_vec) { struct mlx5_irq_table *table = dev->priv.irq_table; int num_sf_ctrl_by_msix; @@ -529,12 +628,12 @@ static int irq_pools_init(struct mlx5_core_dev *dev, int sf_vec, int pf_vec) int num_sf_ctrl; int err; - /* init pf_pool */ - table->pf_pool = irq_pool_alloc(dev, 0, pf_vec, NULL, - MLX5_EQ_SHARE_IRQ_MIN_COMP, - MLX5_EQ_SHARE_IRQ_MAX_COMP); - if (IS_ERR(table->pf_pool)) - return PTR_ERR(table->pf_pool); + /* init pcif_pool */ + table->pcif_pool = irq_pool_alloc(dev, 0, pcif_vec, NULL, + MLX5_EQ_SHARE_IRQ_MIN_COMP, + MLX5_EQ_SHARE_IRQ_MAX_COMP); + if (IS_ERR(table->pcif_pool)) + return PTR_ERR(table->pcif_pool); if (!mlx5_sf_max_functions(dev)) return 0; if (sf_vec < MLX5_IRQ_VEC_COMP_BASE_SF) { @@ -548,7 +647,7 @@ static int irq_pools_init(struct mlx5_core_dev *dev, int sf_vec, int pf_vec) MLX5_SFS_PER_CTRL_IRQ); num_sf_ctrl = min_t(int, num_sf_ctrl_by_msix, num_sf_ctrl_by_sfs); num_sf_ctrl = min_t(int, MLX5_IRQ_CTRL_SF_MAX, num_sf_ctrl); - table->sf_ctrl_pool = irq_pool_alloc(dev, pf_vec, num_sf_ctrl, + table->sf_ctrl_pool = irq_pool_alloc(dev, pcif_vec, num_sf_ctrl, "mlx5_sf_ctrl", MLX5_EQ_SHARE_IRQ_MIN_CTRL, MLX5_EQ_SHARE_IRQ_MAX_CTRL); @@ -557,7 +656,7 @@ static int irq_pools_init(struct mlx5_core_dev *dev, int sf_vec, int pf_vec) goto err_pf; } /* init sf_comp_pool */ - table->sf_comp_pool = irq_pool_alloc(dev, pf_vec + num_sf_ctrl, + table->sf_comp_pool = irq_pool_alloc(dev, pcif_vec + num_sf_ctrl, sf_vec - num_sf_ctrl, "mlx5_sf_comp", MLX5_EQ_SHARE_IRQ_MIN_COMP, MLX5_EQ_SHARE_IRQ_MAX_COMP); @@ -579,7 +678,7 @@ err_irqs_per_cpu: err_sf_ctrl: irq_pool_free(table->sf_ctrl_pool); err_pf: - irq_pool_free(table->pf_pool); + irq_pool_free(table->pcif_pool); return err; } @@ -589,7 +688,7 @@ static void irq_pools_destroy(struct mlx5_irq_table *table) irq_pool_free(table->sf_comp_pool); irq_pool_free(table->sf_ctrl_pool); } - irq_pool_free(table->pf_pool); + irq_pool_free(table->pcif_pool); } /* irq_table API */ @@ -620,9 +719,9 @@ void mlx5_irq_table_cleanup(struct mlx5_core_dev *dev) int mlx5_irq_table_get_num_comp(struct mlx5_irq_table *table) { - if (!table->pf_pool->xa_num_irqs.max) + if (!table->pcif_pool->xa_num_irqs.max) return 1; - return table->pf_pool->xa_num_irqs.max - table->pf_pool->xa_num_irqs.min; + return table->pcif_pool->xa_num_irqs.max - table->pcif_pool->xa_num_irqs.min; } int mlx5_irq_table_create(struct mlx5_core_dev *dev) @@ -631,26 +730,30 @@ int mlx5_irq_table_create(struct mlx5_core_dev *dev) MLX5_CAP_GEN(dev, max_num_eqs) : 1 << MLX5_CAP_GEN(dev, log_max_eq); int total_vec; - int pf_vec; + int pcif_vec; + int req_vec; int err; + int n; if (mlx5_core_is_sf(dev)) return 0; - pf_vec = MLX5_CAP_GEN(dev, num_ports) * num_online_cpus() + 1; - pf_vec = min_t(int, pf_vec, num_eqs); + pcif_vec = MLX5_CAP_GEN(dev, num_ports) * num_online_cpus() + 1; + pcif_vec = min_t(int, pcif_vec, num_eqs); - total_vec = pf_vec; + total_vec = pcif_vec; if (mlx5_sf_max_functions(dev)) total_vec += MLX5_IRQ_CTRL_SF_MAX + MLX5_COMP_EQS_PER_SF * mlx5_sf_max_functions(dev); + total_vec = min_t(int, total_vec, pci_msix_vec_count(dev->pdev)); + pcif_vec = min_t(int, pcif_vec, pci_msix_vec_count(dev->pdev)); - total_vec = pci_alloc_irq_vectors(dev->pdev, 1, total_vec, PCI_IRQ_MSIX); - if (total_vec < 0) - return total_vec; - pf_vec = min(pf_vec, total_vec); + req_vec = pci_msix_can_alloc_dyn(dev->pdev) ? 1 : total_vec; + n = pci_alloc_irq_vectors(dev->pdev, 1, req_vec, PCI_IRQ_MSIX); + if (n < 0) + return n; - err = irq_pools_init(dev, total_vec - pf_vec, pf_vec); + err = irq_pools_init(dev, total_vec - pcif_vec, pcif_vec); if (err) pci_free_irq_vectors(dev->pdev); diff --git a/drivers/net/ethernet/mellanox/mlx5/core/pci_irq.h b/drivers/net/ethernet/mellanox/mlx5/core/pci_irq.h index 5c7e68bee43a..d3a77a0ab848 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/pci_irq.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/pci_irq.h @@ -12,6 +12,7 @@ #define MLX5_EQ_REFS_PER_IRQ (2) struct mlx5_irq; +struct cpu_rmap; struct mlx5_irq_pool { char name[MLX5_MAX_IRQ_NAME - MLX5_MAX_IRQ_IDX_CHARS]; @@ -31,7 +32,8 @@ static inline bool mlx5_irq_pool_is_sf_pool(struct mlx5_irq_pool *pool) } struct mlx5_irq *mlx5_irq_alloc(struct mlx5_irq_pool *pool, int i, - const struct cpumask *affinity); + struct irq_affinity_desc *af_desc, + struct cpu_rmap **rmap); int mlx5_irq_get_locked(struct mlx5_irq *irq); int mlx5_irq_read_locked(struct mlx5_irq *irq); int mlx5_irq_put(struct mlx5_irq *irq); diff --git a/drivers/net/ethernet/mellanox/mlx5/core/port.c b/drivers/net/ethernet/mellanox/mlx5/core/port.c index a1548e6bfb35..0daeb4b72cca 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/port.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/port.c @@ -1054,3 +1054,154 @@ out: kfree(out); return err; } + +/* speed in units of 1Mb */ +static const u32 mlx5e_link_speed[MLX5E_LINK_MODES_NUMBER] = { + [MLX5E_1000BASE_CX_SGMII] = 1000, + [MLX5E_1000BASE_KX] = 1000, + [MLX5E_10GBASE_CX4] = 10000, + [MLX5E_10GBASE_KX4] = 10000, + [MLX5E_10GBASE_KR] = 10000, + [MLX5E_20GBASE_KR2] = 20000, + [MLX5E_40GBASE_CR4] = 40000, + [MLX5E_40GBASE_KR4] = 40000, + [MLX5E_56GBASE_R4] = 56000, + [MLX5E_10GBASE_CR] = 10000, + [MLX5E_10GBASE_SR] = 10000, + [MLX5E_10GBASE_ER] = 10000, + [MLX5E_40GBASE_SR4] = 40000, + [MLX5E_40GBASE_LR4] = 40000, + [MLX5E_50GBASE_SR2] = 50000, + [MLX5E_100GBASE_CR4] = 100000, + [MLX5E_100GBASE_SR4] = 100000, + [MLX5E_100GBASE_KR4] = 100000, + [MLX5E_100GBASE_LR4] = 100000, + [MLX5E_100BASE_TX] = 100, + [MLX5E_1000BASE_T] = 1000, + [MLX5E_10GBASE_T] = 10000, + [MLX5E_25GBASE_CR] = 25000, + [MLX5E_25GBASE_KR] = 25000, + [MLX5E_25GBASE_SR] = 25000, + [MLX5E_50GBASE_CR2] = 50000, + [MLX5E_50GBASE_KR2] = 50000, +}; + +static const u32 mlx5e_ext_link_speed[MLX5E_EXT_LINK_MODES_NUMBER] = { + [MLX5E_SGMII_100M] = 100, + [MLX5E_1000BASE_X_SGMII] = 1000, + [MLX5E_5GBASE_R] = 5000, + [MLX5E_10GBASE_XFI_XAUI_1] = 10000, + [MLX5E_40GBASE_XLAUI_4_XLPPI_4] = 40000, + [MLX5E_25GAUI_1_25GBASE_CR_KR] = 25000, + [MLX5E_50GAUI_2_LAUI_2_50GBASE_CR2_KR2] = 50000, + [MLX5E_50GAUI_1_LAUI_1_50GBASE_CR_KR] = 50000, + [MLX5E_CAUI_4_100GBASE_CR4_KR4] = 100000, + [MLX5E_100GAUI_2_100GBASE_CR2_KR2] = 100000, + [MLX5E_200GAUI_4_200GBASE_CR4_KR4] = 200000, + [MLX5E_400GAUI_8] = 400000, + [MLX5E_100GAUI_1_100GBASE_CR_KR] = 100000, + [MLX5E_200GAUI_2_200GBASE_CR2_KR2] = 200000, + [MLX5E_400GAUI_4_400GBASE_CR4_KR4] = 400000, +}; + +int mlx5_port_query_eth_proto(struct mlx5_core_dev *dev, u8 port, bool ext, + struct mlx5_port_eth_proto *eproto) +{ + u32 out[MLX5_ST_SZ_DW(ptys_reg)]; + int err; + + if (!eproto) + return -EINVAL; + + err = mlx5_query_port_ptys(dev, out, sizeof(out), MLX5_PTYS_EN, port); + if (err) + return err; + + eproto->cap = MLX5_GET_ETH_PROTO(ptys_reg, out, ext, + eth_proto_capability); + eproto->admin = MLX5_GET_ETH_PROTO(ptys_reg, out, ext, eth_proto_admin); + eproto->oper = MLX5_GET_ETH_PROTO(ptys_reg, out, ext, eth_proto_oper); + return 0; +} + +bool mlx5_ptys_ext_supported(struct mlx5_core_dev *mdev) +{ + struct mlx5_port_eth_proto eproto; + int err; + + if (MLX5_CAP_PCAM_FEATURE(mdev, ptys_extended_ethernet)) + return true; + + err = mlx5_port_query_eth_proto(mdev, 1, true, &eproto); + if (err) + return false; + + return !!eproto.cap; +} + +static void mlx5e_port_get_speed_arr(struct mlx5_core_dev *mdev, + const u32 **arr, u32 *size, + bool force_legacy) +{ + bool ext = force_legacy ? false : mlx5_ptys_ext_supported(mdev); + + *size = ext ? ARRAY_SIZE(mlx5e_ext_link_speed) : + ARRAY_SIZE(mlx5e_link_speed); + *arr = ext ? mlx5e_ext_link_speed : mlx5e_link_speed; +} + +u32 mlx5_port_ptys2speed(struct mlx5_core_dev *mdev, u32 eth_proto_oper, + bool force_legacy) +{ + unsigned long temp = eth_proto_oper; + const u32 *table; + u32 speed = 0; + u32 max_size; + int i; + + mlx5e_port_get_speed_arr(mdev, &table, &max_size, force_legacy); + i = find_first_bit(&temp, max_size); + if (i < max_size) + speed = table[i]; + return speed; +} + +u32 mlx5_port_speed2linkmodes(struct mlx5_core_dev *mdev, u32 speed, + bool force_legacy) +{ + u32 link_modes = 0; + const u32 *table; + u32 max_size; + int i; + + mlx5e_port_get_speed_arr(mdev, &table, &max_size, force_legacy); + for (i = 0; i < max_size; ++i) { + if (table[i] == speed) + link_modes |= MLX5E_PROT_MASK(i); + } + return link_modes; +} + +int mlx5_port_max_linkspeed(struct mlx5_core_dev *mdev, u32 *speed) +{ + struct mlx5_port_eth_proto eproto; + u32 max_speed = 0; + const u32 *table; + u32 max_size; + bool ext; + int err; + int i; + + ext = mlx5_ptys_ext_supported(mdev); + err = mlx5_port_query_eth_proto(mdev, 1, ext, &eproto); + if (err) + return err; + + mlx5e_port_get_speed_arr(mdev, &table, &max_size, false); + for (i = 0; i < max_size; ++i) + if (eproto.cap & MLX5E_PROT_MASK(i)) + max_speed = max(max_speed, table[i]); + + *speed = max_speed; + return 0; +} diff --git a/drivers/net/ethernet/mellanox/mlx5/core/sf/dev/driver.c b/drivers/net/ethernet/mellanox/mlx5/core/sf/dev/driver.c index a7377619ba6f..e2f26d0bc615 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/sf/dev/driver.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/sf/dev/driver.c @@ -28,7 +28,7 @@ static int mlx5_sf_dev_probe(struct auxiliary_device *adev, const struct auxilia mdev->priv.adev_idx = adev->id; sf_dev->mdev = mdev; - err = mlx5_mdev_init(mdev, MLX5_DEFAULT_PROF); + err = mlx5_mdev_init(mdev, MLX5_SF_PROF); if (err) { mlx5_core_warn(mdev, "mlx5_mdev_init on err=%d\n", err); goto mdev_err; diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_action.c b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_action.c index ee104cf04392..0eb9a8d7f282 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_action.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_action.c @@ -819,14 +819,34 @@ int mlx5dr_actions_build_ste_arr(struct mlx5dr_matcher *matcher, case DR_ACTION_TYP_TNL_L2_TO_L2: break; case DR_ACTION_TYP_TNL_L3_TO_L2: - attr.decap_index = action->rewrite->index; - attr.decap_actions = action->rewrite->num_of_actions; - attr.decap_with_vlan = - attr.decap_actions == WITH_VLAN_NUM_HW_ACTIONS; + if (action->rewrite->ptrn && action->rewrite->arg) { + attr.decap_index = mlx5dr_arg_get_obj_id(action->rewrite->arg); + attr.decap_actions = action->rewrite->ptrn->num_of_actions; + attr.decap_pat_idx = action->rewrite->ptrn->index; + } else { + attr.decap_index = action->rewrite->index; + attr.decap_actions = action->rewrite->num_of_actions; + attr.decap_with_vlan = + attr.decap_actions == WITH_VLAN_NUM_HW_ACTIONS; + attr.decap_pat_idx = MLX5DR_INVALID_PATTERN_INDEX; + } break; case DR_ACTION_TYP_MODIFY_HDR: - attr.modify_index = action->rewrite->index; - attr.modify_actions = action->rewrite->num_of_actions; + if (action->rewrite->single_action_opt) { + attr.modify_actions = action->rewrite->num_of_actions; + attr.single_modify_action = action->rewrite->data; + } else { + if (action->rewrite->ptrn && action->rewrite->arg) { + attr.modify_index = + mlx5dr_arg_get_obj_id(action->rewrite->arg); + attr.modify_actions = action->rewrite->ptrn->num_of_actions; + attr.modify_pat_idx = action->rewrite->ptrn->index; + } else { + attr.modify_index = action->rewrite->index; + attr.modify_actions = action->rewrite->num_of_actions; + attr.modify_pat_idx = MLX5DR_INVALID_PATTERN_INDEX; + } + } if (action->rewrite->modify_ttl) dr_action_modify_ttl_adjust(dmn, &attr, rx_rule, &recalc_cs_required); @@ -1365,8 +1385,6 @@ out_err: return -EINVAL; } -#define ACTION_CACHE_LINE_SIZE 64 - static int dr_action_create_reformat_action(struct mlx5dr_domain *dmn, u8 reformat_param_0, u8 reformat_param_1, @@ -1403,36 +1421,25 @@ dr_action_create_reformat_action(struct mlx5dr_domain *dmn, } case DR_ACTION_TYP_TNL_L3_TO_L2: { - u8 hw_actions[ACTION_CACHE_LINE_SIZE] = {}; + u8 hw_actions[DR_ACTION_CACHE_LINE_SIZE] = {}; int ret; ret = mlx5dr_ste_set_action_decap_l3_list(dmn->ste_ctx, data, data_sz, hw_actions, - ACTION_CACHE_LINE_SIZE, + DR_ACTION_CACHE_LINE_SIZE, &action->rewrite->num_of_actions); if (ret) { mlx5dr_dbg(dmn, "Failed creating decap l3 action list\n"); return ret; } - action->rewrite->chunk = mlx5dr_icm_alloc_chunk(dmn->action_icm_pool, - DR_CHUNK_SIZE_8); - if (!action->rewrite->chunk) { - mlx5dr_dbg(dmn, "Failed allocating modify header chunk\n"); - return -ENOMEM; - } - - action->rewrite->data = (void *)hw_actions; - action->rewrite->index = (mlx5dr_icm_pool_get_chunk_icm_addr - (action->rewrite->chunk) - - dmn->info.caps.hdr_modify_icm_addr) / - ACTION_CACHE_LINE_SIZE; + action->rewrite->data = hw_actions; + action->rewrite->dmn = dmn; - ret = mlx5dr_send_postsend_action(dmn, action); + ret = mlx5dr_ste_alloc_modify_hdr(action); if (ret) { - mlx5dr_dbg(dmn, "Writing decap l3 actions to ICM failed\n"); - mlx5dr_icm_free_chunk(action->rewrite->chunk); + mlx5dr_dbg(dmn, "Failed preparing reformat data\n"); return ret; } return 0; @@ -1963,7 +1970,6 @@ static int dr_action_create_modify_action(struct mlx5dr_domain *dmn, __be64 actions[], struct mlx5dr_action *action) { - struct mlx5dr_icm_chunk *chunk; u32 max_hw_actions; u32 num_hw_actions; u32 num_sw_actions; @@ -1980,15 +1986,9 @@ static int dr_action_create_modify_action(struct mlx5dr_domain *dmn, return -EINVAL; } - chunk = mlx5dr_icm_alloc_chunk(dmn->action_icm_pool, DR_CHUNK_SIZE_16); - if (!chunk) - return -ENOMEM; - hw_actions = kcalloc(1, max_hw_actions * DR_MODIFY_ACTION_SIZE, GFP_KERNEL); - if (!hw_actions) { - ret = -ENOMEM; - goto free_chunk; - } + if (!hw_actions) + return -ENOMEM; ret = dr_actions_convert_modify_header(action, max_hw_actions, @@ -2000,24 +2000,24 @@ static int dr_action_create_modify_action(struct mlx5dr_domain *dmn, if (ret) goto free_hw_actions; - action->rewrite->chunk = chunk; action->rewrite->modify_ttl = modify_ttl; action->rewrite->data = (u8 *)hw_actions; action->rewrite->num_of_actions = num_hw_actions; - action->rewrite->index = (mlx5dr_icm_pool_get_chunk_icm_addr(chunk) - - dmn->info.caps.hdr_modify_icm_addr) / - ACTION_CACHE_LINE_SIZE; - ret = mlx5dr_send_postsend_action(dmn, action); - if (ret) - goto free_hw_actions; + if (num_hw_actions == 1 && + dmn->info.caps.sw_format_ver >= MLX5_STEERING_FORMAT_CONNECTX_6DX) { + action->rewrite->single_action_opt = true; + } else { + action->rewrite->single_action_opt = false; + ret = mlx5dr_ste_alloc_modify_hdr(action); + if (ret) + goto free_hw_actions; + } return 0; free_hw_actions: kfree(hw_actions); -free_chunk: - mlx5dr_icm_free_chunk(chunk); return ret; } @@ -2162,7 +2162,8 @@ int mlx5dr_action_destroy(struct mlx5dr_action *action) refcount_dec(&action->reformat->dmn->refcount); break; case DR_ACTION_TYP_TNL_L3_TO_L2: - mlx5dr_icm_free_chunk(action->rewrite->chunk); + mlx5dr_ste_free_modify_hdr(action); + kfree(action->rewrite->data); refcount_dec(&action->rewrite->dmn->refcount); break; case DR_ACTION_TYP_L2_TO_TNL_L2: @@ -2173,7 +2174,8 @@ int mlx5dr_action_destroy(struct mlx5dr_action *action) refcount_dec(&action->reformat->dmn->refcount); break; case DR_ACTION_TYP_MODIFY_HDR: - mlx5dr_icm_free_chunk(action->rewrite->chunk); + if (!action->rewrite->single_action_opt) + mlx5dr_ste_free_modify_hdr(action); kfree(action->rewrite->data); refcount_dec(&action->rewrite->dmn->refcount); break; diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_arg.c b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_arg.c new file mode 100644 index 000000000000..01ed6442095d --- /dev/null +++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_arg.c @@ -0,0 +1,273 @@ +// SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB +// Copyright (c) 2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved. + +#include "dr_types.h" + +#define DR_ICM_MODIFY_HDR_GRANULARITY_4K 12 + +/* modify-header arg pool */ +enum dr_arg_chunk_size { + DR_ARG_CHUNK_SIZE_1, + DR_ARG_CHUNK_SIZE_MIN = DR_ARG_CHUNK_SIZE_1, /* keep updated when changing */ + DR_ARG_CHUNK_SIZE_2, + DR_ARG_CHUNK_SIZE_3, + DR_ARG_CHUNK_SIZE_4, + DR_ARG_CHUNK_SIZE_MAX, +}; + +/* argument pool area */ +struct dr_arg_pool { + enum dr_arg_chunk_size log_chunk_size; + struct mlx5dr_domain *dmn; + struct list_head free_list; + struct mutex mutex; /* protect arg pool */ +}; + +struct mlx5dr_arg_mgr { + struct mlx5dr_domain *dmn; + struct dr_arg_pool *pools[DR_ARG_CHUNK_SIZE_MAX]; +}; + +static int dr_arg_pool_alloc_objs(struct dr_arg_pool *pool) +{ + struct mlx5dr_arg_obj *arg_obj, *tmp_arg; + struct list_head cur_list; + u16 object_range; + int num_of_objects; + u32 obj_id = 0; + int i, ret; + + INIT_LIST_HEAD(&cur_list); + + object_range = + pool->dmn->info.caps.log_header_modify_argument_granularity; + + object_range = + max_t(u32, pool->dmn->info.caps.log_header_modify_argument_granularity, + DR_ICM_MODIFY_HDR_GRANULARITY_4K); + object_range = + min_t(u32, pool->dmn->info.caps.log_header_modify_argument_max_alloc, + object_range); + + if (pool->log_chunk_size > object_range) { + mlx5dr_err(pool->dmn, "Required chunk size (%d) is not supported\n", + pool->log_chunk_size); + return -ENOMEM; + } + + num_of_objects = (1 << (object_range - pool->log_chunk_size)); + /* Only one devx object per range */ + ret = mlx5dr_cmd_create_modify_header_arg(pool->dmn->mdev, + object_range, + pool->dmn->pdn, + &obj_id); + if (ret) { + mlx5dr_err(pool->dmn, "failed allocating object with range: %d:\n", + object_range); + return -EAGAIN; + } + + for (i = 0; i < num_of_objects; i++) { + arg_obj = kzalloc(sizeof(*arg_obj), GFP_KERNEL); + if (!arg_obj) { + ret = -ENOMEM; + goto clean_arg_obj; + } + + arg_obj->log_chunk_size = pool->log_chunk_size; + + list_add_tail(&arg_obj->list_node, &cur_list); + + arg_obj->obj_id = obj_id; + arg_obj->obj_offset = i * (1 << pool->log_chunk_size); + } + list_splice_tail_init(&cur_list, &pool->free_list); + + return 0; + +clean_arg_obj: + mlx5dr_cmd_destroy_modify_header_arg(pool->dmn->mdev, obj_id); + list_for_each_entry_safe(arg_obj, tmp_arg, &cur_list, list_node) { + list_del(&arg_obj->list_node); + kfree(arg_obj); + } + return ret; +} + +static struct mlx5dr_arg_obj *dr_arg_pool_get_arg_obj(struct dr_arg_pool *pool) +{ + struct mlx5dr_arg_obj *arg_obj = NULL; + int ret; + + mutex_lock(&pool->mutex); + if (list_empty(&pool->free_list)) { + ret = dr_arg_pool_alloc_objs(pool); + if (ret) + goto out; + } + + arg_obj = list_first_entry_or_null(&pool->free_list, + struct mlx5dr_arg_obj, + list_node); + WARN(!arg_obj, "couldn't get dr arg obj from pool"); + + if (arg_obj) + list_del_init(&arg_obj->list_node); + +out: + mutex_unlock(&pool->mutex); + return arg_obj; +} + +static void dr_arg_pool_put_arg_obj(struct dr_arg_pool *pool, + struct mlx5dr_arg_obj *arg_obj) +{ + mutex_lock(&pool->mutex); + list_add(&arg_obj->list_node, &pool->free_list); + mutex_unlock(&pool->mutex); +} + +static struct dr_arg_pool *dr_arg_pool_create(struct mlx5dr_domain *dmn, + enum dr_arg_chunk_size chunk_size) +{ + struct dr_arg_pool *pool; + + pool = kzalloc(sizeof(*pool), GFP_KERNEL); + if (!pool) + return NULL; + + pool->dmn = dmn; + + INIT_LIST_HEAD(&pool->free_list); + mutex_init(&pool->mutex); + + pool->log_chunk_size = chunk_size; + if (dr_arg_pool_alloc_objs(pool)) + goto free_pool; + + return pool; + +free_pool: + kfree(pool); + + return NULL; +} + +static void dr_arg_pool_destroy(struct dr_arg_pool *pool) +{ + struct mlx5dr_arg_obj *arg_obj, *tmp_arg; + + list_for_each_entry_safe(arg_obj, tmp_arg, &pool->free_list, list_node) { + list_del(&arg_obj->list_node); + if (!arg_obj->obj_offset) /* the first in range */ + mlx5dr_cmd_destroy_modify_header_arg(pool->dmn->mdev, arg_obj->obj_id); + kfree(arg_obj); + } + + mutex_destroy(&pool->mutex); + kfree(pool); +} + +static enum dr_arg_chunk_size dr_arg_get_chunk_size(u16 num_of_actions) +{ + if (num_of_actions <= 8) + return DR_ARG_CHUNK_SIZE_1; + if (num_of_actions <= 16) + return DR_ARG_CHUNK_SIZE_2; + if (num_of_actions <= 32) + return DR_ARG_CHUNK_SIZE_3; + if (num_of_actions <= 64) + return DR_ARG_CHUNK_SIZE_4; + + return DR_ARG_CHUNK_SIZE_MAX; +} + +u32 mlx5dr_arg_get_obj_id(struct mlx5dr_arg_obj *arg_obj) +{ + return (arg_obj->obj_id + arg_obj->obj_offset); +} + +struct mlx5dr_arg_obj *mlx5dr_arg_get_obj(struct mlx5dr_arg_mgr *mgr, + u16 num_of_actions, + u8 *data) +{ + u32 size = dr_arg_get_chunk_size(num_of_actions); + struct mlx5dr_arg_obj *arg_obj; + int ret; + + if (size >= DR_ARG_CHUNK_SIZE_MAX) + return NULL; + + arg_obj = dr_arg_pool_get_arg_obj(mgr->pools[size]); + if (!arg_obj) { + mlx5dr_err(mgr->dmn, "Failed allocating args object for modify header\n"); + return NULL; + } + + /* write it into the hw */ + ret = mlx5dr_send_postsend_args(mgr->dmn, + mlx5dr_arg_get_obj_id(arg_obj), + num_of_actions, data); + if (ret) { + mlx5dr_err(mgr->dmn, "Failed writing args object\n"); + goto put_obj; + } + + return arg_obj; + +put_obj: + mlx5dr_arg_put_obj(mgr, arg_obj); + return NULL; +} + +void mlx5dr_arg_put_obj(struct mlx5dr_arg_mgr *mgr, + struct mlx5dr_arg_obj *arg_obj) +{ + dr_arg_pool_put_arg_obj(mgr->pools[arg_obj->log_chunk_size], arg_obj); +} + +struct mlx5dr_arg_mgr* +mlx5dr_arg_mgr_create(struct mlx5dr_domain *dmn) +{ + struct mlx5dr_arg_mgr *pool_mgr; + int i; + + if (!mlx5dr_domain_is_support_ptrn_arg(dmn)) + return NULL; + + pool_mgr = kzalloc(sizeof(*pool_mgr), GFP_KERNEL); + if (!pool_mgr) + return NULL; + + pool_mgr->dmn = dmn; + + for (i = 0; i < DR_ARG_CHUNK_SIZE_MAX; i++) { + pool_mgr->pools[i] = dr_arg_pool_create(dmn, i); + if (!pool_mgr->pools[i]) + goto clean_pools; + } + + return pool_mgr; + +clean_pools: + for (i--; i >= 0; i--) + dr_arg_pool_destroy(pool_mgr->pools[i]); + + kfree(pool_mgr); + return NULL; +} + +void mlx5dr_arg_mgr_destroy(struct mlx5dr_arg_mgr *mgr) +{ + struct dr_arg_pool **pools; + int i; + + if (!mgr) + return; + + pools = mgr->pools; + for (i = 0; i < DR_ARG_CHUNK_SIZE_MAX; i++) + dr_arg_pool_destroy(pools[i]); + + kfree(mgr); +} diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_cmd.c b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_cmd.c index 07b6a6dcb92f..3835ba3f4dda 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_cmd.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_cmd.c @@ -132,6 +132,17 @@ int mlx5dr_cmd_query_device(struct mlx5_core_dev *mdev, caps->isolate_vl_tc = MLX5_CAP_GEN(mdev, isolate_vl_tc_new); + caps->support_modify_argument = + MLX5_CAP_GEN_64(mdev, general_obj_types) & + MLX5_GENERAL_OBJ_TYPES_CAP_HEADER_MODIFY_ARGUMENT; + + if (caps->support_modify_argument) { + caps->log_header_modify_argument_granularity = + MLX5_CAP_GEN(mdev, log_header_modify_argument_granularity); + caps->log_header_modify_argument_max_alloc = + MLX5_CAP_GEN(mdev, log_header_modify_argument_max_alloc); + } + /* geneve_tlv_option_0_exist is the indication of * STE support for lookup type flex_parser_ok */ @@ -200,6 +211,12 @@ int mlx5dr_cmd_query_device(struct mlx5_core_dev *mdev, caps->hdr_modify_icm_addr = MLX5_CAP64_DEV_MEM(mdev, header_modify_sw_icm_start_address); + caps->log_modify_pattern_icm_size = + MLX5_CAP_DEV_MEM(mdev, log_header_modify_pattern_sw_icm_size); + + caps->hdr_modify_pattern_icm_addr = + MLX5_CAP64_DEV_MEM(mdev, header_modify_pattern_sw_icm_start_address); + caps->roce_min_src_udp = MLX5_CAP_ROCE(mdev, r_roce_min_src_udp_port); caps->is_ecpf = mlx5_core_is_ecpf_esw_manager(mdev); @@ -676,6 +693,49 @@ int mlx5dr_cmd_query_gid(struct mlx5_core_dev *mdev, u8 vhca_port_num, return 0; } +int mlx5dr_cmd_create_modify_header_arg(struct mlx5_core_dev *dev, + u16 log_obj_range, u32 pd, + u32 *obj_id) +{ + u32 in[MLX5_ST_SZ_DW(create_modify_header_arg_in)] = {}; + u32 out[MLX5_ST_SZ_DW(general_obj_out_cmd_hdr)] = {}; + void *attr; + int ret; + + attr = MLX5_ADDR_OF(create_modify_header_arg_in, in, hdr); + MLX5_SET(general_obj_in_cmd_hdr, attr, opcode, + MLX5_CMD_OP_CREATE_GENERAL_OBJECT); + MLX5_SET(general_obj_in_cmd_hdr, attr, obj_type, + MLX5_OBJ_TYPE_HEADER_MODIFY_ARGUMENT); + MLX5_SET(general_obj_in_cmd_hdr, attr, + op_param.create.log_obj_range, log_obj_range); + + attr = MLX5_ADDR_OF(create_modify_header_arg_in, in, arg); + MLX5_SET(modify_header_arg, attr, access_pd, pd); + + ret = mlx5_cmd_exec(dev, in, sizeof(in), out, sizeof(out)); + if (ret) + return ret; + + *obj_id = MLX5_GET(general_obj_out_cmd_hdr, out, obj_id); + return 0; +} + +void mlx5dr_cmd_destroy_modify_header_arg(struct mlx5_core_dev *dev, + u32 obj_id) +{ + u32 out[MLX5_ST_SZ_DW(general_obj_out_cmd_hdr)] = {}; + u32 in[MLX5_ST_SZ_DW(general_obj_in_cmd_hdr)] = {}; + + MLX5_SET(general_obj_in_cmd_hdr, in, opcode, + MLX5_CMD_OP_DESTROY_GENERAL_OBJECT); + MLX5_SET(general_obj_in_cmd_hdr, in, obj_type, + MLX5_OBJ_TYPE_HEADER_MODIFY_ARGUMENT); + MLX5_SET(general_obj_in_cmd_hdr, in, obj_id, obj_id); + + mlx5_cmd_exec(dev, in, sizeof(in), out, sizeof(out)); +} + static int mlx5dr_cmd_set_extended_dest(struct mlx5_core_dev *dev, struct mlx5dr_cmd_fte_info *fte, bool *extended_dest) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_dbg.c b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_dbg.c index db81d881d38e..7e36e1062139 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_dbg.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_dbg.c @@ -4,6 +4,7 @@ #include <linux/debugfs.h> #include <linux/kernel.h> #include <linux/seq_file.h> +#include <linux/version.h> #include "dr_types.h" #define DR_DBG_PTR_TO_ID(p) ((u64)(uintptr_t)(p) & 0xFFFFFFFFULL) @@ -140,10 +141,33 @@ dr_dump_rule_action_mem(struct seq_file *file, const u64 rule_id, action->flow_tag->flow_tag); break; case DR_ACTION_TYP_MODIFY_HDR: - seq_printf(file, "%d,0x%llx,0x%llx,0x%x\n", + { + struct mlx5dr_ptrn_obj *ptrn = action->rewrite->ptrn; + struct mlx5dr_arg_obj *arg = action->rewrite->arg; + u8 *rewrite_data = action->rewrite->data; + bool ptrn_arg; + int i; + + ptrn_arg = !action->rewrite->single_action_opt && ptrn && arg; + + seq_printf(file, "%d,0x%llx,0x%llx,0x%x,%d,0x%x,0x%x,0x%x", DR_DUMP_REC_TYPE_ACTION_MODIFY_HDR, action_id, - rule_id, action->rewrite->index); + rule_id, action->rewrite->index, + action->rewrite->single_action_opt, + ptrn_arg ? action->rewrite->num_of_actions : 0, + ptrn_arg ? ptrn->index : 0, + ptrn_arg ? mlx5dr_arg_get_obj_id(arg) : 0); + + if (ptrn_arg) { + for (i = 0; i < action->rewrite->num_of_actions; i++) { + seq_printf(file, ",0x%016llx", + be64_to_cpu(((__be64 *)rewrite_data)[i])); + } + } + + seq_puts(file, "\n"); break; + } case DR_ACTION_TYP_VPORT: seq_printf(file, "%d,0x%llx,0x%llx,0x%x\n", DR_DUMP_REC_TYPE_ACTION_VPORT, action_id, rule_id, @@ -157,7 +181,10 @@ dr_dump_rule_action_mem(struct seq_file *file, const u64 rule_id, case DR_ACTION_TYP_TNL_L3_TO_L2: seq_printf(file, "%d,0x%llx,0x%llx,0x%x\n", DR_DUMP_REC_TYPE_ACTION_DECAP_L3, action_id, - rule_id, action->rewrite->index); + rule_id, + (action->rewrite->ptrn && action->rewrite->arg) ? + mlx5dr_arg_get_obj_id(action->rewrite->arg) : + action->rewrite->index); break; case DR_ACTION_TYP_L2_TO_TNL_L2: seq_printf(file, "%d,0x%llx,0x%llx,0x%x\n", @@ -606,9 +633,18 @@ dr_dump_domain(struct seq_file *file, struct mlx5dr_domain *dmn) u64 domain_id = DR_DBG_PTR_TO_ID(dmn); int ret; - seq_printf(file, "%d,0x%llx,%d,0%x,%d,%s\n", DR_DUMP_REC_TYPE_DOMAIN, + seq_printf(file, "%d,0x%llx,%d,0%x,%d,%u.%u.%u,%s,%d,%u,%u,%u\n", + DR_DUMP_REC_TYPE_DOMAIN, domain_id, dmn->type, dmn->info.caps.gvmi, - dmn->info.supp_sw_steering, pci_name(dmn->mdev->pdev)); + dmn->info.supp_sw_steering, + /* package version */ + LINUX_VERSION_MAJOR, LINUX_VERSION_PATCHLEVEL, + LINUX_VERSION_SUBLEVEL, + pci_name(dmn->mdev->pdev), + 0, /* domain flags */ + dmn->num_buddies[DR_ICM_TYPE_STE], + dmn->num_buddies[DR_ICM_TYPE_MODIFY_ACTION], + dmn->num_buddies[DR_ICM_TYPE_MODIFY_HDR_PTRN]); ret = dr_dump_domain_info(file, &dmn->info, domain_id); if (ret < 0) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_domain.c b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_domain.c index 5b8bb2ca31e6..9a2dfe6ebe31 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_domain.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_domain.c @@ -10,6 +10,46 @@ ((dmn)->info.caps.dmn_type##_sw_owner_v2 && \ (dmn)->info.caps.sw_format_ver <= MLX5_STEERING_FORMAT_CONNECTX_7)) +bool mlx5dr_domain_is_support_ptrn_arg(struct mlx5dr_domain *dmn) +{ + return dmn->info.caps.sw_format_ver >= MLX5_STEERING_FORMAT_CONNECTX_6DX && + dmn->info.caps.support_modify_argument; +} + +static int dr_domain_init_modify_header_resources(struct mlx5dr_domain *dmn) +{ + if (!mlx5dr_domain_is_support_ptrn_arg(dmn)) + return 0; + + dmn->ptrn_mgr = mlx5dr_ptrn_mgr_create(dmn); + if (!dmn->ptrn_mgr) { + mlx5dr_err(dmn, "Couldn't create ptrn_mgr\n"); + return -ENOMEM; + } + + /* create argument pool */ + dmn->arg_mgr = mlx5dr_arg_mgr_create(dmn); + if (!dmn->arg_mgr) { + mlx5dr_err(dmn, "Couldn't create arg_mgr\n"); + goto free_modify_header_pattern; + } + + return 0; + +free_modify_header_pattern: + mlx5dr_ptrn_mgr_destroy(dmn->ptrn_mgr); + return -ENOMEM; +} + +static void dr_domain_destroy_modify_header_resources(struct mlx5dr_domain *dmn) +{ + if (!mlx5dr_domain_is_support_ptrn_arg(dmn)) + return; + + mlx5dr_arg_mgr_destroy(dmn->arg_mgr); + mlx5dr_ptrn_mgr_destroy(dmn->ptrn_mgr); +} + static void dr_domain_init_csum_recalc_fts(struct mlx5dr_domain *dmn) { /* Per vport cached FW FT for checksum recalculation, this @@ -149,14 +189,22 @@ static int dr_domain_init_resources(struct mlx5dr_domain *dmn) goto clean_uar; } + ret = dr_domain_init_modify_header_resources(dmn); + if (ret) { + mlx5dr_err(dmn, "Couldn't create modify-header-resources\n"); + goto clean_mem_resources; + } + ret = mlx5dr_send_ring_alloc(dmn); if (ret) { mlx5dr_err(dmn, "Couldn't create send-ring\n"); - goto clean_mem_resources; + goto clean_modify_hdr; } return 0; +clean_modify_hdr: + dr_domain_destroy_modify_header_resources(dmn); clean_mem_resources: dr_domain_uninit_mem_resources(dmn); clean_uar: @@ -170,6 +218,7 @@ clean_pd: static void dr_domain_uninit_resources(struct mlx5dr_domain *dmn) { mlx5dr_send_ring_free(dmn, dmn->send_ring); + dr_domain_destroy_modify_header_resources(dmn); dr_domain_uninit_mem_resources(dmn); mlx5_put_uars_page(dmn->mdev, dmn->uar); mlx5_core_dealloc_pd(dmn->mdev, dmn->pdn); @@ -215,7 +264,7 @@ static int dr_domain_query_vport(struct mlx5dr_domain *dmn, return 0; } -static int dr_domain_query_esw_mngr(struct mlx5dr_domain *dmn) +static int dr_domain_query_esw_mgr(struct mlx5dr_domain *dmn) { return dr_domain_query_vport(dmn, 0, false, &dmn->info.caps.vports.esw_manager_caps); @@ -321,7 +370,7 @@ static int dr_domain_query_fdb_caps(struct mlx5_core_dev *mdev, * vports (vport 0, VFs and SFs) will be queried dynamically. */ - ret = dr_domain_query_esw_mngr(dmn); + ret = dr_domain_query_esw_mgr(dmn); if (ret) { mlx5dr_err(dmn, "Failed to query eswitch manager vport caps (err: %d)", ret); goto free_vports_caps_xa; @@ -435,6 +484,9 @@ mlx5dr_domain_create(struct mlx5_core_dev *mdev, enum mlx5dr_domain_type type) dmn->info.max_log_action_icm_sz = DR_CHUNK_SIZE_4K; dmn->info.max_log_sw_icm_sz = min_t(u32, DR_CHUNK_SIZE_1024K, dmn->info.caps.log_icm_size); + dmn->info.max_log_modify_hdr_pattern_icm_sz = + min_t(u32, DR_CHUNK_SIZE_4K, + dmn->info.caps.log_modify_pattern_icm_size); if (!dmn->info.supp_sw_steering) { mlx5dr_err(dmn, "SW steering is not supported\n"); diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_icm_pool.c b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_icm_pool.c index 3eb6719bc8eb..0b5af9f3f605 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_icm_pool.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_icm_pool.c @@ -4,7 +4,9 @@ #include "dr_types.h" #define DR_ICM_MODIFY_HDR_ALIGN_BASE 64 -#define DR_ICM_POOL_HOT_MEMORY_FRACTION 4 +#define DR_ICM_POOL_STE_HOT_MEM_PERCENT 25 +#define DR_ICM_POOL_MODIFY_HDR_PTRN_HOT_MEM_PERCENT 50 +#define DR_ICM_POOL_MODIFY_ACTION_HOT_MEM_PERCENT 90 struct mlx5dr_icm_hot_chunk { struct mlx5dr_icm_buddy_mem *buddy_mem; @@ -29,6 +31,8 @@ struct mlx5dr_icm_pool { struct mlx5dr_icm_hot_chunk *hot_chunks_arr; u32 hot_chunks_num; u64 hot_memory_size; + /* hot memory size threshold for triggering sync */ + u64 th; }; struct mlx5dr_icm_dm { @@ -107,9 +111,9 @@ static struct mlx5dr_icm_mr * dr_icm_pool_mr_create(struct mlx5dr_icm_pool *pool) { struct mlx5_core_dev *mdev = pool->dmn->mdev; - enum mlx5_sw_icm_type dm_type; + enum mlx5_sw_icm_type dm_type = 0; struct mlx5dr_icm_mr *icm_mr; - size_t log_align_base; + size_t log_align_base = 0; int err; icm_mr = kvzalloc(sizeof(*icm_mr), GFP_KERNEL); @@ -121,14 +125,25 @@ dr_icm_pool_mr_create(struct mlx5dr_icm_pool *pool) icm_mr->dm.length = mlx5dr_icm_pool_chunk_size_to_byte(pool->max_log_chunk_sz, pool->icm_type); - if (pool->icm_type == DR_ICM_TYPE_STE) { + switch (pool->icm_type) { + case DR_ICM_TYPE_STE: dm_type = MLX5_SW_ICM_TYPE_STEERING; log_align_base = ilog2(icm_mr->dm.length); - } else { + break; + case DR_ICM_TYPE_MODIFY_ACTION: dm_type = MLX5_SW_ICM_TYPE_HEADER_MODIFY; /* Align base is 64B */ log_align_base = ilog2(DR_ICM_MODIFY_HDR_ALIGN_BASE); + break; + case DR_ICM_TYPE_MODIFY_HDR_PTRN: + dm_type = MLX5_SW_ICM_TYPE_HEADER_MODIFY_PATTERN; + /* Align base is 64B */ + log_align_base = ilog2(DR_ICM_MODIFY_HDR_ALIGN_BASE); + break; + default: + WARN_ON(pool->icm_type); } + icm_mr->dm.type = dm_type; err = mlx5_dm_sw_icm_alloc(mdev, icm_mr->dm.type, icm_mr->dm.length, @@ -273,6 +288,8 @@ static int dr_icm_buddy_create(struct mlx5dr_icm_pool *pool) /* add it to the -start- of the list in order to search in it first */ list_add(&buddy->list_node, &pool->buddy_mem_list); + pool->dmn->num_buddies[pool->icm_type]++; + return 0; err_cleanup_buddy: @@ -286,13 +303,17 @@ free_mr: static void dr_icm_buddy_destroy(struct mlx5dr_icm_buddy_mem *buddy) { + enum mlx5dr_icm_type icm_type = buddy->pool->icm_type; + dr_icm_pool_mr_destroy(buddy->icm_mr); mlx5dr_buddy_cleanup(buddy); - if (buddy->pool->icm_type == DR_ICM_TYPE_STE) + if (icm_type == DR_ICM_TYPE_STE) dr_icm_buddy_cleanup_ste_cache(buddy); + buddy->pool->dmn->num_buddies[icm_type]--; + kvfree(buddy); } @@ -319,15 +340,7 @@ dr_icm_chunk_init(struct mlx5dr_icm_chunk *chunk, static bool dr_icm_pool_is_sync_required(struct mlx5dr_icm_pool *pool) { - int allow_hot_size; - - /* sync when hot memory reaches a certain fraction of the pool size */ - allow_hot_size = - mlx5dr_icm_pool_chunk_size_to_byte(pool->max_log_chunk_sz, - pool->icm_type) / - DR_ICM_POOL_HOT_MEMORY_FRACTION; - - return pool->hot_memory_size > allow_hot_size; + return pool->hot_memory_size > pool->th; } static void dr_icm_pool_clear_hot_chunks_arr(struct mlx5dr_icm_pool *pool) @@ -492,14 +505,9 @@ void mlx5dr_icm_pool_free_htbl(struct mlx5dr_icm_pool *pool, struct mlx5dr_ste_h struct mlx5dr_icm_pool *mlx5dr_icm_pool_create(struct mlx5dr_domain *dmn, enum mlx5dr_icm_type icm_type) { - u32 num_of_chunks, entry_size, max_hot_size; - enum mlx5dr_icm_chunk_size max_log_chunk_sz; + u32 num_of_chunks, entry_size; struct mlx5dr_icm_pool *pool; - - if (icm_type == DR_ICM_TYPE_STE) - max_log_chunk_sz = dmn->info.max_log_sw_icm_sz; - else - max_log_chunk_sz = dmn->info.max_log_action_icm_sz; + u32 max_hot_size = 0; pool = kvzalloc(sizeof(*pool), GFP_KERNEL); if (!pool) @@ -507,20 +515,38 @@ struct mlx5dr_icm_pool *mlx5dr_icm_pool_create(struct mlx5dr_domain *dmn, pool->dmn = dmn; pool->icm_type = icm_type; - pool->max_log_chunk_sz = max_log_chunk_sz; pool->chunks_kmem_cache = dmn->chunks_kmem_cache; INIT_LIST_HEAD(&pool->buddy_mem_list); - mutex_init(&pool->mutex); - entry_size = mlx5dr_icm_pool_dm_type_to_entry_size(pool->icm_type); + switch (icm_type) { + case DR_ICM_TYPE_STE: + pool->max_log_chunk_sz = dmn->info.max_log_sw_icm_sz; + max_hot_size = mlx5dr_icm_pool_chunk_size_to_byte(pool->max_log_chunk_sz, + pool->icm_type) * + DR_ICM_POOL_STE_HOT_MEM_PERCENT / 100; + break; + case DR_ICM_TYPE_MODIFY_ACTION: + pool->max_log_chunk_sz = dmn->info.max_log_action_icm_sz; + max_hot_size = mlx5dr_icm_pool_chunk_size_to_byte(pool->max_log_chunk_sz, + pool->icm_type) * + DR_ICM_POOL_MODIFY_ACTION_HOT_MEM_PERCENT / 100; + break; + case DR_ICM_TYPE_MODIFY_HDR_PTRN: + pool->max_log_chunk_sz = dmn->info.max_log_modify_hdr_pattern_icm_sz; + max_hot_size = mlx5dr_icm_pool_chunk_size_to_byte(pool->max_log_chunk_sz, + pool->icm_type) * + DR_ICM_POOL_MODIFY_HDR_PTRN_HOT_MEM_PERCENT / 100; + break; + default: + WARN_ON(icm_type); + } - max_hot_size = mlx5dr_icm_pool_chunk_size_to_byte(pool->max_log_chunk_sz, - pool->icm_type) / - DR_ICM_POOL_HOT_MEMORY_FRACTION; + entry_size = mlx5dr_icm_pool_dm_type_to_entry_size(pool->icm_type); num_of_chunks = DIV_ROUND_UP(max_hot_size, entry_size) + 1; + pool->th = max_hot_size; pool->hot_chunks_arr = kvcalloc(num_of_chunks, sizeof(struct mlx5dr_icm_hot_chunk), diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_ptrn.c b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_ptrn.c new file mode 100644 index 000000000000..13e06a6a6b22 --- /dev/null +++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_ptrn.c @@ -0,0 +1,241 @@ +// SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB +// Copyright (c) 2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved. + +#include "dr_types.h" +#include "mlx5_ifc_dr_ste_v1.h" + +enum dr_ptrn_modify_hdr_action_id { + DR_PTRN_MODIFY_HDR_ACTION_ID_NOP = 0x00, + DR_PTRN_MODIFY_HDR_ACTION_ID_COPY = 0x05, + DR_PTRN_MODIFY_HDR_ACTION_ID_SET = 0x06, + DR_PTRN_MODIFY_HDR_ACTION_ID_ADD = 0x07, + DR_PTRN_MODIFY_HDR_ACTION_ID_INSERT_INLINE = 0x0a, +}; + +struct mlx5dr_ptrn_mgr { + struct mlx5dr_domain *dmn; + struct mlx5dr_icm_pool *ptrn_icm_pool; + /* cache for modify_header ptrn */ + struct list_head ptrn_list; + struct mutex modify_hdr_mutex; /* protect the pattern cache */ +}; + +/* Cache structure and functions */ +static bool dr_ptrn_compare_modify_hdr(size_t cur_num_of_actions, + __be64 cur_hw_actions[], + size_t num_of_actions, + __be64 hw_actions[]) +{ + int i; + + if (cur_num_of_actions != num_of_actions) + return false; + + for (i = 0; i < num_of_actions; i++) { + u8 action_id = + MLX5_GET(ste_double_action_set_v1, &hw_actions[i], action_id); + + if (action_id == DR_PTRN_MODIFY_HDR_ACTION_ID_COPY) { + if (hw_actions[i] != cur_hw_actions[i]) + return false; + } else { + if ((__force __be32)hw_actions[i] != + (__force __be32)cur_hw_actions[i]) + return false; + } + } + + return true; +} + +static struct mlx5dr_ptrn_obj * +dr_ptrn_find_cached_pattern(struct mlx5dr_ptrn_mgr *mgr, + size_t num_of_actions, + __be64 hw_actions[]) +{ + struct mlx5dr_ptrn_obj *cached_pattern; + struct mlx5dr_ptrn_obj *tmp; + + list_for_each_entry_safe(cached_pattern, tmp, &mgr->ptrn_list, list) { + if (dr_ptrn_compare_modify_hdr(cached_pattern->num_of_actions, + (__be64 *)cached_pattern->data, + num_of_actions, + hw_actions)) { + /* Put this pattern in the head of the list, + * as we will probably use it more. + */ + list_del_init(&cached_pattern->list); + list_add(&cached_pattern->list, &mgr->ptrn_list); + return cached_pattern; + } + } + + return NULL; +} + +static struct mlx5dr_ptrn_obj * +dr_ptrn_alloc_pattern(struct mlx5dr_ptrn_mgr *mgr, + u16 num_of_actions, u8 *data) +{ + struct mlx5dr_ptrn_obj *pattern; + struct mlx5dr_icm_chunk *chunk; + u32 chunk_size; + u32 index; + + chunk_size = ilog2(num_of_actions); + /* HW modify action index granularity is at least 64B */ + chunk_size = max_t(u32, chunk_size, DR_CHUNK_SIZE_8); + + chunk = mlx5dr_icm_alloc_chunk(mgr->ptrn_icm_pool, chunk_size); + if (!chunk) + return NULL; + + index = (mlx5dr_icm_pool_get_chunk_icm_addr(chunk) - + mgr->dmn->info.caps.hdr_modify_pattern_icm_addr) / + DR_ACTION_CACHE_LINE_SIZE; + + pattern = kzalloc(sizeof(*pattern), GFP_KERNEL); + if (!pattern) + goto free_chunk; + + pattern->data = kzalloc(num_of_actions * DR_MODIFY_ACTION_SIZE * + sizeof(*pattern->data), GFP_KERNEL); + if (!pattern->data) + goto free_pattern; + + memcpy(pattern->data, data, num_of_actions * DR_MODIFY_ACTION_SIZE); + pattern->chunk = chunk; + pattern->index = index; + pattern->num_of_actions = num_of_actions; + + list_add(&pattern->list, &mgr->ptrn_list); + refcount_set(&pattern->refcount, 1); + + return pattern; + +free_pattern: + kfree(pattern); +free_chunk: + mlx5dr_icm_free_chunk(chunk); + return NULL; +} + +static void +dr_ptrn_free_pattern(struct mlx5dr_ptrn_obj *pattern) +{ + list_del(&pattern->list); + mlx5dr_icm_free_chunk(pattern->chunk); + kfree(pattern->data); + kfree(pattern); +} + +struct mlx5dr_ptrn_obj * +mlx5dr_ptrn_cache_get_pattern(struct mlx5dr_ptrn_mgr *mgr, + u16 num_of_actions, + u8 *data) +{ + struct mlx5dr_ptrn_obj *pattern; + u64 *hw_actions; + u8 action_id; + int i; + + mutex_lock(&mgr->modify_hdr_mutex); + pattern = dr_ptrn_find_cached_pattern(mgr, + num_of_actions, + (__be64 *)data); + if (!pattern) { + /* Alloc and add new pattern to cache */ + pattern = dr_ptrn_alloc_pattern(mgr, num_of_actions, data); + if (!pattern) + goto out_unlock; + + hw_actions = (u64 *)pattern->data; + /* Here we mask the pattern data to create a valid pattern + * since we do an OR operation between the arg and pattern + */ + for (i = 0; i < num_of_actions; i++) { + action_id = MLX5_GET(ste_double_action_set_v1, &hw_actions[i], action_id); + + if (action_id == DR_PTRN_MODIFY_HDR_ACTION_ID_SET || + action_id == DR_PTRN_MODIFY_HDR_ACTION_ID_ADD || + action_id == DR_PTRN_MODIFY_HDR_ACTION_ID_INSERT_INLINE) + MLX5_SET(ste_double_action_set_v1, &hw_actions[i], inline_data, 0); + } + + if (mlx5dr_send_postsend_pattern(mgr->dmn, pattern->chunk, + num_of_actions, pattern->data)) { + refcount_dec(&pattern->refcount); + goto free_pattern; + } + } else { + refcount_inc(&pattern->refcount); + } + + mutex_unlock(&mgr->modify_hdr_mutex); + + return pattern; + +free_pattern: + dr_ptrn_free_pattern(pattern); +out_unlock: + mutex_unlock(&mgr->modify_hdr_mutex); + return NULL; +} + +void +mlx5dr_ptrn_cache_put_pattern(struct mlx5dr_ptrn_mgr *mgr, + struct mlx5dr_ptrn_obj *pattern) +{ + mutex_lock(&mgr->modify_hdr_mutex); + + if (refcount_dec_and_test(&pattern->refcount)) + dr_ptrn_free_pattern(pattern); + + mutex_unlock(&mgr->modify_hdr_mutex); +} + +struct mlx5dr_ptrn_mgr *mlx5dr_ptrn_mgr_create(struct mlx5dr_domain *dmn) +{ + struct mlx5dr_ptrn_mgr *mgr; + + if (!mlx5dr_domain_is_support_ptrn_arg(dmn)) + return NULL; + + mgr = kzalloc(sizeof(*mgr), GFP_KERNEL); + if (!mgr) + return NULL; + + mgr->dmn = dmn; + mgr->ptrn_icm_pool = mlx5dr_icm_pool_create(dmn, DR_ICM_TYPE_MODIFY_HDR_PTRN); + if (!mgr->ptrn_icm_pool) { + mlx5dr_err(dmn, "Couldn't get modify-header-pattern memory\n"); + goto free_mgr; + } + + INIT_LIST_HEAD(&mgr->ptrn_list); + return mgr; + +free_mgr: + kfree(mgr); + return NULL; +} + +void mlx5dr_ptrn_mgr_destroy(struct mlx5dr_ptrn_mgr *mgr) +{ + struct mlx5dr_ptrn_obj *pattern; + struct mlx5dr_ptrn_obj *tmp; + + if (!mgr) + return; + + WARN_ON(!list_empty(&mgr->ptrn_list)); + + list_for_each_entry_safe(pattern, tmp, &mgr->ptrn_list, list) { + list_del(&pattern->list); + kfree(pattern->data); + kfree(pattern); + } + + mlx5dr_icm_pool_destroy(mgr->ptrn_icm_pool); + kfree(mgr); +} diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_send.c b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_send.c index fd2d31cdbcf9..4a5ae86e2b62 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_send.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_send.c @@ -18,7 +18,13 @@ struct dr_data_seg { unsigned int send_flags; }; +enum send_info_type { + WRITE_ICM = 0, + GTA_ARG = 1, +}; + struct postsend_info { + enum send_info_type type; struct dr_data_seg write; struct dr_data_seg read; u64 remote_addr; @@ -261,9 +267,10 @@ static struct mlx5dr_qp *dr_create_rc_qp(struct mlx5_core_dev *mdev, dr_qp->rq.pc = 0; dr_qp->rq.cc = 0; - dr_qp->rq.wqe_cnt = 4; + dr_qp->rq.wqe_cnt = 256; dr_qp->sq.pc = 0; dr_qp->sq.cc = 0; + dr_qp->sq.head = 0; dr_qp->sq.wqe_cnt = roundup_pow_of_two(attr->max_send_wr); MLX5_SET(qpc, temp_qpc, log_rq_stride, ilog2(MLX5_SEND_WQE_DS) - 4); @@ -362,39 +369,113 @@ static void dr_cmd_notify_hw(struct mlx5dr_qp *dr_qp, void *ctrl) mlx5_write64(ctrl, dr_qp->uar->map + MLX5_BF_OFFSET); } -static void dr_rdma_segments(struct mlx5dr_qp *dr_qp, u64 remote_addr, - u32 rkey, struct dr_data_seg *data_seg, - u32 opcode, bool notify_hw) +static void +dr_rdma_handle_flow_access_arg_segments(struct mlx5_wqe_ctrl_seg *wq_ctrl, + u32 remote_addr, + struct dr_data_seg *data_seg, + int *size) { - struct mlx5_wqe_raddr_seg *wq_raddr; - struct mlx5_wqe_ctrl_seg *wq_ctrl; - struct mlx5_wqe_data_seg *wq_dseg; - unsigned int size; - unsigned int idx; + struct mlx5_wqe_header_modify_argument_update_seg *wq_arg_seg; + struct mlx5_wqe_flow_update_ctrl_seg *wq_flow_seg; - size = sizeof(*wq_ctrl) / 16 + sizeof(*wq_dseg) / 16 + - sizeof(*wq_raddr) / 16; + wq_ctrl->general_id = cpu_to_be32(remote_addr); + wq_flow_seg = (void *)(wq_ctrl + 1); - idx = dr_qp->sq.pc & (dr_qp->sq.wqe_cnt - 1); + /* mlx5_wqe_flow_update_ctrl_seg - all reserved */ + memset(wq_flow_seg, 0, sizeof(*wq_flow_seg)); + wq_arg_seg = (void *)(wq_flow_seg + 1); + + memcpy(wq_arg_seg->argument_list, + (void *)(uintptr_t)data_seg->addr, + data_seg->length); + + *size = (sizeof(*wq_ctrl) + /* WQE ctrl segment */ + sizeof(*wq_flow_seg) + /* WQE flow update ctrl seg - reserved */ + sizeof(*wq_arg_seg)) / /* WQE hdr modify arg seg - data */ + MLX5_SEND_WQE_DS; +} + +static void +dr_rdma_handle_icm_write_segments(struct mlx5_wqe_ctrl_seg *wq_ctrl, + u64 remote_addr, + u32 rkey, + struct dr_data_seg *data_seg, + unsigned int *size) +{ + struct mlx5_wqe_raddr_seg *wq_raddr; + struct mlx5_wqe_data_seg *wq_dseg; - wq_ctrl = mlx5_wq_cyc_get_wqe(&dr_qp->wq.sq, idx); - wq_ctrl->imm = 0; - wq_ctrl->fm_ce_se = (data_seg->send_flags) ? - MLX5_WQE_CTRL_CQ_UPDATE : 0; - wq_ctrl->opmod_idx_opcode = cpu_to_be32(((dr_qp->sq.pc & 0xffff) << 8) | - opcode); - wq_ctrl->qpn_ds = cpu_to_be32(size | dr_qp->qpn << 8); wq_raddr = (void *)(wq_ctrl + 1); + wq_raddr->raddr = cpu_to_be64(remote_addr); wq_raddr->rkey = cpu_to_be32(rkey); wq_raddr->reserved = 0; wq_dseg = (void *)(wq_raddr + 1); + wq_dseg->byte_count = cpu_to_be32(data_seg->length); wq_dseg->lkey = cpu_to_be32(data_seg->lkey); wq_dseg->addr = cpu_to_be64(data_seg->addr); - dr_qp->sq.wqe_head[idx] = dr_qp->sq.pc++; + *size = (sizeof(*wq_ctrl) + /* WQE ctrl segment */ + sizeof(*wq_dseg) + /* WQE data segment */ + sizeof(*wq_raddr)) / /* WQE remote addr segment */ + MLX5_SEND_WQE_DS; +} + +static void dr_set_ctrl_seg(struct mlx5_wqe_ctrl_seg *wq_ctrl, + struct dr_data_seg *data_seg) +{ + wq_ctrl->signature = 0; + wq_ctrl->rsvd[0] = 0; + wq_ctrl->rsvd[1] = 0; + wq_ctrl->fm_ce_se = data_seg->send_flags & IB_SEND_SIGNALED ? + MLX5_WQE_CTRL_CQ_UPDATE : 0; + wq_ctrl->imm = 0; +} + +static void dr_rdma_segments(struct mlx5dr_qp *dr_qp, u64 remote_addr, + u32 rkey, struct dr_data_seg *data_seg, + u32 opcode, bool notify_hw) +{ + struct mlx5_wqe_ctrl_seg *wq_ctrl; + int opcode_mod = 0; + unsigned int size; + unsigned int idx; + + idx = dr_qp->sq.pc & (dr_qp->sq.wqe_cnt - 1); + + wq_ctrl = mlx5_wq_cyc_get_wqe(&dr_qp->wq.sq, idx); + dr_set_ctrl_seg(wq_ctrl, data_seg); + + switch (opcode) { + case MLX5_OPCODE_RDMA_READ: + case MLX5_OPCODE_RDMA_WRITE: + dr_rdma_handle_icm_write_segments(wq_ctrl, remote_addr, + rkey, data_seg, &size); + break; + case MLX5_OPCODE_FLOW_TBL_ACCESS: + opcode_mod = MLX5_CMD_OP_MOD_UPDATE_HEADER_MODIFY_ARGUMENT; + dr_rdma_handle_flow_access_arg_segments(wq_ctrl, remote_addr, + data_seg, &size); + break; + default: + WARN(true, "illegal opcode %d", opcode); + return; + } + + /* -------------------------------------------------------- + * |opcode_mod (8 bit)|wqe_index (16 bits)| opcod (8 bits)| + * -------------------------------------------------------- + */ + wq_ctrl->opmod_idx_opcode = + cpu_to_be32((opcode_mod << 24) | + ((dr_qp->sq.pc & 0xffff) << 8) | + opcode); + wq_ctrl->qpn_ds = cpu_to_be32(size | dr_qp->qpn << 8); + + dr_qp->sq.pc += DIV_ROUND_UP(size * 16, MLX5_SEND_WQE_BB); + dr_qp->sq.wqe_head[idx] = dr_qp->sq.head++; if (notify_hw) dr_cmd_notify_hw(dr_qp, wq_ctrl); @@ -402,10 +483,16 @@ static void dr_rdma_segments(struct mlx5dr_qp *dr_qp, u64 remote_addr, static void dr_post_send(struct mlx5dr_qp *dr_qp, struct postsend_info *send_info) { - dr_rdma_segments(dr_qp, send_info->remote_addr, send_info->rkey, - &send_info->write, MLX5_OPCODE_RDMA_WRITE, false); - dr_rdma_segments(dr_qp, send_info->remote_addr, send_info->rkey, - &send_info->read, MLX5_OPCODE_RDMA_READ, true); + if (send_info->type == WRITE_ICM) { + dr_rdma_segments(dr_qp, send_info->remote_addr, send_info->rkey, + &send_info->write, MLX5_OPCODE_RDMA_WRITE, false); + dr_rdma_segments(dr_qp, send_info->remote_addr, send_info->rkey, + &send_info->read, MLX5_OPCODE_RDMA_READ, true); + } else { /* GTA_ARG */ + dr_rdma_segments(dr_qp, send_info->remote_addr, send_info->rkey, + &send_info->write, MLX5_OPCODE_FLOW_TBL_ACCESS, true); + } + } /** @@ -471,24 +558,54 @@ static int dr_handle_pending_wc(struct mlx5dr_domain *dmn, } else if (ne == 1) { send_ring->pending_wqe -= send_ring->signal_th; } - } while (is_drain && send_ring->pending_wqe); + } while (ne == 1 || + (is_drain && send_ring->pending_wqe >= send_ring->signal_th)); return 0; } -static void dr_fill_data_segs(struct mlx5dr_send_ring *send_ring, - struct postsend_info *send_info) +static void dr_fill_write_args_segs(struct mlx5dr_send_ring *send_ring, + struct postsend_info *send_info) { send_ring->pending_wqe++; if (send_ring->pending_wqe % send_ring->signal_th == 0) send_info->write.send_flags |= IB_SEND_SIGNALED; + else + send_info->write.send_flags = 0; +} + +static void dr_fill_write_icm_segs(struct mlx5dr_domain *dmn, + struct mlx5dr_send_ring *send_ring, + struct postsend_info *send_info) +{ + u32 buff_offset; + + if (send_info->write.length > dmn->info.max_inline_size) { + buff_offset = (send_ring->tx_head & + (dmn->send_ring->signal_th - 1)) * + send_ring->max_post_send_size; + /* Copy to ring mr */ + memcpy(send_ring->buf + buff_offset, + (void *)(uintptr_t)send_info->write.addr, + send_info->write.length); + send_info->write.addr = (uintptr_t)send_ring->mr->dma_addr + buff_offset; + send_info->write.lkey = send_ring->mr->mkey; + + send_ring->tx_head++; + } + + send_ring->pending_wqe++; + + if (send_ring->pending_wqe % send_ring->signal_th == 0) + send_info->write.send_flags |= IB_SEND_SIGNALED; send_ring->pending_wqe++; send_info->read.length = send_info->write.length; - /* Read into the same write area */ - send_info->read.addr = (uintptr_t)send_info->write.addr; - send_info->read.lkey = send_ring->mr->mkey; + + /* Read into dedicated sync buffer */ + send_info->read.addr = (uintptr_t)send_ring->sync_mr->dma_addr; + send_info->read.lkey = send_ring->sync_mr->mkey; if (send_ring->pending_wqe % send_ring->signal_th == 0) send_info->read.send_flags = IB_SEND_SIGNALED; @@ -496,11 +613,20 @@ static void dr_fill_data_segs(struct mlx5dr_send_ring *send_ring, send_info->read.send_flags = 0; } +static void dr_fill_data_segs(struct mlx5dr_domain *dmn, + struct mlx5dr_send_ring *send_ring, + struct postsend_info *send_info) +{ + if (send_info->type == WRITE_ICM) + dr_fill_write_icm_segs(dmn, send_ring, send_info); + else /* args */ + dr_fill_write_args_segs(send_ring, send_info); +} + static int dr_postsend_icm_data(struct mlx5dr_domain *dmn, struct postsend_info *send_info) { struct mlx5dr_send_ring *send_ring = dmn->send_ring; - u32 buff_offset; int ret; if (unlikely(dmn->mdev->state == MLX5_DEVICE_STATE_INTERNAL_ERROR || @@ -517,20 +643,7 @@ static int dr_postsend_icm_data(struct mlx5dr_domain *dmn, if (ret) goto out_unlock; - if (send_info->write.length > dmn->info.max_inline_size) { - buff_offset = (send_ring->tx_head & - (dmn->send_ring->signal_th - 1)) * - send_ring->max_post_send_size; - /* Copy to ring mr */ - memcpy(send_ring->buf + buff_offset, - (void *)(uintptr_t)send_info->write.addr, - send_info->write.length); - send_info->write.addr = (uintptr_t)send_ring->mr->dma_addr + buff_offset; - send_info->write.lkey = send_ring->mr->mkey; - } - - send_ring->tx_head++; - dr_fill_data_segs(send_ring, send_info); + dr_fill_data_segs(dmn, send_ring, send_info); dr_post_send(send_ring->qp, send_info); out_unlock: @@ -736,6 +849,59 @@ int mlx5dr_send_postsend_action(struct mlx5dr_domain *dmn, return dr_postsend_icm_data(dmn, &send_info); } +int mlx5dr_send_postsend_pattern(struct mlx5dr_domain *dmn, + struct mlx5dr_icm_chunk *chunk, + u16 num_of_actions, + u8 *data) +{ + struct postsend_info send_info = {}; + int ret; + + send_info.write.addr = (uintptr_t)data; + send_info.write.length = num_of_actions * DR_MODIFY_ACTION_SIZE; + send_info.remote_addr = mlx5dr_icm_pool_get_chunk_mr_addr(chunk); + send_info.rkey = mlx5dr_icm_pool_get_chunk_rkey(chunk); + + ret = dr_postsend_icm_data(dmn, &send_info); + if (ret) + return ret; + + return 0; +} + +int mlx5dr_send_postsend_args(struct mlx5dr_domain *dmn, u64 arg_id, + u16 num_of_actions, u8 *actions_data) +{ + int data_len, iter = 0, cur_sent; + u64 addr; + int ret; + + addr = (uintptr_t)actions_data; + data_len = num_of_actions * DR_MODIFY_ACTION_SIZE; + + do { + struct postsend_info send_info = {}; + + send_info.type = GTA_ARG; + send_info.write.addr = addr; + cur_sent = min_t(u32, data_len, DR_ACTION_CACHE_LINE_SIZE); + send_info.write.length = cur_sent; + send_info.write.lkey = 0; + send_info.remote_addr = arg_id + iter; + + ret = dr_postsend_icm_data(dmn, &send_info); + if (ret) + goto out; + + iter++; + addr += cur_sent; + data_len -= cur_sent; + } while (data_len > 0); + +out: + return ret; +} + static int dr_modify_qp_rst2init(struct mlx5_core_dev *mdev, struct mlx5dr_qp *dr_qp, int port) @@ -1123,16 +1289,25 @@ int mlx5dr_send_ring_alloc(struct mlx5dr_domain *dmn) goto free_mem; } + dmn->send_ring->sync_buff = kzalloc(dmn->send_ring->max_post_send_size, + GFP_KERNEL); + if (!dmn->send_ring->sync_buff) { + ret = -ENOMEM; + goto clean_mr; + } + dmn->send_ring->sync_mr = dr_reg_mr(dmn->mdev, dmn->pdn, dmn->send_ring->sync_buff, - MIN_READ_SYNC); + dmn->send_ring->max_post_send_size); if (!dmn->send_ring->sync_mr) { ret = -ENOMEM; - goto clean_mr; + goto free_sync_mem; } return 0; +free_sync_mem: + kfree(dmn->send_ring->sync_buff); clean_mr: dr_dereg_mr(dmn->mdev, dmn->send_ring->mr); free_mem: @@ -1155,6 +1330,7 @@ void mlx5dr_send_ring_free(struct mlx5dr_domain *dmn, dr_dereg_mr(dmn->mdev, send_ring->sync_mr); dr_dereg_mr(dmn->mdev, send_ring->mr); kfree(send_ring->buf); + kfree(send_ring->sync_buff); kfree(send_ring); } diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_ste.c b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_ste.c index 1e15f605df6e..9413aaf51251 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_ste.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_ste.c @@ -633,6 +633,63 @@ int mlx5dr_ste_set_action_decap_l3_list(struct mlx5dr_ste_ctx *ste_ctx, used_hw_action_num); } +static int +dr_ste_alloc_modify_hdr_chunk(struct mlx5dr_action *action) +{ + struct mlx5dr_domain *dmn = action->rewrite->dmn; + u32 chunk_size; + int ret; + + chunk_size = ilog2(roundup_pow_of_two(action->rewrite->num_of_actions)); + + /* HW modify action index granularity is at least 64B */ + chunk_size = max_t(u32, chunk_size, DR_CHUNK_SIZE_8); + + action->rewrite->chunk = mlx5dr_icm_alloc_chunk(dmn->action_icm_pool, + chunk_size); + if (!action->rewrite->chunk) + return -ENOMEM; + + action->rewrite->index = (mlx5dr_icm_pool_get_chunk_icm_addr(action->rewrite->chunk) - + dmn->info.caps.hdr_modify_icm_addr) / + DR_ACTION_CACHE_LINE_SIZE; + + ret = mlx5dr_send_postsend_action(action->rewrite->dmn, action); + if (ret) + goto free_chunk; + + return 0; + +free_chunk: + mlx5dr_icm_free_chunk(action->rewrite->chunk); + return -ENOMEM; +} + +static void dr_ste_free_modify_hdr_chunk(struct mlx5dr_action *action) +{ + mlx5dr_icm_free_chunk(action->rewrite->chunk); +} + +int mlx5dr_ste_alloc_modify_hdr(struct mlx5dr_action *action) +{ + struct mlx5dr_domain *dmn = action->rewrite->dmn; + + if (mlx5dr_domain_is_support_ptrn_arg(dmn)) + return dmn->ste_ctx->alloc_modify_hdr_chunk(action); + + return dr_ste_alloc_modify_hdr_chunk(action); +} + +void mlx5dr_ste_free_modify_hdr(struct mlx5dr_action *action) +{ + struct mlx5dr_domain *dmn = action->rewrite->dmn; + + if (mlx5dr_domain_is_support_ptrn_arg(dmn)) + return dmn->ste_ctx->dealloc_modify_hdr_chunk(action); + + return dr_ste_free_modify_hdr_chunk(action); +} + static int dr_ste_build_pre_check_spec(struct mlx5dr_domain *dmn, struct mlx5dr_match_spec *spec) { diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_ste.h b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_ste.h index 7075142bcfb6..54a6619c3ecb 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_ste.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_ste.h @@ -195,6 +195,8 @@ struct mlx5dr_ste_ctx { u8 *hw_action, u32 hw_action_sz, u16 *used_hw_action_num); + int (*alloc_modify_hdr_chunk)(struct mlx5dr_action *action); + void (*dealloc_modify_hdr_chunk)(struct mlx5dr_action *action); /* Send */ void (*prepare_for_postsend)(u8 *hw_ste_p, u32 ste_size); diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_ste_v1.c b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_ste_v1.c index 084145f18084..4c0704ad166b 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_ste_v1.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_ste_v1.c @@ -495,21 +495,66 @@ static void dr_ste_v1_set_rx_decap(u8 *hw_ste_p, u8 *s_action) dr_ste_v1_set_reparse(hw_ste_p); } -static void dr_ste_v1_set_rewrite_actions(u8 *hw_ste_p, - u8 *s_action, - u16 num_of_actions, - u32 re_write_index) +static void dr_ste_v1_set_accelerated_rewrite_actions(u8 *hw_ste_p, + u8 *d_action, + u16 num_of_actions, + u32 rewrite_pattern, + u32 rewrite_args, + u8 *action_data) +{ + if (action_data) { + memcpy(d_action, action_data, DR_MODIFY_ACTION_SIZE); + } else { + MLX5_SET(ste_double_action_accelerated_modify_action_list_v1, d_action, + action_id, DR_STE_V1_ACTION_ID_ACCELERATED_LIST); + MLX5_SET(ste_double_action_accelerated_modify_action_list_v1, d_action, + modify_actions_pattern_pointer, rewrite_pattern); + MLX5_SET(ste_double_action_accelerated_modify_action_list_v1, d_action, + number_of_modify_actions, num_of_actions); + MLX5_SET(ste_double_action_accelerated_modify_action_list_v1, d_action, + modify_actions_argument_pointer, rewrite_args); + } + + dr_ste_v1_set_reparse(hw_ste_p); +} + +static void dr_ste_v1_set_basic_rewrite_actions(u8 *hw_ste_p, + u8 *s_action, + u16 num_of_actions, + u32 rewrite_index) { MLX5_SET(ste_single_action_modify_list_v1, s_action, action_id, DR_STE_V1_ACTION_ID_MODIFY_LIST); MLX5_SET(ste_single_action_modify_list_v1, s_action, num_of_modify_actions, num_of_actions); MLX5_SET(ste_single_action_modify_list_v1, s_action, modify_actions_ptr, - re_write_index); + rewrite_index); dr_ste_v1_set_reparse(hw_ste_p); } +static void dr_ste_v1_set_rewrite_actions(u8 *hw_ste_p, + u8 *action, + u16 num_of_actions, + u32 rewrite_pattern, + u32 rewrite_args, + u8 *action_data) +{ + if (rewrite_pattern != MLX5DR_INVALID_PATTERN_INDEX) + return dr_ste_v1_set_accelerated_rewrite_actions(hw_ste_p, + action, + num_of_actions, + rewrite_pattern, + rewrite_args, + action_data); + + /* fall back to the code that doesn't support accelerated modify header */ + return dr_ste_v1_set_basic_rewrite_actions(hw_ste_p, + action, + num_of_actions, + rewrite_args); +} + static void dr_ste_v1_set_aso_flow_meter(u8 *d_action, u32 object_id, u32 offset, @@ -604,9 +649,6 @@ void dr_ste_v1_set_actions_tx(struct mlx5dr_domain *dmn, allow_modify_hdr = false; } - if (action_type_set[DR_ACTION_TYP_CTR]) - dr_ste_v1_set_counter_id(last_ste, attr->ctr_id); - if (action_type_set[DR_ACTION_TYP_MODIFY_HDR]) { if (!allow_modify_hdr || action_sz < DR_STE_ACTION_DOUBLE_SZ) { dr_ste_v1_arr_init_next_match(&last_ste, added_stes, @@ -617,7 +659,9 @@ void dr_ste_v1_set_actions_tx(struct mlx5dr_domain *dmn, } dr_ste_v1_set_rewrite_actions(last_ste, action, attr->modify_actions, - attr->modify_index); + attr->modify_pat_idx, + attr->modify_index, + attr->single_modify_action); action_sz -= DR_STE_ACTION_DOUBLE_SZ; action += DR_STE_ACTION_DOUBLE_SZ; allow_encap = false; @@ -724,6 +768,10 @@ void dr_ste_v1_set_actions_tx(struct mlx5dr_domain *dmn, attr->range.max); } + /* set counter ID on the last STE to adhere to DMFS behavior */ + if (action_type_set[DR_ACTION_TYP_CTR]) + dr_ste_v1_set_counter_id(last_ste, attr->ctr_id); + dr_ste_v1_set_hit_gvmi(last_ste, attr->hit_gvmi); dr_ste_v1_set_hit_addr(last_ste, attr->final_icm_addr, 1); } @@ -743,7 +791,9 @@ void dr_ste_v1_set_actions_rx(struct mlx5dr_domain *dmn, if (action_type_set[DR_ACTION_TYP_TNL_L3_TO_L2]) { dr_ste_v1_set_rewrite_actions(last_ste, action, attr->decap_actions, - attr->decap_index); + attr->decap_pat_idx, + attr->decap_index, + NULL); action_sz -= DR_STE_ACTION_DOUBLE_SZ; action += DR_STE_ACTION_DOUBLE_SZ; allow_modify_hdr = false; @@ -798,7 +848,9 @@ void dr_ste_v1_set_actions_rx(struct mlx5dr_domain *dmn, } dr_ste_v1_set_rewrite_actions(last_ste, action, attr->modify_actions, - attr->modify_index); + attr->modify_pat_idx, + attr->modify_index, + attr->single_modify_action); action_sz -= DR_STE_ACTION_DOUBLE_SZ; action += DR_STE_ACTION_DOUBLE_SZ; } @@ -2175,6 +2227,49 @@ dr_ste_v1_build_tnl_gtpu_flex_parser_1_init(struct mlx5dr_ste_build *sb, sb->ste_build_tag_func = &dr_ste_v1_build_tnl_gtpu_flex_parser_1_tag; } +int dr_ste_v1_alloc_modify_hdr_ptrn_arg(struct mlx5dr_action *action) +{ + struct mlx5dr_ptrn_mgr *ptrn_mgr; + int ret; + + ptrn_mgr = action->rewrite->dmn->ptrn_mgr; + if (!ptrn_mgr) + return -EOPNOTSUPP; + + action->rewrite->arg = mlx5dr_arg_get_obj(action->rewrite->dmn->arg_mgr, + action->rewrite->num_of_actions, + action->rewrite->data); + if (!action->rewrite->arg) { + mlx5dr_err(action->rewrite->dmn, "Failed allocating args for modify header\n"); + return -EAGAIN; + } + + action->rewrite->ptrn = + mlx5dr_ptrn_cache_get_pattern(ptrn_mgr, + action->rewrite->num_of_actions, + action->rewrite->data); + if (!action->rewrite->ptrn) { + mlx5dr_err(action->rewrite->dmn, "Failed to get pattern\n"); + ret = -EAGAIN; + goto put_arg; + } + + return 0; + +put_arg: + mlx5dr_arg_put_obj(action->rewrite->dmn->arg_mgr, + action->rewrite->arg); + return ret; +} + +void dr_ste_v1_free_modify_hdr_ptrn_arg(struct mlx5dr_action *action) +{ + mlx5dr_ptrn_cache_put_pattern(action->rewrite->dmn->ptrn_mgr, + action->rewrite->ptrn); + mlx5dr_arg_put_obj(action->rewrite->dmn->arg_mgr, + action->rewrite->arg); +} + static struct mlx5dr_ste_ctx ste_ctx_v1 = { /* Builders */ .build_eth_l2_src_dst_init = &dr_ste_v1_build_eth_l2_src_dst_init, @@ -2231,6 +2326,9 @@ static struct mlx5dr_ste_ctx ste_ctx_v1 = { .set_action_add = &dr_ste_v1_set_action_add, .set_action_copy = &dr_ste_v1_set_action_copy, .set_action_decap_l3_list = &dr_ste_v1_set_action_decap_l3_list, + .alloc_modify_hdr_chunk = &dr_ste_v1_alloc_modify_hdr_ptrn_arg, + .dealloc_modify_hdr_chunk = &dr_ste_v1_free_modify_hdr_ptrn_arg, + /* Send */ .prepare_for_postsend = &dr_ste_v1_prepare_for_postsend, }; diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_ste_v1.h b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_ste_v1.h index b5c0f0f8392f..e2fc69867088 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_ste_v1.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_ste_v1.h @@ -31,6 +31,8 @@ void dr_ste_v1_set_action_copy(u8 *d_action, u8 dst_hw_field, u8 dst_shifter, u8 dst_len, u8 src_hw_field, u8 src_shifter); int dr_ste_v1_set_action_decap_l3_list(void *data, u32 data_sz, u8 *hw_action, u32 hw_action_sz, u16 *used_hw_action_num); +int dr_ste_v1_alloc_modify_hdr_ptrn_arg(struct mlx5dr_action *action); +void dr_ste_v1_free_modify_hdr_ptrn_arg(struct mlx5dr_action *action); void dr_ste_v1_build_eth_l2_src_dst_init(struct mlx5dr_ste_build *sb, struct mlx5dr_match_param *mask); void dr_ste_v1_build_eth_l3_ipv6_dst_init(struct mlx5dr_ste_build *sb, diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_ste_v2.c b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_ste_v2.c index cf1a3c9a1cf4..808b013cf48c 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_ste_v2.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_ste_v2.c @@ -221,6 +221,8 @@ static struct mlx5dr_ste_ctx ste_ctx_v2 = { .set_action_add = &dr_ste_v1_set_action_add, .set_action_copy = &dr_ste_v1_set_action_copy, .set_action_decap_l3_list = &dr_ste_v1_set_action_decap_l3_list, + .alloc_modify_hdr_chunk = &dr_ste_v1_alloc_modify_hdr_ptrn_arg, + .dealloc_modify_hdr_chunk = &dr_ste_v1_free_modify_hdr_ptrn_arg, /* Send */ .prepare_for_postsend = &dr_ste_v1_prepare_for_postsend, diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_types.h b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_types.h index 2b769dcbd453..678a993ab053 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_types.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_types.h @@ -21,11 +21,16 @@ #define DR_NUM_OF_FLEX_PARSERS 8 #define DR_STE_MAX_FLEX_0_ID 3 #define DR_STE_MAX_FLEX_1_ID 7 +#define DR_ACTION_CACHE_LINE_SIZE 64 #define mlx5dr_err(dmn, arg...) mlx5_core_err((dmn)->mdev, ##arg) #define mlx5dr_info(dmn, arg...) mlx5_core_info((dmn)->mdev, ##arg) #define mlx5dr_dbg(dmn, arg...) mlx5_core_dbg((dmn)->mdev, ##arg) +struct mlx5dr_ptrn_mgr; +struct mlx5dr_arg_mgr; +struct mlx5dr_arg_obj; + static inline bool dr_is_flex_parser_0_id(u8 parser_id) { return parser_id <= DR_STE_MAX_FLEX_0_ID; @@ -66,6 +71,8 @@ enum mlx5dr_icm_chunk_size { enum mlx5dr_icm_type { DR_ICM_TYPE_STE, DR_ICM_TYPE_MODIFY_ACTION, + DR_ICM_TYPE_MODIFY_HDR_PTRN, + DR_ICM_TYPE_MAX, }; static inline enum mlx5dr_icm_chunk_size @@ -255,11 +262,15 @@ u64 mlx5dr_ste_get_mr_addr(struct mlx5dr_ste *ste); struct list_head *mlx5dr_ste_get_miss_list(struct mlx5dr_ste *ste); #define MLX5DR_MAX_VLANS 2 +#define MLX5DR_INVALID_PATTERN_INDEX 0xffffffff struct mlx5dr_ste_actions_attr { u32 modify_index; + u32 modify_pat_idx; u16 modify_actions; + u8 *single_modify_action; u32 decap_index; + u32 decap_pat_idx; u16 decap_actions; u8 decap_with_vlan:1; u64 final_icm_addr; @@ -331,6 +342,8 @@ int mlx5dr_ste_set_action_decap_l3_list(struct mlx5dr_ste_ctx *ste_ctx, u8 *hw_action, u32 hw_action_sz, u16 *used_hw_action_num); +int mlx5dr_ste_alloc_modify_hdr(struct mlx5dr_action *action); +void mlx5dr_ste_free_modify_hdr(struct mlx5dr_action *action); const struct mlx5dr_ste_action_modify_field * mlx5dr_ste_conv_modify_hdr_sw_field(struct mlx5dr_ste_ctx *ste_ctx, u16 sw_field); @@ -861,6 +874,8 @@ struct mlx5dr_cmd_caps { u64 esw_tx_drop_address; u32 log_icm_size; u64 hdr_modify_icm_addr; + u32 log_modify_pattern_icm_size; + u64 hdr_modify_pattern_icm_addr; u32 flex_protocols; u8 flex_parser_id_icmp_dw0; u8 flex_parser_id_icmp_dw1; @@ -888,6 +903,9 @@ struct mlx5dr_cmd_caps { struct mlx5dr_vports vports; bool prio_tag_required; struct mlx5dr_roce_cap roce_caps; + u16 log_header_modify_argument_granularity; + u16 log_header_modify_argument_max_alloc; + bool support_modify_argument; u8 is_ecpf:1; u8 isolate_vl_tc:1; }; @@ -910,6 +928,7 @@ struct mlx5dr_domain_info { u32 max_send_wr; u32 max_log_sw_icm_sz; u32 max_log_action_icm_sz; + u32 max_log_modify_hdr_pattern_icm_sz; struct mlx5dr_domain_rx_tx rx; struct mlx5dr_domain_rx_tx tx; struct mlx5dr_cmd_caps caps; @@ -928,6 +947,8 @@ struct mlx5dr_domain { struct mlx5dr_send_info_pool *send_info_pool_tx; struct kmem_cache *chunks_kmem_cache; struct kmem_cache *htbls_kmem_cache; + struct mlx5dr_ptrn_mgr *ptrn_mgr; + struct mlx5dr_arg_mgr *arg_mgr; struct mlx5dr_send_ring *send_ring; struct mlx5dr_domain_info info; struct xarray csum_fts_xa; @@ -935,6 +956,8 @@ struct mlx5dr_domain { struct list_head dbg_tbl_list; struct mlx5dr_dbg_dump_info dump_info; struct xarray definers_xa; + /* memory management statistics */ + u32 num_buddies[DR_ICM_TYPE_MAX]; }; struct mlx5dr_table_rx_tx { @@ -994,15 +1017,34 @@ struct mlx5dr_ste_action_modify_field { u8 l4_type; }; +struct mlx5dr_ptrn_obj { + struct mlx5dr_icm_chunk *chunk; + u8 *data; + u16 num_of_actions; + u32 index; + refcount_t refcount; + struct list_head list; +}; + +struct mlx5dr_arg_obj { + u32 obj_id; + u32 obj_offset; + struct list_head list_node; + u32 log_chunk_size; +}; + struct mlx5dr_action_rewrite { struct mlx5dr_domain *dmn; struct mlx5dr_icm_chunk *chunk; u8 *data; u16 num_of_actions; u32 index; + u8 single_action_opt:1; u8 allow_rx:1; u8 allow_tx:1; u8 modify_ttl:1; + struct mlx5dr_ptrn_obj *ptrn; + struct mlx5dr_arg_obj *arg; }; struct mlx5dr_action_reformat { @@ -1334,6 +1376,12 @@ struct mlx5dr_cmd_gid_attr { int mlx5dr_cmd_query_gid(struct mlx5_core_dev *mdev, u8 vhca_port_num, u16 index, struct mlx5dr_cmd_gid_attr *attr); +int mlx5dr_cmd_create_modify_header_arg(struct mlx5_core_dev *dev, + u16 log_obj_range, u32 pd, + u32 *obj_id); +void mlx5dr_cmd_destroy_modify_header_arg(struct mlx5_core_dev *dev, + u32 obj_id); + struct mlx5dr_icm_pool *mlx5dr_icm_pool_create(struct mlx5dr_domain *dmn, enum mlx5dr_icm_type icm_type); void mlx5dr_icm_pool_destroy(struct mlx5dr_icm_pool *pool); @@ -1368,6 +1416,7 @@ struct mlx5dr_qp { struct mlx5_wq_ctrl wq_ctrl; u32 qpn; struct { + unsigned int head; unsigned int pc; unsigned int cc; unsigned int size; @@ -1399,9 +1448,6 @@ struct mlx5dr_mr { size_t size; }; -#define MAX_SEND_CQE 64 -#define MIN_READ_SYNC 64 - struct mlx5dr_send_ring { struct mlx5dr_cq *cq; struct mlx5dr_qp *qp; @@ -1416,7 +1462,7 @@ struct mlx5dr_send_ring { u32 tx_head; void *buf; u32 buf_size; - u8 sync_buff[MIN_READ_SYNC]; + u8 *sync_buff; struct mlx5dr_mr *sync_mr; spinlock_t lock; /* Protect the data path of the send ring */ bool err_state; /* send_ring is not usable in err state */ @@ -1440,6 +1486,12 @@ int mlx5dr_send_postsend_formatted_htbl(struct mlx5dr_domain *dmn, bool update_hw_ste); int mlx5dr_send_postsend_action(struct mlx5dr_domain *dmn, struct mlx5dr_action *action); +int mlx5dr_send_postsend_pattern(struct mlx5dr_domain *dmn, + struct mlx5dr_icm_chunk *chunk, + u16 num_of_actions, + u8 *data); +int mlx5dr_send_postsend_args(struct mlx5dr_domain *dmn, u64 arg_id, + u16 num_of_actions, u8 *actions_data); int mlx5dr_send_info_pool_create(struct mlx5dr_domain *dmn); void mlx5dr_send_info_pool_destroy(struct mlx5dr_domain *dmn); @@ -1526,4 +1578,20 @@ static inline bool mlx5dr_supp_match_ranges(struct mlx5_core_dev *dev) (1ULL << MLX5_IFC_DEFINER_FORMAT_ID_SELECT)); } +bool mlx5dr_domain_is_support_ptrn_arg(struct mlx5dr_domain *dmn); +struct mlx5dr_ptrn_mgr *mlx5dr_ptrn_mgr_create(struct mlx5dr_domain *dmn); +void mlx5dr_ptrn_mgr_destroy(struct mlx5dr_ptrn_mgr *mgr); +struct mlx5dr_ptrn_obj *mlx5dr_ptrn_cache_get_pattern(struct mlx5dr_ptrn_mgr *mgr, + u16 num_of_actions, u8 *data); +void mlx5dr_ptrn_cache_put_pattern(struct mlx5dr_ptrn_mgr *mgr, + struct mlx5dr_ptrn_obj *pattern); +struct mlx5dr_arg_mgr *mlx5dr_arg_mgr_create(struct mlx5dr_domain *dmn); +void mlx5dr_arg_mgr_destroy(struct mlx5dr_arg_mgr *mgr); +struct mlx5dr_arg_obj *mlx5dr_arg_get_obj(struct mlx5dr_arg_mgr *mgr, + u16 num_of_actions, + u8 *data); +void mlx5dr_arg_put_obj(struct mlx5dr_arg_mgr *mgr, + struct mlx5dr_arg_obj *arg_obj); +u32 mlx5dr_arg_get_obj_id(struct mlx5dr_arg_obj *arg_obj); + #endif /* _DR_TYPES_H_ */ diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/mlx5_ifc_dr_ste_v1.h b/drivers/net/ethernet/mellanox/mlx5/core/steering/mlx5_ifc_dr_ste_v1.h index 790a17d6207f..ca3b0f1453a7 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/steering/mlx5_ifc_dr_ste_v1.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/mlx5_ifc_dr_ste_v1.h @@ -100,7 +100,7 @@ struct mlx5_ifc_ste_double_action_insert_with_ptr_v1_bits { u8 pointer[0x20]; }; -struct mlx5_ifc_ste_double_action_modify_action_list_v1_bits { +struct mlx5_ifc_ste_double_action_accelerated_modify_action_list_v1_bits { u8 action_id[0x8]; u8 modify_actions_pattern_pointer[0x18]; diff --git a/drivers/net/ethernet/mellanox/mlx5/core/thermal.c b/drivers/net/ethernet/mellanox/mlx5/core/thermal.c new file mode 100644 index 000000000000..e47fa6fb836f --- /dev/null +++ b/drivers/net/ethernet/mellanox/mlx5/core/thermal.c @@ -0,0 +1,108 @@ +// SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB +// Copyright (c) 2023, NVIDIA CORPORATION & AFFILIATES. + +#include <linux/kernel.h> +#include <linux/types.h> +#include <linux/device.h> +#include <linux/thermal.h> +#include <linux/err.h> +#include <linux/mlx5/driver.h> +#include "mlx5_core.h" +#include "thermal.h" + +#define MLX5_THERMAL_POLL_INT_MSEC 1000 +#define MLX5_THERMAL_NUM_TRIPS 0 +#define MLX5_THERMAL_ASIC_SENSOR_INDEX 0 + +/* Bit string indicating the writeablility of trip points if any */ +#define MLX5_THERMAL_TRIP_MASK (BIT(MLX5_THERMAL_NUM_TRIPS) - 1) + +struct mlx5_thermal { + struct mlx5_core_dev *mdev; + struct thermal_zone_device *tzdev; +}; + +static int mlx5_thermal_get_mtmp_temp(struct mlx5_core_dev *mdev, u32 id, int *p_temp) +{ + u32 mtmp_out[MLX5_ST_SZ_DW(mtmp_reg)] = {}; + u32 mtmp_in[MLX5_ST_SZ_DW(mtmp_reg)] = {}; + int err; + + MLX5_SET(mtmp_reg, mtmp_in, sensor_index, id); + + err = mlx5_core_access_reg(mdev, mtmp_in, sizeof(mtmp_in), + mtmp_out, sizeof(mtmp_out), + MLX5_REG_MTMP, 0, 0); + + if (err) + return err; + + *p_temp = MLX5_GET(mtmp_reg, mtmp_out, temperature); + + return 0; +} + +static int mlx5_thermal_get_temp(struct thermal_zone_device *tzdev, + int *p_temp) +{ + struct mlx5_thermal *thermal = tzdev->devdata; + struct mlx5_core_dev *mdev = thermal->mdev; + int err; + + err = mlx5_thermal_get_mtmp_temp(mdev, MLX5_THERMAL_ASIC_SENSOR_INDEX, p_temp); + + if (err) + return err; + + /* The unit of temp returned is in 0.125 C. The thermal + * framework expects the value in 0.001 C. + */ + *p_temp *= 125; + + return 0; +} + +static struct thermal_zone_device_ops mlx5_thermal_ops = { + .get_temp = mlx5_thermal_get_temp, +}; + +int mlx5_thermal_init(struct mlx5_core_dev *mdev) +{ + struct mlx5_thermal *thermal; + struct thermal_zone_device *tzd; + const char *data = "mlx5"; + + tzd = thermal_zone_get_zone_by_name(data); + if (!IS_ERR(tzd)) + return 0; + + thermal = kzalloc(sizeof(*thermal), GFP_KERNEL); + if (!thermal) + return -ENOMEM; + + thermal->mdev = mdev; + thermal->tzdev = thermal_zone_device_register(data, + MLX5_THERMAL_NUM_TRIPS, + MLX5_THERMAL_TRIP_MASK, + thermal, + &mlx5_thermal_ops, + NULL, 0, MLX5_THERMAL_POLL_INT_MSEC); + if (IS_ERR(thermal->tzdev)) { + dev_err(mdev->device, "Failed to register thermal zone device (%s) %ld\n", + data, PTR_ERR(thermal->tzdev)); + kfree(thermal); + return -EINVAL; + } + + mdev->thermal = thermal; + return 0; +} + +void mlx5_thermal_uninit(struct mlx5_core_dev *mdev) +{ + if (!mdev->thermal) + return; + + thermal_zone_device_unregister(mdev->thermal->tzdev); + kfree(mdev->thermal); +} diff --git a/drivers/net/ethernet/mellanox/mlx5/core/thermal.h b/drivers/net/ethernet/mellanox/mlx5/core/thermal.h new file mode 100644 index 000000000000..7d752c122192 --- /dev/null +++ b/drivers/net/ethernet/mellanox/mlx5/core/thermal.h @@ -0,0 +1,20 @@ +/* SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB + * Copyright (c) 2023, NVIDIA CORPORATION & AFFILIATES. + */ +#ifndef __MLX5_THERMAL_DRIVER_H +#define __MLX5_THERMAL_DRIVER_H + +#if IS_ENABLED(CONFIG_THERMAL) +int mlx5_thermal_init(struct mlx5_core_dev *mdev); +void mlx5_thermal_uninit(struct mlx5_core_dev *mdev); +#else +static inline int mlx5_thermal_init(struct mlx5_core_dev *mdev) +{ + mdev->thermal = NULL; + return 0; +} + +static inline void mlx5_thermal_uninit(struct mlx5_core_dev *mdev) { } +#endif + +#endif /* __MLX5_THERMAL_DRIVER_H */ diff --git a/drivers/net/ethernet/mellanox/mlxsw/core_thermal.c b/drivers/net/ethernet/mellanox/mlxsw/core_thermal.c index 66dd42a8e72f..70d7fff24fa2 100644 --- a/drivers/net/ethernet/mellanox/mlxsw/core_thermal.c +++ b/drivers/net/ethernet/mellanox/mlxsw/core_thermal.c @@ -19,6 +19,9 @@ #define MLXSW_THERMAL_ASIC_TEMP_NORM 75000 /* 75C */ #define MLXSW_THERMAL_ASIC_TEMP_HIGH 85000 /* 85C */ #define MLXSW_THERMAL_ASIC_TEMP_HOT 105000 /* 105C */ +#define MLXSW_THERMAL_MODULE_TEMP_NORM 55000 /* 55C */ +#define MLXSW_THERMAL_MODULE_TEMP_HIGH 65000 /* 65C */ +#define MLXSW_THERMAL_MODULE_TEMP_HOT 80000 /* 80C */ #define MLXSW_THERMAL_HYSTERESIS_TEMP 5000 /* 5C */ #define MLXSW_THERMAL_MODULE_TEMP_SHIFT (MLXSW_THERMAL_HYSTERESIS_TEMP * 2) #define MLXSW_THERMAL_MAX_STATE 10 @@ -30,12 +33,6 @@ static char * const mlxsw_thermal_external_allowed_cdev[] = { "mlxreg_fan", }; -enum mlxsw_thermal_trips { - MLXSW_THERMAL_TEMP_TRIP_NORM, - MLXSW_THERMAL_TEMP_TRIP_HIGH, - MLXSW_THERMAL_TEMP_TRIP_HOT, -}; - struct mlxsw_cooling_states { int min_state; int max_state; @@ -59,6 +56,24 @@ static const struct thermal_trip default_thermal_trips[] = { }, }; +static const struct thermal_trip default_thermal_module_trips[] = { + { /* In range - 0-40% PWM */ + .type = THERMAL_TRIP_ACTIVE, + .temperature = MLXSW_THERMAL_MODULE_TEMP_NORM, + .hysteresis = MLXSW_THERMAL_HYSTERESIS_TEMP, + }, + { + /* In range - 40-100% PWM */ + .type = THERMAL_TRIP_ACTIVE, + .temperature = MLXSW_THERMAL_MODULE_TEMP_HIGH, + .hysteresis = MLXSW_THERMAL_HYSTERESIS_TEMP, + }, + { /* Warning */ + .type = THERMAL_TRIP_HOT, + .temperature = MLXSW_THERMAL_MODULE_TEMP_HOT, + }, +}; + static const struct mlxsw_cooling_states default_cooling_states[] = { { .min_state = 0, @@ -140,63 +155,6 @@ static int mlxsw_get_cooling_device_idx(struct mlxsw_thermal *thermal, return -ENODEV; } -static void -mlxsw_thermal_module_trips_reset(struct mlxsw_thermal_module *tz) -{ - tz->trips[MLXSW_THERMAL_TEMP_TRIP_NORM].temperature = 0; - tz->trips[MLXSW_THERMAL_TEMP_TRIP_HIGH].temperature = 0; - tz->trips[MLXSW_THERMAL_TEMP_TRIP_HOT].temperature = 0; -} - -static int -mlxsw_thermal_module_trips_update(struct device *dev, struct mlxsw_core *core, - struct mlxsw_thermal_module *tz, - int crit_temp, int emerg_temp) -{ - int err; - - /* Do not try to query temperature thresholds directly from the module's - * EEPROM if we got valid thresholds from MTMP. - */ - if (!emerg_temp || !crit_temp) { - err = mlxsw_env_module_temp_thresholds_get(core, tz->slot_index, - tz->module, - SFP_TEMP_HIGH_WARN, - &crit_temp); - if (err) - return err; - - err = mlxsw_env_module_temp_thresholds_get(core, tz->slot_index, - tz->module, - SFP_TEMP_HIGH_ALARM, - &emerg_temp); - if (err) - return err; - } - - if (crit_temp > emerg_temp) { - dev_warn(dev, "%s : Critical threshold %d is above emergency threshold %d\n", - thermal_zone_device_type(tz->tzdev), crit_temp, emerg_temp); - return 0; - } - - /* According to the system thermal requirements, the thermal zones are - * defined with three trip points. The critical and emergency - * temperature thresholds, provided by QSFP module are set as "active" - * and "hot" trip points, "normal" trip point is derived from "active" - * by subtracting double hysteresis value. - */ - if (crit_temp >= MLXSW_THERMAL_MODULE_TEMP_SHIFT) - tz->trips[MLXSW_THERMAL_TEMP_TRIP_NORM].temperature = crit_temp - - MLXSW_THERMAL_MODULE_TEMP_SHIFT; - else - tz->trips[MLXSW_THERMAL_TEMP_TRIP_NORM].temperature = crit_temp; - tz->trips[MLXSW_THERMAL_TEMP_TRIP_HIGH].temperature = crit_temp; - tz->trips[MLXSW_THERMAL_TEMP_TRIP_HOT].temperature = emerg_temp; - - return 0; -} - static int mlxsw_thermal_bind(struct thermal_zone_device *tzdev, struct thermal_cooling_device *cdev) { @@ -325,59 +283,22 @@ static int mlxsw_thermal_module_unbind(struct thermal_zone_device *tzdev, return err; } -static void -mlxsw_thermal_module_temp_and_thresholds_get(struct mlxsw_core *core, - u8 slot_index, u16 sensor_index, - int *p_temp, int *p_crit_temp, - int *p_emerg_temp) -{ - char mtmp_pl[MLXSW_REG_MTMP_LEN]; - int err; - - /* Read module temperature and thresholds. */ - mlxsw_reg_mtmp_pack(mtmp_pl, slot_index, sensor_index, - false, false); - err = mlxsw_reg_query(core, MLXSW_REG(mtmp), mtmp_pl); - if (err) { - /* Set temperature and thresholds to zero to avoid passing - * uninitialized data back to the caller. - */ - *p_temp = 0; - *p_crit_temp = 0; - *p_emerg_temp = 0; - - return; - } - mlxsw_reg_mtmp_unpack(mtmp_pl, p_temp, NULL, p_crit_temp, p_emerg_temp, - NULL); -} - static int mlxsw_thermal_module_temp_get(struct thermal_zone_device *tzdev, int *p_temp) { struct mlxsw_thermal_module *tz = thermal_zone_device_priv(tzdev); struct mlxsw_thermal *thermal = tz->parent; - int temp, crit_temp, emerg_temp; - struct device *dev; + char mtmp_pl[MLXSW_REG_MTMP_LEN]; u16 sensor_index; + int err; - dev = thermal->bus_info->dev; sensor_index = MLXSW_REG_MTMP_MODULE_INDEX_MIN + tz->module; - - /* Read module temperature and thresholds. */ - mlxsw_thermal_module_temp_and_thresholds_get(thermal->core, - tz->slot_index, - sensor_index, &temp, - &crit_temp, &emerg_temp); - *p_temp = temp; - - if (!temp) - return 0; - - /* Update trip points. */ - mlxsw_thermal_module_trips_update(dev, thermal->core, tz, - crit_temp, emerg_temp); - + mlxsw_reg_mtmp_pack(mtmp_pl, tz->slot_index, sensor_index, + false, false); + err = mlxsw_reg_query(thermal->core, MLXSW_REG(mtmp), mtmp_pl); + if (err) + return err; + mlxsw_reg_mtmp_unpack(mtmp_pl, p_temp, NULL, NULL, NULL, NULL); return 0; } @@ -521,36 +442,26 @@ static void mlxsw_thermal_module_tz_fini(struct thermal_zone_device *tzdev) thermal_zone_device_unregister(tzdev); } -static int +static void mlxsw_thermal_module_init(struct device *dev, struct mlxsw_core *core, struct mlxsw_thermal *thermal, struct mlxsw_thermal_area *area, u8 module) { struct mlxsw_thermal_module *module_tz; - int dummy_temp, crit_temp, emerg_temp; - u16 sensor_index; - sensor_index = MLXSW_REG_MTMP_MODULE_INDEX_MIN + module; module_tz = &area->tz_module_arr[module]; /* Skip if parent is already set (case of port split). */ if (module_tz->parent) - return 0; + return; module_tz->module = module; module_tz->slot_index = area->slot_index; module_tz->parent = thermal; - memcpy(module_tz->trips, default_thermal_trips, + BUILD_BUG_ON(ARRAY_SIZE(default_thermal_module_trips) != + MLXSW_THERMAL_NUM_TRIPS); + memcpy(module_tz->trips, default_thermal_module_trips, sizeof(thermal->trips)); memcpy(module_tz->cooling_states, default_cooling_states, sizeof(thermal->cooling_states)); - /* Initialize all trip point. */ - mlxsw_thermal_module_trips_reset(module_tz); - /* Read module temperature and thresholds. */ - mlxsw_thermal_module_temp_and_thresholds_get(core, area->slot_index, - sensor_index, &dummy_temp, - &crit_temp, &emerg_temp); - /* Update trip point according to the module data. */ - return mlxsw_thermal_module_trips_update(dev, core, module_tz, - crit_temp, emerg_temp); } static void mlxsw_thermal_module_fini(struct mlxsw_thermal_module *module_tz) @@ -589,11 +500,8 @@ mlxsw_thermal_modules_init(struct device *dev, struct mlxsw_core *core, if (!area->tz_module_arr) return -ENOMEM; - for (i = 0; i < area->tz_module_num; i++) { - err = mlxsw_thermal_module_init(dev, core, thermal, area, i); - if (err) - goto err_thermal_module_init; - } + for (i = 0; i < area->tz_module_num; i++) + mlxsw_thermal_module_init(dev, core, thermal, area, i); for (i = 0; i < area->tz_module_num; i++) { module_tz = &area->tz_module_arr[i]; @@ -607,7 +515,6 @@ mlxsw_thermal_modules_init(struct device *dev, struct mlxsw_core *core, return 0; err_thermal_module_tz_init: -err_thermal_module_init: for (i = area->tz_module_num - 1; i >= 0; i--) mlxsw_thermal_module_fini(&area->tz_module_arr[i]); kfree(area->tz_module_arr); diff --git a/drivers/net/ethernet/micrel/ksz884x.c b/drivers/net/ethernet/micrel/ksz884x.c index e6acd1e7b263..c5aeeb964c17 100644 --- a/drivers/net/ethernet/micrel/ksz884x.c +++ b/drivers/net/ethernet/micrel/ksz884x.c @@ -1476,15 +1476,6 @@ static void hw_turn_on_intr(struct ksz_hw *hw, u32 bit) hw_set_intr(hw, hw->intr_mask); } -static inline void hw_ena_intr_bit(struct ksz_hw *hw, uint interrupt) -{ - u32 read_intr; - - read_intr = readl(hw->io + KS884X_INTERRUPTS_ENABLE); - hw->intr_set = read_intr | interrupt; - writel(hw->intr_set, hw->io + KS884X_INTERRUPTS_ENABLE); -} - static inline void hw_read_intr(struct ksz_hw *hw, uint *status) { *status = readl(hw->io + KS884X_INTERRUPTS_STATUS); @@ -1854,29 +1845,6 @@ static void port_init_cnt(struct ksz_hw *hw, int port) */ /** - * port_chk - check port register bits - * @hw: The hardware instance. - * @port: The port index. - * @offset: The offset of the port register. - * @bits: The data bits to check. - * - * This function checks whether the specified bits of the port register are set - * or not. - * - * Return 0 if the bits are not set. - */ -static int port_chk(struct ksz_hw *hw, int port, int offset, u16 bits) -{ - u32 addr; - u16 data; - - PORT_CTRL_ADDR(port, addr); - addr += offset; - data = readw(hw->io + addr); - return (data & bits) == bits; -} - -/** * port_cfg - set port register bits * @hw: The hardware instance. * @port: The port index. @@ -1903,53 +1871,6 @@ static void port_cfg(struct ksz_hw *hw, int port, int offset, u16 bits, } /** - * port_chk_shift - check port bit - * @hw: The hardware instance. - * @port: The port index. - * @addr: The offset of the register. - * @shift: Number of bits to shift. - * - * This function checks whether the specified port is set in the register or - * not. - * - * Return 0 if the port is not set. - */ -static int port_chk_shift(struct ksz_hw *hw, int port, u32 addr, int shift) -{ - u16 data; - u16 bit = 1 << port; - - data = readw(hw->io + addr); - data >>= shift; - return (data & bit) == bit; -} - -/** - * port_cfg_shift - set port bit - * @hw: The hardware instance. - * @port: The port index. - * @addr: The offset of the register. - * @shift: Number of bits to shift. - * @set: The flag indicating whether the port is to be set or not. - * - * This routine sets or resets the specified port in the register. - */ -static void port_cfg_shift(struct ksz_hw *hw, int port, u32 addr, int shift, - int set) -{ - u16 data; - u16 bits = 1 << port; - - data = readw(hw->io + addr); - bits <<= shift; - if (set) - data |= bits; - else - data &= ~bits; - writew(data, hw->io + addr); -} - -/** * port_r8 - read byte from port register * @hw: The hardware instance. * @port: The port index. @@ -2051,12 +1972,6 @@ static inline void port_cfg_broad_storm(struct ksz_hw *hw, int p, int set) KS8842_PORT_CTRL_1_OFFSET, PORT_BROADCAST_STORM, set); } -static inline int port_chk_broad_storm(struct ksz_hw *hw, int p) -{ - return port_chk(hw, p, - KS8842_PORT_CTRL_1_OFFSET, PORT_BROADCAST_STORM); -} - /* Driver set switch broadcast storm protection at 10% rate. */ #define BROADCAST_STORM_PROTECTION_RATE 10 @@ -2209,102 +2124,6 @@ static inline void port_cfg_back_pressure(struct ksz_hw *hw, int p, int set) KS8842_PORT_CTRL_2_OFFSET, PORT_BACK_PRESSURE, set); } -static inline void port_cfg_force_flow_ctrl(struct ksz_hw *hw, int p, int set) -{ - port_cfg(hw, p, - KS8842_PORT_CTRL_2_OFFSET, PORT_FORCE_FLOW_CTRL, set); -} - -static inline int port_chk_back_pressure(struct ksz_hw *hw, int p) -{ - return port_chk(hw, p, - KS8842_PORT_CTRL_2_OFFSET, PORT_BACK_PRESSURE); -} - -static inline int port_chk_force_flow_ctrl(struct ksz_hw *hw, int p) -{ - return port_chk(hw, p, - KS8842_PORT_CTRL_2_OFFSET, PORT_FORCE_FLOW_CTRL); -} - -/* Spanning Tree */ - -static inline void port_cfg_rx(struct ksz_hw *hw, int p, int set) -{ - port_cfg(hw, p, - KS8842_PORT_CTRL_2_OFFSET, PORT_RX_ENABLE, set); -} - -static inline void port_cfg_tx(struct ksz_hw *hw, int p, int set) -{ - port_cfg(hw, p, - KS8842_PORT_CTRL_2_OFFSET, PORT_TX_ENABLE, set); -} - -static inline void sw_cfg_fast_aging(struct ksz_hw *hw, int set) -{ - sw_cfg(hw, KS8842_SWITCH_CTRL_1_OFFSET, SWITCH_FAST_AGING, set); -} - -static inline void sw_flush_dyn_mac_table(struct ksz_hw *hw) -{ - if (!(hw->overrides & FAST_AGING)) { - sw_cfg_fast_aging(hw, 1); - mdelay(1); - sw_cfg_fast_aging(hw, 0); - } -} - -/* VLAN */ - -static inline void port_cfg_ins_tag(struct ksz_hw *hw, int p, int insert) -{ - port_cfg(hw, p, - KS8842_PORT_CTRL_1_OFFSET, PORT_INSERT_TAG, insert); -} - -static inline void port_cfg_rmv_tag(struct ksz_hw *hw, int p, int remove) -{ - port_cfg(hw, p, - KS8842_PORT_CTRL_1_OFFSET, PORT_REMOVE_TAG, remove); -} - -static inline int port_chk_ins_tag(struct ksz_hw *hw, int p) -{ - return port_chk(hw, p, - KS8842_PORT_CTRL_1_OFFSET, PORT_INSERT_TAG); -} - -static inline int port_chk_rmv_tag(struct ksz_hw *hw, int p) -{ - return port_chk(hw, p, - KS8842_PORT_CTRL_1_OFFSET, PORT_REMOVE_TAG); -} - -static inline void port_cfg_dis_non_vid(struct ksz_hw *hw, int p, int set) -{ - port_cfg(hw, p, - KS8842_PORT_CTRL_2_OFFSET, PORT_DISCARD_NON_VID, set); -} - -static inline void port_cfg_in_filter(struct ksz_hw *hw, int p, int set) -{ - port_cfg(hw, p, - KS8842_PORT_CTRL_2_OFFSET, PORT_INGRESS_VLAN_FILTER, set); -} - -static inline int port_chk_dis_non_vid(struct ksz_hw *hw, int p) -{ - return port_chk(hw, p, - KS8842_PORT_CTRL_2_OFFSET, PORT_DISCARD_NON_VID); -} - -static inline int port_chk_in_filter(struct ksz_hw *hw, int p) -{ - return port_chk(hw, p, - KS8842_PORT_CTRL_2_OFFSET, PORT_INGRESS_VLAN_FILTER); -} - /* Mirroring */ static inline void port_cfg_mirror_sniffer(struct ksz_hw *hw, int p, int set) @@ -2342,28 +2161,6 @@ static void sw_init_mirror(struct ksz_hw *hw) sw_cfg_mirror_rx_tx(hw, 0); } -static inline void sw_cfg_unk_def_deliver(struct ksz_hw *hw, int set) -{ - sw_cfg(hw, KS8842_SWITCH_CTRL_7_OFFSET, - SWITCH_UNK_DEF_PORT_ENABLE, set); -} - -static inline int sw_cfg_chk_unk_def_deliver(struct ksz_hw *hw) -{ - return sw_chk(hw, KS8842_SWITCH_CTRL_7_OFFSET, - SWITCH_UNK_DEF_PORT_ENABLE); -} - -static inline void sw_cfg_unk_def_port(struct ksz_hw *hw, int port, int set) -{ - port_cfg_shift(hw, port, KS8842_SWITCH_CTRL_7_OFFSET, 0, set); -} - -static inline int sw_chk_unk_def_port(struct ksz_hw *hw, int port) -{ - return port_chk_shift(hw, port, KS8842_SWITCH_CTRL_7_OFFSET, 0); -} - /* Priority */ static inline void port_cfg_diffserv(struct ksz_hw *hw, int p, int set) @@ -2390,30 +2187,6 @@ static inline void port_cfg_prio(struct ksz_hw *hw, int p, int set) KS8842_PORT_CTRL_1_OFFSET, PORT_PRIO_QUEUE_ENABLE, set); } -static inline int port_chk_diffserv(struct ksz_hw *hw, int p) -{ - return port_chk(hw, p, - KS8842_PORT_CTRL_1_OFFSET, PORT_DIFFSERV_ENABLE); -} - -static inline int port_chk_802_1p(struct ksz_hw *hw, int p) -{ - return port_chk(hw, p, - KS8842_PORT_CTRL_1_OFFSET, PORT_802_1P_ENABLE); -} - -static inline int port_chk_replace_vid(struct ksz_hw *hw, int p) -{ - return port_chk(hw, p, - KS8842_PORT_CTRL_2_OFFSET, PORT_USER_PRIORITY_CEILING); -} - -static inline int port_chk_prio(struct ksz_hw *hw, int p) -{ - return port_chk(hw, p, - KS8842_PORT_CTRL_1_OFFSET, PORT_PRIO_QUEUE_ENABLE); -} - /** * sw_dis_diffserv - disable switch DiffServ priority * @hw: The hardware instance. @@ -2614,23 +2387,6 @@ static void sw_cfg_port_base_vlan(struct ksz_hw *hw, int port, u8 member) } /** - * sw_get_addr - get the switch MAC address. - * @hw: The hardware instance. - * @mac_addr: Buffer to store the MAC address. - * - * This function retrieves the MAC address of the switch. - */ -static inline void sw_get_addr(struct ksz_hw *hw, u8 *mac_addr) -{ - int i; - - for (i = 0; i < 6; i += 2) { - mac_addr[i] = readb(hw->io + KS8842_MAC_ADDR_0_OFFSET + i); - mac_addr[1 + i] = readb(hw->io + KS8842_MAC_ADDR_1_OFFSET + i); - } -} - -/** * sw_set_addr - configure switch MAC address * @hw: The hardware instance. * @mac_addr: The MAC address. @@ -2828,56 +2584,6 @@ static inline void hw_w_phy_ctrl(struct ksz_hw *hw, int phy, u16 data) writew(data, hw->io + phy + KS884X_PHY_CTRL_OFFSET); } -static inline void hw_r_phy_link_stat(struct ksz_hw *hw, int phy, u16 *data) -{ - *data = readw(hw->io + phy + KS884X_PHY_STATUS_OFFSET); -} - -static inline void hw_r_phy_auto_neg(struct ksz_hw *hw, int phy, u16 *data) -{ - *data = readw(hw->io + phy + KS884X_PHY_AUTO_NEG_OFFSET); -} - -static inline void hw_w_phy_auto_neg(struct ksz_hw *hw, int phy, u16 data) -{ - writew(data, hw->io + phy + KS884X_PHY_AUTO_NEG_OFFSET); -} - -static inline void hw_r_phy_rem_cap(struct ksz_hw *hw, int phy, u16 *data) -{ - *data = readw(hw->io + phy + KS884X_PHY_REMOTE_CAP_OFFSET); -} - -static inline void hw_r_phy_crossover(struct ksz_hw *hw, int phy, u16 *data) -{ - *data = readw(hw->io + phy + KS884X_PHY_CTRL_OFFSET); -} - -static inline void hw_w_phy_crossover(struct ksz_hw *hw, int phy, u16 data) -{ - writew(data, hw->io + phy + KS884X_PHY_CTRL_OFFSET); -} - -static inline void hw_r_phy_polarity(struct ksz_hw *hw, int phy, u16 *data) -{ - *data = readw(hw->io + phy + KS884X_PHY_PHY_CTRL_OFFSET); -} - -static inline void hw_w_phy_polarity(struct ksz_hw *hw, int phy, u16 data) -{ - writew(data, hw->io + phy + KS884X_PHY_PHY_CTRL_OFFSET); -} - -static inline void hw_r_phy_link_md(struct ksz_hw *hw, int phy, u16 *data) -{ - *data = readw(hw->io + phy + KS884X_PHY_LINK_MD_OFFSET); -} - -static inline void hw_w_phy_link_md(struct ksz_hw *hw, int phy, u16 data) -{ - writew(data, hw->io + phy + KS884X_PHY_LINK_MD_OFFSET); -} - /** * hw_r_phy - read data from PHY register * @hw: The hardware instance. @@ -3213,7 +2919,6 @@ static void port_get_link_speed(struct ksz_port *port) u8 remote; int i; int p; - int change = 0; interrupt = hw_block_intr(hw); @@ -3260,17 +2965,14 @@ static void port_get_link_speed(struct ksz_port *port) port_cfg_back_pressure(hw, p, (1 == info->duplex)); } - change |= 1 << i; port_cfg_change(hw, port, info, status); } info->state = media_connected; } else { - if (media_disconnected != info->state) { - change |= 1 << i; - - /* Indicate the link just goes down. */ + /* Indicate the link just goes down. */ + if (media_disconnected != info->state) hw->port_mib[p].link_down = 1; - } + info->state = media_disconnected; } hw->port_mib[p].state = (u8) info->state; diff --git a/drivers/net/ethernet/microchip/lan743x_main.c b/drivers/net/ethernet/microchip/lan743x_main.c index 7e0871b631e4..957d96a91a8a 100644 --- a/drivers/net/ethernet/microchip/lan743x_main.c +++ b/drivers/net/ethernet/microchip/lan743x_main.c @@ -1466,7 +1466,6 @@ static void lan743x_phy_close(struct lan743x_adapter *adapter) phy_stop(netdev->phydev); phy_disconnect(netdev->phydev); - netdev->phydev = NULL; } static void lan743x_phy_interface_select(struct lan743x_adapter *adapter) diff --git a/drivers/net/ethernet/microchip/lan966x/Kconfig b/drivers/net/ethernet/microchip/lan966x/Kconfig index 8bcd60f17d6d..571e6d4da1e9 100644 --- a/drivers/net/ethernet/microchip/lan966x/Kconfig +++ b/drivers/net/ethernet/microchip/lan966x/Kconfig @@ -6,7 +6,6 @@ config LAN966X_SWITCH depends on NET_SWITCHDEV depends on BRIDGE || BRIDGE=n select PHYLINK - select PACKING select PAGE_POOL select VCAP help diff --git a/drivers/net/ethernet/microchip/lan966x/lan966x_fdma.c b/drivers/net/ethernet/microchip/lan966x/lan966x_fdma.c index 55b484b10562..bd72fbc2220f 100644 --- a/drivers/net/ethernet/microchip/lan966x/lan966x_fdma.c +++ b/drivers/net/ethernet/microchip/lan966x/lan966x_fdma.c @@ -390,6 +390,7 @@ static void lan966x_fdma_stop_netdev(struct lan966x *lan966x) static void lan966x_fdma_tx_clear_buf(struct lan966x *lan966x, int weight) { struct lan966x_tx *tx = &lan966x->tx; + struct lan966x_rx *rx = &lan966x->rx; struct lan966x_tx_dcb_buf *dcb_buf; struct xdp_frame_bulk bq; struct lan966x_db *db; @@ -432,7 +433,8 @@ static void lan966x_fdma_tx_clear_buf(struct lan966x *lan966x, int weight) if (dcb_buf->xdp_ndo) xdp_return_frame_bulk(dcb_buf->data.xdpf, &bq); else - xdp_return_frame_rx_napi(dcb_buf->data.xdpf); + page_pool_recycle_direct(rx->page_pool, + dcb_buf->data.page); } clear = true; @@ -517,7 +519,7 @@ static struct sk_buff *lan966x_fdma_rx_get_frame(struct lan966x_rx *rx, if (likely(!(skb->dev->features & NETIF_F_RXFCS))) skb_trim(skb, skb->len - ETH_FCS_LEN); - lan966x_ptp_rxtstamp(lan966x, skb, timestamp); + lan966x_ptp_rxtstamp(lan966x, skb, src_port, timestamp); skb->protocol = eth_type_trans(skb, skb->dev); if (lan966x->bridge_mask & BIT(src_port)) { @@ -699,15 +701,14 @@ static void lan966x_fdma_tx_start(struct lan966x_tx *tx, int next_to_use) tx->last_in_use = next_to_use; } -int lan966x_fdma_xmit_xdpf(struct lan966x_port *port, - struct xdp_frame *xdpf, - struct page *page, - bool dma_map) +int lan966x_fdma_xmit_xdpf(struct lan966x_port *port, void *ptr, u32 len) { struct lan966x *lan966x = port->lan966x; struct lan966x_tx_dcb_buf *next_dcb_buf; struct lan966x_tx *tx = &lan966x->tx; + struct xdp_frame *xdpf; dma_addr_t dma_addr; + struct page *page; int next_to_use; __be32 *ifh; int ret = 0; @@ -722,8 +723,13 @@ int lan966x_fdma_xmit_xdpf(struct lan966x_port *port, goto out; } + /* Get the next buffer */ + next_dcb_buf = &tx->dcbs_buf[next_to_use]; + /* Generate new IFH */ - if (dma_map) { + if (!len) { + xdpf = ptr; + if (xdpf->headroom < IFH_LEN_BYTES) { ret = NETDEV_TX_OK; goto out; @@ -743,11 +749,16 @@ int lan966x_fdma_xmit_xdpf(struct lan966x_port *port, goto out; } + next_dcb_buf->data.xdpf = xdpf; + next_dcb_buf->len = xdpf->len + IFH_LEN_BYTES; + /* Setup next dcb */ lan966x_fdma_tx_setup_dcb(tx, next_to_use, xdpf->len + IFH_LEN_BYTES, dma_addr); } else { + page = ptr; + ifh = page_address(page) + XDP_PACKET_HEADROOM; memset(ifh, 0x0, sizeof(__be32) * IFH_LEN); lan966x_ifh_set_bypass(ifh, 1); @@ -756,21 +767,21 @@ int lan966x_fdma_xmit_xdpf(struct lan966x_port *port, dma_addr = page_pool_get_dma_addr(page); dma_sync_single_for_device(lan966x->dev, dma_addr + XDP_PACKET_HEADROOM, - xdpf->len + IFH_LEN_BYTES, + len + IFH_LEN_BYTES, DMA_TO_DEVICE); + next_dcb_buf->data.page = page; + next_dcb_buf->len = len + IFH_LEN_BYTES; + /* Setup next dcb */ lan966x_fdma_tx_setup_dcb(tx, next_to_use, - xdpf->len + IFH_LEN_BYTES, + len + IFH_LEN_BYTES, dma_addr + XDP_PACKET_HEADROOM); } /* Fill up the buffer */ - next_dcb_buf = &tx->dcbs_buf[next_to_use]; next_dcb_buf->use_skb = false; - next_dcb_buf->data.xdpf = xdpf; - next_dcb_buf->xdp_ndo = dma_map; - next_dcb_buf->len = xdpf->len + IFH_LEN_BYTES; + next_dcb_buf->xdp_ndo = !len; next_dcb_buf->dma_addr = dma_addr; next_dcb_buf->used = true; next_dcb_buf->ptp = false; diff --git a/drivers/net/ethernet/microchip/lan966x/lan966x_main.c b/drivers/net/ethernet/microchip/lan966x/lan966x_main.c index 685e8cd7658c..2b6e046e1d10 100644 --- a/drivers/net/ethernet/microchip/lan966x/lan966x_main.c +++ b/drivers/net/ethernet/microchip/lan966x/lan966x_main.c @@ -7,7 +7,6 @@ #include <linux/ip.h> #include <linux/of_platform.h> #include <linux/of_net.h> -#include <linux/packing.h> #include <linux/phy/phy.h> #include <linux/reset.h> #include <net/addrconf.h> @@ -305,46 +304,57 @@ err: return NETDEV_TX_BUSY; } +static void lan966x_ifh_set(u8 *ifh, size_t val, size_t pos, size_t length) +{ + int i = 0; + + do { + u8 p = IFH_LEN_BYTES - (pos + i) / 8 - 1; + u8 v = val >> i & 0xff; + + /* There is no need to check for limits of the array, as these + * will never be written + */ + ifh[p] |= v << ((pos + i) % 8); + ifh[p - 1] |= v >> (8 - (pos + i) % 8); + + i += 8; + } while (i < length); +} + void lan966x_ifh_set_bypass(void *ifh, u64 bypass) { - packing(ifh, &bypass, IFH_POS_BYPASS + IFH_WID_BYPASS - 1, - IFH_POS_BYPASS, IFH_LEN * 4, PACK, 0); + lan966x_ifh_set(ifh, bypass, IFH_POS_BYPASS, IFH_WID_BYPASS); } -void lan966x_ifh_set_port(void *ifh, u64 bypass) +void lan966x_ifh_set_port(void *ifh, u64 port) { - packing(ifh, &bypass, IFH_POS_DSTS + IFH_WID_DSTS - 1, - IFH_POS_DSTS, IFH_LEN * 4, PACK, 0); + lan966x_ifh_set(ifh, port, IFH_POS_DSTS, IFH_WID_DSTS); } -static void lan966x_ifh_set_qos_class(void *ifh, u64 bypass) +static void lan966x_ifh_set_qos_class(void *ifh, u64 qos) { - packing(ifh, &bypass, IFH_POS_QOS_CLASS + IFH_WID_QOS_CLASS - 1, - IFH_POS_QOS_CLASS, IFH_LEN * 4, PACK, 0); + lan966x_ifh_set(ifh, qos, IFH_POS_QOS_CLASS, IFH_WID_QOS_CLASS); } -static void lan966x_ifh_set_ipv(void *ifh, u64 bypass) +static void lan966x_ifh_set_ipv(void *ifh, u64 ipv) { - packing(ifh, &bypass, IFH_POS_IPV + IFH_WID_IPV - 1, - IFH_POS_IPV, IFH_LEN * 4, PACK, 0); + lan966x_ifh_set(ifh, ipv, IFH_POS_IPV, IFH_WID_IPV); } static void lan966x_ifh_set_vid(void *ifh, u64 vid) { - packing(ifh, &vid, IFH_POS_TCI + IFH_WID_TCI - 1, - IFH_POS_TCI, IFH_LEN * 4, PACK, 0); + lan966x_ifh_set(ifh, vid, IFH_POS_TCI, IFH_WID_TCI); } static void lan966x_ifh_set_rew_op(void *ifh, u64 rew_op) { - packing(ifh, &rew_op, IFH_POS_REW_CMD + IFH_WID_REW_CMD - 1, - IFH_POS_REW_CMD, IFH_LEN * 4, PACK, 0); + lan966x_ifh_set(ifh, rew_op, IFH_POS_REW_CMD, IFH_WID_REW_CMD); } static void lan966x_ifh_set_timestamp(void *ifh, u64 timestamp) { - packing(ifh, ×tamp, IFH_POS_TIMESTAMP + IFH_WID_TIMESTAMP - 1, - IFH_POS_TIMESTAMP, IFH_LEN * 4, PACK, 0); + lan966x_ifh_set(ifh, timestamp, IFH_POS_TIMESTAMP, IFH_WID_TIMESTAMP); } static netdev_tx_t lan966x_port_xmit(struct sk_buff *skb, @@ -582,22 +592,38 @@ static int lan966x_rx_frame_word(struct lan966x *lan966x, u8 grp, u32 *rval) } } +static u64 lan966x_ifh_get(u8 *ifh, size_t pos, size_t length) +{ + u64 val = 0; + u8 v; + + for (int i = 0; i < length ; i++) { + int j = pos + i; + int k = j % 8; + + if (i == 0 || k == 0) + v = ifh[IFH_LEN_BYTES - (j / 8) - 1]; + + if (v & (1 << k)) + val |= (1ULL << i); + } + + return val; +} + void lan966x_ifh_get_src_port(void *ifh, u64 *src_port) { - packing(ifh, src_port, IFH_POS_SRCPORT + IFH_WID_SRCPORT - 1, - IFH_POS_SRCPORT, IFH_LEN * 4, UNPACK, 0); + *src_port = lan966x_ifh_get(ifh, IFH_POS_SRCPORT, IFH_WID_SRCPORT); } static void lan966x_ifh_get_len(void *ifh, u64 *len) { - packing(ifh, len, IFH_POS_LEN + IFH_WID_LEN - 1, - IFH_POS_LEN, IFH_LEN * 4, UNPACK, 0); + *len = lan966x_ifh_get(ifh, IFH_POS_LEN, IFH_WID_LEN); } void lan966x_ifh_get_timestamp(void *ifh, u64 *timestamp) { - packing(ifh, timestamp, IFH_POS_TIMESTAMP + IFH_WID_TIMESTAMP - 1, - IFH_POS_TIMESTAMP, IFH_LEN * 4, UNPACK, 0); + *timestamp = lan966x_ifh_get(ifh, IFH_POS_TIMESTAMP, IFH_WID_TIMESTAMP); } static irqreturn_t lan966x_xtr_irq_handler(int irq, void *args) @@ -668,7 +694,7 @@ static irqreturn_t lan966x_xtr_irq_handler(int irq, void *args) *buf = val; } - lan966x_ptp_rxtstamp(lan966x, skb, timestamp); + lan966x_ptp_rxtstamp(lan966x, skb, src_port, timestamp); skb->protocol = eth_type_trans(skb, dev); if (lan966x->bridge_mask & BIT(src_port)) { diff --git a/drivers/net/ethernet/microchip/lan966x/lan966x_main.h b/drivers/net/ethernet/microchip/lan966x/lan966x_main.h index 49f5159afbf3..c977c70abc3d 100644 --- a/drivers/net/ethernet/microchip/lan966x/lan966x_main.h +++ b/drivers/net/ethernet/microchip/lan966x/lan966x_main.h @@ -92,6 +92,11 @@ #define SE_IDX_QUEUE 0 /* 0-79 : Queue scheduler elements */ #define SE_IDX_PORT 80 /* 80-89 : Port schedular elements */ +#define LAN966X_VCAP_CID_IS1_L0 VCAP_CID_INGRESS_L0 /* IS1 lookup 0 */ +#define LAN966X_VCAP_CID_IS1_L1 VCAP_CID_INGRESS_L1 /* IS1 lookup 1 */ +#define LAN966X_VCAP_CID_IS1_L2 VCAP_CID_INGRESS_L2 /* IS1 lookup 2 */ +#define LAN966X_VCAP_CID_IS1_MAX (VCAP_CID_INGRESS_L3 - 1) /* IS1 Max */ + #define LAN966X_VCAP_CID_IS2_L0 VCAP_CID_INGRESS_STAGE2_L0 /* IS2 lookup 0 */ #define LAN966X_VCAP_CID_IS2_L1 VCAP_CID_INGRESS_STAGE2_L1 /* IS2 lookup 1 */ #define LAN966X_VCAP_CID_IS2_MAX (VCAP_CID_INGRESS_STAGE2_L2 - 1) /* IS2 Max */ @@ -139,6 +144,39 @@ enum vcap_is2_port_sel_ipv6 { VCAP_IS2_PS_IPV6_MAC_ETYPE, }; +enum vcap_is1_port_sel_other { + VCAP_IS1_PS_OTHER_NORMAL, + VCAP_IS1_PS_OTHER_7TUPLE, + VCAP_IS1_PS_OTHER_DBL_VID, + VCAP_IS1_PS_OTHER_DMAC_VID, +}; + +enum vcap_is1_port_sel_ipv4 { + VCAP_IS1_PS_IPV4_NORMAL, + VCAP_IS1_PS_IPV4_7TUPLE, + VCAP_IS1_PS_IPV4_5TUPLE_IP4, + VCAP_IS1_PS_IPV4_DBL_VID, + VCAP_IS1_PS_IPV4_DMAC_VID, +}; + +enum vcap_is1_port_sel_ipv6 { + VCAP_IS1_PS_IPV6_NORMAL, + VCAP_IS1_PS_IPV6_7TUPLE, + VCAP_IS1_PS_IPV6_5TUPLE_IP4, + VCAP_IS1_PS_IPV6_NORMAL_IP6, + VCAP_IS1_PS_IPV6_5TUPLE_IP6, + VCAP_IS1_PS_IPV6_DBL_VID, + VCAP_IS1_PS_IPV6_DMAC_VID, +}; + +enum vcap_is1_port_sel_rt { + VCAP_IS1_PS_RT_NORMAL, + VCAP_IS1_PS_RT_7TUPLE, + VCAP_IS1_PS_RT_DBL_VID, + VCAP_IS1_PS_RT_DMAC_VID, + VCAP_IS1_PS_RT_FOLLOW_OTHER = 7, +}; + struct lan966x_port; struct lan966x_db { @@ -205,6 +243,7 @@ struct lan966x_tx_dcb_buf { union { struct sk_buff *skb; struct xdp_frame *xdpf; + struct page *page; } data; u32 len; u32 used : 1; @@ -369,7 +408,8 @@ struct lan966x_port { struct phy *serdes; struct fwnode_handle *fwnode; - u8 ptp_cmd; + u8 ptp_tx_cmd; + bool ptp_rx_cmd; u16 ts_id; struct sk_buff_head tx_skbs; @@ -489,7 +529,7 @@ void lan966x_ptp_deinit(struct lan966x *lan966x); int lan966x_ptp_hwtstamp_set(struct lan966x_port *port, struct ifreq *ifr); int lan966x_ptp_hwtstamp_get(struct lan966x_port *port, struct ifreq *ifr); void lan966x_ptp_rxtstamp(struct lan966x *lan966x, struct sk_buff *skb, - u64 timestamp); + u64 src_port, u64 timestamp); int lan966x_ptp_txtstamp_request(struct lan966x_port *port, struct sk_buff *skb); void lan966x_ptp_txtstamp_release(struct lan966x_port *port, @@ -502,10 +542,7 @@ int lan966x_ptp_setup_traps(struct lan966x_port *port, struct ifreq *ifr); int lan966x_ptp_del_traps(struct lan966x_port *port); int lan966x_fdma_xmit(struct sk_buff *skb, __be32 *ifh, struct net_device *dev); -int lan966x_fdma_xmit_xdpf(struct lan966x_port *port, - struct xdp_frame *frame, - struct page *page, - bool dma_map); +int lan966x_fdma_xmit_xdpf(struct lan966x_port *port, void *ptr, u32 len); int lan966x_fdma_change_mtu(struct lan966x *lan966x); void lan966x_fdma_netdev_init(struct lan966x *lan966x, struct net_device *dev); void lan966x_fdma_netdev_deinit(struct lan966x *lan966x, struct net_device *dev); diff --git a/drivers/net/ethernet/microchip/lan966x/lan966x_police.c b/drivers/net/ethernet/microchip/lan966x/lan966x_police.c index 7d66fe75cd3b..7302df2300fd 100644 --- a/drivers/net/ethernet/microchip/lan966x/lan966x_police.c +++ b/drivers/net/ethernet/microchip/lan966x/lan966x_police.c @@ -49,8 +49,7 @@ static int lan966x_police_add(struct lan966x_port *port, return 0; } -static int lan966x_police_del(struct lan966x_port *port, - u16 pol_idx) +static void lan966x_police_del(struct lan966x_port *port, u16 pol_idx) { struct lan966x *lan966x = port->lan966x; @@ -67,8 +66,6 @@ static int lan966x_police_del(struct lan966x_port *port, lan_wr(ANA_POL_PIR_CFG_PIR_RATE_SET(GENMASK(14, 0)) | ANA_POL_PIR_CFG_PIR_BURST_SET(0), lan966x, ANA_POL_PIR_CFG(pol_idx)); - - return 0; } static int lan966x_police_validate(struct lan966x_port *port, @@ -186,7 +183,6 @@ int lan966x_police_port_del(struct lan966x_port *port, struct netlink_ext_ack *extack) { struct lan966x *lan966x = port->lan966x; - int err; if (port->tc.police_id != police_id) { NL_SET_ERR_MSG_MOD(extack, @@ -194,12 +190,7 @@ int lan966x_police_port_del(struct lan966x_port *port, return -EINVAL; } - err = lan966x_police_del(port, POL_IDX_PORT + port->chip_port); - if (err) { - NL_SET_ERR_MSG_MOD(extack, - "Failed to add policer to port"); - return err; - } + lan966x_police_del(port, POL_IDX_PORT + port->chip_port); lan_rmw(ANA_POL_CFG_PORT_POL_ENA_SET(0) | ANA_POL_CFG_POL_ORDER_SET(POL_ORDER), diff --git a/drivers/net/ethernet/microchip/lan966x/lan966x_ptp.c b/drivers/net/ethernet/microchip/lan966x/lan966x_ptp.c index 931e37b9a0ad..266a21a2d124 100644 --- a/drivers/net/ethernet/microchip/lan966x/lan966x_ptp.c +++ b/drivers/net/ethernet/microchip/lan966x/lan966x_ptp.c @@ -272,13 +272,13 @@ int lan966x_ptp_hwtstamp_set(struct lan966x_port *port, struct ifreq *ifr) switch (cfg.tx_type) { case HWTSTAMP_TX_ON: - port->ptp_cmd = IFH_REW_OP_TWO_STEP_PTP; + port->ptp_tx_cmd = IFH_REW_OP_TWO_STEP_PTP; break; case HWTSTAMP_TX_ONESTEP_SYNC: - port->ptp_cmd = IFH_REW_OP_ONE_STEP_PTP; + port->ptp_tx_cmd = IFH_REW_OP_ONE_STEP_PTP; break; case HWTSTAMP_TX_OFF: - port->ptp_cmd = IFH_REW_OP_NOOP; + port->ptp_tx_cmd = IFH_REW_OP_NOOP; break; default: return -ERANGE; @@ -286,6 +286,7 @@ int lan966x_ptp_hwtstamp_set(struct lan966x_port *port, struct ifreq *ifr) switch (cfg.rx_filter) { case HWTSTAMP_FILTER_NONE: + port->ptp_rx_cmd = false; break; case HWTSTAMP_FILTER_ALL: case HWTSTAMP_FILTER_PTP_V1_L4_EVENT: @@ -301,6 +302,7 @@ int lan966x_ptp_hwtstamp_set(struct lan966x_port *port, struct ifreq *ifr) case HWTSTAMP_FILTER_PTP_V2_SYNC: case HWTSTAMP_FILTER_PTP_V2_DELAY_REQ: case HWTSTAMP_FILTER_NTP_ALL: + port->ptp_rx_cmd = true; cfg.rx_filter = HWTSTAMP_FILTER_ALL; break; default: @@ -332,7 +334,7 @@ static int lan966x_ptp_classify(struct lan966x_port *port, struct sk_buff *skb) u8 msgtype; int type; - if (port->ptp_cmd == IFH_REW_OP_NOOP) + if (port->ptp_tx_cmd == IFH_REW_OP_NOOP) return IFH_REW_OP_NOOP; type = ptp_classify_raw(skb); @@ -343,7 +345,7 @@ static int lan966x_ptp_classify(struct lan966x_port *port, struct sk_buff *skb) if (!header) return IFH_REW_OP_NOOP; - if (port->ptp_cmd == IFH_REW_OP_TWO_STEP_PTP) + if (port->ptp_tx_cmd == IFH_REW_OP_TWO_STEP_PTP) return IFH_REW_OP_TWO_STEP_PTP; /* If it is sync and run 1 step then set the correct operation, @@ -1009,9 +1011,6 @@ static int lan966x_ptp_phc_init(struct lan966x *lan966x, phc->index = index; phc->lan966x = lan966x; - /* PTP Rx stamping is always enabled. */ - phc->hwtstamp_config.rx_filter = HWTSTAMP_FILTER_PTP_V2_EVENT; - return 0; } @@ -1088,14 +1087,15 @@ void lan966x_ptp_deinit(struct lan966x *lan966x) } void lan966x_ptp_rxtstamp(struct lan966x *lan966x, struct sk_buff *skb, - u64 timestamp) + u64 src_port, u64 timestamp) { struct skb_shared_hwtstamps *shhwtstamps; struct lan966x_phc *phc; struct timespec64 ts; u64 full_ts_in_ns; - if (!lan966x->ptp) + if (!lan966x->ptp || + !lan966x->ports[src_port]->ptp_rx_cmd) return; phc = &lan966x->phc[LAN966X_PHC_PORT]; diff --git a/drivers/net/ethernet/microchip/lan966x/lan966x_regs.h b/drivers/net/ethernet/microchip/lan966x/lan966x_regs.h index 9767b5a1c958..f99f88b5caa8 100644 --- a/drivers/net/ethernet/microchip/lan966x/lan966x_regs.h +++ b/drivers/net/ethernet/microchip/lan966x/lan966x_regs.h @@ -316,6 +316,42 @@ enum lan966x_target { #define ANA_DROP_CFG_DROP_MC_SMAC_ENA_GET(x)\ FIELD_GET(ANA_DROP_CFG_DROP_MC_SMAC_ENA, x) +/* ANA:PORT:VCAP_CFG */ +#define ANA_VCAP_CFG(g) __REG(TARGET_ANA, 0, 1, 28672, g, 9, 128, 12, 0, 1, 4) + +#define ANA_VCAP_CFG_S1_ENA BIT(14) +#define ANA_VCAP_CFG_S1_ENA_SET(x)\ + FIELD_PREP(ANA_VCAP_CFG_S1_ENA, x) +#define ANA_VCAP_CFG_S1_ENA_GET(x)\ + FIELD_GET(ANA_VCAP_CFG_S1_ENA, x) + +/* ANA:PORT:VCAP_S1_KEY_CFG */ +#define ANA_VCAP_S1_CFG(g, r) __REG(TARGET_ANA, 0, 1, 28672, g, 9, 128, 16, r, 3, 4) + +#define ANA_VCAP_S1_CFG_KEY_RT_CFG GENMASK(11, 9) +#define ANA_VCAP_S1_CFG_KEY_RT_CFG_SET(x)\ + FIELD_PREP(ANA_VCAP_S1_CFG_KEY_RT_CFG, x) +#define ANA_VCAP_S1_CFG_KEY_RT_CFG_GET(x)\ + FIELD_GET(ANA_VCAP_S1_CFG_KEY_RT_CFG, x) + +#define ANA_VCAP_S1_CFG_KEY_IP6_CFG GENMASK(8, 6) +#define ANA_VCAP_S1_CFG_KEY_IP6_CFG_SET(x)\ + FIELD_PREP(ANA_VCAP_S1_CFG_KEY_IP6_CFG, x) +#define ANA_VCAP_S1_CFG_KEY_IP6_CFG_GET(x)\ + FIELD_GET(ANA_VCAP_S1_CFG_KEY_IP6_CFG, x) + +#define ANA_VCAP_S1_CFG_KEY_IP4_CFG GENMASK(5, 3) +#define ANA_VCAP_S1_CFG_KEY_IP4_CFG_SET(x)\ + FIELD_PREP(ANA_VCAP_S1_CFG_KEY_IP4_CFG, x) +#define ANA_VCAP_S1_CFG_KEY_IP4_CFG_GET(x)\ + FIELD_GET(ANA_VCAP_S1_CFG_KEY_IP4_CFG, x) + +#define ANA_VCAP_S1_CFG_KEY_OTHER_CFG GENMASK(2, 0) +#define ANA_VCAP_S1_CFG_KEY_OTHER_CFG_SET(x)\ + FIELD_PREP(ANA_VCAP_S1_CFG_KEY_OTHER_CFG, x) +#define ANA_VCAP_S1_CFG_KEY_OTHER_CFG_GET(x)\ + FIELD_GET(ANA_VCAP_S1_CFG_KEY_OTHER_CFG, x) + /* ANA:PORT:VCAP_S2_CFG */ #define ANA_VCAP_S2_CFG(g) __REG(TARGET_ANA, 0, 1, 28672, g, 9, 128, 28, 0, 1, 4) diff --git a/drivers/net/ethernet/microchip/lan966x/lan966x_tc_flower.c b/drivers/net/ethernet/microchip/lan966x/lan966x_tc_flower.c index f960727ecaee..47b2f7579dd2 100644 --- a/drivers/net/ethernet/microchip/lan966x/lan966x_tc_flower.c +++ b/drivers/net/ethernet/microchip/lan966x/lan966x_tc_flower.c @@ -5,14 +5,34 @@ #include "vcap_api_client.h" #include "vcap_tc.h" -static bool lan966x_tc_is_known_etype(u16 etype) +static bool lan966x_tc_is_known_etype(struct vcap_tc_flower_parse_usage *st, + u16 etype) { - switch (etype) { - case ETH_P_ALL: - case ETH_P_ARP: - case ETH_P_IP: - case ETH_P_IPV6: - return true; + switch (st->admin->vtype) { + case VCAP_TYPE_IS1: + switch (etype) { + case ETH_P_ALL: + case ETH_P_ARP: + case ETH_P_IP: + case ETH_P_IPV6: + return true; + } + break; + case VCAP_TYPE_IS2: + switch (etype) { + case ETH_P_ALL: + case ETH_P_ARP: + case ETH_P_IP: + case ETH_P_IPV6: + case ETH_P_SNAP: + case ETH_P_802_2: + return true; + } + break; + default: + NL_SET_ERR_MSG_MOD(st->fco->common.extack, + "VCAP type not supported"); + return false; } return false; @@ -69,7 +89,7 @@ lan966x_tc_flower_handler_basic_usage(struct vcap_tc_flower_parse_usage *st) flow_rule_match_basic(st->frule, &match); if (match.mask->n_proto) { st->l3_proto = be16_to_cpu(match.key->n_proto); - if (!lan966x_tc_is_known_etype(st->l3_proto)) { + if (!lan966x_tc_is_known_etype(st, st->l3_proto)) { err = vcap_rule_add_key_u32(st->vrule, VCAP_KF_ETYPE, st->l3_proto, ~0); if (err) @@ -79,18 +99,61 @@ lan966x_tc_flower_handler_basic_usage(struct vcap_tc_flower_parse_usage *st) VCAP_BIT_1); if (err) goto out; + } else if (st->l3_proto == ETH_P_IPV6 && + st->admin->vtype == VCAP_TYPE_IS1) { + /* Don't set any keys in this case */ + } else if (st->l3_proto == ETH_P_SNAP && + st->admin->vtype == VCAP_TYPE_IS1) { + err = vcap_rule_add_key_bit(st->vrule, + VCAP_KF_ETYPE_LEN_IS, + VCAP_BIT_0); + if (err) + goto out; + + err = vcap_rule_add_key_bit(st->vrule, + VCAP_KF_IP_SNAP_IS, + VCAP_BIT_1); + if (err) + goto out; + } else if (st->admin->vtype == VCAP_TYPE_IS1) { + err = vcap_rule_add_key_bit(st->vrule, + VCAP_KF_ETYPE_LEN_IS, + VCAP_BIT_1); + if (err) + goto out; + + err = vcap_rule_add_key_u32(st->vrule, VCAP_KF_ETYPE, + st->l3_proto, ~0); + if (err) + goto out; } } if (match.mask->ip_proto) { st->l4_proto = match.key->ip_proto; if (st->l4_proto == IPPROTO_TCP) { + if (st->admin->vtype == VCAP_TYPE_IS1) { + err = vcap_rule_add_key_bit(st->vrule, + VCAP_KF_TCP_UDP_IS, + VCAP_BIT_1); + if (err) + goto out; + } + err = vcap_rule_add_key_bit(st->vrule, VCAP_KF_TCP_IS, VCAP_BIT_1); if (err) goto out; } else if (st->l4_proto == IPPROTO_UDP) { + if (st->admin->vtype == VCAP_TYPE_IS1) { + err = vcap_rule_add_key_bit(st->vrule, + VCAP_KF_TCP_UDP_IS, + VCAP_BIT_1); + if (err) + goto out; + } + err = vcap_rule_add_key_bit(st->vrule, VCAP_KF_TCP_IS, VCAP_BIT_0); @@ -113,11 +176,29 @@ out: } static int +lan966x_tc_flower_handler_cvlan_usage(struct vcap_tc_flower_parse_usage *st) +{ + if (st->admin->vtype != VCAP_TYPE_IS1) { + NL_SET_ERR_MSG_MOD(st->fco->common.extack, + "cvlan not supported in this VCAP"); + return -EINVAL; + } + + return vcap_tc_flower_handler_cvlan_usage(st); +} + +static int lan966x_tc_flower_handler_vlan_usage(struct vcap_tc_flower_parse_usage *st) { - return vcap_tc_flower_handler_vlan_usage(st, - VCAP_KF_8021Q_VID_CLS, - VCAP_KF_8021Q_PCP_CLS); + enum vcap_key_field vid_key = VCAP_KF_8021Q_VID_CLS; + enum vcap_key_field pcp_key = VCAP_KF_8021Q_PCP_CLS; + + if (st->admin->vtype == VCAP_TYPE_IS1) { + vid_key = VCAP_KF_8021Q_VID0; + pcp_key = VCAP_KF_8021Q_PCP0; + } + + return vcap_tc_flower_handler_vlan_usage(st, vid_key, pcp_key); } static int @@ -128,6 +209,7 @@ static int [FLOW_DISSECTOR_KEY_CONTROL] = lan966x_tc_flower_handler_control_usage, [FLOW_DISSECTOR_KEY_PORTS] = vcap_tc_flower_handler_portnum_usage, [FLOW_DISSECTOR_KEY_BASIC] = lan966x_tc_flower_handler_basic_usage, + [FLOW_DISSECTOR_KEY_CVLAN] = lan966x_tc_flower_handler_cvlan_usage, [FLOW_DISSECTOR_KEY_VLAN] = lan966x_tc_flower_handler_vlan_usage, [FLOW_DISSECTOR_KEY_TCP] = vcap_tc_flower_handler_tcp_usage, [FLOW_DISSECTOR_KEY_ARP] = vcap_tc_flower_handler_arp_usage, @@ -143,6 +225,7 @@ static int lan966x_tc_flower_use_dissectors(struct flow_cls_offload *f, .fco = f, .vrule = vrule, .l3_proto = ETH_P_ALL, + .admin = admin, }; int err = 0; @@ -221,6 +304,100 @@ static int lan966x_tc_flower_action_check(struct vcap_control *vctrl, return 0; } +/* Add the actionset that is the default for the VCAP type */ +static int lan966x_tc_set_actionset(struct vcap_admin *admin, + struct vcap_rule *vrule) +{ + enum vcap_actionfield_set aset; + int err = 0; + + switch (admin->vtype) { + case VCAP_TYPE_IS1: + aset = VCAP_AFS_S1; + break; + case VCAP_TYPE_IS2: + aset = VCAP_AFS_BASE_TYPE; + break; + default: + return -EINVAL; + } + + /* Do not overwrite any current actionset */ + if (vrule->actionset == VCAP_AFS_NO_VALUE) + err = vcap_set_rule_set_actionset(vrule, aset); + + return err; +} + +static int lan966x_tc_add_rule_link_target(struct vcap_admin *admin, + struct vcap_rule *vrule, + int target_cid) +{ + int link_val = target_cid % VCAP_CID_LOOKUP_SIZE; + int err; + + if (!link_val) + return 0; + + switch (admin->vtype) { + case VCAP_TYPE_IS1: + /* Choose IS1 specific NXT_IDX key (for chaining rules from IS1) */ + err = vcap_rule_add_key_u32(vrule, VCAP_KF_LOOKUP_GEN_IDX_SEL, + 1, ~0); + if (err) + return err; + + return vcap_rule_add_key_u32(vrule, VCAP_KF_LOOKUP_GEN_IDX, + link_val, ~0); + case VCAP_TYPE_IS2: + /* Add IS2 specific PAG key (for chaining rules from IS1) */ + return vcap_rule_add_key_u32(vrule, VCAP_KF_LOOKUP_PAG, + link_val, ~0); + default: + break; + } + return 0; +} + +static int lan966x_tc_add_rule_link(struct vcap_control *vctrl, + struct vcap_admin *admin, + struct vcap_rule *vrule, + struct flow_cls_offload *f, + int to_cid) +{ + struct vcap_admin *to_admin = vcap_find_admin(vctrl, to_cid); + int diff, err = 0; + + if (!to_admin) { + NL_SET_ERR_MSG_MOD(f->common.extack, + "Unknown destination chain"); + return -EINVAL; + } + + diff = vcap_chain_offset(vctrl, f->common.chain_index, to_cid); + if (!diff) + return 0; + + /* Between IS1 and IS2 the PAG value is used */ + if (admin->vtype == VCAP_TYPE_IS1 && to_admin->vtype == VCAP_TYPE_IS2) { + /* This works for IS1->IS2 */ + err = vcap_rule_add_action_u32(vrule, VCAP_AF_PAG_VAL, diff); + if (err) + return err; + + err = vcap_rule_add_action_u32(vrule, VCAP_AF_PAG_OVERRIDE_MASK, + 0xff); + if (err) + return err; + } else { + NL_SET_ERR_MSG_MOD(f->common.extack, + "Unsupported chain destination"); + return -EOPNOTSUPP; + } + + return err; +} + static int lan966x_tc_flower_add(struct lan966x_port *port, struct flow_cls_offload *f, struct vcap_admin *admin, @@ -248,11 +425,23 @@ static int lan966x_tc_flower_add(struct lan966x_port *port, if (err) goto out; + err = lan966x_tc_add_rule_link_target(admin, vrule, + f->common.chain_index); + if (err) + goto out; + frule = flow_cls_offload_flow_rule(f); flow_action_for_each(idx, act, &frule->action) { switch (act->id) { case FLOW_ACTION_TRAP: + if (admin->vtype != VCAP_TYPE_IS2) { + NL_SET_ERR_MSG_MOD(f->common.extack, + "Trap action not supported in this VCAP"); + err = -EOPNOTSUPP; + goto out; + } + err = vcap_rule_add_action_bit(vrule, VCAP_AF_CPU_COPY_ENA, VCAP_BIT_1); @@ -266,6 +455,16 @@ static int lan966x_tc_flower_add(struct lan966x_port *port, break; case FLOW_ACTION_GOTO: + err = lan966x_tc_set_actionset(admin, vrule); + if (err) + goto out; + + err = lan966x_tc_add_rule_link(port->lan966x->vcap_ctrl, + admin, vrule, + f, act->chain_index); + if (err) + goto out; + break; default: NL_SET_ERR_MSG_MOD(f->common.extack, diff --git a/drivers/net/ethernet/microchip/lan966x/lan966x_vcap_ag_api.c b/drivers/net/ethernet/microchip/lan966x/lan966x_vcap_ag_api.c index 928e711960e6..66400a082d02 100644 --- a/drivers/net/ethernet/microchip/lan966x/lan966x_vcap_ag_api.c +++ b/drivers/net/ethernet/microchip/lan966x/lan966x_vcap_ag_api.c @@ -6,6 +6,965 @@ #include "lan966x_vcap_ag_api.h" /* keyfields */ +static const struct vcap_field is1_normal_keyfield[] = { + [VCAP_KF_TYPE] = { + .type = VCAP_FIELD_BIT, + .offset = 0, + .width = 1, + }, + [VCAP_KF_LOOKUP_INDEX] = { + .type = VCAP_FIELD_U32, + .offset = 1, + .width = 2, + }, + [VCAP_KF_IF_IGR_PORT_MASK] = { + .type = VCAP_FIELD_U32, + .offset = 3, + .width = 9, + }, + [VCAP_KF_L2_MC_IS] = { + .type = VCAP_FIELD_BIT, + .offset = 12, + .width = 1, + }, + [VCAP_KF_L2_BC_IS] = { + .type = VCAP_FIELD_BIT, + .offset = 13, + .width = 1, + }, + [VCAP_KF_IP_MC_IS] = { + .type = VCAP_FIELD_BIT, + .offset = 14, + .width = 1, + }, + [VCAP_KF_8021CB_R_TAGGED_IS] = { + .type = VCAP_FIELD_BIT, + .offset = 15, + .width = 1, + }, + [VCAP_KF_8021Q_VLAN_TAGGED_IS] = { + .type = VCAP_FIELD_BIT, + .offset = 16, + .width = 1, + }, + [VCAP_KF_8021Q_VLAN_DBL_TAGGED_IS] = { + .type = VCAP_FIELD_BIT, + .offset = 17, + .width = 1, + }, + [VCAP_KF_8021Q_TPID0] = { + .type = VCAP_FIELD_BIT, + .offset = 18, + .width = 1, + }, + [VCAP_KF_8021Q_VID0] = { + .type = VCAP_FIELD_U32, + .offset = 19, + .width = 12, + }, + [VCAP_KF_8021Q_DEI0] = { + .type = VCAP_FIELD_BIT, + .offset = 31, + .width = 1, + }, + [VCAP_KF_8021Q_PCP0] = { + .type = VCAP_FIELD_U32, + .offset = 32, + .width = 3, + }, + [VCAP_KF_L2_SMAC] = { + .type = VCAP_FIELD_U48, + .offset = 35, + .width = 48, + }, + [VCAP_KF_ETYPE_LEN_IS] = { + .type = VCAP_FIELD_BIT, + .offset = 83, + .width = 1, + }, + [VCAP_KF_ETYPE] = { + .type = VCAP_FIELD_U32, + .offset = 84, + .width = 16, + }, + [VCAP_KF_IP_SNAP_IS] = { + .type = VCAP_FIELD_BIT, + .offset = 100, + .width = 1, + }, + [VCAP_KF_IP4_IS] = { + .type = VCAP_FIELD_BIT, + .offset = 101, + .width = 1, + }, + [VCAP_KF_L3_FRAGMENT] = { + .type = VCAP_FIELD_BIT, + .offset = 102, + .width = 1, + }, + [VCAP_KF_L3_FRAG_OFS_GT0] = { + .type = VCAP_FIELD_BIT, + .offset = 103, + .width = 1, + }, + [VCAP_KF_L3_OPTIONS_IS] = { + .type = VCAP_FIELD_BIT, + .offset = 104, + .width = 1, + }, + [VCAP_KF_L3_DSCP] = { + .type = VCAP_FIELD_U32, + .offset = 105, + .width = 6, + }, + [VCAP_KF_L3_IP4_SIP] = { + .type = VCAP_FIELD_U32, + .offset = 111, + .width = 32, + }, + [VCAP_KF_TCP_UDP_IS] = { + .type = VCAP_FIELD_BIT, + .offset = 143, + .width = 1, + }, + [VCAP_KF_TCP_IS] = { + .type = VCAP_FIELD_BIT, + .offset = 144, + .width = 1, + }, + [VCAP_KF_L4_SPORT] = { + .type = VCAP_FIELD_U32, + .offset = 145, + .width = 16, + }, + [VCAP_KF_L4_RNG] = { + .type = VCAP_FIELD_U32, + .offset = 161, + .width = 8, + }, +}; + +static const struct vcap_field is1_5tuple_ip4_keyfield[] = { + [VCAP_KF_TYPE] = { + .type = VCAP_FIELD_BIT, + .offset = 0, + .width = 1, + }, + [VCAP_KF_LOOKUP_INDEX] = { + .type = VCAP_FIELD_U32, + .offset = 1, + .width = 2, + }, + [VCAP_KF_IF_IGR_PORT_MASK] = { + .type = VCAP_FIELD_U32, + .offset = 3, + .width = 9, + }, + [VCAP_KF_L2_MC_IS] = { + .type = VCAP_FIELD_BIT, + .offset = 12, + .width = 1, + }, + [VCAP_KF_L2_BC_IS] = { + .type = VCAP_FIELD_BIT, + .offset = 13, + .width = 1, + }, + [VCAP_KF_IP_MC_IS] = { + .type = VCAP_FIELD_BIT, + .offset = 14, + .width = 1, + }, + [VCAP_KF_8021CB_R_TAGGED_IS] = { + .type = VCAP_FIELD_BIT, + .offset = 15, + .width = 1, + }, + [VCAP_KF_8021Q_VLAN_TAGGED_IS] = { + .type = VCAP_FIELD_BIT, + .offset = 16, + .width = 1, + }, + [VCAP_KF_8021Q_VLAN_DBL_TAGGED_IS] = { + .type = VCAP_FIELD_BIT, + .offset = 17, + .width = 1, + }, + [VCAP_KF_8021Q_TPID0] = { + .type = VCAP_FIELD_BIT, + .offset = 18, + .width = 1, + }, + [VCAP_KF_8021Q_VID0] = { + .type = VCAP_FIELD_U32, + .offset = 19, + .width = 12, + }, + [VCAP_KF_8021Q_DEI0] = { + .type = VCAP_FIELD_BIT, + .offset = 31, + .width = 1, + }, + [VCAP_KF_8021Q_PCP0] = { + .type = VCAP_FIELD_U32, + .offset = 32, + .width = 3, + }, + [VCAP_KF_8021Q_TPID1] = { + .type = VCAP_FIELD_BIT, + .offset = 35, + .width = 1, + }, + [VCAP_KF_8021Q_VID1] = { + .type = VCAP_FIELD_U32, + .offset = 36, + .width = 12, + }, + [VCAP_KF_8021Q_DEI1] = { + .type = VCAP_FIELD_BIT, + .offset = 48, + .width = 1, + }, + [VCAP_KF_8021Q_PCP1] = { + .type = VCAP_FIELD_U32, + .offset = 49, + .width = 3, + }, + [VCAP_KF_IP4_IS] = { + .type = VCAP_FIELD_BIT, + .offset = 52, + .width = 1, + }, + [VCAP_KF_L3_FRAGMENT] = { + .type = VCAP_FIELD_BIT, + .offset = 53, + .width = 1, + }, + [VCAP_KF_L3_FRAG_OFS_GT0] = { + .type = VCAP_FIELD_BIT, + .offset = 54, + .width = 1, + }, + [VCAP_KF_L3_OPTIONS_IS] = { + .type = VCAP_FIELD_BIT, + .offset = 55, + .width = 1, + }, + [VCAP_KF_L3_DSCP] = { + .type = VCAP_FIELD_U32, + .offset = 56, + .width = 6, + }, + [VCAP_KF_L3_IP4_DIP] = { + .type = VCAP_FIELD_U32, + .offset = 62, + .width = 32, + }, + [VCAP_KF_L3_IP4_SIP] = { + .type = VCAP_FIELD_U32, + .offset = 94, + .width = 32, + }, + [VCAP_KF_L3_IP_PROTO] = { + .type = VCAP_FIELD_U32, + .offset = 126, + .width = 8, + }, + [VCAP_KF_TCP_UDP_IS] = { + .type = VCAP_FIELD_BIT, + .offset = 134, + .width = 1, + }, + [VCAP_KF_TCP_IS] = { + .type = VCAP_FIELD_BIT, + .offset = 135, + .width = 1, + }, + [VCAP_KF_L4_RNG] = { + .type = VCAP_FIELD_U32, + .offset = 136, + .width = 8, + }, + [VCAP_KF_IP_PAYLOAD_5TUPLE] = { + .type = VCAP_FIELD_U32, + .offset = 144, + .width = 32, + }, +}; + +static const struct vcap_field is1_normal_ip6_keyfield[] = { + [VCAP_KF_TYPE] = { + .type = VCAP_FIELD_U32, + .offset = 0, + .width = 2, + }, + [VCAP_KF_LOOKUP_INDEX] = { + .type = VCAP_FIELD_U32, + .offset = 2, + .width = 2, + }, + [VCAP_KF_IF_IGR_PORT_MASK] = { + .type = VCAP_FIELD_U32, + .offset = 4, + .width = 9, + }, + [VCAP_KF_L2_MC_IS] = { + .type = VCAP_FIELD_BIT, + .offset = 13, + .width = 1, + }, + [VCAP_KF_L2_BC_IS] = { + .type = VCAP_FIELD_BIT, + .offset = 14, + .width = 1, + }, + [VCAP_KF_IP_MC_IS] = { + .type = VCAP_FIELD_BIT, + .offset = 15, + .width = 1, + }, + [VCAP_KF_8021CB_R_TAGGED_IS] = { + .type = VCAP_FIELD_BIT, + .offset = 16, + .width = 1, + }, + [VCAP_KF_8021Q_VLAN_TAGGED_IS] = { + .type = VCAP_FIELD_BIT, + .offset = 17, + .width = 1, + }, + [VCAP_KF_8021Q_VLAN_DBL_TAGGED_IS] = { + .type = VCAP_FIELD_BIT, + .offset = 18, + .width = 1, + }, + [VCAP_KF_8021Q_TPID0] = { + .type = VCAP_FIELD_BIT, + .offset = 19, + .width = 1, + }, + [VCAP_KF_8021Q_VID0] = { + .type = VCAP_FIELD_U32, + .offset = 20, + .width = 12, + }, + [VCAP_KF_8021Q_DEI0] = { + .type = VCAP_FIELD_BIT, + .offset = 32, + .width = 1, + }, + [VCAP_KF_8021Q_PCP0] = { + .type = VCAP_FIELD_U32, + .offset = 33, + .width = 3, + }, + [VCAP_KF_8021Q_TPID1] = { + .type = VCAP_FIELD_BIT, + .offset = 36, + .width = 1, + }, + [VCAP_KF_8021Q_VID1] = { + .type = VCAP_FIELD_U32, + .offset = 37, + .width = 12, + }, + [VCAP_KF_8021Q_DEI1] = { + .type = VCAP_FIELD_BIT, + .offset = 49, + .width = 1, + }, + [VCAP_KF_8021Q_PCP1] = { + .type = VCAP_FIELD_U32, + .offset = 50, + .width = 3, + }, + [VCAP_KF_L2_SMAC] = { + .type = VCAP_FIELD_U48, + .offset = 53, + .width = 48, + }, + [VCAP_KF_L3_DSCP] = { + .type = VCAP_FIELD_U32, + .offset = 101, + .width = 6, + }, + [VCAP_KF_L3_IP6_SIP] = { + .type = VCAP_FIELD_U128, + .offset = 107, + .width = 128, + }, + [VCAP_KF_L3_IP_PROTO] = { + .type = VCAP_FIELD_U32, + .offset = 235, + .width = 8, + }, + [VCAP_KF_TCP_UDP_IS] = { + .type = VCAP_FIELD_BIT, + .offset = 243, + .width = 1, + }, + [VCAP_KF_L4_RNG] = { + .type = VCAP_FIELD_U32, + .offset = 244, + .width = 8, + }, + [VCAP_KF_IP_PAYLOAD_S1_IP6] = { + .type = VCAP_FIELD_U112, + .offset = 252, + .width = 112, + }, +}; + +static const struct vcap_field is1_7tuple_keyfield[] = { + [VCAP_KF_TYPE] = { + .type = VCAP_FIELD_U32, + .offset = 0, + .width = 2, + }, + [VCAP_KF_LOOKUP_INDEX] = { + .type = VCAP_FIELD_U32, + .offset = 2, + .width = 2, + }, + [VCAP_KF_IF_IGR_PORT_MASK] = { + .type = VCAP_FIELD_U32, + .offset = 4, + .width = 9, + }, + [VCAP_KF_L2_MC_IS] = { + .type = VCAP_FIELD_BIT, + .offset = 13, + .width = 1, + }, + [VCAP_KF_L2_BC_IS] = { + .type = VCAP_FIELD_BIT, + .offset = 14, + .width = 1, + }, + [VCAP_KF_IP_MC_IS] = { + .type = VCAP_FIELD_BIT, + .offset = 15, + .width = 1, + }, + [VCAP_KF_8021CB_R_TAGGED_IS] = { + .type = VCAP_FIELD_BIT, + .offset = 16, + .width = 1, + }, + [VCAP_KF_8021Q_VLAN_TAGGED_IS] = { + .type = VCAP_FIELD_BIT, + .offset = 17, + .width = 1, + }, + [VCAP_KF_8021Q_VLAN_DBL_TAGGED_IS] = { + .type = VCAP_FIELD_BIT, + .offset = 18, + .width = 1, + }, + [VCAP_KF_8021Q_TPID0] = { + .type = VCAP_FIELD_BIT, + .offset = 19, + .width = 1, + }, + [VCAP_KF_8021Q_VID0] = { + .type = VCAP_FIELD_U32, + .offset = 20, + .width = 12, + }, + [VCAP_KF_8021Q_DEI0] = { + .type = VCAP_FIELD_BIT, + .offset = 32, + .width = 1, + }, + [VCAP_KF_8021Q_PCP0] = { + .type = VCAP_FIELD_U32, + .offset = 33, + .width = 3, + }, + [VCAP_KF_8021Q_TPID1] = { + .type = VCAP_FIELD_BIT, + .offset = 36, + .width = 1, + }, + [VCAP_KF_8021Q_VID1] = { + .type = VCAP_FIELD_U32, + .offset = 37, + .width = 12, + }, + [VCAP_KF_8021Q_DEI1] = { + .type = VCAP_FIELD_BIT, + .offset = 49, + .width = 1, + }, + [VCAP_KF_8021Q_PCP1] = { + .type = VCAP_FIELD_U32, + .offset = 50, + .width = 3, + }, + [VCAP_KF_L2_DMAC] = { + .type = VCAP_FIELD_U48, + .offset = 53, + .width = 48, + }, + [VCAP_KF_L2_SMAC] = { + .type = VCAP_FIELD_U48, + .offset = 101, + .width = 48, + }, + [VCAP_KF_ETYPE_LEN_IS] = { + .type = VCAP_FIELD_BIT, + .offset = 149, + .width = 1, + }, + [VCAP_KF_ETYPE] = { + .type = VCAP_FIELD_U32, + .offset = 150, + .width = 16, + }, + [VCAP_KF_IP_SNAP_IS] = { + .type = VCAP_FIELD_BIT, + .offset = 166, + .width = 1, + }, + [VCAP_KF_IP4_IS] = { + .type = VCAP_FIELD_BIT, + .offset = 167, + .width = 1, + }, + [VCAP_KF_L3_FRAGMENT] = { + .type = VCAP_FIELD_BIT, + .offset = 168, + .width = 1, + }, + [VCAP_KF_L3_FRAG_OFS_GT0] = { + .type = VCAP_FIELD_BIT, + .offset = 169, + .width = 1, + }, + [VCAP_KF_L3_OPTIONS_IS] = { + .type = VCAP_FIELD_BIT, + .offset = 170, + .width = 1, + }, + [VCAP_KF_L3_DSCP] = { + .type = VCAP_FIELD_U32, + .offset = 171, + .width = 6, + }, + [VCAP_KF_L3_IP6_DIP_MSB] = { + .type = VCAP_FIELD_U32, + .offset = 177, + .width = 16, + }, + [VCAP_KF_L3_IP6_DIP] = { + .type = VCAP_FIELD_U64, + .offset = 193, + .width = 64, + }, + [VCAP_KF_L3_IP6_SIP_MSB] = { + .type = VCAP_FIELD_U32, + .offset = 257, + .width = 16, + }, + [VCAP_KF_L3_IP6_SIP] = { + .type = VCAP_FIELD_U64, + .offset = 273, + .width = 64, + }, + [VCAP_KF_TCP_UDP_IS] = { + .type = VCAP_FIELD_BIT, + .offset = 337, + .width = 1, + }, + [VCAP_KF_TCP_IS] = { + .type = VCAP_FIELD_BIT, + .offset = 338, + .width = 1, + }, + [VCAP_KF_L4_SPORT] = { + .type = VCAP_FIELD_U32, + .offset = 339, + .width = 16, + }, + [VCAP_KF_L4_RNG] = { + .type = VCAP_FIELD_U32, + .offset = 355, + .width = 8, + }, +}; + +static const struct vcap_field is1_5tuple_ip6_keyfield[] = { + [VCAP_KF_TYPE] = { + .type = VCAP_FIELD_U32, + .offset = 0, + .width = 2, + }, + [VCAP_KF_LOOKUP_INDEX] = { + .type = VCAP_FIELD_U32, + .offset = 2, + .width = 2, + }, + [VCAP_KF_IF_IGR_PORT_MASK] = { + .type = VCAP_FIELD_U32, + .offset = 4, + .width = 9, + }, + [VCAP_KF_L2_MC_IS] = { + .type = VCAP_FIELD_BIT, + .offset = 13, + .width = 1, + }, + [VCAP_KF_L2_BC_IS] = { + .type = VCAP_FIELD_BIT, + .offset = 14, + .width = 1, + }, + [VCAP_KF_IP_MC_IS] = { + .type = VCAP_FIELD_BIT, + .offset = 15, + .width = 1, + }, + [VCAP_KF_8021CB_R_TAGGED_IS] = { + .type = VCAP_FIELD_BIT, + .offset = 16, + .width = 1, + }, + [VCAP_KF_8021Q_VLAN_TAGGED_IS] = { + .type = VCAP_FIELD_BIT, + .offset = 17, + .width = 1, + }, + [VCAP_KF_8021Q_VLAN_DBL_TAGGED_IS] = { + .type = VCAP_FIELD_BIT, + .offset = 18, + .width = 1, + }, + [VCAP_KF_8021Q_TPID0] = { + .type = VCAP_FIELD_BIT, + .offset = 19, + .width = 1, + }, + [VCAP_KF_8021Q_VID0] = { + .type = VCAP_FIELD_U32, + .offset = 20, + .width = 12, + }, + [VCAP_KF_8021Q_DEI0] = { + .type = VCAP_FIELD_BIT, + .offset = 32, + .width = 1, + }, + [VCAP_KF_8021Q_PCP0] = { + .type = VCAP_FIELD_U32, + .offset = 33, + .width = 3, + }, + [VCAP_KF_8021Q_TPID1] = { + .type = VCAP_FIELD_BIT, + .offset = 36, + .width = 1, + }, + [VCAP_KF_8021Q_VID1] = { + .type = VCAP_FIELD_U32, + .offset = 37, + .width = 12, + }, + [VCAP_KF_8021Q_DEI1] = { + .type = VCAP_FIELD_BIT, + .offset = 49, + .width = 1, + }, + [VCAP_KF_8021Q_PCP1] = { + .type = VCAP_FIELD_U32, + .offset = 50, + .width = 3, + }, + [VCAP_KF_L3_DSCP] = { + .type = VCAP_FIELD_U32, + .offset = 53, + .width = 6, + }, + [VCAP_KF_L3_IP6_DIP] = { + .type = VCAP_FIELD_U128, + .offset = 59, + .width = 128, + }, + [VCAP_KF_L3_IP6_SIP] = { + .type = VCAP_FIELD_U128, + .offset = 187, + .width = 128, + }, + [VCAP_KF_L3_IP_PROTO] = { + .type = VCAP_FIELD_U32, + .offset = 315, + .width = 8, + }, + [VCAP_KF_TCP_UDP_IS] = { + .type = VCAP_FIELD_BIT, + .offset = 323, + .width = 1, + }, + [VCAP_KF_L4_RNG] = { + .type = VCAP_FIELD_U32, + .offset = 324, + .width = 8, + }, + [VCAP_KF_IP_PAYLOAD_5TUPLE] = { + .type = VCAP_FIELD_U32, + .offset = 332, + .width = 32, + }, +}; + +static const struct vcap_field is1_dbl_vid_keyfield[] = { + [VCAP_KF_TYPE] = { + .type = VCAP_FIELD_U32, + .offset = 0, + .width = 2, + }, + [VCAP_KF_LOOKUP_INDEX] = { + .type = VCAP_FIELD_U32, + .offset = 2, + .width = 2, + }, + [VCAP_KF_IF_IGR_PORT_MASK] = { + .type = VCAP_FIELD_U32, + .offset = 4, + .width = 9, + }, + [VCAP_KF_L2_MC_IS] = { + .type = VCAP_FIELD_BIT, + .offset = 13, + .width = 1, + }, + [VCAP_KF_L2_BC_IS] = { + .type = VCAP_FIELD_BIT, + .offset = 14, + .width = 1, + }, + [VCAP_KF_IP_MC_IS] = { + .type = VCAP_FIELD_BIT, + .offset = 15, + .width = 1, + }, + [VCAP_KF_8021CB_R_TAGGED_IS] = { + .type = VCAP_FIELD_BIT, + .offset = 16, + .width = 1, + }, + [VCAP_KF_8021Q_VLAN_TAGGED_IS] = { + .type = VCAP_FIELD_BIT, + .offset = 17, + .width = 1, + }, + [VCAP_KF_8021Q_VLAN_DBL_TAGGED_IS] = { + .type = VCAP_FIELD_BIT, + .offset = 18, + .width = 1, + }, + [VCAP_KF_8021Q_TPID0] = { + .type = VCAP_FIELD_BIT, + .offset = 19, + .width = 1, + }, + [VCAP_KF_8021Q_VID0] = { + .type = VCAP_FIELD_U32, + .offset = 20, + .width = 12, + }, + [VCAP_KF_8021Q_DEI0] = { + .type = VCAP_FIELD_BIT, + .offset = 32, + .width = 1, + }, + [VCAP_KF_8021Q_PCP0] = { + .type = VCAP_FIELD_U32, + .offset = 33, + .width = 3, + }, + [VCAP_KF_8021Q_TPID1] = { + .type = VCAP_FIELD_BIT, + .offset = 36, + .width = 1, + }, + [VCAP_KF_8021Q_VID1] = { + .type = VCAP_FIELD_U32, + .offset = 37, + .width = 12, + }, + [VCAP_KF_8021Q_DEI1] = { + .type = VCAP_FIELD_BIT, + .offset = 49, + .width = 1, + }, + [VCAP_KF_8021Q_PCP1] = { + .type = VCAP_FIELD_U32, + .offset = 50, + .width = 3, + }, + [VCAP_KF_ETYPE_LEN_IS] = { + .type = VCAP_FIELD_BIT, + .offset = 53, + .width = 1, + }, + [VCAP_KF_ETYPE] = { + .type = VCAP_FIELD_U32, + .offset = 54, + .width = 16, + }, + [VCAP_KF_IP_SNAP_IS] = { + .type = VCAP_FIELD_BIT, + .offset = 70, + .width = 1, + }, + [VCAP_KF_IP4_IS] = { + .type = VCAP_FIELD_BIT, + .offset = 71, + .width = 1, + }, + [VCAP_KF_L3_FRAGMENT] = { + .type = VCAP_FIELD_BIT, + .offset = 72, + .width = 1, + }, + [VCAP_KF_L3_FRAG_OFS_GT0] = { + .type = VCAP_FIELD_BIT, + .offset = 73, + .width = 1, + }, + [VCAP_KF_L3_OPTIONS_IS] = { + .type = VCAP_FIELD_BIT, + .offset = 74, + .width = 1, + }, + [VCAP_KF_L3_DSCP] = { + .type = VCAP_FIELD_U32, + .offset = 75, + .width = 6, + }, + [VCAP_KF_TCP_UDP_IS] = { + .type = VCAP_FIELD_BIT, + .offset = 81, + .width = 1, + }, + [VCAP_KF_TCP_IS] = { + .type = VCAP_FIELD_BIT, + .offset = 82, + .width = 1, + }, +}; + +static const struct vcap_field is1_rt_keyfield[] = { + [VCAP_KF_TYPE] = { + .type = VCAP_FIELD_U32, + .offset = 0, + .width = 2, + }, + [VCAP_KF_LOOKUP_FIRST_IS] = { + .type = VCAP_FIELD_BIT, + .offset = 2, + .width = 1, + }, + [VCAP_KF_IF_IGR_PORT] = { + .type = VCAP_FIELD_U32, + .offset = 3, + .width = 3, + }, + [VCAP_KF_8021CB_R_TAGGED_IS] = { + .type = VCAP_FIELD_BIT, + .offset = 6, + .width = 1, + }, + [VCAP_KF_8021Q_VLAN_TAGGED_IS] = { + .type = VCAP_FIELD_BIT, + .offset = 7, + .width = 1, + }, + [VCAP_KF_8021Q_VLAN_DBL_TAGGED_IS] = { + .type = VCAP_FIELD_BIT, + .offset = 8, + .width = 1, + }, + [VCAP_KF_L2_MAC] = { + .type = VCAP_FIELD_U48, + .offset = 9, + .width = 48, + }, + [VCAP_KF_RT_VLAN_IDX] = { + .type = VCAP_FIELD_U32, + .offset = 57, + .width = 3, + }, + [VCAP_KF_RT_TYPE] = { + .type = VCAP_FIELD_U32, + .offset = 60, + .width = 2, + }, + [VCAP_KF_RT_FRMID] = { + .type = VCAP_FIELD_U32, + .offset = 62, + .width = 32, + }, +}; + +static const struct vcap_field is1_dmac_vid_keyfield[] = { + [VCAP_KF_TYPE] = { + .type = VCAP_FIELD_U32, + .offset = 0, + .width = 2, + }, + [VCAP_KF_LOOKUP_INDEX] = { + .type = VCAP_FIELD_U32, + .offset = 2, + .width = 2, + }, + [VCAP_KF_IF_IGR_PORT_MASK] = { + .type = VCAP_FIELD_U32, + .offset = 4, + .width = 9, + }, + [VCAP_KF_8021CB_R_TAGGED_IS] = { + .type = VCAP_FIELD_BIT, + .offset = 13, + .width = 1, + }, + [VCAP_KF_8021Q_VLAN_TAGGED_IS] = { + .type = VCAP_FIELD_BIT, + .offset = 14, + .width = 1, + }, + [VCAP_KF_8021Q_VLAN_DBL_TAGGED_IS] = { + .type = VCAP_FIELD_BIT, + .offset = 15, + .width = 1, + }, + [VCAP_KF_8021Q_TPID0] = { + .type = VCAP_FIELD_BIT, + .offset = 16, + .width = 1, + }, + [VCAP_KF_8021Q_VID0] = { + .type = VCAP_FIELD_U32, + .offset = 17, + .width = 12, + }, + [VCAP_KF_8021Q_DEI0] = { + .type = VCAP_FIELD_BIT, + .offset = 29, + .width = 1, + }, + [VCAP_KF_8021Q_PCP0] = { + .type = VCAP_FIELD_U32, + .offset = 30, + .width = 3, + }, + [VCAP_KF_L2_DMAC] = { + .type = VCAP_FIELD_U48, + .offset = 33, + .width = 48, + }, +}; + static const struct vcap_field is2_mac_etype_keyfield[] = { [VCAP_KF_TYPE] = { .type = VCAP_FIELD_U32, @@ -1163,6 +2122,49 @@ static const struct vcap_field is2_smac_sip6_keyfield[] = { }; /* keyfield_set */ +static const struct vcap_set is1_keyfield_set[] = { + [VCAP_KFS_NORMAL] = { + .type_id = 0, + .sw_per_item = 2, + .sw_cnt = 2, + }, + [VCAP_KFS_5TUPLE_IP4] = { + .type_id = 1, + .sw_per_item = 2, + .sw_cnt = 2, + }, + [VCAP_KFS_NORMAL_IP6] = { + .type_id = 0, + .sw_per_item = 4, + .sw_cnt = 1, + }, + [VCAP_KFS_7TUPLE] = { + .type_id = 1, + .sw_per_item = 4, + .sw_cnt = 1, + }, + [VCAP_KFS_5TUPLE_IP6] = { + .type_id = 2, + .sw_per_item = 4, + .sw_cnt = 1, + }, + [VCAP_KFS_DBL_VID] = { + .type_id = 0, + .sw_per_item = 1, + .sw_cnt = 4, + }, + [VCAP_KFS_RT] = { + .type_id = 1, + .sw_per_item = 1, + .sw_cnt = 4, + }, + [VCAP_KFS_DMAC_VID] = { + .type_id = 2, + .sw_per_item = 1, + .sw_cnt = 4, + }, +}; + static const struct vcap_set is2_keyfield_set[] = { [VCAP_KFS_MAC_ETYPE] = { .type_id = 0, @@ -1227,6 +2229,17 @@ static const struct vcap_set is2_keyfield_set[] = { }; /* keyfield_set map */ +static const struct vcap_field *is1_keyfield_set_map[] = { + [VCAP_KFS_NORMAL] = is1_normal_keyfield, + [VCAP_KFS_5TUPLE_IP4] = is1_5tuple_ip4_keyfield, + [VCAP_KFS_NORMAL_IP6] = is1_normal_ip6_keyfield, + [VCAP_KFS_7TUPLE] = is1_7tuple_keyfield, + [VCAP_KFS_5TUPLE_IP6] = is1_5tuple_ip6_keyfield, + [VCAP_KFS_DBL_VID] = is1_dbl_vid_keyfield, + [VCAP_KFS_RT] = is1_rt_keyfield, + [VCAP_KFS_DMAC_VID] = is1_dmac_vid_keyfield, +}; + static const struct vcap_field *is2_keyfield_set_map[] = { [VCAP_KFS_MAC_ETYPE] = is2_mac_etype_keyfield, [VCAP_KFS_MAC_LLC] = is2_mac_llc_keyfield, @@ -1243,6 +2256,17 @@ static const struct vcap_field *is2_keyfield_set_map[] = { }; /* keyfield_set map sizes */ +static int is1_keyfield_set_map_size[] = { + [VCAP_KFS_NORMAL] = ARRAY_SIZE(is1_normal_keyfield), + [VCAP_KFS_5TUPLE_IP4] = ARRAY_SIZE(is1_5tuple_ip4_keyfield), + [VCAP_KFS_NORMAL_IP6] = ARRAY_SIZE(is1_normal_ip6_keyfield), + [VCAP_KFS_7TUPLE] = ARRAY_SIZE(is1_7tuple_keyfield), + [VCAP_KFS_5TUPLE_IP6] = ARRAY_SIZE(is1_5tuple_ip6_keyfield), + [VCAP_KFS_DBL_VID] = ARRAY_SIZE(is1_dbl_vid_keyfield), + [VCAP_KFS_RT] = ARRAY_SIZE(is1_rt_keyfield), + [VCAP_KFS_DMAC_VID] = ARRAY_SIZE(is1_dmac_vid_keyfield), +}; + static int is2_keyfield_set_map_size[] = { [VCAP_KFS_MAC_ETYPE] = ARRAY_SIZE(is2_mac_etype_keyfield), [VCAP_KFS_MAC_LLC] = ARRAY_SIZE(is2_mac_llc_keyfield), @@ -1259,6 +2283,154 @@ static int is2_keyfield_set_map_size[] = { }; /* actionfields */ +static const struct vcap_field is1_s1_actionfield[] = { + [VCAP_AF_TYPE] = { + .type = VCAP_FIELD_BIT, + .offset = 0, + .width = 1, + }, + [VCAP_AF_DSCP_ENA] = { + .type = VCAP_FIELD_BIT, + .offset = 1, + .width = 1, + }, + [VCAP_AF_DSCP_VAL] = { + .type = VCAP_FIELD_U32, + .offset = 2, + .width = 6, + }, + [VCAP_AF_QOS_ENA] = { + .type = VCAP_FIELD_BIT, + .offset = 8, + .width = 1, + }, + [VCAP_AF_QOS_VAL] = { + .type = VCAP_FIELD_U32, + .offset = 9, + .width = 3, + }, + [VCAP_AF_DP_ENA] = { + .type = VCAP_FIELD_BIT, + .offset = 12, + .width = 1, + }, + [VCAP_AF_DP_VAL] = { + .type = VCAP_FIELD_BIT, + .offset = 13, + .width = 1, + }, + [VCAP_AF_PAG_OVERRIDE_MASK] = { + .type = VCAP_FIELD_U32, + .offset = 14, + .width = 8, + }, + [VCAP_AF_PAG_VAL] = { + .type = VCAP_FIELD_U32, + .offset = 22, + .width = 8, + }, + [VCAP_AF_ISDX_REPLACE_ENA] = { + .type = VCAP_FIELD_BIT, + .offset = 30, + .width = 1, + }, + [VCAP_AF_ISDX_ADD_VAL] = { + .type = VCAP_FIELD_U32, + .offset = 31, + .width = 8, + }, + [VCAP_AF_VID_REPLACE_ENA] = { + .type = VCAP_FIELD_BIT, + .offset = 39, + .width = 1, + }, + [VCAP_AF_VID_VAL] = { + .type = VCAP_FIELD_U32, + .offset = 40, + .width = 12, + }, + [VCAP_AF_PCP_ENA] = { + .type = VCAP_FIELD_BIT, + .offset = 67, + .width = 1, + }, + [VCAP_AF_PCP_VAL] = { + .type = VCAP_FIELD_U32, + .offset = 68, + .width = 3, + }, + [VCAP_AF_DEI_ENA] = { + .type = VCAP_FIELD_BIT, + .offset = 71, + .width = 1, + }, + [VCAP_AF_DEI_VAL] = { + .type = VCAP_FIELD_BIT, + .offset = 72, + .width = 1, + }, + [VCAP_AF_VLAN_POP_CNT_ENA] = { + .type = VCAP_FIELD_BIT, + .offset = 73, + .width = 1, + }, + [VCAP_AF_VLAN_POP_CNT] = { + .type = VCAP_FIELD_U32, + .offset = 74, + .width = 2, + }, + [VCAP_AF_CUSTOM_ACE_TYPE_ENA] = { + .type = VCAP_FIELD_U32, + .offset = 76, + .width = 4, + }, + [VCAP_AF_SFID_ENA] = { + .type = VCAP_FIELD_BIT, + .offset = 80, + .width = 1, + }, + [VCAP_AF_SFID_VAL] = { + .type = VCAP_FIELD_U32, + .offset = 81, + .width = 8, + }, + [VCAP_AF_SGID_ENA] = { + .type = VCAP_FIELD_BIT, + .offset = 89, + .width = 1, + }, + [VCAP_AF_SGID_VAL] = { + .type = VCAP_FIELD_U32, + .offset = 90, + .width = 8, + }, + [VCAP_AF_POLICE_ENA] = { + .type = VCAP_FIELD_BIT, + .offset = 98, + .width = 1, + }, + [VCAP_AF_POLICE_IDX] = { + .type = VCAP_FIELD_U32, + .offset = 99, + .width = 9, + }, + [VCAP_AF_OAM_SEL] = { + .type = VCAP_FIELD_U32, + .offset = 108, + .width = 3, + }, + [VCAP_AF_MRP_SEL] = { + .type = VCAP_FIELD_U32, + .offset = 111, + .width = 2, + }, + [VCAP_AF_DLR_SEL] = { + .type = VCAP_FIELD_U32, + .offset = 113, + .width = 2, + }, +}; + static const struct vcap_field is2_base_type_actionfield[] = { [VCAP_AF_HIT_ME_ONCE] = { .type = VCAP_FIELD_BIT, @@ -1351,6 +2523,14 @@ static const struct vcap_field is2_smac_sip_actionfield[] = { }; /* actionfield_set */ +static const struct vcap_set is1_actionfield_set[] = { + [VCAP_AFS_S1] = { + .type_id = 0, + .sw_per_item = 1, + .sw_cnt = 4, + }, +}; + static const struct vcap_set is2_actionfield_set[] = { [VCAP_AFS_BASE_TYPE] = { .type_id = -1, @@ -1365,18 +2545,73 @@ static const struct vcap_set is2_actionfield_set[] = { }; /* actionfield_set map */ +static const struct vcap_field *is1_actionfield_set_map[] = { + [VCAP_AFS_S1] = is1_s1_actionfield, +}; + static const struct vcap_field *is2_actionfield_set_map[] = { [VCAP_AFS_BASE_TYPE] = is2_base_type_actionfield, [VCAP_AFS_SMAC_SIP] = is2_smac_sip_actionfield, }; /* actionfield_set map size */ +static int is1_actionfield_set_map_size[] = { + [VCAP_AFS_S1] = ARRAY_SIZE(is1_s1_actionfield), +}; + static int is2_actionfield_set_map_size[] = { [VCAP_AFS_BASE_TYPE] = ARRAY_SIZE(is2_base_type_actionfield), [VCAP_AFS_SMAC_SIP] = ARRAY_SIZE(is2_smac_sip_actionfield), }; /* Type Groups */ +static const struct vcap_typegroup is1_x4_keyfield_set_typegroups[] = { + { + .offset = 0, + .width = 3, + .value = 4, + }, + { + .offset = 96, + .width = 1, + .value = 0, + }, + { + .offset = 192, + .width = 2, + .value = 0, + }, + { + .offset = 288, + .width = 1, + .value = 0, + }, + {} +}; + +static const struct vcap_typegroup is1_x2_keyfield_set_typegroups[] = { + { + .offset = 0, + .width = 2, + .value = 2, + }, + { + .offset = 96, + .width = 1, + .value = 0, + }, + {} +}; + +static const struct vcap_typegroup is1_x1_keyfield_set_typegroups[] = { + { + .offset = 0, + .width = 1, + .value = 1, + }, + {} +}; + static const struct vcap_typegroup is2_x4_keyfield_set_typegroups[] = { { .offset = 0, @@ -1424,6 +2659,13 @@ static const struct vcap_typegroup is2_x1_keyfield_set_typegroups[] = { {} }; +static const struct vcap_typegroup *is1_keyfield_set_typegroups[] = { + [4] = is1_x4_keyfield_set_typegroups, + [2] = is1_x2_keyfield_set_typegroups, + [1] = is1_x1_keyfield_set_typegroups, + [5] = NULL, +}; + static const struct vcap_typegroup *is2_keyfield_set_typegroups[] = { [4] = is2_x4_keyfield_set_typegroups, [2] = is2_x2_keyfield_set_typegroups, @@ -1431,6 +2673,10 @@ static const struct vcap_typegroup *is2_keyfield_set_typegroups[] = { [5] = NULL, }; +static const struct vcap_typegroup is1_x1_actionfield_set_typegroups[] = { + {} +}; + static const struct vcap_typegroup is2_x2_actionfield_set_typegroups[] = { { .offset = 0, @@ -1454,6 +2700,11 @@ static const struct vcap_typegroup is2_x1_actionfield_set_typegroups[] = { {} }; +static const struct vcap_typegroup *is1_actionfield_set_typegroups[] = { + [1] = is1_x1_actionfield_set_typegroups, + [5] = NULL, +}; + static const struct vcap_typegroup *is2_actionfield_set_typegroups[] = { [2] = is2_x2_actionfield_set_typegroups, [1] = is2_x1_actionfield_set_typegroups, @@ -1463,16 +2714,33 @@ static const struct vcap_typegroup *is2_actionfield_set_typegroups[] = { /* Keyfieldset names */ static const char * const vcap_keyfield_set_names[] = { [VCAP_KFS_NO_VALUE] = "(None)", + [VCAP_KFS_5TUPLE_IP4] = "VCAP_KFS_5TUPLE_IP4", + [VCAP_KFS_5TUPLE_IP6] = "VCAP_KFS_5TUPLE_IP6", + [VCAP_KFS_7TUPLE] = "VCAP_KFS_7TUPLE", [VCAP_KFS_ARP] = "VCAP_KFS_ARP", + [VCAP_KFS_DBL_VID] = "VCAP_KFS_DBL_VID", + [VCAP_KFS_DMAC_VID] = "VCAP_KFS_DMAC_VID", + [VCAP_KFS_ETAG] = "VCAP_KFS_ETAG", [VCAP_KFS_IP4_OTHER] = "VCAP_KFS_IP4_OTHER", [VCAP_KFS_IP4_TCP_UDP] = "VCAP_KFS_IP4_TCP_UDP", + [VCAP_KFS_IP4_VID] = "VCAP_KFS_IP4_VID", [VCAP_KFS_IP6_OTHER] = "VCAP_KFS_IP6_OTHER", [VCAP_KFS_IP6_STD] = "VCAP_KFS_IP6_STD", [VCAP_KFS_IP6_TCP_UDP] = "VCAP_KFS_IP6_TCP_UDP", + [VCAP_KFS_IP6_VID] = "VCAP_KFS_IP6_VID", + [VCAP_KFS_IP_7TUPLE] = "VCAP_KFS_IP_7TUPLE", + [VCAP_KFS_ISDX] = "VCAP_KFS_ISDX", + [VCAP_KFS_LL_FULL] = "VCAP_KFS_LL_FULL", [VCAP_KFS_MAC_ETYPE] = "VCAP_KFS_MAC_ETYPE", [VCAP_KFS_MAC_LLC] = "VCAP_KFS_MAC_LLC", [VCAP_KFS_MAC_SNAP] = "VCAP_KFS_MAC_SNAP", + [VCAP_KFS_NORMAL] = "VCAP_KFS_NORMAL", + [VCAP_KFS_NORMAL_5TUPLE_IP4] = "VCAP_KFS_NORMAL_5TUPLE_IP4", + [VCAP_KFS_NORMAL_7TUPLE] = "VCAP_KFS_NORMAL_7TUPLE", + [VCAP_KFS_NORMAL_IP6] = "VCAP_KFS_NORMAL_IP6", [VCAP_KFS_OAM] = "VCAP_KFS_OAM", + [VCAP_KFS_PURE_5TUPLE_IP4] = "VCAP_KFS_PURE_5TUPLE_IP4", + [VCAP_KFS_RT] = "VCAP_KFS_RT", [VCAP_KFS_SMAC_SIP4] = "VCAP_KFS_SMAC_SIP4", [VCAP_KFS_SMAC_SIP6] = "VCAP_KFS_SMAC_SIP6", }; @@ -1481,16 +2749,42 @@ static const char * const vcap_keyfield_set_names[] = { static const char * const vcap_actionfield_set_names[] = { [VCAP_AFS_NO_VALUE] = "(None)", [VCAP_AFS_BASE_TYPE] = "VCAP_AFS_BASE_TYPE", + [VCAP_AFS_CLASSIFICATION] = "VCAP_AFS_CLASSIFICATION", + [VCAP_AFS_CLASS_REDUCED] = "VCAP_AFS_CLASS_REDUCED", + [VCAP_AFS_FULL] = "VCAP_AFS_FULL", + [VCAP_AFS_S1] = "VCAP_AFS_S1", [VCAP_AFS_SMAC_SIP] = "VCAP_AFS_SMAC_SIP", }; /* Keyfield names */ static const char * const vcap_keyfield_names[] = { [VCAP_KF_NO_VALUE] = "(None)", + [VCAP_KF_8021BR_ECID_BASE] = "8021BR_ECID_BASE", + [VCAP_KF_8021BR_ECID_EXT] = "8021BR_ECID_EXT", + [VCAP_KF_8021BR_E_TAGGED] = "8021BR_E_TAGGED", + [VCAP_KF_8021BR_GRP] = "8021BR_GRP", + [VCAP_KF_8021BR_IGR_ECID_BASE] = "8021BR_IGR_ECID_BASE", + [VCAP_KF_8021BR_IGR_ECID_EXT] = "8021BR_IGR_ECID_EXT", + [VCAP_KF_8021CB_R_TAGGED_IS] = "8021CB_R_TAGGED_IS", + [VCAP_KF_8021Q_DEI0] = "8021Q_DEI0", + [VCAP_KF_8021Q_DEI1] = "8021Q_DEI1", + [VCAP_KF_8021Q_DEI2] = "8021Q_DEI2", [VCAP_KF_8021Q_DEI_CLS] = "8021Q_DEI_CLS", + [VCAP_KF_8021Q_PCP0] = "8021Q_PCP0", + [VCAP_KF_8021Q_PCP1] = "8021Q_PCP1", + [VCAP_KF_8021Q_PCP2] = "8021Q_PCP2", [VCAP_KF_8021Q_PCP_CLS] = "8021Q_PCP_CLS", + [VCAP_KF_8021Q_TPID0] = "8021Q_TPID0", + [VCAP_KF_8021Q_TPID1] = "8021Q_TPID1", + [VCAP_KF_8021Q_TPID2] = "8021Q_TPID2", + [VCAP_KF_8021Q_VID0] = "8021Q_VID0", + [VCAP_KF_8021Q_VID1] = "8021Q_VID1", + [VCAP_KF_8021Q_VID2] = "8021Q_VID2", [VCAP_KF_8021Q_VID_CLS] = "8021Q_VID_CLS", + [VCAP_KF_8021Q_VLAN_DBL_TAGGED_IS] = "8021Q_VLAN_DBL_TAGGED_IS", [VCAP_KF_8021Q_VLAN_TAGGED_IS] = "8021Q_VLAN_TAGGED_IS", + [VCAP_KF_8021Q_VLAN_TAGS] = "8021Q_VLAN_TAGS", + [VCAP_KF_ACL_GRP_ID] = "ACL_GRP_ID", [VCAP_KF_ARP_ADDR_SPACE_OK_IS] = "ARP_ADDR_SPACE_OK_IS", [VCAP_KF_ARP_LEN_OK_IS] = "ARP_LEN_OK_IS", [VCAP_KF_ARP_OPCODE] = "ARP_OPCODE", @@ -1498,32 +2792,57 @@ static const char * const vcap_keyfield_names[] = { [VCAP_KF_ARP_PROTO_SPACE_OK_IS] = "ARP_PROTO_SPACE_OK_IS", [VCAP_KF_ARP_SENDER_MATCH_IS] = "ARP_SENDER_MATCH_IS", [VCAP_KF_ARP_TGT_MATCH_IS] = "ARP_TGT_MATCH_IS", + [VCAP_KF_COSID_CLS] = "COSID_CLS", + [VCAP_KF_ES0_ISDX_KEY_ENA] = "ES0_ISDX_KEY_ENA", [VCAP_KF_ETYPE] = "ETYPE", + [VCAP_KF_ETYPE_LEN_IS] = "ETYPE_LEN_IS", [VCAP_KF_HOST_MATCH] = "HOST_MATCH", + [VCAP_KF_IF_EGR_PORT_MASK] = "IF_EGR_PORT_MASK", + [VCAP_KF_IF_EGR_PORT_MASK_RNG] = "IF_EGR_PORT_MASK_RNG", [VCAP_KF_IF_IGR_PORT] = "IF_IGR_PORT", [VCAP_KF_IF_IGR_PORT_MASK] = "IF_IGR_PORT_MASK", + [VCAP_KF_IF_IGR_PORT_MASK_L3] = "IF_IGR_PORT_MASK_L3", + [VCAP_KF_IF_IGR_PORT_MASK_RNG] = "IF_IGR_PORT_MASK_RNG", + [VCAP_KF_IF_IGR_PORT_MASK_SEL] = "IF_IGR_PORT_MASK_SEL", + [VCAP_KF_IF_IGR_PORT_SEL] = "IF_IGR_PORT_SEL", [VCAP_KF_IP4_IS] = "IP4_IS", + [VCAP_KF_IP_MC_IS] = "IP_MC_IS", + [VCAP_KF_IP_PAYLOAD_5TUPLE] = "IP_PAYLOAD_5TUPLE", + [VCAP_KF_IP_PAYLOAD_S1_IP6] = "IP_PAYLOAD_S1_IP6", + [VCAP_KF_IP_SNAP_IS] = "IP_SNAP_IS", + [VCAP_KF_ISDX_CLS] = "ISDX_CLS", [VCAP_KF_ISDX_GT0_IS] = "ISDX_GT0_IS", [VCAP_KF_L2_BC_IS] = "L2_BC_IS", [VCAP_KF_L2_DMAC] = "L2_DMAC", [VCAP_KF_L2_FRM_TYPE] = "L2_FRM_TYPE", + [VCAP_KF_L2_FWD_IS] = "L2_FWD_IS", [VCAP_KF_L2_LLC] = "L2_LLC", + [VCAP_KF_L2_MAC] = "L2_MAC", [VCAP_KF_L2_MC_IS] = "L2_MC_IS", [VCAP_KF_L2_PAYLOAD0] = "L2_PAYLOAD0", [VCAP_KF_L2_PAYLOAD1] = "L2_PAYLOAD1", [VCAP_KF_L2_PAYLOAD2] = "L2_PAYLOAD2", + [VCAP_KF_L2_PAYLOAD_ETYPE] = "L2_PAYLOAD_ETYPE", [VCAP_KF_L2_SMAC] = "L2_SMAC", [VCAP_KF_L2_SNAP] = "L2_SNAP", [VCAP_KF_L3_DIP_EQ_SIP_IS] = "L3_DIP_EQ_SIP_IS", + [VCAP_KF_L3_DPL_CLS] = "L3_DPL_CLS", + [VCAP_KF_L3_DSCP] = "L3_DSCP", + [VCAP_KF_L3_DST_IS] = "L3_DST_IS", [VCAP_KF_L3_FRAGMENT] = "L3_FRAGMENT", + [VCAP_KF_L3_FRAGMENT_TYPE] = "L3_FRAGMENT_TYPE", + [VCAP_KF_L3_FRAG_INVLD_L4_LEN] = "L3_FRAG_INVLD_L4_LEN", [VCAP_KF_L3_FRAG_OFS_GT0] = "L3_FRAG_OFS_GT0", [VCAP_KF_L3_IP4_DIP] = "L3_IP4_DIP", [VCAP_KF_L3_IP4_SIP] = "L3_IP4_SIP", [VCAP_KF_L3_IP6_DIP] = "L3_IP6_DIP", + [VCAP_KF_L3_IP6_DIP_MSB] = "L3_IP6_DIP_MSB", [VCAP_KF_L3_IP6_SIP] = "L3_IP6_SIP", + [VCAP_KF_L3_IP6_SIP_MSB] = "L3_IP6_SIP_MSB", [VCAP_KF_L3_IP_PROTO] = "L3_IP_PROTO", [VCAP_KF_L3_OPTIONS_IS] = "L3_OPTIONS_IS", [VCAP_KF_L3_PAYLOAD] = "L3_PAYLOAD", + [VCAP_KF_L3_RT_IS] = "L3_RT_IS", [VCAP_KF_L3_TOS] = "L3_TOS", [VCAP_KF_L3_TTL_GT0] = "L3_TTL_GT0", [VCAP_KF_L4_1588_DOM] = "L4_1588_DOM", @@ -1531,6 +2850,7 @@ static const char * const vcap_keyfield_names[] = { [VCAP_KF_L4_ACK] = "L4_ACK", [VCAP_KF_L4_DPORT] = "L4_DPORT", [VCAP_KF_L4_FIN] = "L4_FIN", + [VCAP_KF_L4_PAYLOAD] = "L4_PAYLOAD", [VCAP_KF_L4_PSH] = "L4_PSH", [VCAP_KF_L4_RNG] = "L4_RNG", [VCAP_KF_L4_RST] = "L4_RST", @@ -1540,7 +2860,11 @@ static const char * const vcap_keyfield_names[] = { [VCAP_KF_L4_SYN] = "L4_SYN", [VCAP_KF_L4_URG] = "L4_URG", [VCAP_KF_LOOKUP_FIRST_IS] = "LOOKUP_FIRST_IS", + [VCAP_KF_LOOKUP_GEN_IDX] = "LOOKUP_GEN_IDX", + [VCAP_KF_LOOKUP_GEN_IDX_SEL] = "LOOKUP_GEN_IDX_SEL", + [VCAP_KF_LOOKUP_INDEX] = "LOOKUP_INDEX", [VCAP_KF_LOOKUP_PAG] = "LOOKUP_PAG", + [VCAP_KF_MIRROR_PROBE] = "MIRROR_PROBE", [VCAP_KF_OAM_CCM_CNTS_EQ0] = "OAM_CCM_CNTS_EQ0", [VCAP_KF_OAM_DETECTED] = "OAM_DETECTED", [VCAP_KF_OAM_FLAGS] = "OAM_FLAGS", @@ -1549,7 +2873,12 @@ static const char * const vcap_keyfield_names[] = { [VCAP_KF_OAM_OPCODE] = "OAM_OPCODE", [VCAP_KF_OAM_VER] = "OAM_VER", [VCAP_KF_OAM_Y1731_IS] = "OAM_Y1731_IS", + [VCAP_KF_PROT_ACTIVE] = "PROT_ACTIVE", + [VCAP_KF_RT_FRMID] = "RT_FRMID", + [VCAP_KF_RT_TYPE] = "RT_TYPE", + [VCAP_KF_RT_VLAN_IDX] = "RT_VLAN_IDX", [VCAP_KF_TCP_IS] = "TCP_IS", + [VCAP_KF_TCP_UDP_IS] = "TCP_UDP_IS", [VCAP_KF_TYPE] = "TYPE", }; @@ -1557,24 +2886,95 @@ static const char * const vcap_keyfield_names[] = { static const char * const vcap_actionfield_names[] = { [VCAP_AF_NO_VALUE] = "(None)", [VCAP_AF_ACL_ID] = "ACL_ID", + [VCAP_AF_CLS_VID_SEL] = "CLS_VID_SEL", + [VCAP_AF_CNT_ID] = "CNT_ID", + [VCAP_AF_COPY_PORT_NUM] = "COPY_PORT_NUM", + [VCAP_AF_COPY_QUEUE_NUM] = "COPY_QUEUE_NUM", [VCAP_AF_CPU_COPY_ENA] = "CPU_COPY_ENA", [VCAP_AF_CPU_QUEUE_NUM] = "CPU_QUEUE_NUM", + [VCAP_AF_CUSTOM_ACE_TYPE_ENA] = "CUSTOM_ACE_TYPE_ENA", + [VCAP_AF_DEI_ENA] = "DEI_ENA", + [VCAP_AF_DEI_VAL] = "DEI_VAL", + [VCAP_AF_DLR_SEL] = "DLR_SEL", + [VCAP_AF_DP_ENA] = "DP_ENA", + [VCAP_AF_DP_VAL] = "DP_VAL", + [VCAP_AF_DSCP_ENA] = "DSCP_ENA", + [VCAP_AF_DSCP_VAL] = "DSCP_VAL", + [VCAP_AF_ES2_REW_CMD] = "ES2_REW_CMD", [VCAP_AF_FWD_KILL_ENA] = "FWD_KILL_ENA", + [VCAP_AF_FWD_MODE] = "FWD_MODE", [VCAP_AF_HIT_ME_ONCE] = "HIT_ME_ONCE", [VCAP_AF_HOST_MATCH] = "HOST_MATCH", + [VCAP_AF_IGNORE_PIPELINE_CTRL] = "IGNORE_PIPELINE_CTRL", + [VCAP_AF_INTR_ENA] = "INTR_ENA", + [VCAP_AF_ISDX_ADD_REPLACE_SEL] = "ISDX_ADD_REPLACE_SEL", + [VCAP_AF_ISDX_ADD_VAL] = "ISDX_ADD_VAL", [VCAP_AF_ISDX_ENA] = "ISDX_ENA", + [VCAP_AF_ISDX_REPLACE_ENA] = "ISDX_REPLACE_ENA", + [VCAP_AF_ISDX_VAL] = "ISDX_VAL", [VCAP_AF_LRN_DIS] = "LRN_DIS", + [VCAP_AF_MAP_IDX] = "MAP_IDX", + [VCAP_AF_MAP_KEY] = "MAP_KEY", + [VCAP_AF_MAP_LOOKUP_SEL] = "MAP_LOOKUP_SEL", [VCAP_AF_MASK_MODE] = "MASK_MODE", + [VCAP_AF_MATCH_ID] = "MATCH_ID", + [VCAP_AF_MATCH_ID_MASK] = "MATCH_ID_MASK", [VCAP_AF_MIRROR_ENA] = "MIRROR_ENA", + [VCAP_AF_MIRROR_PROBE] = "MIRROR_PROBE", + [VCAP_AF_MIRROR_PROBE_ID] = "MIRROR_PROBE_ID", + [VCAP_AF_MRP_SEL] = "MRP_SEL", + [VCAP_AF_NXT_IDX] = "NXT_IDX", + [VCAP_AF_NXT_IDX_CTRL] = "NXT_IDX_CTRL", + [VCAP_AF_OAM_SEL] = "OAM_SEL", + [VCAP_AF_PAG_OVERRIDE_MASK] = "PAG_OVERRIDE_MASK", + [VCAP_AF_PAG_VAL] = "PAG_VAL", + [VCAP_AF_PCP_ENA] = "PCP_ENA", + [VCAP_AF_PCP_VAL] = "PCP_VAL", + [VCAP_AF_PIPELINE_FORCE_ENA] = "PIPELINE_FORCE_ENA", + [VCAP_AF_PIPELINE_PT] = "PIPELINE_PT", [VCAP_AF_POLICE_ENA] = "POLICE_ENA", [VCAP_AF_POLICE_IDX] = "POLICE_IDX", + [VCAP_AF_POLICE_REMARK] = "POLICE_REMARK", [VCAP_AF_POLICE_VCAP_ONLY] = "POLICE_VCAP_ONLY", [VCAP_AF_PORT_MASK] = "PORT_MASK", + [VCAP_AF_QOS_ENA] = "QOS_ENA", + [VCAP_AF_QOS_VAL] = "QOS_VAL", [VCAP_AF_REW_OP] = "REW_OP", + [VCAP_AF_RT_DIS] = "RT_DIS", + [VCAP_AF_SFID_ENA] = "SFID_ENA", + [VCAP_AF_SFID_VAL] = "SFID_VAL", + [VCAP_AF_SGID_ENA] = "SGID_ENA", + [VCAP_AF_SGID_VAL] = "SGID_VAL", + [VCAP_AF_TYPE] = "TYPE", + [VCAP_AF_VID_REPLACE_ENA] = "VID_REPLACE_ENA", + [VCAP_AF_VID_VAL] = "VID_VAL", + [VCAP_AF_VLAN_POP_CNT] = "VLAN_POP_CNT", + [VCAP_AF_VLAN_POP_CNT_ENA] = "VLAN_POP_CNT_ENA", }; /* VCAPs */ const struct vcap_info lan966x_vcaps[] = { + [VCAP_TYPE_IS1] = { + .name = "is1", + .rows = 192, + .sw_count = 4, + .sw_width = 96, + .sticky_width = 32, + .act_width = 123, + .default_cnt = 0, + .require_cnt_dis = 1, + .version = 1, + .keyfield_set = is1_keyfield_set, + .keyfield_set_size = ARRAY_SIZE(is1_keyfield_set), + .actionfield_set = is1_actionfield_set, + .actionfield_set_size = ARRAY_SIZE(is1_actionfield_set), + .keyfield_set_map = is1_keyfield_set_map, + .keyfield_set_map_size = is1_keyfield_set_map_size, + .actionfield_set_map = is1_actionfield_set_map, + .actionfield_set_map_size = is1_actionfield_set_map_size, + .keyfield_set_typegroups = is1_keyfield_set_typegroups, + .actionfield_set_typegroups = is1_actionfield_set_typegroups, + }, [VCAP_TYPE_IS2] = { .name = "is2", .rows = 64, @@ -1600,7 +3000,7 @@ const struct vcap_info lan966x_vcaps[] = { const struct vcap_statistics lan966x_vcap_stats = { .name = "lan966x", - .count = 1, + .count = 2, .keyfield_set_names = vcap_keyfield_set_names, .actionfield_set_names = vcap_actionfield_set_names, .keyfield_names = vcap_keyfield_names, diff --git a/drivers/net/ethernet/microchip/lan966x/lan966x_vcap_debugfs.c b/drivers/net/ethernet/microchip/lan966x/lan966x_vcap_debugfs.c index 7a0db58f5513..d90c08cfcf14 100644 --- a/drivers/net/ethernet/microchip/lan966x/lan966x_vcap_debugfs.c +++ b/drivers/net/ethernet/microchip/lan966x/lan966x_vcap_debugfs.c @@ -5,9 +5,124 @@ #include "vcap_api.h" #include "vcap_api_client.h" -static void lan966x_vcap_port_keys(struct lan966x_port *port, - struct vcap_admin *admin, - struct vcap_output_print *out) +static void lan966x_vcap_is1_port_keys(struct lan966x_port *port, + struct vcap_admin *admin, + struct vcap_output_print *out) +{ + struct lan966x *lan966x = port->lan966x; + u32 val; + + out->prf(out->dst, " port[%d] (%s): ", port->chip_port, + netdev_name(port->dev)); + + val = lan_rd(lan966x, ANA_VCAP_CFG(port->chip_port)); + out->prf(out->dst, "\n state: "); + if (ANA_VCAP_CFG_S1_ENA_GET(val)) + out->prf(out->dst, "on"); + else + out->prf(out->dst, "off"); + + for (int l = 0; l < admin->lookups; ++l) { + out->prf(out->dst, "\n Lookup %d: ", l); + + out->prf(out->dst, "\n other: "); + switch (ANA_VCAP_S1_CFG_KEY_OTHER_CFG_GET(val)) { + case VCAP_IS1_PS_OTHER_NORMAL: + out->prf(out->dst, "normal"); + break; + case VCAP_IS1_PS_OTHER_7TUPLE: + out->prf(out->dst, "7tuple"); + break; + case VCAP_IS1_PS_OTHER_DBL_VID: + out->prf(out->dst, "dbl_vid"); + break; + case VCAP_IS1_PS_OTHER_DMAC_VID: + out->prf(out->dst, "dmac_vid"); + break; + default: + out->prf(out->dst, "-"); + break; + } + + out->prf(out->dst, "\n ipv4: "); + switch (ANA_VCAP_S1_CFG_KEY_IP4_CFG_GET(val)) { + case VCAP_IS1_PS_IPV4_NORMAL: + out->prf(out->dst, "normal"); + break; + case VCAP_IS1_PS_IPV4_7TUPLE: + out->prf(out->dst, "7tuple"); + break; + case VCAP_IS1_PS_IPV4_5TUPLE_IP4: + out->prf(out->dst, "5tuple_ipv4"); + break; + case VCAP_IS1_PS_IPV4_DBL_VID: + out->prf(out->dst, "dbl_vid"); + break; + case VCAP_IS1_PS_IPV4_DMAC_VID: + out->prf(out->dst, "dmac_vid"); + break; + default: + out->prf(out->dst, "-"); + break; + } + + out->prf(out->dst, "\n ipv6: "); + switch (ANA_VCAP_S1_CFG_KEY_IP6_CFG_GET(val)) { + case VCAP_IS1_PS_IPV6_NORMAL: + out->prf(out->dst, "normal"); + break; + case VCAP_IS1_PS_IPV6_7TUPLE: + out->prf(out->dst, "7tuple"); + break; + case VCAP_IS1_PS_IPV6_5TUPLE_IP4: + out->prf(out->dst, "5tuple_ip4"); + break; + case VCAP_IS1_PS_IPV6_NORMAL_IP6: + out->prf(out->dst, "normal_ip6"); + break; + case VCAP_IS1_PS_IPV6_5TUPLE_IP6: + out->prf(out->dst, "5tuple_ip6"); + break; + case VCAP_IS1_PS_IPV6_DBL_VID: + out->prf(out->dst, "dbl_vid"); + break; + case VCAP_IS1_PS_IPV6_DMAC_VID: + out->prf(out->dst, "dmac_vid"); + break; + default: + out->prf(out->dst, "-"); + break; + } + + out->prf(out->dst, "\n rt: "); + switch (ANA_VCAP_S1_CFG_KEY_RT_CFG_GET(val)) { + case VCAP_IS1_PS_RT_NORMAL: + out->prf(out->dst, "normal"); + break; + case VCAP_IS1_PS_RT_7TUPLE: + out->prf(out->dst, "7tuple"); + break; + case VCAP_IS1_PS_RT_DBL_VID: + out->prf(out->dst, "dbl_vid"); + break; + case VCAP_IS1_PS_RT_DMAC_VID: + out->prf(out->dst, "dmac_vid"); + break; + case VCAP_IS1_PS_RT_FOLLOW_OTHER: + out->prf(out->dst, "follow_other"); + break; + default: + out->prf(out->dst, "-"); + break; + } + } + + out->prf(out->dst, "\n"); +} + +static void lan966x_vcap_is2_port_keys(struct lan966x_port *port, + struct vcap_admin *admin, + struct vcap_output_print *out) { struct lan966x *lan966x = port->lan966x; u32 val; @@ -88,7 +203,17 @@ int lan966x_vcap_port_info(struct net_device *dev, vcap = &vctrl->vcaps[admin->vtype]; out->prf(out->dst, "%s:\n", vcap->name); - lan966x_vcap_port_keys(port, admin, out); + switch (admin->vtype) { + case VCAP_TYPE_IS2: + lan966x_vcap_is2_port_keys(port, admin, out); + break; + case VCAP_TYPE_IS1: + lan966x_vcap_is1_port_keys(port, admin, out); + break; + default: + out->prf(out->dst, " no info\n"); + break; + } return 0; } diff --git a/drivers/net/ethernet/microchip/lan966x/lan966x_vcap_impl.c b/drivers/net/ethernet/microchip/lan966x/lan966x_vcap_impl.c index 68f9d69fd37b..7ea8e8633609 100644 --- a/drivers/net/ethernet/microchip/lan966x/lan966x_vcap_impl.c +++ b/drivers/net/ethernet/microchip/lan966x/lan966x_vcap_impl.c @@ -8,6 +8,7 @@ #define STREAMSIZE (64 * 4) +#define LAN966X_IS1_LOOKUPS 3 #define LAN966X_IS2_LOOKUPS 2 static struct lan966x_vcap_inst { @@ -20,6 +21,15 @@ static struct lan966x_vcap_inst { bool ingress; /* is vcap in the ingress path */ } lan966x_vcap_inst_cfg[] = { { + .vtype = VCAP_TYPE_IS1, /* IS1-0 */ + .tgt_inst = 1, + .lookups = LAN966X_IS1_LOOKUPS, + .first_cid = LAN966X_VCAP_CID_IS1_L0, + .last_cid = LAN966X_VCAP_CID_IS1_MAX, + .count = 768, + .ingress = true, + }, + { .vtype = VCAP_TYPE_IS2, /* IS2-0 */ .tgt_inst = 2, .lookups = LAN966X_IS2_LOOKUPS, @@ -72,7 +82,21 @@ static void __lan966x_vcap_range_init(struct lan966x *lan966x, lan966x_vcap_wait_update(lan966x, admin->tgt_inst); } -static int lan966x_vcap_cid_to_lookup(int cid) +static int lan966x_vcap_is1_cid_to_lookup(int cid) +{ + int lookup = 0; + + if (cid >= LAN966X_VCAP_CID_IS1_L1 && + cid < LAN966X_VCAP_CID_IS1_L2) + lookup = 1; + else if (cid >= LAN966X_VCAP_CID_IS1_L2 && + cid < LAN966X_VCAP_CID_IS1_MAX) + lookup = 2; + + return lookup; +} + +static int lan966x_vcap_is2_cid_to_lookup(int cid) { if (cid >= LAN966X_VCAP_CID_IS2_L1 && cid < LAN966X_VCAP_CID_IS2_MAX) @@ -81,6 +105,67 @@ static int lan966x_vcap_cid_to_lookup(int cid) return 0; } +/* Return the list of keysets for the vcap port configuration */ +static int +lan966x_vcap_is1_get_port_keysets(struct net_device *ndev, int lookup, + struct vcap_keyset_list *keysetlist, + u16 l3_proto) +{ + struct lan966x_port *port = netdev_priv(ndev); + struct lan966x *lan966x = port->lan966x; + u32 val; + + val = lan_rd(lan966x, ANA_VCAP_S1_CFG(port->chip_port, lookup)); + + /* Collect all keysets for the port in a list */ + if (l3_proto == ETH_P_ALL || l3_proto == ETH_P_IP) { + switch (ANA_VCAP_S1_CFG_KEY_IP4_CFG_GET(val)) { + case VCAP_IS1_PS_IPV4_7TUPLE: + vcap_keyset_list_add(keysetlist, VCAP_KFS_7TUPLE); + break; + case VCAP_IS1_PS_IPV4_5TUPLE_IP4: + vcap_keyset_list_add(keysetlist, VCAP_KFS_5TUPLE_IP4); + break; + case VCAP_IS1_PS_IPV4_NORMAL: + vcap_keyset_list_add(keysetlist, VCAP_KFS_NORMAL); + break; + } + } + + if (l3_proto == ETH_P_ALL || l3_proto == ETH_P_IPV6) { + switch (ANA_VCAP_S1_CFG_KEY_IP6_CFG_GET(val)) { + case VCAP_IS1_PS_IPV6_NORMAL: + case VCAP_IS1_PS_IPV6_NORMAL_IP6: + vcap_keyset_list_add(keysetlist, VCAP_KFS_NORMAL); + vcap_keyset_list_add(keysetlist, VCAP_KFS_NORMAL_IP6); + break; + case VCAP_IS1_PS_IPV6_5TUPLE_IP6: + vcap_keyset_list_add(keysetlist, VCAP_KFS_5TUPLE_IP6); + break; + case VCAP_IS1_PS_IPV6_7TUPLE: + vcap_keyset_list_add(keysetlist, VCAP_KFS_7TUPLE); + break; + case VCAP_IS1_PS_IPV6_5TUPLE_IP4: + vcap_keyset_list_add(keysetlist, VCAP_KFS_5TUPLE_IP4); + break; + case VCAP_IS1_PS_IPV6_DMAC_VID: + vcap_keyset_list_add(keysetlist, VCAP_KFS_DMAC_VID); + break; + } + } + + switch (ANA_VCAP_S1_CFG_KEY_OTHER_CFG_GET(val)) { + case VCAP_IS1_PS_OTHER_7TUPLE: + vcap_keyset_list_add(keysetlist, VCAP_KFS_7TUPLE); + break; + case VCAP_IS1_PS_OTHER_NORMAL: + vcap_keyset_list_add(keysetlist, VCAP_KFS_NORMAL); + break; + } + + return 0; +} + static int lan966x_vcap_is2_get_port_keysets(struct net_device *dev, int lookup, struct vcap_keyset_list *keysetlist, @@ -180,11 +265,26 @@ lan966x_vcap_validate_keyset(struct net_device *dev, if (!kslist || kslist->cnt == 0) return VCAP_KFS_NO_VALUE; - lookup = lan966x_vcap_cid_to_lookup(rule->vcap_chain_id); keysetlist.max = ARRAY_SIZE(keysets); keysetlist.keysets = keysets; - err = lan966x_vcap_is2_get_port_keysets(dev, lookup, &keysetlist, - l3_proto); + + switch (admin->vtype) { + case VCAP_TYPE_IS1: + lookup = lan966x_vcap_is1_cid_to_lookup(rule->vcap_chain_id); + err = lan966x_vcap_is1_get_port_keysets(dev, lookup, &keysetlist, + l3_proto); + break; + case VCAP_TYPE_IS2: + lookup = lan966x_vcap_is2_cid_to_lookup(rule->vcap_chain_id); + err = lan966x_vcap_is2_get_port_keysets(dev, lookup, &keysetlist, + l3_proto); + break; + default: + pr_err("vcap type: %s not supported\n", + lan966x_vcaps[admin->vtype].name); + return VCAP_KFS_NO_VALUE; + } + if (err) return VCAP_KFS_NO_VALUE; @@ -197,17 +297,32 @@ lan966x_vcap_validate_keyset(struct net_device *dev, return VCAP_KFS_NO_VALUE; } -static bool lan966x_vcap_is_first_chain(struct vcap_rule *rule) +static bool lan966x_vcap_is2_is_first_chain(struct vcap_rule *rule) { return (rule->vcap_chain_id >= LAN966X_VCAP_CID_IS2_L0 && rule->vcap_chain_id < LAN966X_VCAP_CID_IS2_L1); } -static void lan966x_vcap_add_default_fields(struct net_device *dev, - struct vcap_admin *admin, - struct vcap_rule *rule) +static void lan966x_vcap_is1_add_default_fields(struct lan966x_port *port, + struct vcap_admin *admin, + struct vcap_rule *rule) +{ + u32 value, mask; + u32 lookup; + + if (vcap_rule_get_key_u32(rule, VCAP_KF_IF_IGR_PORT_MASK, + &value, &mask)) + vcap_rule_add_key_u32(rule, VCAP_KF_IF_IGR_PORT_MASK, 0, + ~BIT(port->chip_port)); + + lookup = lan966x_vcap_is1_cid_to_lookup(rule->vcap_chain_id); + vcap_rule_add_key_u32(rule, VCAP_KF_LOOKUP_INDEX, lookup, 0x3); +} + +static void lan966x_vcap_is2_add_default_fields(struct lan966x_port *port, + struct vcap_admin *admin, + struct vcap_rule *rule) { - struct lan966x_port *port = netdev_priv(dev); u32 value, mask; if (vcap_rule_get_key_u32(rule, VCAP_KF_IF_IGR_PORT_MASK, @@ -215,7 +330,7 @@ static void lan966x_vcap_add_default_fields(struct net_device *dev, vcap_rule_add_key_u32(rule, VCAP_KF_IF_IGR_PORT_MASK, 0, ~BIT(port->chip_port)); - if (lan966x_vcap_is_first_chain(rule)) + if (lan966x_vcap_is2_is_first_chain(rule)) vcap_rule_add_key_bit(rule, VCAP_KF_LOOKUP_FIRST_IS, VCAP_BIT_1); else @@ -223,6 +338,26 @@ static void lan966x_vcap_add_default_fields(struct net_device *dev, VCAP_BIT_0); } +static void lan966x_vcap_add_default_fields(struct net_device *dev, + struct vcap_admin *admin, + struct vcap_rule *rule) +{ + struct lan966x_port *port = netdev_priv(dev); + + switch (admin->vtype) { + case VCAP_TYPE_IS1: + lan966x_vcap_is1_add_default_fields(port, admin, rule); + break; + case VCAP_TYPE_IS2: + lan966x_vcap_is2_add_default_fields(port, admin, rule); + break; + default: + pr_err("vcap type: %s not supported\n", + lan966x_vcaps[admin->vtype].name); + break; + } +} + static void lan966x_vcap_cache_erase(struct vcap_admin *admin) { memset(admin->cache.keystream, 0, STREAMSIZE); @@ -464,8 +599,37 @@ static void lan966x_vcap_block_init(struct lan966x *lan966x, static void lan966x_vcap_port_key_deselection(struct lan966x *lan966x, struct vcap_admin *admin) { - for (int p = 0; p < lan966x->num_phys_ports; ++p) - lan_wr(0, lan966x, ANA_VCAP_S2_CFG(p)); + u32 val; + + switch (admin->vtype) { + case VCAP_TYPE_IS1: + val = ANA_VCAP_S1_CFG_KEY_IP6_CFG_SET(VCAP_IS1_PS_IPV6_5TUPLE_IP6) | + ANA_VCAP_S1_CFG_KEY_IP4_CFG_SET(VCAP_IS1_PS_IPV4_5TUPLE_IP4) | + ANA_VCAP_S1_CFG_KEY_OTHER_CFG_SET(VCAP_IS1_PS_OTHER_NORMAL); + + for (int p = 0; p < lan966x->num_phys_ports; ++p) { + if (!lan966x->ports[p]) + continue; + + for (int l = 0; l < LAN966X_IS1_LOOKUPS; ++l) + lan_wr(val, lan966x, ANA_VCAP_S1_CFG(p, l)); + + lan_rmw(ANA_VCAP_CFG_S1_ENA_SET(true), + ANA_VCAP_CFG_S1_ENA, lan966x, + ANA_VCAP_CFG(p)); + } + + break; + case VCAP_TYPE_IS2: + for (int p = 0; p < lan966x->num_phys_ports; ++p) + lan_wr(0, lan966x, ANA_VCAP_S2_CFG(p)); + + break; + default: + pr_err("vcap type: %s not supported\n", + lan966x_vcaps[admin->vtype].name); + break; + } } int lan966x_vcap_init(struct lan966x *lan966x) @@ -506,6 +670,10 @@ int lan966x_vcap_init(struct lan966x *lan966x) lan_rmw(ANA_VCAP_S2_CFG_ENA_SET(true), ANA_VCAP_S2_CFG_ENA, lan966x, ANA_VCAP_S2_CFG(lan966x->ports[p]->chip_port)); + + lan_rmw(ANA_VCAP_CFG_S1_ENA_SET(true), + ANA_VCAP_CFG_S1_ENA, lan966x, + ANA_VCAP_CFG(lan966x->ports[p]->chip_port)); } } diff --git a/drivers/net/ethernet/microchip/lan966x/lan966x_xdp.c b/drivers/net/ethernet/microchip/lan966x/lan966x_xdp.c index 2e6f486ec67d..9ee61db8690b 100644 --- a/drivers/net/ethernet/microchip/lan966x/lan966x_xdp.c +++ b/drivers/net/ethernet/microchip/lan966x/lan966x_xdp.c @@ -62,7 +62,7 @@ int lan966x_xdp_xmit(struct net_device *dev, struct xdp_frame *xdpf = frames[i]; int err; - err = lan966x_fdma_xmit_xdpf(port, xdpf, NULL, true); + err = lan966x_fdma_xmit_xdpf(port, xdpf, 0); if (err) break; @@ -76,7 +76,6 @@ int lan966x_xdp_run(struct lan966x_port *port, struct page *page, u32 data_len) { struct bpf_prog *xdp_prog = port->xdp_prog; struct lan966x *lan966x = port->lan966x; - struct xdp_frame *xdpf; struct xdp_buff xdp; u32 act; @@ -90,11 +89,8 @@ int lan966x_xdp_run(struct lan966x_port *port, struct page *page, u32 data_len) case XDP_PASS: return FDMA_PASS; case XDP_TX: - xdpf = xdp_convert_buff_to_frame(&xdp); - if (!xdpf) - return FDMA_DROP; - - return lan966x_fdma_xmit_xdpf(port, xdpf, page, false) ? + return lan966x_fdma_xmit_xdpf(port, page, + data_len - IFH_LEN_BYTES) ? FDMA_DROP : FDMA_TX; case XDP_REDIRECT: if (xdp_do_redirect(port->dev, &xdp, xdp_prog)) diff --git a/drivers/net/ethernet/microchip/sparx5/sparx5_main.c b/drivers/net/ethernet/microchip/sparx5/sparx5_main.c index 42b77ba9b572..a7edf524eedb 100644 --- a/drivers/net/ethernet/microchip/sparx5/sparx5_main.c +++ b/drivers/net/ethernet/microchip/sparx5/sparx5_main.c @@ -282,6 +282,7 @@ static int sparx5_create_port(struct sparx5 *sparx5, spx5_port->phylink_pcs.poll = true; spx5_port->phylink_pcs.ops = &sparx5_phylink_pcs_ops; spx5_port->is_mrouter = false; + INIT_LIST_HEAD(&spx5_port->tc_templates); sparx5->ports[config->portno] = spx5_port; err = sparx5_port_init(sparx5, spx5_port, &config->conf); diff --git a/drivers/net/ethernet/microchip/sparx5/sparx5_main.h b/drivers/net/ethernet/microchip/sparx5/sparx5_main.h index 72e7928912eb..62c85463b634 100644 --- a/drivers/net/ethernet/microchip/sparx5/sparx5_main.h +++ b/drivers/net/ethernet/microchip/sparx5/sparx5_main.h @@ -192,6 +192,7 @@ struct sparx5_port { u16 ts_id; struct sk_buff_head tx_skbs; bool is_mrouter; + struct list_head tc_templates; /* list of TC templates on this port */ }; enum sparx5_core_clockfreq { diff --git a/drivers/net/ethernet/microchip/sparx5/sparx5_tc_flower.c b/drivers/net/ethernet/microchip/sparx5/sparx5_tc_flower.c index b36819aafaca..3f87a5285a6d 100644 --- a/drivers/net/ethernet/microchip/sparx5/sparx5_tc_flower.c +++ b/drivers/net/ethernet/microchip/sparx5/sparx5_tc_flower.c @@ -28,6 +28,14 @@ struct sparx5_multiple_rules { struct sparx5_wildcard_rule rule[SPX5_MAX_RULE_SIZE]; }; +struct sparx5_tc_flower_template { + struct list_head list; /* for insertion in the list of templates */ + int cid; /* chain id */ + enum vcap_keyfield_set orig; /* keyset used before the template */ + enum vcap_keyfield_set keyset; /* new keyset used by template */ + u16 l3_proto; /* protocol specified in the template */ +}; + static int sparx5_tc_flower_es0_tpid(struct vcap_tc_flower_parse_usage *st) { @@ -382,7 +390,7 @@ static int sparx5_tc_select_protocol_keyset(struct net_device *ndev, /* Find the keysets that the rule can use */ matches.keysets = keysets; matches.max = ARRAY_SIZE(keysets); - if (vcap_rule_find_keysets(vrule, &matches) == 0) + if (!vcap_rule_find_keysets(vrule, &matches)) return -EINVAL; /* Find the keysets that the port configuration supports */ @@ -996,6 +1004,73 @@ static int sparx5_tc_action_vlan_push(struct vcap_admin *admin, return err; } +/* Remove rule keys that may prevent templates from matching a keyset */ +static void sparx5_tc_flower_simplify_rule(struct vcap_admin *admin, + struct vcap_rule *vrule, + u16 l3_proto) +{ + switch (admin->vtype) { + case VCAP_TYPE_IS0: + vcap_rule_rem_key(vrule, VCAP_KF_ETYPE); + switch (l3_proto) { + case ETH_P_IP: + break; + case ETH_P_IPV6: + vcap_rule_rem_key(vrule, VCAP_KF_IP_SNAP_IS); + break; + default: + break; + } + break; + case VCAP_TYPE_ES2: + switch (l3_proto) { + case ETH_P_IP: + if (vrule->keyset == VCAP_KFS_IP4_OTHER) + vcap_rule_rem_key(vrule, VCAP_KF_TCP_IS); + break; + case ETH_P_IPV6: + if (vrule->keyset == VCAP_KFS_IP6_STD) + vcap_rule_rem_key(vrule, VCAP_KF_TCP_IS); + vcap_rule_rem_key(vrule, VCAP_KF_IP4_IS); + break; + default: + break; + } + break; + case VCAP_TYPE_IS2: + switch (l3_proto) { + case ETH_P_IP: + case ETH_P_IPV6: + vcap_rule_rem_key(vrule, VCAP_KF_IP4_IS); + break; + default: + break; + } + break; + default: + break; + } +} + +static bool sparx5_tc_flower_use_template(struct net_device *ndev, + struct flow_cls_offload *fco, + struct vcap_admin *admin, + struct vcap_rule *vrule) +{ + struct sparx5_port *port = netdev_priv(ndev); + struct sparx5_tc_flower_template *ftp; + + list_for_each_entry(ftp, &port->tc_templates, list) { + if (ftp->cid != fco->common.chain_index) + continue; + + vcap_set_rule_set_keyset(vrule, ftp->keyset); + sparx5_tc_flower_simplify_rule(admin, vrule, ftp->l3_proto); + return true; + } + return false; +} + static int sparx5_tc_flower_replace(struct net_device *ndev, struct flow_cls_offload *fco, struct vcap_admin *admin, @@ -1122,12 +1197,14 @@ static int sparx5_tc_flower_replace(struct net_device *ndev, goto out; } - err = sparx5_tc_select_protocol_keyset(ndev, vrule, admin, - state.l3_proto, &multi); - if (err) { - NL_SET_ERR_MSG_MOD(fco->common.extack, - "No matching port keyset for filter protocol and keys"); - goto out; + if (!sparx5_tc_flower_use_template(ndev, fco, admin, vrule)) { + err = sparx5_tc_select_protocol_keyset(ndev, vrule, admin, + state.l3_proto, &multi); + if (err) { + NL_SET_ERR_MSG_MOD(fco->common.extack, + "No matching port keyset for filter protocol and keys"); + goto out; + } } /* provide the l3 protocol to guide the keyset selection */ @@ -1259,6 +1336,120 @@ static int sparx5_tc_flower_stats(struct net_device *ndev, return err; } +static int sparx5_tc_flower_template_create(struct net_device *ndev, + struct flow_cls_offload *fco, + struct vcap_admin *admin) +{ + struct sparx5_port *port = netdev_priv(ndev); + struct vcap_tc_flower_parse_usage state = { + .fco = fco, + .l3_proto = ETH_P_ALL, + .admin = admin, + }; + struct sparx5_tc_flower_template *ftp; + struct vcap_keyset_list kslist = {}; + enum vcap_keyfield_set keysets[10]; + struct vcap_control *vctrl; + struct vcap_rule *vrule; + int count, err; + + if (admin->vtype == VCAP_TYPE_ES0) { + pr_err("%s:%d: %s\n", __func__, __LINE__, + "VCAP does not support templates"); + return -EINVAL; + } + + count = vcap_admin_rule_count(admin, fco->common.chain_index); + if (count > 0) { + pr_err("%s:%d: %s\n", __func__, __LINE__, + "Filters are already present"); + return -EBUSY; + } + + ftp = kzalloc(sizeof(*ftp), GFP_KERNEL); + if (!ftp) + return -ENOMEM; + + ftp->cid = fco->common.chain_index; + ftp->orig = VCAP_KFS_NO_VALUE; + ftp->keyset = VCAP_KFS_NO_VALUE; + + vctrl = port->sparx5->vcap_ctrl; + vrule = vcap_alloc_rule(vctrl, ndev, fco->common.chain_index, + VCAP_USER_TC, fco->common.prio, 0); + if (IS_ERR(vrule)) { + err = PTR_ERR(vrule); + goto err_rule; + } + + state.vrule = vrule; + state.frule = flow_cls_offload_flow_rule(fco); + err = sparx5_tc_use_dissectors(&state, admin, vrule); + if (err) { + pr_err("%s:%d: key error: %d\n", __func__, __LINE__, err); + goto out; + } + + ftp->l3_proto = state.l3_proto; + + sparx5_tc_flower_simplify_rule(admin, vrule, state.l3_proto); + + /* Find the keysets that the rule can use */ + kslist.keysets = keysets; + kslist.max = ARRAY_SIZE(keysets); + if (!vcap_rule_find_keysets(vrule, &kslist)) { + pr_err("%s:%d: %s\n", __func__, __LINE__, + "Could not find a suitable keyset"); + err = -ENOENT; + goto out; + } + + ftp->keyset = vcap_select_min_rule_keyset(vctrl, admin->vtype, &kslist); + kslist.cnt = 0; + sparx5_vcap_set_port_keyset(ndev, admin, fco->common.chain_index, + state.l3_proto, + ftp->keyset, + &kslist); + + if (kslist.cnt > 0) + ftp->orig = kslist.keysets[0]; + + /* Store new template */ + list_add_tail(&ftp->list, &port->tc_templates); + vcap_free_rule(vrule); + return 0; + +out: + vcap_free_rule(vrule); +err_rule: + kfree(ftp); + return err; +} + +static int sparx5_tc_flower_template_destroy(struct net_device *ndev, + struct flow_cls_offload *fco, + struct vcap_admin *admin) +{ + struct sparx5_port *port = netdev_priv(ndev); + struct sparx5_tc_flower_template *ftp, *tmp; + int err = -ENOENT; + + /* Rules using the template are removed by the tc framework */ + list_for_each_entry_safe(ftp, tmp, &port->tc_templates, list) { + if (ftp->cid != fco->common.chain_index) + continue; + + sparx5_vcap_set_port_keyset(ndev, admin, + fco->common.chain_index, + ftp->l3_proto, ftp->orig, + NULL); + list_del(&ftp->list); + kfree(ftp); + break; + } + return err; +} + int sparx5_tc_flower(struct net_device *ndev, struct flow_cls_offload *fco, bool ingress) { @@ -1282,6 +1473,10 @@ int sparx5_tc_flower(struct net_device *ndev, struct flow_cls_offload *fco, return sparx5_tc_flower_destroy(ndev, fco, admin); case FLOW_CLS_STATS: return sparx5_tc_flower_stats(ndev, fco, admin); + case FLOW_CLS_TMPLT_CREATE: + return sparx5_tc_flower_template_create(ndev, fco, admin); + case FLOW_CLS_TMPLT_DESTROY: + return sparx5_tc_flower_template_destroy(ndev, fco, admin); default: return -EOPNOTSUPP; } diff --git a/drivers/net/ethernet/microchip/sparx5/sparx5_vcap_debugfs.c b/drivers/net/ethernet/microchip/sparx5/sparx5_vcap_debugfs.c index 07b472c84a47..12722f728ef7 100644 --- a/drivers/net/ethernet/microchip/sparx5/sparx5_vcap_debugfs.c +++ b/drivers/net/ethernet/microchip/sparx5/sparx5_vcap_debugfs.c @@ -198,7 +198,7 @@ static void sparx5_vcap_is2_port_keys(struct sparx5 *sparx5, out->prf(out->dst, "ip6_std"); break; case VCAP_IS2_PS_IPV6_MC_IP4_TCP_UDP_OTHER: - out->prf(out->dst, "ip4_tcp_udp ipv4_other"); + out->prf(out->dst, "ip4_tcp_udp ip4_other"); break; } out->prf(out->dst, "\n ipv6_uc: "); diff --git a/drivers/net/ethernet/microchip/sparx5/sparx5_vcap_impl.c b/drivers/net/ethernet/microchip/sparx5/sparx5_vcap_impl.c index d0d4e0385ac7..187efa1fc904 100644 --- a/drivers/net/ethernet/microchip/sparx5/sparx5_vcap_impl.c +++ b/drivers/net/ethernet/microchip/sparx5/sparx5_vcap_impl.c @@ -1519,6 +1519,276 @@ static struct vcap_operations sparx5_vcap_ops = { .port_info = sparx5_port_info, }; +static u32 sparx5_vcap_is0_keyset_to_etype_ps(enum vcap_keyfield_set keyset) +{ + switch (keyset) { + case VCAP_KFS_NORMAL_7TUPLE: + return VCAP_IS0_PS_ETYPE_NORMAL_7TUPLE; + case VCAP_KFS_NORMAL_5TUPLE_IP4: + return VCAP_IS0_PS_ETYPE_NORMAL_5TUPLE_IP4; + default: + return VCAP_IS0_PS_ETYPE_NORMAL_7TUPLE; + } +} + +static void sparx5_vcap_is0_set_port_keyset(struct net_device *ndev, int lookup, + enum vcap_keyfield_set keyset, + int l3_proto) +{ + struct sparx5_port *port = netdev_priv(ndev); + struct sparx5 *sparx5 = port->sparx5; + int portno = port->portno; + u32 value; + + switch (l3_proto) { + case ETH_P_IP: + value = sparx5_vcap_is0_keyset_to_etype_ps(keyset); + spx5_rmw(ANA_CL_ADV_CL_CFG_IP4_CLM_KEY_SEL_SET(value), + ANA_CL_ADV_CL_CFG_IP4_CLM_KEY_SEL, + sparx5, + ANA_CL_ADV_CL_CFG(portno, lookup)); + break; + case ETH_P_IPV6: + value = sparx5_vcap_is0_keyset_to_etype_ps(keyset); + spx5_rmw(ANA_CL_ADV_CL_CFG_IP6_CLM_KEY_SEL_SET(value), + ANA_CL_ADV_CL_CFG_IP6_CLM_KEY_SEL, + sparx5, + ANA_CL_ADV_CL_CFG(portno, lookup)); + break; + default: + value = sparx5_vcap_is0_keyset_to_etype_ps(keyset); + spx5_rmw(ANA_CL_ADV_CL_CFG_ETYPE_CLM_KEY_SEL_SET(value), + ANA_CL_ADV_CL_CFG_ETYPE_CLM_KEY_SEL, + sparx5, + ANA_CL_ADV_CL_CFG(portno, lookup)); + break; + } +} + +static u32 sparx5_vcap_is2_keyset_to_arp_ps(enum vcap_keyfield_set keyset) +{ + switch (keyset) { + case VCAP_KFS_ARP: + return VCAP_IS2_PS_ARP_ARP; + default: + return VCAP_IS2_PS_ARP_MAC_ETYPE; + } +} + +static u32 sparx5_vcap_is2_keyset_to_ipv4_ps(enum vcap_keyfield_set keyset) +{ + switch (keyset) { + case VCAP_KFS_MAC_ETYPE: + return VCAP_IS2_PS_IPV4_UC_MAC_ETYPE; + case VCAP_KFS_IP4_OTHER: + case VCAP_KFS_IP4_TCP_UDP: + return VCAP_IS2_PS_IPV4_UC_IP4_TCP_UDP_OTHER; + case VCAP_KFS_IP_7TUPLE: + return VCAP_IS2_PS_IPV4_UC_IP_7TUPLE; + default: + return VCAP_KFS_NO_VALUE; + } +} + +static u32 sparx5_vcap_is2_keyset_to_ipv6_uc_ps(enum vcap_keyfield_set keyset) +{ + switch (keyset) { + case VCAP_KFS_MAC_ETYPE: + return VCAP_IS2_PS_IPV6_UC_MAC_ETYPE; + case VCAP_KFS_IP4_OTHER: + case VCAP_KFS_IP4_TCP_UDP: + return VCAP_IS2_PS_IPV6_UC_IP4_TCP_UDP_OTHER; + case VCAP_KFS_IP_7TUPLE: + return VCAP_IS2_PS_IPV6_UC_IP_7TUPLE; + default: + return VCAP_KFS_NO_VALUE; + } +} + +static u32 sparx5_vcap_is2_keyset_to_ipv6_mc_ps(enum vcap_keyfield_set keyset) +{ + switch (keyset) { + case VCAP_KFS_MAC_ETYPE: + return VCAP_IS2_PS_IPV6_MC_MAC_ETYPE; + case VCAP_KFS_IP4_OTHER: + case VCAP_KFS_IP4_TCP_UDP: + return VCAP_IS2_PS_IPV6_MC_IP4_TCP_UDP_OTHER; + case VCAP_KFS_IP_7TUPLE: + return VCAP_IS2_PS_IPV6_MC_IP_7TUPLE; + default: + return VCAP_KFS_NO_VALUE; + } +} + +static void sparx5_vcap_is2_set_port_keyset(struct net_device *ndev, int lookup, + enum vcap_keyfield_set keyset, + int l3_proto) +{ + struct sparx5_port *port = netdev_priv(ndev); + struct sparx5 *sparx5 = port->sparx5; + int portno = port->portno; + u32 value; + + switch (l3_proto) { + case ETH_P_ARP: + value = sparx5_vcap_is2_keyset_to_arp_ps(keyset); + spx5_rmw(ANA_ACL_VCAP_S2_KEY_SEL_ARP_KEY_SEL_SET(value), + ANA_ACL_VCAP_S2_KEY_SEL_ARP_KEY_SEL, + sparx5, + ANA_ACL_VCAP_S2_KEY_SEL(portno, lookup)); + break; + case ETH_P_IP: + value = sparx5_vcap_is2_keyset_to_ipv4_ps(keyset); + spx5_rmw(ANA_ACL_VCAP_S2_KEY_SEL_IP4_UC_KEY_SEL_SET(value), + ANA_ACL_VCAP_S2_KEY_SEL_IP4_UC_KEY_SEL, + sparx5, + ANA_ACL_VCAP_S2_KEY_SEL(portno, lookup)); + spx5_rmw(ANA_ACL_VCAP_S2_KEY_SEL_IP4_MC_KEY_SEL_SET(value), + ANA_ACL_VCAP_S2_KEY_SEL_IP4_MC_KEY_SEL, + sparx5, + ANA_ACL_VCAP_S2_KEY_SEL(portno, lookup)); + break; + case ETH_P_IPV6: + value = sparx5_vcap_is2_keyset_to_ipv6_uc_ps(keyset); + spx5_rmw(ANA_ACL_VCAP_S2_KEY_SEL_IP6_UC_KEY_SEL_SET(value), + ANA_ACL_VCAP_S2_KEY_SEL_IP6_UC_KEY_SEL, + sparx5, + ANA_ACL_VCAP_S2_KEY_SEL(portno, lookup)); + value = sparx5_vcap_is2_keyset_to_ipv6_mc_ps(keyset); + spx5_rmw(ANA_ACL_VCAP_S2_KEY_SEL_IP6_MC_KEY_SEL_SET(value), + ANA_ACL_VCAP_S2_KEY_SEL_IP6_MC_KEY_SEL, + sparx5, + ANA_ACL_VCAP_S2_KEY_SEL(portno, lookup)); + break; + default: + value = VCAP_IS2_PS_NONETH_MAC_ETYPE; + spx5_rmw(ANA_ACL_VCAP_S2_KEY_SEL_NON_ETH_KEY_SEL_SET(value), + ANA_ACL_VCAP_S2_KEY_SEL_NON_ETH_KEY_SEL, + sparx5, + ANA_ACL_VCAP_S2_KEY_SEL(portno, lookup)); + break; + } +} + +static u32 sparx5_vcap_es2_keyset_to_arp_ps(enum vcap_keyfield_set keyset) +{ + switch (keyset) { + case VCAP_KFS_ARP: + return VCAP_ES2_PS_ARP_ARP; + default: + return VCAP_ES2_PS_ARP_MAC_ETYPE; + } +} + +static u32 sparx5_vcap_es2_keyset_to_ipv4_ps(enum vcap_keyfield_set keyset) +{ + switch (keyset) { + case VCAP_KFS_MAC_ETYPE: + return VCAP_ES2_PS_IPV4_MAC_ETYPE; + case VCAP_KFS_IP_7TUPLE: + return VCAP_ES2_PS_IPV4_IP_7TUPLE; + case VCAP_KFS_IP4_TCP_UDP: + return VCAP_ES2_PS_IPV4_IP4_TCP_UDP_OTHER; + case VCAP_KFS_IP4_OTHER: + return VCAP_ES2_PS_IPV4_IP4_OTHER; + default: + return VCAP_ES2_PS_IPV4_MAC_ETYPE; + } +} + +static u32 sparx5_vcap_es2_keyset_to_ipv6_ps(enum vcap_keyfield_set keyset) +{ + switch (keyset) { + case VCAP_KFS_MAC_ETYPE: + return VCAP_ES2_PS_IPV6_MAC_ETYPE; + case VCAP_KFS_IP4_TCP_UDP: + case VCAP_KFS_IP4_OTHER: + return VCAP_ES2_PS_IPV6_IP4_DOWNGRADE; + case VCAP_KFS_IP_7TUPLE: + return VCAP_ES2_PS_IPV6_IP_7TUPLE; + case VCAP_KFS_IP6_STD: + return VCAP_ES2_PS_IPV6_IP6_STD; + default: + return VCAP_ES2_PS_IPV6_MAC_ETYPE; + } +} + +static void sparx5_vcap_es2_set_port_keyset(struct net_device *ndev, int lookup, + enum vcap_keyfield_set keyset, + int l3_proto) +{ + struct sparx5_port *port = netdev_priv(ndev); + struct sparx5 *sparx5 = port->sparx5; + int portno = port->portno; + u32 value; + + switch (l3_proto) { + case ETH_P_IP: + value = sparx5_vcap_es2_keyset_to_ipv4_ps(keyset); + spx5_rmw(EACL_VCAP_ES2_KEY_SEL_IP4_KEY_SEL_SET(value), + EACL_VCAP_ES2_KEY_SEL_IP4_KEY_SEL, + sparx5, + EACL_VCAP_ES2_KEY_SEL(portno, lookup)); + break; + case ETH_P_IPV6: + value = sparx5_vcap_es2_keyset_to_ipv6_ps(keyset); + spx5_rmw(EACL_VCAP_ES2_KEY_SEL_IP6_KEY_SEL_SET(value), + EACL_VCAP_ES2_KEY_SEL_IP6_KEY_SEL, + sparx5, + EACL_VCAP_ES2_KEY_SEL(portno, lookup)); + break; + case ETH_P_ARP: + value = sparx5_vcap_es2_keyset_to_arp_ps(keyset); + spx5_rmw(EACL_VCAP_ES2_KEY_SEL_ARP_KEY_SEL_SET(value), + EACL_VCAP_ES2_KEY_SEL_ARP_KEY_SEL, + sparx5, + EACL_VCAP_ES2_KEY_SEL(portno, lookup)); + break; + } +} + +/* Change the port keyset for the lookup and protocol */ +void sparx5_vcap_set_port_keyset(struct net_device *ndev, + struct vcap_admin *admin, + int cid, + u16 l3_proto, + enum vcap_keyfield_set keyset, + struct vcap_keyset_list *orig) +{ + struct sparx5_port *port; + int lookup; + + switch (admin->vtype) { + case VCAP_TYPE_IS0: + lookup = sparx5_vcap_is0_cid_to_lookup(cid); + if (orig) + sparx5_vcap_is0_get_port_keysets(ndev, lookup, orig, + l3_proto); + sparx5_vcap_is0_set_port_keyset(ndev, lookup, keyset, l3_proto); + break; + case VCAP_TYPE_IS2: + lookup = sparx5_vcap_is2_cid_to_lookup(cid); + if (orig) + sparx5_vcap_is2_get_port_keysets(ndev, lookup, orig, + l3_proto); + sparx5_vcap_is2_set_port_keyset(ndev, lookup, keyset, l3_proto); + break; + case VCAP_TYPE_ES0: + break; + case VCAP_TYPE_ES2: + lookup = sparx5_vcap_es2_cid_to_lookup(cid); + if (orig) + sparx5_vcap_es2_get_port_keysets(ndev, lookup, orig, + l3_proto); + sparx5_vcap_es2_set_port_keyset(ndev, lookup, keyset, l3_proto); + break; + default: + port = netdev_priv(ndev); + sparx5_vcap_type_err(port->sparx5, admin, __func__); + break; + } +} + /* Enable IS0 lookups per port and set the keyset generation */ static void sparx5_vcap_is0_port_key_selection(struct sparx5 *sparx5, struct vcap_admin *admin) diff --git a/drivers/net/ethernet/microchip/sparx5/sparx5_vcap_impl.h b/drivers/net/ethernet/microchip/sparx5/sparx5_vcap_impl.h index 3260ab5e3a82..2684d9199b05 100644 --- a/drivers/net/ethernet/microchip/sparx5/sparx5_vcap_impl.h +++ b/drivers/net/ethernet/microchip/sparx5/sparx5_vcap_impl.h @@ -195,6 +195,12 @@ int sparx5_vcap_get_port_keyset(struct net_device *ndev, u16 l3_proto, struct vcap_keyset_list *kslist); +/* Change the port keyset for the lookup and protocol */ +void sparx5_vcap_set_port_keyset(struct net_device *ndev, + struct vcap_admin *admin, int cid, + u16 l3_proto, enum vcap_keyfield_set keyset, + struct vcap_keyset_list *orig); + /* Check if the ethertype is supported by the vcap port classification */ bool sparx5_vcap_is_known_etype(struct vcap_admin *admin, u16 etype); diff --git a/drivers/net/ethernet/microchip/vcap/vcap_ag_api.h b/drivers/net/ethernet/microchip/vcap/vcap_ag_api.h index 0844fcaeee68..a556c4419986 100644 --- a/drivers/net/ethernet/microchip/vcap/vcap_ag_api.h +++ b/drivers/net/ethernet/microchip/vcap/vcap_ag_api.h @@ -3,8 +3,8 @@ * Microchip VCAP API */ -/* This file is autogenerated by cml-utils 2023-02-10 11:15:56 +0100. - * Commit ID: c30fb4bf0281cd4a7133bdab6682f9e43c872ada +/* This file is autogenerated by cml-utils 2023-02-16 11:41:14 +0100. + * Commit ID: be85f176b3a151fa748dcaf97c8824a5c2e065f3 */ #ifndef __VCAP_AG_API__ @@ -14,6 +14,7 @@ enum vcap_type { VCAP_TYPE_ES0, VCAP_TYPE_ES2, VCAP_TYPE_IS0, + VCAP_TYPE_IS1, VCAP_TYPE_IS2, VCAP_TYPE_MAX }; @@ -21,7 +22,12 @@ enum vcap_type { /* Keyfieldset names with origin information */ enum vcap_keyfield_set { VCAP_KFS_NO_VALUE, /* initial value */ + VCAP_KFS_5TUPLE_IP4, /* lan966x is1 X2 */ + VCAP_KFS_5TUPLE_IP6, /* lan966x is1 X4 */ + VCAP_KFS_7TUPLE, /* lan966x is1 X4 */ VCAP_KFS_ARP, /* sparx5 is2 X6, sparx5 es2 X6, lan966x is2 X2 */ + VCAP_KFS_DBL_VID, /* lan966x is1 X1 */ + VCAP_KFS_DMAC_VID, /* lan966x is1 X1 */ VCAP_KFS_ETAG, /* sparx5 is0 X2 */ VCAP_KFS_IP4_OTHER, /* sparx5 is2 X6, sparx5 es2 X6, lan966x is2 X2 */ VCAP_KFS_IP4_TCP_UDP, /* sparx5 is2 X6, sparx5 es2 X6, lan966x is2 X2 */ @@ -36,10 +42,13 @@ enum vcap_keyfield_set { VCAP_KFS_MAC_ETYPE, /* sparx5 is2 X6, sparx5 es2 X6, lan966x is2 X2 */ VCAP_KFS_MAC_LLC, /* lan966x is2 X2 */ VCAP_KFS_MAC_SNAP, /* lan966x is2 X2 */ + VCAP_KFS_NORMAL, /* lan966x is1 X2 */ VCAP_KFS_NORMAL_5TUPLE_IP4, /* sparx5 is0 X6 */ VCAP_KFS_NORMAL_7TUPLE, /* sparx5 is0 X12 */ + VCAP_KFS_NORMAL_IP6, /* lan966x is1 X4 */ VCAP_KFS_OAM, /* lan966x is2 X2 */ VCAP_KFS_PURE_5TUPLE_IP4, /* sparx5 is0 X3 */ + VCAP_KFS_RT, /* lan966x is1 X1 */ VCAP_KFS_SMAC_SIP4, /* lan966x is2 X1 */ VCAP_KFS_SMAC_SIP6, /* lan966x is2 X2 */ }; @@ -61,17 +70,20 @@ enum vcap_keyfield_set { * Used by 802.1BR Bridge Port Extension in an E-Tag * VCAP_KF_8021BR_IGR_ECID_EXT: W8, sparx5: is0 * Used by 802.1BR Bridge Port Extension in an E-Tag - * VCAP_KF_8021Q_DEI0: W1, sparx5: is0 + * VCAP_KF_8021CB_R_TAGGED_IS: W1, lan966x: is1 + * Set if frame contains an RTAG: IEEE 802.1CB (FRER Redundancy tag, Ethertype + * 0xf1c1) + * VCAP_KF_8021Q_DEI0: W1, sparx5: is0, lan966x: is1 * First DEI in multiple vlan tags (outer tag or default port tag) - * VCAP_KF_8021Q_DEI1: W1, sparx5: is0 + * VCAP_KF_8021Q_DEI1: W1, sparx5: is0, lan966x: is1 * Second DEI in multiple vlan tags (inner tag) * VCAP_KF_8021Q_DEI2: W1, sparx5: is0 * Third DEI in multiple vlan tags (not always available) * VCAP_KF_8021Q_DEI_CLS: W1, sparx5: is2/es2, lan966x: is2 * Classified DEI - * VCAP_KF_8021Q_PCP0: W3, sparx5: is0 + * VCAP_KF_8021Q_PCP0: W3, sparx5: is0, lan966x: is1 * First PCP in multiple vlan tags (outer tag or default port tag) - * VCAP_KF_8021Q_PCP1: W3, sparx5: is0 + * VCAP_KF_8021Q_PCP1: W3, sparx5: is0, lan966x: is1 * Second PCP in multiple vlan tags (inner tag) * VCAP_KF_8021Q_PCP2: W3, sparx5: is0 * Third PCP in multiple vlan tags (not always available) @@ -79,22 +91,24 @@ enum vcap_keyfield_set { * Classified PCP * VCAP_KF_8021Q_TPID: W3, sparx5: es0 * TPID for outer tag: 0: Customer TPID 1: Service TPID (88A8 or programmable) - * VCAP_KF_8021Q_TPID0: W3, sparx5: is0 + * VCAP_KF_8021Q_TPID0: sparx5 is0 W3, lan966x is1 W1 * First TPIC in multiple vlan tags (outer tag or default port tag) - * VCAP_KF_8021Q_TPID1: W3, sparx5: is0 + * VCAP_KF_8021Q_TPID1: sparx5 is0 W3, lan966x is1 W1 * Second TPID in multiple vlan tags (inner tag) * VCAP_KF_8021Q_TPID2: W3, sparx5: is0 * Third TPID in multiple vlan tags (not always available) - * VCAP_KF_8021Q_VID0: W12, sparx5: is0 + * VCAP_KF_8021Q_VID0: W12, sparx5: is0, lan966x: is1 * First VID in multiple vlan tags (outer tag or default port tag) - * VCAP_KF_8021Q_VID1: W12, sparx5: is0 + * VCAP_KF_8021Q_VID1: W12, sparx5: is0, lan966x: is1 * Second VID in multiple vlan tags (inner tag) * VCAP_KF_8021Q_VID2: W12, sparx5: is0 * Third VID in multiple vlan tags (not always available) * VCAP_KF_8021Q_VID_CLS: sparx5 is2 W13, sparx5 es0 W13, sparx5 es2 W13, * lan966x is2 W12 * Classified VID - * VCAP_KF_8021Q_VLAN_TAGGED_IS: W1, sparx5: is2/es2, lan966x: is2 + * VCAP_KF_8021Q_VLAN_DBL_TAGGED_IS: W1, lan966x: is1 + * Set if frame has two or more Q-tags. Independent of port VLAN awareness + * VCAP_KF_8021Q_VLAN_TAGGED_IS: W1, sparx5: is2/es2, lan966x: is1/is2 * Sparx5: Set if frame was received with a VLAN tag, LAN966x: Set if frame has * one or more Q-tags. Independent of port VLAN awareness * VCAP_KF_8021Q_VLAN_TAGS: W3, sparx5: is0 @@ -120,9 +134,9 @@ enum vcap_keyfield_set { * Class of service * VCAP_KF_ES0_ISDX_KEY_ENA: W1, sparx5: es2 * The value taken from the IFH .FWD.ES0_ISDX_KEY_ENA - * VCAP_KF_ETYPE: W16, sparx5: is0/is2/es2, lan966x: is2 + * VCAP_KF_ETYPE: W16, sparx5: is0/is2/es2, lan966x: is1/is2 * Ethernet type - * VCAP_KF_ETYPE_LEN_IS: W1, sparx5: is0/is2/es2 + * VCAP_KF_ETYPE_LEN_IS: W1, sparx5: is0/is2/es2, lan966x: is1 * Set if frame has EtherType >= 0x600 * VCAP_KF_HOST_MATCH: W1, lan966x: is2 * The action from the SMAC_SIP4 or SMAC_SIP6 lookups. Used for IP source @@ -134,11 +148,12 @@ enum vcap_keyfield_set { * CPU queue) * VCAP_KF_IF_EGR_PORT_NO: W7, sparx5: es0 * Egress port number - * VCAP_KF_IF_IGR_PORT: sparx5 is0 W7, sparx5 es2 W9, lan966x is2 W4 + * VCAP_KF_IF_IGR_PORT: sparx5 is0 W7, sparx5 es2 W9, lan966x is1 W3, lan966x + * is2 W4 * Sparx5: Logical ingress port number retrieved from * ANA_CL::PORT_ID_CFG.LPORT_NUM or ERLEG, LAN966x: ingress port nunmber * VCAP_KF_IF_IGR_PORT_MASK: sparx5 is0 W65, sparx5 is2 W32, sparx5 is2 W65, - * lan966x is2 W9 + * lan966x is1 W9, lan966x is2 W9 * Ingress port mask, one bit per port/erleg * VCAP_KF_IF_IGR_PORT_MASK_L3: W1, sparx5: is2 * If set, IF_IGR_PORT_MASK, IF_IGR_PORT_MASK_RNG, and IF_IGR_PORT_MASK_SEL are @@ -151,24 +166,26 @@ enum vcap_keyfield_set { * Mapping: 0: DEFAULT 1: LOOPBACK 2: MASQUERADE 3: CPU_VD * VCAP_KF_IF_IGR_PORT_SEL: W1, sparx5: es2 * Selector for IF_IGR_PORT: physical port number or ERLEG - * VCAP_KF_IP4_IS: W1, sparx5: is0/is2/es2, lan966x: is2 + * VCAP_KF_IP4_IS: W1, sparx5: is0/is2/es2, lan966x: is1/is2 * Set if frame has EtherType = 0x800 and IP version = 4 - * VCAP_KF_IP_MC_IS: W1, sparx5: is0 + * VCAP_KF_IP_MC_IS: W1, sparx5: is0, lan966x: is1 * Set if frame is IPv4 frame and frame's destination MAC address is an IPv4 * multicast address (0x01005E0 /25). Set if frame is IPv6 frame and frame's * destination MAC address is an IPv6 multicast address (0x3333/16). - * VCAP_KF_IP_PAYLOAD_5TUPLE: W32, sparx5: is0 + * VCAP_KF_IP_PAYLOAD_5TUPLE: W32, sparx5: is0, lan966x: is1 * Payload bytes after IP header - * VCAP_KF_IP_SNAP_IS: W1, sparx5: is0 + * VCAP_KF_IP_PAYLOAD_S1_IP6: W112, lan966x: is1 + * Payload after IPv6 header + * VCAP_KF_IP_SNAP_IS: W1, sparx5: is0, lan966x: is1 * Set if frame is IPv4, IPv6, or SNAP frame * VCAP_KF_ISDX_CLS: W12, sparx5: is2/es0/es2 * Classified ISDX * VCAP_KF_ISDX_GT0_IS: W1, sparx5: is2/es0/es2, lan966x: is2 * Set if classified ISDX > 0 - * VCAP_KF_L2_BC_IS: W1, sparx5: is0/is2/es2, lan966x: is2 + * VCAP_KF_L2_BC_IS: W1, sparx5: is0/is2/es2, lan966x: is1/is2 * Set if frame's destination MAC address is the broadcast address * (FF-FF-FF-FF-FF-FF). - * VCAP_KF_L2_DMAC: W48, sparx5: is0/is2/es2, lan966x: is2 + * VCAP_KF_L2_DMAC: W48, sparx5: is0/is2/es2, lan966x: is1/is2 * Destination MAC address * VCAP_KF_L2_FRM_TYPE: W4, lan966x: is2 * Frame subtype for specific EtherTypes (MRP, DLR) @@ -176,7 +193,9 @@ enum vcap_keyfield_set { * Set if the frame is allowed to be forwarded to front ports * VCAP_KF_L2_LLC: W40, lan966x: is2 * LLC header and data after up to two VLAN tags and the type/length field - * VCAP_KF_L2_MC_IS: W1, sparx5: is0/is2/es2, lan966x: is2 + * VCAP_KF_L2_MAC: W48, lan966x: is1 + * MAC address (FIRST=1: SMAC, FIRST=0: DMAC) + * VCAP_KF_L2_MC_IS: W1, sparx5: is0/is2/es2, lan966x: is1/is2 * Set if frame's destination MAC address is a multicast address (bit 40 = 1). * VCAP_KF_L2_PAYLOAD0: W16, lan966x: is2 * Payload bytes 0-1 after the frame's EtherType @@ -188,7 +207,7 @@ enum vcap_keyfield_set { * specifically for PTP frames. * VCAP_KF_L2_PAYLOAD_ETYPE: W64, sparx5: is2/es2 * Byte 0-7 of L2 payload after Type/Len field and overloading for OAM - * VCAP_KF_L2_SMAC: W48, sparx5: is0/is2/es2, lan966x: is2 + * VCAP_KF_L2_SMAC: W48, sparx5: is0/is2/es2, lan966x: is1/is2 * Source MAC address * VCAP_KF_L2_SNAP: W40, lan966x: is2 * SNAP header after LLC header (AA-AA-03) @@ -196,32 +215,38 @@ enum vcap_keyfield_set { * Set if Src IP matches Dst IP address * VCAP_KF_L3_DPL_CLS: W1, sparx5: es0/es2 * The frames drop precedence level - * VCAP_KF_L3_DSCP: W6, sparx5: is0 + * VCAP_KF_L3_DSCP: W6, sparx5: is0, lan966x: is1 * Frame's DSCP value * VCAP_KF_L3_DST_IS: W1, sparx5: is2 * Set if lookup is done for egress router leg - * VCAP_KF_L3_FRAGMENT: W1, lan966x: is2 + * VCAP_KF_L3_FRAGMENT: W1, lan966x: is1/is2 * Set if IPv4 frame is fragmented * VCAP_KF_L3_FRAGMENT_TYPE: W2, sparx5: is0/is2/es2 * L3 Fragmentation type (none, initial, suspicious, valid follow up) * VCAP_KF_L3_FRAG_INVLD_L4_LEN: W1, sparx5: is0/is2 * Set if frame's L4 length is less than ANA_CL:COMMON:CLM_FRAGMENT_CFG.L4_MIN_L * EN - * VCAP_KF_L3_FRAG_OFS_GT0: W1, lan966x: is2 + * VCAP_KF_L3_FRAG_OFS_GT0: W1, lan966x: is1/is2 * Set if IPv4 frame is fragmented and it is not the first fragment - * VCAP_KF_L3_IP4_DIP: W32, sparx5: is0/is2/es2, lan966x: is2 + * VCAP_KF_L3_IP4_DIP: W32, sparx5: is0/is2/es2, lan966x: is1/is2 * Destination IPv4 Address - * VCAP_KF_L3_IP4_SIP: W32, sparx5: is0/is2/es2, lan966x: is2 + * VCAP_KF_L3_IP4_SIP: W32, sparx5: is0/is2/es2, lan966x: is1/is2 * Source IPv4 Address - * VCAP_KF_L3_IP6_DIP: W128, sparx5: is0/is2/es2, lan966x: is2 + * VCAP_KF_L3_IP6_DIP: sparx5 is0 W128, sparx5 is2 W128, sparx5 es2 W128, + * lan966x is1 W64, lan966x is1 W128, lan966x is2 W128 * Sparx5: Full IPv6 DIP, LAN966x: Either Full IPv6 DIP or a subset depending on * frame type - * VCAP_KF_L3_IP6_SIP: W128, sparx5: is0/is2/es2, lan966x: is2 + * VCAP_KF_L3_IP6_DIP_MSB: W16, lan966x: is1 + * MS 16bits of IPv6 DIP + * VCAP_KF_L3_IP6_SIP: sparx5 is0 W128, sparx5 is2 W128, sparx5 es2 W128, + * lan966x is1 W128, lan966x is1 W64, lan966x is2 W128 * Sparx5: Full IPv6 SIP, LAN966x: Either Full IPv6 SIP or a subset depending on * frame type - * VCAP_KF_L3_IP_PROTO: W8, sparx5: is0/is2/es2, lan966x: is2 + * VCAP_KF_L3_IP6_SIP_MSB: W16, lan966x: is1 + * MS 16bits of IPv6 DIP + * VCAP_KF_L3_IP_PROTO: W8, sparx5: is0/is2/es2, lan966x: is1/is2 * IPv4 frames: IP protocol. IPv6 frames: Next header, same as for IPV4 - * VCAP_KF_L3_OPTIONS_IS: W1, sparx5: is0/is2/es2, lan966x: is2 + * VCAP_KF_L3_OPTIONS_IS: W1, sparx5: is0/is2/es2, lan966x: is1/is2 * Set if IPv4 frame contains options (IP len > 5) * VCAP_KF_L3_PAYLOAD: sparx5 is2 W96, sparx5 is2 W40, sparx5 es2 W96, sparx5 * es2 W40, lan966x is2 W56 @@ -254,7 +279,8 @@ enum vcap_keyfield_set { * VCAP_KF_L4_PSH: W1, sparx5: is2/es2, lan966x: is2 * Sparx5: TCP flag PSH, LAN966x: TCP: TCP flag PSH. PTP over UDP: flagField bit * 1 (twoStepFlag) - * VCAP_KF_L4_RNG: sparx5 is0 W8, sparx5 is2 W16, sparx5 es2 W16, lan966x is2 W8 + * VCAP_KF_L4_RNG: sparx5 is0 W8, sparx5 is2 W16, sparx5 es2 W16, lan966x is1 + * W8, lan966x is2 W8 * Range checker bitmask (one for each range checker). Input into range checkers * is taken from classified results (VID, DSCP) and frame (SPORT, DPORT, ETYPE, * outer VID, inner VID) @@ -264,7 +290,7 @@ enum vcap_keyfield_set { * VCAP_KF_L4_SEQUENCE_EQ0_IS: W1, sparx5: is2/es2, lan966x: is2 * Set if TCP sequence number is 0, LAN966x: Overlayed with PTP over UDP: * messageType bit 0 - * VCAP_KF_L4_SPORT: W16, sparx5: is0/is2/es2, lan966x: is2 + * VCAP_KF_L4_SPORT: W16, sparx5: is0/is2/es2, lan966x: is1/is2 * TCP/UDP source port * VCAP_KF_L4_SPORT_EQ_DPORT_IS: W1, sparx5: is2/es2, lan966x: is2 * Set if UDP or TCP source port equals UDP or TCP destination port @@ -274,13 +300,16 @@ enum vcap_keyfield_set { * VCAP_KF_L4_URG: W1, sparx5: is2/es2, lan966x: is2 * Sparx5: TCP flag URG, LAN966x: TCP: TCP flag URG. PTP over UDP: flagField bit * 7 (reserved) - * VCAP_KF_LOOKUP_FIRST_IS: W1, sparx5: is0/is2/es2, lan966x: is2 + * VCAP_KF_LOOKUP_FIRST_IS: W1, sparx5: is0/is2/es2, lan966x: is1/is2 * Selects between entries relevant for first and second lookup. Set for first * lookup, cleared for second lookup. * VCAP_KF_LOOKUP_GEN_IDX: W12, sparx5: is0 * Generic index - for chaining CLM instances * VCAP_KF_LOOKUP_GEN_IDX_SEL: W2, sparx5: is0 * Select the mode of the Generic Index + * VCAP_KF_LOOKUP_INDEX: W2, lan966x: is1 + * 0: First lookup, 1: Second lookup, 2: Third lookup, Similar to VCAP_KF_FIRST + * but with extra info * VCAP_KF_LOOKUP_PAG: W8, sparx5: is2, lan966x: is2 * Classified Policy Association Group: chains rules from IS1/CLM to IS2 * VCAP_KF_MIRROR_PROBE: W2, sparx5: es2 @@ -303,14 +332,22 @@ enum vcap_keyfield_set { * Set if frame's EtherType = 0x8902 * VCAP_KF_PROT_ACTIVE: W1, sparx5: es0/es2 * Protection is active - * VCAP_KF_TCP_IS: W1, sparx5: is0/is2/es2, lan966x: is2 + * VCAP_KF_RT_FRMID: W32, lan966x: is1 + * Profinet or OPC-UA FrameId + * VCAP_KF_RT_TYPE: W2, lan966x: is1 + * Encoding of frame's EtherType: 0: Other, 1: Profinet, 2: OPC-UA, 3: Custom + * (ANA::RT_CUSTOM) + * VCAP_KF_RT_VLAN_IDX: W3, lan966x: is1 + * Real-time VLAN index from ANA::RT_VLAN_PCP + * VCAP_KF_TCP_IS: W1, sparx5: is0/is2/es2, lan966x: is1/is2 * Set if frame is IPv4 TCP frame (IP protocol = 6) or IPv6 TCP frames (Next * header = 6) - * VCAP_KF_TCP_UDP_IS: W1, sparx5: is0/is2/es2 + * VCAP_KF_TCP_UDP_IS: W1, sparx5: is0/is2/es2, lan966x: is1 * Set if frame is IPv4/IPv6 TCP or UDP frame (IP protocol/next header equals 6 * or 17) * VCAP_KF_TYPE: sparx5 is0 W2, sparx5 is0 W1, sparx5 is2 W4, sparx5 is2 W2, - * sparx5 es0 W1, sparx5 es2 W3, lan966x is2 W4, lan966x is2 W2 + * sparx5 es0 W1, sparx5 es2 W3, lan966x is1 W1, lan966x is1 W2, lan966x is2 W4, + * lan966x is2 W2 * Keyset type id - set by the API */ @@ -323,6 +360,7 @@ enum vcap_key_field { VCAP_KF_8021BR_GRP, VCAP_KF_8021BR_IGR_ECID_BASE, VCAP_KF_8021BR_IGR_ECID_EXT, + VCAP_KF_8021CB_R_TAGGED_IS, VCAP_KF_8021Q_DEI0, VCAP_KF_8021Q_DEI1, VCAP_KF_8021Q_DEI2, @@ -339,6 +377,7 @@ enum vcap_key_field { VCAP_KF_8021Q_VID1, VCAP_KF_8021Q_VID2, VCAP_KF_8021Q_VID_CLS, + VCAP_KF_8021Q_VLAN_DBL_TAGGED_IS, VCAP_KF_8021Q_VLAN_TAGGED_IS, VCAP_KF_8021Q_VLAN_TAGS, VCAP_KF_ACL_GRP_ID, @@ -366,6 +405,7 @@ enum vcap_key_field { VCAP_KF_IP4_IS, VCAP_KF_IP_MC_IS, VCAP_KF_IP_PAYLOAD_5TUPLE, + VCAP_KF_IP_PAYLOAD_S1_IP6, VCAP_KF_IP_SNAP_IS, VCAP_KF_ISDX_CLS, VCAP_KF_ISDX_GT0_IS, @@ -374,6 +414,7 @@ enum vcap_key_field { VCAP_KF_L2_FRM_TYPE, VCAP_KF_L2_FWD_IS, VCAP_KF_L2_LLC, + VCAP_KF_L2_MAC, VCAP_KF_L2_MC_IS, VCAP_KF_L2_PAYLOAD0, VCAP_KF_L2_PAYLOAD1, @@ -392,7 +433,9 @@ enum vcap_key_field { VCAP_KF_L3_IP4_DIP, VCAP_KF_L3_IP4_SIP, VCAP_KF_L3_IP6_DIP, + VCAP_KF_L3_IP6_DIP_MSB, VCAP_KF_L3_IP6_SIP, + VCAP_KF_L3_IP6_SIP_MSB, VCAP_KF_L3_IP_PROTO, VCAP_KF_L3_OPTIONS_IS, VCAP_KF_L3_PAYLOAD, @@ -416,6 +459,7 @@ enum vcap_key_field { VCAP_KF_LOOKUP_FIRST_IS, VCAP_KF_LOOKUP_GEN_IDX, VCAP_KF_LOOKUP_GEN_IDX_SEL, + VCAP_KF_LOOKUP_INDEX, VCAP_KF_LOOKUP_PAG, VCAP_KF_MIRROR_PROBE, VCAP_KF_OAM_CCM_CNTS_EQ0, @@ -427,6 +471,9 @@ enum vcap_key_field { VCAP_KF_OAM_VER, VCAP_KF_OAM_Y1731_IS, VCAP_KF_PROT_ACTIVE, + VCAP_KF_RT_FRMID, + VCAP_KF_RT_TYPE, + VCAP_KF_RT_VLAN_IDX, VCAP_KF_TCP_IS, VCAP_KF_TCP_UDP_IS, VCAP_KF_TYPE, @@ -440,6 +487,7 @@ enum vcap_actionfield_set { VCAP_AFS_CLASS_REDUCED, /* sparx5 is0 X1 */ VCAP_AFS_ES0, /* sparx5 es0 X1 */ VCAP_AFS_FULL, /* sparx5 is0 X3 */ + VCAP_AFS_S1, /* lan966x is1 X1 */ VCAP_AFS_SMAC_SIP, /* lan966x is2 X1 */ }; @@ -470,23 +518,31 @@ enum vcap_actionfield_set { * CPU extraction queue. Used when FWD_SEL >0 and PIPELINE_ACT = XTR. * VCAP_AF_CPU_QUEUE_NUM: W3, sparx5: is2/es2, lan966x: is2 * CPU queue number. Used when CPU_COPY_ENA is set. + * VCAP_AF_CUSTOM_ACE_TYPE_ENA: W4, lan966x: is1 + * Enables use of custom keys in IS2. Bits 3:2 control second lookup in IS2 + * while bits 1:0 control first lookup. Encoding per lookup: 0: Disabled. 1: + * Extract 40 bytes after position corresponding to the location of the IPv4 + * header and use as key. 2: Extract 40 bytes after SMAC and use as key * VCAP_AF_DEI_A_VAL: W1, sparx5: es0 * DEI used in ES0 tag A. See TAG_A_DEI_SEL. * VCAP_AF_DEI_B_VAL: W1, sparx5: es0 * DEI used in ES0 tag B. See TAG_B_DEI_SEL. * VCAP_AF_DEI_C_VAL: W1, sparx5: es0 * DEI used in ES0 tag C. See TAG_C_DEI_SEL. - * VCAP_AF_DEI_ENA: W1, sparx5: is0 + * VCAP_AF_DEI_ENA: W1, sparx5: is0, lan966x: is1 * If set, use DEI_VAL as classified DEI value. Otherwise, DEI from basic * classification is used - * VCAP_AF_DEI_VAL: W1, sparx5: is0 + * VCAP_AF_DEI_VAL: W1, sparx5: is0, lan966x: is1 * See DEI_ENA - * VCAP_AF_DP_ENA: W1, sparx5: is0 + * VCAP_AF_DLR_SEL: W2, lan966x: is1 + * 0: No changes to port-based selection in ANA:PORT:OAM_CFG.DLR_ENA. 1: Enable + * DLR frame processing 2: Disable DLR processing + * VCAP_AF_DP_ENA: W1, sparx5: is0, lan966x: is1 * If set, use DP_VAL as classified drop precedence level. Otherwise, drop * precedence level from basic classification is used. - * VCAP_AF_DP_VAL: W2, sparx5: is0 + * VCAP_AF_DP_VAL: sparx5 is0 W2, lan966x is1 W1 * See DP_ENA. - * VCAP_AF_DSCP_ENA: W1, sparx5: is0 + * VCAP_AF_DSCP_ENA: W1, sparx5: is0, lan966x: is1 * If set, use DSCP_VAL as classified DSCP value. Otherwise, DSCP value from * basic classification is used. * VCAP_AF_DSCP_SEL: W3, sparx5: es0 @@ -495,7 +551,7 @@ enum vcap_actionfield_set { * table 0, otherwise use DSCP_VAL. 5: Mapped using mapping table 1, otherwise * use mapping table 0. 6: Mapped using mapping table 2, otherwise use DSCP_VAL. * 7: Mapped using mapping table 3, otherwise use mapping table 2 - * VCAP_AF_DSCP_VAL: W6, sparx5: is0/es0 + * VCAP_AF_DSCP_VAL: W6, sparx5: is0/es0, lan966x: is1 * See DSCP_ENA. * VCAP_AF_ES2_REW_CMD: W3, sparx5: es2 * Command forwarded to REW: 0: No action. 1: SWAP MAC addresses. 2: Do L2CP @@ -529,9 +585,16 @@ enum vcap_actionfield_set { * VCAP_AF_ISDX_ADD_REPLACE_SEL: W1, sparx5: is0 * Controls the classified ISDX. 0: New ISDX = old ISDX + ISDX_VAL. 1: New ISDX * = ISDX_VAL. + * VCAP_AF_ISDX_ADD_VAL: W8, lan966x: is1 + * If ISDX_REPLACE_ENA is set, ISDX_ADD_VAL is used directly as the new ISDX. + * Encoding: ISDX_REPLACE_ENA=0, ISDX_ADD_VAL=0: Disabled ISDX_EPLACE_ENA=0, + * ISDX_ADD_VAL>0: Add value to classified ISDX. ISDX_REPLACE_ENA=1: Replace + * with ISDX_ADD_VAL value. * VCAP_AF_ISDX_ENA: W1, lan966x: is2 * Setting this bit to 1 causes the classified ISDX to be set to the value of * POLICE_IDX[8:0]. + * VCAP_AF_ISDX_REPLACE_ENA: W1, lan966x: is1 + * If set, classified ISDX is set to ISDX_ADD_VAL. * VCAP_AF_ISDX_VAL: W12, sparx5: is0 * See isdx_add_replace_sel * VCAP_AF_LOOP_ENA: W1, sparx5: es0 @@ -572,14 +635,22 @@ enum vcap_actionfield_set { * VCAP_AF_MIRROR_PROBE_ID: W2, sparx5: es2 * Signals a mirror probe to be placed in the IFH. Only possible when FWD_MODE * is copy. 0: No mirroring. 1-3: Use mirror probe 0-2. + * VCAP_AF_MRP_SEL: W2, lan966x: is1 + * 0: No changes to port-based selection in ANA:PORT:OAM_CFG.MRP_ENA. 1: Enable + * MRP frame processing 2: Disable MRP processing * VCAP_AF_NXT_IDX: W12, sparx5: is0 * Index used as part of key (field G_IDX) in the next lookup. * VCAP_AF_NXT_IDX_CTRL: W3, sparx5: is0 * Controls the generation of the G_IDX used in the VCAP CLM next lookup - * VCAP_AF_PAG_OVERRIDE_MASK: W8, sparx5: is0 + * VCAP_AF_OAM_SEL: W3, lan966x: is1 + * 0: No changes to port-based selection in ANA:PORT:OAM_CFG.OAM_CFG 1: Enable + * OAM frame processing for untagged frames 2: Enable OAM frame processing for + * single frames 3: Enable OAM frame processing for double frames 4: Disable OAM + * frame processing + * VCAP_AF_PAG_OVERRIDE_MASK: W8, sparx5: is0, lan966x: is1 * Bits set in this mask will override PAG_VAL from port profile. New PAG = (PAG * (input) AND ~PAG_OVERRIDE_MASK) OR (PAG_VAL AND PAG_OVERRIDE_MASK) - * VCAP_AF_PAG_VAL: W8, sparx5: is0 + * VCAP_AF_PAG_VAL: W8, sparx5: is0, lan966x: is1 * See PAG_OVERRIDE_MASK. * VCAP_AF_PCP_A_VAL: W3, sparx5: es0 * PCP used in ES0 tag A. See TAG_A_PCP_SEL. @@ -587,10 +658,10 @@ enum vcap_actionfield_set { * PCP used in ES0 tag B. See TAG_B_PCP_SEL. * VCAP_AF_PCP_C_VAL: W3, sparx5: es0 * PCP used in ES0 tag C. See TAG_C_PCP_SEL. - * VCAP_AF_PCP_ENA: W1, sparx5: is0 + * VCAP_AF_PCP_ENA: W1, sparx5: is0, lan966x: is1 * If set, use PCP_VAL as classified PCP value. Otherwise, PCP from basic * classification is used. - * VCAP_AF_PCP_VAL: W3, sparx5: is0 + * VCAP_AF_PCP_VAL: W3, sparx5: is0, lan966x: is1 * See PCP_ENA. * VCAP_AF_PIPELINE_ACT: W1, sparx5: es0 * Pipeline action when FWD_SEL > 0. 0: XTR. CPU_QU selects CPU extraction queue @@ -600,11 +671,11 @@ enum vcap_actionfield_set { * PIPELINE_PT == NONE. Overrules previous settings of pipeline point. * VCAP_AF_PIPELINE_PT: sparx5 is2 W5, sparx5 es0 W2 * Pipeline point used if PIPELINE_FORCE_ENA is set - * VCAP_AF_POLICE_ENA: W1, sparx5: is2/es2, lan966x: is2 - * Setting this bit to 1 causes frames that hit this action to be policed by the - * ACL policer specified in POLICE_IDX. Only applies to the first lookup. - * VCAP_AF_POLICE_IDX: sparx5 is2 W6, sparx5 es2 W6, lan966x is2 W9 - * Selects VCAP policer used when policing frames (POLICE_ENA) + * VCAP_AF_POLICE_ENA: W1, sparx5: is2/es2, lan966x: is1/is2 + * If set, POLICE_IDX is used to lookup ANA::POL. + * VCAP_AF_POLICE_IDX: sparx5 is2 W6, sparx5 es2 W6, lan966x is1 W9, lan966x is2 + * W9 + * Policer index. * VCAP_AF_POLICE_REMARK: W1, sparx5: es2 * If set, frames exceeding policer rates are marked as yellow but not * discarded. @@ -628,16 +699,24 @@ enum vcap_actionfield_set { * port. 1: ES0 tag A: Push ES0 tag A. No port tag. 2: Force port tag: Always * push port tag. No ES0 tag A. 3: Force untag: Never push port tag or ES0 tag * A. - * VCAP_AF_QOS_ENA: W1, sparx5: is0 + * VCAP_AF_QOS_ENA: W1, sparx5: is0, lan966x: is1 * If set, use QOS_VAL as classified QoS class. Otherwise, QoS class from basic * classification is used. - * VCAP_AF_QOS_VAL: W3, sparx5: is0 + * VCAP_AF_QOS_VAL: W3, sparx5: is0, lan966x: is1 * See QOS_ENA. * VCAP_AF_REW_OP: W16, lan966x: is2 * Rewriter operation command. * VCAP_AF_RT_DIS: W1, sparx5: is2 * If set, routing is disallowed. Only applies when IS_INNER_ACL is 0. See also * IGR_ACL_ENA, EGR_ACL_ENA, and RLEG_STAT_IDX. + * VCAP_AF_SFID_ENA: W1, lan966x: is1 + * If set, SFID_VAL is used to lookup ANA::SFID. + * VCAP_AF_SFID_VAL: W8, lan966x: is1 + * Stream filter identifier. + * VCAP_AF_SGID_ENA: W1, lan966x: is1 + * If set, SGID_VAL is used to lookup ANA::SGID. + * VCAP_AF_SGID_VAL: W8, lan966x: is1 + * Stream gate identifier. * VCAP_AF_SWAP_MACS_ENA: W1, sparx5: es0 * This setting is only active when FWD_SEL = 1 or FWD_SEL = 2 and PIPELINE_ACT * = LBK_ASM. 0: No action. 1: Swap MACs and clear bit 40 in new SMAC. @@ -686,7 +765,7 @@ enum vcap_actionfield_set { * VCAP_AF_TAG_C_VID_SEL: W2, sparx5: es0 * Selects VID for ES0 tag C. The resulting VID is termed C-TAG.VID. 0: * Classified VID. 1: VID_C_VAL. 2: IFH.ENCAP.GVID. 3: Reserved. - * VCAP_AF_TYPE: W1, sparx5: is0 + * VCAP_AF_TYPE: W1, sparx5: is0, lan966x: is1 * Actionset type id - Set by the API * VCAP_AF_UNTAG_VID_ENA: W1, sparx5: es0 * Controls insertion of tag C. Untag or insert mode can be selected. See @@ -697,8 +776,19 @@ enum vcap_actionfield_set { * VID used in ES0 tag B. See TAG_B_VID_SEL. * VCAP_AF_VID_C_VAL: W12, sparx5: es0 * VID used in ES0 tag C. See TAG_C_VID_SEL. - * VCAP_AF_VID_VAL: W13, sparx5: is0 + * VCAP_AF_VID_REPLACE_ENA: W1, lan966x: is1 + * Controls the classified VID: VID_REPLACE_ENA=0: Add VID_ADD_VAL to basic + * classified VID and use result as new classified VID. VID_REPLACE_ENA = 1: + * Replace basic classified VID with VID_VAL value and use as new classified + * VID. + * VCAP_AF_VID_VAL: sparx5 is0 W13, lan966x is1 W12 * New VID Value + * VCAP_AF_VLAN_POP_CNT: W2, lan966x: is1 + * See VLAN_POP_CNT_ENA + * VCAP_AF_VLAN_POP_CNT_ENA: W1, lan966x: is1 + * If set, use VLAN_POP_CNT as the number of VLAN tags to pop from the incoming + * frame. This number is used by the Rewriter. Otherwise, VLAN_POP_CNT from + * ANA:PORT:VLAN_CFG.VLAN_POP_CNT is used */ /* Actionfield names */ @@ -712,11 +802,13 @@ enum vcap_action_field { VCAP_AF_CPU_COPY_ENA, VCAP_AF_CPU_QU, VCAP_AF_CPU_QUEUE_NUM, + VCAP_AF_CUSTOM_ACE_TYPE_ENA, VCAP_AF_DEI_A_VAL, VCAP_AF_DEI_B_VAL, VCAP_AF_DEI_C_VAL, VCAP_AF_DEI_ENA, VCAP_AF_DEI_VAL, + VCAP_AF_DLR_SEL, VCAP_AF_DP_ENA, VCAP_AF_DP_VAL, VCAP_AF_DSCP_ENA, @@ -732,7 +824,9 @@ enum vcap_action_field { VCAP_AF_IGNORE_PIPELINE_CTRL, VCAP_AF_INTR_ENA, VCAP_AF_ISDX_ADD_REPLACE_SEL, + VCAP_AF_ISDX_ADD_VAL, VCAP_AF_ISDX_ENA, + VCAP_AF_ISDX_REPLACE_ENA, VCAP_AF_ISDX_VAL, VCAP_AF_LOOP_ENA, VCAP_AF_LRN_DIS, @@ -745,8 +839,10 @@ enum vcap_action_field { VCAP_AF_MIRROR_ENA, VCAP_AF_MIRROR_PROBE, VCAP_AF_MIRROR_PROBE_ID, + VCAP_AF_MRP_SEL, VCAP_AF_NXT_IDX, VCAP_AF_NXT_IDX_CTRL, + VCAP_AF_OAM_SEL, VCAP_AF_PAG_OVERRIDE_MASK, VCAP_AF_PAG_VAL, VCAP_AF_PCP_A_VAL, @@ -770,6 +866,10 @@ enum vcap_action_field { VCAP_AF_QOS_VAL, VCAP_AF_REW_OP, VCAP_AF_RT_DIS, + VCAP_AF_SFID_ENA, + VCAP_AF_SFID_VAL, + VCAP_AF_SGID_ENA, + VCAP_AF_SGID_VAL, VCAP_AF_SWAP_MACS_ENA, VCAP_AF_TAG_A_DEI_SEL, VCAP_AF_TAG_A_PCP_SEL, @@ -788,7 +888,10 @@ enum vcap_action_field { VCAP_AF_VID_A_VAL, VCAP_AF_VID_B_VAL, VCAP_AF_VID_C_VAL, + VCAP_AF_VID_REPLACE_ENA, VCAP_AF_VID_VAL, + VCAP_AF_VLAN_POP_CNT, + VCAP_AF_VLAN_POP_CNT_ENA, }; #endif /* __VCAP_AG_API__ */ diff --git a/drivers/net/ethernet/microchip/vcap/vcap_api.c b/drivers/net/ethernet/microchip/vcap/vcap_api.c index 4847d0d99ec9..5675b0962bc3 100644 --- a/drivers/net/ethernet/microchip/vcap/vcap_api.c +++ b/drivers/net/ethernet/microchip/vcap/vcap_api.c @@ -976,6 +976,25 @@ int vcap_lookup_rule_by_cookie(struct vcap_control *vctrl, u64 cookie) } EXPORT_SYMBOL_GPL(vcap_lookup_rule_by_cookie); +/* Get number of rules in a vcap instance lookup chain id range */ +int vcap_admin_rule_count(struct vcap_admin *admin, int cid) +{ + int max_cid = roundup(cid + 1, VCAP_CID_LOOKUP_SIZE); + int min_cid = rounddown(cid, VCAP_CID_LOOKUP_SIZE); + struct vcap_rule_internal *elem; + int count = 0; + + list_for_each_entry(elem, &admin->rules, list) { + mutex_lock(&admin->lock); + if (elem->data.vcap_chain_id >= min_cid && + elem->data.vcap_chain_id < max_cid) + ++count; + mutex_unlock(&admin->lock); + } + return count; +} +EXPORT_SYMBOL_GPL(vcap_admin_rule_count); + /* Make a copy of the rule, shallow or full */ static struct vcap_rule_internal *vcap_dup_rule(struct vcap_rule_internal *ri, bool full) @@ -3403,6 +3422,25 @@ int vcap_rule_mod_key_u32(struct vcap_rule *rule, enum vcap_key_field key, } EXPORT_SYMBOL_GPL(vcap_rule_mod_key_u32); +/* Remove a key field with value and mask in the rule */ +int vcap_rule_rem_key(struct vcap_rule *rule, enum vcap_key_field key) +{ + struct vcap_rule_internal *ri = to_intrule(rule); + struct vcap_client_keyfield *field; + + field = vcap_find_keyfield(rule, key); + if (!field) { + pr_err("%s:%d: key %s is not in the rule\n", + __func__, __LINE__, vcap_keyfield_name(ri->vctrl, key)); + return -EINVAL; + } + /* Deallocate the key field */ + list_del(&field->ctrl.list); + kfree(field); + return 0; +} +EXPORT_SYMBOL_GPL(vcap_rule_rem_key); + static int vcap_rule_mod_action(struct vcap_rule *rule, enum vcap_action_field action, enum vcap_field_type ftype, @@ -3475,6 +3513,29 @@ int vcap_filter_rule_keys(struct vcap_rule *rule, } EXPORT_SYMBOL_GPL(vcap_filter_rule_keys); +/* Select the keyset from the list that results in the smallest rule size */ +enum vcap_keyfield_set +vcap_select_min_rule_keyset(struct vcap_control *vctrl, + enum vcap_type vtype, + struct vcap_keyset_list *kslist) +{ + enum vcap_keyfield_set ret = VCAP_KFS_NO_VALUE; + const struct vcap_set *kset; + int max = 100, idx; + + for (idx = 0; idx < kslist->cnt; ++idx) { + kset = vcap_keyfieldset(vctrl, vtype, kslist->keysets[idx]); + if (!kset) + continue; + if (kset->sw_per_item >= max) + continue; + max = kset->sw_per_item; + ret = kslist->keysets[idx]; + } + return ret; +} +EXPORT_SYMBOL_GPL(vcap_select_min_rule_keyset); + /* Make a full copy of an existing rule with a new rule id */ struct vcap_rule *vcap_copy_rule(struct vcap_rule *erule) { diff --git a/drivers/net/ethernet/microchip/vcap/vcap_api_client.h b/drivers/net/ethernet/microchip/vcap/vcap_api_client.h index 417af9754bcc..d9d1f7c9d762 100644 --- a/drivers/net/ethernet/microchip/vcap/vcap_api_client.h +++ b/drivers/net/ethernet/microchip/vcap/vcap_api_client.h @@ -201,6 +201,9 @@ int vcap_rule_add_action_bit(struct vcap_rule *rule, int vcap_rule_add_action_u32(struct vcap_rule *rule, enum vcap_action_field action, u32 value); +/* Get number of rules in a vcap instance lookup chain id range */ +int vcap_admin_rule_count(struct vcap_admin *admin, int cid); + /* VCAP rule counter operations */ int vcap_get_rule_count_by_cookie(struct vcap_control *vctrl, struct vcap_counter *ctr, u64 cookie); @@ -269,6 +272,14 @@ int vcap_rule_mod_action_u32(struct vcap_rule *rule, int vcap_rule_get_key_u32(struct vcap_rule *rule, enum vcap_key_field key, u32 *value, u32 *mask); +/* Remove a key field with value and mask in the rule */ +int vcap_rule_rem_key(struct vcap_rule *rule, enum vcap_key_field key); + +/* Select the keyset from the list that results in the smallest rule size */ +enum vcap_keyfield_set +vcap_select_min_rule_keyset(struct vcap_control *vctrl, enum vcap_type vtype, + struct vcap_keyset_list *kslist); + struct vcap_client_actionfield * vcap_find_actionfield(struct vcap_rule *rule, enum vcap_action_field act); #endif /* __VCAP_API_CLIENT__ */ diff --git a/drivers/net/ethernet/microchip/vcap/vcap_api_debugfs_kunit.c b/drivers/net/ethernet/microchip/vcap/vcap_api_debugfs_kunit.c index 0de3f677135a..b23c11b0647c 100644 --- a/drivers/net/ethernet/microchip/vcap/vcap_api_debugfs_kunit.c +++ b/drivers/net/ethernet/microchip/vcap/vcap_api_debugfs_kunit.c @@ -387,7 +387,7 @@ static const char * const test_admin_info_expect[] = { "default_cnt: 73\n", "require_cnt_dis: 0\n", "version: 1\n", - "vtype: 3\n", + "vtype: 4\n", "vinst: 0\n", "ingress: 1\n", "first_cid: 10000\n", @@ -435,7 +435,7 @@ static const char * const test_admin_expect[] = { "default_cnt: 73\n", "require_cnt_dis: 0\n", "version: 1\n", - "vtype: 3\n", + "vtype: 4\n", "vinst: 0\n", "ingress: 1\n", "first_cid: 8000000\n", diff --git a/drivers/net/ethernet/microsoft/mana/gdma_main.c b/drivers/net/ethernet/microsoft/mana/gdma_main.c index f9b8f372ec8a..8f3f78b68592 100644 --- a/drivers/net/ethernet/microsoft/mana/gdma_main.c +++ b/drivers/net/ethernet/microsoft/mana/gdma_main.c @@ -1439,7 +1439,6 @@ free_gc: release_region: pci_release_regions(pdev); disable_dev: - pci_clear_master(pdev); pci_disable_device(pdev); dev_err(&pdev->dev, "gdma probe failed: err = %d\n", err); return err; @@ -1458,7 +1457,6 @@ static void mana_gd_remove(struct pci_dev *pdev) vfree(gc); pci_release_regions(pdev); - pci_clear_master(pdev); pci_disable_device(pdev); } diff --git a/drivers/net/ethernet/microsoft/mana/mana_bpf.c b/drivers/net/ethernet/microsoft/mana/mana_bpf.c index 3caea631229c..23b1521c0df9 100644 --- a/drivers/net/ethernet/microsoft/mana/mana_bpf.c +++ b/drivers/net/ethernet/microsoft/mana/mana_bpf.c @@ -133,12 +133,6 @@ out: return act; } -static unsigned int mana_xdp_fraglen(unsigned int len) -{ - return SKB_DATA_ALIGN(len) + - SKB_DATA_ALIGN(sizeof(struct skb_shared_info)); -} - struct bpf_prog *mana_xdp_get(struct mana_port_context *apc) { ASSERT_RTNL(); @@ -179,17 +173,18 @@ static int mana_xdp_set(struct net_device *ndev, struct bpf_prog *prog, { struct mana_port_context *apc = netdev_priv(ndev); struct bpf_prog *old_prog; - int buf_max; + struct gdma_context *gc; + + gc = apc->ac->gdma_dev->gdma_context; old_prog = mana_xdp_get(apc); if (!old_prog && !prog) return 0; - buf_max = XDP_PACKET_HEADROOM + mana_xdp_fraglen(ndev->mtu + ETH_HLEN); - if (prog && buf_max > PAGE_SIZE) { - netdev_err(ndev, "XDP: mtu:%u too large, buf_max:%u\n", - ndev->mtu, buf_max); + if (prog && ndev->mtu > MANA_XDP_MTU_MAX) { + netdev_err(ndev, "XDP: mtu:%u too large, mtu_max:%lu\n", + ndev->mtu, MANA_XDP_MTU_MAX); NL_SET_ERR_MSG_MOD(extack, "XDP: mtu too large"); return -EOPNOTSUPP; @@ -206,6 +201,11 @@ static int mana_xdp_set(struct net_device *ndev, struct bpf_prog *prog, if (apc->port_is_up) mana_chn_setxdp(apc, prog); + if (prog) + ndev->max_mtu = MANA_XDP_MTU_MAX; + else + ndev->max_mtu = gc->adapter_mtu - ETH_HLEN; + return 0; } diff --git a/drivers/net/ethernet/microsoft/mana/mana_en.c b/drivers/net/ethernet/microsoft/mana/mana_en.c index 6120f2b6684f..06d6292e09b3 100644 --- a/drivers/net/ethernet/microsoft/mana/mana_en.c +++ b/drivers/net/ethernet/microsoft/mana/mana_en.c @@ -156,6 +156,7 @@ netdev_tx_t mana_start_xmit(struct sk_buff *skb, struct net_device *ndev) struct mana_txq *txq; struct mana_cq *cq; int err, len; + u16 ihs; if (unlikely(!apc->port_is_up)) goto tx_drop; @@ -166,6 +167,7 @@ netdev_tx_t mana_start_xmit(struct sk_buff *skb, struct net_device *ndev) txq = &apc->tx_qp[txq_idx].txq; gdma_sq = txq->gdma_sq; cq = &apc->tx_qp[txq_idx].tx_cq; + tx_stats = &txq->stats; pkg.tx_oob.s_oob.vcq_num = cq->gdma_id; pkg.tx_oob.s_oob.vsq_frame = txq->vsq_frame; @@ -179,10 +181,17 @@ netdev_tx_t mana_start_xmit(struct sk_buff *skb, struct net_device *ndev) pkg.tx_oob.s_oob.pkt_fmt = pkt_fmt; - if (pkt_fmt == MANA_SHORT_PKT_FMT) + if (pkt_fmt == MANA_SHORT_PKT_FMT) { pkg.wqe_req.inline_oob_size = sizeof(struct mana_tx_short_oob); - else + u64_stats_update_begin(&tx_stats->syncp); + tx_stats->short_pkt_fmt++; + u64_stats_update_end(&tx_stats->syncp); + } else { pkg.wqe_req.inline_oob_size = sizeof(struct mana_tx_oob); + u64_stats_update_begin(&tx_stats->syncp); + tx_stats->long_pkt_fmt++; + u64_stats_update_end(&tx_stats->syncp); + } pkg.wqe_req.inline_oob_data = &pkg.tx_oob; pkg.wqe_req.flags = 0; @@ -232,9 +241,35 @@ netdev_tx_t mana_start_xmit(struct sk_buff *skb, struct net_device *ndev) &ipv6_hdr(skb)->daddr, 0, IPPROTO_TCP, 0); } + + if (skb->encapsulation) { + ihs = skb_inner_tcp_all_headers(skb); + u64_stats_update_begin(&tx_stats->syncp); + tx_stats->tso_inner_packets++; + tx_stats->tso_inner_bytes += skb->len - ihs; + u64_stats_update_end(&tx_stats->syncp); + } else { + if (skb_shinfo(skb)->gso_type & SKB_GSO_UDP_L4) { + ihs = skb_transport_offset(skb) + sizeof(struct udphdr); + } else { + ihs = skb_tcp_all_headers(skb); + if (ipv6_has_hopopt_jumbo(skb)) + ihs -= sizeof(struct hop_jumbo_hdr); + } + + u64_stats_update_begin(&tx_stats->syncp); + tx_stats->tso_packets++; + tx_stats->tso_bytes += skb->len - ihs; + u64_stats_update_end(&tx_stats->syncp); + } + } else if (skb->ip_summed == CHECKSUM_PARTIAL) { csum_type = mana_checksum_info(skb); + u64_stats_update_begin(&tx_stats->syncp); + tx_stats->csum_partial++; + u64_stats_update_end(&tx_stats->syncp); + if (csum_type == IPPROTO_TCP) { pkg.tx_oob.s_oob.is_outer_ipv4 = ipv4; pkg.tx_oob.s_oob.is_outer_ipv6 = ipv6; @@ -254,8 +289,12 @@ netdev_tx_t mana_start_xmit(struct sk_buff *skb, struct net_device *ndev) } } - if (mana_map_skb(skb, apc, &pkg)) + if (mana_map_skb(skb, apc, &pkg)) { + u64_stats_update_begin(&tx_stats->syncp); + tx_stats->mana_map_err++; + u64_stats_update_end(&tx_stats->syncp); goto free_sgl_ptr; + } skb_queue_tail(&txq->pending_skbs, skb); @@ -388,6 +427,199 @@ static u16 mana_select_queue(struct net_device *ndev, struct sk_buff *skb, return txq; } +/* Release pre-allocated RX buffers */ +static void mana_pre_dealloc_rxbufs(struct mana_port_context *mpc) +{ + struct device *dev; + int i; + + dev = mpc->ac->gdma_dev->gdma_context->dev; + + if (!mpc->rxbufs_pre) + goto out1; + + if (!mpc->das_pre) + goto out2; + + while (mpc->rxbpre_total) { + i = --mpc->rxbpre_total; + dma_unmap_single(dev, mpc->das_pre[i], mpc->rxbpre_datasize, + DMA_FROM_DEVICE); + put_page(virt_to_head_page(mpc->rxbufs_pre[i])); + } + + kfree(mpc->das_pre); + mpc->das_pre = NULL; + +out2: + kfree(mpc->rxbufs_pre); + mpc->rxbufs_pre = NULL; + +out1: + mpc->rxbpre_datasize = 0; + mpc->rxbpre_alloc_size = 0; + mpc->rxbpre_headroom = 0; +} + +/* Get a buffer from the pre-allocated RX buffers */ +static void *mana_get_rxbuf_pre(struct mana_rxq *rxq, dma_addr_t *da) +{ + struct net_device *ndev = rxq->ndev; + struct mana_port_context *mpc; + void *va; + + mpc = netdev_priv(ndev); + + if (!mpc->rxbufs_pre || !mpc->das_pre || !mpc->rxbpre_total) { + netdev_err(ndev, "No RX pre-allocated bufs\n"); + return NULL; + } + + /* Check sizes to catch unexpected coding error */ + if (mpc->rxbpre_datasize != rxq->datasize) { + netdev_err(ndev, "rxbpre_datasize mismatch: %u: %u\n", + mpc->rxbpre_datasize, rxq->datasize); + return NULL; + } + + if (mpc->rxbpre_alloc_size != rxq->alloc_size) { + netdev_err(ndev, "rxbpre_alloc_size mismatch: %u: %u\n", + mpc->rxbpre_alloc_size, rxq->alloc_size); + return NULL; + } + + if (mpc->rxbpre_headroom != rxq->headroom) { + netdev_err(ndev, "rxbpre_headroom mismatch: %u: %u\n", + mpc->rxbpre_headroom, rxq->headroom); + return NULL; + } + + mpc->rxbpre_total--; + + *da = mpc->das_pre[mpc->rxbpre_total]; + va = mpc->rxbufs_pre[mpc->rxbpre_total]; + mpc->rxbufs_pre[mpc->rxbpre_total] = NULL; + + /* Deallocate the array after all buffers are gone */ + if (!mpc->rxbpre_total) + mana_pre_dealloc_rxbufs(mpc); + + return va; +} + +/* Get RX buffer's data size, alloc size, XDP headroom based on MTU */ +static void mana_get_rxbuf_cfg(int mtu, u32 *datasize, u32 *alloc_size, + u32 *headroom) +{ + if (mtu > MANA_XDP_MTU_MAX) + *headroom = 0; /* no support for XDP */ + else + *headroom = XDP_PACKET_HEADROOM; + + *alloc_size = mtu + MANA_RXBUF_PAD + *headroom; + + *datasize = ALIGN(mtu + ETH_HLEN, MANA_RX_DATA_ALIGN); +} + +static int mana_pre_alloc_rxbufs(struct mana_port_context *mpc, int new_mtu) +{ + struct device *dev; + struct page *page; + dma_addr_t da; + int num_rxb; + void *va; + int i; + + mana_get_rxbuf_cfg(new_mtu, &mpc->rxbpre_datasize, + &mpc->rxbpre_alloc_size, &mpc->rxbpre_headroom); + + dev = mpc->ac->gdma_dev->gdma_context->dev; + + num_rxb = mpc->num_queues * RX_BUFFERS_PER_QUEUE; + + WARN(mpc->rxbufs_pre, "mana rxbufs_pre exists\n"); + mpc->rxbufs_pre = kmalloc_array(num_rxb, sizeof(void *), GFP_KERNEL); + if (!mpc->rxbufs_pre) + goto error; + + mpc->das_pre = kmalloc_array(num_rxb, sizeof(dma_addr_t), GFP_KERNEL); + if (!mpc->das_pre) + goto error; + + mpc->rxbpre_total = 0; + + for (i = 0; i < num_rxb; i++) { + if (mpc->rxbpre_alloc_size > PAGE_SIZE) { + va = netdev_alloc_frag(mpc->rxbpre_alloc_size); + if (!va) + goto error; + + page = virt_to_head_page(va); + /* Check if the frag falls back to single page */ + if (compound_order(page) < + get_order(mpc->rxbpre_alloc_size)) { + put_page(page); + goto error; + } + } else { + page = dev_alloc_page(); + if (!page) + goto error; + + va = page_to_virt(page); + } + + da = dma_map_single(dev, va + mpc->rxbpre_headroom, + mpc->rxbpre_datasize, DMA_FROM_DEVICE); + if (dma_mapping_error(dev, da)) { + put_page(virt_to_head_page(va)); + goto error; + } + + mpc->rxbufs_pre[i] = va; + mpc->das_pre[i] = da; + mpc->rxbpre_total = i + 1; + } + + return 0; + +error: + mana_pre_dealloc_rxbufs(mpc); + return -ENOMEM; +} + +static int mana_change_mtu(struct net_device *ndev, int new_mtu) +{ + struct mana_port_context *mpc = netdev_priv(ndev); + unsigned int old_mtu = ndev->mtu; + int err; + + /* Pre-allocate buffers to prevent failure in mana_attach later */ + err = mana_pre_alloc_rxbufs(mpc, new_mtu); + if (err) { + netdev_err(ndev, "Insufficient memory for new MTU\n"); + return err; + } + + err = mana_detach(ndev, false); + if (err) { + netdev_err(ndev, "mana_detach failed: %d\n", err); + goto out; + } + + ndev->mtu = new_mtu; + + err = mana_attach(ndev); + if (err) { + netdev_err(ndev, "mana_attach failed: %d\n", err); + ndev->mtu = old_mtu; + } + +out: + mana_pre_dealloc_rxbufs(mpc); + return err; +} + static const struct net_device_ops mana_devops = { .ndo_open = mana_open, .ndo_stop = mana_close, @@ -397,6 +629,7 @@ static const struct net_device_ops mana_devops = { .ndo_get_stats64 = mana_get_stats64, .ndo_bpf = mana_bpf, .ndo_xdp_xmit = mana_xdp_xmit, + .ndo_change_mtu = mana_change_mtu, }; static void mana_cleanup_port_context(struct mana_port_context *apc) @@ -586,6 +819,9 @@ static int mana_query_device_cfg(struct mana_context *ac, u32 proto_major_ver, mana_gd_init_req_hdr(&req.hdr, MANA_QUERY_DEV_CONFIG, sizeof(req), sizeof(resp)); + + req.hdr.resp.msg_version = GDMA_MESSAGE_V2; + req.proto_major_ver = proto_major_ver; req.proto_minor_ver = proto_minor_ver; req.proto_micro_ver = proto_micro_ver; @@ -608,6 +844,11 @@ static int mana_query_device_cfg(struct mana_context *ac, u32 proto_major_ver, *max_num_vports = resp.max_num_vports; + if (resp.hdr.response.msg_version == GDMA_MESSAGE_V2) + gc->adapter_mtu = resp.adapter_mtu; + else + gc->adapter_mtu = ETH_FRAME_LEN; + return 0; } @@ -1038,6 +1279,8 @@ static void mana_poll_tx_cq(struct mana_cq *cq) if (comp_read < 1) return; + apc->eth_stats.tx_cqes = comp_read; + for (i = 0; i < comp_read; i++) { struct mana_tx_comp_oob *cqe_oob; @@ -1064,6 +1307,7 @@ static void mana_poll_tx_cq(struct mana_cq *cq) case CQE_TX_VLAN_TAGGING_VIOLATION: WARN_ONCE(1, "TX: CQE error %d: ignored.\n", cqe_oob->cqe_hdr.cqe_type); + apc->eth_stats.tx_cqe_err++; break; default: @@ -1072,6 +1316,7 @@ static void mana_poll_tx_cq(struct mana_cq *cq) */ WARN_ONCE(1, "TX: Unexpected CQE type %d: HW BUG?\n", cqe_oob->cqe_hdr.cqe_type); + apc->eth_stats.tx_cqe_unknown_type++; return; } @@ -1118,6 +1363,8 @@ static void mana_poll_tx_cq(struct mana_cq *cq) WARN_ON_ONCE(1); cq->work_done = pkt_transmitted; + + apc->eth_stats.tx_cqes -= pkt_transmitted; } static void mana_post_pkt_rxq(struct mana_rxq *rxq) @@ -1140,10 +1387,10 @@ static void mana_post_pkt_rxq(struct mana_rxq *rxq) WARN_ON_ONCE(recv_buf_oob->wqe_inf.wqe_size_in_bu != 1); } -static struct sk_buff *mana_build_skb(void *buf_va, uint pkt_len, - struct xdp_buff *xdp) +static struct sk_buff *mana_build_skb(struct mana_rxq *rxq, void *buf_va, + uint pkt_len, struct xdp_buff *xdp) { - struct sk_buff *skb = build_skb(buf_va, PAGE_SIZE); + struct sk_buff *skb = napi_build_skb(buf_va, rxq->alloc_size); if (!skb) return NULL; @@ -1151,11 +1398,12 @@ static struct sk_buff *mana_build_skb(void *buf_va, uint pkt_len, if (xdp->data_hard_start) { skb_reserve(skb, xdp->data - xdp->data_hard_start); skb_put(skb, xdp->data_end - xdp->data); - } else { - skb_reserve(skb, XDP_PACKET_HEADROOM); - skb_put(skb, pkt_len); + return skb; } + skb_reserve(skb, rxq->headroom); + skb_put(skb, pkt_len); + return skb; } @@ -1188,7 +1436,7 @@ static void mana_rx_skb(void *buf_va, struct mana_rxcomp_oob *cqe, if (act != XDP_PASS && act != XDP_TX) goto drop_xdp; - skb = mana_build_skb(buf_va, pkt_len, &xdp); + skb = mana_build_skb(rxq, buf_va, pkt_len, &xdp); if (!skb) goto drop; @@ -1237,14 +1485,77 @@ drop_xdp: u64_stats_update_end(&rx_stats->syncp); drop: - WARN_ON_ONCE(rxq->xdp_save_page); - rxq->xdp_save_page = virt_to_page(buf_va); + WARN_ON_ONCE(rxq->xdp_save_va); + /* Save for reuse */ + rxq->xdp_save_va = buf_va; ++ndev->stats.rx_dropped; return; } +static void *mana_get_rxfrag(struct mana_rxq *rxq, struct device *dev, + dma_addr_t *da, bool is_napi) +{ + struct page *page; + void *va; + + /* Reuse XDP dropped page if available */ + if (rxq->xdp_save_va) { + va = rxq->xdp_save_va; + rxq->xdp_save_va = NULL; + } else if (rxq->alloc_size > PAGE_SIZE) { + if (is_napi) + va = napi_alloc_frag(rxq->alloc_size); + else + va = netdev_alloc_frag(rxq->alloc_size); + + if (!va) + return NULL; + + page = virt_to_head_page(va); + /* Check if the frag falls back to single page */ + if (compound_order(page) < get_order(rxq->alloc_size)) { + put_page(page); + return NULL; + } + } else { + page = dev_alloc_page(); + if (!page) + return NULL; + + va = page_to_virt(page); + } + + *da = dma_map_single(dev, va + rxq->headroom, rxq->datasize, + DMA_FROM_DEVICE); + if (dma_mapping_error(dev, *da)) { + put_page(virt_to_head_page(va)); + return NULL; + } + + return va; +} + +/* Allocate frag for rx buffer, and save the old buf */ +static void mana_refill_rx_oob(struct device *dev, struct mana_rxq *rxq, + struct mana_recv_buf_oob *rxoob, void **old_buf) +{ + dma_addr_t da; + void *va; + + va = mana_get_rxfrag(rxq, dev, &da, true); + if (!va) + return; + + dma_unmap_single(dev, rxoob->sgl[0].address, rxq->datasize, + DMA_FROM_DEVICE); + *old_buf = rxoob->buf_va; + + rxoob->buf_va = va; + rxoob->sgl[0].address = da; +} + static void mana_process_rx_cqe(struct mana_rxq *rxq, struct mana_cq *cq, struct gdma_comp *cqe) { @@ -1252,11 +1563,12 @@ static void mana_process_rx_cqe(struct mana_rxq *rxq, struct mana_cq *cq, struct gdma_context *gc = rxq->gdma_rq->gdma_dev->gdma_context; struct net_device *ndev = rxq->ndev; struct mana_recv_buf_oob *rxbuf_oob; + struct mana_port_context *apc; struct device *dev = gc->dev; - void *new_buf, *old_buf; - struct page *new_page; + void *old_buf = NULL; u32 curr, pktlen; - dma_addr_t da; + + apc = netdev_priv(ndev); switch (oob->cqe_hdr.cqe_type) { case CQE_RX_OKAY: @@ -1270,6 +1582,7 @@ static void mana_process_rx_cqe(struct mana_rxq *rxq, struct mana_cq *cq, case CQE_RX_COALESCED_4: netdev_err(ndev, "RX coalescing is unsupported\n"); + apc->eth_stats.rx_coalesced_err++; return; case CQE_RX_OBJECT_FENCE: @@ -1279,6 +1592,7 @@ static void mana_process_rx_cqe(struct mana_rxq *rxq, struct mana_cq *cq, default: netdev_err(ndev, "Unknown RX CQE type = %d\n", oob->cqe_hdr.cqe_type); + apc->eth_stats.rx_cqe_unknown_type++; return; } @@ -1295,40 +1609,11 @@ static void mana_process_rx_cqe(struct mana_rxq *rxq, struct mana_cq *cq, rxbuf_oob = &rxq->rx_oobs[curr]; WARN_ON_ONCE(rxbuf_oob->wqe_inf.wqe_size_in_bu != 1); - /* Reuse XDP dropped page if available */ - if (rxq->xdp_save_page) { - new_page = rxq->xdp_save_page; - rxq->xdp_save_page = NULL; - } else { - new_page = alloc_page(GFP_ATOMIC); - } - - if (new_page) { - da = dma_map_page(dev, new_page, XDP_PACKET_HEADROOM, rxq->datasize, - DMA_FROM_DEVICE); - - if (dma_mapping_error(dev, da)) { - __free_page(new_page); - new_page = NULL; - } - } - - new_buf = new_page ? page_to_virt(new_page) : NULL; - - if (new_buf) { - dma_unmap_page(dev, rxbuf_oob->buf_dma_addr, rxq->datasize, - DMA_FROM_DEVICE); - - old_buf = rxbuf_oob->buf_va; - - /* refresh the rxbuf_oob with the new page */ - rxbuf_oob->buf_va = new_buf; - rxbuf_oob->buf_dma_addr = da; - rxbuf_oob->sgl[0].address = rxbuf_oob->buf_dma_addr; - } else { - old_buf = NULL; /* drop the packet if no memory */ - } + mana_refill_rx_oob(dev, rxq, rxbuf_oob, &old_buf); + /* Unsuccessful refill will have old_buf == NULL. + * In this case, mana_rx_skb() will drop the packet. + */ mana_rx_skb(old_buf, oob, rxq); drop: @@ -1341,11 +1626,15 @@ static void mana_poll_rx_cq(struct mana_cq *cq) { struct gdma_comp *comp = cq->gdma_comp_buf; struct mana_rxq *rxq = cq->rxq; + struct mana_port_context *apc; int comp_read, i; + apc = netdev_priv(rxq->ndev); + comp_read = mana_gd_poll_cq(cq->gdma_cq, comp, CQE_POLLING_BUFFER); WARN_ON_ONCE(comp_read > CQE_POLLING_BUFFER); + apc->eth_stats.rx_cqes = comp_read; rxq->xdp_flush = false; for (i = 0; i < comp_read; i++) { @@ -1357,6 +1646,8 @@ static void mana_poll_rx_cq(struct mana_cq *cq) return; mana_process_rx_cqe(rxq, cq, &comp[i]); + + apc->eth_stats.rx_cqes--; } if (rxq->xdp_flush) @@ -1603,8 +1894,8 @@ static void mana_destroy_rxq(struct mana_port_context *apc, mana_deinit_cq(apc, &rxq->rx_cq); - if (rxq->xdp_save_page) - __free_page(rxq->xdp_save_page); + if (rxq->xdp_save_va) + put_page(virt_to_head_page(rxq->xdp_save_va)); for (i = 0; i < rxq->num_rx_buf; i++) { rx_oob = &rxq->rx_oobs[i]; @@ -1612,10 +1903,10 @@ static void mana_destroy_rxq(struct mana_port_context *apc, if (!rx_oob->buf_va) continue; - dma_unmap_page(dev, rx_oob->buf_dma_addr, rxq->datasize, - DMA_FROM_DEVICE); + dma_unmap_single(dev, rx_oob->sgl[0].address, + rx_oob->sgl[0].size, DMA_FROM_DEVICE); - free_page((unsigned long)rx_oob->buf_va); + put_page(virt_to_head_page(rx_oob->buf_va)); rx_oob->buf_va = NULL; } @@ -1625,6 +1916,30 @@ static void mana_destroy_rxq(struct mana_port_context *apc, kfree(rxq); } +static int mana_fill_rx_oob(struct mana_recv_buf_oob *rx_oob, u32 mem_key, + struct mana_rxq *rxq, struct device *dev) +{ + struct mana_port_context *mpc = netdev_priv(rxq->ndev); + dma_addr_t da; + void *va; + + if (mpc->rxbufs_pre) + va = mana_get_rxbuf_pre(rxq, &da); + else + va = mana_get_rxfrag(rxq, dev, &da, false); + + if (!va) + return -ENOMEM; + + rx_oob->buf_va = va; + + rx_oob->sgl[0].address = da; + rx_oob->sgl[0].size = rxq->datasize; + rx_oob->sgl[0].mem_key = mem_key; + + return 0; +} + #define MANA_WQE_HEADER_SIZE 16 #define MANA_WQE_SGE_SIZE 16 @@ -1634,11 +1949,10 @@ static int mana_alloc_rx_wqe(struct mana_port_context *apc, struct gdma_context *gc = apc->ac->gdma_dev->gdma_context; struct mana_recv_buf_oob *rx_oob; struct device *dev = gc->dev; - struct page *page; - dma_addr_t da; u32 buf_idx; + int ret; - WARN_ON(rxq->datasize == 0 || rxq->datasize > PAGE_SIZE); + WARN_ON(rxq->datasize == 0); *rxq_size = 0; *cq_size = 0; @@ -1647,25 +1961,12 @@ static int mana_alloc_rx_wqe(struct mana_port_context *apc, rx_oob = &rxq->rx_oobs[buf_idx]; memset(rx_oob, 0, sizeof(*rx_oob)); - page = alloc_page(GFP_KERNEL); - if (!page) - return -ENOMEM; - - da = dma_map_page(dev, page, XDP_PACKET_HEADROOM, rxq->datasize, - DMA_FROM_DEVICE); - - if (dma_mapping_error(dev, da)) { - __free_page(page); - return -ENOMEM; - } - - rx_oob->buf_va = page_to_virt(page); - rx_oob->buf_dma_addr = da; - rx_oob->num_sge = 1; - rx_oob->sgl[0].address = rx_oob->buf_dma_addr; - rx_oob->sgl[0].size = rxq->datasize; - rx_oob->sgl[0].mem_key = apc->ac->gdma_dev->gpa_mkey; + + ret = mana_fill_rx_oob(rx_oob, apc->ac->gdma_dev->gpa_mkey, rxq, + dev); + if (ret) + return ret; rx_oob->wqe_req.sgl = rx_oob->sgl; rx_oob->wqe_req.num_sge = rx_oob->num_sge; @@ -1724,9 +2025,11 @@ static struct mana_rxq *mana_create_rxq(struct mana_port_context *apc, rxq->ndev = ndev; rxq->num_rx_buf = RX_BUFFERS_PER_QUEUE; rxq->rxq_idx = rxq_idx; - rxq->datasize = ALIGN(MAX_FRAME_SIZE, 64); rxq->rxobj = INVALID_MANA_HANDLE; + mana_get_rxbuf_cfg(ndev->mtu, &rxq->datasize, &rxq->alloc_size, + &rxq->headroom); + err = mana_alloc_rx_wqe(apc, rxq, &rq_size, &cq_size); if (err) goto out; @@ -2138,8 +2441,8 @@ static int mana_probe_port(struct mana_context *ac, int port_idx, ndev->netdev_ops = &mana_devops; ndev->ethtool_ops = &mana_ethtool_ops; ndev->mtu = ETH_DATA_LEN; - ndev->max_mtu = ndev->mtu; - ndev->min_mtu = ndev->mtu; + ndev->max_mtu = gc->adapter_mtu - ETH_HLEN; + ndev->min_mtu = ETH_MIN_MTU; ndev->needed_headroom = MANA_HEADROOM; ndev->dev_port = port_idx; SET_NETDEV_DEV(ndev, gc->dev); diff --git a/drivers/net/ethernet/microsoft/mana/mana_ethtool.c b/drivers/net/ethernet/microsoft/mana/mana_ethtool.c index 5b776a33a817..a64c81410dc1 100644 --- a/drivers/net/ethernet/microsoft/mana/mana_ethtool.c +++ b/drivers/net/ethernet/microsoft/mana/mana_ethtool.c @@ -13,6 +13,15 @@ static const struct { } mana_eth_stats[] = { {"stop_queue", offsetof(struct mana_ethtool_stats, stop_queue)}, {"wake_queue", offsetof(struct mana_ethtool_stats, wake_queue)}, + {"tx_cqes", offsetof(struct mana_ethtool_stats, tx_cqes)}, + {"tx_cq_err", offsetof(struct mana_ethtool_stats, tx_cqe_err)}, + {"tx_cqe_unknown_type", offsetof(struct mana_ethtool_stats, + tx_cqe_unknown_type)}, + {"rx_cqes", offsetof(struct mana_ethtool_stats, rx_cqes)}, + {"rx_coalesced_err", offsetof(struct mana_ethtool_stats, + rx_coalesced_err)}, + {"rx_cqe_unknown_type", offsetof(struct mana_ethtool_stats, + rx_cqe_unknown_type)}, }; static int mana_get_sset_count(struct net_device *ndev, int stringset) @@ -23,7 +32,8 @@ static int mana_get_sset_count(struct net_device *ndev, int stringset) if (stringset != ETH_SS_STATS) return -EINVAL; - return ARRAY_SIZE(mana_eth_stats) + num_queues * 8; + return ARRAY_SIZE(mana_eth_stats) + num_queues * + (MANA_STATS_RX_COUNT + MANA_STATS_TX_COUNT); } static void mana_get_strings(struct net_device *ndev, u32 stringset, u8 *data) @@ -61,6 +71,22 @@ static void mana_get_strings(struct net_device *ndev, u32 stringset, u8 *data) p += ETH_GSTRING_LEN; sprintf(p, "tx_%d_xdp_xmit", i); p += ETH_GSTRING_LEN; + sprintf(p, "tx_%d_tso_packets", i); + p += ETH_GSTRING_LEN; + sprintf(p, "tx_%d_tso_bytes", i); + p += ETH_GSTRING_LEN; + sprintf(p, "tx_%d_tso_inner_packets", i); + p += ETH_GSTRING_LEN; + sprintf(p, "tx_%d_tso_inner_bytes", i); + p += ETH_GSTRING_LEN; + sprintf(p, "tx_%d_long_pkt_fmt", i); + p += ETH_GSTRING_LEN; + sprintf(p, "tx_%d_short_pkt_fmt", i); + p += ETH_GSTRING_LEN; + sprintf(p, "tx_%d_csum_partial", i); + p += ETH_GSTRING_LEN; + sprintf(p, "tx_%d_mana_map_err", i); + p += ETH_GSTRING_LEN; } } @@ -78,6 +104,14 @@ static void mana_get_ethtool_stats(struct net_device *ndev, u64 xdp_xmit; u64 xdp_drop; u64 xdp_tx; + u64 tso_packets; + u64 tso_bytes; + u64 tso_inner_packets; + u64 tso_inner_bytes; + u64 long_pkt_fmt; + u64 short_pkt_fmt; + u64 csum_partial; + u64 mana_map_err; int q, i = 0; if (!apc->port_is_up) @@ -113,11 +147,27 @@ static void mana_get_ethtool_stats(struct net_device *ndev, packets = tx_stats->packets; bytes = tx_stats->bytes; xdp_xmit = tx_stats->xdp_xmit; + tso_packets = tx_stats->tso_packets; + tso_bytes = tx_stats->tso_bytes; + tso_inner_packets = tx_stats->tso_inner_packets; + tso_inner_bytes = tx_stats->tso_inner_bytes; + long_pkt_fmt = tx_stats->long_pkt_fmt; + short_pkt_fmt = tx_stats->short_pkt_fmt; + csum_partial = tx_stats->csum_partial; + mana_map_err = tx_stats->mana_map_err; } while (u64_stats_fetch_retry(&tx_stats->syncp, start)); data[i++] = packets; data[i++] = bytes; data[i++] = xdp_xmit; + data[i++] = tso_packets; + data[i++] = tso_bytes; + data[i++] = tso_inner_packets; + data[i++] = tso_inner_bytes; + data[i++] = long_pkt_fmt; + data[i++] = short_pkt_fmt; + data[i++] = csum_partial; + data[i++] = mana_map_err; } } diff --git a/drivers/net/ethernet/mscc/ocelot.c b/drivers/net/ethernet/mscc/ocelot.c index 08acb7b89086..1f5f00b30441 100644 --- a/drivers/net/ethernet/mscc/ocelot.c +++ b/drivers/net/ethernet/mscc/ocelot.c @@ -7,6 +7,9 @@ #include <linux/dsa/ocelot.h> #include <linux/if_bridge.h> #include <linux/iopoll.h> +#include <linux/phy/phy.h> +#include <net/pkt_sched.h> +#include <soc/mscc/ocelot_hsio.h> #include <soc/mscc/ocelot_vcap.h> #include "ocelot.h" #include "ocelot_vcap.h" @@ -211,6 +214,36 @@ static void ocelot_mact_init(struct ocelot *ocelot) ocelot_write(ocelot, MACACCESS_CMD_INIT, ANA_TABLES_MACACCESS); } +void ocelot_pll5_init(struct ocelot *ocelot) +{ + /* Configure PLL5. This will need a proper CCF driver + * The values are coming from the VTSS API for Ocelot + */ + regmap_write(ocelot->targets[HSIO], HSIO_PLL5G_CFG4, + HSIO_PLL5G_CFG4_IB_CTRL(0x7600) | + HSIO_PLL5G_CFG4_IB_BIAS_CTRL(0x8)); + regmap_write(ocelot->targets[HSIO], HSIO_PLL5G_CFG0, + HSIO_PLL5G_CFG0_CORE_CLK_DIV(0x11) | + HSIO_PLL5G_CFG0_CPU_CLK_DIV(2) | + HSIO_PLL5G_CFG0_ENA_BIAS | + HSIO_PLL5G_CFG0_ENA_VCO_BUF | + HSIO_PLL5G_CFG0_ENA_CP1 | + HSIO_PLL5G_CFG0_SELCPI(2) | + HSIO_PLL5G_CFG0_LOOP_BW_RES(0xe) | + HSIO_PLL5G_CFG0_SELBGV820(4) | + HSIO_PLL5G_CFG0_DIV4 | + HSIO_PLL5G_CFG0_ENA_CLKTREE | + HSIO_PLL5G_CFG0_ENA_LANE); + regmap_write(ocelot->targets[HSIO], HSIO_PLL5G_CFG2, + HSIO_PLL5G_CFG2_EN_RESET_FRQ_DET | + HSIO_PLL5G_CFG2_EN_RESET_OVERRUN | + HSIO_PLL5G_CFG2_GAIN_TEST(0x8) | + HSIO_PLL5G_CFG2_ENA_AMPCTRL | + HSIO_PLL5G_CFG2_PWD_AMPCTRL_N | + HSIO_PLL5G_CFG2_AMPC_SEL(0x10)); +} +EXPORT_SYMBOL(ocelot_pll5_init); + static void ocelot_vcap_enable(struct ocelot *ocelot, int port) { ocelot_write_gix(ocelot, ANA_PORT_VCAP_S2_CFG_S2_ENA | @@ -778,6 +811,71 @@ static int ocelot_port_flush(struct ocelot *ocelot, int port) return err; } +int ocelot_port_configure_serdes(struct ocelot *ocelot, int port, + struct device_node *portnp) +{ + struct ocelot_port *ocelot_port = ocelot->ports[port]; + struct device *dev = ocelot->dev; + int err; + + /* Ensure clock signals and speed are set on all QSGMII links */ + if (ocelot_port->phy_mode == PHY_INTERFACE_MODE_QSGMII) + ocelot_port_rmwl(ocelot_port, 0, + DEV_CLOCK_CFG_MAC_TX_RST | + DEV_CLOCK_CFG_MAC_RX_RST, + DEV_CLOCK_CFG); + + if (ocelot_port->phy_mode != PHY_INTERFACE_MODE_INTERNAL) { + struct phy *serdes = of_phy_get(portnp, NULL); + + if (IS_ERR(serdes)) { + err = PTR_ERR(serdes); + dev_err_probe(dev, err, + "missing SerDes phys for port %d\n", + port); + return err; + } + + err = phy_set_mode_ext(serdes, PHY_MODE_ETHERNET, + ocelot_port->phy_mode); + of_phy_put(serdes); + if (err) { + dev_err(dev, "Could not SerDes mode on port %d: %pe\n", + port, ERR_PTR(err)); + return err; + } + } + + return 0; +} +EXPORT_SYMBOL_GPL(ocelot_port_configure_serdes); + +void ocelot_phylink_mac_config(struct ocelot *ocelot, int port, + unsigned int link_an_mode, + const struct phylink_link_state *state) +{ + struct ocelot_port *ocelot_port = ocelot->ports[port]; + + /* Disable HDX fast control */ + ocelot_port_writel(ocelot_port, DEV_PORT_MISC_HDX_FAST_DIS, + DEV_PORT_MISC); + + /* SGMII only for now */ + ocelot_port_writel(ocelot_port, PCS1G_MODE_CFG_SGMII_MODE_ENA, + PCS1G_MODE_CFG); + ocelot_port_writel(ocelot_port, PCS1G_SD_CFG_SD_SEL, PCS1G_SD_CFG); + + /* Enable PCS */ + ocelot_port_writel(ocelot_port, PCS1G_CFG_PCS_ENA, PCS1G_CFG); + + /* No aneg on SGMII */ + ocelot_port_writel(ocelot_port, 0, PCS1G_ANEG_CFG); + + /* No loopback */ + ocelot_port_writel(ocelot_port, 0, PCS1G_LB_CFG); +} +EXPORT_SYMBOL_GPL(ocelot_phylink_mac_config); + void ocelot_phylink_mac_link_down(struct ocelot *ocelot, int port, unsigned int link_an_mode, phy_interface_t interface, @@ -908,7 +1006,12 @@ void ocelot_phylink_mac_link_up(struct ocelot *ocelot, int port, */ if (ocelot->ops->cut_through_fwd) { mutex_lock(&ocelot->fwd_domain_lock); - ocelot->ops->cut_through_fwd(ocelot); + /* Workaround for hardware bug - FP doesn't work + * at all link speeds for all PHY modes. The function + * below also calls ocelot->ops->cut_through_fwd(), + * so we don't need to do it twice. + */ + ocelot_port_update_active_preemptible_tcs(ocelot, port); mutex_unlock(&ocelot->fwd_domain_lock); } @@ -2602,6 +2705,58 @@ void ocelot_port_mirror_del(struct ocelot *ocelot, int from, bool ingress) } EXPORT_SYMBOL_GPL(ocelot_port_mirror_del); +static void ocelot_port_reset_mqprio(struct ocelot *ocelot, int port) +{ + struct net_device *dev = ocelot->ops->port_to_netdev(ocelot, port); + + netdev_reset_tc(dev); + ocelot_port_change_fp(ocelot, port, 0); +} + +int ocelot_port_mqprio(struct ocelot *ocelot, int port, + struct tc_mqprio_qopt_offload *mqprio) +{ + struct net_device *dev = ocelot->ops->port_to_netdev(ocelot, port); + struct netlink_ext_ack *extack = mqprio->extack; + struct tc_mqprio_qopt *qopt = &mqprio->qopt; + int num_tc = qopt->num_tc; + int tc, err; + + if (!num_tc) { + ocelot_port_reset_mqprio(ocelot, port); + return 0; + } + + err = netdev_set_num_tc(dev, num_tc); + if (err) + return err; + + for (tc = 0; tc < num_tc; tc++) { + if (qopt->count[tc] != 1) { + NL_SET_ERR_MSG_MOD(extack, + "Only one TXQ per TC supported"); + return -EINVAL; + } + + err = netdev_set_tc_queue(dev, tc, 1, qopt->offset[tc]); + if (err) + goto err_reset_tc; + } + + err = netif_set_real_num_tx_queues(dev, num_tc); + if (err) + goto err_reset_tc; + + ocelot_port_change_fp(ocelot, port, mqprio->preemptible_tcs); + + return 0; + +err_reset_tc: + ocelot_port_reset_mqprio(ocelot, port); + return err; +} +EXPORT_SYMBOL_GPL(ocelot_port_mqprio); + void ocelot_init_port(struct ocelot *ocelot, int port) { struct ocelot_port *ocelot_port = ocelot->ports[port]; diff --git a/drivers/net/ethernet/mscc/ocelot.h b/drivers/net/ethernet/mscc/ocelot.h index e9a0179448bf..87f2055c242c 100644 --- a/drivers/net/ethernet/mscc/ocelot.h +++ b/drivers/net/ethernet/mscc/ocelot.h @@ -74,6 +74,15 @@ struct ocelot_multicast { struct ocelot_pgid *pgid; }; +static inline void ocelot_reg_to_target_addr(struct ocelot *ocelot, + enum ocelot_reg reg, + enum ocelot_target *target, + u32 *addr) +{ + *target = reg >> TARGET_OFFSET; + *addr = ocelot->map[*target][reg & REG_MASK]; +} + int ocelot_bridge_num_find(struct ocelot *ocelot, const struct net_device *bridge); @@ -85,9 +94,6 @@ int ocelot_mact_forget(struct ocelot *ocelot, struct net_device *ocelot_port_to_netdev(struct ocelot *ocelot, int port); int ocelot_netdev_to_port(struct net_device *dev); -u32 ocelot_port_readl(struct ocelot_port *port, u32 reg); -void ocelot_port_writel(struct ocelot_port *port, u32 val, u32 reg); - int ocelot_probe_port(struct ocelot *ocelot, int port, struct regmap *target, struct device_node *portnp); void ocelot_release_port(struct ocelot_port *ocelot_port); @@ -110,6 +116,9 @@ int ocelot_stats_init(struct ocelot *ocelot); void ocelot_stats_deinit(struct ocelot *ocelot); int ocelot_mm_init(struct ocelot *ocelot); +void ocelot_port_change_fp(struct ocelot *ocelot, int port, + unsigned long preemptible_tcs); +void ocelot_port_update_active_preemptible_tcs(struct ocelot *ocelot, int port); extern struct notifier_block ocelot_netdevice_nb; extern struct notifier_block ocelot_switchdev_nb; diff --git a/drivers/net/ethernet/mscc/ocelot_io.c b/drivers/net/ethernet/mscc/ocelot_io.c index 2067382d0ee1..3aa7dc29ebe1 100644 --- a/drivers/net/ethernet/mscc/ocelot_io.c +++ b/drivers/net/ethernet/mscc/ocelot_io.c @@ -10,57 +10,60 @@ #include "ocelot.h" -int __ocelot_bulk_read_ix(struct ocelot *ocelot, u32 reg, u32 offset, void *buf, - int count) +int __ocelot_bulk_read_ix(struct ocelot *ocelot, enum ocelot_reg reg, + u32 offset, void *buf, int count) { - u16 target = reg >> TARGET_OFFSET; + enum ocelot_target target; + u32 addr; + ocelot_reg_to_target_addr(ocelot, reg, &target, &addr); WARN_ON(!target); - return regmap_bulk_read(ocelot->targets[target], - ocelot->map[target][reg & REG_MASK] + offset, + return regmap_bulk_read(ocelot->targets[target], addr + offset, buf, count); } EXPORT_SYMBOL_GPL(__ocelot_bulk_read_ix); -u32 __ocelot_read_ix(struct ocelot *ocelot, u32 reg, u32 offset) +u32 __ocelot_read_ix(struct ocelot *ocelot, enum ocelot_reg reg, u32 offset) { - u16 target = reg >> TARGET_OFFSET; - u32 val; + enum ocelot_target target; + u32 addr, val; + ocelot_reg_to_target_addr(ocelot, reg, &target, &addr); WARN_ON(!target); - regmap_read(ocelot->targets[target], - ocelot->map[target][reg & REG_MASK] + offset, &val); + regmap_read(ocelot->targets[target], addr + offset, &val); return val; } EXPORT_SYMBOL_GPL(__ocelot_read_ix); -void __ocelot_write_ix(struct ocelot *ocelot, u32 val, u32 reg, u32 offset) +void __ocelot_write_ix(struct ocelot *ocelot, u32 val, enum ocelot_reg reg, + u32 offset) { - u16 target = reg >> TARGET_OFFSET; + enum ocelot_target target; + u32 addr; + ocelot_reg_to_target_addr(ocelot, reg, &target, &addr); WARN_ON(!target); - regmap_write(ocelot->targets[target], - ocelot->map[target][reg & REG_MASK] + offset, val); + regmap_write(ocelot->targets[target], addr + offset, val); } EXPORT_SYMBOL_GPL(__ocelot_write_ix); -void __ocelot_rmw_ix(struct ocelot *ocelot, u32 val, u32 mask, u32 reg, - u32 offset) +void __ocelot_rmw_ix(struct ocelot *ocelot, u32 val, u32 mask, + enum ocelot_reg reg, u32 offset) { - u16 target = reg >> TARGET_OFFSET; + enum ocelot_target target; + u32 addr; + ocelot_reg_to_target_addr(ocelot, reg, &target, &addr); WARN_ON(!target); - regmap_update_bits(ocelot->targets[target], - ocelot->map[target][reg & REG_MASK] + offset, - mask, val); + regmap_update_bits(ocelot->targets[target], addr + offset, mask, val); } EXPORT_SYMBOL_GPL(__ocelot_rmw_ix); -u32 ocelot_port_readl(struct ocelot_port *port, u32 reg) +u32 ocelot_port_readl(struct ocelot_port *port, enum ocelot_reg reg) { struct ocelot *ocelot = port->ocelot; u16 target = reg >> TARGET_OFFSET; @@ -73,7 +76,7 @@ u32 ocelot_port_readl(struct ocelot_port *port, u32 reg) } EXPORT_SYMBOL_GPL(ocelot_port_readl); -void ocelot_port_writel(struct ocelot_port *port, u32 val, u32 reg) +void ocelot_port_writel(struct ocelot_port *port, u32 val, enum ocelot_reg reg) { struct ocelot *ocelot = port->ocelot; u16 target = reg >> TARGET_OFFSET; @@ -84,7 +87,8 @@ void ocelot_port_writel(struct ocelot_port *port, u32 val, u32 reg) } EXPORT_SYMBOL_GPL(ocelot_port_writel); -void ocelot_port_rmwl(struct ocelot_port *port, u32 val, u32 mask, u32 reg) +void ocelot_port_rmwl(struct ocelot_port *port, u32 val, u32 mask, + enum ocelot_reg reg) { u32 cur = ocelot_port_readl(port, reg); diff --git a/drivers/net/ethernet/mscc/ocelot_mm.c b/drivers/net/ethernet/mscc/ocelot_mm.c index 0a8f21ae23f0..fb3145118d68 100644 --- a/drivers/net/ethernet/mscc/ocelot_mm.c +++ b/drivers/net/ethernet/mscc/ocelot_mm.c @@ -49,14 +49,68 @@ static enum ethtool_mm_verify_status ocelot_mm_verify_status(u32 val) } } -void ocelot_port_mm_irq(struct ocelot *ocelot, int port) +void ocelot_port_update_active_preemptible_tcs(struct ocelot *ocelot, int port) +{ + struct ocelot_port *ocelot_port = ocelot->ports[port]; + struct ocelot_mm_state *mm = &ocelot->mm[port]; + u32 val = 0; + + lockdep_assert_held(&ocelot->fwd_domain_lock); + + /* Only commit preemptible TCs when MAC Merge is active. + * On NXP LS1028A, when using QSGMII, the port hangs if transmitting + * preemptible frames at any other link speed than gigabit, so avoid + * preemption at lower speeds in this PHY mode. + */ + if ((ocelot_port->phy_mode != PHY_INTERFACE_MODE_QSGMII || + ocelot_port->speed == SPEED_1000) && mm->tx_active) + val = mm->preemptible_tcs; + + /* Cut through switching doesn't work for preemptible priorities, + * so first make sure it is disabled. + */ + mm->active_preemptible_tcs = val; + ocelot->ops->cut_through_fwd(ocelot); + + dev_dbg(ocelot->dev, + "port %d %s/%s, MM TX %s, preemptible TCs 0x%x, active 0x%x\n", + port, phy_modes(ocelot_port->phy_mode), + phy_speed_to_str(ocelot_port->speed), + mm->tx_active ? "active" : "inactive", mm->preemptible_tcs, + mm->active_preemptible_tcs); + + ocelot_rmw_rix(ocelot, QSYS_PREEMPTION_CFG_P_QUEUES(val), + QSYS_PREEMPTION_CFG_P_QUEUES_M, + QSYS_PREEMPTION_CFG, port); +} + +void ocelot_port_change_fp(struct ocelot *ocelot, int port, + unsigned long preemptible_tcs) +{ + struct ocelot_mm_state *mm = &ocelot->mm[port]; + + mutex_lock(&ocelot->fwd_domain_lock); + + if (mm->preemptible_tcs == preemptible_tcs) + goto out_unlock; + + mm->preemptible_tcs = preemptible_tcs; + + ocelot_port_update_active_preemptible_tcs(ocelot, port); + +out_unlock: + mutex_unlock(&ocelot->fwd_domain_lock); +} + +static void ocelot_mm_update_port_status(struct ocelot *ocelot, int port) { struct ocelot_port *ocelot_port = ocelot->ports[port]; struct ocelot_mm_state *mm = &ocelot->mm[port]; enum ethtool_mm_verify_status verify_status; - u32 val; + u32 val, ack = 0; - mutex_lock(&mm->lock); + if (!mm->tx_enabled) + return; val = ocelot_port_readl(ocelot_port, DEV_MM_STATUS); @@ -73,25 +127,43 @@ void ocelot_port_mm_irq(struct ocelot *ocelot, int port) dev_dbg(ocelot->dev, "Port %d TX preemption %s\n", port, mm->tx_active ? "active" : "inactive"); + ocelot_port_update_active_preemptible_tcs(ocelot, port); + + ack |= DEV_MM_STAT_MM_STATUS_PRMPT_ACTIVE_STICKY; } if (val & DEV_MM_STAT_MM_STATUS_UNEXP_RX_PFRM_STICKY) { dev_err(ocelot->dev, "Unexpected P-frame received on port %d while verification was unsuccessful or not yet verified\n", port); + + ack |= DEV_MM_STAT_MM_STATUS_UNEXP_RX_PFRM_STICKY; } if (val & DEV_MM_STAT_MM_STATUS_UNEXP_TX_PFRM_STICKY) { dev_err(ocelot->dev, "Unexpected P-frame requested to be transmitted on port %d while verification was unsuccessful or not yet verified, or MM_TX_ENA=0\n", port); + + ack |= DEV_MM_STAT_MM_STATUS_UNEXP_TX_PFRM_STICKY; } - ocelot_port_writel(ocelot_port, val, DEV_MM_STATUS); + if (ack) + ocelot_port_writel(ocelot_port, ack, DEV_MM_STATUS); +} - mutex_unlock(&mm->lock); +void ocelot_mm_irq(struct ocelot *ocelot) +{ + int port; + + mutex_lock(&ocelot->fwd_domain_lock); + + for (port = 0; port < ocelot->num_phys_ports; port++) + ocelot_mm_update_port_status(ocelot, port); + + mutex_unlock(&ocelot->fwd_domain_lock); } -EXPORT_SYMBOL_GPL(ocelot_port_mm_irq); +EXPORT_SYMBOL_GPL(ocelot_mm_irq); int ocelot_port_set_mm(struct ocelot *ocelot, int port, struct ethtool_mm_cfg *cfg, @@ -121,7 +193,7 @@ int ocelot_port_set_mm(struct ocelot *ocelot, int port, if (!cfg->verify_enabled) verify_disable = DEV_MM_CONFIG_VERIF_CONFIG_PRM_VERIFY_DIS; - mutex_lock(&mm->lock); + mutex_lock(&ocelot->fwd_domain_lock); ocelot_port_rmwl(ocelot_port, mm_enable, DEV_MM_CONFIG_ENABLE_CONFIG_MM_TX_ENA | @@ -140,7 +212,20 @@ int ocelot_port_set_mm(struct ocelot *ocelot, int port, QSYS_PREEMPTION_CFG, port); - mutex_unlock(&mm->lock); + /* The switch will emit an IRQ when TX is disabled, to notify that it + * has become inactive. We optimize ocelot_mm_update_port_status() to + * not bother processing MM IRQs at all for ports with TX disabled, + * but we need to ACK this IRQ now, while mm->tx_enabled is still set, + * otherwise we get an IRQ storm. + */ + if (mm->tx_enabled && !cfg->tx_enabled) { + ocelot_mm_update_port_status(ocelot, port); + WARN_ON(mm->tx_active); + } + + mm->tx_enabled = cfg->tx_enabled; + + mutex_unlock(&ocelot->fwd_domain_lock); return 0; } @@ -158,7 +243,7 @@ int ocelot_port_get_mm(struct ocelot *ocelot, int port, mm = &ocelot->mm[port]; - mutex_lock(&mm->lock); + mutex_lock(&ocelot->fwd_domain_lock); val = ocelot_port_readl(ocelot_port, DEV_MM_ENABLE_CONFIG); state->pmac_enabled = !!(val & DEV_MM_CONFIG_ENABLE_CONFIG_MM_RX_ENA); @@ -174,10 +259,11 @@ int ocelot_port_get_mm(struct ocelot *ocelot, int port, state->tx_min_frag_size = ethtool_mm_frag_size_add_to_min(add_frag_size); state->rx_min_frag_size = ETH_ZLEN; + ocelot_mm_update_port_status(ocelot, port); state->verify_status = mm->verify_status; state->tx_active = mm->tx_active; - mutex_unlock(&mm->lock); + mutex_unlock(&ocelot->fwd_domain_lock); return 0; } @@ -201,7 +287,6 @@ int ocelot_mm_init(struct ocelot *ocelot) u32 val; mm = &ocelot->mm[port]; - mutex_init(&mm->lock); ocelot_port = ocelot->ports[port]; /* Update initial status variable for the diff --git a/drivers/net/ethernet/mscc/ocelot_net.c b/drivers/net/ethernet/mscc/ocelot_net.c index ca4bde861397..21a87a3fc556 100644 --- a/drivers/net/ethernet/mscc/ocelot_net.c +++ b/drivers/net/ethernet/mscc/ocelot_net.c @@ -1675,25 +1675,10 @@ static void vsc7514_phylink_mac_config(struct phylink_config *config, { struct net_device *ndev = to_net_dev(config->dev); struct ocelot_port_private *priv = netdev_priv(ndev); - struct ocelot_port *ocelot_port = &priv->port; - - /* Disable HDX fast control */ - ocelot_port_writel(ocelot_port, DEV_PORT_MISC_HDX_FAST_DIS, - DEV_PORT_MISC); - - /* SGMII only for now */ - ocelot_port_writel(ocelot_port, PCS1G_MODE_CFG_SGMII_MODE_ENA, - PCS1G_MODE_CFG); - ocelot_port_writel(ocelot_port, PCS1G_SD_CFG_SD_SEL, PCS1G_SD_CFG); - - /* Enable PCS */ - ocelot_port_writel(ocelot_port, PCS1G_CFG_PCS_ENA, PCS1G_CFG); - - /* No aneg on SGMII */ - ocelot_port_writel(ocelot_port, 0, PCS1G_ANEG_CFG); + struct ocelot *ocelot = priv->port.ocelot; + int port = priv->port.index; - /* No loopback */ - ocelot_port_writel(ocelot_port, 0, PCS1G_LB_CFG); + ocelot_phylink_mac_config(ocelot, port, link_an_mode, state); } static void vsc7514_phylink_mac_link_down(struct phylink_config *config, @@ -1757,34 +1742,11 @@ static int ocelot_port_phylink_create(struct ocelot *ocelot, int port, return -EINVAL; } - /* Ensure clock signals and speed are set on all QSGMII links */ - if (phy_mode == PHY_INTERFACE_MODE_QSGMII) - ocelot_port_rmwl(ocelot_port, 0, - DEV_CLOCK_CFG_MAC_TX_RST | - DEV_CLOCK_CFG_MAC_RX_RST, - DEV_CLOCK_CFG); - ocelot_port->phy_mode = phy_mode; - if (phy_mode != PHY_INTERFACE_MODE_INTERNAL) { - struct phy *serdes = of_phy_get(portnp, NULL); - - if (IS_ERR(serdes)) { - err = PTR_ERR(serdes); - dev_err_probe(dev, err, - "missing SerDes phys for port %d\n", - port); - return err; - } - - err = phy_set_mode_ext(serdes, PHY_MODE_ETHERNET, phy_mode); - of_phy_put(serdes); - if (err) { - dev_err(dev, "Could not SerDes mode on port %d: %pe\n", - port, ERR_PTR(err)); - return err; - } - } + err = ocelot_port_configure_serdes(ocelot, port, portnp); + if (err) + return err; priv = container_of(ocelot_port, struct ocelot_port_private, port); diff --git a/drivers/net/ethernet/mscc/ocelot_stats.c b/drivers/net/ethernet/mscc/ocelot_stats.c index d0e6cd8dbe5c..5c55197c7327 100644 --- a/drivers/net/ethernet/mscc/ocelot_stats.c +++ b/drivers/net/ethernet/mscc/ocelot_stats.c @@ -145,7 +145,7 @@ enum ocelot_stat { }; struct ocelot_stat_layout { - u32 reg; + enum ocelot_reg reg; char name[ETH_GSTRING_LEN]; }; @@ -257,7 +257,7 @@ struct ocelot_stat_layout { struct ocelot_stats_region { struct list_head node; - u32 base; + enum ocelot_reg base; enum ocelot_stat first_stat; int count; u32 *buf; @@ -395,7 +395,7 @@ static void ocelot_check_stats_work(struct work_struct *work) void ocelot_get_strings(struct ocelot *ocelot, int port, u32 sset, u8 *data) { const struct ocelot_stat_layout *layout; - int i; + enum ocelot_stat i; if (sset != ETH_SS_STATS) return; @@ -442,7 +442,8 @@ out_unlock: int ocelot_get_sset_count(struct ocelot *ocelot, int port, int sset) { const struct ocelot_stat_layout *layout; - int i, num_stats = 0; + enum ocelot_stat i; + int num_stats = 0; if (sset != ETH_SS_STATS) return -EOPNOTSUPP; @@ -461,8 +462,8 @@ static void ocelot_port_ethtool_stats_cb(struct ocelot *ocelot, int port, void *priv) { const struct ocelot_stat_layout *layout; + enum ocelot_stat i; u64 *data = priv; - int i; layout = ocelot_get_stats_layout(ocelot); @@ -889,8 +890,8 @@ static int ocelot_prepare_stats_regions(struct ocelot *ocelot) { struct ocelot_stats_region *region = NULL; const struct ocelot_stat_layout *layout; - unsigned int last = 0; - int i; + enum ocelot_reg last = 0; + enum ocelot_stat i; INIT_LIST_HEAD(&ocelot->stats_regions); @@ -900,6 +901,17 @@ static int ocelot_prepare_stats_regions(struct ocelot *ocelot) if (!layout[i].reg) continue; + /* enum ocelot_stat must be kept sorted in the same order + * as the addresses behind layout[i].reg in order to have + * efficient bulking + */ + if (last) { + WARN(ocelot->map[SYS][last & REG_MASK] >= ocelot->map[SYS][layout[i].reg & REG_MASK], + "reg 0x%x had address 0x%x but reg 0x%x has address 0x%x, bulking broken!", + last, ocelot->map[SYS][last & REG_MASK], + layout[i].reg, ocelot->map[SYS][layout[i].reg & REG_MASK]); + } + if (region && ocelot->map[SYS][layout[i].reg & REG_MASK] == ocelot->map[SYS][last & REG_MASK] + 4) { region->count++; @@ -909,12 +921,6 @@ static int ocelot_prepare_stats_regions(struct ocelot *ocelot) if (!region) return -ENOMEM; - /* enum ocelot_stat must be kept sorted in the same - * order as layout[i].reg in order to have efficient - * bulking - */ - WARN_ON(last >= layout[i].reg); - region->base = layout[i].reg; region->first_stat = i; region->count = 1; @@ -925,6 +931,15 @@ static int ocelot_prepare_stats_regions(struct ocelot *ocelot) } list_for_each_entry(region, &ocelot->stats_regions, node) { + enum ocelot_target target; + u32 addr; + + ocelot_reg_to_target_addr(ocelot, region->base, &target, + &addr); + + dev_dbg(ocelot->dev, + "region of %d contiguous counters starting with SYS:STAT:CNT[0x%03x]\n", + region->count, addr / 4); region->buf = devm_kcalloc(ocelot->dev, region->count, sizeof(*region->buf), GFP_KERNEL); if (!region->buf) @@ -972,4 +987,3 @@ void ocelot_stats_deinit(struct ocelot *ocelot) cancel_delayed_work(&ocelot->stats_work); destroy_workqueue(ocelot->stats_queue); } - diff --git a/drivers/net/ethernet/mscc/ocelot_vsc7514.c b/drivers/net/ethernet/mscc/ocelot_vsc7514.c index 7388c3b0535c..97e90e2869d4 100644 --- a/drivers/net/ethernet/mscc/ocelot_vsc7514.c +++ b/drivers/net/ethernet/mscc/ocelot_vsc7514.c @@ -18,7 +18,6 @@ #include <soc/mscc/ocelot.h> #include <soc/mscc/ocelot_vcap.h> -#include <soc/mscc/ocelot_hsio.h> #include <soc/mscc/vsc7514_regs.h> #include "ocelot_fdma.h" #include "ocelot.h" @@ -26,35 +25,6 @@ #define VSC7514_VCAP_POLICER_BASE 128 #define VSC7514_VCAP_POLICER_MAX 191 -static void ocelot_pll5_init(struct ocelot *ocelot) -{ - /* Configure PLL5. This will need a proper CCF driver - * The values are coming from the VTSS API for Ocelot - */ - regmap_write(ocelot->targets[HSIO], HSIO_PLL5G_CFG4, - HSIO_PLL5G_CFG4_IB_CTRL(0x7600) | - HSIO_PLL5G_CFG4_IB_BIAS_CTRL(0x8)); - regmap_write(ocelot->targets[HSIO], HSIO_PLL5G_CFG0, - HSIO_PLL5G_CFG0_CORE_CLK_DIV(0x11) | - HSIO_PLL5G_CFG0_CPU_CLK_DIV(2) | - HSIO_PLL5G_CFG0_ENA_BIAS | - HSIO_PLL5G_CFG0_ENA_VCO_BUF | - HSIO_PLL5G_CFG0_ENA_CP1 | - HSIO_PLL5G_CFG0_SELCPI(2) | - HSIO_PLL5G_CFG0_LOOP_BW_RES(0xe) | - HSIO_PLL5G_CFG0_SELBGV820(4) | - HSIO_PLL5G_CFG0_DIV4 | - HSIO_PLL5G_CFG0_ENA_CLKTREE | - HSIO_PLL5G_CFG0_ENA_LANE); - regmap_write(ocelot->targets[HSIO], HSIO_PLL5G_CFG2, - HSIO_PLL5G_CFG2_EN_RESET_FRQ_DET | - HSIO_PLL5G_CFG2_EN_RESET_OVERRUN | - HSIO_PLL5G_CFG2_GAIN_TEST(0x8) | - HSIO_PLL5G_CFG2_ENA_AMPCTRL | - HSIO_PLL5G_CFG2_PWD_AMPCTRL_N | - HSIO_PLL5G_CFG2_AMPC_SEL(0x10)); -} - static int ocelot_chip_init(struct ocelot *ocelot, const struct ocelot_ops *ops) { int ret; diff --git a/drivers/net/ethernet/netronome/nfp/crypto/ipsec.c b/drivers/net/ethernet/netronome/nfp/crypto/ipsec.c index c0dcce8ae437..b1f026b81dea 100644 --- a/drivers/net/ethernet/netronome/nfp/crypto/ipsec.c +++ b/drivers/net/ethernet/netronome/nfp/crypto/ipsec.c @@ -269,7 +269,7 @@ static void set_sha2_512hmac(struct nfp_ipsec_cfg_add_sa *cfg, int *trunc_len) static int nfp_net_xfrm_add_state(struct xfrm_state *x, struct netlink_ext_ack *extack) { - struct net_device *netdev = x->xso.dev; + struct net_device *netdev = x->xso.real_dev; struct nfp_ipsec_cfg_mssg msg = {}; int i, key_len, trunc_len, err = 0; struct nfp_ipsec_cfg_add_sa *cfg; @@ -513,7 +513,7 @@ static void nfp_net_xfrm_del_state(struct xfrm_state *x) .cmd = NFP_IPSEC_CFG_MSSG_INV_SA, .sa_idx = x->xso.offload_handle - 1, }; - struct net_device *netdev = x->xso.dev; + struct net_device *netdev = x->xso.real_dev; struct nfp_net *nn; int err; diff --git a/drivers/net/ethernet/netronome/nfp/flower/conntrack.c b/drivers/net/ethernet/netronome/nfp/flower/conntrack.c index d23830b5bcb8..73032173ac4e 100644 --- a/drivers/net/ethernet/netronome/nfp/flower/conntrack.c +++ b/drivers/net/ethernet/netronome/nfp/flower/conntrack.c @@ -55,9 +55,21 @@ static void *get_hashentry(struct rhashtable *ht, void *key, bool is_pre_ct_flow(struct flow_cls_offload *flow) { + struct flow_rule *rule = flow_cls_offload_flow_rule(flow); + struct flow_dissector *dissector = rule->match.dissector; struct flow_action_entry *act; + struct flow_match_ct ct; int i; + if (dissector->used_keys & BIT(FLOW_DISSECTOR_KEY_CT)) { + flow_rule_match_ct(rule, &ct); + if (ct.key->ct_state) + return false; + } + + if (flow->common.chain_index) + return false; + flow_action_for_each(i, act, &flow->rule->action) { if (act->id == FLOW_ACTION_CT) { /* The pre_ct rule only have the ct or ct nat action, cannot @@ -82,24 +94,23 @@ bool is_post_ct_flow(struct flow_cls_offload *flow) struct flow_match_ct ct; int i; - /* post ct entry cannot contains any ct action except ct_clear. */ - flow_action_for_each(i, act, &flow->rule->action) { - if (act->id == FLOW_ACTION_CT) { - /* ignore ct clear action. */ - if (act->ct.action == TCA_CT_ACT_CLEAR) { - exist_ct_clear = true; - continue; - } - - return false; - } - } - if (dissector->used_keys & BIT(FLOW_DISSECTOR_KEY_CT)) { flow_rule_match_ct(rule, &ct); if (ct.key->ct_state & TCA_FLOWER_KEY_CT_FLAGS_ESTABLISHED) return true; } else { + /* post ct entry cannot contains any ct action except ct_clear. */ + flow_action_for_each(i, act, &flow->rule->action) { + if (act->id == FLOW_ACTION_CT) { + /* ignore ct clear action. */ + if (act->ct.action == TCA_CT_ACT_CLEAR) { + exist_ct_clear = true; + continue; + } + + return false; + } + } /* when do nat with ct, the post ct entry ignore the ct status, * will match the nat field(sip/dip) instead. In this situation, * the flow chain index is not zero and contains ct clear action. @@ -511,6 +522,21 @@ static int nfp_ct_check_vlan_merge(struct flow_action_entry *a_in, return 0; } +/* Extra check for multiple ct-zones merge + * currently surpport nft entries merge check in different zones + */ +static int nfp_ct_merge_extra_check(struct nfp_fl_ct_flow_entry *nft_entry, + struct nfp_fl_ct_tc_merge *tc_m_entry) +{ + struct nfp_fl_nft_tc_merge *prev_nft_m_entry; + struct nfp_fl_ct_flow_entry *pre_ct_entry; + + pre_ct_entry = tc_m_entry->pre_ct_parent; + prev_nft_m_entry = pre_ct_entry->prev_m_entries[pre_ct_entry->num_prev_m_entries - 1]; + + return nfp_ct_merge_check(prev_nft_m_entry->nft_parent, nft_entry); +} + static int nfp_ct_merge_act_check(struct nfp_fl_ct_flow_entry *pre_ct_entry, struct nfp_fl_ct_flow_entry *post_ct_entry, struct nfp_fl_ct_flow_entry *nft_entry) @@ -682,34 +708,34 @@ static void nfp_fl_get_csum_flag(struct flow_action_entry *a_in, u8 ip_proto, u3 static int nfp_fl_merge_actions_offload(struct flow_rule **rules, struct nfp_flower_priv *priv, struct net_device *netdev, - struct nfp_fl_payload *flow_pay) + struct nfp_fl_payload *flow_pay, + int num_rules) { enum flow_action_hw_stats tmp_stats = FLOW_ACTION_HW_STATS_DONT_CARE; struct flow_action_entry *a_in; - int i, j, num_actions, id; + int i, j, id, num_actions = 0; struct flow_rule *a_rule; int err = 0, offset = 0; - num_actions = rules[CT_TYPE_PRE_CT]->action.num_entries + - rules[CT_TYPE_NFT]->action.num_entries + - rules[CT_TYPE_POST_CT]->action.num_entries; + for (i = 0; i < num_rules; i++) + num_actions += rules[i]->action.num_entries; /* Add one action to make sure there is enough room to add an checksum action * when do nat. */ - a_rule = flow_rule_alloc(num_actions + 1); + a_rule = flow_rule_alloc(num_actions + (num_rules / 2)); if (!a_rule) return -ENOMEM; - /* Actions need a BASIC dissector. */ - a_rule->match = rules[CT_TYPE_PRE_CT]->match; /* post_ct entry have one action at least. */ - if (rules[CT_TYPE_POST_CT]->action.num_entries != 0) { - tmp_stats = rules[CT_TYPE_POST_CT]->action.entries[0].hw_stats; - } + if (rules[num_rules - 1]->action.num_entries != 0) + tmp_stats = rules[num_rules - 1]->action.entries[0].hw_stats; + + /* Actions need a BASIC dissector. */ + a_rule->match = rules[0]->match; /* Copy actions */ - for (j = 0; j < _CT_TYPE_MAX; j++) { + for (j = 0; j < num_rules; j++) { u32 csum_updated = 0; u8 ip_proto = 0; @@ -747,8 +773,9 @@ static int nfp_fl_merge_actions_offload(struct flow_rule **rules, /* nft entry is generated by tc ct, which mangle action do not care * the stats, inherit the post entry stats to meet the * flow_action_hw_stats_check. + * nft entry flow rules are at odd array index. */ - if (j == CT_TYPE_NFT) { + if (j & 0x01) { if (a_in->hw_stats == FLOW_ACTION_HW_STATS_DONT_CARE) a_in->hw_stats = tmp_stats; nfp_fl_get_csum_flag(a_in, ip_proto, &csum_updated); @@ -784,32 +811,40 @@ static int nfp_fl_ct_add_offload(struct nfp_fl_nft_tc_merge *m_entry) { enum nfp_flower_tun_type tun_type = NFP_FL_TUNNEL_NONE; struct nfp_fl_ct_zone_entry *zt = m_entry->zt; + struct flow_rule *rules[NFP_MAX_ENTRY_RULES]; + struct nfp_fl_ct_flow_entry *pre_ct_entry; struct nfp_fl_key_ls key_layer, tmp_layer; struct nfp_flower_priv *priv = zt->priv; u16 key_map[_FLOW_PAY_LAYERS_MAX]; struct nfp_fl_payload *flow_pay; - - struct flow_rule *rules[_CT_TYPE_MAX]; u8 *key, *msk, *kdata, *mdata; struct nfp_port *port = NULL; + int num_rules, err, i, j = 0; struct net_device *netdev; bool qinq_sup; u32 port_id; u16 offset; - int i, err; netdev = m_entry->netdev; qinq_sup = !!(priv->flower_ext_feats & NFP_FL_FEATS_VLAN_QINQ); - rules[CT_TYPE_PRE_CT] = m_entry->tc_m_parent->pre_ct_parent->rule; - rules[CT_TYPE_NFT] = m_entry->nft_parent->rule; - rules[CT_TYPE_POST_CT] = m_entry->tc_m_parent->post_ct_parent->rule; + pre_ct_entry = m_entry->tc_m_parent->pre_ct_parent; + num_rules = pre_ct_entry->num_prev_m_entries * 2 + _CT_TYPE_MAX; + + for (i = 0; i < pre_ct_entry->num_prev_m_entries; i++) { + rules[j++] = pre_ct_entry->prev_m_entries[i]->tc_m_parent->pre_ct_parent->rule; + rules[j++] = pre_ct_entry->prev_m_entries[i]->nft_parent->rule; + } + + rules[j++] = m_entry->tc_m_parent->pre_ct_parent->rule; + rules[j++] = m_entry->nft_parent->rule; + rules[j++] = m_entry->tc_m_parent->post_ct_parent->rule; memset(&key_layer, 0, sizeof(struct nfp_fl_key_ls)); memset(&key_map, 0, sizeof(key_map)); /* Calculate the resultant key layer and size for offload */ - for (i = 0; i < _CT_TYPE_MAX; i++) { + for (i = 0; i < num_rules; i++) { err = nfp_flower_calculate_key_layers(priv->app, m_entry->netdev, &tmp_layer, rules[i], @@ -875,7 +910,7 @@ static int nfp_fl_ct_add_offload(struct nfp_fl_nft_tc_merge *m_entry) * that the layer is not present. */ if (!qinq_sup) { - for (i = 0; i < _CT_TYPE_MAX; i++) { + for (i = 0; i < num_rules; i++) { offset = key_map[FLOW_PAY_META_TCI]; key = kdata + offset; msk = mdata + offset; @@ -889,7 +924,7 @@ static int nfp_fl_ct_add_offload(struct nfp_fl_nft_tc_merge *m_entry) offset = key_map[FLOW_PAY_MAC_MPLS]; key = kdata + offset; msk = mdata + offset; - for (i = 0; i < _CT_TYPE_MAX; i++) { + for (i = 0; i < num_rules; i++) { nfp_flower_compile_mac((struct nfp_flower_mac_mpls *)key, (struct nfp_flower_mac_mpls *)msk, rules[i]); @@ -905,7 +940,7 @@ static int nfp_fl_ct_add_offload(struct nfp_fl_nft_tc_merge *m_entry) offset = key_map[FLOW_PAY_IPV4]; key = kdata + offset; msk = mdata + offset; - for (i = 0; i < _CT_TYPE_MAX; i++) { + for (i = 0; i < num_rules; i++) { nfp_flower_compile_ipv4((struct nfp_flower_ipv4 *)key, (struct nfp_flower_ipv4 *)msk, rules[i]); @@ -916,7 +951,7 @@ static int nfp_fl_ct_add_offload(struct nfp_fl_nft_tc_merge *m_entry) offset = key_map[FLOW_PAY_IPV6]; key = kdata + offset; msk = mdata + offset; - for (i = 0; i < _CT_TYPE_MAX; i++) { + for (i = 0; i < num_rules; i++) { nfp_flower_compile_ipv6((struct nfp_flower_ipv6 *)key, (struct nfp_flower_ipv6 *)msk, rules[i]); @@ -927,7 +962,7 @@ static int nfp_fl_ct_add_offload(struct nfp_fl_nft_tc_merge *m_entry) offset = key_map[FLOW_PAY_L4]; key = kdata + offset; msk = mdata + offset; - for (i = 0; i < _CT_TYPE_MAX; i++) { + for (i = 0; i < num_rules; i++) { nfp_flower_compile_tport((struct nfp_flower_tp_ports *)key, (struct nfp_flower_tp_ports *)msk, rules[i]); @@ -938,7 +973,7 @@ static int nfp_fl_ct_add_offload(struct nfp_fl_nft_tc_merge *m_entry) offset = key_map[FLOW_PAY_QINQ]; key = kdata + offset; msk = mdata + offset; - for (i = 0; i < _CT_TYPE_MAX; i++) { + for (i = 0; i < num_rules; i++) { nfp_flower_compile_vlan((struct nfp_flower_vlan *)key, (struct nfp_flower_vlan *)msk, rules[i]); @@ -954,7 +989,7 @@ static int nfp_fl_ct_add_offload(struct nfp_fl_nft_tc_merge *m_entry) struct nfp_ipv6_addr_entry *entry; struct in6_addr *dst; - for (i = 0; i < _CT_TYPE_MAX; i++) { + for (i = 0; i < num_rules; i++) { nfp_flower_compile_ipv6_gre_tun((void *)key, (void *)msk, rules[i]); } @@ -971,7 +1006,7 @@ static int nfp_fl_ct_add_offload(struct nfp_fl_nft_tc_merge *m_entry) } else { __be32 dst; - for (i = 0; i < _CT_TYPE_MAX; i++) { + for (i = 0; i < num_rules; i++) { nfp_flower_compile_ipv4_gre_tun((void *)key, (void *)msk, rules[i]); } @@ -995,7 +1030,7 @@ static int nfp_fl_ct_add_offload(struct nfp_fl_nft_tc_merge *m_entry) struct nfp_ipv6_addr_entry *entry; struct in6_addr *dst; - for (i = 0; i < _CT_TYPE_MAX; i++) { + for (i = 0; i < num_rules; i++) { nfp_flower_compile_ipv6_udp_tun((void *)key, (void *)msk, rules[i]); } @@ -1012,7 +1047,7 @@ static int nfp_fl_ct_add_offload(struct nfp_fl_nft_tc_merge *m_entry) } else { __be32 dst; - for (i = 0; i < _CT_TYPE_MAX; i++) { + for (i = 0; i < num_rules; i++) { nfp_flower_compile_ipv4_udp_tun((void *)key, (void *)msk, rules[i]); } @@ -1029,13 +1064,13 @@ static int nfp_fl_ct_add_offload(struct nfp_fl_nft_tc_merge *m_entry) offset = key_map[FLOW_PAY_GENEVE_OPT]; key = kdata + offset; msk = mdata + offset; - for (i = 0; i < _CT_TYPE_MAX; i++) + for (i = 0; i < num_rules; i++) nfp_flower_compile_geneve_opt(key, msk, rules[i]); } } /* Merge actions into flow_pay */ - err = nfp_fl_merge_actions_offload(rules, priv, netdev, flow_pay); + err = nfp_fl_merge_actions_offload(rules, priv, netdev, flow_pay, num_rules); if (err) goto ct_offload_err; @@ -1168,6 +1203,12 @@ static int nfp_ct_do_nft_merge(struct nfp_fl_ct_zone_entry *zt, if (err) return err; + if (pre_ct_entry->num_prev_m_entries > 0) { + err = nfp_ct_merge_extra_check(nft_entry, tc_m_entry); + if (err) + return err; + } + /* Combine tc_merge and nft cookies for this cookie. */ new_cookie[0] = tc_m_entry->cookie[0]; new_cookie[1] = tc_m_entry->cookie[1]; @@ -1198,11 +1239,6 @@ static int nfp_ct_do_nft_merge(struct nfp_fl_ct_zone_entry *zt, list_add(&nft_m_entry->tc_merge_list, &tc_m_entry->children); list_add(&nft_m_entry->nft_flow_list, &nft_entry->children); - /* Generate offload structure and send to nfp */ - err = nfp_fl_ct_add_offload(nft_m_entry); - if (err) - goto err_nft_ct_offload; - err = rhashtable_insert_fast(&zt->nft_merge_tb, &nft_m_entry->hash_node, nfp_nft_ct_merge_params); if (err) @@ -1210,12 +1246,20 @@ static int nfp_ct_do_nft_merge(struct nfp_fl_ct_zone_entry *zt, zt->nft_merge_count++; + if (post_ct_entry->goto_chain_index > 0) + return nfp_fl_create_new_pre_ct(nft_m_entry); + + /* Generate offload structure and send to nfp */ + err = nfp_fl_ct_add_offload(nft_m_entry); + if (err) + goto err_nft_ct_offload; + return err; -err_nft_ct_merge_insert: +err_nft_ct_offload: nfp_fl_ct_del_offload(zt->priv->app, nft_m_entry->tc_flower_cookie, nft_m_entry->netdev); -err_nft_ct_offload: +err_nft_ct_merge_insert: list_del(&nft_m_entry->tc_merge_list); list_del(&nft_m_entry->nft_flow_list); kfree(nft_m_entry); @@ -1243,7 +1287,7 @@ static int nfp_ct_do_tc_merge(struct nfp_fl_ct_zone_entry *zt, /* Checks that the chain_index of the filter matches the * chain_index of the GOTO action. */ - if (post_ct_entry->chain_index != pre_ct_entry->chain_index) + if (post_ct_entry->chain_index != pre_ct_entry->goto_chain_index) return -EINVAL; err = nfp_ct_merge_check(pre_ct_entry, post_ct_entry); @@ -1461,7 +1505,7 @@ nfp_fl_ct_flow_entry *nfp_fl_ct_add_flow(struct nfp_fl_ct_zone_entry *zt, entry->zt = zt; entry->netdev = netdev; - entry->cookie = flow->cookie; + entry->cookie = flow->cookie > 0 ? flow->cookie : (unsigned long)entry; entry->chain_index = flow->common.chain_index; entry->tun_offset = NFP_FL_CT_NO_TUN; @@ -1501,6 +1545,9 @@ nfp_fl_ct_flow_entry *nfp_fl_ct_add_flow(struct nfp_fl_ct_zone_entry *zt, INIT_LIST_HEAD(&entry->children); + if (flow->cookie == 0) + return entry; + /* Now add a ct map entry to flower-priv */ map = get_hashentry(&zt->priv->ct_map_table, &flow->cookie, nfp_ct_map_params, sizeof(*map)); @@ -1559,6 +1606,14 @@ static void cleanup_nft_merge_entry(struct nfp_fl_nft_tc_merge *m_entry) list_del(&m_entry->tc_merge_list); list_del(&m_entry->nft_flow_list); + if (m_entry->next_pre_ct_entry) { + struct nfp_fl_ct_map_entry pre_ct_map_ent; + + pre_ct_map_ent.ct_entry = m_entry->next_pre_ct_entry; + pre_ct_map_ent.cookie = 0; + nfp_fl_ct_del_flow(&pre_ct_map_ent); + } + kfree(m_entry); } @@ -1656,6 +1711,22 @@ void nfp_fl_ct_clean_flow_entry(struct nfp_fl_ct_flow_entry *entry) kfree(entry); } +static struct flow_action_entry *get_flow_act_ct(struct flow_rule *rule) +{ + struct flow_action_entry *act; + int i; + + /* More than one ct action may be present in a flow rule, + * Return the first one that is not a CT clear action + */ + flow_action_for_each(i, act, &rule->action) { + if (act->id == FLOW_ACTION_CT && act->ct.action != TCA_CT_ACT_CLEAR) + return act; + } + + return NULL; +} + static struct flow_action_entry *get_flow_act(struct flow_rule *rule, enum flow_action_id act_id) { @@ -1713,14 +1784,15 @@ nfp_ct_merge_nft_with_tc(struct nfp_fl_ct_flow_entry *nft_entry, int nfp_fl_ct_handle_pre_ct(struct nfp_flower_priv *priv, struct net_device *netdev, struct flow_cls_offload *flow, - struct netlink_ext_ack *extack) + struct netlink_ext_ack *extack, + struct nfp_fl_nft_tc_merge *m_entry) { struct flow_action_entry *ct_act, *ct_goto; struct nfp_fl_ct_flow_entry *ct_entry; struct nfp_fl_ct_zone_entry *zt; int err; - ct_act = get_flow_act(flow->rule, FLOW_ACTION_CT); + ct_act = get_flow_act_ct(flow->rule); if (!ct_act) { NL_SET_ERR_MSG_MOD(extack, "unsupported offload: Conntrack action empty in conntrack offload"); @@ -1756,7 +1828,22 @@ int nfp_fl_ct_handle_pre_ct(struct nfp_flower_priv *priv, if (IS_ERR(ct_entry)) return PTR_ERR(ct_entry); ct_entry->type = CT_TYPE_PRE_CT; - ct_entry->chain_index = ct_goto->chain_index; + ct_entry->chain_index = flow->common.chain_index; + ct_entry->goto_chain_index = ct_goto->chain_index; + + if (m_entry) { + struct nfp_fl_ct_flow_entry *pre_ct_entry; + int i; + + pre_ct_entry = m_entry->tc_m_parent->pre_ct_parent; + for (i = 0; i < pre_ct_entry->num_prev_m_entries; i++) + ct_entry->prev_m_entries[i] = pre_ct_entry->prev_m_entries[i]; + ct_entry->prev_m_entries[i++] = m_entry; + ct_entry->num_prev_m_entries = i; + + m_entry->next_pre_ct_entry = ct_entry; + } + list_add(&ct_entry->list_node, &zt->pre_ct_list); zt->pre_ct_count++; @@ -1779,6 +1866,7 @@ int nfp_fl_ct_handle_post_ct(struct nfp_flower_priv *priv, struct nfp_fl_ct_zone_entry *zt; bool wildcarded = false; struct flow_match_ct ct; + struct flow_action_entry *ct_goto; flow_rule_match_ct(rule, &ct); if (!ct.mask->ct_zone) { @@ -1803,6 +1891,8 @@ int nfp_fl_ct_handle_post_ct(struct nfp_flower_priv *priv, ct_entry->type = CT_TYPE_POST_CT; ct_entry->chain_index = flow->common.chain_index; + ct_goto = get_flow_act(flow->rule, FLOW_ACTION_GOTO); + ct_entry->goto_chain_index = ct_goto ? ct_goto->chain_index : 0; list_add(&ct_entry->list_node, &zt->post_ct_list); zt->post_ct_count++; @@ -1831,6 +1921,28 @@ int nfp_fl_ct_handle_post_ct(struct nfp_flower_priv *priv, return 0; } +int nfp_fl_create_new_pre_ct(struct nfp_fl_nft_tc_merge *m_entry) +{ + struct nfp_fl_ct_flow_entry *pre_ct_entry, *post_ct_entry; + struct flow_cls_offload new_pre_ct_flow; + int err; + + pre_ct_entry = m_entry->tc_m_parent->pre_ct_parent; + if (pre_ct_entry->num_prev_m_entries >= NFP_MAX_RECIRC_CT_ZONES - 1) + return -1; + + post_ct_entry = m_entry->tc_m_parent->post_ct_parent; + memset(&new_pre_ct_flow, 0, sizeof(struct flow_cls_offload)); + new_pre_ct_flow.rule = post_ct_entry->rule; + new_pre_ct_flow.common.chain_index = post_ct_entry->chain_index; + + err = nfp_fl_ct_handle_pre_ct(pre_ct_entry->zt->priv, + pre_ct_entry->netdev, + &new_pre_ct_flow, NULL, + m_entry); + return err; +} + static void nfp_fl_ct_sub_stats(struct nfp_fl_nft_tc_merge *nft_merge, enum ct_entry_type type, u64 *m_pkts, @@ -1876,6 +1988,32 @@ nfp_fl_ct_sub_stats(struct nfp_fl_nft_tc_merge *nft_merge, 0, priv->stats[ctx_id].used, FLOW_ACTION_HW_STATS_DELAYED); } + + /* Update previous pre_ct/post_ct/nft flow stats */ + if (nft_merge->tc_m_parent->pre_ct_parent->num_prev_m_entries > 0) { + struct nfp_fl_nft_tc_merge *tmp_nft_merge; + int i; + + for (i = 0; i < nft_merge->tc_m_parent->pre_ct_parent->num_prev_m_entries; i++) { + tmp_nft_merge = nft_merge->tc_m_parent->pre_ct_parent->prev_m_entries[i]; + flow_stats_update(&tmp_nft_merge->tc_m_parent->pre_ct_parent->stats, + priv->stats[ctx_id].bytes, + priv->stats[ctx_id].pkts, + 0, priv->stats[ctx_id].used, + FLOW_ACTION_HW_STATS_DELAYED); + flow_stats_update(&tmp_nft_merge->tc_m_parent->post_ct_parent->stats, + priv->stats[ctx_id].bytes, + priv->stats[ctx_id].pkts, + 0, priv->stats[ctx_id].used, + FLOW_ACTION_HW_STATS_DELAYED); + flow_stats_update(&tmp_nft_merge->nft_parent->stats, + priv->stats[ctx_id].bytes, + priv->stats[ctx_id].pkts, + 0, priv->stats[ctx_id].used, + FLOW_ACTION_HW_STATS_DELAYED); + } + } + /* Reset stats from the nfp */ priv->stats[ctx_id].pkts = 0; priv->stats[ctx_id].bytes = 0; @@ -2080,10 +2218,12 @@ int nfp_fl_ct_del_flow(struct nfp_fl_ct_map_entry *ct_map_ent) switch (ct_entry->type) { case CT_TYPE_PRE_CT: zt->pre_ct_count--; - rhashtable_remove_fast(m_table, &ct_map_ent->hash_node, - nfp_ct_map_params); + if (ct_map_ent->cookie > 0) + rhashtable_remove_fast(m_table, &ct_map_ent->hash_node, + nfp_ct_map_params); nfp_fl_ct_clean_flow_entry(ct_entry); - kfree(ct_map_ent); + if (ct_map_ent->cookie > 0) + kfree(ct_map_ent); if (!zt->pre_ct_count) { zt->nft = NULL; diff --git a/drivers/net/ethernet/netronome/nfp/flower/conntrack.h b/drivers/net/ethernet/netronome/nfp/flower/conntrack.h index 762c0b36e269..c4ec78358033 100644 --- a/drivers/net/ethernet/netronome/nfp/flower/conntrack.h +++ b/drivers/net/ethernet/netronome/nfp/flower/conntrack.h @@ -86,6 +86,9 @@ enum ct_entry_type { _CT_TYPE_MAX, }; +#define NFP_MAX_RECIRC_CT_ZONES 4 +#define NFP_MAX_ENTRY_RULES (NFP_MAX_RECIRC_CT_ZONES * 2 + 1) + enum nfp_nfp_layer_name { FLOW_PAY_META_TCI = 0, FLOW_PAY_INPORT, @@ -112,27 +115,33 @@ enum nfp_nfp_layer_name { * @cookie: Flow cookie, same as original TC flow, used as key * @list_node: Used by the list * @chain_index: Chain index of the original flow + * @goto_chain_index: goto chain index of the flow * @netdev: netdev structure. - * @type: Type of pre-entry from enum ct_entry_type * @zt: Reference to the zone table this belongs to * @children: List of tc_merge flows this flow forms part of * @rule: Reference to the original TC flow rule * @stats: Used to cache stats for updating + * @prev_m_entries: Array of all previous nft_tc_merge entries + * @num_prev_m_entries: The number of all previous nft_tc_merge entries * @tun_offset: Used to indicate tunnel action offset in action list * @flags: Used to indicate flow flag like NAT which used by merge. + * @type: Type of ct-entry from enum ct_entry_type */ struct nfp_fl_ct_flow_entry { unsigned long cookie; struct list_head list_node; u32 chain_index; - enum ct_entry_type type; + u32 goto_chain_index; struct net_device *netdev; struct nfp_fl_ct_zone_entry *zt; struct list_head children; struct flow_rule *rule; struct flow_stats stats; + struct nfp_fl_nft_tc_merge *prev_m_entries[NFP_MAX_RECIRC_CT_ZONES - 1]; + u8 num_prev_m_entries; u8 tun_offset; // Set to NFP_FL_CT_NO_TUN if no tun u8 flags; + u8 type; }; /** @@ -169,6 +178,7 @@ struct nfp_fl_ct_tc_merge { * @nft_parent: The nft_entry parent * @tc_flower_cookie: The cookie of the flow offloaded to the nfp * @flow_pay: Reference to the offloaded flow struct + * @next_pre_ct_entry: Reference to the next ct zone pre ct entry */ struct nfp_fl_nft_tc_merge { struct net_device *netdev; @@ -181,6 +191,7 @@ struct nfp_fl_nft_tc_merge { struct nfp_fl_ct_flow_entry *nft_parent; unsigned long tc_flower_cookie; struct nfp_fl_payload *flow_pay; + struct nfp_fl_ct_flow_entry *next_pre_ct_entry; }; /** @@ -204,6 +215,7 @@ bool is_post_ct_flow(struct flow_cls_offload *flow); * @netdev: netdev structure. * @flow: TC flower classifier offload structure. * @extack: Extack pointer for errors + * @m_entry:previous nfp_fl_nft_tc_merge entry * * Adds a new entry to the relevant zone table and tries to * merge with other +trk+est entries and offload if possible. @@ -213,7 +225,8 @@ bool is_post_ct_flow(struct flow_cls_offload *flow); int nfp_fl_ct_handle_pre_ct(struct nfp_flower_priv *priv, struct net_device *netdev, struct flow_cls_offload *flow, - struct netlink_ext_ack *extack); + struct netlink_ext_ack *extack, + struct nfp_fl_nft_tc_merge *m_entry); /** * nfp_fl_ct_handle_post_ct() - Handles +trk+est conntrack rules * @priv: Pointer to app priv @@ -232,6 +245,19 @@ int nfp_fl_ct_handle_post_ct(struct nfp_flower_priv *priv, struct netlink_ext_ack *extack); /** + * nfp_fl_create_new_pre_ct() - create next ct_zone -trk conntrack rules + * @m_entry:previous nfp_fl_nft_tc_merge entry + * + * Create a new pre_ct entry from previous nfp_fl_nft_tc_merge entry + * to the next relevant zone table. Try to merge with other +trk+est + * entries and offload if possible. The created new pre_ct entry is + * linked to the previous nfp_fl_nft_tc_merge entry. + * + * Return: negative value on error, 0 if configured successfully. + */ +int nfp_fl_create_new_pre_ct(struct nfp_fl_nft_tc_merge *m_entry); + +/** * nfp_fl_ct_clean_flow_entry() - Free a nfp_fl_ct_flow_entry * @entry: Flow entry to cleanup */ diff --git a/drivers/net/ethernet/netronome/nfp/flower/offload.c b/drivers/net/ethernet/netronome/nfp/flower/offload.c index 8593cafa6368..18328eb7f5c3 100644 --- a/drivers/net/ethernet/netronome/nfp/flower/offload.c +++ b/drivers/net/ethernet/netronome/nfp/flower/offload.c @@ -1344,7 +1344,7 @@ nfp_flower_add_offload(struct nfp_app *app, struct net_device *netdev, port = nfp_port_from_netdev(netdev); if (is_pre_ct_flow(flow)) - return nfp_fl_ct_handle_pre_ct(priv, netdev, flow, extack); + return nfp_fl_ct_handle_pre_ct(priv, netdev, flow, extack, NULL); if (is_post_ct_flow(flow)) return nfp_fl_ct_handle_post_ct(priv, netdev, flow, extack); diff --git a/drivers/net/ethernet/netronome/nfp/nfp_hwmon.c b/drivers/net/ethernet/netronome/nfp/nfp_hwmon.c index 5cabb1aa9c0c..0d6c59d6d4ae 100644 --- a/drivers/net/ethernet/netronome/nfp/nfp_hwmon.c +++ b/drivers/net/ethernet/netronome/nfp/nfp_hwmon.c @@ -115,7 +115,7 @@ static const struct hwmon_channel_info nfp_power = { .config = nfp_power_config, }; -static const struct hwmon_channel_info *nfp_hwmon_info[] = { +static const struct hwmon_channel_info * const nfp_hwmon_info[] = { &nfp_chip, &nfp_temp, &nfp_power, diff --git a/drivers/net/ethernet/netronome/nfp/nfp_port.c b/drivers/net/ethernet/netronome/nfp/nfp_port.c index 4f2308570dcf..54640bcb70fb 100644 --- a/drivers/net/ethernet/netronome/nfp/nfp_port.c +++ b/drivers/net/ethernet/netronome/nfp/nfp_port.c @@ -189,6 +189,7 @@ int nfp_port_init_phy_port(struct nfp_pf *pf, struct nfp_app *app, port->eth_port = &pf->eth_tbl->ports[id]; port->eth_id = pf->eth_tbl->ports[id].index; + port->netdev->dev_port = id; if (pf->mac_stats_mem) port->eth_stats = pf->mac_stats_mem + port->eth_id * NFP_MAC_STATS_SIZE; diff --git a/drivers/net/ethernet/ni/nixge.c b/drivers/net/ethernet/ni/nixge.c index 56e02cba0b8a..0fd156286d4d 100644 --- a/drivers/net/ethernet/ni/nixge.c +++ b/drivers/net/ethernet/ni/nixge.c @@ -1422,7 +1422,7 @@ static struct platform_driver nixge_driver = { .remove = nixge_remove, .driver = { .name = "nixge", - .of_match_table = of_match_ptr(nixge_dt_ids), + .of_match_table = nixge_dt_ids, }, }; module_platform_driver(nixge_driver); diff --git a/drivers/net/ethernet/pasemi/pasemi_mac.c b/drivers/net/ethernet/pasemi/pasemi_mac.c index aaab590ef548..ed7dd0a04235 100644 --- a/drivers/net/ethernet/pasemi/pasemi_mac.c +++ b/drivers/net/ethernet/pasemi/pasemi_mac.c @@ -1423,7 +1423,7 @@ static void pasemi_mac_queue_csdesc(const struct sk_buff *skb, write_dma_reg(PAS_DMA_TXCHAN_INCR(txring->chan.chno), 2); } -static int pasemi_mac_start_tx(struct sk_buff *skb, struct net_device *dev) +static netdev_tx_t pasemi_mac_start_tx(struct sk_buff *skb, struct net_device *dev) { struct pasemi_mac * const mac = netdev_priv(dev); struct pasemi_mac_txring * const txring = tx_ring(mac); diff --git a/drivers/net/ethernet/pensando/ionic/ionic_bus_pci.c b/drivers/net/ethernet/pensando/ionic/ionic_bus_pci.c index e508f8eb43bf..b8678da1cce5 100644 --- a/drivers/net/ethernet/pensando/ionic/ionic_bus_pci.c +++ b/drivers/net/ethernet/pensando/ionic/ionic_bus_pci.c @@ -392,7 +392,6 @@ static void ionic_remove(struct pci_dev *pdev) ionic_port_reset(ionic); ionic_reset(ionic); ionic_dev_teardown(ionic); - pci_clear_master(pdev); ionic_unmap_bars(ionic); pci_release_regions(pdev); pci_disable_device(pdev); diff --git a/drivers/net/ethernet/pensando/ionic/ionic_phc.c b/drivers/net/ethernet/pensando/ionic/ionic_phc.c index eac2f0e3576e..7505efdff8e9 100644 --- a/drivers/net/ethernet/pensando/ionic/ionic_phc.c +++ b/drivers/net/ethernet/pensando/ionic/ionic_phc.c @@ -579,11 +579,10 @@ void ionic_lif_alloc_phc(struct ionic_lif *lif) diff |= diff >> 16; diff |= diff >> 32; - /* constrain to the hardware bitmask, and use this as the bitmask */ + /* constrain to the hardware bitmask */ diff &= phc->cc.mask; - phc->cc.mask = diff; - /* the wrap period is now defined by diff (or phc->cc.mask) + /* the wrap period is now defined by diff * * we will update the time basis at about 1/4 the wrap period, so * should not see a difference of more than +/- diff/4. diff --git a/drivers/net/ethernet/qlogic/netxen/netxen_nic.h b/drivers/net/ethernet/qlogic/netxen/netxen_nic.h index f13fa7396aef..3d36d23df0c6 100644 --- a/drivers/net/ethernet/qlogic/netxen/netxen_nic.h +++ b/drivers/net/ethernet/qlogic/netxen/netxen_nic.h @@ -854,7 +854,7 @@ typedef struct { The following is packed: - N cardrsp_rds_rings - N cardrs_sds_rings */ - char data[0]; + char data[]; } nx_cardrsp_rx_ctx_t; #define SIZEOF_HOSTRQ_RX(HOSTRQ_RX, rds_rings, sds_rings) \ diff --git a/drivers/net/ethernet/qlogic/netxen/netxen_nic_main.c b/drivers/net/ethernet/qlogic/netxen/netxen_nic_main.c index de8d54b23f73..1d1e183d3a8b 100644 --- a/drivers/net/ethernet/qlogic/netxen/netxen_nic_main.c +++ b/drivers/net/ethernet/qlogic/netxen/netxen_nic_main.c @@ -18,7 +18,6 @@ #include <linux/ipv6.h> #include <linux/inetdevice.h> #include <linux/sysfs.h> -#include <linux/aer.h> MODULE_DESCRIPTION("QLogic/NetXen (1/10) GbE Intelligent Ethernet Driver"); MODULE_LICENSE("GPL"); @@ -1464,9 +1463,6 @@ netxen_nic_probe(struct pci_dev *pdev, const struct pci_device_id *ent) if ((err = pci_request_regions(pdev, netxen_nic_driver_name))) goto err_out_disable_pdev; - if (NX_IS_REVISION_P3(pdev->revision)) - pci_enable_pcie_error_reporting(pdev); - pci_set_master(pdev); netdev = alloc_etherdev(sizeof(struct netxen_adapter)); @@ -1603,8 +1599,6 @@ err_out_free_netdev: free_netdev(netdev); err_out_free_res: - if (NX_IS_REVISION_P3(pdev->revision)) - pci_disable_pcie_error_reporting(pdev); pci_release_regions(pdev); err_out_disable_pdev: @@ -1659,10 +1653,8 @@ static void netxen_nic_remove(struct pci_dev *pdev) netxen_release_firmware(adapter); - if (NX_IS_REVISION_P3(pdev->revision)) { + if (NX_IS_REVISION_P3(pdev->revision)) netxen_cleanup_minidump(adapter); - pci_disable_pcie_error_reporting(pdev); - } pci_release_regions(pdev); pci_disable_device(pdev); @@ -1862,7 +1854,7 @@ netxen_tso_check(struct net_device *netdev, if (protocol == cpu_to_be16(ETH_P_8021Q)) { - vh = (struct vlan_ethhdr *)skb->data; + vh = skb_vlan_eth_hdr(skb); protocol = vh->h_vlan_encapsulated_proto; flags = FLAGS_VLAN_TAGGED; diff --git a/drivers/net/ethernet/qlogic/qed/qed_ll2.c b/drivers/net/ethernet/qlogic/qed/qed_ll2.c index e5116a86cfbc..717a0b3f89bd 100644 --- a/drivers/net/ethernet/qlogic/qed/qed_ll2.c +++ b/drivers/net/ethernet/qlogic/qed/qed_ll2.c @@ -646,13 +646,13 @@ static int qed_ll2_lb_rxq_handler(struct qed_hwfn *p_hwfn, struct qed_ll2_rx_queue *p_rx = &p_ll2_conn->rx_queue; u16 packet_length = 0, parse_flags = 0, vlan = 0; struct qed_ll2_rx_packet *p_pkt = NULL; - u32 num_ooo_add_to_peninsula = 0, cid; union core_rx_cqe_union *cqe = NULL; u16 cq_new_idx = 0, cq_old_idx = 0; struct qed_ooo_buffer *p_buffer; struct ooo_opaque *ooo_opq; u8 placement_offset = 0; u8 cqe_type; + u32 cid; cq_new_idx = le16_to_cpu(*p_rx->p_fw_cons); cq_old_idx = qed_chain_get_cons_idx(&p_rx->rcq_chain); @@ -762,7 +762,6 @@ static int qed_ll2_lb_rxq_handler(struct qed_hwfn *p_hwfn, cid, ooo_opq->ooo_isle); break; case TCP_EVENT_ADD_PEN: - num_ooo_add_to_peninsula++; qed_ooo_put_ready_buffer(p_hwfn, p_hwfn->p_ooo_info, p_buffer, true); diff --git a/drivers/net/ethernet/qlogic/qed/qed_main.c b/drivers/net/ethernet/qlogic/qed/qed_main.c index c91898be7c03..f5af83342856 100644 --- a/drivers/net/ethernet/qlogic/qed/qed_main.c +++ b/drivers/net/ethernet/qlogic/qed/qed_main.c @@ -23,7 +23,6 @@ #include <linux/qed/qed_if.h> #include <linux/qed/qed_ll2_if.h> #include <net/devlink.h> -#include <linux/aer.h> #include <linux/phylink.h> #include "qed.h" @@ -259,8 +258,6 @@ static void qed_free_pci(struct qed_dev *cdev) { struct pci_dev *pdev = cdev->pdev; - pci_disable_pcie_error_reporting(pdev); - if (cdev->doorbells && cdev->db_size) iounmap(cdev->doorbells); if (cdev->regview) @@ -366,12 +363,6 @@ static int qed_init_pci(struct qed_dev *cdev, struct pci_dev *pdev) return -ENOMEM; } - /* AER (Advanced Error reporting) configuration */ - rc = pci_enable_pcie_error_reporting(pdev); - if (rc) - DP_VERBOSE(cdev, NETIF_MSG_DRV, - "Failed to configure PCIe AER [%d]\n", rc); - return 0; err2: diff --git a/drivers/net/ethernet/qlogic/qede/qede.h b/drivers/net/ethernet/qlogic/qede/qede.h index f90dcfe9ee68..f9931ecb7baa 100644 --- a/drivers/net/ethernet/qlogic/qede/qede.h +++ b/drivers/net/ethernet/qlogic/qede/qede.h @@ -6,8 +6,6 @@ #ifndef _QEDE_H_ #define _QEDE_H_ -#include <linux/compiler.h> -#include <linux/version.h> #include <linux/workqueue.h> #include <linux/netdevice.h> #include <linux/interrupt.h> diff --git a/drivers/net/ethernet/qlogic/qede/qede_ethtool.c b/drivers/net/ethernet/qlogic/qede/qede_ethtool.c index 8034d812d5a0..374a86b875a3 100644 --- a/drivers/net/ethernet/qlogic/qede/qede_ethtool.c +++ b/drivers/net/ethernet/qlogic/qede/qede_ethtool.c @@ -4,7 +4,6 @@ * Copyright (c) 2019-2020 Marvell International Ltd. */ -#include <linux/version.h> #include <linux/types.h> #include <linux/netdevice.h> #include <linux/etherdevice.h> diff --git a/drivers/net/ethernet/qlogic/qede/qede_main.c b/drivers/net/ethernet/qlogic/qede/qede_main.c index 261f982ca40d..4c6c685820e3 100644 --- a/drivers/net/ethernet/qlogic/qede/qede_main.c +++ b/drivers/net/ethernet/qlogic/qede/qede_main.c @@ -35,7 +35,6 @@ #include <net/ip6_checksum.h> #include <linux/bitops.h> #include <linux/vmalloc.h> -#include <linux/aer.h> #include "qede.h" #include "qede_ptp.h" diff --git a/drivers/net/ethernet/qlogic/qlcnic/qlcnic_83xx_hw.c b/drivers/net/ethernet/qlogic/qlcnic/qlcnic_83xx_hw.c index 2fd5c6fdb500..bcef8ab715bf 100644 --- a/drivers/net/ethernet/qlogic/qlcnic/qlcnic_83xx_hw.c +++ b/drivers/net/ethernet/qlogic/qlcnic/qlcnic_83xx_hw.c @@ -8,7 +8,6 @@ #include <linux/ipv6.h> #include <linux/ethtool.h> #include <linux/interrupt.h> -#include <linux/aer.h> #include "qlcnic.h" #include "qlcnic_sriov.h" diff --git a/drivers/net/ethernet/qlogic/qlcnic/qlcnic_io.c b/drivers/net/ethernet/qlogic/qlcnic/qlcnic_io.c index 92930a055cbc..41894d154013 100644 --- a/drivers/net/ethernet/qlogic/qlcnic/qlcnic_io.c +++ b/drivers/net/ethernet/qlogic/qlcnic/qlcnic_io.c @@ -318,7 +318,7 @@ static void qlcnic_send_filter(struct qlcnic_adapter *adapter, if (adapter->flags & QLCNIC_VLAN_FILTERING) { if (protocol == ETH_P_8021Q) { - vh = (struct vlan_ethhdr *)skb->data; + vh = skb_vlan_eth_hdr(skb); vlan_id = ntohs(vh->h_vlan_TCI); } else if (skb_vlan_tag_present(skb)) { vlan_id = skb_vlan_tag_get(skb); @@ -468,7 +468,7 @@ static int qlcnic_tx_pkt(struct qlcnic_adapter *adapter, u32 producer = tx_ring->producer; if (protocol == ETH_P_8021Q) { - vh = (struct vlan_ethhdr *)skb->data; + vh = skb_vlan_eth_hdr(skb); flags = QLCNIC_FLAGS_VLAN_TAGGED; vlan_tci = ntohs(vh->h_vlan_TCI); protocol = ntohs(vh->h_vlan_encapsulated_proto); diff --git a/drivers/net/ethernet/qlogic/qlcnic/qlcnic_main.c b/drivers/net/ethernet/qlogic/qlcnic/qlcnic_main.c index 44dac3c0908e..90df4a0909fa 100644 --- a/drivers/net/ethernet/qlogic/qlcnic/qlcnic_main.c +++ b/drivers/net/ethernet/qlogic/qlcnic/qlcnic_main.c @@ -12,7 +12,6 @@ #include <net/ip.h> #include <linux/ipv6.h> #include <linux/inetdevice.h> -#include <linux/aer.h> #include <linux/log2.h> #include <linux/pci.h> #include <net/vxlan.h> @@ -2445,7 +2444,6 @@ qlcnic_probe(struct pci_dev *pdev, const struct pci_device_id *ent) goto err_out_disable_pdev; pci_set_master(pdev); - pci_enable_pcie_error_reporting(pdev); ahw = kzalloc(sizeof(struct qlcnic_hardware_context), GFP_KERNEL); if (!ahw) { @@ -2675,7 +2673,6 @@ err_out_free_hw_res: kfree(ahw); err_out_free_res: - pci_disable_pcie_error_reporting(pdev); pci_release_regions(pdev); err_out_disable_pdev: @@ -2757,7 +2754,6 @@ static void qlcnic_remove(struct pci_dev *pdev) qlcnic_release_firmware(adapter); - pci_disable_pcie_error_reporting(pdev); pci_release_regions(pdev); pci_disable_device(pdev); diff --git a/drivers/net/ethernet/qlogic/qlcnic/qlcnic_sysfs.c b/drivers/net/ethernet/qlogic/qlcnic/qlcnic_sysfs.c index 5c2edb715d3e..74125188beb8 100644 --- a/drivers/net/ethernet/qlogic/qlcnic/qlcnic_sysfs.c +++ b/drivers/net/ethernet/qlogic/qlcnic/qlcnic_sysfs.c @@ -12,7 +12,6 @@ #include <linux/ipv6.h> #include <linux/inetdevice.h> #include <linux/sysfs.h> -#include <linux/aer.h> #include <linux/log2.h> #ifdef CONFIG_QLCNIC_HWMON #include <linux/hwmon.h> diff --git a/drivers/net/ethernet/qualcomm/Kconfig b/drivers/net/ethernet/qualcomm/Kconfig index a4434eb38950..9210ff360fdc 100644 --- a/drivers/net/ethernet/qualcomm/Kconfig +++ b/drivers/net/ethernet/qualcomm/Kconfig @@ -52,6 +52,7 @@ config QCOM_EMAC depends on HAS_DMA && HAS_IOMEM select CRC32 select PHYLIB + select MDIO_DEVRES help This driver supports the Qualcomm Technologies, Inc. Gigabit Ethernet Media Access Controller (EMAC). The controller diff --git a/drivers/net/ethernet/realtek/r8169_main.c b/drivers/net/ethernet/realtek/r8169_main.c index 45147a1016be..a7e376e7e689 100644 --- a/drivers/net/ethernet/realtek/r8169_main.c +++ b/drivers/net/ethernet/realtek/r8169_main.c @@ -30,6 +30,7 @@ #include <linux/ipv6.h> #include <asm/unaligned.h> #include <net/ip6_checksum.h> +#include <net/netdev_queues.h> #include "r8169.h" #include "r8169_firmware.h" @@ -68,6 +69,8 @@ #define NUM_RX_DESC 256 /* Number of Rx descriptor registers */ #define R8169_TX_RING_BYTES (NUM_TX_DESC * sizeof(struct TxDesc)) #define R8169_RX_RING_BYTES (NUM_RX_DESC * sizeof(struct RxDesc)) +#define R8169_TX_STOP_THRS (MAX_SKB_FRAGS + 1) +#define R8169_TX_START_THRS (2 * R8169_TX_STOP_THRS) #define OCP_STD_PHY_BASE 0xa400 @@ -613,8 +616,13 @@ struct rtl8169_private { struct work_struct work; } wk; + spinlock_t config25_lock; + spinlock_t mac_ocp_lock; + + spinlock_t cfg9346_usage_lock; + int cfg9346_usage_count; + unsigned supports_gmii:1; - unsigned aspm_manageable:1; dma_addr_t counters_phys_addr; struct rtl8169_counters *counters; struct rtl8169_tc_offsets tc_offset; @@ -661,12 +669,22 @@ static inline struct device *tp_to_dev(struct rtl8169_private *tp) static void rtl_lock_config_regs(struct rtl8169_private *tp) { - RTL_W8(tp, Cfg9346, Cfg9346_Lock); + unsigned long flags; + + spin_lock_irqsave(&tp->cfg9346_usage_lock, flags); + if (!--tp->cfg9346_usage_count) + RTL_W8(tp, Cfg9346, Cfg9346_Lock); + spin_unlock_irqrestore(&tp->cfg9346_usage_lock, flags); } static void rtl_unlock_config_regs(struct rtl8169_private *tp) { - RTL_W8(tp, Cfg9346, Cfg9346_Unlock); + unsigned long flags; + + spin_lock_irqsave(&tp->cfg9346_usage_lock, flags); + if (!tp->cfg9346_usage_count++) + RTL_W8(tp, Cfg9346, Cfg9346_Unlock); + spin_unlock_irqrestore(&tp->cfg9346_usage_lock, flags); } static void rtl_pci_commit(struct rtl8169_private *tp) @@ -675,6 +693,28 @@ static void rtl_pci_commit(struct rtl8169_private *tp) RTL_R8(tp, ChipCmd); } +static void rtl_mod_config2(struct rtl8169_private *tp, u8 clear, u8 set) +{ + unsigned long flags; + u8 val; + + spin_lock_irqsave(&tp->config25_lock, flags); + val = RTL_R8(tp, Config2); + RTL_W8(tp, Config2, (val & ~clear) | set); + spin_unlock_irqrestore(&tp->config25_lock, flags); +} + +static void rtl_mod_config5(struct rtl8169_private *tp, u8 clear, u8 set) +{ + unsigned long flags; + u8 val; + + spin_lock_irqsave(&tp->config25_lock, flags); + val = RTL_R8(tp, Config5); + RTL_W8(tp, Config5, (val & ~clear) | set); + spin_unlock_irqrestore(&tp->config25_lock, flags); +} + static bool rtl_is_8125(struct rtl8169_private *tp) { return tp->mac_version >= RTL_GIGA_MAC_VER_61; @@ -847,7 +887,7 @@ static int r8168_phy_ocp_read(struct rtl8169_private *tp, u32 reg) (RTL_R32(tp, GPHY_OCP) & 0xffff) : -ETIMEDOUT; } -static void r8168_mac_ocp_write(struct rtl8169_private *tp, u32 reg, u32 data) +static void __r8168_mac_ocp_write(struct rtl8169_private *tp, u32 reg, u32 data) { if (rtl_ocp_reg_failure(reg)) return; @@ -855,7 +895,16 @@ static void r8168_mac_ocp_write(struct rtl8169_private *tp, u32 reg, u32 data) RTL_W32(tp, OCPDR, OCPAR_FLAG | (reg << 15) | data); } -static u16 r8168_mac_ocp_read(struct rtl8169_private *tp, u32 reg) +static void r8168_mac_ocp_write(struct rtl8169_private *tp, u32 reg, u32 data) +{ + unsigned long flags; + + spin_lock_irqsave(&tp->mac_ocp_lock, flags); + __r8168_mac_ocp_write(tp, reg, data); + spin_unlock_irqrestore(&tp->mac_ocp_lock, flags); +} + +static u16 __r8168_mac_ocp_read(struct rtl8169_private *tp, u32 reg) { if (rtl_ocp_reg_failure(reg)) return 0; @@ -865,12 +914,28 @@ static u16 r8168_mac_ocp_read(struct rtl8169_private *tp, u32 reg) return RTL_R32(tp, OCPDR); } +static u16 r8168_mac_ocp_read(struct rtl8169_private *tp, u32 reg) +{ + unsigned long flags; + u16 val; + + spin_lock_irqsave(&tp->mac_ocp_lock, flags); + val = __r8168_mac_ocp_read(tp, reg); + spin_unlock_irqrestore(&tp->mac_ocp_lock, flags); + + return val; +} + static void r8168_mac_ocp_modify(struct rtl8169_private *tp, u32 reg, u16 mask, u16 set) { - u16 data = r8168_mac_ocp_read(tp, reg); + unsigned long flags; + u16 data; - r8168_mac_ocp_write(tp, reg, (data & ~mask) | set); + spin_lock_irqsave(&tp->mac_ocp_lock, flags); + data = __r8168_mac_ocp_read(tp, reg); + __r8168_mac_ocp_write(tp, reg, (data & ~mask) | set); + spin_unlock_irqrestore(&tp->mac_ocp_lock, flags); } /* Work around a hw issue with RTL8168g PHY, the quirk disables @@ -1336,6 +1401,7 @@ static void __rtl8169_set_wol(struct rtl8169_private *tp, u32 wolopts) { WAKE_MAGIC, Config3, MagicPacket } }; unsigned int i, tmp = ARRAY_SIZE(cfg); + unsigned long flags; u8 options; rtl_unlock_config_regs(tp); @@ -1354,12 +1420,14 @@ static void __rtl8169_set_wol(struct rtl8169_private *tp, u32 wolopts) r8168_mac_ocp_modify(tp, 0xc0b6, BIT(0), 0); } + spin_lock_irqsave(&tp->config25_lock, flags); for (i = 0; i < tmp; i++) { options = RTL_R8(tp, cfg[i].reg) & ~cfg[i].mask; if (wolopts & cfg[i].opt) options |= cfg[i].mask; RTL_W8(tp, cfg[i].reg, options); } + spin_unlock_irqrestore(&tp->config25_lock, flags); switch (tp->mac_version) { case RTL_GIGA_MAC_VER_02 ... RTL_GIGA_MAC_VER_06: @@ -1371,10 +1439,10 @@ static void __rtl8169_set_wol(struct rtl8169_private *tp, u32 wolopts) case RTL_GIGA_MAC_VER_34: case RTL_GIGA_MAC_VER_37: case RTL_GIGA_MAC_VER_39 ... RTL_GIGA_MAC_VER_63: - options = RTL_R8(tp, Config2) & ~PME_SIGNAL; if (wolopts) - options |= PME_SIGNAL; - RTL_W8(tp, Config2, options); + rtl_mod_config2(tp, 0, PME_SIGNAL); + else + rtl_mod_config2(tp, PME_SIGNAL, 0); break; default: break; @@ -2675,10 +2743,12 @@ static void rtl_disable_exit_l1(struct rtl8169_private *tp) static void rtl_hw_aspm_clkreq_enable(struct rtl8169_private *tp, bool enable) { - /* Don't enable ASPM in the chip if OS can't control ASPM */ - if (enable && tp->aspm_manageable) { - RTL_W8(tp, Config5, RTL_R8(tp, Config5) | ASPM_en); - RTL_W8(tp, Config2, RTL_R8(tp, Config2) | ClkReqEn); + if (tp->mac_version < RTL_GIGA_MAC_VER_32) + return; + + if (enable) { + rtl_mod_config5(tp, 0, ASPM_en); + rtl_mod_config2(tp, 0, ClkReqEn); switch (tp->mac_version) { case RTL_GIGA_MAC_VER_46 ... RTL_GIGA_MAC_VER_48: @@ -2701,11 +2771,9 @@ static void rtl_hw_aspm_clkreq_enable(struct rtl8169_private *tp, bool enable) break; } - RTL_W8(tp, Config2, RTL_R8(tp, Config2) & ~ClkReqEn); - RTL_W8(tp, Config5, RTL_R8(tp, Config5) & ~ASPM_en); + rtl_mod_config2(tp, ClkReqEn, 0); + rtl_mod_config5(tp, ASPM_en, 0); } - - udelay(10); } static void rtl_set_fifo_size(struct rtl8169_private *tp, u16 rx_stat, @@ -2863,7 +2931,7 @@ static void rtl_hw_start_8168e_1(struct rtl8169_private *tp) RTL_W32(tp, MISC, RTL_R32(tp, MISC) | TXPLA_RST); RTL_W32(tp, MISC, RTL_R32(tp, MISC) & ~TXPLA_RST); - RTL_W8(tp, Config5, RTL_R8(tp, Config5) & ~Spi_en); + rtl_mod_config5(tp, Spi_en, 0); } static void rtl_hw_start_8168e_2(struct rtl8169_private *tp) @@ -2896,9 +2964,7 @@ static void rtl_hw_start_8168e_2(struct rtl8169_private *tp) RTL_W8(tp, DLLPR, RTL_R8(tp, DLLPR) | PFM_EN); RTL_W32(tp, MISC, RTL_R32(tp, MISC) | PWM_EN); - RTL_W8(tp, Config5, RTL_R8(tp, Config5) & ~Spi_en); - - rtl_hw_aspm_clkreq_enable(tp, true); + rtl_mod_config5(tp, Spi_en, 0); } static void rtl_hw_start_8168f(struct rtl8169_private *tp) @@ -2919,7 +2985,7 @@ static void rtl_hw_start_8168f(struct rtl8169_private *tp) RTL_W8(tp, MCU, RTL_R8(tp, MCU) & ~NOW_IS_OOB); RTL_W8(tp, DLLPR, RTL_R8(tp, DLLPR) | PFM_EN); RTL_W32(tp, MISC, RTL_R32(tp, MISC) | PWM_EN); - RTL_W8(tp, Config5, RTL_R8(tp, Config5) & ~Spi_en); + rtl_mod_config5(tp, Spi_en, 0); rtl8168_config_eee_mac(tp); } @@ -2989,11 +3055,7 @@ static void rtl_hw_start_8168g_1(struct rtl8169_private *tp) }; rtl_hw_start_8168g(tp); - - /* disable aspm and clock request before access ephy */ - rtl_hw_aspm_clkreq_enable(tp, false); rtl_ephy_init(tp, e_info_8168g_1); - rtl_hw_aspm_clkreq_enable(tp, true); } static void rtl_hw_start_8168g_2(struct rtl8169_private *tp) @@ -3011,9 +3073,6 @@ static void rtl_hw_start_8168g_2(struct rtl8169_private *tp) }; rtl_hw_start_8168g(tp); - - /* disable aspm and clock request before access ephy */ - rtl_hw_aspm_clkreq_enable(tp, false); rtl_ephy_init(tp, e_info_8168g_2); } @@ -3034,8 +3093,6 @@ static void rtl_hw_start_8411_2(struct rtl8169_private *tp) rtl_hw_start_8168g(tp); - /* disable aspm and clock request before access ephy */ - rtl_hw_aspm_clkreq_enable(tp, false); rtl_ephy_init(tp, e_info_8411_2); /* The following Realtek-provided magic fixes an issue with the RX unit @@ -3173,8 +3230,6 @@ static void rtl_hw_start_8411_2(struct rtl8169_private *tp) r8168_mac_ocp_write(tp, 0xFC32, 0x0C25); r8168_mac_ocp_write(tp, 0xFC34, 0x00A9); r8168_mac_ocp_write(tp, 0xFC36, 0x012D); - - rtl_hw_aspm_clkreq_enable(tp, true); } static void rtl_hw_start_8168h_1(struct rtl8169_private *tp) @@ -3189,8 +3244,6 @@ static void rtl_hw_start_8168h_1(struct rtl8169_private *tp) }; int rg_saw_cnt; - /* disable aspm and clock request before access ephy */ - rtl_hw_aspm_clkreq_enable(tp, false); rtl_ephy_init(tp, e_info_8168h_1); rtl_set_fifo_size(tp, 0x08, 0x10, 0x02, 0x06); @@ -3238,8 +3291,6 @@ static void rtl_hw_start_8168h_1(struct rtl8169_private *tp) r8168_mac_ocp_write(tp, 0xe63e, 0x0000); r8168_mac_ocp_write(tp, 0xc094, 0x0000); r8168_mac_ocp_write(tp, 0xc09e, 0x0000); - - rtl_hw_aspm_clkreq_enable(tp, true); } static void rtl_hw_start_8168ep(struct rtl8169_private *tp) @@ -3278,8 +3329,6 @@ static void rtl_hw_start_8168ep_3(struct rtl8169_private *tp) { 0x1e, 0x0000, 0x2000 }, }; - /* disable aspm and clock request before access ephy */ - rtl_hw_aspm_clkreq_enable(tp, false); rtl_ephy_init(tp, e_info_8168ep_3); rtl_hw_start_8168ep(tp); @@ -3290,8 +3339,6 @@ static void rtl_hw_start_8168ep_3(struct rtl8169_private *tp) r8168_mac_ocp_modify(tp, 0xd3e2, 0x0fff, 0x0271); r8168_mac_ocp_modify(tp, 0xd3e4, 0x00ff, 0x0000); r8168_mac_ocp_modify(tp, 0xe860, 0x0000, 0x0080); - - rtl_hw_aspm_clkreq_enable(tp, true); } static void rtl_hw_start_8117(struct rtl8169_private *tp) @@ -3303,9 +3350,6 @@ static void rtl_hw_start_8117(struct rtl8169_private *tp) int rg_saw_cnt; rtl8168ep_stop_cmac(tp); - - /* disable aspm and clock request before access ephy */ - rtl_hw_aspm_clkreq_enable(tp, false); rtl_ephy_init(tp, e_info_8117); rtl_set_fifo_size(tp, 0x08, 0x10, 0x02, 0x06); @@ -3355,8 +3399,6 @@ static void rtl_hw_start_8117(struct rtl8169_private *tp) /* firmware is for MAC only */ r8169_apply_firmware(tp); - - rtl_hw_aspm_clkreq_enable(tp, true); } static void rtl_hw_start_8102e_1(struct rtl8169_private *tp) @@ -3479,8 +3521,6 @@ static void rtl_hw_start_8402(struct rtl8169_private *tp) static void rtl_hw_start_8106(struct rtl8169_private *tp) { - rtl_hw_aspm_clkreq_enable(tp, false); - /* Force LAN exit from ASPM if Rx/Tx are not idle */ RTL_W32(tp, FuncEvent, RTL_R32(tp, FuncEvent) | 0x002800); @@ -3497,7 +3537,6 @@ static void rtl_hw_start_8106(struct rtl8169_private *tp) rtl_eri_write(tp, 0x1b0, ERIAR_MASK_0011, 0x0000); rtl_pcie_state_l2l3_disable(tp); - rtl_hw_aspm_clkreq_enable(tp, true); } DECLARE_RTL_COND(rtl_mac_ocp_e00e_cond) @@ -3585,13 +3624,8 @@ static void rtl_hw_start_8125a_2(struct rtl8169_private *tp) }; rtl_set_def_aspm_entry_latency(tp); - - /* disable aspm and clock request before access ephy */ - rtl_hw_aspm_clkreq_enable(tp, false); rtl_ephy_init(tp, e_info_8125a_2); - rtl_hw_start_8125_common(tp); - rtl_hw_aspm_clkreq_enable(tp, true); } static void rtl_hw_start_8125b(struct rtl8169_private *tp) @@ -3606,12 +3640,8 @@ static void rtl_hw_start_8125b(struct rtl8169_private *tp) }; rtl_set_def_aspm_entry_latency(tp); - rtl_hw_aspm_clkreq_enable(tp, false); - rtl_ephy_init(tp, e_info_8125b); rtl_hw_start_8125_common(tp); - - rtl_hw_aspm_clkreq_enable(tp, true); } static void rtl_hw_config(struct rtl8169_private *tp) @@ -3707,7 +3737,8 @@ static void rtl_hw_start_8169(struct rtl8169_private *tp) static void rtl_hw_start(struct rtl8169_private *tp) { rtl_unlock_config_regs(tp); - + /* disable aspm and clock request before ephy access */ + rtl_hw_aspm_clkreq_enable(tp, false); RTL_W16(tp, CPlusCmd, tp->cp_cmd); if (tp->mac_version <= RTL_GIGA_MAC_VER_06) @@ -3718,6 +3749,7 @@ static void rtl_hw_start(struct rtl8169_private *tp) rtl_hw_start_8168(tp); rtl_enable_exit_l1(tp); + rtl_hw_aspm_clkreq_enable(tp, true); rtl_set_rx_max_size(tp); rtl_set_rx_tx_desc_registers(tp); rtl_lock_config_regs(tp); @@ -4133,13 +4165,9 @@ static bool rtl8169_tso_csum_v2(struct rtl8169_private *tp, return true; } -static bool rtl_tx_slots_avail(struct rtl8169_private *tp) +static unsigned int rtl_tx_slots_avail(struct rtl8169_private *tp) { - unsigned int slots_avail = READ_ONCE(tp->dirty_tx) + NUM_TX_DESC - - READ_ONCE(tp->cur_tx); - - /* A skbuff with nr_frags needs nr_frags+1 entries in the tx queue */ - return slots_avail > MAX_SKB_FRAGS; + return READ_ONCE(tp->dirty_tx) + NUM_TX_DESC - READ_ONCE(tp->cur_tx); } /* Versions RTL8102e and from RTL8168c onwards support csum_v2 */ @@ -4216,27 +4244,10 @@ static netdev_tx_t rtl8169_start_xmit(struct sk_buff *skb, WRITE_ONCE(tp->cur_tx, tp->cur_tx + frags + 1); - stop_queue = !rtl_tx_slots_avail(tp); - if (unlikely(stop_queue)) { - /* Avoid wrongly optimistic queue wake-up: rtl_tx thread must - * not miss a ring update when it notices a stopped queue. - */ - smp_wmb(); - netif_stop_queue(dev); - /* Sync with rtl_tx: - * - publish queue status and cur_tx ring index (write barrier) - * - refresh dirty_tx ring index (read barrier). - * May the current thread have a pessimistic view of the ring - * status and forget to wake up queue, a racing rtl_tx thread - * can't. - */ - smp_mb__after_atomic(); - if (rtl_tx_slots_avail(tp)) - netif_start_queue(dev); - door_bell = true; - } - - if (door_bell) + stop_queue = !netif_subqueue_maybe_stop(dev, 0, rtl_tx_slots_avail(tp), + R8169_TX_STOP_THRS, + R8169_TX_START_THRS); + if (door_bell || stop_queue) rtl8169_doorbell(tp); return NETDEV_TX_OK; @@ -4360,19 +4371,12 @@ static void rtl_tx(struct net_device *dev, struct rtl8169_private *tp, } if (tp->dirty_tx != dirty_tx) { - netdev_completed_queue(dev, pkts_compl, bytes_compl); dev_sw_netstats_tx_add(dev, pkts_compl, bytes_compl); + WRITE_ONCE(tp->dirty_tx, dirty_tx); - /* Sync with rtl8169_start_xmit: - * - publish dirty_tx ring index (write barrier) - * - refresh cur_tx ring index and queue status (read barrier) - * May the current thread miss the stopped queue condition, - * a racing xmit thread can only have a right view of the - * ring status. - */ - smp_store_mb(tp->dirty_tx, dirty_tx); - if (netif_queue_stopped(dev) && rtl_tx_slots_avail(tp)) - netif_wake_queue(dev); + netif_subqueue_completed_wake(dev, 0, pkts_compl, bytes_compl, + rtl_tx_slots_avail(tp), + R8169_TX_START_THRS); /* * 8168 hack: TxPoll requests are lost when the Tx packets are * too close. Let's kick an extra TxPoll request when a burst @@ -4510,6 +4514,10 @@ static irqreturn_t rtl8169_interrupt(int irq, void *dev_instance) } if (napi_schedule_prep(&tp->napi)) { + rtl_unlock_config_regs(tp); + rtl_hw_aspm_clkreq_enable(tp, false); + rtl_lock_config_regs(tp); + rtl_irq_disable(tp); __napi_schedule(&tp->napi); } @@ -4569,9 +4577,14 @@ static int rtl8169_poll(struct napi_struct *napi, int budget) work_done = rtl_rx(dev, tp, budget); - if (work_done < budget && napi_complete_done(napi, work_done)) + if (work_done < budget && napi_complete_done(napi, work_done)) { rtl_irq_enable(tp); + rtl_unlock_config_regs(tp); + rtl_hw_aspm_clkreq_enable(tp, true); + rtl_lock_config_regs(tp); + } + return work_done; } @@ -5145,16 +5158,6 @@ done: rtl_rar_set(tp, mac_addr); } -/* register is set if system vendor successfully tested ASPM 1.2 */ -static bool rtl_aspm_is_safe(struct rtl8169_private *tp) -{ - if (tp->mac_version >= RTL_GIGA_MAC_VER_61 && - r8168_mac_ocp_read(tp, 0xc0b2) & 0xf) - return true; - - return false; -} - static int rtl_init_one(struct pci_dev *pdev, const struct pci_device_id *ent) { struct rtl8169_private *tp; @@ -5176,6 +5179,10 @@ static int rtl_init_one(struct pci_dev *pdev, const struct pci_device_id *ent) tp->eee_adv = -1; tp->ocp_base = OCP_STD_PHY_BASE; + spin_lock_init(&tp->cfg9346_usage_lock); + spin_lock_init(&tp->config25_lock); + spin_lock_init(&tp->mac_ocp_lock); + dev->tstats = devm_netdev_alloc_pcpu_stats(&pdev->dev, struct pcpu_sw_netstats); if (!dev->tstats) @@ -5222,19 +5229,6 @@ static int rtl_init_one(struct pci_dev *pdev, const struct pci_device_id *ent) tp->mac_version = chipset; - /* Disable ASPM L1 as that cause random device stop working - * problems as well as full system hangs for some PCIe devices users. - * Chips from RTL8168h partially have issues with L1.2, but seem - * to work fine with L1 and L1.1. - */ - if (rtl_aspm_is_safe(tp)) - rc = 0; - else if (tp->mac_version >= RTL_GIGA_MAC_VER_46) - rc = pci_disable_link_state(pdev, PCIE_LINK_STATE_L1_2); - else - rc = pci_disable_link_state(pdev, PCIE_LINK_STATE_L1); - tp->aspm_manageable = !rc; - tp->dash_type = rtl_check_dash(tp); tp->cp_cmd = RTL_R16(tp, CPlusCmd) & CPCMD_MASK; diff --git a/drivers/net/ethernet/renesas/ravb_main.c b/drivers/net/ethernet/renesas/ravb_main.c index 894e2690c643..4d6b3b7d6abb 100644 --- a/drivers/net/ethernet/renesas/ravb_main.c +++ b/drivers/net/ethernet/renesas/ravb_main.c @@ -28,7 +28,6 @@ #include <linux/pm_runtime.h> #include <linux/slab.h> #include <linux/spinlock.h> -#include <linux/sys_soc.h> #include <linux/reset.h> #include <linux/math64.h> @@ -1390,11 +1389,6 @@ static void ravb_adjust_link(struct net_device *ndev) phy_print_status(phydev); } -static const struct soc_device_attribute r8a7795es10[] = { - { .soc_id = "r8a7795", .revision = "ES1.0", }, - { /* sentinel */ } -}; - /* PHY init function */ static int ravb_phy_init(struct net_device *ndev) { @@ -1434,15 +1428,6 @@ static int ravb_phy_init(struct net_device *ndev) goto err_deregister_fixed_link; } - /* This driver only support 10/100Mbit speeds on R-Car H3 ES1.0 - * at this time. - */ - if (soc_device_match(r8a7795es10)) { - phy_set_max_speed(phydev, SPEED_100); - - netdev_info(ndev, "limited PHY to 100Mbit/s\n"); - } - if (!info->half_duplex) { /* 10BASE, Pause and Asym Pause is not supported */ phy_remove_link_mode(phydev, ETHTOOL_LINK_MODE_10baseT_Half_BIT); diff --git a/drivers/net/ethernet/renesas/rswitch.c b/drivers/net/ethernet/renesas/rswitch.c index c4f93d24c6a4..29afaddb598d 100644 --- a/drivers/net/ethernet/renesas/rswitch.c +++ b/drivers/net/ethernet/renesas/rswitch.c @@ -1324,10 +1324,8 @@ out: static void rswitch_phy_device_deinit(struct rswitch_device *rdev) { - if (rdev->ndev->phydev) { + if (rdev->ndev->phydev) phy_disconnect(rdev->ndev->phydev); - rdev->ndev->phydev = NULL; - } } static int rswitch_serdes_set_params(struct rswitch_device *rdev) diff --git a/drivers/net/ethernet/samsung/sxgbe/sxgbe_platform.c b/drivers/net/ethernet/samsung/sxgbe/sxgbe_platform.c index 926532466691..4e5526303f07 100644 --- a/drivers/net/ethernet/samsung/sxgbe/sxgbe_platform.c +++ b/drivers/net/ethernet/samsung/sxgbe/sxgbe_platform.c @@ -229,7 +229,7 @@ static struct platform_driver sxgbe_platform_driver = { .driver = { .name = SXGBE_RESOURCE_NAME, .pm = &sxgbe_platform_pm_ops, - .of_match_table = of_match_ptr(sxgbe_dt_ids), + .of_match_table = sxgbe_dt_ids, }, }; diff --git a/drivers/net/ethernet/sfc/ef100.c b/drivers/net/ethernet/sfc/ef100.c index 71aab3d0480f..6334992b0af4 100644 --- a/drivers/net/ethernet/sfc/ef100.c +++ b/drivers/net/ethernet/sfc/ef100.c @@ -11,7 +11,6 @@ #include "net_driver.h" #include <linux/module.h> -#include <linux/aer.h> #include "efx_common.h" #include "efx_channels.h" #include "io.h" @@ -440,8 +439,6 @@ static void ef100_pci_remove(struct pci_dev *pci_dev) pci_dbg(pci_dev, "shutdown successful\n"); - pci_disable_pcie_error_reporting(pci_dev); - pci_set_drvdata(pci_dev, NULL); efx_fini_struct(efx); kfree(probe_data); diff --git a/drivers/net/ethernet/sfc/efx.c b/drivers/net/ethernet/sfc/efx.c index 1eceffa02b55..a4f22d8e6ac7 100644 --- a/drivers/net/ethernet/sfc/efx.c +++ b/drivers/net/ethernet/sfc/efx.c @@ -18,7 +18,6 @@ #include <linux/ethtool.h> #include <linux/topology.h> #include <linux/gfp.h> -#include <linux/aer.h> #include <linux/interrupt.h> #include "net_driver.h" #include <net/gre.h> @@ -891,8 +890,6 @@ static void efx_pci_remove(struct pci_dev *pci_dev) free_netdev(efx->net_dev); probe_data = container_of(efx, struct efx_probe_data, efx); kfree(probe_data); - - pci_disable_pcie_error_reporting(pci_dev); }; /* NIC VPD information @@ -1122,8 +1119,6 @@ static int efx_pci_probe(struct pci_dev *pci_dev, netif_warn(efx, probe, efx->net_dev, "failed to create MTDs (%d)\n", rc); - (void)pci_enable_pcie_error_reporting(pci_dev); - if (efx->type->udp_tnl_push_ports) efx->type->udp_tnl_push_ports(efx); diff --git a/drivers/net/ethernet/sfc/falcon/efx.c b/drivers/net/ethernet/sfc/falcon/efx.c index e151b0957751..e001f27085c6 100644 --- a/drivers/net/ethernet/sfc/falcon/efx.c +++ b/drivers/net/ethernet/sfc/falcon/efx.c @@ -17,7 +17,6 @@ #include <linux/ethtool.h> #include <linux/topology.h> #include <linux/gfp.h> -#include <linux/aer.h> #include <linux/interrupt.h> #include "net_driver.h" #include "efx.h" @@ -2765,8 +2764,6 @@ static void ef4_pci_remove(struct pci_dev *pci_dev) ef4_fini_struct(efx); free_netdev(efx->net_dev); - - pci_disable_pcie_error_reporting(pci_dev); }; /* NIC VPD information @@ -2927,12 +2924,6 @@ static int ef4_pci_probe(struct pci_dev *pci_dev, netif_warn(efx, probe, efx->net_dev, "failed to create MTDs (%d)\n", rc); - rc = pci_enable_pcie_error_reporting(pci_dev); - if (rc && rc != -EINVAL) - netif_notice(efx, probe, efx->net_dev, - "PCIE error reporting unavailable (%d).\n", - rc); - return 0; fail4: diff --git a/drivers/net/ethernet/sfc/mae.c b/drivers/net/ethernet/sfc/mae.c index 2d32abe5f478..49706a7b94bf 100644 --- a/drivers/net/ethernet/sfc/mae.c +++ b/drivers/net/ethernet/sfc/mae.c @@ -241,6 +241,7 @@ static int efx_mae_get_basic_caps(struct efx_nic *efx, struct mae_caps *caps) if (outlen < sizeof(outbuf)) return -EIO; caps->match_field_count = MCDI_DWORD(outbuf, MAE_GET_CAPS_OUT_MATCH_FIELD_COUNT); + caps->encap_types = MCDI_DWORD(outbuf, MAE_GET_CAPS_OUT_ENCAP_TYPES_SUPPORTED); caps->action_prios = MCDI_DWORD(outbuf, MAE_GET_CAPS_OUT_ACTION_PRIOS); return 0; } @@ -254,13 +255,23 @@ static int efx_mae_get_rule_fields(struct efx_nic *efx, u32 cmd, size_t outlen; int rc, i; + /* AR and OR caps MCDIs have identical layout, so we are using the + * same code for both. + */ + BUILD_BUG_ON(MC_CMD_MAE_GET_AR_CAPS_OUT_LEN(MAE_NUM_FIELDS) < + MC_CMD_MAE_GET_OR_CAPS_OUT_LEN(MAE_NUM_FIELDS)); BUILD_BUG_ON(MC_CMD_MAE_GET_AR_CAPS_IN_LEN); + BUILD_BUG_ON(MC_CMD_MAE_GET_OR_CAPS_IN_LEN); rc = efx_mcdi_rpc(efx, cmd, NULL, 0, outbuf, sizeof(outbuf), &outlen); if (rc) return rc; + BUILD_BUG_ON(MC_CMD_MAE_GET_AR_CAPS_OUT_COUNT_OFST != + MC_CMD_MAE_GET_OR_CAPS_OUT_COUNT_OFST); count = MCDI_DWORD(outbuf, MAE_GET_AR_CAPS_OUT_COUNT); memset(field_support, MAE_FIELD_UNSUPPORTED, MAE_NUM_FIELDS); + BUILD_BUG_ON(MC_CMD_MAE_GET_AR_CAPS_OUT_FIELD_FLAGS_OFST != + MC_CMD_MAE_GET_OR_CAPS_OUT_FIELD_FLAGS_OFST); caps = _MCDI_DWORD(outbuf, MAE_GET_AR_CAPS_OUT_FIELD_FLAGS); /* We're only interested in the support status enum, not any other * flags, so just extract that from each entry. @@ -278,8 +289,12 @@ int efx_mae_get_caps(struct efx_nic *efx, struct mae_caps *caps) rc = efx_mae_get_basic_caps(efx, caps); if (rc) return rc; - return efx_mae_get_rule_fields(efx, MC_CMD_MAE_GET_AR_CAPS, - caps->action_rule_fields); + rc = efx_mae_get_rule_fields(efx, MC_CMD_MAE_GET_AR_CAPS, + caps->action_rule_fields); + if (rc) + return rc; + return efx_mae_get_rule_fields(efx, MC_CMD_MAE_GET_OR_CAPS, + caps->outer_rule_fields); } /* Bit twiddling: @@ -432,11 +447,86 @@ int efx_mae_match_check_caps(struct efx_nic *efx, CHECK_BIT(IP_FIRST_FRAG, ip_firstfrag) || CHECK(RECIRC_ID, recirc_id)) return rc; + /* Matches on outer fields are done in a separate hardware table, + * the Outer Rule table. Thus the Action Rule merely does an + * exact match on Outer Rule ID if any outer field matches are + * present. The exception is the VNI/VSID (enc_keyid), which is + * available to the Action Rule match iff the Outer Rule matched + * (and thus identified the encap protocol to use to extract it). + */ + if (efx_tc_match_is_encap(mask)) { + rc = efx_mae_match_check_cap_typ( + supported_fields[MAE_FIELD_OUTER_RULE_ID], + MASK_ONES); + if (rc) { + NL_SET_ERR_MSG_MOD(extack, "No support for encap rule ID matches"); + return rc; + } + if (CHECK(ENC_VNET_ID, enc_keyid)) + return rc; + } else if (mask->enc_keyid) { + NL_SET_ERR_MSG_MOD(extack, "Match on enc_keyid requires other encap fields"); + return -EINVAL; + } return 0; } #undef CHECK_BIT #undef CHECK +#define CHECK(_mcdi) ({ \ + rc = efx_mae_match_check_cap_typ(supported_fields[MAE_FIELD_ ## _mcdi],\ + MASK_ONES); \ + if (rc) \ + NL_SET_ERR_MSG_FMT_MOD(extack, \ + "No support for field %s", #_mcdi); \ + rc; \ +}) +/* Checks that the fields needed for encap-rule matches are supported by the + * MAE. All the fields are exact-match. + */ +int efx_mae_check_encap_match_caps(struct efx_nic *efx, bool ipv6, + struct netlink_ext_ack *extack) +{ + u8 *supported_fields = efx->tc->caps->outer_rule_fields; + int rc; + + if (CHECK(ENC_ETHER_TYPE)) + return rc; + if (ipv6) { + if (CHECK(ENC_SRC_IP6) || + CHECK(ENC_DST_IP6)) + return rc; + } else { + if (CHECK(ENC_SRC_IP4) || + CHECK(ENC_DST_IP4)) + return rc; + } + if (CHECK(ENC_L4_DPORT) || + CHECK(ENC_IP_PROTO)) + return rc; + return 0; +} +#undef CHECK + +int efx_mae_check_encap_type_supported(struct efx_nic *efx, enum efx_encap_type typ) +{ + unsigned int bit; + + switch (typ & EFX_ENCAP_TYPES_MASK) { + case EFX_ENCAP_TYPE_VXLAN: + bit = MC_CMD_MAE_GET_CAPS_OUT_ENCAP_TYPE_VXLAN_LBN; + break; + case EFX_ENCAP_TYPE_GENEVE: + bit = MC_CMD_MAE_GET_CAPS_OUT_ENCAP_TYPE_GENEVE_LBN; + break; + default: + return -EOPNOTSUPP; + } + if (efx->tc->caps->encap_types & BIT(bit)) + return 0; + return -EOPNOTSUPP; +} + int efx_mae_allocate_counter(struct efx_nic *efx, struct efx_tc_counter *cnt) { MCDI_DECLARE_BUF(outbuf, MC_CMD_MAE_COUNTER_ALLOC_OUT_LEN(1)); @@ -488,6 +578,20 @@ int efx_mae_free_counter(struct efx_nic *efx, struct efx_tc_counter *cnt) return 0; } +static int efx_mae_encap_type_to_mae_type(enum efx_encap_type type) +{ + switch (type & EFX_ENCAP_TYPES_MASK) { + case EFX_ENCAP_TYPE_NONE: + return MAE_MCDI_ENCAP_TYPE_NONE; + case EFX_ENCAP_TYPE_VXLAN: + return MAE_MCDI_ENCAP_TYPE_VXLAN; + case EFX_ENCAP_TYPE_GENEVE: + return MAE_MCDI_ENCAP_TYPE_GENEVE; + default: + return -EOPNOTSUPP; + } +} + int efx_mae_lookup_mport(struct efx_nic *efx, u32 vf_idx, u32 *id) { struct ef100_nic_data *nic_data = efx->nic_data; @@ -682,6 +786,11 @@ int efx_mae_alloc_action_set(struct efx_nic *efx, struct efx_tc_action_set *act) size_t outlen; int rc; + MCDI_POPULATE_DWORD_3(inbuf, MAE_ACTION_SET_ALLOC_IN_FLAGS, + MAE_ACTION_SET_ALLOC_IN_VLAN_PUSH, act->vlan_push, + MAE_ACTION_SET_ALLOC_IN_VLAN_POP, act->vlan_pop, + MAE_ACTION_SET_ALLOC_IN_DECAP, act->decap); + MCDI_SET_DWORD(inbuf, MAE_ACTION_SET_ALLOC_IN_SRC_MAC_ID, MC_CMD_MAE_MAC_ADDR_ALLOC_OUT_MAC_ID_NULL); MCDI_SET_DWORD(inbuf, MAE_ACTION_SET_ALLOC_IN_DST_MAC_ID, @@ -694,6 +803,18 @@ int efx_mae_alloc_action_set(struct efx_nic *efx, struct efx_tc_action_set *act) MC_CMD_MAE_COUNTER_ALLOC_OUT_COUNTER_ID_NULL); MCDI_SET_DWORD(inbuf, MAE_ACTION_SET_ALLOC_IN_COUNTER_LIST_ID, MC_CMD_MAE_COUNTER_LIST_ALLOC_OUT_COUNTER_LIST_ID_NULL); + if (act->vlan_push) { + MCDI_SET_WORD_BE(inbuf, MAE_ACTION_SET_ALLOC_IN_VLAN0_TCI_BE, + act->vlan_tci[0]); + MCDI_SET_WORD_BE(inbuf, MAE_ACTION_SET_ALLOC_IN_VLAN0_PROTO_BE, + act->vlan_proto[0]); + } + if (act->vlan_push >= 2) { + MCDI_SET_WORD_BE(inbuf, MAE_ACTION_SET_ALLOC_IN_VLAN1_TCI_BE, + act->vlan_tci[1]); + MCDI_SET_WORD_BE(inbuf, MAE_ACTION_SET_ALLOC_IN_VLAN1_PROTO_BE, + act->vlan_proto[1]); + } MCDI_SET_DWORD(inbuf, MAE_ACTION_SET_ALLOC_IN_ENCAP_HEADER_ID, MC_CMD_MAE_ENCAP_HEADER_ALLOC_OUT_ENCAP_HEADER_ID_NULL); if (act->deliver) @@ -829,6 +950,97 @@ int efx_mae_free_action_set_list(struct efx_nic *efx, return 0; } +int efx_mae_register_encap_match(struct efx_nic *efx, + struct efx_tc_encap_match *encap) +{ + MCDI_DECLARE_BUF(inbuf, MC_CMD_MAE_OUTER_RULE_INSERT_IN_LEN(MAE_ENC_FIELD_PAIRS_LEN)); + MCDI_DECLARE_BUF(outbuf, MC_CMD_MAE_OUTER_RULE_INSERT_OUT_LEN); + MCDI_DECLARE_STRUCT_PTR(match_crit); + size_t outlen; + int rc; + + rc = efx_mae_encap_type_to_mae_type(encap->tun_type); + if (rc < 0) + return rc; + match_crit = _MCDI_DWORD(inbuf, MAE_OUTER_RULE_INSERT_IN_FIELD_MATCH_CRITERIA); + /* The struct contains IP src and dst, and udp dport. + * So we actually need to filter on IP src and dst, L4 dport, and + * ipproto == udp. + */ + MCDI_SET_DWORD(inbuf, MAE_OUTER_RULE_INSERT_IN_ENCAP_TYPE, rc); +#ifdef CONFIG_IPV6 + if (encap->src_ip | encap->dst_ip) { +#endif + MCDI_STRUCT_SET_DWORD_BE(match_crit, MAE_ENC_FIELD_PAIRS_ENC_SRC_IP4_BE, + encap->src_ip); + MCDI_STRUCT_SET_DWORD_BE(match_crit, MAE_ENC_FIELD_PAIRS_ENC_SRC_IP4_BE_MASK, + ~(__be32)0); + MCDI_STRUCT_SET_DWORD_BE(match_crit, MAE_ENC_FIELD_PAIRS_ENC_DST_IP4_BE, + encap->dst_ip); + MCDI_STRUCT_SET_DWORD_BE(match_crit, MAE_ENC_FIELD_PAIRS_ENC_DST_IP4_BE_MASK, + ~(__be32)0); + MCDI_STRUCT_SET_WORD_BE(match_crit, MAE_ENC_FIELD_PAIRS_ENC_ETHER_TYPE_BE, + htons(ETH_P_IP)); +#ifdef CONFIG_IPV6 + } else { + memcpy(MCDI_STRUCT_PTR(match_crit, MAE_ENC_FIELD_PAIRS_ENC_SRC_IP6_BE), + &encap->src_ip6, sizeof(encap->src_ip6)); + memset(MCDI_STRUCT_PTR(match_crit, MAE_ENC_FIELD_PAIRS_ENC_SRC_IP6_BE_MASK), + 0xff, sizeof(encap->src_ip6)); + memcpy(MCDI_STRUCT_PTR(match_crit, MAE_ENC_FIELD_PAIRS_ENC_DST_IP6_BE), + &encap->dst_ip6, sizeof(encap->dst_ip6)); + memset(MCDI_STRUCT_PTR(match_crit, MAE_ENC_FIELD_PAIRS_ENC_DST_IP6_BE_MASK), + 0xff, sizeof(encap->dst_ip6)); + MCDI_STRUCT_SET_WORD_BE(match_crit, MAE_ENC_FIELD_PAIRS_ENC_ETHER_TYPE_BE, + htons(ETH_P_IPV6)); + } +#endif + MCDI_STRUCT_SET_WORD_BE(match_crit, MAE_ENC_FIELD_PAIRS_ENC_ETHER_TYPE_BE_MASK, + ~(__be16)0); + MCDI_STRUCT_SET_WORD_BE(match_crit, MAE_ENC_FIELD_PAIRS_ENC_L4_DPORT_BE, + encap->udp_dport); + MCDI_STRUCT_SET_WORD_BE(match_crit, MAE_ENC_FIELD_PAIRS_ENC_L4_DPORT_BE_MASK, + ~(__be16)0); + MCDI_STRUCT_SET_BYTE(match_crit, MAE_ENC_FIELD_PAIRS_ENC_IP_PROTO, IPPROTO_UDP); + MCDI_STRUCT_SET_BYTE(match_crit, MAE_ENC_FIELD_PAIRS_ENC_IP_PROTO_MASK, ~0); + rc = efx_mcdi_rpc(efx, MC_CMD_MAE_OUTER_RULE_INSERT, inbuf, + sizeof(inbuf), outbuf, sizeof(outbuf), &outlen); + if (rc) + return rc; + if (outlen < sizeof(outbuf)) + return -EIO; + encap->fw_id = MCDI_DWORD(outbuf, MAE_OUTER_RULE_INSERT_OUT_OR_ID); + return 0; +} + +int efx_mae_unregister_encap_match(struct efx_nic *efx, + struct efx_tc_encap_match *encap) +{ + MCDI_DECLARE_BUF(outbuf, MC_CMD_MAE_OUTER_RULE_REMOVE_OUT_LEN(1)); + MCDI_DECLARE_BUF(inbuf, MC_CMD_MAE_OUTER_RULE_REMOVE_IN_LEN(1)); + size_t outlen; + int rc; + + MCDI_SET_DWORD(inbuf, MAE_OUTER_RULE_REMOVE_IN_OR_ID, encap->fw_id); + rc = efx_mcdi_rpc(efx, MC_CMD_MAE_OUTER_RULE_REMOVE, inbuf, + sizeof(inbuf), outbuf, sizeof(outbuf), &outlen); + if (rc) + return rc; + if (outlen < sizeof(outbuf)) + return -EIO; + /* FW freed a different ID than we asked for, should also never happen. + * Warn because it means we've now got a different idea to the FW of + * what encap_mds exist, which could cause mayhem later. + */ + if (WARN_ON(MCDI_DWORD(outbuf, MAE_OUTER_RULE_REMOVE_OUT_REMOVED_OR_ID) != encap->fw_id)) + return -EIO; + /* We're probably about to free @encap, but let's just make sure its + * fw_id is blatted so that it won't look valid if it leaks out. + */ + encap->fw_id = MC_CMD_MAE_OUTER_RULE_INSERT_OUT_OUTER_RULE_ID_NULL; + return 0; +} + static int efx_mae_populate_match_criteria(MCDI_DECLARE_STRUCT_PTR(match_crit), const struct efx_tc_match *match) { @@ -925,6 +1137,29 @@ static int efx_mae_populate_match_criteria(MCDI_DECLARE_STRUCT_PTR(match_crit), match->value.tcp_flags); MCDI_STRUCT_SET_WORD_BE(match_crit, MAE_FIELD_MASK_VALUE_PAIRS_V2_TCP_FLAGS_BE_MASK, match->mask.tcp_flags); + /* enc-keys are handled indirectly, through encap_match ID */ + if (match->encap) { + MCDI_STRUCT_SET_DWORD(match_crit, MAE_FIELD_MASK_VALUE_PAIRS_V2_OUTER_RULE_ID, + match->encap->fw_id); + MCDI_STRUCT_SET_DWORD(match_crit, MAE_FIELD_MASK_VALUE_PAIRS_V2_OUTER_RULE_ID_MASK, + U32_MAX); + /* enc_keyid (VNI/VSID) is not part of the encap_match */ + MCDI_STRUCT_SET_DWORD_BE(match_crit, MAE_FIELD_MASK_VALUE_PAIRS_V2_ENC_VNET_ID_BE, + match->value.enc_keyid); + MCDI_STRUCT_SET_DWORD_BE(match_crit, MAE_FIELD_MASK_VALUE_PAIRS_V2_ENC_VNET_ID_BE_MASK, + match->mask.enc_keyid); + } else if (WARN_ON_ONCE(match->mask.enc_src_ip) || + WARN_ON_ONCE(match->mask.enc_dst_ip) || + WARN_ON_ONCE(!ipv6_addr_any(&match->mask.enc_src_ip6)) || + WARN_ON_ONCE(!ipv6_addr_any(&match->mask.enc_dst_ip6)) || + WARN_ON_ONCE(match->mask.enc_ip_tos) || + WARN_ON_ONCE(match->mask.enc_ip_ttl) || + WARN_ON_ONCE(match->mask.enc_sport) || + WARN_ON_ONCE(match->mask.enc_dport) || + WARN_ON_ONCE(match->mask.enc_keyid)) { + /* No enc-keys should appear in a rule without an encap_match */ + return -EOPNOTSUPP; + } return 0; } diff --git a/drivers/net/ethernet/sfc/mae.h b/drivers/net/ethernet/sfc/mae.h index bec293a06733..9226219491a0 100644 --- a/drivers/net/ethernet/sfc/mae.h +++ b/drivers/net/ethernet/sfc/mae.h @@ -70,8 +70,10 @@ void efx_mae_counters_grant_credits(struct work_struct *work); struct mae_caps { u32 match_field_count; + u32 encap_types; u32 action_prios; u8 action_rule_fields[MAE_NUM_FIELDS]; + u8 outer_rule_fields[MAE_NUM_FIELDS]; }; int efx_mae_get_caps(struct efx_nic *efx, struct mae_caps *caps); @@ -79,6 +81,10 @@ int efx_mae_get_caps(struct efx_nic *efx, struct mae_caps *caps); int efx_mae_match_check_caps(struct efx_nic *efx, const struct efx_tc_match_fields *mask, struct netlink_ext_ack *extack); +int efx_mae_check_encap_match_caps(struct efx_nic *efx, bool ipv6, + struct netlink_ext_ack *extack); +int efx_mae_check_encap_type_supported(struct efx_nic *efx, + enum efx_encap_type typ); int efx_mae_allocate_counter(struct efx_nic *efx, struct efx_tc_counter *cnt); int efx_mae_free_counter(struct efx_nic *efx, struct efx_tc_counter *cnt); @@ -91,6 +97,11 @@ int efx_mae_alloc_action_set_list(struct efx_nic *efx, int efx_mae_free_action_set_list(struct efx_nic *efx, struct efx_tc_action_set_list *acts); +int efx_mae_register_encap_match(struct efx_nic *efx, + struct efx_tc_encap_match *encap); +int efx_mae_unregister_encap_match(struct efx_nic *efx, + struct efx_tc_encap_match *encap); + int efx_mae_insert_rule(struct efx_nic *efx, const struct efx_tc_match *match, u32 prio, u32 acts_id, u32 *id); int efx_mae_delete_rule(struct efx_nic *efx, u32 id); diff --git a/drivers/net/ethernet/sfc/mcdi.h b/drivers/net/ethernet/sfc/mcdi.h index b139b76febff..454e9d51a4c2 100644 --- a/drivers/net/ethernet/sfc/mcdi.h +++ b/drivers/net/ethernet/sfc/mcdi.h @@ -233,6 +233,11 @@ void efx_mcdi_sensor_event(struct efx_nic *efx, efx_qword_t *ev); ((void)BUILD_BUG_ON_ZERO(_field ## _LEN != 2), \ le16_to_cpu(*(__force const __le16 *)MCDI_STRUCT_PTR(_buf, _field))) /* Write a 16-bit field defined in the protocol as being big-endian. */ +#define MCDI_SET_WORD_BE(_buf, _field, _value) do { \ + BUILD_BUG_ON(MC_CMD_ ## _field ## _LEN != 2); \ + BUILD_BUG_ON(MC_CMD_ ## _field ## _OFST & 1); \ + *(__force __be16 *)MCDI_PTR(_buf, _field) = (_value); \ + } while (0) #define MCDI_STRUCT_SET_WORD_BE(_buf, _field, _value) do { \ BUILD_BUG_ON(_field ## _LEN != 2); \ BUILD_BUG_ON(_field ## _OFST & 1); \ diff --git a/drivers/net/ethernet/sfc/ptp.c b/drivers/net/ethernet/sfc/ptp.c index 9f07e1ba7780..0c40571133cb 100644 --- a/drivers/net/ethernet/sfc/ptp.c +++ b/drivers/net/ethernet/sfc/ptp.c @@ -33,6 +33,7 @@ #include <linux/ip.h> #include <linux/udp.h> #include <linux/time.h> +#include <linux/errno.h> #include <linux/ktime.h> #include <linux/module.h> #include <linux/pps_kernel.h> @@ -74,6 +75,9 @@ /* How long an unmatched event or packet can be held */ #define PKT_EVENT_LIFETIME_MS 10 +/* How long unused unicast filters can be held */ +#define UCAST_FILTER_EXPIRY_JIFFIES msecs_to_jiffies(30000) + /* Offsets into PTP packet for identification. These offsets are from the * start of the IP header, not the MAC header. Note that neither PTP V1 nor * PTP V2 permit the use of IPV4 options. @@ -118,8 +122,6 @@ #define PTP_MIN_LENGTH 63 -#define PTP_RXFILTERS_LEN 5 - #define PTP_ADDR_IPV4 0xe0000181 /* 224.0.1.129 */ #define PTP_ADDR_IPV6 {0xff, 0x0e, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, \ 0, 0x01, 0x81} /* ff0e::181 */ @@ -214,6 +216,24 @@ struct efx_ptp_timeset { }; /** + * struct efx_ptp_rxfilter - Filter for PTP packets + * @list: Node of the list where the filter is added + * @ether_type: Network protocol of the filter (ETHER_P_IP / ETHER_P_IPV6) + * @loc_port: UDP port of the filter (PTP_EVENT_PORT / PTP_GENERAL_PORT) + * @loc_host: IPv4/v6 address of the filter + * @expiry: time when the filter expires, in jiffies + * @handle: Handle ID for the MCDI filters table + */ +struct efx_ptp_rxfilter { + struct list_head list; + __be16 ether_type; + __be16 loc_port; + __be32 loc_host[4]; + unsigned long expiry; + int handle; +}; + +/** * struct efx_ptp_data - Precision Time Protocol (PTP) state * @efx: The NIC context * @channel: The PTP channel (Siena only) @@ -227,10 +247,11 @@ struct efx_ptp_timeset { * @rx_evts: Instantiated events (on evt_list and evt_free_list) * @workwq: Work queue for processing pending PTP operations * @work: Work task + * @cleanup_work: Work task for periodic cleanup * @reset_required: A serious error has occurred and the PTP task needs to be * reset (disable, enable). - * @rxfilters: Receive filters when operating - * @rxfilters_count: Num of installed rxfilters, should be == PTP_RXFILTERS_LEN + * @rxfilters_mcast: Receive filters for multicast PTP packets + * @rxfilters_ucast: Receive filters for unicast PTP packets * @config: Current timestamp configuration * @enabled: PTP operation enabled * @mode: Mode in which PTP operating (PTP version) @@ -298,9 +319,10 @@ struct efx_ptp_data { struct efx_ptp_event_rx rx_evts[MAX_RECEIVE_EVENTS]; struct workqueue_struct *workwq; struct work_struct work; + struct delayed_work cleanup_work; bool reset_required; - u32 rxfilters[PTP_RXFILTERS_LEN]; - size_t rxfilters_count; + struct list_head rxfilters_mcast; + struct list_head rxfilters_ucast; struct hwtstamp_config config; bool enabled; unsigned int mode; @@ -358,6 +380,8 @@ static int efx_phc_settime(struct ptp_clock_info *ptp, const struct timespec64 *e_ts); static int efx_phc_enable(struct ptp_clock_info *ptp, struct ptp_clock_request *request, int on); +static int efx_ptp_insert_unicast_filter(struct efx_nic *efx, + struct sk_buff *skb); bool efx_ptp_use_mac_tx_timestamps(struct efx_nic *efx) { @@ -1103,6 +1127,8 @@ static void efx_ptp_xmit_skb_queue(struct efx_nic *efx, struct sk_buff *skb) tx_queue = efx_channel_get_tx_queue(ptp_data->channel, type); if (tx_queue && tx_queue->timestamping) { + skb_get(skb); + /* This code invokes normal driver TX code which is always * protected from softirqs when called from generic TX code, * which in turn disables preemption. Look at __dev_queue_xmit @@ -1126,6 +1152,13 @@ static void efx_ptp_xmit_skb_queue(struct efx_nic *efx, struct sk_buff *skb) local_bh_disable(); efx_enqueue_skb(tx_queue, skb); local_bh_enable(); + + /* We need to add the filters after enqueuing the packet. + * Otherwise, there's high latency in sending back the + * timestamp, causing ptp4l timeouts + */ + efx_ptp_insert_unicast_filter(efx, skb); + dev_consume_skb_any(skb); } else { WARN_ONCE(1, "PTP channel has no timestamped tx queue\n"); dev_kfree_skb_any(skb); @@ -1135,11 +1168,11 @@ static void efx_ptp_xmit_skb_queue(struct efx_nic *efx, struct sk_buff *skb) /* Transmit a PTP packet, via the MCDI interface, to the wire. */ static void efx_ptp_xmit_skb_mc(struct efx_nic *efx, struct sk_buff *skb) { + MCDI_DECLARE_BUF(txtime, MC_CMD_PTP_OUT_TRANSMIT_LEN); struct efx_ptp_data *ptp_data = efx->ptp_data; struct skb_shared_hwtstamps timestamps; - int rc = -EIO; - MCDI_DECLARE_BUF(txtime, MC_CMD_PTP_OUT_TRANSMIT_LEN); size_t len; + int rc; MCDI_SET_DWORD(ptp_data->txbuf, PTP_IN_OP, MC_CMD_PTP_OP_TRANSMIT); MCDI_SET_DWORD(ptp_data->txbuf, PTP_IN_PERIPH_ID, 0); @@ -1173,7 +1206,10 @@ static void efx_ptp_xmit_skb_mc(struct efx_nic *efx, struct sk_buff *skb) skb_tstamp_tx(skb, ×tamps); - rc = 0; + /* Add the filters after sending back the timestamp to avoid delaying it + * or ptp4l may timeout. + */ + efx_ptp_insert_unicast_filter(efx, skb); fail: dev_kfree_skb_any(skb); @@ -1289,15 +1325,37 @@ static inline void efx_ptp_process_rx(struct efx_nic *efx, struct sk_buff *skb) local_bh_enable(); } -static void efx_ptp_remove_multicast_filters(struct efx_nic *efx) +static struct efx_ptp_rxfilter * +efx_ptp_find_filter(struct list_head *filter_list, struct efx_filter_spec *spec) { - struct efx_ptp_data *ptp = efx->ptp_data; + struct efx_ptp_rxfilter *rxfilter; - while (ptp->rxfilters_count) { - ptp->rxfilters_count--; - efx_filter_remove_id_safe(efx, EFX_FILTER_PRI_REQUIRED, - ptp->rxfilters[ptp->rxfilters_count]); + list_for_each_entry(rxfilter, filter_list, list) { + if (rxfilter->ether_type == spec->ether_type && + rxfilter->loc_port == spec->loc_port && + !memcmp(rxfilter->loc_host, spec->loc_host, sizeof(spec->loc_host))) + return rxfilter; } + + return NULL; +} + +static void efx_ptp_remove_one_filter(struct efx_nic *efx, + struct efx_ptp_rxfilter *rxfilter) +{ + efx_filter_remove_id_safe(efx, EFX_FILTER_PRI_REQUIRED, + rxfilter->handle); + list_del(&rxfilter->list); + kfree(rxfilter); +} + +static void efx_ptp_remove_filters(struct efx_nic *efx, + struct list_head *filter_list) +{ + struct efx_ptp_rxfilter *rxfilter, *tmp; + + list_for_each_entry_safe(rxfilter, tmp, filter_list, list) + efx_ptp_remove_one_filter(efx, rxfilter); } static void efx_ptp_init_filter(struct efx_nic *efx, @@ -1311,48 +1369,80 @@ static void efx_ptp_init_filter(struct efx_nic *efx, } static int efx_ptp_insert_filter(struct efx_nic *efx, - struct efx_filter_spec *rxfilter) + struct list_head *filter_list, + struct efx_filter_spec *spec, + unsigned long expiry) { struct efx_ptp_data *ptp = efx->ptp_data; + struct efx_ptp_rxfilter *rxfilter; + int rc; + + rxfilter = efx_ptp_find_filter(filter_list, spec); + if (rxfilter) { + rxfilter->expiry = expiry; + return 0; + } + + rxfilter = kzalloc(sizeof(*rxfilter), GFP_KERNEL); + if (!rxfilter) + return -ENOMEM; - int rc = efx_filter_insert_filter(efx, rxfilter, true); + rc = efx_filter_insert_filter(efx, spec, true); if (rc < 0) - return rc; - ptp->rxfilters[ptp->rxfilters_count] = rc; - ptp->rxfilters_count++; + goto fail; + + rxfilter->handle = rc; + rxfilter->ether_type = spec->ether_type; + rxfilter->loc_port = spec->loc_port; + memcpy(rxfilter->loc_host, spec->loc_host, sizeof(spec->loc_host)); + rxfilter->expiry = expiry; + list_add(&rxfilter->list, filter_list); + + queue_delayed_work(ptp->workwq, &ptp->cleanup_work, + UCAST_FILTER_EXPIRY_JIFFIES + 1); + return 0; + +fail: + kfree(rxfilter); + return rc; } -static int efx_ptp_insert_ipv4_filter(struct efx_nic *efx, u16 port) +static int efx_ptp_insert_ipv4_filter(struct efx_nic *efx, + struct list_head *filter_list, + __be32 addr, u16 port, + unsigned long expiry) { - struct efx_filter_spec rxfilter; + struct efx_filter_spec spec; - efx_ptp_init_filter(efx, &rxfilter); - efx_filter_set_ipv4_local(&rxfilter, IPPROTO_UDP, htonl(PTP_ADDR_IPV4), - htons(port)); - return efx_ptp_insert_filter(efx, &rxfilter); + efx_ptp_init_filter(efx, &spec); + efx_filter_set_ipv4_local(&spec, IPPROTO_UDP, addr, htons(port)); + return efx_ptp_insert_filter(efx, filter_list, &spec, expiry); } -static int efx_ptp_insert_ipv6_filter(struct efx_nic *efx, u16 port) +static int efx_ptp_insert_ipv6_filter(struct efx_nic *efx, + struct list_head *filter_list, + struct in6_addr *addr, u16 port, + unsigned long expiry) { - const struct in6_addr addr = {{PTP_ADDR_IPV6}}; - struct efx_filter_spec rxfilter; + struct efx_filter_spec spec; - efx_ptp_init_filter(efx, &rxfilter); - efx_filter_set_ipv6_local(&rxfilter, IPPROTO_UDP, &addr, htons(port)); - return efx_ptp_insert_filter(efx, &rxfilter); + efx_ptp_init_filter(efx, &spec); + efx_filter_set_ipv6_local(&spec, IPPROTO_UDP, addr, htons(port)); + return efx_ptp_insert_filter(efx, filter_list, &spec, expiry); } -static int efx_ptp_insert_eth_filter(struct efx_nic *efx) +static int efx_ptp_insert_eth_multicast_filter(struct efx_nic *efx) { + struct efx_ptp_data *ptp = efx->ptp_data; const u8 addr[ETH_ALEN] = PTP_ADDR_ETHER; - struct efx_filter_spec rxfilter; + struct efx_filter_spec spec; - efx_ptp_init_filter(efx, &rxfilter); - efx_filter_set_eth_local(&rxfilter, EFX_FILTER_VID_UNSPEC, addr); - rxfilter.match_flags |= EFX_FILTER_MATCH_ETHER_TYPE; - rxfilter.ether_type = htons(ETH_P_1588); - return efx_ptp_insert_filter(efx, &rxfilter); + efx_ptp_init_filter(efx, &spec); + efx_filter_set_eth_local(&spec, EFX_FILTER_VID_UNSPEC, addr); + spec.match_flags |= EFX_FILTER_MATCH_ETHER_TYPE; + spec.ether_type = htons(ETH_P_1588); + return efx_ptp_insert_filter(efx, &ptp->rxfilters_mcast, &spec, 0); } static int efx_ptp_insert_multicast_filters(struct efx_nic *efx) @@ -1360,17 +1450,21 @@ static int efx_ptp_insert_multicast_filters(struct efx_nic *efx) struct efx_ptp_data *ptp = efx->ptp_data; int rc; - if (!ptp->channel || ptp->rxfilters_count) + if (!ptp->channel || !list_empty(&ptp->rxfilters_mcast)) return 0; /* Must filter on both event and general ports to ensure * that there is no packet re-ordering. */ - rc = efx_ptp_insert_ipv4_filter(efx, PTP_EVENT_PORT); + rc = efx_ptp_insert_ipv4_filter(efx, &ptp->rxfilters_mcast, + htonl(PTP_ADDR_IPV4), PTP_EVENT_PORT, + 0); if (rc < 0) goto fail; - rc = efx_ptp_insert_ipv4_filter(efx, PTP_GENERAL_PORT); + rc = efx_ptp_insert_ipv4_filter(efx, &ptp->rxfilters_mcast, + htonl(PTP_ADDR_IPV4), PTP_GENERAL_PORT, + 0); if (rc < 0) goto fail; @@ -1378,15 +1472,19 @@ static int efx_ptp_insert_multicast_filters(struct efx_nic *efx) * PTP over IPv6 and Ethernet */ if (efx_ptp_use_mac_tx_timestamps(efx)) { - rc = efx_ptp_insert_ipv6_filter(efx, PTP_EVENT_PORT); + struct in6_addr ipv6_addr = {{PTP_ADDR_IPV6}}; + + rc = efx_ptp_insert_ipv6_filter(efx, &ptp->rxfilters_mcast, + &ipv6_addr, PTP_EVENT_PORT, 0); if (rc < 0) goto fail; - rc = efx_ptp_insert_ipv6_filter(efx, PTP_GENERAL_PORT); + rc = efx_ptp_insert_ipv6_filter(efx, &ptp->rxfilters_mcast, + &ipv6_addr, PTP_GENERAL_PORT, 0); if (rc < 0) goto fail; - rc = efx_ptp_insert_eth_filter(efx); + rc = efx_ptp_insert_eth_multicast_filter(efx); if (rc < 0) goto fail; } @@ -1394,7 +1492,64 @@ static int efx_ptp_insert_multicast_filters(struct efx_nic *efx) return 0; fail: - efx_ptp_remove_multicast_filters(efx); + efx_ptp_remove_filters(efx, &ptp->rxfilters_mcast); + return rc; +} + +static bool efx_ptp_valid_unicast_event_pkt(struct sk_buff *skb) +{ + if (skb->protocol == htons(ETH_P_IP)) { + return ip_hdr(skb)->daddr != htonl(PTP_ADDR_IPV4) && + ip_hdr(skb)->protocol == IPPROTO_UDP && + udp_hdr(skb)->source == htons(PTP_EVENT_PORT); + } else if (skb->protocol == htons(ETH_P_IPV6)) { + struct in6_addr mcast_addr = {{PTP_ADDR_IPV6}}; + + return !ipv6_addr_equal(&ipv6_hdr(skb)->daddr, &mcast_addr) && + ipv6_hdr(skb)->nexthdr == IPPROTO_UDP && + udp_hdr(skb)->source == htons(PTP_EVENT_PORT); + } + return false; +} + +static int efx_ptp_insert_unicast_filter(struct efx_nic *efx, + struct sk_buff *skb) +{ + struct efx_ptp_data *ptp = efx->ptp_data; + unsigned long expiry; + int rc; + + if (!efx_ptp_valid_unicast_event_pkt(skb)) + return -EINVAL; + + expiry = jiffies + UCAST_FILTER_EXPIRY_JIFFIES; + + if (skb->protocol == htons(ETH_P_IP)) { + __be32 addr = ip_hdr(skb)->saddr; + + rc = efx_ptp_insert_ipv4_filter(efx, &ptp->rxfilters_ucast, + addr, PTP_EVENT_PORT, expiry); + if (rc < 0) + goto out; + + rc = efx_ptp_insert_ipv4_filter(efx, &ptp->rxfilters_ucast, + addr, PTP_GENERAL_PORT, expiry); + } else if (efx_ptp_use_mac_tx_timestamps(efx)) { + /* IPv6 PTP only supported by devices with MAC hw timestamp */ + struct in6_addr *addr = &ipv6_hdr(skb)->saddr; + + rc = efx_ptp_insert_ipv6_filter(efx, &ptp->rxfilters_ucast, + addr, PTP_EVENT_PORT, expiry); + if (rc < 0) + goto out; + + rc = efx_ptp_insert_ipv6_filter(efx, &ptp->rxfilters_ucast, + addr, PTP_GENERAL_PORT, expiry); + } else { + return -EOPNOTSUPP; + } + +out: return rc; } @@ -1419,7 +1574,7 @@ static int efx_ptp_start(struct efx_nic *efx) return 0; fail: - efx_ptp_remove_multicast_filters(efx); + efx_ptp_remove_filters(efx, &ptp->rxfilters_mcast); return rc; } @@ -1435,7 +1590,8 @@ static int efx_ptp_stop(struct efx_nic *efx) rc = efx_ptp_disable(efx); - efx_ptp_remove_multicast_filters(efx); + efx_ptp_remove_filters(efx, &ptp->rxfilters_mcast); + efx_ptp_remove_filters(efx, &ptp->rxfilters_ucast); /* Make sure RX packets are really delivered */ efx_ptp_deliver_rx_queue(&efx->ptp_data->rxq); @@ -1499,6 +1655,23 @@ static void efx_ptp_worker(struct work_struct *work) efx_ptp_process_rx(efx, skb); } +static void efx_ptp_cleanup_worker(struct work_struct *work) +{ + struct efx_ptp_data *ptp = + container_of(work, struct efx_ptp_data, cleanup_work.work); + struct efx_ptp_rxfilter *rxfilter, *tmp; + + list_for_each_entry_safe(rxfilter, tmp, &ptp->rxfilters_ucast, list) { + if (time_is_before_jiffies(rxfilter->expiry)) + efx_ptp_remove_one_filter(ptp->efx, rxfilter); + } + + if (!list_empty(&ptp->rxfilters_ucast)) { + queue_delayed_work(ptp->workwq, &ptp->cleanup_work, + UCAST_FILTER_EXPIRY_JIFFIES + 1); + } +} + static const struct ptp_clock_info efx_phc_clock_info = { .owner = THIS_MODULE, .name = "sfc", @@ -1557,6 +1730,7 @@ int efx_ptp_probe(struct efx_nic *efx, struct efx_channel *channel) } INIT_WORK(&ptp->work, efx_ptp_worker); + INIT_DELAYED_WORK(&ptp->cleanup_work, efx_ptp_cleanup_worker); ptp->config.flags = 0; ptp->config.tx_type = HWTSTAMP_TX_OFF; ptp->config.rx_filter = HWTSTAMP_FILTER_NONE; @@ -1566,6 +1740,9 @@ int efx_ptp_probe(struct efx_nic *efx, struct efx_channel *channel) for (pos = 0; pos < MAX_RECEIVE_EVENTS; pos++) list_add(&ptp->rx_evts[pos].link, &ptp->evt_free_list); + INIT_LIST_HEAD(&ptp->rxfilters_mcast); + INIT_LIST_HEAD(&ptp->rxfilters_ucast); + /* Get the NIC PTP attributes and set up time conversions */ rc = efx_ptp_get_attributes(efx); if (rc < 0) @@ -1645,6 +1822,7 @@ void efx_ptp_remove(struct efx_nic *efx) (void)efx_ptp_disable(efx); cancel_work_sync(&efx->ptp_data->work); + cancel_delayed_work_sync(&efx->ptp_data->cleanup_work); if (efx->ptp_data->pps_workwq) cancel_work_sync(&efx->ptp_data->pps_work); diff --git a/drivers/net/ethernet/sfc/siena/efx.c b/drivers/net/ethernet/sfc/siena/efx.c index ef52ec71d197..8c557f6a183c 100644 --- a/drivers/net/ethernet/sfc/siena/efx.c +++ b/drivers/net/ethernet/sfc/siena/efx.c @@ -18,7 +18,6 @@ #include <linux/ethtool.h> #include <linux/topology.h> #include <linux/gfp.h> -#include <linux/aer.h> #include <linux/interrupt.h> #include "net_driver.h" #include <net/gre.h> @@ -874,8 +873,6 @@ static void efx_pci_remove(struct pci_dev *pci_dev) efx_siena_fini_struct(efx); free_netdev(efx->net_dev); - - pci_disable_pcie_error_reporting(pci_dev); }; /* NIC VPD information @@ -1094,8 +1091,6 @@ static int efx_pci_probe(struct pci_dev *pci_dev, netif_warn(efx, probe, efx->net_dev, "failed to create MTDs (%d)\n", rc); - (void)pci_enable_pcie_error_reporting(pci_dev); - if (efx->type->udp_tnl_push_ports) efx->type->udp_tnl_push_ports(efx); diff --git a/drivers/net/ethernet/sfc/tc.c b/drivers/net/ethernet/sfc/tc.c index deeaab9ee761..0327639a628a 100644 --- a/drivers/net/ethernet/sfc/tc.c +++ b/drivers/net/ethernet/sfc/tc.c @@ -10,12 +10,24 @@ */ #include <net/pkt_cls.h> +#include <net/vxlan.h> +#include <net/geneve.h> #include "tc.h" #include "tc_bindings.h" #include "mae.h" #include "ef100_rep.h" #include "efx.h" +static enum efx_encap_type efx_tc_indr_netdev_type(struct net_device *net_dev) +{ + if (netif_is_vxlan(net_dev)) + return EFX_ENCAP_TYPE_VXLAN; + if (netif_is_geneve(net_dev)) + return EFX_ENCAP_TYPE_GENEVE; + + return EFX_ENCAP_TYPE_NONE; +} + #define EFX_EFV_PF NULL /* Look up the representor information (efv) for a device. * May return NULL for the PF (us), or an error pointer for a device that @@ -43,6 +55,20 @@ static struct efx_rep *efx_tc_flower_lookup_efv(struct efx_nic *efx, return efv; } +/* Convert a driver-internal vport ID into an internal device (PF or VF) */ +static s64 efx_tc_flower_internal_mport(struct efx_nic *efx, struct efx_rep *efv) +{ + u32 mport; + + if (IS_ERR(efv)) + return PTR_ERR(efv); + if (!efv) /* device is PF (us) */ + efx_mae_mport_uplink(efx, &mport); + else /* device is repr */ + efx_mae_mport_mport(efx, efv->mport, &mport); + return mport; +} + /* Convert a driver-internal vport ID into an external device (wire or VF) */ static s64 efx_tc_flower_external_mport(struct efx_nic *efx, struct efx_rep *efv) { @@ -57,6 +83,12 @@ static s64 efx_tc_flower_external_mport(struct efx_nic *efx, struct efx_rep *efv return mport; } +static const struct rhashtable_params efx_tc_encap_match_ht_params = { + .key_len = offsetof(struct efx_tc_encap_match, linkage), + .key_offset = 0, + .head_offset = offsetof(struct efx_tc_encap_match, linkage), +}; + static const struct rhashtable_params efx_tc_match_action_ht_params = { .key_len = sizeof(unsigned long), .key_offset = offsetof(struct efx_tc_flow_rule, cookie), @@ -66,7 +98,7 @@ static const struct rhashtable_params efx_tc_match_action_ht_params = { static void efx_tc_free_action_set(struct efx_nic *efx, struct efx_tc_action_set *act, bool in_hw) { - /* Failure paths calling this on the 'running action' set in_hw=false, + /* Failure paths calling this on the 'cursor' action set in_hw=false, * because if the alloc had succeeded we'd've put it in acts.list and * not still have it in act. */ @@ -100,15 +132,6 @@ static void efx_tc_free_action_set_list(struct efx_nic *efx, /* Don't kfree, as acts is embedded inside a struct efx_tc_flow_rule */ } -static void efx_tc_delete_rule(struct efx_nic *efx, struct efx_tc_flow_rule *rule) -{ - efx_mae_delete_rule(efx, rule->fw_id); - - /* Release entries in subsidiary tables */ - efx_tc_free_action_set_list(efx, &rule->acts, true); - rule->fw_id = MC_CMD_MAE_ACTION_RULE_INSERT_OUT_ACTION_RULE_ID_NULL; -} - static void efx_tc_flow_free(void *ptr, void *arg) { struct efx_tc_flow_rule *rule = ptr; @@ -193,6 +216,11 @@ static int efx_tc_flower_parse_match(struct efx_nic *efx, BIT(FLOW_DISSECTOR_KEY_IPV4_ADDRS) | BIT(FLOW_DISSECTOR_KEY_IPV6_ADDRS) | BIT(FLOW_DISSECTOR_KEY_PORTS) | + BIT(FLOW_DISSECTOR_KEY_ENC_KEYID) | + BIT(FLOW_DISSECTOR_KEY_ENC_IPV4_ADDRS) | + BIT(FLOW_DISSECTOR_KEY_ENC_IPV6_ADDRS) | + BIT(FLOW_DISSECTOR_KEY_ENC_PORTS) | + BIT(FLOW_DISSECTOR_KEY_ENC_CONTROL) | BIT(FLOW_DISSECTOR_KEY_TCP) | BIT(FLOW_DISSECTOR_KEY_IP))) { NL_SET_ERR_MSG_FMT_MOD(extack, "Unsupported flower keys %#x", @@ -280,12 +308,228 @@ static int efx_tc_flower_parse_match(struct efx_nic *efx, MAP_KEY_AND_MASK(PORTS, ports, src, l4_sport); MAP_KEY_AND_MASK(PORTS, ports, dst, l4_dport); MAP_KEY_AND_MASK(TCP, tcp, flags, tcp_flags); + if (flow_rule_match_key(rule, FLOW_DISSECTOR_KEY_ENC_CONTROL)) { + struct flow_match_control fm; + flow_rule_match_enc_control(rule, &fm); + if (fm.mask->flags) { + NL_SET_ERR_MSG_FMT_MOD(extack, "Unsupported match on enc_control.flags %#x", + fm.mask->flags); + return -EOPNOTSUPP; + } + if (!IS_ALL_ONES(fm.mask->addr_type)) { + NL_SET_ERR_MSG_FMT_MOD(extack, "Unsupported enc addr_type mask %u (key %u)", + fm.mask->addr_type, + fm.key->addr_type); + return -EOPNOTSUPP; + } + switch (fm.key->addr_type) { + case FLOW_DISSECTOR_KEY_IPV4_ADDRS: + MAP_ENC_KEY_AND_MASK(IPV4_ADDRS, ipv4_addrs, enc_ipv4_addrs, + src, enc_src_ip); + MAP_ENC_KEY_AND_MASK(IPV4_ADDRS, ipv4_addrs, enc_ipv4_addrs, + dst, enc_dst_ip); + break; +#ifdef CONFIG_IPV6 + case FLOW_DISSECTOR_KEY_IPV6_ADDRS: + MAP_ENC_KEY_AND_MASK(IPV6_ADDRS, ipv6_addrs, enc_ipv6_addrs, + src, enc_src_ip6); + MAP_ENC_KEY_AND_MASK(IPV6_ADDRS, ipv6_addrs, enc_ipv6_addrs, + dst, enc_dst_ip6); + break; +#endif + default: + NL_SET_ERR_MSG_FMT_MOD(extack, + "Unsupported enc addr_type %u (supported are IPv4, IPv6)", + fm.key->addr_type); + return -EOPNOTSUPP; + } + MAP_ENC_KEY_AND_MASK(IP, ip, enc_ip, tos, enc_ip_tos); + MAP_ENC_KEY_AND_MASK(IP, ip, enc_ip, ttl, enc_ip_ttl); + MAP_ENC_KEY_AND_MASK(PORTS, ports, enc_ports, src, enc_sport); + MAP_ENC_KEY_AND_MASK(PORTS, ports, enc_ports, dst, enc_dport); + MAP_ENC_KEY_AND_MASK(KEYID, enc_keyid, enc_keyid, keyid, enc_keyid); + } else if (dissector->used_keys & + (BIT(FLOW_DISSECTOR_KEY_ENC_KEYID) | + BIT(FLOW_DISSECTOR_KEY_ENC_IPV4_ADDRS) | + BIT(FLOW_DISSECTOR_KEY_ENC_IPV6_ADDRS) | + BIT(FLOW_DISSECTOR_KEY_ENC_IP) | + BIT(FLOW_DISSECTOR_KEY_ENC_PORTS))) { + NL_SET_ERR_MSG_FMT_MOD(extack, "Flower enc keys require enc_control (keys: %#x)", + dissector->used_keys); + return -EOPNOTSUPP; + } + + return 0; +} + +static int efx_tc_flower_record_encap_match(struct efx_nic *efx, + struct efx_tc_match *match, + enum efx_encap_type type, + struct netlink_ext_ack *extack) +{ + struct efx_tc_encap_match *encap, *old; + bool ipv6 = false; + int rc; + + /* We require that the socket-defining fields (IP addrs and UDP dest + * port) are present and exact-match. Other fields are currently not + * allowed. This meets what OVS will ask for, and means that we don't + * need to handle difficult checks for overlapping matches as could + * come up if we allowed masks or varying sets of match fields. + */ + if (match->mask.enc_dst_ip | match->mask.enc_src_ip) { + if (!IS_ALL_ONES(match->mask.enc_dst_ip)) { + NL_SET_ERR_MSG_MOD(extack, + "Egress encap match is not exact on dst IP address"); + return -EOPNOTSUPP; + } + if (!IS_ALL_ONES(match->mask.enc_src_ip)) { + NL_SET_ERR_MSG_MOD(extack, + "Egress encap match is not exact on src IP address"); + return -EOPNOTSUPP; + } +#ifdef CONFIG_IPV6 + if (!ipv6_addr_any(&match->mask.enc_dst_ip6) || + !ipv6_addr_any(&match->mask.enc_src_ip6)) { + NL_SET_ERR_MSG_MOD(extack, + "Egress encap match on both IPv4 and IPv6, don't understand"); + return -EOPNOTSUPP; + } + } else { + ipv6 = true; + if (!efx_ipv6_addr_all_ones(&match->mask.enc_dst_ip6)) { + NL_SET_ERR_MSG_MOD(extack, + "Egress encap match is not exact on dst IP address"); + return -EOPNOTSUPP; + } + if (!efx_ipv6_addr_all_ones(&match->mask.enc_src_ip6)) { + NL_SET_ERR_MSG_MOD(extack, + "Egress encap match is not exact on src IP address"); + return -EOPNOTSUPP; + } +#endif + } + if (!IS_ALL_ONES(match->mask.enc_dport)) { + NL_SET_ERR_MSG_MOD(extack, "Egress encap match is not exact on dst UDP port"); + return -EOPNOTSUPP; + } + if (match->mask.enc_sport) { + NL_SET_ERR_MSG_MOD(extack, "Egress encap match on src UDP port not supported"); + return -EOPNOTSUPP; + } + if (match->mask.enc_ip_tos) { + NL_SET_ERR_MSG_MOD(extack, "Egress encap match on IP ToS not supported"); + return -EOPNOTSUPP; + } + if (match->mask.enc_ip_ttl) { + NL_SET_ERR_MSG_MOD(extack, "Egress encap match on IP TTL not supported"); + return -EOPNOTSUPP; + } + + rc = efx_mae_check_encap_match_caps(efx, ipv6, extack); + if (rc) { + NL_SET_ERR_MSG_FMT_MOD(extack, "MAE hw reports no support for IPv%d encap matches", + ipv6 ? 6 : 4); + return -EOPNOTSUPP; + } + + encap = kzalloc(sizeof(*encap), GFP_USER); + if (!encap) + return -ENOMEM; + encap->src_ip = match->value.enc_src_ip; + encap->dst_ip = match->value.enc_dst_ip; +#ifdef CONFIG_IPV6 + encap->src_ip6 = match->value.enc_src_ip6; + encap->dst_ip6 = match->value.enc_dst_ip6; +#endif + encap->udp_dport = match->value.enc_dport; + encap->tun_type = type; + old = rhashtable_lookup_get_insert_fast(&efx->tc->encap_match_ht, + &encap->linkage, + efx_tc_encap_match_ht_params); + if (old) { + /* don't need our new entry */ + kfree(encap); + if (old->tun_type != type) { + NL_SET_ERR_MSG_FMT_MOD(extack, + "Egress encap match with conflicting tun_type %u != %u", + old->tun_type, type); + return -EEXIST; + } + if (!refcount_inc_not_zero(&old->ref)) + return -EAGAIN; + /* existing entry found */ + encap = old; + } else { + rc = efx_mae_register_encap_match(efx, encap); + if (rc) { + NL_SET_ERR_MSG_MOD(extack, "Failed to record egress encap match in HW"); + goto fail; + } + refcount_set(&encap->ref, 1); + } + match->encap = encap; return 0; +fail: + rhashtable_remove_fast(&efx->tc->encap_match_ht, &encap->linkage, + efx_tc_encap_match_ht_params); + kfree(encap); + return rc; +} + +static void efx_tc_flower_release_encap_match(struct efx_nic *efx, + struct efx_tc_encap_match *encap) +{ + int rc; + + if (!refcount_dec_and_test(&encap->ref)) + return; /* still in use */ + + rc = efx_mae_unregister_encap_match(efx, encap); + if (rc) + /* Display message but carry on and remove entry from our + * SW tables, because there's not much we can do about it. + */ + netif_err(efx, drv, efx->net_dev, + "Failed to release encap match %#x, rc %d\n", + encap->fw_id, rc); + rhashtable_remove_fast(&efx->tc->encap_match_ht, &encap->linkage, + efx_tc_encap_match_ht_params); + kfree(encap); +} + +static void efx_tc_delete_rule(struct efx_nic *efx, struct efx_tc_flow_rule *rule) +{ + efx_mae_delete_rule(efx, rule->fw_id); + + /* Release entries in subsidiary tables */ + efx_tc_free_action_set_list(efx, &rule->acts, true); + if (rule->match.encap) + efx_tc_flower_release_encap_match(efx, rule->match.encap); + rule->fw_id = MC_CMD_MAE_ACTION_RULE_INSERT_OUT_ACTION_RULE_ID_NULL; +} + +static const char *efx_tc_encap_type_name(enum efx_encap_type typ) +{ + switch (typ) { + case EFX_ENCAP_TYPE_NONE: + return "none"; + case EFX_ENCAP_TYPE_VXLAN: + return "vxlan"; + case EFX_ENCAP_TYPE_GENEVE: + return "geneve"; + default: + pr_warn_once("Unknown efx_encap_type %d encountered\n", typ); + return "unknown"; + } } /* For details of action order constraints refer to SF-123102-TC-1§12.6.1 */ enum efx_tc_action_order { + EFX_TC_AO_DECAP, + EFX_TC_AO_VLAN_POP, + EFX_TC_AO_VLAN_PUSH, EFX_TC_AO_COUNT, EFX_TC_AO_DELIVER }; @@ -294,6 +538,24 @@ static bool efx_tc_flower_action_order_ok(const struct efx_tc_action_set *act, enum efx_tc_action_order new) { switch (new) { + case EFX_TC_AO_DECAP: + if (act->decap) + return false; + fallthrough; + case EFX_TC_AO_VLAN_POP: + if (act->vlan_pop >= 2) + return false; + /* If we've already pushed a VLAN, we can't then pop it; + * the hardware would instead try to pop an existing VLAN + * before pushing the new one. + */ + if (act->vlan_push) + return false; + fallthrough; + case EFX_TC_AO_VLAN_PUSH: + if (act->vlan_push >= 2) + return false; + fallthrough; case EFX_TC_AO_COUNT: if (act->count) return false; @@ -307,6 +569,286 @@ static bool efx_tc_flower_action_order_ok(const struct efx_tc_action_set *act, } } +static int efx_tc_flower_replace_foreign(struct efx_nic *efx, + struct net_device *net_dev, + struct flow_cls_offload *tc) +{ + struct flow_rule *fr = flow_cls_offload_flow_rule(tc); + struct netlink_ext_ack *extack = tc->common.extack; + struct efx_tc_flow_rule *rule = NULL, *old = NULL; + struct efx_tc_action_set *act = NULL; + bool found = false, uplinked = false; + const struct flow_action_entry *fa; + struct efx_tc_match match; + struct efx_rep *to_efv; + s64 rc; + int i; + + /* Parse match */ + memset(&match, 0, sizeof(match)); + rc = efx_tc_flower_parse_match(efx, fr, &match, NULL); + if (rc) + return rc; + /* The rule as given to us doesn't specify a source netdevice. + * But, determining whether packets from a VF should match it is + * complicated, so leave those to the software slowpath: qualify + * the filter with source m-port == wire. + */ + rc = efx_tc_flower_external_mport(efx, EFX_EFV_PF); + if (rc < 0) { + NL_SET_ERR_MSG_MOD(extack, "Failed to identify ingress m-port for foreign filter"); + return rc; + } + match.value.ingress_port = rc; + match.mask.ingress_port = ~0; + + if (tc->common.chain_index) { + NL_SET_ERR_MSG_MOD(extack, "No support for nonzero chain_index"); + return -EOPNOTSUPP; + } + match.mask.recirc_id = 0xff; + + flow_action_for_each(i, fa, &fr->action) { + switch (fa->id) { + case FLOW_ACTION_REDIRECT: + case FLOW_ACTION_MIRRED: /* mirred means mirror here */ + to_efv = efx_tc_flower_lookup_efv(efx, fa->dev); + if (IS_ERR(to_efv)) + continue; + found = true; + break; + default: + break; + } + } + if (!found) { /* We don't care. */ + netif_dbg(efx, drv, efx->net_dev, + "Ignoring foreign filter that doesn't egdev us\n"); + rc = -EOPNOTSUPP; + goto release; + } + + rc = efx_mae_match_check_caps(efx, &match.mask, NULL); + if (rc) + goto release; + + if (efx_tc_match_is_encap(&match.mask)) { + enum efx_encap_type type; + + type = efx_tc_indr_netdev_type(net_dev); + if (type == EFX_ENCAP_TYPE_NONE) { + NL_SET_ERR_MSG_MOD(extack, + "Egress encap match on unsupported tunnel device"); + rc = -EOPNOTSUPP; + goto release; + } + + rc = efx_mae_check_encap_type_supported(efx, type); + if (rc) { + NL_SET_ERR_MSG_FMT_MOD(extack, + "Firmware reports no support for %s encap match", + efx_tc_encap_type_name(type)); + goto release; + } + + rc = efx_tc_flower_record_encap_match(efx, &match, type, + extack); + if (rc) + goto release; + } else { + /* This is not a tunnel decap rule, ignore it */ + netif_dbg(efx, drv, efx->net_dev, + "Ignoring foreign filter without encap match\n"); + rc = -EOPNOTSUPP; + goto release; + } + + rule = kzalloc(sizeof(*rule), GFP_USER); + if (!rule) { + rc = -ENOMEM; + goto release; + } + INIT_LIST_HEAD(&rule->acts.list); + rule->cookie = tc->cookie; + old = rhashtable_lookup_get_insert_fast(&efx->tc->match_action_ht, + &rule->linkage, + efx_tc_match_action_ht_params); + if (old) { + netif_dbg(efx, drv, efx->net_dev, + "Ignoring already-offloaded rule (cookie %lx)\n", + tc->cookie); + rc = -EEXIST; + goto release; + } + + act = kzalloc(sizeof(*act), GFP_USER); + if (!act) { + rc = -ENOMEM; + goto release; + } + + /* Parse actions. For foreign rules we only support decap & redirect. + * See corresponding code in efx_tc_flower_replace() for theory of + * operation & how 'act' cursor is used. + */ + flow_action_for_each(i, fa, &fr->action) { + struct efx_tc_action_set save; + + switch (fa->id) { + case FLOW_ACTION_REDIRECT: + case FLOW_ACTION_MIRRED: + /* See corresponding code in efx_tc_flower_replace() for + * long explanations of what's going on here. + */ + save = *act; + if (fa->hw_stats) { + struct efx_tc_counter_index *ctr; + + if (!(fa->hw_stats & FLOW_ACTION_HW_STATS_DELAYED)) { + NL_SET_ERR_MSG_FMT_MOD(extack, + "hw_stats_type %u not supported (only 'delayed')", + fa->hw_stats); + rc = -EOPNOTSUPP; + goto release; + } + if (!efx_tc_flower_action_order_ok(act, EFX_TC_AO_COUNT)) { + rc = -EOPNOTSUPP; + goto release; + } + + ctr = efx_tc_flower_get_counter_index(efx, + tc->cookie, + EFX_TC_COUNTER_TYPE_AR); + if (IS_ERR(ctr)) { + rc = PTR_ERR(ctr); + NL_SET_ERR_MSG_MOD(extack, "Failed to obtain a counter"); + goto release; + } + act->count = ctr; + } + + if (!efx_tc_flower_action_order_ok(act, EFX_TC_AO_DELIVER)) { + /* can't happen */ + rc = -EOPNOTSUPP; + NL_SET_ERR_MSG_MOD(extack, + "Deliver action violates action order (can't happen)"); + goto release; + } + to_efv = efx_tc_flower_lookup_efv(efx, fa->dev); + /* PF implies egdev is us, in which case we really + * want to deliver to the uplink (because this is an + * ingress filter). If we don't recognise the egdev + * at all, then we'd better trap so SW can handle it. + */ + if (IS_ERR(to_efv)) + to_efv = EFX_EFV_PF; + if (to_efv == EFX_EFV_PF) { + if (uplinked) + break; + uplinked = true; + } + rc = efx_tc_flower_internal_mport(efx, to_efv); + if (rc < 0) { + NL_SET_ERR_MSG_MOD(extack, "Failed to identify egress m-port"); + goto release; + } + act->dest_mport = rc; + act->deliver = 1; + rc = efx_mae_alloc_action_set(efx, act); + if (rc) { + NL_SET_ERR_MSG_MOD(extack, + "Failed to write action set to hw (mirred)"); + goto release; + } + list_add_tail(&act->list, &rule->acts.list); + act = NULL; + if (fa->id == FLOW_ACTION_REDIRECT) + break; /* end of the line */ + /* Mirror, so continue on with saved act */ + act = kzalloc(sizeof(*act), GFP_USER); + if (!act) { + rc = -ENOMEM; + goto release; + } + *act = save; + break; + case FLOW_ACTION_TUNNEL_DECAP: + if (!efx_tc_flower_action_order_ok(act, EFX_TC_AO_DECAP)) { + rc = -EINVAL; + NL_SET_ERR_MSG_MOD(extack, "Decap action violates action order"); + goto release; + } + act->decap = 1; + /* If we previously delivered/trapped to uplink, now + * that we've decapped we'll want another copy if we + * try to deliver/trap to uplink again. + */ + uplinked = false; + break; + default: + NL_SET_ERR_MSG_FMT_MOD(extack, "Unhandled action %u", + fa->id); + rc = -EOPNOTSUPP; + goto release; + } + } + + if (act) { + if (!uplinked) { + /* Not shot/redirected, so deliver to default dest (which is + * the uplink, as this is an ingress filter) + */ + efx_mae_mport_uplink(efx, &act->dest_mport); + act->deliver = 1; + } + rc = efx_mae_alloc_action_set(efx, act); + if (rc) { + NL_SET_ERR_MSG_MOD(extack, "Failed to write action set to hw (deliver)"); + goto release; + } + list_add_tail(&act->list, &rule->acts.list); + act = NULL; /* Prevent double-free in error path */ + } + + rule->match = match; + + netif_dbg(efx, drv, efx->net_dev, + "Successfully parsed foreign filter (cookie %lx)\n", + tc->cookie); + + rc = efx_mae_alloc_action_set_list(efx, &rule->acts); + if (rc) { + NL_SET_ERR_MSG_MOD(extack, "Failed to write action set list to hw"); + goto release; + } + rc = efx_mae_insert_rule(efx, &rule->match, EFX_TC_PRIO_TC, + rule->acts.fw_id, &rule->fw_id); + if (rc) { + NL_SET_ERR_MSG_MOD(extack, "Failed to insert rule in hw"); + goto release_acts; + } + return 0; + +release_acts: + efx_mae_free_action_set_list(efx, &rule->acts); +release: + /* We failed to insert the rule, so free up any entries we created in + * subsidiary tables. + */ + if (act) + efx_tc_free_action_set(efx, act, false); + if (rule) { + rhashtable_remove_fast(&efx->tc->match_action_ht, + &rule->linkage, + efx_tc_match_action_ht_params); + efx_tc_free_action_set_list(efx, &rule->acts, false); + } + kfree(rule); + if (match.encap) + efx_tc_flower_release_encap_match(efx, match.encap); + return rc; +} + static int efx_tc_flower_replace(struct efx_nic *efx, struct net_device *net_dev, struct flow_cls_offload *tc, @@ -331,10 +873,8 @@ static int efx_tc_flower_replace(struct efx_nic *efx, from_efv = efx_tc_flower_lookup_efv(efx, net_dev); if (IS_ERR(from_efv)) { - /* Might be a tunnel decap rule from an indirect block. - * Support for those not implemented yet. - */ - return -EOPNOTSUPP; + /* Not from our PF or representors, so probably a tunnel dev */ + return efx_tc_flower_replace_foreign(efx, net_dev, tc); } if (efv != from_efv) { @@ -357,6 +897,11 @@ static int efx_tc_flower_replace(struct efx_nic *efx, rc = efx_tc_flower_parse_match(efx, fr, &match, extack); if (rc) return rc; + if (efx_tc_match_is_encap(&match.mask)) { + NL_SET_ERR_MSG_MOD(extack, "Ingress enc_key matches not supported"); + rc = -EOPNOTSUPP; + goto release; + } if (tc->common.chain_index) { NL_SET_ERR_MSG_MOD(extack, "No support for nonzero chain_index"); @@ -391,8 +936,33 @@ static int efx_tc_flower_replace(struct efx_nic *efx, goto release; } + /** + * DOC: TC action translation + * + * Actions in TC are sequential and cumulative, with delivery actions + * potentially anywhere in the order. The EF100 MAE, however, takes + * an 'action set list' consisting of 'action sets', each of which is + * applied to the _original_ packet, and consists of a set of optional + * actions in a fixed order with delivery at the end. + * To translate between these two models, we maintain a 'cursor', @act, + * which describes the cumulative effect of all the packet-mutating + * actions encountered so far; on handling a delivery (mirred or drop) + * action, once the action-set has been inserted into hardware, we + * append @act to the action-set list (@rule->acts); if this is a pipe + * action (mirred mirror) we then allocate a new @act with a copy of + * the cursor state _before_ the delivery action, otherwise we set @act + * to %NULL. + * This ensures that every allocated action-set is either attached to + * @rule->acts or pointed to by @act (and never both), and that only + * those action-sets in @rule->acts exist in hardware. Consequently, + * in the failure path, @act only needs to be freed in memory, whereas + * for @rule->acts we remove each action-set from hardware before + * freeing it (efx_tc_free_action_set_list()), even if the action-set + * list itself is not in hardware. + */ flow_action_for_each(i, fa, &fr->action) { struct efx_tc_action_set save; + u16 tci; if (!act) { /* more actions after a non-pipe action */ @@ -494,6 +1064,31 @@ static int efx_tc_flower_replace(struct efx_nic *efx, } *act = save; break; + case FLOW_ACTION_VLAN_POP: + if (act->vlan_push) { + act->vlan_push--; + } else if (efx_tc_flower_action_order_ok(act, EFX_TC_AO_VLAN_POP)) { + act->vlan_pop++; + } else { + NL_SET_ERR_MSG_MOD(extack, + "More than two VLAN pops, or action order violated"); + rc = -EINVAL; + goto release; + } + break; + case FLOW_ACTION_VLAN_PUSH: + if (!efx_tc_flower_action_order_ok(act, EFX_TC_AO_VLAN_PUSH)) { + rc = -EINVAL; + NL_SET_ERR_MSG_MOD(extack, + "More than two VLAN pushes, or action order violated"); + goto release; + } + tci = fa->vlan.vid & VLAN_VID_MASK; + tci |= fa->vlan.prio << VLAN_PRIO_SHIFT; + act->vlan_tci[act->vlan_push] = cpu_to_be16(tci); + act->vlan_proto[act->vlan_push] = fa->vlan.proto; + act->vlan_push++; + break; default: NL_SET_ERR_MSG_FMT_MOD(extack, "Unhandled action %u", fa->id); @@ -847,6 +1442,18 @@ void efx_fini_tc(struct efx_nic *efx) efx->tc->up = false; } +/* At teardown time, all TC filter rules (and thus all resources they created) + * should already have been removed. If we find any in our hashtables, make a + * cursory attempt to clean up the software side. + */ +static void efx_tc_encap_match_free(void *ptr, void *__unused) +{ + struct efx_tc_encap_match *encap = ptr; + + WARN_ON(refcount_read(&encap->ref)); + kfree(encap); +} + int efx_init_struct_tc(struct efx_nic *efx) { int rc; @@ -869,6 +1476,9 @@ int efx_init_struct_tc(struct efx_nic *efx) rc = efx_tc_init_counters(efx); if (rc < 0) goto fail_counters; + rc = rhashtable_init(&efx->tc->encap_match_ht, &efx_tc_encap_match_ht_params); + if (rc < 0) + goto fail_encap_match_ht; rc = rhashtable_init(&efx->tc->match_action_ht, &efx_tc_match_action_ht_params); if (rc < 0) goto fail_match_action_ht; @@ -881,6 +1491,8 @@ int efx_init_struct_tc(struct efx_nic *efx) efx->extra_channel_type[EFX_EXTRA_CHANNEL_TC] = &efx_tc_channel_type; return 0; fail_match_action_ht: + rhashtable_destroy(&efx->tc->encap_match_ht); +fail_encap_match_ht: efx_tc_destroy_counters(efx); fail_counters: mutex_destroy(&efx->tc->mutex); @@ -903,6 +1515,8 @@ void efx_fini_struct_tc(struct efx_nic *efx) MC_CMD_MAE_ACTION_RULE_INSERT_OUT_ACTION_RULE_ID_NULL); rhashtable_free_and_destroy(&efx->tc->match_action_ht, efx_tc_flow_free, efx); + rhashtable_free_and_destroy(&efx->tc->encap_match_ht, + efx_tc_encap_match_free, NULL); efx_tc_fini_counters(efx); mutex_unlock(&efx->tc->mutex); mutex_destroy(&efx->tc->mutex); diff --git a/drivers/net/ethernet/sfc/tc.h b/drivers/net/ethernet/sfc/tc.h index 418ce8c13a06..04cced6a2d39 100644 --- a/drivers/net/ethernet/sfc/tc.h +++ b/drivers/net/ethernet/sfc/tc.h @@ -18,8 +18,20 @@ #define IS_ALL_ONES(v) (!(typeof (v))~(v)) +#ifdef CONFIG_IPV6 +static inline bool efx_ipv6_addr_all_ones(struct in6_addr *addr) +{ + return !memchr_inv(addr, 0xff, sizeof(*addr)); +} +#endif + struct efx_tc_action_set { + u16 vlan_push:2; + u16 vlan_pop:2; + u16 decap:1; u16 deliver:1; + __be16 vlan_tci[2]; /* TCIs for vlan_push */ + __be16 vlan_proto[2]; /* Ethertypes for vlan_push */ struct efx_tc_counter_index *count; u32 dest_mport; u32 fw_id; /* index of this entry in firmware actions table */ @@ -44,11 +56,38 @@ struct efx_tc_match_fields { /* L4 */ __be16 l4_sport, l4_dport; /* Ports (UDP, TCP) */ __be16 tcp_flags; + /* Encap. The following are *outer* fields. Note that there are no + * outer eth (L2) fields; this is because TC doesn't have them. + */ + __be32 enc_src_ip, enc_dst_ip; + struct in6_addr enc_src_ip6, enc_dst_ip6; + u8 enc_ip_tos, enc_ip_ttl; + __be16 enc_sport, enc_dport; + __be32 enc_keyid; /* e.g. VNI, VSID */ +}; + +static inline bool efx_tc_match_is_encap(const struct efx_tc_match_fields *mask) +{ + return mask->enc_src_ip || mask->enc_dst_ip || + !ipv6_addr_any(&mask->enc_src_ip6) || + !ipv6_addr_any(&mask->enc_dst_ip6) || mask->enc_ip_tos || + mask->enc_ip_ttl || mask->enc_sport || mask->enc_dport; +} + +struct efx_tc_encap_match { + __be32 src_ip, dst_ip; + struct in6_addr src_ip6, dst_ip6; + __be16 udp_dport; + struct rhash_head linkage; + enum efx_encap_type tun_type; + refcount_t ref; + u32 fw_id; /* index of this entry in firmware encap match table */ }; struct efx_tc_match { struct efx_tc_match_fields value; struct efx_tc_match_fields mask; + struct efx_tc_encap_match *encap; }; struct efx_tc_action_set_list { @@ -78,6 +117,7 @@ enum efx_tc_rule_prios { * @mutex: Used to serialise operations on TC hashtables * @counter_ht: Hashtable of TC counters (FW IDs and counter values) * @counter_id_ht: Hashtable mapping TC counter cookies to counters + * @encap_match_ht: Hashtable of TC encap matches * @match_action_ht: Hashtable of TC match-action rules * @reps_mport_id: MAE port allocated for representor RX * @reps_filter_uc: VNIC filter for representor unicast RX (promisc) @@ -101,6 +141,7 @@ struct efx_tc_state { struct mutex mutex; struct rhashtable counter_ht; struct rhashtable counter_id_ht; + struct rhashtable encap_match_ht; struct rhashtable match_action_ht; u32 reps_mport_id, reps_mport_vport_id; s32 reps_filter_uc, reps_filter_mc; diff --git a/drivers/net/ethernet/sfc/tx_tso.c b/drivers/net/ethernet/sfc/tx_tso.c index 898e5c61d908..d381d8164f07 100644 --- a/drivers/net/ethernet/sfc/tx_tso.c +++ b/drivers/net/ethernet/sfc/tx_tso.c @@ -147,7 +147,7 @@ static __be16 efx_tso_check_protocol(struct sk_buff *skb) EFX_WARN_ON_ONCE_PARANOID(((struct ethhdr *)skb->data)->h_proto != protocol); if (protocol == htons(ETH_P_8021Q)) { - struct vlan_ethhdr *veh = (struct vlan_ethhdr *)skb->data; + struct vlan_ethhdr *veh = skb_vlan_eth_hdr(skb); protocol = veh->h_vlan_encapsulated_proto; } diff --git a/drivers/net/ethernet/smsc/smc91x.c b/drivers/net/ethernet/smsc/smc91x.c index 35e99bf0c401..032eccf8eb42 100644 --- a/drivers/net/ethernet/smsc/smc91x.c +++ b/drivers/net/ethernet/smsc/smc91x.c @@ -57,6 +57,7 @@ static const char version[] = #include <linux/kernel.h> #include <linux/sched.h> #include <linux/delay.h> +#include <linux/gpio/consumer.h> #include <linux/interrupt.h> #include <linux/irq.h> #include <linux/errno.h> @@ -69,7 +70,6 @@ static const char version[] = #include <linux/workqueue.h> #include <linux/of.h> #include <linux/of_device.h> -#include <linux/of_gpio.h> #include <linux/netdevice.h> #include <linux/etherdevice.h> diff --git a/drivers/net/ethernet/smsc/smsc911x.c b/drivers/net/ethernet/smsc/smsc911x.c index a690d139e177..174dc8908b72 100644 --- a/drivers/net/ethernet/smsc/smsc911x.c +++ b/drivers/net/ethernet/smsc/smsc911x.c @@ -1016,7 +1016,7 @@ static void smsc911x_phy_adjust_link(struct net_device *dev) static int smsc911x_mii_probe(struct net_device *dev) { struct smsc911x_data *pdata = netdev_priv(dev); - struct phy_device *phydev = NULL; + struct phy_device *phydev; int ret; /* find the first phy */ @@ -1744,7 +1744,6 @@ irq_stop_out: free_irq(dev->irq, dev); mii_free_out: phy_disconnect(dev->phydev); - dev->phydev = NULL; out: pm_runtime_put(dev->dev.parent); return retval; @@ -1775,7 +1774,6 @@ static int smsc911x_stop(struct net_device *dev) if (dev->phydev) { phy_stop(dev->phydev); phy_disconnect(dev->phydev); - dev->phydev = NULL; } netif_carrier_off(dev); pm_runtime_put(dev->dev.parent); diff --git a/drivers/net/ethernet/stmicro/stmmac/Kconfig b/drivers/net/ethernet/stmicro/stmmac/Kconfig index f77511fe4e87..5f5a997f21f3 100644 --- a/drivers/net/ethernet/stmicro/stmmac/Kconfig +++ b/drivers/net/ethernet/stmicro/stmmac/Kconfig @@ -165,6 +165,18 @@ config DWMAC_SOCFPGA for the stmmac device driver. This driver is used for arria5 and cyclone5 FPGA SoCs. +config DWMAC_STARFIVE + tristate "StarFive dwmac support" + depends on OF && (ARCH_STARFIVE || COMPILE_TEST) + select MFD_SYSCON + default m if ARCH_STARFIVE + help + Support for ethernet controllers on StarFive RISC-V SoCs + + This selects the StarFive platform specific glue layer support for + the stmmac device driver. This driver is used for StarFive JH7110 + ethernet controller. + config DWMAC_STI tristate "STi GMAC support" default ARCH_STI diff --git a/drivers/net/ethernet/stmicro/stmmac/Makefile b/drivers/net/ethernet/stmicro/stmmac/Makefile index 057e4bab5c08..8738fdbb4b2d 100644 --- a/drivers/net/ethernet/stmicro/stmmac/Makefile +++ b/drivers/net/ethernet/stmicro/stmmac/Makefile @@ -23,6 +23,7 @@ obj-$(CONFIG_DWMAC_OXNAS) += dwmac-oxnas.o obj-$(CONFIG_DWMAC_QCOM_ETHQOS) += dwmac-qcom-ethqos.o obj-$(CONFIG_DWMAC_ROCKCHIP) += dwmac-rk.o obj-$(CONFIG_DWMAC_SOCFPGA) += dwmac-altr-socfpga.o +obj-$(CONFIG_DWMAC_STARFIVE) += dwmac-starfive.o obj-$(CONFIG_DWMAC_STI) += dwmac-sti.o obj-$(CONFIG_DWMAC_STM32) += dwmac-stm32.o obj-$(CONFIG_DWMAC_SUNXI) += dwmac-sunxi.o diff --git a/drivers/net/ethernet/stmicro/stmmac/chain_mode.c b/drivers/net/ethernet/stmicro/stmmac/chain_mode.c index 2e8744ac6b91..fb55efd52240 100644 --- a/drivers/net/ethernet/stmicro/stmmac/chain_mode.c +++ b/drivers/net/ethernet/stmicro/stmmac/chain_mode.c @@ -14,9 +14,9 @@ #include "stmmac.h" -static int jumbo_frm(void *p, struct sk_buff *skb, int csum) +static int jumbo_frm(struct stmmac_tx_queue *tx_q, struct sk_buff *skb, + int csum) { - struct stmmac_tx_queue *tx_q = (struct stmmac_tx_queue *)p; unsigned int nopaged_len = skb_headlen(skb); struct stmmac_priv *priv = tx_q->priv_data; unsigned int entry = tx_q->cur_tx; @@ -125,9 +125,8 @@ static void init_dma_chain(void *des, dma_addr_t phy_addr, } } -static void refill_desc3(void *priv_ptr, struct dma_desc *p) +static void refill_desc3(struct stmmac_rx_queue *rx_q, struct dma_desc *p) { - struct stmmac_rx_queue *rx_q = (struct stmmac_rx_queue *)priv_ptr; struct stmmac_priv *priv = rx_q->priv_data; if (priv->hwts_rx_en && !priv->extend_desc) @@ -141,9 +140,8 @@ static void refill_desc3(void *priv_ptr, struct dma_desc *p) sizeof(struct dma_desc))); } -static void clean_desc3(void *priv_ptr, struct dma_desc *p) +static void clean_desc3(struct stmmac_tx_queue *tx_q, struct dma_desc *p) { - struct stmmac_tx_queue *tx_q = (struct stmmac_tx_queue *)priv_ptr; struct stmmac_priv *priv = tx_q->priv_data; unsigned int entry = tx_q->dirty_tx; diff --git a/drivers/net/ethernet/stmicro/stmmac/common.h b/drivers/net/ethernet/stmicro/stmmac/common.h index 54bb072aeb2d..4ad692c4116c 100644 --- a/drivers/net/ethernet/stmicro/stmmac/common.h +++ b/drivers/net/ethernet/stmicro/stmmac/common.h @@ -242,7 +242,7 @@ struct stmmac_safety_stats { #define SF_DMA_MODE 1 /* DMA STORE-AND-FORWARD Operation Mode */ -/* DAM HW feature register fields */ +/* DMA HW feature register fields */ #define DMA_HW_FEAT_MIISEL 0x00000001 /* 10/100 Mbps Support */ #define DMA_HW_FEAT_GMIISEL 0x00000002 /* 1000 Mbps Support */ #define DMA_HW_FEAT_HDSEL 0x00000004 /* Half-Duplex Support */ diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac-anarion.c b/drivers/net/ethernet/stmicro/stmmac/dwmac-anarion.c index dfbaea06d108..9354bf419112 100644 --- a/drivers/net/ethernet/stmicro/stmmac/dwmac-anarion.c +++ b/drivers/net/ethernet/stmicro/stmmac/dwmac-anarion.c @@ -20,18 +20,18 @@ #define GMAC_CONFIG_INTF_RGMII (0x1 << 0) struct anarion_gmac { - uintptr_t ctl_block; + void __iomem *ctl_block; uint32_t phy_intf_sel; }; static uint32_t gmac_read_reg(struct anarion_gmac *gmac, uint8_t reg) { - return readl((void *)(gmac->ctl_block + reg)); + return readl(gmac->ctl_block + reg); }; static void gmac_write_reg(struct anarion_gmac *gmac, uint8_t reg, uint32_t val) { - writel(val, (void *)(gmac->ctl_block + reg)); + writel(val, gmac->ctl_block + reg); } static int anarion_gmac_init(struct platform_device *pdev, void *priv) @@ -68,16 +68,16 @@ static struct anarion_gmac *anarion_config_dt(struct platform_device *pdev) ctl_block = devm_platform_ioremap_resource(pdev, 1); if (IS_ERR(ctl_block)) { - dev_err(&pdev->dev, "Cannot get reset region (%ld)!\n", - PTR_ERR(ctl_block)); - return ctl_block; + err = PTR_ERR(ctl_block); + dev_err(&pdev->dev, "Cannot get reset region (%d)!\n", err); + return ERR_PTR(err); } gmac = devm_kzalloc(&pdev->dev, sizeof(*gmac), GFP_KERNEL); if (!gmac) return ERR_PTR(-ENOMEM); - gmac->ctl_block = (uintptr_t)ctl_block; + gmac->ctl_block = ctl_block; err = of_get_phy_mode(pdev->dev.of_node, &phy_mode); if (err) diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac-generic.c b/drivers/net/ethernet/stmicro/stmmac/dwmac-generic.c index 5e731a72cce8..ef8f3a940938 100644 --- a/drivers/net/ethernet/stmicro/stmmac/dwmac-generic.c +++ b/drivers/net/ethernet/stmicro/stmmac/dwmac-generic.c @@ -91,7 +91,7 @@ static struct platform_driver dwmac_generic_driver = { .driver = { .name = STMMAC_RESOURCE_NAME, .pm = &stmmac_pltfr_pm_ops, - .of_match_table = of_match_ptr(dwmac_generic_match), + .of_match_table = dwmac_generic_match, }, }; module_platform_driver(dwmac_generic_driver); diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac-imx.c b/drivers/net/ethernet/stmicro/stmmac/dwmac-imx.c index 2a2be65d65a0..7c228bd0d099 100644 --- a/drivers/net/ethernet/stmicro/stmmac/dwmac-imx.c +++ b/drivers/net/ethernet/stmicro/stmmac/dwmac-imx.c @@ -37,10 +37,15 @@ #define MX93_GPR_ENET_QOS_INTF_SEL_RGMII (0x1 << 1) #define MX93_GPR_ENET_QOS_CLK_GEN_EN (0x1 << 0) +#define DMA_BUS_MODE 0x00001000 +#define DMA_BUS_MODE_SFT_RESET (0x1 << 0) +#define RMII_RESET_SPEED (0x3 << 14) + struct imx_dwmac_ops { u32 addr_width; bool mac_rgmii_txclk_auto_adj; + int (*fix_soc_reset)(void *priv, void __iomem *ioaddr); int (*set_intf_mode)(struct plat_stmmacenet_data *plat_dat); }; @@ -207,6 +212,25 @@ static void imx_dwmac_fix_speed(void *priv, unsigned int speed) dev_err(dwmac->dev, "failed to set tx rate %lu\n", rate); } +static int imx_dwmac_mx93_reset(void *priv, void __iomem *ioaddr) +{ + struct plat_stmmacenet_data *plat_dat = priv; + u32 value = readl(ioaddr + DMA_BUS_MODE); + + /* DMA SW reset */ + value |= DMA_BUS_MODE_SFT_RESET; + writel(value, ioaddr + DMA_BUS_MODE); + + if (plat_dat->interface == PHY_INTERFACE_MODE_RMII) { + usleep_range(100, 200); + writel(RMII_RESET_SPEED, ioaddr + MAC_CTRL_REG); + } + + return readl_poll_timeout(ioaddr + DMA_BUS_MODE, value, + !(value & DMA_BUS_MODE_SFT_RESET), + 10000, 1000000); +} + static int imx_dwmac_parse_dt(struct imx_priv_data *dwmac, struct device *dev) { @@ -304,6 +328,8 @@ static int imx_dwmac_probe(struct platform_device *pdev) if (ret) goto err_dwmac_init; + dwmac->plat_dat->fix_soc_reset = dwmac->ops->fix_soc_reset; + ret = stmmac_dvr_probe(&pdev->dev, plat_dat, &stmmac_res); if (ret) goto err_drv_probe; @@ -337,6 +363,7 @@ static struct imx_dwmac_ops imx93_dwmac_data = { .addr_width = 32, .mac_rgmii_txclk_auto_adj = true, .set_intf_mode = imx93_set_intf_mode, + .fix_soc_reset = imx_dwmac_mx93_reset, }; static const struct of_device_id imx_dwmac_match[] = { diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac-meson8b.c b/drivers/net/ethernet/stmicro/stmmac/dwmac-meson8b.c index e8b507f88fbc..f6754e3643f3 100644 --- a/drivers/net/ethernet/stmicro/stmmac/dwmac-meson8b.c +++ b/drivers/net/ethernet/stmicro/stmmac/dwmac-meson8b.c @@ -263,6 +263,11 @@ static int meson_axg_set_phy_mode(struct meson8b_dwmac *dwmac) return 0; } +static void meson8b_clk_disable_unprepare(void *data) +{ + clk_disable_unprepare(data); +} + static int meson8b_devm_clk_prepare_enable(struct meson8b_dwmac *dwmac, struct clk *clk) { @@ -273,8 +278,7 @@ static int meson8b_devm_clk_prepare_enable(struct meson8b_dwmac *dwmac, return ret; return devm_add_action_or_reset(dwmac->dev, - (void(*)(void *))clk_disable_unprepare, - clk); + meson8b_clk_disable_unprepare, clk); } static int meson8b_init_rgmii_delays(struct meson8b_dwmac *dwmac) diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac-qcom-ethqos.c b/drivers/net/ethernet/stmicro/stmmac/dwmac-qcom-ethqos.c index 732774645c1a..16a8c361283b 100644 --- a/drivers/net/ethernet/stmicro/stmmac/dwmac-qcom-ethqos.c +++ b/drivers/net/ethernet/stmicro/stmmac/dwmac-qcom-ethqos.c @@ -11,6 +11,7 @@ #define RGMII_IO_MACRO_CONFIG 0x0 #define SDCC_HC_REG_DLL_CONFIG 0x4 +#define SDCC_TEST_CTL 0x8 #define SDCC_HC_REG_DDR_CONFIG 0xC #define SDCC_HC_REG_DLL_CONFIG2 0x10 #define SDC4_STATUS 0x14 @@ -49,6 +50,7 @@ #define SDCC_DDR_CONFIG_EXT_PRG_RCLK_DLY GENMASK(26, 21) #define SDCC_DDR_CONFIG_EXT_PRG_RCLK_DLY_CODE GENMASK(29, 27) #define SDCC_DDR_CONFIG_EXT_PRG_RCLK_DLY_EN BIT(30) +#define SDCC_DDR_CONFIG_TCXO_CYCLES_CNT GENMASK(11, 9) #define SDCC_DDR_CONFIG_PRG_RCLK_DLY GENMASK(8, 0) /* SDCC_HC_REG_DLL_CONFIG2 fields */ @@ -78,7 +80,9 @@ struct ethqos_emac_por { struct ethqos_emac_driver_data { const struct ethqos_emac_por *por; unsigned int num_por; - bool rgmii_config_looback_en; + bool rgmii_config_loopback_en; + bool has_emac3; + struct dwmac4_addrs dwmac4_addrs; }; struct qcom_ethqos { @@ -91,7 +95,8 @@ struct qcom_ethqos { const struct ethqos_emac_por *por; unsigned int num_por; - bool rgmii_config_looback_en; + bool rgmii_config_loopback_en; + bool has_emac3; }; static int rgmii_readl(struct qcom_ethqos *ethqos, unsigned int offset) @@ -183,7 +188,8 @@ static const struct ethqos_emac_por emac_v2_3_0_por[] = { static const struct ethqos_emac_driver_data emac_v2_3_0_data = { .por = emac_v2_3_0_por, .num_por = ARRAY_SIZE(emac_v2_3_0_por), - .rgmii_config_looback_en = true, + .rgmii_config_loopback_en = true, + .has_emac3 = false, }; static const struct ethqos_emac_por emac_v2_1_0_por[] = { @@ -198,7 +204,40 @@ static const struct ethqos_emac_por emac_v2_1_0_por[] = { static const struct ethqos_emac_driver_data emac_v2_1_0_data = { .por = emac_v2_1_0_por, .num_por = ARRAY_SIZE(emac_v2_1_0_por), - .rgmii_config_looback_en = false, + .rgmii_config_loopback_en = false, + .has_emac3 = false, +}; + +static const struct ethqos_emac_por emac_v3_0_0_por[] = { + { .offset = RGMII_IO_MACRO_CONFIG, .value = 0x40c01343 }, + { .offset = SDCC_HC_REG_DLL_CONFIG, .value = 0x2004642c }, + { .offset = SDCC_HC_REG_DDR_CONFIG, .value = 0x80040800 }, + { .offset = SDCC_HC_REG_DLL_CONFIG2, .value = 0x00200000 }, + { .offset = SDCC_USR_CTL, .value = 0x00010800 }, + { .offset = RGMII_IO_MACRO_CONFIG2, .value = 0x00002060 }, +}; + +static const struct ethqos_emac_driver_data emac_v3_0_0_data = { + .por = emac_v3_0_0_por, + .num_por = ARRAY_SIZE(emac_v3_0_0_por), + .rgmii_config_loopback_en = false, + .has_emac3 = true, + .dwmac4_addrs = { + .dma_chan = 0x00008100, + .dma_chan_offset = 0x1000, + .mtl_chan = 0x00008000, + .mtl_chan_offset = 0x1000, + .mtl_ets_ctrl = 0x00008010, + .mtl_ets_ctrl_offset = 0x1000, + .mtl_txq_weight = 0x00008018, + .mtl_txq_weight_offset = 0x1000, + .mtl_send_slp_cred = 0x0000801c, + .mtl_send_slp_cred_offset = 0x1000, + .mtl_high_cred = 0x00008020, + .mtl_high_cred_offset = 0x1000, + .mtl_low_cred = 0x00008024, + .mtl_low_cred_offset = 0x1000, + }, }; static int ethqos_dll_configure(struct qcom_ethqos *ethqos) @@ -222,11 +261,13 @@ static int ethqos_dll_configure(struct qcom_ethqos *ethqos) rgmii_updatel(ethqos, SDCC_DLL_CONFIG_DLL_EN, SDCC_DLL_CONFIG_DLL_EN, SDCC_HC_REG_DLL_CONFIG); - rgmii_updatel(ethqos, SDCC_DLL_MCLK_GATING_EN, - 0, SDCC_HC_REG_DLL_CONFIG); + if (!ethqos->has_emac3) { + rgmii_updatel(ethqos, SDCC_DLL_MCLK_GATING_EN, + 0, SDCC_HC_REG_DLL_CONFIG); - rgmii_updatel(ethqos, SDCC_DLL_CDR_FINE_PHASE, - 0, SDCC_HC_REG_DLL_CONFIG); + rgmii_updatel(ethqos, SDCC_DLL_CDR_FINE_PHASE, + 0, SDCC_HC_REG_DLL_CONFIG); + } /* Wait for CK_OUT_EN clear */ do { @@ -261,28 +302,48 @@ static int ethqos_dll_configure(struct qcom_ethqos *ethqos) rgmii_updatel(ethqos, SDCC_DLL_CONFIG2_DDR_CAL_EN, SDCC_DLL_CONFIG2_DDR_CAL_EN, SDCC_HC_REG_DLL_CONFIG2); - rgmii_updatel(ethqos, SDCC_DLL_CONFIG2_DLL_CLOCK_DIS, - 0, SDCC_HC_REG_DLL_CONFIG2); + if (!ethqos->has_emac3) { + rgmii_updatel(ethqos, SDCC_DLL_CONFIG2_DLL_CLOCK_DIS, + 0, SDCC_HC_REG_DLL_CONFIG2); - rgmii_updatel(ethqos, SDCC_DLL_CONFIG2_MCLK_FREQ_CALC, - 0x1A << 10, SDCC_HC_REG_DLL_CONFIG2); + rgmii_updatel(ethqos, SDCC_DLL_CONFIG2_MCLK_FREQ_CALC, + 0x1A << 10, SDCC_HC_REG_DLL_CONFIG2); - rgmii_updatel(ethqos, SDCC_DLL_CONFIG2_DDR_TRAFFIC_INIT_SEL, - BIT(2), SDCC_HC_REG_DLL_CONFIG2); + rgmii_updatel(ethqos, SDCC_DLL_CONFIG2_DDR_TRAFFIC_INIT_SEL, + BIT(2), SDCC_HC_REG_DLL_CONFIG2); - rgmii_updatel(ethqos, SDCC_DLL_CONFIG2_DDR_TRAFFIC_INIT_SW, - SDCC_DLL_CONFIG2_DDR_TRAFFIC_INIT_SW, - SDCC_HC_REG_DLL_CONFIG2); + rgmii_updatel(ethqos, SDCC_DLL_CONFIG2_DDR_TRAFFIC_INIT_SW, + SDCC_DLL_CONFIG2_DDR_TRAFFIC_INIT_SW, + SDCC_HC_REG_DLL_CONFIG2); + } return 0; } static int ethqos_rgmii_macro_init(struct qcom_ethqos *ethqos) { + int phase_shift; + int phy_mode; + int loopback; + + /* Determine if the PHY adds a 2 ns TX delay or the MAC handles it */ + phy_mode = device_get_phy_mode(ðqos->pdev->dev); + if (phy_mode == PHY_INTERFACE_MODE_RGMII_ID || + phy_mode == PHY_INTERFACE_MODE_RGMII_TXID) + phase_shift = 0; + else + phase_shift = RGMII_CONFIG2_TX_CLK_PHASE_SHIFT_EN; + /* Disable loopback mode */ rgmii_updatel(ethqos, RGMII_CONFIG2_TX_TO_RX_LOOPBACK_EN, 0, RGMII_IO_MACRO_CONFIG2); + /* Determine if this platform wants loopback enabled after programming */ + if (ethqos->rgmii_config_loopback_en) + loopback = RGMII_CONFIG_LOOPBACK_EN; + else + loopback = 0; + /* Select RGMII, write 0 to interface select */ rgmii_updatel(ethqos, RGMII_CONFIG_INTF_SEL, 0, RGMII_IO_MACRO_CONFIG); @@ -300,27 +361,32 @@ static int ethqos_rgmii_macro_init(struct qcom_ethqos *ethqos) RGMII_CONFIG_PROG_SWAP, RGMII_IO_MACRO_CONFIG); rgmii_updatel(ethqos, RGMII_CONFIG2_DATA_DIVIDE_CLK_SEL, 0, RGMII_IO_MACRO_CONFIG2); + rgmii_updatel(ethqos, RGMII_CONFIG2_TX_CLK_PHASE_SHIFT_EN, - RGMII_CONFIG2_TX_CLK_PHASE_SHIFT_EN, - RGMII_IO_MACRO_CONFIG2); + phase_shift, RGMII_IO_MACRO_CONFIG2); rgmii_updatel(ethqos, RGMII_CONFIG2_RSVD_CONFIG15, 0, RGMII_IO_MACRO_CONFIG2); rgmii_updatel(ethqos, RGMII_CONFIG2_RX_PROG_SWAP, RGMII_CONFIG2_RX_PROG_SWAP, RGMII_IO_MACRO_CONFIG2); - /* Set PRG_RCLK_DLY to 57 for 1.8 ns delay */ - rgmii_updatel(ethqos, SDCC_DDR_CONFIG_PRG_RCLK_DLY, - 57, SDCC_HC_REG_DDR_CONFIG); + /* PRG_RCLK_DLY = TCXO period * TCXO_CYCLES_CNT / 2 * RX delay ns, + * in practice this becomes PRG_RCLK_DLY = 52 * 4 / 2 * RX delay ns + */ + if (ethqos->has_emac3) { + /* 0.9 ns */ + rgmii_updatel(ethqos, SDCC_DDR_CONFIG_PRG_RCLK_DLY, + 115, SDCC_HC_REG_DDR_CONFIG); + } else { + /* 1.8 ns */ + rgmii_updatel(ethqos, SDCC_DDR_CONFIG_PRG_RCLK_DLY, + 57, SDCC_HC_REG_DDR_CONFIG); + } rgmii_updatel(ethqos, SDCC_DDR_CONFIG_PRG_DLY_EN, SDCC_DDR_CONFIG_PRG_DLY_EN, SDCC_HC_REG_DDR_CONFIG); - if (ethqos->rgmii_config_looback_en) - rgmii_updatel(ethqos, RGMII_CONFIG_LOOPBACK_EN, - RGMII_CONFIG_LOOPBACK_EN, RGMII_IO_MACRO_CONFIG); - else - rgmii_updatel(ethqos, RGMII_CONFIG_LOOPBACK_EN, - 0, RGMII_IO_MACRO_CONFIG); + rgmii_updatel(ethqos, RGMII_CONFIG_LOOPBACK_EN, + loopback, RGMII_IO_MACRO_CONFIG); break; case SPEED_100: @@ -336,14 +402,20 @@ static int ethqos_rgmii_macro_init(struct qcom_ethqos *ethqos) rgmii_updatel(ethqos, RGMII_CONFIG2_DATA_DIVIDE_CLK_SEL, 0, RGMII_IO_MACRO_CONFIG2); rgmii_updatel(ethqos, RGMII_CONFIG2_TX_CLK_PHASE_SHIFT_EN, - RGMII_CONFIG2_TX_CLK_PHASE_SHIFT_EN, - RGMII_IO_MACRO_CONFIG2); + phase_shift, RGMII_IO_MACRO_CONFIG2); rgmii_updatel(ethqos, RGMII_CONFIG_MAX_SPD_PRG_2, BIT(6), RGMII_IO_MACRO_CONFIG); rgmii_updatel(ethqos, RGMII_CONFIG2_RSVD_CONFIG15, 0, RGMII_IO_MACRO_CONFIG2); - rgmii_updatel(ethqos, RGMII_CONFIG2_RX_PROG_SWAP, - 0, RGMII_IO_MACRO_CONFIG2); + + if (ethqos->has_emac3) + rgmii_updatel(ethqos, RGMII_CONFIG2_RX_PROG_SWAP, + RGMII_CONFIG2_RX_PROG_SWAP, + RGMII_IO_MACRO_CONFIG2); + else + rgmii_updatel(ethqos, RGMII_CONFIG2_RX_PROG_SWAP, + 0, RGMII_IO_MACRO_CONFIG2); + /* Write 0x5 to PRG_RCLK_DLY_CODE */ rgmii_updatel(ethqos, SDCC_DDR_CONFIG_EXT_PRG_RCLK_DLY_CODE, (BIT(29) | BIT(27)), SDCC_HC_REG_DDR_CONFIG); @@ -353,13 +425,8 @@ static int ethqos_rgmii_macro_init(struct qcom_ethqos *ethqos) rgmii_updatel(ethqos, SDCC_DDR_CONFIG_EXT_PRG_RCLK_DLY_EN, SDCC_DDR_CONFIG_EXT_PRG_RCLK_DLY_EN, SDCC_HC_REG_DDR_CONFIG); - if (ethqos->rgmii_config_looback_en) - rgmii_updatel(ethqos, RGMII_CONFIG_LOOPBACK_EN, - RGMII_CONFIG_LOOPBACK_EN, RGMII_IO_MACRO_CONFIG); - else - rgmii_updatel(ethqos, RGMII_CONFIG_LOOPBACK_EN, - 0, RGMII_IO_MACRO_CONFIG); - + rgmii_updatel(ethqos, RGMII_CONFIG_LOOPBACK_EN, + loopback, RGMII_IO_MACRO_CONFIG); break; case SPEED_10: @@ -375,14 +442,19 @@ static int ethqos_rgmii_macro_init(struct qcom_ethqos *ethqos) rgmii_updatel(ethqos, RGMII_CONFIG2_DATA_DIVIDE_CLK_SEL, 0, RGMII_IO_MACRO_CONFIG2); rgmii_updatel(ethqos, RGMII_CONFIG2_TX_CLK_PHASE_SHIFT_EN, - 0, RGMII_IO_MACRO_CONFIG2); + phase_shift, RGMII_IO_MACRO_CONFIG2); rgmii_updatel(ethqos, RGMII_CONFIG_MAX_SPD_PRG_9, BIT(12) | GENMASK(9, 8), RGMII_IO_MACRO_CONFIG); rgmii_updatel(ethqos, RGMII_CONFIG2_RSVD_CONFIG15, 0, RGMII_IO_MACRO_CONFIG2); - rgmii_updatel(ethqos, RGMII_CONFIG2_RX_PROG_SWAP, - 0, RGMII_IO_MACRO_CONFIG2); + if (ethqos->has_emac3) + rgmii_updatel(ethqos, RGMII_CONFIG2_RX_PROG_SWAP, + RGMII_CONFIG2_RX_PROG_SWAP, + RGMII_IO_MACRO_CONFIG2); + else + rgmii_updatel(ethqos, RGMII_CONFIG2_RX_PROG_SWAP, + 0, RGMII_IO_MACRO_CONFIG2); /* Write 0x5 to PRG_RCLK_DLY_CODE */ rgmii_updatel(ethqos, SDCC_DDR_CONFIG_EXT_PRG_RCLK_DLY_CODE, (BIT(29) | BIT(27)), SDCC_HC_REG_DDR_CONFIG); @@ -393,7 +465,7 @@ static int ethqos_rgmii_macro_init(struct qcom_ethqos *ethqos) SDCC_DDR_CONFIG_EXT_PRG_RCLK_DLY_EN, SDCC_HC_REG_DDR_CONFIG); rgmii_updatel(ethqos, RGMII_CONFIG_LOOPBACK_EN, - RGMII_CONFIG_LOOPBACK_EN, RGMII_IO_MACRO_CONFIG); + loopback, RGMII_IO_MACRO_CONFIG); break; default: dev_err(ðqos->pdev->dev, @@ -425,6 +497,17 @@ static int ethqos_configure(struct qcom_ethqos *ethqos) rgmii_updatel(ethqos, SDCC_DLL_CONFIG_PDN, SDCC_DLL_CONFIG_PDN, SDCC_HC_REG_DLL_CONFIG); + if (ethqos->has_emac3) { + if (ethqos->speed == SPEED_1000) { + rgmii_writel(ethqos, 0x1800000, SDCC_TEST_CTL); + rgmii_writel(ethqos, 0x2C010800, SDCC_USR_CTL); + rgmii_writel(ethqos, 0xA001, SDCC_HC_REG_DLL_CONFIG2); + } else { + rgmii_writel(ethqos, 0x40010800, SDCC_USR_CTL); + rgmii_writel(ethqos, 0xA001, SDCC_HC_REG_DLL_CONFIG2); + } + } + /* Clear DLL_RST */ rgmii_updatel(ethqos, SDCC_DLL_CONFIG_DLL_RST, 0, SDCC_HC_REG_DLL_CONFIG); @@ -444,7 +527,9 @@ static int ethqos_configure(struct qcom_ethqos *ethqos) SDCC_HC_REG_DLL_CONFIG); /* Set USR_CTL bit 26 with mask of 3 bits */ - rgmii_updatel(ethqos, GENMASK(26, 24), BIT(26), SDCC_USR_CTL); + if (!ethqos->has_emac3) + rgmii_updatel(ethqos, GENMASK(26, 24), BIT(26), + SDCC_USR_CTL); /* wait for DLL LOCK */ do { @@ -538,7 +623,8 @@ static int qcom_ethqos_probe(struct platform_device *pdev) data = of_device_get_match_data(&pdev->dev); ethqos->por = data->por; ethqos->num_por = data->num_por; - ethqos->rgmii_config_looback_en = data->rgmii_config_looback_en; + ethqos->rgmii_config_loopback_en = data->rgmii_config_loopback_en; + ethqos->has_emac3 = data->has_emac3; ethqos->rgmii_clk = devm_clk_get(&pdev->dev, "rgmii"); if (IS_ERR(ethqos->rgmii_clk)) { @@ -558,6 +644,7 @@ static int qcom_ethqos_probe(struct platform_device *pdev) plat_dat->fix_mac_speed = ethqos_fix_mac_speed; plat_dat->dump_debug_regs = rgmii_dump; plat_dat->has_gmac4 = 1; + plat_dat->dwmac4_addrs = &data->dwmac4_addrs; plat_dat->pmt = 1; plat_dat->tso_en = of_property_read_bool(np, "snps,tso"); if (of_device_is_compatible(np, "qcom,qcs404-ethqos")) @@ -595,6 +682,7 @@ static int qcom_ethqos_remove(struct platform_device *pdev) static const struct of_device_id qcom_ethqos_match[] = { { .compatible = "qcom,qcs404-ethqos", .data = &emac_v2_3_0_data}, + { .compatible = "qcom,sc8280xp-ethqos", .data = &emac_v3_0_0_data}, { .compatible = "qcom,sm8150-ethqos", .data = &emac_v2_1_0_data}, { } }; @@ -606,7 +694,7 @@ static struct platform_driver qcom_ethqos_driver = { .driver = { .name = "qcom-ethqos", .pm = &stmmac_pltfr_pm_ops, - .of_match_table = of_match_ptr(qcom_ethqos_match), + .of_match_table = qcom_ethqos_match, }, }; module_platform_driver(qcom_ethqos_driver); diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac-rk.c b/drivers/net/ethernet/stmicro/stmmac/dwmac-rk.c index 4b8fd11563e4..4ea31ccf24d0 100644 --- a/drivers/net/ethernet/stmicro/stmmac/dwmac-rk.c +++ b/drivers/net/ethernet/stmicro/stmmac/dwmac-rk.c @@ -39,6 +39,24 @@ struct rk_gmac_ops { u32 regs[]; }; +static const char * const rk_clocks[] = { + "aclk_mac", "pclk_mac", "mac_clk_tx", "clk_mac_speed", +}; + +static const char * const rk_rmii_clocks[] = { + "mac_clk_rx", "clk_mac_ref", "clk_mac_refout", +}; + +enum rk_clocks_index { + RK_ACLK_MAC = 0, + RK_PCLK_MAC, + RK_MAC_CLK_TX, + RK_CLK_MAC_SPEED, + RK_MAC_CLK_RX, + RK_CLK_MAC_REF, + RK_CLK_MAC_REFOUT, +}; + struct rk_priv_data { struct platform_device *pdev; phy_interface_t phy_iface; @@ -51,15 +69,9 @@ struct rk_priv_data { bool clock_input; bool integrated_phy; + struct clk_bulk_data *clks; + int num_clks; struct clk *clk_mac; - struct clk *gmac_clkin; - struct clk *mac_clk_rx; - struct clk *mac_clk_tx; - struct clk *clk_mac_ref; - struct clk *clk_mac_refout; - struct clk *clk_mac_speed; - struct clk *aclk_mac; - struct clk *pclk_mac; struct clk *clk_phy; struct reset_control *phy_reset; @@ -104,10 +116,11 @@ static void px30_set_to_rmii(struct rk_priv_data *bsp_priv) static void px30_set_rmii_speed(struct rk_priv_data *bsp_priv, int speed) { + struct clk *clk_mac_speed = bsp_priv->clks[RK_CLK_MAC_SPEED].clk; struct device *dev = &bsp_priv->pdev->dev; int ret; - if (IS_ERR(bsp_priv->clk_mac_speed)) { + if (!clk_mac_speed) { dev_err(dev, "%s: Missing clk_mac_speed clock\n", __func__); return; } @@ -116,7 +129,7 @@ static void px30_set_rmii_speed(struct rk_priv_data *bsp_priv, int speed) regmap_write(bsp_priv->grf, PX30_GRF_GMAC_CON1, PX30_GMAC_SPEED_10M); - ret = clk_set_rate(bsp_priv->clk_mac_speed, 2500000); + ret = clk_set_rate(clk_mac_speed, 2500000); if (ret) dev_err(dev, "%s: set clk_mac_speed rate 2500000 failed: %d\n", __func__, ret); @@ -124,7 +137,7 @@ static void px30_set_rmii_speed(struct rk_priv_data *bsp_priv, int speed) regmap_write(bsp_priv->grf, PX30_GRF_GMAC_CON1, PX30_GMAC_SPEED_100M); - ret = clk_set_rate(bsp_priv->clk_mac_speed, 25000000); + ret = clk_set_rate(clk_mac_speed, 25000000); if (ret) dev_err(dev, "%s: set clk_mac_speed rate 25000000 failed: %d\n", __func__, ret); @@ -1066,6 +1079,7 @@ static void rk3568_set_to_rmii(struct rk_priv_data *bsp_priv) static void rk3568_set_gmac_speed(struct rk_priv_data *bsp_priv, int speed) { + struct clk *clk_mac_speed = bsp_priv->clks[RK_CLK_MAC_SPEED].clk; struct device *dev = &bsp_priv->pdev->dev; unsigned long rate; int ret; @@ -1085,7 +1099,7 @@ static void rk3568_set_gmac_speed(struct rk_priv_data *bsp_priv, int speed) return; } - ret = clk_set_rate(bsp_priv->clk_mac_speed, rate); + ret = clk_set_rate(clk_mac_speed, rate); if (ret) dev_err(dev, "%s: set clk_mac_speed rate %ld failed %d\n", __func__, rate, ret); @@ -1371,6 +1385,7 @@ static void rv1126_set_to_rmii(struct rk_priv_data *bsp_priv) static void rv1126_set_rgmii_speed(struct rk_priv_data *bsp_priv, int speed) { + struct clk *clk_mac_speed = bsp_priv->clks[RK_CLK_MAC_SPEED].clk; struct device *dev = &bsp_priv->pdev->dev; unsigned long rate; int ret; @@ -1390,7 +1405,7 @@ static void rv1126_set_rgmii_speed(struct rk_priv_data *bsp_priv, int speed) return; } - ret = clk_set_rate(bsp_priv->clk_mac_speed, rate); + ret = clk_set_rate(clk_mac_speed, rate); if (ret) dev_err(dev, "%s: set clk_mac_speed rate %ld failed %d\n", __func__, rate, ret); @@ -1398,6 +1413,7 @@ static void rv1126_set_rgmii_speed(struct rk_priv_data *bsp_priv, int speed) static void rv1126_set_rmii_speed(struct rk_priv_data *bsp_priv, int speed) { + struct clk *clk_mac_speed = bsp_priv->clks[RK_CLK_MAC_SPEED].clk; struct device *dev = &bsp_priv->pdev->dev; unsigned long rate; int ret; @@ -1414,7 +1430,7 @@ static void rv1126_set_rmii_speed(struct rk_priv_data *bsp_priv, int speed) return; } - ret = clk_set_rate(bsp_priv->clk_mac_speed, rate); + ret = clk_set_rate(clk_mac_speed, rate); if (ret) dev_err(dev, "%s: set clk_mac_speed rate %ld failed %d\n", __func__, rate, ret); @@ -1475,68 +1491,50 @@ static int rk_gmac_clk_init(struct plat_stmmacenet_data *plat) { struct rk_priv_data *bsp_priv = plat->bsp_priv; struct device *dev = &bsp_priv->pdev->dev; - int ret; + int phy_iface = bsp_priv->phy_iface; + int i, j, ret; bsp_priv->clk_enabled = false; - bsp_priv->mac_clk_rx = devm_clk_get(dev, "mac_clk_rx"); - if (IS_ERR(bsp_priv->mac_clk_rx)) - dev_err(dev, "cannot get clock %s\n", - "mac_clk_rx"); + bsp_priv->num_clks = ARRAY_SIZE(rk_clocks); + if (phy_iface == PHY_INTERFACE_MODE_RMII) + bsp_priv->num_clks += ARRAY_SIZE(rk_rmii_clocks); - bsp_priv->mac_clk_tx = devm_clk_get(dev, "mac_clk_tx"); - if (IS_ERR(bsp_priv->mac_clk_tx)) - dev_err(dev, "cannot get clock %s\n", - "mac_clk_tx"); + bsp_priv->clks = devm_kcalloc(dev, bsp_priv->num_clks, + sizeof(*bsp_priv->clks), GFP_KERNEL); + if (!bsp_priv->clks) + return -ENOMEM; - bsp_priv->aclk_mac = devm_clk_get(dev, "aclk_mac"); - if (IS_ERR(bsp_priv->aclk_mac)) - dev_err(dev, "cannot get clock %s\n", - "aclk_mac"); + for (i = 0; i < ARRAY_SIZE(rk_clocks); i++) + bsp_priv->clks[i].id = rk_clocks[i]; - bsp_priv->pclk_mac = devm_clk_get(dev, "pclk_mac"); - if (IS_ERR(bsp_priv->pclk_mac)) - dev_err(dev, "cannot get clock %s\n", - "pclk_mac"); - - bsp_priv->clk_mac = devm_clk_get(dev, "stmmaceth"); - if (IS_ERR(bsp_priv->clk_mac)) - dev_err(dev, "cannot get clock %s\n", - "stmmaceth"); - - if (bsp_priv->phy_iface == PHY_INTERFACE_MODE_RMII) { - bsp_priv->clk_mac_ref = devm_clk_get(dev, "clk_mac_ref"); - if (IS_ERR(bsp_priv->clk_mac_ref)) - dev_err(dev, "cannot get clock %s\n", - "clk_mac_ref"); - - if (!bsp_priv->clock_input) { - bsp_priv->clk_mac_refout = - devm_clk_get(dev, "clk_mac_refout"); - if (IS_ERR(bsp_priv->clk_mac_refout)) - dev_err(dev, "cannot get clock %s\n", - "clk_mac_refout"); - } + if (phy_iface == PHY_INTERFACE_MODE_RMII) { + for (j = 0; j < ARRAY_SIZE(rk_rmii_clocks); j++) + bsp_priv->clks[i++].id = rk_rmii_clocks[j]; } - bsp_priv->clk_mac_speed = devm_clk_get(dev, "clk_mac_speed"); - if (IS_ERR(bsp_priv->clk_mac_speed)) - dev_err(dev, "cannot get clock %s\n", "clk_mac_speed"); + ret = devm_clk_bulk_get_optional(dev, bsp_priv->num_clks, + bsp_priv->clks); + if (ret) + return dev_err_probe(dev, ret, "Failed to get clocks\n"); + + /* "stmmaceth" will be enabled by the core */ + bsp_priv->clk_mac = devm_clk_get(dev, "stmmaceth"); + ret = PTR_ERR_OR_ZERO(bsp_priv->clk_mac); + if (ret) + return dev_err_probe(dev, ret, "Cannot get stmmaceth clock\n"); if (bsp_priv->clock_input) { dev_info(dev, "clock input from PHY\n"); - } else { - if (bsp_priv->phy_iface == PHY_INTERFACE_MODE_RMII) - clk_set_rate(bsp_priv->clk_mac, 50000000); + } else if (phy_iface == PHY_INTERFACE_MODE_RMII) { + clk_set_rate(bsp_priv->clk_mac, 50000000); } if (plat->phy_node && bsp_priv->integrated_phy) { bsp_priv->clk_phy = of_clk_get(plat->phy_node, 0); - if (IS_ERR(bsp_priv->clk_phy)) { - ret = PTR_ERR(bsp_priv->clk_phy); - dev_err(dev, "Cannot get PHY clock: %d\n", ret); - return -EINVAL; - } + ret = PTR_ERR_OR_ZERO(bsp_priv->clk_phy); + if (ret) + return dev_err_probe(dev, ret, "Cannot get PHY clock\n"); clk_set_rate(bsp_priv->clk_phy, 50000000); } @@ -1545,77 +1543,36 @@ static int rk_gmac_clk_init(struct plat_stmmacenet_data *plat) static int gmac_clk_enable(struct rk_priv_data *bsp_priv, bool enable) { - int phy_iface = bsp_priv->phy_iface; + int ret; if (enable) { if (!bsp_priv->clk_enabled) { - if (phy_iface == PHY_INTERFACE_MODE_RMII) { - if (!IS_ERR(bsp_priv->mac_clk_rx)) - clk_prepare_enable( - bsp_priv->mac_clk_rx); - - if (!IS_ERR(bsp_priv->clk_mac_ref)) - clk_prepare_enable( - bsp_priv->clk_mac_ref); - - if (!IS_ERR(bsp_priv->clk_mac_refout)) - clk_prepare_enable( - bsp_priv->clk_mac_refout); - } - - if (!IS_ERR(bsp_priv->clk_phy)) - clk_prepare_enable(bsp_priv->clk_phy); + ret = clk_bulk_prepare_enable(bsp_priv->num_clks, + bsp_priv->clks); + if (ret) + return ret; - if (!IS_ERR(bsp_priv->aclk_mac)) - clk_prepare_enable(bsp_priv->aclk_mac); - - if (!IS_ERR(bsp_priv->pclk_mac)) - clk_prepare_enable(bsp_priv->pclk_mac); - - if (!IS_ERR(bsp_priv->mac_clk_tx)) - clk_prepare_enable(bsp_priv->mac_clk_tx); - - if (!IS_ERR(bsp_priv->clk_mac_speed)) - clk_prepare_enable(bsp_priv->clk_mac_speed); + ret = clk_prepare_enable(bsp_priv->clk_phy); + if (ret) + return ret; if (bsp_priv->ops && bsp_priv->ops->set_clock_selection) bsp_priv->ops->set_clock_selection(bsp_priv, bsp_priv->clock_input, true); - /** - * if (!IS_ERR(bsp_priv->clk_mac)) - * clk_prepare_enable(bsp_priv->clk_mac); - */ mdelay(5); bsp_priv->clk_enabled = true; } } else { if (bsp_priv->clk_enabled) { - if (phy_iface == PHY_INTERFACE_MODE_RMII) { - clk_disable_unprepare(bsp_priv->mac_clk_rx); - - clk_disable_unprepare(bsp_priv->clk_mac_ref); - - clk_disable_unprepare(bsp_priv->clk_mac_refout); - } - + clk_bulk_disable_unprepare(bsp_priv->num_clks, + bsp_priv->clks); clk_disable_unprepare(bsp_priv->clk_phy); - clk_disable_unprepare(bsp_priv->aclk_mac); - - clk_disable_unprepare(bsp_priv->pclk_mac); - - clk_disable_unprepare(bsp_priv->mac_clk_tx); - - clk_disable_unprepare(bsp_priv->clk_mac_speed); - if (bsp_priv->ops && bsp_priv->ops->set_clock_selection) bsp_priv->ops->set_clock_selection(bsp_priv, bsp_priv->clock_input, false); - /** - * if (!IS_ERR(bsp_priv->clk_mac)) - * clk_disable_unprepare(bsp_priv->clk_mac); - */ + bsp_priv->clk_enabled = false; } } @@ -1629,9 +1586,6 @@ static int phy_power_on(struct rk_priv_data *bsp_priv, bool enable) int ret; struct device *dev = &bsp_priv->pdev->dev; - if (!ldo) - return 0; - if (enable) { ret = regulator_enable(ldo); if (ret) @@ -1679,14 +1633,11 @@ static struct rk_priv_data *rk_gmac_setup(struct platform_device *pdev, } } - bsp_priv->regulator = devm_regulator_get_optional(dev, "phy"); + bsp_priv->regulator = devm_regulator_get(dev, "phy"); if (IS_ERR(bsp_priv->regulator)) { - if (PTR_ERR(bsp_priv->regulator) == -EPROBE_DEFER) { - dev_err(dev, "phy regulator is not available yet, deferred probing\n"); - return ERR_PTR(-EPROBE_DEFER); - } - dev_err(dev, "no regulator found\n"); - bsp_priv->regulator = NULL; + ret = PTR_ERR(bsp_priv->regulator); + dev_err_probe(dev, ret, "failed to get phy regulator\n"); + return ERR_PTR(ret); } ret = of_property_read_string(dev->of_node, "clock_in_out", &strings); diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac-starfive.c b/drivers/net/ethernet/stmicro/stmmac/dwmac-starfive.c new file mode 100644 index 000000000000..4f51a7889642 --- /dev/null +++ b/drivers/net/ethernet/stmicro/stmmac/dwmac-starfive.c @@ -0,0 +1,171 @@ +// SPDX-License-Identifier: GPL-2.0+ +/* + * StarFive DWMAC platform driver + * + * Copyright (C) 2021 Emil Renner Berthing <kernel@esmil.dk> + * Copyright (C) 2022 StarFive Technology Co., Ltd. + * + */ + +#include <linux/mfd/syscon.h> +#include <linux/of_device.h> +#include <linux/regmap.h> + +#include "stmmac_platform.h" + +#define STARFIVE_DWMAC_PHY_INFT_RGMII 0x1 +#define STARFIVE_DWMAC_PHY_INFT_RMII 0x4 +#define STARFIVE_DWMAC_PHY_INFT_FIELD 0x7U + +struct starfive_dwmac { + struct device *dev; + struct clk *clk_tx; +}; + +static void starfive_dwmac_fix_mac_speed(void *priv, unsigned int speed) +{ + struct starfive_dwmac *dwmac = priv; + unsigned long rate; + int err; + + rate = clk_get_rate(dwmac->clk_tx); + + switch (speed) { + case SPEED_1000: + rate = 125000000; + break; + case SPEED_100: + rate = 25000000; + break; + case SPEED_10: + rate = 2500000; + break; + default: + dev_err(dwmac->dev, "invalid speed %u\n", speed); + break; + } + + err = clk_set_rate(dwmac->clk_tx, rate); + if (err) + dev_err(dwmac->dev, "failed to set tx rate %lu\n", rate); +} + +static int starfive_dwmac_set_mode(struct plat_stmmacenet_data *plat_dat) +{ + struct starfive_dwmac *dwmac = plat_dat->bsp_priv; + struct regmap *regmap; + unsigned int args[2]; + unsigned int mode; + int err; + + switch (plat_dat->interface) { + case PHY_INTERFACE_MODE_RMII: + mode = STARFIVE_DWMAC_PHY_INFT_RMII; + break; + + case PHY_INTERFACE_MODE_RGMII: + case PHY_INTERFACE_MODE_RGMII_ID: + mode = STARFIVE_DWMAC_PHY_INFT_RGMII; + break; + + default: + dev_err(dwmac->dev, "unsupported interface %d\n", + plat_dat->interface); + return -EINVAL; + } + + regmap = syscon_regmap_lookup_by_phandle_args(dwmac->dev->of_node, + "starfive,syscon", + 2, args); + if (IS_ERR(regmap)) + return dev_err_probe(dwmac->dev, PTR_ERR(regmap), "getting the regmap failed\n"); + + /* args[0]:offset args[1]: shift */ + err = regmap_update_bits(regmap, args[0], + STARFIVE_DWMAC_PHY_INFT_FIELD << args[1], + mode << args[1]); + if (err) + return dev_err_probe(dwmac->dev, err, "error setting phy mode\n"); + + return 0; +} + +static int starfive_dwmac_probe(struct platform_device *pdev) +{ + struct plat_stmmacenet_data *plat_dat; + struct stmmac_resources stmmac_res; + struct starfive_dwmac *dwmac; + struct clk *clk_gtx; + int err; + + err = stmmac_get_platform_resources(pdev, &stmmac_res); + if (err) + return dev_err_probe(&pdev->dev, err, + "failed to get resources\n"); + + plat_dat = stmmac_probe_config_dt(pdev, stmmac_res.mac); + if (IS_ERR(plat_dat)) + return dev_err_probe(&pdev->dev, PTR_ERR(plat_dat), + "dt configuration failed\n"); + + dwmac = devm_kzalloc(&pdev->dev, sizeof(*dwmac), GFP_KERNEL); + if (!dwmac) + return -ENOMEM; + + dwmac->clk_tx = devm_clk_get_enabled(&pdev->dev, "tx"); + if (IS_ERR(dwmac->clk_tx)) + return dev_err_probe(&pdev->dev, PTR_ERR(dwmac->clk_tx), + "error getting tx clock\n"); + + clk_gtx = devm_clk_get_enabled(&pdev->dev, "gtx"); + if (IS_ERR(clk_gtx)) + return dev_err_probe(&pdev->dev, PTR_ERR(clk_gtx), + "error getting gtx clock\n"); + + /* Generally, the rgmii_tx clock is provided by the internal clock, + * which needs to match the corresponding clock frequency according + * to different speeds. If the rgmii_tx clock is provided by the + * external rgmii_rxin, there is no need to configure the clock + * internally, because rgmii_rxin will be adaptively adjusted. + */ + if (!device_property_read_bool(&pdev->dev, "starfive,tx-use-rgmii-clk")) + plat_dat->fix_mac_speed = starfive_dwmac_fix_mac_speed; + + dwmac->dev = &pdev->dev; + plat_dat->bsp_priv = dwmac; + plat_dat->dma_cfg->dche = true; + + err = starfive_dwmac_set_mode(plat_dat); + if (err) + return err; + + err = stmmac_dvr_probe(&pdev->dev, plat_dat, &stmmac_res); + if (err) { + stmmac_remove_config_dt(pdev, plat_dat); + return err; + } + + return 0; +} + +static const struct of_device_id starfive_dwmac_match[] = { + { .compatible = "starfive,jh7110-dwmac" }, + { /* sentinel */ } +}; +MODULE_DEVICE_TABLE(of, starfive_dwmac_match); + +static struct platform_driver starfive_dwmac_driver = { + .probe = starfive_dwmac_probe, + .remove = stmmac_pltfr_remove, + .driver = { + .name = "starfive-dwmac", + .pm = &stmmac_pltfr_pm_ops, + .of_match_table = starfive_dwmac_match, + }, +}; +module_platform_driver(starfive_dwmac_driver); + +MODULE_LICENSE("GPL"); +MODULE_DESCRIPTION("StarFive DWMAC platform driver"); +MODULE_AUTHOR("Emil Renner Berthing <kernel@esmil.dk>"); +MODULE_AUTHOR("Samin Guo <samin.guo@starfivetech.com>"); diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac-sti.c b/drivers/net/ethernet/stmicro/stmmac/dwmac-sti.c index be3b1ebc06ab..465ce66ef9c1 100644 --- a/drivers/net/ethernet/stmicro/stmmac/dwmac-sti.c +++ b/drivers/net/ethernet/stmicro/stmmac/dwmac-sti.c @@ -35,7 +35,7 @@ #define IS_PHY_IF_MODE_GBIT(iface) (IS_PHY_IF_MODE_RGMII(iface) || \ iface == PHY_INTERFACE_MODE_GMII) -/* STiH4xx register definitions (STiH415/STiH416/STiH407/STiH410 families) +/* STiH4xx register definitions (STiH407/STiH410 families) * * Below table summarizes the clock requirement and clock sources for * supported phy interface modes with link speeds. @@ -75,27 +75,6 @@ #define STIH4XX_ETH_SEL_INTERNAL_NOTEXT_PHYCLK BIT(7) #define STIH4XX_ETH_SEL_TXCLK_NOT_CLK125 BIT(6) -/* STiD127 register definitions - *----------------------- - * src |BIT(6)| BIT(7)| - *----------------------- - * MII | 1 | n/a | - *----------------------- - * RMII | n/a | 1 | - * clkgen| | | - *----------------------- - * RMII | n/a | 0 | - * phyclk| | | - *----------------------- - * RGMII | 1 | n/a | - * clkgen| | | - *----------------------- - */ - -#define STID127_RETIME_SRC_MASK GENMASK(7, 6) -#define STID127_ETH_SEL_INTERNAL_NOTEXT_PHYCLK BIT(7) -#define STID127_ETH_SEL_INTERNAL_NOTEXT_TXCLK BIT(6) - #define ENMII_MASK GENMASK(5, 5) #define ENMII BIT(5) #define EN_MASK GENMASK(1, 1) @@ -194,36 +173,6 @@ static void stih4xx_fix_retime_src(void *priv, u32 spd) stih4xx_tx_retime_val[src]); } -static void stid127_fix_retime_src(void *priv, u32 spd) -{ - struct sti_dwmac *dwmac = priv; - u32 reg = dwmac->ctrl_reg; - u32 freq = 0; - u32 val = 0; - - if (dwmac->interface == PHY_INTERFACE_MODE_MII) { - val = STID127_ETH_SEL_INTERNAL_NOTEXT_TXCLK; - } else if (dwmac->interface == PHY_INTERFACE_MODE_RMII) { - if (!dwmac->ext_phyclk) { - val = STID127_ETH_SEL_INTERNAL_NOTEXT_PHYCLK; - freq = DWMAC_50MHZ; - } - } else if (IS_PHY_IF_MODE_RGMII(dwmac->interface)) { - val = STID127_ETH_SEL_INTERNAL_NOTEXT_TXCLK; - if (spd == SPEED_1000) - freq = DWMAC_125MHZ; - else if (spd == SPEED_100) - freq = DWMAC_25MHZ; - else if (spd == SPEED_10) - freq = DWMAC_2_5MHZ; - } - - if (freq) - clk_set_rate(dwmac->clk, freq); - - regmap_update_bits(dwmac->regmap, reg, STID127_RETIME_SRC_MASK, val); -} - static int sti_dwmac_set_mode(struct sti_dwmac *dwmac) { struct regmap *regmap = dwmac->regmap; @@ -408,14 +357,7 @@ static const struct sti_dwmac_of_data stih4xx_dwmac_data = { .fix_retime_src = stih4xx_fix_retime_src, }; -static const struct sti_dwmac_of_data stid127_dwmac_data = { - .fix_retime_src = stid127_fix_retime_src, -}; - static const struct of_device_id sti_dwmac_match[] = { - { .compatible = "st,stih415-dwmac", .data = &stih4xx_dwmac_data}, - { .compatible = "st,stih416-dwmac", .data = &stih4xx_dwmac_data}, - { .compatible = "st,stid127-dwmac", .data = &stid127_dwmac_data}, { .compatible = "st,stih407-dwmac", .data = &stih4xx_dwmac_data}, { } }; diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac-sun8i.c b/drivers/net/ethernet/stmicro/stmmac/dwmac-sun8i.c index f834472599f7..c2c592ba0eb8 100644 --- a/drivers/net/ethernet/stmicro/stmmac/dwmac-sun8i.c +++ b/drivers/net/ethernet/stmicro/stmmac/dwmac-sun8i.c @@ -304,7 +304,8 @@ static void sun8i_dwmac_dma_init(void __iomem *ioaddr, writel(0x1FFFFFF, ioaddr + EMAC_INT_STA); } -static void sun8i_dwmac_dma_init_rx(void __iomem *ioaddr, +static void sun8i_dwmac_dma_init_rx(struct stmmac_priv *priv, + void __iomem *ioaddr, struct stmmac_dma_cfg *dma_cfg, dma_addr_t dma_rx_phy, u32 chan) { @@ -312,7 +313,8 @@ static void sun8i_dwmac_dma_init_rx(void __iomem *ioaddr, writel(lower_32_bits(dma_rx_phy), ioaddr + EMAC_RX_DESC_LIST); } -static void sun8i_dwmac_dma_init_tx(void __iomem *ioaddr, +static void sun8i_dwmac_dma_init_tx(struct stmmac_priv *priv, + void __iomem *ioaddr, struct stmmac_dma_cfg *dma_cfg, dma_addr_t dma_tx_phy, u32 chan) { @@ -324,7 +326,8 @@ static void sun8i_dwmac_dma_init_tx(void __iomem *ioaddr, * Called from stmmac_dma_ops->dump_regs * Used for ethtool */ -static void sun8i_dwmac_dump_regs(void __iomem *ioaddr, u32 *reg_space) +static void sun8i_dwmac_dump_regs(struct stmmac_priv *priv, + void __iomem *ioaddr, u32 *reg_space) { int i; @@ -352,7 +355,8 @@ static void sun8i_dwmac_dump_mac_regs(struct mac_device_info *hw, } } -static void sun8i_dwmac_enable_dma_irq(void __iomem *ioaddr, u32 chan, +static void sun8i_dwmac_enable_dma_irq(struct stmmac_priv *priv, + void __iomem *ioaddr, u32 chan, bool rx, bool tx) { u32 value = readl(ioaddr + EMAC_INT_EN); @@ -365,7 +369,8 @@ static void sun8i_dwmac_enable_dma_irq(void __iomem *ioaddr, u32 chan, writel(value, ioaddr + EMAC_INT_EN); } -static void sun8i_dwmac_disable_dma_irq(void __iomem *ioaddr, u32 chan, +static void sun8i_dwmac_disable_dma_irq(struct stmmac_priv *priv, + void __iomem *ioaddr, u32 chan, bool rx, bool tx) { u32 value = readl(ioaddr + EMAC_INT_EN); @@ -378,7 +383,8 @@ static void sun8i_dwmac_disable_dma_irq(void __iomem *ioaddr, u32 chan, writel(value, ioaddr + EMAC_INT_EN); } -static void sun8i_dwmac_dma_start_tx(void __iomem *ioaddr, u32 chan) +static void sun8i_dwmac_dma_start_tx(struct stmmac_priv *priv, + void __iomem *ioaddr, u32 chan) { u32 v; @@ -398,7 +404,8 @@ static void sun8i_dwmac_enable_dma_transmission(void __iomem *ioaddr) writel(v, ioaddr + EMAC_TX_CTL1); } -static void sun8i_dwmac_dma_stop_tx(void __iomem *ioaddr, u32 chan) +static void sun8i_dwmac_dma_stop_tx(struct stmmac_priv *priv, + void __iomem *ioaddr, u32 chan) { u32 v; @@ -407,7 +414,8 @@ static void sun8i_dwmac_dma_stop_tx(void __iomem *ioaddr, u32 chan) writel(v, ioaddr + EMAC_TX_CTL1); } -static void sun8i_dwmac_dma_start_rx(void __iomem *ioaddr, u32 chan) +static void sun8i_dwmac_dma_start_rx(struct stmmac_priv *priv, + void __iomem *ioaddr, u32 chan) { u32 v; @@ -417,7 +425,8 @@ static void sun8i_dwmac_dma_start_rx(void __iomem *ioaddr, u32 chan) writel(v, ioaddr + EMAC_RX_CTL1); } -static void sun8i_dwmac_dma_stop_rx(void __iomem *ioaddr, u32 chan) +static void sun8i_dwmac_dma_stop_rx(struct stmmac_priv *priv, + void __iomem *ioaddr, u32 chan) { u32 v; @@ -426,7 +435,8 @@ static void sun8i_dwmac_dma_stop_rx(void __iomem *ioaddr, u32 chan) writel(v, ioaddr + EMAC_RX_CTL1); } -static int sun8i_dwmac_dma_interrupt(void __iomem *ioaddr, +static int sun8i_dwmac_dma_interrupt(struct stmmac_priv *priv, + void __iomem *ioaddr, struct stmmac_extra_stats *x, u32 chan, u32 dir) { @@ -492,7 +502,8 @@ static int sun8i_dwmac_dma_interrupt(void __iomem *ioaddr, return ret; } -static void sun8i_dwmac_dma_operation_mode_rx(void __iomem *ioaddr, int mode, +static void sun8i_dwmac_dma_operation_mode_rx(struct stmmac_priv *priv, + void __iomem *ioaddr, int mode, u32 channel, int fifosz, u8 qmode) { u32 v; @@ -515,7 +526,8 @@ static void sun8i_dwmac_dma_operation_mode_rx(void __iomem *ioaddr, int mode, writel(v, ioaddr + EMAC_RX_CTL1); } -static void sun8i_dwmac_dma_operation_mode_tx(void __iomem *ioaddr, int mode, +static void sun8i_dwmac_dma_operation_mode_tx(struct stmmac_priv *priv, + void __iomem *ioaddr, int mode, u32 channel, int fifosz, u8 qmode) { u32 v; diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac1000_core.c b/drivers/net/ethernet/stmicro/stmmac/dwmac1000_core.c index 0e00dd83d027..3927609abc44 100644 --- a/drivers/net/ethernet/stmicro/stmmac/dwmac1000_core.c +++ b/drivers/net/ethernet/stmicro/stmmac/dwmac1000_core.c @@ -414,7 +414,8 @@ static void dwmac1000_get_adv_lp(void __iomem *ioaddr, struct rgmii_adv *adv) dwmac_get_adv_lp(ioaddr, GMAC_PCS_BASE, adv); } -static void dwmac1000_debug(void __iomem *ioaddr, struct stmmac_extra_stats *x, +static void dwmac1000_debug(struct stmmac_priv *priv, void __iomem *ioaddr, + struct stmmac_extra_stats *x, u32 rx_queues, u32 tx_queues) { u32 value = readl(ioaddr + GMAC_DEBUG); diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac1000_dma.c b/drivers/net/ethernet/stmicro/stmmac/dwmac1000_dma.c index f5581db0ba9b..daf79cdbd3ec 100644 --- a/drivers/net/ethernet/stmicro/stmmac/dwmac1000_dma.c +++ b/drivers/net/ethernet/stmicro/stmmac/dwmac1000_dma.c @@ -110,7 +110,8 @@ static void dwmac1000_dma_init(void __iomem *ioaddr, writel(DMA_INTR_DEFAULT_MASK, ioaddr + DMA_INTR_ENA); } -static void dwmac1000_dma_init_rx(void __iomem *ioaddr, +static void dwmac1000_dma_init_rx(struct stmmac_priv *priv, + void __iomem *ioaddr, struct stmmac_dma_cfg *dma_cfg, dma_addr_t dma_rx_phy, u32 chan) { @@ -118,7 +119,8 @@ static void dwmac1000_dma_init_rx(void __iomem *ioaddr, writel(lower_32_bits(dma_rx_phy), ioaddr + DMA_RCV_BASE_ADDR); } -static void dwmac1000_dma_init_tx(void __iomem *ioaddr, +static void dwmac1000_dma_init_tx(struct stmmac_priv *priv, + void __iomem *ioaddr, struct stmmac_dma_cfg *dma_cfg, dma_addr_t dma_tx_phy, u32 chan) { @@ -147,7 +149,8 @@ static u32 dwmac1000_configure_fc(u32 csr6, int rxfifosz) return csr6; } -static void dwmac1000_dma_operation_mode_rx(void __iomem *ioaddr, int mode, +static void dwmac1000_dma_operation_mode_rx(struct stmmac_priv *priv, + void __iomem *ioaddr, int mode, u32 channel, int fifosz, u8 qmode) { u32 csr6 = readl(ioaddr + DMA_CONTROL); @@ -175,7 +178,8 @@ static void dwmac1000_dma_operation_mode_rx(void __iomem *ioaddr, int mode, writel(csr6, ioaddr + DMA_CONTROL); } -static void dwmac1000_dma_operation_mode_tx(void __iomem *ioaddr, int mode, +static void dwmac1000_dma_operation_mode_tx(struct stmmac_priv *priv, + void __iomem *ioaddr, int mode, u32 channel, int fifosz, u8 qmode) { u32 csr6 = readl(ioaddr + DMA_CONTROL); @@ -208,7 +212,8 @@ static void dwmac1000_dma_operation_mode_tx(void __iomem *ioaddr, int mode, writel(csr6, ioaddr + DMA_CONTROL); } -static void dwmac1000_dump_dma_regs(void __iomem *ioaddr, u32 *reg_space) +static void dwmac1000_dump_dma_regs(struct stmmac_priv *priv, + void __iomem *ioaddr, u32 *reg_space) { int i; @@ -263,8 +268,8 @@ static int dwmac1000_get_hw_feature(void __iomem *ioaddr, return 0; } -static void dwmac1000_rx_watchdog(void __iomem *ioaddr, u32 riwt, - u32 queue) +static void dwmac1000_rx_watchdog(struct stmmac_priv *priv, + void __iomem *ioaddr, u32 riwt, u32 queue) { writel(riwt, ioaddr + DMA_RX_WATCHDOG); } diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac100_dma.c b/drivers/net/ethernet/stmicro/stmmac/dwmac100_dma.c index 8f0d9bc7cab5..1c32b1788f02 100644 --- a/drivers/net/ethernet/stmicro/stmmac/dwmac100_dma.c +++ b/drivers/net/ethernet/stmicro/stmmac/dwmac100_dma.c @@ -29,7 +29,7 @@ static void dwmac100_dma_init(void __iomem *ioaddr, writel(DMA_INTR_DEFAULT_MASK, ioaddr + DMA_INTR_ENA); } -static void dwmac100_dma_init_rx(void __iomem *ioaddr, +static void dwmac100_dma_init_rx(struct stmmac_priv *priv, void __iomem *ioaddr, struct stmmac_dma_cfg *dma_cfg, dma_addr_t dma_rx_phy, u32 chan) { @@ -37,7 +37,7 @@ static void dwmac100_dma_init_rx(void __iomem *ioaddr, writel(lower_32_bits(dma_rx_phy), ioaddr + DMA_RCV_BASE_ADDR); } -static void dwmac100_dma_init_tx(void __iomem *ioaddr, +static void dwmac100_dma_init_tx(struct stmmac_priv *priv, void __iomem *ioaddr, struct stmmac_dma_cfg *dma_cfg, dma_addr_t dma_tx_phy, u32 chan) { @@ -50,7 +50,8 @@ static void dwmac100_dma_init_tx(void __iomem *ioaddr, * The transmit threshold can be programmed by setting the TTC bits in the DMA * control register. */ -static void dwmac100_dma_operation_mode_tx(void __iomem *ioaddr, int mode, +static void dwmac100_dma_operation_mode_tx(struct stmmac_priv *priv, + void __iomem *ioaddr, int mode, u32 channel, int fifosz, u8 qmode) { u32 csr6 = readl(ioaddr + DMA_CONTROL); @@ -65,7 +66,8 @@ static void dwmac100_dma_operation_mode_tx(void __iomem *ioaddr, int mode, writel(csr6, ioaddr + DMA_CONTROL); } -static void dwmac100_dump_dma_regs(void __iomem *ioaddr, u32 *reg_space) +static void dwmac100_dump_dma_regs(struct stmmac_priv *priv, + void __iomem *ioaddr, u32 *reg_space) { int i; @@ -80,10 +82,10 @@ static void dwmac100_dump_dma_regs(void __iomem *ioaddr, u32 *reg_space) } /* DMA controller has two counters to track the number of the missed frames. */ -static void dwmac100_dma_diagnostic_fr(void *data, struct stmmac_extra_stats *x, +static void dwmac100_dma_diagnostic_fr(struct net_device_stats *stats, + struct stmmac_extra_stats *x, void __iomem *ioaddr) { - struct net_device_stats *stats = (struct net_device_stats *)data; u32 csr8 = readl(ioaddr + DMA_MISSED_FRAME_CTR); if (unlikely(csr8)) { diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac4.h b/drivers/net/ethernet/stmicro/stmmac/dwmac4.h index ccd49346d3b3..4538f334df57 100644 --- a/drivers/net/ethernet/stmicro/stmmac/dwmac4.h +++ b/drivers/net/ethernet/stmicro/stmmac/dwmac4.h @@ -336,14 +336,25 @@ enum power_event { #define MTL_CHAN_BASE_ADDR 0x00000d00 #define MTL_CHAN_BASE_OFFSET 0x40 -#define MTL_CHANX_BASE_ADDR(x) (MTL_CHAN_BASE_ADDR + \ - (x * MTL_CHAN_BASE_OFFSET)) -#define MTL_CHAN_TX_OP_MODE(x) MTL_CHANX_BASE_ADDR(x) -#define MTL_CHAN_TX_DEBUG(x) (MTL_CHANX_BASE_ADDR(x) + 0x8) -#define MTL_CHAN_INT_CTRL(x) (MTL_CHANX_BASE_ADDR(x) + 0x2c) -#define MTL_CHAN_RX_OP_MODE(x) (MTL_CHANX_BASE_ADDR(x) + 0x30) -#define MTL_CHAN_RX_DEBUG(x) (MTL_CHANX_BASE_ADDR(x) + 0x38) +static inline u32 mtl_chanx_base_addr(const struct dwmac4_addrs *addrs, + const u32 x) +{ + u32 addr; + + if (addrs) + addr = addrs->mtl_chan + (x * addrs->mtl_chan_offset); + else + addr = MTL_CHAN_BASE_ADDR + (x * MTL_CHAN_BASE_OFFSET); + + return addr; +} + +#define MTL_CHAN_TX_OP_MODE(addrs, x) mtl_chanx_base_addr(addrs, x) +#define MTL_CHAN_TX_DEBUG(addrs, x) (mtl_chanx_base_addr(addrs, x) + 0x8) +#define MTL_CHAN_INT_CTRL(addrs, x) (mtl_chanx_base_addr(addrs, x) + 0x2c) +#define MTL_CHAN_RX_OP_MODE(addrs, x) (mtl_chanx_base_addr(addrs, x) + 0x30) +#define MTL_CHAN_RX_DEBUG(addrs, x) (mtl_chanx_base_addr(addrs, x) + 0x38) #define MTL_OP_MODE_RSF BIT(5) #define MTL_OP_MODE_TXQEN_MASK GENMASK(3, 2) @@ -388,8 +399,19 @@ enum power_event { /* MTL ETS Control register */ #define MTL_ETS_CTRL_BASE_ADDR 0x00000d10 #define MTL_ETS_CTRL_BASE_OFFSET 0x40 -#define MTL_ETSX_CTRL_BASE_ADDR(x) (MTL_ETS_CTRL_BASE_ADDR + \ - ((x) * MTL_ETS_CTRL_BASE_OFFSET)) + +static inline u32 mtl_etsx_ctrl_base_addr(const struct dwmac4_addrs *addrs, + const u32 x) +{ + u32 addr; + + if (addrs) + addr = addrs->mtl_ets_ctrl + (x * addrs->mtl_ets_ctrl_offset); + else + addr = MTL_ETS_CTRL_BASE_ADDR + (x * MTL_ETS_CTRL_BASE_OFFSET); + + return addr; +} #define MTL_ETS_CTRL_CC BIT(3) #define MTL_ETS_CTRL_AVALG BIT(2) @@ -397,31 +419,76 @@ enum power_event { /* MTL Queue Quantum Weight */ #define MTL_TXQ_WEIGHT_BASE_ADDR 0x00000d18 #define MTL_TXQ_WEIGHT_BASE_OFFSET 0x40 -#define MTL_TXQX_WEIGHT_BASE_ADDR(x) (MTL_TXQ_WEIGHT_BASE_ADDR + \ - ((x) * MTL_TXQ_WEIGHT_BASE_OFFSET)) + +static inline u32 mtl_txqx_weight_base_addr(const struct dwmac4_addrs *addrs, + const u32 x) +{ + u32 addr; + + if (addrs) + addr = addrs->mtl_txq_weight + (x * addrs->mtl_txq_weight_offset); + else + addr = MTL_TXQ_WEIGHT_BASE_ADDR + (x * MTL_TXQ_WEIGHT_BASE_OFFSET); + + return addr; +} + #define MTL_TXQ_WEIGHT_ISCQW_MASK GENMASK(20, 0) /* MTL sendSlopeCredit register */ #define MTL_SEND_SLP_CRED_BASE_ADDR 0x00000d1c #define MTL_SEND_SLP_CRED_OFFSET 0x40 -#define MTL_SEND_SLP_CREDX_BASE_ADDR(x) (MTL_SEND_SLP_CRED_BASE_ADDR + \ - ((x) * MTL_SEND_SLP_CRED_OFFSET)) + +static inline u32 mtl_send_slp_credx_base_addr(const struct dwmac4_addrs *addrs, + const u32 x) +{ + u32 addr; + + if (addrs) + addr = addrs->mtl_send_slp_cred + (x * addrs->mtl_send_slp_cred_offset); + else + addr = MTL_SEND_SLP_CRED_BASE_ADDR + (x * MTL_SEND_SLP_CRED_OFFSET); + + return addr; +} #define MTL_SEND_SLP_CRED_SSC_MASK GENMASK(13, 0) /* MTL hiCredit register */ #define MTL_HIGH_CRED_BASE_ADDR 0x00000d20 #define MTL_HIGH_CRED_OFFSET 0x40 -#define MTL_HIGH_CREDX_BASE_ADDR(x) (MTL_HIGH_CRED_BASE_ADDR + \ - ((x) * MTL_HIGH_CRED_OFFSET)) + +static inline u32 mtl_high_credx_base_addr(const struct dwmac4_addrs *addrs, + const u32 x) +{ + u32 addr; + + if (addrs) + addr = addrs->mtl_high_cred + (x * addrs->mtl_high_cred_offset); + else + addr = MTL_HIGH_CRED_BASE_ADDR + (x * MTL_HIGH_CRED_OFFSET); + + return addr; +} #define MTL_HIGH_CRED_HC_MASK GENMASK(28, 0) /* MTL loCredit register */ #define MTL_LOW_CRED_BASE_ADDR 0x00000d24 #define MTL_LOW_CRED_OFFSET 0x40 -#define MTL_LOW_CREDX_BASE_ADDR(x) (MTL_LOW_CRED_BASE_ADDR + \ - ((x) * MTL_LOW_CRED_OFFSET)) + +static inline u32 mtl_low_credx_base_addr(const struct dwmac4_addrs *addrs, + const u32 x) +{ + u32 addr; + + if (addrs) + addr = addrs->mtl_low_cred + (x * addrs->mtl_low_cred_offset); + else + addr = MTL_LOW_CRED_BASE_ADDR + (x * MTL_LOW_CRED_OFFSET); + + return addr; +} #define MTL_HIGH_CRED_LC_MASK GENMASK(28, 0) diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac4_core.c b/drivers/net/ethernet/stmicro/stmmac/dwmac4_core.c index 36251ec2589c..afaec3fb9ab6 100644 --- a/drivers/net/ethernet/stmicro/stmmac/dwmac4_core.c +++ b/drivers/net/ethernet/stmicro/stmmac/dwmac4_core.c @@ -198,15 +198,18 @@ static void dwmac4_prog_mtl_tx_algorithms(struct mac_device_info *hw, writel(value, ioaddr + MTL_OPERATION_MODE); } -static void dwmac4_set_mtl_tx_queue_weight(struct mac_device_info *hw, +static void dwmac4_set_mtl_tx_queue_weight(struct stmmac_priv *priv, + struct mac_device_info *hw, u32 weight, u32 queue) { + const struct dwmac4_addrs *dwmac4_addrs = priv->plat->dwmac4_addrs; void __iomem *ioaddr = hw->pcsr; - u32 value = readl(ioaddr + MTL_TXQX_WEIGHT_BASE_ADDR(queue)); + u32 value = readl(ioaddr + mtl_txqx_weight_base_addr(dwmac4_addrs, + queue)); value &= ~MTL_TXQ_WEIGHT_ISCQW_MASK; value |= weight & MTL_TXQ_WEIGHT_ISCQW_MASK; - writel(value, ioaddr + MTL_TXQX_WEIGHT_BASE_ADDR(queue)); + writel(value, ioaddr + mtl_txqx_weight_base_addr(dwmac4_addrs, queue)); } static void dwmac4_map_mtl_dma(struct mac_device_info *hw, u32 queue, u32 chan) @@ -227,10 +230,12 @@ static void dwmac4_map_mtl_dma(struct mac_device_info *hw, u32 queue, u32 chan) } } -static void dwmac4_config_cbs(struct mac_device_info *hw, +static void dwmac4_config_cbs(struct stmmac_priv *priv, + struct mac_device_info *hw, u32 send_slope, u32 idle_slope, u32 high_credit, u32 low_credit, u32 queue) { + const struct dwmac4_addrs *dwmac4_addrs = priv->plat->dwmac4_addrs; void __iomem *ioaddr = hw->pcsr; u32 value; @@ -241,31 +246,33 @@ static void dwmac4_config_cbs(struct mac_device_info *hw, pr_debug("\tlow_credit: 0x%08x\n", low_credit); /* enable AV algorithm */ - value = readl(ioaddr + MTL_ETSX_CTRL_BASE_ADDR(queue)); + value = readl(ioaddr + mtl_etsx_ctrl_base_addr(dwmac4_addrs, queue)); value |= MTL_ETS_CTRL_AVALG; value |= MTL_ETS_CTRL_CC; - writel(value, ioaddr + MTL_ETSX_CTRL_BASE_ADDR(queue)); + writel(value, ioaddr + mtl_etsx_ctrl_base_addr(dwmac4_addrs, queue)); /* configure send slope */ - value = readl(ioaddr + MTL_SEND_SLP_CREDX_BASE_ADDR(queue)); + value = readl(ioaddr + mtl_send_slp_credx_base_addr(dwmac4_addrs, + queue)); value &= ~MTL_SEND_SLP_CRED_SSC_MASK; value |= send_slope & MTL_SEND_SLP_CRED_SSC_MASK; - writel(value, ioaddr + MTL_SEND_SLP_CREDX_BASE_ADDR(queue)); + writel(value, ioaddr + mtl_send_slp_credx_base_addr(dwmac4_addrs, + queue)); /* configure idle slope (same register as tx weight) */ - dwmac4_set_mtl_tx_queue_weight(hw, idle_slope, queue); + dwmac4_set_mtl_tx_queue_weight(priv, hw, idle_slope, queue); /* configure high credit */ - value = readl(ioaddr + MTL_HIGH_CREDX_BASE_ADDR(queue)); + value = readl(ioaddr + mtl_high_credx_base_addr(dwmac4_addrs, queue)); value &= ~MTL_HIGH_CRED_HC_MASK; value |= high_credit & MTL_HIGH_CRED_HC_MASK; - writel(value, ioaddr + MTL_HIGH_CREDX_BASE_ADDR(queue)); + writel(value, ioaddr + mtl_high_credx_base_addr(dwmac4_addrs, queue)); /* configure high credit */ - value = readl(ioaddr + MTL_LOW_CREDX_BASE_ADDR(queue)); + value = readl(ioaddr + mtl_low_credx_base_addr(dwmac4_addrs, queue)); value &= ~MTL_HIGH_CRED_LC_MASK; value |= low_credit & MTL_HIGH_CRED_LC_MASK; - writel(value, ioaddr + MTL_LOW_CREDX_BASE_ADDR(queue)); + writel(value, ioaddr + mtl_low_credx_base_addr(dwmac4_addrs, queue)); } static void dwmac4_dump_regs(struct mac_device_info *hw, u32 *reg_space) @@ -759,8 +766,10 @@ static void dwmac4_phystatus(void __iomem *ioaddr, struct stmmac_extra_stats *x) } } -static int dwmac4_irq_mtl_status(struct mac_device_info *hw, u32 chan) +static int dwmac4_irq_mtl_status(struct stmmac_priv *priv, + struct mac_device_info *hw, u32 chan) { + const struct dwmac4_addrs *dwmac4_addrs = priv->plat->dwmac4_addrs; void __iomem *ioaddr = hw->pcsr; u32 mtl_int_qx_status; int ret = 0; @@ -770,12 +779,13 @@ static int dwmac4_irq_mtl_status(struct mac_device_info *hw, u32 chan) /* Check MTL Interrupt */ if (mtl_int_qx_status & MTL_INT_QX(chan)) { /* read Queue x Interrupt status */ - u32 status = readl(ioaddr + MTL_CHAN_INT_CTRL(chan)); + u32 status = readl(ioaddr + MTL_CHAN_INT_CTRL(dwmac4_addrs, + chan)); if (status & MTL_RX_OVERFLOW_INT) { /* clear Interrupt */ writel(status | MTL_RX_OVERFLOW_INT, - ioaddr + MTL_CHAN_INT_CTRL(chan)); + ioaddr + MTL_CHAN_INT_CTRL(dwmac4_addrs, chan)); ret = CORE_IRQ_MTL_RX_OVERFLOW; } } @@ -833,14 +843,16 @@ static int dwmac4_irq_status(struct mac_device_info *hw, return ret; } -static void dwmac4_debug(void __iomem *ioaddr, struct stmmac_extra_stats *x, +static void dwmac4_debug(struct stmmac_priv *priv, void __iomem *ioaddr, + struct stmmac_extra_stats *x, u32 rx_queues, u32 tx_queues) { + const struct dwmac4_addrs *dwmac4_addrs = priv->plat->dwmac4_addrs; u32 value; u32 queue; for (queue = 0; queue < tx_queues; queue++) { - value = readl(ioaddr + MTL_CHAN_TX_DEBUG(queue)); + value = readl(ioaddr + MTL_CHAN_TX_DEBUG(dwmac4_addrs, queue)); if (value & MTL_DEBUG_TXSTSFSTS) x->mtl_tx_status_fifo_full++; @@ -865,7 +877,7 @@ static void dwmac4_debug(void __iomem *ioaddr, struct stmmac_extra_stats *x, } for (queue = 0; queue < rx_queues; queue++) { - value = readl(ioaddr + MTL_CHAN_RX_DEBUG(queue)); + value = readl(ioaddr + MTL_CHAN_RX_DEBUG(dwmac4_addrs, queue)); if (value & MTL_DEBUG_RXFSTS_MASK) { u32 rxfsts = (value & MTL_DEBUG_RXFSTS_MASK) diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac4_descs.c b/drivers/net/ethernet/stmicro/stmmac/dwmac4_descs.c index 8cc80b1db4cb..6a011d8633e8 100644 --- a/drivers/net/ethernet/stmicro/stmmac/dwmac4_descs.c +++ b/drivers/net/ethernet/stmicro/stmmac/dwmac4_descs.c @@ -13,11 +13,11 @@ #include "dwmac4.h" #include "dwmac4_descs.h" -static int dwmac4_wrback_get_tx_status(void *data, struct stmmac_extra_stats *x, +static int dwmac4_wrback_get_tx_status(struct net_device_stats *stats, + struct stmmac_extra_stats *x, struct dma_desc *p, void __iomem *ioaddr) { - struct net_device_stats *stats = (struct net_device_stats *)data; unsigned int tdes3; int ret = tx_done; @@ -73,10 +73,10 @@ static int dwmac4_wrback_get_tx_status(void *data, struct stmmac_extra_stats *x, return ret; } -static int dwmac4_wrback_get_rx_status(void *data, struct stmmac_extra_stats *x, +static int dwmac4_wrback_get_rx_status(struct net_device_stats *stats, + struct stmmac_extra_stats *x, struct dma_desc *p) { - struct net_device_stats *stats = (struct net_device_stats *)data; unsigned int rdes1 = le32_to_cpu(p->des1); unsigned int rdes2 = le32_to_cpu(p->des2); unsigned int rdes3 = le32_to_cpu(p->des3); diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac4_dma.c b/drivers/net/ethernet/stmicro/stmmac/dwmac4_dma.c index d99fa028c646..84d3a8551b03 100644 --- a/drivers/net/ethernet/stmicro/stmmac/dwmac4_dma.c +++ b/drivers/net/ethernet/stmicro/stmmac/dwmac4_dma.c @@ -13,6 +13,7 @@ #include <linux/io.h> #include "dwmac4.h" #include "dwmac4_dma.h" +#include "stmmac.h" static void dwmac4_dma_axi(void __iomem *ioaddr, struct stmmac_axi *axi) { @@ -68,77 +69,87 @@ static void dwmac4_dma_axi(void __iomem *ioaddr, struct stmmac_axi *axi) writel(value, ioaddr + DMA_SYS_BUS_MODE); } -static void dwmac4_dma_init_rx_chan(void __iomem *ioaddr, +static void dwmac4_dma_init_rx_chan(struct stmmac_priv *priv, + void __iomem *ioaddr, struct stmmac_dma_cfg *dma_cfg, dma_addr_t dma_rx_phy, u32 chan) { + const struct dwmac4_addrs *dwmac4_addrs = priv->plat->dwmac4_addrs; u32 value; u32 rxpbl = dma_cfg->rxpbl ?: dma_cfg->pbl; - value = readl(ioaddr + DMA_CHAN_RX_CONTROL(chan)); + value = readl(ioaddr + DMA_CHAN_RX_CONTROL(dwmac4_addrs, chan)); value = value | (rxpbl << DMA_BUS_MODE_RPBL_SHIFT); - writel(value, ioaddr + DMA_CHAN_RX_CONTROL(chan)); + writel(value, ioaddr + DMA_CHAN_RX_CONTROL(dwmac4_addrs, chan)); if (IS_ENABLED(CONFIG_ARCH_DMA_ADDR_T_64BIT) && likely(dma_cfg->eame)) writel(upper_32_bits(dma_rx_phy), - ioaddr + DMA_CHAN_RX_BASE_ADDR_HI(chan)); + ioaddr + DMA_CHAN_RX_BASE_ADDR_HI(dwmac4_addrs, chan)); - writel(lower_32_bits(dma_rx_phy), ioaddr + DMA_CHAN_RX_BASE_ADDR(chan)); + writel(lower_32_bits(dma_rx_phy), + ioaddr + DMA_CHAN_RX_BASE_ADDR(dwmac4_addrs, chan)); } -static void dwmac4_dma_init_tx_chan(void __iomem *ioaddr, +static void dwmac4_dma_init_tx_chan(struct stmmac_priv *priv, + void __iomem *ioaddr, struct stmmac_dma_cfg *dma_cfg, dma_addr_t dma_tx_phy, u32 chan) { + const struct dwmac4_addrs *dwmac4_addrs = priv->plat->dwmac4_addrs; u32 value; u32 txpbl = dma_cfg->txpbl ?: dma_cfg->pbl; - value = readl(ioaddr + DMA_CHAN_TX_CONTROL(chan)); + value = readl(ioaddr + DMA_CHAN_TX_CONTROL(dwmac4_addrs, chan)); value = value | (txpbl << DMA_BUS_MODE_PBL_SHIFT); /* Enable OSP to get best performance */ value |= DMA_CONTROL_OSP; - writel(value, ioaddr + DMA_CHAN_TX_CONTROL(chan)); + writel(value, ioaddr + DMA_CHAN_TX_CONTROL(dwmac4_addrs, chan)); if (IS_ENABLED(CONFIG_ARCH_DMA_ADDR_T_64BIT) && likely(dma_cfg->eame)) writel(upper_32_bits(dma_tx_phy), - ioaddr + DMA_CHAN_TX_BASE_ADDR_HI(chan)); + ioaddr + DMA_CHAN_TX_BASE_ADDR_HI(dwmac4_addrs, chan)); - writel(lower_32_bits(dma_tx_phy), ioaddr + DMA_CHAN_TX_BASE_ADDR(chan)); + writel(lower_32_bits(dma_tx_phy), + ioaddr + DMA_CHAN_TX_BASE_ADDR(dwmac4_addrs, chan)); } -static void dwmac4_dma_init_channel(void __iomem *ioaddr, +static void dwmac4_dma_init_channel(struct stmmac_priv *priv, + void __iomem *ioaddr, struct stmmac_dma_cfg *dma_cfg, u32 chan) { + const struct dwmac4_addrs *dwmac4_addrs = priv->plat->dwmac4_addrs; u32 value; /* common channel control register config */ - value = readl(ioaddr + DMA_CHAN_CONTROL(chan)); + value = readl(ioaddr + DMA_CHAN_CONTROL(dwmac4_addrs, chan)); if (dma_cfg->pblx8) value = value | DMA_BUS_MODE_PBL; - writel(value, ioaddr + DMA_CHAN_CONTROL(chan)); + writel(value, ioaddr + DMA_CHAN_CONTROL(dwmac4_addrs, chan)); /* Mask interrupts by writing to CSR7 */ writel(DMA_CHAN_INTR_DEFAULT_MASK, - ioaddr + DMA_CHAN_INTR_ENA(chan)); + ioaddr + DMA_CHAN_INTR_ENA(dwmac4_addrs, chan)); } -static void dwmac410_dma_init_channel(void __iomem *ioaddr, +static void dwmac410_dma_init_channel(struct stmmac_priv *priv, + void __iomem *ioaddr, struct stmmac_dma_cfg *dma_cfg, u32 chan) { + const struct dwmac4_addrs *dwmac4_addrs = priv->plat->dwmac4_addrs; u32 value; /* common channel control register config */ - value = readl(ioaddr + DMA_CHAN_CONTROL(chan)); + value = readl(ioaddr + DMA_CHAN_CONTROL(dwmac4_addrs, chan)); if (dma_cfg->pblx8) value = value | DMA_BUS_MODE_PBL; - writel(value, ioaddr + DMA_CHAN_CONTROL(chan)); + writel(value, ioaddr + DMA_CHAN_CONTROL(dwmac4_addrs, chan)); /* Mask interrupts by writing to CSR7 */ writel(DMA_CHAN_INTR_DEFAULT_MASK_4_10, - ioaddr + DMA_CHAN_INTR_ENA(chan)); + ioaddr + DMA_CHAN_INTR_ENA(dwmac4_addrs, chan)); } static void dwmac4_dma_init(void __iomem *ioaddr, @@ -176,65 +187,78 @@ static void dwmac4_dma_init(void __iomem *ioaddr, } -static void _dwmac4_dump_dma_regs(void __iomem *ioaddr, u32 channel, +static void _dwmac4_dump_dma_regs(struct stmmac_priv *priv, + void __iomem *ioaddr, u32 channel, u32 *reg_space) { - reg_space[DMA_CHAN_CONTROL(channel) / 4] = - readl(ioaddr + DMA_CHAN_CONTROL(channel)); - reg_space[DMA_CHAN_TX_CONTROL(channel) / 4] = - readl(ioaddr + DMA_CHAN_TX_CONTROL(channel)); - reg_space[DMA_CHAN_RX_CONTROL(channel) / 4] = - readl(ioaddr + DMA_CHAN_RX_CONTROL(channel)); - reg_space[DMA_CHAN_TX_BASE_ADDR(channel) / 4] = - readl(ioaddr + DMA_CHAN_TX_BASE_ADDR(channel)); - reg_space[DMA_CHAN_RX_BASE_ADDR(channel) / 4] = - readl(ioaddr + DMA_CHAN_RX_BASE_ADDR(channel)); - reg_space[DMA_CHAN_TX_END_ADDR(channel) / 4] = - readl(ioaddr + DMA_CHAN_TX_END_ADDR(channel)); - reg_space[DMA_CHAN_RX_END_ADDR(channel) / 4] = - readl(ioaddr + DMA_CHAN_RX_END_ADDR(channel)); - reg_space[DMA_CHAN_TX_RING_LEN(channel) / 4] = - readl(ioaddr + DMA_CHAN_TX_RING_LEN(channel)); - reg_space[DMA_CHAN_RX_RING_LEN(channel) / 4] = - readl(ioaddr + DMA_CHAN_RX_RING_LEN(channel)); - reg_space[DMA_CHAN_INTR_ENA(channel) / 4] = - readl(ioaddr + DMA_CHAN_INTR_ENA(channel)); - reg_space[DMA_CHAN_RX_WATCHDOG(channel) / 4] = - readl(ioaddr + DMA_CHAN_RX_WATCHDOG(channel)); - reg_space[DMA_CHAN_SLOT_CTRL_STATUS(channel) / 4] = - readl(ioaddr + DMA_CHAN_SLOT_CTRL_STATUS(channel)); - reg_space[DMA_CHAN_CUR_TX_DESC(channel) / 4] = - readl(ioaddr + DMA_CHAN_CUR_TX_DESC(channel)); - reg_space[DMA_CHAN_CUR_RX_DESC(channel) / 4] = - readl(ioaddr + DMA_CHAN_CUR_RX_DESC(channel)); - reg_space[DMA_CHAN_CUR_TX_BUF_ADDR(channel) / 4] = - readl(ioaddr + DMA_CHAN_CUR_TX_BUF_ADDR(channel)); - reg_space[DMA_CHAN_CUR_RX_BUF_ADDR(channel) / 4] = - readl(ioaddr + DMA_CHAN_CUR_RX_BUF_ADDR(channel)); - reg_space[DMA_CHAN_STATUS(channel) / 4] = - readl(ioaddr + DMA_CHAN_STATUS(channel)); + const struct dwmac4_addrs *dwmac4_addrs = priv->plat->dwmac4_addrs; + const struct dwmac4_addrs *default_addrs = NULL; + + /* Purposely save the registers in the "normal" layout, regardless of + * platform modifications, to keep reg_space size constant + */ + reg_space[DMA_CHAN_CONTROL(default_addrs, channel) / 4] = + readl(ioaddr + DMA_CHAN_CONTROL(dwmac4_addrs, channel)); + reg_space[DMA_CHAN_TX_CONTROL(default_addrs, channel) / 4] = + readl(ioaddr + DMA_CHAN_TX_CONTROL(dwmac4_addrs, channel)); + reg_space[DMA_CHAN_RX_CONTROL(default_addrs, channel) / 4] = + readl(ioaddr + DMA_CHAN_RX_CONTROL(dwmac4_addrs, channel)); + reg_space[DMA_CHAN_TX_BASE_ADDR(default_addrs, channel) / 4] = + readl(ioaddr + DMA_CHAN_TX_BASE_ADDR(dwmac4_addrs, channel)); + reg_space[DMA_CHAN_RX_BASE_ADDR(default_addrs, channel) / 4] = + readl(ioaddr + DMA_CHAN_RX_BASE_ADDR(dwmac4_addrs, channel)); + reg_space[DMA_CHAN_TX_END_ADDR(default_addrs, channel) / 4] = + readl(ioaddr + DMA_CHAN_TX_END_ADDR(dwmac4_addrs, channel)); + reg_space[DMA_CHAN_RX_END_ADDR(default_addrs, channel) / 4] = + readl(ioaddr + DMA_CHAN_RX_END_ADDR(dwmac4_addrs, channel)); + reg_space[DMA_CHAN_TX_RING_LEN(default_addrs, channel) / 4] = + readl(ioaddr + DMA_CHAN_TX_RING_LEN(dwmac4_addrs, channel)); + reg_space[DMA_CHAN_RX_RING_LEN(default_addrs, channel) / 4] = + readl(ioaddr + DMA_CHAN_RX_RING_LEN(dwmac4_addrs, channel)); + reg_space[DMA_CHAN_INTR_ENA(default_addrs, channel) / 4] = + readl(ioaddr + DMA_CHAN_INTR_ENA(dwmac4_addrs, channel)); + reg_space[DMA_CHAN_RX_WATCHDOG(default_addrs, channel) / 4] = + readl(ioaddr + DMA_CHAN_RX_WATCHDOG(dwmac4_addrs, channel)); + reg_space[DMA_CHAN_SLOT_CTRL_STATUS(default_addrs, channel) / 4] = + readl(ioaddr + DMA_CHAN_SLOT_CTRL_STATUS(dwmac4_addrs, channel)); + reg_space[DMA_CHAN_CUR_TX_DESC(default_addrs, channel) / 4] = + readl(ioaddr + DMA_CHAN_CUR_TX_DESC(dwmac4_addrs, channel)); + reg_space[DMA_CHAN_CUR_RX_DESC(default_addrs, channel) / 4] = + readl(ioaddr + DMA_CHAN_CUR_RX_DESC(dwmac4_addrs, channel)); + reg_space[DMA_CHAN_CUR_TX_BUF_ADDR(default_addrs, channel) / 4] = + readl(ioaddr + DMA_CHAN_CUR_TX_BUF_ADDR(dwmac4_addrs, channel)); + reg_space[DMA_CHAN_CUR_RX_BUF_ADDR(default_addrs, channel) / 4] = + readl(ioaddr + DMA_CHAN_CUR_RX_BUF_ADDR(dwmac4_addrs, channel)); + reg_space[DMA_CHAN_STATUS(default_addrs, channel) / 4] = + readl(ioaddr + DMA_CHAN_STATUS(dwmac4_addrs, channel)); } -static void dwmac4_dump_dma_regs(void __iomem *ioaddr, u32 *reg_space) +static void dwmac4_dump_dma_regs(struct stmmac_priv *priv, void __iomem *ioaddr, + u32 *reg_space) { int i; for (i = 0; i < DMA_CHANNEL_NB_MAX; i++) - _dwmac4_dump_dma_regs(ioaddr, i, reg_space); + _dwmac4_dump_dma_regs(priv, ioaddr, i, reg_space); } -static void dwmac4_rx_watchdog(void __iomem *ioaddr, u32 riwt, u32 queue) +static void dwmac4_rx_watchdog(struct stmmac_priv *priv, void __iomem *ioaddr, + u32 riwt, u32 queue) { - writel(riwt, ioaddr + DMA_CHAN_RX_WATCHDOG(queue)); + const struct dwmac4_addrs *dwmac4_addrs = priv->plat->dwmac4_addrs; + + writel(riwt, ioaddr + DMA_CHAN_RX_WATCHDOG(dwmac4_addrs, queue)); } -static void dwmac4_dma_rx_chan_op_mode(void __iomem *ioaddr, int mode, +static void dwmac4_dma_rx_chan_op_mode(struct stmmac_priv *priv, + void __iomem *ioaddr, int mode, u32 channel, int fifosz, u8 qmode) { + const struct dwmac4_addrs *dwmac4_addrs = priv->plat->dwmac4_addrs; unsigned int rqs = fifosz / 256 - 1; u32 mtl_rx_op; - mtl_rx_op = readl(ioaddr + MTL_CHAN_RX_OP_MODE(channel)); + mtl_rx_op = readl(ioaddr + MTL_CHAN_RX_OP_MODE(dwmac4_addrs, channel)); if (mode == SF_DMA_MODE) { pr_debug("GMAC: enable RX store and forward mode\n"); @@ -292,13 +316,16 @@ static void dwmac4_dma_rx_chan_op_mode(void __iomem *ioaddr, int mode, mtl_rx_op |= rfa << MTL_OP_MODE_RFA_SHIFT; } - writel(mtl_rx_op, ioaddr + MTL_CHAN_RX_OP_MODE(channel)); + writel(mtl_rx_op, ioaddr + MTL_CHAN_RX_OP_MODE(dwmac4_addrs, channel)); } -static void dwmac4_dma_tx_chan_op_mode(void __iomem *ioaddr, int mode, +static void dwmac4_dma_tx_chan_op_mode(struct stmmac_priv *priv, + void __iomem *ioaddr, int mode, u32 channel, int fifosz, u8 qmode) { - u32 mtl_tx_op = readl(ioaddr + MTL_CHAN_TX_OP_MODE(channel)); + const struct dwmac4_addrs *dwmac4_addrs = priv->plat->dwmac4_addrs; + u32 mtl_tx_op = readl(ioaddr + MTL_CHAN_TX_OP_MODE(dwmac4_addrs, + channel)); unsigned int tqs = fifosz / 256 - 1; if (mode == SF_DMA_MODE) { @@ -344,7 +371,7 @@ static void dwmac4_dma_tx_chan_op_mode(void __iomem *ioaddr, int mode, mtl_tx_op &= ~MTL_OP_MODE_TQS_MASK; mtl_tx_op |= tqs << MTL_OP_MODE_TQS_SHIFT; - writel(mtl_tx_op, ioaddr + MTL_CHAN_TX_OP_MODE(channel)); + writel(mtl_tx_op, ioaddr + MTL_CHAN_TX_OP_MODE(dwmac4_addrs, channel)); } static int dwmac4_get_hw_feature(void __iomem *ioaddr, @@ -442,26 +469,31 @@ static int dwmac4_get_hw_feature(void __iomem *ioaddr, } /* Enable/disable TSO feature and set MSS */ -static void dwmac4_enable_tso(void __iomem *ioaddr, bool en, u32 chan) +static void dwmac4_enable_tso(struct stmmac_priv *priv, void __iomem *ioaddr, + bool en, u32 chan) { + const struct dwmac4_addrs *dwmac4_addrs = priv->plat->dwmac4_addrs; u32 value; if (en) { /* enable TSO */ - value = readl(ioaddr + DMA_CHAN_TX_CONTROL(chan)); + value = readl(ioaddr + DMA_CHAN_TX_CONTROL(dwmac4_addrs, chan)); writel(value | DMA_CONTROL_TSE, - ioaddr + DMA_CHAN_TX_CONTROL(chan)); + ioaddr + DMA_CHAN_TX_CONTROL(dwmac4_addrs, chan)); } else { /* enable TSO */ - value = readl(ioaddr + DMA_CHAN_TX_CONTROL(chan)); + value = readl(ioaddr + DMA_CHAN_TX_CONTROL(dwmac4_addrs, chan)); writel(value & ~DMA_CONTROL_TSE, - ioaddr + DMA_CHAN_TX_CONTROL(chan)); + ioaddr + DMA_CHAN_TX_CONTROL(dwmac4_addrs, chan)); } } -static void dwmac4_qmode(void __iomem *ioaddr, u32 channel, u8 qmode) +static void dwmac4_qmode(struct stmmac_priv *priv, void __iomem *ioaddr, + u32 channel, u8 qmode) { - u32 mtl_tx_op = readl(ioaddr + MTL_CHAN_TX_OP_MODE(channel)); + const struct dwmac4_addrs *dwmac4_addrs = priv->plat->dwmac4_addrs; + u32 mtl_tx_op = readl(ioaddr + MTL_CHAN_TX_OP_MODE(dwmac4_addrs, + channel)); mtl_tx_op &= ~MTL_OP_MODE_TXQEN_MASK; if (qmode != MTL_QUEUE_AVB) @@ -469,47 +501,54 @@ static void dwmac4_qmode(void __iomem *ioaddr, u32 channel, u8 qmode) else mtl_tx_op |= MTL_OP_MODE_TXQEN_AV; - writel(mtl_tx_op, ioaddr + MTL_CHAN_TX_OP_MODE(channel)); + writel(mtl_tx_op, ioaddr + MTL_CHAN_TX_OP_MODE(dwmac4_addrs, channel)); } -static void dwmac4_set_bfsize(void __iomem *ioaddr, int bfsize, u32 chan) +static void dwmac4_set_bfsize(struct stmmac_priv *priv, void __iomem *ioaddr, + int bfsize, u32 chan) { - u32 value = readl(ioaddr + DMA_CHAN_RX_CONTROL(chan)); + const struct dwmac4_addrs *dwmac4_addrs = priv->plat->dwmac4_addrs; + u32 value = readl(ioaddr + DMA_CHAN_RX_CONTROL(dwmac4_addrs, chan)); value &= ~DMA_RBSZ_MASK; value |= (bfsize << DMA_RBSZ_SHIFT) & DMA_RBSZ_MASK; - writel(value, ioaddr + DMA_CHAN_RX_CONTROL(chan)); + writel(value, ioaddr + DMA_CHAN_RX_CONTROL(dwmac4_addrs, chan)); } -static void dwmac4_enable_sph(void __iomem *ioaddr, bool en, u32 chan) +static void dwmac4_enable_sph(struct stmmac_priv *priv, void __iomem *ioaddr, + bool en, u32 chan) { + const struct dwmac4_addrs *dwmac4_addrs = priv->plat->dwmac4_addrs; u32 value = readl(ioaddr + GMAC_EXT_CONFIG); value &= ~GMAC_CONFIG_HDSMS; value |= GMAC_CONFIG_HDSMS_256; /* Segment max 256 bytes */ writel(value, ioaddr + GMAC_EXT_CONFIG); - value = readl(ioaddr + DMA_CHAN_CONTROL(chan)); + value = readl(ioaddr + DMA_CHAN_CONTROL(dwmac4_addrs, chan)); if (en) value |= DMA_CONTROL_SPH; else value &= ~DMA_CONTROL_SPH; - writel(value, ioaddr + DMA_CHAN_CONTROL(chan)); + writel(value, ioaddr + DMA_CHAN_CONTROL(dwmac4_addrs, chan)); } -static int dwmac4_enable_tbs(void __iomem *ioaddr, bool en, u32 chan) +static int dwmac4_enable_tbs(struct stmmac_priv *priv, void __iomem *ioaddr, + bool en, u32 chan) { - u32 value = readl(ioaddr + DMA_CHAN_TX_CONTROL(chan)); + const struct dwmac4_addrs *dwmac4_addrs = priv->plat->dwmac4_addrs; + u32 value = readl(ioaddr + DMA_CHAN_TX_CONTROL(dwmac4_addrs, chan)); if (en) value |= DMA_CONTROL_EDSE; else value &= ~DMA_CONTROL_EDSE; - writel(value, ioaddr + DMA_CHAN_TX_CONTROL(chan)); + writel(value, ioaddr + DMA_CHAN_TX_CONTROL(dwmac4_addrs, chan)); - value = readl(ioaddr + DMA_CHAN_TX_CONTROL(chan)) & DMA_CONTROL_EDSE; + value = readl(ioaddr + DMA_CHAN_TX_CONTROL(dwmac4_addrs, + chan)) & DMA_CONTROL_EDSE; if (en && !value) return -EIO; diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac4_dma.h b/drivers/net/ethernet/stmicro/stmmac/dwmac4_dma.h index 9321879b599c..358e7dcb6a9a 100644 --- a/drivers/net/ethernet/stmicro/stmmac/dwmac4_dma.h +++ b/drivers/net/ethernet/stmicro/stmmac/dwmac4_dma.h @@ -95,29 +95,41 @@ /* Following DMA defines are chanels oriented */ #define DMA_CHAN_BASE_ADDR 0x00001100 #define DMA_CHAN_BASE_OFFSET 0x80 -#define DMA_CHANX_BASE_ADDR(x) (DMA_CHAN_BASE_ADDR + \ - (x * DMA_CHAN_BASE_OFFSET)) + +static inline u32 dma_chanx_base_addr(const struct dwmac4_addrs *addrs, + const u32 x) +{ + u32 addr; + + if (addrs) + addr = addrs->dma_chan + (x * addrs->dma_chan_offset); + else + addr = DMA_CHAN_BASE_ADDR + (x * DMA_CHAN_BASE_OFFSET); + + return addr; +} + #define DMA_CHAN_REG_NUMBER 17 -#define DMA_CHAN_CONTROL(x) DMA_CHANX_BASE_ADDR(x) -#define DMA_CHAN_TX_CONTROL(x) (DMA_CHANX_BASE_ADDR(x) + 0x4) -#define DMA_CHAN_RX_CONTROL(x) (DMA_CHANX_BASE_ADDR(x) + 0x8) -#define DMA_CHAN_TX_BASE_ADDR_HI(x) (DMA_CHANX_BASE_ADDR(x) + 0x10) -#define DMA_CHAN_TX_BASE_ADDR(x) (DMA_CHANX_BASE_ADDR(x) + 0x14) -#define DMA_CHAN_RX_BASE_ADDR_HI(x) (DMA_CHANX_BASE_ADDR(x) + 0x18) -#define DMA_CHAN_RX_BASE_ADDR(x) (DMA_CHANX_BASE_ADDR(x) + 0x1c) -#define DMA_CHAN_TX_END_ADDR(x) (DMA_CHANX_BASE_ADDR(x) + 0x20) -#define DMA_CHAN_RX_END_ADDR(x) (DMA_CHANX_BASE_ADDR(x) + 0x28) -#define DMA_CHAN_TX_RING_LEN(x) (DMA_CHANX_BASE_ADDR(x) + 0x2c) -#define DMA_CHAN_RX_RING_LEN(x) (DMA_CHANX_BASE_ADDR(x) + 0x30) -#define DMA_CHAN_INTR_ENA(x) (DMA_CHANX_BASE_ADDR(x) + 0x34) -#define DMA_CHAN_RX_WATCHDOG(x) (DMA_CHANX_BASE_ADDR(x) + 0x38) -#define DMA_CHAN_SLOT_CTRL_STATUS(x) (DMA_CHANX_BASE_ADDR(x) + 0x3c) -#define DMA_CHAN_CUR_TX_DESC(x) (DMA_CHANX_BASE_ADDR(x) + 0x44) -#define DMA_CHAN_CUR_RX_DESC(x) (DMA_CHANX_BASE_ADDR(x) + 0x4c) -#define DMA_CHAN_CUR_TX_BUF_ADDR(x) (DMA_CHANX_BASE_ADDR(x) + 0x54) -#define DMA_CHAN_CUR_RX_BUF_ADDR(x) (DMA_CHANX_BASE_ADDR(x) + 0x5c) -#define DMA_CHAN_STATUS(x) (DMA_CHANX_BASE_ADDR(x) + 0x60) +#define DMA_CHAN_CONTROL(addrs, x) dma_chanx_base_addr(addrs, x) +#define DMA_CHAN_TX_CONTROL(addrs, x) (dma_chanx_base_addr(addrs, x) + 0x4) +#define DMA_CHAN_RX_CONTROL(addrs, x) (dma_chanx_base_addr(addrs, x) + 0x8) +#define DMA_CHAN_TX_BASE_ADDR_HI(addrs, x) (dma_chanx_base_addr(addrs, x) + 0x10) +#define DMA_CHAN_TX_BASE_ADDR(addrs, x) (dma_chanx_base_addr(addrs, x) + 0x14) +#define DMA_CHAN_RX_BASE_ADDR_HI(addrs, x) (dma_chanx_base_addr(addrs, x) + 0x18) +#define DMA_CHAN_RX_BASE_ADDR(addrs, x) (dma_chanx_base_addr(addrs, x) + 0x1c) +#define DMA_CHAN_TX_END_ADDR(addrs, x) (dma_chanx_base_addr(addrs, x) + 0x20) +#define DMA_CHAN_RX_END_ADDR(addrs, x) (dma_chanx_base_addr(addrs, x) + 0x28) +#define DMA_CHAN_TX_RING_LEN(addrs, x) (dma_chanx_base_addr(addrs, x) + 0x2c) +#define DMA_CHAN_RX_RING_LEN(addrs, x) (dma_chanx_base_addr(addrs, x) + 0x30) +#define DMA_CHAN_INTR_ENA(addrs, x) (dma_chanx_base_addr(addrs, x) + 0x34) +#define DMA_CHAN_RX_WATCHDOG(addrs, x) (dma_chanx_base_addr(addrs, x) + 0x38) +#define DMA_CHAN_SLOT_CTRL_STATUS(addrs, x) (dma_chanx_base_addr(addrs, x) + 0x3c) +#define DMA_CHAN_CUR_TX_DESC(addrs, x) (dma_chanx_base_addr(addrs, x) + 0x44) +#define DMA_CHAN_CUR_RX_DESC(addrs, x) (dma_chanx_base_addr(addrs, x) + 0x4c) +#define DMA_CHAN_CUR_TX_BUF_ADDR(addrs, x) (dma_chanx_base_addr(addrs, x) + 0x54) +#define DMA_CHAN_CUR_RX_BUF_ADDR(addrs, x) (dma_chanx_base_addr(addrs, x) + 0x5c) +#define DMA_CHAN_STATUS(addrs, x) (dma_chanx_base_addr(addrs, x) + 0x60) /* DMA Control X */ #define DMA_CONTROL_SPH BIT(24) @@ -220,19 +232,31 @@ #define DMA_CHAN0_DBG_STAT_RPS_SHIFT 8 int dwmac4_dma_reset(void __iomem *ioaddr); -void dwmac4_enable_dma_irq(void __iomem *ioaddr, u32 chan, bool rx, bool tx); -void dwmac410_enable_dma_irq(void __iomem *ioaddr, u32 chan, bool rx, bool tx); -void dwmac4_disable_dma_irq(void __iomem *ioaddr, u32 chan, bool rx, bool tx); -void dwmac410_disable_dma_irq(void __iomem *ioaddr, u32 chan, bool rx, bool tx); -void dwmac4_dma_start_tx(void __iomem *ioaddr, u32 chan); -void dwmac4_dma_stop_tx(void __iomem *ioaddr, u32 chan); -void dwmac4_dma_start_rx(void __iomem *ioaddr, u32 chan); -void dwmac4_dma_stop_rx(void __iomem *ioaddr, u32 chan); -int dwmac4_dma_interrupt(void __iomem *ioaddr, +void dwmac4_enable_dma_irq(struct stmmac_priv *priv, void __iomem *ioaddr, + u32 chan, bool rx, bool tx); +void dwmac410_enable_dma_irq(struct stmmac_priv *priv, void __iomem *ioaddr, + u32 chan, bool rx, bool tx); +void dwmac4_disable_dma_irq(struct stmmac_priv *priv, void __iomem *ioaddr, + u32 chan, bool rx, bool tx); +void dwmac410_disable_dma_irq(struct stmmac_priv *priv, void __iomem *ioaddr, + u32 chan, bool rx, bool tx); +void dwmac4_dma_start_tx(struct stmmac_priv *priv, void __iomem *ioaddr, + u32 chan); +void dwmac4_dma_stop_tx(struct stmmac_priv *priv, void __iomem *ioaddr, + u32 chan); +void dwmac4_dma_start_rx(struct stmmac_priv *priv, void __iomem *ioaddr, + u32 chan); +void dwmac4_dma_stop_rx(struct stmmac_priv *priv, void __iomem *ioaddr, + u32 chan); +int dwmac4_dma_interrupt(struct stmmac_priv *priv, void __iomem *ioaddr, struct stmmac_extra_stats *x, u32 chan, u32 dir); -void dwmac4_set_rx_ring_len(void __iomem *ioaddr, u32 len, u32 chan); -void dwmac4_set_tx_ring_len(void __iomem *ioaddr, u32 len, u32 chan); -void dwmac4_set_rx_tail_ptr(void __iomem *ioaddr, u32 tail_ptr, u32 chan); -void dwmac4_set_tx_tail_ptr(void __iomem *ioaddr, u32 tail_ptr, u32 chan); +void dwmac4_set_rx_ring_len(struct stmmac_priv *priv, void __iomem *ioaddr, + u32 len, u32 chan); +void dwmac4_set_tx_ring_len(struct stmmac_priv *priv, void __iomem *ioaddr, + u32 len, u32 chan); +void dwmac4_set_rx_tail_ptr(struct stmmac_priv *priv, void __iomem *ioaddr, + u32 tail_ptr, u32 chan); +void dwmac4_set_tx_tail_ptr(struct stmmac_priv *priv, void __iomem *ioaddr, + u32 tail_ptr, u32 chan); #endif /* __DWMAC4_DMA_H__ */ diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac4_lib.c b/drivers/net/ethernet/stmicro/stmmac/dwmac4_lib.c index d1c605777985..df41eac54058 100644 --- a/drivers/net/ethernet/stmicro/stmmac/dwmac4_lib.c +++ b/drivers/net/ethernet/stmicro/stmmac/dwmac4_lib.c @@ -11,6 +11,7 @@ #include "common.h" #include "dwmac4_dma.h" #include "dwmac4.h" +#include "stmmac.h" int dwmac4_dma_reset(void __iomem *ioaddr) { @@ -25,120 +26,151 @@ int dwmac4_dma_reset(void __iomem *ioaddr) 10000, 1000000); } -void dwmac4_set_rx_tail_ptr(void __iomem *ioaddr, u32 tail_ptr, u32 chan) +void dwmac4_set_rx_tail_ptr(struct stmmac_priv *priv, void __iomem *ioaddr, + u32 tail_ptr, u32 chan) { - writel(tail_ptr, ioaddr + DMA_CHAN_RX_END_ADDR(chan)); + const struct dwmac4_addrs *dwmac4_addrs = priv->plat->dwmac4_addrs; + + writel(tail_ptr, ioaddr + DMA_CHAN_RX_END_ADDR(dwmac4_addrs, chan)); } -void dwmac4_set_tx_tail_ptr(void __iomem *ioaddr, u32 tail_ptr, u32 chan) +void dwmac4_set_tx_tail_ptr(struct stmmac_priv *priv, void __iomem *ioaddr, + u32 tail_ptr, u32 chan) { - writel(tail_ptr, ioaddr + DMA_CHAN_TX_END_ADDR(chan)); + const struct dwmac4_addrs *dwmac4_addrs = priv->plat->dwmac4_addrs; + + writel(tail_ptr, ioaddr + DMA_CHAN_TX_END_ADDR(dwmac4_addrs, chan)); } -void dwmac4_dma_start_tx(void __iomem *ioaddr, u32 chan) +void dwmac4_dma_start_tx(struct stmmac_priv *priv, void __iomem *ioaddr, + u32 chan) { - u32 value = readl(ioaddr + DMA_CHAN_TX_CONTROL(chan)); + const struct dwmac4_addrs *dwmac4_addrs = priv->plat->dwmac4_addrs; + u32 value = readl(ioaddr + DMA_CHAN_TX_CONTROL(dwmac4_addrs, chan)); value |= DMA_CONTROL_ST; - writel(value, ioaddr + DMA_CHAN_TX_CONTROL(chan)); + writel(value, ioaddr + DMA_CHAN_TX_CONTROL(dwmac4_addrs, chan)); value = readl(ioaddr + GMAC_CONFIG); value |= GMAC_CONFIG_TE; writel(value, ioaddr + GMAC_CONFIG); } -void dwmac4_dma_stop_tx(void __iomem *ioaddr, u32 chan) +void dwmac4_dma_stop_tx(struct stmmac_priv *priv, void __iomem *ioaddr, + u32 chan) { - u32 value = readl(ioaddr + DMA_CHAN_TX_CONTROL(chan)); + const struct dwmac4_addrs *dwmac4_addrs = priv->plat->dwmac4_addrs; + + u32 value = readl(ioaddr + DMA_CHAN_TX_CONTROL(dwmac4_addrs, chan)); value &= ~DMA_CONTROL_ST; - writel(value, ioaddr + DMA_CHAN_TX_CONTROL(chan)); + writel(value, ioaddr + DMA_CHAN_TX_CONTROL(dwmac4_addrs, chan)); } -void dwmac4_dma_start_rx(void __iomem *ioaddr, u32 chan) +void dwmac4_dma_start_rx(struct stmmac_priv *priv, void __iomem *ioaddr, + u32 chan) { - u32 value = readl(ioaddr + DMA_CHAN_RX_CONTROL(chan)); + const struct dwmac4_addrs *dwmac4_addrs = priv->plat->dwmac4_addrs; + + u32 value = readl(ioaddr + DMA_CHAN_RX_CONTROL(dwmac4_addrs, chan)); value |= DMA_CONTROL_SR; - writel(value, ioaddr + DMA_CHAN_RX_CONTROL(chan)); + writel(value, ioaddr + DMA_CHAN_RX_CONTROL(dwmac4_addrs, chan)); value = readl(ioaddr + GMAC_CONFIG); value |= GMAC_CONFIG_RE; writel(value, ioaddr + GMAC_CONFIG); } -void dwmac4_dma_stop_rx(void __iomem *ioaddr, u32 chan) +void dwmac4_dma_stop_rx(struct stmmac_priv *priv, void __iomem *ioaddr, + u32 chan) { - u32 value = readl(ioaddr + DMA_CHAN_RX_CONTROL(chan)); + const struct dwmac4_addrs *dwmac4_addrs = priv->plat->dwmac4_addrs; + u32 value = readl(ioaddr + DMA_CHAN_RX_CONTROL(dwmac4_addrs, chan)); value &= ~DMA_CONTROL_SR; - writel(value, ioaddr + DMA_CHAN_RX_CONTROL(chan)); + writel(value, ioaddr + DMA_CHAN_RX_CONTROL(dwmac4_addrs, chan)); } -void dwmac4_set_tx_ring_len(void __iomem *ioaddr, u32 len, u32 chan) +void dwmac4_set_tx_ring_len(struct stmmac_priv *priv, void __iomem *ioaddr, + u32 len, u32 chan) { - writel(len, ioaddr + DMA_CHAN_TX_RING_LEN(chan)); + const struct dwmac4_addrs *dwmac4_addrs = priv->plat->dwmac4_addrs; + + writel(len, ioaddr + DMA_CHAN_TX_RING_LEN(dwmac4_addrs, chan)); } -void dwmac4_set_rx_ring_len(void __iomem *ioaddr, u32 len, u32 chan) +void dwmac4_set_rx_ring_len(struct stmmac_priv *priv, void __iomem *ioaddr, + u32 len, u32 chan) { - writel(len, ioaddr + DMA_CHAN_RX_RING_LEN(chan)); + const struct dwmac4_addrs *dwmac4_addrs = priv->plat->dwmac4_addrs; + + writel(len, ioaddr + DMA_CHAN_RX_RING_LEN(dwmac4_addrs, chan)); } -void dwmac4_enable_dma_irq(void __iomem *ioaddr, u32 chan, bool rx, bool tx) +void dwmac4_enable_dma_irq(struct stmmac_priv *priv, void __iomem *ioaddr, + u32 chan, bool rx, bool tx) { - u32 value = readl(ioaddr + DMA_CHAN_INTR_ENA(chan)); + const struct dwmac4_addrs *dwmac4_addrs = priv->plat->dwmac4_addrs; + u32 value = readl(ioaddr + DMA_CHAN_INTR_ENA(dwmac4_addrs, chan)); if (rx) value |= DMA_CHAN_INTR_DEFAULT_RX; if (tx) value |= DMA_CHAN_INTR_DEFAULT_TX; - writel(value, ioaddr + DMA_CHAN_INTR_ENA(chan)); + writel(value, ioaddr + DMA_CHAN_INTR_ENA(dwmac4_addrs, chan)); } -void dwmac410_enable_dma_irq(void __iomem *ioaddr, u32 chan, bool rx, bool tx) +void dwmac410_enable_dma_irq(struct stmmac_priv *priv, void __iomem *ioaddr, + u32 chan, bool rx, bool tx) { - u32 value = readl(ioaddr + DMA_CHAN_INTR_ENA(chan)); + const struct dwmac4_addrs *dwmac4_addrs = priv->plat->dwmac4_addrs; + u32 value = readl(ioaddr + DMA_CHAN_INTR_ENA(dwmac4_addrs, chan)); if (rx) value |= DMA_CHAN_INTR_DEFAULT_RX_4_10; if (tx) value |= DMA_CHAN_INTR_DEFAULT_TX_4_10; - writel(value, ioaddr + DMA_CHAN_INTR_ENA(chan)); + writel(value, ioaddr + DMA_CHAN_INTR_ENA(dwmac4_addrs, chan)); } -void dwmac4_disable_dma_irq(void __iomem *ioaddr, u32 chan, bool rx, bool tx) +void dwmac4_disable_dma_irq(struct stmmac_priv *priv, void __iomem *ioaddr, + u32 chan, bool rx, bool tx) { - u32 value = readl(ioaddr + DMA_CHAN_INTR_ENA(chan)); + const struct dwmac4_addrs *dwmac4_addrs = priv->plat->dwmac4_addrs; + u32 value = readl(ioaddr + DMA_CHAN_INTR_ENA(dwmac4_addrs, chan)); if (rx) value &= ~DMA_CHAN_INTR_DEFAULT_RX; if (tx) value &= ~DMA_CHAN_INTR_DEFAULT_TX; - writel(value, ioaddr + DMA_CHAN_INTR_ENA(chan)); + writel(value, ioaddr + DMA_CHAN_INTR_ENA(dwmac4_addrs, chan)); } -void dwmac410_disable_dma_irq(void __iomem *ioaddr, u32 chan, bool rx, bool tx) +void dwmac410_disable_dma_irq(struct stmmac_priv *priv, void __iomem *ioaddr, + u32 chan, bool rx, bool tx) { - u32 value = readl(ioaddr + DMA_CHAN_INTR_ENA(chan)); + const struct dwmac4_addrs *dwmac4_addrs = priv->plat->dwmac4_addrs; + u32 value = readl(ioaddr + DMA_CHAN_INTR_ENA(dwmac4_addrs, chan)); if (rx) value &= ~DMA_CHAN_INTR_DEFAULT_RX_4_10; if (tx) value &= ~DMA_CHAN_INTR_DEFAULT_TX_4_10; - writel(value, ioaddr + DMA_CHAN_INTR_ENA(chan)); + writel(value, ioaddr + DMA_CHAN_INTR_ENA(dwmac4_addrs, chan)); } -int dwmac4_dma_interrupt(void __iomem *ioaddr, +int dwmac4_dma_interrupt(struct stmmac_priv *priv, void __iomem *ioaddr, struct stmmac_extra_stats *x, u32 chan, u32 dir) { - u32 intr_status = readl(ioaddr + DMA_CHAN_STATUS(chan)); - u32 intr_en = readl(ioaddr + DMA_CHAN_INTR_ENA(chan)); + const struct dwmac4_addrs *dwmac4_addrs = priv->plat->dwmac4_addrs; + u32 intr_status = readl(ioaddr + DMA_CHAN_STATUS(dwmac4_addrs, chan)); + u32 intr_en = readl(ioaddr + DMA_CHAN_INTR_ENA(dwmac4_addrs, chan)); int ret = 0; if (dir == DMA_DIR_RX) @@ -183,7 +215,8 @@ int dwmac4_dma_interrupt(void __iomem *ioaddr, if (unlikely(intr_status & DMA_CHAN_STATUS_ERI)) x->rx_early_irq++; - writel(intr_status & intr_en, ioaddr + DMA_CHAN_STATUS(chan)); + writel(intr_status & intr_en, + ioaddr + DMA_CHAN_STATUS(dwmac4_addrs, chan)); return ret; } diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac_dma.h b/drivers/net/ethernet/stmicro/stmmac/dwmac_dma.h index acd70b9a3173..72672391675f 100644 --- a/drivers/net/ethernet/stmicro/stmmac/dwmac_dma.h +++ b/drivers/net/ethernet/stmicro/stmmac/dwmac_dma.h @@ -153,14 +153,20 @@ #define NUM_DWMAC4_DMA_REGS 27 void dwmac_enable_dma_transmission(void __iomem *ioaddr); -void dwmac_enable_dma_irq(void __iomem *ioaddr, u32 chan, bool rx, bool tx); -void dwmac_disable_dma_irq(void __iomem *ioaddr, u32 chan, bool rx, bool tx); -void dwmac_dma_start_tx(void __iomem *ioaddr, u32 chan); -void dwmac_dma_stop_tx(void __iomem *ioaddr, u32 chan); -void dwmac_dma_start_rx(void __iomem *ioaddr, u32 chan); -void dwmac_dma_stop_rx(void __iomem *ioaddr, u32 chan); -int dwmac_dma_interrupt(void __iomem *ioaddr, struct stmmac_extra_stats *x, - u32 chan, u32 dir); +void dwmac_enable_dma_irq(struct stmmac_priv *priv, void __iomem *ioaddr, + u32 chan, bool rx, bool tx); +void dwmac_disable_dma_irq(struct stmmac_priv *priv, void __iomem *ioaddr, + u32 chan, bool rx, bool tx); +void dwmac_dma_start_tx(struct stmmac_priv *priv, void __iomem *ioaddr, + u32 chan); +void dwmac_dma_stop_tx(struct stmmac_priv *priv, void __iomem *ioaddr, + u32 chan); +void dwmac_dma_start_rx(struct stmmac_priv *priv, void __iomem *ioaddr, + u32 chan); +void dwmac_dma_stop_rx(struct stmmac_priv *priv, void __iomem *ioaddr, + u32 chan); +int dwmac_dma_interrupt(struct stmmac_priv *priv, void __iomem *ioaddr, + struct stmmac_extra_stats *x, u32 chan, u32 dir); int dwmac_dma_reset(void __iomem *ioaddr); #endif /* __DWMAC_DMA_H__ */ diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac_lib.c b/drivers/net/ethernet/stmicro/stmmac/dwmac_lib.c index 9b6138b11776..0b6f999a8305 100644 --- a/drivers/net/ethernet/stmicro/stmmac/dwmac_lib.c +++ b/drivers/net/ethernet/stmicro/stmmac/dwmac_lib.c @@ -32,7 +32,8 @@ void dwmac_enable_dma_transmission(void __iomem *ioaddr) writel(1, ioaddr + DMA_XMT_POLL_DEMAND); } -void dwmac_enable_dma_irq(void __iomem *ioaddr, u32 chan, bool rx, bool tx) +void dwmac_enable_dma_irq(struct stmmac_priv *priv, void __iomem *ioaddr, + u32 chan, bool rx, bool tx) { u32 value = readl(ioaddr + DMA_INTR_ENA); @@ -44,7 +45,8 @@ void dwmac_enable_dma_irq(void __iomem *ioaddr, u32 chan, bool rx, bool tx) writel(value, ioaddr + DMA_INTR_ENA); } -void dwmac_disable_dma_irq(void __iomem *ioaddr, u32 chan, bool rx, bool tx) +void dwmac_disable_dma_irq(struct stmmac_priv *priv, void __iomem *ioaddr, + u32 chan, bool rx, bool tx) { u32 value = readl(ioaddr + DMA_INTR_ENA); @@ -56,28 +58,30 @@ void dwmac_disable_dma_irq(void __iomem *ioaddr, u32 chan, bool rx, bool tx) writel(value, ioaddr + DMA_INTR_ENA); } -void dwmac_dma_start_tx(void __iomem *ioaddr, u32 chan) +void dwmac_dma_start_tx(struct stmmac_priv *priv, void __iomem *ioaddr, + u32 chan) { u32 value = readl(ioaddr + DMA_CONTROL); value |= DMA_CONTROL_ST; writel(value, ioaddr + DMA_CONTROL); } -void dwmac_dma_stop_tx(void __iomem *ioaddr, u32 chan) +void dwmac_dma_stop_tx(struct stmmac_priv *priv, void __iomem *ioaddr, u32 chan) { u32 value = readl(ioaddr + DMA_CONTROL); value &= ~DMA_CONTROL_ST; writel(value, ioaddr + DMA_CONTROL); } -void dwmac_dma_start_rx(void __iomem *ioaddr, u32 chan) +void dwmac_dma_start_rx(struct stmmac_priv *priv, void __iomem *ioaddr, + u32 chan) { u32 value = readl(ioaddr + DMA_CONTROL); value |= DMA_CONTROL_SR; writel(value, ioaddr + DMA_CONTROL); } -void dwmac_dma_stop_rx(void __iomem *ioaddr, u32 chan) +void dwmac_dma_stop_rx(struct stmmac_priv *priv, void __iomem *ioaddr, u32 chan) { u32 value = readl(ioaddr + DMA_CONTROL); value &= ~DMA_CONTROL_SR; @@ -154,7 +158,7 @@ static void show_rx_process_state(unsigned int status) } #endif -int dwmac_dma_interrupt(void __iomem *ioaddr, +int dwmac_dma_interrupt(struct stmmac_priv *priv, void __iomem *ioaddr, struct stmmac_extra_stats *x, u32 chan, u32 dir) { int ret = 0; diff --git a/drivers/net/ethernet/stmicro/stmmac/dwxgmac2_core.c b/drivers/net/ethernet/stmicro/stmmac/dwxgmac2_core.c index c6c4d7948fe5..a0c2ef8bb0ac 100644 --- a/drivers/net/ethernet/stmicro/stmmac/dwxgmac2_core.c +++ b/drivers/net/ethernet/stmicro/stmmac/dwxgmac2_core.c @@ -187,7 +187,8 @@ static void dwxgmac2_prog_mtl_tx_algorithms(struct mac_device_info *hw, } } -static void dwxgmac2_set_mtl_tx_queue_weight(struct mac_device_info *hw, +static void dwxgmac2_set_mtl_tx_queue_weight(struct stmmac_priv *priv, + struct mac_device_info *hw, u32 weight, u32 queue) { void __iomem *ioaddr = hw->pcsr; @@ -212,7 +213,8 @@ static void dwxgmac2_map_mtl_to_dma(struct mac_device_info *hw, u32 queue, writel(value, ioaddr + reg); } -static void dwxgmac2_config_cbs(struct mac_device_info *hw, +static void dwxgmac2_config_cbs(struct stmmac_priv *priv, + struct mac_device_info *hw, u32 send_slope, u32 idle_slope, u32 high_credit, u32 low_credit, u32 queue) { @@ -276,7 +278,8 @@ static int dwxgmac2_host_irq_status(struct mac_device_info *hw, return ret; } -static int dwxgmac2_host_mtl_irq_status(struct mac_device_info *hw, u32 chan) +static int dwxgmac2_host_mtl_irq_status(struct stmmac_priv *priv, + struct mac_device_info *hw, u32 chan) { void __iomem *ioaddr = hw->pcsr; int ret = 0; diff --git a/drivers/net/ethernet/stmicro/stmmac/dwxgmac2_descs.c b/drivers/net/ethernet/stmicro/stmmac/dwxgmac2_descs.c index b1f0c3984a09..13c347ee8be9 100644 --- a/drivers/net/ethernet/stmicro/stmmac/dwxgmac2_descs.c +++ b/drivers/net/ethernet/stmicro/stmmac/dwxgmac2_descs.c @@ -8,7 +8,8 @@ #include "common.h" #include "dwxgmac2.h" -static int dwxgmac2_get_tx_status(void *data, struct stmmac_extra_stats *x, +static int dwxgmac2_get_tx_status(struct net_device_stats *stats, + struct stmmac_extra_stats *x, struct dma_desc *p, void __iomem *ioaddr) { unsigned int tdes3 = le32_to_cpu(p->des3); @@ -22,7 +23,8 @@ static int dwxgmac2_get_tx_status(void *data, struct stmmac_extra_stats *x, return ret; } -static int dwxgmac2_get_rx_status(void *data, struct stmmac_extra_stats *x, +static int dwxgmac2_get_rx_status(struct net_device_stats *stats, + struct stmmac_extra_stats *x, struct dma_desc *p) { unsigned int rdes3 = le32_to_cpu(p->des3); diff --git a/drivers/net/ethernet/stmicro/stmmac/dwxgmac2_dma.c b/drivers/net/ethernet/stmicro/stmmac/dwxgmac2_dma.c index 5e98355f422b..dfd53264e036 100644 --- a/drivers/net/ethernet/stmicro/stmmac/dwxgmac2_dma.c +++ b/drivers/net/ethernet/stmicro/stmmac/dwxgmac2_dma.c @@ -33,7 +33,8 @@ static void dwxgmac2_dma_init(void __iomem *ioaddr, writel(value, ioaddr + XGMAC_DMA_SYSBUS_MODE); } -static void dwxgmac2_dma_init_chan(void __iomem *ioaddr, +static void dwxgmac2_dma_init_chan(struct stmmac_priv *priv, + void __iomem *ioaddr, struct stmmac_dma_cfg *dma_cfg, u32 chan) { u32 value = readl(ioaddr + XGMAC_DMA_CH_CONTROL(chan)); @@ -45,7 +46,8 @@ static void dwxgmac2_dma_init_chan(void __iomem *ioaddr, writel(XGMAC_DMA_INT_DEFAULT_EN, ioaddr + XGMAC_DMA_CH_INT_EN(chan)); } -static void dwxgmac2_dma_init_rx_chan(void __iomem *ioaddr, +static void dwxgmac2_dma_init_rx_chan(struct stmmac_priv *priv, + void __iomem *ioaddr, struct stmmac_dma_cfg *dma_cfg, dma_addr_t phy, u32 chan) { @@ -61,7 +63,8 @@ static void dwxgmac2_dma_init_rx_chan(void __iomem *ioaddr, writel(lower_32_bits(phy), ioaddr + XGMAC_DMA_CH_RxDESC_LADDR(chan)); } -static void dwxgmac2_dma_init_tx_chan(void __iomem *ioaddr, +static void dwxgmac2_dma_init_tx_chan(struct stmmac_priv *priv, + void __iomem *ioaddr, struct stmmac_dma_cfg *dma_cfg, dma_addr_t phy, u32 chan) { @@ -131,7 +134,8 @@ static void dwxgmac2_dma_axi(void __iomem *ioaddr, struct stmmac_axi *axi) writel(XGMAC_RDPS, ioaddr + XGMAC_RX_EDMA_CTRL); } -static void dwxgmac2_dma_dump_regs(void __iomem *ioaddr, u32 *reg_space) +static void dwxgmac2_dma_dump_regs(struct stmmac_priv *priv, + void __iomem *ioaddr, u32 *reg_space) { int i; @@ -139,8 +143,8 @@ static void dwxgmac2_dma_dump_regs(void __iomem *ioaddr, u32 *reg_space) reg_space[i] = readl(ioaddr + i * 4); } -static void dwxgmac2_dma_rx_mode(void __iomem *ioaddr, int mode, - u32 channel, int fifosz, u8 qmode) +static void dwxgmac2_dma_rx_mode(struct stmmac_priv *priv, void __iomem *ioaddr, + int mode, u32 channel, int fifosz, u8 qmode) { u32 value = readl(ioaddr + XGMAC_MTL_RXQ_OPMODE(channel)); unsigned int rqs = fifosz / 256 - 1; @@ -205,8 +209,8 @@ static void dwxgmac2_dma_rx_mode(void __iomem *ioaddr, int mode, writel(value | XGMAC_RXOIE, ioaddr + XGMAC_MTL_QINTEN(channel)); } -static void dwxgmac2_dma_tx_mode(void __iomem *ioaddr, int mode, - u32 channel, int fifosz, u8 qmode) +static void dwxgmac2_dma_tx_mode(struct stmmac_priv *priv, void __iomem *ioaddr, + int mode, u32 channel, int fifosz, u8 qmode) { u32 value = readl(ioaddr + XGMAC_MTL_TXQ_OPMODE(channel)); unsigned int tqs = fifosz / 256 - 1; @@ -248,7 +252,8 @@ static void dwxgmac2_dma_tx_mode(void __iomem *ioaddr, int mode, writel(value, ioaddr + XGMAC_MTL_TXQ_OPMODE(channel)); } -static void dwxgmac2_enable_dma_irq(void __iomem *ioaddr, u32 chan, +static void dwxgmac2_enable_dma_irq(struct stmmac_priv *priv, + void __iomem *ioaddr, u32 chan, bool rx, bool tx) { u32 value = readl(ioaddr + XGMAC_DMA_CH_INT_EN(chan)); @@ -261,7 +266,8 @@ static void dwxgmac2_enable_dma_irq(void __iomem *ioaddr, u32 chan, writel(value, ioaddr + XGMAC_DMA_CH_INT_EN(chan)); } -static void dwxgmac2_disable_dma_irq(void __iomem *ioaddr, u32 chan, +static void dwxgmac2_disable_dma_irq(struct stmmac_priv *priv, + void __iomem *ioaddr, u32 chan, bool rx, bool tx) { u32 value = readl(ioaddr + XGMAC_DMA_CH_INT_EN(chan)); @@ -274,7 +280,8 @@ static void dwxgmac2_disable_dma_irq(void __iomem *ioaddr, u32 chan, writel(value, ioaddr + XGMAC_DMA_CH_INT_EN(chan)); } -static void dwxgmac2_dma_start_tx(void __iomem *ioaddr, u32 chan) +static void dwxgmac2_dma_start_tx(struct stmmac_priv *priv, + void __iomem *ioaddr, u32 chan) { u32 value; @@ -287,7 +294,8 @@ static void dwxgmac2_dma_start_tx(void __iomem *ioaddr, u32 chan) writel(value, ioaddr + XGMAC_TX_CONFIG); } -static void dwxgmac2_dma_stop_tx(void __iomem *ioaddr, u32 chan) +static void dwxgmac2_dma_stop_tx(struct stmmac_priv *priv, void __iomem *ioaddr, + u32 chan) { u32 value; @@ -300,7 +308,8 @@ static void dwxgmac2_dma_stop_tx(void __iomem *ioaddr, u32 chan) writel(value, ioaddr + XGMAC_TX_CONFIG); } -static void dwxgmac2_dma_start_rx(void __iomem *ioaddr, u32 chan) +static void dwxgmac2_dma_start_rx(struct stmmac_priv *priv, + void __iomem *ioaddr, u32 chan) { u32 value; @@ -313,7 +322,8 @@ static void dwxgmac2_dma_start_rx(void __iomem *ioaddr, u32 chan) writel(value, ioaddr + XGMAC_RX_CONFIG); } -static void dwxgmac2_dma_stop_rx(void __iomem *ioaddr, u32 chan) +static void dwxgmac2_dma_stop_rx(struct stmmac_priv *priv, void __iomem *ioaddr, + u32 chan) { u32 value; @@ -322,7 +332,8 @@ static void dwxgmac2_dma_stop_rx(void __iomem *ioaddr, u32 chan) writel(value, ioaddr + XGMAC_DMA_CH_RX_CONTROL(chan)); } -static int dwxgmac2_dma_interrupt(void __iomem *ioaddr, +static int dwxgmac2_dma_interrupt(struct stmmac_priv *priv, + void __iomem *ioaddr, struct stmmac_extra_stats *x, u32 chan, u32 dir) { @@ -449,32 +460,38 @@ static int dwxgmac2_get_hw_feature(void __iomem *ioaddr, return 0; } -static void dwxgmac2_rx_watchdog(void __iomem *ioaddr, u32 riwt, u32 queue) +static void dwxgmac2_rx_watchdog(struct stmmac_priv *priv, void __iomem *ioaddr, + u32 riwt, u32 queue) { writel(riwt & XGMAC_RWT, ioaddr + XGMAC_DMA_CH_Rx_WATCHDOG(queue)); } -static void dwxgmac2_set_rx_ring_len(void __iomem *ioaddr, u32 len, u32 chan) +static void dwxgmac2_set_rx_ring_len(struct stmmac_priv *priv, + void __iomem *ioaddr, u32 len, u32 chan) { writel(len, ioaddr + XGMAC_DMA_CH_RxDESC_RING_LEN(chan)); } -static void dwxgmac2_set_tx_ring_len(void __iomem *ioaddr, u32 len, u32 chan) +static void dwxgmac2_set_tx_ring_len(struct stmmac_priv *priv, + void __iomem *ioaddr, u32 len, u32 chan) { writel(len, ioaddr + XGMAC_DMA_CH_TxDESC_RING_LEN(chan)); } -static void dwxgmac2_set_rx_tail_ptr(void __iomem *ioaddr, u32 ptr, u32 chan) +static void dwxgmac2_set_rx_tail_ptr(struct stmmac_priv *priv, + void __iomem *ioaddr, u32 ptr, u32 chan) { writel(ptr, ioaddr + XGMAC_DMA_CH_RxDESC_TAIL_LPTR(chan)); } -static void dwxgmac2_set_tx_tail_ptr(void __iomem *ioaddr, u32 ptr, u32 chan) +static void dwxgmac2_set_tx_tail_ptr(struct stmmac_priv *priv, + void __iomem *ioaddr, u32 ptr, u32 chan) { writel(ptr, ioaddr + XGMAC_DMA_CH_TxDESC_TAIL_LPTR(chan)); } -static void dwxgmac2_enable_tso(void __iomem *ioaddr, bool en, u32 chan) +static void dwxgmac2_enable_tso(struct stmmac_priv *priv, void __iomem *ioaddr, + bool en, u32 chan) { u32 value = readl(ioaddr + XGMAC_DMA_CH_TX_CONTROL(chan)); @@ -486,7 +503,8 @@ static void dwxgmac2_enable_tso(void __iomem *ioaddr, bool en, u32 chan) writel(value, ioaddr + XGMAC_DMA_CH_TX_CONTROL(chan)); } -static void dwxgmac2_qmode(void __iomem *ioaddr, u32 channel, u8 qmode) +static void dwxgmac2_qmode(struct stmmac_priv *priv, void __iomem *ioaddr, + u32 channel, u8 qmode) { u32 value = readl(ioaddr + XGMAC_MTL_TXQ_OPMODE(channel)); u32 flow = readl(ioaddr + XGMAC_RX_FLOW_CTRL); @@ -503,7 +521,8 @@ static void dwxgmac2_qmode(void __iomem *ioaddr, u32 channel, u8 qmode) writel(value, ioaddr + XGMAC_MTL_TXQ_OPMODE(channel)); } -static void dwxgmac2_set_bfsize(void __iomem *ioaddr, int bfsize, u32 chan) +static void dwxgmac2_set_bfsize(struct stmmac_priv *priv, void __iomem *ioaddr, + int bfsize, u32 chan) { u32 value; @@ -513,7 +532,8 @@ static void dwxgmac2_set_bfsize(void __iomem *ioaddr, int bfsize, u32 chan) writel(value, ioaddr + XGMAC_DMA_CH_RX_CONTROL(chan)); } -static void dwxgmac2_enable_sph(void __iomem *ioaddr, bool en, u32 chan) +static void dwxgmac2_enable_sph(struct stmmac_priv *priv, void __iomem *ioaddr, + bool en, u32 chan) { u32 value = readl(ioaddr + XGMAC_RX_CONFIG); @@ -529,7 +549,8 @@ static void dwxgmac2_enable_sph(void __iomem *ioaddr, bool en, u32 chan) writel(value, ioaddr + XGMAC_DMA_CH_CONTROL(chan)); } -static int dwxgmac2_enable_tbs(void __iomem *ioaddr, bool en, u32 chan) +static int dwxgmac2_enable_tbs(struct stmmac_priv *priv, void __iomem *ioaddr, + bool en, u32 chan) { u32 value = readl(ioaddr + XGMAC_DMA_CH_TX_CONTROL(chan)); diff --git a/drivers/net/ethernet/stmicro/stmmac/enh_desc.c b/drivers/net/ethernet/stmicro/stmmac/enh_desc.c index 1bcbbd724fb5..a91d8f13a931 100644 --- a/drivers/net/ethernet/stmicro/stmmac/enh_desc.c +++ b/drivers/net/ethernet/stmicro/stmmac/enh_desc.c @@ -12,10 +12,10 @@ #include "common.h" #include "descs_com.h" -static int enh_desc_get_tx_status(void *data, struct stmmac_extra_stats *x, +static int enh_desc_get_tx_status(struct net_device_stats *stats, + struct stmmac_extra_stats *x, struct dma_desc *p, void __iomem *ioaddr) { - struct net_device_stats *stats = (struct net_device_stats *)data; unsigned int tdes0 = le32_to_cpu(p->des0); int ret = tx_done; @@ -117,7 +117,8 @@ static int enh_desc_coe_rdes0(int ipc_err, int type, int payload_err) return ret; } -static void enh_desc_get_ext_status(void *data, struct stmmac_extra_stats *x, +static void enh_desc_get_ext_status(struct net_device_stats *stats, + struct stmmac_extra_stats *x, struct dma_extended_desc *p) { unsigned int rdes0 = le32_to_cpu(p->basic.des0); @@ -181,10 +182,10 @@ static void enh_desc_get_ext_status(void *data, struct stmmac_extra_stats *x, } } -static int enh_desc_get_rx_status(void *data, struct stmmac_extra_stats *x, +static int enh_desc_get_rx_status(struct net_device_stats *stats, + struct stmmac_extra_stats *x, struct dma_desc *p) { - struct net_device_stats *stats = (struct net_device_stats *)data; unsigned int rdes0 = le32_to_cpu(p->des0); int ret = good_frame; diff --git a/drivers/net/ethernet/stmicro/stmmac/hwif.c b/drivers/net/ethernet/stmicro/stmmac/hwif.c index bb7114f970f8..b8ba8f2d8041 100644 --- a/drivers/net/ethernet/stmicro/stmmac/hwif.c +++ b/drivers/net/ethernet/stmicro/stmmac/hwif.c @@ -87,6 +87,19 @@ static int stmmac_dwxlgmac_quirks(struct stmmac_priv *priv) return 0; } +int stmmac_reset(struct stmmac_priv *priv, void __iomem *ioaddr) +{ + struct plat_stmmacenet_data *plat = priv ? priv->plat : NULL; + + if (!priv) + return -EINVAL; + + if (plat && plat->fix_soc_reset) + return plat->fix_soc_reset(plat, ioaddr); + + return stmmac_do_callback(priv, dma, reset, ioaddr); +} + static const struct stmmac_hwif_entry { bool gmac; bool gmac4; diff --git a/drivers/net/ethernet/stmicro/stmmac/hwif.h b/drivers/net/ethernet/stmicro/stmmac/hwif.h index 16a7421715cb..6ee7cf07cfd7 100644 --- a/drivers/net/ethernet/stmicro/stmmac/hwif.h +++ b/drivers/net/ethernet/stmicro/stmmac/hwif.h @@ -26,6 +26,7 @@ }) struct stmmac_extra_stats; +struct stmmac_priv; struct stmmac_safety_stats; struct dma_desc; struct dma_extended_desc; @@ -56,8 +57,9 @@ struct stmmac_desc_ops { /* Last tx segment reports the transmit status */ int (*get_tx_ls)(struct dma_desc *p); /* Return the transmit status looking at the TDES1 */ - int (*tx_status)(void *data, struct stmmac_extra_stats *x, - struct dma_desc *p, void __iomem *ioaddr); + int (*tx_status)(struct net_device_stats *stats, + struct stmmac_extra_stats *x, + struct dma_desc *p, void __iomem *ioaddr); /* Get the buffer size from the descriptor */ int (*get_tx_len)(struct dma_desc *p); /* Handle extra events on specific interrupts hw dependent */ @@ -65,10 +67,12 @@ struct stmmac_desc_ops { /* Get the receive frame size */ int (*get_rx_frame_len)(struct dma_desc *p, int rx_coe_type); /* Return the reception status looking at the RDES1 */ - int (*rx_status)(void *data, struct stmmac_extra_stats *x, - struct dma_desc *p); - void (*rx_extended_status)(void *data, struct stmmac_extra_stats *x, - struct dma_extended_desc *p); + int (*rx_status)(struct net_device_stats *stats, + struct stmmac_extra_stats *x, + struct dma_desc *p); + void (*rx_extended_status)(struct net_device_stats *stats, + struct stmmac_extra_stats *x, + struct dma_extended_desc *p); /* Set tx timestamp enable bit */ void (*enable_tx_timestamp) (struct dma_desc *p); /* get tx timestamp status */ @@ -168,110 +172,125 @@ struct stmmac_dma_ops { int (*reset)(void __iomem *ioaddr); void (*init)(void __iomem *ioaddr, struct stmmac_dma_cfg *dma_cfg, int atds); - void (*init_chan)(void __iomem *ioaddr, + void (*init_chan)(struct stmmac_priv *priv, void __iomem *ioaddr, struct stmmac_dma_cfg *dma_cfg, u32 chan); - void (*init_rx_chan)(void __iomem *ioaddr, + void (*init_rx_chan)(struct stmmac_priv *priv, void __iomem *ioaddr, struct stmmac_dma_cfg *dma_cfg, dma_addr_t phy, u32 chan); - void (*init_tx_chan)(void __iomem *ioaddr, + void (*init_tx_chan)(struct stmmac_priv *priv, void __iomem *ioaddr, struct stmmac_dma_cfg *dma_cfg, dma_addr_t phy, u32 chan); /* Configure the AXI Bus Mode Register */ void (*axi)(void __iomem *ioaddr, struct stmmac_axi *axi); /* Dump DMA registers */ - void (*dump_regs)(void __iomem *ioaddr, u32 *reg_space); - void (*dma_rx_mode)(void __iomem *ioaddr, int mode, u32 channel, - int fifosz, u8 qmode); - void (*dma_tx_mode)(void __iomem *ioaddr, int mode, u32 channel, + void (*dump_regs)(struct stmmac_priv *priv, void __iomem *ioaddr, + u32 *reg_space); + void (*dma_rx_mode)(struct stmmac_priv *priv, void __iomem *ioaddr, + int mode, u32 channel, int fifosz, u8 qmode); + void (*dma_tx_mode)(struct stmmac_priv *priv, void __iomem *ioaddr, + int mode, u32 channel, int fifosz, u8 qmode); /* To track extra statistic (if supported) */ - void (*dma_diagnostic_fr) (void *data, struct stmmac_extra_stats *x, - void __iomem *ioaddr); + void (*dma_diagnostic_fr)(struct net_device_stats *stats, + struct stmmac_extra_stats *x, + void __iomem *ioaddr); void (*enable_dma_transmission) (void __iomem *ioaddr); - void (*enable_dma_irq)(void __iomem *ioaddr, u32 chan, - bool rx, bool tx); - void (*disable_dma_irq)(void __iomem *ioaddr, u32 chan, - bool rx, bool tx); - void (*start_tx)(void __iomem *ioaddr, u32 chan); - void (*stop_tx)(void __iomem *ioaddr, u32 chan); - void (*start_rx)(void __iomem *ioaddr, u32 chan); - void (*stop_rx)(void __iomem *ioaddr, u32 chan); - int (*dma_interrupt) (void __iomem *ioaddr, - struct stmmac_extra_stats *x, u32 chan, u32 dir); + void (*enable_dma_irq)(struct stmmac_priv *priv, void __iomem *ioaddr, + u32 chan, bool rx, bool tx); + void (*disable_dma_irq)(struct stmmac_priv *priv, void __iomem *ioaddr, + u32 chan, bool rx, bool tx); + void (*start_tx)(struct stmmac_priv *priv, void __iomem *ioaddr, + u32 chan); + void (*stop_tx)(struct stmmac_priv *priv, void __iomem *ioaddr, + u32 chan); + void (*start_rx)(struct stmmac_priv *priv, void __iomem *ioaddr, + u32 chan); + void (*stop_rx)(struct stmmac_priv *priv, void __iomem *ioaddr, + u32 chan); + int (*dma_interrupt)(struct stmmac_priv *priv, void __iomem *ioaddr, + struct stmmac_extra_stats *x, u32 chan, u32 dir); /* If supported then get the optional core features */ int (*get_hw_feature)(void __iomem *ioaddr, struct dma_features *dma_cap); /* Program the HW RX Watchdog */ - void (*rx_watchdog)(void __iomem *ioaddr, u32 riwt, u32 queue); - void (*set_tx_ring_len)(void __iomem *ioaddr, u32 len, u32 chan); - void (*set_rx_ring_len)(void __iomem *ioaddr, u32 len, u32 chan); - void (*set_rx_tail_ptr)(void __iomem *ioaddr, u32 tail_ptr, u32 chan); - void (*set_tx_tail_ptr)(void __iomem *ioaddr, u32 tail_ptr, u32 chan); - void (*enable_tso)(void __iomem *ioaddr, bool en, u32 chan); - void (*qmode)(void __iomem *ioaddr, u32 channel, u8 qmode); - void (*set_bfsize)(void __iomem *ioaddr, int bfsize, u32 chan); - void (*enable_sph)(void __iomem *ioaddr, bool en, u32 chan); - int (*enable_tbs)(void __iomem *ioaddr, bool en, u32 chan); + void (*rx_watchdog)(struct stmmac_priv *priv, void __iomem *ioaddr, + u32 riwt, u32 queue); + void (*set_tx_ring_len)(struct stmmac_priv *priv, void __iomem *ioaddr, + u32 len, u32 chan); + void (*set_rx_ring_len)(struct stmmac_priv *priv, void __iomem *ioaddr, + u32 len, u32 chan); + void (*set_rx_tail_ptr)(struct stmmac_priv *priv, void __iomem *ioaddr, + u32 tail_ptr, u32 chan); + void (*set_tx_tail_ptr)(struct stmmac_priv *priv, void __iomem *ioaddr, + u32 tail_ptr, u32 chan); + void (*enable_tso)(struct stmmac_priv *priv, void __iomem *ioaddr, + bool en, u32 chan); + void (*qmode)(struct stmmac_priv *priv, void __iomem *ioaddr, + u32 channel, u8 qmode); + void (*set_bfsize)(struct stmmac_priv *priv, void __iomem *ioaddr, + int bfsize, u32 chan); + void (*enable_sph)(struct stmmac_priv *priv, void __iomem *ioaddr, + bool en, u32 chan); + int (*enable_tbs)(struct stmmac_priv *priv, void __iomem *ioaddr, + bool en, u32 chan); }; -#define stmmac_reset(__priv, __args...) \ - stmmac_do_callback(__priv, dma, reset, __args) #define stmmac_dma_init(__priv, __args...) \ stmmac_do_void_callback(__priv, dma, init, __args) #define stmmac_init_chan(__priv, __args...) \ - stmmac_do_void_callback(__priv, dma, init_chan, __args) + stmmac_do_void_callback(__priv, dma, init_chan, __priv, __args) #define stmmac_init_rx_chan(__priv, __args...) \ - stmmac_do_void_callback(__priv, dma, init_rx_chan, __args) + stmmac_do_void_callback(__priv, dma, init_rx_chan, __priv, __args) #define stmmac_init_tx_chan(__priv, __args...) \ - stmmac_do_void_callback(__priv, dma, init_tx_chan, __args) + stmmac_do_void_callback(__priv, dma, init_tx_chan, __priv, __args) #define stmmac_axi(__priv, __args...) \ stmmac_do_void_callback(__priv, dma, axi, __args) #define stmmac_dump_dma_regs(__priv, __args...) \ - stmmac_do_void_callback(__priv, dma, dump_regs, __args) + stmmac_do_void_callback(__priv, dma, dump_regs, __priv, __args) #define stmmac_dma_rx_mode(__priv, __args...) \ - stmmac_do_void_callback(__priv, dma, dma_rx_mode, __args) + stmmac_do_void_callback(__priv, dma, dma_rx_mode, __priv, __args) #define stmmac_dma_tx_mode(__priv, __args...) \ - stmmac_do_void_callback(__priv, dma, dma_tx_mode, __args) + stmmac_do_void_callback(__priv, dma, dma_tx_mode, __priv, __args) #define stmmac_dma_diagnostic_fr(__priv, __args...) \ stmmac_do_void_callback(__priv, dma, dma_diagnostic_fr, __args) #define stmmac_enable_dma_transmission(__priv, __args...) \ stmmac_do_void_callback(__priv, dma, enable_dma_transmission, __args) #define stmmac_enable_dma_irq(__priv, __args...) \ - stmmac_do_void_callback(__priv, dma, enable_dma_irq, __args) + stmmac_do_void_callback(__priv, dma, enable_dma_irq, __priv, __args) #define stmmac_disable_dma_irq(__priv, __args...) \ - stmmac_do_void_callback(__priv, dma, disable_dma_irq, __args) + stmmac_do_void_callback(__priv, dma, disable_dma_irq, __priv, __args) #define stmmac_start_tx(__priv, __args...) \ - stmmac_do_void_callback(__priv, dma, start_tx, __args) + stmmac_do_void_callback(__priv, dma, start_tx, __priv, __args) #define stmmac_stop_tx(__priv, __args...) \ - stmmac_do_void_callback(__priv, dma, stop_tx, __args) + stmmac_do_void_callback(__priv, dma, stop_tx, __priv, __args) #define stmmac_start_rx(__priv, __args...) \ - stmmac_do_void_callback(__priv, dma, start_rx, __args) + stmmac_do_void_callback(__priv, dma, start_rx, __priv, __args) #define stmmac_stop_rx(__priv, __args...) \ - stmmac_do_void_callback(__priv, dma, stop_rx, __args) + stmmac_do_void_callback(__priv, dma, stop_rx, __priv, __args) #define stmmac_dma_interrupt_status(__priv, __args...) \ - stmmac_do_callback(__priv, dma, dma_interrupt, __args) + stmmac_do_callback(__priv, dma, dma_interrupt, __priv, __args) #define stmmac_get_hw_feature(__priv, __args...) \ stmmac_do_callback(__priv, dma, get_hw_feature, __args) #define stmmac_rx_watchdog(__priv, __args...) \ - stmmac_do_void_callback(__priv, dma, rx_watchdog, __args) + stmmac_do_void_callback(__priv, dma, rx_watchdog, __priv, __args) #define stmmac_set_tx_ring_len(__priv, __args...) \ - stmmac_do_void_callback(__priv, dma, set_tx_ring_len, __args) + stmmac_do_void_callback(__priv, dma, set_tx_ring_len, __priv, __args) #define stmmac_set_rx_ring_len(__priv, __args...) \ - stmmac_do_void_callback(__priv, dma, set_rx_ring_len, __args) + stmmac_do_void_callback(__priv, dma, set_rx_ring_len, __priv, __args) #define stmmac_set_rx_tail_ptr(__priv, __args...) \ - stmmac_do_void_callback(__priv, dma, set_rx_tail_ptr, __args) + stmmac_do_void_callback(__priv, dma, set_rx_tail_ptr, __priv, __args) #define stmmac_set_tx_tail_ptr(__priv, __args...) \ - stmmac_do_void_callback(__priv, dma, set_tx_tail_ptr, __args) + stmmac_do_void_callback(__priv, dma, set_tx_tail_ptr, __priv, __args) #define stmmac_enable_tso(__priv, __args...) \ - stmmac_do_void_callback(__priv, dma, enable_tso, __args) + stmmac_do_void_callback(__priv, dma, enable_tso, __priv, __args) #define stmmac_dma_qmode(__priv, __args...) \ - stmmac_do_void_callback(__priv, dma, qmode, __args) + stmmac_do_void_callback(__priv, dma, qmode, __priv, __args) #define stmmac_set_dma_bfsize(__priv, __args...) \ - stmmac_do_void_callback(__priv, dma, set_bfsize, __args) + stmmac_do_void_callback(__priv, dma, set_bfsize, __priv, __args) #define stmmac_enable_sph(__priv, __args...) \ - stmmac_do_void_callback(__priv, dma, enable_sph, __args) + stmmac_do_void_callback(__priv, dma, enable_sph, __priv, __args) #define stmmac_enable_tbs(__priv, __args...) \ - stmmac_do_callback(__priv, dma, enable_tbs, __args) + stmmac_do_callback(__priv, dma, enable_tbs, __priv, __args) struct mac_device_info; struct net_device; @@ -303,21 +322,23 @@ struct stmmac_ops { /* Program TX Algorithms */ void (*prog_mtl_tx_algorithms)(struct mac_device_info *hw, u32 tx_alg); /* Set MTL TX queues weight */ - void (*set_mtl_tx_queue_weight)(struct mac_device_info *hw, + void (*set_mtl_tx_queue_weight)(struct stmmac_priv *priv, + struct mac_device_info *hw, u32 weight, u32 queue); /* RX MTL queue to RX dma mapping */ void (*map_mtl_to_dma)(struct mac_device_info *hw, u32 queue, u32 chan); /* Configure AV Algorithm */ - void (*config_cbs)(struct mac_device_info *hw, u32 send_slope, - u32 idle_slope, u32 high_credit, u32 low_credit, - u32 queue); + void (*config_cbs)(struct stmmac_priv *priv, struct mac_device_info *hw, + u32 send_slope, u32 idle_slope, u32 high_credit, + u32 low_credit, u32 queue); /* Dump MAC registers */ void (*dump_regs)(struct mac_device_info *hw, u32 *reg_space); /* Handle extra events on specific interrupts hw dependent */ int (*host_irq_status)(struct mac_device_info *hw, struct stmmac_extra_stats *x); /* Handle MTL interrupts */ - int (*host_mtl_irq_status)(struct mac_device_info *hw, u32 chan); + int (*host_mtl_irq_status)(struct stmmac_priv *priv, + struct mac_device_info *hw, u32 chan); /* Multicast filter setting */ void (*set_filter)(struct mac_device_info *hw, struct net_device *dev); /* Flow control setting */ @@ -337,8 +358,9 @@ struct stmmac_ops { void (*set_eee_lpi_entry_timer)(struct mac_device_info *hw, int et); void (*set_eee_timer)(struct mac_device_info *hw, int ls, int tw); void (*set_eee_pls)(struct mac_device_info *hw, int link); - void (*debug)(void __iomem *ioaddr, struct stmmac_extra_stats *x, - u32 rx_queues, u32 tx_queues); + void (*debug)(struct stmmac_priv *priv, void __iomem *ioaddr, + struct stmmac_extra_stats *x, u32 rx_queues, + u32 tx_queues); /* PCS calls */ void (*pcs_ctrl_ane)(void __iomem *ioaddr, bool ane, bool srgmi_ral, bool loopback); @@ -418,17 +440,17 @@ struct stmmac_ops { #define stmmac_prog_mtl_tx_algorithms(__priv, __args...) \ stmmac_do_void_callback(__priv, mac, prog_mtl_tx_algorithms, __args) #define stmmac_set_mtl_tx_queue_weight(__priv, __args...) \ - stmmac_do_void_callback(__priv, mac, set_mtl_tx_queue_weight, __args) + stmmac_do_void_callback(__priv, mac, set_mtl_tx_queue_weight, __priv, __args) #define stmmac_map_mtl_to_dma(__priv, __args...) \ stmmac_do_void_callback(__priv, mac, map_mtl_to_dma, __args) #define stmmac_config_cbs(__priv, __args...) \ - stmmac_do_void_callback(__priv, mac, config_cbs, __args) + stmmac_do_void_callback(__priv, mac, config_cbs, __priv, __args) #define stmmac_dump_mac_regs(__priv, __args...) \ stmmac_do_void_callback(__priv, mac, dump_regs, __args) #define stmmac_host_irq_status(__priv, __args...) \ stmmac_do_callback(__priv, mac, host_irq_status, __args) #define stmmac_host_mtl_irq_status(__priv, __args...) \ - stmmac_do_callback(__priv, mac, host_mtl_irq_status, __args) + stmmac_do_callback(__priv, mac, host_mtl_irq_status, __priv, __args) #define stmmac_set_filter(__priv, __args...) \ stmmac_do_void_callback(__priv, mac, set_filter, __args) #define stmmac_flow_ctrl(__priv, __args...) \ @@ -450,11 +472,11 @@ struct stmmac_ops { #define stmmac_set_eee_pls(__priv, __args...) \ stmmac_do_void_callback(__priv, mac, set_eee_pls, __args) #define stmmac_mac_debug(__priv, __args...) \ - stmmac_do_void_callback(__priv, mac, debug, __args) + stmmac_do_void_callback(__priv, mac, debug, __priv, __args) #define stmmac_pcs_ctrl_ane(__priv, __args...) \ stmmac_do_void_callback(__priv, mac, pcs_ctrl_ane, __args) #define stmmac_pcs_rane(__priv, __args...) \ - stmmac_do_void_callback(__priv, mac, pcs_rane, __args) + stmmac_do_void_callback(__priv, mac, pcs_rane, __priv, __args) #define stmmac_pcs_get_adv_lp(__priv, __args...) \ stmmac_do_void_callback(__priv, mac, pcs_get_adv_lp, __args) #define stmmac_safety_feat_config(__priv, __args...) \ @@ -502,8 +524,6 @@ struct stmmac_ops { #define stmmac_fpe_irq_status(__priv, __args...) \ stmmac_do_callback(__priv, mac, fpe_irq_status, __args) -struct stmmac_priv; - /* PTP and HW Timer helpers */ struct stmmac_hwtimestamp { void (*config_hw_tstamping) (void __iomem *ioaddr, u32 data); @@ -535,16 +555,20 @@ struct stmmac_hwtimestamp { #define stmmac_timestamp_interrupt(__priv, __args...) \ stmmac_do_void_callback(__priv, ptp, timestamp_interrupt, __args) +struct stmmac_tx_queue; +struct stmmac_rx_queue; + /* Helpers to manage the descriptors for chain and ring modes */ struct stmmac_mode_ops { void (*init) (void *des, dma_addr_t phy_addr, unsigned int size, unsigned int extend_desc); unsigned int (*is_jumbo_frm) (int len, int ehn_desc); - int (*jumbo_frm)(void *priv, struct sk_buff *skb, int csum); + int (*jumbo_frm)(struct stmmac_tx_queue *tx_q, struct sk_buff *skb, + int csum); int (*set_16kib_bfsize)(int mtu); void (*init_desc3)(struct dma_desc *p); - void (*refill_desc3) (void *priv, struct dma_desc *p); - void (*clean_desc3) (void *priv, struct dma_desc *p); + void (*refill_desc3)(struct stmmac_rx_queue *rx_q, struct dma_desc *p); + void (*clean_desc3)(struct stmmac_tx_queue *tx_q, struct dma_desc *p); }; #define stmmac_mode_init(__priv, __args...) \ @@ -640,6 +664,7 @@ extern const struct stmmac_mmc_ops dwxgmac_mmc_ops; #define GMAC_VERSION 0x00000020 /* GMAC CORE Version */ #define GMAC4_VERSION 0x00000110 /* GMAC4+ CORE Version */ +int stmmac_reset(struct stmmac_priv *priv, void __iomem *ioaddr); int stmmac_hwif_init(struct stmmac_priv *priv); #endif /* __STMMAC_HWIF_H__ */ diff --git a/drivers/net/ethernet/stmicro/stmmac/norm_desc.c b/drivers/net/ethernet/stmicro/stmmac/norm_desc.c index e3da4da242ee..350e6670a576 100644 --- a/drivers/net/ethernet/stmicro/stmmac/norm_desc.c +++ b/drivers/net/ethernet/stmicro/stmmac/norm_desc.c @@ -12,10 +12,10 @@ #include "common.h" #include "descs_com.h" -static int ndesc_get_tx_status(void *data, struct stmmac_extra_stats *x, +static int ndesc_get_tx_status(struct net_device_stats *stats, + struct stmmac_extra_stats *x, struct dma_desc *p, void __iomem *ioaddr) { - struct net_device_stats *stats = (struct net_device_stats *)data; unsigned int tdes0 = le32_to_cpu(p->des0); unsigned int tdes1 = le32_to_cpu(p->des1); int ret = tx_done; @@ -70,12 +70,12 @@ static int ndesc_get_tx_len(struct dma_desc *p) * and, if required, updates the multicast statistics. * In case of success, it returns good_frame because the GMAC device * is supposed to be able to compute the csum in HW. */ -static int ndesc_get_rx_status(void *data, struct stmmac_extra_stats *x, +static int ndesc_get_rx_status(struct net_device_stats *stats, + struct stmmac_extra_stats *x, struct dma_desc *p) { int ret = good_frame; unsigned int rdes0 = le32_to_cpu(p->des0); - struct net_device_stats *stats = (struct net_device_stats *)data; if (unlikely(rdes0 & RDES0_OWN)) return dma_own; diff --git a/drivers/net/ethernet/stmicro/stmmac/ring_mode.c b/drivers/net/ethernet/stmicro/stmmac/ring_mode.c index 2b5b17d8b8a0..d218412ca832 100644 --- a/drivers/net/ethernet/stmicro/stmmac/ring_mode.c +++ b/drivers/net/ethernet/stmicro/stmmac/ring_mode.c @@ -14,9 +14,9 @@ #include "stmmac.h" -static int jumbo_frm(void *p, struct sk_buff *skb, int csum) +static int jumbo_frm(struct stmmac_tx_queue *tx_q, struct sk_buff *skb, + int csum) { - struct stmmac_tx_queue *tx_q = (struct stmmac_tx_queue *)p; unsigned int nopaged_len = skb_headlen(skb); struct stmmac_priv *priv = tx_q->priv_data; unsigned int entry = tx_q->cur_tx; @@ -101,9 +101,8 @@ static unsigned int is_jumbo_frm(int len, int enh_desc) return ret; } -static void refill_desc3(void *priv_ptr, struct dma_desc *p) +static void refill_desc3(struct stmmac_rx_queue *rx_q, struct dma_desc *p) { - struct stmmac_rx_queue *rx_q = priv_ptr; struct stmmac_priv *priv = rx_q->priv_data; /* Fill DES3 in case of RING mode */ @@ -117,9 +116,8 @@ static void init_desc3(struct dma_desc *p) p->des3 = cpu_to_le32(le32_to_cpu(p->des2) + BUF_SIZE_8KiB); } -static void clean_desc3(void *priv_ptr, struct dma_desc *p) +static void clean_desc3(struct stmmac_tx_queue *tx_q, struct dma_desc *p) { - struct stmmac_tx_queue *tx_q = (struct stmmac_tx_queue *)priv_ptr; struct stmmac_priv *priv = tx_q->priv_data; unsigned int entry = tx_q->dirty_tx; diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac.h b/drivers/net/ethernet/stmicro/stmmac/stmmac.h index 3d15e1e92e18..07ea5ab0a60b 100644 --- a/drivers/net/ethernet/stmicro/stmmac/stmmac.h +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac.h @@ -92,6 +92,13 @@ struct stmmac_rx_buffer { dma_addr_t sec_addr; }; +struct stmmac_xdp_buff { + struct xdp_buff xdp; + struct stmmac_priv *priv; + struct dma_desc *desc; + struct dma_desc *ndesc; +}; + struct stmmac_rx_queue { u32 rx_count_frames; u32 queue_index; diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_ethtool.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_ethtool.c index 35c8dd92d369..2ae73ab842d4 100644 --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_ethtool.c +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_ethtool.c @@ -393,19 +393,10 @@ stmmac_ethtool_set_link_ksettings(struct net_device *dev, if (priv->hw->pcs & STMMAC_PCS_RGMII || priv->hw->pcs & STMMAC_PCS_SGMII) { - u32 mask = ADVERTISED_Autoneg | ADVERTISED_Pause; - /* Only support ANE */ if (cmd->base.autoneg != AUTONEG_ENABLE) return -EINVAL; - mask &= (ADVERTISED_1000baseT_Half | - ADVERTISED_1000baseT_Full | - ADVERTISED_100baseT_Half | - ADVERTISED_100baseT_Full | - ADVERTISED_10baseT_Half | - ADVERTISED_10baseT_Full); - mutex_lock(&priv->lock); stmmac_pcs_ctrl_ane(priv, priv->ioaddr, 1, priv->hw->ps, 0); mutex_unlock(&priv->lock); diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c index d7fcab057032..0fca81507a77 100644 --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c @@ -1614,6 +1614,12 @@ static int stmmac_alloc_rx_buffers_zc(struct stmmac_priv *priv, struct stmmac_rx_queue *rx_q = &dma_conf->rx_queue[queue]; int i; + /* struct stmmac_xdp_buff is using cb field (maximum size of 24 bytes) + * in struct xdp_buff_xsk to stash driver specific information. Thus, + * use this macro to make sure no size violations. + */ + XSK_CHECK_PRIV_TYPE(struct stmmac_xdp_buff); + for (i = 0; i < dma_conf->dma_rx_size; i++) { struct stmmac_rx_buffer *buf; dma_addr_t dma_addr; @@ -4563,13 +4569,10 @@ dma_map_err: static void stmmac_rx_vlan(struct net_device *dev, struct sk_buff *skb) { - struct vlan_ethhdr *veth; - __be16 vlan_proto; + struct vlan_ethhdr *veth = skb_vlan_eth_hdr(skb); + __be16 vlan_proto = veth->h_vlan_proto; u16 vlanid; - veth = (struct vlan_ethhdr *)skb->data; - vlan_proto = veth->h_vlan_proto; - if ((vlan_proto == htons(ETH_P_8021Q) && dev->features & NETIF_F_HW_VLAN_CTAG_RX) || (vlan_proto == htons(ETH_P_8021AD) && @@ -4998,6 +5001,16 @@ static bool stmmac_rx_refill_zc(struct stmmac_priv *priv, u32 queue, u32 budget) return ret; } +static struct stmmac_xdp_buff *xsk_buff_to_stmmac_ctx(struct xdp_buff *xdp) +{ + /* In XDP zero copy data path, xdp field in struct xdp_buff_xsk is used + * to represent incoming packet, whereas cb field in the same structure + * is used to store driver specific info. Thus, struct stmmac_xdp_buff + * is laid on top of xdp and cb fields of struct xdp_buff_xsk. + */ + return (struct stmmac_xdp_buff *)xdp; +} + static int stmmac_rx_zc(struct stmmac_priv *priv, int limit, u32 queue) { struct stmmac_rx_queue *rx_q = &priv->dma_conf.rx_queue[queue]; @@ -5027,6 +5040,7 @@ static int stmmac_rx_zc(struct stmmac_priv *priv, int limit, u32 queue) } while (count < limit) { struct stmmac_rx_buffer *buf; + struct stmmac_xdp_buff *ctx; unsigned int buf1_len = 0; struct dma_desc *np, *p; int entry; @@ -5112,6 +5126,11 @@ read_again: goto read_again; } + ctx = xsk_buff_to_stmmac_ctx(buf->xdp); + ctx->priv = priv; + ctx->desc = p; + ctx->ndesc = np; + /* XDP ZC Frame only support primary buffers for now */ buf1_len = stmmac_rx_buf1_len(priv, p, status, len); len += buf1_len; @@ -5190,7 +5209,7 @@ static int stmmac_rx(struct stmmac_priv *priv, int limit, u32 queue) enum dma_data_direction dma_dir; unsigned int desc_size; struct sk_buff *skb = NULL; - struct xdp_buff xdp; + struct stmmac_xdp_buff ctx; int xdp_status = 0; int buf_sz; @@ -5311,17 +5330,22 @@ read_again: dma_sync_single_for_cpu(priv->device, buf->addr, buf1_len, dma_dir); - xdp_init_buff(&xdp, buf_sz, &rx_q->xdp_rxq); - xdp_prepare_buff(&xdp, page_address(buf->page), - buf->page_offset, buf1_len, false); + xdp_init_buff(&ctx.xdp, buf_sz, &rx_q->xdp_rxq); + xdp_prepare_buff(&ctx.xdp, page_address(buf->page), + buf->page_offset, buf1_len, true); - pre_len = xdp.data_end - xdp.data_hard_start - + pre_len = ctx.xdp.data_end - ctx.xdp.data_hard_start - buf->page_offset; - skb = stmmac_xdp_run_prog(priv, &xdp); + + ctx.priv = priv; + ctx.desc = p; + ctx.ndesc = np; + + skb = stmmac_xdp_run_prog(priv, &ctx.xdp); /* Due xdp_adjust_tail: DMA sync for_device * cover max len CPU touch */ - sync_len = xdp.data_end - xdp.data_hard_start - + sync_len = ctx.xdp.data_end - ctx.xdp.data_hard_start - buf->page_offset; sync_len = max(sync_len, pre_len); @@ -5331,7 +5355,7 @@ read_again: if (xdp_res & STMMAC_XDP_CONSUMED) { page_pool_put_page(rx_q->page_pool, - virt_to_head_page(xdp.data), + virt_to_head_page(ctx.xdp.data), sync_len, true); buf->page = NULL; priv->dev->stats.rx_dropped++; @@ -5359,7 +5383,7 @@ read_again: if (!skb) { /* XDP program may expand or reduce tail */ - buf1_len = xdp.data_end - xdp.data; + buf1_len = ctx.xdp.data_end - ctx.xdp.data; skb = napi_alloc_skb(&ch->rx_napi, buf1_len); if (!skb) { @@ -5369,7 +5393,7 @@ read_again: } /* XDP program may adjust header */ - skb_copy_to_linear_data(skb, xdp.data, buf1_len); + skb_copy_to_linear_data(skb, ctx.xdp.data, buf1_len); skb_put(skb, buf1_len); /* Data payload copied into SKB, page ready for recycle */ @@ -6350,6 +6374,10 @@ static int stmmac_vlan_rx_add_vid(struct net_device *ndev, __be16 proto, u16 vid bool is_double = false; int ret; + ret = pm_runtime_resume_and_get(priv->device); + if (ret < 0) + return ret; + if (be16_to_cpu(proto) == ETH_P_8021AD) is_double = true; @@ -6357,16 +6385,18 @@ static int stmmac_vlan_rx_add_vid(struct net_device *ndev, __be16 proto, u16 vid ret = stmmac_vlan_update(priv, is_double); if (ret) { clear_bit(vid, priv->active_vlans); - return ret; + goto err_pm_put; } if (priv->hw->num_vlan) { ret = stmmac_add_hw_vlan_rx_fltr(priv, ndev, priv->hw, proto, vid); if (ret) - return ret; + goto err_pm_put; } +err_pm_put: + pm_runtime_put(priv->device); - return 0; + return ret; } static int stmmac_vlan_rx_kill_vid(struct net_device *ndev, __be16 proto, u16 vid) @@ -7060,6 +7090,37 @@ void stmmac_fpe_handshake(struct stmmac_priv *priv, bool enable) } } +static int stmmac_xdp_rx_timestamp(const struct xdp_md *_ctx, u64 *timestamp) +{ + const struct stmmac_xdp_buff *ctx = (void *)_ctx; + struct dma_desc *desc_contains_ts = ctx->desc; + struct stmmac_priv *priv = ctx->priv; + struct dma_desc *ndesc = ctx->ndesc; + struct dma_desc *desc = ctx->desc; + u64 ns = 0; + + if (!priv->hwts_rx_en) + return -ENODATA; + + /* For GMAC4, the valid timestamp is from CTX next desc. */ + if (priv->plat->has_gmac4 || priv->plat->has_xgmac) + desc_contains_ts = ndesc; + + /* Check if timestamp is available */ + if (stmmac_get_rx_timestamp_status(priv, desc, ndesc, priv->adv_ts)) { + stmmac_get_timestamp(priv, desc_contains_ts, priv->adv_ts, &ns); + ns -= priv->plat->cdc_error_adj; + *timestamp = ns_to_ktime(ns); + return 0; + } + + return -ENODATA; +} + +static const struct xdp_metadata_ops stmmac_xdp_metadata_ops = { + .xmo_rx_timestamp = stmmac_xdp_rx_timestamp, +}; + /** * stmmac_dvr_probe * @device: device pointer @@ -7167,6 +7228,8 @@ int stmmac_dvr_probe(struct device *device, ndev->netdev_ops = &stmmac_netdev_ops; + ndev->xdp_metadata_ops = &stmmac_xdp_metadata_ops; + ndev->hw_features = NETIF_F_SG | NETIF_F_IP_CSUM | NETIF_F_IPV6_CSUM | NETIF_F_RXCSUM; ndev->xdp_features = NETDEV_XDP_ACT_BASIC | NETDEV_XDP_ACT_REDIRECT | @@ -7253,6 +7316,10 @@ int stmmac_dvr_probe(struct device *device, if (priv->dma_cap.rssen && priv->plat->rss_en) ndev->features |= NETIF_F_RXHASH; + ndev->vlan_features |= ndev->features; + /* TSO doesn't work on VLANs yet */ + ndev->vlan_features &= ~NETIF_F_TSO; + /* MTU range: 46 - hw-specific max */ ndev->min_mtu = ETH_ZLEN - ETH_HLEN; if (priv->plat->has_xgmac) @@ -7275,6 +7342,8 @@ int stmmac_dvr_probe(struct device *device, if (flow_ctrl) priv->flow_ctrl = FLOW_AUTO; /* RX/TX pause on */ + ndev->priv_flags |= IFF_LIVE_ADDR_CHANGE; + /* Setup channels NAPI */ stmmac_napi_add(ndev); diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_mdio.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_mdio.c index 21aaa2730ac8..6807c4c1a0a2 100644 --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_mdio.c +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_mdio.c @@ -281,9 +281,8 @@ static int stmmac_mdio_read_c22(struct mii_bus *bus, int phyaddr, int phyreg) value |= (phyreg << priv->hw->mii.reg_shift) & priv->hw->mii.reg_mask; value |= (priv->clk_csr << priv->hw->mii.clk_csr_shift) & priv->hw->mii.clk_csr_mask; - if (priv->plat->has_gmac4) { + if (priv->plat->has_gmac4) value |= MII_GMAC4_READ; - } data = stmmac_mdio_read(priv, data, value); diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c index 067a40fe0a23..eb0b2898daa3 100644 --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c @@ -519,7 +519,8 @@ stmmac_probe_config_dt(struct platform_device *pdev, u8 *mac) if (of_device_is_compatible(np, "snps,dwmac-4.00") || of_device_is_compatible(np, "snps,dwmac-4.10a") || of_device_is_compatible(np, "snps,dwmac-4.20a") || - of_device_is_compatible(np, "snps,dwmac-5.10a")) { + of_device_is_compatible(np, "snps,dwmac-5.10a") || + of_device_is_compatible(np, "snps,dwmac-5.20")) { plat->has_gmac4 = 1; plat->has_gmac = 0; plat->pmt = 1; diff --git a/drivers/net/ethernet/sun/sunhme.c b/drivers/net/ethernet/sun/sunhme.c index b0c7ab74a82e..b93613cd1994 100644 --- a/drivers/net/ethernet/sun/sunhme.c +++ b/drivers/net/ethernet/sun/sunhme.c @@ -14,48 +14,44 @@ * argument : macaddr=0x00,0x10,0x20,0x30,0x40,0x50 */ -#include <linux/module.h> -#include <linux/kernel.h> -#include <linux/types.h> +#include <linux/bitops.h> +#include <linux/crc32.h> +#include <linux/delay.h> +#include <linux/dma-mapping.h> +#include <linux/errno.h> +#include <linux/etherdevice.h> +#include <linux/ethtool.h> #include <linux/fcntl.h> -#include <linux/interrupt.h> -#include <linux/ioport.h> #include <linux/in.h> -#include <linux/slab.h> -#include <linux/string.h> -#include <linux/delay.h> #include <linux/init.h> -#include <linux/ethtool.h> +#include <linux/interrupt.h> +#include <linux/io.h> +#include <linux/ioport.h> +#include <linux/kernel.h> #include <linux/mii.h> -#include <linux/crc32.h> -#include <linux/random.h> -#include <linux/errno.h> +#include <linux/mm.h> +#include <linux/module.h> #include <linux/netdevice.h> -#include <linux/etherdevice.h> +#include <linux/of_device.h> +#include <linux/of.h> +#include <linux/pci.h> +#include <linux/random.h> #include <linux/skbuff.h> -#include <linux/mm.h> -#include <linux/bitops.h> -#include <linux/dma-mapping.h> +#include <linux/slab.h> +#include <linux/string.h> +#include <linux/types.h> +#include <linux/uaccess.h> -#include <asm/io.h> -#include <asm/dma.h> #include <asm/byteorder.h> +#include <asm/dma.h> +#include <asm/irq.h> #ifdef CONFIG_SPARC -#include <linux/of.h> -#include <linux/of_device.h> +#include <asm/auxio.h> #include <asm/idprom.h> #include <asm/openprom.h> #include <asm/oplib.h> #include <asm/prom.h> -#include <asm/auxio.h> -#endif -#include <linux/uaccess.h> - -#include <asm/irq.h> - -#ifdef CONFIG_PCI -#include <linux/pci.h> #endif #include "sunhme.h" @@ -589,8 +585,6 @@ no_response: return 1; } -static int happy_meal_init(struct happy_meal *hp); - static int is_lucent_phy(struct happy_meal *hp) { void __iomem *tregs = hp->tcvregs; @@ -606,6 +600,124 @@ static int is_lucent_phy(struct happy_meal *hp) return ret; } +/* hp->happy_lock must be held */ +static void +happy_meal_begin_auto_negotiation(struct happy_meal *hp, + void __iomem *tregs, + const struct ethtool_link_ksettings *ep) +{ + int timeout; + + /* Read all of the registers we are interested in now. */ + hp->sw_bmsr = happy_meal_tcvr_read(hp, tregs, MII_BMSR); + hp->sw_bmcr = happy_meal_tcvr_read(hp, tregs, MII_BMCR); + hp->sw_physid1 = happy_meal_tcvr_read(hp, tregs, MII_PHYSID1); + hp->sw_physid2 = happy_meal_tcvr_read(hp, tregs, MII_PHYSID2); + + /* XXX Check BMSR_ANEGCAPABLE, should not be necessary though. */ + + hp->sw_advertise = happy_meal_tcvr_read(hp, tregs, MII_ADVERTISE); + if (!ep || ep->base.autoneg == AUTONEG_ENABLE) { + /* Advertise everything we can support. */ + if (hp->sw_bmsr & BMSR_10HALF) + hp->sw_advertise |= (ADVERTISE_10HALF); + else + hp->sw_advertise &= ~(ADVERTISE_10HALF); + + if (hp->sw_bmsr & BMSR_10FULL) + hp->sw_advertise |= (ADVERTISE_10FULL); + else + hp->sw_advertise &= ~(ADVERTISE_10FULL); + if (hp->sw_bmsr & BMSR_100HALF) + hp->sw_advertise |= (ADVERTISE_100HALF); + else + hp->sw_advertise &= ~(ADVERTISE_100HALF); + if (hp->sw_bmsr & BMSR_100FULL) + hp->sw_advertise |= (ADVERTISE_100FULL); + else + hp->sw_advertise &= ~(ADVERTISE_100FULL); + happy_meal_tcvr_write(hp, tregs, MII_ADVERTISE, hp->sw_advertise); + + /* XXX Currently no Happy Meal cards I know off support 100BaseT4, + * XXX and this is because the DP83840 does not support it, changes + * XXX would need to be made to the tx/rx logic in the driver as well + * XXX so I completely skip checking for it in the BMSR for now. + */ + + ASD("Advertising [ %s%s%s%s]\n", + hp->sw_advertise & ADVERTISE_10HALF ? "10H " : "", + hp->sw_advertise & ADVERTISE_10FULL ? "10F " : "", + hp->sw_advertise & ADVERTISE_100HALF ? "100H " : "", + hp->sw_advertise & ADVERTISE_100FULL ? "100F " : ""); + + /* Enable Auto-Negotiation, this is usually on already... */ + hp->sw_bmcr |= BMCR_ANENABLE; + happy_meal_tcvr_write(hp, tregs, MII_BMCR, hp->sw_bmcr); + + /* Restart it to make sure it is going. */ + hp->sw_bmcr |= BMCR_ANRESTART; + happy_meal_tcvr_write(hp, tregs, MII_BMCR, hp->sw_bmcr); + + /* BMCR_ANRESTART self clears when the process has begun. */ + + timeout = 64; /* More than enough. */ + while (--timeout) { + hp->sw_bmcr = happy_meal_tcvr_read(hp, tregs, MII_BMCR); + if (!(hp->sw_bmcr & BMCR_ANRESTART)) + break; /* got it. */ + udelay(10); + } + if (!timeout) { + netdev_err(hp->dev, + "Happy Meal would not start auto negotiation BMCR=0x%04x\n", + hp->sw_bmcr); + netdev_notice(hp->dev, + "Performing force link detection.\n"); + goto force_link; + } else { + hp->timer_state = arbwait; + } + } else { +force_link: + /* Force the link up, trying first a particular mode. + * Either we are here at the request of ethtool or + * because the Happy Meal would not start to autoneg. + */ + + /* Disable auto-negotiation in BMCR, enable the duplex and + * speed setting, init the timer state machine, and fire it off. + */ + if (!ep || ep->base.autoneg == AUTONEG_ENABLE) { + hp->sw_bmcr = BMCR_SPEED100; + } else { + if (ep->base.speed == SPEED_100) + hp->sw_bmcr = BMCR_SPEED100; + else + hp->sw_bmcr = 0; + if (ep->base.duplex == DUPLEX_FULL) + hp->sw_bmcr |= BMCR_FULLDPLX; + } + happy_meal_tcvr_write(hp, tregs, MII_BMCR, hp->sw_bmcr); + + if (!is_lucent_phy(hp)) { + /* OK, seems we need do disable the transceiver for the first + * tick to make sure we get an accurate link state at the + * second tick. + */ + hp->sw_csconfig = happy_meal_tcvr_read(hp, tregs, + DP83840_CSCONFIG); + hp->sw_csconfig &= ~(CSCONFIG_TCVDISAB); + happy_meal_tcvr_write(hp, tregs, DP83840_CSCONFIG, + hp->sw_csconfig); + } + hp->timer_state = ltrywait; + } + + hp->timer_ticks = 0; + hp->happy_timer.expires = jiffies + (12 * HZ)/10; /* 1.2 sec. */ + add_timer(&hp->happy_timer); +} + static void happy_meal_timer(struct timer_list *t) { struct happy_meal *hp = from_timer(hp, t, happy_timer); @@ -743,12 +855,7 @@ static void happy_meal_timer(struct timer_list *t) netdev_notice(hp->dev, "Link down, cable problem?\n"); - ret = happy_meal_init(hp); - if (ret) { - /* ho hum... */ - netdev_err(hp->dev, - "Error, cannot re-init the Happy Meal.\n"); - } + happy_meal_begin_auto_negotiation(hp, tregs, NULL); goto out; } if (!is_lucent_phy(hp)) { @@ -874,32 +981,6 @@ static void happy_meal_get_counters(struct happy_meal *hp, void __iomem *bregs) hme_write32(hp, bregs + BMAC_LTCTR, 0); } -/* hp->happy_lock must be held */ -static void happy_meal_poll_stop(struct happy_meal *hp, void __iomem *tregs) -{ - /* If polling disabled or not polling already, nothing to do. */ - if ((hp->happy_flags & (HFLAG_POLLENABLE | HFLAG_POLL)) != - (HFLAG_POLLENABLE | HFLAG_POLL)) { - ASD("not polling, return\n"); - return; - } - - /* Shut up the MIF. */ - ASD("were polling, mif ints off, polling off\n"); - hme_write32(hp, tregs + TCVR_IMASK, 0xffff); - - /* Turn off polling. */ - hme_write32(hp, tregs + TCVR_CFG, - hme_read32(hp, tregs + TCVR_CFG) & ~(TCV_CFG_PENABLE)); - - /* We are no longer polling. */ - hp->happy_flags &= ~(HFLAG_POLL); - - /* Let the bits set. */ - udelay(200); - ASD("done\n"); -} - /* Only Sun can take such nice parts and fuck up the programming interface * like this. Good job guys... */ @@ -1004,57 +1085,26 @@ static int happy_meal_tcvr_reset(struct happy_meal *hp, void __iomem *tregs) static void happy_meal_transceiver_check(struct happy_meal *hp, void __iomem *tregs) { unsigned long tconfig = hme_read32(hp, tregs + TCVR_CFG); + u32 reread = hme_read32(hp, tregs + TCVR_CFG); ASD("tcfg=%08lx\n", tconfig); - if (hp->happy_flags & HFLAG_POLL) { - /* If we are polling, we must stop to get the transceiver type. */ - if (hp->tcvr_type == internal) { - if (tconfig & TCV_CFG_MDIO1) { - happy_meal_poll_stop(hp, tregs); - hp->paddr = TCV_PADDR_ETX; - hp->tcvr_type = external; - tconfig &= ~(TCV_CFG_PENABLE); - tconfig |= TCV_CFG_PSELECT; - hme_write32(hp, tregs + TCVR_CFG, tconfig); - ASD("poll stop, internal->external\n"); - } - } else { - if (hp->tcvr_type == external) { - if (!(hme_read32(hp, tregs + TCVR_STATUS) >> 16)) { - happy_meal_poll_stop(hp, tregs); - hp->paddr = TCV_PADDR_ITX; - hp->tcvr_type = internal; - hme_write32(hp, tregs + TCVR_CFG, - hme_read32(hp, tregs + TCVR_CFG) & - ~(TCV_CFG_PSELECT)); - ASD("poll stop, external->internal\n"); - } - } else { - ASD("polling, none\n"); - } - } + if (reread & TCV_CFG_MDIO1) { + hme_write32(hp, tregs + TCVR_CFG, tconfig | TCV_CFG_PSELECT); + hp->paddr = TCV_PADDR_ETX; + hp->tcvr_type = external; + ASD("not polling, external\n"); } else { - u32 reread = hme_read32(hp, tregs + TCVR_CFG); - - /* Else we can just work off of the MDIO bits. */ - if (reread & TCV_CFG_MDIO1) { - hme_write32(hp, tregs + TCVR_CFG, tconfig | TCV_CFG_PSELECT); - hp->paddr = TCV_PADDR_ETX; - hp->tcvr_type = external; - ASD("not polling, external\n"); + if (reread & TCV_CFG_MDIO0) { + hme_write32(hp, tregs + TCVR_CFG, + tconfig & ~(TCV_CFG_PSELECT)); + hp->paddr = TCV_PADDR_ITX; + hp->tcvr_type = internal; + ASD("not polling, internal\n"); } else { - if (reread & TCV_CFG_MDIO0) { - hme_write32(hp, tregs + TCVR_CFG, - tconfig & ~(TCV_CFG_PSELECT)); - hp->paddr = TCV_PADDR_ITX; - hp->tcvr_type = internal; - ASD("not polling, internal\n"); - } else { - netdev_err(hp->dev, - "Transceiver and a coke please."); - hp->tcvr_type = none; /* Grrr... */ - ASD("not polling, none\n"); - } + netdev_err(hp->dev, + "Transceiver and a coke please."); + hp->tcvr_type = none; /* Grrr... */ + ASD("not polling, none\n"); } } } @@ -1202,124 +1252,6 @@ static void happy_meal_init_rings(struct happy_meal *hp) } /* hp->happy_lock must be held */ -static void -happy_meal_begin_auto_negotiation(struct happy_meal *hp, - void __iomem *tregs, - const struct ethtool_link_ksettings *ep) -{ - int timeout; - - /* Read all of the registers we are interested in now. */ - hp->sw_bmsr = happy_meal_tcvr_read(hp, tregs, MII_BMSR); - hp->sw_bmcr = happy_meal_tcvr_read(hp, tregs, MII_BMCR); - hp->sw_physid1 = happy_meal_tcvr_read(hp, tregs, MII_PHYSID1); - hp->sw_physid2 = happy_meal_tcvr_read(hp, tregs, MII_PHYSID2); - - /* XXX Check BMSR_ANEGCAPABLE, should not be necessary though. */ - - hp->sw_advertise = happy_meal_tcvr_read(hp, tregs, MII_ADVERTISE); - if (!ep || ep->base.autoneg == AUTONEG_ENABLE) { - /* Advertise everything we can support. */ - if (hp->sw_bmsr & BMSR_10HALF) - hp->sw_advertise |= (ADVERTISE_10HALF); - else - hp->sw_advertise &= ~(ADVERTISE_10HALF); - - if (hp->sw_bmsr & BMSR_10FULL) - hp->sw_advertise |= (ADVERTISE_10FULL); - else - hp->sw_advertise &= ~(ADVERTISE_10FULL); - if (hp->sw_bmsr & BMSR_100HALF) - hp->sw_advertise |= (ADVERTISE_100HALF); - else - hp->sw_advertise &= ~(ADVERTISE_100HALF); - if (hp->sw_bmsr & BMSR_100FULL) - hp->sw_advertise |= (ADVERTISE_100FULL); - else - hp->sw_advertise &= ~(ADVERTISE_100FULL); - happy_meal_tcvr_write(hp, tregs, MII_ADVERTISE, hp->sw_advertise); - - /* XXX Currently no Happy Meal cards I know off support 100BaseT4, - * XXX and this is because the DP83840 does not support it, changes - * XXX would need to be made to the tx/rx logic in the driver as well - * XXX so I completely skip checking for it in the BMSR for now. - */ - - ASD("Advertising [ %s%s%s%s]\n", - hp->sw_advertise & ADVERTISE_10HALF ? "10H " : "", - hp->sw_advertise & ADVERTISE_10FULL ? "10F " : "", - hp->sw_advertise & ADVERTISE_100HALF ? "100H " : "", - hp->sw_advertise & ADVERTISE_100FULL ? "100F " : ""); - - /* Enable Auto-Negotiation, this is usually on already... */ - hp->sw_bmcr |= BMCR_ANENABLE; - happy_meal_tcvr_write(hp, tregs, MII_BMCR, hp->sw_bmcr); - - /* Restart it to make sure it is going. */ - hp->sw_bmcr |= BMCR_ANRESTART; - happy_meal_tcvr_write(hp, tregs, MII_BMCR, hp->sw_bmcr); - - /* BMCR_ANRESTART self clears when the process has begun. */ - - timeout = 64; /* More than enough. */ - while (--timeout) { - hp->sw_bmcr = happy_meal_tcvr_read(hp, tregs, MII_BMCR); - if (!(hp->sw_bmcr & BMCR_ANRESTART)) - break; /* got it. */ - udelay(10); - } - if (!timeout) { - netdev_err(hp->dev, - "Happy Meal would not start auto negotiation BMCR=0x%04x\n", - hp->sw_bmcr); - netdev_notice(hp->dev, - "Performing force link detection.\n"); - goto force_link; - } else { - hp->timer_state = arbwait; - } - } else { -force_link: - /* Force the link up, trying first a particular mode. - * Either we are here at the request of ethtool or - * because the Happy Meal would not start to autoneg. - */ - - /* Disable auto-negotiation in BMCR, enable the duplex and - * speed setting, init the timer state machine, and fire it off. - */ - if (!ep || ep->base.autoneg == AUTONEG_ENABLE) { - hp->sw_bmcr = BMCR_SPEED100; - } else { - if (ep->base.speed == SPEED_100) - hp->sw_bmcr = BMCR_SPEED100; - else - hp->sw_bmcr = 0; - if (ep->base.duplex == DUPLEX_FULL) - hp->sw_bmcr |= BMCR_FULLDPLX; - } - happy_meal_tcvr_write(hp, tregs, MII_BMCR, hp->sw_bmcr); - - if (!is_lucent_phy(hp)) { - /* OK, seems we need do disable the transceiver for the first - * tick to make sure we get an accurate link state at the - * second tick. - */ - hp->sw_csconfig = happy_meal_tcvr_read(hp, tregs, - DP83840_CSCONFIG); - hp->sw_csconfig &= ~(CSCONFIG_TCVDISAB); - happy_meal_tcvr_write(hp, tregs, DP83840_CSCONFIG, - hp->sw_csconfig); - } - hp->timer_state = ltrywait; - } - - hp->timer_ticks = 0; - hp->happy_timer.expires = jiffies + (12 * HZ)/10; /* 1.2 sec. */ - add_timer(&hp->happy_timer); -} - -/* hp->happy_lock must be held */ static int happy_meal_init(struct happy_meal *hp) { const unsigned char *e = &hp->dev->dev_addr[0]; @@ -1341,10 +1273,6 @@ static int happy_meal_init(struct happy_meal *hp) happy_meal_get_counters(hp, bregs); } - /* Stop polling. */ - HMD("to happy_meal_poll_stop\n"); - happy_meal_poll_stop(hp, tregs); - /* Stop transmitter and receiver. */ HMD("to happy_meal_stop\n"); happy_meal_stop(hp, gregs); @@ -1353,11 +1281,6 @@ static int happy_meal_init(struct happy_meal *hp) HMD("to happy_meal_init_rings\n"); happy_meal_init_rings(hp); - /* Shut up the MIF. */ - HMD("Disable all MIF irqs (old[%08x])\n", - hme_read32(hp, tregs + TCVR_IMASK)); - hme_write32(hp, tregs + TCVR_IMASK, 0xffff); - /* See if we can enable the MIF frame on this card to speak to the DP83840. */ if (hp->happy_flags & HFLAG_FENABLE) { HMD("use frame old[%08x]\n", @@ -1612,7 +1535,6 @@ static void happy_meal_set_initial_advertisement(struct happy_meal *hp) void __iomem *gregs = hp->gregs; happy_meal_stop(hp, gregs); - hme_write32(hp, tregs + TCVR_IMASK, 0xffff); if (hp->happy_flags & HFLAG_FENABLE) hme_write32(hp, tregs + TCVR_CFG, hme_read32(hp, tregs + TCVR_CFG) & ~(TCV_CFG_BENABLE)); @@ -1770,34 +1692,6 @@ static int happy_meal_is_not_so_happy(struct happy_meal *hp, u32 status) } /* hp->happy_lock must be held */ -static void happy_meal_mif_interrupt(struct happy_meal *hp) -{ - void __iomem *tregs = hp->tcvregs; - - netdev_info(hp->dev, "Link status change.\n"); - hp->sw_bmcr = happy_meal_tcvr_read(hp, tregs, MII_BMCR); - hp->sw_lpa = happy_meal_tcvr_read(hp, tregs, MII_LPA); - - /* Use the fastest transmission protocol possible. */ - if (hp->sw_lpa & LPA_100FULL) { - netdev_info(hp->dev, "Switching to 100Mbps at full duplex.\n"); - hp->sw_bmcr |= (BMCR_FULLDPLX | BMCR_SPEED100); - } else if (hp->sw_lpa & LPA_100HALF) { - netdev_info(hp->dev, "Switching to 100MBps at half duplex.\n"); - hp->sw_bmcr |= BMCR_SPEED100; - } else if (hp->sw_lpa & LPA_10FULL) { - netdev_info(hp->dev, "Switching to 10MBps at full duplex.\n"); - hp->sw_bmcr |= BMCR_FULLDPLX; - } else { - netdev_info(hp->dev, "Using 10Mbps at half duplex.\n"); - } - happy_meal_tcvr_write(hp, tregs, MII_BMCR, hp->sw_bmcr); - - /* Finally stop polling and shut up the MIF. */ - happy_meal_poll_stop(hp, tregs); -} - -/* hp->happy_lock must be held */ static void happy_meal_tx(struct happy_meal *hp) { struct happy_meal_txd *txbase = &hp->happy_block->happy_meal_txd[0]; @@ -1972,6 +1866,8 @@ static irqreturn_t happy_meal_interrupt(int irq, void *dev_id) u32 happy_status = hme_read32(hp, hp->gregs + GREG_STAT); HMD("status=%08x\n", happy_status); + if (!happy_status) + return IRQ_NONE; spin_lock(&hp->happy_lock); @@ -1980,9 +1876,6 @@ static irqreturn_t happy_meal_interrupt(int irq, void *dev_id) goto out; } - if (happy_status & GREG_STAT_MIFIRQ) - happy_meal_mif_interrupt(hp); - if (happy_status & GREG_STAT_TXALL) happy_meal_tx(hp); @@ -1996,66 +1889,16 @@ out: return IRQ_HANDLED; } -#ifdef CONFIG_SBUS -static irqreturn_t quattro_sbus_interrupt(int irq, void *cookie) -{ - struct quattro *qp = (struct quattro *) cookie; - int i; - - for (i = 0; i < 4; i++) { - struct net_device *dev = qp->happy_meals[i]; - struct happy_meal *hp = netdev_priv(dev); - u32 happy_status = hme_read32(hp, hp->gregs + GREG_STAT); - - HMD("status=%08x\n", happy_status); - - if (!(happy_status & (GREG_STAT_ERRORS | - GREG_STAT_MIFIRQ | - GREG_STAT_TXALL | - GREG_STAT_RXTOHOST))) - continue; - - spin_lock(&hp->happy_lock); - - if (happy_status & GREG_STAT_ERRORS) - if (happy_meal_is_not_so_happy(hp, happy_status)) - goto next; - - if (happy_status & GREG_STAT_MIFIRQ) - happy_meal_mif_interrupt(hp); - - if (happy_status & GREG_STAT_TXALL) - happy_meal_tx(hp); - - if (happy_status & GREG_STAT_RXTOHOST) - happy_meal_rx(hp, dev); - - next: - spin_unlock(&hp->happy_lock); - } - HMD("done\n"); - - return IRQ_HANDLED; -} -#endif - static int happy_meal_open(struct net_device *dev) { struct happy_meal *hp = netdev_priv(dev); int res; - /* On SBUS Quattro QFE cards, all hme interrupts are concentrated - * into a single source which we register handling at probe time. - */ - if ((hp->happy_flags & (HFLAG_QUATTRO|HFLAG_PCI)) != HFLAG_QUATTRO) { - res = request_irq(hp->irq, happy_meal_interrupt, IRQF_SHARED, - dev->name, dev); - if (res) { - HMD("EAGAIN\n"); - netdev_err(dev, "Can't order irq %d to go.\n", hp->irq); - - return -EAGAIN; - } + res = request_irq(hp->irq, happy_meal_interrupt, IRQF_SHARED, + dev->name, dev); + if (res) { + netdev_err(dev, "Can't order irq %d to go.\n", hp->irq); + return res; } HMD("to happy_meal_init\n"); @@ -2064,7 +1907,7 @@ static int happy_meal_open(struct net_device *dev) res = happy_meal_init(hp); spin_unlock_irq(&hp->happy_lock); - if (res && ((hp->happy_flags & (HFLAG_QUATTRO|HFLAG_PCI)) != HFLAG_QUATTRO)) + if (res) free_irq(hp->irq, dev); return res; } @@ -2082,12 +1925,7 @@ static int happy_meal_close(struct net_device *dev) spin_unlock_irq(&hp->happy_lock); - /* On Quattro QFE cards, all hme interrupts are concentrated - * into a single source which we register handling at probe - * time and never unregister. - */ - if ((hp->happy_flags & (HFLAG_QUATTRO|HFLAG_PCI)) != HFLAG_QUATTRO) - free_irq(hp->irq, dev); + free_irq(hp->irq, dev); return 0; } @@ -2420,59 +2258,6 @@ static struct quattro *quattro_sbus_find(struct platform_device *child) platform_set_drvdata(op, qp); return qp; } - -/* After all quattro cards have been probed, we call these functions - * to register the IRQ handlers for the cards that have been - * successfully probed and skip the cards that failed to initialize - */ -static int __init quattro_sbus_register_irqs(void) -{ - struct quattro *qp; - - for (qp = qfe_sbus_list; qp != NULL; qp = qp->next) { - struct platform_device *op = qp->quattro_dev; - int err, qfe_slot, skip = 0; - - for (qfe_slot = 0; qfe_slot < 4; qfe_slot++) { - if (!qp->happy_meals[qfe_slot]) - skip = 1; - } - if (skip) - continue; - - err = request_irq(op->archdata.irqs[0], - quattro_sbus_interrupt, - IRQF_SHARED, "Quattro", - qp); - if (err != 0) { - dev_err(&op->dev, - "Quattro HME: IRQ registration error %d.\n", - err); - return err; - } - } - - return 0; -} - -static void quattro_sbus_free_irqs(void) -{ - struct quattro *qp; - - for (qp = qfe_sbus_list; qp != NULL; qp = qp->next) { - struct platform_device *op = qp->quattro_dev; - int qfe_slot, skip = 0; - - for (qfe_slot = 0; qfe_slot < 4; qfe_slot++) { - if (!qp->happy_meals[qfe_slot]) - skip = 1; - } - if (skip) - continue; - - free_irq(op->archdata.irqs[0], qp); - } -} #endif /* CONFIG_SBUS */ #ifdef CONFIG_PCI @@ -2520,6 +2305,184 @@ static const struct net_device_ops hme_netdev_ops = { .ndo_validate_addr = eth_validate_addr, }; +#ifdef CONFIG_PCI +static int is_quattro_p(struct pci_dev *pdev) +{ + struct pci_dev *busdev = pdev->bus->self; + struct pci_dev *this_pdev; + int n_hmes; + + if (!busdev || busdev->vendor != PCI_VENDOR_ID_DEC || + busdev->device != PCI_DEVICE_ID_DEC_21153) + return 0; + + n_hmes = 0; + list_for_each_entry(this_pdev, &pdev->bus->devices, bus_list) { + if (this_pdev->vendor == PCI_VENDOR_ID_SUN && + this_pdev->device == PCI_DEVICE_ID_SUN_HAPPYMEAL) + n_hmes++; + } + + if (n_hmes != 4) + return 0; + + return 1; +} + +/* Fetch MAC address from vital product data of PCI ROM. */ +static int find_eth_addr_in_vpd(void __iomem *rom_base, int len, int index, unsigned char *dev_addr) +{ + int this_offset; + + for (this_offset = 0x20; this_offset < len; this_offset++) { + void __iomem *p = rom_base + this_offset; + + if (readb(p + 0) != 0x90 || + readb(p + 1) != 0x00 || + readb(p + 2) != 0x09 || + readb(p + 3) != 0x4e || + readb(p + 4) != 0x41 || + readb(p + 5) != 0x06) + continue; + + this_offset += 6; + p += 6; + + if (index == 0) { + for (int i = 0; i < 6; i++) + dev_addr[i] = readb(p + i); + return 1; + } + index--; + } + return 0; +} + +static void __maybe_unused get_hme_mac_nonsparc(struct pci_dev *pdev, + unsigned char *dev_addr) +{ + void __iomem *p; + size_t size; + + p = pci_map_rom(pdev, &size); + if (p) { + int index = 0; + int found; + + if (is_quattro_p(pdev)) + index = PCI_SLOT(pdev->devfn); + + found = readb(p) == 0x55 && + readb(p + 1) == 0xaa && + find_eth_addr_in_vpd(p, (64 * 1024), index, dev_addr); + pci_unmap_rom(pdev, p); + if (found) + return; + } + + /* Sun MAC prefix then 3 random bytes. */ + dev_addr[0] = 0x08; + dev_addr[1] = 0x00; + dev_addr[2] = 0x20; + get_random_bytes(&dev_addr[3], 3); +} +#endif + +static void happy_meal_addr_init(struct happy_meal *hp, + struct device_node *dp, int qfe_slot) +{ + int i; + + for (i = 0; i < 6; i++) { + if (macaddr[i] != 0) + break; + } + + if (i < 6) { /* a mac address was given */ + u8 addr[ETH_ALEN]; + + for (i = 0; i < 6; i++) + addr[i] = macaddr[i]; + eth_hw_addr_set(hp->dev, addr); + macaddr[5]++; + } else { +#ifdef CONFIG_SPARC + const unsigned char *addr; + int len; + + /* If user did not specify a MAC address specifically, use + * the Quattro local-mac-address property... + */ + if (qfe_slot != -1) { + addr = of_get_property(dp, "local-mac-address", &len); + if (addr && len == 6) { + eth_hw_addr_set(hp->dev, addr); + return; + } + } + + eth_hw_addr_set(hp->dev, idprom->id_ethaddr); +#else + u8 addr[ETH_ALEN]; + + get_hme_mac_nonsparc(hp->happy_dev, addr); + eth_hw_addr_set(hp->dev, addr); +#endif + } +} + +static int happy_meal_common_probe(struct happy_meal *hp, + struct device_node *dp) +{ + struct net_device *dev = hp->dev; + int err; + +#ifdef CONFIG_SPARC + hp->hm_revision = of_getintprop_default(dp, "hm-rev", hp->hm_revision); +#endif + + /* Now enable the feature flags we can. */ + if (hp->hm_revision == 0x20 || hp->hm_revision == 0x21) + hp->happy_flags |= HFLAG_20_21; + else if (hp->hm_revision != 0xa0) + hp->happy_flags |= HFLAG_NOT_A0; + + hp->happy_block = dmam_alloc_coherent(hp->dma_dev, PAGE_SIZE, + &hp->hblock_dvma, GFP_KERNEL); + if (!hp->happy_block) + return -ENOMEM; + + /* Force check of the link first time we are brought up. */ + hp->linkcheck = 0; + + /* Force timer state to 'asleep' with count of zero. */ + hp->timer_state = asleep; + hp->timer_ticks = 0; + + timer_setup(&hp->happy_timer, happy_meal_timer, 0); + + dev->netdev_ops = &hme_netdev_ops; + dev->watchdog_timeo = 5 * HZ; + dev->ethtool_ops = &hme_ethtool_ops; + + /* Happy Meal can do it all... */ + dev->hw_features = NETIF_F_SG | NETIF_F_HW_CSUM; + dev->features |= dev->hw_features | NETIF_F_RXCSUM; + + + /* Grrr, Happy Meal comes up by default not advertising + * full duplex 100baseT capabilities, fix this. + */ + spin_lock_irq(&hp->happy_lock); + happy_meal_set_initial_advertisement(hp); + spin_unlock_irq(&hp->happy_lock); + + err = devm_register_netdev(hp->dma_dev, dev); + if (err) + dev_err(hp->dma_dev, "Cannot register net device, aborting.\n"); + return err; +} + #ifdef CONFIG_SBUS static int happy_meal_sbus_probe_one(struct platform_device *op, int is_qfe) { @@ -2527,152 +2490,92 @@ static int happy_meal_sbus_probe_one(struct platform_device *op, int is_qfe) struct quattro *qp = NULL; struct happy_meal *hp; struct net_device *dev; - int i, qfe_slot = -1; - u8 addr[ETH_ALEN]; - int err = -ENODEV; + int qfe_slot = -1; + int err; sbus_dp = op->dev.parent->of_node; /* We can match PCI devices too, do not accept those here. */ if (!of_node_name_eq(sbus_dp, "sbus") && !of_node_name_eq(sbus_dp, "sbi")) - return err; + return -ENODEV; if (is_qfe) { qp = quattro_sbus_find(op); if (qp == NULL) - goto err_out; + return -ENODEV; for (qfe_slot = 0; qfe_slot < 4; qfe_slot++) if (qp->happy_meals[qfe_slot] == NULL) break; if (qfe_slot == 4) - goto err_out; + return -ENODEV; } - err = -ENOMEM; - dev = alloc_etherdev(sizeof(struct happy_meal)); + dev = devm_alloc_etherdev(&op->dev, sizeof(struct happy_meal)); if (!dev) - goto err_out; + return -ENOMEM; SET_NETDEV_DEV(dev, &op->dev); - /* If user did not specify a MAC address specifically, use - * the Quattro local-mac-address property... - */ - for (i = 0; i < 6; i++) { - if (macaddr[i] != 0) - break; - } - if (i < 6) { /* a mac address was given */ - for (i = 0; i < 6; i++) - addr[i] = macaddr[i]; - eth_hw_addr_set(dev, addr); - macaddr[5]++; - } else { - const unsigned char *addr; - int len; - - addr = of_get_property(dp, "local-mac-address", &len); - - if (qfe_slot != -1 && addr && len == ETH_ALEN) - eth_hw_addr_set(dev, addr); - else - eth_hw_addr_set(dev, idprom->id_ethaddr); - } - hp = netdev_priv(dev); - + hp->dev = dev; hp->happy_dev = op; hp->dma_dev = &op->dev; + happy_meal_addr_init(hp, dp, qfe_slot); spin_lock_init(&hp->happy_lock); - err = -ENODEV; if (qp != NULL) { hp->qfe_parent = qp; hp->qfe_ent = qfe_slot; qp->happy_meals[qfe_slot] = dev; } - hp->gregs = of_ioremap(&op->resource[0], 0, - GREG_REG_SIZE, "HME Global Regs"); - if (!hp->gregs) { + hp->gregs = devm_platform_ioremap_resource(op, 0); + if (IS_ERR(hp->gregs)) { dev_err(&op->dev, "Cannot map global registers.\n"); - goto err_out_free_netdev; + err = PTR_ERR(hp->gregs); + goto err_out_clear_quattro; } - hp->etxregs = of_ioremap(&op->resource[1], 0, - ETX_REG_SIZE, "HME TX Regs"); - if (!hp->etxregs) { + hp->etxregs = devm_platform_ioremap_resource(op, 1); + if (IS_ERR(hp->etxregs)) { dev_err(&op->dev, "Cannot map MAC TX registers.\n"); - goto err_out_iounmap; + err = PTR_ERR(hp->etxregs); + goto err_out_clear_quattro; } - hp->erxregs = of_ioremap(&op->resource[2], 0, - ERX_REG_SIZE, "HME RX Regs"); - if (!hp->erxregs) { + hp->erxregs = devm_platform_ioremap_resource(op, 2); + if (IS_ERR(hp->erxregs)) { dev_err(&op->dev, "Cannot map MAC RX registers.\n"); - goto err_out_iounmap; + err = PTR_ERR(hp->erxregs); + goto err_out_clear_quattro; } - hp->bigmacregs = of_ioremap(&op->resource[3], 0, - BMAC_REG_SIZE, "HME BIGMAC Regs"); - if (!hp->bigmacregs) { + hp->bigmacregs = devm_platform_ioremap_resource(op, 3); + if (IS_ERR(hp->bigmacregs)) { dev_err(&op->dev, "Cannot map BIGMAC registers.\n"); - goto err_out_iounmap; + err = PTR_ERR(hp->bigmacregs); + goto err_out_clear_quattro; } - hp->tcvregs = of_ioremap(&op->resource[4], 0, - TCVR_REG_SIZE, "HME Tranceiver Regs"); - if (!hp->tcvregs) { + hp->tcvregs = devm_platform_ioremap_resource(op, 4); + if (IS_ERR(hp->tcvregs)) { dev_err(&op->dev, "Cannot map TCVR registers.\n"); - goto err_out_iounmap; + err = PTR_ERR(hp->tcvregs); + goto err_out_clear_quattro; } - hp->hm_revision = of_getintprop_default(dp, "hm-rev", 0xff); - if (hp->hm_revision == 0xff) - hp->hm_revision = 0xa0; - - /* Now enable the feature flags we can. */ - if (hp->hm_revision == 0x20 || hp->hm_revision == 0x21) - hp->happy_flags = HFLAG_20_21; - else if (hp->hm_revision != 0xa0) - hp->happy_flags = HFLAG_NOT_A0; + hp->hm_revision = 0xa0; if (qp != NULL) hp->happy_flags |= HFLAG_QUATTRO; + hp->irq = op->archdata.irqs[0]; + /* Get the supported DVMA burst sizes from our Happy SBUS. */ hp->happy_bursts = of_getintprop_default(sbus_dp, "burst-sizes", 0x00); - hp->happy_block = dma_alloc_coherent(hp->dma_dev, - PAGE_SIZE, - &hp->hblock_dvma, - GFP_ATOMIC); - err = -ENOMEM; - if (!hp->happy_block) - goto err_out_iounmap; - - /* Force check of the link first time we are brought up. */ - hp->linkcheck = 0; - - /* Force timer state to 'asleep' with count of zero. */ - hp->timer_state = asleep; - hp->timer_ticks = 0; - - timer_setup(&hp->happy_timer, happy_meal_timer, 0); - - hp->dev = dev; - dev->netdev_ops = &hme_netdev_ops; - dev->watchdog_timeo = 5*HZ; - dev->ethtool_ops = &hme_ethtool_ops; - - /* Happy Meal can do it all... */ - dev->hw_features = NETIF_F_SG | NETIF_F_HW_CSUM; - dev->features |= dev->hw_features | NETIF_F_RXCSUM; - - hp->irq = op->archdata.irqs[0]; - -#if defined(CONFIG_SBUS) && defined(CONFIG_PCI) +#ifdef CONFIG_PCI /* Hook up SBUS register/descriptor accessors. */ hp->read_desc32 = sbus_hme_read_desc32; hp->write_txd = sbus_hme_write_txd; @@ -2681,18 +2584,9 @@ static int happy_meal_sbus_probe_one(struct platform_device *op, int is_qfe) hp->write32 = sbus_hme_write32; #endif - /* Grrr, Happy Meal comes up by default not advertising - * full duplex 100baseT capabilities, fix this. - */ - spin_lock_irq(&hp->happy_lock); - happy_meal_set_initial_advertisement(hp); - spin_unlock_irq(&hp->happy_lock); - - err = register_netdev(hp->dev); - if (err) { - dev_err(&op->dev, "Cannot register net device, aborting.\n"); - goto err_out_free_coherent; - } + err = happy_meal_common_probe(hp, dp); + if (err) + goto err_out_clear_quattro; platform_set_drvdata(op, hp); @@ -2706,135 +2600,26 @@ static int happy_meal_sbus_probe_one(struct platform_device *op, int is_qfe) return 0; -err_out_free_coherent: - dma_free_coherent(hp->dma_dev, - PAGE_SIZE, - hp->happy_block, - hp->hblock_dvma); - -err_out_iounmap: - if (hp->gregs) - of_iounmap(&op->resource[0], hp->gregs, GREG_REG_SIZE); - if (hp->etxregs) - of_iounmap(&op->resource[1], hp->etxregs, ETX_REG_SIZE); - if (hp->erxregs) - of_iounmap(&op->resource[2], hp->erxregs, ERX_REG_SIZE); - if (hp->bigmacregs) - of_iounmap(&op->resource[3], hp->bigmacregs, BMAC_REG_SIZE); - if (hp->tcvregs) - of_iounmap(&op->resource[4], hp->tcvregs, TCVR_REG_SIZE); - +err_out_clear_quattro: if (qp) qp->happy_meals[qfe_slot] = NULL; - -err_out_free_netdev: - free_netdev(dev); - -err_out: return err; } #endif #ifdef CONFIG_PCI -#ifndef CONFIG_SPARC -static int is_quattro_p(struct pci_dev *pdev) -{ - struct pci_dev *busdev = pdev->bus->self; - struct pci_dev *this_pdev; - int n_hmes; - - if (busdev == NULL || - busdev->vendor != PCI_VENDOR_ID_DEC || - busdev->device != PCI_DEVICE_ID_DEC_21153) - return 0; - - n_hmes = 0; - list_for_each_entry(this_pdev, &pdev->bus->devices, bus_list) { - if (this_pdev->vendor == PCI_VENDOR_ID_SUN && - this_pdev->device == PCI_DEVICE_ID_SUN_HAPPYMEAL) - n_hmes++; - } - - if (n_hmes != 4) - return 0; - - return 1; -} - -/* Fetch MAC address from vital product data of PCI ROM. */ -static int find_eth_addr_in_vpd(void __iomem *rom_base, int len, int index, unsigned char *dev_addr) -{ - int this_offset; - - for (this_offset = 0x20; this_offset < len; this_offset++) { - void __iomem *p = rom_base + this_offset; - - if (readb(p + 0) != 0x90 || - readb(p + 1) != 0x00 || - readb(p + 2) != 0x09 || - readb(p + 3) != 0x4e || - readb(p + 4) != 0x41 || - readb(p + 5) != 0x06) - continue; - - this_offset += 6; - p += 6; - - if (index == 0) { - int i; - - for (i = 0; i < 6; i++) - dev_addr[i] = readb(p + i); - return 1; - } - index--; - } - return 0; -} - -static void get_hme_mac_nonsparc(struct pci_dev *pdev, unsigned char *dev_addr) -{ - size_t size; - void __iomem *p = pci_map_rom(pdev, &size); - - if (p) { - int index = 0; - int found; - - if (is_quattro_p(pdev)) - index = PCI_SLOT(pdev->devfn); - - found = readb(p) == 0x55 && - readb(p + 1) == 0xaa && - find_eth_addr_in_vpd(p, (64 * 1024), index, dev_addr); - pci_unmap_rom(pdev, p); - if (found) - return; - } - - /* Sun MAC prefix then 3 random bytes. */ - dev_addr[0] = 0x08; - dev_addr[1] = 0x00; - dev_addr[2] = 0x20; - get_random_bytes(&dev_addr[3], 3); -} -#endif /* !(CONFIG_SPARC) */ - static int happy_meal_pci_probe(struct pci_dev *pdev, const struct pci_device_id *ent) { + struct device_node *dp = NULL; struct quattro *qp = NULL; -#ifdef CONFIG_SPARC - struct device_node *dp; -#endif struct happy_meal *hp; struct net_device *dev; void __iomem *hpreg_base; struct resource *hpreg_res; - int i, qfe_slot = -1; char prom_name[64]; - u8 addr[ETH_ALEN]; - int err; + int qfe_slot = -1; + int err = -ENODEV; /* Now make sure pci_dev cookie is there. */ #ifdef CONFIG_SPARC @@ -2849,33 +2634,29 @@ static int happy_meal_pci_probe(struct pci_dev *pdev, err = pcim_enable_device(pdev); if (err) - goto err_out; + return err; pci_set_master(pdev); if (!strcmp(prom_name, "SUNW,qfe") || !strcmp(prom_name, "qfe")) { qp = quattro_pci_find(pdev); - if (IS_ERR(qp)) { - err = PTR_ERR(qp); - goto err_out; - } + if (IS_ERR(qp)) + return PTR_ERR(qp); for (qfe_slot = 0; qfe_slot < 4; qfe_slot++) if (!qp->happy_meals[qfe_slot]) break; if (qfe_slot == 4) - goto err_out; + return -ENODEV; } dev = devm_alloc_etherdev(&pdev->dev, sizeof(struct happy_meal)); - if (!dev) { - err = -ENOMEM; - goto err_out; - } + if (!dev) + return -ENOMEM; SET_NETDEV_DEV(dev, &pdev->dev); hp = netdev_priv(dev); - + hp->dev = dev; hp->happy_dev = pdev; hp->dma_dev = &pdev->dev; @@ -2911,35 +2692,7 @@ static int happy_meal_pci_probe(struct pci_dev *pdev, goto err_out_clear_quattro; } - for (i = 0; i < 6; i++) { - if (macaddr[i] != 0) - break; - } - if (i < 6) { /* a mac address was given */ - for (i = 0; i < 6; i++) - addr[i] = macaddr[i]; - eth_hw_addr_set(dev, addr); - macaddr[5]++; - } else { -#ifdef CONFIG_SPARC - const unsigned char *addr; - int len; - - if (qfe_slot != -1 && - (addr = of_get_property(dp, "local-mac-address", &len)) - != NULL && - len == 6) { - eth_hw_addr_set(dev, addr); - } else { - eth_hw_addr_set(dev, idprom->id_ethaddr); - } -#else - u8 addr[ETH_ALEN]; - - get_hme_mac_nonsparc(pdev, addr); - eth_hw_addr_set(dev, addr); -#endif - } + happy_meal_addr_init(hp, dp, qfe_slot); /* Layout registers. */ hp->gregs = (hpreg_base + 0x0000UL); @@ -2948,20 +2701,10 @@ static int happy_meal_pci_probe(struct pci_dev *pdev, hp->bigmacregs = (hpreg_base + 0x6000UL); hp->tcvregs = (hpreg_base + 0x7000UL); -#ifdef CONFIG_SPARC - hp->hm_revision = of_getintprop_default(dp, "hm-rev", 0xff); - if (hp->hm_revision == 0xff) + if (IS_ENABLED(CONFIG_SPARC)) hp->hm_revision = 0xc0 | (pdev->revision & 0x0f); -#else - /* works with this on non-sparc hosts */ - hp->hm_revision = 0x20; -#endif - - /* Now enable the feature flags we can. */ - if (hp->hm_revision == 0x20 || hp->hm_revision == 0x21) - hp->happy_flags = HFLAG_20_21; - else if (hp->hm_revision != 0xa0 && hp->hm_revision != 0xc0) - hp->happy_flags = HFLAG_NOT_A0; + else + hp->hm_revision = 0x20; if (qp != NULL) hp->happy_flags |= HFLAG_QUATTRO; @@ -2973,31 +2716,9 @@ static int happy_meal_pci_probe(struct pci_dev *pdev, /* Assume PCI happy meals can handle all burst sizes. */ hp->happy_bursts = DMA_BURSTBITS; #endif - - hp->happy_block = dmam_alloc_coherent(&pdev->dev, PAGE_SIZE, - &hp->hblock_dvma, GFP_KERNEL); - if (!hp->happy_block) { - err = -ENOMEM; - goto err_out_clear_quattro; - } - - hp->linkcheck = 0; - hp->timer_state = asleep; - hp->timer_ticks = 0; - - timer_setup(&hp->happy_timer, happy_meal_timer, 0); - hp->irq = pdev->irq; - hp->dev = dev; - dev->netdev_ops = &hme_netdev_ops; - dev->watchdog_timeo = 5*HZ; - dev->ethtool_ops = &hme_ethtool_ops; - /* Happy Meal can do it all... */ - dev->hw_features = NETIF_F_SG | NETIF_F_HW_CSUM; - dev->features |= dev->hw_features | NETIF_F_RXCSUM; - -#if defined(CONFIG_SBUS) && defined(CONFIG_PCI) +#ifdef CONFIG_SBUS /* Hook up PCI register/descriptor accessors. */ hp->read_desc32 = pci_hme_read_desc32; hp->write_txd = pci_hme_write_txd; @@ -3006,18 +2727,9 @@ static int happy_meal_pci_probe(struct pci_dev *pdev, hp->write32 = pci_hme_write32; #endif - /* Grrr, Happy Meal comes up by default not advertising - * full duplex 100baseT capabilities, fix this. - */ - spin_lock_irq(&hp->happy_lock); - happy_meal_set_initial_advertisement(hp); - spin_unlock_irq(&hp->happy_lock); - - err = devm_register_netdev(&pdev->dev, dev); - if (err) { - dev_err(&pdev->dev, "Cannot register net device, aborting.\n"); + err = happy_meal_common_probe(hp, dp); + if (err) goto err_out_clear_quattro; - } pci_set_drvdata(pdev, hp); @@ -3048,8 +2760,6 @@ static int happy_meal_pci_probe(struct pci_dev *pdev, err_out_clear_quattro: if (qp != NULL) qp->happy_meals[qfe_slot] = NULL; - -err_out: return err; } @@ -3107,30 +2817,6 @@ static int hme_sbus_probe(struct platform_device *op) return happy_meal_sbus_probe_one(op, is_qfe); } -static int hme_sbus_remove(struct platform_device *op) -{ - struct happy_meal *hp = platform_get_drvdata(op); - struct net_device *net_dev = hp->dev; - - unregister_netdev(net_dev); - - /* XXX qfe parent interrupt... */ - - of_iounmap(&op->resource[0], hp->gregs, GREG_REG_SIZE); - of_iounmap(&op->resource[1], hp->etxregs, ETX_REG_SIZE); - of_iounmap(&op->resource[2], hp->erxregs, ERX_REG_SIZE); - of_iounmap(&op->resource[3], hp->bigmacregs, BMAC_REG_SIZE); - of_iounmap(&op->resource[4], hp->tcvregs, TCVR_REG_SIZE); - dma_free_coherent(hp->dma_dev, - PAGE_SIZE, - hp->happy_block, - hp->hblock_dvma); - - free_netdev(net_dev); - - return 0; -} - static const struct of_device_id hme_sbus_match[] = { { .name = "SUNW,hme", @@ -3154,24 +2840,16 @@ static struct platform_driver hme_sbus_driver = { .of_match_table = hme_sbus_match, }, .probe = hme_sbus_probe, - .remove = hme_sbus_remove, }; static int __init happy_meal_sbus_init(void) { - int err; - - err = platform_driver_register(&hme_sbus_driver); - if (!err) - err = quattro_sbus_register_irqs(); - - return err; + return platform_driver_register(&hme_sbus_driver); } static void happy_meal_sbus_exit(void) { platform_driver_unregister(&hme_sbus_driver); - quattro_sbus_free_irqs(); while (qfe_sbus_list) { struct quattro *qfe = qfe_sbus_list; diff --git a/drivers/net/ethernet/sun/sunhme.h b/drivers/net/ethernet/sun/sunhme.h index 9118c60c9426..258b4c7fe962 100644 --- a/drivers/net/ethernet/sun/sunhme.h +++ b/drivers/net/ethernet/sun/sunhme.h @@ -462,22 +462,20 @@ struct happy_meal { }; /* Here are the happy flags. */ -#define HFLAG_POLL 0x00000001 /* We are doing MIF polling */ #define HFLAG_FENABLE 0x00000002 /* The MII frame is enabled */ #define HFLAG_LANCE 0x00000004 /* We are using lance-mode */ #define HFLAG_RXENABLE 0x00000008 /* Receiver is enabled */ #define HFLAG_AUTO 0x00000010 /* Using auto-negotiation, 0 = force */ #define HFLAG_FULL 0x00000020 /* Full duplex enable */ #define HFLAG_MACFULL 0x00000040 /* Using full duplex in the MAC */ -#define HFLAG_POLLENABLE 0x00000080 /* Actually try MIF polling */ #define HFLAG_RXCV 0x00000100 /* XXX RXCV ENABLE */ #define HFLAG_INIT 0x00000200 /* Init called at least once */ #define HFLAG_LINKUP 0x00000400 /* 1 = Link is up */ #define HFLAG_PCI 0x00000800 /* PCI based Happy Meal */ #define HFLAG_QUATTRO 0x00001000 /* On QFE/Quattro card */ -#define HFLAG_20_21 (HFLAG_POLLENABLE | HFLAG_FENABLE) -#define HFLAG_NOT_A0 (HFLAG_POLLENABLE | HFLAG_FENABLE | HFLAG_LANCE | HFLAG_RXCV) +#define HFLAG_20_21 HFLAG_FENABLE +#define HFLAG_NOT_A0 (HFLAG_FENABLE | HFLAG_LANCE | HFLAG_RXCV) /* Support for QFE/Quattro cards. */ struct quattro { diff --git a/drivers/net/ethernet/sunplus/spl2sw_phy.c b/drivers/net/ethernet/sunplus/spl2sw_phy.c index 404f508a54d4..6f899e48f51d 100644 --- a/drivers/net/ethernet/sunplus/spl2sw_phy.c +++ b/drivers/net/ethernet/sunplus/spl2sw_phy.c @@ -84,9 +84,7 @@ void spl2sw_phy_remove(struct spl2sw_common *comm) for (i = 0; i < MAX_NETDEV_NUM; i++) if (comm->ndev[i]) { ndev = comm->ndev[i]; - if (ndev) { + if (ndev) phy_disconnect(ndev->phydev); - ndev->phydev = NULL; - } } } diff --git a/drivers/net/ethernet/ti/am65-cpsw-nuss.c b/drivers/net/ethernet/ti/am65-cpsw-nuss.c index bcea87b7151c..11cbcd9e2c72 100644 --- a/drivers/net/ethernet/ti/am65-cpsw-nuss.c +++ b/drivers/net/ethernet/ti/am65-cpsw-nuss.c @@ -76,6 +76,7 @@ #define AM65_CPSW_PORTN_REG_TS_CTL_LTYPE2 0x31C #define AM65_CPSW_SGMII_CONTROL_REG 0x010 +#define AM65_CPSW_SGMII_MR_ADV_ABILITY_REG 0x018 #define AM65_CPSW_SGMII_CONTROL_MR_AN_ENABLE BIT(0) #define AM65_CPSW_CTL_VLAN_AWARE BIT(1) @@ -85,6 +86,7 @@ /* AM65_CPSW_P0_REG_CTL */ #define AM65_CPSW_P0_REG_CTL_RX_CHECKSUM_EN BIT(0) +#define AM65_CPSW_P0_REG_CTL_RX_REMAP_VLAN BIT(16) /* AM65_CPSW_PORT_REG_PRI_CTL */ #define AM65_CPSW_PORT_REG_PRI_CTL_RX_PTYPE_RROBIN BIT(8) @@ -384,8 +386,8 @@ static int am65_cpsw_nuss_common_open(struct am65_cpsw_common *common) /* set base flow_id */ writel(common->rx_flow_id_base, host_p->port_base + AM65_CPSW_PORT0_REG_FLOW_ID_OFFSET); - /* en tx crc offload */ - writel(AM65_CPSW_P0_REG_CTL_RX_CHECKSUM_EN, host_p->port_base + AM65_CPSW_P0_REG_CTL); + writel(AM65_CPSW_P0_REG_CTL_RX_CHECKSUM_EN | AM65_CPSW_P0_REG_CTL_RX_REMAP_VLAN, + host_p->port_base + AM65_CPSW_P0_REG_CTL); am65_cpsw_nuss_set_p0_ptype(common); @@ -427,6 +429,8 @@ static int am65_cpsw_nuss_common_open(struct am65_cpsw_common *common) else am65_cpsw_init_host_port_switch(common); + am65_cpsw_qos_tx_p0_rate_init(common); + for (i = 0; i < common->rx_chns.descs_num; i++) { skb = __netdev_alloc_skb_ip_align(NULL, AM65_CPSW_MAX_PACKET_SIZE, @@ -598,8 +602,12 @@ static int am65_cpsw_nuss_ndo_slave_open(struct net_device *ndev) goto runtime_put; } - for (i = 0; i < common->tx_ch_num; i++) - netdev_tx_reset_queue(netdev_get_tx_queue(ndev, i)); + for (i = 0; i < common->tx_ch_num; i++) { + struct netdev_queue *txq = netdev_get_tx_queue(ndev, i); + + netdev_tx_reset_queue(txq); + txq->tx_maxrate = common->tx_chns[i].rate_mbps; + } ret = am65_cpsw_nuss_common_open(common); if (ret) @@ -1424,6 +1432,7 @@ static const struct net_device_ops am65_cpsw_nuss_netdev_ops = { .ndo_vlan_rx_kill_vid = am65_cpsw_nuss_ndo_slave_kill_vid, .ndo_eth_ioctl = am65_cpsw_nuss_ndo_slave_ioctl, .ndo_setup_tc = am65_cpsw_qos_ndo_setup_tc, + .ndo_set_tx_maxrate = am65_cpsw_qos_ndo_tx_p0_set_maxrate, }; static void am65_cpsw_disable_phy(struct phy *phy) @@ -1466,15 +1475,13 @@ static void am65_cpsw_disable_serdes_phy(struct am65_cpsw_common *common) static int am65_cpsw_init_serdes_phy(struct device *dev, struct device_node *port_np, struct am65_cpsw_port *port) { - const char *name = "serdes-phy"; + const char *name = "serdes"; struct phy *phy; int ret; - phy = devm_of_phy_get(dev, port_np, name); - if (PTR_ERR(phy) == -ENODEV) - return 0; - if (IS_ERR(phy)) - return PTR_ERR(phy); + phy = devm_of_phy_optional_get(dev, port_np, name); + if (IS_ERR_OR_NULL(phy)) + return PTR_ERR_OR_ZERO(phy); /* Serdes PHY exists. Store it. */ port->slave.serdes_phy = phy; @@ -1498,9 +1505,26 @@ static void am65_cpsw_nuss_mac_config(struct phylink_config *config, unsigned in struct am65_cpsw_port *port = container_of(slave, struct am65_cpsw_port, slave); struct am65_cpsw_common *common = port->common; - if (common->pdata.extra_modes & BIT(state->interface)) + if (common->pdata.extra_modes & BIT(state->interface)) { + if (state->interface == PHY_INTERFACE_MODE_SGMII) { + writel(ADVERTISE_SGMII, + port->sgmii_base + AM65_CPSW_SGMII_MR_ADV_ABILITY_REG); + cpsw_sl_ctl_set(port->slave.mac_sl, CPSW_SL_CTL_EXT_EN); + } else { + cpsw_sl_ctl_clr(port->slave.mac_sl, CPSW_SL_CTL_EXT_EN); + } + + if (state->interface == PHY_INTERFACE_MODE_USXGMII) { + cpsw_sl_ctl_set(port->slave.mac_sl, + CPSW_SL_CTL_XGIG | CPSW_SL_CTL_XGMII_EN); + } else { + cpsw_sl_ctl_clr(port->slave.mac_sl, + CPSW_SL_CTL_XGIG | CPSW_SL_CTL_XGMII_EN); + } + writel(AM65_CPSW_SGMII_CONTROL_MR_AN_ENABLE, port->sgmii_base + AM65_CPSW_SGMII_CONTROL_REG); + } } static void am65_cpsw_nuss_mac_link_down(struct phylink_config *config, unsigned int mode, @@ -1511,6 +1535,7 @@ static void am65_cpsw_nuss_mac_link_down(struct phylink_config *config, unsigned struct am65_cpsw_port *port = container_of(slave, struct am65_cpsw_port, slave); struct am65_cpsw_common *common = port->common; struct net_device *ndev = port->ndev; + u32 mac_control; int tmo; /* disable forwarding */ @@ -1522,7 +1547,14 @@ static void am65_cpsw_nuss_mac_link_down(struct phylink_config *config, unsigned dev_dbg(common->dev, "down msc_sl %08x tmo %d\n", cpsw_sl_reg_read(port->slave.mac_sl, CPSW_SL_MACSTATUS), tmo); - cpsw_sl_ctl_reset(port->slave.mac_sl); + /* All the bits that am65_cpsw_nuss_mac_link_up() can possibly set */ + mac_control = CPSW_SL_CTL_GMII_EN | CPSW_SL_CTL_GIG | CPSW_SL_CTL_IFCTL_A | + CPSW_SL_CTL_FULLDUPLEX | CPSW_SL_CTL_RX_FLOW_EN | CPSW_SL_CTL_TX_FLOW_EN; + /* If interface mode is RGMII, CPSW_SL_CTL_EXT_EN might have been set for 10 Mbps */ + if (phy_interface_mode_is_rgmii(interface)) + mac_control |= CPSW_SL_CTL_EXT_EN; + /* Only clear those bits that can be set by am65_cpsw_nuss_mac_link_up() */ + cpsw_sl_ctl_clr(port->slave.mac_sl, mac_control); am65_cpsw_qos_link_down(ndev); netif_tx_stop_all_queues(ndev); @@ -1539,8 +1571,12 @@ static void am65_cpsw_nuss_mac_link_up(struct phylink_config *config, struct phy u32 mac_control = CPSW_SL_CTL_GMII_EN; struct net_device *ndev = port->ndev; + /* Bring the port out of idle state */ + cpsw_sl_ctl_clr(port->slave.mac_sl, CPSW_SL_CTL_CMD_IDLE); + if (speed == SPEED_1000) mac_control |= CPSW_SL_CTL_GIG; + /* TODO: Verify whether in-band is necessary for 10 Mbps RGMII */ if (speed == SPEED_10 && phy_interface_mode_is_rgmii(interface)) /* Can be used with in band mode only */ mac_control |= CPSW_SL_CTL_EXT_EN; @@ -1610,6 +1646,7 @@ void am65_cpsw_nuss_remove_tx_chns(struct am65_cpsw_common *common) devm_remove_action(dev, am65_cpsw_nuss_free_tx_chns, common); + common->tx_ch_rate_msk = 0; for (i = 0; i < common->tx_ch_num; i++) { struct am65_cpsw_tx_chn *tx_chn = &common->tx_chns[i]; @@ -2142,18 +2179,36 @@ am65_cpsw_nuss_init_port_ndev(struct am65_cpsw_common *common, u32 port_idx) /* Configuring Phylink */ port->slave.phylink_config.dev = &port->ndev->dev; port->slave.phylink_config.type = PHYLINK_NETDEV; - port->slave.phylink_config.mac_capabilities = MAC_SYM_PAUSE | MAC_10 | MAC_100 | MAC_1000FD; + port->slave.phylink_config.mac_capabilities = MAC_SYM_PAUSE | MAC_10 | MAC_100 | + MAC_1000FD | MAC_5000FD; port->slave.phylink_config.mac_managed_pm = true; /* MAC does PM */ - if (phy_interface_mode_is_rgmii(port->slave.phy_if)) { + switch (port->slave.phy_if) { + case PHY_INTERFACE_MODE_RGMII: + case PHY_INTERFACE_MODE_RGMII_ID: + case PHY_INTERFACE_MODE_RGMII_RXID: + case PHY_INTERFACE_MODE_RGMII_TXID: phy_interface_set_rgmii(port->slave.phylink_config.supported_interfaces); - } else if (port->slave.phy_if == PHY_INTERFACE_MODE_RMII) { + break; + + case PHY_INTERFACE_MODE_RMII: __set_bit(PHY_INTERFACE_MODE_RMII, port->slave.phylink_config.supported_interfaces); - } else if (common->pdata.extra_modes & BIT(port->slave.phy_if)) { - __set_bit(PHY_INTERFACE_MODE_QSGMII, - port->slave.phylink_config.supported_interfaces); - } else { + break; + + case PHY_INTERFACE_MODE_QSGMII: + case PHY_INTERFACE_MODE_SGMII: + case PHY_INTERFACE_MODE_USXGMII: + if (common->pdata.extra_modes & BIT(port->slave.phy_if)) { + __set_bit(port->slave.phy_if, + port->slave.phylink_config.supported_interfaces); + } else { + dev_err(dev, "selected phy-mode is not supported\n"); + return -EOPNOTSUPP; + } + break; + + default: dev_err(dev, "selected phy-mode is not supported\n"); return -EOPNOTSUPP; } @@ -2755,14 +2810,21 @@ static const struct am65_cpsw_pdata j7200_cpswxg_pdata = { .quirks = 0, .ale_dev_id = "am64-cpswxg", .fdqring_mode = K3_RINGACC_RING_MODE_RING, - .extra_modes = BIT(PHY_INTERFACE_MODE_QSGMII), + .extra_modes = BIT(PHY_INTERFACE_MODE_QSGMII) | BIT(PHY_INTERFACE_MODE_SGMII), }; static const struct am65_cpsw_pdata j721e_cpswxg_pdata = { .quirks = 0, .ale_dev_id = "am64-cpswxg", .fdqring_mode = K3_RINGACC_RING_MODE_MESSAGE, - .extra_modes = BIT(PHY_INTERFACE_MODE_QSGMII), + .extra_modes = BIT(PHY_INTERFACE_MODE_QSGMII) | BIT(PHY_INTERFACE_MODE_SGMII), +}; + +static const struct am65_cpsw_pdata j784s4_cpswxg_pdata = { + .quirks = 0, + .ale_dev_id = "am64-cpswxg", + .fdqring_mode = K3_RINGACC_RING_MODE_MESSAGE, + .extra_modes = BIT(PHY_INTERFACE_MODE_QSGMII) | BIT(PHY_INTERFACE_MODE_USXGMII), }; static const struct of_device_id am65_cpsw_nuss_of_mtable[] = { @@ -2771,6 +2833,7 @@ static const struct of_device_id am65_cpsw_nuss_of_mtable[] = { { .compatible = "ti,am642-cpsw-nuss", .data = &am64x_cpswxg_pdata}, { .compatible = "ti,j7200-cpswxg-nuss", .data = &j7200_cpswxg_pdata}, { .compatible = "ti,j721e-cpswxg-nuss", .data = &j721e_cpswxg_pdata}, + { .compatible = "ti,j784s4-cpswxg-nuss", .data = &j784s4_cpswxg_pdata}, { /* sentinel */ }, }; MODULE_DEVICE_TABLE(of, am65_cpsw_nuss_of_mtable); diff --git a/drivers/net/ethernet/ti/am65-cpsw-nuss.h b/drivers/net/ethernet/ti/am65-cpsw-nuss.h index cad04662739c..bf40c88fbd9b 100644 --- a/drivers/net/ethernet/ti/am65-cpsw-nuss.h +++ b/drivers/net/ethernet/ti/am65-cpsw-nuss.h @@ -79,6 +79,7 @@ struct am65_cpsw_tx_chn { u32 id; u32 descs_num; char tx_chn_name[128]; + u32 rate_mbps; }; struct am65_cpsw_rx_chn { @@ -126,6 +127,7 @@ struct am65_cpsw_common { int usage_count; /* number of opened ports */ struct cpsw_ale *ale; int tx_ch_num; + u32 tx_ch_rate_msk; u32 rx_flow_id_base; struct am65_cpsw_tx_chn tx_chns[AM65_CPSW_MAX_TX_QUEUES]; diff --git a/drivers/net/ethernet/ti/am65-cpsw-qos.c b/drivers/net/ethernet/ti/am65-cpsw-qos.c index 8dc2c3085dcf..3a908db6e5b2 100644 --- a/drivers/net/ethernet/ti/am65-cpsw-qos.c +++ b/drivers/net/ethernet/ti/am65-cpsw-qos.c @@ -19,6 +19,7 @@ #define AM65_CPSW_PN_REG_CTL 0x004 #define AM65_CPSW_PN_REG_FIFO_STATUS 0x050 #define AM65_CPSW_PN_REG_EST_CTL 0x060 +#define AM65_CPSW_PN_REG_PRI_CIR(pri) (0x140 + 4 * (pri)) /* AM65_CPSW_REG_CTL register fields */ #define AM65_CPSW_CTL_EST_EN BIT(18) @@ -819,3 +820,115 @@ void am65_cpsw_qos_link_down(struct net_device *ndev) port->qos.link_speed = SPEED_UNKNOWN; } + +static u32 +am65_cpsw_qos_tx_rate_calc(u32 rate_mbps, unsigned long bus_freq) +{ + u32 ir; + + bus_freq /= 1000000; + ir = DIV_ROUND_UP(((u64)rate_mbps * 32768), bus_freq); + return ir; +} + +static void +am65_cpsw_qos_tx_p0_rate_apply(struct am65_cpsw_common *common, + int tx_ch, u32 rate_mbps) +{ + struct am65_cpsw_host *host = am65_common_get_host(common); + u32 ch_cir; + int i; + + ch_cir = am65_cpsw_qos_tx_rate_calc(rate_mbps, common->bus_freq); + writel(ch_cir, host->port_base + AM65_CPSW_PN_REG_PRI_CIR(tx_ch)); + + /* update rates for every port tx queues */ + for (i = 0; i < common->port_num; i++) { + struct net_device *ndev = common->ports[i].ndev; + + if (!ndev) + continue; + netdev_get_tx_queue(ndev, tx_ch)->tx_maxrate = rate_mbps; + } +} + +int am65_cpsw_qos_ndo_tx_p0_set_maxrate(struct net_device *ndev, + int queue, u32 rate_mbps) +{ + struct am65_cpsw_port *port = am65_ndev_to_port(ndev); + struct am65_cpsw_common *common = port->common; + struct am65_cpsw_tx_chn *tx_chn; + u32 ch_rate, tx_ch_rate_msk_new; + u32 ch_msk = 0; + int ret; + + dev_dbg(common->dev, "apply TX%d rate limiting %uMbps tx_rate_msk%x\n", + queue, rate_mbps, common->tx_ch_rate_msk); + + if (common->pf_p0_rx_ptype_rrobin) { + dev_err(common->dev, "TX Rate Limiting failed - rrobin mode\n"); + return -EINVAL; + } + + ch_rate = netdev_get_tx_queue(ndev, queue)->tx_maxrate; + if (ch_rate == rate_mbps) + return 0; + + ret = pm_runtime_get_sync(common->dev); + if (ret < 0) { + pm_runtime_put_noidle(common->dev); + return ret; + } + ret = 0; + + tx_ch_rate_msk_new = common->tx_ch_rate_msk; + if (rate_mbps && !(tx_ch_rate_msk_new & BIT(queue))) { + tx_ch_rate_msk_new |= BIT(queue); + ch_msk = GENMASK(common->tx_ch_num - 1, queue); + ch_msk = tx_ch_rate_msk_new ^ ch_msk; + } else if (!rate_mbps) { + tx_ch_rate_msk_new &= ~BIT(queue); + ch_msk = queue ? GENMASK(queue - 1, 0) : 0; + ch_msk = tx_ch_rate_msk_new & ch_msk; + } + + if (ch_msk) { + dev_err(common->dev, "TX rate limiting has to be enabled sequentially hi->lo tx_rate_msk:%x tx_rate_msk_new:%x\n", + common->tx_ch_rate_msk, tx_ch_rate_msk_new); + ret = -EINVAL; + goto exit_put; + } + + tx_chn = &common->tx_chns[queue]; + tx_chn->rate_mbps = rate_mbps; + common->tx_ch_rate_msk = tx_ch_rate_msk_new; + + if (!common->usage_count) + /* will be applied on next netif up */ + goto exit_put; + + am65_cpsw_qos_tx_p0_rate_apply(common, queue, rate_mbps); + +exit_put: + pm_runtime_put(common->dev); + return ret; +} + +void am65_cpsw_qos_tx_p0_rate_init(struct am65_cpsw_common *common) +{ + struct am65_cpsw_host *host = am65_common_get_host(common); + int tx_ch; + + for (tx_ch = 0; tx_ch < common->tx_ch_num; tx_ch++) { + struct am65_cpsw_tx_chn *tx_chn = &common->tx_chns[tx_ch]; + u32 ch_cir; + + if (!tx_chn->rate_mbps) + continue; + + ch_cir = am65_cpsw_qos_tx_rate_calc(tx_chn->rate_mbps, + common->bus_freq); + writel(ch_cir, + host->port_base + AM65_CPSW_PN_REG_PRI_CIR(tx_ch)); + } +} diff --git a/drivers/net/ethernet/ti/am65-cpsw-qos.h b/drivers/net/ethernet/ti/am65-cpsw-qos.h index fb223b43b196..0cc2a3b3d7f9 100644 --- a/drivers/net/ethernet/ti/am65-cpsw-qos.h +++ b/drivers/net/ethernet/ti/am65-cpsw-qos.h @@ -8,6 +8,8 @@ #include <linux/netdevice.h> #include <net/pkt_sched.h> +struct am65_cpsw_common; + struct am65_cpsw_est { int buf; /* has to be the last one */ @@ -33,5 +35,7 @@ int am65_cpsw_qos_ndo_setup_tc(struct net_device *ndev, enum tc_setup_type type, void *type_data); void am65_cpsw_qos_link_up(struct net_device *ndev, int link_speed); void am65_cpsw_qos_link_down(struct net_device *ndev); +int am65_cpsw_qos_ndo_tx_p0_set_maxrate(struct net_device *ndev, int queue, u32 rate_mbps); +void am65_cpsw_qos_tx_p0_rate_init(struct am65_cpsw_common *common); #endif /* AM65_CPSW_QOS_H_ */ diff --git a/drivers/net/ethernet/ti/am65-cpts.c b/drivers/net/ethernet/ti/am65-cpts.c index 8caf85acbb6a..c66618d91c28 100644 --- a/drivers/net/ethernet/ti/am65-cpts.c +++ b/drivers/net/ethernet/ti/am65-cpts.c @@ -175,6 +175,7 @@ struct am65_cpts { u64 timestamp; u32 genf_enable; u32 hw_ts_enable; + u32 estf_enable; struct sk_buff_head txq; bool pps_enabled; bool pps_present; @@ -405,13 +406,13 @@ static irqreturn_t am65_cpts_interrupt(int irq, void *dev_id) static int am65_cpts_ptp_adjfine(struct ptp_clock_info *ptp, long scaled_ppm) { struct am65_cpts *cpts = container_of(ptp, struct am65_cpts, ptp_info); - u32 pps_ctrl_val = 0, pps_ppm_hi = 0, pps_ppm_low = 0; + u32 estf_ctrl_val = 0, estf_ppm_hi = 0, estf_ppm_low = 0; s32 ppb = scaled_ppm_to_ppb(scaled_ppm); int pps_index = cpts->pps_genf_idx; u64 adj_period, pps_adj_period; u32 ctrl_val, ppm_hi, ppm_low; unsigned long flags; - int neg_adj = 0; + int neg_adj = 0, i; if (ppb < 0) { neg_adj = 1; @@ -441,19 +442,19 @@ static int am65_cpts_ptp_adjfine(struct ptp_clock_info *ptp, long scaled_ppm) ppm_low = lower_32_bits(adj_period); if (cpts->pps_enabled) { - pps_ctrl_val = am65_cpts_read32(cpts, genf[pps_index].control); + estf_ctrl_val = am65_cpts_read32(cpts, genf[pps_index].control); if (neg_adj) - pps_ctrl_val &= ~BIT(1); + estf_ctrl_val &= ~BIT(1); else - pps_ctrl_val |= BIT(1); + estf_ctrl_val |= BIT(1); /* GenF PPM will do correction using cpts refclk tick which is * (cpts->ts_add_val + 1) ns, so GenF length PPM adj period * need to be corrected. */ pps_adj_period = adj_period * (cpts->ts_add_val + 1); - pps_ppm_hi = upper_32_bits(pps_adj_period) & 0x3FF; - pps_ppm_low = lower_32_bits(pps_adj_period); + estf_ppm_hi = upper_32_bits(pps_adj_period) & 0x3FF; + estf_ppm_low = lower_32_bits(pps_adj_period); } spin_lock_irqsave(&cpts->lock, flags); @@ -471,11 +472,18 @@ static int am65_cpts_ptp_adjfine(struct ptp_clock_info *ptp, long scaled_ppm) am65_cpts_write32(cpts, ppm_low, ts_ppm_low); if (cpts->pps_enabled) { - am65_cpts_write32(cpts, pps_ctrl_val, genf[pps_index].control); - am65_cpts_write32(cpts, pps_ppm_hi, genf[pps_index].ppm_hi); - am65_cpts_write32(cpts, pps_ppm_low, genf[pps_index].ppm_low); + am65_cpts_write32(cpts, estf_ctrl_val, genf[pps_index].control); + am65_cpts_write32(cpts, estf_ppm_hi, genf[pps_index].ppm_hi); + am65_cpts_write32(cpts, estf_ppm_low, genf[pps_index].ppm_low); } + for (i = 0; i < AM65_CPTS_ESTF_MAX_NUM; i++) { + if (cpts->estf_enable & BIT(i)) { + am65_cpts_write32(cpts, estf_ctrl_val, estf[i].control); + am65_cpts_write32(cpts, estf_ppm_hi, estf[i].ppm_hi); + am65_cpts_write32(cpts, estf_ppm_low, estf[i].ppm_low); + } + } /* All GenF/EstF can be updated here the same way */ spin_unlock_irqrestore(&cpts->lock, flags); @@ -596,6 +604,11 @@ int am65_cpts_estf_enable(struct am65_cpts *cpts, int idx, am65_cpts_write32(cpts, val, estf[idx].comp_lo); val = lower_32_bits(cycles); am65_cpts_write32(cpts, val, estf[idx].length); + am65_cpts_write32(cpts, 0, estf[idx].control); + am65_cpts_write32(cpts, 0, estf[idx].ppm_hi); + am65_cpts_write32(cpts, 0, estf[idx].ppm_low); + + cpts->estf_enable |= BIT(idx); dev_dbg(cpts->dev, "%s: ESTF:%u enabled\n", __func__, idx); @@ -606,6 +619,7 @@ EXPORT_SYMBOL_GPL(am65_cpts_estf_enable); void am65_cpts_estf_disable(struct am65_cpts *cpts, int idx) { am65_cpts_write32(cpts, 0, estf[idx].length); + cpts->estf_enable &= ~BIT(idx); dev_dbg(cpts->dev, "%s: ESTF:%u disabled\n", __func__, idx); } diff --git a/drivers/net/ethernet/ti/netcp_core.c b/drivers/net/ethernet/ti/netcp_core.c index 1bb596a9d8a2..d829113c16ee 100644 --- a/drivers/net/ethernet/ti/netcp_core.c +++ b/drivers/net/ethernet/ti/netcp_core.c @@ -2081,8 +2081,8 @@ static int netcp_create_interface(struct netcp_device *netcp_device, netcp->tx_pool_region_id = temp[1]; if (netcp->tx_pool_size < MAX_SKB_FRAGS) { - dev_err(dev, "tx-pool size too small, must be at least %ld\n", - MAX_SKB_FRAGS); + dev_err(dev, "tx-pool size too small, must be at least %u\n", + (unsigned int)MAX_SKB_FRAGS); ret = -ENODEV; goto quit; } diff --git a/drivers/net/ethernet/wangxun/libwx/wx_hw.c b/drivers/net/ethernet/wangxun/libwx/wx_hw.c index 7db57f934a91..ca409b4054d0 100644 --- a/drivers/net/ethernet/wangxun/libwx/wx_hw.c +++ b/drivers/net/ethernet/wangxun/libwx/wx_hw.c @@ -4,6 +4,7 @@ #include <linux/etherdevice.h> #include <linux/netdevice.h> #include <linux/if_ether.h> +#include <linux/if_vlan.h> #include <linux/iopoll.h> #include <linux/pci.h> @@ -1261,7 +1262,7 @@ static void wx_set_rx_buffer_len(struct wx *wx) struct net_device *netdev = wx->netdev; u32 mhadd, max_frame; - max_frame = netdev->mtu + ETH_HLEN + ETH_FCS_LEN; + max_frame = netdev->mtu + ETH_HLEN + ETH_FCS_LEN + VLAN_HLEN; /* adjust max frame to be at least the size of a standard frame */ if (max_frame < (ETH_FRAME_LEN + ETH_FCS_LEN)) max_frame = (ETH_FRAME_LEN + ETH_FCS_LEN); @@ -1271,6 +1272,24 @@ static void wx_set_rx_buffer_len(struct wx *wx) wr32(wx, WX_PSR_MAX_SZ, max_frame); } +/** + * wx_change_mtu - Change the Maximum Transfer Unit + * @netdev: network interface device structure + * @new_mtu: new value for maximum frame size + * + * Returns 0 on success, negative on failure + **/ +int wx_change_mtu(struct net_device *netdev, int new_mtu) +{ + struct wx *wx = netdev_priv(netdev); + + netdev->mtu = new_mtu; + wx_set_rx_buffer_len(wx); + + return 0; +} +EXPORT_SYMBOL(wx_change_mtu); + /* Disable the specified rx queue */ void wx_disable_rx_queue(struct wx *wx, struct wx_ring *ring) { diff --git a/drivers/net/ethernet/wangxun/libwx/wx_hw.h b/drivers/net/ethernet/wangxun/libwx/wx_hw.h index 44dfd6ea442a..c173c56f0ab5 100644 --- a/drivers/net/ethernet/wangxun/libwx/wx_hw.h +++ b/drivers/net/ethernet/wangxun/libwx/wx_hw.h @@ -23,6 +23,7 @@ void wx_flush_sw_mac_table(struct wx *wx); int wx_set_mac(struct net_device *netdev, void *p); void wx_disable_rx(struct wx *wx); void wx_set_rx_mode(struct net_device *netdev); +int wx_change_mtu(struct net_device *netdev, int new_mtu); void wx_disable_rx_queue(struct wx *wx, struct wx_ring *ring); void wx_configure(struct wx *wx); int wx_disable_pcie_master(struct wx *wx); diff --git a/drivers/net/ethernet/wangxun/libwx/wx_lib.c b/drivers/net/ethernet/wangxun/libwx/wx_lib.c index eb89a274083e..1e8d8b7b0c62 100644 --- a/drivers/net/ethernet/wangxun/libwx/wx_lib.c +++ b/drivers/net/ethernet/wangxun/libwx/wx_lib.c @@ -1798,10 +1798,13 @@ static int wx_setup_rx_resources(struct wx_ring *rx_ring) ret = wx_alloc_page_pool(rx_ring); if (ret < 0) { dev_err(rx_ring->dev, "Page pool creation failed: %d\n", ret); - goto err; + goto err_desc; } return 0; + +err_desc: + dma_free_coherent(dev, rx_ring->size, rx_ring->desc, rx_ring->dma); err: kvfree(rx_ring->rx_buffer_info); rx_ring->rx_buffer_info = NULL; diff --git a/drivers/net/ethernet/wangxun/libwx/wx_type.h b/drivers/net/ethernet/wangxun/libwx/wx_type.h index 97e2c1e13b80..32f952d93009 100644 --- a/drivers/net/ethernet/wangxun/libwx/wx_type.h +++ b/drivers/net/ethernet/wangxun/libwx/wx_type.h @@ -7,11 +7,6 @@ #include <linux/bitfield.h> #include <linux/netdevice.h> -/* Vendor ID */ -#ifndef PCI_VENDOR_ID_WANGXUN -#define PCI_VENDOR_ID_WANGXUN 0x8088 -#endif - #define WX_NCSI_SUP 0x8000 #define WX_NCSI_MASK 0x8000 #define WX_WOL_SUP 0x4000 @@ -300,6 +295,8 @@ #define WX_MAX_RXD 8192 #define WX_MAX_TXD 8192 +#define WX_MAX_JUMBO_FRAME_SIZE 9432 /* max payload 9414 */ + /* Supported Rx Buffer Sizes */ #define WX_RXBUFFER_256 256 /* Used for skb receive header */ #define WX_RXBUFFER_2K 2048 diff --git a/drivers/net/ethernet/wangxun/ngbe/ngbe_main.c b/drivers/net/ethernet/wangxun/ngbe/ngbe_main.c index 17412e5282de..df6b870aa871 100644 --- a/drivers/net/ethernet/wangxun/ngbe/ngbe_main.c +++ b/drivers/net/ethernet/wangxun/ngbe/ngbe_main.c @@ -6,10 +6,10 @@ #include <linux/pci.h> #include <linux/netdevice.h> #include <linux/string.h> -#include <linux/aer.h> #include <linux/etherdevice.h> #include <net/ip.h> #include <linux/phy.h> +#include <linux/if_vlan.h> #include "../libwx/wx_type.h" #include "../libwx/wx_hw.h" @@ -470,6 +470,7 @@ static void ngbe_shutdown(struct pci_dev *pdev) static const struct net_device_ops ngbe_netdev_ops = { .ndo_open = ngbe_open, .ndo_stop = ngbe_close, + .ndo_change_mtu = wx_change_mtu, .ndo_start_xmit = wx_xmit_frame, .ndo_set_rx_mode = wx_set_rx_mode, .ndo_validate_addr = eth_validate_addr, @@ -520,7 +521,6 @@ static int ngbe_probe(struct pci_dev *pdev, goto err_pci_disable_dev; } - pci_enable_pcie_error_reporting(pdev); pci_set_master(pdev); netdev = devm_alloc_etherdev_mqs(&pdev->dev, @@ -562,7 +562,8 @@ static int ngbe_probe(struct pci_dev *pdev, netdev->priv_flags |= IFF_SUPP_NOFCS; netdev->min_mtu = ETH_MIN_MTU; - netdev->max_mtu = NGBE_MAX_JUMBO_FRAME_SIZE - (ETH_HLEN + ETH_FCS_LEN); + netdev->max_mtu = WX_MAX_JUMBO_FRAME_SIZE - + (ETH_HLEN + ETH_FCS_LEN + VLAN_HLEN); wx->bd_number = func_nums; /* setup the private structure */ @@ -669,7 +670,6 @@ err_clear_interrupt_scheme: err_free_mac_table: kfree(wx->mac_table); err_pci_release_regions: - pci_disable_pcie_error_reporting(pdev); pci_release_selected_regions(pdev, pci_select_bars(pdev, IORESOURCE_MEM)); err_pci_disable_dev: @@ -698,7 +698,6 @@ static void ngbe_remove(struct pci_dev *pdev) kfree(wx->mac_table); wx_clear_interrupt_scheme(wx); - pci_disable_pcie_error_reporting(pdev); pci_disable_device(pdev); } diff --git a/drivers/net/ethernet/wangxun/ngbe/ngbe_type.h b/drivers/net/ethernet/wangxun/ngbe/ngbe_type.h index a2351349785e..373d5af628cd 100644 --- a/drivers/net/ethernet/wangxun/ngbe/ngbe_type.h +++ b/drivers/net/ethernet/wangxun/ngbe/ngbe_type.h @@ -137,7 +137,6 @@ enum NGBE_MSCA_CMD_value { #define NGBE_RX_PB_SIZE 42 #define NGBE_MC_TBL_SIZE 128 #define NGBE_TDB_PB_SZ (20 * 1024) /* 160KB Packet Buffer */ -#define NGBE_MAX_JUMBO_FRAME_SIZE 9432 /* max payload 9414 */ /* TX/RX descriptor defines */ #define NGBE_DEFAULT_TXD 512 /* default ring size */ diff --git a/drivers/net/ethernet/wangxun/txgbe/txgbe_main.c b/drivers/net/ethernet/wangxun/txgbe/txgbe_main.c index a58ce5463686..5b8a121fb496 100644 --- a/drivers/net/ethernet/wangxun/txgbe/txgbe_main.c +++ b/drivers/net/ethernet/wangxun/txgbe/txgbe_main.c @@ -6,9 +6,9 @@ #include <linux/pci.h> #include <linux/netdevice.h> #include <linux/string.h> -#include <linux/aer.h> #include <linux/etherdevice.h> #include <net/ip.h> +#include <linux/if_vlan.h> #include "../libwx/wx_type.h" #include "../libwx/wx_lib.h" @@ -488,6 +488,7 @@ static void txgbe_shutdown(struct pci_dev *pdev) static const struct net_device_ops txgbe_netdev_ops = { .ndo_open = txgbe_open, .ndo_stop = txgbe_close, + .ndo_change_mtu = wx_change_mtu, .ndo_start_xmit = wx_xmit_frame, .ndo_set_rx_mode = wx_set_rx_mode, .ndo_validate_addr = eth_validate_addr, @@ -539,7 +540,6 @@ static int txgbe_probe(struct pci_dev *pdev, goto err_pci_disable_dev; } - pci_enable_pcie_error_reporting(pdev); pci_set_master(pdev); netdev = devm_alloc_etherdev_mqs(&pdev->dev, @@ -606,7 +606,8 @@ static int txgbe_probe(struct pci_dev *pdev, netdev->priv_flags |= IFF_SUPP_NOFCS; netdev->min_mtu = ETH_MIN_MTU; - netdev->max_mtu = TXGBE_MAX_JUMBO_FRAME_SIZE - (ETH_HLEN + ETH_FCS_LEN); + netdev->max_mtu = WX_MAX_JUMBO_FRAME_SIZE - + (ETH_HLEN + ETH_FCS_LEN + VLAN_HLEN); /* make sure the EEPROM is good */ err = txgbe_validate_eeprom_checksum(wx, NULL); @@ -699,7 +700,6 @@ err_release_hw: err_free_mac_table: kfree(wx->mac_table); err_pci_release_regions: - pci_disable_pcie_error_reporting(pdev); pci_release_selected_regions(pdev, pci_select_bars(pdev, IORESOURCE_MEM)); err_pci_disable_dev: @@ -730,8 +730,6 @@ static void txgbe_remove(struct pci_dev *pdev) kfree(wx->mac_table); wx_clear_interrupt_scheme(wx); - pci_disable_pcie_error_reporting(pdev); - pci_disable_device(pdev); } diff --git a/drivers/net/ethernet/wangxun/txgbe/txgbe_type.h b/drivers/net/ethernet/wangxun/txgbe/txgbe_type.h index 563ea51deca6..63a1c733718d 100644 --- a/drivers/net/ethernet/wangxun/txgbe/txgbe_type.h +++ b/drivers/net/ethernet/wangxun/txgbe/txgbe_type.h @@ -79,7 +79,6 @@ #define TXGBE_SP_MC_TBL_SIZE 128 #define TXGBE_SP_RX_PB_SIZE 512 #define TXGBE_SP_TDB_PB_SZ (160 * 1024) /* 160KB Packet Buffer */ -#define TXGBE_MAX_JUMBO_FRAME_SIZE 9432 /* max payload 9414 */ /* TX/RX descriptor defines */ #define TXGBE_DEFAULT_TXD 512 |