summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-nextLinus Torvalds2020-01-291827-31388/+156150
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull networking updates from David Miller: 1) Add WireGuard 2) Add HE and TWT support to ath11k driver, from John Crispin. 3) Add ESP in TCP encapsulation support, from Sabrina Dubroca. 4) Add variable window congestion control to TIPC, from Jon Maloy. 5) Add BCM84881 PHY driver, from Russell King. 6) Start adding netlink support for ethtool operations, from Michal Kubecek. 7) Add XDP drop and TX action support to ena driver, from Sameeh Jubran. 8) Add new ipv4 route notifications so that mlxsw driver does not have to handle identical routes itself. From Ido Schimmel. 9) Add BPF dynamic program extensions, from Alexei Starovoitov. 10) Support RX and TX timestamping in igc, from Vinicius Costa Gomes. 11) Add support for macsec HW offloading, from Antoine Tenart. 12) Add initial support for MPTCP protocol, from Christoph Paasch, Matthieu Baerts, Florian Westphal, Peter Krystad, and many others. 13) Add Octeontx2 PF support, from Sunil Goutham, Geetha sowjanya, Linu Cherian, and others. * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1469 commits) net: phy: add default ARCH_BCM_IPROC for MDIO_BCM_IPROC udp: segment looped gso packets correctly netem: change mailing list qed: FW 8.42.2.0 debug features qed: rt init valid initialization changed qed: Debug feature: ilt and mdump qed: FW 8.42.2.0 Add fw overlay feature qed: FW 8.42.2.0 HSI changes qed: FW 8.42.2.0 iscsi/fcoe changes qed: Add abstraction for different hsi values per chip qed: FW 8.42.2.0 Additional ll2 type qed: Use dmae to write to widebus registers in fw_funcs qed: FW 8.42.2.0 Parser offsets modified qed: FW 8.42.2.0 Queue Manager changes qed: FW 8.42.2.0 Expose new registers and change windows qed: FW 8.42.2.0 Internal ram offsets modifications MAINTAINERS: Add entry for Marvell OcteonTX2 Physical Function driver Documentation: net: octeontx2: Add RVU HW and drivers overview octeontx2-pf: ethtool RSS config support octeontx2-pf: Add basic ethtool support ...
| * net: phy: add default ARCH_BCM_IPROC for MDIO_BCM_IPROCScott Branden2020-01-281-0/+1
| | | | | | | | | | | | | | | | | | Add default MDIO_BCM_IPROC Kconfig setting such that it is default on for IPROC family of devices. Signed-off-by: Scott Branden <scott.branden@broadcom.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * udp: segment looped gso packets correctlyWillem de Bruijn2020-01-281-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Multicast and broadcast packets can be looped from egress to ingress pre segmentation with dev_loopback_xmit. That function unconditionally sets ip_summed to CHECKSUM_UNNECESSARY. udp_rcv_segment segments gso packets in the udp rx path. Segmentation usually executes on egress, and does not expect packets of this type. __udp_gso_segment interprets !CHECKSUM_PARTIAL as CHECKSUM_NONE. But the offsets are not correct for gso_make_checksum. UDP GSO packets are of type CHECKSUM_PARTIAL, with their uh->check set to the correct pseudo header checksum. Reset ip_summed to this type. (CHECKSUM_PARTIAL is allowed on ingress, see comments in skbuff.h) Reported-by: syzbot <syzkaller@googlegroups.com> Fixes: cf329aa42b66 ("udp: cope with UDP GRO packet misdirection") Signed-off-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * netem: change mailing listStephen Hemminger2020-01-281-1/+1
| | | | | | | | | | | | | | | | | | The old netem mailing list was inactive and recently was targeted by spammers. Switch to just using netdev mailing list which is where all the real change happens. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>
| * Merge branch 'qed-Utilize-FW-8.42.2.0'David S. Miller2020-01-2730-4069/+4242
| |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Michal Kalderon says: ==================== qed*: Utilize FW 8.42.2.0 This FW contains several fixes and features, main ones listed below. We have taken into consideration past comments on previous FW versions that were uploaded and tried to separate this one to smaller patches to ease review. - RoCE - SRIOV support - Fixes in following flows: - latency optimization flow for inline WQEs - iwarp OOO packed DDPs flow - tx-dif workaround calculations flow - XRC-SRQ exceed cache num - iSCSI - Fixes: - iSCSI TCP out-of-order handling. - iscsi retransmit flow - Fcoe - Fixes: - upload + cleanup flows - Debug - Better handling of extracting data during traffic - ILT Dump -> dumping host memory used by chip - MDUMP -> collect debug data on system crash and extract after reboot Patches prefixed with FW 8.42.2.0 are required to work with binary 8.42.2.0 FW where as the rest are FW related but do not require the binary. Changes from V2 --------------- - Move FW version to the start of the series to maintain minimal compatibility - Fix some kbuild errors: - frame size larger than 1024 (Queue Manager patch - remove redundant field from struct) - sparse warning on endianity (Dmae patch fix - wrong use of __le32 for field used only on host, should be u32) - static should be used for some functions (Debug feature ilt and mdump) Reported-by: kbuild test robot <lkp@intel.com> Changes from V1 --------------- - Remove epoch + kernel version from device debug dump - don't bump driver version ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| | * qed: FW 8.42.2.0 debug featuresMichal Kalderon2020-01-276-2536/+1528
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add to debug dump more information on the platform it was collected from (pci func, path id). Provide human readable reg fifo erros. Removed static debug arrays from HSI Functions, and move them to the hwfn. Some structures were slightly changed (removing reserved chip id for example) which lead to many long initializations being modified with one parameter less during initialization. This leads to some long diffs that don't really change anything. Signed-off-by: Ariel Elior <ariel.elior@marvell.com> Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * qed: rt init valid initialization changedMichal Kalderon2020-01-273-21/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The QM phase init tool can be invoked multiple times during the driver lifetime. Part of the init comes from the runtime array. The logic for setting the values did not init all values, basically assuming the runtime array was all zeroes. But if it was invoked multiple times, nobody was zeroing it after the first time. In this change we zero the runtime array right after using it. Signed-off-by: Ariel Elior <ariel.elior@marvell.com> Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * qed: Debug feature: ilt and mdumpMichal Kalderon2020-01-278-214/+821
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Part of the FW drop includes new debug capabilities implemented in the qed_debug file. This patch dumps additional information during ethtool -d for better debugging. The data dumped is the ilt (internal logical table) and information gathered by the management firmware incase there was a crash and driver was not able to extract the information (mdump). Signed-off-by: Ariel Elior <ariel.elior@marvell.com> Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * qed: FW 8.42.2.0 Add fw overlay featureMichal Kalderon2020-01-275-1/+224
| | | | | | | | | | | | | | | | | | | | | | | | This feature enables the FW to page out FW code when required Signed-off-by: Ariel Elior <ariel.elior@marvell.com> Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * qed: FW 8.42.2.0 HSI changesMichal Kalderon2020-01-277-154/+280
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch contains several HSI changes. The changes are part of features like RDMA VF and OVS, the patch also contains a fix to how the init code determines if the dmae is ready to be used. Signed-off-by: Ariel Elior <ariel.elior@marvell.com> Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * qed: FW 8.42.2.0 iscsi/fcoe changesMichal Kalderon2020-01-276-78/+32
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - Remove struct iscsi_slow_path_hdr and field fw_cid from several structs - Remove struct iscsi_spe_func_dstry - Remove fields pbe_page_size_log and pbl_page_size_log from struct iscsi_conn_offload_param Signed-off-by: Manish Rangankar <manish.rangankar@marvell.com> Signed-off-by: Saurav Kashyap <saurav.kashyap@marvell.com> Signed-off-by: Ariel Elior <ariel.elior@marvell.com> Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * qed: Add abstraction for different hsi values per chipMichal Kalderon2020-01-273-32/+90
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The number of BTB blocks was modified to be different between the two chip flavors supported (BB/K2) as a result, this lead to a re-write of selecting the default hsi value based on the chip. This patch creates a lookup table for hsi values per chip rather than ask again and again for every value. Signed-off-by: Ariel Elior <ariel.elior@marvell.com> Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * qed: FW 8.42.2.0 Additional ll2 typeMichal Kalderon2020-01-2711-37/+235
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | LL2 queues were a limited resource due to FW constraints. This FW introduced a new resource which is a context based ll2 queue (memory on host). The additional ll2 queues are required for RDMA SRIOV. The code refers to the previous ll2 queues as ram-based or legacy, and the new queues as ctx-based. This change decreased the "legacy" ram-based queues therefore the first ll2 queue used for iWARP was converted to the ctx-based ll2 queue. This feature also exposed a bug in the DIRECT_REG_WR64 macro implementation which didn't have an effect in other use cases. Signed-off-by: Ariel Elior <ariel.elior@marvell.com> Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * qed: Use dmae to write to widebus registers in fw_funcsMichal Kalderon2020-01-277-91/+142
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There are several wide-bus registers written to by the fw_funcs that require using the dmae for atomicity. Therefore using the dmae channel functionality was added to the fw_funcs file, since the code is very similar to the previously used code, the structures used were moved to qed_hsi. Due to FW conventions, the names of the flags in the struct changed. Since this required slight modification in the places that set the flags the code was modified to use GET/SET FIELD macros. Signed-off-by: Ariel Elior <ariel.elior@marvell.com> Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * qed: FW 8.42.2.0 Parser offsets modifiedMichal Kalderon2020-01-271-26/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Convert storm ram line to regpair rather than two distinct u32 to better represent the u64 width of the ram. Convert some defines to be hex instead of negative values these values also changed by FW from previous value. Signed-off-by: Ariel Elior <ariel.elior@marvell.com> Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * qed: FW 8.42.2.0 Queue Manager changesMichal Kalderon2020-01-275-160/+149
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch contains changes in initialization and usage of the QM blocks. Instead of setting a rate limiter per vport the rate limiters are now a global resource and set independentaly. The patch also contains a field name change: vport_wfq which is part of vport_params was renamed to wfq as the vport prefix is redundant. Signed-off-by: Ariel Elior <ariel.elior@marvell.com> Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * qed: FW 8.42.2.0 Expose new registers and change windowsMichal Kalderon2020-01-274-538/+488
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch contains register initialization related changes. - Modifications to the runtime offsets - these are defines used by the driver or firmware functions to set values that are used by the initialization functions to set device register values. - Global window values changes to provide different device register ranges. - Additional device registers addresses were added to the register file, used in later stages. Signed-off-by: Ariel Elior <ariel.elior@marvell.com> Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * qed: FW 8.42.2.0 Internal ram offsets modificationsMichal Kalderon2020-01-273-193/+234
| |/ | | | | | | | | | | | | | | | | | | | | IRO stands for internal RAM offsets. Updating the FW binary produces different iro offsets. This file contains the different values, and a new representation of the values. Update the FW version Signed-off-by: Ariel Elior <ariel.elior@marvell.com> Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * Merge branch 'octeontx2-pf-Add-network-driver-for-physical-function'David S. Miller2020-01-2717-3/+5690
| |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Sunil Goutham says: ==================== octeontx2-pf: Add network driver for physical function OcteonTX2 SOC's resource virtualization unit (RVU) supports multiple physical and virtual functions. Each of the PF's functionality is determined by what kind of resources are attached to it. If NPA and NIX blocks are attached to a PF it can function as a highly capable network device. This patch series add a network driver for the PF. Initial set of patches adds mailbox communication with admin function (RVU AF) and configuration of queues. Followed by Rx and tx pkts NAPI handler and then support for HW offloads like RSS, TSO, Rxhash etc. Ethtool support to extract stats, config RSS, queue sizes, queue count is also added. Added documentation to give a high level overview of HW and different drivers which will be upstreamed and how they interact. Changes from v5: * Fixed otx2_atomic64_add() non ARM64 fallback definition. - Suggested by David Miller Changes from v4: * Replaced pci_set_dma_mask and pci_set_consistent_dma_mask fn()s with dma_set_mask_and_coherent(). * Some additonal code cleanup. * Fixed receive buffer segmnetation logic in otx2_alloc_rbuf() * Removed all unused BIG_ENDIAN structure definitions. * Removed unnecessary memory barriers - Sugested by Jakub Kicinski * Fixed mailbox initalization failure handling * Removed unused function parameter in otx2_skb_add_frag() - Suggested by Maciej Fijalkowski Changes from v3: * Fixed receive side scaling reinitialization during interface DOWN and UP to retain user configured settings, if any. * Removed driver version from ethtool. * Fixed otx2_set_rss_hash_opts() to return error incase RSS is not enabled. - Sugested by Jakub Kicinski Changes from v2: * Removed frames, bytes, dropped packet stats from ethtool to avoid duplication of same stats in netlink and ethtool. - Sugested by Jakub Kicinski * Removed number of channels and ringparam upper bound checking in ethtool support. * Fixed RSS hash option setting to reject unsupported config. - Suggested by Michal Kubecek Changes from v1: * Made driver dependent on 64bit, to fix build errors related to non availability of writeq/readq APIs for 32bit platforms. - Reported by kbuild test robot ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| | * MAINTAINERS: Add entry for Marvell OcteonTX2 Physical Function driverSunil Goutham2020-01-271-0/+9
| | | | | | | | | | | | | | | | | | | | | | | | Added maintainers entry for Marvell OcteonTX2 SOC's physical function NIC driver. Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * Documentation: net: octeontx2: Add RVU HW and drivers overviewSunil Goutham2020-01-273-0/+161
| | | | | | | | | | | | | | | | | | | | | | | | Added high level overview of OcteonTx2 RVU HW and functionality of various drivers which will be upstreamed. Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * octeontx2-pf: ethtool RSS config supportSunil Goutham2020-01-274-4/+269
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Added support to show or configure RSS hash key, indirection table, 2,4 tuple via ethtool. Also added debug msg_level support to dump messages when HW reports errors in packet received or transmitted. Signed-off-by: Prakash Brahmajyosyula <bprakash@marvell.com> Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * octeontx2-pf: Add basic ethtool supportChristina Jacob2020-01-276-4/+525
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds ethtool support for - Driver stats, Tx/Rx perqueue and CGX LMAC stats - Set/show Rx/Tx queue count - Set/show Rx/Tx ring sizes - Set/show IRQ coalescing parameters Signed-off-by: Christina Jacob <cjacob@marvell.com> Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * octeontx2-pf: Add ndo_get_stats64Geetha sowjanya2020-01-273-0/+97
| | | | | | | | | | | | | | | | | | | | | | | | Added ndo_get_stats64 which returns stats maintained by HW. Signed-off-by: Geetha sowjanya <gakula@marvell.com> Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * octeontx2-pf: TCP segmentation offload supportSunil Goutham2020-01-275-4/+273
| | | | | | | | | | | | | | | | | | | | | | | | | | | Adds TCP segmentation offload (TSO) support. First version of the silicon didn't support TSO offload, for this driver level TSO support is added. Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * octeontx2-pf: Receive side scaling supportSunil Goutham2020-01-274-1/+167
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Adds receive side scaling (RSS) support to distribute pkts/flows across multiple queues. Sets up key, indirection table etc. Also added extraction of HW calculated rxhash and adding to same to SKB ie NETIF_F_RXHASH offload support. Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * octeontx2-pf: Error handling supportGeetha sowjanya2020-01-276-3/+255
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | HW reports many errors on the receive and transmit paths. Such as incorrect queue configuration, pkt transmission errors, LMTST instruction errors, transmit queue full etc. These are reported via QINT interrupt. Most of the errors are fatal and needs reinitialization. Also added support to allocate receive buffers in non-atomic context when allocation fails in NAPI context. Signed-off-by: Geetha sowjanya <gakula@marvell.com> Signed-off-by: Aleksey Makarov <amakarov@marvell.com> Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * octeontx2-pf: MTU, MAC and RX mode config supportSunil Goutham2020-01-276-6/+242
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch addes support to change interface MTU, MAC address retrieval and config, RX mode ie unicast, multicast and promiscuous. Also added link loopback support Signed-off-by: Tomasz Duszynski <tduszynski@marvell.com> Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com> Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * octeontx2-pf: Register and handle link notificationsLinu Cherian2020-01-273-0/+109
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | PF and AF (admin function) shares 64KB of reserved memory region for communication. This region is shared for - Messages sent by PF and responses sent by AF. - Notifications sent by AF and ACKs sent by PF. This patch adds infrastructure to handle notifications sent by AF and adds handlers to process them. One of the main usecase of notifications from AF is physical link changes. So this patch adds registration of PF with AF to receive link status change notifications and also adds the handler for that notification. Signed-off-by: Linu Cherian <lcherian@marvell.com> Signed-off-by: Tomasz Duszynski <tduszynski@marvell.com> Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * octeontx2-pf: Add packet transmission supportSunil Goutham2020-01-276-2/+470
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds the packet transmission support. For a given skb prepares send queue descriptors (SQEs) and pushes them to HW. Here driver doesn't maintain it's own SQ rings, SQEs are pushed to HW using a silicon specific operations called LMTST. From the instuction HW derives the transmit queue number and queues the SQE to that queue. These LMTST instructions are designed to avoid queue maintenance in SW and lockless behavior ie when multiple cores are trying to add SQEs to same queue then HW will takecare of serialization, no need for SW to hold locks. Also supports scatter/gather. Co-developed-by: Geetha sowjanya <gakula@marvell.com> Signed-off-by: Geetha sowjanya <gakula@marvell.com> Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * octeontx2-pf: Receive packet handling supportSunil Goutham2020-01-276-3/+349
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Added receive packet handling (NAPI) support, error stats, RX_ALL capability config option to passon error pkts to stack upon user request. In subsequent patches these error stats will be added to ethttool. Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * octeontx2-pf: Setup interrupts and NAPI handlerSunil Goutham2020-01-276-10/+332
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Completion queue (CQ) is the one with which HW notifies SW on a packet reception or transmission. Each of the RQ and SQ are mapped to a unique CQ and again both CQs are mapped to same interrupt ie the CINT. So that each core has one interrupt source in whose handler both Rx and Tx notifications are processed. Also - Registered a NAPI handler for the CINT. - Setup coalescing parameters. - IRQ affinity hints etc Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * octeontx2-pf: Initialize and config queuesSunil Goutham2020-01-277-17/+1290
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch does the initialization of all queues ie the receive buffer pools, receive and transmit queues, completion or notification queues etc. Allocates all required resources (eg transmit schedulers, receive buffers etc) and configures them for proper functioning of queues. Also sets up receive queue's RED dropping levels. Co-developed-by: Geetha sowjanya <gakula@marvell.com> Signed-off-by: Geetha sowjanya <gakula@marvell.com> Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * octeontx2-pf: Attach NIX and NPA block LFsSunil Goutham2020-01-274-4/+286
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For a PF to function as a NIC, NPA (for Rx buffers, Tx descriptors etc) and NIX (for rcv, send and completion queues) are the minimum resources needed. So request admin function (AF) to attach one each of NIX and NPA block LFs (local functions). Only AF can configure a LF's contexts, so request AF to allocate memory for NPA aura/pool and NIX RQ/SQ/CQ HW contexts. Upon receiving response, save some of the HW constants like number of pointers per stack page, size of send queue buffer (SQBs, where SQEs are queued by HW) e.t.c which are later used to initialize queues. A HW context here is like a state machine maintained for a descriptor queue. eg size, head/tail pointers, irq etc etc. HW maintains this in memory. Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * octeontx2-pf: Mailbox communication with AFSunil Goutham2020-01-274-3/+552
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In the resource virtualization unit (RVU) each of the PF and AF (admin function) share a 64KB of reserved memory region for communication. This patch initializes PF <=> AF mailbox IRQs, registers handlers for processing these communication messages. Also adds support to process these messages in both directions ie responses to PF initiated DOWN (PF => AF) messages and AF initiated UP messages (AF => PF). Mbox communication APIs and message formats are defined in AF driver (drivers/net/ethernet/marvell/octeontx2/af), mbox.h from AF driver is included here to avoid duplication. Signed-off-by: Geetha sowjanya <gakula@marvell.com> Signed-off-by: Christina Jacob <cjacob@marvell.com> Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com> Signed-off-by: Aleksey Makarov <amakarov@marvell.com> Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * octeontx2-pf: Add Marvell OcteonTX2 NIC driverSunil Goutham2020-01-276-0/+362
| |/ | | | | | | | | | | | | | | | | This patch adds template for the Marvell's OcteonTX2 network controller's physical function driver. Just the probe, PCI specific initialization and netdev registration. Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-nextDavid S. Miller2020-01-2724-104/+433
| |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Daniel Borkmann says: ==================== pull-request: bpf-next 2020-01-27 The following pull-request contains BPF updates for your *net-next* tree. We've added 20 non-merge commits during the last 5 day(s) which contain a total of 24 files changed, 433 insertions(+), 104 deletions(-). The main changes are: 1) Make BPF trampolines and dispatcher aware for the stack unwinder, from Jiri Olsa. 2) Improve handling of failed CO-RE relocations in libbpf, from Andrii Nakryiko. 3) Several fixes to BPF sockmap and reuseport selftests, from Lorenz Bauer. 4) Various cleanups in BPF devmap's XDP flush code, from John Fastabend. 5) Fix BPF flow dissector when used with port ranges, from Yoshiki Komachi. 6) Fix bpffs' map_seq_next callback to always inc position index, from Vasily Averin. 7) Allow overriding LLVM tooling for runqslower utility, from Andrey Ignatov. 8) Silence false-positive lockdep splats in devmap hash lookup, from Amol Grover. 9) Fix fentry/fexit selftests to initialize a variable before use, from John Sperbeck. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| | * Merge branch 'bpf-flow-dissector-fix-port-ranges'Daniel Borkmann2020-01-272-2/+23
| | |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Yoshiki Komachi says: ==================== When I tried a test based on the selftest program for BPF flow dissector (test_flow_dissector.sh), I observed unexpected result as below: $ tc filter add dev lo parent ffff: protocol ip pref 1337 flower ip_proto \ udp src_port 8-10 action drop $ tools/testing/selftests/bpf/test_flow_dissector -i 4 -f 9 -F inner.dest4: 127.0.0.1 inner.source4: 127.0.0.3 pkts: tx=10 rx=10 The last rx means the number of received packets. I expected rx=0 in this test (i.e., all received packets should have been dropped), but it resulted in acceptance. Although the previous commit 8ffb055beae5 ("cls_flower: Fix the behavior using port ranges with hw-offload") added new flag and field toward filtering based on port ranges with hw-offload, it missed applying for BPF flow dissector then. As a result, BPF flow dissector currently stores data extracted from packets in incorrect field used for exact match whenever packets are classified by filters based on port ranges. Thus, they never match rules in such cases because flow dissector gives rise to generating incorrect flow keys. This series fixes the issue by replacing incorrect flag and field with new ones in BPF flow dissector, and adds a test for filtering based on specified port ranges to the existing selftest program. Changes in v2: - set key_ports to NULL at the top of __skb_flow_bpf_to_target() ==================== Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
| | | * selftests/bpf: Add test based on port range for BPF flow dissectorYoshiki Komachi2020-01-271-0/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add a simple test to make sure that a filter based on specified port range classifies packets correctly. Signed-off-by: Yoshiki Komachi <komachi.yoshiki@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Petar Penkov <ppenkov@google.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20200117070533.402240-3-komachi.yoshiki@gmail.com
| | | * flow_dissector: Fix to use new variables for port ranges in bpf hookYoshiki Komachi2020-01-271-2/+9
| | |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch applies new flag (FLOW_DISSECTOR_KEY_PORTS_RANGE) and field (tp_range) to BPF flow dissector to generate appropriate flow keys when classified by specified port ranges. Fixes: 8ffb055beae5 ("cls_flower: Fix the behavior using port ranges with hw-offload") Signed-off-by: Yoshiki Komachi <komachi.yoshiki@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Petar Penkov <ppenkov@google.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20200117070533.402240-2-komachi.yoshiki@gmail.com
| | * bpf, xdp: Remove no longer required rcu_read_{un}lock()John Fastabend2020-01-272-3/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now that we depend on rcu_call() and synchronize_rcu() to also wait for preempt_disabled region to complete the rcu read critical section in __dev_map_flush() is no longer required. Except in a few special cases in drivers that need it for other reasons. These originally ensured the map reference was safe while a map was also being free'd. And additionally that bpf program updates via ndo_bpf did not happen while flush updates were in flight. But flush by new rules can only be called from preempt-disabled NAPI context. The synchronize_rcu from the map free path and the rcu_call from the delete path will ensure the reference there is safe. So lets remove the rcu_read_lock and rcu_read_unlock pair to avoid any confusion around how this is being protected. If the rcu_read_lock was required it would mean errors in the above logic and the original patch would also be wrong. Now that we have done above we put the rcu_read_lock in the driver code where it is needed in a driver dependent way. I think this helps readability of the code so we know where and why we are taking read locks. Most drivers will not need rcu_read_locks here and further XDP drivers already have rcu_read_locks in their code paths for reading xdp programs on RX side so this makes it symmetric where we don't have half of rcu critical sections define in driver and the other half in devmap. Signed-off-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Link: https://lore.kernel.org/bpf/1580084042-11598-4-git-send-email-john.fastabend@gmail.com
| | * bpf, xdp: virtio_net use access ptr macro for xdp enable checkJohn Fastabend2020-01-271-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | virtio_net currently relies on rcu critical section to access the xdp program in its xdp_xmit handler. However, the pointer to the xdp program is only used to do a NULL pointer comparison to determine if xdp is enabled or not. Use rcu_access_pointer() instead of rcu_dereference() to reflect this. Then later when we drop rcu_read critical section virtio_net will not need in special handling. Signed-off-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Link: https://lore.kernel.org/bpf/1580084042-11598-3-git-send-email-john.fastabend@gmail.com
| | * bpf, xdp: Update devmap comments to reflect napi/rcu usageJohn Fastabend2020-01-271-10/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now that we rely on synchronize_rcu and call_rcu waiting to exit perempt-disable regions (NAPI) lets update the comments to reflect this. Fixes: 0536b85239b84 ("xdp: Simplify devmap cleanup") Signed-off-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Björn Töpel <bjorn.topel@intel.com> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/1580084042-11598-2-git-send-email-john.fastabend@gmail.com
| | * bpf: map_seq_next should always increase position indexVasily Averin2020-01-271-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If seq_file .next fuction does not change position index, read after some lseek can generate an unexpected output. See also: https://bugzilla.kernel.org/show_bug.cgi?id=206283 v1 -> v2: removed missed increment in end of function Signed-off-by: Vasily Averin <vvs@virtuozzo.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/eca84fdd-c374-a154-d874-6c7b55fc3bc4@virtuozzo.com
| | * tools/bpf: Allow overriding llvm tools for runqslowerAndrey Ignatov2020-01-271-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | tools/testing/selftests/bpf/Makefile supports overriding clang, llc and other tools so that custom ones can be used instead of those from PATH. It's convinient and heavily used by some users. Apply same rules to runqslower/Makefile. Signed-off-by: Andrey Ignatov <rdna@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/20200124224142.1833678-1-rdna@fb.com
| | * Merge branch 'trampoline-fixes'Alexei Starovoitov2020-01-257-13/+239
| | |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Jiri Olsa says: ==================== hi, sending 2 fixes to fix kernel support for loading trampoline programs in bcc/bpftrace and allow to unwind through trampoline/dispatcher. Original rfc post [1]. Speedup output of perf bench while running klockstat.py on kprobes vs trampolines: Without: $ perf bench sched messaging -l 50000 ... Total time: 18.571 [sec] With current kprobe tracing: $ perf bench sched messaging -l 50000 ... Total time: 183.395 [sec] With kfunc tracing: $ perf bench sched messaging -l 50000 ... Total time: 39.773 [sec] v4 changes: - rebased on latest bpf-next/master - removed image tree mutex and use trampoline_mutex instead - checking directly for string pointer in patch 1 [Alexei] - skipped helpers patches, as they are no longer needed [Alexei] v3 changes: - added ack from John Fastabend for patch 1 - move out is_bpf_image_address from is_bpf_text_address call [David] v2 changes: - make the unwind work for dispatcher as well - added test for allowed trampolines count - used raw tp pt_regs nest-arrays for trampoline helpers thanks, jirka [1] https://lore.kernel.org/netdev/20191229143740.29143-1-jolsa@kernel.org/ ==================== Signed-off-by: Alexei Starovoitov <ast@kernel.org>
| | | * selftest/bpf: Add test for allowed trampolines countJiri Olsa2020-01-252-0/+133
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There's limit of 40 programs tht can be attached to trampoline for one function. Adding test that tries to attach that many plus one extra that needs to fail. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200123161508.915203-4-jolsa@kernel.org
| | | * bpf: Allow to resolve bpf trampoline and dispatcher in unwindJiri Olsa2020-01-254-13/+90
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When unwinding the stack we need to identify each address to successfully continue. Adding latch tree to keep trampolines for quick lookup during the unwind. The patch uses first 48 bytes for latch tree node, leaving 4048 bytes from the rest of the page for trampoline or dispatcher generated code. It's still enough not to affect trampoline and dispatcher progs maximum counts. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200123161508.915203-3-jolsa@kernel.org
| | | * bpf: Allow BTF ctx access for string pointersJiri Olsa2020-01-251-0/+16
| | |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When accessing the context we allow access to arguments with scalar type and pointer to struct. But we deny access for pointer to scalar type, which is the case for many functions. Alexei suggested to take conservative approach and allow currently only string pointer access, which is the case for most functions now: Adding check if the pointer is to string type and allow access to it. Suggested-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200123161508.915203-2-jolsa@kernel.org
| | * libbpf: Fix realloc usage in bpf_core_find_candsAndrii Nakryiko2020-01-241-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix bug requesting invalid size of reallocated array when constructing CO-RE relocation candidate list. This can cause problems if there are many potential candidates and a very fine-grained memory allocator bucket sizes are used. Fixes: ddc7c3042614 ("libbpf: implement BPF CO-RE offset relocation algorithm") Reported-by: William Smith <williampsmith@fb.com> Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20200124201847.212528-1-andriin@fb.com