summaryrefslogtreecommitdiffstats
path: root/include/rdma (follow)
Commit message (Collapse)AuthorAgeFilesLines
* IB/core: Make device counter infrastructure dynamicChristoph Lameter2016-05-261-52/+74
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In practice, each RDMA device has a unique set of counters that the hardware implements. Having a central set of counters that they must all adhere to is limiting and causes many useful counters to not be available. Therefore we create a dynamic counter registration infrastructure. The driver must implement a stats structure allocation routine, in which the driver must place the directory name it wants, a list of names for all of the counters, an array of u64 counters themselves, plus a few generic configuration options. We then implement a core routine to create a sysfs file for each of the named stats elements, and a core routine to retrieve the stats when any of the sysfs attribute files are read. To avoid excessive beating on the stats generation routine in the drivers, the core code also caches the stats for a short period of time so that someone attempting to read all of the stats in a given device's directory will not result in a stats generation call per file read. Future work will attempt to standardize just the shared stats elements, and possibly add a method to get the stats via netlink in addition to sysfs. Signed-off-by: Christoph Lameter <cl@linux.com> Signed-off-by: Mark Bloch <markb@mellanox.com> Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com> [ Add caching, make structure names more informative, add i40iw support, other significant rewrites from the original patch ]
* Merge branch 'hfi1-2' into k.o/for-4.7Doug Ledford2016-05-263-6/+17
|\
| * IB/rdamvt: Fix rdmavt s_ack_queue sizingMike Marciniszyn2016-05-262-4/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | rdmavt allows the driver to specify the size of the ack queue, but only uses it for the modify QP limit testing for setting the atomic limit value. The driver dependent size is now used to size the s_ack_queue ring dynamicially. Since the driver knows its size, the driver will use its define for any ring size dependent code. Reviewed-by: Mitko Haralanov <mitko.haralanov@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
| * IB/rdmavt: Max atomic value should be a u8Mike Marciniszyn2016-05-261-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | This matches the ib_qp_attr size and avoids a extremely large value when the lower level driver registers. As part of the patch, the u8 ordinals are moved to the end of the struct to reduce pahole noted excesses. Reviewed-by: Mitko Haralanov <mitko.haralanov@intel.com> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
| * ib_pack.h: Add opcode definition for send with invalidateJianxin Xiong2016-05-141-0/+5
| | | | | | | | | | | | | | | | | | | | | | The opcode for "SEND Last with Invalidate" and "SEND Only with Invalidate" have been defined for RC in IBA Specification Vol 1 since Release 1.2. Add the definition to the header file in preparation of supporting these opcodes in rdmavt based drivers. Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Jianxin Xiong <jianxin.xiong@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
* | IB/SA Agent: Add support for SA agent get ClassPortInfoErez Shitrit2016-05-251-0/+12
| | | | | | | | | | | | | | | | | | | | New SA query function to return the ClassPortInfo struct from the SA. If the SM supports FullMemberSendOnly mode for MCG's, it sets a capability bit in the capability_mask2 field of the response. Signed-off-by: Erez Shitrit <erezsh@mellanox.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
* | IB/core: Introduce capabilitymask2 field in ClassPortInfo madErez Shitrit2016-05-251-2/+58
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Change struct ib_class_port_info to conform to IB Spec 1.3 That in order to get specific capability mask from ClassPortInfo mad. >From the IB Spec, ClassPortInfo section: "CapabilityMask2 Bits 0-26: Additional class-specific capabilities... RespTimeValue the rest 5 bits" The new struct now has one field for capabilitymask2 (previously was the reserved field) and the resp_time field. And it fixes up qib and srpt, use of the field repurposed to be used as capabilitymask2: IB/qib: Change pma_get_classportinfo IB/srpt: Adjust the use of ib_class_port_info Signed-off-by: Erez Shitrit <erezsh@mellanox.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Hal Rosenstock <hal@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
| |
| \
*-. \ Merge branches 'cxgb4-2', 'i40iw-2', 'ipoib', 'misc-4.7' and 'mlx5-fcs' into ↵Doug Ledford2016-05-141-0/+2
|\ \ \ | | |/ | |/| | | | k.o/for-4.7
| | * IB/core: Add Scatter FCS create flagMajd Dibbiny2016-05-141-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | Raw Packet QPs that were created with Scatter FCS flag, will scatter the FCS into the receive buffers. Signed-off-by: Majd Dibbiny <majd@mellanox.com> Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
| | * IB/core: Add Raw Scatter FCS device capabilityMajd Dibbiny2016-05-141-0/+1
| |/ | | | | | | | | | | | | | | | | Raw Scatter FCS device capability is set when the NIC supports scattering the FCS to the receive buffers of Raw Packet QPs. Signed-off-by: Majd Dibbiny <majd@mellanox.com> Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
* | IB/core: Enhance ib_map_mr_sg()Bart Van Assche2016-05-131-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The SRP initiator allows to set max_sectors to a value that exceeds the largest amount of data that can be mapped at once with an mlx4 HCA using fast registration and a page size of 4 KB. Hence modify ib_map_mr_sg() such that it can map partial sg-elements. If an sg-element has been mapped partially, let the caller know which fraction has been mapped by adjusting *sg_offset. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Tested-by: Laurence Oberman <loberman@redhat.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Doug Ledford <dledford@redhat.com>
* | IB/core: add RW API support for signature MRsChristoph Hellwig2016-05-132-0/+20
| | | | | | | | | | Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Doug Ledford <dledford@redhat.com>
* | IB/core: generic RDMA READ/WRITE APIChristoph Hellwig2016-05-132-1/+82
| | | | | | | | | | | | | | | | | | | | | | | | | | This supports both manual mapping of lots of SGEs, as well as using MRs from the QP's MR pool, for iWarp or other cases where it's more optimal. For now, MRs are only used for iWARP transports. The user of the RDMA-RW API must allocate the QP MR pool as well as size the SQ accordingly. Thanks to Steve Wise for testing, fixing and rewriting the iWarp support, and to Sagi Grimberg for ideas, reviews and fixes. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Doug Ledford <dledford@redhat.com>
* | IB/core: add a need_inval flag to struct ib_mrSteve Wise2016-05-131-0/+1
| | | | | | | | | | | | | | | | | | | | | | This is the first step toward moving MR invalidation decisions to the core. It will be needed by the upcoming RW API. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Doug Ledford <dledford@redhat.com>
* | IB/core: add a simple MR poolChristoph Hellwig2016-05-132-1/+32
| | | | | | | | | | | | | | | | | | Signed-off-by: Christoph Hellwig <hch@lst.de> Tested-by: Steve Wise <swise@opengridcomputing.com> Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
* | IB/core: add a helper to check for READ WITH INVALIDATE supportChristoph Hellwig2016-05-131-0/+12
| | | | | | | | | | | | | | | | Signed-off-by: Christoph Hellwig <hch@lst.de> Tested-by: Steve Wise <swise@opengridcomputing.com> Reviewed-by: Steve Wise <swise@opengridcomputing.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Doug Ledford <dledford@redhat.com>
* | IB/core: Add passing an offset into the SG to ib_map_mr_sgChristoph Hellwig2016-05-131-14/+9
|/ | | | | | | | | Signed-off-by: Christoph Hellwig <hch@lst.de> Tested-by: Steve Wise <swise@opengridcomputing.com> Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
* IB/rdmavt,hfi1,qib: Fix memory leakJubin John2016-04-281-0/+1
| | | | | | | | | | | | | | | rdi->ports has memory allocated in rvt_alloc_device(), but does not get freed because the hfi1 and qib drivers drivers call ib_dealloc_device() directly instead of going through rdmavt. Add a rvt_dealloc_device() that frees rdi->ports and then calls ib_dealloc_device(). Switch hfi1 and qib drivers to calling rvt_dealloc_device() instead of ib_dealloc_device() directly. Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Reviewed-by: Brian Welty <brian.welty@intel.com> Signed-off-by: Jubin John <jubin.john@intel.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
* IB/rdmavt: Fix adaptive pio hangMike Marciniszyn2016-04-281-2/+3
| | | | | | | | | | | | | The RVT_S_WAIT_PIO_DRAIN flag was missing from the set of flags indicating a qp is waiting on a resource. This caused the sleep/wakeup for adaptive pio drain to lose a wakeup "hanging" a QP. Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
* IB/security: Restrict use of the write() interfaceJason Gunthorpe2016-04-281-0/+16
| | | | | | | | | | | | | | | | | | | | | | | | | The drivers/infiniband stack uses write() as a replacement for bi-directional ioctl(). This is not safe. There are ways to trigger write calls that result in the return structure that is normally written to user space being shunted off to user specified kernel memory instead. For the immediate repair, detect and deny suspicious accesses to the write API. For long term, update the user space libraries and the kernel API to something that doesn't present the same security vulnerabilities (likely a structured ioctl() interface). The impacted uAPI interfaces are generally only available if hardware from drivers/infiniband is installed in the system. Reported-by: Jann Horn <jann@thejh.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> [ Expanded check to all known write() entry points ] Cc: stable@vger.kernel.org Signed-off-by: Doug Ledford <dledford@redhat.com>
* Merge tag 'for-linus' of ↵Linus Torvalds2016-03-226-2/+1194
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma Pull more rdma updates from Doug Ledford: "Round two of 4.6 merge window patches. This is a monster pull request. I held off on the hfi1 driver updates (the hfi1 driver is intimately tied to the qib driver and the new rdmavt software library that was created to help both of them) in my first pull request. The hfi1/qib/rdmavt update is probably 90% of this pull request. The hfi1 driver is being left in staging so that it can be fixed up in regards to the API that Al and yourself didn't like. Intel has agreed to do the work, but in the meantime, this clears out 300+ patches in the backlog queue and brings my tree and their tree closer to sync. This also includes about 10 patches to the core and a few to mlx5 to create an infrastructure for configuring SRIOV ports on IB devices. That series includes one patch to the net core that we sent to netdev@ and Dave Miller with each of the three revisions to the series. We didn't get any response to the patch, so we took that as implicit approval. Finally, this series includes Intel's new iWARP driver for their x722 cards. It's not nearly the beast as the hfi1 driver. It also has a linux-next merge issue, but that has been resolved and it now passes just fine. Summary: - A few minor core fixups needed for the next patch series - The IB SRIOV series. This has bounced around for several versions. Of note is the fact that the first patch in this series effects the net core. It was directed to netdev and DaveM for each iteration of the series (three versions total). Dave did not object, but did not respond either. I've taken this as permission to move forward with the series. - The new Intel X722 iWARP driver - A huge set of updates to the Intel hfi1 driver. Of particular interest here is that we have left the driver in staging since it still has an API that people object to. Intel is working on a fix, but getting these patches in now helps keep me sane as the upstream and Intel's trees were over 300 patches apart" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma: (362 commits) IB/ipoib: Allow mcast packets from other VFs IB/mlx5: Implement callbacks for manipulating VFs net/mlx5_core: Implement modify HCA vport command net/mlx5_core: Add VF param when querying vport counter IB/ipoib: Add ndo operations for configuring VFs IB/core: Add interfaces to control VF attributes IB/core: Support accessing SA in virtualized environment IB/core: Add subnet prefix to port info IB/mlx5: Fix decision on using MAD_IFC net/core: Add support for configuring VF GUIDs IB/{core, ulp} Support above 32 possible device capability flags IB/core: Replace setting the zero values in ib_uverbs_ex_query_device net/mlx5_core: Introduce offload arithmetic hardware capabilities net/mlx5_core: Refactor device capability function net/mlx5_core: Fix caching ATOMIC endian mode capability ib_srpt: fix a WARN_ON() message i40iw: Replace the obsolete crypto hash interface with shash IB/hfi1: Add SDMA cache eviction algorithm IB/hfi1: Switch to using the pin query function IB/hfi1: Specify mm when releasing pages ...
| *-. Merge branches 'i40iw', 'sriov' and 'hfi1' into k.o/for-4.6Doug Ledford2016-03-218-6/+1219
| |\ \
| | | * IB/hfi1: Report pid in qp_stats to aid debugMike Marciniszyn2016-03-171-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Tracking user/QP ownership is needed to debug issues with user ULPs like OpenMPI. Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
| | | * staging/rdma/hfi1: Adaptive PIO for short messagesMike Marciniszyn2016-03-111-10/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The change requires a new pio_busy field in the iowait structure to track the number of outstanding pios. The new counter together with the sdma counter serve as the basis for a packet by packet decision as to which egress mechanism to use. Since packets given to different egress mechanisms are not ordered, this scheme will preserve the order. The iowait drain/wait mechanisms are extended for a pio case. An additional qp wait flag is added for the PIO drain wait case. Currently the only pio wait is for buffers, so the no_bufs_available() routine name is changed to pio_wait() and a third argument is passed with one of the two pio wait flags to generalize the routine. A module parameter is added to hold a configurable threshold. For now, the module parameter is zero. A heuristic routine is added to return the func pointer of the proper egress routine to use. The heuristic is as follows: - SMI always uses pio - GSI,UD qps <= threshold use pio - UD qps > threadhold use sdma o No coordination with sdma is required because order is not required and this qp pio count is not maintained for UD - RC/UC ONLY packets <= threshold chose as follows: o If sdmas pending, use SDMA o Otherwise use pio and enable the pio tracking count at the time the pio buffer is allocated - RC/UC ONLY packets > threshold use SDMA o If pio's are pending the pio_wait with the new wait flag is called to delay for pios to drain The threshold is potentially reduced by the QP's mtu. The sc_buffer_alloc() has two additional args (a callback, a void *) which are exploited by the RC/UC cases to pass a new complete routine and a qp *. When the shadow ring completes the credit associated with a packet, the new complete routine is called. The verbs_pio_complete() will then decrement the busy count and trigger any drain waiters in qp destroy or reset. Reviewed-by: Jubin John <jubin.john@intel.com> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
| | | * IB/rdmvt: close send engine struct holesMike Marciniszyn2016-03-111-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | pahole noted the wasted 4 bytes after s_lock and r_lock. Move s_flags and r_psn to fill the holes. Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
| | | * IB/rdmavt: Remove unnecessary exported functionsDennis Dalessandro2016-03-112-17/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Remove exported functions which are no longer required as the functionality has moved into rdmavt. This also requires re-ordering some of the functions since their prototype no longer appears in a header file. Rather than add forward declarations it is just cleaner to re-order some of the functions. Reviewed-by: Jubin John <jubin.john@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
| | | * IB/rdmavt: Remove signal_supported and commentsDennis Dalessandro2016-03-111-18/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Initially it was intended that rdmavt would support some signaling between the underlying driver and itself. However this turned out to be unnecessary for qib and hfi1. If we need to add something like this in later to support another driver we should do it then. As of now this essentially dead code so remove it. Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Jubin John <jubin.john@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
| | | * IB/rdmavt: Remove RVT_FLAGsDennis Dalessandro2016-03-111-10/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | While hfi1 and qib were still supporting bits and pieces of core verbs components there needed to be a way to convey if rdmavt should handle allocation and initialize of resources like the queue pair table. Now that all of this is moved into rdmavt there is no need for these flags. They are no longer used in the drivers. Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Jubin John <jubin.john@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
| | | * IB/qib,rdmavt: Move smi_ah to qibDennis Dalessandro2016-03-111-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Rdmavt adopted an smi_ah from qib which is not needed by hfi1. Move this back to qib and get it out of the common library. Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Jubin John <jubin.john@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
| | | * IB/rdmavt: Add per verb driver callback checkingDennis Dalessandro2016-03-111-14/+109
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For each verb validate that all requirements for driver callbacks are met. If a function is called without checking for a valid pointer, it is a required function. Also document what each callback function does. Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
| | | * IB/rdmavt: Clean up comments and add more documentationDennis Dalessandro2016-03-111-15/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add, remove, and otherwise clean up existing comments that are leftover from the initial code postings of rdmavt. Many of the comments were added to provide an idea on the direction we were thinking of going. Now that the design is solidified make a pass over and clean everything up. Also add details where lacking. Ensure all non static functions have nano comments. Reviewed-by: Jubin John <jubin.john@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
| | | * IB/qib, staging/rdma/hfi1: add s_hlock for use in post sendMike Marciniszyn2016-03-112-5/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds an additional lock to reduce contention on the s_lock. This lock is used in post_send() so that the post_send is not serialized with the send engine and other send related processing. To do this the s_next_psn is now maintained on post_send() while post_send() related fields are moved to a new cache line. There is an s_avail maintained for the post_send() to mitigate trading cache lines with the send engine. The lock is released/acquired around releasing the just built packet to the egress mechanism. Reviewed-by: Jubin John <jubin.john@intel.com> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Dean Luick <dean.luick@intel.com> Signed-off-by: Harish Chegondi <harish.chegondi@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
| | | * IB/rdmavt, staging/rdma/hfi1: use qps to dynamically scale timeout valueVennila Megavannan2016-03-112-1/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A busy_jiffies variable is maintained and updated when rc qps are created and deleted. busy_jiffies is a scaled value of the number of rc qps in the device. busy_jiffies is incremented every rc qp scaling interval. busy_jiffies is added to the rc timeout in add_retry_timer and mod_retry_timer. The rc qp scaling interval is selected based on extensive performance evaluation of targeted workloads. Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Vennila Megavannan <vennila.megavannan@intel.com> Signed-off-by: Jubin John <jubin.john@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
| | | * IB/rdmavt: remove unused qp fieldMike Marciniszyn2016-03-111-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The field is a vestige from ipath. Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
| | | * staging/rdma/hfi1, IB/core: Fix LinkDownReason define for consistencyEaswar Hariharan2016-03-111-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | LinkDownReason LocalMediaNotInstalled lacked an underscore and was inconsistent with other defines in the same family. This patch fixes this. Reviewed-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Easwar Hariharan <easwar.hariharan@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
| | | * IB/rdmavt: Add support for query_port, modify_port and get_port_immutableHarish Chegondi2016-03-111-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | rvt_query_port calls into the driver through a call back function query_port_state to populate the rest of ib_port_attr elements. rvt_modify_port calls into the driver if needed through a call back function shut_down_port() Signed-off-by: Harish Chegondi <harish.chegondi@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
| | | * IB/rdmavt: Add query gid support.Dennis Dalessandro2016-03-111-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Addin query gid support. Rdmavt still relies on the driver to maintain the gid table. Rdmavt simply calls into the driver to retrive the guid for a particular port. Reviewed-by: Harish Chegondi <harish.chegondi@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
| | | * IB/rdmavt: Clean up distinction between port number and indexDennis Dalessandro2016-03-111-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | IB core uses 1 relative indexing for ports. All of our data structures use 0 based indexing. Add an inline function that we can use whenever we need to validate a legal value and try to convert a port number to a port index at the entrance into rdmavt. Try to follow the policy that when we are talking about a port from IB core point of view we refer to it as a port number. When port is an index into our arrays refer to it as a port index. Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Reviewed-by: Harish Chegondi <harish.chegondi@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
| | | * IB/rdmavt: Add hardware driver send work request checkIra Weiny2016-03-111-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Some hardware drivers requires additional checks on send WRs. Create an optional call back to allow hardware drivers to reject a send WR. Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
| | | * IB/rdmavt: Add srq functionality to rdmavtJubin John2016-03-111-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fill in srq function stubs with code derived from hfi1 and qib. Move necessary functions and data structure members as well. Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Reviewed-by: Harish Chegondi <harish.chegondi@intel.com> Signed-off-by: Jubin John <jubin.john@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
| | | * IB/rdmavt: Fix copyright dateDennis Dalessandro2016-03-113-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Update all files added by rdmavt which do not yet have 2016 as the copyright year. Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Harish Chegondi <harish.chegondi@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
| | | * IB/rdmavt: Add mad agents to rdmavtDennis Dalessandro2016-03-111-1/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds mad agent create and free to rdmavt. Reviewed-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
| | | * IB/rdmavt: Add device structure allocationDennis Dalessandro2016-03-111-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds rdmavt device structure allocation in rdamvt. The ib_device alloc is now done in rdmavt instead of the driver. Drivers need to tell rdmavt the number of ports when calling. A side of effect of this patch is fixing a bug with port initialization where the device structure port array was allocated over top of an existing one. Reviewed-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
| | | * IB/rdmavt: add modify queue pair driver helpersIra Weiny2016-03-111-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Low level drivers need to be able to check incoming attributes as well as be able to adjust their private data on queue pair modification. Add 2 driver callbacks, check_modify_qp and modify_qp, to facilitate this. Signed-off-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
| | | * IB/rdmavt: Remove unused variable from Queue PairIra Weiny2016-03-111-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | s_sde should be in the low level driver QP private data. Remove the definition from rvt_qp. Signed-off-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
| | | * IB/rdmavt: Add multicast functionsDennis Dalessandro2016-03-112-1/+29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds in the multicast add and remove functions as well as the ancillary infrastructure needed. Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
| | | * IB/rdmavt: Add modify qpDennis Dalessandro2016-03-113-0/+71
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add modify qp and supporting functions. Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
| | | * IB/rdmavt: Add post send to rdmavtDennis Dalessandro2016-03-112-0/+33
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add in a post_send and post_one_send to rdmavt. The ULP will provide a WQE to rdmavt which will then walk and queue each element. Rdmavt will then queue the work to be done in the driver or kick the driver's progress routine. There needs to be a follow on patch which adds in another lock for the head of the queue so that it can be added to and read from in parallel. This will touch protocol handlers and require other changes in the drivers. This will be done separately. Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
| | | * IB/rdmavt: Add completion queue functionsDennis Dalessandro2016-03-112-0/+108
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Brings in completion queue functionality. A kthread worker is added to the rvt_dev_info to serve as a worker for completion queues. Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Reviewed-by: Harish Chegondi <harish.chegondi@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
| | | * IB/rdmavt: Support creating qps with GFP_NOIO flagMike Marciniszyn2016-03-111-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The current code is problematic when the QP creation and ipoib is used to support NFS and NFS desires to do IO for paging purposes. In that case, the GFP_KERNEL allocation within create_qp causes a deadlock in tight memory situations. This fix adds support to create queue pair with GFP_NOIO flag for connected mode only to cleanly fail the create queue pair in those situations. This was previously fixed in qib but needed to get ported to hfi1. This patch handles that for both hardwares in the new rdmavt common layer. Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>