summaryrefslogtreecommitdiffstats
path: root/drivers/infiniband/hw (follow)
Commit message (Collapse)AuthorAgeFilesLines
* Merge branch 'for-linus' of ↵Linus Torvalds2007-05-228-128/+250
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | master.kernel.org:/pub/scm/linux/kernel/git/roland/infiniband * 'for-linus' of master.kernel.org:/pub/scm/linux/kernel/git/roland/infiniband: IB/cm: Improve local id allocation IPoIB/cm: Fix SRQ WR leak IB/ipoib: Fix typos in error messages IB/mlx4: Check if SRQ is full when posting receive IB/mlx4: Pass send queue sizes from userspace to kernel IB/mlx4: Fix check of opcode in mlx4_ib_post_send() mlx4_core: Fix array overrun in dump_dev_cap_flags() IB/mlx4: Fix RESET to RESET and RESET to ERROR transitions IB/mthca: Fix RESET to ERROR transition IB/mlx4: Set GRH:HopLimit when sending globally routed MADs IB/mthca: Set GRH:HopLimit when building MLX headers IB/mlx4: Fix check of max_qp_dest_rdma in modify QP IB/mthca: Fix use-after-free on device restart IB/ehca: Return proper error code if register_mr fails IPoIB: Handle P_Key table reordering IB/core: Use start_port() and end_port() IB/core: Add helpers for uncached GID and P_Key searches IB/ipath: Fix potential deadlock with multicast spinlocks IB/core: Free umem when mm is already gone
| * IB/mlx4: Check if SRQ is full when posting receiveRoland Dreier2007-05-211-0/+6
| | | | | | | | | | | | Make mlx4_post_srq_recv() fail if the SRQ is full (head == tail). Signed-off-by: Roland Dreier <rolandd@cisco.com>
| * IB/mlx4: Pass send queue sizes from userspace to kernelEli Cohen2007-05-202-17/+51
| | | | | | | | | | | | | | | | | | | | | | | | Pass the number of WQEs for the send queue and their size from userspace to the kernel to avoid having to keep the QP size calculations in sync between the kernel driver and libmlx4. This fixes a bug seen with the current mlx4_ib driver and current libmlx4 caused by a difference in the calculated sizes for SQ WQEs. Also, this gives more flexibility for userspace to experiment with using multiple WQE BBs for a single SQ WQE. Signed-off-by: Eli Cohen <eli@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
| * IB/mlx4: Fix check of opcode in mlx4_ib_post_send()Roland Dreier2007-05-191-1/+1
| | | | | | | | | | | | | | | | | | wr->opcode is invalid if it's >= ARRAY_SIZE(mlx4_ib_opcode), not just strictly >. This was spotted by the Coverity checker (CID 1643). Signed-off-by: Roland Dreier <rolandd@cisco.com>
| * IB/mlx4: Fix RESET to RESET and RESET to ERROR transitionsMichael S. Tsirkin2007-05-191-35/+80
| | | | | | | | | | | | | | | | | | | | | | | | According to the IB spec, a QP can be moved from RESET back to RESET or to the ERROR state, but mlx4 firmware does not support this and returns an error if we try. Fix the RESET to RESET transition by just returning 0 without doing anything, and fix RESET to ERROR by moving the QP from RESET to INIT with dummy parameters and then transitioning from INIT to ERROR. Signed-off-by: Michael S. Tsirkin <mst@dev.mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
| * IB/mthca: Fix RESET to ERROR transitionMichael S. Tsirkin2007-05-191-60/+98
| | | | | | | | | | | | | | | | | | | | According to the IB spec, a QP can be moved from RESET to the ERROR state, but mthca firmware does not support this and returns an error if we try. Work around this FW limitation by moving the QP from RESET to INIT with dummy parameters and then transitioning from INIT to ERROR. Signed-off-by: Michael S. Tsirkin <mst@dev.mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
| * IB/mlx4: Set GRH:HopLimit when sending globally routed MADsRoland Dreier2007-05-191-0/+1
| | | | | | | | | | | | | | This is the same issue discovered in mthca by Rolf Manderscheid <rvm@obsidianresearch.com>. Signed-off-by: Roland Dreier <rolandd@cisco.com>
| * IB/mthca: Set GRH:HopLimit when building MLX headersRolf Manderscheid2007-05-191-0/+1
| | | | | | | | | | | | | | | | | | | | | | Global CM packets used by rmda_cm were being sent with a GRH:hopLimit of zero, causing them to be dropped by the router. The problem is a missing initialization of the hop_limit field in mthca_read_ah(), which was called by build_mlx_header() when sending a MAD on QP1. Signed-off-by: Rolf Manderscheid <rvm@obsidianresearch.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
| * IB/mlx4: Fix check of max_qp_dest_rdma in modify QPEli Cohen2007-05-191-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | max_qp_dest_rdma is already in natural units - no need to shift. This was discovered by a test that deliberately requests more outstanding atomic operation than the device supports. Found by Sagi Rotem at Mellanox. Signed-off-by: Eli Cohen <eli@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
| * IB/mthca: Fix use-after-free on device restartAli Ayoub2007-05-191-1/+3
| | | | | | | | | | | | Signed-off-by: Michael S. Tsirkin <mst@dev.mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
| * IB/ehca: Return proper error code if register_mr failsHoang-Nam Nguyen2007-05-191-5/+2
| | | | | | | | | | | | | | | | | | | | Set the return code of ehca_register_mr() to ENOMEM if the corresponding firmware call fails due to out of resources. Some other error codes were explicitly mapped to EINVAL -- just remove those cases so they get mapped to the default case, which already returns EINVAL anyway. Signed-off-by: Hoang-Nam Nguyen <hnguyen@de.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
| * IB/ipath: Fix potential deadlock with multicast spinlocksRoland Dreier2007-05-191-9/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Lockdep found the following potential deadlock between mcast_lock and n_mcast_grps_lock: mcast_lock is taken from both interrupt context and process context, so spin_lock_irqsave() must be used to take it. n_mcast_grps_lock is only taken from process context, so at first it seems safe to take it with plain spin_lock(); however, it also nests inside mcast_lock, and hence we could deadlock: cpu A cpu B ipath_mcast_add(): spin_lock_irq(&mcast_lock); ipath_mcast_detach(): spin_lock(&n_mcast_grps_lock); <enter interrupt> ipath_mcast_find(): spin_lock_irqsave(&mcast_lock); spin_lock(&n_mcast_grps_lock); Fix this by using spin_lock_irq() to take n_mcast_grps_lock. Signed-off-by: Roland Dreier <rolandd@cisco.com>
* | Detach sched.h from mm.hAlexey Dobriyan2007-05-215-0/+5
|/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | First thing mm.h does is including sched.h solely for can_do_mlock() inline function which has "current" dereference inside. By dealing with can_do_mlock() mm.h can be detached from sched.h which is good. See below, why. This patch a) removes unconditional inclusion of sched.h from mm.h b) makes can_do_mlock() normal function in mm/mlock.c c) exports can_do_mlock() to not break compilation d) adds sched.h inclusions back to files that were getting it indirectly. e) adds less bloated headers to some files (asm/signal.h, jiffies.h) that were getting them indirectly Net result is: a) mm.h users would get less code to open, read, preprocess, parse, ... if they don't need sched.h b) sched.h stops being dependency for significant number of files: on x86_64 allmodconfig touching sched.h results in recompile of 4083 files, after patch it's only 3744 (-8.3%). Cross-compile tested on all arm defconfigs, all mips defconfigs, all powerpc defconfigs, alpha alpha-up arm i386 i386-up i386-defconfig i386-allnoconfig ia64 ia64-up m68k mips parisc parisc-up powerpc powerpc-up s390 s390-up sparc sparc-up sparc64 sparc64-up um-x86_64 x86_64 x86_64-up x86_64-defconfig x86_64-allnoconfig as well as my two usual configs. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* IB/mthca: Set cleaned CQEs back to HW ownership when cleaning CQMichael S. Tsirkin2007-05-141-1/+3
| | | | | | | | | | | | mthca_cq_clean() updates the CQ consumer index without moving CQEs back to HW ownership. As a result, the same WRID might get reported twice, resulting in a use-after-free. This was observed in IPoIB CM. Fix by moving all freed CQEs to HW ownership. This fixes <https://bugs.openfabrics.org/show_bug.cgi?id=617> Signed-off-by: Michael S. Tsirkin <mst@dev.mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/mthca: Fix posting >255 recv WRs for TavorMichael S. Tsirkin2007-05-141-0/+1
| | | | | | | | | | | Fix posting lists of > 255 receive WRs for Tavor: rq.next_ind must be updated each doorbell, otherwise the next doorbell will use an incorrect index. Found by Ronni Zimmermann at Mellanox. Signed-off-by: Michael S. Tsirkin <mst@dev.mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/ehca: Disable scaling code by default, bump version numberJoachim Fenkes2007-05-141-3/+3
| | | | | | | | - Scaling code is still considered experimental, so disable it by default - Increase version to SVNEHCA_0023 Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/ehca: Beautify sysfs attribute code and fix compiler warningsJoachim Fenkes2007-05-141-49/+37
| | | | | | | | | eHCA's sysfs attributes are now being created via sysfs_create_group(), making the process neatly table-driven. The return value is checked, thus fixing a few compiler warnings. Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/ehca: Remove _irqsave, move #ifdefJoachim Fenkes2007-05-141-4/+3
| | | | | | | | | | | - In ehca_process_eq(), we're IRQ safe throughout the whole function, so we don't need another _irqsave in the middle of flight. - take_over_work() is only called by comp_pool_callback(), so it can move into the same #ifdef block. Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/ehca: Fix AQP0/1 QP numberHoang-Nam Nguyen2007-05-141-2/+3
| | | | | | | | AQP0/1 should report qp_num={0|1} and the actual QP# should be stored in struct ehca_qp, not the other way round. Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/ehca: Correctly set GRH mask bit in ehca_modify_qp()Joachim Fenkes2007-05-141-4/+8
| | | | | | | | | | The driver needs to always supply the "GRH present" flag to the hypervisor, whether it's true or false. Not supplying it (i.e. not setting the corresponding mask bit) amounts to a "perhaps", which we don't want. Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/ehca: Serialize hypervisor calls in ehca_register_mr()Stefan Roscher2007-05-143-2/+14
| | | | | | | | Some pSeries hypervisor versions show a race condition in the allocate MR hCall. Serialize this call per adapter to circumvent this problem. Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/ipath: Shadow the gpio_mask registerArthur Jones2007-05-144-14/+14
| | | | | | | | | | | | | | | | | | Once upon a time, GPIO interrupts were rare. But then a chip bug in the waldo series forced the use of a GPIO interrupt to signal packet reception. This greatly increased the frequency of GPIO interrupts which have the gpio_mask bits set on the waldo chips. Other bits in the gpio_status register are used for I2C clock and data lines, these bits are usually on. An "unlikely" annotation leftover from the old days was improperly applied to these bits, and an unnecessary chip mmio read was being accessed in the interrupt fast path on waldo. Remove the stagnant unlikely annotation in the interrupt handler and keep a shadow copy of the gpio_mask register to avoid the slow mmio read when testing for interruptable GPIO bits. Signed-off-by: Arthur Jones <arthur.jones@qlogic.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/mlx4: Fix uninitialized spinlock for 32-bit archsJack Morgenstein2007-05-141-0/+1
| | | | | | | | | uar_lock spinlock was used in mlx4_ib_cq_arm without being initialized (this only affects 32-bit archs, because uar_lock is not used on 64-bit archs and MLX4_INIT_DOORBELL_LOCK() is a NOP). Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* Merge branch 'for-linus' of ↵Linus Torvalds2007-05-1023-79/+4180
|\ | | | | | | | | | | | | | | | | master.kernel.org:/pub/scm/linux/kernel/git/roland/infiniband * 'for-linus' of master.kernel.org:/pub/scm/linux/kernel/git/roland/infiniband: IB/mlx4: Add a driver Mellanox ConnectX InfiniBand adapters IB: Put rlimit accounting struct in struct ib_umem IB/uverbs: Export ib_umem_get()/ib_umem_release() to modules
| * IB/mlx4: Add a driver Mellanox ConnectX InfiniBand adaptersRoland Dreier2007-05-0912-0/+4032
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add an InfiniBand driver for Mellanox ConnectX adapters. Because these adapters can also be used as ethernet NICs and Fibre Channel HBAs, the driver is split into two modules: mlx4_core: Handles low-level things like device initialization and processing firmware commands. Also controls resource allocation so that the InfiniBand, ethernet and FC functions can share a device without stepping on each other. mlx4_ib: Handles InfiniBand-specific things; plugs into the InfiniBand midlayer. Signed-off-by: Roland Dreier <rolandd@cisco.com>
| * IB/uverbs: Export ib_umem_get()/ib_umem_release() to modulesRoland Dreier2007-05-0911-79/+148
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Export ib_umem_get()/ib_umem_release() and put low-level drivers in control of when to call ib_umem_get() to pin and DMA map userspace, rather than always calling it in ib_uverbs_reg_mr() before calling the low-level driver's reg_user_mr method. Also move these functions to be in the ib_core module instead of ib_uverbs, so that driver modules using them do not depend on ib_uverbs. This has a number of advantages: - It is better design from the standpoint of making generic code a library that can be used or overridden by device-specific code as the details of specific devices dictate. - Drivers that do not need to pin userspace memory regions do not need to take the performance hit of calling ib_mem_get(). For example, although I have not tried to implement it in this patch, the ipath driver should be able to avoid pinning memory and just use copy_{to,from}_user() to access userspace memory regions. - Buffers that need special mapping treatment can be identified by the low-level driver. For example, it may be possible to solve some Altix-specific memory ordering issues with mthca CQs in userspace by mapping CQ buffers with extra flags. - Drivers that need to pin and DMA map userspace memory for things other than memory regions can use ib_umem_get() directly, instead of hacks using extra parameters to their reg_phys_mr method. For example, the mlx4 driver that is pending being merged needs to pin and DMA map QP and CQ buffers, but it does not need to create a memory key for these buffers. So the cleanest solution is for mlx4 to call ib_umem_get() in the create_qp and create_cq methods. Signed-off-by: Roland Dreier <rolandd@cisco.com>
* | Add suspend-related notifications for CPU hotplugRafael J. Wysocki2007-05-091-0/+6
|/ | | | | | | | | | | | | | | | | | | | | Since nonboot CPUs are now disabled after tasks and devices have been frozen and the CPU hotplug infrastructure is used for this purpose, we need special CPU hotplug notifications that will help the CPU-hotplug-aware subsystems distinguish normal CPU hotplug events from CPU hotplug events related to a system-wide suspend or resume operation in progress. This patch introduces such notifications and causes them to be used during suspend and resume transitions. It also changes all of the CPU-hotplug-aware subsystems to take these notifications into consideration (for now they are handled in the same way as the corresponding "normal" ones). [oleg@tv-sign.ru: cleanups] Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Cc: Gautham R Shenoy <ego@in.ibm.com> Cc: Pavel Machek <pavel@ucw.cz> Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Merge branch 'master' of ↵Linus Torvalds2007-05-081-1/+1
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc * 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc: (77 commits) [POWERPC] Abolish powerpc_flash_init() [POWERPC] Early serial debug support for PPC44x [POWERPC] Support for the Ebony 440GP reference board in arch/powerpc [POWERPC] Add device tree for Ebony [POWERPC] Add powerpc/platforms/44x, disable platforms/4xx for now [POWERPC] MPIC U3/U4 MSI backend [POWERPC] MPIC MSI allocator [POWERPC] Enable MSI mappings for MPIC [POWERPC] Tell Phyp we support MSI [POWERPC] RTAS MSI implementation [POWERPC] PowerPC MSI infrastructure [POWERPC] Rip out the existing powerpc msi stubs [POWERPC] Remove use of 4level-fixup.h for ppc32 [POWERPC] Add powerpc PCI-E reset API implementation [POWERPC] Holly bootwrapper [POWERPC] Holly DTS [POWERPC] Holly defconfig [POWERPC] Add support for 750CL Holly board [POWERPC] Generalize tsi108 PCI setup [POWERPC] Generalize tsi108 PHY types ... Fixed conflict in include/asm-powerpc/kdebug.h manually Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * Merge branch 'linux-2.6'Paul Mackerras2007-05-0830-186/+337
| |\
| * | [POWERPC] Rename get_property to of_get_property: driversStephen Rothwell2007-05-021-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | These are all the remaining instances of get_property. Simple rename of get_property to of_get_property. Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Paul Mackerras <paulus@samba.org>
* | | inode numbering: change libfs sb creation routines to avoid collisions with ↵Jeff Layton2007-05-081-1/+1
| |/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | their root inodes This patch makes it so that simple_fill_super and get_sb_pseudo assign their root inodes to be number 1. It also fixes up a couple of callers of simple_fill_super that were passing in files arrays that had an index at number 1, and adds a warning for any caller that sends in such an array. It would have been nice to have made it so that it wasn't possible to make such a collision, but some callers need to be able to control what inode number their entries get, so I think this is the best that can be done. Signed-off-by: Jeff Layton <jlayton@redhat.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | Merge branch 'for-linus' of ↵Linus Torvalds2007-05-0725-180/+337
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | master.kernel.org:/pub/scm/linux/kernel/git/roland/infiniband * 'for-linus' of master.kernel.org:/pub/scm/linux/kernel/git/roland/infiniband: IPoIB: Convert to NAPI IB: Return "maybe missed event" hint from ib_req_notify_cq() IB: Add CQ comp_vector support IB/ipath: Fix a race condition when generating ACKs IB/ipath: Fix two more spin lock problems IB/fmr_pool: Add prefix to all printks IB/srp: Set proc_name IB/srp: Add orig_dgid sysfs attribute to scsi_host IPoIB/cm: Don't crash if remote side uses one QP for both directions RDMA/cxgb3: Support for new abort logic RDMA/cxgb3: Initialize cpu_idx field in cpl_close_listserv_req message RDMA/cxgb3: Fail qp creation if the requested max_inline is too large RDMA/cxgb3: Fix TERM codes IPoIB/cm: Fix error handling in ipoib_cm_dev_open() IB/ipath: Don't corrupt pending mmap list when unmapped objects are freed IB/mthca: Work around kernel QP starvation IB/ipath: Don't put QP in timeout queue if waiting to send IB/ipath: Don't call spin_lock_irq() from interrupt context
| * | IB: Return "maybe missed event" hint from ib_req_notify_cq()Roland Dreier2007-05-0711-24/+62
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The semantics defined by the InfiniBand specification say that completion events are only generated when a completions is added to a completion queue (CQ) after completion notification is requested. In other words, this means that the following race is possible: while (CQ is not empty) ib_poll_cq(CQ); // new completion is added after while loop is exited ib_req_notify_cq(CQ); // no event is generated for the existing completion To close this race, the IB spec recommends doing another poll of the CQ after requesting notification. However, it is not always possible to arrange code this way (for example, we have found that NAPI for IPoIB cannot poll after requesting notification). Also, some hardware (eg Mellanox HCAs) actually will generate an event for completions added before the call to ib_req_notify_cq() -- which is allowed by the spec, since there's no way for any upper-layer consumer to know exactly when a completion was really added -- so the extra poll of the CQ is just a waste. Motivated by this, we add a new flag "IB_CQ_REPORT_MISSED_EVENTS" for ib_req_notify_cq() so that it can return a hint about whether the a completion may have been added before the request for notification. The return value of ib_req_notify_cq() is extended so: < 0 means an error occurred while requesting notification == 0 means notification was requested successfully, and if IB_CQ_REPORT_MISSED_EVENTS was passed in, then no events were missed and it is safe to wait for another event. > 0 is only returned if IB_CQ_REPORT_MISSED_EVENTS was passed in. It means that the consumer must poll the CQ again to make sure it is empty to avoid the race described above. We add a flag to enable this behavior rather than turning it on unconditionally, because checking for missed events may incur significant overhead for some low-level drivers, and consumers that don't care about the results of this test shouldn't be forced to pay for the test. Signed-off-by: Roland Dreier <rolandd@cisco.com>
| * | IB: Add CQ comp_vector supportMichael S. Tsirkin2007-05-079-7/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add a num_comp_vectors member to struct ib_device and extend ib_create_cq() to pass in a comp_vector parameter -- this parallels the userspace libibverbs API. Update all hardware drivers to set num_comp_vectors to 1 and have all ULPs pass 0 for the comp_vector value. Pass the value of num_comp_vectors to userspace rather than hard-coding a value of 1. We want multiple CQ event vector support (via MSI-X or similar for adapters that can generate multiple interrupts), but it's not clear how many vectors we want, or how we want to deal with policy issues such as how to decide which vector to use or how to set up interrupt affinity. This patch is useful for experimenting, since no core changes will be necessary when updating a driver to support multiple vectors, and we know that we want to make at least these changes anyway. Signed-off-by: Michael S. Tsirkin <mst@dev.mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
| * | IB/ipath: Fix a race condition when generating ACKsRalph Campbell2007-05-071-13/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | Fix a problem where simple ACKs can be sent ahead of RDMA read responses thus implicitly NAKing the RDMA read. Signed-off-by: Ralph Campbell <ralph.cambpell@qlogic.com> Signed-off-by: Robert Walsh <robert.walsh@qlogic.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
| * | IB/ipath: Fix two more spin lock problemsRalph Campbell2007-05-071-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | Fix a missing unlock in ipath_rc_rcv_resp() and remove an extra unlock from ipath_rc_rcv_error(). Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
| * | RDMA/cxgb3: Support for new abort logicSteve Wise2007-05-072-0/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The HW now posts 2 ABORT_RPL and/or PEER_ABORT_REQ messages. We need to handle them by silenty dropping the 1st but mark that we're ready for the final message. This plugs some close races between the uP and HW. Also update the minimum required firmware version. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
| * | RDMA/cxgb3: Initialize cpu_idx field in cpl_close_listserv_req messageSteve Wise2007-05-011-0/+1
| | | | | | | | | | | | | | | Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
| * | RDMA/cxgb3: Fail qp creation if the requested max_inline is too largeSteve Wise2007-05-012-0/+4
| | | | | | | | | | | | | | | Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
| * | RDMA/cxgb3: Fix TERM codesSteve Wise2007-05-011-31/+38
| | | | | | | | | | | | | | | | | | | | | | | | Fix TERMINATE layer, type, and ecode values based on conformance testing. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
| * | IB/ipath: Don't corrupt pending mmap list when unmapped objects are freedRobert Walsh2007-05-016-90/+153
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix the pending mmap code so it doesn't corrupt the list of pending mmaps and crash the machine when pending mmaps are destroyed without first being mapped. Also, remove an unused variable, and use standard kernel lists instead of our own homebrewed linked list implementation to keep the pending mmap list. Signed-off-by: Robert Walsh <robert.walsh@qlogic.com> Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
| * | IB/mthca: Work around kernel QP starvationMichael S. Tsirkin2007-05-011-0/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | With mthca, RC QPs can starve each other and even UD QPs on the same hardware schedule queue. As a result, userspace MPI can starve e.g. IPoIB traffic, with netdev watchdog warnings getting printed out, and TCP connections getting stuck or failing. Reduce the chance of this happening by using three separate hardware schedule queues: one for userspace RC QPs, one for kernel RC QPs, and one for all other QPs. Signed-off-by: Michael S. Tsirkin <mst@dev.mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
| * | IB/ipath: Don't put QP in timeout queue if waiting to sendRalph Campbell2007-05-012-7/+2
| | | | | | | | | | | | | | | | | | | | | | | | This fixes a problem which causes too many RC timeouts and retransmits. Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
| * | IB/ipath: Don't call spin_lock_irq() from interrupt contextRalph Campbell2007-05-011-7/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch fixes the problem reported by Bernd Schubert <bs@q-leap.de> with kernel debug options enabled: BUG: at kernel/lockdep.c:1860 trace_hardirqs_on() This was caused by using spin_lock_irq()/spin_unlock_irq() from interrupt context. Fix all the places that might be called from interrupts to use spin_lock_irqsave()/spin_unlock_irqrestore(). Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* | | PCI: Cleanup the includes of <linux/pci.h>Jean Delvare2007-05-035-6/+0
| |/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | I noticed that many source files include <linux/pci.h> while they do not appear to need it. Here is an attempt to clean it all up. In order to find all possibly affected files, I searched for all files including <linux/pci.h> but without any other occurence of "pci" or "PCI". I removed the include statement from all of these, then I compiled an allmodconfig kernel on both i386 and x86_64 and fixed the false positives manually. My tests covered 66% of the affected files, so there could be false positives remaining. Untested files are: arch/alpha/kernel/err_common.c arch/alpha/kernel/err_ev6.c arch/alpha/kernel/err_ev7.c arch/ia64/sn/kernel/huberror.c arch/ia64/sn/kernel/xpnet.c arch/m68knommu/kernel/dma.c arch/mips/lib/iomap.c arch/powerpc/platforms/pseries/ras.c arch/ppc/8260_io/enet.c arch/ppc/8260_io/fcc_enet.c arch/ppc/8xx_io/enet.c arch/ppc/syslib/ppc4xx_sgdma.c arch/sh64/mach-cayman/iomap.c arch/xtensa/kernel/xtensa_ksyms.c arch/xtensa/platform-iss/setup.c drivers/i2c/busses/i2c-at91.c drivers/i2c/busses/i2c-mpc.c drivers/media/video/saa711x.c drivers/misc/hdpuftrs/hdpu_cpustate.c drivers/misc/hdpuftrs/hdpu_nexus.c drivers/net/au1000_eth.c drivers/net/fec_8xx/fec_main.c drivers/net/fec_8xx/fec_mii.c drivers/net/fs_enet/fs_enet-main.c drivers/net/fs_enet/mac-fcc.c drivers/net/fs_enet/mac-fec.c drivers/net/fs_enet/mac-scc.c drivers/net/fs_enet/mii-bitbang.c drivers/net/fs_enet/mii-fec.c drivers/net/ibm_emac/ibm_emac_core.c drivers/net/lasi_82596.c drivers/parisc/hppb.c drivers/sbus/sbus.c drivers/video/g364fb.c drivers/video/platinumfb.c drivers/video/stifb.c drivers/video/valkyriefb.c include/asm-arm/arch-ixp4xx/dma.h sound/oss/au1550_ac97.c I would welcome test reports for these files. I am fine with removing the untested files from the patch if the general opinion is that these changes aren't safe. The tested part would still be nice to have. Note that this patch depends on another header fixup patch I submitted to LKML yesterday: [PATCH] scatterlist.h needs types.h http://lkml.org/lkml/2007/3/01/141 Signed-off-by: Jean Delvare <khali@linux-fr.org> Cc: Badari Pulavarty <pbadari@us.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* | Merge branch 'linux-2.6' into for-2.6.22Paul Mackerras2007-04-3036-851/+1470
|\|
| * Merge branch 'for-linus' of ↵Linus Torvalds2007-04-2734-841/+1444
| |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | master.kernel.org:/pub/scm/linux/kernel/git/roland/infiniband * 'for-linus' of master.kernel.org:/pub/scm/linux/kernel/git/roland/infiniband: (49 commits) IB: Set class_dev->dev in core for nice device symlink IB/ehca: Implement modify_port IB/umad: Clarify documentation of transaction ID IPoIB/cm: spin_lock_irqsave() -> spin_lock_irq() replacements IB/mad: Change SMI to use enums rather than magic return codes IB/umad: Implement GRH handling for sent/received MADs IB/ipoib: Use ib_init_ah_from_path to initialize ah_attr IB/sa: Set src_path_bits correctly in ib_init_ah_from_path() IB/ucm: Simplify ib_ucm_event() RDMA/ucma: Simplify ucma_get_event() IB/mthca: Simplify CQ cleaning in mthca_free_qp() IB/mthca: Fix mthca_write_mtt() on HCAs with hidden memory IB/mthca: Update HCA firmware revisions IB/ipath: Fix WC format drift between user and kernel space IB/ipath: Check that a UD work request's address handle is valid IB/ipath: Remove duplicate stuff from ipath_verbs.h IB/ipath: Check reserved memory keys IB/ipath: Fix unit selection when all CPU affinity bits set IB/ipath: Don't allow QPs 0 and 1 to be opened multiple times IB/ipath: Disable IB link earlier in shutdown sequence ...
| | * IB: Set class_dev->dev in core for nice device symlinkJoachim Fenkes2007-04-254-4/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | All RDMA drivers except ehca set class_dev->dev to their dma_device value (ehca leaves this unset). dma_device is the only value that makes any sense, so move this assignment to core/sysfs.c. This reduce the duplicated code in the rest of the drivers and gives ehca a nice /sys/class/infiniband/ehcaX/device symlink. Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
| | * IB/ehca: Implement modify_portJoachim Fenkes2007-04-255-2/+83
| | | | | | | | | | | | | | | | | | | | | | | | | | | Add "Modify Port" verb support to eHCA driver. The IB communication manager needs this to set the IsCM port capability bit when initializing. Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
| | * IB/mthca: Simplify CQ cleaning in mthca_free_qp()Roland Dreier2007-04-251-4/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | mthca_free_qp() already has local variables to hold the QP's send_cq and recv_cq, so we can slightly clean up the calls to mthca_cq_clean() by using those local variables instead of expressions like to_mcq(qp->ibqp.send_cq). Also, by cleaning the recv_cq first, we can avoid worrying about whether the QP is attached to an SRQ for the second call, because we would only clean send_cq if send_cq is not equal to recv_cq, and that means send_cq cannot have any receive completions from the QP being destroyed. All this work even improves the generated code a bit: add/remove: 0/0 grow/shrink: 0/1 up/down: 0/-5 (-5) function old new delta mthca_free_qp 510 505 -5 Signed-off-by: Roland Dreier <rolandd@cisco.com>