summaryrefslogtreecommitdiffstats
path: root/net/rds (follow)
Commit message (Collapse)AuthorAgeFilesLines
* RDS/IW: Remove page_shift variable from iwarp transportAndy Grover2009-07-204-22/+11
| | | | | | | | | The existing code treated page_shift as a variable, when in fact we always want to have the fastreg page size be the same as the arch's page size -- and it is, so this doesn't need to be a variable. Signed-off-by: Andy Grover <andy.grover@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* RDS/IB: Always use PAGE_SIZE for FMR page sizeAndy Grover2009-07-203-12/+6
| | | | | | | | | While FMRs allow significant flexibility in what size of pages they can use, we really just want FMR pages to match CPU page size. Roland says we can count on this always being supported, so this simplifies things. Signed-off-by: Andy Grover <andy.grover@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* RDS: Fix completion notifications on blocking socketsAndy Grover2009-07-201-11/+13
| | | | | | | | Completion or congestion notifications were not being checked if the socket went to sleep. This patch fixes that. Signed-off-by: Andy Grover <andy.grover@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* RDS/IB: Drop connection when a fatal QP event is receivedAndy Grover2009-07-201-3/+3
| | | | | Signed-off-by: Andy Grover <andy.grover@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* RDS/IB: Disable flow control in sysctl and explain whyAndy Grover2009-07-201-1/+11
| | | | | | | | | | | | Backwards compatibility with rds 3.0 causes protocol- based flow control to be disabled as a side-effect. I don't want to pull out FC support from the IB transport but I do want to document and keep the sysctl consistent if possible. Signed-off-by: Andy Grover <andy.grover@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* RDS/IB: Move tx/rx ring init and refill to laterAndy Grover2009-07-201-6/+12
| | | | | | | | | Since RDS 3.0 and 3.1 have different packet formats, we need to wait until after protocol negotiation is complete to layout the rx buffers. Signed-off-by: Andy Grover <andy.grover@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* RDS: Don't set c_version in __rds_conn_create()Andy Grover2009-07-201-1/+0
| | | | | | | | Protocol negotiation is logically a property of the transports, so rds core need not set it. Signed-off-by: Andy Grover <andy.grover@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* RDS/IB: Rename byte_len to data_len to enhance readabilityAndy Grover2009-07-201-6/+6
| | | | | | | | Of course len is in bytes. Calling it data_len hopefully indicates a little better what the variable is actually for. Signed-off-by: Andy Grover <andy.grover@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* RDS/RDMA: Fix cut-n-paste errors in printks in rdma_transport.cAndy Grover2009-07-201-4/+4
| | | | | Signed-off-by: Andy Grover <andy.grover@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* RDS/IB: Fix printk to indicate remote IP, not localAndy Grover2009-07-201-1/+1
| | | | | Signed-off-by: Andy Grover <andy.grover@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* RDS/IB: Handle connections using RDS 3.0 wire protocolAndy Grover2009-07-203-6/+58
| | | | | | | | | | | | | | | The big differences between RDS 3.0 and 3.1 are protocol-level flow control, and with 3.1 the header is in front of the data. The header always ends up in the header buffer, and the data goes in the data page. In 3.0 our "header" is a trailer, and will end up either in the data page, the header buffer, or split across the two. Since 3.1 is backwards- compatible with 3.0, we need to continue to support these cases. This patch does that -- if using RDS 3.0 wire protocol, it will copy the header from wherever it ended up into the header buffer. Signed-off-by: Andy Grover <andy.grover@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* RDS/IB: Improve RDS protocol version checkingAndy Grover2009-07-201-6/+19
| | | | | | | | | RDS on IB uses privdata to do protocol version negotiation. Apparently the IB stack will return a larger privdata buffer than the struct we were expecting. Just to be extra-sure, this patch adds some checks in this area. Signed-off-by: Andy Grover <andy.grover@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* RDS: Set retry_count to 2 and make modifiable via modparamAndy Grover2009-07-203-1/+7
| | | | | | | | This will be default cause IB connections to failover faster, but allow a longer retry count to be used if desired. Signed-off-by: Andy Grover <andy.grover@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge branch 'master' of ↵David S. Miller2009-05-191-1/+1
|\ | | | | | | | | | | | | master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/scsi/fcoe/fcoe.c
| * FRV: Fix the section attribute on UP DECLARE_PER_CPU()David Howells2009-04-221-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In non-SMP mode, the variable section attribute specified by DECLARE_PER_CPU() does not agree with that specified by DEFINE_PER_CPU(). This means that architectures that have a small data section references relative to a base register may throw up linkage errors due to too great a displacement between where the base register points and the per-CPU variable. On FRV, the .h declaration says that the variable is in the .sdata section, but the .c definition says it's actually in the .data section. The linker throws up the following errors: kernel/built-in.o: In function `release_task': kernel/exit.c:78: relocation truncated to fit: R_FRV_GPREL12 against symbol `per_cpu__process_counts' defined in .data section in kernel/built-in.o kernel/exit.c:78: relocation truncated to fit: R_FRV_GPREL12 against symbol `per_cpu__process_counts' defined in .data section in kernel/built-in.o To fix this, DECLARE_PER_CPU() should simply apply the same section attribute as does DEFINE_PER_CPU(). However, this is made slightly more complex by virtue of the fact that there are several variants on DEFINE, so these need to be matched by variants on DECLARE. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | ERR_PTR() dereference in net/rds/ib.cDan Carpenter2009-04-101-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | rdma_create_id() doesn't return NULL, only ERR_PTR(). Found by smatch (http://repo.or.cz/w/smatch.git). Compile tested. regards, dan carpenter Signed-off-by: Dan Carpenter <error27@gmail.com> Signed-off-by: Andy Grover <andy.grover@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | ERR_PTR() dereference in net/rds/iw.cDan Carpenter2009-04-101-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | rdma_create_id() returns ERR_PTR() not null. Found by smatch (http://repo.or.cz/w/smatch.git). Compile tested. regards, dan carpenter Signed-off-by: Dan Carpenter <error27@gmail.com> Signed-off-by: Andy Grover <andy.grover@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | rds: use kmem_cache_zalloc instead of kmem_cache_alloc/memsetWei Yongjun2009-04-101-3/+1
| | | | | | | | | | | | | | | | Use kmem_cache_zalloc instead of kmem_cache_alloc/memset. Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com> Signed-off-by: Andy Grover <andy.grover@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | RDS: remove unused #include <version.h>Huang Weiyi2009-04-101-1/+0
| | | | | | | | | | | | | | | | Remove unused #include <version.h> in net/rds/af_rds.c. Signed-off-by: Huang Weiyi <weiyi.huang@gmail.com> Signed-off-by: Andy Grover <andy.grover@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | RDS: use get_user_pages_fast()Andy Grover2009-04-102-8/+2
| | | | | | | | | | | | | | Use the new function that is simpler and faster. Signed-off-by: Andy Grover <andy.grover@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | RDS: Establish connection before parsing CMSGsAndy Grover2009-04-101-5/+5
| | | | | | | | | | | | | | | | | | The first message to a remote node should prompt a new connection. Even an RDMA op via CMSG. Therefore move CMSG parsing to after connection establishment. Signed-off-by: Andy Grover <andy.grover@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | RDS: Fix ordering in a conditionalAndy Grover2009-04-101-1/+1
| | | | | | | | | | | | | | | | | | | | | | Putting the constant first is a supposed "best practice" that actually makes the code harder to read. Thanks to Roland Dreier for finding a bug in this "simple, obviously correct" patch. Signed-off-by: Andy Grover <andy.grover@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | RDS/IW+IB: Allow max credit advertise window.Steve Wise2009-04-107-13/+13
| | | | | | | | | | | | | | | | Fix hack that restricts the credit advertisement to 127. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Andy Grover <andy.grover@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | RDS/IW+IB: Set the RDS_LL_SEND_FULL bit when we're throttled.Steve Wise2009-04-102-2/+2
| | | | | | | | | | | | | | | | | | | | The RDS_LL_SEND_FULL bit should be set when we stop transmitted due to flow control. Otherwise the send worker will keep trying as opposed to sleeping until we unthrottle. Saves CPU. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Andy Grover <andy.grover@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | RDS: Correct some iw references in rdma_transport.cAndy Grover2009-04-101-6/+6
| | | | | | | | | | | | | | | | Had some lingering instances of _iw_ variable names from when the listen code was centralized into rdma_transport.c Signed-off-by: Andy Grover <andy.grover@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | RDS/IW+IB: Set recv ring low water mark to 1/2 full.Steve Wise2009-04-102-2/+2
|/ | | | | | | | | | | | Currently the recv ring low water mark is 1/4 the depth. Performance measurements show that this limits iWARP throughput by flow controlling the rds-stress senders. Setting it to 1/2 seems to max the T3 performance. I tried even higher levels but that didn't help and it started to increase the rds thread cpu utilization. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Andy Grover <andy.grover@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* RDS: Use spinlock to protect 64b value update on 32b archsAndy Grover2009-04-027-24/+100
| | | | | | | | | | | | | | | We have a 64bit value that needs to be set atomically. This is easy and quick on all 64bit archs, and can also be done on x86/32 with set_64bit() (uses cmpxchg8b). However other 32b archs don't have this. I actually changed this to the current state in preparation for mainline because the old way (using a spinlock on 32b) resulted in unsightly #ifdefs in the code. But obviously, being correct takes precedence. Signed-off-by: Andy Grover <andy.grover@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* RDS: Rewrite connection cleanup, fixing oops on rmmodAndy Grover2009-04-028-85/+109
| | | | | | | | | | | | | | | | | | | | | | This fixes a bug where a connection was unexpectedly not on *any* list while being destroyed. It also cleans up some code duplication and regularizes some function names. * Grab appropriate lock in conn_free() and explain in comment * Ensure via locking that a conn is never not on either a dev's list or the nodev list * Add rds_xx_remove_conn() to match rds_xx_add_conn() * Make rds_xx_add_conn() return void * Rename remove_{,nodev_}conns() to destroy_{,nodev_}conns() and unify their implementation in a helper function * Document lock ordering as nodev conn_lock before dev_conn_lock Reported-by: Yosef Etigin <yosefe@voltaire.com> Signed-off-by: Andy Grover <andy.grover@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* RDS: Fix m_rs_lock deadlockAndy Grover2009-04-021-3/+3
| | | | | | | | | | rs_send_drop_to() is called during socket close. If it takes m_rs_lock without disabling interrupts, then rds_send_remove_from_sock() can run from the rx completion handler and thus deadlock. Signed-off-by: Andy Grover <andy.grover@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* rds: fix iband RDMA dependenciesRandy Dunlap2009-03-041-0/+1
| | | | | | | | | | | | | | | | | | | | | | | Fix RDS Infiniband dependencies for RDMA so that these build errors won't happen: ERROR: "rdma_accept" [net/rds/rds.ko] undefined! ERROR: "rdma_destroy_id" [net/rds/rds.ko] undefined! ERROR: "rdma_connect" [net/rds/rds.ko] undefined! ERROR: "rdma_destroy_qp" [net/rds/rds.ko] undefined! ERROR: "rdma_listen" [net/rds/rds.ko] undefined! ERROR: "rdma_notify" [net/rds/rds.ko] undefined! ERROR: "rdma_create_id" [net/rds/rds.ko] undefined! ERROR: "rdma_create_qp" [net/rds/rds.ko] undefined! ERROR: "rdma_bind_addr" [net/rds/rds.ko] undefined! ERROR: "rdma_resolve_route" [net/rds/rds.ko] undefined! ERROR: "rdma_disconnect" [net/rds/rds.ko] undefined! ERROR: "rdma_reject" [net/rds/rds.ko] undefined! ERROR: "rdma_resolve_addr" [net/rds/rds.ko] undefined! Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Acked-by: Andy Grover <andy.grover@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* rds: Fix build on powerpc.David S. Miller2009-03-021-0/+2
| | | | | | | | | | | | | | | As reported by Stephen Rothwell. > Today's linux-next build (powerpc allyesconfig) failed like this: > > net/rds/cong.c: In function 'rds_cong_set_bit': > net/rds/cong.c:284: error: implicit declaration of function 'generic___set_le_bit' > net/rds/cong.c: In function 'rds_cong_clear_bit': > net/rds/cong.c:298: error: implicit declaration of function 'generic___clear_le_bit' > net/rds/cong.c: In function 'rds_cong_test_bit': > net/rds/cong.c:309: error: implicit declaration of function 'generic_test_le_bit' Signed-off-by: David S. Miller <davem@davemloft.net>
* RDS: Kconfig and MakefileAndy Grover2009-02-272-0/+27
| | | | | | | | Add RDS Kconfig and Makefile, and modify net/'s to add us to the build. Signed-off-by: Andy Grover <andy.grover@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* RDS: Common RDMA transport codeAndy Grover2009-02-272-0/+242
| | | | | | | | | | | Although most of IB and iWARP are separated from each other, there is some common code required to handle their shared CM listen port. This code listens for CM events and then dispatches the event to the appropriate transport, either IB or iWARP. Signed-off-by: Andy Grover <andy.grover@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* RDS: Add iWARP supportAndy Grover2009-02-279-0/+4611
| | | | | | | | | | | | | | | | | Support for iWARP NICs is implemented as a separate RDS transport from IB. The code, however, is very similar to IB (it was forked, basically.) so let's keep it in one changeset. The reason for this duplicationis that despite its similarity to IB, there are a number of places where it has different semantics. iwarp zcopy support is still under development, and giving it its own sandbox ensures that IB code isn't disrupted while iwarp changes. Over time these transports will re-converge. Signed-off-by: Andy Grover <andy.grover@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* RDS/IB: Stats and sysctlsAndy Grover2009-02-272-0/+232
| | | | | | | IB-specific stats and sysctls. Signed-off-by: Andy Grover <andy.grover@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* RDS/IB: Receive datagrams via IBAndy Grover2009-02-271-0/+869
| | | | | | | | Header parsing, ring refill. It puts the incoming data into an rds_incoming struct, which is passed up to rds-core. Signed-off-by: Andy Grover <andy.grover@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* RDS/IB: Implement IB-specific datagram send.Andy Grover2009-02-271-0/+874
| | | | | | | | | Specific to IB is a credits-based flow control mechanism, in addition to the expected usage of the IB API to package outgoing data into work requests. Signed-off-by: Andy Grover <andy.grover@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* RDS/IB: Implement RDMA ops using FMRsAndy Grover2009-02-271-0/+641
| | | | | Signed-off-by: Andy Grover <andy.grover@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* RDS/IB: Ring-handling code.Andy Grover2009-02-271-0/+168
| | | | | Signed-off-by: Andy Grover <andy.grover@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* RDS/IB: Infiniband transportAndy Grover2009-02-273-0/+1416
| | | | | | | | Registers as an RDS transport and an IB client, and uses IB CM API to allocate ids, queue pairs, and the rest of that fun stuff. Signed-off-by: Andy Grover <andy.grover@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* RDS: RDMA supportAndy Grover2009-02-272-0/+763
| | | | | | | | | Some transports may support RDMA features. This handles the non-transport-specific parts, like pinning user pages and tracking mapped regions. Signed-off-by: Andy Grover <andy.grover@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* RDS: recv.cAndy Grover2009-02-271-0/+542
| | | | | | | | Upon receiving a datagram from the transport, RDS parses the headers and potentially queues an ACK. Signed-off-by: Andy Grover <andy.grover@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* RDS: send.cAndy Grover2009-02-271-0/+1003
| | | | | | | This is the code to send an RDS datagram. Signed-off-by: Andy Grover <andy.grover@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* RDS: Message parsingAndy Grover2009-02-272-0/+623
| | | | | | | | | | | Parsing of newly-received RDS message headers (including ext. headers) and copy-to/from-user routines. page.c implements a per-cpu page remainder cache, to reduce the number of allocations needed for small datagrams. Signed-off-by: Andy Grover <andy.grover@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* RDS: sysctlsAndy Grover2009-02-271-0/+122
| | | | | | | RDS exposes a few tunable parameters via sysctls. Signed-off-by: Andy Grover <andy.grover@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* RDS: loopbackAndy Grover2009-02-272-0/+197
| | | | | | | A simple rds transport to handle loopback connections. Signed-off-by: Andy Grover <andy.grover@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* RDS: Connection handlingAndy Grover2009-02-272-0/+752
| | | | | | | | | | | | | | While arguably the fact that the underlying transport needs a connection to convey RDS's datagrame reliably is not important to rds proper, the transports implemented so far (IB and TCP) have both been connection-oriented, and so the connection state machine-related code is in the common rds code. This patch also includes several work items, to handle connecting, sending, receiving, and shutdown. Signed-off-by: Andy Grover <andy.grover@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* RDS: Info and statsAndy Grover2009-02-273-0/+419
| | | | | | | | RDS currently generates a lot of stats that are accessible via the rds-info utility. This code implements the support for this. Signed-off-by: Andy Grover <andy.grover@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* RDS: Transport codeAndy Grover2009-02-271-0/+117
| | | | | | | | | | | | RDS supports multiple transports. While this initial submission only supports Infiniband transport, this abstraction allows others to be added. We're working on an iWARP transport, and also see UDP over DCB as another possibility. This code handles transport registration. Signed-off-by: Andy Grover <andy.grover@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* RDS: Congestion-handling codeAndy Grover2009-02-271-0/+402
| | | | | | | | | RDS handles per-socket congestion by updating peers with a complete congestion map (8KB). This code keeps track of these maps for itself and ones received from peers. Signed-off-by: Andy Grover <andy.grover@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>