summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* r8169: Fix warning in rtl8169_start_xmit().David S. Miller2009-09-071-1/+0
| | | | | | | | | | | | Reported by Stephen Rothwell: drivers/net/r8169.c: In function 'rtl8169_start_xmit': drivers/net/r8169.c:3421: warning: label 'out' defined but not used Introduced by commit 61357325f377889a1daffa14962d705dc814dd0e ("netdev: convert bulk of drivers to netdev_tx_t"). Signed-off-by: David S. Miller <davem@davemloft.net>
* net: fix hydra printk format warningRandy Dunlap2009-09-071-2/+2
| | | | | | | | m68k: drivers/net/hydra.c:178: warning: format '%08lx' expects type 'long unsigned int', but argument 3 has type 'resource_size_t' Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* IXP42x HSS support for setting internal clock rateKrzysztof Halasa2009-09-072-4/+92
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | HSS usually uses external clocks, so it's not a big deal. Internal clock is used for direct DTE-DTE connections and when the DCE doesn't provide it's own clock. This also depends on the oscillator frequency. Intel seems to have calculated the clock register settings for 33.33 MHz (66.66 MHz timer base). Their settings seem quite suboptimal both in terms of average frequency (60 ppm is unacceptable for G.703 applications, their primary intended usage(?)) and jitter. Many (most?) platforms use a 33.333 MHz oscillator, a 10 ppm difference from Intel's base. Instead of creating static tables, I've created a procedure to program the HSS clock register. The register consists of 3 parts (A, B, C). The average frequency (= bit rate) is: 66.66x MHz / (A + (B + 1) / (C + 1)) The procedure aims at the closest average frequency, possibly at the cost of increased jitter. Nobody would be able to directly drive an unbufferred transmitter with a HSS anyway, and the frequency error is what it really counts. I've verified the above with an oscilloscope on IXP425. It seems IXP46x and possibly IXP43x use a bit different clock generation algorithm - it looks like the avg frequency is: (on IXP465) 66.66x MHz / (A + B / (C + 1)). Also they use much greater precomputed A and B - on IXP425 it would simply result in more jitter, but I don't know how does it work on IXP46x (perhaps 3 least significant bits aren't used?). Anyway it looks that they were aiming for exactly +60 ppm or -60 ppm, while <1 ppm is typically possible (with a synchronized clock, of course). The attached patch makes it possible to set almost any bit rate (my IXP425 533 MHz quits at > 22 Mb/s if a single port is used, and the minimum is ca. 65 Kb/s). This is independent of MVIP (multi-E1/T1 on one HSS) mode. Signed-off-by: Krzysztof Hałasa <khc@pm.waw.pl> Signed-off-by: David S. Miller <davem@davemloft.net>
* WAN: remove deprecated PCI_DEVICE_ID from PCI200SYN driver.Krzysztof Halasa2009-09-071-11/+0
| | | | | | | | | | | | PCI200SYN has its own PCI subsystem device ID for 3+ years, now it's time to remove the generic PLX905[02] ID from the driver. Anyone with old EEPROM data will have to run the upgrade. Having the generic PLX905[02] (PCI-local bus bridge) ID is harmful as the driver tries to handle other devices based on these bridges. Signed-off-by: Krzysztof Halasa <khc@pm.waw.pl> Signed-off-by: David S. Miller <davem@davemloft.net>
* be2net: Code changes in Tx path to use skb_dma_map/skb_dma_unmapAjit Khaparde2009-09-071-30/+32
| | | | | | | | | | | Code changes to - In the tx completion processing, there were instances of unmapping a memory as a page which was originally mapped as single. This patch takes care of this by using skb_dma_map()/skb_dma_unmap() to map/unmap Tx buffers. - set gso_max_size to 65535. This was not done till now. Signed-off-by: Ajit Khaparde <ajitk@serverengines.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* be2net: Changes to support flashing of the be2 network adapterAjit Khaparde2009-09-076-4/+317
| | | | | | | | | | | Changes to support flashing of the be2 network adapter using the request_firmware() & ethtool infrastructure. The trigger to flash the device will come from ethtool utility. The driver will invoke request_firmware() to start the flash process. The file containing the flash image is expected to be available in /lib/firmware/ Signed-off-by: Ajit Khaparde <ajitk@serverengines.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* wan: dlci/sdla transmit return dehackingStephen Hemminger2009-09-073-47/+9
| | | | | | | | | | | | This is a brute force removal of the wierd slave interface done for DLCI -> SDLA transmit. Before it was using non-standard return values and freeing skb in caller. This changes it to using normal return values, and freeing in the callee. Luckly only one driver pair was doing this. Not tested on real hardware, in fact I wonder if this driver pair is even being used by any users. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* netxen: update version to 4.0.50Dhananjay Phadke2009-09-071-2/+2
| | | | | Signed-off-by: Dhananjay Phadke <dhananjay@netxen.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* netxen: refactor firmware info codeDhananjay Phadke2009-09-073-103/+98
| | | | | | | | | o Combine netxen_get_firmware_info(), netxen_check_options() so that they are updated every time firmware is reset. o Set dma mask everytime firmware is reset. Signed-off-by: Dhananjay Phadke <dhananjay@netxen.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* netxen: pre calculate register addressesAmit Kumar Salecha2009-09-076-191/+210
| | | | | | | | | | | For registers accessed in fast path (interrupt / softirq) avoid expensive I/O address translation. These registers are directly mapped in PCI bar 0 and do not require any window checks. Signed-off-by: Amit Kumar Salecha <amit@netxen.com> Signed-off-by: Dhananjay Phadke <dhananjay@netxen.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* netxen: fix ip addr hashing after firmware resetAmit Kumar Salecha2009-09-071-22/+39
| | | | | | | | | Reprogram local IP addresses after firmware is reset or after resuming from suspend. Signed-off-by: Amit Kumar Salecha <amit@netxen.com> Signed-off-by: Dhananjay Phadke <dhananjay@netxen.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* netxen: firmware hang detectionDhananjay Phadke2009-09-074-88/+348
| | | | | | | | | | | | | | | | | | | | | | | | Implement state machine to detect firmware hung state and recover. Since firmware will be shared by all PCI functions that have different class drivers (NIC or FCOE or iSCSI), explicit hardware based serialization is required for initializing firmware. o Used global scratchpad register to maintain device reference count. Every probed pci function adds to ref count. o Implement timer (delayed work) for each pci func that checks firmware heartbit every 5 sec and detaches itself if firmware is dead. Last detaching function reloads firmware. Other functions wait for firmware init, and re-attach themselves. Heartbit is not supported by NX2031 firmware. Signed-off-by: Amit Kumar Salecha <amit@netxen.com> Signed-off-by: Dhananjay Phadke <dhananjay@netxen.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* netxen: handle firmware load errorsDhananjay Phadke2009-09-073-4/+20
| | | | | | | | Unwind allocations and release file firmware when when firmware load fails. Signed-off-by: Dhananjay Phadke <dhananjay@netxen.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net_sched: add classful multiqueue dummy schedulerDavid S. Miller2009-09-065-13/+277
| | | | | | | | | | | | | | | | | | | This patch adds a classful dummy scheduler which can be used as root qdisc for multiqueue devices and exposes each device queue as a child class. This allows to address queues individually and graft them similar to regular classes. Additionally it presents an accumulated view of the statistics of all real root qdiscs in the dummy root. Two new callbacks are added to the qdisc_ops and qdisc_class_ops: - cl_ops->select_queue selects the tx queue number for new child classes. - qdisc_ops->attach() overrides root qdisc device grafting to attach non-shared qdiscs to the queues. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
* net_sched: move dev_graft_qdisc() to sch_generic.cPatrick McHardy2009-09-063-26/+27
| | | | | | | It will be used in a following patch by the multiqueue qdisc. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
* net_sched: reintroduce dev->qdisc for use by sch_apiPatrick McHardy2009-09-065-48/+34
| | | | | | | | | | | | | | | | | | | | | | | | | Currently the multiqueue integration with the qdisc API suffers from a few problems: - with multiple queues, all root qdiscs use the same handle. This means they can't be exposed to userspace in a backwards compatible fashion. - all API operations always refer to queue number 0. Newly created qdiscs are automatically shared between all queues, its not possible to address individual queues or restore multiqueue behaviour once a shared qdisc has been attached. - Dumps only contain the root qdisc of queue 0, in case of non-shared qdiscs this means the statistics are incomplete. This patch reintroduces dev->qdisc, which points to the (single) root qdisc from userspace's point of view. Currently it either points to the first (non-shared) default qdisc, or a qdisc shared between all queues. The following patches will introduce a classful dummy qdisc, which will be used as root qdisc and contain the per-queue qdiscs as children. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
* net_sched: remove some unnecessary checks in classful schedulersPatrick McHardy2009-09-067-67/+37
| | | | | | | | | | The class argument to the ->graft(), ->leaf(), ->dump(), ->dump_stats() all originate from either ->get() or ->walk() and are always valid. Remove unnecessary checks. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
* net_sched: make cls_ops->change and cls_ops->delete optionalPatrick McHardy2009-09-067-85/+6
| | | | | | | | | | | | | | | | | | | | Some schedulers don't support creating, changing or deleting classes. Make the respective callbacks optionally and consistently return -EOPNOTSUPP for unsupported operations, instead of currently either -EOPNOTSUPP, -ENOSYS or no error. In case of sch_prio and sch_multiq, the removed operations additionally checked for an invalid class. This is not necessary since the class argument can only orginate from ->get() or in case of ->change is 0 for creation of new classes, in which case ->change() incorrectly returned -ENOENT. As a side-effect, this patch fixes a possible (root-only) NULL pointer function call in sch_ingress, which didn't implement a so far mandatory ->delete() operation. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
* net_sched: make cls_ops->tcf_chain() optionalPatrick McHardy2009-09-063-12/+5
| | | | | | | | Some qdiscs don't support attaching filters. Handle this centrally in cls_api and return a proper errno code (EOPNOTSUPP) instead of EINVAL. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
* net_sched: fix class grafting errno codesPatrick McHardy2009-09-052-11/+4
| | | | | | | | | | | | If the parent qdisc doesn't support classes, use EOPNOTSUPP. If the parent class doesn't exist, use ENOENT. Currently EINVAL is returned in both cases. Additionally check whether grafting is supported and remove a now unnecessary graft function from sch_ingress. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
* netlink: silence compiler warningBrian Haley2009-09-051-1/+1
| | | | | | | | | | | | CC net/netlink/genetlink.o net/netlink/genetlink.c: In function ‘genl_register_mc_group’: net/netlink/genetlink.c:139: warning: ‘err’ may be used uninitialized in this function From following the code 'err' is initialized, but set it to zero to silence the warning. Signed-off-by: Brian Haley <brian.haley@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* sctp: Catch bogus stream sequence numbersVlad Yasevich2009-09-051-2/+26
| | | | | | | | | | Since our TSN map is capable of holding at most a 4K chunk gap, there is no way that during this gap, a stream sequence number (unsigned short) can wrap such that the new number is smaller then the next expected one. If such a case is encountered, this is a protocol violation. Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
* sctp: remove dup code in net/sctp/output.cWei Yongjun2009-09-051-24/+13
| | | | | | | Use sctp_packet_reset() instead of dup code. Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com> Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
* sctp: turn flags in 'struct sctp_association' into bit fieldsWei Yongjun2009-09-051-12/+9
| | | | | | | This shrinks the size of struct sctp_association a little. Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com> Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
* sctp: Sysctl configuration for IPv4 Address ScopingBhaskar Dutta2009-09-056-6/+65
| | | | | | | | | | | | | | | | | | | | | | | | | | | This patch introduces a new sysctl option to make IPv4 Address Scoping configurable <draft-stewart-tsvwg-sctp-ipv4-00.txt>. In networking environments where DNAT rules in iptables prerouting chains convert destination IP's to link-local/private IP addresses, SCTP connections fail to establish as the INIT chunk is dropped by the kernel due to address scope match failure. For example to support overlapping IP addresses (same IP address with different vlan id) a Layer-5 application listens on link local IP's, and there is a DNAT rule that maps the destination IP to a link local IP. Such applications never get the SCTP INIT if the address-scoping draft is strictly followed. This sysctl configuration allows SCTP to function in such unconventional networking environments. Sysctl options: 0 - Disable IPv4 address scoping draft altogether 1 - Enable IPv4 address scoping (default, current behavior) 2 - Enable address scoping but allow IPv4 private addresses in init/init-ack 3 - Enable address scoping but allow IPv4 link local address in init/init-ack Signed-off-by: Bhaskar Dutta <bhaskar.dutta@globallogic.com> Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
* sctp: Get rid of an extra routing lookup when adding a transport.Vlad Yasevich2009-09-051-6/+8
| | | | | | | | | | | | | | We used to perform 2 routing lookups for a new transport: one just for path mtu detection, and one to actually route to destination and path mtu update when sending a packet. There is no point in doing both of them, especially since the first one just for path mtu doesn't take into account source address and sometimes gives the wrong route, causing path mtu updates anyway. We now do just the one call to do both route to destination and get path mtu updates. Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
* sctp: Turn flags in 'sctp_packet' into bit fieldsVlad Yasevich2009-09-051-16/+6
| | | | | | This shrinks the size of sctp_packet a little. Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
* sctp: Correctly track if AUTH has been bundled.Vlad Yasevich2009-09-051-1/+1
| | | | | | | | | We currently track if AUTH has been bundled using the 'auth' pointer to the chunk. However, AUTH is disallowed after DATA is already in the packet, so we need to instead use the 'has_auth' field. Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
* sctp: fix to reset packet information after packet transmitWei Yongjun2009-09-051-1/+12
| | | | | | | | | | The packet information does not reset after packet transmit, this may cause some problems such as following DATA chunk be sent without AUTH chunk, even if the authentication of DATA chunk has been requested by the peer. Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com> Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
* sctp: Failover transmitted list on transport deleteVlad Yasevich2009-09-053-13/+67
| | | | | | | | | Add-IP feature allows users to delete an active transport. If that transport has chunks in flight, those chunks need to be moved to another transport or association may get into unrecoverable state. Reported-by: Rafael Laufer <rlaufer@cisco.com> Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
* sctp: Fix SCTP_MAXSEG socket option to comply to spec.Vlad Yasevich2009-09-054-14/+11
| | | | | | | | | | | | | | | | We had a bug that we never stored the user-defined value for MAXSEG when setting the value on an association. Thus future PMTU events ended up re-writing the frag point and increasing it past user limit. Additionally, when setting the option on the socket/endpoint, we effect all current associations, which is against spec. Now, we store the user 'maxseg' value along with the computed 'frag_point'. We inherit 'maxseg' from the socket at association creation and use it as an upper limit for 'frag_point' when its set. Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
* sctp: Don't do NAGLE delay on large writes that were fragmented smallVlad Yasevich2009-09-053-2/+6
| | | | | | | | | | | | | SCTP will delay the last part of a large write due to NAGLE, if that part is smaller then MTU. Since we are doing large writes, we might as well send the last portion now instead of waiting untill the next large write happens. The small portion will be sent as is regardless, so it's better to not delay it. This is a result of much discussions with Wei Yongjun <yjwei@cn.fujitsu.com> and Doug Graham <dgraham@nortel.com>. Many thanks go out to them. Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
* sctp: Nagle delay should be based on path mtuVlad Yasevich2009-09-051-2/+3
| | | | | | | | | | | The decision to delay due to Nagle should be based on the path mtu and future packet size. We currently incorrectly base it on 'frag_point' which is the SCTP DATA segment size, and also we do not count DATA chunk header overhead in the computation. This actuall allows situations where a user can set low 'frag_point', and then send small messages without delay. Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
* sctp: Try not to change a_rwnd when faking a SACK from SHUTDOWN.Vlad Yasevich2009-09-051-3/+4
| | | | | | | | | | | | We currently set a_rwnd to 0 when faking a SACK from SHUTDOWN. This results in an hung association if the remote only uses SHUTDOWNs (which it's allowed to do) to acknowlege DATA when closing. The reason for that is that we simply honor the a_rwnd from the sack, but since we faked it to be 0, we enter 0-window probing. The fix is to use the peers old rwnd and add our flight size to it. Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
* sctp: drop a_rwnd to 0 when receive buffer overflows.Vlad Yasevich2009-09-052-2/+41
| | | | | | | | | | | | | | SCTP has a problem that when small chunks are used, it is possible to exhaust the receiver buffer without fully closing receive window. This happens due to all overhead that we have account for with small messages. To fix this, when receive buffer is exceeded, we'll drop the window to 0 and save the 'drop' portion. When application starts reading data and freeing up recevie buffer space, we'll wait until we've reached the 'drop' window and then add back this 'drop' one mtu at a time. This worked well in testing and under stress produced rather even recovery. Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
* sctp: Clear fast_recovery on the transport when T3 timer expires.Vlad Yasevich2009-09-051-0/+3
| | | | | | | | If T3 timer expires, we are retransmitting data due to timeout any any fast recovery is null and void. We can clear the fast recovery flag. Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
* sctp: Fix error count increments that were results of HEARTBEATSVlad Yasevich2009-09-052-5/+17
| | | | | | | | | | | | | | | | | | | | | | | SCTP RFC 4960 states that unacknowledged HEARTBEATS count as errors agains a given transport or endpoint. As such, we should increment the error counts for only for unacknowledged HB, otherwise we detect failure too soon. This goes for both the overall error count and the path error count. Now, there is a difference in how the detection is done between the two. The path error detection is done after the increment, so to detect it properly, we actually need to exceed the path threshold. The overall error detection is done _BEFORE_ the increment. Thus to detect the failure, it's enough for the error count to match the threshold. This is why all the state functions use '>=' to detect failure, while path detection uses '>'. Thanks goes to Chunbo Luo <chunbo.luo@windriver.com> who first proposed patches to fix this issue and made me re-read the spec and the code to figure out how this cruft really works. Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
* sctp: use proc_create()Alexey Dobriyan2009-09-051-3/+1
| | | | | | | create_proc_entry() is deprecated (not formally, though). Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
* sctp: fix check the chunk length of received HEARTBEAT-ACK chunkWei Yongjun2009-09-051-1/+2
| | | | | | | | | | | | | | | The receiver of the HEARTBEAT should respond with a HEARTBEAT ACK that contains the Heartbeat Information field copied from the received HEARTBEAT chunk. So the received HEARTBEAT-ACK chunk must have a length of: sizeof(sctp_chunkhdr_t) + sizeof(sctp_sender_hb_info_t) A badly formatted HB-ACK chunk, it is possible that we may access invalid memory. We should really make sure that the chunk format is what we expect, before attempting to touch the data. Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com> Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
* sctp: drop SHUTDOWN chunk if the TSN is less than the CTSNWei Yongjun2009-09-051-1/+15
| | | | | | | | If Cumulative TSN Ack field of SHUTDOWN chunk is less than the Cumulative TSN Ack Point then drop the SHUTDOWN chunk. Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com> Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
* sctp: Send user messages to the lower layer as oneVlad Yasevich2009-09-056-15/+61
| | | | | | | | | | | | | | | | | Currenlty, sctp breaks up user messages into fragments and sends each fragment to the lower layer by itself. This means that for each fragment we go all the way down the stack and back up. This also discourages bundling of multiple fragments when they can fit into a sigle packet (ex: due to user setting a low fragmentation threashold). We introduce a new command SCTP_CMD_SND_MSG and hand the whole message down state machine. The state machine and the side-effect parser will cork the queue, add all chunks from the message to the queue, and then un-cork the queue thus causing the chunks to get transmitted. Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
* sctp: Try to encourage SACK bundling with DATA.Vlad Yasevich2009-09-051-5/+16
| | | | | | | | | If the association has a SACK timer pending and now DATA queued to be send, we'll try to bundle the SACK with the next application send. As such, try encourage bundling by accounting for SACK in the size of the first chunk fragment. Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
* sctp: Generate SACKs when actually sending outbound DATAVlad Yasevich2009-09-051-57/+85
| | | | | | | | | | | | | | | We are now trying to bundle SACKs when we have outbound DATA to send. However, there are situations where this outbound DATA will not be sent (due to congestion or available window). In such cases it's ok to wait for the timer to expire. This patch refactors the sending code so that betfore attempting to bundle the SACK we check to see if the DATA will actually be transmitted. Based on eirlier works for Doug Graham <dgraham@nortel.com> and Wei Youngjun <yjwei@cn.fujitsu.com>. Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
* sctp: Fix data segmentation with small frag_sizeVlad Yasevich2009-09-051-9/+23
| | | | | | | | | | | | | | | Since an application may specify the maximum SCTP fragment size that all data should be fragmented to, we need to fix how we do segmentation. Right now, if a user specifies a small fragment size, the segment size can go negative in the presence of AUTH or COOKIE_ECHO bundling. What we need to do is track the largest possbile DATA chunk that can fit into the mtu. Then if the fragment size specified is bigger then this maximum length, we'll shrink it down. Otherwise, we just use the smaller segment size without changing it further. Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
* sctp: Disallow new connection on a closing socketVlad Yasevich2009-09-053-1/+11
| | | | | | | | | | | If a socket has a lot of association that are in the process of of being closed/aborted, it is possible for a remote to establish new associations during the time period that the old ones are shutting down. If this was a result of a close() call, there will be no socket and will cause a memory leak. We'll prevent this by setting the socket state to CLOSING and disallow new associations when in this state. Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
* sctp: Fix piggybacked ACKsDoug Graham2009-09-051-4/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch corrects the conditions under which a SACK will be piggybacked on a DATA packet. The previous condition was incorrect due to a misinterpretation of RFC 4960 and/or RFC 2960. Specifically, the following paragraph from section 6.2 had not been implemented correctly: Before an endpoint transmits a DATA chunk, if any received DATA chunks have not been acknowledged (e.g., due to delayed ack), the sender should create a SACK and bundle it with the outbound DATA chunk, as long as the size of the final SCTP packet does not exceed the current MTU. See Section 6.2. When about to send a DATA chunk, the code now checks to see if the SACK timer is running. If it is, we know we have a SACK to send to the peer, so we append the SACK (assuming available space in the packet) and turn off the timer. For a simple request-response scenario, this will result in the SACK being bundled with the response, meaning the the SACK is received quickly by the client, and also meaning that no separate SACK packet needs to be sent by the server to acknowledge the request. Prior to this patch, a separate SACK packet would have been sent by the server SCTP only after its delayed-ACK timer had expired (usually 200ms). This is wasteful of bandwidth, and can also have a major negative impact on performance due the interaction of delayed ACKs with the Nagle algorithm. Signed-off-by: Doug Graham <dgraham@nortel.com> Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
* sctp: remove unused union (sctp_cmsg_data_t) definitionRami Rosen2009-09-051-6/+0
| | | | | | | | This patch removes an unused union definition (sctp_cmsg_data_t) from include/net/sctp/user.h. Signed-off-by: Rami Rosen <rosenrami@gmail.com> Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
* sctp: release cached route when the transport goes down.Vlad Yasevich2009-09-051-2/+7
| | | | | | | | | When the sctp transport is marked down, we can release the cached route and force a new lookup when attempting to use this transport for anything. This way, if a better route or source address is available, we'll try to use it. Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
* sctp: update the route for non-active transports after addresses are addedWei Yongjun2009-09-051-0/+8
| | | | | | | | | Update the route and saddr entries for the non-active transports as some of the added addresses can be used as better source addresses, or may be there is a better route. Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com> Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
* sctp: check the unrecognized ASCONF parameter before access itWei Yongjun2009-09-051-3/+5
| | | | | | | | This patch fix to check the unrecognized ASCONF parameter before access it. Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com> Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>