summaryrefslogtreecommitdiffstats
path: root/net (follow)
Commit message (Collapse)AuthorAgeFilesLines
* ipsec: Use RCU-like construct for saved state within a walkHerbert Xu2008-09-101-13/+39
| | | | | | | | | | | | | | | | | Now that we save states within a walk we need synchronisation so that the list the saved state is on doesn't disappear from under us. As it stands this is done by keeping the state on the list which is bad because it gets in the way of the management of the state life-cycle. An alternative is to make our own pseudo-RCU system where we use counters to indicate which state can't be freed immediately as it may be referenced by an ongoing walk when that resumes. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge branch 'lvs-next-2.6' of ↵David S. Miller2008-09-1022-726/+2231
|\ | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/horms/lvs-2.6
| * Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6 into ↵Simon Horman2008-09-1051-256/+415
| |\ | | | | | | | | | lvs-next-2.6
| * | ipvs: Embed user stats structure into kernel stats structureSven Wegener2008-09-093-67/+56
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Instead of duplicating the fields, integrate a user stats structure into the kernel stats structure. This is more robust when the members are changed, because they are now automatically kept in sync. Signed-off-by: Sven Wegener <sven.wegener@stealer.net> Reviewed-by: Julius Volz <juliusv@google.com> Signed-off-by: Simon Horman <horms@verge.net.au>
| * | ipvs: Restrict connection table size via KconfigSven Wegener2008-09-091-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Instead of checking the value in include/net/ip_vs.h, we can just restrict the range in our Kconfig file. This will prevent values outside of the range early. Signed-off-by: Sven Wegener <sven.wegener@stealer.net> Reviewed-by: Julius Volz <juliusv@google.com> Signed-off-by: Simon Horman <horms@verge.net.au>
| * | IPVS: Remove incorrect ip_route_me_harder(), fix IPv6Julius Volz2008-09-091-9/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | Remove an incorrect ip_route_me_harder() that was probably a result of merging my IPv6 patches with the local client patches. With this, IPv6+NAT are working again. Signed-off-by: Julius Volz <juliusv@google.com> Signed-off-by: Simon Horman <horms@verge.net.au>
| * | ipvs: handle PARTIAL_CHECKSUMSimon Horman2008-09-092-4/+70
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now that LVS can load balance locally generated traffic, packets may come from the loopback device and thus may have a partial checksum. The existing code allows for the case where there is no checksum at all for TCP, however Herbert Xu has confirmed that this is not legal. Signed-off-by: Simon Horman <horms@verge.net.au> Acked-by: Julius Volz <juliusv@google.com>
| * | IPVS: use ipv6_addr_copy()Simon Horman2008-09-081-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It is standard to use ipv6_addr_copy() to fill in the in6 element of a union nf_inet_addr snet. Thanks to Julius Volz for pointing this out. Cc: Brian Haley <brian.haley@hp.com> Signed-off-by: Simon Horman <horms@verge.net.au> Acked-by: Julius Volz <juliusv@google.com>
| * | IPVS: fix bogus indentationSimon Horman2008-09-081-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | Sorry, this was my error. Thanks to Julius Volz for pointing it out. Signed-off-by: Simon Horman <horms@verge.net.au> Acked-by: Julius Volz <juliusv@google.com>
| * | ipvs: Reject ipv6 link-local addresses for destinationsSven Wegener2008-09-081-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We can't use non-local link-local addresses for destinations, without knowing the interface on which we can reach the address. Reject them for now. Signed-off-by: Sven Wegener <sven.wegener@stealer.net> Acked-by: Julius Volz <juliusv@google.com> Signed-off-by: Simon Horman <horms@verge.net.au>
| * | ipvs: Mark tcp/udp v4 and v6 debug functions staticSven Wegener2008-09-081-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | They are only used in this file, so they should be static Signed-off-by: Sven Wegener <sven.wegener@stealer.net> Acked-by: Julius Volz <juliusv@google.com> Signed-off-by: Simon Horman <horms@verge.net.au>
| * | ipvs: Return negative error values from ip_vs_edit_service()Sven Wegener2008-09-081-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | Like the other code in this function does. Signed-off-by: Sven Wegener <sven.wegener@stealer.net> Acked-by: Julius Volz <juliusv@google.com> Signed-off-by: Simon Horman <horms@verge.net.au>
| * | ipvs: Use pointer to address from sync messageSven Wegener2008-09-081-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | We want a pointer to it, not the value casted to a pointer. Signed-off-by: Sven Wegener <sven.wegener@stealer.net> Acked-by: Julius Volz <juliusv@google.com> Signed-off-by: Simon Horman <horms@verge.net.au>
| * | ipvs: load balance ipv6 connections from a local processSimon Horman2008-09-051-50/+41
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This allows IPVS to load balance IPv6 connections made by a local process. For example a proxy server running locally. External client --> pound:443 -> Local:443 --> IPVS:80 --> RealServer This is an extenstion to the IPv4 work done in this area by Siim Põder and Malcolm Turnbull. Cc: Siim Põder <siim@p6drad-teel.net> Cc: Malcolm Turnbull <malcolm@loadbalancer.org> Signed-off-by: Simon Horman <horms@verge.net.au>
| * | ipvs: load balance IPv4 connections from a local processMalcolm Turnbull2008-09-052-94/+134
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This allows IPVS to load balance connections made by a local process. For example a proxy server running locally. External client --> pound:443 -> Local:443 --> IPVS:80 --> RealServer Signed-off-by: Siim Põder <siim@p6drad-teel.net> Signed-off-by: Malcolm Turnbull <malcolm@loadbalancer.org> Signed-off-by: Simon Horman <horms@verge.net.au>
| * | IPVS: Allow adding IPv6 services from userspaceJulius Volz2008-09-051-5/+48
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Allow adding IPv6 services through the genetlink interface and add checks to see if the chosen scheduler is supported with IPv6 and whether the supplied prefix length is sane. Make sure the service count exported via the sockopt interface only counts IPv4 services. Signed-off-by: Julius Volz <juliusv@google.com> Signed-off-by: Simon Horman <horms@verge.net.au>
| * | IPVS: Activate IPv6 Netfilter hooksJulius Volz2008-09-051-0/+37
| | | | | | | | | | | | | | | | | | | | | | | | Register the previously defined or adapted netfilter hook functions for IPv6 as PF_INET6 hooks. Signed-off-by: Julius Volz <juliusv@google.com> Signed-off-by: Simon Horman <horms@verge.net.au>
| * | IPVS: Adjust various debug outputs to use new macrosJulius Volz2008-09-054-63/+78
| | | | | | | | | | | | | | | | | | | | | | | | Adjust various debug outputs to use the new *_BUF macro variants for correct output of v4/v6 addresses. Signed-off-by: Julius Volz <juliusv@google.com> Signed-off-by: Simon Horman <horms@verge.net.au>
| * | IPVS: Add function to determine if IPv6 address is localVince Busam2008-09-051-7/+49
| | | | | | | | | | | | | | | | | | | | | | | | | | | Add __ip_vs_addr_is_local_v6() to find out if an IPv6 address belongs to a local interface. Use this function to decide whether to set the IP_VS_CONN_F_LOCALNODE flag for IPv6 destinations. Signed-off-by: Vince Busam <vbusam@google.com> Signed-off-by: Simon Horman <horms@verge.net.au>
| * | IPVS: Turn off FTP application helper for IPv6Julius Volz2008-09-051-0/+16
| | | | | | | | | | | | | | | | | | | | | | | | Immediately return from FTP application helper and do nothing when dealing with IPv6 packets. IPv6 is not supported by this helper yet. Signed-off-by: Julius Volz <juliusv@google.com> Signed-off-by: Simon Horman <horms@verge.net.au>
| * | IVPS: Disable sync daemon for IPv6 connectionsJulius Volz2008-09-051-1/+2
| | | | | | | | | | | | | | | | | | | | | Disable the sync daemon for IPv6 connections, works only with IPv4 for now. Signed-off-by: Julius Volz <juliusv@google.com> Signed-off-by: Simon Horman <horms@verge.net.au>
| * | IPVS: Convert procfs files for IPv6 entry outputVince Busam2008-09-052-18/+73
| | | | | | | | | | | | | | | | | | | | | Correctly output IPv6 connection/service/dest entries in procfs files. Signed-off-by: Vince Busam <vbusam@google.com> Signed-off-by: Simon Horman <horms@verge.net.au>
| * | IPVS: Convert real server lookup functionsJulius Volz2008-09-054-34/+62
| | | | | | | | | | | | | | | | | | | | | | | | Convert functions for looking up destinations (real servers) to support IPv6 services/dests. Signed-off-by: Julius Volz <juliusv@google.com> Signed-off-by: Simon Horman <horms@verge.net.au>
| * | IPVS: Add/adjust Netfilter hook functions and helpers for v6Julius Volz2008-09-051-36/+329
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add Netfilter hook functions or modify existing ones, if possible, to process IPv6 packets. Some support functions are also added/modified for this. ip_vs_nat_icmp_v6() was already added in the patch that added the v6 xmit functions, as it is called from one of them. Signed-off-by: Julius Volz <juliusv@google.com> Signed-off-by: Simon Horman <horms@verge.net.au>
| * | IPVS: Extend scheduling functions for IPv6 supportJulius Volz2008-09-051-25/+31
| | | | | | | | | | | | | | | | | | | | | | | | Convert ip_vs_schedule() and ip_vs_sched_persist() to support scheduling of IPv6 connections. Signed-off-by: Julius Volz <juliusv@google.com> Signed-off-by: Simon Horman <horms@verge.net.au>
| * | IPVS: Add and bind IPv6 xmit functionsJulius Volz2008-09-053-1/+453
| | | | | | | | | | | | | | | | | | | | | | | | Add xmit functions for IPv6. Also add the already needed __ip_vs_get_out_rt_v6() to ip_vs_core.c. Bind the new xmit functions to v6 connections. Signed-off-by: Julius Volz <juliusv@google.com> Signed-off-by: Simon Horman <horms@verge.net.au>
| * | IPVS: Add IPv6 support to xmit() support functionsJulius Volz2008-09-051-7/+75
| | | | | | | | | | | | | | | | | | | | | | | | Add IPv6 support to IP_VS_XMIT() and to the xmit routing cache, introducing a new function __ip_vs_get_out_rt_v6(). Signed-off-by: Julius Volz <juliusv@google.com> Signed-off-by: Simon Horman <horms@verge.net.au>
| * | IPVS: Extend functions for getting/creating connectionsJulius Volz2008-09-057-158/+198
| | | | | | | | | | | | | | | | | | | | | | | | Extend functions for getting/creating connections and connection templates for IPv6 support and fix the callers. Signed-off-by: Julius Volz <juliusv@google.com> Signed-off-by: Simon Horman <horms@verge.net.au>
| * | IPVS: Extend protocol DNAT/SNAT and state handlersJulius Volz2008-09-052-36/+131
| | | | | | | | | | | | | | | | | | | | | | | | Extend protocol DNAT/SNAT and state handlers to work with IPv6. Also change/introduce new checksumming helper functions for this. Signed-off-by: Julius Volz <juliusv@google.com> Signed-off-by: Simon Horman <horms@verge.net.au>
| * | IPVS: Add protocol debug functions for IPv6Julius Volz2008-09-052-6/+93
| | | | | | | | | | | | | | | | | | | | | | | | Add protocol (TCP, UDP, AH, ESP) debug functions for IPv6 packet debug output. Signed-off-by: Julius Volz <juliusv@google.com> Signed-off-by: Simon Horman <horms@verge.net.au>
| * | IPVS: Add 'af' args to protocol handler functionsJulius Volz2008-09-054-118/+162
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add 'af' arguments to conn_schedule(), conn_in_get(), conn_out_get() and csum_check() function pointers in struct ip_vs_protocol. Extend the respective functions for TCP, UDP, AH and ESP and adjust the callers. The changes in the callers need to be somewhat extensive, since they now need to pass a filled out struct ip_vs_iphdr * to the modified functions instead of a struct iphdr *. Signed-off-by: Julius Volz <juliusv@google.com> Signed-off-by: Simon Horman <horms@verge.net.au>
| * | IPVS: Add IPv6 support flag to schedulersJulius Volz2008-09-0510-33/+63
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add 'supports_ipv6' flag to struct ip_vs_scheduler to indicate whether a scheduler supports IPv6. Set the flag to 1 in schedulers that work with IPv6, 0 otherwise. This flag is checked in a later patch while trying to add a service with a specific scheduler. Adjust debug in v6-supporting schedulers to work with both address families. Signed-off-by: Julius Volz <juliusv@google.com> Signed-off-by: Simon Horman <horms@verge.net.au>
| * | IPVS: Add v6 support to ip_vs_service_get()Julius Volz2008-09-053-18/+26
| | | | | | | | | | | | | | | | | | | | | | | | Add support for selecting services based on their address family to ip_vs_service_get() and adjust the callers. Signed-off-by: Julius Volz <juliusv@google.com> Signed-off-by: Simon Horman <horms@verge.net.au>
| * | IPVS: Convert __ip_vs_svc_get() and __ip_vs_fwm_get()Julius Volz2008-09-051-32/+47
| | | | | | | | | | | | | | | | | | | | | | | | | | | Add support for getting services based on their address family to __ip_vs_service_get(), __ip_vs_fwm_get() and the helper hash function ip_vs_svc_hashkey(). Adjust the callers. Signed-off-by: Julius Volz <juliusv@google.com> Signed-off-by: Simon Horman <horms@verge.net.au>
| * | IPVS: Add internal versions of sockopt interface structsJulius Volz2008-09-051-48/+90
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add extended internal versions of struct ip_vs_service_user and struct ip_vs_dest_user (the originals can't be modified as they are part of the old sockopt interface). Adjust ip_vs_ctl.c to work with the new data structures and add some minor AF-awareness. Signed-off-by: Julius Volz <juliusv@google.com> Signed-off-by: Simon Horman <horms@verge.net.au>
| * | IPVS: Change IPVS data structures to support IPv6 addressesJulius Volz2008-09-0518-108/+109
| | | | | | | | | | | | | | | | | | | | | | | | | | | Introduce new 'af' fields into IPVS data structures for specifying an entry's address family. Convert IP addresses to be of type union nf_inet_addr. Signed-off-by: Julius Volz <juliusv@google.com> Signed-off-by: Simon Horman <horms@verge.net.au>
| * | IPVS: Add CONFIG_IP_VS_IPV6 option for IPv6 supportJulius Volz2008-09-051-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | Add boolean config option CONFIG_IP_VS_IPV6 for enabling experimental IPv6 support in IPVS. Only visible if IPv6 support is set to 'y' or both IPv6 and IPVS are modules. Signed-off-by: Julius Volz <juliusv@google.com> Signed-off-by: Simon Horman <horms@verge.net.au>
* | | Merge branch 'master' of ↵David S. Miller2008-09-108-50/+87
|\ \ \ | |_|/ |/| | | | | master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
| * | ipv6: Fix OOPS in ip6_dst_lookup_tail().Neil Horman2008-09-091-32/+32
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This fixes kernel bugzilla 11469: "TUN with 1024 neighbours: ip6_dst_lookup_tail NULL crash" dst->neighbour is not necessarily hooked up at this point in the processing path, so blindly dereferencing it is the wrong thing to do. This NULL check exists in other similar paths and this case was just an oversight. Also fix the completely wrong and confusing indentation here while we're at it. Based upon a patch by Evgeniy Polyakov. Signed-off-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | ipsec: Restore larval states and socket policies in dumpHerbert Xu2008-09-092-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The commit commit 4c563f7669c10a12354b72b518c2287ffc6ebfb3 ("[XFRM]: Speed up xfrm_policy and xfrm_state walking") inadvertently removed larval states and socket policies from netlink dumps. This patch restores them. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | Merge branch 'master' of ↵David S. Miller2008-09-095-18/+52
| |\ \ | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/holtmann/bluetooth-2.6
| | * | [Bluetooth] Reject L2CAP connections on an insecure ACL linkMarcel Holtmann2008-09-093-5/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The Security Mode 4 of the Bluetooth 2.1 specification has strict authentication and encryption requirements. It is the initiators job to create a secure ACL link. However in case of malicious devices, the acceptor has to make sure that the ACL is encrypted before allowing any kind of L2CAP connection. The only exception here is the PSM 1 for the service discovery protocol, because that is allowed to run on an insecure ACL link. Previously it was enough to reject a L2CAP connection during the connection setup phase, but with Bluetooth 2.1 it is forbidden to do any L2CAP protocol exchange on an insecure link (except SDP). The new hci_conn_check_link_mode() function can be used to check the integrity of an ACL link. This functions also takes care of the cases where Security Mode 4 is disabled or one of the devices is based on an older specification. Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
| | * | [Bluetooth] Enforce correct authentication requirementsMarcel Holtmann2008-09-093-6/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | With the introduction of Security Mode 4 and Simple Pairing from the Bluetooth 2.1 specification it became mandatory that the initiator requires authentication and encryption before any L2CAP channel can be established. The only exception here is PSM 1 for the service discovery protocol (SDP). It is meant to be used without any encryption since it contains only public information. This is how Bluetooth 2.0 and before handle connections on PSM 1. For Bluetooth 2.1 devices the pairing procedure differentiates between no bonding, general bonding and dedicated bonding. The L2CAP layer wrongly uses always general bonding when creating new connections, but it should not do this for SDP connections. In this case the authentication requirement should be no bonding and the just-works model should be used, but in case of non-SDP connection it is required to use general bonding. If the new connection requires man-in-the-middle (MITM) protection, it also first wrongly creates an unauthenticated link key and then later on requests an upgrade to an authenticated link key to provide full MITM protection. With Simple Pairing the link key generation is an expensive operation (compared to Bluetooth 2.0 and before) and doing this twice during a connection setup causes a noticeable delay when establishing a new connection. This should be avoided to not regress from the expected Bluetooth 2.0 connection times. The authentication requirements are known up-front and so enforce them. To fulfill these requirements the hci_connect() function has been extended with an authentication requirement parameter that will be stored inside the connection information and can be retrieved by userspace at any time. This allows the correct IO capabilities exchange and results in the expected behavior. Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
| | * | [Bluetooth] Fix reference counting during ACL config stageMarcel Holtmann2008-09-091-7/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The ACL config stage keeps holding a reference count on incoming connections when requesting the extended features. This results in keeping an ACL link up without any users. The problem here is that the Bluetooth specification doesn't define an ownership of the ACL link and thus it can happen that the implementation on the initiator side doesn't care about disconnecting unused links. In this case the acceptor needs to take care of this. Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
* | | | ipsec: Make xfrm_larval_drop default to 1.David S. Miller2008-09-101-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The previous default behavior is definitely the least user friendly. Hanging there forever just because the keying daemon is wedged or the refreshing of the policy can't move forward is anti-social to say the least. Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | This reverts "Merge branch 'dccp' of git://eden-feed.erg.abdn.ac.uk/dccp_exp"Gerrit Renker2008-09-0933-3863/+2801
| | | | | | | | | | | | | | | | | | | | | | | | as it accentally contained the wrong set of patches. These will be submitted separately. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
* | | | Merge branch 'dccp' of git://eden-feed.erg.abdn.ac.uk/dccp_expDavid S. Miller2008-09-0933-2801/+3863
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Conflicts: net/dccp/input.c net/dccp/options.c
| * | | | dccp ccid-3: Preventing OscillationsGerrit Renker2008-09-043-2/+59
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This implements [RFC 3448, 4.5], which performs congestion avoidance behaviour by reducing the transmit rate as the queueing delay (measured in terms of long-term RTT) increases. Oscillation can be turned on/off via a module option (do_osc_prev) and via sysfs (using mode 0644), the default is off. Overflow analysis: ------------------ * oscillation prevention is done after update_x(), so that t_ipi <= 64000; * hence the multiplication "t_ipi * sqrt(R_sample)" needs 64 bits; * done using u64 for sqrt_sample and explicit typecast of t_ipi; * the divisor, R_sqmean, is non-zero because oscillation prevention is first called when receiving the second feedback packet, and tfrc_scaled_rtt() > 0. A detailed discussion of the algorithm (with plots) is on http://www.erg.abdn.ac.uk/users/gerrit/dccp/notes/ccid3/sender_notes/oscillation_prevention/ The algorithm has negative side effects: * when allowing to decrease t_ipi (leads to a large RTT) and * when using it during slow-start; both uses are therefore disabled. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
| * | | | dccp ccid-3: Simplify computing and range-checking of t_ipiGerrit Renker2008-09-041-10/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch simplifies the computation of t_ipi, avoiding expensive computations to enforce the minimum sending rate. Both RFC 3448 and rfc3448bis (revision #06), as well as RFC 4342 sec 5., require at various stages that at least one packet must be sent per t_mbi = 64 seconds. This requires frequent divisions of the type X_min = s/t_mbi, which are later converted back into an inter-packet-interval t_ipi_max = s/X_min = t_mbi. The patch removes the expensive indirection; in the unlikely case of having a sending rate less than one packet per 64 seconds, it also re-adjusts X. The following cases document conformance with RFC 3448 / rfc3448bis-06: 1) Time until receiving the first feedback packet: * if the sender has no initial RTT sample then X = s/1 Bps > s/t_mbi; * if the sender has an initial RTT sample or when the first feedback packet is received, X = W_init/R > s/t_mbi. 2) Slow-start (p == 0 and feedback packets come in): * RFC 3448 (current code) enforces a minimum of s/R > s/t_mbi; * rfc3448bis (future code) enforces an even higher minimum of W_init/R. 3) Congestion avoidance with no absence of feedback (p > 0): * when X_calc or X_recv/2 are too low, the minimum of X_min = s/t_mbi is enforced in update_x() when calling update_send_interval(); * update_send_interval() is, as before, only called when X changes (i.e. either when increasing or decreasing, not when in equilibrium). 4) Reduction of X without prior feedback or during slow-start (p==0): * both RFC 3448 and rfc3448bis here halve X directly; * the associated constraint X >= s/t_mbi is nforced here by send_interval(). 5) Reduction of X when p > 0: * X is modified indirectly via X_recv (RFC 3448) or X_recv_set (rfc3448bis); * in both cases, control goes back to section 4.3 (in both documents); * since p > 0, both documents use X = max(min(...), s/t_mbi), which is enforced in this patch by calling send_interval() from update_x(). I think that this analysis is exhaustive. Should I have forgotten a case, the worst-case consideration arises when X sinks below s/t_mbi, and is then increased back up to this minimum value. Even under this assumption, the behaviour is correct, since all lower limits of X in RFC 3448 / rfc3448bis are either equal to or greater than s/t_mbi. Note on the condition X >= s/t_mbi <==> t_ipi = s/X <= t_mbi: since X is scaled by 64, and all time units are in microseconds, the coded condition is: t_ipi = s * 64 * 10^6 usec / X <= 64 * 10^6 usec This simplifies to s / X <= 1 second <==> X * 1 second >= s > 0. (A zero `s' is not allowed by the CCID-3 code). Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
| * | | | dccp ccid-3: Measuring the packet size s with regard to rfc3448bis-06Gerrit Renker2008-09-042-12/+35
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | rfc3448bis allows three different ways of tracking the packet size `s': 1. using the MSS/MPS (at initialisation, 4.2, and in 4.1 (1)); 2. using the average of `s' (in 4.1); 3. using the maximum of `s' (in 4.2). Instead of hard-coding a single interpretation of rfc3448bis, this implements a choice of all three alternatives and suggests the first as default, since it is the option which is most consistent with other parts of the specification. The patch further deprecates the update of t_ipi whenever `s' changes. The gains of doing this are only small since a change of s takes effect at the next instant X is updated: * when the next feedback comes in (within one RTT or less); * when the nofeedback timer expires (within at most 4 RTTs). Further, there are complications caused by updating t_ipi whenever s changes: * if t_ipi had previously been updated to effect oscillation prevention (4.5), then it is impossible to make the same adjustment to t_ipi again, thus counter-acting the algorithm; * s may be updated any time and a modification of t_ipi depends on the current state (e.g. no oscillation prevention is done in the absence of feedback); * in rev-06 of rfc3448bis, there are more possible cases, depending on whether the sender is in slow-start (t_ipi <= R/W_init), or in congestion-avoidance, limited by X_recv or the throughput equation (t_ipi <= t_mbi). Thus there are side effects of always updating t_ipi as s changes. These may not be desirable. The only case I can think of where such an update makes sense is to recompute X_calc when p > 0 and when s changes (not done by this patch). Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>