linux - linux

	Commit message (Collapse)	Author	Age	Files	Lines
*	ipv4: Use flowi4 in public route lookup interfaces.	David S. Miller	2011-03-13	9	-127/+134
\| \| \| \|	Signed-off-by: David S. Miller <davem@davemloft.net>
*	ipv4: Use struct flowi4 internally in routing lookups.	David S. Miller	2011-03-13	1	-115/+115
\| \| \| \| \| \|	We will change the externally visible APIs next. Signed-off-by: David S. Miller <davem@davemloft.net>
*	ipv4: Pass ipv4 flow objects into fib_lookup() paths.	David S. Miller	2011-03-13	5	-17/+17
\| \| \| \| \| \| \| \|	To start doing these conversions, we need to add some temporary flow4_* macros which will eventually go away when all the protocol code paths are changed to work on AF specific flowi objects. Signed-off-by: David S. Miller <davem@davemloft.net>
*	net: Break struct flowi out into AF specific instances.	David S. Miller	2011-03-13	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \|	Now we have struct flowi4, flowi6, and flowidn for each address family. And struct flowi is just a union of them all. It might have been troublesome to convert flow_cache_uli_match() but as it turns out this function is completely unused and therefore can be simply removed. Signed-off-by: David S. Miller <davem@davemloft.net>
*	net: Make flowi ports AF dependent.	David S. Miller	2011-03-13	9	-28/+28
\| \| \| \| \| \| \| \| \| \| \| \|	Create two sets of port member accessors, one set prefixed by fl4_* and the other prefixed by fl6_* This will let us to create AF optimal flow instances. It will work because every context in which we access the ports, we have to be fully aware of which AF the flowi is anyways. Signed-off-by: David S. Miller <davem@davemloft.net>
*	net: Put flowi_* prefix on AF independent members of struct flowi	David S. Miller	2011-03-13	14	-107/+116
\| \| \| \| \| \| \| \| \| \|	I intend to turn struct flowi into a union of AF specific flowi structs. There will be a common structure that each variant includes first, much like struct sock_common. This is the first step to move in that direction. Signed-off-by: David S. Miller <davem@davemloft.net>
*	ipv4: Create and use route lookup helpers.	David S. Miller	2011-03-13	7	-133/+75
\| \| \| \| \| \| \|	The idea here is this minimizes the number of places one has to edit in order to make changes to how flows are defined and used. Signed-off-by: David S. Miller <davem@davemloft.net>
*	ipv4: Kill flowi arg to fib_select_multipath()	David S. Miller	2011-03-11	2	-3/+3
\| \| \| \| \| \|	Completely unused. Signed-off-by: David S. Miller <davem@davemloft.net>
*	ipv4: Remove unnecessary test from ip_mkroute_input()	David S. Miller	2011-03-11	1	-1/+1
\| \| \| \| \| \| \|	fl->oif will always be zero on the input path, so there is no reason to test for that. Signed-off-by: David S. Miller <davem@davemloft.net>
*	ipv4: Remove redundant RCU locking in ip_check_mc().	David S. Miller	2011-03-11	2	-7/+6
\| \| \| \| \| \| \| \|	All callers are under rcu_read_lock() protection already. Rename to ip_check_mc_rcu() to make it even more clear. Signed-off-by: David S. Miller <davem@davemloft.net>
*	Merge branch 'master' of ↵	David S. Miller	2011-03-10	3	-5/+5
\|\ \| \| \| \| \| \| \| \| \| \| \| \|	master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/bnx2x/bnx2x_cmn.c
\| *	Merge branch 'master' of /home/davem/src/GIT/linux-2.6/	David S. Miller	2011-03-10	2	-2/+2
\| \|\
\| \| *	net: don't allow CAP_NET_ADMIN to load non-netdev kernel modules	Vasiliy Kulikov	2011-03-10	2	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Since a8f80e8ff94ecba629542d9b4b5f5a8ee3eb565c any process with CAP_NET_ADMIN may load any module from /lib/modules/. This doesn't mean that CAP_NET_ADMIN is a superset of CAP_SYS_MODULE as modules are limited to /lib/modules/**. However, CAP_NET_ADMIN capability shouldn't allow anybody load any module not related to networking. This patch restricts an ability of autoloading modules to netdev modules with explicit aliases. This fixes CVE-2011-1019. Arnd Bergmann suggested to leave untouched the old pre-v2.6.32 behavior of loading netdev modules by name (without any prefix) for processes with CAP_SYS_MODULE to maintain the compatibility with network scripts that use autoloading netdev modules by aliases like "eth0", "wlan0". Currently there are only three users of the feature in the upstream kernel: ipip, ip_gre and sit. root@albatros:~# capsh --drop=$(seq -s, 0 11),$(seq -s, 13 34) -- root@albatros:~# grep Cap /proc/$$/status CapInh: 0000000000000000 CapPrm: fffffff800001000 CapEff: fffffff800001000 CapBnd: fffffff800001000 root@albatros:~# modprobe xfs FATAL: Error inserting xfs (/lib/modules/2.6.38-rc6-00001-g2bf4ca3/kernel/fs/xfs/xfs.ko): Operation not permitted root@albatros:~# lsmod \| grep xfs root@albatros:~# ifconfig xfs xfs: error fetching interface information: Device not found root@albatros:~# lsmod \| grep xfs root@albatros:~# lsmod \| grep sit root@albatros:~# ifconfig sit sit: error fetching interface information: Device not found root@albatros:~# lsmod \| grep sit root@albatros:~# ifconfig sit0 sit0 Link encap:IPv6-in-IPv4 NOARP MTU:1480 Metric:1 root@albatros:~# lsmod \| grep sit sit 10457 0 tunnel4 2957 1 sit For CAP_SYS_MODULE module loading is still relaxed: root@albatros:~# grep Cap /proc/$$/status CapInh: 0000000000000000 CapPrm: ffffffffffffffff CapEff: ffffffffffffffff CapBnd: ffffffffffffffff root@albatros:~# ifconfig xfs xfs: error fetching interface information: Device not found root@albatros:~# lsmod \| grep xfs xfs 745319 0 Reference: https://lkml.org/lkml/2011/2/24/203 Signed-off-by: Vasiliy Kulikov <segoon@openwall.com> Signed-off-by: Michael Tokarev <mjt@tls.msk.ru> Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Kees Cook <kees.cook@canonical.com> Signed-off-by: James Morris <jmorris@namei.org>
\| * \|	ipv4: Fix erroneous uses of ifa_address.	David S. Miller	2011-03-09	1	-3/+3
\| \|/ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In usual cases ifa_address == ifa_local, but in the case where SIOCSIFDSTADDR sets the destination address on a point-to-point link, ifa_address gets set to that destination address. Therefore we should use ifa_local when we want the local interface address. There were two cases where the selection was done incorrectly: 1) When devinet_ioctl() does matching, it checks ifa_address even though gifconf correct reported ifa_local to the user 2) IN_DEV_ARP_NOTIFY handling sends a gratuitous ARP using ifa_address instead of ifa_local. Reported-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: David S. Miller <davem@davemloft.net>
* \|	tcp: mark tcp_congestion_ops read_mostly	Stephen Hemminger	2011-03-10	12	-12/+12
\| \| \| \| \| \| \| \| \| \|	Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* \|	ipv4: Optimize flow initialization in fib_validate_source().	David S. Miller	2011-03-10	1	-7/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Like in commit 44713b67db10c774f14280c129b0d5fd13c70cf2 ("ipv4: Optimize flow initialization in output route lookup." we can optimize the on-stack flow setup to only initialize the members which are actually used. Otherwise we bzero the entire structure, then initialize explicitly the first half of it. Signed-off-by: David S. Miller <davem@davemloft.net>
* \|	ipv4: Optimize flow initialization in input route lookup.	David S. Miller	2011-03-10	1	-6/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Like in commit 44713b67db10c774f14280c129b0d5fd13c70cf2 ("ipv4: Optimize flow initialization in output route lookup." we can optimize the on-stack flow setup to only initialize the members which are actually used. Otherwise we bzero the entire structure, then initialize explicitly the first half of it. Signed-off-by: David S. Miller <davem@davemloft.net>
* \|	tcp: ioctl type SIOCOUTQNSD returns amount of data not sent	Mario Schuknecht	2011-03-09	1	-0/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In contrast to SIOCOUTQ which returns the amount of data sent but not yet acknowledged plus data not yet sent this patch only returns the data not sent. For various methods of live streaming bitrate control it may be helpful to know how much data are in the tcp outqueue are not sent yet. Signed-off-by: Mario Schuknecht <m.schuknecht@dresearch.de> Signed-off-by: Steffen Sledz <sledz@dresearch.de> Signed-off-by: David S. Miller <davem@davemloft.net>
* \|	ipv4: Lookup multicast routes by rtable using helper.	David S. Miller	2011-03-09	1	-42/+28
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Create a common helper for this operation, since we do it identically in three spots. Suggested by Eric Dumazet. Signed-off-by: David S. Miller <davem@davemloft.net>
* \|	inetpeer: Don't disable BH for initial fast RCU lookup.	David S. Miller	2011-03-08	1	-9/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	If modifications on other cpus are ok, then modifications to the tree during lookup done by the local cpu are ok too. Signed-off-by: David S. Miller <davem@davemloft.net>
* \|	ipv4: Fix scope value used in route src-address caching.	David S. Miller	2011-03-08	1	-2/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	We have to use cfg->fc_scope not the final nh_scope value. Reported-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: David S. Miller <davem@davemloft.net>
* \|	ipv4: Cache source address in nexthop entries.	David S. Miller	2011-03-08	2	-7/+26
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When doing output route lookups, we have to select the source address if the user has not specified an explicit one. First, if the route has an explicit preferred source address specified, then we use that. Otherwise we search the route's outgoing interface for a suitable address. This search can be precomputed and cached at route insertion time. The only missing part is that we have to refresh this precomputed value any time addresses are added or removed from the interface, and this is accomplished by fib_update_nh_saddrs(). Signed-off-by: David S. Miller <davem@davemloft.net>
* \|	ipv4: Inline fib_semantic_match into check_leaf	David S. Miller	2011-03-08	3	-75/+51
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This elimiates a lot of pure overhead due to parameter passing. Signed-off-by: David S. Miller <davem@davemloft.net>
* \|	ipv4: Validate route entry type at insert instead of every lookup.	David S. Miller	2011-03-07	1	-26/+28
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	fib_semantic_match() requires that if the type doesn't signal an automatic error, it must be of type RTN_UNICAST, RTN_LOCAL, RTN_BROADCAST, RTN_ANYCAST, or RTN_MULTICAST. Checking this every route lookup is pointless work. Instead validate it during route insertion, via fib_create_info(). Also, there was nothing making sure the type value was less than RTN_MAX, so add that missing check while we're here. Signed-off-by: David S. Miller <davem@davemloft.net>
* \|	ipv4: Remove flowi from struct rtable.	David S. Miller	2011-03-05	4	-83/+131
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The only necessary parts are the src/dst addresses, the interface indexes, the TOS, and the mark. The rest is unnecessary bloat, which amounts to nearly 50 bytes on 64-bit. Signed-off-by: David S. Miller <davem@davemloft.net>
* \|	ipv4: Set rt->rt_iif more sanely on output routes.	David S. Miller	2011-03-05	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	rt->rt_iif is only ever inspected on input routes, for example DCCP uses this to populate a route lookup flow key when generating replies to another packet. Therefore, setting it to anything other than zero on output routes makes no sense. Signed-off-by: David S. Miller <davem@davemloft.net>
* \|	ipv4: Get peer more cheaply in rt_init_metrics().	David S. Miller	2011-03-05	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	We know this is a new route object, so doing atomics and stuff makes no sense at all. Signed-off-by: David S. Miller <davem@davemloft.net>
* \|	ipv4: Optimize flow initialization in output route lookup.	David S. Miller	2011-03-05	1	-8/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We burn a lot of useless cycles, cpu store buffer traffic, and memory operations memset()'ing the on-stack flow used to perform output route lookups in __ip_route_output_key(). Only the first half of the flow object members even matter for output route lookups in this context, specifically: FIB rules matching cares about: dst, src, tos, iif, oif, mark FIB trie lookup cares about: dst FIB semantic match cares about: tos, scope, oif Therefore only initialize these specific members and elide the memset entirely. On Niagara2 this kills about ~300 cycles from the output route lookup path. Likely, we can take things further, since all callers of output route lookups essentially throw away the on-stack flow they use. So they don't care if we use it as a scratch-pad to compute the final flow key. Signed-off-by: David S. Miller <davem@davemloft.net> Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
* \|	inetpeer: seqlock optimization	Eric Dumazet	2011-03-04	1	-40/+35
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	David noticed : ------------------ Eric, I was profiling the non-routing-cache case and something that stuck out is the case of calling inet_getpeer() with create==0. If an entry is not found, we have to redo the lookup under a spinlock to make certain that a concurrent writer rebalancing the tree does not "hide" an existing entry from us. This makes the case of a create==0 lookup for a not-present entry really expensive. It is on the order of 600 cpu cycles on my Niagara2. I added a hack to not do the relookup under the lock when create==0 and it now costs less than 300 cycles. This is now a pretty common operation with the way we handle COW'd metrics, so I think it's definitely worth optimizing. ----------------- One solution is to use a seqlock instead of a spinlock to protect struct inet_peer_base. After a failed avl tree lookup, we can easily detect if a writer did some changes during our lookup. Taking the lock and redo the lookup is only necessary in this case. Note: Add one private rcu_deref_locked() macro to place in one spot the access to spinlock included in seqlock. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* \|	Merge branch 'master' of ↵	David S. Miller	2011-03-04	2	-3/+4
\|\\| \| \| \| \| \| \| \| \| \| \| \| \|	master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/bnx2x/bnx2x.h
\| *	tcp: undo_retrans counter fixes	Yuchung Cheng	2011-02-21	2	-3/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fix a bug that undo_retrans is incorrectly decremented when undo_marker is not set or undo_retrans is already 0. This happens when sender receives more DSACK ACKs than packets retransmitted during the current undo phase. This may also happen when sender receives DSACK after the undo operation is completed or cancelled. Fix another bug that undo_retrans is incorrectly incremented when sender retransmits an skb and tcp_skb_pcount(skb) > 1 (TSO). This case is rare but not impossible. Signed-off-by: Yuchung Cheng <ycheng@google.com> Acked-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>
* \|	ipv4: Fix __ip_dev_find() to use ifa_local instead of ifa_address.	David S. Miller	2011-03-03	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \|	Reported-by: Stephen Hemminger <shemminger@vyatta.com> Reported-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: David S. Miller <davem@davemloft.net>
* \|	ipv4: Fix crash in dst_release when udp_sendmsg route lookup fails.	David S. Miller	2011-03-03	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	As reported by Eric: [11483.697233] IP: [<c12b0638>] dst_release+0x18/0x60 ... [11483.697741] Call Trace: [11483.697764] [<c12fc9d2>] udp_sendmsg+0x282/0x6e0 [11483.697790] [<c12a1c01>] ? memcpy_toiovec+0x51/0x70 [11483.697818] [<c12dbd90>] ? ip_generic_getfrag+0x0/0xb0 The pointer passed to dst_release() is -EINVAL, that's because we leave an error pointer in the local variable "rt" by accident. NULL it out to fix the bug. Reported-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* \|	ipv4: ip_route_output_key() is better as an inline.	David S. Miller	2011-03-02	1	-6/+0
\| \| \| \| \| \| \| \| \| \| \| \|	This avoid a stack frame at zero cost. Signed-off-by: David S. Miller <davem@davemloft.net>
* \|	ipv4: Make output route lookup return rtable directly.	David S. Miller	2011-03-02	17	-138/+160
\| \| \| \| \| \| \| \| \| \| \| \|	Instead of on the stack. Signed-off-by: David S. Miller <davem@davemloft.net>
* \|	xfrm: Return dst directly from xfrm_lookup()	David S. Miller	2011-03-02	3	-25/+24
\| \| \| \| \| \| \| \| \| \| \| \|	Instead of on the stack. Signed-off-by: David S. Miller <davem@davemloft.net>
* \|	inet: Replace left-over references to inet->cork	Herbert Xu	2011-03-02	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The patch to replace inet->cork with cork left out two spots in __ip_append_data that can result in bogus packet construction. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
* \|	ipv4: Make icmp route lookup code a bit clearer.	David S. Miller	2011-03-02	1	-79/+96
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The route lookup code in icmp_send() is slightly tricky as a result of having to handle all of the requirements of RFC 4301 host relookups. Pull the route resolution into a seperate function, so that the error handling and route reference counting is hopefully easier to see and contained wholly within this new routine. Signed-off-by: David S. Miller <davem@davemloft.net>
* \|	xfrm: Handle blackhole route creation via afinfo.	David S. Miller	2011-03-01	2	-13/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	That way we don't have to potentially do this in every xfrm_lookup() caller. Signed-off-by: David S. Miller <davem@davemloft.net>
* \|	xfrm: Kill XFRM_LOOKUP_WAIT flag.	David S. Miller	2011-03-01	1	-3/+1
\| \| \| \| \| \| \| \| \| \| \| \|	This can be determined from the flow flags instead. Signed-off-by: David S. Miller <davem@davemloft.net>
* \|	ipv4: Kill can_sleep arg to ip_route_output_flow()	David S. Miller	2011-03-01	6	-8/+9
\| \| \| \| \| \| \| \| \| \| \| \|	This boolean state is now available in the flow flags. Signed-off-by: David S. Miller <davem@davemloft.net>
* \|	net: Add FLOWI_FLAG_CAN_SLEEP.	David S. Miller	2011-03-01	2	-3/+6
\| \| \| \| \| \| \| \| \| \| \| \|	And set is in contexts where the route resolution can sleep. Signed-off-by: David S. Miller <davem@davemloft.net>
* \|	ipv4: Make final arg to ip_route_output_flow to be boolean "can_sleep"	David S. Miller	2011-03-01	6	-8/+8
\| \| \| \| \| \| \| \| \| \| \| \|	Since that is what the current vague "flags" argument means. Signed-off-by: David S. Miller <davem@davemloft.net>
* \|	ipv4: Can final ip_route_connect() arg to boolean "can_sleep".	David S. Miller	2011-03-01	3	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \|	Since that's what the current vague "flags" thing means. Signed-off-by: David S. Miller <davem@davemloft.net>
* \|	udp: Add lockless transmit path	Herbert Xu	2011-03-01	1	-1/+14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The UDP transmit path has been running under the socket lock for a long time because of the corking feature. This means that transmitting to the same socket in multiple threads does not scale at all. However, as most users don't actually use corking, the locking can be removed in the common case. This patch creates a lockless fast path where corking is not used. Please note that this does create a slight inaccuracy in the enforcement of socket send buffer limits. In particular, we may exceed the socket limit by up to (number of CPUs) * (packet size) because of the way the limit is computed. As the primary purpose of socket buffers is to indicate congestion, this should not be a great problem for now. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* \|	udp: Switch to ip_finish_skb	Herbert Xu	2011-03-01	1	-33/+50
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch converts UDP to use the new ip_finish_skb API. This would then allows us to more easily use ip_make_skb which allows UDP to run without a socket lock. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* \|	inet: Add ip_make_skb and ip_finish_skb	Herbert Xu	2011-03-01	1	-14/+51
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch adds the helper ip_make_skb which is like ip_append_data and ip_push_pending_frames all rolled into one, except that it does not send the skb produced. The sending part is carried out by ip_send_skb, which the transport protocol can call after it has tweaked the skb. It is meant to be called in cases where corking is not used should have a one-to-one correspondence to sendmsg. This patch also adds the helper ip_finish_skb which is meant to be replace ip_push_pending_frames when corking is required. Previously the protocol stack would peek at the socket write queue and add its header to the first packet. With ip_finish_skb, the protocol stack can directly operate on the final skb instead, just like the non-corking case with ip_make_skb. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* \|	inet: Remove explicit write references to sk/inet in ip_append_data	Herbert Xu	2011-03-01	1	-98/+140
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In order to allow simultaneous calls to ip_append_data on the same socket, it must not modify any shared state in sk or inet (other than those that are designed to allow that such as atomic counters). This patch abstracts out write references to sk and inet_sk in ip_append_data and its friends so that we may use the underlying code in parallel. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* \|	inet: Remove unused sk_sndmsg_* from UFO	Herbert Xu	2011-03-01	1	-1/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	UFO doesn't really use the sk_sndmsg_* parameters so touching them is pointless. It can't use them anyway since the whole point of UFO is to use the original pages without copying. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* \|	ipv4: Rearrange how ip_route_newports() gets port keys.	David S. Miller	2011-02-24	1	-1/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	ip_route_newports() is the only place in the entire kernel that cares about the port members in the routing cache entry's lookup flow key. Therefore the only reason we store an entire flow inside of the struct rtentry is for this one special case. Rewrite ip_route_newports() such that: 1) The caller passes in the original port values, so we don't need to use the rth->fl.fl_ip_{s,d}port values to remember them. 2) The lookup flow is constructed by hand instead of being copied from the routing cache entry's flow. Signed-off-by: David S. Miller <davem@davemloft.net>