| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
| |
It is not easy to use in4_pton() correctly without reading
its definition, so add some doc for it.
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Cong Wang <amwang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
|
|
|
|
|
|
|
| |
It is not easy to use in6_pton() correctly without reading
its definition, so add some doc for it.
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Cong Wang <amwang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
|
|
|
|
|
| |
Use be32_to_cpu instead of htonl to keep sparse happy.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
After commit e2446eaa ("tcp_v4_send_reset: binding oif to iif in no
sock case").. tcp resets are always lost, when routing is asymmetric.
Yes, backing out that patch will result in misrouting of resets for
dead connections which used interface binding when were alive, but we
actually cannot do anything here. What's died that's died and correct
handling normal unbound connections is obviously a priority.
Comment to comment:
> This has few benefits:
> 1. tcp_v6_send_reset already did that.
It was done to route resets for IPv6 link local addresses. It was a
mistake to do so for global addresses. The patch fixes this as well.
Actually, the problem appears to be even more serious than guaranteed
loss of resets. As reported by Sergey Soloviev <sol@eqv.ru>, those
misrouted resets create a lot of arp traffic and huge amount of
unresolved arp entires putting down to knees NAT firewalls which use
asymmetric routing.
Signed-off-by: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
|
|
|
|
|
|
|
|
| |
If the rc_dereference of tcp_fastopen_ctx ever fails then we copy 16 bytes
of kernel stack into the proc result.
Signed-off-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Sparse complains about RTA_MARK which is should be host order according
to include file and usage in iproute.
net/ipv4/route.c:2223:46: warning: incorrect type in argument 3 (different base types)
net/ipv4/route.c:2223:46: expected restricted __be32 [usertype] value
net/ipv4/route.c:2223:46: got unsigned int [unsigned] [usertype] flowic_mark
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
|
|
|
|
|
|
|
| |
If lower layer driver leaves the ip header in the skb fragment, it needs to
be first pulled into skb->data before inspecting ip header length or ip version
number.
Signed-off-by: Sarveshwar Bandi <sarveshwar.bandi@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
|
|
|
|
| |
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Cong Wang <amwang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
|
|
|
|
| |
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Cong Wang <amwang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
|
|
|
|
|
|
|
|
| |
It is weird to display IPv4 address in %x format, what's more,
IPv6 address is disaplayed in human-readable format too. So,
make it human-readable.
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Cong Wang <amwang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
|
|
|
|
|
|
|
| |
ETH_ZLEN is too small for IPv6, so this default value is not
suitable.
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Cong Wang <amwang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For IPv6, sizeof(struct ipv6hdr) = 40, thus the following
expression will result negative:
datalen = pkt_dev->cur_pkt_size - 14 -
sizeof(struct ipv6hdr) - sizeof(struct udphdr) -
pkt_dev->pkt_overhead;
And, the check "if (datalen < sizeof(struct pktgen_hdr))" will be
passed as "datalen" is promoted to unsigned, therefore will cause
a crash later.
This is a quick fix by checking if "datalen" is negative. The following
patch will increase the default value of 'min_pkt_size' for IPv6.
This bug should exist for a long time, so Cc -stable too.
Cc: <stable@vger.kernel.org>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Cong Wang <amwang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|\
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless
John W. Linville says:
====================
Here is a batch of fixes intended for 3.7...
Amitkumar Karwar provides a couple of mwifiex fixes to correctly
report some reason codes for certain connection failures. He also
provides a fix to cleanup after a scanning failure. Bing Zhao rounds
that out with another mwifiex scanning fix.
Daniel Golle gives us a fix for a copy/paste error in rt2x00.
Felix Fietkau brings a couple of ath9k fixes related to suspend/resume,
and a couple of fixes to prevent memory leaks in ath9k and mac80211.
Ronald Wahl sends a carl9170 fix for a sleep in softirq context.
Thomas Pedersen reorders some code to prevent drv_get_tsf from being
called while holding a spinlock, now that it can sleep.
Finally, Wei Yongjun prevents a NULL pointer dereference in the
ath5k driver.
Please let me know if there are problems!
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
A few places free skbs using dev_kfree_skb even though they're called
after ieee80211_subif_start_xmit might have cloned it for tracking tx
status. Use ieee80211_free_txskb here to prevent skb leaks.
Signed-off-by: Felix Fietkau <nbd@openwrt.org>
Cc: stable@vger.kernel.org
Signed-off-by: John W. Linville <linville@tuxdriver.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The call to drv_get/set_tsf() was put on the workqueue to perform tsf
adjustments since that function might sleep. However it ended up inside
a spinlock, whose critical section must be atomic. Do tsf adjustment
outside the spinlock instead, and get rid of a warning.
Signed-off-by: Thomas Pedersen <thomas@cozybit.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
This is the revised patch for fixing rds-ping spinlock recursion
according to Venkat's suggestions.
RDS ping/pong over TCP feature has been broken for years(2.6.39 to
3.6.0) since we have to set TCP cork and call kernel_sendmsg() between
ping/pong which both need to lock "struct sock *sk". However, this
lock has already been hold before rds_tcp_data_ready() callback is
triggerred. As a result, we always facing spinlock resursion which
would resulting in system panic.
Given that RDS ping is only used to test the connectivity and not for
serious performance measurements, we can queue the pong transmit to
rds_wq as a delayed response.
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
CC: Venkat Venkatsubra <venkat.x.venkatsubra@oracle.com>
CC: David S. Miller <davem@davemloft.net>
CC: James Morris <james.l.morris@oracle.com>
Signed-off-by: Jie Liu <jeff.liu@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|\ \
| | |
| | |
| | |
| | |
| | |
| | | |
Pulled mainline in order to get the UAPI infrastructure already
merged before I pull in David Howells's UAPI trees for networking.
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Empty nodes have no color. We can make use of this property to simplify
the code emitted by the RB_EMPTY_NODE and RB_CLEAR_NODE macros. Also,
we can get rid of the rb_init_node function which had been introduced by
commit 88d19cf37952 ("timers: Add rb_init_node() to allow for stack
allocated rb nodes") to avoid some issue with the empty node's color not
being initialized.
I'm not sure what the RB_EMPTY_NODE checks in rb_prev() / rb_next() are
doing there, though. axboe introduced them in commit 10fd48f2376d
("rbtree: fixed reversed RB_EMPTY_NODE and rb_next/prev"). The way I
see it, the 'empty node' abstraction is only used by rbtree users to
flag nodes that they haven't inserted in any rbtree, so asking the
predecessor or successor of such nodes doesn't make any sense.
One final rb_init_node() caller was recently added in sysctl code to
implement faster sysctl name lookups. This code doesn't make use of
RB_EMPTY_NODE at all, and from what I could see it only called
rb_init_node() under the mistaken assumption that such initialization was
required before node insertion.
[sfr@canb.auug.org.au: fix net/ceph/osd_client.c build]
Signed-off-by: Michel Lespinasse <walken@google.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Acked-by: David Woodhouse <David.Woodhouse@intel.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Daniel Santos <daniel.santos@pobox.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: John Stultz <john.stultz@linaro.org>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
| |\ \
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client
Pull ceph updates from Sage Weil:
"The bulk of this pull is a series from Alex that refactors and cleans
up the RBD code to lay the groundwork for supporting the new image
format and evolving feature set. There are also some cleanups in
libceph, and for ceph there's fixed validation of file striping
layouts and a bugfix in the code handling a shrinking MDS cluster."
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: (71 commits)
ceph: avoid 32-bit page index overflow
ceph: return EIO on invalid layout on GET_DATALOC ioctl
rbd: BUG on invalid layout
ceph: propagate layout error on osd request creation
libceph: check for invalid mapping
ceph: convert to use le32_add_cpu()
ceph: Fix oops when handling mdsmap that decreases max_mds
rbd: update remaining header fields for v2
rbd: get snapshot name for a v2 image
rbd: get the snapshot context for a v2 image
rbd: get image features for a v2 image
rbd: get the object prefix for a v2 rbd image
rbd: add code to get the size of a v2 rbd image
rbd: lay out header probe infrastructure
rbd: encapsulate code that gets snapshot info
rbd: add an rbd features field
rbd: don't use index in __rbd_add_snap_dev()
rbd: kill create_snap sysfs entry
rbd: define rbd_dev_image_id()
rbd: define some new format constants
...
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
If we are creating an osd request and get an invalid layout, return
an EINVAL to the caller. We switch up the return to have an error
code instead of NULL implying -ENOMEM.
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Alex Elder <elder@inktank.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
If we encounter an invalid (e.g., zeroed) mapping, return an error
and avoid a divide by zero.
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Alex Elder <elder@inktank.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Using list_move_tail() instead of list_del() + list_add_tail().
Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Signed-off-by: Sage Weil <sage@inktank.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Make ceph_monc_do_poolop() static to remove the following sparse warning:
* net/ceph/mon_client.c:616:5: warning: symbol 'ceph_monc_do_poolop' was not
declared. Should it be static?
Also drops the 'ceph_monc_' prefix, now being a private function.
Signed-off-by: Iulius Curt <icurt@ixiacom.com>
Signed-off-by: Sage Weil <sage@inktank.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
This is unused; use monc->client->have_fsid.
Signed-off-by: Sage Weil <sage@inktank.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
After the change "Make neigh lookups directly in output packet path"
(commit a263b30936) IPVS can not reach the real server for DR mode
because we resolve the destination address from IP header, not from
route neighbour. Use the new FLOWI_FLAG_KNOWN_NH flag to request
output routes with known nexthop, so that it has preference
on resolving.
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Add flag to request that output route should be
returned with known rt_gateway, in case we want to use
it as nexthop for neighbour resolving.
The returned route can be cached as follows:
- in NH exception: because the cached routes are not shared
with other destinations
- in FIB NH: when using gateway because all destinations for
NH share same gateway
As last option, to return rt_gateway!=0 we have to
set DST_NOCACHE.
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Add new flag to remember when route is via gateway.
We will use it to allow rt_gateway to contain address of
directly connected host for the cases when DST_NOCACHE is
used or when the NH exception caches per-destination route
without DST_NOCACHE flag, i.e. when routes are not used for
other destinations. By this way we force the neighbour
resolving to work with the routed destination but we
can use different address in the packet, feature needed
for IPVS-DR where original packet for virtual IP is routed
via route to real IP.
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Avoid checking nh_pcpu_rth_output in fast path,
abort fib_info creation on alloc_percpu failure.
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
After the change "Adjust semantics of rt->rt_gateway"
(commit f8126f1d51) rt_gateway can be 0 but ip_forward() compares
it directly with nexthop. What we want here is to check if traffic
is to directly connected nexthop and to fail if using gateway.
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
After "Cache input routes in fib_info nexthops" (commit
d2d68ba9fe) and "Elide fib_validate_source() completely when possible"
(commit 7a9bc9b81a) we can not send ICMP redirects. It seems we
should not cache the RTCF_DOREDIRECT flag in nh_rth_input because
the same fib_info can be used for traffic that is not redirected,
eg. from other input devices or from sources that are not in same subnet.
As result, we have to disable the caching of RTCF_DOREDIRECT
flag and to force source validation for the case when forwarding
traffic to the input device. If traffic comes from directly connected
source we allow redirection as it was done before both changes.
Avoid setting RTCF_DOREDIRECT if IN_DEV_TX_REDIRECTS
is disabled, this can avoid source address validation and to
help caching the routes.
After the change "Adjust semantics of rt->rt_gateway"
(commit f8126f1d51) we should make sure our ICMP_REDIR_HOST messages
contain daddr instead of 0.0.0.0 when target is directly connected.
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
It seems IPV6_GRO_CB(skb)->proto can be destroyed in skb_gro_receive()
if a new skb is allocated (to serve as an anchor for frag_list)
We copy NAPI_GRO_CB() only (not the IPV6 specific part) in :
*NAPI_GRO_CB(nskb) = *NAPI_GRO_CB(p);
So we leave IPV6_GRO_CB(nskb)->proto to 0 (fresh skb allocation) instead
of IPPROTO_TCP (6)
ipv6_gro_complete() isnt able to call ops->gro_complete()
[ tcp6_gro_complete() ]
Fix this by moving proto in NAPI_GRO_CB() and getting rid of
IPV6_GRO_CB
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
6a32e4f9dd9219261f8856f817e6655114cfec2f made the vlan code skip marking
vlan-tagged frames for not locally configured vlans as PACKET_OTHERHOST if
there was an rx_handler, as the rx_handler could cause the frame to be received
on a different (virtual) vlan-capable interface where that vlan might be
configured.
As rx_handlers do not necessarily return RX_HANDLER_ANOTHER, this could cause
frames for unknown vlans to be delivered to the protocol stack as if they had
been received untagged.
For example, if an ipv6 router advertisement that's tagged for a locally not
configured vlan is received on an interface with macvlan interfaces attached,
macvlan's rx_handler returns RX_HANDLER_PASS after delivering the frame to the
macvlan interfaces, which caused it to be passed to the protocol stack, leading
to ipv6 addresses for the announced prefix being configured even though those
are completely unusable on the underlying interface.
The fix moves marking as PACKET_OTHERHOST after the rx_handler so the
rx_handler, if there is one, sees the frame unchanged, but afterwards,
before the frame is delivered to the protocol stack, it gets marked whether
there is an rx_handler or not.
Signed-off-by: Florian Zumbiehl <florz@florz.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Current GRO can hold packets in gro_list for almost unlimited
time, in case napi->poll() handler consumes its budget over and over.
In this case, napi_complete()/napi_gro_flush() are not called.
Another problem is that gro_list is flushed in non friendly way :
We scan the list and complete packets in the reverse order.
(youngest packets first, oldest packets last)
This defeats priorities that sender could have cooked.
Since GRO currently only store TCP packets, we dont really notice the
bug because of retransmits, but this behavior can add unexpected
latencies, particularly on mice flows clamped by elephant flows.
This patch makes sure no packet can stay more than 1 ms in queue, and
only in stress situations.
It also complete packets in the right order to minimize latencies.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Jesse Gross <jesse@nicira.com>
Cc: Tom Herbert <therbert@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
We report cached pmtu values even if they are already expired.
Change this to not report these values after they are expired
and fix a race in the expire time calculation, as suggested by
Eric Dumazet.
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
reported pmtu
When a local tool like tracepath tries to send packets bigger than
the device mtu, we create a nh exeption and set the pmtu to device
mtu. The device mtu does not expire, so check if the device mtu is
smaller than the reported pmtu and don't crerate a nh exeption in
that case.
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Some protocols, like IPsec still cache routes. So we need to invalidate
the old route on pmtu events to avoid the reuse of stale routes.
We also need to update the mtu and expire time of the route if we already
use a nh exception route, otherwise we ignore newly learned pmtu values
after the first expiration.
With this patch we always invalidate or update the route on pmtu events.
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Before accessing skb first fragment, better make sure there
is one.
This is probably not needed for old kernels, since an ethernet frame
cannot contain only an ethernet header, but the recent GRO addition
to tunnels makes this patch needed.
Also skb_gro_reset_offset() can be static, it actually allows
compiler to inline it.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
IPv4 side of the problem was addressed in commit a9e050f4e7f9d
(net: tcp: GRO should be ECN friendly)
This patch does the same, but for IPv6 : A Traffic Class mismatch
doesnt mean flows are different, but instead should force a flush
of previous packets.
This patch removes artificial packet reordering problem.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
The retry loop in neigh_resolve_output() and neigh_connected_output()
call dev_hard_header() with out reseting the skb to network_header.
This causes the retry to fail with skb_under_panic. The fix is to
reset the network_header within the retry loop.
Signed-off-by: Ramesh Nagappa <ramesh.nagappa@ericsson.com>
Reviewed-by: Shawn Lu <shawn.lu@ericsson.com>
Reviewed-by: Robert Coulson <robert.coulson@ericsson.com>
Reviewed-by: Billie Alsup <billie.alsup@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Over time, skb recycling infrastructure got litle interest and
many bugs. Generic rx path skb allocation is now using page
fragments for efficient GRO / TCP coalescing, and recyling
a tx skb for rx path is not worth the pain.
Last identified bug is that fat skbs can be recycled
and it can endup using high order pages after few iterations.
With help from Maxime Bizon, who pointed out that commit
87151b8689d (net: allow pskb_expand_head() to get maximum tailroom)
introduced this regression for recycled skbs.
Instead of fixing this bug, lets remove skb recycling.
Drivers wanting really hot skbs should use build_skb() anyway,
to allocate/populate sk_buff right before netif_receive_skb()
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Maxime Bizon <mbizon@freebox.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|/ / /
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
I get a panic when I use ss -a and rmmod inet_diag at the
same time.
It's because netlink_dump uses inet_diag_dump which belongs to module
inet_diag.
I search the codes and find many modules have the same problem. We
need to add a reference to the module which the cb->dump belongs to.
Thanks for all help from Stephen,Jan,Eric,Steffen and Pablo.
Change From v3:
change netlink_dump_start to inline,suggestion from Pablo and
Eric.
Change From v2:
delete netlink_dump_done,and call module_put in netlink_dump
and netlink_sock_destruct.
Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|\ \ \
| | |/
| |/|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Pull networking changes from David Miller:
"The most important bit in here is the fix for input route caching from
Eric Dumazet, it's a shame we couldn't fully analyze this in time for
3.6 as it's a 3.6 regression introduced by the routing cache removal.
Anyways, will send quickly to -stable after you pull this in.
Other changes of note:
1) Fix lockdep splats in team and bonding, from Eric Dumazet.
2) IPV6 adds link local route even when there is no link local
address, from Nicolas Dichtel.
3) Fix ixgbe PTP implementation, from Jacob Keller.
4) Fix excessive stack usage in cxgb4 driver, from Vipul Pandya.
5) MAC length computed improperly in VLAN demux, from Antonio
Quartulli."
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (26 commits)
ipv6: release reference of ip6_null_entry's dst entry in __ip6_del_rt
Remove noisy printks from llcp_sock_connect
tipc: prevent dropped connections due to rcvbuf overflow
silence some noisy printks in irda
team: set qdisc_tx_busylock to avoid LOCKDEP splat
bonding: set qdisc_tx_busylock to avoid LOCKDEP splat
sctp: check src addr when processing SACK to update transport state
sctp: fix a typo in prototype of __sctp_rcv_lookup()
ipv4: add a fib_type to fib_info
can: mpc5xxx_can: fix section type conflict
can: peak_pcmcia: fix error return code
can: peak_pci: fix error return code
cxgb4: Fix build error due to missing linux/vmalloc.h include.
bnx2x: fix ring size for 10G functions
cxgb4: Dynamically allocate memory in t4_memory_rw() and get_vpd_params()
ixgbe: add support for X540-AT1
ixgbe: fix poll loop for FDIRCTRL.INIT_DONE bit
ixgbe: fix PTP ethtool timestamping function
ixgbe: (PTP) Fix PPS interrupt code
ixgbe: Fix PTP X540 SDP alignment code for PPS signal
...
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
as we hold dst_entry before we call __ip6_del_rt,
so we should alse call dst_release not only return
-ENOENT when the rt6_info is ip6_null_entry.
and we already hold the dst entry, so I think it's
safe to call dst_release out of the write-read lock.
Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Validation of userspace input shouldn't trigger dmesg spamming.
Signed-off-by: Dave Jones <davej@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
When large buffers are sent over connected TIPC sockets, it
is likely that the sk_backlog will be filled up on the
receiver side, but the TIPC flow control mechanism is happily
unaware of this since that is based on message count.
The sender will receive a TIPC_ERR_OVERLOAD message when this occurs
and drop it's side of the connection, leaving it stale on
the receiver end.
By increasing the sk_rcvbuf to a 'worst case' value, we avoid the
overload caused by a full backlog queue and the flow control
will work properly.
This worst case value is the max TIPC message size times
the flow control window, multiplied by two because a sender
will transmit up to double the window size before a port is marked
congested.
We multiply this by 2 to account for the sk_buff and other overheads.
Signed-off-by: Erik Hugne <erik.hugne@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Fuzzing causes these printks to spew constantly.
Changing them to DEBUG statements is consistent with other usage in the file,
and makes them disappear when CONFIG_IRDA_DEBUG is disabled.
Signed-off-by: Dave Jones <davej@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Suppose we have an SCTP connection with two paths. After connection is
established, path1 is not available, thus this path is marked as inactive. Then
traffic goes through path2, but for some reasons packets are delayed (after
rto.max). Because packets are delayed, the retransmit mechanism will switch
again to path1. At this time, we receive a delayed SACK from path2. When we
update the state of the path in sctp_check_transmitted(), we do not take into
account the source address of the SACK, hence we update the wrong path.
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Just to avoid confusion when people only reads this prototype.
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
commit d2d68ba9fe8 (ipv4: Cache input routes in fib_info nexthops.)
introduced a regression for forwarding.
This was hard to reproduce but the symptom was that packets were
delivered to local host instead of being forwarded.
David suggested to add fib_type to fib_info so that we dont
inadvertently share same fib_info for different purposes.
With help from Julian Anastasov who provided very helpful
hints, reproduced here :
<quote>
Can it be a problem related to fib_info reuse
from different routes. For example, when local IP address
is created for subnet we have:
broadcast 192.168.0.255 dev DEV proto kernel scope link src
192.168.0.1
192.168.0.0/24 dev DEV proto kernel scope link src 192.168.0.1
local 192.168.0.1 dev DEV proto kernel scope host src 192.168.0.1
The "dev DEV proto kernel scope link src 192.168.0.1" is
a reused fib_info structure where we put cached routes.
The result can be same fib_info for 192.168.0.255 and
192.168.0.0/24. RTN_BROADCAST is cached only for input
routes. Incoming broadcast to 192.168.0.255 can be cached
and can cause problems for traffic forwarded to 192.168.0.0/24.
So, this patch should solve the problem because it
separates the broadcast from unicast traffic.
And the ip_route_input_slow caching will work for
local and broadcast input routes (above routes 1 and 3) just
because they differ in scope and use different fib_info.
</quote>
Many thanks to Chris Clayton for his patience and help.
Reported-by: Chris Clayton <chris2553@googlemail.com>
Bisected-by: Chris Clayton <chris2553@googlemail.com>
Reported-by: Dave Jones <davej@redhat.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Julian Anastasov <ja@ssi.bg>
Tested-by: Chris Clayton <chris2553@googlemail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
skb_reset_mac_len() relies on the value of the skb->network_header pointer,
therefore we must wait for such pointer to be recalculated before computing
the new mac_len value.
Signed-off-by: Antonio Quartulli <ordex@autistici.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|