| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
| |
Use strlcpy where possible to ensure the string is \0 terminated.
Use always sizeof(string) instead of 32, ETHTOOL_BUSINFO_LEN
and custom defines.
Use snprintf instead of sprint.
Remove unnecessary inits of ->fw_version
Remove unnecessary inits of drvinfo struct.
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
|
|
|
|
|
|
| |
In case user passed address via netlink during create, NET_ADDR_PERM was set.
That is not correct so fix this by setting NET_ADDR_SET.
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently, the size of skb allocated for NDISC is MAX_HEADER +
LL_RESERVED_SPACE(dev) + packet length + dev->needed_tailroom,
but only LL_RESERVED_SPACE(dev) bytes is "reserved" for headers.
As a result, the skb looks like this (after construction of the
message):
head data tail end
+--------------------------------------------------------------+
+ | | | |
+--------------------------------------------------------------+
|<-hlen---->|<---ipv6 packet------>|<--tlen-->|<--MAX_HEADER-->|
=LL_ = dev
RESERVED_ ->needed_
SPACE(dev) tailroom
As the name implies, "MAX_HEADER" is used for headers, and should
be "reserved" in prior to packet construction. Or, if some space
is really required at the tail of ther skb, it should be
explicitly documented.
We have several option after construction of NDISC message:
Option 1:
head data tail end
+---------------------------------------------+
+ | | |
+---------------------------------------------+
|<-hlen---->|<---ipv6 packet------>|<--tlen-->|
=LL_ = dev
RESERVED_ ->needed_
SPACE(dev) tailroom
Option 2:
head data tail end
+--------------------------------------------------+
+ | | |
+--------------------------------------------------+
|<--MAX_HEADER-->|<---ipv6 packet------>|<--tlen-->|
= dev
->needed_
tailroom
Option 3:
head data tail end
+--------------------------------------------------------------+
+ | | | |
+--------------------------------------------------------------+
|<--MAX_HEADER-->|<-hlen---->|<---ipv6 packet------>|<--tlen-->|
=LL_ = dev
RESERVED_ ->needed_
SPACE(dev) tailroom
Our tunnel drivers try expanding headroom and the space for tunnel
encapsulation was not a mandatory space -- so we are not seeing
bugs here --, but just for optimization for performance critial
situations.
Since NDISC messages are not performance critical unlike TCP,
and as we know outgoing device, LL_RESERVED_SPACE(dev) should be
just enough for the device in most (if not all) cases:
LL_RESERVED_SPACE(dev) <= LL_MAX_HEADER <= MAX_HEADER
Note that LL_RESERVED_SPACE(dev) is also enough for NDISC over
SIT (e.g., ISATAP).
So, I think Option 1 is just fine here.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
|
|
|
| |
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
|
|
|
|
|
| |
Benefit from new upper dev list and free bonding from dev->master usage.
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
|
|
|
|
|
|
| |
Also, since all users call __vlan_find_dev_deep() with rcu_read_lock,
make no possibility to call this with rtnl mutex held only.
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
|
|
|
| |
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
|
|
|
| |
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
|
|
|
| |
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
|
|
|
| |
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This lists are supposed to serve for storing pointers to all upper devices.
Eventually it will replace dev->master pointer which is used for
bonding, bridge, team but it cannot be used for vlan, macvlan where
there might be multiple upper present. In case the upper link is
replacement for dev->master, it is marked with "master" flag.
New upper device list resolves this limitation. Also, the information
stored in lists is used for preventing looping setups like
"bond->somethingelse->samebond"
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
|
|
|
|
|
|
| |
NET_ADDR_SET is set in dev_set_mac_address() no need to alter
dev->addr_assign_type value in drivers.
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
|
|
|
|
|
|
| |
This is the way to indicate that mac address of a device has been set by
dev_set_mac_address()
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
|
|
|
| |
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
|
|
|
|
|
|
| |
Benefit from existence of dev_set_mac_address() and remove duplicate
code.
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
|
|
|
|
|
|
|
|
|
| |
The bridge link detection should follow the operational state
of the lower device, rather than the carrier bit. This allows devices
like tunnels that are controlled by userspace control plane to work
with bridge STP link management.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Reviewed-by: Flavio Leitner <fbl@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently, we return -EINVAL for malformed or wrong BPF filters.
However, this is not done for BPF_S_ANC* operations, which makes it
more difficult to detect if it's actually supported or not by the
BPF machine. Therefore, we should also return -EINVAL if K is within
the SKF_AD_OFF universe and the ancillary operation did not match.
Why exactly is it needed? If tools such as libpcap/tcpdump want to
make use of new ancillary operations (like filtering VLAN in kernel
space), there is currently no sane way to test if this feature /
BPF_S_ANC* op is present or not, since no error is returned. This
patch will make life easier for that and allow for a proper usage
for user space applications.
There was concern, if this patch will break userland. Short answer: Yes
and no. Long answer: It will "break" only for code that calls ...
{ BPF_LD | BPF_(W|H|B) | BPF_ABS, 0, 0, <K> },
... where <K> is in [0xfffff000, 0xffffffff] _and_ <K> is *not* an
ancillary. And here comes the BUT: assuming some *old* code will have
such an instruction where <K> is between [0xfffff000, 0xffffffff] and
it doesn't know ancillary operations, then this will give a
non-expected / unwanted behavior as well (since we do not return the
BPF machine with 0 after a failed load_pointer(), which was the case
before introducing ancillary operations, but load sth. into the
accumulator instead, and continue with the next instruction, for
instance). Thus, user space code would already have been broken by
introducing ancillary operations into the BPF machine per se. Code
that does such a direct load, e.g. "load word at packet offset
0xffffffff into accumulator" ("ld [0xffffffff]") is quite broken,
isn't it? The whole assumption of ancillary operations is that no-one
intentionally calls things like "ld [0xffffffff]" and expect this
word to be loaded from such a packet offset. Hence, we can also safely
make use of this feature testing patch and facilitate application
development. Therefore, at least from this patch onwards, we have
*for sure* a check whether current or in future implemented BPF_S_ANC*
ops are supported in the kernel. Patch was tested on x86_64.
(Thanks to Eric for the previous review.)
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Reported-by: Ani Sinha <ani@aristanetworks.com>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
|
|
|
|
|
|
| |
Sparse detected case where this local function should be static.
It may even allow some compiler optimizations.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
|
|
|
|
|
| |
Detected by sparse.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
|
|
|
|
|
| |
Fix sparse warning about local function that should be static.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Use the new per task frag allocator in skb_append_datato_frags(),
to reduce number of frags and page allocator overhead.
Tested:
ifconfig lo mtu 16436
perf record netperf -t UDP_STREAM ; perf report
before :
Throughput: 32928 Mbit/s
51.79% netperf [kernel.kallsyms] [k] copy_user_generic_string
5.98% netperf [kernel.kallsyms] [k] __alloc_pages_nodemask
5.58% netperf [kernel.kallsyms] [k] get_page_from_freelist
5.01% netperf [kernel.kallsyms] [k] __rmqueue
3.74% netperf [kernel.kallsyms] [k] skb_append_datato_frags
1.87% netperf [kernel.kallsyms] [k] prep_new_page
1.42% netperf [kernel.kallsyms] [k] next_zones_zonelist
1.28% netperf [kernel.kallsyms] [k] __inc_zone_state
1.26% netperf [kernel.kallsyms] [k] alloc_pages_current
0.78% netperf [kernel.kallsyms] [k] sock_alloc_send_pskb
0.74% netperf [kernel.kallsyms] [k] udp_sendmsg
0.72% netperf [kernel.kallsyms] [k] zone_watermark_ok
0.68% netperf [kernel.kallsyms] [k] __cpuset_node_allowed_softwall
0.67% netperf [kernel.kallsyms] [k] fib_table_lookup
0.60% netperf [kernel.kallsyms] [k] memcpy_fromiovecend
0.55% netperf [kernel.kallsyms] [k] __udp4_lib_lookup
after:
Throughput: 47185 Mbit/s
61.74% netperf [kernel.kallsyms] [k] copy_user_generic_string
2.07% netperf [kernel.kallsyms] [k] prep_new_page
1.98% netperf [kernel.kallsyms] [k] skb_append_datato_frags
1.02% netperf [kernel.kallsyms] [k] sock_alloc_send_pskb
0.97% netperf [kernel.kallsyms] [k] enqueue_task_fair
0.97% netperf [kernel.kallsyms] [k] udp_sendmsg
0.91% netperf [kernel.kallsyms] [k] __ip_route_output_key
0.88% netperf [kernel.kallsyms] [k] __netif_receive_skb
0.87% netperf [kernel.kallsyms] [k] fib_table_lookup
0.85% netperf [kernel.kallsyms] [k] resched_task
0.78% netperf [kernel.kallsyms] [k] __udp4_lib_lookup
0.77% netperf [kernel.kallsyms] [k] _raw_spin_lock_irqsave
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
|
|
|
|
| |
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Acked-by: Flavio Leitner <fbl@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
|
|
|
|
|
|
| |
Make carrier writable
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Acked-by: Flavio Leitner <fbl@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
|
|
|
|
|
|
|
|
| |
This allows a driver to register change_carrier callback which will be
called whenever user will like to change carrier state. This is useful
for devices like dummy, gre, team and so on.
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Acked-by: Flavio Leitner <fbl@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
ip6gre_xmit2() incorrectly sets transport header to inner payload
instead of GRE header. It seems copy-and-pasted from ipip.c.
Set transport header to gre header.
(In ipip case the transport header is the inner ip header, so that's
correct.)
Found by inspection. In practice the incorrect transport header
doesn't matter because the skb usually is sent to another net_device
or socket, so the transport header isn't referenced.
Signed-off-by: Isaku Yamahata <yamahata@valinux.co.jp>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
ipgre_tunnel_xmit() incorrectly sets transport header to inner payload
instead of GRE header. It seems copy-and-pasted from ipip.c.
So set transport header to gre header.
(In ipip case the transport header is the inner ip header, so that's
correct.)
Found by inspection. In practice the incorrect transport header
doesn't matter because the skb usually is sent to another net_device
or socket, so the transport header isn't referenced.
Signed-off-by: Isaku Yamahata <yamahata@valinux.co.jp>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
|
|
|
|
|
|
| |
Add an else to only print the incompatible protocol message
when version hasn't been established.
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
|
|
|
|
|
|
|
|
|
|
| |
0b088e00 ("RDS: Use page_remainder_alloc() for recv bufs")
added uses of sg_dma_len() and sg_dma_address(). This makes
RDS DOA with the qib driver.
IB ulps should use ib_sg_dma_len() and ib_sg_dma_address
respectively since some HCAs overload ib_sg_dma* operations.
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In commit 96e0bf4b5193d (tcp: Discard segments that ack data not yet
sent) John Dykstra enforced a check against ack sequences.
In commit 354e4aa391ed5 (tcp: RFC 5961 5.2 Blind Data Injection Attack
Mitigation) I added more safety tests.
But we missed fact that these tests are not performed if ACK bit is
not set.
RFC 793 3.9 mandates TCP should drop a frame without ACK flag set.
" fifth check the ACK field,
if the ACK bit is off drop the segment and return"
Not doing so permits an attacker to only guess an acceptable sequence
number, evading stronger checks.
Many thanks to Zhiyun Qian for bringing this issue to our attention.
See :
http://web.eecs.umich.edu/~zhiyunq/pub/ccs12_TCP_sequence_number_inference.pdf
Reported-by: Zhiyun Qian <zhiyunq@umich.edu>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Nandita Dukkipati <nanditad@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Cc: John Dykstra <john.dykstra1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
batadv_iv_ogm_emit_send_time() attempts to calculates a random integer
in the range of 'orig_interval +- BATADV_JITTER' by the below lines.
msecs = atomic_read(&bat_priv->orig_interval) - BATADV_JITTER;
msecs += (random32() % 2 * BATADV_JITTER);
But it actually gets 'orig_interval' or 'orig_interval - BATADV_JITTER'
because '%' and '*' have same precedence and associativity is
left-to-right.
This adds the parentheses at the appropriate position so that it matches
original intension.
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Acked-by: Antonio Quartulli <ordex@autistici.org>
Cc: Marek Lindner <lindner_marek@yahoo.de>
Cc: Simon Wunderlich <siwu@hrz.tu-chemnitz.de>
Cc: Antonio Quartulli <ordex@autistici.org>
Cc: b.a.t.m.a.n@lists.open-mesh.org
Cc: "David S. Miller" <davem@davemloft.net>
Cc: netdev@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Sedat reported the following commit caused a regression:
commit 9650388b5c56578fdccc79c57a8c82fb92b8e7f1
Author: Eric Dumazet <edumazet@google.com>
Date: Fri Dec 21 07:32:10 2012 +0000
ipv4: arp: fix a lockdep splat in arp_solicit
This is due to the 6th parameter of arp_send() needs to be NULL
for the broadcast case, the above commit changed it to an all-zero
array by mistake.
Reported-by: Sedat Dilek <sedat.dilek@gmail.com>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com>
Cc: Sedat Dilek <sedat.dilek@gmail.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
|
|
|
|
|
|
| |
Fixed integer overflow in function htb_dequeue
Signed-off-by: Stefan Hasko <hasko.stevo@gmail.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
|
|
|
|
|
|
|
|
| |
CONFIG_HOTPLUG is always enabled now, so remove the unused code that was
trying to be compiled out when this option was disabled, in the
networking core.
Cc: Bill Pemberton <wfp5p@virginia.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
|
|
|
|
|
|
|
|
|
| |
When netdev_set_master faild in br_add_if, we should
call br_netpoll_disable to do some cleanup jobs,such
as free the memory of struct netpoll which allocated
in br_netpoll_enable.
Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
Acked-by: Cong Wang <amwang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Yan Burman reported following lockdep warning :
=============================================
[ INFO: possible recursive locking detected ]
3.7.0+ #24 Not tainted
---------------------------------------------
swapper/1/0 is trying to acquire lock:
(&n->lock){++--..}, at: [<ffffffff8139f56e>] __neigh_event_send
+0x2e/0x2f0
but task is already holding lock:
(&n->lock){++--..}, at: [<ffffffff813f63f4>] arp_solicit+0x1d4/0x280
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0
----
lock(&n->lock);
lock(&n->lock);
*** DEADLOCK ***
May be due to missing lock nesting notation
4 locks held by swapper/1/0:
#0: (((&n->timer))){+.-...}, at: [<ffffffff8104b350>]
call_timer_fn+0x0/0x1c0
#1: (&n->lock){++--..}, at: [<ffffffff813f63f4>] arp_solicit
+0x1d4/0x280
#2: (rcu_read_lock_bh){.+....}, at: [<ffffffff81395400>]
dev_queue_xmit+0x0/0x5d0
#3: (rcu_read_lock_bh){.+....}, at: [<ffffffff813cb41e>]
ip_finish_output+0x13e/0x640
stack backtrace:
Pid: 0, comm: swapper/1 Not tainted 3.7.0+ #24
Call Trace:
<IRQ> [<ffffffff8108c7ac>] validate_chain+0xdcc/0x11f0
[<ffffffff8108d570>] ? __lock_acquire+0x440/0xc30
[<ffffffff81120565>] ? kmem_cache_free+0xe5/0x1c0
[<ffffffff8108d570>] __lock_acquire+0x440/0xc30
[<ffffffff813c3570>] ? inet_getpeer+0x40/0x600
[<ffffffff8108d570>] ? __lock_acquire+0x440/0xc30
[<ffffffff8139f56e>] ? __neigh_event_send+0x2e/0x2f0
[<ffffffff8108ddf5>] lock_acquire+0x95/0x140
[<ffffffff8139f56e>] ? __neigh_event_send+0x2e/0x2f0
[<ffffffff8108d570>] ? __lock_acquire+0x440/0xc30
[<ffffffff81448d4b>] _raw_write_lock_bh+0x3b/0x50
[<ffffffff8139f56e>] ? __neigh_event_send+0x2e/0x2f0
[<ffffffff8139f56e>] __neigh_event_send+0x2e/0x2f0
[<ffffffff8139f99b>] neigh_resolve_output+0x16b/0x270
[<ffffffff813cb62d>] ip_finish_output+0x34d/0x640
[<ffffffff813cb41e>] ? ip_finish_output+0x13e/0x640
[<ffffffffa046f146>] ? vxlan_xmit+0x556/0xbec [vxlan]
[<ffffffff813cb9a0>] ip_output+0x80/0xf0
[<ffffffff813ca368>] ip_local_out+0x28/0x80
[<ffffffffa046f25a>] vxlan_xmit+0x66a/0xbec [vxlan]
[<ffffffffa046f146>] ? vxlan_xmit+0x556/0xbec [vxlan]
[<ffffffff81394a50>] ? skb_gso_segment+0x2b0/0x2b0
[<ffffffff81449355>] ? _raw_spin_unlock_irqrestore+0x65/0x80
[<ffffffff81394c57>] ? dev_queue_xmit_nit+0x207/0x270
[<ffffffff813950c8>] dev_hard_start_xmit+0x298/0x5d0
[<ffffffff813956f3>] dev_queue_xmit+0x2f3/0x5d0
[<ffffffff81395400>] ? dev_hard_start_xmit+0x5d0/0x5d0
[<ffffffff813f5788>] arp_xmit+0x58/0x60
[<ffffffff813f59db>] arp_send+0x3b/0x40
[<ffffffff813f6424>] arp_solicit+0x204/0x280
[<ffffffff813a1a70>] ? neigh_add+0x310/0x310
[<ffffffff8139f515>] neigh_probe+0x45/0x70
[<ffffffff813a1c10>] neigh_timer_handler+0x1a0/0x2a0
[<ffffffff8104b3cf>] call_timer_fn+0x7f/0x1c0
[<ffffffff8104b350>] ? detach_if_pending+0x120/0x120
[<ffffffff8104b748>] run_timer_softirq+0x238/0x2b0
[<ffffffff813a1a70>] ? neigh_add+0x310/0x310
[<ffffffff81043e51>] __do_softirq+0x101/0x280
[<ffffffff814518cc>] call_softirq+0x1c/0x30
[<ffffffff81003b65>] do_softirq+0x85/0xc0
[<ffffffff81043a7e>] irq_exit+0x9e/0xc0
[<ffffffff810264f8>] smp_apic_timer_interrupt+0x68/0xa0
[<ffffffff8145122f>] apic_timer_interrupt+0x6f/0x80
<EOI> [<ffffffff8100a054>] ? mwait_idle+0xa4/0x1c0
[<ffffffff8100a04b>] ? mwait_idle+0x9b/0x1c0
[<ffffffff8100a6a9>] cpu_idle+0x89/0xe0
[<ffffffff81441127>] start_secondary+0x1b2/0x1b6
Bug is from arp_solicit(), releasing the neigh lock after arp_send()
In case of vxlan, we eventually need to write lock a neigh lock later.
Its a false positive, but we can get rid of it without lockdep
annotations.
We can instead use neigh_ha_snapshot() helper.
Reported-by: Yan Burman <yanb@mellanox.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Using a seqlock for devnet_rename_seq is not a good idea,
as device_rename() can sleep.
As we hold RTNL, we dont need a protection for writers,
and only need a seqcount so that readers can catch a change done
by a writer.
Bug added in commit c91f6df2db4972d3 (sockopt: Change getsockopt() of
SO_BINDTODEVICE to return an interface name)
Reported-by: Dave Jones <davej@redhat.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Brian Haley <brian.haley@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
|
|
|
|
|
|
|
|
|
| |
Once skb_realloc_headroom() is called, tiph might point to freed memory.
Cache tiph->ttl value before the reallocation, to avoid unexpected
behavior.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Isaku Yamahata <yamahata@valinux.co.jp>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
|
|
|
|
|
|
|
|
|
| |
ipgre_tunnel_xmit() parses network header as IP unconditionally.
But transmitting packets are not always IP packet. For example such packet
can be sent by packet socket with sockaddr_ll.sll_protocol set.
So make the function check if skb->protocol is IP.
Signed-off-by: Isaku Yamahata <yamahata@valinux.co.jp>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|\
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Pull nfsd update from Bruce Fields:
"Included this time:
- more nfsd containerization work from Stanislav Kinsbursky: we're
not quite there yet, but should be by 3.9.
- NFSv4.1 progress: implementation of basic backchannel security
negotiation and the mandatory BACKCHANNEL_CTL operation. See
http://wiki.linux-nfs.org/wiki/index.php/Server_4.0_and_4.1_issues
for remaining TODO's
- Fixes for some bugs that could be triggered by unusual compounds.
Our xdr code wasn't designed with v4 compounds in mind, and it
shows. A more thorough rewrite is still a todo.
- If you've ever seen "RPC: multiple fragments per record not
supported" logged while using some sort of odd userland NFS client,
that should now be fixed.
- Further work from Jeff Layton on our mechanism for storing
information about NFSv4 clients across reboots.
- Further work from Bryan Schumaker on his fault-injection mechanism
(which allows us to discard selective NFSv4 state, to excercise
rarely-taken recovery code paths in the client.)
- The usual mix of miscellaneous bugs and cleanup.
Thanks to everyone who tested or contributed this cycle."
* 'for-3.8' of git://linux-nfs.org/~bfields/linux: (111 commits)
nfsd4: don't leave freed stateid hashed
nfsd4: free_stateid can use the current stateid
nfsd4: cleanup: replace rq_resused count by rq_next_page pointer
nfsd: warn on odd reply state in nfsd_vfs_read
nfsd4: fix oops on unusual readlike compound
nfsd4: disable zero-copy on non-final read ops
svcrpc: fix some printks
NFSD: Correct the size calculation in fault_inject_write
NFSD: Pass correct buffer size to rpc_ntop
nfsd: pass proper net to nfsd_destroy() from NFSd kthreads
nfsd: simplify service shutdown
nfsd: replace boolean nfsd_up flag by users counter
nfsd: simplify NFSv4 state init and shutdown
nfsd: introduce helpers for generic resources init and shutdown
nfsd: make NFSd service structure allocated per net
nfsd: make NFSd service boot time per-net
nfsd: per-net NFSd up flag introduced
nfsd: move per-net startup code to separated function
nfsd: pass net to __write_ports() and down
nfsd: pass net to nfsd_set_nrthreads()
...
|
| |
| |
| |
| |
| |
| |
| | |
It may be a matter of personal taste, but I find this makes the code
clearer.
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
| |
| |
| |
| |
| | |
Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
| |
| |
| |
| |
| |
| |
| | |
This is a cleanup patch.
Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Over TCP, RPC's are preceded by a single 4-byte field telling you how
long the rpc is (in bytes). The spec also allows you to send an RPC in
multiple such records (the high bit of the length field is used to tell
you whether this is the final record).
We've survived for years without supporting this because in practice the
clients we care about don't use it. But the userland rpc libraries do,
and every now and then an experimental client will run into this. (Most
recently I noticed it while trying to write a pynfs check.) And we're
really on the wrong side of the spec here--let's fix this.
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Keep a separate field, sk_datalen, that tracks only the data contained
in a fragment, not including the fragment header.
For now, this is always just max(0, sk_tcplen - 4), but after we allow
multiple fragments sk_datalen will accumulate the total rpc data size
while sk_tcplen only tracks progress receiving the current fragment.
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
| |
| |
| |
| |
| |
| |
| | |
The full reclen doesn't include the fragment header, but sk_tcplen does.
Fix this to make it an apples-to-apples comparison.
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Soon we want to support multiple fragments, in which case it may be
legal for a single fragment to be smaller than 8 bytes, so we'll want to
delay this check till we've reached the last fragment.
Also fix an outdated comment.
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Byte-swapping in place is always a little dubious.
Let's instead define this field to always be big-endian, and do the
swapping on demand where we need it.
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
| |
| |
| |
| |
| |
| |
| | |
In general I'd rather random bad behavior on the network won't trigger a
printk.
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
|
|\ \
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client
Pull Ceph update from Sage Weil:
"There are a few different groups of commits here. The largest is
Alex's ongoing work to enable the coming RBD features (cloning,
striping). There is some cleanup in libceph that goes along with it.
Cyril and David have fixed some problems with NFS reexport (leaking
dentries and page locks), and there is a batch of patches from Yan
fixing problems with the fs client when running against a clustered
MDS. There are a few bug fixes mixed in for good measure, many of
which will be going to the stable trees once they're upstream.
My apologies for the late pull. There is still a gremlin in the rbd
map/unmap code and I was hoping to include the fix for that as well,
but we haven't been able to confirm the fix is correct yet; I'll send
that in a separate pull once it's nailed down."
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: (68 commits)
rbd: get rid of rbd_{get,put}_dev()
libceph: register request before unregister linger
libceph: don't use rb_init_node() in ceph_osdc_alloc_request()
libceph: init event->node in ceph_osdc_create_event()
libceph: init osd->o_node in create_osd()
libceph: report connection fault with warning
libceph: socket can close in any connection state
rbd: don't use ENOTSUPP
rbd: remove linger unconditionally
rbd: get rid of RBD_MAX_SEG_NAME_LEN
libceph: avoid using freed osd in __kick_osd_requests()
ceph: don't reference req after put
rbd: do not allow remove of mounted-on image
libceph: Unlock unprocessed pages in start_read() error path
ceph: call handle_cap_grant() for cap import message
ceph: Fix __ceph_do_pending_vmtruncate
ceph: Don't add dirty inode to dirty list if caps is in migration
ceph: Fix infinite loop in __wake_requests
ceph: Don't update i_max_size when handling non-auth cap
bdi_register: add __printf verification, fix arg mismatch
...
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
In kick_requests(), we need to register the request before we
unregister the linger request. Otherwise the unregister will
reset the request's osd pointer to NULL.
Signed-off-by: Alex Elder <elder@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
|