| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
when udp_recvmsg() is executed, on x86_64 and other archs, most skb
fields are on cold cachelines.
If the skb are linear and the kernel don't need to compute the udp
csum, only a handful of skb fields are required by udp_recvmsg().
Since we already use skb->dev_scratch to cache hot data, and
there are 32 bits unused on 64 bit archs, use such field to cache
as much data as we can, and try to prefetch on dequeue the relevant
fields that are left out.
This can save up to 2 cache miss per packet.
v1 -> v2:
- changed udp_dev_scratch fields types to u{32,16} variant,
replaced bitfiled with bool
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Since UDP no more uses sk->destructor, we can clear completely
the skb head state before enqueuing. Amend and use
skb_release_head_state() for that.
All head states share a single cacheline, which is not
normally used/accesses on dequeue. We can avoid entirely accessing
such cacheline implementing and using in the UDP code a specialized
skb free helper which ignores the skb head state.
This saves a cacheline miss at skb deallocation time.
v1 -> v2:
replaced secpath_reset() with skb_release_head_state()
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
|
|
|
|
|
|
|
| |
The same code is replicated in 3 different places; move it to a
common helper.
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
|
|
|
|
|
|
|
|
| |
Allow for tc BPF programs to set a skb->hash, apart from clearing
and triggering a recalc that we have right now. It allows for BPF
to implement a custom hashing routine for skb_get_hash().
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
|
|
|
|
|
|
|
|
| |
Since cg_skb_func_proto() doesn't do anything else than just calling
into sk_filter_func_proto(), remove it and set sk_filter_func_proto()
directly for .get_func_proto callback.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
|
|
|
|
|
|
|
| |
After moves the skb->dev and skb->protocol initialization into
ip6_output, setting the skb->dev inside ip6_fragment is unnecessary.
Fixes: 97a7a37a7b7b("ipv6: Initial skb->dev and skb->protocol in ip6_output")
Signed-off-by: Chenbo Feng <fengc@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
sctp_assoc_set_id does the assoc id check in the beginning when
processing dupcookie, no need to do the same check before calling
it.
v1->v2:
fix some typo errs Marcelo pointed in changelog.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
|
|
|
|
|
|
|
| |
This patch is to use read_lock_bh instead of local_bh_disable
and read_lock in sctp_eps_seq_show.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Dmitry got the following recursive locking report while running syzkaller
fuzzer, the Call Trace:
__dump_stack lib/dump_stack.c:16 [inline]
dump_stack+0x2ee/0x3ef lib/dump_stack.c:52
print_deadlock_bug kernel/locking/lockdep.c:1729 [inline]
check_deadlock kernel/locking/lockdep.c:1773 [inline]
validate_chain kernel/locking/lockdep.c:2251 [inline]
__lock_acquire+0xef2/0x3430 kernel/locking/lockdep.c:3340
lock_acquire+0x2a1/0x630 kernel/locking/lockdep.c:3755
lock_sock_nested+0xcb/0x120 net/core/sock.c:2536
lock_sock include/net/sock.h:1460 [inline]
sctp_close+0xcd/0x9d0 net/sctp/socket.c:1497
inet_release+0xed/0x1c0 net/ipv4/af_inet.c:425
inet6_release+0x50/0x70 net/ipv6/af_inet6.c:432
sock_release+0x8d/0x1e0 net/socket.c:597
__sock_create+0x38b/0x870 net/socket.c:1226
sock_create+0x7f/0xa0 net/socket.c:1237
sctp_do_peeloff+0x1a2/0x440 net/sctp/socket.c:4879
sctp_getsockopt_peeloff net/sctp/socket.c:4914 [inline]
sctp_getsockopt+0x111a/0x67e0 net/sctp/socket.c:6628
sock_common_getsockopt+0x95/0xd0 net/core/sock.c:2690
SYSC_getsockopt net/socket.c:1817 [inline]
SyS_getsockopt+0x240/0x380 net/socket.c:1799
entry_SYSCALL_64_fastpath+0x1f/0xc2
This warning is caused by the lock held by sctp_getsockopt() is on one
socket, while the other lock that sctp_close() is getting later is on
the newly created (which failed) socket during peeloff operation.
This patch is to avoid this warning by use lock_sock with subclass
SINGLE_DEPTH_NESTING as Wang Cong and Marcelo's suggestion.
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Suggested-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Suggested-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
|
|
|
|
|
|
| |
This patch removes unneeded forward declaration of tpacket_snd()
in net/packet/af_packet.c.
Signed-off-by: Rami Rosen <rami.rosen@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
|
|
|
|
|
|
|
|
|
|
| |
There are two tcp_filter hooks in tcp_ipv6 ingress path currently.
One is at tcp_v6_rcv and another is in tcp_v6_do_rcv. It seems the
tcp_filter() call inside tcp_v6_do_rcv is redundent and some packet
will be filtered twice in this situation. This will cause trouble
when using eBPF filters to account traffic data.
Signed-off-by: Chenbo Feng <fengc@google.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
|
|
|
|
|
|
|
|
|
| |
When the user tries to assign a specific nsid, idr_alloc() is called with
the range [nsid, nsid+1]. If this nsid is already used, idr_alloc() returns
ENOSPC (No space left on device). In our case, it's better to return
EEXIST to make it clear that the nsid is not available.
CC: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
|
|
|
|
|
|
| |
It helps the user to identify errors.
CC: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
|
|
|
|
|
|
|
|
| |
Move the initialization of skb->dev and skb->protocol from
ip6_finish_output2 to ip6_output. This can make the skb->dev and
skb->protocol information avalaible to the CGROUP eBPF filter.
Signed-off-by: Chenbo Feng <fengc@google.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The ipvlan code already knows how to detect when a duplicate address is
about to be assigned to an ipvlan device. However, that failure is not
propogated outward and leads to a silent failure.
Introduce a validation step at ip address creation time and allow device
drivers to register to validate the incoming ip addresses. The ipvlan
code is the first consumer. If it detects an address in use, we can
return an error to the user before beginning to commit the new ifa in
the networking code.
This can be especially useful if it is necessary to provision many
ipvlans in containers. The provisioning software (or operator) can use
this to detect situations where an ip address is unexpectedly in use.
Signed-off-by: Krister Johansen <kjlx@templeofstupid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When inheriting tx_flags from one skbuff to another, always apply a
mask to avoid overwriting unrelated other bits in the field.
The two SKBTX_SHARED_FRAG cases clears all other bits. In practice,
tx_flags are zero at this point now. But this is fragile. Timestamp
flags are set, for instance, if in tcp_gso_segment, after this clear
in skb_segment.
The SKBTX_ANY_TSTAMP mask in __skb_tstamp_tx ensures that new
skbs do not accidentally inherit flags such as SKBTX_SHARED_FRAG.
Signed-off-by: Willem de Bruijn <willemb@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently there's no way to dump the VIF table for an ipmr table other
than the default (via proc). This is a major issue when debugging ipmr
issues and in general it is good to know which interfaces are
configured. This patch adds support for RTM_GETLINK for the ipmr family
so we can dump the VIF table and the ipmr table's current config for
each table. We're protected by rtnl so no need to acquire RCU or
mrt_lock.
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Remove support for bridge bypass ndos from stacked devices. At this point
no driver which supports stack device behavior offload supports operation
with SELF flag. The case for upper device is already taken care of in both
of the following cases:
1. FDB add/del - driver should check at the notification cb if the
stacked device contains his ports.
2. Port attribute - calls switchdev code directly which checks
for case of stack device.
Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When a new static FDB is added to the bridge a notification is sent to
the driver for offload. In case of successful offload the driver should
notify the bridge back, which in turn should mark the FDB as offloaded.
Currently, externally learned is equivalent for being offloaded which is
not correct due to the fact that FDBs which are added from user-space are
also marked as externally learned. In order to specify if an FDB was
successfully offloaded a new flag is introduced.
Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently the bridge doesn't notify the underlying devices about new
FDBs learned. The FDB sync is placed on the switchdev notifier chain
because devices may potentially learn FDB that are not directly related
to their ports, for example:
1. Mixed SW/HW bridge - FDBs that point to the ASICs external devices
should be offloaded as CPU traps in order to
perform forwarding in slow path.
2. EVPN - Externally learned FDBs for the vtep device.
Notification is sent only about static FDB add/del. This is done due
to fact that currently this is the only scenario supported by switch
drivers.
Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Reviewed-by: Ivan Vecera <ivecera@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In order to use the switchdev notifier chain for FDB sync with the
device it has to be changed to atomic. The is done because the bridge
can learn new FDBs in atomic context.
Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Ivan Vecera <ivecera@redhat.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Ivan Vecera <ivecera@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is done as a preparation to moving the switchdev notifier chain
to be atomic. The FDB external learning should be called under rtnl
or rcu.
Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently the flood, learning and learning_sync port attributes are
offloaded by setting the SELF flag. Add support for offloading the
flood and learning attribute through the bridge code. In case of
setting an unsupported flag on a offloded port the operation will
fail.
The learning_sync attribute doesn't have any software representation
and cannot be offloaded through the bridge code.
Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Reviewed-by: Ivan Vecera <ivecera@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In a multi-chip switch fabric, it is currently the responsibility of the
driver to add the CPU or DSA (interconnecting chips together) ports as
members of a new VLAN entry. This makes the drivers more complicated.
We want the DSA drivers to be stupid and the DSA core being the one
responsible for caring about the abstracted switch logic and topology.
Make the DSA core program the CPU and DSA ports as part of the VLAN.
This makes all chips of the data path to be aware of VIDs spanning the
the whole fabric and thus, seamlessly add support for cross-chip VLAN.
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Now that the VLAN object is propagated to every switch chip of the
switch fabric, we can easily ensure that they all support the required
VLAN operations before modifying an entry on a single switch.
To achieve that, remove the condition skipping other target switches,
and add a bitmap of VLAN members, eventually containing the target port,
if we are programming the switch target.
This will allow us to easily add other VLAN members, such as the DSA or
CPU ports (to introduce cross-chip VLAN support) or the other port
members if we want to reduce hardware accesses later.
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|\
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs
David Howells says:
====================
rxrpc: Tx length parameter
Here's a set of patches that allows someone initiating a client call with
AF_RXRPC to indicate upfront the total amount of data that will be
transmitted. This will allow AF_RXRPC to encrypt directly from source
buffer to packet rather than having to copy into the buffer and only
encrypt when it's full (the encrypted portion of the packet starts with a
length and so we can't encrypt until we know what the length will be).
The three patches are:
(1) Provide a means of finding out what control message types are actually
supported. EINVAL is reported if an unsupported cmsg type is seen, so
we don't want to set the new cmsg unless we know it will be accepted.
(2) Consolidate some stuff into a struct to reduce the parameter count on
the function that parses the cmsg buffer.
(3) Introduce the RXRPC_TX_LENGTH cmsg. This can be provided on the first
sendmsg() that contributes data to a client call request or a service
call reply. If provided, the user must provide exactly that amount of
data or an error will be incurred.
Changes in version 2:
(*) struct rxrpc_send_params::tx_total_len should be s64 not u64. Thanks to
Julia Lawall for reporting this.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Provide a control message that can be specified on the first sendmsg() of a
client call or the first sendmsg() of a service response to indicate the
total length of the data to be transmitted for that call.
Currently, because the length of the payload of an encrypted DATA packet is
encrypted in front of the data, the packet cannot be encrypted until we
know how much data it will hold.
By specifying the length at the beginning of the transmit phase, each DATA
packet length can be set before we start loading data from userspace (where
several sendmsg() calls may contribute to a particular packet).
An error will be returned if too little or too much data is presented in
the Tx phase.
Signed-off-by: David Howells <dhowells@redhat.com>
|
| |
| |
| |
| |
| |
| |
| |
| | |
Consolidate the sendmsg control message parameters into a struct rather
than passing them individually through the argument list of
rxrpc_sendmsg_cmsg(). This makes it easier to add more parameters.
Signed-off-by: David Howells <dhowells@redhat.com>
|
| |
| |
| |
| |
| | |
Provide a getsockopt() call that can query what cmsg types are supported by
AF_RXRPC.
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
As the higher level communication only deals with "services" the
a service directory is required to keep track of local and remote
services. In order for qrtr clients to be informed about when the
service directory implementation is available some event needs to be
passed to them.
Rather than introducing support for broadcasting such a message in-band
to all open local sockets we flag each socket with ENETRESET, as there
are no other expected operations that would benefit from having support
from locally broadcasting messages.
Cc: Courtney Cavin <ccavin@gmail.com>
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Per the QMUXv2 protocol specificiation a DEL_CLIENT message should be
broadcasted when an endpoint is disconnected.
The protocol specification does suggest that the router can keep track
of which nodes the endpoint has been communicating with to not wake up
sleeping remotes unecessarily, but implementation of this suggestion is
left for the future.
Cc: Courtney Cavin <ccavin@gmail.com>
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Per the QMUX protocol specification a terminating node can send a BYE
control message to signal that the link is going down, upon receiving
this all information about remote services should be discarded and local
clients should be notified.
In the event that the link was brought down abruptly the router is
supposed to act like a BYE message has arrived. As there is no harm in
receiving an extra BYE from the remote this patch implements the latter
by injecting a BYE when the link to the remote is unregistered.
The name service will receive the BYE and can implement the notification
to the local clients.
Cc: Courtney Cavin <ccavin@gmail.com>
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Extract the allocation and filling in the control message header fields
to a separate function in order to reuse this in subsequent patches.
Cc: Courtney Cavin <ccavin@gmail.com>
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
DRAM supply shortage and poor memory pressure tracking in TCP
stack makes any change in SO_SNDBUF/SO_RCVBUF (or equivalent autotuning
limits) and tcp_mem[] quite hazardous.
TCPMemoryPressures SNMP counter is an indication of tcp_mem sysctl
limits being hit, but only tracking number of transitions.
If TCP stack behavior under stress was perfect :
1) It would maintain memory usage close to the limit.
2) Memory pressure state would be entered for short times.
We certainly prefer 100 events lasting 10ms compared to one event
lasting 200 seconds.
This patch adds a new SNMP counter tracking cumulative duration of
memory pressure events, given in ms units.
$ cat /proc/sys/net/ipv4/tcp_mem
3088 4117 6176
$ grep TCP /proc/net/sockstat
TCP: inuse 180 orphan 0 tw 2 alloc 234 mem 4140
$ nstat -n ; sleep 10 ; nstat |grep Pressure
TcpExtTCPMemoryPressures 1700
TcpExtTCPMemoryPressuresChrono 5209
v2: Used EXPORT_SYMBOL_GPL() instead of EXPORT_SYMBOL() as David
instructed.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| |
| |
| |
| |
| | |
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| |
| |
| |
| |
| | |
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| |
| |
| |
| |
| | |
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
We want to move some TCP sysctls to net namespaces in the future.
tcp_window_scaling, tcp_sack and tcp_timestamps being fetched
from tcp_parse_options(), we need to pass an extra parameter.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|/
|
|
|
|
|
|
|
|
|
| |
We need to push the chain index down to the drivers, so they have the
information to which chain the rule belongs. For now, no driver supports
multichain offload, so only chain 0 is supported. This is needed to
prevent chain squashes during offload for now. Later this will be used
to implement multichain offload.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|\
| |
| |
| |
| |
| |
| | |
Just some simple overlapping changes in marvell PHY driver
and the DSA core code.
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| |\
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Pull networking fixes from David Miller:
1) Made TCP congestion control documentation match current reality,
from Anmol Sarma.
2) Various build warning and failure fixes from Arnd Bergmann.
3) Fix SKB list leak in ipv6_gso_segment().
4) Use after free in ravb driver, from Eugeniu Rosca.
5) Don't use udp_poll() in ping protocol driver, from Eric Dumazet.
6) Don't crash in PCI error recovery of cxgb4 driver, from Guilherme
Piccoli.
7) _SRC_NAT_DONE_BIT needs to be cleared using atomics, from Liping
Zhang.
8) Use after free in vxlan deletion, from Mark Bloch.
9) Fix ordering of NAPI poll enabled in ethoc driver, from Max
Filippov.
10) Fix stmmac hangs with TSO, from Niklas Cassel.
11) Fix crash in CALIPSO ipv6, from Richard Haines.
12) Clear nh_flags properly on mpls link up. From Roopa Prabhu.
13) Fix regression in sk_err socket error queue handling, noticed by
ping applications. From Soheil Hassas Yeganeh.
14) Update mlx4/mlx5 MAINTAINERS information.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (78 commits)
net: stmmac: fix a broken u32 less than zero check
net: stmmac: fix completely hung TX when using TSO
net: ethoc: enable NAPI before poll may be scheduled
net: bridge: fix a null pointer dereference in br_afspec
ravb: Fix use-after-free on `ifconfig eth0 down`
net/ipv6: Fix CALIPSO causing GPF with datagram support
net: stmmac: ensure jumbo_frm error return is correctly checked for -ve value
Revert "sit: reload iphdr in ipip6_rcv"
i40e/i40evf: proper update of the page_offset field
i40e: Fix state flags for bit set and clean operations of PF
iwlwifi: fix host command memory leaks
iwlwifi: fix min API version for 7265D, 3168, 8000 and 8265
iwlwifi: mvm: clear new beacon command template struct
iwlwifi: mvm: don't fail when removing a key from an inexisting sta
iwlwifi: pcie: only use d0i3 in suspend/resume if system_pm is set to d0i3
iwlwifi: mvm: fix firmware debug restart recording
iwlwifi: tt: move ucode_loaded check under mutex
iwlwifi: mvm: support ibss in dqa mode
iwlwifi: mvm: Fix command queue number on d0i3 flow
iwlwifi: mvm: rs: start using LQ command color
...
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
We might call br_afspec() with p == NULL which is a valid use case if
the action is on the bridge device itself, but the bridge tunnel code
dereferences the p pointer without checking, so check if p is null
first.
Reported-by: Gustavo A. R. Silva <garsilva@embeddedor.com>
Fixes: efa5356b0d97 ("bridge: per vlan dst_metadata netlink support")
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Acked-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
When using CALIPSO with IPPROTO_UDP it is possible to trigger a GPF as the
IP header may have moved.
Also update the payload length after adding the CALIPSO option.
Signed-off-by: Richard Haines <richard_c_haines@btinternet.com>
Acked-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: Huw Davies <huw@codeweavers.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
This reverts commit b699d0035836f6712917a41e7ae58d84359b8ff9.
As per Eric Dumazet, the pskb_may_pull() is a NOP in this
particular case, so the 'iph' reload is unnecessary.
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
We must free allocated skb when genlmsg_put() return fails.
Fixes: 1555d204e743 ("devlink: Support for pipeline debug (dpipe)")
Signed-off-by: Haishuang Yan <yanhaishuang@cmss.chinamobile.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Since iptunnel_pull_header() can call pskb_may_pull(),
we must reload any pointer that was related to skb->head.
Fixes: a09a4c8dd1ec ("tunnels: Remove encapsulation offloads on decap")
Signed-off-by: Haishuang Yan <yanhaishuang@cmss.chinamobile.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Alexander reported various KASAN messages triggered in recent kernels
The problem is that ping sockets should not use udp_poll() in the first
place, and recent changes in UDP stack finally exposed this old bug.
Fixes: c319b4d76b9e ("net: ipv4: add IPPROTO_ICMP socket kind")
Fixes: 6d0bfe226116 ("net: ipv6: Add IPv6 support to the ping socket.")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Sasha Levin <alexander.levin@verizon.com>
Cc: Solar Designer <solar@openwall.com>
Cc: Vasiliy Kulikov <segoon@openwall.com>
Cc: Lorenzo Colitti <lorenzo@google.com>
Acked-By: Lorenzo Colitti <lorenzo@google.com>
Tested-By: Lorenzo Colitti <lorenzo@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Commit 9520ed8fb841 ("net: dsa: use cpu_switch instead of ds[0]")
replaced the use of dst->ds[0] with dst->cpu_switch since that is
functionally equivalent, however, we can now run into an use after free
scenario after unbinding then rebinding the switch driver.
The use after free happens because we do correctly initialize
dst->cpu_switch the first time we probe in dsa_cpu_parse(), then we
unbind the driver: dsa_dst_unapply() is called, and we rebind again.
dst->cpu_switch now points to a freed "ds" structure, and so when we
finally dereference it in dsa_cpu_port_ethtool_setup(), we oops.
To fix this, simply set dst->cpu_switch to NULL in dsa_dst_unapply()
which guarantees that we always correctly re-assign dst->cpu_switch in
dsa_cpu_parse().
Fixes: 9520ed8fb841 ("net: dsa: use cpu_switch instead of ds[0]")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
If ip6_find_1stfragopt() fails and we return an error we have to free
up 'segs' because nobody else is going to.
Fixes: 2423496af35d ("ipv6: Prevent overrun when parsing v6 header options")
Reported-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Prior to f5f99309fa74 (sock: do not set sk_err in
sock_dequeue_err_skb), sk_err was reset to the error of
the skb on the head of the error queue.
Applications, most notably ping, are relying on this
behavior to reset sk_err for ICMP packets.
Set sk_err to the ICMP error when there is an ICMP packet
at the head of the error queue.
Fixes: f5f99309fa74 (sock: do not set sk_err in sock_dequeue_err_skb)
Reported-by: Cyril Hrubis <chrubis@suse.cz>
Tested-by: Cyril Hrubis <chrubis@suse.cz>
Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|