summaryrefslogtreecommitdiffstats
path: root/net (follow)
Commit message (Collapse)AuthorAgeFilesLines
* lwtunnel: remove unused but set variableRoopa Prabhu2017-03-141-2/+0
| | | | | | | | | | | silences the below warning: net/core/lwtunnel.c: In function ‘lwtunnel_valid_encap_type_attr’: net/core/lwtunnel.c:165:17: warning: variable ‘nla’ set but not used [-Wunused-but-set-variable] Fixes: 9ed59592e3e3 ("lwtunnel: fix autoload of lwt modules") Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* rds: ib: unmap the scatter/gather list when errorZhu Yanjun2017-03-141-7/+19
| | | | | | | | | | | When some errors occur, the scatter/gather list mapped to DMA addresses should be handled. Cc: Joe Jin <joe.jin@oracle.com> Cc: Junxiao Bi <junxiao.bi@oracle.com> Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* rds: ib: add the static type to the functionZhu Yanjun2017-03-142-4/+3
| | | | | | | | | | | The function rds_ib_map_fmr is used only in the ib_fmr.c file. As such, the static type is added to limit it in this file. Cc: Joe Jin <joe.jin@oracle.com> Cc: Junxiao Bi <junxiao.bi@oracle.com> Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* rds: ib: remove redundant ib_dealloc_fmrZhu Yanjun2017-03-141-5/+2
| | | | | | | | | | | | | The function ib_dealloc_fmr will never be called. As such, it should be removed. Cc: Joe Jin <joe.jin@oracle.com> Cc: Junxiao Bi <junxiao.bi@oracle.com> Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* rds: ib: drop unnecessary rdma_rejectZhu Yanjun2017-03-141-3/+2
| | | | | | | | | | | When rdma_accept fails, rdma_reject is called in it. As such, it is not necessary to execute rdma_reject again. Cc: Joe Jin <joe.jin@oracle.com> Cc: Junxiao Bi <junxiao.bi@oracle.com> Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* atm: remove an unnecessary loopFrancois Romieu2017-03-131-9/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Andrey reported this kernel warning: WARNING: CPU: 0 PID: 4114 at kernel/sched/core.c:7737 __might_sleep+0x149/0x1a0 do not call blocking ops when !TASK_RUNNING; state=1 set at [<ffffffff813fcb22>] prepare_to_wait+0x182/0x530 The deeply nested alloc_skb is a problem. Diagnosis: nesting is wrong. It makes zero sense. Fix it and the implicit task state change problem automagically goes away. alloc_skb() does not need to be in the "while" loop. alloc_skb() does not need to be in the {prepare_to_wait/add_wait_queue ... finish_wait/remove_wait_queue} block. I claim that: - alloc_tx() should only perform the "wait_for_decent_tx_drain" part - alloc_skb() ought to be done directly in vcc_sendmsg - alloc_skb() failure can be handled gracefully in vcc_sendmsg - alloc_skb() may use a (m->msg_flags & MSG_DONTWAIT) dependent GFP_{KERNEL / ATOMIC} flag Reported-by: Andrey Konovalov <andreyknvl@google.com> Reviewed-and-Tested-by: Chas Williams <3chas3@gmail.com> Signed-off-by: Chas Williams <3chas3@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* mpls: allow TTL propagation from IP packets to be configuredRobert Shearman2017-03-132-13/+71
| | | | | | | | | | | | | | | | | | | | | | | Allow TTL propagation from IP packets to MPLS packets to be configured. Add a new optional LWT attribute, MPLS_IPTUNNEL_TTL, which allows the TTL to be set in the resulting MPLS packet, with the value of 0 having the semantics of enabling propagation of the TTL from the IP header (i.e. non-zero values disable propagation). Also allow the configuration to be overridden globally by reusing the same sysctl to control whether the TTL is propagated from IP packets into the MPLS header. If the per-LWT attribute is set then it overrides the global configuration. If the TTL isn't propagated then a default TTL value is used which can be configured via a new sysctl, "net.mpls.default_ttl". This is kept separate from the configuration of whether IP TTL propagation is enabled as it can be used in the future when non-IP payloads are supported (i.e. where there is no payload TTL that can be propagated). Signed-off-by: Robert Shearman <rshearma@brocade.com> Acked-by: David Ahern <dsa@cumulusnetworks.com> Tested-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* mpls: allow TTL propagation to IP packets to be configuredRobert Shearman2017-03-132-8/+86
| | | | | | | | | | | | | | | | | | | | | | Provide the ability to control on a per-route basis whether the TTL value from an MPLS packet is propagated to an IPv4/IPv6 packet when the last label is popped as per the theoretical model in RFC 3443 through a new route attribute, RTA_TTL_PROPAGATE which can be 0 to mean disable propagation and 1 to mean enable propagation. In order to provide the ability to change the behaviour for packets arriving with IPv4/IPv6 Explicit Null labels and to provide an easy way for a user to change the behaviour for all existing routes without having to reprogram them, a global knob is provided. This is done through the addition of a new per-namespace sysctl, "net.mpls.ip_ttl_propagate", which defaults to enabled. If the per-route attribute is set (either enabled or disabled) then it overrides the global configuration. Signed-off-by: Robert Shearman <rshearma@brocade.com> Acked-by: David Ahern <dsa@cumulusnetworks.com> Tested-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* sch_tbf: Remove bogus semicolon in if() conditional.David S. Miller2017-03-131-1/+1
| | | | | | Fixes: 49b499718fa1 ("net: sched: make default fifo qdiscs appear in the dump") Reported-by: kbuild test robot <fengguang.wu@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* drop_monitor: use setup_timerGeliang Tang2017-03-131-3/+2
| | | | | | | Use setup_timer() instead of init_timer() to simplify the code. Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: Eliminate duplicated codes by creating one new function in_dev_select_addrGao Feng2017-03-131-14/+18
| | | | | | | | | There are two duplicated loops codes which used to select right address in current codes. Now eliminate these codes by creating one new function in_dev_select_addr. Signed-off-by: Gao Feng <fgao@ikuai8.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* sctp: add get and set sockopt for reconf_enableXin Long2017-03-132-0/+88
| | | | | | | | | | | | | | This patchset is to add SCTP_RECONFIG_SUPPORTED sockopt, it would set and get asoc reconf_enable value when asoc_id is set, or it would set and get ep reconf_enalbe value if asoc_id is 0. It is also to add sysctl interface for users to set the default value for reconf_enable. After this patch, stream reconf will work. Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* sctp: implement receiver-side procedures for the Reconf Response ParameterXin Long2017-03-132-3/+157
| | | | | | | | | | | | This patch is to implement Receiver-Side Procedures for the Re-configuration Response Parameter in rfc6525 section 5.2.7. sctp_process_strreset_resp would process the response for any kind of reconf request, and the stream reconf is applied only when the response result is success. Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* sctp: implement receiver-side procedures for the Add Incoming Streams ↵Xin Long2017-03-132-12/+65
| | | | | | | | | | | | | | Request Parameter This patch is to implement Receiver-Side Procedures for the Add Incoming Streams Request Parameter described in rfc6525 section 5.2.6. It is also to fix that it shouldn't have add streams when sending addstrm in request, as the process in peer will handle it by sending a addstrm out request back. Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* sctp: implement receiver-side procedures for the Add Outgoing Streams ↵Xin Long2017-03-132-10/+82
| | | | | | | | | | | | | Request Parameter This patch is to add Receiver-Side Procedures for the Add Outgoing Streams Request Parameter described in section 5.2.5. It is also to improve sctp_chunk_lookup_strreset_param, so that it can be used for processing addstrm_out request. Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* sctp: add support for generating add stream change event notificationXin Long2017-03-131-0/+28
| | | | | | | | This patch is to add Stream Change Event described in rfc6525 section 6.1.3. Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* sctp: implement receiver-side procedures for the SSN/TSN Reset Request ParameterXin Long2017-03-132-0/+82
| | | | | | | | | | | This patch is to implement Receiver-Side Procedures for the SSN/TSN Reset Request Parameter described in rfc6525 section 6.2.4. The process is kind of complicate, it's wonth having some comments from section 6.2.4 in the codes. Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* sctp: add support for generating assoc reset event notificationXin Long2017-03-131-0/+28
| | | | | | | | This patch is to add Association Reset Event described in rfc6525 section 6.1.2. Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: ipv6: Add early demux handler for UDP unicastsubashab@codeaurora.org2017-03-131-0/+59
| | | | | | | | | | | | | | | | | | | | | | | While running a single stream UDPv6 test, we observed that amount of CPU spent in NET_RX softirq was much greater than UDPv4 for an equivalent receive rate. The test here was run on an ARM64 based Android system. On further analysis with perf, we found that UDPv6 was spending significant time in the statistics netfilter targets which did socket lookup per packet. These statistics rules perform a lookup when there is no socket associated with the skb. Since there are multiple instances of these rules based on UID, there will be equal number of lookups per skb. By introducing early demux for UDPv6, we avoid the redundant lookups. This also helped to improve the performance (800Mbps -> 870Mbps) on a CPU limited system in a single stream UDPv6 receive test with 1450 byte sized datagrams using iperf. v1->v2: Use IPv6 cookie to validate dst instead of 0 as suggested by Eric Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: sched: make default fifo qdiscs appear in the dumpJiri Kosina2017-03-1315-16/+62
| | | | | | | | | | | | | | | | | | | | | | | | | | | The original reason [1] for having hidden qdiscs (potential scalability issues in qdisc_match_from_root() with single linked list in case of large amount of qdiscs) has been invalidated by 59cc1f61f0 ("net: sched: convert qdisc linked list to hashtable"). This allows us for bringing more clarity and determinism into the dump by making default pfifo qdiscs visible. We're not turning this on by default though, at it was deemed [2] too intrusive / unnecessary change of default behavior towards userspace. Instead, TCA_DUMP_INVISIBLE netlink attribute is introduced, which allows applications to request complete qdisc hierarchy dump, including the ones that have always been implicit/invisible. Singleton noop_qdisc stays invisible, as teaching the whole infrastructure about singletons would require quite some surgery with very little gain (seeing no qdisc or seeing noop qdisc in the dump is probably setting the same user expectation). [1] http://lkml.kernel.org/r/1460732328.10638.74.camel@edumazet-glaptop3.roam.corp.google.com [2] http://lkml.kernel.org/r/20161021.105935.1907696543877061916.davem@davemloft.net Signed-off-by: Jiri Kosina <jkosina@suse.cz> Signed-off-by: David S. Miller <davem@davemloft.net>
* ipv4: fib: Remove redundant argumentIdo Schimmel2017-03-103-14/+10
| | | | | | | | | | We always pass the same event type to fib_notify() and fib_rules_notify(), so we can safely drop this argument. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Acked-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* ipv4: fib: Move FIB notification code to a separate fileIdo Schimmel2017-03-104-96/+98
| | | | | | | | | | | | | | | Most of the code concerned with the FIB notification chain currently resides in fib_trie.c, but this isn't really appropriate, as the FIB notification chain is also used for FIB rules. Therefore, it makes sense to move the common FIB notification code to a separate file and have it export the relevant functions, which can be invoked by its different users (e.g., fib_trie.c, fib_rules.c). Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Acked-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* tcp: rename *_sequence_number() to *_seq_and_tsoff()Alexey Kodanev2017-03-104-31/+30
| | | | | | | | | | | The functions that are returning tcp sequence number also setup TS offset value, so rename them to better describe their purpose. No functional changes in this patch. Suggested-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Alexey Kodanev <alexey.kodanev@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* ethtool: add CRC32 as an RSS hash functionJakub Kicinski2017-03-101-0/+1
| | | | | | | | CRC32 engines are usually easily available in hardware and generate OK spread for RSS hash. Add CRC32 RSS hash function to ethtool API. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net/socket: use per af lockdep classes for sk queuesPaolo Abeni2017-03-101-18/+78
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently the sock queue's spin locks get their lockdep classes by the default init_spin_lock() initializer: all socket families get - usually, see below - a single class for rx, another specific class for tx, etc. This can lead to false positive lockdep splat, as reported by Andrey. Moreover there are two separate initialization points for the sock queues, one in sk_clone_lock() and one in sock_init_data(), so that e.g. the rx queue lock can get one of two possible, different classes, depending on the socket being cloned or not. This change tries to address the above, setting explicitly a per address family lockdep class for each queue's spinlock. Also, move the duplicated initialization code to a single location. v1 -> v2: - renamed the init helper rfc -> v1: - no changes, tested with several different workload Suggested-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* flow_dissector: Move GRE dissection into a separate functionJiri Pirko2017-03-091-110/+134
| | | | | | | | Make the main flow_dissect function a bit smaller and move the GRE dissection into a separate function. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* flow_dissector: rename "proto again" goto labelJiri Pirko2017-03-091-4/+4
| | | | | | | | Align with "ip_proto_again" label used in the same function and rename vague "again" to "proto_again". Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* flow_dissector: Fix GRE header error pathJiri Pirko2017-03-091-3/+3
| | | | | | | | | | Now, when an unexpected element in the GRE header appears, we break so the l4 ports are processed. But since the ports are processed unconditionally, there will be certainly random values dissected. Fix this by just bailing out in such situations. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* flow_dissector: Move MPLS dissection into a separate functionJiri Pirko2017-03-091-22/+34
| | | | | | | | | Make the main flow_dissect function a bit smaller and move the MPLS dissection into a separate function. Along with that, do the MPLS header processing only in case the flow dissection user requires it. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* flow_dissector: Move ARP dissection into a separate functionJiri Pirko2017-03-091-53/+67
| | | | | | | | | Make the main flow_dissect function a bit smaller and move the ARP dissection into a separate function. Along with that, do the ARP header processing only in case the flow dissection user requires it. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* decnet: Use TCP nagle macro instead of literal number in decnetGao Feng2017-03-071-6/+7
| | | | | | | | Use existing TCP nagle macro TCP_NAGLE_OFF and TCP_NAGLE_CORK instead of the literal number 1 and 2 in the current decnet codes. Signed-off-by: Gao Feng <fgao@ikuai8.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* ipv6: Provide ipv6 version of "disable_policy" sysctlDavid Forster2017-03-071-0/+114
| | | | | | | | | This provides equivalent functionality to the existing ipv4 "disable_policy" systcl. ie. Allows IPsec processing to be skipped on terminating packets on a per-interface basis. Signed-off-by: David Forster <dforster@brocade.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds2017-03-0542-197/+462
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull networking fixes from David Miller: 1) Fix double-free in batman-adv, from Sven Eckelmann. 2) Fix packet stats for fast-RX path, from Joannes Berg. 3) Netfilter's ip_route_me_harder() doesn't handle request sockets properly, fix from Florian Westphal. 4) Fix sendmsg deadlock in rxrpc, from David Howells. 5) Add missing RCU locking to transport hashtable scan, from Xin Long. 6) Fix potential packet loss in mlxsw driver, from Ido Schimmel. 7) Fix race in NAPI handling between poll handlers and busy polling, from Eric Dumazet. 8) TX path in vxlan and geneve need proper RCU locking, from Jakub Kicinski. 9) SYN processing in DCCP and TCP need to disable BH, from Eric Dumazet. 10) Properly handle net_enable_timestamp() being invoked from IRQ context, also from Eric Dumazet. 11) Fix crash on device-tree systems in xgene driver, from Alban Bedel. 12) Do not call sk_free() on a locked socket, from Arnaldo Carvalho de Melo. 13) Fix use-after-free in netvsc driver, from Dexuan Cui. 14) Fix max MTU setting in bonding driver, from WANG Cong. 15) xen-netback hash table can be allocated from softirq context, so use GFP_ATOMIC. From Anoob Soman. 16) Fix MAC address change bug in bgmac driver, from Hari Vyas. 17) strparser needs to destroy strp_wq on module exit, from WANG Cong. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (69 commits) strparser: destroy workqueue on module exit sfc: fix IPID endianness in TSOv2 sfc: avoid max() in array size rds: remove unnecessary returned value check rxrpc: Fix potential NULL-pointer exception nfp: correct DMA direction in XDP DMA sync nfp: don't tell FW about the reserved buffer space net: ethernet: bgmac: mac address change bug net: ethernet: bgmac: init sequence bug xen-netback: don't vfree() queues under spinlock xen-netback: keep a local pointer for vif in backend_disconnect() netfilter: nf_tables: don't call nfnetlink_set_err() if nfnetlink_send() fails netfilter: nft_set_rbtree: incorrect assumption on lower interval lookups netfilter: nf_conntrack_sip: fix wrong memory initialisation can: flexcan: fix typo in comment can: usb_8dev: Fix memory leak of priv->cmd_msg_buffer can: gs_usb: fix coding style can: gs_usb: Don't use stack memory for USB transfers ixgbe: Limit use of 2K buffers on architectures with 256B or larger cache lines ixgbe: update the rss key on h/w, when ethtool ask for it ...
| * strparser: destroy workqueue on module exitWANG Cong2017-03-041-0/+1
| | | | | | | | | | | | | | Fixes: 43a0c6751a32 ("strparser: Stream parser for messages") Cc: Tom Herbert <tom@herbertland.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nfDavid S. Miller2017-03-044-88/+63
| |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pablo Neira Ayuso says: ==================== Netfilter fixes for net The following patchset contains Netfilter fixes for your net tree, they are: 1) Missing check for full sock in ip_route_me_harder(), from Florian Westphal. 2) Incorrect sip helper structure initilization that breaks it when several ports are used, from Christophe Leroy. 3) Fix incorrect assumption when looking up for matching with adjacent intervals in the nft_set_rbtree. 4) Fix broken netlink event error reporting in nf_tables that results in misleading ESRCH errors propagated to userspace listeners. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| | * netfilter: nf_tables: don't call nfnetlink_set_err() if nfnetlink_send() failsPablo Neira Ayuso2017-03-031-78/+55
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The underlying nlmsg_multicast() already sets sk->sk_err for us to notify socket overruns, so we should not do anything with this return value. So we just call nfnetlink_set_err() if: 1) We fail to allocate the netlink message. or 2) We don't have enough space in the netlink message to place attributes, which means that we likely need to allocate a larger message. Before this patch, the internal ESRCH netlink error code was propagated to userspace, which is quite misleading. Netlink semantics mandate that listeners just hit ENOBUFS if the socket buffer overruns. Reported-by: Alexander Alemayhu <alexander@alemayhu.com> Tested-by: Alexander Alemayhu <alexander@alemayhu.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
| | * netfilter: nft_set_rbtree: incorrect assumption on lower interval lookupsPablo Neira Ayuso2017-03-031-5/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In case of adjacent ranges, we may indeed see either the high part of the range in first place or the low part of it. Remove this incorrect assumption, let's make sure we annotate the low part of the interval in case of we have adjacent interva intervals so we hit a matching in lookups. Reported-by: Simon Hanisch <hanisch@wh2.tu-dresden.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
| | * netfilter: nf_conntrack_sip: fix wrong memory initialisationChristophe Leroy2017-03-031-2/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In commit 82de0be6862cd ("netfilter: Add helper array register/unregister functions"), struct nf_conntrack_helper sip[MAX_PORTS][4] was changed to sip[MAX_PORTS * 4], so the memory init should have been changed to memset(&sip[4 * i], 0, 4 * sizeof(sip[i])); But as the sip[] table is allocated in the BSS, it is already set to 0 Fixes: 82de0be6862cd ("netfilter: Add helper array register/unregister functions") Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
| | * netfilter: use skb_to_full_sk in ip_route_me_harderFlorian Westphal2017-02-281-3/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | inet_sk(skb->sk) is illegal in case skb is attached to request socket. Fixes: ca6fb0651883 ("tcp: attach SYNACK messages to request sockets instead of listener") Reported by: Daniel J Blueman <daniel@quora.org> Signed-off-by: Florian Westphal <fw@strlen.de> Tested-by: Daniel J Blueman <daniel@quora.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
| * | rds: remove unnecessary returned value checkZhu Yanjun2017-03-034-14/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The function rds_trans_register always returns 0. As such, it is not necessary to check the returned value. Cc: Joe Jin <joe.jin@oracle.com> Cc: Junxiao Bi <junxiao.bi@oracle.com> Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com> Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com> Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | rxrpc: Fix potential NULL-pointer exceptionDavid Howells2017-03-031-7/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix a potential NULL-pointer exception in rxrpc_do_sendmsg(). The call state check that I added should have gone into the else-body of the if-statement where we actually have a call to check. Found by CoverityScan CID#1414316 ("Dereference after null check"). Fixes: 540b1c48c37a ("rxrpc: Fix deadlock between call creation and sendmsg/recvmsg") Reported-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | Merge tag 'mac80211-for-davem-2017-03-02' of ↵David S. Miller2017-03-023-3/+3
| |\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211 Johannes Berg says: ==================== This contains just the average.h change in order to get it into the tree before adding new users through -next trees. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| | * | average: change to declare precision, not factorJohannes Berg2017-03-023-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Declaring the factor is counter-intuitive, and people are prone to using small(-ish) values even when that makes no sense. Change the DECLARE_EWMA() macro to take the fractional precision, in bits, rather than a factor, and update all users. While at it, add some more documentation. Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
| * | | ipv6: ignore null_entry in inet6_rtm_getroute() tooWANG Cong2017-03-021-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Like commit 1f17e2f2c8a8 ("net: ipv6: ignore null_entry on route dumps"), we need to ignore null entry in inet6_rtm_getroute() too. Return -ENETUNREACH here to sync with IPv4 behavior, as suggested by David. Fixes: a1a22c1206 ("net: ipv6: Keep nexthop of multipath route on admin down") Reported-by: Dmitry Vyukov <dvyukov@google.com> Cc: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Acked-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | tcp: fix potential double free issue for fastopen_reqWei Wang2017-03-021-3/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | tp->fastopen_req could potentially be double freed if a malicious user does the following: 1. Enable TCP_FASTOPEN_CONNECT sockopt and do a connect() on the socket. 2. Call connect() with AF_UNSPEC to disconnect the socket. 3. Make this socket a listening socket by calling listen(). 4. Accept incoming connections and generate child sockets. All child sockets will get a copy of the pointer of fastopen_req. 5. Call close() on all sockets. fastopen_req will get freed multiple times. Fixes: 19f6d3f3c842 ("net/tcp-fastopen: Add new API support") Reported-by: Andrey Konovalov <andreyknvl@google.com> Signed-off-by: Wei Wang <weiwan@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | net: Introduce sk_clone_lock() error path routineArnaldo Carvalho de Melo2017-03-022-10/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When handling problems in cloning a socket with the sk_clone_locked() function we need to perform several steps that were open coded in it and its callers, so introduce a routine to avoid this duplication: sk_free_unlock_clone(). Cc: Cong Wang <xiyou.wangcong@gmail.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Eric Dumazet <edumazet@google.com> Cc: Gerrit Renker <gerrit@erg.abdn.ac.uk> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/n/net-ui6laqkotycunhtmqryl9bfx@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | dccp: Unlock sock before calling sk_free()Arnaldo Carvalho de Melo2017-03-021-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The code where sk_clone() came from created a new socket and locked it, but then, on the error path didn't unlock it. This problem stayed there for a long while, till b0691c8ee7c2 ("net: Unlock sock before calling sk_free()") fixed it, but unfortunately the callers of sk_clone() (now sk_clone_locked()) were not audited and the one in dccp_create_openreq_child() remained. Now in the age of the syskaller fuzzer, this was finally uncovered, as reported by Dmitry: ---- 8< ---- I've got the following report while running syzkaller fuzzer on 86292b33d4b7 ("Merge branch 'akpm' (patches from Andrew)") [ BUG: held lock freed! ] 4.10.0+ #234 Not tainted ------------------------- syz-executor6/6898 is freeing memory ffff88006286cac0-ffff88006286d3b7, with a lock still held there! (slock-AF_INET6){+.-...}, at: [<ffffffff8362c2c9>] spin_lock include/linux/spinlock.h:299 [inline] (slock-AF_INET6){+.-...}, at: [<ffffffff8362c2c9>] sk_clone_lock+0x3d9/0x12c0 net/core/sock.c:1504 5 locks held by syz-executor6/6898: #0: (sk_lock-AF_INET6){+.+.+.}, at: [<ffffffff839a34b4>] lock_sock include/net/sock.h:1460 [inline] #0: (sk_lock-AF_INET6){+.+.+.}, at: [<ffffffff839a34b4>] inet_stream_connect+0x44/0xa0 net/ipv4/af_inet.c:681 #1: (rcu_read_lock){......}, at: [<ffffffff83bc1c2a>] inet6_csk_xmit+0x12a/0x5d0 net/ipv6/inet6_connection_sock.c:126 #2: (rcu_read_lock){......}, at: [<ffffffff8369b424>] __skb_unlink include/linux/skbuff.h:1767 [inline] #2: (rcu_read_lock){......}, at: [<ffffffff8369b424>] __skb_dequeue include/linux/skbuff.h:1783 [inline] #2: (rcu_read_lock){......}, at: [<ffffffff8369b424>] process_backlog+0x264/0x730 net/core/dev.c:4835 #3: (rcu_read_lock){......}, at: [<ffffffff83aeb5c0>] ip6_input_finish+0x0/0x1700 net/ipv6/ip6_input.c:59 #4: (slock-AF_INET6){+.-...}, at: [<ffffffff8362c2c9>] spin_lock include/linux/spinlock.h:299 [inline] #4: (slock-AF_INET6){+.-...}, at: [<ffffffff8362c2c9>] sk_clone_lock+0x3d9/0x12c0 net/core/sock.c:1504 Fix it just like was done by b0691c8ee7c2 ("net: Unlock sock before calling sk_free()"). Reported-by: Dmitry Vyukov <dvyukov@google.com> Cc: Cong Wang <xiyou.wangcong@gmail.com> Cc: Eric Dumazet <edumazet@google.com> Cc: Gerrit Renker <gerrit@erg.abdn.ac.uk> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20170301153510.GE15145@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | Merge tag 'batadv-net-for-davem-20170301' of git://git.open-mesh.org/linux-mergeDavid S. Miller2017-03-021-9/+11
| |\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Simon Wunderlich says: ==================== Here are two batman-adv bugfixes: - fix a potential double free when fragment merges fail, by Sven Eckelmann - fix failing tranmission of the 16th (last) fragment if that exists, by Linus Lüssing ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| | * | | batman-adv: Fix transmission of final, 16th fragmentLinus Lüssing2017-02-211-6/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Trying to split and transmit a unicast packet in 16 parts will fail for the final fragment: After having sent the 15th one with a frag_packet.no index of 14, we will increase the the index to 15 - and return with an error code immediately, even though one more fragment is due for transmission and allowed. Fixing this issue by moving the check before incrementing the index. While at it, adding an unlikely(), because the check is actually more of an assertion. Fixes: ee75ed88879a ("batman-adv: Fragment and send skbs larger than mtu") Signed-off-by: Linus Lüssing <linus.luessing@c0d3.blue> Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
| | * | | batman-adv: Fix double free during fragment merge errorSven Eckelmann2017-02-211-3/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The function batadv_frag_skb_buffer was supposed not to consume the skbuff on errors. This was followed in the helper function batadv_frag_insert_packet when the skb would potentially be inserted in the fragment queue. But it could happen that the next helper function batadv_frag_merge_packets would try to merge the fragments and fail. This results in a kfree_skb of all the enqueued fragments (including the just inserted one). batadv_recv_frag_packet would detect the error in batadv_frag_skb_buffer and try to free the skb again. The behavior of batadv_frag_skb_buffer (and its helper batadv_frag_insert_packet) must therefore be changed to always consume the skbuff to have a common behavior and avoid the double kfree_skb. Fixes: 610bfc6bc99b ("batman-adv: Receive fragmented packets and merge") Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>