summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* tcp/dccp: remove obsolete WARN_ON() in icmp handlersEric Dumazet2016-03-182-4/+0
| | | | | | | | | | | | | | | | | | Now SYN_RECV request sockets are installed in ehash table, an ICMP handler can find a request socket while another cpu handles an incoming packet transforming this SYN_RECV request socket into an ESTABLISHED socket. We need to remove the now obsolete WARN_ON(req->sk), since req->sk is set when a new child is created and added into listener accept queue. If this race happens, the ICMP will do nothing special. Fixes: 079096f103fa ("tcp/dccp: install syn_recv requests into ehash table") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Ben Lazarus <blazarus@google.com> Reported-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* vlan: propagate gso_max_segsEric Dumazet2016-03-184-0/+6
| | | | | | | | vlan drivers lack proper propagation of gso_max_segs from lower device. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge branch 'thunderx-mdio-fixes'David S. Miller2016-03-173-7/+25
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | David Daney says: ==================== net/phy: Fixes for Cavium Thunder MDIO code. Previous patch set: commit 5fc7cf179449 ("net: thunderx: Cleanup PHY probing code.") commit 1eefee901fca ("phy: mdio-octeon: Refactor into two files/modules") commit 379d7ac7ca31 ("phy: mdio-thunder: Add driver for Cavium Thunder SoC MDIO buses.") Had several problems. We try to fix them here. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| * net: thunderx: Don't leak phy device references on -EPROBE_DEFER condition.David Daney2016-03-171-6/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | It is possible, although unlikely, that probing will find the phy_device for the first LMAC of a thunder BGX device, but then need to fail with -EPROBE_DEFER on a subsequent LMAC. In this case, we need to call put_device() on each of the phy_devices that were obtained, but will be unused due to returning -EPROBE_DEFER. Also, since we can break out of the probing loop early, we need to explicitly call of_node_put() outside of the loop. Signed-off-by: David Daney <david.daney@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net: cavium: For Kconfig THUNDER_NIC_BGX, select MDIO_THUNDER.David Daney2016-03-171-1/+1
| | | | | | | | | | | | | | | | Previously we selected MDIO_OCTEON, which after creating the Thunder specific MDIO bus driver is much less useful. Signed-off-by: David Daney <david.daney@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * phy: mdio-cavium: Add missing MODULE_* annotations.David Daney2016-03-171-0/+4
|/ | | | | | | | | | When the code was factored out of mdio-octeon.c, the MODULE_DESCRIPTION, MODULE_AUTHOR and MODULE_LICENSE annotations were inadvertently omitted. Restore them so that we don't get kernel taint warnings upon module loading. Signed-off-by: David Daney <david.daney@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* ppp: ensure file->private_data can't be overriddenGuillaume Nault2016-03-171-14/+17
| | | | | | | | | | | | | | Locking ppp_mutex must be done before dereferencing file->private_data, otherwise it could be modified before ppp_unattached_ioctl() takes the lock. This could lead ppp_unattached_ioctl() to override ->private_data, thus leaking reference to the ppp_file previously pointed to. v2: lock all ppp_ioctl() instead of just checking private_data in ppp_unattached_ioctl(), to avoid ambiguous behaviour. Fixes: f3ff8a4d80e8 ("ppp: push BKL down into the driver") Signed-off-by: Guillaume Nault <g.nault@alphalink.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge branch 'arc_emac-next'David S. Miller2016-03-1711-62/+201
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Caesar Wang says: ==================== arc_emac: fixes the emac issues and cleanup emac drivers This series patches are based on kernel 4.5-rc7+ version. Linux version 4.5.0-rc7-next-20160311+ (wxt@nb) (...) #45 SMP Sun Mar 13 16:17:56 The history patch in here: Patch-v1: https://lkml.org/lkml/2016/3/11/209 Patch-v2: https://lkml.org/lkml/2016/3/13/39 Verified on kylin board with my github. https://github.com/Caesar-github/rockchip/tree/kylin/next That's verified on kylin board with ubuntu os. This series patches are built all pass with Mr.robot on https://github.com/Caesar-github/linux/tree/build-emac-v3 How to test and verify? You can refer to the following wiki document. http://rockchip.wikidot.com/linux-develop-guide bootup log: [ 1.264740] rockchip_emac 10200000.ethernet: no regulator found [ 1.270908] rockchip_emac 10200000.ethernet: ARC EMAC detected with id: 0x7fd02 [ 1.278362] rockchip_emac 10200000.ethernet: IRQ is 29 [ 1.283747] rockchip_emac 10200000.ethernet: MAC address is now 06:5d:61:c7:39:41 [ 1.291314] rockchip_emac 10200000.ethernet: GPIO lookup for consumer phy-reset [ 1.291333] rockchip_emac 10200000.ethernet: using device tree for GPIO lookup [ 1.663155] rockchip_emac 10200000.ethernet: connected to Generic PHY phy with id 0xffffc816 [ 8.863448] rockchip_emac 10200000.ethernet eth0: Link is Up - 100Mbps/Full - flow control off root@localhost:/# busybox ping www.baidu.com PING www.baidu.com (14.215.177.38): 56 data bytes 64 bytes from 14.215.177.38: seq=0 ttl=48 time=35.046 ms 64 bytes from 14.215.177.38: seq=1 ttl=48 time=35.095 ms 64 bytes from 14.215.177.38: seq=2 ttl=48 time=34.203 ms 64 bytes from 14.215.177.38: seq=3 ttl=48 time=38.516 ms ... --- 1) This series has 6 patches: (1--->9) net: arc_emac: make the rockchip emac document more compatible net: arc_emac: add phy reset is optional for device tree net: arc_emac: support the phy reset for emac driver net: arc: trivial: cleanup the emac driver clk: rockchip: add node-id for rk3036 emac hclk clk: rockchip: associate the rk3036 HCLK_EMAC clock-id clk: rockchip: add clock-id for rk3036 emac pll source clock clk: rockchip: associate SCLK_MAC_PLL and disable reparenting on rk3036 ARM: dts: rockchip: add support emac for RK3036 2) This series patches have the following descriptions: Hi Rob, David: PATCH[1/9-2/9]: ====> net: arc_emac: make the rockchip emac document more compatible net: arc_emac: add phy reset is optional for device tree The patches change the rockchip emac document for more compatible and Add the phy reset property for document. --- Hi David PATCH[3/9]: ====> net: arc_emac: support the phy reset for emac driver The emac didn't work on kylin board since in some case the clocks parent changed. The kylin hardware connects the phy reset pin, we should use it with real world. As the previous patch discuss on https://patchwork.kernel.org/patch/8186801/ And as sergei/Heiko suggestions on https://patchwork.kernel.org/patch/8564571/ --- Hi David PATCH[4/9]: ====> net: arc: trivial: cleanup the emac driver The first time to look the emac drivers, I think that have to cleanup the drivers with scripts. Although it's the trivial things, in order to be more read. --- Hi Heiko,Michael,Stephen: PATCH[5/9-8/9]: ====> clk: rockchip: rk3036: fix and add node id for emac clock Four-part from https://patchwork.kernel.org/patch/8564581/ clk: rockchip: add node-id for rk3036 emac hclk clk: rockchip: associate the rk3036 HCLK_EMAC clock-id clk: rockchip: add clock-id for rk3036 emac pll source clock clk: rockchip: associate SCLK_MAC_PLL and disable reparenting on rk3036 Add the emac needed clocks for rk3036 SoCs --- Hi Heiko: PATCH[9/9]: ====> ARM: dts: rockchip: add support emac for RK3036 Add the emac needed main info for rk3036 dts. --- Thanks your reviewing! :) Changes in v3: - %s/he/the - Add the Cc people - As Sergei comments, the original name is better, so %s/reset-gpios/phy-reset-gpios - Add the Cc people. - Caused the build error since the missing include head file. - %s/reset/phy-reset to match the device tree. - Add the Cc people - Add the Cc people. - Add the Cc people. - Add the Cc people. - Add the Cc people. - Add the Cc people. - rename reset-gpio to phy-reset-gpios. - change the commit. - remove the pcfg_output_high, that's really not needed for emac. - Add the Cc people. - Fixes the 'zhengxing' to 'Xing Zheng'. Changes in v2: - change the commit and remove the repeat the name 'rockchip'. - %s/phy-reset-gpios/reset-gpios - As the pervious version, Sergei and Heiko comments on https://patchwork.kernel.org/patch/8564571/. - Nevermind, add signed-off since Heiko the original patch, refer the Heiko's test patch on https://github.com/mmind/linux-rockchip/commit/a943c588783438ff1c508dfa8c79f1709aa5775e :) - As the robot notice the build error since overflow in implicit constant conversion. - rename phy-reset-gpio to reset-gpios. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| * ARM: dts: rockchip: add to support emac for rk3036 SoCsXing Zheng2016-03-173-0/+67
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds the emac device node for rk3036 SoCs. We need to let mac clock under the DPLL which is able to provide the accurate 50MHz what mac_ref need, since that will cause some unstable things if the cpufreq is working. Signed-off-by: Xing Zheng <zhengxing@rock-chips.com> Signed-off-by: Caesar Wang <wxt@rock-chips.com> Cc: linux-rockchip@lists.infradead.org Cc: Xing Zheng <zhengxing@rock-chips.com> Cc: Heiko Stuebner <heiko@sntech.de> Cc: linux-arm-kernel@lists.infradead.org Signed-off-by: David S. Miller <davem@davemloft.net>
| * clk: rockchip: associate SCLK_MAC_PLL and disable reparenting on rk3036Heiko Stuebner2016-03-171-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The emac needs constant and very specific rate but the possible PLL-sources are very limited, so we expect the PLL source to be set manually on per board and don't want it to get changed in an automatic way later. So add the necessary clock-id and disable reparenting on set_rate calls. Signed-off-by: Heiko Stuebner <heiko@sntech.de> Cc: Michael Turquette <mturquette@baylibre.com> Cc: Heiko Stuebner <heiko@sntech.de> Cc: Stephen Boyd <sboyd@codeaurora.org> Cc: linux-clk@vger.kernel.org Signed-off-by: Caesar Wang <wxt@rock-chips.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * clk: rockchip: add clock-id for rk3036 emac pll source clockXing Zheng2016-03-171-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Suitable PLLs for the emac on the rk3036 are difficult to find and one of them is the (continuously changing) APLL. So in most cases it will be necessary to select a PLL manually. So add a clock-id for it. Signed-off-by: Xing Zheng <zhengxing@rock-chips.com> Signed-off-by: Caesar Wang <wxt@rock-chips.com> Cc: Xing Zheng <zhengxing@rock-chips.com> Cc: Michael Turquette <mturquette@baylibre.com> Cc: Heiko Stuebner <heiko@sntech.de> Cc: Stephen Boyd <sboyd@codeaurora.org> Cc: linux-clk@vger.kernel.org Cc: linux-rockchip@lists.infradead.org Signed-off-by: David S. Miller <davem@davemloft.net>
| * clk: rockchip: associate the rk3036 HCLK_EMAC clock-idXing Zheng2016-03-171-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Associate the new clock id the clock. Signed-off-by: Xing Zheng <zhengxing@rock-chips.com> Signed-off-by: Caesar Wang <wxt@rock-chips.com> Cc: Xing Zheng <zhengxing@rock-chips.com> Cc: Michael Turquette <mturquette@baylibre.com> Cc: Heiko Stuebner <heiko@sntech.de> Cc: Stephen Boyd <sboyd@codeaurora.org> Cc: linux-clk@vger.kernel.org Cc: linux-rockchip@lists.infradead.org Signed-off-by: David S. Miller <davem@davemloft.net>
| * clk: rockchip: add node-id for rk3036 emac hclkXing Zheng2016-03-171-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Add the node-id for the emac hclk to the binding header. Signed-off-by: Xing Zheng <zhengxing@rock-chips.com> Signed-off-by: Caesar Wang <wxt@rock-chips.com> Cc: Xing Zheng <zhengxing@rock-chips.com> Cc: Michael Turquette <mturquette@baylibre.com> Cc: Heiko Stuebner <heiko@sntech.de> Cc: Stephen Boyd <sboyd@codeaurora.org> Cc: linux-clk@vger.kernel.org Cc: linux-rockchip@lists.infradead.org Signed-off-by: David S. Miller <davem@davemloft.net>
| * net: arc: trivial: cleanup the emac driverCaesar Wang2016-03-174-57/+75
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch will make the driver more readability The emac has the error and warnings if you run 'scripts/checkpatch.pl -f --subjective xxx' to check. Let's clean up such trivial details. Signed-off-by: Caesar Wang <wxt@rock-chips.com> Cc: Jiri Kosina <trivial@kernel.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Alexander Kochetkov <al.kochet@gmail.com> Cc: netdev@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>
| * net: arc_emac: support the phy reset for emac driverCaesar Wang2016-03-172-0/+43
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds to support the emac phy reset. Different boards may require different phy reset duration. Add property phy-reset-duration for emac driver, so that the boards that need a longer reset duration can specify it in their device tree. Signed-off-by: Heiko Stuebner <heiko@sntech.de> Signed-off-by: Caesar Wang <wxt@rock-chips.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: netdev@vger.kernel.org Cc: Alexander Kochetkov <al.kochet@gmail.com> Cc: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net: arc_emac: add phy reset is optional for device treeCaesar Wang2016-03-171-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds the following property for arc_emac. 1) phy-reset-gpios: The phy-reset-gpio is an optional property for arc emac device tree boot. Change the binding document to match the driver code. 2) phy-reset-duration: Different boards may require different phy reset duration. Add property phy-reset-duration for device tree probe, so that the boards that need a longer reset duration can specify it in their device tree. Anyway, we can add the above property for arc emac. Signed-off-by: Caesar Wang <wxt@rock-chips.com> Cc: Rob Herring <robh+dt@kernel.org> Cc: devicetree@vger.kernel.org Cc: netdev@vger.kernel.org Cc: "David S. Miller" <davem@davemloft.net> Cc: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Cc; Alexander Kochetkov <al.kochet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net: arc_emac: make the rockchip emac document more compatibleCaesar Wang2016-03-171-3/+5
|/ | | | | | | | | | | | | | | | Add the rk3036 SoCs to match driver for document since the emac driver has supported the rk3036 SoCs. This patch adds the rk3036/rk3066/rk3188 SoCS to compatible for rockchip emac ducument. Also, that will suit for other SoCs in the future. Signed-off-by: Caesar Wang <wxt@rock-chips.com> Cc: Rob Herring <robh+dt@kernel.org> Cc: devicetree@vger.kernel.org Cc: netdev@vger.kernel.org Cc: "David S. Miller" <davem@davemloft.net> Cc: Alexander Kochetkov <al.kochet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* ethtool: Set cmd field in ETHTOOL_GLINKSETTINGS response to wrong nwordsBen Hutchings2016-03-171-1/+1
| | | | | | | | | | | | | | | When the ETHTOOL_GLINKSETTINGS implementation finds that userland is using the wrong number of words of link mode bitmaps (or is trying to find out the right numbers) it sets the cmd field to 0 in the response structure. This is inconsistent with the implementation of every other ethtool command, so let's remove that inconsistency before it gets into a stable release. Fixes: 3f1ac7a700d03 ("net: ethtool: add new ETHTOOL_xLINKSETTINGS API") Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
* sh_eth: do not call netif_start_queue() from sh_eth_dev_init()Sergei Shtylyov2016-03-171-2/+4
| | | | | | | | | | | Iff sh_eth_phy_start() call fails in sh_eth_open(), the netif_start_queue() call done by sh_eth_dev_init() is not undone. In order to deal with that, stop calling netif_start_queue() from there, so that it can be called only when the device is fully opened and sh_eth_dev_init() only deals with the hardware initialization, symmetrically to sh_eth_dev_exit()... Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* bnx2x: don't wait for Tx completion on recoveryYuval Mintz2016-03-171-2/+6
| | | | | | | | | | | | | When driver has hit a parity event, HW can no longer write to host memory. As a result, Tx completions cannot be written to the host SB memory, and waiting for Tx completions eventually timeout. As driver is willing to delay as much as 1-2 seconds per Tx queue for its draining and this delay is sequential, the time to recover might greatly lengthen needlessly in case the recovery is done under multi-connection traffic. Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* sctp: consolidate local_bh_disable/enable + spin_lock/unlock to _bh variantNicholas Mc Guire2016-03-171-4/+2
| | | | | | | | | local_bh_disable() + spin_lock() is equivalent to spin_lock_bh(), same for the unlock/enable case, so replace the calls by the appropriate wrappers. Signed-off-by: Nicholas Mc Guire <hofrat@osadl.org> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-nextDavid S. Miller2016-03-1518-134/+795
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pablo Neira Ayuso says: ==================== Netfilter/IPVS/OVS updates for net-next The following patchset contains Netfilter/IPVS fixes and OVS NAT support, more specifically this batch is composed of: 1) Fix a crash in ipset when performing a parallel flush/dump with set:list type, from Jozsef Kadlecsik. 2) Make sure NFACCT_FILTER_* netlink attributes are in place before accessing them, from Phil Turnbull. 3) Check return error code from ip_vs_fill_iph_skb_off() in IPVS SIP helper, from Arnd Bergmann. 4) Add workaround to IPVS to reschedule existing connections to new destination server by dropping the packet and wait for retransmission of TCP syn packet, from Julian Anastasov. 5) Allow connection rescheduling in IPVS when in CLOSE state, also from Julian. 6) Fix wrong offset of SIP Call-ID in IPVS helper, from Marco Angaroni. 7) Validate IPSET_ATTR_ETHER netlink attribute length, from Jozsef. 8) Check match/targetinfo netlink attribute size in nft_compat, patch from Florian Westphal. 9) Check for integer overflow on 32-bit systems in x_tables, from Florian Westphal. Several patches from Jarno Rajahalme to prepare the introduction of NAT support to OVS based on the Netfilter infrastructure: 10) Schedule IP_CT_NEW_REPLY definition for removal in nf_conntrack_common.h. 11) Simplify checksumming recalculation in nf_nat. 12) Add comments to the openvswitch conntrack code, from Jarno. 13) Update the CT state key only after successful nf_conntrack_in() invocation. 14) Find existing conntrack entry after upcall. 15) Handle NF_REPEAT case due to templates in nf_conntrack_in(). 16) Call the conntrack helper functions once the conntrack has been confirmed. 17) And finally, add the NAT interface to OVS. The batch closes with: 18) Cleanup to use spin_unlock_wait() instead of spin_lock()/spin_unlock(), from Nicholas Mc Guire. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| * netfilter: nf_conntrack: consolidate lock/unlock into unlock_waitNicholas Mc Guire2016-03-151-4/+2
| | | | | | | | | | | | | | | | | | The spin_lock()/spin_unlock() is synchronizing on the nf_conntrack_locks_all_lock which is equivalent to spin_unlock_wait() but the later should be more efficient. Signed-off-by: Nicholas Mc Guire <hofrat@osadl.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
| * openvswitch: Interface with NAT.Jarno Rajahalme2016-03-144-28/+551
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Extend OVS conntrack interface to cover NAT. New nested OVS_CT_ATTR_NAT attribute may be used to include NAT with a CT action. A bare OVS_CT_ATTR_NAT only mangles existing and expected connections. If OVS_NAT_ATTR_SRC or OVS_NAT_ATTR_DST is included within the nested attributes, new (non-committed/non-confirmed) connections are mangled according to the rest of the nested attributes. The corresponding OVS userspace patch series includes test cases (in tests/system-traffic.at) that also serve as example uses. This work extends on a branch by Thomas Graf at https://github.com/tgraf/ovs/tree/nat. Signed-off-by: Jarno Rajahalme <jarno@ovn.org> Acked-by: Thomas Graf <tgraf@suug.ch> Acked-by: Joe Stringer <joe@ovn.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
| * openvswitch: Delay conntrack helper call for new connections.Jarno Rajahalme2016-03-141-5/+16
| | | | | | | | | | | | | | | | | | | | | | There is no need to help connections that are not confirmed, so we can delay helping new connections to the time when they are confirmed. This change is needed for NAT support, and having this as a separate patch will make the following NAT patch a bit easier to review. Signed-off-by: Jarno Rajahalme <jarno@ovn.org> Acked-by: Joe Stringer <joe@ovn.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
| * openvswitch: Handle NF_REPEAT in conntrack action.Jarno Rajahalme2016-03-141-2/+8
| | | | | | | | | | | | | | | | | | Repeat the nf_conntrack_in() call when it returns NF_REPEAT. This avoids dropping a SYN packet re-opening an existing TCP connection. Signed-off-by: Jarno Rajahalme <jarno@ovn.org> Acked-by: Joe Stringer <joe@ovn.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
| * openvswitch: Find existing conntrack entry after upcall.Jarno Rajahalme2016-03-141-13/+90
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add a new function ovs_ct_find_existing() to find an existing conntrack entry for which this packet was already applied to. This is only to be called when there is evidence that the packet was already tracked and committed, but we lost the ct reference due to an userspace upcall. ovs_ct_find_existing() is called from skb_nfct_cached(), which can now hide the fact that the ct reference may have been lost due to an upcall. This allows ovs_ct_commit() to be simplified. This patch is needed by later "openvswitch: Interface with NAT" patch, as we need to be able to pass the packet through NAT using the original ct reference also after the reference is lost after an upcall. Signed-off-by: Jarno Rajahalme <jarno@ovn.org> Acked-by: Joe Stringer <joe@ovn.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
| * openvswitch: Update the CT state key only after nf_conntrack_in().Jarno Rajahalme2016-03-141-3/+4
| | | | | | | | | | | | | | | | | | | | | | | | Only a successful nf_conntrack_in() call can effect a connection state change, so it suffices to update the key only after the nf_conntrack_in() returns. This change is needed for the later NAT patches. Signed-off-by: Jarno Rajahalme <jarno@ovn.org> Acked-by: Joe Stringer <joe@ovn.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
| * openvswitch: Add commentary to conntrack.cJarno Rajahalme2016-03-141-1/+20
| | | | | | | | | | | | | | | | | | This makes the code easier to understand and the following patches more focused. Signed-off-by: Jarno Rajahalme <jarno@ovn.org> Acked-by: Joe Stringer <joe@ovn.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
| * netfilter: Allow calling into nat helper without skb_dst.Jarno Rajahalme2016-03-142-44/+16
| | | | | | | | | | | | | | | | | | | | | | NAT checksum recalculation code assumes existence of skb_dst, which becomes a problem for a later patch in the series ("openvswitch: Interface with NAT."). Simplify this by removing the check on skb_dst, as the checksum will be dealt with later in the stack. Suggested-by: Pravin Shelar <pshelar@nicira.com> Signed-off-by: Jarno Rajahalme <jarno@ovn.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
| * netfilter: Remove IP_CT_NEW_REPLY definition.Jarno Rajahalme2016-03-142-5/+9
| | | | | | | | | | | | | | | | | | Remove the definition of IP_CT_NEW_REPLY from the kernel as it does not make sense. This allows the definition of IP_CT_NUMBER to be simplified as well. Signed-off-by: Jarno Rajahalme <jarno@ovn.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
| * netfilter: x_tables: check for size overflowFlorian Westphal2016-03-121-0/+3
| | | | | | | | | | | | | | | | | | | | | | Ben Hawkes says: integer overflow in xt_alloc_table_info, which on 32-bit systems can lead to small structure allocation and a copy_from_user based heap corruption. Reported-by: Ben Hawkes <hawkes@google.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
| * netfilter: nft_compat: check match/targetinfo attr sizeFlorian Westphal2016-03-111-0/+6
| | | | | | | | | | | | | | | | | | We copy according to ->target|matchsize, so check that the netlink attribute (which can include padding and might be larger) contains enough data. Reported-by: Julia Lawall <Julia.Lawall@lip6.fr> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
| * Merge tag 'ipvs-fixes-for-v4.5' of ↵Pablo Neira Ayuso2016-03-114-12/+52
| |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | https://git.kernel.org/pub/scm/linux/kernel/git/horms/ipvs Simon Horman says: ==================== please consider these IPVS fixes for v4.5 or if it is too late please consider them for v4.6. * Arnd Bergman has corrected an error whereby the SIP persistence engine may incorrectly access protocol fields * Julian Anastasov has corrected a problem reported by Jiri Bohac with the connection rescheduling mechanism added in 3.10 when new SYNs in connection to dead real server can be redirected to another real server. * Marco Angaroni resolved a problem in the SIP persistence engine whereby the Call-ID could not be found if it was at the beginning of a SIP message. ==================== Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
| | * ipvs: correct initial offset of Call-ID header search in SIP persistence engineMarco Angaroni2016-03-071-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The IPVS SIP persistence engine is not able to parse the SIP header "Call-ID" when such header is inserted in the first positions of the SIP message. When IPVS is configured with "--pe sip" option, like for example: ipvsadm -A -u 1.2.3.4:5060 -s rr --pe sip -p 120 -o some particular messages (see below for details) do not create entries in the connection template table, which can be listed with: ipvsadm -Lcn --persistent-conn Problematic SIP messages are SIP responses having "Call-ID" header positioned just after message first line: SIP/2.0 200 OK [Call-ID header here] [rest of the headers] When "Call-ID" header is positioned down (after a few other headers) it is correctly recognized. This is due to the data offset used in get_callid function call inside ip_vs_pe_sip.c file: since dptr already points to the start of the SIP message, the value of dataoff should be initially 0. Otherwise the header is searched starting from some bytes after the first character of the SIP message. Fixes: 758ff0338722 ("IPVS: sip persistence engine") Signed-off-by: Marco Angaroni <marcoangaroni@gmail.com> Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>
| | * ipvs: allow rescheduling after RSTJulian Anastasov2016-03-071-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | "RFC 5961, 4.2. Mitigation" describes a mechanism to request client to confirm with RST the restart of TCP connection before resending its SYN. As result, IPVS can see SYNs for existing connection in CLOSE state. Add check to allow rescheduling in this state. Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>
| | * ipvs: drop first packet to redirect conntrackJulian Anastasov2016-03-072-9/+45
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Jiri Bohac is reporting for a problem where the attempt to reschedule existing connection to another real server needs proper redirect for the conntrack used by the IPVS connection. For example, when IPVS connection is created to NAT-ed real server we alter the reply direction of conntrack. If we later decide to select different real server we can not alter again the conntrack. And if we expire the old connection, the new connection is left without conntrack. So, the only way to redirect both the IPVS connection and the Netfilter's conntrack is to drop the SYN packet that hits existing connection, to wait for the next jiffie to expire the old connection and its conntrack and to rely on client's retransmission to create new connection as usually. Jiri Bohac provided a fix that drops all SYNs on rescheduling, I extended his patch to do such drops only for connections that use conntrack. Here is the original report from Jiri Bohac: Since commit dc7b3eb900aa ("ipvs: Fix reuse connection if real server is dead"), new connections to dead servers are redistributed immediately to new servers. The old connection is expired using ip_vs_conn_expire_now() which sets the connection timer to expire immediately. However, before the timer callback, ip_vs_conn_expire(), is run to clean the connection's conntrack entry, the new redistributed connection may already be established and its conntrack removed instead. Fix this by dropping the first packet of the new connection instead, like we do when the destination server is not available. The timer will have deleted the old conntrack entry long before the first packet of the new connection is retransmitted. Fixes: dc7b3eb900aa ("ipvs: Fix reuse connection if real server is dead") Signed-off-by: Jiri Bohac <jbohac@suse.cz> Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>
| | * ipvs: handle ip_vs_fill_iph_skb_off failureArnd Bergmann2016-03-071-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ip_vs_fill_iph_skb_off() may not find an IP header, and gcc has determined that ip_vs_sip_fill_param() then incorrectly accesses the protocol fields: net/netfilter/ipvs/ip_vs_pe_sip.c: In function 'ip_vs_sip_fill_param': net/netfilter/ipvs/ip_vs_pe_sip.c:76:5: error: 'iph.protocol' may be used uninitialized in this function [-Werror=maybe-uninitialized] if (iph.protocol != IPPROTO_UDP) ^ net/netfilter/ipvs/ip_vs_pe_sip.c:81:10: error: 'iph.len' may be used uninitialized in this function [-Werror=maybe-uninitialized] dataoff = iph.len + sizeof(struct udphdr); ^ This adds a check for the ip_vs_fill_iph_skb_off() return code before looking at the ip header data returned from it. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Fixes: b0e010c527de ("ipvs: replace ip_vs_fill_ip4hdr with ip_vs_fill_iph_skb_off") Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>
| | * netfilter: nfnetlink_acct: validate NFACCT_FILTER parametersPhil Turnbull2016-02-291-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | nfacct_filter_alloc doesn't validate the NFACCT_FILTER_MASK and NFACCT_FILTER_VALUE parameters which can trigger a NULL pointer dereference. CAP_NET_ADMIN is required to trigger the bug. Signed-off-by: Phil Turnbull <phil.turnbull@oracle.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
| * | Merge branch 'master' of git://blackhole.kfki.hu/nfPablo Neira Ayuso2016-03-104-31/+32
| |\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Jozsef Kadlecsik says: ==================== Please apply the next patch against the nf tree: - Deniz Eren reported that parallel flush/dump of list:set type of sets can lead to kernel crash. The bug was due to non-RCU compatible flushing, listing in the set type, fixed by me. - Julia Lawall pointed out that IPSET_ATTR_ETHER netlink attribute length was not checked explicitly. The patch adds the missing checkings. ==================== Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
| | * | netfilter: ipset: Check IPSET_ATTR_ETHER netlink attribute lengthJozsef Kadlecsik2016-03-082-1/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Julia Lawall pointed out that IPSET_ATTR_ETHER netlink attribute length was not checked explicitly, just for the maximum possible size. Malicious netlink clients could send shorter attribute and thus resulting a kernel read after the buffer. The patch adds the explicit length checkings. Reported-by: Julia Lawall <julia.lawall@lip6.fr> Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
| | * | netfilter: ipset: Fix set:list type crash when flush/dump set in parallelJozsef Kadlecsik2016-02-242-30/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Flushing/listing entries was not RCU safe, so parallel flush/dump could lead to kernel crash. Bug reported by Deniz Eren. Fixes netfilter bugzilla id #1050. Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
* | | | net: diag: add a scheduling point in inet_diag_dump_icsk()Eric Dumazet2016-03-151-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On loaded TCP servers, looking at millions of sockets can hold cpu for many seconds, if the lookup condition is very narrow. (eg : ss dst 1.2.3.4 ) Better add a cond_resched() to allow other processes to access the cpu. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | smc91x: avoid self-comparison warningArnd Bergmann2016-03-151-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The smc91x driver defines a macro that compares its argument to itself, apparently to get a true result while using its argument to avoid a warning about unused local variables. Unfortunately, this triggers a warning with gcc-6, as the comparison is obviously useless: drivers/net/ethernet/smsc/smc91x.c: In function 'smc_hardware_send_pkt': drivers/net/ethernet/smsc/smc91x.c:563:14: error: self-comparison always evaluates to true [-Werror=tautological-compare] if (!smc_special_trylock(&lp->lock, flags)) { This replaces the macro with another one that behaves similarly, with a cast to (void) to ensure the argument is used, and using a literal 'true' as its value. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | Merge branch 'dsa-finers-bridging-control'David S. Miller2016-03-148-65/+56
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Vivien Didelot says: ==================== net: dsa: finer bridging control This patchset renames the bridging routines of the DSA layer, make the unbridging routine return void, and rework the DSA netdev notifier handler, similar to what the Mellanox Spectrum driver does. Changes RFC -> v1: - drop unused NETDEV_PRECHANGEUPPER case - add Andrew's Tested-by tag ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | net: dsa: refine netdev event notifierVivien Didelot2016-03-141-24/+30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Rework the netdev event handler, similar to what the Mellanox Spectrum driver does, to easily welcome more events later (for example NETDEV_PRECHANGEUPPER) and use netdev helpers (such as netif_is_bridge_master). Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Acked-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | net: dsa: make port_bridge_leave return voidVivien Didelot2016-03-145-30/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | netdev_upper_dev_unlink() which notifies NETDEV_CHANGEUPPER, returns void, as well as del_nbp(). So there's no advantage to catch an eventual error from the port_bridge_leave routine at the DSA level. Make this routine void for the DSA layer and its existing drivers. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | net: dsa: rename port_*_bridge routinesVivien Didelot2016-03-146-14/+14
|/ / / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Rename DSA port_join_bridge and port_leave_bridge routines to respectively port_bridge_join and port_bridge_leave in order to respect an implicit Port::Bridge namespace. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | mISDN: Support DR6 indication in mISDNipac driverMaciej S. Szmigiero2016-03-142-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | According to figure 39 in PEB3086 data sheet, version 1.4 this indication replaces DR when layer 1 transition source state is F6. This fixes mISDN layer 1 getting stuck in F6 state in TE mode on Dialogic Diva 2.02 card (and possibly others) when NT deactivates it. Signed-off-by: Maciej S. Szmigiero <mail@maciej.szmigiero.name> Acked-by: Karsten Keil <keil@b1-systems.de> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | mISDN: Order IPAC register definesMaciej S. Szmigiero2016-03-141-20/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It looks like IPAC/ISAC chips register defines weren't in any particular order. Order them by their number to make it easier to spot holes. Signed-off-by: Maciej S. Szmigiero <mail@maciej.szmigiero.name> Acked-by: Karsten Keil <keil@b1-systems.de> Signed-off-by: David S. Miller <davem@davemloft.net>