summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* mdio: remove an unneed conditionDan Carpenter2016-01-121-4/+2
| | | | | | | | | | It used to be that mdio->irq was a pointer but after e7f4dc3536a4 ('mdio: Move allocation of interrupts into core') it's an array inside the mdio struct so it can never be NULL. Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* mdio_bus: NULL dereference on allocation errorDan Carpenter2016-01-121-5/+6
| | | | | | | | | | | If bus = kzalloc() fails then we end up dereferencing bus when we do "bus->irq[i] = PHY_POLL;". The code is a little simpler if we reverse the NULL check and return directly on failure. Fixes: e7f4dc3536a4 ('mdio: Move allocation of interrupts into core') Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller2016-01-1230-73/+206
|\ | | | | | | | | | | | | | | | | | | | | | | Conflicts: drivers/net/bonding/bond_main.c drivers/net/ethernet/mellanox/mlxsw/spectrum.h drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c The bond_main.c and mellanox switch conflicts were cases of overlapping changes. Signed-off-by: David S. Miller <davem@davemloft.net>
| * bonding: Prevent IPv6 link local address on enslaved devicesKarl Heiss2016-01-111-1/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 1f718f0f4f97 ("bonding: populate neighbour's private on enslave") undoes the fix provided by commit c2edacf80e15 ("bonding / ipv6: no addrconf for slaves separately from master") by effectively setting the slave flag after the slave has been opened. If the slave comes up quickly enough, it will go through the IPv6 addrconf before the slave flag has been set and will get a link local IPv6 address. In order to ensure that addrconf knows to ignore the slave devices on state change, set IFF_SLAVE before dev_open() during bonding enslavement. Fixes: 1f718f0f4f97 ("bonding: populate neighbour's private on enslave") Signed-off-by: Karl Heiss <kheiss@gmail.com> Signed-off-by: Jay Vosburgh <jay.vosburgh@canonical.com> Reviewed-by: Jarod Wilson <jarod@redhat.com> Signed-off-by: Andy Gospodarek <gospo@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * udp: disallow UFO for sockets with SO_NO_CHECK optionMichal Kubeček2016-01-112-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit acf8dd0a9d0b ("udp: only allow UFO for packets from SOCK_DGRAM sockets") disallows UFO for packets sent from raw sockets. We need to do the same also for SOCK_DGRAM sockets with SO_NO_CHECK options, even if for a bit different reason: while such socket would override the CHECKSUM_PARTIAL set by ip_ufo_append_data(), gso_size is still set and bad offloading flags warning is triggered in __skb_gso_segment(). In the IPv6 case, SO_NO_CHECK option is ignored but we need to disallow UFO for packets sent by sockets with UDP_NO_CHECK6_TX option. Signed-off-by: Michal Kubecek <mkubecek@suse.cz> Tested-by: Shannon Nelson <shannon.nelson@intel.com> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net: pktgen: fix null ptr deref in skb allocationJohn Fastabend2016-01-111-1/+3
| | | | | | | | | | | | | | | | | | Fix possible null pointer dereference that may occur when calling skb_reserve() on a null skb. Fixes: 879c7220e82 ("net: pktgen: Observe needed_headroom of the device") Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * Merge branch 'renesas-eth-fixes'David S. Miller2016-01-112-10/+4
| |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Sergei Shtylyov says: ==================== Fix some dubious code in the Renesas Ethernet drivers Here's a set of 2 patches against DaveM's 'net.git' repo. While initializing EMAC the code tries to respect the duplex mode both programmed into ECMR and stored in its own private data -- this just can't be right. [1/2] ravb: stop reading ECMR in ravb_emac_init() [2/2] sh_eth: stop reading ECMR in sh_eth_dev_init() ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| | * sh_eth: stop reading ECMR in sh_eth_dev_init()Sergei Shtylyov2016-01-111-5/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The code in sh_eth_dev_init() twiddling the ECMR bits always looked a bit strange to me: if one intends to respect 'mdp->duplex', why save old value of the ECMR.DM bit? As all the other bits are zeroed anyway, we don't really need to read ECMR before writing to it. Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * ravb: stop reading ECMR in ravb_emac_init()Sergei Shtylyov2016-01-111-5/+2
| |/ | | | | | | | | | | | | | | | | | | The code in ravb_emac_init() twiddling the ECMR bits always looked a bit strange to me: if one intends to respect 'priv->duplex', why save old value of the ECMR.DM bit? As all the other bits are zeroed anyway, we don't really need to read ECMR before writing to it. Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * sched,cls_flower: set key address type when presentJamal Hadi Salim2016-01-111-2/+8
| | | | | | | | | | | | | | | | | | only when user space passes the addresses should we consider their presence Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com> Acked-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>
| * tcp_yeah: don't set ssthresh below 2Neal Cardwell2016-01-111-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For tcp_yeah, use an ssthresh floor of 2, the same floor used by Reno and CUBIC, per RFC 5681 (equation 4). tcp_yeah_ssthresh() was sometimes returning a 0 or negative ssthresh value if the intended reduction is as big or bigger than the current cwnd. Congestion control modules should never return a zero or negative ssthresh. A zero ssthresh generally results in a zero cwnd, causing the connection to stall. A negative ssthresh value will be interpreted as a u32 and will set a target cwnd for PRR near 4 billion. Oleksandr Natalenko reported that a system using tcp_yeah with ECN could see a warning about a prior_cwnd of 0 in tcp_cwnd_reduction(). Testing verified that this was due to tcp_yeah_ssthresh() misbehaving in this way. Reported-by: Oleksandr Natalenko <oleksandr@natalenko.name> Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * sctp: fix use-after-free in pr_debug statementMarcelo Ricardo Leitner2016-01-112-18/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Dmitry Vyukov reported a use-after-free in the code expanded by the macro debug_post_sfx, which is caused by the use of the asoc pointer after it was freed within sctp_side_effect() scope. This patch fixes it by allowing sctp_side_effect to clear that asoc pointer when the TCB is freed. As Vlad explained, we also have to cover the SCTP_DISPOSITION_ABORT case because it will trigger DELETE_TCB too on that same loop. Also, there were places issuing SCTP_CMD_INIT_FAILED and ASSOC_FAILED but returning SCTP_DISPOSITION_CONSUME, which would fool the scheme above. Fix it by returning SCTP_DISPOSITION_ABORT instead. The macro is already prepared to handle such NULL pointer. Reported-by: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Acked-by: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * mlxsw: spectrum: Add FDB lock to prevent session interleavingIdo Schimmel2016-01-112-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Dumping the FDB (invoked with a process context) or handling FDB notifications (polled periodicly in delayed work) might each entail multiple EMAD transcations due to the number of entries. While we only allow one EMAD transaction at a time, there is nothing stopping the dump and notification processing sessions from interleaving. However, this is forbidden by the hardware, so we need to make sure only one of these sessions can run at a time. Solve this by adding a mutex ('fdb_lock'), as both kernel threads can sleep while waiting for the response EMAD. Fixes: 56ade8fe3f ("mlxsw: spectrum: Add initial support for Spectrum ASIC") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * unix: properly account for FDs passed over unix socketswilly tarreau2016-01-113-9/+29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It is possible for a process to allocate and accumulate far more FDs than the process' limit by sending them over a unix socket then closing them to keep the process' fd count low. This change addresses this problem by keeping track of the number of FDs in flight per user and preventing non-privileged processes from having more FDs in flight than their configured FD limit. Reported-by: socketpair@gmail.com Reported-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Mitigates: CVE-2013-4312 (Linux 2.0+) Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: David S. Miller <davem@davemloft.net>
| * ipv6: tcp: add rcu locking in tcp_v6_send_synack()Eric Dumazet2016-01-111-0/+2
| | | | | | | | | | | | | | | | | | | | | | When first SYNACK is sent, we already hold rcu_read_lock(), but this is not true if a SYNACK is retransmitted, as a timer (soft) interrupt does not hold rcu_read_lock() Fixes: 45f6fad84cc30 ("ipv6: add complete rcu protection around np->opt") Reported-by: Dave Jones <davej@codemonkey.org.uk> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net: sctp: prevent writes to cookie_hmac_alg from accessing invalid memorySasha Levin2016-01-111-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | proc_dostring() needs an initialized destination string, while the one provided in proc_sctp_do_hmac_alg() contains stack garbage. Thus, writing to cookie_hmac_alg would strlen() that garbage and end up accessing invalid memory. Fixes: 3c68198e7 ("sctp: Make hmac algorithm selection for cookie generation dynamic") Signed-off-by: Sasha Levin <sasha.levin@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net: qmi_wwan: Add SIMCom 7230EKristian Evensen2016-01-101-0/+1
| | | | | | | | | | | | | | | | | | | | SIMCom 7230E is a QMI LTE module with support for most "normal" bands. Manual testing has showed that only interface five works. Cc: Bjørn Mork <bjorn@mork.no> Signed-off-by: Kristian Evensen <kristian.evensen@gmail.com> Acked-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: David S. Miller <davem@davemloft.net>
| * udp: restrict offloads to one namespaceHannes Frederic Sowa2016-01-105-7/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | udp tunnel offloads tend to aggregate datagrams based on inner headers. gro engine gets notified by tunnel implementations about possible offloads. The match is solely based on the port number. Imagine a tunnel bound to port 53, the offloading will look into all DNS packets and tries to aggregate them based on the inner data found within. This could lead to data corruption and malformed DNS packets. While this patch minimizes the problem and helps an administrator to find the issue by querying ip tunnel/fou, a better way would be to match on the specific destination ip address so if a user space socket is bound to the same address it will conflict. Cc: Tom Herbert <tom@herbertland.com> Cc: Eric Dumazet <edumazet@google.com> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
| * vxlan: fix test which detect duplicate vxlan ifaceNicolas Dichtel2016-01-101-3/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When a vxlan interface is created, the driver checks that there is not another vxlan interface with the same properties. To do this, it checks the existing vxlan udp socket. Since commit 1c51a9159dde, the creation of the vxlan socket is done only when the interface is set up, thus it breaks that test. Example: $ ip l a vxlan10 type vxlan id 10 group 239.0.0.10 dev eth0 dstport 0 $ ip l a vxlan11 type vxlan id 10 group 239.0.0.10 dev eth0 dstport 0 $ ip -br l | grep vxlan vxlan10 DOWN f2:55:1c:6a:fb:00 <BROADCAST,MULTICAST> vxlan11 DOWN 7a:cb:b9:38:59:0d <BROADCAST,MULTICAST> Instead of checking sockets, let's loop over the vxlan iface list. Fixes: 1c51a9159dde ("vxlan: fix race caused by dropping rtnl_unlock") Reported-by: Thomas Faivre <thomas.faivre@6wind.com> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * cdc-acm: fix NULL pointer referenceOliver Neukum2016-01-101-1/+7
| | | | | | | | | | | | | | | | | | | | | | The union descriptor must be checked. Its usage was conditional before the parser was introduced. This is important, because many RNDIS device, which also use the common parser, have bogus extra descriptors. Signed-off-by: Oliver Neukum <oneukum@suse.com> Tested-by: Vasily Galkin <galkin-vv@yandex.ru> Signed-off-by: David S. Miller <davem@davemloft.net>
| * r8152: fix the wake eventhayeswang2016-01-091-1/+39
| | | | | | | | | | | | | | | | | | | | When the autosuspend is enabled and occurs before system suspend, we should wake the device before running system syspend. Then, we could change the wake event for system suspend. Otherwise, the device would resume the system when receiving any packet. Signed-off-by: Hayes Wang <hayeswang@realtek.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * phy: micrel: Fix finding PHY properties in MAC node for KSZ9031.Roosen Henri2016-01-091-2/+10
| | | | | | | | | | | | | | | | | | | | | | Commit 651df2183543 ("phy: micrel: Fix finding PHY properties in MAC node.") only fixes finding PHY properties in MAC node for KSZ9021. This commit applies the same fix for KSZ9031. Fixes: 8b63ec1837fa ("phylib: Make PHYs children of their MDIO bus, not the bus' parent.") Acked-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Henri Roosen <henri.roosen@ginzinger.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * Merge tag 'batman-adv-fix-for-davem' of git://git.open-mesh.org/linux-mergeDavid S. Miller2016-01-091-2/+4
| |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Antonio Quartulli says: ==================== Included change: - Fix invalid read while copying bat_iv.bcast_own by Sven Eckelmann ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| | * batman-adv: Fix invalid read while copying bat_iv.bcast_ownSven Eckelmann2016-01-071-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | batadv_iv_ogm_orig_del_if removes a part of the bcast_own which previously belonged to the now removed interface. This is done by copying all data which comes before the removed interface and then appending all the data which comes after the removed interface. The address calculation for the position of the data which comes after the removed interface assumed that the bat_iv.bcast_own is a pointer to a single byte datatype. But it is a pointer to unsigned long and thus the calculated position was wrong off factor sizeof(unsigned long). Fixes: 83a8342678a0 ("more basic routing code added (forwarding packets / bitarray added)") Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch> Signed-off-by: Antonio Quartulli <a@unstable.cc>
| * | irda: toim3232-sir: delete some dead codeDan Carpenter2016-01-081-10/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | I noticed this code because it has a typo in the module_param(), the "trs" at the end of "toim3232fliptrs" should be "rts". But it's dead code so instead of fixing it, I just deleted it. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | isdn: act200: fix MODULE_PARM_DESC() typoDan Carpenter2016-01-081-1/+1
| | | | | | | | | | | | | | | | | | | | | There is no "membase", it was "act_port" that was intended. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | net: lan78xx: Fix to write to OTP(One Time Programmable) per magic number.Woojung.Huh@microchip.com2016-01-071-1/+54
| |/ | | | | | | | | | | | | | | This patch fixes a bug writing to EEPROM in lan78xx_ethtool_set_eeprom() when asked to write to OTP. Signed-off-by: Woojung Huh <woojung.huh@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | net: Fix typo in netdev_intersect_featuresTom Herbert2016-01-121-2/+2
| | | | | | | | | | | | | | | | | | Obviously need to 'or in NETIF_F_IP_CSUM and NETIF_F_IPV6_CSUM. Fixes: c8cd0989bd151f ("net: Eliminate NETIF_F_GEN_CSUM and NETIF_F_V[46]_CSUM") Reported-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Tom Herbert <tom@herbertland.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | Merge branch 'mdio-build-failures'David S. Miller2016-01-122-2/+2
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Andrew Lunn says: ==================== More mdio device build failure fixes These patches fix two build errors reported by Guenter Roeck ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| * | net: freescale: mac-fec: Fix build error from phy_device API changeAndrew Lunn2016-01-121-1/+1
| | | | | | | | | | | | | | | | | | | | | dev has moved inside the new mdio structure. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | net: freescale: ucc_geth: Fix build error from phy_device API changeAndrew Lunn2016-01-121-1/+1
|/ / | | | | | | | | | | | | dev has moved inside the new mdio structure. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
* | Merge branch 'mlx5-enhanced-flow-steering'David S. Miller2016-01-119-76/+1233
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Or Gerlitz says: ==================== net/mlx5_core: Enhance flow steering support v0 --> v1 changes: - fixed improperly formatted comments. - compare value of ib_spec->eth.mask.ether_type in network byte order in ('IB/mlx5: Add flow steering utilities'). v1 --> v2 changes: - made sure that service functions added in the IB driver are only static-fied on the last commit, to make sure bisection with -Werror works fine. v2 --> v3 changes: - squashed patches 11 and 12 into one patch, s.t Dave's comment on unused static functions gcc complaints during bisection is correctly addressed. v3 has been generated against net-next commit c9c9931 "Merge tag 'batman-adv-for-davem' of git://git.open-mesh.org/linux-merge" The series is signed by Matan who was revently assigned to a maintainer for the mlx5_core and IB drivers (this is a 4.5-rc1 change to the maintainers file coming from the rdma tree) -- as such I didn't see a neeed to add my signature (Or). This series adds three new functionalists to the driver flow-steering infrastructure: auto-grouped flow tables, chaining of flow tables and updates for the root flow table. 1. Auto-grouped flow tables - Flow table with auto grouping management. When a flow table is created, hints regarding the number of rule types and the number of rules are given in advance. Thus, a flow table is divided into #NUM_TYPES+1 groups each contains (#NUM_RULES)/(#NUM_TYPES+1) rules. The first #NUM_TYPES parts are groups which are filled if the added rule matches the group specification or the group is empty. The last part is filled by rules that can't fit any of the former groups. 2. Chaining flow tables - Flow tables from different priorities are chained together, if there is no match in flow table of priority i we continue searching for a match in priority i+1. This is both true if priorities i and i+1 belongs to the same namespace or not. 3. Updating the root flow table - the root flow table is the flow table with the lowest level. The hardware start searching for a match in the root flow table and continue according to the matches it find along the way. The first usage for the new functionality is flow steering for user-space ConnectX-4 offloaded HW Eth RX queues done through the mlx5 IB driver. When the mlx5 core driver is loaded, it opens three flow namespaces: 1. By-pass namespace (used by mlx5 IB driver). 2. Kernel namespace (used in order to get packets to the networking stack through mlx5 EN driver). 3. Leftovers namespace (used by mlx5 IB and future sniffer) The series is built as follows: Patch #1 introduces auto-grouped flow tables support. Patch #2 add utility functions for finding the next and the previous flow tables in different priorities. This is used in order to chain the flow tables in a downstream patch. Patch #3 introduces a firmware command for updating the root flow table. Patch #4 introduces modify flow table firmware command, this command is used when we want to change the next flow table of an existing flow table. This is used for chaining flow tables as well. Patch #5 connect/disconnect flow tables. This is actually the chaining process when we want to link flow tables. This means that if we couldn't find a match in the first flow table, we'll continue in the chained flow table. Patch #6 updates priority's attributes that is required for flow table level allocation. We update both the max_fts (the number of allowed FTs in the sub-tree of this priority) and the start_level (which is the first level we'll assign to the flow-tables created inside the priority). Patch #7 adds checking of required device capabilities. Some namespaces could be only created if the hardware supports certain attributes. This is especially true for the Bypass and leftovers namespaces. This adds a generic mechanism to check these required attributes. Patch #8 creates two additional namespaces: a. Bypass flow rules(has nine priorities) b. Leftovers packets(have one priority) - for unmatched packets. Patch #9 re-factors ipv4/ipv6 match fields in the mlx5 firmware interface header to be more clear. Patch #10 exports the flow steering API for mlx5_ib usage Patch #11 implements the required support in mlx5_ib in order to support the RDMA flow steering verbs. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| * | IB/mlx5: Add flow steering supportMaor Gottlieb2016-01-113-1/+517
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Adding flow steering support by creating a flow-table per priority (if rules exist in the priority). mlx5_ib uses autogrouping and thus only creates the required destinations. Also includes adding of these flow steering utilities 1. Parsing verbs flow attributes hardware steering specs. 2. Check if flow is multicast - this is required in order to decide to which flow table will we add the steering rule. 3. Set outer headers in flow match criteria to zeros. Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | net/mlx5_core: Export flow steering APIMaor Gottlieb2016-01-111-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add exports to flow steering API for mlx5_ib usage. The following functions are exported: 1. mlx5_create_auto_grouped_flow_table - used to create flow table with auto flow grouping management (create and destroy flow groups). In auto-grouped flow tables, we create groups automatically if needed (if we don't find an existing flow group with same match criteria when we add new rule). 2. mlx5_destroy_flow_table - used to destroy a flow table. 3. mlx5_add_flow_rule - used to add flow rule into a flow table. 4. mlx5_del_flow_rule - used to delete flow rule from its flow table. 5. mlx5_get_flow_namespace - used to get a handle to the required namespace sub-tree. Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | net/mlx5_core: Make ipv4/ipv6 location more clearMaor Gottlieb2016-01-111-2/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Change the mlx5 firmware interface header to make it more clear which bytes should be used by IPv4 or IPv6 addresses. Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | net/mlx5_core: Enable flow steering support for the IB driverMaor Gottlieb2016-01-113-8/+51
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When the driver is loaded, we create flow steering namespace for kernel bypass with nine priorities and another namespace for leftovers(in order to catch packets that weren't matched). Verbs applications will use these priorities. we found nine as a number that balances the requirements from the user and retains performance. The bypass namespace is used by verbs applications that want to bypass the kernel networking stack. The leftovers namespace is used by verbs applications and the sniffer in order to catch packets that weren't handled by any preceding rules. Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | net/mlx5_core: Initialize namespaces only when supported by deviceMaor Gottlieb2016-01-111-21/+49
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Before we create the sub tree of a steering namespaces(kernel, bypass, leftovers) we check that the device has the required capabilities in order to create this subtree. Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | net/mlx5_core: Set priority attributesMaor Gottlieb2016-01-112-19/+55
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Each priority has two attributes: 1. max_ft - maximum allowed flow tables under this priority. 2. start_level - start level range of the flow tables in the priority. These attributes are set by traversing the tree nodes by DFS and set start level and max flow tables to each priority. Start level depends on the max flow tables of the prior priorities in the tree. The leaves of the trees have max_ft set in them. Each node accumulates the max_ft of its children and set it accordingly. Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | net/mlx5_core: Connect flow tablesMaor Gottlieb2016-01-113-10/+104
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Flow tables from different priorities should be chained together. When a packet arrives we search for a match in the by-pass flow tables (first we search for a match in priority 0 and if we don't find a match we move to the next priority). If we can't find a match in any of the bypass flow-tables, we continue searching in the flow-tables of the next priority, which are the kernel's flow tables. Setting the miss flow table in a new flow table to be the next one in the list is performed via create flow table API. If we want to change an existing flow table, for example in order to point from an existing flow table to the new next-in-list flow table, we use the modify flow table API. Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | net/mlx5_core: Introduce modify flow table commandMaor Gottlieb2016-01-113-4/+83
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Introduce the modify flow table command. This command is used when we want to change the next flow table of an existing flow table. The next flow table is defined as the table we search (in order to find a match), if we couldn't find a match in any of the flow table entries in the current flow table. Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | net/mlx5_core: Managing root flow tableMaor Gottlieb2016-01-115-10/+144
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The root Flow Table for each Flow Table Type is defined, by default, as the Flow Table with level 0. In order not to use an empty flow tables and introduce new hops, but still preserve space for flow-tables that have a priority greater(lower number) than the current flow table, we introduce this new set root flow table command. This command tells the HW to start matching packets from the assigned root flow table. This command is used when we create new flow table with level lower than the current lowest flow table or it is the first flow table. Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | net/mlx5_core: Add utilities to find next and prev flow-tablesMaor Gottlieb2016-01-111-0/+67
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add two utility functions for find next and prev flow table. Find next flow table function gets priority and return the first flow table of the next priority in the tree. Find prev flow table return the last flow table of the previous priority in the tree. These utility functions are used for chaining flow table from different priorities. Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | net/mlx5_core: Introduce flow steering autogrouped flow tableMaor Gottlieb2016-01-113-19/+158
|/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When user add rule to autogrouped flow table, we search for flow group with the same match criteria, if we don't find such group then we create new flow group with the required match criteria and insert the rule to this group. We divide the flow table into required_groups + 1, in order to reserve a part of the flow table for rules which don't match any existing group. Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | Merge branch 'bpf-next'David S. Miller2016-01-112-22/+110
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Daniel Borkmann says: ==================== BPF update This set adds IPv6 support for bpf_skb_{set,get}_tunnel_key() helper. It also exports flags to user space that are being used in helpers and weren't exported thus far. For more details, please see the individual patches. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| * | bpf: support ipv6 for bpf_skb_{set,get}_tunnel_keyDaniel Borkmann2016-01-112-8/+71
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | After IPv6 support has recently been added to metadata dst and related encaps, add support for populating/reading it from an eBPF program. Commit d3aa45ce6b ("bpf: add helpers to access tunnel metadata") started with initial IPv4-only support back then (due to IPv6 metadata support not being available yet). To stay compatible with older programs, we need to test for the passed structure size. Also TOS and TTL support from the ip_tunnel_info key has been added. Tested with vxlan devs in collect meta data mode with IPv4, IPv6 and in compat mode over different network namespaces. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | bpf: export helper function flags and reject invalid onesDaniel Borkmann2016-01-112-14/+39
|/ / | | | | | | | | | | | | | | | | | | | | | | Export flags used by eBPF helper functions through UAPI, so they can be used by programs (instead of them redefining all flags each time or just using the hard-coded values). It also gives a better overview what flags are used where and we can further get rid of the extra macros defined in filter.c. Moreover, reject invalid flags. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* | bonding: make mii_status sysfs node consistentJarod Wilson2016-01-111-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The spew in /proc/net/bonding/bond0 uses netif_carrier_ok() to determine mii_status, while /sys/class/net/bond0/bonding/mii_status looks at curr_active_slave, which doesn't actually seem to be set sometimes when the bond actually is up. A mode 4 bond configured via ifcfg-foo files on a Red Hat Enterprise Linux system, after boot, comes up clean and functional, but the sysfs node shows mii_status of down, while proc shows up. A simple enough fix here seems to be to use the same method for determining up or down in both places, and I'd opt for the one that seems to match reality. CC: Jay Vosburgh <j.vosburgh@gmail.com> CC: Veaceslav Falico <vfalico@gmail.com> CC: Andy Gospodarek <gospo@cumulusnetworks.com> CC: netdev@vger.kernel.org Signed-off-by: Jarod Wilson <jarod@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | net/rtnetlink: remove unused sz_idx variableAlexander Kuleshov2016-01-111-2/+1
| | | | | | | | | | | | | | | | The sz_idx variable is defined in the rtnetlink_rcv_msg(), but not used anywhere. Let's remove it. Signed-off-by: Alexander Kuleshov <kuleshovmail@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | net: bfin_mac: Use phy_find_first() instead of open-coding itGuenter Roeck2016-01-111-15/+2
| | | | | | | | | | | | | | | | | | Use phy_find_first() to find the first phy device instead of open-coding it. Cc: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: David S. Miller <davem@davemloft.net>
* | Merge branch 'ovs-cleanups'David S. Miller2016-01-112-24/+5
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Jean Sacren says: ==================== Trivial fix-ups for openvswitch This series does trivial fix-ups for openvswitch as follows: 1) Clean up the leftover of the unused function. 2) Fix up the twisted struct geneve_port member name. 3) Update the kernel doc to reflect the changes in struct vport. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>