summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* net: phy: marvell: Consolidate setting the phy-modeAndrew Lunn2017-08-011-48/+40
| | | | | | | The same code is repeated a few times. Refactor into a helped. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: phy: marvell: consolidate RGMII delay codeAndrew Lunn2017-08-011-32/+22
| | | | | | | | The same code is repeated for different PHY versions. Put it into a help and call when needed. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: phy: marvell: Use core genphy_soft_reset()Andrew Lunn2017-08-011-35/+12
| | | | | | | | Rather than using an open coded equivalent, use the core genphy_soft_reset() function. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: phy: marvell: tabificationAndrew Lunn2017-08-011-15/+15
| | | | | | | | Convert spaces to tabs where appropriate, and fix up some otherwise odd indentation. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: bcmgenet: Add dependency on HAS_IOMEM && OFFlorian Fainelli2017-08-011-0/+1
| | | | | | | | | | | | | | The driver needs CONFIG_HAS_IOMEM and OF to be functional, but we still let it build with COMPILE_TEST. This fixes the unmet dependency after selecting MDIO_BCM_UNIMAC in commit mentioned below: warning: (NET_DSA_BCM_SF2 && BCMGENET) selects MDIO_BCM_UNIMAC which has unmet direct dependencies (NETDEVICES && MDIO_DEVICE && HAS_IOMEM && OF_MDIO) Fixes: 9a4e79697009 ("net: bcmgenet: utilize generic Broadcom UniMAC MDIO controller driver") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* tcp: add related fields into SCM_TIMESTAMPING_OPT_STATSWei Wang2017-08-012-1/+27
| | | | | | | | | | | | | | | | Add the following stats into SCM_TIMESTAMPING_OPT_STATS control msg: TCP_NLA_PACING_RATE TCP_NLA_DELIVERY_RATE TCP_NLA_SND_CWND TCP_NLA_REORDERING TCP_NLA_MIN_RTT TCP_NLA_RECUR_RETRANS TCP_NLA_DELIVERY_RATE_APP_LMT Signed-off-by: Wei Wang <weiwan@google.com> Acked-by: Yuchung Cheng <ycheng@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* tcp: extract the function to compute delivery rateWei Wang2017-08-011-7/+16
| | | | | | | | | | Refactor the code to extract the function to compute delivery rate. This function will be used in later commit. Signed-off-by: Wei Wang <weiwan@google.com> Acked-by: Yuchung Cheng <ycheng@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: phy: Log only PHY state transitionsMarc Gonzalez2017-08-011-3/+4
| | | | | | | | In the current code, old and new PHY states are always logged. >From now on, log only PHY state transitions. Signed-off-by: Marc Gonzalez <marc_gonzalez@sigmadesigns.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge branch 'mlxsw-Various-small-fixes'David S. Miller2017-07-313-17/+15
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | Jiri Pirko says: ==================== mlxsw: Various small fixes This patch series is to contribute several fixes for nits that I noticed while working on mlxsw. The changes range from typo fixes to local improvements of the code and have little in common besides being small in scope. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| * mlxsw: spectrum_router: Simplify a piece of codePetr Machata2017-07-311-2/+2
| | | | | | | | | | | | | | | | | | Express the same logic more succinctly. Signed-off-by: Petr Machata <petrm@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * mlxsw: spectrum_router: Clarify a piece of codePetr Machata2017-07-311-1/+1
| | | | | | | | | | | | | | | | | | | | Prefer logical operator that expresses the intent to bitwise one that happens to give the same result. Signed-off-by: Petr Machata <petrm@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * mlxsw: spectrum_router: Simplify a piece of codePetr Machata2017-07-311-3/+1
| | | | | | | | | | | | | | Signed-off-by: Petr Machata <petrm@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * mlxsw: reg.h: Namespace IP2ME registersPetr Machata2017-07-311-4/+4
| | | | | | | | | | | | | | | | | | | | This renames IP2ME-specific registers reg_ralue_v and reg_ralue_tunnel_ptr to reg_ralue_ip2me_*. Signed-off-by: Petr Machata <petrm@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * mlxsw: Update specification of reg_ritr_typePetr Machata2017-07-311-4/+4
| | | | | | | | | | | | | | | | | | The comments really belong to the individual enumerators. The comment at the register should instead reference the enum. Signed-off-by: Petr Machata <petrm@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * mlxsw: spectrum_router: Fix a typoPetr Machata2017-07-311-1/+1
| | | | | | | | | | | | | | Signed-off-by: Petr Machata <petrm@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * mlxsw: reg.h: Fix a typoPetr Machata2017-07-311-1/+1
| | | | | | | | | | | | | | Signed-off-by: Petr Machata <petrm@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * mlxsw: spectrum_acl: Fix a typoPetr Machata2017-07-311-1/+1
|/ | | | | | | Signed-off-by: Petr Machata <petrm@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge branch 'bcmgenet-utilize-MDIO-unimac-driver'David S. Miller2017-07-316-180/+163
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Florian Fainelli says: ==================== net: bcmgenet: utilize MDIO unimac driver This patch series migrates the Broadcom GENET driver to use the mdio-bcm-unimac driver. This MDIO HW is the same as the one GENET internally embedds, yet for historical reasons the two drivers lived their own lives. Because of the GENET interrupt situation, we let it specify how it wants to signal MDIO operations completion using its driver-private waitqueue. The diffstat is not super impressive, but it's still negative! This would make it easier in the future to absorb possible workarounds/bugs/features within the same location. This was tested on BCM7260 (GENETv5, single instance), BCM7439 (GENETv4, triple instance) and BCM7445 (bcm_sf2 + mdio-bcm-unimac). We also now have a nice /proc/iomem output: f0b00000-f0b0fc4b : /rdb/ethernet@f0b00000 f0b00e14-f0b00e1c : unimac-mdio.0 f0b20000-f0b2fc4b : /rdb/ethernet@f0b20000 f0b20e14-f0b20e1c : unimac-mdio.1 f0b40000-f0b4fc4b : /rdb/ethernet@f0b40000 f0b40e14-f0b40e1c : unimac-mdio.2 ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| * net: bcmgenet: Utilize bcmgenet_mii_exit() for error pathFlorian Fainelli2017-07-311-6/+1
| | | | | | | | | | | | | | | | | | bcmgenet_mii_init() has an error path which is strictly identical to the unwinding that bcmgenet_mii_exit() does, so have bcmgenet_mii_init() utilize bcmgenet_mii_exit() for that. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net: bcmgenet: Drop legacy MDIO codeFlorian Fainelli2017-07-311-125/+0
| | | | | | | | | | | | | | | | Now that we have fully migrated to the mdio-bcm-unimac driver, drop the legacy MDIO bus code which did duplicate a fair amount of code. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net: bcmgenet: utilize generic Broadcom UniMAC MDIO controller driverFlorian Fainelli2017-07-313-34/+116
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Update the GENET driver to register an UniMAC MDIO bus controller for the GENET internal MDIO bus, update the platform data code to attach the PHY to the correct MDIO bus controller. The Device Tree portion of the code is mostly left unmodified since the lookup/binding is done via phandles and Device Tree nodes which are much more flexible in locating and binding PHYs to their respective MDIO bus controllers. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net: phy: mdio-bcm-unimac: Allow specifying platform dataFlorian Fainelli2017-07-313-6/+36
| | | | | | | | | | | | | | | | | | | | | | | | | | In preparation for having the bcmgenet driver migrate over the mdio-bcm-unimac driver, add a platform data structure which allows passing integrating specific details like bus name, wait function to complete MDIO operations and PHY mask. We also define what the platform device name contract is by defining UNIMAC_MDIO_DRV_NAME and moving it to the platform_data header. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net: phy: mdio-bcm-unimac: Add debug print for PHY workaroundFlorian Fainelli2017-07-311-1/+3
| | | | | | | | | | | | | | | | | | In order to be stricly identical to what bcmgenet does, add a debug print when a PHY workaround during bus->reset() is executed. Preliminary change to moving bcmgenet towards mdio-bcm-unimac. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net: phy: mdio-bcm-unimac: create unique bus namesFlorian Fainelli2017-07-311-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | In preparation for having multiple GENET instances in a system (up to 3), make sure that we do include the bus instance number in the name of the MDIO bus such that we change it from "unimac-mdio" to "unimac-mdio-0" for instance. So far, the only user of this driver is using Device Tree, which uses a lookup/parenting based technique to map PHY devices to their respective MDIO bus controllers, hence causing no additional changes. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net: phy: mdio-bcm-unimac: factor busy polling loopFlorian Fainelli2017-07-311-22/+21
|/ | | | | | | | | Factor the code that does the busy polling on the MDIO_BUSY bit since we will have different code-paths for for completion depending on whether we are using interrupts or polling. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge branch 'tcp-remove-prequeue-and-header-prediction'David S. Miller2017-07-3114-566/+43
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Florian Westphal says: ==================== tcp: remove prequeue and header prediction During a hallway discussion with Eric Dumazet at Netdev 1.2 in Tokyo some maybe-not-so-useful-anymore TCP stack features came up, among these header prediction and prequeueing. In brief, TCP prequeue assumes a single-process-blocking-read design, which is not that common anymore. The most frequently used high-performance networking program that is an excellent fit for these features is netperf. The idea behind prequeueing is to move part of tcp processing, including retransmit queue cleaning, to process context. With (e)poll designs, prequeue is always skipped, so for such programs this is dead-code removal. Header prediction is also less useful nowadays. For packet trains, GRO will do packet aggregation so we do not get the per-packet benefit that this had before GRO anymore. Because of SACK, header prediction also will be ineffective once a connection suffers even light packet losses. code removal aside, after this change processing always occurs in BH context, this allows to experiment e.g. with doing bulk freeing of skb heads when incoming ACKs clean packets from the retransmit queue. There are no changes since the RFC, except in last patch (i missed another no-longer-used mib counter). I also edited a few commit messages. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| * tcp: remove unused mib countersFlorian Westphal2017-07-312-18/+0
| | | | | | | | | | | | | | | | was used by tcp prequeue and header prediction. TCPFORWARDRETRANS use was removed in january. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
| * tcp: remove CA_ACK_SLOWPATHFlorian Westphal2017-07-313-49/+22
| | | | | | | | | | | | | | re-indent tcp_ack, and remove CA_ACK_SLOWPATH; it is always set now. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
| * tcp: remove header predictionFlorian Westphal2017-07-316-219/+10
| | | | | | | | | | | | | | | | | | | | | | Like prequeue, I am not sure this is overly useful nowadays. If we receive a train of packets, GRO will aggregate them if the headers are the same (HP predates GRO by several years) so we don't get a per-packet benefit, only a per-aggregated-packet one. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
| * tcp: remove low_latency sysctlFlorian Westphal2017-07-314-9/+4
| | | | | | | | | | | | | | Was only checked by the removed prequeue code. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
| * tcp: reindent two spots after prequeue removalFlorian Westphal2017-07-311-27/+23
| | | | | | | | | | | | | | | | These two branches are now always true, remove the conditional. objdiff shows no changes. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
| * tcp: remove prequeue supportFlorian Westphal2017-07-318-262/+2
|/ | | | | | | | | | | | | | | | | | | | | | prequeue is a tcp receive optimization that moves part of rx processing from bh to process context. This only works if the socket being processed belongs to a process that is blocked in recv on that socket. In practice, this doesn't happen anymore that often because nowadays servers tend to use an event driven (epoll) model. Even normal client applications (web browsers) commonly use many tcp connections in parallel. This has measureable impact only in netperf (which uses plain recv and thus allows prequeue use) from host to locally running vm (~4%), however, there were no changes when using netperf between two physical hosts with ixgbe interfaces. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge branch 'net-sched-actions-improve-dump-performance'David S. Miller2017-07-315-13/+144
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Jamal Hadi Salim says: ==================== net sched actions: improve dump performance Changes since v11: ------------------ 1) Jiri - renames: nla_value to value and nla_selector to selector 2) Jiri - rename: validate_nla_bitfield_32 to validate_nla_bitfield_32 3) Jiri - rename: NLA_BITFIELD_32 to NLA_BITFIELD32 4) Jiri - remove unnecessary break when we return in case statement 5) Jiri - rename and move nla_get_bitfield_32 to an earlier patch 6) Jiri - xmas tree alignment of var declaration 7) Jiri - rename all declarations of bitfield 32 vars to be consistent ("bf") 8) Jiri - improve validate_nla_bitfield32() validation to disallow valid bit values that are not selected by the selector Changes since v10: ----------------- 1) Jiri: move type->validate_content() to its own patch Jamal: decided to remove it altogether so we can get this patch set in. 2) Change name of NLA_FLAG_BITS to NLA_BITFIELD_32 based on discussions with D. Ahern and Jiri. D. Ahern suggests to make this a variable bitmap size. My analysis at this point is it too complex and i only need a few bit flags. If we run out of bits someone else can create a new NLA_BITFIELD_XXX and start using that. So please let this go. 3) Jamal - Add Suggested-by: Jiri for type NLA_BITFIELD_32 4) Jiri: Change name allowed_flags to tcaa_root_flags_allowed 5) Jiri: Introduce nla_get_flag_bits_values() helper instead of using memcpy for retrieving nla_bitfield_32 fields. Changes since v9: ----------------- 1) General consensus: - remove again the use of BIT() to maintain uapi consistency ;-> 1) Jiri: - Add a new netlink type NLA_FLAG_BITS to check for valid bits and use it instead of inline vetting (patch 4/4 now) Changes since v8: ----------------- 1) Jiri: - Add back the use of BIT(). Eventually fix iproute2 instead - Rename VALID_TCA_FLAGS to VALID_TCA_ROOT_FLAGS Changes since v7: ----------------- Jamal: No changes. Patch 1 went out twice. Resend without two copies of patch 1 changes since v6: ----------------- 1) DaveM: New rules for netlink messages. From now on we are going to start checking for bits that are not used and rejecting anything we dont understand. In the future this is going to require major changes to user space code (tc etc). This is just a start. To quote, David: " Again, bits you aren't using now, make sure userspace doesn't set them. And if it does, reject. " Added checks for ensuring things work as above. 2) Jiri: a)Fix the commit message to properly use "Fixes" description b)Align assignments for nla_policy Changes since v5: ---------------- 0) Remove use of BIT() because it is kernel specific. Requires a separate patch (Jiri can submit that in his cleanups) 1)To paraphrase Eric D. "memcpy(nla_data(count_attr), &cb->args[1], sizeof(u32)); wont work on 64bit BE machines because cb->args[1] (which is 64 bit is larger in size than sizeof(u32))" Fixed 2) Jiri Pirko i) Spotted a bug fix mixed in the patch for wrong TLV fix. Add patch 1/3 to address this. Make part of this series because of dependencies. ii) Rename ACT_LARGE_DUMP_ON -> TCA_FLAG_LARGE_DUMP_ON iii) Satisfy Jiri's obsession against the noun "tcaa" a)Rename struct nlattr *tcaa --> struct nlattr *tb b)Rename TCAA_ACT_XXX -> TCA_ROOT_XXX Changes since v4: ----------------- 1) Eric D. pointed out that when all skb space is used up by the dump there will be no space to insert the TCAA_ACT_COUNT attribute. 2) Jiri: i) Change: enum { TCAA_UNSPEC, TCAA_ACT_TAB, TCAA_ACT_FLAGS, TCAA_ACT_COUNT, TCAA_ACT_TIME_FILTER, __TCAA_MAX }; to: enum { TCAA_UNSPEC, TCAA_ACT_TAB, TCAA_ACT_FLAGS, TCAA_ACT_COUNT, __TCAA_MAX, }; Jiri plans to followup with the rest of the code to make the style consistent. ii) Rename attribute TCAA_ACT_TIME_FILTER --> TCAA_ACT_TIME_DELTA iii) Rename variable jiffy_filter --> jiffy_since iv) Rename msecs_filter --> msecs_since v) get rid of unused cb->args[0] and rename cb->args[4] to cb->args[0] Earlier Changes ---------------- - Jiri mostly on names of things. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| * net sched actions: add time filter for action dumpingJamal Hadi Salim2017-07-312-1/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds support for filtering based on time since last used. When we are dumping a large number of actions it is useful to have the option of filtering based on when the action was last used to reduce the amount of data crossing to user space. With this patch the user space app sets the TCA_ROOT_TIME_DELTA attribute with the value in milliseconds with "time of interest since now". The kernel converts this to jiffies and does the filtering comparison matching entries that have seen activity since then and returns them to user space. Old kernels and old tc continue to work in legacy mode since they dont specify this attribute. Some example (we have 400 actions bound to 400 filters); at installation time. Using updated when tc setting the time of interest to 120 seconds earlier (we see 400 actions): prompt$ hackedtc actions ls action gact since 120000| grep index | wc -l 400 go get some coffee and wait for > 120 seconds and try again: prompt$ hackedtc actions ls action gact since 120000 | grep index | wc -l 0 Lets see a filter bound to one of these actions: .... filter pref 10 u32 filter pref 10 u32 fh 800: ht divisor 1 filter pref 10 u32 fh 800::800 order 2048 key ht 800 bkt 0 flowid 1:10 (rule hit 2 success 1) match 7f000002/ffffffff at 12 (success 1 ) action order 1: gact action pass random type none pass val 0 index 23 ref 2 bind 1 installed 1145 sec used 802 sec Action statistics: Sent 84 bytes 1 pkt (dropped 0, overlimits 0 requeues 0) backlog 0b 0p requeues 0 .... that coffee took long, no? It was good. Now lets ping -c 1 127.0.0.2, then run the actions again: prompt$ hackedtc actions ls action gact since 120 | grep index | wc -l 1 More details please: prompt$ hackedtc -s actions ls action gact since 120000 action order 0: gact action pass random type none pass val 0 index 23 ref 2 bind 1 installed 1270 sec used 30 sec Action statistics: Sent 168 bytes 2 pkt (dropped 0, overlimits 0 requeues 0) backlog 0b 0p requeues 0 And the filter? filter pref 10 u32 filter pref 10 u32 fh 800: ht divisor 1 filter pref 10 u32 fh 800::800 order 2048 key ht 800 bkt 0 flowid 1:10 (rule hit 4 success 2) match 7f000002/ffffffff at 12 (success 2 ) action order 1: gact action pass random type none pass val 0 index 23 ref 2 bind 1 installed 1324 sec used 84 sec Action statistics: Sent 168 bytes 2 pkt (dropped 0, overlimits 0 requeues 0) backlog 0b 0p requeues 0 Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net sched actions: dump more than TCA_ACT_MAX_PRIO actions per batchJamal Hadi Salim2017-07-312-12/+60
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When you dump hundreds of thousands of actions, getting only 32 per dump batch even when the socket buffer and memory allocations allow is inefficient. With this change, the user will get as many as possibly fitting within the given constraints available to the kernel. The top level action TLV space is extended. An attribute TCA_ROOT_FLAGS is used to carry flags; flag TCA_FLAG_LARGE_DUMP_ON is set by the user indicating the user is capable of processing these large dumps. Older user space which doesnt set this flag doesnt get the large (than 32) batches. The kernel uses the TCA_ROOT_COUNT attribute to tell the user how many actions are put in a single batch. As such user space app knows how long to iterate (independent of the type of action being dumped) instead of hardcoded maximum of 32 thus maintaining backward compat. Some results dumping 1.5M actions below: first an unpatched tc which doesnt understand these features... prompt$ time -p tc actions ls action gact | grep index | wc -l 1500000 real 1388.43 user 2.07 sys 1386.79 Now lets see a patched tc which sets the correct flags when requesting a dump: prompt$ time -p updatedtc actions ls action gact | grep index | wc -l 1500000 real 178.13 user 2.02 sys 176.96 That is about 8x performance improvement for tc app which sets its receive buffer to about 32K. Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net sched actions: Use proper root attribute table for actionsJamal Hadi Salim2017-07-311-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | Bug fix for an issue which has been around for about a decade. We got away with it because the enumeration was larger than needed. Fixes: 7ba699c604ab ("[NET_SCHED]: Convert actions from rtnetlink to new netlink API") Suggested-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net netlink: Add new type NLA_BITFIELD32Jamal Hadi Salim2017-07-313-0/+63
|/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Generic bitflags attribute content sent to the kernel by user. With this netlink attr type the user can either set or unset a flag in the kernel. The value is a bitmap that defines the bit values being set The selector is a bitmask that defines which value bit is to be considered. A check is made to ensure the rules that a kernel subsystem always conforms to bitflags the kernel already knows about. i.e if the user tries to set a bit flag that is not understood then the _it will be rejected_. In the most basic form, the user specifies the attribute policy as: [ATTR_GOO] = { .type = NLA_BITFIELD32, .validation_data = &myvalidflags }, where myvalidflags is the bit mask of the flags the kernel understands. If the user _does not_ provide myvalidflags then the attribute will also be rejected. Examples: value = 0x0, and selector = 0x1 implies we are selecting bit 1 and we want to set its value to 0. value = 0x2, and selector = 0x2 implies we are selecting bit 2 and we want to set its value to 1. Suggested-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: fec: Allow reception of frames bigger than 1522 bytesAndrew Lunn2017-07-311-3/+5
| | | | | | | | | | | | | | | | | | | | | | The FEC Receive Control Register has a 14 bit field indicating the longest frame that may be received. It is being set to 1522. Frames longer than this are discarded, but counted as being in error. When using DSA, frames from the switch has an additional header, either 4 or 8 bytes if a Marvell switch is used. Thus a full MTU frame of 1522 bytes received by the switch on a port becomes 1530 bytes when passed to the host via the FEC interface. Change the maximum receive size to 2048 - 64, where 64 is the maximum rx_alignment applied on the receive buffer for AVB capable FEC cores. Use this value also for the maximum receive buffer size. The driver is already allocating a receive SKB of 2048 bytes, so this change should not have any significant effects. Tested on imx51, imx6, vf610. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: fec: Issue error for missing but expected PHYAndrew Lunn2017-07-311-1/+3
| | | | | | | | | | | | | | | | | If the PHY is missing but expected, e.g. because of a typ0 in the dt file, it is not possible to open the interface. ip link returns: RTNETLINK answers: No such device It is not very obvious what the problem is. Add a netdev_err() in this case to make it easier to debug the issue. [ 21.409385] fec 2188000.ethernet eth0: Unable to connect to phy RTNETLINK answers: No such device Signed-off-by: Andrew Lunn <andrew@lunn.ch> Acked-by: Fugang Duan <fugang.duan@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge branch 'dsa-lan9303-Fix-MDIO-issues'David S. Miller2017-07-314-15/+64
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Egil Hjelmeland says: ==================== net: dsa: lan9303: Fix MDIO issues. This series fix the MDIO interface for the lan9303 DSA driver. Bugs found after testing on actual HW. This series is extracted from the first patch of my first large series. Significant changes from that version are: - use mdiobus_write_nested, mdiobus_read_nested. - EXPORT lan9303_indirect_phy_ops Unfortunately I do not have access to i2c based system for testing. Changes from first version: - Change EXPORT_SYMBOL to EXPORT_SYMBOL_GPL ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| * net: dsa: lan9303: MDIO access phy registers directlyEgil Hjelmeland2017-07-314-7/+47
| | | | | | | | | | | | | | | | | | | | | | | | Indirect access (PMI) to phy register only work in I2C mode. In MDIO mode phy registers must be accessed directly. Introduced struct lan9303_phy_ops to handle the two modes. Signed-off-by: Egil Hjelmeland <privat@egil-hjelmeland.no> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net: dsa: lan9303: Renamed indirect phy access functionsEgil Hjelmeland2017-07-311-14/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Preparing for the following fix of MDIO phy access: Renamed functions that access PHY 1 and 2 indirectly through PMI registers. lan9303_port_phy_reg_wait_for_completion() to lan9303_indirect_phy_wait_for_completion() lan9303_port_phy_reg_read() to lan9303_indirect_phy_read() lan9303_port_phy_reg_write() to lan9303_indirect_phy_write() Also changed "val" parameter of lan9303_indirect_phy_write() to u16, for clarity. Signed-off-by: Egil Hjelmeland <privat@egil-hjelmeland.no> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net: dsa: lan9303: Multiply by 4 to get MDIO registerEgil Hjelmeland2017-07-312-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | lan9303_mdio_write()/_read() must multiply register number by 4 to get offset. Added some commments to the register definitions. Signed-off-by: Egil Hjelmeland <privat@egil-hjelmeland.no> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net: dsa: lan9303: Fix lan9303_detect_phy_setup() for MDIOEgil Hjelmeland2017-07-311-1/+2
|/ | | | | | | | | | Handle that MDIO read with no response return 0xffff. Signed-off-by: Egil Hjelmeland <privat@egil-hjelmeland.no> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge branch 'ethtool-fec'David S. Miller2017-07-305-36/+302
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Roopa Prabhu says: ==================== ethtool: support for forward error correction mode setting on a link Forward Error Correction (FEC) modes i.e Base-R and Reed-Solomon modes are introduced in 25G/40G/100G standards for providing good BER at high speeds. Various networking devices which support 25G/40G/100G provides ability to manage supported FEC modes and the lack of FEC encoding control and reporting today is a source for interoperability issues for many vendors. FEC capability as well as specific FEC mode i.e. Base-R or RS modes can be requested or advertised through bits D44:47 of base link codeword. This patch set intends to provide option under ethtool to manage and report FEC encoding settings for networking devices as per IEEE 802.3 bj, bm and by specs. v2 : - minor patch format fixes and typos pointed out by Andrew - there was a pending discussion on the use of 'auto' vs 'automatic' for fec settings. I have left it as 'auto' because in most cases today auto is used in place of automatic to represent automatically generated values. We use it in other networking config too. I would prefer leaving it as auto. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| * cxgb4: ethtool forward error correction management supportCasey Leedom2017-07-301-0/+100
| | | | | | | | | | Signed-off-by: Casey Leedom <leedom@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * cxgb4: core hardware/firmware support for Forward Error Correction on a linkCasey Leedom2017-07-301-35/+117
| | | | | | | | | | Signed-off-by: Casey Leedom <leedom@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net: ethtool: add support for forward error correction modesVidya Sagar Ravipati2017-07-303-1/+85
|/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Forward Error Correction (FEC) modes i.e Base-R and Reed-Solomon modes are introduced in 25G/40G/100G standards for providing good BER at high speeds. Various networking devices which support 25G/40G/100G provides ability to manage supported FEC modes and the lack of FEC encoding control and reporting today is a source for interoperability issues for many vendors. FEC capability as well as specific FEC mode i.e. Base-R or RS modes can be requested or advertised through bits D44:47 of base link codeword. This patch set intends to provide option under ethtool to manage and report FEC encoding settings for networking devices as per IEEE 802.3 bj, bm and by specs. set-fec/show-fec option(s) are designed to provide control and report the FEC encoding on the link. SET FEC option: root@tor: ethtool --set-fec swp1 encoding [off | RS | BaseR | auto] Encoding: Types of encoding Off : Turning off any encoding RS : enforcing RS-FEC encoding on supported speeds BaseR : enforcing Base R encoding on supported speeds Auto : IEEE defaults for the speed/medium combination Here are a few examples of what we would expect if encoding=auto: - if autoneg is on, we are expecting FEC to be negotiated as on or off as long as protocol supports it - if the hardware is capable of detecting the FEC encoding on it's receiver it will reconfigure its encoder to match - in absence of the above, the configuration would be set to IEEE defaults. >From our understanding , this is essentially what most hardware/driver combinations are doing today in the absence of a way for users to control the behavior. SHOW FEC option: root@tor: ethtool --show-fec swp1 FEC parameters for swp1: Active FEC encodings: RS Configured FEC encodings: RS | BaseR ETHTOOL DEVNAME output modification: ethtool devname output: root@tor:~# ethtool swp1 Settings for swp1: root@hpe-7712-03:~# ethtool swp18 Settings for swp18: Supported ports: [ FIBRE ] Supported link modes: 40000baseCR4/Full 40000baseSR4/Full 40000baseLR4/Full 100000baseSR4/Full 100000baseCR4/Full 100000baseLR4_ER4/Full Supported pause frame use: No Supports auto-negotiation: Yes Supported FEC modes: [RS | BaseR | None | Not reported] Advertised link modes: Not reported Advertised pause frame use: No Advertised auto-negotiation: No Advertised FEC modes: [RS | BaseR | None | Not reported] <<<< One or more FEC modes Speed: 100000Mb/s Duplex: Full Port: FIBRE PHYAD: 106 Transceiver: internal Auto-negotiation: off Link detected: yes This patch includes following changes a) New ETHTOOL_SFECPARAM/SFECPARAM API, handled by the new get_fecparam/set_fecparam callbacks, provides support for configuration of forward error correction modes. b) Link mode bits for FEC modes i.e. None (No FEC mode), RS, BaseR/FC are defined so that users can configure these fec modes for supported and advertising fields as part of link autonegotiation. Signed-off-by: Vidya Sagar Ravipati <vidya.chowdary@gmail.com> Signed-off-by: Dustin Byford <dustin@cumulusnetworks.com> Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge branch 'netvsc-minor-fixes-and-optimization'David S. Miller2017-07-304-248/+208
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Stephen Hemminger says: ==================== netvsc: minor fixes and optimization This is a subset of earlier submission with a few more fixes found during testing. The are two small optimizations, one is to better manage the receive completion ring, and the other is removing one unneeded level of indirection. Will submit the improved VF support and buffer sizing in a later patch so they get more review. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| * netvsc: signal host if receive ring is emptiedstephen hemminger2017-07-301-3/+8
| | | | | | | | | | | | | | | | | | Latency improvement related to NAPI conversion. If all packets are processed from receive ring then need to signal host. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>