summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* MAINTAINERS: Add entry for Socionext ethernet driverJassi Brar2018-01-101-0/+7
| | | | | | | | Add entry for the Socionext Netsec controller driver and DT bindings. Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: socionext: Add Synquacer NetSec driverJassi Brar2018-01-103-0/+1788
| | | | | | | | | This driver adds support for Socionext "netsec" IP Gigabit Ethernet + PHY IP used in the Synquacer SC2A11 SoC. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* dt-bindings: net: Add DT bindings for Socionext NetsecJassi Brar2018-01-101-0/+53
| | | | | | | | | | This patch adds documentation for Device-Tree bindings for the Socionext NetSec Controller driver. Reviewed-by: Rob Herring <robh@kernel.org> Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge branch '10GbE' of ↵David S. Miller2018-01-1010-177/+298
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue Jeff Kirsher says: ==================== 10GbE Intel Wired LAN Driver Updates 2018-01-09 This series contains updates to ixgbe and ixgbevf only. Emil fixes an issue with "wake on LAN"(WoL) where we need to ensure we enable the reception of multicast packets so that WoL works for IPv6 magic packets. Cleaned up code no longer needed with the update to adaptive ITR. Paul update the driver to advertise the highest capable link speed when a module gets inserted. Also extended the displaying of firmware version to include the iSCSI and OEM block in the EEPROM to better identify firmware versions/images. Tonghao Zhang cleans up a code comment that no longer applies since InterruptThrottleRate has been removed from the driver. Alex fixes SR-IOV and MACVLAN offload interaction, where the MACVLAN offload was incorrectly configuring several filters with the wrong pool value which resulted in MACLVAN interfaces not being able to receive traffic that had to pass over the physical interface. Fixed transmit hangs and dropped receive frames when the number of VFs changed. Added support for RSS on MACVLAN pools for X550 devices. Fixed up the MACVLAN limitations so we can now support 63 offloaded devices. Cleaned up MACVLAN code that is no longer needed with the recent changes and fixes. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| * ixgbe: Drop l2_accel_priv data pointer from ring structAlexander Duyck2018-01-092-11/+13
| | | | | | | | | | | | | | | | | | | | | | The l2 acceleration private pointer isn't needed in the ring struct. It isn't really used anywhere other than to test and see if we are supporting an offloaded macvlan netdev, and it is much easier to test netdev for not being ixgbe based to verify that. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
| * ixgbe: Use ring values to test for Tx pendingAlexander Duyck2018-01-091-16/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch simplifies the check for Tx pending traffic and makes it more holistic as there being any difference between next_to_use and next_to_clean is much more informative than if head and tail are equal, as it is possible for us to either not update tail, or not be notified of completed work in which case next_to_clean would not be equal to head. In addition the simplification makes it so that we don't have to read hardware which allows us to drop a number of variables that were previously being used in the call. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
| * ixgbe: Fix limitations on macvlan so we can support up to 63 offloaded devicesAlexander Duyck2018-01-094-43/+34
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This change is a fix of the macvlan offload so that we correctly handle macvlan offloaded devices. Specifically we were configuring our limits based on the assumption that we were going to max out the RSS indices for every mode. As a result when we went to 15 or more macvlan interfaces we were forced into the 2 queue RSS mode on VFs even though they could have still supported 4. This change splits the logic up so that we limit either the total number of macvlan instances if DCB is enabled, or limit the number of RSS queues used per macvlan (instead of per pool) if SR-IOV is enabled. By doing this we can make best use of the part. In addition I have increased the maximum number of supported interfaces to 63 with one queue per offloaded interface as this more closely reflects the actual values supported by the interface. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
| * ixgbe: There is no need to update num_rx_pools in L2 fwd offloadAlexander Duyck2018-01-092-4/+1
| | | | | | | | | | | | | | | | | | | | | | The num_rx_pools value is overwritten when we reinitialize the queue configuration. In reality we shouldn't need to be updating the value since it is redone every time we call into ixgbe_setup_tc so for now just drop the spots where we were incrementing or decrementing the value. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
| * ixgbe: Add support for macvlan offload RSS on X550 and clean-up pool handlingAlexander Duyck2018-01-091-37/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | In order for RSS to work on the macvlan pools of the X550 we need to populate the MRQC, RETA, and RSS key values for each pool. This patch makes it so that we now take care of that. In addition I have dropped the macvlan specific configuration of psrtype since it is redundant with the code that already exists for configuring this value. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
| * ixgbe: Perform reinit any time number of VFs changeAlexander Duyck2018-01-091-16/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If the number of VFs are changed we need to reinitialize the part since the offset for the device and the number of pools will be incorrect. Without this change we can end up seeing Tx hangs and dropped Rx frames for incoming traffic. In addition we should drop the code that is arbitrarily changing the default pool and queue configuration. Instead we should wait until the port is reset and reconfigured via ixgbe_sriov_reinit. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
| * ixgbe: Fix interaction between SR-IOV and macvlan offloadAlexander Duyck2018-01-091-7/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | When SR-IOV was enabled the macvlan offload was configuring several filters with the wrong pool value. This would result in the macvlan interfaces not being able to receive traffic that had to pass over the physical interface. To fix it wrap the pool argument in the VMDQ_P macro which will add the necessary offset to get to the actual VMDq pool Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
| * ixgbevf: remove redundant setting of xcast_modeEmil Tantilov2018-01-091-4/+0
| | | | | | | | | | | | | | | | Removed leftover assignment of xcast_mode. Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
| * ixgbe: Remove an obsolete comment about ITRTonghao Zhang2018-01-091-2/+0
| | | | | | | | | | | | | | | | | | The InterruptThrottleRate has been removed from ixgbe. Then Update the comment. Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
| * ixgbe: extend firmware version supportPaul Greenwalt2018-01-097-14/+198
| | | | | | | | | | | | | | | | | | | | | | Extend FW version reporting by displaying information from the iSCSI or OEM block in the EEPROM. This will allow us to more accurately identify the FW. Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
| * ixgbe: advertise highest capable link speedPaul Greenwalt2018-01-091-9/+8
| | | | | | | | | | | | | | | | | | | | On module insert advertise highest capable link speed. If module is capable of 10G, then advertise 10G, else advertise modules capable link speeds. Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
| * ixgbe: remove unused enum latency_rangeEmil Tantilov2018-01-091-7/+0
| | | | | | | | | | | | | | | | | | This enum is no longer needed after commit: b4ded8327fe ("ixgbe: Update adaptive ITR algorithm") Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
| * ixgbe: enable multicast on shutdown for WOLEmil Tantilov2018-01-091-7/+7
| | | | | | | | | | | | | | | | | | | | Previously we only enabled the reception of multicast packets when wake on multicast is set, but we also need this to allow waking with IPv6 magic packets. Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
* | Merge branch 'r8169-improve-runtime-pm'David S. Miller2018-01-091-27/+17
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Heiner Kallweit says: ==================== r8169: improve runtime pm On my system with two network ports I found that runtime PM didn't suspend the unused port. Therefore I checked runtime pm in this driver in somewhat more detail and this series improves runtime pm in general and solves the mentioned issue. Tested on a system with RTL8168evl (MAC version 34). ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| * | r8169: improve runtime pm in general and suspend unused portsHeiner Kallweit2018-01-091-8/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | So far rpm doesn't cover cases like unused ports which are never brought up. If they are active at probe time they remain in this state. Included in this patch: - Let the idle notification check whether we can suspend and let it schedule the suspend. This way we don't need to have calls to pm_schedule_suspend in different places. - At the end of rtl_open and rtl_init_one send an idle notification to allow suspending if the link is down. If a cable is plugged in aneg is finished before the suspend timer expires and the suspend request is cancelled. - Change rtl8169_runtime_suspend to power down the chip if the interface is down. Successfully tested on a RTL8168evl (mac version 34). Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | r8169: improve runtime pm in rtl8169_check_link_statusHeiner Kallweit2018-01-091-15/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch partially reverts commit e4fbce740f07 "r8169: Fix runtime power management" from 2010. At that time the suspend delay was 100ms and therefore suspending happened during initial aneg. Currently suspend delay is 5s, so suspend starts after aneg and the issue doesn't exist any longer. On my system aneg takes almost 3s, to be on the safe side let's increase the suspend delay to 10s. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | r8169: remove unneeded rpm ops in rtl_shutdownHeiner Kallweit2018-01-091-5/+0
|/ / | | | | | | | | | | | | | | | | | | This patch reverts commit 2a15cd2ff488 "r8169: runtime resume before shutdown" from 2012. Few months after this change the underlying issue was solved in the PCI core with commit 3ff2de9ba1a2 "PCI/PM: Resume device before shutdown". Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | Merge branch 'tipc-improvements-to-group-messaging'David S. Miller2018-01-0910-243/+300
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Jon Maloy says: ==================== tipc: improvements to group messaging We make a number of simplifications and improvements to the group messaging service. They aim at readability/maintainability of the code as well as scalability. The series is based on commit f9c935db8086 ("tipc: fix problems with multipoint-to-point flow control) which has been applied to 'net' but not yet to 'net-next'. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| * | tipc: improve poll() for group member socketJon Maloy2018-01-093-33/+41
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The current criteria for returning POLLOUT from a group member socket is too simplistic. It basically returns POLLOUT as soon as the group has external destinations, something obviously leading to a lot of spinning during destination congestion situations. At the same time, the internal congestion handling is unnecessarily complex. We now change this as follows. - We introduce an 'open' flag in struct tipc_group. This flag is used only to help poll() get the setting of POLLOUT right, and *not* for congeston handling as such. This means that a user can choose to ignore an EAGAIN for a destination and go on sending messages to other destinations in the group if he wants to. - The flag is set to false every time we return EAGAIN on a send call. - The flag is set to true every time any member, i.e., not necessarily the member that caused EAGAIN, is removed from the small_win list. - We remove the group member 'usr_pending' flag. The size of the send window and presence in the 'small_win' list is sufficient criteria for recognizing congestion. This solution seems to be a reasonable compromise between 'anycast', which is normally not waiting for POLLOUT for a specific destination, and the other three send modes, which are. Acked-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | tipc: improve groupcast scope handlingJon Maloy2018-01-099-75/+99
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When a member joins a group, it also indicates a binding scope. This makes it possible to create both node local groups, invisible to other nodes, as well as cluster global groups, visible everywhere. In order to avoid that different members end up having permanently differing views of group size and memberhip, we must inhibit locally and globally bound members from joining the same group. We do this by using the binding scope as an additional separator between groups. I.e., a member must ignore all membership events from sockets using a different scope than itself, and all lookups for message destinations must require an exact match between the message's lookup scope and the potential target's binding scope. Apart from making it possible to create local groups using the same identity on different nodes, a side effect of this is that it now also becomes possible to create a cluster global group with the same identity across the same nodes, without interfering with the local groups. Acked-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | tipc: add option to suppress PUBLISH events for pre-existing publicationsJon Maloy2018-01-096-15/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, when a user is subscribing for binding table publications, he will receive a PUBLISH event for all already existing matching items in the binding table. However, a group socket making a subscriptions doesn't need this initial status update from the binding table, because it has already scanned it during the join operation. Worse, the multiplicatory effect of issuing mutual events for dozens or hundreds group members within a short time frame put a heavy load on the topology server, with the end result that scale out operations on a big group tend to take much longer than needed. We now add a new filter option, TIPC_SUB_NO_STATUS, for topology server subscriptions, so that this initial avalanche of events is suppressed. This change, along with the previous commit, significantly improves the range and speed of group scale out operations. We keep the new option internal for the tipc driver, at least for now. Acked-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | tipc: send out join messages as soon as new member is discoveredJon Maloy2018-01-094-42/+68
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When a socket is joining a group, we look up in the binding table to find if there are already other members of the group present. This is used for being able to return EAGAIN instead of EHOSTUNREACH if the user proceeds directly to a send attempt. However, the information in the binding table can be used to directly set the created member in state MBR_PUBLISHED and send a JOIN message to the peer, instead of waiting for a topology PUBLISH event to do this. When there are many members in a group, the propagation time for such events can be significant, and we can save time during the join operation if we use the initial lookup result fully. In this commit, we eliminate the member state MBR_DISCOVERED which has been the result of the initial lookup, and do instead go directly to MBR_PUBLISHED, which initiates the setup. After this change, the tipc_member FSM looks as follows: +-----------+ ---->| PUBLISHED |-----------------------------------------------+ PUB- +-----------+ LEAVE/WITHRAW | LISH |JOIN | | +-------------------------------------------+ | | | LEAVE/WITHDRAW | | | | +------------+ | | | | +----------->| PENDING |---------+ | | | | |msg/maxactv +-+---+------+ LEAVE/ | | | | | | | | WITHDRAW | | | | | | +----------+ | | | | | | | |revert/maxactv| | | | | | | V V V V V | +----------+ msg +------------+ +-----------+ +-->| JOINED |------>| ACTIVE |------>| LEAVING |---> | +----------+ +--- -+------+ LEAVE/+-----------+DOWN | A A | WITHDRAW A A A EVT | | | |RECLAIM | | | | | |REMIT V | | | | | |== adv +------------+ | | | | | +---------| RECLAIMING |--------+ | | | | +-----+------+ LEAVE/ | | | | |REMIT WITHDRAW | | | | |< adv | | | |msg/ V LEAVE/ | | | |adv==ADV_IDLE+------------+ WITHDRAW | | | +-------------| REMITTED |------------+ | | +------------+ | |PUBLISH | JOIN +-----------+ LEAVE/WITHDRAW | ---->| JOINING |-----------------------------------------------+ +-----------+ Acked-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | tipc: simplify group LEAVE sequenceJon Maloy2018-01-091-31/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | After the changes in the previous commit the group LEAVE sequence can be simplified. We now let the arrival of a LEAVE message unconditionally issue a group DOWN event to the user. When a topology WITHDRAW event is received, the member, if it still there, is set to state LEAVING, but we only issue a group DOWN event when the link to the peer node is gone, so that no LEAVE message is to be expected. Acked-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | tipc: create group member event messages when they are neededJon Maloy2018-01-093-44/+56
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In the current implementation, a group socket receiving topology events about other members just converts the topology event message into a group event message and stores it until it reaches the right state to issue it to the user. This complicates the code unnecessarily, and becomes impractical when we in the coming commits will need to create and issue membership events independently. In this commit, we change this so that we just notice the type and origin of the incoming topology event, and then drop the buffer. Only when it is time to actually send a group event to the user do we explicitly create a new message and send it upwards. Acked-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | tipc: adjustment to group member FSMJon Maloy2018-01-091-3/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Analysis reveals that the member state MBR_QURANTINED in reality is unnecessary, and can be replaced by the state MBR_JOINING at all occurrencs. Acked-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | tipc: let group member stay in JOINED mode if unable to reclaimJon Maloy2018-01-091-12/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We handle a corner case in the function tipc_group_update_rcv_win(). During extreme pessure it might happen that a message receiver has all its active senders in RECLAIMING or REMITTED mode, meaning that there is nobody to reclaim advertisements from if an additional sender tries to go active. Currently we just set the new sender to ACTIVE anyway, hence at least theoretically opening up for a receiver queue overflow by exceeding the MAX_ACTIVE limit. The correct solution to this is to instead add the member to the pending queue, while letting the oldest member in that queue revert to JOINED state. In this commit we refactor the code for handling message arrival from a JOINED member, both to make it more comprehensible and to cover the case described above. Acked-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | tipc: a couple of cleanupsJon Maloy2018-01-091-14/+8
|/ / | | | | | | | | | | | | | | | | | | | | - We remove the 'reclaiming' member list in struct tipc_group, since it doesn't serve any purpose. - We simplify the GRP_REMIT_MSG branch of tipc_group_protocol_rcv(). Acked-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | Merge branch 'ethtool-ringparam-upper-bound'David S. Miller2018-01-093-21/+24
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Tariq Toukan says: ==================== ethtool ringparam upper bound This patchset by Jenny adds sanity checks in ethtool ringparam operation for input upper bounds, similarly to what's done in ethtool_set_channels. The checks are added in patch 1, using a call to get_ringparam prior to calling set_ringparam NDO. Patch 2 changes the function's behavior in mlx4_en, so that it returns an error for out-of-range input, instead of rounding it to closest valid, similar to mlx5e. Patch 3 removes the upper bound checks in mlx5e_ethtool_set_ringparam as it becomes redundant. Series generated against net-next commit: f66faae2f80a Merge branch 'ipv6-ipv4-nexthop-align' ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| * | net/mlx5e: Remove redundant checks in set_ringparamEugenia Emantayev2018-01-091-15/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | Since the checks are done in upper layer ethtool code, checks in driver are not needed any more. Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | net/mlx4_en: Align behavior of set ring size flow via ethtoolEugenia Emantayev2018-01-091-4/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In current implementation, any requested RX/TX ring size value that is less than minimum is silently casted to nearest valid value. Update this behavior to align with mlx5 behavior by printing warning in dmesg and remaining the size unchanged. Kernel is responsible for verifying against the maximum. Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | ethtool: Ensure new ring parameters are within bounds during SRINGPARAMEugenia Emantayev2018-01-091-2/+11
|/ / | | | | | | | | | | | | | | | | Add a sanity check to ensure that all requested ring parameters are within bounds, which should reduce errors in driver implementation. Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | ipv6: use ARRAY_SIZE for array sizing calculation on array seg6_action_tableColin Ian King2018-01-091-1/+1
| | | | | | | | | | | | | | | | Use the ARRAY_SIZE macro on array seg6_action_table to determine size of the array. Improvement suggested by coccinelle. Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | be2net: use ARRAY_SIZE for array sizing calculation on array cmd_priv_mapColin Ian King2018-01-091-1/+1
| | | | | | | | | | | | | | | | Use the ARRAY_SIZE macro on array cmd_priv_map to determine size of the array. Improvement suggested by coccinelle. Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | virtio_net: propagate linkspeed/duplex settings from the hypervisorJason Baron2018-01-092-1/+35
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The ability to set speed and duplex for virtio_net is useful in various scenarios as described here: 16032be virtio_net: add ethtool support for set and get of settings However, it would be nice to be able to set this from the hypervisor, such that virtio_net doesn't require custom guest ethtool commands. Introduce a new feature flag, VIRTIO_NET_F_SPEED_DUPLEX, which allows the hypervisor to export a linkspeed and duplex setting. The user can subsequently overwrite it later if desired via: 'ethtool -s'. Note that VIRTIO_NET_F_SPEED_DUPLEX is defined as bit 63, the intention is that device feature bits are to grow down from bit 63, since the transports are starting from bit 24 and growing up. Signed-off-by: Jason Baron <jbaron@akamai.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Cc: virtio-dev@lists.oasis-open.org Signed-off-by: David S. Miller <davem@davemloft.net>
* | macsec: Add support for GCM-AES-256 cipher suiteFelix Walter2018-01-092-16/+67
| | | | | | | | | | | | | | | | | | | | This adds support for the GCM-AES-256 cipher suite as specified in IEEE 802.1AEbn-2011. The prepared cipher suite selection mechanism is used, with GCM-AES-128 being the default cipher suite as defined in the standard. Signed-off-by: Felix Walter <felix.walter@cloudandheat.com> Cc: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: David S. Miller <davem@davemloft.net>
* | Merge branch 'XDP-transmission-for-tuntap'David S. Miller2018-01-095-90/+269
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Jason Wang says: ==================== XDP transmission for tuntap This series tries to implement XDP transmission (ndo_xdp_xmit) for tuntap. Pointer ring was used for queuing both XDP buffers and sk_buff, this is done by encoding the type into lowest bit of the pointer and storin XDP metadata in the headroom of XDP buff. Tests gets 3.05 Mpps when doing xdp_redirect_map from ixgbe to VM (testpmd + virtio-net in guest). This gives us ~20% improvments compared to use skb during redirect. Please review. Changes from V1: - slient warnings - fix typos - add skb mode number in the commit log ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| * | tuntap: XDP transmissionJason Wang2018-01-093-33/+208
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch implements XDP transmission for TAP. Since we can't create new queues for TAP during XDP set, exist ptr_ring was reused for queuing XDP buffers. To differ xdp_buff from sk_buff, TUN_XDP_FLAG (0x1UL) was encoded into lowest bit of xpd_buff pointer during ptr_ring_produce, and was decoded during consuming. XDP metadata was stored in the headroom of the packet which should work in most of cases since driver usually reserve enough headroom. Very minor changes were done for vhost_net: it just need to peek the length depends on the type of pointer. Tests were done on two Intel E5-2630 2.40GHz machines connected back to back through two 82599ES. Traffic were generated/received through MoonGen/testpmd(rxonly). It reports ~20% improvements when xdp_redirect_map is doing redirection from ixgbe to TAP (from 2.50Mpps to 3.05Mpps) Cc: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | tun/tap: use ptr_ring instead of skb_arrayJason Wang2018-01-095-64/+68
|/ / | | | | | | | | | | | | | | | | This patch switches to use ptr_ring instead of skb_array. This will be used to enqueue different types of pointers by encoding type into lower bits. Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller2018-01-09350-1320/+3949
|\ \ | |/ |/|
| * Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds2018-01-0953-220/+474
| |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull networking fixes from David Miller: 1) Frag and UDP handling fixes in i40e driver, from Amritha Nambiar and Alexander Duyck. 2) Undo unintentional UAPI change in netfilter conntrack, from Florian Westphal. 3) Revert a change to how error codes are returned from dev_get_valid_name(), it broke some apps. 4) Cannot cache routes for ipv6 tunnels in the tunnel is ipv4/ipv6 dual-stack. From Eli Cooper. 5) Fix missed PMTU updates in geneve, from Xin Long. 6) Cure double free in macvlan, from Gao Feng. 7) Fix heap out-of-bounds write in rds_message_alloc_sgs(), from Mohamed Ghannam. 8) FEC bug fixes from FUgang Duan (mis-accounting of dev_id, missed deferral of probe when the regulator is not ready yet). 9) Missing DMA mapping error checks in 3c59x, from Neil Horman. 10) Turn off Broadcom tags for some b53 switches, from Florian Fainelli. 11) Fix OOPS when get_target_net() is passed an SKB whose NETLINK_CB() isn't initialized. From Andrei Vagin. 12) Fix crashes in fib6_add(), from Wei Wang. 13) PMTU bug fixes in SCTP from Marcelo Ricardo Leitner. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (56 commits) sh_eth: fix TXALCR1 offsets mdio-sun4i: Fix a memory leak phylink: mark expected switch fall-throughs in phylink_mii_ioctl sctp: fix the handling of ICMP Frag Needed for too small MTUs sctp: do not retransmit upon FragNeeded if PMTU discovery is disabled xen-netfront: enable device after manual module load bnxt_en: Fix the 'Invalid VF' id check in bnxt_vf_ndo_prep routine. bnxt_en: Fix population of flow_type in bnxt_hwrm_cfa_flow_alloc() sh_eth: fix SH7757 GEther initialization net: fec: free/restore resource in related probe error pathes uapi/if_ether.h: prevent redefinition of struct ethhdr ipv6: fix general protection fault in fib6_add() RDS: null pointer dereference in rds_atomic_free_op sh_eth: fix TSU resource handling net: stmmac: enable EEE in MII, GMII or RGMII only rtnetlink: give a user socket to get_target_net() MAINTAINERS: Update my email address. can: ems_usb: improve error reporting for error warning and error passive can: flex_can: Correct the checking for frame length in flexcan_start_xmit() can: gs_usb: fix return value of the "set_bittiming" callback ...
| | * sh_eth: fix TXALCR1 offsetsSergei Shtylyov2018-01-081-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The TXALCR1 offsets are incorrect in the register offset tables, most probably due to copy&paste error. Luckily, the driver never uses this register. :-) Fixes: 4a55530f38e4 ("net: sh_eth: modify the definitions of register") Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * mdio-sun4i: Fix a memory leakChristophe JAILLET2018-01-081-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If the probing of the regulator is deferred, the memory allocated by 'mdiobus_alloc_size()' will be leaking. It should be freed before the next call to 'sun4i_mdio_probe()' which will reallocate it. Fixes: 4bdcb1dd9feb ("net: Add MDIO bus driver for the Allwinner EMAC") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * phylink: mark expected switch fall-throughs in phylink_mii_ioctlGustavo A. R. Silva2018-01-081-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | In preparation to enabling -Wimplicit-fallthrough, mark switch cases where we are expecting to fall through. Addresses-Coverity-ID: 1463447 ("Missing break in switch") Signed-off-by: Gustavo A. R. Silva <garsilva@embeddedor.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * Merge branch 'SCTP-PMTU-discovery-fixes'David S. Miller2018-01-083-23/+36
| | |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Marcelo Ricardo Leitner says: ==================== SCTP PMTU discovery fixes This patchset fixes 2 issues with PMTU discovery that can lead to flood of retransmissions. The first patch fixes the issue for when PMTUD is disabled by the application, while the second fixes it for when its enabled. Please consider these to stable. ==================== Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | | * sctp: fix the handling of ICMP Frag Needed for too small MTUsMarcelo Ricardo Leitner2018-01-083-13/+26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | syzbot reported a hang involving SCTP, on which it kept flooding dmesg with the message: [ 246.742374] sctp: sctp_transport_update_pmtu: Reported pmtu 508 too low, using default minimum of 512 That happened because whenever SCTP hits an ICMP Frag Needed, it tries to adjust to the new MTU and triggers an immediate retransmission. But it didn't consider the fact that MTUs smaller than the SCTP minimum MTU allowed (512) would not cause the PMTU to change, and issued the retransmission anyway (thus leading to another ICMP Frag Needed, and so on). As IPv4 (ip_rt_min_pmtu=556) and IPv6 (IPV6_MIN_MTU=1280) minimum MTU are higher than that, sctp_transport_update_pmtu() is changed to re-fetch the PMTU that got set after our request, and with that, detect if there was an actual change or not. The fix, thus, skips the immediate retransmission if the received ICMP resulted in no change, in the hope that SCTP will select another path. Note: The value being used for the minimum MTU (512, SCTP_DEFAULT_MINSEGMENT) is not right and instead it should be (576, SCTP_MIN_PMTU), but such change belongs to another patch. Changes from v1: - do not disable PMTU discovery, in the light of commit 06ad391919b2 ("[SCTP] Don't disable PMTU discovery when mtu is small") and as suggested by Xin Long. - changed the way to break the rtx loop by detecting if the icmp resulted in a change or not Changes from v2: none See-also: https://lkml.org/lkml/2017/12/22/811 Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | | * sctp: do not retransmit upon FragNeeded if PMTU discovery is disabledMarcelo Ricardo Leitner2018-01-081-12/+12
| | |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, if PMTU discovery is disabled on a given transport, but the configured value is higher than the actual PMTU, it is likely that we will get some icmp Frag Needed. The issue is, if PMTU discovery is disabled, we won't update the information and will issue a retransmission immediately, which may very well trigger another ICMP, and another retransmission, leading to a loop. The fix is to simply not trigger immediate retransmissions if PMTU discovery is disabled on the given transport. Changes from v2: - updated stale comment, noticed by Xin Long Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>