summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* ip6_tunnel: Update skb->protocol to ETH_P_IPV6 in ip6_tnl_xmit()Eli Cooper2016-10-291-0/+1
| | | | | | | | | | | | | | This patch updates skb->protocol to ETH_P_IPV6 in ip6_tnl_xmit() when an IPv6 header is installed to a socket buffer. This is not a cosmetic change. Without updating this value, GSO packets transmitted through an ipip6 tunnel have the protocol of ETH_P_IP and skb_mac_gso_segment() will attempt to call gso_segment() for IPv4, which results in the packets being dropped. Fixes: b8921ca83eed ("ip4ip6: Support for GSO/GRO") Signed-off-by: Eli Cooper <elicooper@gmx.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* bpf: fix samples to add fake KBUILD_MODNAMEDaniel Borkmann2016-10-296-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | Some of the sample files are causing issues when they are loaded with tc and cls_bpf, meaning tc bails out while trying to parse the resulting ELF file as program/map/etc sections are not present, which can be easily spotted with readelf(1). Currently, BPF samples are including some of the kernel headers and mid term we should change them to refrain from this, really. When dynamic debugging is enabled, we bail out due to undeclared KBUILD_MODNAME, which is easily overlooked in the build as clang spills this along with other noisy warnings from various header includes, and llc still generates an ELF file with mentioned characteristics. For just playing around with BPF examples, this can be a bit of a hurdle to take. Just add a fake KBUILD_MODNAME as a band-aid to fix the issue, same is done in xdp*_kern samples already. Fixes: 65d472fb007d ("samples/bpf: add 'pointer to packet' tests") Fixes: 6afb1e28b859 ("samples/bpf: Add tunnel set/get tests.") Fixes: a3f74617340b ("cgroup: bpf: Add an example to do cgroup checking in BPF") Reported-by: Chandrasekar Kannan <ckannan@console.to> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* inet: Fix missing return value in inet6_hashCraig Gallek2016-10-291-2/+4
| | | | | | | | | | | | | | | | | As part of a series to implement faster SO_REUSEPORT lookups, commit 086c653f5862 ("sock: struct proto hash function may error") added return values to protocol hash functions and commit 496611d7b5ea ("inet: create IPv6-equivalent inet_hash function") implemented a new hash function for IPv6. However, the latter does not respect the former's convention. This properly propagates the hash errors in the IPv6 case. Fixes: 496611d7b5ea ("inet: create IPv6-equivalent inet_hash function") Reported-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: Craig Gallek <kraig@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge branch 'mlx5-fixes'David S. Miller2016-10-2914-69/+210
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Saeed Mahameed says: ==================== Mellanox 100G mlx5 fixes 2016-10-25 This series contains some bug fixes for the mlx5 core and mlx5e driver. From Daniel: - Cache line size determination at runtime, instead of using L1_CACHE_BYTES hard coded value, use cache_line_size() - Always Query HCA caps after setting them even on reset flow From Mohamad: - Reorder netdev cleanup to uregister netdev before detaching it for the kernel to not complain about open resources such as vlans - Change the acl enable prototype to return status, for better error resiliency - Clear health sick bit when starting health poll after reset flow - Fix race between PCI error handlers and health work - PCI error recovery health care simulation, in case when the kernel PCI error handlers are not triggered for some internal firmware errors From Noa: - Avoid passing dma address 0 to firmware when mapping system pages to the firmware From Paul: Some straight forward flow steering fixes - Keep autogroups list ordered - Fix autogroups groups num not decreasing - Correctly initialize last use of flow counters From Saeed: - Choose the nearest LRO timeout to the wanted one instead of blindly choosing "dev_cap.lro_timeout[2]" This series has no conflict with the for-next pull request posted earlier today ("Mellanox mlx5 core driver updates 2016-10-25"). ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| * net/mlx5: Avoid passing dma address 0 to firmwareNoa Osherovich2016-10-291-8/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently the firmware can't work with a page with dma address 0. Passing such an address to the firmware will cause the give_pages command to fail. To avoid this, in case we get a 0 dma address of a page from the dma engine, we avoid passing it to FW by remapping to get an address other than 0. Fixes: bf0bf77f6519 ('mlx5: Support communicating arbitrary host...') Signed-off-by: Noa Osherovich <noaos@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net/mlx5: PCI error recovery health care simulationMohamad Haj Yahia2016-10-294-9/+56
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In case that the kernel PCI error handlers are not called, we will trigger our own recovery flow. The health work will give priority to the kernel pci error handlers to recover the PCI by waiting for a small period, if the pci error handlers are not triggered the manual recovery flow will be executed. We don't save pci state in case of manual recovery because it will ruin the pci configuration space and we will lose dma sync. Fixes: 89d44f0a6c73 ('net/mlx5_core: Add pci error handlers to mlx5_core driver') Signed-off-by: Mohamad Haj Yahia <mohamad@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net/mlx5: Fix race between PCI error handlers and health workMohamad Haj Yahia2016-10-293-5/+38
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently there is a race between the health care work and the kernel pci error handlers because both of them detect the error, the first one to be called will do the error handling. There is a chance that health care will disable the pci after resuming pci slot. Also create a separate WQ because now we will have two types of health works, one for the error detection and one for the recovery. Fixes: 89d44f0a6c73 ('net/mlx5_core: Add pci error handlers to mlx5_core driver') Signed-off-by: Mohamad Haj Yahia <mohamad@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net/mlx5: Clear health sick bit when starting health pollMohamad Haj Yahia2016-10-291-0/+1
| | | | | | | | | | | | | | | | | | | | | | The health sick status should be cleared when we start the health poll. This is crucial for driver reload (unload + load) in order to behave right in case of health issue. Fixes: fd76ee4da55a ('net/mlx5_core: Fix internal error detection conditions') Signed-off-by: Mohamad Haj Yahia <mohamad@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net/mlx5: Change the acl enable prototype to return statusMohamad Haj Yahia2016-10-291-16/+34
| | | | | | | | | | | | | | | | | | | | The Ingress/Egress ACL enable function may fail and it should return status to its caller to avoid NULL pointer dereference. Fixes: f942380c1239 ('net/mlx5: E-Switch, Vport ingress/egress ACLs rules for spoofchk') Signed-off-by: Mohamad Haj Yahia <mohamad@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net/mlx5e: Unregister netdev before detaching itMohamad Haj Yahia2016-10-292-1/+2
| | | | | | | | | | | | | | | | | | | | | | Detaching the netdev before unregistering it cause some netdev cleanup ndos to fail because they check presence of the netdev, so we need to unregister the netdev first. Fixes: 26e59d8077a3 ('net/mlx5e: Implement mlx5e interface attach/detach callbacks') Signed-off-by: Mohamad Haj Yahia <mohamad@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net/mlx5e: Choose best nearest LRO timeoutSaeed Mahameed2016-10-292-3/+21
| | | | | | | | | | | | | | | | | | | | Instead of predicting the index of the wanted LRO timeout value from hardware capabilities, look for the nearest LRO timeout value. Fixes: 5c50368f3831 ('net/mlx5e: Light-weight netdev open/stop') Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Mohamad Haj Yahia <mohamad@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net/mlx5: Correctly initialize last use of flow countersPaul Blakey2016-10-291-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | Currently, last use timestamp is initialized to zero. This is not the expected value by higher layers such as when we do TC action offloading. To fix that, set it to the current time, e.g when the counter/rule is offloaded. This is the same behaviour of non-offloaded TC actions. Fixes: 43a335e055bb ('mlx5_core: Flow counters infrastructure') Signed-off-by: Paul Blakey <paulb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net/mlx5: Fix autogroups groups num not decreasingPaul Blakey2016-10-291-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | Autogroups groups num is increased when creating a new flow group, but is never decreased. Now decreasing it when deleting a flow group. Fixes: f0d22d187473 ('net/mlx5_core: Introduce flow steering autogrouped flow table') Signed-off-by: Paul Blakey <paulb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net/mlx5: Keep autogroups list orderedPaul Blakey2016-10-291-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Finding a new autogroup range is done by going over a group list sorted by each group start index. The search is stopped after finding the first free range. Adding the newly created group to the list is wrongly added to the end of the list regardless of its start index as the parameter of where to insert it is ignored. This commit makes sure to use that unused parameter to insert it where requested. Fixes: f0d22d187473 ('net/mlx5_core: Introduce flow steering autogrouped flow table') Signed-off-by: Paul Blakey <paulb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net/mlx5: Always Query HCA caps after setting themDaniel Jurgens2016-10-291-6/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | Always query the HCA caps after setting them to update the capablities data structures. Not doing so results in incorrect capabilities being reported including max_dc, max_qp and several others. Fixes: 59211bd3b632 ("net/mlx5: Split the load/unload flow into hardware and software flows") Signed-off-by: Daniel Jurgens <danielj@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * {net, ib}/mlx5: Make cache line size determination at runtime.Daniel Jurgens2016-10-294-18/+27
|/ | | | | | | | | | | | ARM 64B cache line systems have L1_CACHE_BYTES set to 128. cache_line_size() will return the correct size. Fixes: cf50b5efa2fe('net/mlx5_core/ib: New device capabilities handling.') Signed-off-by: Daniel Jurgens <danielj@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* sctp: validate chunk len before actually using itMarcelo Ricardo Leitner2016-10-291-6/+6
| | | | | | | | | | | | | | | | | | Andrey Konovalov reported that KASAN detected that SCTP was using a slab beyond the boundaries. It was caused because when handling out of the blue packets in function sctp_sf_ootb() it was checking the chunk len only after already processing the first chunk, validating only for the 2nd and subsequent ones. The fix is to just move the check upwards so it's also validated for the 1st chunk. Reported-by: Andrey Konovalov <andreyknvl@google.com> Tested-by: Andrey Konovalov <andreyknvl@google.com> Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Reviewed-by: Xin Long <lucien.xin@gmail.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge branch 'mlxsw-fixes'David S. Miller2016-10-281-1/+4
|\ | | | | | | | | | | | | | | | | | | | | | | Jiri Pirko says: ==================== mlxsw: Couple of fixes Couple of LPM tree management fixes. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| * mlxsw: spectrum_router: Compare only trees which are in use during tree getJiri Pirko2016-10-281-1/+2
| | | | | | | | | | | | | | | | | | Only trees which are in use should be compared to requested prefix usage. Fixes: 53342023eed9 ("mlxsw: spectrum_router: Implement LPM trees management") Signed-off-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * mlxsw: spectrum_router: Save requested prefix bitlist when creating treeJiri Pirko2016-10-281-0/+2
|/ | | | | | | | | | | | Currently, the prefix bitlist is not saved for LPM trees, causing the compare to always fail which causes the tree to be destroyed and created for every inserted and removed FIB entry. So fix this by saving the bitlist as it should have been done from the very beginning. Fixes: 53342023eed9 ("mlxsw: spectrum_router: Implement LPM trees management") Signed-off-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net sched filters: fix notification of filter delete with proper handleJamal Hadi Salim2016-10-271-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Daniel says: While trying out [1][2], I noticed that tc monitor doesn't show the correct handle on delete: $ tc monitor qdisc clsact ffff: dev eno1 parent ffff:fff1 filter dev eno1 ingress protocol all pref 49152 bpf handle 0x2a [...] deleted filter dev eno1 ingress protocol all pref 49152 bpf handle 0xf3be0c80 some context to explain the above: The user identity of any tc filter is represented by a 32-bit identifier encoded in tcm->tcm_handle. Example 0x2a in the bpf filter above. A user wishing to delete, get or even modify a specific filter uses this handle to reference it. Every classifier is free to provide its own semantics for the 32 bit handle. Example: classifiers like u32 use schemes like 800:1:801 to describe the semantics of their filters represented as hash table, bucket and node ids etc. Classifiers also have internal per-filter representation which is different from this externally visible identity. Most classifiers set this internal representation to be a pointer address (which allows fast retrieval of said filters in their implementations). This internal representation is referenced with the "fh" variable in the kernel control code. When a user successfuly deletes a specific filter, by specifying the correct tcm->tcm_handle, an event is generated to user space which indicates which specific filter was deleted. Before this patch, the "fh" value was sent to user space as the identity. As an example what is shown in the sample bpf filter delete event above is 0xf3be0c80. This is infact a 32-bit truncation of 0xffff8807f3be0c80 which happens to be a 64-bit memory address of the internal filter representation (address of the corresponding filter's struct cls_bpf_prog); After this patch the appropriate user identifiable handle as encoded in the originating request tcm->tcm_handle is generated in the event. One of the cardinal rules of netlink rules is to be able to take an event (such as a delete in this case) and reflect it back to the kernel and successfully delete the filter. This patch achieves that. Note, this issue has existed since the original TC action infrastructure code patch back in 2004 as found in: https://git.kernel.org/cgit/linux/kernel/git/history/history.git/commit/ [1] http://patchwork.ozlabs.org/patch/682828/ [2] http://patchwork.ozlabs.org/patch/682829/ Fixes: 4e54c4816bfe ("[NET]: Add tc extensions infrastructure.") Reported-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: bgmac: fix spelling mistake: "connecton" -> "connection"Colin Ian King2016-10-271-1/+1
| | | | | | | | trivial fix to spelling mistake in dev_err message Signed-off-by: Colin Ian King <colin.king@canonical.com> Acked-by: Jon Mason <jon.mason@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* flow_dissector: fix vlan tag handlingArnd Bergmann2016-10-271-5/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | gcc warns about an uninitialized pointer dereference in the vlan priority handling: net/core/flow_dissector.c: In function '__skb_flow_dissect': net/core/flow_dissector.c:281:61: error: 'vlan' may be used uninitialized in this function [-Werror=maybe-uninitialized] As pointed out by Jiri Pirko, the variable is never actually used without being initialized first as the only way it end up uninitialized is with skb_vlan_tag_present(skb)==true, and that means it does not get accessed. However, the warning hints at some related issues that I'm addressing here: - the second check for the vlan tag is different from the first one that tests the skb for being NULL first, causing both the warning and a possible NULL pointer dereference that was not entirely fixed. - The same patch that introduced the NULL pointer check dropped an earlier optimization that skipped the repeated check of the protocol type - The local '_vlan' variable is referenced through the 'vlan' pointer but the variable has gone out of scope by the time that it is accessed, causing undefined behavior Caching the result of the 'skb && skb_vlan_tag_present(skb)' check in a local variable allows the compiler to further optimize the later check. With those changes, the warning also disappears. Fixes: 3805a938a6c2 ("flow_dissector: Check skb for VLAN only if skb specified.") Fixes: d5709f7ab776 ("flow_dissector: For stripped vlan, get vlan info from skb->vlan_tci") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Jiri Pirko <jiri@mellanox.com> Acked-by: Eric Garver <e@erig.me> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: ipv6: Do not consider link state for nexthop validationDavid Ahern2016-10-272-2/+5
| | | | | | | | | | | | | | Similar to IPv4, do not consider link state when validating next hops. Currently, if the link is down default routes can fail to insert: $ ip -6 ro add vrf blue default via 2100:2::64 dev eth2 RTNETLINK answers: No route to host With this patch the command succeeds. Fixes: 8c14586fc320 ("net: ipv6: Use passed in table for nexthop lookups") Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: ipv6: Fix processing of RAs in presence of VRFDavid Ahern2016-10-272-20/+50
| | | | | | | | | | | | | | | | | | | | | rt6_add_route_info and rt6_add_dflt_router were updated to pull the FIB table from the device index, but the corresponding rt6_get_route_info and rt6_get_dflt_router functions were not leading to the failure to process RA's: ICMPv6: RA: ndisc_router_discovery failed to add default route Fix the 'get' functions by using the table id associated with the device when applicable. Also, now that default routes can be added to tables other than the default table, rt6_purge_dflt_routers needs to be updated as well to look at all tables. To handle that efficiently, add a flag to the table denoting if it is has a default route via RA. Fixes: ca254490c8dfd ("net: Add VRF support to IPv6 stack") Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* kalmia: avoid potential uninitialized variable useArnd Bergmann2016-10-271-1/+1
| | | | | | | | | | | | | | | | | | | | | The kalmia_send_init_packet() returns zero or a negative return code, but gcc has no way of knowing that there cannot be a positive return code, so it determines that copying the ethernet address at the end of kalmia_bind() will access uninitialized data: drivers/net/usb/kalmia.c: In function ‘kalmia_bind’: arch/x86/include/asm/string_32.h:78:22: error: ‘*((void *)&ethernet_addr+4)’ may be used uninitialized in this function [-Werror=maybe-uninitialized] *((short *)to + 2) = *((short *)from + 2); ^ drivers/net/usb/kalmia.c:138:5: note: ‘*((void *)&ethernet_addr+4)’ was declared here This warning is harmless, but for consistency, we should make the check for the return code match what the driver does everywhere else and just progate it, which then gets rid of the warning. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net>
* macsec: Fix header length if SCI is added if explicitly disabledTobias Brunner2016-10-271-8/+18
| | | | | | | | | | | | | | | | Even if sending SCIs is explicitly disabled, the code that creates the Security Tag might still decide to add it (e.g. if multiple RX SCs are defined on the MACsec interface). But because the header length so far only depended on the configuration option the SCI overwrote the original frame's contents (EtherType and e.g. the beginning of the IP header) and if encrypted did not visibly end up in the packet, while the SC flag in the TCI field of the Security Tag was still set, resulting in invalid MACsec frames. Fixes: c09440f7dcb3 ("macsec: introduce IEEE 802.1AE driver") Signed-off-by: Tobias Brunner <tobias@strongswan.org> Acked-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: David S. Miller <davem@davemloft.net>
* at803x: double check SGMII side autonegZefir Kurtisi2016-10-271-0/+39
| | | | | | | | | | | | | | | | | | | | | | | | | | In SGMII mode, we observed an autonegotiation issue after power-down-up cycles where the copper side reports successful link establishment but the SGMII side's link is down. This happened in a setup where the at8031 is connected over SGMII to a eTSEC (fsl gianfar), but so far could not be reproduced with other Ethernet device / driver combinations. This commit adds a wrapper function for at8031 that in case of operating in SGMII mode double checks SGMII link state when generic aneg_done() succeeds. It prints a warning on failure but intentionally does not try to recover from this state. As a result, if you ever see a warning '803x_aneg_done: SGMII link is not ok' you will end up having an Ethernet link up but won't get any data through. This should not happen, if it does, please contact the module maintainer. Signed-off-by: Zefir Kurtisi <zefir.kurtisi@neratec.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Revert "at803x: fix suspend/resume for SGMII link"Zefir Kurtisi2016-10-271-26/+0
| | | | | | | | | | | This reverts commit 98267311fe3b334ae7c107fa0e2413adcf3ba735. Suspending the SGMII alongside the copper side made the at803x inaccessable while powered down, e.g. it can't be re-probed after suspend. Signed-off-by: Zefir Kurtisi <zefir.kurtisi@neratec.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* MAINTAINERS: Update qlogic networking driversMintz, Yuval2016-10-271-19/+22
| | | | | | | | | | | | | Following Cavium's acquisition of qlogic we need to update all the qlogic drivers maintainer's entries to point to our new e-mail addresses, as well as update some of the driver's maintainers as those are no longer working for Cavium. I would like to thank Sony Chacko and Rajesh Borundia for their support and development of our various networking drivers. Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* netvsc: fix incorrect receive checksum offloadingStephen Hemminger2016-10-271-8/+11
| | | | | | | | | | | | | | | | | The Hyper-V netvsc driver was looking at the incorrect status bits in the checksum info. It was setting the receive checksum unnecessary flag based on the IP header checksum being correct. The checksum flag is skb is about TCP and UDP checksum status. Because of this bug, any packet received with bad TCP checksum would be passed up the stack and to the application causing data corruption. The problem is reproducible via netcat and netem. This had a side effect of not doing receive checksum offload on IPv6. The driver was also also always doing checksum offload independent of the checksum setting done via ethtool. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* udp: fix IP_CHECKSUM handlingEric Dumazet2016-10-264-9/+11
| | | | | | | | | | | | | | | | | | | First bug was added in commit ad6f939ab193 ("ip: Add offset parameter to ip_cmsg_recv") : Tom missed that ipv4 udp messages could be received on AF_INET6 socket. ip_cmsg_recv(msg, skb) should have been replaced by ip_cmsg_recv_offset(msg, skb, sizeof(struct udphdr)); Then commit e6afc8ace6dd ("udp: remove headers from UDP packets before queueing") forgot to adjust the offsets now UDP headers are pulled before skb are put in receive queue. Fixes: ad6f939ab193 ("ip: Add offset parameter to ip_cmsg_recv") Fixes: e6afc8ace6dd ("udp: remove headers from UDP packets before queueing") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Sam Kumar <samanthakumar@google.com> Cc: Willem de Bruijn <willemb@google.com> Tested-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* sctp: fix the panic caused by route updateXin Long2016-10-261-1/+7
| | | | | | | | | | | | | | | | | | | | | | Commit 7303a1475008 ("sctp: identify chunks that need to be fragmented at IP level") made the chunk be fragmented at IP level in the next round if it's size exceed PMTU. But there still is another case, PMTU can be updated if transport's dst expires and transport's pmtu_pending is set in sctp_packet_transmit. If the new PMTU is less than the chunk, the same issue with that commit can be triggered. So we should drop this packet and let it retransmit in another round where it would be fragmented at IP level. This patch is to fix it by checking the chunk size after PMTU may be updated and dropping this packet if it's size exceed PMTU. Fixes: 90017accff61 ("sctp: Add GSO support") Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Neil Horman <nhorman@txudriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* doc: update docbook annotations for socket and skbStephen Hemminger2016-10-262-1/+4
| | | | | | | | The skbuff and sock structure both had missing parameter annotation values. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* rocker: fix error return code in rocker_world_check_init()Wei Yongjun2016-10-261-1/+1
| | | | | | | | | | Fix to return error code -EINVAL from the error handling case instead of 0, as done elsewhere in this function. Fixes: e420114eef4a ("rocker: introduce worlds infrastructure") Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge branch 'for-upstream' of ↵David S. Miller2016-10-235-36/+51
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth Johan Hedberg says: ==================== pull request: bluetooth 2016-10-21 Here are some more Bluetooth fixes for the 4.9 kernel: - Fix to btwilink driver probe function return value - Power management fix to hci_bcm - Fix to encoding name in scan response data Please let me know if there are any issues pulling. Thanks. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| * Bluetooth: btwilink: Fix probe return valueJacob Siverskog2016-10-201-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | Probe functions should return 0 on success. This driver's probe returns the value returned by hci_register_dev(), which is the hci index. This works for systems with only one hci device (id = 0) but for systems where the btwilink device ends up with an id larger than 0, things will start to fall apart. Make the probe function return 0 on success. Signed-off-by: Jacob Siverskog <jacob@teenage.engineering> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
| * Bluetooth: Fix append max 11 bytes of name to scan rsp dataMichał Narajowski2016-10-193-35/+42
| | | | | | | | | | | | | | | | | | | | | | Append maximum of 10 + 1 bytes of name to scan response data. Complete name is appended only if exists and is <= 10 characters. Else append short name if exists or shorten complete name if not. This makes sure name is consistent across multiple advertising instances. Signed-off-by: Michał Narajowski <michal.narajowski@codecoup.pl> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
| * Bluetooth: hci_bcm: Fix autosuspend PM for Lenovo ThinkPad 8Jérôme de Bretagne2016-10-131-0/+8
| | | | | | | | | | | | | | | | | | ACPI table for BCM2E55 of Lenovo ThinkPad 8 is not correct. Set correctly IRQ polarity for this device, fixing the issue of bluetooth never resuming after autosuspend PM. Signed-off-by: Jérôme de Bretagne <jerome.debretagne@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
* | net: sctp, forbid negative lengthJiri Slaby2016-10-231-1/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Most of getsockopt handlers in net/sctp/socket.c check len against sizeof some structure like: if (len < sizeof(int)) return -EINVAL; On the first look, the check seems to be correct. But since len is int and sizeof returns size_t, int gets promoted to unsigned size_t too. So the test returns false for negative lengths. Yes, (-1 < sizeof(long)) is false. Fix this in sctp by explicitly checking len < 0 before any getsockopt handler is called. Note that sctp_getsockopt_events already handled the negative case. Since we added the < 0 check elsewhere, this one can be removed. If not checked, this is the result: UBSAN: Undefined behaviour in ../mm/page_alloc.c:2722:19 shift exponent 52 is too large for 32-bit type 'int' CPU: 1 PID: 24535 Comm: syz-executor Not tainted 4.8.1-0-syzkaller #1 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.9.1-0-gb3ef39f-prebuilt.qemu-project.org 04/01/2014 0000000000000000 ffff88006d99f2a8 ffffffffb2f7bdea 0000000041b58ab3 ffffffffb4363c14 ffffffffb2f7bcde ffff88006d99f2d0 ffff88006d99f270 0000000000000000 0000000000000000 0000000000000034 ffffffffb5096422 Call Trace: [<ffffffffb3051498>] ? __ubsan_handle_shift_out_of_bounds+0x29c/0x300 ... [<ffffffffb273f0e4>] ? kmalloc_order+0x24/0x90 [<ffffffffb27416a4>] ? kmalloc_order_trace+0x24/0x220 [<ffffffffb2819a30>] ? __kmalloc+0x330/0x540 [<ffffffffc18c25f4>] ? sctp_getsockopt_local_addrs+0x174/0xca0 [sctp] [<ffffffffc18d2bcd>] ? sctp_getsockopt+0x10d/0x1b0 [sctp] [<ffffffffb37c1219>] ? sock_common_getsockopt+0xb9/0x150 [<ffffffffb37be2f5>] ? SyS_getsockopt+0x1a5/0x270 Signed-off-by: Jiri Slaby <jslaby@suse.cz> Cc: Vlad Yasevich <vyasevich@gmail.com> Cc: Neil Horman <nhorman@tuxdriver.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: linux-sctp@vger.kernel.org Cc: netdev@vger.kernel.org Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | net: fec: Call swap_buffer() prior to IP header alignmentFabio Estevam2016-10-231-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 3ac72b7b63d5 ("net: fec: align IP header in hardware") breaks networking on mx28. There is an erratum on mx28 (ENGR121613 - ENET big endian mode not compatible with ARM little endian) that requires an additional byte-swap operation to workaround this problem. So call swap_buffer() prior to performing the IP header alignment to restore network functionality on mx28. Fixes: 3ac72b7b63d5 ("net: fec: align IP header in hardware") Reported-and-tested-by: Henri Roosen <henri.roosen@ginzinger.com> Signed-off-by: Fabio Estevam <fabio.estevam@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | ipv6: do not increment mac header when it's unsetJason A. Donenfeld2016-10-231-1/+2
| | | | | | | | | | | | | | | | Otherwise we'll overflow the integer. This occurs when layer 3 tunneled packets are handed off to the IPv6 layer. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | bnx2x: Use the correct divisor value for PHC clock readings.Sudarsana Reddy Kalluru2016-10-231-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | Time Sync (PTP) implementation uses the divisor/shift value for converting the clock ticks to nanoseconds. Driver currently defines shift value as 1, this results in the nanoseconds value to be calculated as half the actual value. Hence the user application fails to synchronize the device clock value with the PTP master device clock. Need to use the 'shift' value of 0. Signed-off-by: Sony.Chacko <Sony.Chacko@cavium.com> Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com> Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | Merge branch 'qed-fixes'David S. Miller2016-10-226-42/+75
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Sudarsana Reddy Kalluru says: ==================== qed*: fix series. The patch series contains several minor bug fixes for qed/qede modules. Please consider applying this to 'net' branch. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| * | qede: Fix incorrrect usage of APIs for un-mapping DMA memoryManish Chopra2016-10-222-6/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Driver uses incorrect APIs to unmap DMA memory which were mapped using dma_map_single(). This patch fixes it to use appropriate APIs for un-mapping DMA memory. Signed-off-by: Manish Chopra <manish.chopra@cavium.com> Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | qed: Zero-out the buffer paased to dcbx_query() APISudarsana Reddy Kalluru2016-10-221-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | qed_dcbx_query_params() implementation populate the values to input buffer based on the dcbx mode and, the current negotiated state/params, the caller of this API need to memset the buffer to zero. Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com> Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | qede: Reconfigure rss indirection direction table when rss count is updatedSudarsana Reddy Kalluru2016-10-221-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Rx indirection table entries are in the range [0, (rss_count - 1)]. If user reduces the rss count, the table entries may not be in the ccorrect range. Need to reconfigure the table with new rss_count as a basis. Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com> Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | qed*: Reduce the memory footprint for Rx pathSudarsana Reddy Kalluru2016-10-223-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | With the current default values for Rx path i.e., 8 queues of 8Kb entries each with 4Kb size, interface will consume 256Mb for Rx. The default values causing the driver probe to fail when the system memory is low. Based on the perforamnce results, rx-ring count value of 1Kb gives the comparable performance with Rx coalesce timeout of 12 seconds. Updating the default values. Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com> Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | qede: Loopback implementation should ignore the normal trafficSudarsana Reddy Kalluru2016-10-223-35/+49
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | During the execution of loopback test, driver may receive the packets which are not originated by this test, loopback implementation need to skip those packets. Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com> Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | qede: Do not allow RSS config for 100G devicesSudarsana Reddy Kalluru2016-10-221-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | RSS configuration is not supported for 100G adapters. Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com> Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>