summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* tcp/dccp: fix behavior of stale SYN_RECV request socketsEric Dumazet2015-10-144-18/+26
| | | | | | | | | | | | | | | | | | When a TCP/DCCP listener is closed, its pending SYN_RECV request sockets become stale, meaning 3WHS can not complete. But current behavior is wrong : incoming packets finding such stale sockets are dropped. We need instead to cleanup the request socket and perform another lookup : - Incoming ACK will give a RST answer, - SYN rtx might find another listener if available. - We expedite cleanup of request sockets and old listener socket. Fixes: 079096f103fa ("tcp/dccp: install syn_recv requests into ehash table") Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge branch 'bridge-vlan'David S. Miller2015-10-137-71/+135
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Nikolay Aleksandrov says: ==================== bridge: vlan: cleanups & fixes (part 3) Patch 01 converts the vlgrp member to use rcu as it was already used in a similar way so better to make it official and use all the available RCU instrumentation. Patch 02 fixes a bug where the vlan_list can be traversed without rtnl or rcu held which could lead to using freed entries. Patch 03 removes some redundant code that isn't needed anymore. Patch 04 fixes a bug reported by Ido Schimmel about the vlan_flush order and switchdevs, it moves it back. v2: patch 03 and 04 are new, couldn't escape the second synchronize_rcu() since the rhtable destruction can sleep ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| * bridge: vlan: move back vlan_flushNikolay Aleksandrov2015-10-133-10/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Ido Schimmel reported a problem with switchdev devices because of the order change of del_nbp operations, more specifically the move of nbp_vlan_flush() which deletes all vlans and frees vlgrp after the rx_handler has been unregistered. So in order to fix this move vlan_flush back where it was and make it destroy the rhtable after NULLing vlgrp and waiting a grace period to make sure noone can see it. Reported-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * bridge: vlan: drop unnecessary flush codeNikolay Aleksandrov2015-10-131-8/+1
| | | | | | | | | | | | | | | | | | | | As Ido Schimmel pointed out the vlan_vid_del() code in nbp_vlan_flush is unnecessary (and is actually a remnant of the old vlan code) so we can remove it. Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * bridge: vlan: use rcu for vlan_list traversal in br_fill_ifinfoNikolay Aleksandrov2015-10-131-8/+13
| | | | | | | | | | | | | | | | | | | | | | br_fill_ifinfo is called by br_ifinfo_notify which can be called from many contexts with different locks held, sometimes it relies upon bridge's spinlock only which is a problem for the vlan code, so use explicitly rcu for that to avoid problems. Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * bridge: vlan: use proper rcu for the vlgrp memberNikolay Aleksandrov2015-10-136-53/+104
|/ | | | | | | | | | | | | The bridge and port's vlgrp member is already used in RCU way, currently we rely on the fact that it cannot disappear while the port exists but that is error-prone and we might miss places with improper locking (either RCU or RTNL must be held to walk the vlan_list). So make it official and use RCU for vlgrp to catch offenders. Introduce proper vlgrp accessors and use them consistently throughout the code. Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge branch 'vrf-ipv6'David S. Miller2015-10-139-16/+386
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | David Ahern says: ==================== net: VRF support in IPv6 stack Initial support for VRF in IPv6 stack. Makes IPv6 functionality on par with IPv4 -- ping, tcp client/server and udp client/server all work fine. tcpdump on vrf device and external tap (e.g., host side tap device) shows all packets with proper addresses. IPv6 does not need the source address operation like IPv4. Verified vti6 works properly in my setup as does use of an IPv6 address on the VRF device. v3 - re-based to top of net-next (updates per net namespace changes by Eric) - fixed dst_entry typecasts as requested by Dave - added flags to inet6_rtm_getroute (IPv6 version of deaa0a6a930e) v2 - fixed CONFIG_IPV6 dependency as questioned by Cong - if IPV6 is a module, kbuild ensures VRF is a module - if IPV6 is disabled IPV6 functionality is compiled out of VRF module - addressed comments from Nik over IRC - removed duplicate call to netif_is_l3_master in l3mdev_rt6_dst_by_oif - changed allocation flag from GFP_ATOMIC to GFP_KERNEL since it is init time - added free of rt6i_pcpu - check_ipv6_frame returns false only if packet is NDISC type ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| * net: Add VRF support to IPv6 stackDavid Ahern2015-10-135-14/+63
| | | | | | | | | | | | | | | | | | | | As with IPv4 support for VRFs added to IPv6 stack by replacing hardcoded table ids with possibly device specific ones and manipulating the oif in the flowi6. The flow flags are used to skip oif compare in nexthop lookups if the device is enslaved to a VRF via the L3 master device. Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net: Add IPv6 support to VRF deviceDavid Ahern2015-10-132-2/+275
| | | | | | | | | | | | | | | | Add support for IPv6 to VRF device driver. Implemenation parallels what has been done for IPv4. Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net: Export fib6_get_table and nd_tblDavid Ahern2015-10-132-0/+2
| | | | | | | | | | Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net: Add IPv6 support to l3mdevDavid Ahern2015-10-131-0/+46
|/ | | | | | | | Add operations to retrieve cached IPv6 dst entry from l3mdev device and lookup IPv6 source address. Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* bridge: fix gc_timer mod/del race conditionNikolay Aleksandrov2015-10-131-1/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit c62987bbd8a1 ("bridge: push bridge setting ageing_time down to switchdev") introduced a timer race condition because the gc_timer can get rearmed after it's supposedly stopped and flushed in br_dev_delete() leading to a use of freed memory. So take rtnl to sync with bridge destruction when setting ageing_timer. Here's the trace reproduced with these two commands running in parallel: while :; do echo 10000 > /sys/class/net/br0/bridge/ageing_timer; done; while :; do brctl addbr br0; ip l set br0 up; ip l set br0 down; brctl delbr br0; done; [ 300.000029] BUG: unable to handle kernel paging request at ffffffff811c59d3 [ 300.000263] IP: [<ffffffff810f168e>] __internal_add_timer+0x2e/0xd0 [ 300.000422] PGD 1a0f067 PUD 1a10063 PMD 10001e1 [ 300.000639] Oops: 0003 [#1] SMP [ 300.000793] Modules linked in: bridge stp llc nfsd auth_rpcgss oid_registry nfs_acl nfs lockd grace fscache sunrpc crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel ppdev aesni_intel aes_x86_64 glue_helper lrw gf128mul ablk_helper cryptd snd_hda_codec_generic qxl drm_kms_helper psmouse pcspkr ttm snd_hda_intel 9pnet_virtio evdev serio_raw joydev snd_hda_codec 9pnet virtio_balloon drm snd_hwdep virtio_console snd_hda_core pvpanic snd_pcm i2c_piix4 snd_timer acpi_cpufreq parport_pc snd parport soundcore button processor i2c_core ipv6 autofs4 hid_generic usbhid hid ext4 crc16 mbcache jbd2 sg sr_mod cdrom ata_generic virtio_blk virtio_net e1000 ehci_pci uhci_hcd ehci_hcd usbcore usb_common floppy ata_piix libata virtio_pci virtio_ring virtio scsi_mod [ 300.004008] CPU: 1 PID: 1169 Comm: bash Not tainted 4.3.0-rc3+ #46 [ 300.004008] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 [ 300.004008] task: ffff880035be2200 ti: ffff88003795c000 task.ti: ffff88003795c000 [ 300.004008] RIP: 0010:[<ffffffff810f168e>] [<ffffffff810f168e>] __internal_add_timer+0x2e/0xd0 [ 300.004008] RSP: 0018:ffff88003fd03e78 EFLAGS: 00010046 [ 300.004008] RAX: ffff88003fd0ef60 RBX: 840fc78949c08548 RCX: 00000001ffffffff [ 300.004008] RDX: 0000000000000000 RSI: ffffffff811c59d3 RDI: ffff88003fd0df00 [ 300.004008] RBP: ffff88003fd03e78 R08: 00000000ffffffff R09: 0000000000000000 [ 300.004008] R10: 0000000000000000 R11: 0000000000000000 R12: ffff88003fd0df00 [ 300.004008] R13: 0000000000000000 R14: 0000000000000001 R15: ffffffff816032e0 [ 300.004008] FS: 00007fcbdd609700(0000) GS:ffff88003fd00000(0000) knlGS:0000000000000000 [ 300.004008] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 300.004008] CR2: ffffffff811c59d3 CR3: 0000000037879000 CR4: 00000000000406e0 [ 300.004008] Stack: [ 300.004008] ffff88003fd03ea8 ffffffff810f1775 ffff88003c8cb958 ffff88003fd0df00 [ 300.004008] 0000000000000000 0000000000000001 ffff88003fd03f18 ffffffff810f28c4 [ 300.004008] ffff88003fd0eb68 ffff88003fd0e968 ffff88003fd0e768 ffff88003fd0df68 [ 300.004008] Call Trace: [ 300.004008] <IRQ> [ 300.004008] [<ffffffff810f1775>] cascade+0x45/0x70 [ 300.004008] [<ffffffff810f28c4>] run_timer_softirq+0x2f4/0x340 [ 300.004008] [<ffffffff8107e380>] __do_softirq+0xd0/0x440 [ 300.004008] [<ffffffff8107e8a3>] irq_exit+0xb3/0xc0 [ 300.004008] [<ffffffff815c2032>] smp_apic_timer_interrupt+0x42/0x50 [ 300.004008] [<ffffffff815bfe37>] apic_timer_interrupt+0x87/0x90 [ 300.004008] <EOI> [ 300.004008] [<ffffffff811fb80c>] ? create_object+0x13c/0x2e0 [ 300.004008] [<ffffffff8109b23e>] ? __kernel_text_address+0x4e/0x70 [ 300.004008] [<ffffffff8109b23e>] ? __kernel_text_address+0x4e/0x70 [ 300.004008] [<ffffffff8101e17f>] print_context_stack+0x7f/0xf0 [ 300.004008] [<ffffffff8101d55b>] dump_trace+0x11b/0x300 [ 300.004008] [<ffffffff8102970b>] save_stack_trace+0x2b/0x50 [ 300.004008] [<ffffffff811fb80c>] create_object+0x13c/0x2e0 [ 300.004008] [<ffffffff815b2e8e>] kmemleak_alloc+0x4e/0xb0 [ 300.004008] [<ffffffff811e475d>] kmem_cache_alloc_trace+0x18d/0x2f0 [ 300.004008] [<ffffffff8128b139>] kernfs_fop_open+0xc9/0x380 [ 300.004008] [<ffffffff8120214f>] do_dentry_open+0x1ff/0x2f0 [ 300.004008] [<ffffffff8128b070>] ? kernfs_fop_release+0x70/0x70 [ 300.004008] [<ffffffff812034f9>] vfs_open+0x59/0x60 [ 300.004008] [<ffffffff812130de>] path_openat+0x1ce/0x1260 [ 300.004008] [<ffffffff812154ae>] do_filp_open+0x7e/0xe0 [ 300.004008] [<ffffffff812251ff>] ? __alloc_fd+0xaf/0x180 [ 300.004008] [<ffffffff8120387b>] do_sys_open+0x12b/0x210 [ 300.004008] [<ffffffff8120397e>] SyS_open+0x1e/0x20 [ 300.004008] [<ffffffff815bf0b6>] entry_SYSCALL_64_fastpath+0x16/0x7a [ 300.004008] Code: 66 90 48 8b 46 10 48 8b 4f 40 55 48 89 c2 48 89 e5 48 29 ca 48 81 fa ff 00 00 00 77 20 0f b6 c0 48 8d 44 c7 68 48 8b 10 48 85 d2 <48> 89 16 74 04 48 89 72 08 48 89 30 48 89 46 08 5d c3 48 81 fa [ 300.004008] RIP [<ffffffff810f168e>] __internal_add_timer+0x2e/0xd0 [ 300.004008] RSP <ffff88003fd03e78> [ 300.004008] CR2: ffffffff811c59d3 Fixes: c62987bbd8a1 ("bridge: push bridge setting ageing_time down to switchdev") Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Acked-by: Scott Feldman <sfeldma@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* switchdev: enforce no pvid flag in vlan rangesNikolay Aleksandrov2015-10-131-0/+3
| | | | | | | | | We shouldn't allow BRIDGE_VLAN_INFO_PVID flag in VLAN ranges. Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Acked-by: Elad Raz <eladr@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge branch 'dsa-mv88e6xxx-fix-hardware-bridging'David S. Miller2015-10-135-203/+26
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Vivien Didelot says: ==================== net: dsa: mv88e6xxx: fix hardware bridging DSA and its drivers currently hook the NETDEV_CHANGEUPPER net_device event in order to configure the VLAN map of every port. This VLAN map is a feature of these switch chips to hardcode and restrict which output ports a given input port can egress frames to. A Linux bridge is a simple untagged VLAN propagated by the bridge code itself. With a proper 802.1Q support, a driver does not need this hook anymore, and will simply program the related VLAN object. This patchset improves the hardware bridging code in the mv88e6xxx driver with a strict 802.1Q mode. Ideally, the equivalent must be done for Broadcom Starfighter 2 and Rocker, before completely getting rid of this hook. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| * net: dsa: mv88e6xxx: fix hardware bridgingVivien Didelot2015-10-134-50/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Playing with the VLAN map of every port to implement "hardware bridging" in the 88E6352 driver was a hack until full 802.1Q was supported. Indeed with 802.1Q port mode "Disabled" or "Fallback", this feature is used to restrict which output ports an input port can egress frames to. A Linux bridge is an untagged VLAN. With full 802.1Q support, we don't need this hack anymore and can use the "Secure" strict 802.1Q port mode. With this mode, the port-based VLAN map still needs to be configured, but all the logic is VTU-centric. This means that the switch only cares about rules described in its hardware VLAN table, which is exactly what Linux bridge expects and what we want. Note also that the hardware bridging was broken with the previous flexible "Fallback" 802.1Q port mode. Here's an example: Port0 and Port1 belong to the same bridge. If Port0 sends crafted tagged frames with VID 200 to Port1, Port1 receives it. Even if Port1 is in hardware VLAN 200, but not Port0, Port1 will still receive it, because Fallback mode doesn't care about invalid VID or non-member source port. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net: dsa: do not warn unsupported bridge opsVivien Didelot2015-10-131-1/+1
| | | | | | | | | | | | | | | | A DSA driver may not provide the port_join_bridge and port_leave_bridge functions, so don't warn in such case. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net: dsa: mv88e6xxx: do not support per-port FIDVivien Didelot2015-10-132-63/+11
| | | | | | | | | | | | | | | | | | | | | | Since we configure a switch chip through a Linux bridge, and a bridge is implemented as a VLAN, there is no need for per-port FID anymore. This patch gets rid of this and simplifies the driver code since we can now directly map all 4095 FIDs available to all VLANs. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net: dsa: mv88e6xxx: bridges do not need an FIDVivien Didelot2015-10-132-110/+32
|/ | | | | | | | | | | | | | | | | With 88E6352 and similar switch chips, each port has a map to restrict which output port this input port can egress frames to. The current driver code implements hardware bridging using this feature, and assigns to a bridge group the FID of its first member. Now that 802.1Q is fully implemented in this driver, a Linux bridge which is a simple untagged VLAN, already gets its own FID. This patch gets rid of the per-bridge FID and explicits the usage of the port based VLAN map feature. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* RDS-TCP: Reset tcp callbacks if re-using an outgoing socket in ↵Sowmini Varadhan2015-10-131-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | rds_tcp_accept_one() Consider the following "duelling syn" sequence between two peers A and B: A B SYN1 --> <-- SYN2 SYN2ACK --> Note that the SYN/ACK has already been sent out by TCP before rds_tcp_accept_one() gets invoked as part of callbacks. If the inet_addr(A) is numerically less than inet_addr(B), the arbitration scheme in rds_tcp_accept_one() will prefer the TCP connection triggered by SYN1, and will send a CLOSE for the SYN2 (just after the SYN2ACK was sent). Since B also follows the same arbitration scheme, it will send the SYN-ACK for SYN1 that will set up a healthy ESTABLISHED connection on both sides. B will also get a CLOSE for SYN2, which should result in the cleanup of the TCP state machine for SYN2, but it should not trigger any stale RDS-TCP callbacks (such as ->writespace, ->state_change etc), that would disrupt the progress of the SYN2 based RDS-TCP connection. Thus the arbitration scheme in rds_tcp_accept_one() should restore rds_tcp callbacks for the winner before setting them up for the new accept socket, and also make sure that conn->c_outgoing is set to 0 so that we do not trigger any reconnect attempts on the passive side of the tcp socket in the future, in conformance with commit c82ac7e69efe ("net/rds: RDS-TCP: only initiate reconnect attempt on outgoing TCP socket.") Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com> Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* RDS: Invoke ->laddr_check() in rds_bind() for explicitly bound transports.Sowmini Varadhan2015-10-131-1/+8
| | | | | | | | | | | | | The IP address passed to rds_bind() should be vetted by the transport's ->laddr_check() for a previously bound transport. This needs to be done to avoid cases where, for example, the application has asked for an IB transport, but the IP address passed to bind is only usable on ethernet interfaces. Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com> Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* qlcnic: constify qlcnic_mbx_ops structureJulia Lawall2015-10-132-3/+3
| | | | | | | | | | | | | | | | The only instance of a qlcnic_mbx_ops structure is never modified. Thus the declaration of the structure and all references to the structure type can be made const. In the definition of the qlcnic_mailbox structure, the ops field is no longer lined up with the other fields. This was left as is, to avoid a lot of trivial changes on the other lines. Done with the help of Coccinelle. Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr> Acked-by: Sony Chacko <sony.chacko@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* bridge: vlan: enforce no pvid flag in vlan rangesNikolay Aleksandrov2015-10-131-0/+3
| | | | | | | | | | | | | Currently it's possible for someone to send a vlan range to the kernel with the pvid flag set which will result in the pvid bouncing from a vlan to vlan and isn't correct, it also introduces problems for hardware where it doesn't make sense having more than 1 pvid. iproute2 already enforces this, so let's enforce it on kernel-side as well. Reported-by: Elad Raz <eladr@mellanox.com> Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* atm: iphase: fix misleading indentionTillmann Heidsieck2015-10-131-1/+1
| | | | | | | | | | | Fix a smatch warning: drivers/atm/iphase.c:1178 rx_pkt() warn: curly braces intended? The code is correct, the indention is misleading. In case the allocation of skb fails, we want to skip to the end. Signed-off-by: Tillmann Heidsieck <theidsieck@leenox.de> Signed-off-by: David S. Miller <davem@davemloft.net>
* atm: iphase: return -ENOMEM instead of -1 in case of failed kmalloc()Tillmann Heidsieck2015-10-131-1/+2
| | | | | | | | | | Smatch complains about returning hard coded error codes, silence this warning. drivers/atm/iphase.c:115 ia_enque_rtn_q() warn: returning -1 instead of -ENOMEM is sloppy Signed-off-by: Tillmann Heidsieck <theidsieck@leenox.de> Signed-off-by: David S. Miller <davem@davemloft.net>
* ipv6 route: use err pointers instead of returning pointer by referenceRoopa Prabhu2015-10-131-15/+17
| | | | | | | | This patch makes ip6_route_info_create return err pointer instead of returning the rt pointer by reference as suggested by Dave Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: hns: fix the unknown phy_nterface_t type errorhuangdaode2015-10-131-0/+1
| | | | | | | | | | | | | | | This patch fix the building error reported by Jiri Pirko <jiri@resnulli.us> drivers/net/ethernet/hisilicon/hns/hnae.h:465:2: error: unknown type name 'phy_interface_t' phy_interface_t phy_if; ^ the full build log is on https://lists.01.org/pipermail/kbuild-all. Signed-off-by: huangdaode <huangdaode@hisilicon.com> Signed-off-by: yankejian <yankejian@huawei.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* tun: use sk_fullsock() before reading sk->sk_tsflagsEric Dumazet2015-10-131-1/+1
| | | | | | | | | | | | | | | | | timewait or request sockets are small and do not contain sk->sk_tsflags Without this fix, we might read garbage, and crash later in __skb_complete_tx_timestamp() -> sock_queue_err_skb() (These pseudo sockets do not have an error queue either) Fixes: ca6fb0651883 ("tcp: attach SYNACK messages to request sockets instead of listener") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Willem de Bruijn <willemb@google.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge branch 'netns-defrag'David S. Miller2015-10-1311-26/+27
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Eric W. Biederman says: ==================== net: Pass net into defragmentation This is the next installment of my work to pass struct net through the output path so the code does not need to guess how to figure out which network namespace it is in, and ultimately routes can have output devices in another network namespace. In netfilter and af_packet we defragment packets in the output path, and there is the usual amount of confusion about how to compute which net we are processing the packets in. This patchset clears that confusion up by explicitly passing in struct net in ip_defrag, ip_check_defrag, and nf_ct_frag6_gather. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| * ipv6: Pass struct net into nf_ct_frag6_gatherEric W. Biederman2015-10-134-6/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The function nf_ct_frag6_gather is called on both the input and the output paths of the networking stack. In particular ipv6_defrag which calls nf_ct_frag6_gather is called from both the the PRE_ROUTING chain on input and the LOCAL_OUT chain on output. The addition of a net parameter makes it explicit which network namespace the packets are being reassembled in, and removes the need for nf_ct_frag6_gather to guess. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Acked-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
| * ipv4: Pass struct net into ip_defrag and ip_check_defragEric W. Biederman2015-10-138-19/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | The function ip_defrag is called on both the input and the output paths of the networking stack. In particular conntrack when it is tracking outbound packets from the local machine calls ip_defrag. So add a struct net parameter and stop making ip_defrag guess which network namespace it needs to defragment packets in. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Acked-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
| * ipv4: Only compute net once in ip_call_ra_chainEric W. Biederman2015-10-131-1/+2
|/ | | | | | | | | | | | | | ip_call_ra_chain is called early in the forwarding chain from ip_forward and ip_mr_input, which makes skb->dev the correct expression to get the input network device and dev_net(skb->dev) a correct expression for the network namespace the packet is being processed in. Compute the network namespace and store it in a variable to make the code clearer. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* packet: fix match_fanout_group()Eric Dumazet2015-10-131-3/+3
| | | | | | | | | | | | | | | | | | Recent TCP listener patches exposed a prior af_packet bug : match_fanout_group() blindly assumes it is always safe to cast sk to a packet socket to compare fanout with af_packet_priv But SYNACK packets can be sent while attached to request_sock, which are smaller than a "struct sock". We can read non existent memory and crash. Fixes: c0de08d04215 ("af_packet: don't emit packet on orig fanout group") Fixes: ca6fb0651883 ("tcp: attach SYNACK messages to request sockets instead of listener") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Willem de Bruijn <willemb@google.com> Cc: Eric Leblond <eric@regit.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge tag 'wireless-drivers-next-for-davem-2015-10-09' of ↵David S. Miller2015-10-13117-1410/+2677
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers-next Kalle Valo says: ==================== Major changes: iwlwifi * some debugfs improvements * fix signedness in beacon statistics * deinline some functions to reduce size when device tracing is enabled * filter beacons out in AP mode when no stations are associated * deprecate firmwares version -12 * fix a runtime PM vs. legacy suspend race * one-liner fix for a ToF bug * clean-ups in the rx code * small debugging improvement * fix WoWLAN with new firmware versions * more clean-ups towards multiple RX queues; * some rate scaling fixes and improvements; * some time-of-flight fixes; * other generic improvements and clean-ups; brcmfmac * rework code dealing with multiple interfaces * allow logging firmware console using debug level * support for BCM4350, BCM4365, and BCM4366 PCIE devices * fixed for legacy P2P and P2P device handling * correct set and get tx-power ath9k * add support for Outside Context of a BSS (OCB) mode mwifiex * add USB multichannel feature ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| * Merge tag 'iwlwifi-next-for-kalle-2015-10-05' of ↵Kalle Valo2015-10-0732-244/+643
| |\ | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/iwlwifi/iwlwifi-next * more clean-ups towards multiple RX queues; * some rate scaling fixes and improvements; * some time-of-flight fixes; * other generic improvements and clean-ups;
| | * iwlwifi: mvm: add minimal multi-RXQ infrastructureJohannes Berg2015-10-052-27/+83
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since the new multi-queue capability depends on a new firmware API, we can already add some code for it. If the new API is present, a new opmode ops struct is used that handles the new rx_rss method. For now, only restructure the RX handling to distinguish between the two. Future patches will convert the new infrastructure to actually use the new RX descriptor layout. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
| | * iwlwifi: size firmware flags memory correctlyJohannes Berg2015-10-053-7/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Instead of relying on a hard-coded constant of a maximum of 64 API and capability bits, add a new enum value after the others that will then always track the number of used bits in the API/capabilities. We thus no longer need to maintain the maximum number, and on 32-bit platforms even (currently) reduce the number of bits kept in memory. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
| | * iwlwifi: mvm: make threshold temperatures unsignedJohannes Berg2015-10-051-6/+6
| | | | | | | | | | | | | | | | | | | | | | | | There's no need to have negative threshold temperatures, so make them unsigned to avoid signedness warnings in debugfs. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
| | * iwlwifi: nvm: add nvm phy_sku section to debugfsMoshe Harel2015-10-053-0/+8
| | | | | | | | | | | | | | | | | | | | | The only NVM section not captured in debugfs. Signed-off-by: Moshe Harel <moshe.harel@intel.com> Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
| | * iwlwifi: mvm: fix signedness warnings in ToF debugfsJohannes Berg2015-10-051-6/+10
| | | | | | | | | | | | | | | | | | | | | | | | Using an int* instead of u32* as the kstrtou32() output argument obviously results in signedness warnings, change that. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
| | * iwlwifi: mvm: rs: dynamically switch between 80MHz and 20MHz in some scenariosEyal Shapira2015-10-052-0/+48
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a tweak which has been shown to improve performance when moving away from the AP while working in 80Mhz. When RS decides to go down to 80MHz SISO MCS0 instead switch to 20MHz MCS4. Go back to 80MHz MCS1 if RS can sustain 20MHz MCS5. Signed-off-by: Eyal Shapira <eyalx.shapira@intel.com> Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
| | * iwlwifi: mvm: minor rx code cleanupJohannes Berg2015-10-051-2/+2
| | | | | | | | | | | | | | | | | | | | | Clean up variable initialisation slightly. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
| | * iwlwifi: remove IWL3165_UCODE_API_OK and _MINJohannes Berg2015-10-051-7/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As the 3165 device uses the same firmware as 7265-D and currently all 7000 series (including 3160/3165) use the same API versions remove IWL3165_UCODE_API_OK and _MIN. We might have to put them back if firmware support ever splits, but in that case might also have to add a different MODULE_FIRMWARE statement. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
| | * iwlwifi: mvm: rs: fix success ratio comparison in rs_get_best_rateEyal Shapira2015-10-051-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | success_ratio is actually 128 * SR in percentage while IWL_MVM_RS_SR_NO_DECREASE is 85%. Fix this by using RS_PERCENT(). This bug caused the if branch to be always executed. This in turn led to always selecting a rate, following a column switch, in which the expected throughput would exceed the best expected current throughput. In some scenarios where the success ratio isn't >85% such a rate could be too aggressive leading us to avoid the new column. This has the potential of causing sub optimal performance. Reported-by: Moshe Harel <moshe.harel@intel.com> Signed-off-by: Eyal Shapira <eyalx.shapira@intel.com> Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
| | * iwlwifi: mvm: rs: minor indentation fixEyal Shapira2015-10-051-2/+2
| | | | | | | | | | | | | | | | | | | | | Indentation was off a bit. Fix it. Signed-off-by: Eyal Shapira <eyalx.shapira@intel.com> Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
| | * iwlwifi: mvm: rs: remove overflowing debug messageEyal Shapira2015-10-051-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | This message isn't very useful and creates clutter. Remove it. Signed-off-by: Eyal Shapira <eyalx.shapira@intel.com> Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
| | * iwlwifi: mvm: rs: improve rate debug messagesEyal Shapira2015-10-051-9/+51
| | | | | | | | | | | | | | | | | | | | | Pretty print the rate full details to ease debugging. Signed-off-by: Eyal Shapira <eyalx.shapira@intel.com> Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
| | * iwlwifi: mvm: stop using DEVICE_POWER_FLAGS_CAM_MSKJohannes Berg2015-10-052-10/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The firmware has always treated these two bits to mean that powersave is enabled when POWER_SAVE_ENA is set and CAM is clear; it doesn't use them in any non-combined way. Therefore, it's pointless to send it two bits, and the API should be cleaned up. Prepare the driver by removing the CAM bit and using only POWER_SAVE_ENA to indicate whether PS is enabled or not. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
| | * iwlwifi: mvm: support enabling a queue with a given ssnLiad Kaufman2015-10-053-7/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When enabling a queue, the default SSN is 0. Allow determining what that SSN should be, if required. This can happen, for example, if a queue gets reconfigured. Signed-off-by: Liad Kaufman <liad.kaufman@intel.com> Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
| | * iwlwifi: mvm: support using multiple ACs on single HW queueLiad Kaufman2015-10-057-89/+249
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | "DQA" is shorthand for "dynamic queue allocation", with the idea of allocating queues per-RA/TID on-demand rather than using shared queues statically allocated per vif. The goal of this is to enable future features (like GO PM) and to improve performance measurements of TX traffic. When RA/TID streams can't be neatly sorted into different AC queues, DQA allows sharing queues for the same RA. This means that DQA allows different ACs may reach the same HW queue. Update the code to allow such queue sharing by having a mapping between the HW queue and the mac80211 queues using it (as this could be more than one queue). Signed-off-by: Liad Kaufman <liad.kaufman@intel.com> Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
| | * iwlwifi: transport: track number of allocated queuesJohannes Berg2015-10-052-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | As the transport will decide how many queues (and MSI-X vectors) to allocate, add a field to indicate that to the op-mode so it can size/allocate its own data structures appropriately. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Luca Coelho <luciano.coelho@intel.com>