linux - linux

	Commit message (Collapse)	Author	Age	Files	Lines
*	ipv4: make ip_local_reserved_ports per netns	WANG Cong	2014-05-14	9	-37/+38
\| \| \| \| \| \| \| \| \| \| \| \|	ip_local_port_range is already per netns, so should ip_local_reserved_ports be. And since it is none by default we don't actually need it when we don't enable CONFIG_SYSCTL. By the way, rename inet_is_reserved_local_port() to inet_is_local_reserved_port() Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	irda: sh_irda: Enable driver compilation with COMPILE_TEST	Laurent Pinchart	2014-05-14	1	-1/+2
\| \| \| \| \| \| \| \|	This helps increasing build testing coverage. Signed-off-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com> Acked-by: Simon Horman <horms@verge.net.au> Signed-off-by: David S. Miller <davem@davemloft.net>
*	Merge branch 'tipc-next'	David S. Miller	2014-05-14	19	-351/+348
\|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Jon Maloy says: ==================== tipc: bug fixes and improvements Intensive and extensive testing has revealed some rather infrequent problems related to flow control, buffer handling and link establishment. Commits ##1 to 4 deal with these problems. The remaining four commits are just code improvments, aiming at making the code more comprehensible and maintainable. There are no functional enhancements in this series. v2: Fixed a typo in commit log #2. Otherwise no changes from v1. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	tipc: merge port message reception into socket reception function	Jon Paul Maloy	2014-05-14	6	-59/+46
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In order to reduce complexity and save a call level during message reception at port/socket level, we remove the function tipc_port_rcv() and merge its functionality into tipc_sk_rcv(). Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Reviewed-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	tipc: clean up neigbor discovery message reception	Jon Paul Maloy	2014-05-14	1	-108/+111
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The function tipc_disc_rcv(), which is handling received neighbor discovery messages, is perceived as messy, and it is hard to verify its correctness by code inspection. The fact that the task it is set to resolve is fairly complex does not make the situation better. In this commit we try to take a more systematic approach to the problem. We define a decision machine which takes three state flags as input, and produces three action flags as output. We then walk through all permutations of the state flags, and for each of them we describe verbally what is going on, plus that we set zero or more of the action flags. The action flags indicate what should be done once the decision machine has finished its job, while the last part of the function deals with performing those actions. Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Reviewed-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	tipc: improve and extend media address conversion functions	Jon Paul Maloy	2014-05-14	6	-76/+83
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	TIPC currently handles two media specific addresses: Ethernet MAC addresses and InfiniBand addresses. Those are kept in three different formats: 1) A "raw" format as obtained from the device. This format is known only by the media specific adapter code in eth_media.c and ib_media.c. 2) A "generic" internal format, in the form of struct tipc_media_addr, which can be referenced and passed around by the generic media- unaware code. 3) A serialized version of the latter, to be conveyed in neighbor discovery messages. Conversion between the three formats can only be done by the media specific code, so we have function pointers for this purpose in struct tipc_media. Here, the media adapters can install their own conversion functions at startup. We now introduce a new such function, 'raw2addr()', whose purpose is to convert from format 1 to format 2 above. We also try to as far as possible uniform commenting, variable names and usage of these functions, with the purpose of making them more comprehensible. We can now also remove the function tipc_l2_media_addr_set(), whose job is done better by the new function. Finally, we expand the field for serialized addresses (format 3) in discovery messages from 20 to 32 bytes. This is permitted according to the spec, and reduces the risk of problems when we add new media in the future. Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Reviewed-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	tipc: rename and move message reassembly function	Jon Paul Maloy	2014-05-14	8	-91/+74
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The function tipc_link_frag_rcv() is in reality a re-entrant generic message reassemby function that has nothing in particular to do with the link, where it is defined now. This becomes obvious when we see the need to call the function from other places in the code. In this commit rename it to tipc_buf_append() and move it to the file msg.c. We also simplify its signature by moving the tail pointer to the control block of the head buffer, hence making the head buffer self-contained. Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Reviewed-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	tipc: mark head of reassembly buffer as non-linear	Jon Paul Maloy	2014-05-14	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The message reassembly function does not update the 'len' and 'data_len' fields of the head skbuff correctly when fragments are chained to it. This may sometimes lead to obsure errors, such as fragment reordering when we receive fragments which are cloned buffers. This commit fixes this, by ensuring that the two fields are updated correctly. Suggested-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Reviewed-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	tipc: don't record link RESET or ACTIVATE messages as traffic	Jon Paul Maloy	2014-05-14	1	-3/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In the current code, all incoming LINK_PROTOCOL messages, irrespective of type, nudge the "last message received" checkpoint, informing the link state machine that a message was received from the peer since last supervision timeout event. This inhibits the link from starting probing the peer unnecessarily. However, not only STATE messages are recorded as legitimate incoming traffic this way, but even RESET and ACTIVATE messages, which in reality are there to inform the link that the peer endpoint has been reset. At the same time, some RESET messages may be dropped instead of causing a link reset. This happens when the link endpoint thinks it is fully up and working, and the session number of the RESET is lower than or equal to the current link session. In such cases the RESET is perceived as a delayed remnant from an earlier session, or the current one, and dropped. Now, if a TIPC module is removed and then immediately reinserted, e.g. when using a script, RESET messages may arrive at the peer link endpoint before this one has had time to discover the failure. The RESET may be dropped because of the session number, but only after it has been recorded as a legitimate traffic event. Hence, the receiving link will not start probing, and not discover that the peer endpoint is down, at the same time ignoring the periodic RESET messages coming from that endpoint. We have ended up in a stale state where a failed link cannot be re-established. In this commit, we remedy this by nudging the checkpoint only for received STATE messages, not for RESET or ACTIVATE messages. Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Reviewed-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	tipc: compensate for double accounting in socket rcv buffer	Jon Paul Maloy	2014-05-14	2	-9/+21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The function net/core/sock.c::__release_sock() runs a tight loop to move buffers from the socket backlog queue to the receive queue. As a security measure, sk_backlog.len of the receiving socket is not set to zero until after the loop is finished, i.e., until the whole backlog queue has been transferred to the receive queue. During this transfer, the data that has already been moved is counted both in the backlog queue and the receive queue, hence giving an incorrect picture of the available queue space for new arriving buffers. This leads to unnecessary rejection of buffers by sk_add_backlog(), which in TIPC leads to unnecessarily broken connections. In this commit, we compensate for this double accounting by adding a counter that keeps track of it. The function socket.c::backlog_rcv() receives buffers one by one from __release_sock(), and adds them to the socket receive queue. If the transfer is successful, it increases a new atomic counter 'tipc_sock::dupl_rcvcnt' with 'truesize' of the transferred buffer. If a new buffer arrives during this transfer and finds the socket busy (owned), we attempt to add it to the backlog. However, when sk_add_backlog() is called, we adjust the 'limit' parameter with the value of the new counter, so that the risk of inadvertent rejection is eliminated. It should be noted that this change does not invalidate the original purpose of zeroing 'sk_backlog.len' after the full transfer. We set an upper limit for dupl_rcvcnt, so that if a 'wild' sender (i.e., one that doesn't respect the send window) keeps pumping in buffers to sk_add_backlog(), he will eventually reach an upper limit, (2 x TIPC_CONN_OVERLOAD_LIMIT). After that, no messages can be added to the backlog, and the connection will be broken. Ordinary, well- behaved senders will never reach this buffer limit at all. Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Reviewed-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	tipc: decrease connection flow control window	Jon Paul Maloy	2014-05-14	3	-9/+11
\|/ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Memory overhead when allocating big buffers for data transfer may be quite significant. E.g., truesize of a 64 KB buffer turns out to be 132 KB, 2 x the requested size. This invalidates the "worst case" calculation we have been using to determine the default socket receive buffer limit, which is based on the assumption that 1024x64KB = 67MB buffers may be queued up on a socket. Since TIPC connections cannot survive hitting the buffer limit, we have to compensate for this overhead. We do that in this commit by dividing the fix connection flow control window from 1024 (2512) messages to 512 (2256). Since older version nodes send out acks at 512 message intervals, compatibility with such nodes is guaranteed, although performance may be non-optimal in such cases. Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Reviewed-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	bonding: alloc the structure ad_info dynamically in per slave	dingtianhong	2014-05-14	6	-37/+67
\| \| \| \| \| \| \| \| \| \| \| \| \|	The struct ad_slave_info is very huge, and only be used for 802.3ad mode, so alloc the structure dynamically could save 356 Bits for every slave in non 802.3ad mode. Cc: Jay Vosburgh <j.vosburgh@gmail.com> Cc: Veaceslav Falico <vfalico@gmail.com> Cc: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: Ding Tianhong <dingtianhong@huawei.com> Acked-by: Veaceslav Falico <vfalico@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	sh_eth: replace devm_kzalloc() with devm_kmalloc_array()	Sergei Shtylyov	2014-05-14	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \|	When I was converting the driver to the managed device API, only devm_kzalloc() was available for memory allocation, so I had to use it, despite zeroing out the PHY IRQ array right before initializing all its entries to PHY_POLL was quite stupid. Now that devm_kmalloc_array() has become available, we can avoid the needless zeroing out... Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	Merge branch 'tg3-next'	David S. Miller	2014-05-14	2	-18/+35
\|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Michael Chan says: ==================== tg3: TSO related enhancements to prevent memory allocation failure Michael Chan (3): tg3: Don't modify ip header fields when doing GSO tg3: Prevent page allocation failure during TSO workaround tg3: Update copyright and version to 3.137 ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	tg3: Update copyright and version to 3.137	Michael Chan	2014-05-14	2	-4/+4
\| \| \| \| \| \| \| \| \| \|	Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	tg3: Prevent page allocation failure during TSO workaround	Michael Chan	2014-05-14	1	-7/+26
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If any TSO fragment hits hardware bug conditions (e.g. 4G boundary), the driver will workaround by calling skb_copy() to copy to a linear SKB. Users have reported page allocation failures as the TSO packet can be up to 64K. Copying such a large packet is also very inefficient. We fix this by using existing tg3_tso_bug() to transmit the packet using GSO. Signed-off-by: Prashant Sreedharan <prashant@broadcom.com> Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	tg3: Don't modify ip header fields when doing GSO	Michael Chan	2014-05-14	1	-7/+5
\|/ \| \| \| \| \| \| \| \| \| \| \| \| \|	tg3 uses GSO as workaround if the hardware cannot perform TSO on certain packets. We should not modify the ip header fields if we do GSO on the packet. It happens to work by accident because GSO recalculates the IP checksum and IP total length. Also fix the tg3_start_xmit comment to reflect that this is the only xmit function for all devices. Signed-off-by: Prashant Sreedharan <prashant@broadcom.com> Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	Merge branch 'inet_fwmark_reflect'	David S. Miller	2014-05-14	18	-9/+79
\|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Lorenzo Colitti says: ==================== Make mark-based routing work better with multiple separate networks. Mark-based routing (ip rule fwmark 17 lookup 100) combined with either iptables marking (iptables -j MARK --set-mark 17) or application-based marking (the SO_MARK setsockopt) are a good way to deal with connecting simultaneously to multiple networks. Each network can be given a routing table, and ip rules can be configured to make different fwmarks select different networks. Applications can select networks them by setting appropriate socket marks, and iptables rules can be used to handle non-aware applications, enforce policy, etc. This patch series improves functionality when mark-based routing is used in this way. Current behaviour has the following limitations: 1. Kernel-originated replies that are not associated with a socket always use a mark of zero. This means that, for example, when the kernel sends a ping reply or a TCP reset, it does not send it on the network from which it received the original packet. 2. Path MTU discovery, which is triggered by incoming packets, does not always work correctly, because the routing lookups it uses to clone routes do not take the fwmark into account and thus can happen in the wrong routing table. 3. Application-based marking works well for outbound connections, but does not work well for incoming connections. Marking a listening socket causes that socket to only accept connections from a given network, and sockets that are returned by accept() are not marked (and are thus not routed correctly). sysctl. This causes route lookups for kernel-generated replies and PMTUD to use the fwmark of the packet that caused them. which causes TCP sockets returned by accept() to be marked with the same mark that sent the intial SYN packet. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	net: support marking accepting TCP sockets	Lorenzo Colitti	2014-05-14	9	-5/+30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When using mark-based routing, sockets returned from accept() may need to be marked differently depending on the incoming connection request. This is the case, for example, if different socket marks identify different networks: a listening socket may want to accept connections from all networks, but each connection should be marked with the network that the request came in on, so that subsequent packets are sent on the correct network. This patch adds a sysctl to mark TCP sockets based on the fwmark of the incoming SYN packet. If enabled, and an unmarked socket receives a SYN, then the SYN packet's fwmark is written to the connection's inet_request_sock, and later written back to the accepted socket when the connection is established. If the socket already has a nonzero mark, then the behaviour is the same as it is today, i.e., the listening socket's fwmark is used. Black-box tested using user-mode linux: - IPv4/IPv6 SYN+ACK, FIN, etc. packets are routed based on the mark of the incoming SYN packet. - The socket returned by accept() is marked with the mark of the incoming SYN packet. - Tested with syncookies=1 and syncookies=2. Signed-off-by: Lorenzo Colitti <lorenzo@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	net: Use fwmark reflection in PMTU discovery.	Lorenzo Colitti	2014-05-14	2	-1/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently, routing lookups used for Path PMTU Discovery in absence of a socket or on unmarked sockets use a mark of 0. This causes PMTUD not to work when using routing based on netfilter fwmark mangling and fwmark ip rules, such as: iptables -j MARK --set-mark 17 ip rule add fwmark 17 lookup 100 This patch causes these route lookups to use the fwmark from the received ICMP error when the fwmark_reflect sysctl is enabled. This allows the administrator to make PMTUD work by configuring appropriate fwmark rules to mark the inbound ICMP packets. Black-box tested using user-mode linux by pointing different fwmarks at routing tables egressing on different interfaces, and using iptables mangling to mark packets inbound on each interface with the interface's fwmark. ICMPv4 and ICMPv6 PMTU discovery work as expected when mark reflection is enabled and fail when it is disabled. Signed-off-by: Lorenzo Colitti <lorenzo@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	net: add a sysctl to reflect the fwmark on replies	Lorenzo Colitti	2014-05-14	10	-3/+41
\|/ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Kernel-originated IP packets that have no user socket associated with them (e.g., ICMP errors and echo replies, TCP RSTs, etc.) are emitted with a mark of zero. Add a sysctl to make them have the same mark as the packet they are replying to. This allows an administrator that wishes to do so to use mark-based routing, firewalling, etc. for these replies by marking the original packets inbound. Tested using user-mode linux: - ICMP/ICMPv6 echo replies and errors. - TCP RST packets (IPv4 and IPv6). Signed-off-by: Lorenzo Colitti <lorenzo@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	Merge branch 'arc_emac-next'	David S. Miller	2014-05-14	1	-0/+49
\|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Beniamino Galvani says: ==================== arc_emac: promiscuous/multicast mode and netpoll support These patches add support for promiscuous mode, multicast filtering and netpoll to the ARC EMAC driver. They were both tested on a Radxa Rock board which uses a ARC EMAC IP core integrated in the Rockchip RK3188 SoC. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	arc_emac: add netpoll support	Beniamino Galvani	2014-05-14	1	-0/+12
\| \| \| \| \| \| \| \| \| \|	Signed-off-by: Beniamino Galvani <b.galvani@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	arc_emac: implement promiscuous mode and multicast filtering	Beniamino Galvani	2014-05-14	1	-0/+37
\|/ \| \| \| \| \| \| \| \|	This patch implements the set_rx_mode function to enable/disable promiscuous or all-multicast modes and to update the multicast filtering list of the device. Signed-off-by: Beniamino Galvani <b.galvani@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	Merge branch 'tcp-fastopen-ipv6'	David S. Miller	2014-05-13	7	-309/+323
\|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Yuchung Cheng says: ==================== tcp: IPv6 support for fastopen server This patch series add IPv6 support for fastopen server. To minimize code duplication in IPv4 and IPv6, the current v4 only code is refactored and common code is moved into net/ipv4/tcp_fastopen.c. Also the current code uses a different function from tcp_v4_send_synack() to send the first SYN-ACK in fastopen. The new code eliminates this separate function by refactoring the child-socket and syn-ack creation code. After these refactoring in the first four patches, we can easily add the fastopen code in IPv6 by changing corresponding IPv6 functions. Note Fast Open client already supports IPv6. This patch is for the server-side (passive open) IPv6 support only. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	tcp: IPv6 support for fastopen server	Daniel Lee	2014-05-13	2	-26/+71
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	After all the preparatory works, supporting IPv6 in Fast Open is now easy. We pretty much just mirror v4 code. The only difference is how we generate the Fast Open cookie for IPv6 sockets. Since Fast Open cookie is 128 bits and we use AES 128, we use CBC-MAC to encrypt both the source and destination IPv6 addresses since the cookie is a MAC tag. Signed-off-by: Daniel Lee <longinus00@gmail.com> Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Jerry Chu <hkchu@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	tcp: improve fastopen icmp handling	Yuchung Cheng	2014-05-13	2	-26/+31
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If a fast open socket is already accepted by the user, it should be treated like a connected socket to record the ICMP error in sk_softerr, so the user can fetch it. Do that in both tcp_v4_err and tcp_v6_err. Also refactor the sequence window check to improve readability (e.g., there were two local variables named 'req'). Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Daniel Lee <longinus00@gmail.com> Signed-off-by: Jerry Chu <hkchu@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	tcp: use tcp_v4_send_synack on first SYN-ACK	Yuchung Cheng	2014-05-13	5	-105/+85
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	To avoid large code duplication in IPv6, we need to first simplify the complicate SYN-ACK sending code in tcp_v4_conn_request(). To use tcp_v4(6)_send_synack() to send all SYN-ACKs, we need to initialize the mini socket's receive window before trying to create the child socket and/or building the SYN-ACK packet. So we move that initialization from tcp_make_synack() to tcp_v4_conn_request() as a new function tcp_openreq_init_req_rwin(). After this refactoring the SYN-ACK sending code is simpler and easier to implement Fast Open for IPv6. Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Daniel Lee <longinus00@gmail.com> Signed-off-by: Jerry Chu <hkchu@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	tcp: simplify fast open cookie processing	Yuchung Cheng	2014-05-13	5	-64/+33
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Consolidate various cookie checking and generation code to simplify the fast open processing. The main goal is to reduce code duplication in tcp_v4_conn_request() for IPv6 support. Removes two experimental sysctl flags TFO_SERVER_ALWAYS and TFO_SERVER_COOKIE_NOT_CHKD used primarily for developmental debugging purposes. Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Daniel Lee <longinus00@gmail.com> Signed-off-by: Jerry Chu <hkchu@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	tcp: move fastopen functions to tcp_fastopen.c	Yuchung Cheng	2014-05-13	3	-185/+200
\|/ \| \| \| \| \| \| \| \| \| \| \|	Move common TFO functions that will be used by both v4 and v6 to tcp_fastopen.c. Create a helper tcp_fastopen_queue_check(). Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Daniel Lee <longinus00@gmail.com> Signed-off-by: Jerry Chu <hkchu@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	Merge branch 'cdc_mbim-next'	David S. Miller	2014-05-13	4	-23/+464
\|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Bjørn Mork says: ==================== cdc_mbim: cleanups and new features This series depends on commit 6b5eeb7f874b ("net: cdc_mbim: handle unaccelerated VLAN tagged frames"), which is currently in "net" but not yet in "net-next". Patch 4 might have a minor context collision with the "cdc_ncm: add buffer tuning and stats using ethtool" series I just posted for review. Please let me know if I should submit an adjusted version in either direction. These two series' are otherwise completely independent of each other. The major new feature here is in patch 1, which I hope will solve some problems with the original design without changing the existing API, optionally allowing IP session 0 to be treated like any other MBIM session. The rest are some minor cleanups and finally some documentation of the current driver APIs, after this series has been applied. I started feeling a bit more mortal than usual lately, which probably is healthy, and realized that I should put some of the stuff in my head in a somewhat less volatile storage. v2: Fixed patch 1 so that it actually does what it claims to do. This time it is even tested for functionality, and not just build tested... ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	net: cdc_ncm/cdc_mbim: rework probing of NCM/MBIM functions	Bjørn Mork	2014-05-13	3	-17/+55
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The NCM class match in the cdc_mbim driver is confusing and cause unexpected behaviour. The USB core guarantees that a USB interface is in altsetting 0 when probing starts. This means that devices implementing a NCM 1.0 backwards compatible MBIM function (a "NCM/MBIM function") always hit the NCM entry in the cdc_mbim driver match table. Such functions will never match any of the MBIM entries. This causes unexpeced behaviour for cases where the NCM and MBIM entries are differet, which is currently the case for all except Ericsson devices. Improve the probing of NCM/MBIM functions by looking up the device again in the cdc_mbim match table after switching to the MBIM identity. The shared altsetting selection is updated to better accommodate the new probing logic, returning the preferred altsetting for the control interface instead of the data interface. The control interface altsetting update is moved to the cdc_mbim driver. It is never necessary to change the control interface altsetting for NCM. Cc: Greg Suarez <gsuarez@smithmicro.com> Reported by: Yu-an Shih <yshih@nvidia.com> Signed-off-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	net: cdc_mbim: add driver documentation	Bjørn Mork	2014-05-13	1	-0/+339
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	An initial attempt on describing some of the odd APIs provided by this driver. Cc: Greg Suarez <gsuarez@smithmicro.com> Signed-off-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	net: cdc_mbim: reject IP packets on DSS VLANs	Bjørn Mork	2014-05-13	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	DSS VLANs are pseudo network interfaces representing arbitrary data streams, and specifically not IP. Preventing spurious IP packets can sometimes be a hassle. The kernel will for example send an IPv6 Router Solicit when the interface is brought up unless the user has been careful enough to disable IPv6 first. Such packets forwared to a MBIM DSS session will look like spurious noise to the device, and can cause it to log an error or even malfunction. Drop all IP packets on the designated DSS VLANs to prevent such unwanted spurious transmissions. Cc: Greg Suarez <gsuarez@smithmicro.com> Reported-by: Arnaud Desmier <adesmier@sequans.com> Signed-off-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	net: cdc_mbim: optionally use VLAN ID 4094 for IP session 0	Bjørn Mork	2014-05-13	1	-6/+68
\|/ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The cdc_mbim driver maps 802.1q VLANs to MBIM IP and DSS sessions. MBIM IP session 0 is handled as an exception and is mapped to untagged frames. This patch adds optional support for remapping MBIM IP session 0 to 802.1q VLAN ID 4094 instead. The default behaviour is not changed. The new behaviour is triggered by adding a link for this previously unsupported VLAN. The untagged mapping was chosen initially to support the assumed most common use case: Most current MBIM devices only support a single IP session (i.e. session 0 only), and using untagged frames lets the users completely ignore the additonal complexity of the multiplexing layer. But when the multiplexing features of MBIM are used, then this netdev gets a double meaning: It becomes the master interface for all the VLAN subdevs the additional sessions are mapped to, while still serving as the untagged IP interface for session 0. This can be problematic, especially when using Device Service Streams (DSS), as have become apparent recently with the availability of devices with real DSS support. Some use cases need to e.g set a MTU which is higher than allowed for IP Session 0. The dual role also leads to the situation where the IP Session 0 interface cannot be taken down without breaking unrelated IP or DSS sessions - a devastating side effect which applications managing a simple IP session cannot be expected to be aware of. A typical DHCP client will assume that it should bring the interface down after releasing the IP lease. These problems can be avoided by tagging IP session 0 packets too, making this session similar to all other multiplexed sessions. This redefines the main netdev as an upper master interface only. Cc: Greg Suarez <gsuarez@smithmicro.com> Signed-off-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: David S. Miller <davem@davemloft.net>
*	net: get rid of SET_ETHTOOL_OPS	Wilfried Klaebe	2014-05-13	108	-128/+118
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	net: get rid of SET_ETHTOOL_OPS Dave Miller mentioned he'd like to see SET_ETHTOOL_OPS gone. This does that. Mostly done via coccinelle script: @@ struct ethtool_ops ops; struct net_device dev; @@ - SET_ETHTOOL_OPS(dev, ops); + dev->ethtool_ops = ops; Compile tested only, but I'd seriously wonder if this broke anything. Suggested-by: Dave Miller <davem@davemloft.net> Signed-off-by: Wilfried Klaebe <w-lkml@lebenslange-mailadresse.de> Acked-by: Felipe Balbi <balbi@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	net: ptp: mark filter as __initdata	Mathias Krause	2014-05-13	1	-1/+1
\| \| \| \| \| \| \| \| \|	sk_unattached_filter_create() will copy the filter's instructions so we don't need to have the master copy hanging around after initialization. Signed-off-by: Mathias Krause <minipli@googlemail.com> Acked-by: Alexei Starovoitov <ast@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	net: fix test_bpf build to depend on NET	Randy Dunlap	2014-05-13	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fix build when CONFIG_NET is not enabled. Fixes these build errors: WARNING: "sk_unattached_filter_destroy" [lib/test_bpf.ko] undefined! WARNING: "kfree_skb" [lib/test_bpf.ko] undefined! WARNING: "sk_unattached_filter_create" [lib/test_bpf.ko] undefined! WARNING: "sk_run_filter_int_skb" [lib/test_bpf.ko] undefined! WARNING: "__alloc_skb" [lib/test_bpf.ko] undefined! Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Acked-by: Alexei Starovoitov <ast@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	net: filter: Fix redefinition warnings on x86-64.	David S. Miller	2014-05-13	1	-34/+37
\| \| \| \| \| \| \| \| \| \| \| \|	Do not collide with the x86-64 PTRACE user API namespace. net/core/filter.c:57:0: warning: "R8" redefined [enabled by default] arch/x86/include/uapi/asm/ptrace-abi.h:38:0: note: this is the location of the previous definition Fix by adding a BPF_ prefix to the register macros. Reported-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: David S. Miller <davem@davemloft.net>
*	bnx2x: fix build when BNX2X_SRIOV is not enabled	Randy Dunlap	2014-05-13	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fix build when BNX2X_SRIOV is not enabled. Change one parameter struct from bnx2 to bnx2x and don't return a value from a void function. drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.h:576:48: warning: 'struct bnx2' declared inside parameter list [enabled by default] drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.h:576:48: warning: its scope is only this definition or declaration, which is probably not what you want [enabled by default] drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.h:576:60: warning: 'return' with a value, in function returning void [enabled by default] Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Ariel Elior <ariele@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	Merge branch 'cpsw'	David S. Miller	2014-05-13	2	-2/+64
\|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Mugunthan V N says: ==================== Add DRA7xx and AM43xx platform support in cpsw-phy-sel driver Adding DRA7xx and AM43xx platform support to cpsw-phy-sel driver to select phy mode in control driver and fixing the uninitialized dev by initializing to platform device structure pointer. Changes from Initial version * Added back the missing patch (1/3) ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	drivers: net: cpsw-phy-sel: add am43xx platform support	Mugunthan V N	2014-05-13	2	-0/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	AM43xx phy mode selection is similar to AM33xx platform, so adding only the compatibility string to the driver Signed-off-by: Mugunthan V N <mugunthanvnm@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	drivers: net: cpsw-phy-sel: add dra7xx support for phy sel	Mugunthan V N	2014-05-13	2	-2/+58
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add dra7xx support for selecting the phy mode which is present in control module of dra7xx SoC Signed-off-by: Mugunthan V N <mugunthanvnm@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	drivers: net: cpsw-phy-sel: initialize priv->dev	Mugunthan V N	2014-05-13	1	-0/+1
\|/ \| \| \| \| \| \|	priv->dev is uninitialized, initializing with pdev->dev Signed-off-by: Mugunthan V N <mugunthanvnm@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	sch_hhf: fix comparison of qlen and limit	Yang Yingliang	2014-05-12	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \|	When I use the following command, eth0 cannot send any packets. #tc qdisc add dev eth0 root handle 1: hhf limit 1 Because qlen need be smaller than limit, all packets were dropped. Fix this by qlen <= limit. Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	vlan: rename __vlan_find_dev_deep() to __vlan_find_dev_deep_rcu()	dingtianhong	2014-05-12	8	-16/+16
\| \| \| \| \| \| \| \| \|	The __vlan_find_dev_deep should always called in RCU, according David's suggestion, rename to __vlan_find_dev_deep_rcu looks more reasonable. Signed-off-by: Ding Tianhong <dingtianhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	net: rename local_df to ignore_df	WANG Cong	2014-05-12	17	-42/+42
\| \| \| \| \| \| \| \| \| \| \| \| \|	As suggested by several people, rename local_df to ignore_df, since it means "ignore df bit if it is set". Cc: Maciej Żenczykowski <maze@google.com> Cc: Florian Westphal <fw@strlen.de> Cc: David S. Miller <davem@davemloft.net> Cc: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Acked-by: Maciej Żenczykowski <maze@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net	David S. Miller	2014-05-12	600	-3513/+6816
\|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Conflicts: drivers/net/ethernet/altera/altera_sgdma.c net/netlink/af_netlink.c net/sched/cls_api.c net/sched/sch_api.c The netlink conflict dealt with moving to netlink_capable() and netlink_ns_capable() in the 'net' tree vs. supporting 'tc' operations in non-init namespaces. These were simple transformations from netlink_capable to netlink_ns_capable. The Altera driver conflict was simply code removal overlapping some void pointer cast cleanups in net-next. Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	ptp: fix kconfig dependency warnings	Randy Dunlap	2014-05-12	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fix kconfig warnings: PTP_1588_CLOCK selects NET_PTP_CLASSIFY, which depends on NET, so PTP_1588_CLOCK should also depend on NET. PTP_1588_CLOCK_PCH selects PTP_1588_CLOCK so the former should depend on NET. warning: (IXP4XX_ETH && PTP_1588_CLOCK) selects NET_PTP_CLASSIFY which has unmet direct dependencies (NET) warning: (SFC && TILE_NET && BFIN_MAC_USE_HWSTAMP && TIGON3 && FEC && E1000E && IGB && IXGBE && I40E && MLX4_EN && SXGBE_ETH && STMMAC_ETH && TI_CPTS && PTP_1588_CLOCK_GIANFAR && PTP_1588_CLOCK_IXP46X && DP83640_PHY && PTP_1588_CLOCK_PCH) selects PTP_1588_CLOCK which has unmet direct dependencies (NET) [This warning is caused by the new 'depends on NET' in PTP_1588_CLOCK.] Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Acked-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	Merge branch 'for-davem' of ↵	David S. Miller	2014-05-09	7	-9/+30
\| \|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless John W. Linville says: ==================== pull request: wireless 2014-05-08 This one is all from Johannes: "Here are a few small fixes for the current cycle: radiotap TX flags were wrong (fix by Bob), Chun-Yeow fixes an SMPS issue with mesh interfaces, Eliad fixes a locking bug and a cfg80211 state problem and finally Henning sent me a fix for IBSS rate information." Please let me know if there are problems! ==================== Signed-off-by: David S. Miller <davem@davemloft.net>