summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds2018-10-2743-206/+205
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull networking fixes from David Miller: "What better way to start off a weekend than with some networking bug fixes: 1) net namespace leak in dump filtering code of ipv4 and ipv6, fixed by David Ahern and Bjørn Mork. 2) Handle bad checksums from hardware when using CHECKSUM_COMPLETE properly in UDP, from Sean Tranchetti. 3) Remove TCA_OPTIONS from policy validation, it turns out we don't consistently use nested attributes for this across all packet schedulers. From David Ahern. 4) Fix SKB corruption in cadence driver, from Tristram Ha. 5) Fix broken WoL handling in r8169 driver, from Heiner Kallweit. 6) Fix OOPS in pneigh_dump_table(), from Eric Dumazet" * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (28 commits) net/neigh: fix NULL deref in pneigh_dump_table() net: allow traceroute with a specified interface in a vrf bridge: do not add port to router list when receives query with source 0.0.0.0 net/smc: fix smc_buf_unuse to use the lgr pointer ipv6/ndisc: Preserve IPv6 control buffer if protocol error handlers are called net/{ipv4,ipv6}: Do not put target net if input nsid is invalid lan743x: Remove SPI dependency from Microchip group. drivers: net: remove <net/busy_poll.h> inclusion when not needed net: phy: genphy_10g_driver: Avoid NULL pointer dereference r8169: fix broken Wake-on-LAN from S5 (poweroff) octeontx2-af: Use GFP_ATOMIC under spin lock net: ethernet: cadence: fix socket buffer corruption problem net/ipv6: Allow onlink routes to have a device mismatch if it is the default route net: sched: Remove TCA_OPTIONS from policy ice: Poll for link status change ice: Allocate VF interrupts and set queue map ice: Introduce ice_dev_onetime_setup net: hns3: Fix for warning uninitialized symbol hw_err_lst3 octeontx2-af: Copy the right amount of memory net: udp: fix handling of CHECKSUM_COMPLETE packets ...
| * net/neigh: fix NULL deref in pneigh_dump_table()Eric Dumazet2018-10-271-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | pneigh can have NULL device pointer, so we need to make neigh_master_filtered() and neigh_ifindex_filtered() more robust. syzbot report : kasan: CONFIG_KASAN_INLINE enabled kasan: GPF could be caused by NULL-ptr deref or user memory access general protection fault: 0000 [#1] PREEMPT SMP KASAN CPU: 0 PID: 15867 Comm: syz-executor2 Not tainted 4.19.0+ #276 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:__read_once_size include/linux/compiler.h:179 [inline] RIP: 0010:list_empty include/linux/list.h:203 [inline] RIP: 0010:netdev_master_upper_dev_get+0xa1/0x250 net/core/dev.c:6467 RSP: 0018:ffff8801bfaf7220 EFLAGS: 00010202 RAX: dffffc0000000000 RBX: 0000000000000001 RCX: ffffc90005e92000 RDX: 0000000000000016 RSI: ffffffff860b44d9 RDI: 0000000000000005 RBP: ffff8801bfaf72b0 R08: ffff8801c4c84080 R09: fffffbfff139a580 R10: fffffbfff139a580 R11: ffffffff89cd2c07 R12: 1ffff10037f5ee45 R13: 0000000000000000 R14: ffff8801bfaf7288 R15: 00000000000000b0 FS: 00007f65cc68d700(0000) GS:ffff8801dae00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000001b33a21000 CR3: 00000001c6116000 CR4: 00000000001406f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: neigh_master_filtered net/core/neighbour.c:2367 [inline] pneigh_dump_table net/core/neighbour.c:2456 [inline] neigh_dump_info+0x7a9/0x1ce0 net/core/neighbour.c:2577 netlink_dump+0x606/0x1080 net/netlink/af_netlink.c:2244 __netlink_dump_start+0x59a/0x7c0 net/netlink/af_netlink.c:2352 netlink_dump_start include/linux/netlink.h:216 [inline] rtnetlink_rcv_msg+0x809/0xc20 net/core/rtnetlink.c:4898 netlink_rcv_skb+0x172/0x440 net/netlink/af_netlink.c:2477 rtnetlink_rcv+0x1c/0x20 net/core/rtnetlink.c:4953 netlink_unicast_kernel net/netlink/af_netlink.c:1310 [inline] netlink_unicast+0x5a5/0x760 net/netlink/af_netlink.c:1336 netlink_sendmsg+0xa18/0xfc0 net/netlink/af_netlink.c:1917 sock_sendmsg_nosec net/socket.c:621 [inline] sock_sendmsg+0xd5/0x120 net/socket.c:631 sock_write_iter+0x35e/0x5c0 net/socket.c:900 call_write_iter include/linux/fs.h:1808 [inline] new_sync_write fs/read_write.c:474 [inline] __vfs_write+0x6b8/0x9f0 fs/read_write.c:487 vfs_write+0x1fc/0x560 fs/read_write.c:549 ksys_write+0x101/0x260 fs/read_write.c:598 __do_sys_write fs/read_write.c:610 [inline] __se_sys_write fs/read_write.c:607 [inline] __x64_sys_write+0x73/0xb0 fs/read_write.c:607 do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x457569 Fixes: 6f52f80e8530 ("net/neigh: Extend dump filter to proxy neighbor dumps") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: David Ahern <dsahern@gmail.com> Reported-by: syzbot <syzkaller@googlegroups.com> Reviewed-by: David Ahern <dsahern@gmail.com> Tested-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net: allow traceroute with a specified interface in a vrfMike Manning2018-10-272-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Traceroute executed in a vrf succeeds if no device is given or if the vrf is given as the device, but fails if the interface is given as the device. This is for default UDP probes, it succeeds for TCP SYN or ICMP ECHO probes. As the skb bound dev is the interface and the sk dev is the vrf, sk lookup fails for ICMP_DEST_UNREACH and ICMP_TIME_EXCEEDED messages. The solution is for the secondary dev to be passed so that the interface is available for the device match to succeed, in the same way as is already done for non-error cases. Signed-off-by: Mike Manning <mmanning@vyatta.att-mail.com> Reviewed-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * bridge: do not add port to router list when receives query with source 0.0.0.0Hangbin Liu2018-10-271-1/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Based on RFC 4541, 2.1.1. IGMP Forwarding Rules The switch supporting IGMP snooping must maintain a list of multicast routers and the ports on which they are attached. This list can be constructed in any combination of the following ways: a) This list should be built by the snooping switch sending Multicast Router Solicitation messages as described in IGMP Multicast Router Discovery [MRDISC]. It may also snoop Multicast Router Advertisement messages sent by and to other nodes. b) The arrival port for IGMP Queries (sent by multicast routers) where the source address is not 0.0.0.0. We should not add the port to router list when receives query with source 0.0.0.0. Reported-by: Ying Xu <yinxu@redhat.com> Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Acked-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Acked-by: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net/smc: fix smc_buf_unuse to use the lgr pointerKarsten Graul2018-10-271-13/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | The pointer to the link group is unset in the smc connection structure right before the call to smc_buf_unuse. Provide the lgr pointer to smc_buf_unuse explicitly. And move the call to smc_lgr_schedule_free_work to the end of smc_conn_free. Fixes: a6920d1d130c ("net/smc: handle unregistered buffers") Signed-off-by: Karsten Graul <kgraul@linux.ibm.com> Signed-off-by: Ursula Braun <ubraun@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * ipv6/ndisc: Preserve IPv6 control buffer if protocol error handlers are calledStefano Brivio2018-10-271-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit a61bbcf28a8c ("[NET]: Store skb->timestamp as offset to a base timestamp") introduces a neighbour control buffer and zeroes it out in ndisc_rcv(), as ndisc_recv_ns() uses it. Commit f2776ff04722 ("[IPV6]: Fix address/interface handling in UDP and DCCP, according to the scoping architecture.") introduces the usage of the IPv6 control buffer in protocol error handlers (e.g. inet6_iif() in present-day __udp6_lib_err()). Now, with commit b94f1c0904da ("ipv6: Use icmpv6_notify() to propagate redirect, instead of rt6_redirect()."), we call protocol error handlers from ndisc_redirect_rcv(), after the control buffer is already stolen and some parts are already zeroed out. This implies that inet6_iif() on this path will always return zero. This gives unexpected results on UDP socket lookup in __udp6_lib_err(), as we might actually need to match sockets for a given interface. Instead of always claiming the control buffer in ndisc_rcv(), do that only when needed. Fixes: b94f1c0904da ("ipv6: Use icmpv6_notify() to propagate redirect, instead of rt6_redirect().") Signed-off-by: Stefano Brivio <sbrivio@redhat.com> Reviewed-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net/{ipv4,ipv6}: Do not put target net if input nsid is invalidBjørn Mork2018-10-262-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | The cleanup path will put the target net when netnsid is set. So we must reset netnsid if the input is invalid. Fixes: d7e38611b81e ("net/ipv4: Put target net when address dump fails due to bad attributes") Fixes: 242afaa6968c ("net/ipv6: Put target net when address dump fails due to bad attributes") Cc: David Ahern <dsahern@gmail.com> Signed-off-by: Bjørn Mork <bjorn@mork.no> Reviewed-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * lan743x: Remove SPI dependency from Microchip group.Bryan Whitehead2018-10-261-1/+0
| | | | | | | | | | | | | | | | The SPI dependency does not apply to lan743x driver, and other drivers in the group already state their dependence on SPI. Signed-off-by: Bryan Whitehead <Bryan.Whitehead@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * drivers: net: remove <net/busy_poll.h> inclusion when not neededEric Dumazet2018-10-269-9/+0
| | | | | | | | | | | | | | | | | | | | | | | | Drivers using generic NAPI interface no longer need to include <net/busy_poll.h>, since busy polling was moved to core networking stack long ago. See commit 79e7fff47b7b ("net: remove support for per driver ndo_busy_poll()") for reference. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net: phy: genphy_10g_driver: Avoid NULL pointer dereferenceAndrew Lunn2018-10-261-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | This driver got missed during the recent change of .features from a u32 to a pointer to a Linux bitmap. Change the initialisation from 0 to PHY_10GBIT_FEATURES so removing the danger of a NULL pointer dereference. Fixes: 719655a14971 ("net: phy: Replace phy driver features u32 with link_mode bitmap") Reported-by: Jose Abreu <jose.abreu@synopsys.com> Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
| * r8169: fix broken Wake-on-LAN from S5 (poweroff)Heiner Kallweit2018-10-261-2/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It was reported that WoL from S5 is broken (WoL from S3 works) and the analysis showed that during system shutdown the network interface was brought down already when the actual kernel shutdown started. Therefore netif_running() returned false and as a consequence the PHY was suspended. Obviously WoL wasn't working then. To fix this the original patch needs to be effectively reverted. A side effect is that when normally bringing down the interface and WoL is enabled the PHY will remain powered on (like it was before the original patch). Fixes: fe87bef01f9b ("r8169: don't check WoL when powering down PHY and interface is down") Reported-by: Neil MacLeod <neil@nmacleod.com> Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * octeontx2-af: Use GFP_ATOMIC under spin lockWei Yongjun2018-10-251-1/+1
| | | | | | | | | | | | | | | | | | | | The function nix_update_mce_list() is called from nix_update_bcast_mce_list(), and a spin lock is held here, so we should use GFP_ATOMIC instead. Fixes: 4b05528ebf0c ("octeontx2-af: Update bcast list upon NIXLF alloc/free") Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net: ethernet: cadence: fix socket buffer corruption problemTristram Ha2018-10-251-1/+1
| | | | | | | | | | | | | | Socket buffer is not re-created when headroom is 2 and tailroom is 1. Signed-off-by: Tristram Ha <Tristram.Ha@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * Merge branch '100GbE' of ↵David S. Miller2018-10-259-138/+77
| |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-queue Jeff Kirsher says: ==================== Intel Wired LAN Driver Fixes 2018-10-24 This series contains fixes for the ice driver. Anirudh fixes a namespace issue which was introduced with a previous patch to remove ice_netpoll. Fixed up the device ID define names to align with the branding string names. Use the capability count returned by the firmware, instead of calculating the count. Introduced driver workarounds due to current firmware limitations. Fixed the queue mapping for a VF, which needs to be set in the config and scatter queue modes. Fixed the driver which is setup to handle link status events (LSE), even though the firmware does not have this feature yet, so add the ability to poll for link status changes while we wait for updated firmware. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| | * ice: Poll for link status changeAnirudh Venkataramanan2018-10-243-120/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When the physical link goes up or down, the driver is supposed to receive a link status event (LSE). The driver currently has the code to handle LSEs but there is no firmware support for this feature yet. So this patch adds the ability for the driver to poll for link status changes. The polling itself is done in ice_watchdog_subtask. For namespace cleanliness, this patch also removes code that handles LSE. This code will be reintroduced once the feature is officially supported. Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
| | * ice: Allocate VF interrupts and set queue mapAnirudh Venkataramanan2018-10-242-4/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Allocate VF interrupts using VPINT_ALLOC_PCI. Multiple interrupts are specified as a range from "first" to "last". Also, according to the spec, the queue mapping for a VF needs to be set in both contig and scatter queue modes. So make this change as well. Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
| | * ice: Introduce ice_dev_onetime_setupAnirudh Venkataramanan2018-10-244-0/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ice_dev_onetime_setup contains a couple of driver workarounds for current firmware limitations. These workarounds are expected to go away once these limitations are fixed in the firmware. On a firmware release that has these issues addressed, these workarounds (while unnecessary) will not break anything. Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
| | * ice: Use capability count returned by the firmwareAnirudh Venkataramanan2018-10-241-3/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | The firmware now returns the capability count in the command buffer. Use it. Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
| | * ice: Update expected FW versionAnirudh Venkataramanan2018-10-241-3/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Update to the current firmware major and minor version which are 1 and 3 respectively. Also remove an empty comment line. Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
| | * ice: Change device ID define names to align with branding stringAnirudh Venkataramanan2018-10-242-6/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | Basically remove references to C810 and use E810C (from the branding string) instead. Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
| | * ice: Make ice_msix_clean_rings staticAnirudh Venkataramanan2018-10-242-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 158a08a694c4e ("ice: remove ndo_poll_controller") removed ice_netpoll and introduced a namespace warning for ice_msix_clean_rings. Fix the namespace warning by making ice_msix_clean_rings static. Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
| * | net/ipv6: Allow onlink routes to have a device mismatch if it is the default ↵David Ahern2018-10-242-7/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | route The intent of ip6_route_check_nh_onlink is to make sure the gateway given for an onlink route is not actually on a connected route for a different interface (e.g., 2001:db8:1::/64 is on dev eth1 and then an onlink route has a via 2001:db8:1::1 dev eth2). If the gateway lookup hits the default route then it most likely will be a different interface than the onlink route which is ok. Update ip6_route_check_nh_onlink to disregard the device mismatch if the gateway lookup hits the default route. Turns out the existing onlink tests are passing because there is no default route or it is an unreachable default, so update the onlink tests to have a default route other than unreachable. Fixes: fc1e64e1092f6 ("net/ipv6: Add support for onlink flag") Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | net: sched: Remove TCA_OPTIONS from policyDavid Ahern2018-10-241-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Marco reported an error with hfsc: root@Calimero:~# tc qdisc add dev eth0 root handle 1:0 hfsc default 1 Error: Attribute failed policy validation. Apparently a few implementations pass TCA_OPTIONS as a binary instead of nested attribute, so drop TCA_OPTIONS from the policy. Fixes: 8b4c3cdd9dd8 ("net: sched: Add policy validation for tc attributes") Reported-by: Marco Berizzi <pupilla@libero.it> Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | net: hns3: Fix for warning uninitialized symbol hw_err_lst3Shiju Jose2018-10-241-4/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch fixes the smatch warning, drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_err.c:700 hclge_log_and_clear_ppp_error() error: uninitialized symbol 'hw_err_lst3' Link: https://lkml.org/lkml/2018/10/23/430 Fixes: da2d072a9ea7 ("net: hns3: Add enable and process hw errors from PPP") Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Shiju Jose <shiju.jose@huawei.com> Signed-off-by: Salil Mehta <salil.mehta@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | octeontx2-af: Copy the right amount of memoryDan Carpenter2018-10-241-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a copy and paste bug where we copied the sizeof() from the chunk before. We're copying more data than intended but the destination is a union so it doesn't cause memory corruption. Fixes: ffb0abd7e9cb ("octeontx2-af: NIX AQ instruction enqueue support") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | net: udp: fix handling of CHECKSUM_COMPLETE packetsSean Tranchetti2018-10-243-6/+39
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Current handling of CHECKSUM_COMPLETE packets by the UDP stack is incorrect for any packet that has an incorrect checksum value. udp4/6_csum_init() will both make a call to __skb_checksum_validate_complete() to initialize/validate the csum field when receiving a CHECKSUM_COMPLETE packet. When this packet fails validation, skb->csum will be overwritten with the pseudoheader checksum so the packet can be fully validated by software, but the skb->ip_summed value will be left as CHECKSUM_COMPLETE so that way the stack can later warn the user about their hardware spewing bad checksums. Unfortunately, leaving the SKB in this state can cause problems later on in the checksum calculation. Since the the packet is still marked as CHECKSUM_COMPLETE, udp_csum_pull_header() will SUBTRACT the checksum of the UDP header from skb->csum instead of adding it, leaving us with a garbage value in that field. Once we try to copy the packet to userspace in the udp4/6_recvmsg(), we'll make a call to skb_copy_and_csum_datagram_msg() to checksum the packet data and add it in the garbage skb->csum value to perform our final validation check. Since the value we're validating is not the proper checksum, it's possible that the folded value could come out to 0, causing us not to drop the packet. Instead, we believe that the packet was checksummed incorrectly by hardware since skb->ip_summed is still CHECKSUM_COMPLETE, and we attempt to warn the user with netdev_rx_csum_fault(skb->dev); Unfortunately, since this is the UDP path, skb->dev has been overwritten by skb->dev_scratch and is no longer a valid pointer, so we end up reading invalid memory. This patch addresses this problem in two ways: 1) Do not use the dev pointer when calling netdev_rx_csum_fault() from skb_copy_and_csum_datagram_msg(). Since this gets called from the UDP path where skb->dev has been overwritten, we have no way of knowing if the pointer is still valid. Also for the sake of consistency with the other uses of netdev_rx_csum_fault(), don't attempt to call it if the packet was checksummed by software. 2) Add better CHECKSUM_COMPLETE handling to udp4/6_csum_init(). If we receive a packet that's CHECKSUM_COMPLETE that fails verification (i.e. skb->csum_valid == 0), check who performed the calculation. It's possible that the checksum was done in software by the network stack earlier (such as Netfilter's CONNTRACK module), and if that says the checksum is bad, we can drop the packet immediately instead of waiting until we try and copy it to userspace. Otherwise, we need to mark the SKB as CHECKSUM_NONE, since the skb->csum field no longer contains the full packet checksum after the call to __skb_checksum_validate_complete(). Fixes: e6afc8ace6dd ("udp: remove headers from UDP packets before queueing") Fixes: c84d949057ca ("udp: copy skb->truesize in the first cache line") Cc: Sam Kumar <samanthakumar@google.com> Cc: Eric Dumazet <edumazet@google.com> Signed-off-by: Sean Tranchetti <stranche@codeaurora.org> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | Merge branch 'route-dump-filter-fixes'David S. Miller2018-10-248-13/+34
| |\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | David Ahern says: ==================== net: Fixups for recent dump filtering changes Li RongQing noted that tgt_net is leaked in ipv4 due to the recent change to handle address dumps for a specific device. The report also applies to ipv6 and other error paths. Patches 1 and 2 fix those leaks. Patch 3 stops route dumps from erroring out when dumping across address families and a table id is given. This is needed in preparation for patch 4. Patch 4 updates the rtnl_dump_all to handle a failure in one of the dumpit functions. At the moment, if an address dump returns an error the dump all loop breaks but the error is dropped. The result can be no data is returned and no error either leaving the user wondering about the addresses. Patches were tested with a modified iproute2 to add invalid data to the dump request causing each specific failure path to be hit in addition to positive testing that it works as it should when given valid data. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| | * | net: rtnl_dump_all needs to propagate error from dumpit functionDavid Ahern2018-10-241-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If an address, route or netconf dump request is sent for AF_UNSPEC, then rtnl_dump_all is used to do the dump across all address families. If one of the dumpit functions fails (e.g., invalid attributes in the dump request) then rtnl_dump_all needs to propagate that error so the user gets an appropriate response instead of just getting no data. Fixes: effe67926624 ("net: Enable kernel side filtering of route dumps") Fixes: 5fcd266a9f64 ("net/ipv4: Add support for dumping addresses for a specific device") Fixes: 6371a71f3a3b ("net/ipv6: Add support for dumping addresses for a specific device") Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * | net: Don't return invalid table id error when dumping all familiesDavid Ahern2018-10-245-0/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When doing a route dump across all address families, do not error out if the table does not exist. This allows a route dump for AF_UNSPEC with a table id that may only exist for some of the families. Do return the table does not exist error if dumping routes for a specific family and the table does not exist. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * | net/ipv6: Put target net when address dump fails due to bad attributesDavid Ahern2018-10-241-6/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If tgt_net is set based on IFA_TARGET_NETNSID attribute in the dump request, make sure all error paths call put_net. Fixes: 6371a71f3a3b ("net/ipv6: Add support for dumping addresses for a specific device") Fixes: ed6eff11790a ("net/ipv6: Update inet6_dump_addr for strict data checking") Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * | net/ipv4: Put target net when address dump fails due to bad attributesDavid Ahern2018-10-241-5/+8
| |/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If tgt_net is set based on IFA_TARGET_NETNSID attribute in the dump request, make sure all error paths call put_net. Fixes: 5fcd266a9f64 ("net/ipv4: Add support for dumping addresses for a specific device") Fixes: c33078e3dfb1 ("net/ipv4: Update inet_dump_ifaddr for strict data checking") Reported-by: Li RongQing <lirongqing@baidu.com> Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparcLinus Torvalds2018-10-277-11/+44
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull sparc fixes from David Miller: "Some more sparc fixups, mostly aimed at getting the allmodconfig build up and clean again" * git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc: sparc64: Rework xchg() definition to avoid warnings. sparc64: Export __node_distance. sparc64: Make corrupted user stacks more debuggable.
| * | | sparc64: Rework xchg() definition to avoid warnings.David S. Miller2018-10-271-1/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Such as: fs/ocfs2/file.c: In function ‘ocfs2_file_write_iter’: ./arch/sparc/include/asm/cmpxchg_64.h:55:22: warning: value computed is not used [-Wunused-value] #define xchg(ptr,x) ((__typeof__(*(ptr)))__xchg((unsigned long)(x),(ptr),sizeof(*(ptr)))) and drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c: In function ‘ixgbevf_xdp_setup’: ./arch/sparc/include/asm/cmpxchg_64.h:55:22: warning: value computed is not used [-Wunused-value] #define xchg(ptr,x) ((__typeof__(*(ptr)))__xchg((unsigned long)(x),(ptr),sizeof(*(ptr)))) Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | sparc64: Export __node_distance.David S. Miller2018-10-271-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Some drivers reference it via node_distance(), for example the NVME host driver core. ERROR: "__node_distance" [drivers/nvme/host/nvme-core.ko] undefined! make[1]: *** [scripts/Makefile.modpost:92: __modpost] Error 1 Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | sparc64: Make corrupted user stacks more debuggable.David Miller2018-10-275-10/+37
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Right now if we get a corrupted user stack frame we do a do_exit(SIGILL) which is not helpful. If under a debugger, this behavior causes the inferior process to exit. So the register and other state cannot be examined at the time of the event. Instead, conditionally log a rate limited kernel log message and then force a SIGSEGV. With bits and ideas borrowed (as usual) from powerpc. Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | Merge tag 'mips_4.20' of ↵Linus Torvalds2018-10-2668-1017/+918
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux Pull MIPS updates from Paul Burton: - kexec support for the generic MIPS platform when running on a CPU including the MIPS Coherence Manager & related hardware. - Improvements to the definition of memory barriers used around MMIO accesses, and fixes in their use. - Switch to CONFIG_NO_BOOTMEM from Mike Rapoport, finally dropping reliance on the old bootmem code. - A number of fixes & improvements for Loongson 3 systems. - DT & config updates for the Microsemi Ocelot platform. - Workaround to enable USB power on the Netgear WNDR3400v3. - Various cleanups & fixes. * tag 'mips_4.20' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux: (51 commits) MIPS: Cleanup DSP ASE detection MIPS: dts: Change upper case to lower case MIPS: generic: Add Network, SPI and I2C to ocelot_defconfig MIPS: Loongson-3: Fix BRIDGE irq delivery problem MIPS: Loongson-3: Fix CPU UART irq delivery problem MIPS: Remove unused PREF, PREFE & PREFX macros MIPS: lib: Use kernel_pref & user_pref in memcpy() MIPS: Remove unused CAT macro MIPS: Add kernel_pref & user_pref helpers MIPS: Remove unused TTABLE macro MIPS: Remove unused PIC macros MIPS: Remove unused MOVN & MOVZ macros MIPS: Provide actually relaxed MMIO accessors MIPS: Enforce strong ordering for MMIO accessors MIPS: Correct `mmiowb' barrier for `wbflush' platforms MIPS: Define MMIO ordering barriers MIPS: mscc: add PCB120 to the ocelot fitImage MIPS: mscc: add DT for Ocelot PCB120 MIPS: memset: Limit excessive `noreorder' assembly mode use MIPS: memset: Fix CPU_DADDI_WORKAROUNDS `small_fixup' regression ...
| * | | | MIPS: Cleanup DSP ASE detectionPaul Burton2018-10-173-19/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently we hardcode a list of files for which we specify that the toolchain has DSP ASE support when building for MIPSr2 only. This has a number of problems: 1) It doesn't actually ensure that the toolchain supports the DSP ASE at all. 2) It's fragile if we try to use DSP ASE macros in other files. 3) It makes no provision for MIPSr6 & later systems which also support the DSP ASE & end up using the .word directive implementation of the DSP macros. Fix this by detecting assembler support for the DSP ASE globally, not just for a small set of files, and not just for MIPSr2. This now exposes use of toolchain DSP support to kernel builds targeting MIPSr1 and older, so we add .set MIPS_ISA_LEVEL directives prior to all .set dsp directives in order to prevent the assembler from complaining that the DSP ASE is only supported with MIPSr2 & higher. Signed-off-by: Paul Burton <paul.burton@mips.com> Patchwork: https://patchwork.linux-mips.org/patch/20901/ Cc: linux-mips@linux-mips.org
| * | | | MIPS: dts: Change upper case to lower caseSongjun Wu2018-10-162-28/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | All the upper case in unit-address and hex constants are changed to lower case according to the DT conventions. Signed-off-by: Songjun Wu <songjun.wu@linux.intel.com> Signed-off-by: Paul Burton <paul.burton@mips.com> Reviewed-by: Rob Herring <robh@kernel.org> Patchwork: https://patchwork.linux-mips.org/patch/20768/ Cc: yixin.zhu@linux.intel.com Cc: chuanhua.lei@linux.intel.com Cc: hauke.mehrtens@intel.com Cc: devicetree@vger.kernel.org Cc: James Hogan <jhogan@kernel.org> Cc: linux-mips@linux-mips.org Cc: linux-kernel@vger.kernel.org Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Philippe Ombredanne <pombredanne@nexb.com> Cc: Rob Herring <robh+dt@kernel.org> Cc: Kate Stewart <kstewart@linuxfoundation.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Ralf Baechle <ralf@linux-mips.org>
| * | | | MIPS: generic: Add Network, SPI and I2C to ocelot_defconfigAlexandre Belloni2018-10-161-1/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add support for the integrated switch, and the SPI and I2C controller found on MSCC Ocelot. Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Signed-off-by: Paul Burton <paul.burton@mips.com> Patchwork: https://patchwork.linux-mips.org/patch/20345/ Cc: James Hogan <jhogan@kernel.org> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-kernel@vger.kernel.org Cc: linux-mips@linux-mips.org
| * | | | MIPS: Loongson-3: Fix BRIDGE irq delivery problemHuacai Chen2018-10-162-11/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | After commit e509bd7da149dc349160 ("genirq: Allow migration of chained interrupts by installing default action") Loongson-3 fails at here: setup_irq(LOONGSON_HT1_IRQ, &cascade_irqaction); This is because both chained_action and cascade_irqaction don't have IRQF_SHARED flag. This will cause Loongson-3 resume fails because HPET timer interrupt can't be delivered during S3. So we set the irqchip of the chained irq to loongson_irq_chip which doesn't disable the chained irq in CP0.Status. Cc: stable@vger.kernel.org Signed-off-by: Huacai Chen <chenhc@lemote.com> Signed-off-by: Paul Burton <paul.burton@mips.com> Patchwork: https://patchwork.linux-mips.org/patch/20434/ Cc: Ralf Baechle <ralf@linux-mips.org> Cc: James Hogan <jhogan@kernel.org> Cc: linux-mips@linux-mips.org Cc: Fuxin Zhang <zhangfx@lemote.com> Cc: Zhangjin Wu <wuzhangjin@gmail.com> Cc: Huacai Chen <chenhuacai@gmail.com>
| * | | | MIPS: Loongson-3: Fix CPU UART irq delivery problemHuacai Chen2018-10-161-40/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Masking/unmasking the CPU UART irq in CP0_Status (and redirecting it to other CPUs) may cause interrupts be lost, especially in multi-package machines (Package-0's UART irq cannot be delivered to others). So make mask_loongson_irq() and unmask_loongson_irq() be no-ops. The original problem (UART IRQ may deliver to any core) is also because of masking/unmasking the CPU UART irq in CP0_Status. So it is safe to remove all of the stuff. Signed-off-by: Huacai Chen <chenhc@lemote.com> Signed-off-by: Paul Burton <paul.burton@mips.com> Patchwork: https://patchwork.linux-mips.org/patch/20433/ Cc: Ralf Baechle <ralf@linux-mips.org> Cc: James Hogan <jhogan@kernel.org> Cc: linux-mips@linux-mips.org Cc: Fuxin Zhang <zhangfx@lemote.com> Cc: Zhangjin Wu <wuzhangjin@gmail.com> Cc: Huacai Chen <chenhuacai@gmail.com>
| * | | | MIPS: Remove unused PREF, PREFE & PREFX macrosPaul Burton2018-10-161-36/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | asm/asm.h provides PREF(), PREFE() & PREFX() macros which are now entirely unused. Delete the dead code. Signed-off-by: Paul Burton <paul.burton@mips.com> Patchwork: https://patchwork.linux-mips.org/patch/20908/ Cc: linux-mips@linux-mips.org
| * | | | MIPS: lib: Use kernel_pref & user_pref in memcpy()Paul Burton2018-10-161-4/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | memcpy() is the only user of the PREF() & PREFE() macros from asm/asm.h. Switch to using the kernel_pref() & user_pref() macros from asm/asm-eva.h which fit more consistently with other abstractions of EVA vs non-EVA instructions. Signed-off-by: Paul Burton <paul.burton@mips.com> Patchwork: https://patchwork.linux-mips.org/patch/20907/ Cc: linux-mips@linux-mips.org
| * | | | MIPS: Remove unused CAT macroPaul Burton2018-10-161-9/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | asm/asm.h provides a CAT macro which is unused throughout the tree, and if anyone wanted it the generic CONCATENATE macro in linux/kernel.h provides the same functionality. Delete the dead code. Signed-off-by: Paul Burton <paul.burton@mips.com> Patchwork: https://patchwork.linux-mips.org/patch/20905/ Cc: linux-mips@linux-mips.org
| * | | | MIPS: Add kernel_pref & user_pref helpersPaul Burton2018-10-161-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add kernel_pref & user_pref macros to asm/asm-eva.h, providing an abstraction around EVA & non-EVA pref instructions consistent with the existing macros we have for cache & load/store instructions. Signed-off-by: Paul Burton <paul.burton@mips.com> Patchwork: https://patchwork.linux-mips.org/patch/20906/ Cc: linux-mips@linux-mips.org
| * | | | MIPS: Remove unused TTABLE macroPaul Burton2018-10-161-11/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | asm/asm.h contains a TTABLE macro to generate "text tables" which would appear to be arrays of pointers to strings. It is unused throughout the kernel tree, so delete the dead code. Signed-off-by: Paul Burton <paul.burton@mips.com> Patchwork: https://patchwork.linux-mips.org/patch/20904/ Cc: linux-mips@linux-mips.org
| * | | | MIPS: Remove unused PIC macrosPaul Burton2018-10-161-17/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | asm/asm.h contains CPRESTORE, CPADD & CPLOAD macros that are intended for use with position independent code, but are not used anywhere in the kernel - along with a comment to that effect. Remove the dead code. Signed-off-by: Paul Burton <paul.burton@mips.com> Patchwork: https://patchwork.linux-mips.org/patch/20903/ Cc: linux-mips@linux-mips.org
| * | | | MIPS: Remove unused MOVN & MOVZ macrosPaul Burton2018-10-161-43/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We have macros in asm/asm.h to allow for use of the MOVN & MOVZ instructions with compare-and-branch sequences providing compatibility for ISA versions which don't include those instructions. However the macros are unused, and appear to have always been unused. Delete the dead code. Signed-off-by: Paul Burton <paul.burton@mips.com> Patchwork: https://patchwork.linux-mips.org/patch/20909/ Cc: linux-mips@linux-mips.org
| * | | | MIPS: Provide actually relaxed MMIO accessorsMaciej W. Rozycki2018-10-091-20/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Improve performance for the relevant systems and remove the DMA ordering barrier from `readX_relaxed' and `writeX_relaxed' MMIO accessors, where it is not needed according to our requirements[1]. For consistency make the same arrangement with low-level port I/O accessors, but do not actually provide any accessors making use of it. References: [1] "LINUX KERNEL MEMORY BARRIERS", Documentation/memory-barriers.txt, Section "KERNEL I/O BARRIER EFFECTS" Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org> Signed-off-by: Paul Burton <paul.burton@mips.com> Patchwork: https://patchwork.linux-mips.org/patch/20865/ Cc: Ralf Baechle <ralf@linux-mips.org>
| * | | | MIPS: Enforce strong ordering for MMIO accessorsMaciej W. Rozycki2018-10-091-8/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Architecturally the MIPS ISA does not specify ordering requirements for uncached bus accesses such as MMIO operations normally use and therefore explicit barriers have to be inserted between MMIO accesses where unspecified ordering of operations would cause unpredictable results. For example the R2020 write buffer implements write gathering and combining[1] and as used with the DECstation models 2100 and 3100 for MMIO accesses it bypasses the read buffer entirely, because conflicts are resolved by the memory controller for DRAM accesses only[2] (NB the R2020 and R3020 buffers are the same except for the maximum clock rate). Consequently if a device has say a 16-bit control register at offset 0, a 16-bit event mask register at offset 2 and a 16-bit reset register at offset 4, and the initial value of the control register is 0x1111, then in the absence of barriers a hypothetical code sequence like this: u16 init_dev(u16 __iomem *dev); u16 x; write16(dev + 2, 0xffff); write16(dev + 0, 0x2222); x = read16(dev + 0); write16(dev + 1, 0x3333); write16(dev + 0, 0x4444); return x; } will return 0x1111 and issue a single 32-bit write of 0x33334444 (in the little-endian bus configuration) to offset 0 on the system bus. This is because the read to set `x' from offset 0 bypasses the write of 0x2222 that is still in the write buffer pending the completion of the write of 0xffff to the reset register. Then the write of 0x3333 to the event mask register is merged with the preceding write to the control register as they share the same word address, making it a 32-bit write of 0x33332222 to offset 0. Finally the write of 0x4444 to the control register is combined with the outstanding 32-bit write of 0x33332222 to offset 0, because, again, it shares the same address. This is an example from a legacy system, given here because it is well documented and affects a machine we actually support. But likewise modern MIPS systems may implement weak MMIO ordering, possibly even without having it clearly documented except for being compliant with the architecture specification with respect to the currently defined SYNC instruction variants[3]. Considering the above and that we are required to implement MMIO accessors such that individual accesses made with them are strongly ordered with respect to each other[4], add the necessary barriers to our `inX', `outX', `readX' and `writeX' handlers, as well the associated special use variants. It's up to platforms then to possibly define the respective barriers so as to expand to nil if no ordering enforcement is actually needed for a given system; SYNC is supposed to be as cheap as a NOP on strongly ordered MIPS implementations though. Retain the option to generate weakly-ordered accessors, so that the arrangement for `war_io_reorder_wmb' is not lost in case we need it for fully raw accessors in the future. The reason for this is that it is unclear from commit 1e820da3c9af ("MIPS: Loongson-3: Introduce CONFIG_LOONGSON3_ENHANCEMENT") and especially commit 8faca49a6731 ("MIPS: Modify core io.h macros to account for the Octeon Errata Core-301.") why they are needed there under the previous assumption that these accessors can be weakly ordered. References: [1] "LR3020 Write Buffer", LSI Logic Corporation, September 1988, Section "Byte Gathering", pp. 6-7 [2] "DECstation 3100 Desktop Workstation Functional Specification", Digital Equipment Corporation, Revision 1.3, August 28, 1990, Section 6.1 "Processor", p. 4 [3] "MIPS Architecture For Programmers, Volume II-A: The MIPS32 Instruction Set Manual", Imagination Technologies LTD, Document Number: MD00086, Revision 6.06, December 15, 2016, Table 5.5 "Encodings of the Bits[10:6] of the SYNC instruction; the SType Field", p. 409 [4] "LINUX KERNEL MEMORY BARRIERS", Documentation/memory-barriers.txt, Section "KERNEL I/O BARRIER EFFECTS" Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org> References: 8faca49a6731 ("MIPS: Modify core io.h macros to account for the Octeon Errata Core-301.") References: 1e820da3c9af ("MIPS: Loongson-3: Introduce CONFIG_LOONGSON3_ENHANCEMENT") Signed-off-by: Paul Burton <paul.burton@mips.com> Patchwork: https://patchwork.linux-mips.org/patch/20864/ Cc: Ralf Baechle <ralf@linux-mips.org>