summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* Merge tag 'net-5.19-rc3' of ↵Linus Torvalds2022-06-1636-752/+433
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Mostly driver fixes. Current release - regressions: - Revert "net: Add a second bind table hashed by port and address", needs more work - amd-xgbe: use platform_irq_count(), static setup of IRQ resources had been removed from DT core - dts: at91: ksz9477_evb: add phy-mode to fix port/phy validation Current release - new code bugs: - hns3: modify the ring param print info Previous releases - always broken: - axienet: make the 64b addressable DMA depends on 64b architectures - iavf: fix issue with MAC address of VF shown as zero - ice: fix PTP TX timestamp offset calculation - usb: ax88179_178a needs FLAG_SEND_ZLP Misc: - document some net.sctp.* sysctls" * tag 'net-5.19-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (31 commits) net: axienet: add missing error return code in axienet_probe() Revert "net: Add a second bind table hashed by port and address" net: ax25: Fix deadlock caused by skb_recv_datagram in ax25_recvmsg net: usb: ax88179_178a needs FLAG_SEND_ZLP MAINTAINERS: add include/dt-bindings/net to NETWORKING DRIVERS ARM: dts: at91: ksz9477_evb: fix port/phy validation net: bgmac: Fix an erroneous kfree() in bgmac_remove() ice: Fix memory corruption in VF driver ice: Fix queue config fail handling ice: Sync VLAN filtering features for DVM ice: Fix PTP TX timestamp offset calculation mlxsw: spectrum_cnt: Reorder counter pools docs: networking: phy: Fix a typo amd-xgbe: Use platform_irq_count() octeontx2-vf: Add support for adaptive interrupt coalescing xilinx: Fix build on x86. net: axienet: Use iowrite64 to write all 64b descriptor pointers net: axienet: make the 64b addresable DMA depends on 64b archectures net: hns3: fix tm port shapping of fibre port is incorrect after driver initialization net: hns3: fix PF rss size initialization bug ...
| * net: axienet: add missing error return code in axienet_probe()Yang Yingliang2022-06-161-0/+1
| | | | | | | | | | | | | | | | | | | | It should return error code in error path in axienet_probe(). Fixes: 00be43a74ca2 ("net: axienet: make the 64b addresable DMA depends on 64b archectures") Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Link: https://lore.kernel.org/r/20220616062917.3601-1-yangyingliang@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| * Revert "net: Add a second bind table hashed by port and address"Joanne Koong2022-06-1610-611/+83
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This reverts: commit d5a42de8bdbe ("net: Add a second bind table hashed by port and address") commit 538aaf9b2383 ("selftests: Add test for timing a bind request to a port with a populated bhash entry") Link: https://lore.kernel.org/netdev/20220520001834.2247810-1-kuba@kernel.org/ There are a few things that need to be fixed here: * Updating bhash2 in cases where the socket's rcv saddr changes * Adding bhash2 hashbucket locks Links to syzbot reports: https://lore.kernel.org/netdev/00000000000022208805e0df247a@google.com/ https://lore.kernel.org/netdev/0000000000003f33bc05dfaf44fe@google.com/ Fixes: d5a42de8bdbe ("net: Add a second bind table hashed by port and address") Reported-by: syzbot+015d756bbd1f8b5c8f09@syzkaller.appspotmail.com Reported-by: syzbot+98fd2d1422063b0f8c44@syzkaller.appspotmail.com Reported-by: syzbot+0a847a982613c6438fba@syzkaller.appspotmail.com Signed-off-by: Joanne Koong <joannelkoong@gmail.com> Link: https://lore.kernel.org/r/20220615193213.2419568-1-joannelkoong@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| * net: ax25: Fix deadlock caused by skb_recv_datagram in ax25_recvmsgDuoming Zhou2022-06-151-5/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The skb_recv_datagram() in ax25_recvmsg() will hold lock_sock and block until it receives a packet from the remote. If the client doesn`t connect to server and calls read() directly, it will not receive any packets forever. As a result, the deadlock will happen. The fail log caused by deadlock is shown below: [ 369.606973] INFO: task ax25_deadlock:157 blocked for more than 245 seconds. [ 369.608919] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 369.613058] Call Trace: [ 369.613315] <TASK> [ 369.614072] __schedule+0x2f9/0xb20 [ 369.615029] schedule+0x49/0xb0 [ 369.615734] __lock_sock+0x92/0x100 [ 369.616763] ? destroy_sched_domains_rcu+0x20/0x20 [ 369.617941] lock_sock_nested+0x6e/0x70 [ 369.618809] ax25_bind+0xaa/0x210 [ 369.619736] __sys_bind+0xca/0xf0 [ 369.620039] ? do_futex+0xae/0x1b0 [ 369.620387] ? __x64_sys_futex+0x7c/0x1c0 [ 369.620601] ? fpregs_assert_state_consistent+0x19/0x40 [ 369.620613] __x64_sys_bind+0x11/0x20 [ 369.621791] do_syscall_64+0x3b/0x90 [ 369.622423] entry_SYSCALL_64_after_hwframe+0x46/0xb0 [ 369.623319] RIP: 0033:0x7f43c8aa8af7 [ 369.624301] RSP: 002b:00007f43c8197ef8 EFLAGS: 00000246 ORIG_RAX: 0000000000000031 [ 369.625756] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f43c8aa8af7 [ 369.626724] RDX: 0000000000000010 RSI: 000055768e2021d0 RDI: 0000000000000005 [ 369.628569] RBP: 00007f43c8197f00 R08: 0000000000000011 R09: 00007f43c8198700 [ 369.630208] R10: 0000000000000000 R11: 0000000000000246 R12: 00007fff845e6afe [ 369.632240] R13: 00007fff845e6aff R14: 00007f43c8197fc0 R15: 00007f43c8198700 This patch replaces skb_recv_datagram() with an open-coded variant of it releasing the socket lock before the __skb_wait_for_more_packets() call and re-acquiring it after such call in order that other functions that need socket lock could be executed. what's more, the socket lock will be released only when recvmsg() will block and that should produce nicer overall behavior. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Suggested-by: Thomas Osterried <thomas@osterried.de> Signed-off-by: Duoming Zhou <duoming@zju.edu.cn> Reported-by: Thomas Habets <thomas@@habets.se> Acked-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net: usb: ax88179_178a needs FLAG_SEND_ZLPJose Alonso2022-06-151-13/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The extra byte inserted by usbnet.c when (length % dev->maxpacket == 0) is causing problems to device. This patch sets FLAG_SEND_ZLP to avoid this. Tested with: 0b95:1790 ASIX Electronics Corp. AX88179 Gigabit Ethernet Problems observed: ====================================================================== 1) Using ssh/sshfs. The remote sshd daemon can abort with the message: "message authentication code incorrect" This happens because the tcp message sent is corrupted during the USB "Bulk out". The device calculate the tcp checksum and send a valid tcp message to the remote sshd. Then the encryption detects the error and aborts. 2) NETDEV WATCHDOG: ... (ax88179_178a): transmit queue 0 timed out 3) Stop normal work without any log message. The "Bulk in" continue receiving packets normally. The host sends "Bulk out" and the device responds with -ECONNRESET. (The netusb.c code tx_complete ignore -ECONNRESET) Under normal conditions these errors take days to happen and in intense usage take hours. A test with ping gives packet loss, showing that something is wrong: ping -4 -s 462 {destination} # 462 = 512 - 42 - 8 Not all packets fail. My guess is that the device tries to find another packet starting at the extra byte and will fail or not depending on the next bytes (old buffer content). ====================================================================== Signed-off-by: Jose Alonso <joalonsof@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * Merge branch '100GbE' of ↵David S. Miller2022-06-155-46/+94
| |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue Tony Nguyen says: ==================== Intel Wired LAN Driver Updates 2022-06-14 This series contains updates to ice driver only. Michal fixes incorrect Tx timestamp offset calculation for E822 devices. Roman enforces required VLAN filtering settings for double VLAN mode. Przemyslaw fixes memory corruption issues with VFs by ensuring queues are disabled in the error path of VF queue configuration and to disabled VFs during reset. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| | * ice: Fix memory corruption in VF driverPrzemyslaw Patynowski2022-06-141-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Disable VF's RX/TX queues, when it's disabled. VF can have queues enabled, when it requests a reset. If PF driver assumes that VF is disabled, while VF still has queues configured, VF may unmap DMA resources. In such scenario device still can map packets to memory, which ends up silently corrupting it. Previously, VF driver could experience memory corruption, which lead to crash: [ 5119.170157] BUG: unable to handle kernel paging request at 00001b9780003237 [ 5119.170166] PGD 0 P4D 0 [ 5119.170173] Oops: 0002 [#1] PREEMPT_RT SMP PTI [ 5119.170181] CPU: 30 PID: 427592 Comm: kworker/u96:2 Kdump: loaded Tainted: G W I --------- - - 4.18.0-372.9.1.rt7.166.el8.x86_64 #1 [ 5119.170189] Hardware name: Dell Inc. PowerEdge R740/014X06, BIOS 2.3.10 08/15/2019 [ 5119.170193] Workqueue: iavf iavf_adminq_task [iavf] [ 5119.170219] RIP: 0010:__page_frag_cache_drain+0x5/0x30 [ 5119.170238] Code: 0f 0f b6 77 51 85 f6 74 07 31 d2 e9 05 df ff ff e9 90 fe ff ff 48 8b 05 49 db 33 01 eb b4 0f 1f 80 00 00 00 00 0f 1f 44 00 00 <f0> 29 77 34 74 01 c3 48 8b 07 f6 c4 80 74 0f 0f b6 77 51 85 f6 74 [ 5119.170244] RSP: 0018:ffffa43b0bdcfd78 EFLAGS: 00010282 [ 5119.170250] RAX: ffffffff896b3e40 RBX: ffff8fb282524000 RCX: 0000000000000002 [ 5119.170254] RDX: 0000000049000000 RSI: 0000000000000000 RDI: 00001b9780003203 [ 5119.170259] RBP: ffff8fb248217b00 R08: 0000000000000022 R09: 0000000000000009 [ 5119.170262] R10: 2b849d6300000000 R11: 0000000000000020 R12: 0000000000000000 [ 5119.170265] R13: 0000000000001000 R14: 0000000000000009 R15: 0000000000000000 [ 5119.170269] FS: 0000000000000000(0000) GS:ffff8fb1201c0000(0000) knlGS:0000000000000000 [ 5119.170274] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 5119.170279] CR2: 00001b9780003237 CR3: 00000008f3e1a003 CR4: 00000000007726e0 [ 5119.170283] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 5119.170286] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 5119.170290] PKRU: 55555554 [ 5119.170292] Call Trace: [ 5119.170298] iavf_clean_rx_ring+0xad/0x110 [iavf] [ 5119.170324] iavf_free_rx_resources+0xe/0x50 [iavf] [ 5119.170342] iavf_free_all_rx_resources.part.51+0x30/0x40 [iavf] [ 5119.170358] iavf_virtchnl_completion+0xd8a/0x15b0 [iavf] [ 5119.170377] ? iavf_clean_arq_element+0x210/0x280 [iavf] [ 5119.170397] iavf_adminq_task+0x126/0x2e0 [iavf] [ 5119.170416] process_one_work+0x18f/0x420 [ 5119.170429] worker_thread+0x30/0x370 [ 5119.170437] ? process_one_work+0x420/0x420 [ 5119.170445] kthread+0x151/0x170 [ 5119.170452] ? set_kthread_struct+0x40/0x40 [ 5119.170460] ret_from_fork+0x35/0x40 [ 5119.170477] Modules linked in: iavf sctp ip6_udp_tunnel udp_tunnel mlx4_en mlx4_core nfp tls vhost_net vhost vhost_iotlb tap tun xt_CHECKSUM ipt_MASQUERADE xt_conntrack ipt_REJECT nf_reject_ipv4 nft_compat nft_counter nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nf_tables nfnetlink bridge stp llc rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache sunrpc intel_rapl_msr iTCO_wdt iTCO_vendor_support dell_smbios wmi_bmof dell_wmi_descriptor dcdbas kvm_intel kvm irqbypass intel_rapl_common isst_if_common skx_edac irdma nfit libnvdimm x86_pkg_temp_thermal i40e intel_powerclamp coretemp crct10dif_pclmul crc32_pclmul ghash_clmulni_intel ib_uverbs rapl ipmi_ssif intel_cstate intel_uncore mei_me pcspkr acpi_ipmi ib_core mei lpc_ich i2c_i801 ipmi_si ipmi_devintf wmi ipmi_msghandler acpi_power_meter xfs libcrc32c sd_mod t10_pi sg mgag200 drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ice ahci drm libahci crc32c_intel libata tg3 megaraid_sas [ 5119.170613] i2c_algo_bit dm_mirror dm_region_hash dm_log dm_mod fuse [last unloaded: iavf] [ 5119.170627] CR2: 00001b9780003237 Fixes: ec4f5a436bdf ("ice: Check if VF is disabled for Opcode and other operations") Signed-off-by: Przemyslaw Patynowski <przemyslawx.patynowski@intel.com> Co-developed-by: Slawomir Laba <slawomirx.laba@intel.com> Signed-off-by: Slawomir Laba <slawomirx.laba@intel.com> Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
| | * ice: Fix queue config fail handlingPrzemyslaw Patynowski2022-06-141-27/+26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Disable VF's RX/TX queues, when VIRTCHNL_OP_CONFIG_VSI_QUEUES fail. Not disabling them might lead to scenario, where PF driver leaves VF queues enabled, when VF's VSI failed queue config. In this scenario VF should not have RX/TX queues enabled. If PF failed to set up VF's queues, VF will reset due to TX timeouts in VF driver. Initialize iterator 'i' to -1, so if error happens prior to configuring queues then error path code will not disable queue 0. Loop that configures queues will is using same iterator, so error path code will only disable queues that were configured. Fixes: 77ca27c41705 ("ice: add support for virtchnl_queue_select.[tx|rx]_queues bitmap") Suggested-by: Slawomir Laba <slawomirx.laba@intel.com> Signed-off-by: Przemyslaw Patynowski <przemyslawx.patynowski@intel.com> Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
| | * ice: Sync VLAN filtering features for DVMRoman Storozhenko2022-06-141-18/+31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | VLAN filtering features, that is C-Tag and S-Tag, in DVM mode must be both enabled or disabled. In case of turning off/on only one of the features, another feature must be turned off/on automatically with issuing an appropriate message to the kernel log. Fixes: 1babaf77f49d ("ice: Advertise 802.1ad VLAN filtering and offloads for PF netdev") Signed-off-by: Roman Storozhenko <roman.storozhenko@intel.com> Co-developed-by: Anatolii Gerasymenko <anatolii.gerasymenko@intel.com> Signed-off-by: Anatolii Gerasymenko <anatolii.gerasymenko@intel.com> Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
| | * ice: Fix PTP TX timestamp offset calculationMichal Michalik2022-06-142-1/+32
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The offset was being incorrectly calculated for E822 - that led to collisions in choosing TX timestamp register location when more than one port was trying to use timestamping mechanism. In E822 one quad is being logically split between ports, so quad 0 is having trackers for ports 0-3, quad 1 ports 4-7 etc. Each port should have separate memory location for tracking timestamps. Due to error for example ports 1 and 2 had been assigned to quad 0 with same offset (0), while port 1 should have offset 0 and 1 offset 16. Fix it by correctly calculating quad offset. Fixes: 3a7496234d17 ("ice: implement basic E822 PTP support") Signed-off-by: Michal Michalik <michal.michalik@intel.com> Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
| * | MAINTAINERS: add include/dt-bindings/net to NETWORKING DRIVERSLukas Bulwahn2022-06-151-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Maintainers of the directory Documentation/devicetree/bindings/net are also the maintainers of the corresponding directory include/dt-bindings/net. Add the file entry for include/dt-bindings/net to the appropriate section in MAINTAINERS. Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com> Link: https://lore.kernel.org/r/20220613121826.11484-1-lukas.bulwahn@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| * | ARM: dts: at91: ksz9477_evb: fix port/phy validationOleksij Rempel2022-06-151-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Latest drivers version requires phy-mode to be set. Otherwise we will use "NA" mode and the switch driver will invalidate this port mode. Fixes: 65ac79e18120 ("net: dsa: microchip: add the phylink get_caps") Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Link: https://lore.kernel.org/r/20220610081621.584393-1-o.rempel@pengutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| * | net: bgmac: Fix an erroneous kfree() in bgmac_remove()Christophe JAILLET2022-06-151-1/+0
| |/ | | | | | | | | | | | | | | | | | | | | | | | | 'bgmac' is part of a managed resource allocated with bgmac_alloc(). It should not be freed explicitly. Remove the erroneous kfree() from the .remove() function. Fixes: 34a5102c3235 ("net: bgmac: allocate struct bgmac just once & don't copy it") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Link: https://lore.kernel.org/r/a026153108dd21239036a032b95c25b5cece253b.1655153616.git.christophe.jaillet@wanadoo.fr Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| * mlxsw: spectrum_cnt: Reorder counter poolsPetr Machata2022-06-141-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Both RIF and ACL flow counters use a 24-bit SW-managed counter address to communicate which counter they want to bind. In a number of Spectrum FW releases, binding a RIF counter is broken and slices the counter index to 16 bits. As a result, on Spectrum-2 and above, no more than about 410 RIF counters can be effectively used. This translates to 205 netdevices for which L3 HW stats can be enabled. (This does not happen on Spectrum-1, because there are fewer counters available overall and the counter index never exceeds 16 bits.) Binding counters to ACLs does not have this issue. Therefore reorder the counter allocation scheme so that RIF counters come first and therefore get lower indices that are below the 16-bit barrier. Fixes: 98e60dce4da1 ("Merge branch 'mlxsw-Introduce-initial-Spectrum-2-support'") Reported-by: Maksym Yaremchuk <maksymy@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Link: https://lore.kernel.org/r/20220613125017.2018162-1-idosch@nvidia.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
| * docs: networking: phy: Fix a typoJonathan Neuschäfer2022-06-141-1/+1
| | | | | | | | | | | | | | | | | | Write "to be operated" instead of "to be operate". Signed-off-by: Jonathan Neuschäfer <j.neuschaefer@gmx.net> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://lore.kernel.org/r/20220610072809.352962-1-j.neuschaefer@gmx.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| * amd-xgbe: Use platform_irq_count()Jean-Philippe Brucker2022-06-141-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The AMD XGbE driver currently counts the number of interrupts assigned to the device by inspecting the pdev->resource array. Since commit a1a2b7125e10 ("of/platform: Drop static setup of IRQ resource from DT core") removed IRQs from this array, the driver now attempts to get all interrupts from 1 to -1U and gives up probing once it reaches an invalid interrupt index. Obtain the number of IRQs with platform_irq_count() instead. Fixes: a1a2b7125e10 ("of/platform: Drop static setup of IRQ resource from DT core") Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org> Acked-by: Rob Herring <robh@kernel.org> Acked-by: Tom Lendacky <thomas.lendacky@amd.com> Link: https://lore.kernel.org/r/20220609161457.69614-1-jean-philippe@linaro.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| * octeontx2-vf: Add support for adaptive interrupt coalescingSuman Ghosh2022-06-131-1/+2
| | | | | | | | | | | | | | | | Fixes: 6e144b47f560 (octeontx2-pf: Add support for adaptive interrupt coalescing) Added support for VF interfaces as well. Signed-off-by: Suman Ghosh <sumang@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * xilinx: Fix build on x86.David S. Miller2022-06-131-4/+4
| | | | | | | | | | | | | | | | | | | | CONFIG_64BIT is not sufficient for checking for availability of iowrite64() and friends. Also, the out_addr helpers need to be inline. Fixes: b690f8df6497 ("net: axienet: Use iowrite64 to write all 64b descriptor pointers") Signed-off-by: David S. Miller <davem@davemloft.net>
| * Merge branch 'axienet-fixes'David S. Miller2022-06-132-24/+55
| |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Andy Chiu says: ==================== net: axienet: fix DMA Tx error We ran into multiple DMA TX errors while writing files over a network block device running on top of a DMA-connected AXI Ethernet device on 64-bit RISC-V machines. The errors indicated that the DMA had fetched a null descriptor and we found that the reason for this is that AXI DMA had unexpectedly processed a partially updated tail descriptor pointer. To fix it, we suggest that the driver should use one 64-bit write instead of two 32-bit writes to perform such update if possible. For those archectures where double-word load/stores are unavailable, e.g. 32-bit archectures, force a driver probe failure if the driver finds 64-bit capability on DMA. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| | * net: axienet: Use iowrite64 to write all 64b descriptor pointersAndy Chiu2022-06-131-3/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | According to commit f735c40ed93c ("net: axienet: Autodetect 64-bit DMA capability") and AXI-DMA spec (pg021), on 64-bit capable dma, only writing MSB part of tail descriptor pointer causes DMA engine to start fetching descriptors. However, we found that it is true only if dma is in idle state. In other words, dma would use a tailp even if it only has LSB updated, when the dma is running. The non-atomicity of this behavior could be problematic if enough delay were introduced in between the 2 writes. For example, if an interrupt comes right after the LSB write and the cpu spends long enough time in the handler for the dma to get back into idle state by completing descriptors, then the seconcd write to MSB would treat dma to start fetching descriptors again. Since the descriptor next to the one pointed by current tail pointer is not filled by the kernel yet, fetching a null descriptor here causes a dma internal error and halt the dma engine down. We suggest that the dma engine should start process a 64-bit MMIO write to the descriptor pointer only if ONE 32-bit part of it is written on all states. Or we should restrict the use of 64-bit addressable dma on 32-bit platforms, since those devices have no instruction to guarantee the write to LSB and MSB part of tail pointer occurs atomically to the dma. initial condition: curp = x-3; tailp = x-2; LSB = x; MSB = 0; cpu: |dma: iowrite32(LSB, tailp) | completes #(x-3) desc, curp = x-3 ... | tailp updated => irq | completes #(x-2) desc, curp = x-2 ... | completes #(x-1) desc, curp = x-1 ... | ... ... | completes #x desc, curp = tailp = x <= irqreturn | reaches tailp == curp = x, idle iowrite32(MSB, tailp + 4) | ... | tailp updated, starts fetching... | fetches #(x + 1) desc, sees cntrl = 0 | post Tx error, halt Signed-off-by: Andy Chiu <andy.chiu@sifive.com> Reported-by: Max Hsu <max.hsu@sifive.com> Reviewed-by: Greentime Hu <greentime.hu@sifive.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * net: axienet: make the 64b addresable DMA depends on 64b archecturesAndy Chiu2022-06-132-24/+40
| |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently it is not safe to config the IP as 64-bit addressable on 32-bit archectures, which cannot perform a double-word store on its descriptor pointers. The pointer is 64-bit wide if the IP is configured as 64-bit, and the device would process the partially updated pointer on some states if the pointer was updated via two store-words. To prevent such condition, we force a probe fail if we discover that the IP has 64-bit capability but it is not running on a 64-Bit kernel. This is a series of patch (1/2). The next patch must be applied in order to make 64b DMA safe on 64b archectures. Signed-off-by: Andy Chiu <andy.chiu@sifive.com> Reported-by: Max Hsu <max.hsu@sifive.com> Reviewed-by: Greentime Hu <greentime.hu@sifive.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * Merge branch 'hns3-fixres'David S. Miller2022-06-135-37/+86
| |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Guangbin Huang says: ==================== net: hns3: add some fixes for -net This series adds some fixes for the HNS3 ethernet driver. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| | * net: hns3: fix tm port shapping of fibre port is incorrect after driver ↵Guangbin Huang2022-06-133-4/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | initialization Currently in driver initialization process, driver will set shapping parameters of tm port to default speed read from firmware. However, the speed of SFP module may not be default speed, so shapping parameters of tm port may be incorrect. To fix this problem, driver sets new shapping parameters for tm port after getting exact speed of SFP module in this case. Fixes: 88d10bd6f730 ("net: hns3: add support for multiple media type") Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * net: hns3: fix PF rss size initialization bugJie Wang2022-06-131-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently hns3 driver misuses the VF rss size to initialize the PF rss size in hclge_tm_vport_tc_info_update. So this patch fix it by checking the vport id before initialization. Fixes: 7347255ea389 ("net: hns3: refactor PF rss get APIs with new common rss get APIs") Signed-off-by: Jie Wang <wangjie125@huawei.com> Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * net: hns3: restore tm priority/qset to default settings when tc disabledGuangbin Huang2022-06-132-31/+65
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, settings parameters of schedule mode, dwrr, shaper of tm priority or qset of one tc are only be set when tc is enabled, they are not restored to the default settings when tc is disabled. It confuses users when they cat tm_priority or tm_qset files of debugfs. So this patch fixes it. Fixes: 848440544b41 ("net: hns3: Add support of TX Scheduler & Shaper to HNS3 driver") Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * net: hns3: modify the ring param print infoJie Wang2022-06-131-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently tx push is also a ring param. So the original ring param print info in hns3_is_ringparam_changed should be adjusted. Fixes: 07fdc163ac88 ("net: hns3: refactor hns3_set_ringparam()") Signed-off-by: Jie Wang <wangjie125@huawei.com> Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * net: hns3: don't push link state to VF if unaliveJian Shen2022-06-131-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It's unnecessary to push link state to unalive VF, and the VF will query link state from PF when it being start works. Fixes: 18b6e31f8bf4 ("net: hns3: PF add support for pushing link status to VFs") Signed-off-by: Jian Shen <shenjian15@huawei.com> Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * net: hns3: set port base vlan tbl_sta to false before removing old vlanGuangbin Huang2022-06-131-0/+1
| |/ | | | | | | | | | | | | | | | | When modify port base vlan, the port base vlan tbl_sta needs to set to false before removing old vlan, to indicate this operation is not finish. Fixes: c0f46de30c96 ("net: hns3: fix port base vlan add fail when concurrent with reset") Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * Merge branch 'documentation-add-description-for-a-couple-of-sctp-sysctl-options'Jakub Kicinski2022-06-111-0/+37
| |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Xin Long says: ==================== Documentation: add description for a couple of sctp sysctl options These are a couple of sysctl options I recently added, but missed adding documents for them. Especially for net.sctp.intl_enable, it's hard for users to setup stream interleaving, as it also needs to call some socket options. This patchset is to add documents for them. ==================== Link: https://lore.kernel.org/r/cover.1654787716.git.lucien.xin@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| | * Documentation: add description for net.sctp.ecn_enableXin Long2022-06-111-0/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | Describe it in networking/ip-sysctl.rst like other SCTP options. Fixes: 2f5268a9249b ("sctp: allow users to set netns ecn flag with sysctl") Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| | * Documentation: add description for net.sctp.intl_enableXin Long2022-06-111-0/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Describe it in networking/ip-sysctl.rst like other SCTP options. We need to document this especially as when using the feature of User Message Interleaving, some socket options also needs to be set. Fixes: 463118c34a35 ("sctp: support sysctl to allow users to use stream interleave") Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| | * Documentation: add description for net.sctp.reconf_enableXin Long2022-06-111-0/+11
| |/ | | | | | | | | | | | | | | | | Describe it in networking/ip-sysctl.rst like other SCTP options. Fixes: c0d8bab6ae51 ("sctp: add get and set sockopt for reconf_enable") Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| * Merge branch '40GbE' of ↵Jakub Kicinski2022-06-114-10/+24
| |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue Tony Nguyen says: ==================== Intel Wired LAN Driver Updates 2022-06-09 Grzegorz prevents addition of TC flower filters to TC0 and fixes queue iteration for VF ADQ to number of actual queues for i40e. Aleksandr prevents running of ethtool tests when device is being reset for i40e. Michal resolves an issue where iavf does not report its MAC address properly. * '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue: iavf: Fix issue with MAC address of VF shown as zero i40e: Fix call trace in setup_tx_descriptors i40e: Fix calculating the number of queue pairs i40e: Fix adding ADQ filter to TC0 ==================== Link: https://lore.kernel.org/r/20220609162620.2619258-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
| | * iavf: Fix issue with MAC address of VF shown as zeroMichal Wilczynski2022-06-091-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | After reinitialization of iavf, ice driver gets VIRTCHNL_OP_ADD_ETH_ADDR message with incorrectly set type of MAC address. Hardware address should have is_primary flag set as true. This way ice driver knows what it has to set as a MAC address. Check if the address is primary in iavf_add_filter function and set flag accordingly. To test set all-zero MAC on a VF. This triggers iavf re-initialization and VIRTCHNL_OP_ADD_ETH_ADDR message gets sent to PF. For example: ip link set dev ens785 vf 0 mac 00:00:00:00:00:00 This triggers re-initialization of iavf. New MAC should be assigned. Now check if MAC is non-zero: ip link show dev ens785 Fixes: a3e839d539e0 ("iavf: Add usage of new virtchnl format to set default MAC") Signed-off-by: Michal Wilczynski <michal.wilczynski@intel.com> Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
| | * i40e: Fix call trace in setup_tx_descriptorsAleksandr Loktionov2022-06-091-8/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | After PF reset and ethtool -t there was call trace in dmesg sometimes leading to panic. When there was some time, around 5 seconds, between reset and test there were no errors. Problem was that pf reset calls i40e_vsi_close in prep_for_reset and ethtool -t calls i40e_vsi_close in diag_test. If there was not enough time between those commands the second i40e_vsi_close starts before previous i40e_vsi_close was done which leads to crash. Add check to diag_test if pf is in reset and don't start offline tests if it is true. Add netif_info("testing failed") into unhappy path of i40e_diag_test() Fixes: e17bc411aea8 ("i40e: Disable offline diagnostics if VFs are enabled") Fixes: 510efb2682b3 ("i40e: Fix ethtool offline diagnostic with netqueues") Signed-off-by: Michal Jaron <michalx.jaron@intel.com> Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
| | * i40e: Fix calculating the number of queue pairsGrzegorz Szczurek2022-06-091-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If ADQ is enabled for a VF, then actual number of queue pair is a number of currently available traffic classes for this VF. Without this change the configuration of the Rx/Tx queues fails with error. Fixes: d29e0d233e0d ("i40e: missing input validation on VF message handling by the PF") Signed-off-by: Grzegorz Szczurek <grzegorzx.szczurek@intel.com> Signed-off-by: Jedrzej Jagielski <jedrzej.jagielski@intel.com> Tested-by: Bharathi Sreenivas <bharathi.sreenivas@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
| | * i40e: Fix adding ADQ filter to TC0Grzegorz Szczurek2022-06-091-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Procedure of configure tc flower filters erroneously allows to create filters on TC0 where unfiltered packets are also directed by default. Issue was caused by insufficient checks of hw_tc parameter specifying the hardware traffic class to pass matching packets to. Fix checking hw_tc parameter which blocks creation of filters on TC0. Fixes: 2f4b411a3d67 ("i40e: Enable cloud filters via tc-flower") Signed-off-by: Grzegorz Szczurek <grzegorzx.szczurek@intel.com> Signed-off-by: Jedrzej Jagielski <jedrzej.jagielski@intel.com> Tested-by: Bharathi Sreenivas <bharathi.sreenivas@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
* | | Merge tag 'hardening-v5.19-rc3' of ↵Linus Torvalds2022-06-154-21/+30
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull hardening fixes from Kees Cook: - Correctly handle vm_map areas in hardened usercopy (Matthew Wilcox) - Adjust CFI RCU usage to avoid boot splats with cpuidle (Sami Tolvanen) * tag 'hardening-v5.19-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: usercopy: Make usercopy resilient against ridiculously large copies usercopy: Cast pointer to an integer once usercopy: Handle vm_map_ram() areas cfi: Fix __cfi_slowpath_diag RCU usage with cpuidle
| * | | usercopy: Make usercopy resilient against ridiculously large copiesMatthew Wilcox (Oracle)2022-06-131-10/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If 'n' is so large that it's negative, we might wrap around and mistakenly think that the copy is OK when it's not. Such a copy would probably crash, but just doing the arithmetic in a more simple way lets us detect and refuse this case. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Uladzislau Rezki (Sony) <urezki@gmail.com> Tested-by: Zorro Lang <zlang@redhat.com> Signed-off-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20220612213227.3881769-4-willy@infradead.org
| * | | usercopy: Cast pointer to an integer onceMatthew Wilcox (Oracle)2022-06-131-5/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Get rid of a lot of annoying casts by setting 'addr' once at the top of the function. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Uladzislau Rezki (Sony) <urezki@gmail.com> Tested-by: Zorro Lang <zlang@redhat.com> Signed-off-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20220612213227.3881769-3-willy@infradead.org
| * | | usercopy: Handle vm_map_ram() areasMatthew Wilcox (Oracle)2022-06-133-7/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | vmalloc does not allocate a vm_struct for vm_map_ram() areas. That causes us to deny usercopies from those areas. This affects XFS which uses vm_map_ram() for its directories. Fix this by calling find_vmap_area() instead of find_vm_area(). Fixes: 0aef499f3172 ("mm/usercopy: Detect vmalloc overruns") Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Uladzislau Rezki (Sony) <urezki@gmail.com> Tested-by: Zorro Lang <zlang@redhat.com> Signed-off-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20220612213227.3881769-2-willy@infradead.org
| * | | cfi: Fix __cfi_slowpath_diag RCU usage with cpuidleSami Tolvanen2022-06-131-6/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | RCU_NONIDLE usage during __cfi_slowpath_diag can result in an invalid RCU state in the cpuidle code path: WARNING: CPU: 1 PID: 0 at kernel/rcu/tree.c:613 rcu_eqs_enter+0xe4/0x138 ... Call trace: rcu_eqs_enter+0xe4/0x138 rcu_idle_enter+0xa8/0x100 cpuidle_enter_state+0x154/0x3a8 cpuidle_enter+0x3c/0x58 do_idle.llvm.6590768638138871020+0x1f4/0x2ec cpu_startup_entry+0x28/0x2c secondary_start_kernel+0x1b8/0x220 __secondary_switched+0x94/0x98 Instead, call rcu_irq_enter/exit to wake up RCU only when needed and disable interrupts for the entire CFI shadow/module check when we do. Signed-off-by: Sami Tolvanen <samitolvanen@google.com> Link: https://lore.kernel.org/r/20220531175910.890307-1-samitolvanen@google.com Fixes: cf68fffb66d6 ("add support for Clang CFI") Cc: stable@vger.kernel.org Signed-off-by: Kees Cook <keescook@chromium.org>
* | | | Merge tag 'tpmdd-next-v5.19-rc3' of ↵Linus Torvalds2022-06-153-13/+13
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd Pull tpm fixes from Jarkko Sakkinen: "Two fixes for this merge window" * tag 'tpmdd-next-v5.19-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd: certs: fix and refactor CONFIG_SYSTEM_BLACKLIST_HASH_LIST build certs/blacklist_hashes.c: fix const confusion in certs blacklist
| * | | | certs: fix and refactor CONFIG_SYSTEM_BLACKLIST_HASH_LIST buildMasahiro Yamada2022-06-153-12/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit addf466389d9 ("certs: Check that builtin blacklist hashes are valid") was applied 8 months after the submission. In the meantime, the base code had been removed by commit b8c96a6b466c ("certs: simplify $(srctree)/ handling and remove config_filename macro"). Fix the Makefile. Create a local copy of $(CONFIG_SYSTEM_BLACKLIST_HASH_LIST). It is included from certs/blacklist_hashes.c and also works as a timestamp. Send error messages from check-blacklist-hashes.awk to stderr instead of stdout. Fixes: addf466389d9 ("certs: Check that builtin blacklist hashes are valid") Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Reviewed-by: Mickaël Salaün <mic@linux.microsoft.com> Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
| * | | | certs/blacklist_hashes.c: fix const confusion in certs blacklistMasahiro Yamada2022-06-151-1/+1
|/ / / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This file fails to compile as follows: CC certs/blacklist_hashes.o certs/blacklist_hashes.c:4:1: error: ignoring attribute ‘section (".init.data")’ because it conflicts with previous ‘section (".init.rodata")’ [-Werror=attributes] 4 | const char __initdata *const blacklist_hashes[] = { | ^~~~~ In file included from certs/blacklist_hashes.c:2: certs/blacklist.h:5:38: note: previous declaration here 5 | extern const char __initconst *const blacklist_hashes[]; | ^~~~~~~~~~~~~~~~ Apply the same fix as commit 2be04df5668d ("certs/blacklist_nohashes.c: fix const confusion in certs blacklist"). Fixes: 734114f8782f ("KEYS: Add a system blacklist keyring") Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Reviewed-by: Mickaël Salaün <mic@linux.microsoft.com> Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
* | | | Merge tag 'fs.fixes.v5.19-rc3' of ↵Linus Torvalds2022-06-151-6/+20
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux Pull vfs idmapping fix from Christian Brauner: "This fixes an issue where we fail to change the group of a file when the caller owns the file and is a member of the group to change to. This is only relevant on idmapped mounts. There's a detailed description in the commit message and regression tests have been added to xfstests" * tag 'fs.fixes.v5.19-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux: fs: account for group membership
| * | | | fs: account for group membershipChristian Brauner2022-06-141-6/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When calling setattr_prepare() to determine the validity of the attributes the ia_{g,u}id fields contain the value that will be written to inode->i_{g,u}id. This is exactly the same for idmapped and non-idmapped mounts and allows callers to pass in the values they want to see written to inode->i_{g,u}id. When group ownership is changed a caller whose fsuid owns the inode can change the group of the inode to any group they are a member of. When searching through the caller's groups we need to use the gid mapped according to the idmapped mount otherwise we will fail to change ownership for unprivileged users. Consider a caller running with fsuid and fsgid 1000 using an idmapped mount that maps id 65534 to 1000 and 65535 to 1001. Consequently, a file owned by 65534:65535 in the filesystem will be owned by 1000:1001 in the idmapped mount. The caller now requests the gid of the file to be changed to 1000 going through the idmapped mount. In the vfs we will immediately map the requested gid to the value that will need to be written to inode->i_gid and place it in attr->ia_gid. Since this idmapped mount maps 65534 to 1000 we place 65534 in attr->ia_gid. When we check whether the caller is allowed to change group ownership we first validate that their fsuid matches the inode's uid. The inode->i_uid is 65534 which is mapped to uid 1000 in the idmapped mount. Since the caller's fsuid is 1000 we pass the check. We now check whether the caller is allowed to change inode->i_gid to the requested gid by calling in_group_p(). This will compare the passed in gid to the caller's fsgid and search the caller's additional groups. Since we're dealing with an idmapped mount we need to pass in the gid mapped according to the idmapped mount. This is akin to checking whether a caller is privileged over the future group the inode is owned by. And that needs to take the idmapped mount into account. Note, all helpers are nops without idmapped mounts. New regression test sent to xfstests. Link: https://github.com/lxc/lxd/issues/10537 Link: https://lore.kernel.org/r/20220613111517.2186646-1-brauner@kernel.org Fixes: 2f221d6f7b88 ("attr: handle idmapped mounts") Cc: Seth Forshee <sforshee@digitalocean.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Aleksa Sarai <cyphar@cyphar.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: stable@vger.kernel.org # 5.15+ CC: linux-fsdevel@vger.kernel.org Reviewed-by: Seth Forshee <sforshee@digitalocean.com> Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>
* | | | | netfs: fix up netfs_inode_init() docbook commentLinus Torvalds2022-06-141-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit e81fb4198e27 ("netfs: Further cleanups after struct netfs_inode wrapper introduced") changed the argument types and names, and actually updated the comment too (although that was thanks to David Howells, not me: my original patch only changed the code). But the comment fixup didn't go quite far enough, and didn't change the argument name in the comment, resulting in include/linux/netfs.h:314: warning: Function parameter or member 'ctx' not described in 'netfs_inode_init' include/linux/netfs.h:314: warning: Excess function parameter 'inode' description in 'netfs_inode_init' during htmldoc generation. Fixes: e81fb4198e27 ("netfs: Further cleanups after struct netfs_inode wrapper introduced") Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | | | Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds2022-06-1437-312/+640
|\ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull kvm fixes from Paolo Bonzini: "While last week's pull request contained miscellaneous fixes for x86, this one covers other architectures, selftests changes, and a bigger series for APIC virtualization bugs that were discovered during 5.20 development. The idea is to base 5.20 development for KVM on top of this tag. ARM64: - Properly reset the SVE/SME flags on vcpu load - Fix a vgic-v2 regression regarding accessing the pending state of a HW interrupt from userspace (and make the code common with vgic-v3) - Fix access to the idreg range for protected guests - Ignore 'kvm-arm.mode=protected' when using VHE - Return an error from kvm_arch_init_vm() on allocation failure - A bunch of small cleanups (comments, annotations, indentation) RISC-V: - Typo fix in arch/riscv/kvm/vmid.c - Remove broken reference pattern from MAINTAINERS entry x86-64: - Fix error in page tables with MKTME enabled - Dirty page tracking performance test extended to running a nested guest - Disable APICv/AVIC in cases that it cannot implement correctly" [ This merge also fixes a misplaced end parenthesis bug introduced in commit 3743c2f02517 ("KVM: x86: inhibit APICv/AVIC on changes to APIC ID or APIC base") pointed out by Sean Christopherson ] Link: https://lore.kernel.org/all/20220610191813.371682-1-seanjc@google.com/ * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (34 commits) KVM: selftests: Restrict test region to 48-bit physical addresses when using nested KVM: selftests: Add option to run dirty_log_perf_test vCPUs in L2 KVM: selftests: Clean up LIBKVM files in Makefile KVM: selftests: Link selftests directly with lib object files KVM: selftests: Drop unnecessary rule for STATIC_LIBS KVM: selftests: Add a helper to check EPT/VPID capabilities KVM: selftests: Move VMX_EPT_VPID_CAP_AD_BITS to vmx.h KVM: selftests: Refactor nested_map() to specify target level KVM: selftests: Drop stale function parameter comment for nested_map() KVM: selftests: Add option to create 2M and 1G EPT mappings KVM: selftests: Replace x86_page_size with PG_LEVEL_XX KVM: x86: SVM: fix nested PAUSE filtering when L0 intercepts PAUSE KVM: x86: SVM: drop preempt-safe wrappers for avic_vcpu_load/put KVM: x86: disable preemption around the call to kvm_arch_vcpu_{un|}blocking KVM: x86: disable preemption while updating apicv inhibition KVM: x86: SVM: fix avic_kick_target_vcpus_fast KVM: x86: SVM: remove avic's broken code that updated APIC ID KVM: x86: inhibit APICv/AVIC on changes to APIC ID or APIC base KVM: x86: document AVIC/APICv inhibit reasons KVM: x86/mmu: Set memory encryption "value", not "mask", in shadow PDPTRs ...
| * | | | | KVM: selftests: Restrict test region to 48-bit physical addresses when using ↵David Matlack2022-06-091-3/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | nested The selftests nested code only supports 4-level paging at the moment. This means it cannot map nested guest physical addresses with more than 48 bits. Allow perf_test_util nested mode to work on hosts with more than 48 physical addresses by restricting the guest test region to 48-bits. While here, opportunistically fix an off-by-one error when dealing with vm_get_max_gfn(). perf_test_util.c was treating this as the maximum number of GFNs, rather than the maximum allowed GFN. This didn't result in any correctness issues, but it did end up shifting the test region down slightly when using huge pages. Suggested-by: Sean Christopherson <seanjc@google.com> Signed-off-by: David Matlack <dmatlack@google.com> Message-Id: <20220520233249.3776001-12-dmatlack@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>