| Commit message (Collapse) | Author | Files | Lines |
|
It is read by data path and modified from process context on remote cpu
so it is needed to use WRITE_ONCE to clear the pointer.
Fixes: efc2214b6047 ("ice: Add support for XDP")
Reviewed-by: Shannon Nelson <shannon.nelson@amd.com>
Tested-by: Chandan Kumar Rout <chandanx.rout@intel.com> (A Contingent Worker at Intel)
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
xsk_buff_pool pointers that ice ring structs hold are updated via
ndo_bpf that is executed in process context while it can be read by
remote CPU at the same time within NAPI poll. Use synchronize_net()
after pointer update and {READ,WRITE}_ONCE() when working with mentioned
pointer.
Fixes: 2d4238f55697 ("ice: Add support for AF_XDP")
Reviewed-by: Shannon Nelson <shannon.nelson@amd.com>
Tested-by: Chandan Kumar Rout <chandanx.rout@intel.com> (A Contingent Worker at Intel)
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
This so we prevent Tx timeout issues. One of conditions checked on
running in the background dev_watchdog() is netif_carrier_ok(), so let
us turn it off when we disable the queues that belong to a q_vector
where XSK pool is being configured. Turn carrier on in ice_qp_ena()
only when ice_get_link_status() tells us that physical link is up.
Fixes: 2d4238f55697 ("ice: Add support for AF_XDP")
Reviewed-by: Shannon Nelson <shannon.nelson@amd.com>
Tested-by: Chandan Kumar Rout <chandanx.rout@intel.com> (A Contingent Worker at Intel)
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
Don't bail out right when spotting an error within ice_qp_{dis,ena}()
but rather track error and go through whole flow of disabling and
enabling queue pair.
Fixes: 2d4238f55697 ("ice: Add support for AF_XDP")
Reviewed-by: Shannon Nelson <shannon.nelson@amd.com>
Tested-by: Chandan Kumar Rout <chandanx.rout@intel.com> (A Contingent Worker at Intel)
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
Given that ice_qp_dis() is called under rtnl_lock, synchronize_net() can
be called instead of synchronize_rcu() so that XDP rings can finish its
job in a faster way. Also let us do this as earlier in XSK queue disable
flow.
Additionally, turn off regular Tx queue before disabling irqs and NAPI.
Fixes: 2d4238f55697 ("ice: Add support for AF_XDP")
Reviewed-by: Shannon Nelson <shannon.nelson@amd.com>
Tested-by: Chandan Kumar Rout <chandanx.rout@intel.com> (A Contingent Worker at Intel)
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
When ice driver is spammed with multiple xdpsock instances and flow
control is enabled, there are cases when Rx queue gets stuck and unable
to reflect the disable state in QRX_CTRL register. Similar issue has
previously been addressed in commit 13a6233b033f ("ice: Add support to
enable/disable all Rx queues before waiting").
To workaround this, let us simply not wait for a disabled state as later
patch will make sure that regardless of the encountered error in the
process of disabling a queue pair, the Rx queue will be enabled.
Fixes: 2d4238f55697 ("ice: Add support for AF_XDP")
Reviewed-by: Shannon Nelson <shannon.nelson@amd.com>
Tested-by: Chandan Kumar Rout <chandanx.rout@intel.com> (A Contingent Worker at Intel)
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
Address a scenario in which XSK ZC Tx produces descriptors to XDP Tx
ring when link is either not yet fully initialized or process of
stopping the netdev has already started. To avoid this, add checks
against carrier readiness in ice_xsk_wakeup() and in ice_xmit_zc().
One could argue that bailing out early in ice_xsk_wakeup() would be
sufficient but given the fact that we produce Tx descriptors on behalf
of NAPI that is triggered for Rx traffic, the latter is also needed.
Bringing link up is an asynchronous event executed within
ice_service_task so even though interface has been brought up there is
still a time frame where link is not yet ok.
Without this patch, when AF_XDP ZC Tx is used simultaneously with stack
Tx, Tx timeouts occur after going through link flap (admin brings
interface down then up again). HW seem to be unable to transmit
descriptor to the wire after HW tail register bump which in turn causes
bit __QUEUE_STATE_STACK_XOFF to be set forever as
netdev_tx_completed_queue() sees no cleaned bytes on the input.
Fixes: 126cdfe1007a ("ice: xsk: Improve AF_XDP ZC Tx and use batching API")
Fixes: 2d4238f55697 ("ice: Add support for AF_XDP")
Reviewed-by: Shannon Nelson <shannon.nelson@amd.com>
Tested-by: Chandan Kumar Rout <chandanx.rout@intel.com> (A Contingent Worker at Intel)
Signed-off-by: Michal Kubiak <michal.kubiak@intel.com>
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
In main_loop_s function, when the open(cfg_input, O_RDONLY) function is
run, the last fd is not closed if the "--cfg_repeat > 0" branch is not
taken.
Fixes: 05be5e273c84 ("selftests: mptcp: add disconnect tests")
Cc: stable@vger.kernel.org
Signed-off-by: Liu Jing <liujing@cmss.chinamobile.com>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
pm_nl_check_endpoint() currently calls an not existing helper
to mark the test as failed. Fix the wrong call.
Fixes: 03668c65d153 ("selftests: mptcp: join: rework detailed report")
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Delete and re-create a signal endpoint and ensure that the PM
actually deletes and re-create the subflow.
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Currently the per connection announced address counter is never
decreased. As a consequence, after connection establishment, if
the NL PM deletes an endpoint and adds a new/different one, no
additional subflow is created for the new endpoint even if the
current limits allow that.
Address the issue properly updating the signaled address counter
every time the NL PM removes such addresses.
Fixes: 01cacb00b35c ("mptcp: add netlink-based PM")
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Currently the per-connection announced address counter is never
decreased. When the user-space PM is in use, this just affect
the information exposed via diag/sockopt, but it could still foul
the PM to wrong decision.
Add the missing accounting for the user-space PM's sake.
Fixes: 8b1c94da1e48 ("mptcp: only send RM_ADDR in nl_cmd_remove")
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
rtnl_dellink().
The cited commit accidentally replaced tgt_net with net in rtnl_dellink().
As a result, IFLA_TARGET_NETNSID is ignored if the interface is specified
with IFLA_IFNAME or IFLA_ALT_IFNAME.
Let's pass tgt_net to rtnl_dev_get().
Fixes: cc6090e985d7 ("net: rtnetlink: introduce helper to get net_device instance by ifname")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
softirq may get lost if an Rx interrupt comes before we call
napi_enable. Move napi_enable in front of axienet_setoptions(), which
turns on the device, to address the issue.
Link: https://lists.gnu.org/archive/html/qemu-devel/2024-07/msg06160.html
Fixes: cc37610caaf8 ("net: axienet: implement NAPI and GRO receive")
Signed-off-by: Andy Chiu <andy.chiu@sifive.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
tp->scaling_ratio is not updated based on skb->len/skb->truesize once
SO_RCVBUF is set leading to the maximum window scaling to be 25% of
rcvbuf after
commit dfa2f0483360 ("tcp: get rid of sysctl_tcp_adv_win_scale")
and 50% of rcvbuf after
commit 697a6c8cec03 ("tcp: increase the default TCP scaling ratio").
50% tries to emulate the behavior of older kernels using
sysctl_tcp_adv_win_scale with default value.
Systems which were using a different values of sysctl_tcp_adv_win_scale
in older kernels ended up seeing reduced download speeds in certain
cases as covered in https://lists.openwall.net/netdev/2024/05/15/13
While the sysctl scheme is no longer acceptable, the value of 50% is
a bit conservative when the skb->len/skb->truesize ratio is later
determined to be ~0.66.
Applications not specifying SO_RCVBUF update the window scaling and
the receiver buffer every time data is copied to userspace. This
computation is now used for applications setting SO_RCVBUF to update
the maximum window scaling while ensuring that the receive buffer
is within the application specified limit.
Fixes: dfa2f0483360 ("tcp: get rid of sysctl_tcp_adv_win_scale")
Signed-off-by: Sean Tranchetti <quic_stranche@quicinc.com>
Signed-off-by: Subash Abhinov Kasiviswanathan <quic_subashab@quicinc.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
We had a handful of bugs relating to key being either all 0
or just reported incorrectly as all 0. Check for this in
the tests.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
We expect drivers implementing the new create/modify/destroy
API to populate the defaults in struct ethtool_rxfh_context.
In legacy API ctx isn't even passed, and rxfh.indir / rxfh.key
are NULL so drivers can't give us defaults even if they want to.
Call get_rxfh() to fetch the values. We can reuse rxfh_dev
for the get_rxfh(), rxfh stores the input from the user.
This fixes IOCTL reporting 0s instead of the default key /
indir table for drivers using legacy API.
Add a check to try to catch drivers using the new API
but not populating the key.
Fixes: 7964e7884643 ("net: ethtool: use the tracking array for get_rxfh on custom RSS contexts")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The indirection table and the key follow struct ethtool_rxfh
in user memory.
To reset the indirection table user space calls SET_RXFH with
table of size 0 (OTOH to say "no change" it should use -1 / ~0).
The logic for calculating the offset where they key sits is
incorrect in this case, as kernel would still offset by the full
table length, while for the reset there is no indir table and
key is immediately after the struct.
$ ethtool -X eth0 default hkey 01:02:03...
$ ethtool -x eth0
[...]
RSS hash key:
00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00
[...]
Fixes: 3de0b592394d ("ethtool: Support for configurable RSS hash key")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
As described in the kdoc for .create_rxfh_context we are responsible
for populating the defaults. The core will not call .get_rxfh
for non-0 context.
The problem can be easily observed since Netlink doesn't currently
use the cache. Using netlink ethtool:
$ ethtool -x eth0 context 1
[...]
RSS hash key:
13:60:cd:60:14:d3:55:36:86:df:90:f2:96:14:e2:21:05:57:a8:8f:a5:12:5e:54:62:7f:fd:3c:15:7e:76:05:71:42:a2:9a:73:80:09:9c
RSS hash function:
toeplitz: on
xor: off
crc32: off
But using IOCTL ethtool shows:
$ ./ethtool-old -x eth0 context 1
[...]
RSS hash key:
00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00
RSS hash function:
Operation not supported
Fixes: 7964e7884643 ("net: ethtool: use the tracking array for get_rxfh on custom RSS contexts")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
In commit under Fixes I split the bnxt_set_rxfh_context() function,
and attached the appropriate chunks to new ops. I missed that
bnxt_set_rxfh_context() gets called after some initial checks
in bnxt_set_rxfh(), namely that the hash function is Toeplitz.
Fixes: 5c466b4d4e75 ("eth: bnxt: move from .set_rxfh to .create_rxfh_context and friends")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
There are cases where do_xdp_generic returns bpf_net_context without
clearing it. This causes various memory corruptions, so the missing
bpf_net_ctx_clear must be added.
Reported-by: syzbot+44623300f057a28baf1e@syzkaller.appspotmail.com
Fixes: fecef4cd42c6 ("tun: Assign missing bpf_net_context.")
Signed-off-by: Jeongjun Park <aha310510@gmail.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Reported-by: syzbot+3c2b6d5d4bec3b904933@syzkaller.appspotmail.com
Reported-by: syzbot+707d98c8649695eaf329@syzkaller.appspotmail.com
Reported-by: syzbot+c226757eb784a9da3e8b@syzkaller.appspotmail.com
Reported-by: syzbot+61a1cfc2b6632363d319@syzkaller.appspotmail.com
Reported-by: syzbot+709e4c85c904bcd62735@syzkaller.appspotmail.com
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
In testing the recent kernel I found that the fbnic driver couldn't be
enabled on x86_64 builds. A bit of digging showed that the fbnic driver was
the only one to check for S390 to be n, all others had checked for !S390.
Since it is a boolean and not a tristate I am not sure it will be N. So
just update it to use the !S390 flag.
A quick check via "make menuconfig" verified that after making this change
there was an option to select the fbnic driver.
Fixes 0e03c643dc93 ("eth: fbnic: fix s390 build.")
Signed-off-by: Alexander Duyck <alexanderduyck@fb.com>
Reviewed-by: Joe Damato <jdamato@fastly.com>
Link: https://patch.msgid.link/172192698293.1903337.4255690118685300353.stgit@ahduyck-xeon-server.home.arpa
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
DISCOVERY_FINDING shall only be set for active scanning as passive
scanning is not meant to generate MGMT Device Found events causing
discovering state to go out of sync since userspace would believe it
is discovering when in fact it is just passive scanning.
Cc: stable@vger.kernel.org
Link: https://bugzilla.kernel.org/show_bug.cgi?id=219088
Fixes: 2e2515c1ba38 ("Bluetooth: hci_event: Set DISCOVERY_FINDING on SCAN_ENABLED")
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
|
|
The caller of these functions in btusb.c is guarded with an
if(IS_ENABLED()) style check, so dead code is left out, but the
declarations are still needed at compile time:
drivers/bluetooth/btusb.c: In function 'btusb_mtk_reset':
drivers/bluetooth/btusb.c:2705:15: error: implicit declaration of function 'btmtk_usb_subsys_reset' [-Wimplicit-function-declaration]
2705 | err = btmtk_usb_subsys_reset(hdev, btmtk_data->dev_id);
| ^~~~~~~~~~~~~~~~~~~~~~
drivers/bluetooth/btusb.c: In function 'btusb_send_frame_mtk':
drivers/bluetooth/btusb.c:2720:23: error: implicit declaration of function 'alloc_mtk_intr_urb' [-Wimplicit-function-declaration]
2720 | urb = alloc_mtk_intr_urb(hdev, skb, btusb_tx_complete);
| ^~~~~~~~~~~~~~~~~~
drivers/bluetooth/btusb.c:2720:21: error: assignment to 'struct urb *' from 'int' makes pointer from integer without a cast [-Wint-conversion]
2720 | urb = alloc_mtk_intr_urb(hdev, skb, btusb_tx_complete);
| ^
Fixes: f0c83a23fcbb ("Bluetooth: btmtk: Fix btmtk.c undefined reference build error")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
|
|
The previous fix was incomplete as the link failure still persists
with CONFIG_USB=m when the sdio or serial wrappers for btmtk.c
are build-in:
btmtk.c:(.text+0x468): undefined reference to `usb_alloc_urb'
btmtk.c:(.text+0x488): undefined reference to `usb_free_urb'
btmtk.c:(.text+0x500): undefined reference to `usb_anchor_urb'
btmtk.c:(.text+0x50a): undefined reference to `usb_submit_urb'
btmtk.c:(.text+0x92c): undefined reference to `usb_control_msg'
btmtk.c:(.text+0xa92): undefined reference to `usb_unanchor_urb'
btmtk.c:(.text+0x11e4): undefined reference to `usb_set_interface'
btmtk.c:(.text+0x120a): undefined reference to `usb_kill_anchored_urbs'
Disallow this configuration.
Fixes: f0c83a23fcbb ("Bluetooth: btmtk: Fix btmtk.c undefined reference build error")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
|
|
MediaTek moved some usb interface related function to btmtk.c which
may cause build failed if BT USB Kconfig wasn't enabled.
Fix undefined reference by adding config check.
btmtk.c:(.text+0x89c): undefined reference to `usb_alloc_urb'
btmtk.c:(.text+0x8e3): undefined reference to `usb_free_urb'
btmtk.c:(.text+0x956): undefined reference to `usb_free_urb'
btmtk.c:(.text+0xa0e): undefined reference to `usb_anchor_urb'
btmtk.c:(.text+0xb43): undefined reference to `usb_autopm_get_interface'
btmtk.c:(.text+0xb7e): undefined reference to `usb_autopm_put_interface'
btmtk.c:(.text+0xf70): undefined reference to `usb_disable_autosuspend'
btmtk.c:(.text+0x133a): undefined reference to `usb_control_msg'
Fixes: d019930b0049 ("Bluetooth: btmtk: move btusb_mtk_hci_wmt_sync to btmtk.c")
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202407091928.AH0aGZnx-lkp@intel.com/
Signed-off-by: Chris Lu <chris.lu@mediatek.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
|
|
When suspending the scan filter policy cannot be 0x00 (no acceptlist)
since that means the host has to process every advertisement report
waking up the system, so this attempts to check if hdev is marked as
suspended and if the resulting filter policy would be 0x00 (no
acceptlist) then skip passive scanning if thre no devices in the
acceptlist otherwise reset the filter policy to 0x01 so the acceptlist
is used since the devices programmed there can still wakeup be system.
Fixes: 182ee45da083 ("Bluetooth: hci_sync: Rework hci_suspend_notifier")
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
|
|
If MediaTek's Bluetooth setup is unsuccessful, a NULL pointer issue
occur when the system is suspended and the anchored kill function
is called. To avoid this, add protection to prevent executing the
anchored kill function if the setup is unsuccessful.
[ 6.922106] Hardware name: Acer Tomato (rev2) board (DT)
[ 6.922114] Workqueue: pm pm_runtime_work
[ 6.922132] pstate: 804000c9
(Nzcv daIF +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[ 6.922147] pc : usb_kill_anchored_urbs+0x6c/0x1e0
[ 6.922164] lr : usb_kill_anchored_urbs+0x48/0x1e0
[ 6.922181] sp : ffff800080903b60
[ 6.922187] x29: ffff800080903b60
x28: ffff2c7b85c32b80 x27: ffff2c7bbb370930
[ 6.922211] x26: 00000000000f4240
x25: 00000000ffffffff x24: ffffd49ece2dcb48
[ 6.922255] x20: ffffffffffffffd8
x19: 0000000000000000 x18: 0000000000000006
[ 6.922276] x17: 6531656337386238
x16: 3632373862333863 x15: ffff800080903480
[ 6.922297] x14: 0000000000000000
x13: 303278302f303178 x12: ffffd49ecf090e30
[ 6.922318] x11: 0000000000000001
x10: 0000000000000001 x9 : ffffd49ecd2c5bb4
[ 6.922339] x8 : c0000000ffffdfff
x7 : ffffd49ecefe0db8 x6 : 00000000000affa8
[ 6.922360] x5 : ffff2c7bbb35dd48
x4 : 0000000000000000 x3 : 0000000000000000
[ 6.922379] x2 : 0000000000000000
x1 : 0000000000000003 x0 : ffffffffffffffd8
[ 6.922400] Call trace:
[ 6.922405] usb_kill_anchored_urbs+0x6c/0x1e0
[ 6.922422] btmtk_usb_suspend+0x20/0x38
[btmtk 5f200a97badbdfda4266773fee49acfc8e0224d5]
[ 6.922444] btusb_suspend+0xd0/0x210
[btusb 0bfbf19a87ff406c83b87268b87ce1e80e9a829b]
[ 6.922469] usb_suspend_both+0x90/0x288
[ 6.922487] usb_runtime_suspend+0x3c/0xa8
[ 6.922507] __rpm_callback+0x50/0x1f0
[ 6.922523] rpm_callback+0x70/0x88
[ 6.922538] rpm_suspend+0xe4/0x5a0
[ 6.922553] pm_runtime_work+0xd4/0xe0
[ 6.922569] process_one_work+0x18c/0x440
[ 6.922588] worker_thread+0x314/0x428
[ 6.922606] kthread+0x128/0x138
[ 6.922621] ret_from_fork+0x10/0x20
[ 6.922644] Code: f100a274 54000520 d503201f d100a260 (b8370000)
[ 6.922654] ---[ end trace 0000000000000000 ]---
Fixes: ceac1cb0259d ("Bluetooth: btusb: mediatek: add ISO data transmission functions")
Signed-off-by: Chris Lu <chris.lu@mediatek.com>
Reported-by: Nícolas F. R. A. Prado <nfraprado@collabora.com> #KernelCI
Tested-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
|
|
Do not attempt to send any hci command to controller if *setup* function
fails.
Fixes: af395330abed ("Bluetooth: btintel: Add Intel devcoredump support")
Signed-off-by: Kiran K <kiran.k@intel.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
|
|
The PHY built in to the Realtek RTL8366S switch controller was
previously supported by genphy_driver. This PHY does not implement MMD
operations. Since commit 9b01c885be36 ("net: phy: c22: migrate to
genphy_c45_write_eee_adv()"), MMD register reads have been made during
phy_probe to determine EEE support. For genphy_driver, these reads are
transformed into 802.3 annex 22D clause 45-over-clause 22
mmd_phy_indirect operations that perform MII register writes to
MII_MMD_CTRL and MII_MMD_DATA. This overwrites those two MII registers,
which on this PHY are reserved and have another function, rendering the
PHY unusable while so configured.
Proper support for this PHY is restored by providing a phy_driver that
declares MMD operations as unsupported by using the helper functions
provided for that purpose, while remaining otherwise identical to
genphy_driver.
Fixes: 9b01c885be36 ("net: phy: c22: migrate to genphy_c45_write_eee_adv()")
Reported-by: Russell Senior <russell@personaltelco.net>
Closes: https://github.com/openwrt/openwrt/issues/15981
Link: https://github.com/openwrt/openwrt/issues/15739
Signed-off-by: Mark Mentovai <mark@mentovai.com>
Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The ext interrupts are enabled when the firmware has been started, but
this may never happen, for example, if the board configuration file is
missing.
When the system is later suspended, the driver unconditionally tries to
disable interrupts, which results in an irq disable imbalance and causes
the driver to spin indefinitely in napi_synchronize().
Make sure that the interrupts have been enabled before attempting to
disable them.
Fixes: d889913205cf ("wifi: ath12k: driver for Qualcomm Wi-Fi 7 devices")
Cc: stable@vger.kernel.org # 6.3
Signed-off-by: Johan Hovold <johan+linaro@kernel.org>
Acked-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Link: https://patch.msgid.link/20240709073132.9168-1-johan+linaro@kernel.org
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
Fix null pointer access in mt792x_mac_link_bss_remove.
To prevent null pointer access, we should assign the vif to bss_conf in
mt7921_add_interface. This ensures that subsequent operations on the BSS
can properly reference the correct vif.
[ T843] Call Trace:
[ T843] <TASK>
[ T843] ? __die+0x1e/0x60
[ T843] ? page_fault_oops+0x157/0x450
[ T843] ? srso_alias_return_thunk+0x5/0xfbef5
[ T843] ? srso_alias_return_thunk+0x5/0xfbef5
[ T843] ? search_bpf_extables+0x5a/0x80
[ T843] ? srso_alias_return_thunk+0x5/0xfbef5
[ T843] ? exc_page_fault+0x2bb/0x670
[ T843] ? srso_alias_return_thunk+0x5/0xfbef5
[ T843] ? lock_timer_base+0x71/0x90
[ T843] ? asm_exc_page_fault+0x26/0x30
[ T843] ? mt792x_mac_link_bss_remove+0x24/0x110 [mt792x_lib]
[ T843] ? mt792x_remove_interface+0x6e/0x90 [mt792x_lib]
[ T843] ? ieee80211_do_stop+0x507/0x7e0 [mac80211]
[ T843] ? ieee80211_stop+0x53/0x190 [mac80211]
[ T843] ? __dev_close_many+0xa5/0x120
[ T843] ? __dev_change_flags+0x18c/0x220
[ T843] ? dev_change_flags+0x21/0x60
[ T843] ? do_setlink+0xdf9/0x11d0
[ T843] ? srso_alias_return_thunk+0x5/0xfbef5
[ T843] ? srso_alias_return_thunk+0x5/0xfbef5
[ T843] ? security_sock_rcv_skb+0x33/0x50
[ T843] ? srso_alias_return_thunk+0x5/0xfbef5
[ T843] ? srso_alias_return_thunk+0x5/0xfbef5
[ T843] ? __nla_validate_parse+0x61/0xd10
[ T843] ? srso_alias_return_thunk+0x5/0xfbef5
[ T843] ? genl_done+0x53/0x80
[ T843] ? srso_alias_return_thunk+0x5/0xfbef5
[ T843] ? netlink_dump+0x357/0x410
[ T843] ? __rtnl_newlink+0x5d6/0x980
[ T843] ? srso_alias_return_thunk+0x5/0xfbef5
[ T843] ? genl_family_rcv_msg_dumpit+0xdf/0xf0
[ T843] ? srso_alias_return_thunk+0x5/0xfbef5
[ T843] ? __kmalloc_cache_noprof+0x44/0x210
[ T843] ? rtnl_newlink+0x42/0x60
[ T843] ? rtnetlink_rcv_msg+0x152/0x3f0
[ T843] ? mptcp_pm_nl_dump_addr+0x180/0x180
[ T843] ? rtnl_calcit.isra.0+0x130/0x130
[ T843] ? netlink_rcv_skb+0x56/0x100
[ T843] ? netlink_unicast+0x199/0x290
[ T843] ? netlink_sendmsg+0x21d/0x490
[ T843] ? __sock_sendmsg+0x78/0x80
[ T843] ? ____sys_sendmsg+0x23f/0x2e0
[ T843] ? srso_alias_return_thunk+0x5/0xfbef5
[ T843] ? copy_msghdr_from_user+0x68/0xa0
[ T843] ? ___sys_sendmsg+0x81/0xd0
[ T843] ? srso_alias_return_thunk+0x5/0xfbef5
[ T843] ? crng_fast_key_erasure+0xbc/0xf0
[ T843] ? srso_alias_return_thunk+0x5/0xfbef5
[ T843] ? get_random_bytes_user+0x126/0x140
[ T843] ? srso_alias_return_thunk+0x5/0xfbef5
[ T843] ? __fdget+0xb1/0xe0
[ T843] ? __sys_sendmsg+0x56/0xa0
[ T843] ? srso_alias_return_thunk+0x5/0xfbef5
[ T843] ? do_syscall_64+0x5f/0x170
[ T843] ? entry_SYSCALL_64_after_hwframe+0x55/0x5d
[ T843] </TASK>
Fixes: 1541d63c5fe2 ("wifi: mt76: mt7925: add mt7925_mac_link_bss_remove to remove per-link BSS")
Reported-by: Bert Karwatzki <spasswolf@web.de>
Closes: https://lore.kernel.org/linux-wireless/2fee61f8c903d02a900ca3188c3742c7effd102e.camel@web.de/#b
Signed-off-by: Sean Wang <sean.wang@mediatek.com>
Tested-by: Bert Karwatzki <spasswolf@web.de>
Link: https://patch.msgid.link/20240718234633.12737-1-sean.wang@kernel.org
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
Smatch throws below warning:
drivers/net/wireless/ath/ath12k/wow.c:434 ath12k_wow_vif_set_wakeups()
warn: reusing outside iterator: 'i'
drivers/net/wireless/ath/ath12k/wow.c
411 default:
412 break;
413 }
414
415 for (i = 0; i < wowlan->n_patterns; i++) {
^^^^^^^^^^^^^^^^^^^^^^
Here we loop until ->n_patterns
416 const struct cfg80211_pkt_pattern *eth_pattern = &patterns[i];
417 struct ath12k_pkt_pattern new_pattern = {};
418
419 if (WARN_ON(eth_pattern->pattern_len > WOW_MAX_PATTERN_SIZE))
420 return -EINVAL;
421
422 if (ar->ab->wow.wmi_conf_rx_decap_mode ==
423 ATH12K_HW_TXRX_NATIVE_WIFI) {
424 ath12k_wow_convert_8023_to_80211(ar, eth_pattern,
425 &new_pattern);
426
427 if (WARN_ON(new_pattern.pattern_len > WOW_MAX_PATTERN_SIZE))
428 return -EINVAL;
429 } else {
430 memcpy(new_pattern.pattern, eth_pattern->pattern,
431 eth_pattern->pattern_len);
432
433 /* convert bitmask to bytemask */
--> 434 for (i = 0; i < eth_pattern->pattern_len; i++)
435 if (eth_pattern->mask[i / 8] & BIT(i % 8))
436 new_pattern.bytemask[i] = 0xff;
This loop re-uses i and the loop ends with i == eth_pattern->pattern_len.
This looks like a bug.
Change to use a new iterator 'j' for the inner loop to fix it.
Tested-on: WCN7850 hw2.0 PCI WLAN.HMT.1.0-03427-QCAHMTSWPL_V1.0_V2.0_SILICONZ-1.15378.4
Fixes: 4a3c212eee0e ("wifi: ath12k: add basic WoW functionalities")
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Closes: https://lore.kernel.org/all/d4975b95-9c43-45af-a0ab-80253f18c7f2@stanley.mountain/
Signed-off-by: Baochen Qiang <quic_bqiang@quicinc.com>
Acked-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Link: https://patch.msgid.link/20240722033332.6273-1-quic_bqiang@quicinc.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
The minimum header length calculation (equivalent to the start
of the elements) for the S1G long beacon erroneously required
only up to the start of u.s1g_beacon rather than the start of
u.s1g_beacon.variable. Fix that, and also shuffle the branches
around a bit to not assign useless values that are overwritten
later.
Reported-by: syzbot+0f3afa93b91202f21939@syzkaller.appspotmail.com
Fixes: 9eaffe5078ca ("cfg80211: convert S1G beacon to scan results")
Link: https://patch.msgid.link/20240724132912.9662972db7c1.I8779675b5bbda4994cc66f876b6b87a2361c3c0b@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
Individual MLO links connection status is not copied to
EVENT_CONNECT_RESULT data while processing the connect response
information in cfg80211_connect_done(). Due to this failed links
are wrongly indicated with success status in EVENT_CONNECT_RESULT.
To fix this, copy the individual MLO links status to the
EVENT_CONNECT_RESULT data.
Fixes: 53ad07e9823b ("wifi: cfg80211: support reporting failed links")
Signed-off-by: Veerendranath Jakkam <quic_vjakkam@quicinc.com>
Reviewed-by: Carlos Llamas <cmllamas@google.com>
Link: https://patch.msgid.link/20240724125327.3495874-1-quic_vjakkam@quicinc.com
[commit message editorial changes]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
In commit 0d9c2beed116 ("wifi: mac80211: fix monitor channel
with chanctx emulation") I changed mac80211 to always have an
internal monitor_sdata to have something to have the chanctx
bound to.
However, if the driver didn't also have the WANT_MONITOR flag
this would cause mac80211 to allocate it without telling the
driver (which was intentional) but also use it for later APIs
to the driver without it ever having known about it which was
_not_ intentional.
Check through the code and only use the monitor_sdata in the
relevant places (TX, MU-MIMO follow settings, TX power, and
interface iteration) when the WANT_MONITOR flag is set.
Cc: stable@vger.kernel.org
Fixes: 0d9c2beed116 ("wifi: mac80211: fix monitor channel with chanctx emulation")
Reported-by: ZeroBeat <ZeroBeat@gmx.de>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=219086
Tested-by: Lorenzo Bianconi <lorenzo@kernel.org>
Link: https://patch.msgid.link/20240725184836.25d334157a8e.I02574086da2c5cf0e18264ce5807db6f14ffd9c0@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
Blamed commit increased lookup key size from 2 bytes to 16 bytes,
because zones_ht_key got a struct net pointer.
Make sure rhashtable_lookup() is not using the padding bytes
which are not initialized.
BUG: KMSAN: uninit-value in rht_ptr_rcu include/linux/rhashtable.h:376 [inline]
BUG: KMSAN: uninit-value in __rhashtable_lookup include/linux/rhashtable.h:607 [inline]
BUG: KMSAN: uninit-value in rhashtable_lookup include/linux/rhashtable.h:646 [inline]
BUG: KMSAN: uninit-value in rhashtable_lookup_fast include/linux/rhashtable.h:672 [inline]
BUG: KMSAN: uninit-value in tcf_ct_flow_table_get+0x611/0x2260 net/sched/act_ct.c:329
rht_ptr_rcu include/linux/rhashtable.h:376 [inline]
__rhashtable_lookup include/linux/rhashtable.h:607 [inline]
rhashtable_lookup include/linux/rhashtable.h:646 [inline]
rhashtable_lookup_fast include/linux/rhashtable.h:672 [inline]
tcf_ct_flow_table_get+0x611/0x2260 net/sched/act_ct.c:329
tcf_ct_init+0xa67/0x2890 net/sched/act_ct.c:1408
tcf_action_init_1+0x6cc/0xb30 net/sched/act_api.c:1425
tcf_action_init+0x458/0xf00 net/sched/act_api.c:1488
tcf_action_add net/sched/act_api.c:2061 [inline]
tc_ctl_action+0x4be/0x19d0 net/sched/act_api.c:2118
rtnetlink_rcv_msg+0x12fc/0x1410 net/core/rtnetlink.c:6647
netlink_rcv_skb+0x375/0x650 net/netlink/af_netlink.c:2550
rtnetlink_rcv+0x34/0x40 net/core/rtnetlink.c:6665
netlink_unicast_kernel net/netlink/af_netlink.c:1331 [inline]
netlink_unicast+0xf52/0x1260 net/netlink/af_netlink.c:1357
netlink_sendmsg+0x10da/0x11e0 net/netlink/af_netlink.c:1901
sock_sendmsg_nosec net/socket.c:730 [inline]
__sock_sendmsg+0x30f/0x380 net/socket.c:745
____sys_sendmsg+0x877/0xb60 net/socket.c:2597
___sys_sendmsg+0x28d/0x3c0 net/socket.c:2651
__sys_sendmsg net/socket.c:2680 [inline]
__do_sys_sendmsg net/socket.c:2689 [inline]
__se_sys_sendmsg net/socket.c:2687 [inline]
__x64_sys_sendmsg+0x307/0x4a0 net/socket.c:2687
x64_sys_call+0x2dd6/0x3c10 arch/x86/include/generated/asm/syscalls_64.h:47
do_syscall_x64 arch/x86/entry/common.c:52 [inline]
do_syscall_64+0xcd/0x1e0 arch/x86/entry/common.c:83
entry_SYSCALL_64_after_hwframe+0x77/0x7f
Local variable key created at:
tcf_ct_flow_table_get+0x4a/0x2260 net/sched/act_ct.c:324
tcf_ct_init+0xa67/0x2890 net/sched/act_ct.c:1408
Fixes: 88c67aeb1407 ("sched: act_ct: add netns into the key of tcf_ct_flow_table")
Reported-by: syzbot+1b5e4e187cc586d05ea0@syzkaller.appspotmail.com
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Xin Long <lucien.xin@gmail.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
It could lead to error happen because the variable res is not updated if
the call to sr_share_read_word returns an error. In this particular case
error code was returned and res stayed uninitialized. Same issue also
applies to sr_read_reg.
This can be avoided by checking the return value of sr_share_read_word
and sr_read_reg, and propagating the error if the read operation failed.
Found by code review.
Cc: stable@vger.kernel.org
Fixes: c9b37458e956 ("USB2NET : SR9700 : One chip USB 1.1 USB2NET SR9700Device Driver Support")
Signed-off-by: Ma Ke <make24@iscas.ac.cn>
Reviewed-by: Shigeru Yoshida <syoshida@redhat.com>
Reviewed-by: Hariprasad Kelam <hkelam@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The response to a GET request in Netlink should fully identify
the queried object. RSS_GET accepts context id as an input,
so it must echo that attribute back to the response.
After (assuming context 1 has been created):
$ ./cli.py --spec netlink/specs/ethtool.yaml \
--do rss-get \
--json '{"header": {"dev-index": 2}, "context": 1}'
{'context': 1,
'header': {'dev-index': 2, 'dev-name': 'eth0'},
[...]
Fixes: 7112a04664bf ("ethtool: add netlink based get rss support")
Acked-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Joe Damato <jdamato@fastly.com>
Link: https://patch.msgid.link/20240724234249.2621109-3-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The spec for Ethtool is a bit inaccurate. We don't currently
support dump. Context is only accepted as input and not echoed
to output (which is a separate bug).
Fixes: a353318ebf24 ("tools: ynl: populate most of the ethtool spec")
Acked-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Joe Damato <jdamato@fastly.com>
Link: https://patch.msgid.link/20240724234249.2621109-2-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
In __bnxt_reserve_rings(), the existing code unconditionally sets the
default RSS indirection table to default if netif_is_rxfh_configured()
returns false. This used to be correct before we added RSS contexts
support. For example, if the user is changing the number of ethtool
channels, we will enter this path to reserve the new number of rings.
We will then set the RSS indirection table to default to cover the new
number of rings if netif_is_rxfh_configured() is false.
Now, with RSS contexts support, if the user has added or deleted RSS
contexts, we may now enter this path to reserve the new number of VNICs.
However, netif_is_rxfh_configured() will not return the correct state if
we are still in the middle of set_rxfh(). So the existing code may
set the indirection table of the default RSS context to default by
mistake.
Fix it to check if the reservation of the RX rings is changing. Only
check netif_is_rxfh_configured() if it is changing. RX rings will not
change in the middle of set_rxfh() and this will fix the issue.
Fixes: b3d0083caf9a ("bnxt_en: Support RSS contexts in ethtool .{get|set}_rxfh()")
Reported-and-tested-by: Jakub Kicinski <kuba@kernel.org>
Link: https://lore.kernel.org/20240625010210.2002310-1-kuba@kernel.org
Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com>
Signed-off-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Link: https://patch.msgid.link/20240724222106.147744-1-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The cited commit missed to check against the validity of the frame length
in the tun_xdp_one() path, which could cause a corrupted skb to be sent
downstack. Even before the skb is transmitted, the
tun_xdp_one-->eth_type_trans() may access the Ethernet header although it
can be less than ETH_HLEN. Once transmitted, this could either cause
out-of-bound access beyond the actual length, or confuse the underlayer
with incorrect or inconsistent header length in the skb metadata.
In the alternative path, tun_get_user() already prohibits short frame which
has the length less than Ethernet header size from being transmitted for
IFF_TAP.
This is to drop any frame shorter than the Ethernet header size just like
how tun_get_user() does.
CVE: CVE-2024-41091
Inspired-by: https://lore.kernel.org/netdev/1717026141-25716-1-git-send-email-si-wei.liu@oracle.com/
Fixes: 043d222f93ab ("tuntap: accept an array of XDP buffs through sendmsg()")
Cc: stable@vger.kernel.org
Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
Reviewed-by: Si-Wei Liu <si-wei.liu@oracle.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Jason Wang <jasowang@redhat.com>
Link: https://patch.msgid.link/20240724170452.16837-3-dongli.zhang@oracle.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The cited commit missed to check against the validity of the frame length
in the tap_get_user_xdp() path, which could cause a corrupted skb to be
sent downstack. Even before the skb is transmitted, the
tap_get_user_xdp()-->skb_set_network_header() may assume the size is more
than ETH_HLEN. Once transmitted, this could either cause out-of-bound
access beyond the actual length, or confuse the underlayer with incorrect
or inconsistent header length in the skb metadata.
In the alternative path, tap_get_user() already prohibits short frame which
has the length less than Ethernet header size from being transmitted.
This is to drop any frame shorter than the Ethernet header size just like
how tap_get_user() does.
CVE: CVE-2024-41090
Link: https://lore.kernel.org/netdev/1717026141-25716-1-git-send-email-si-wei.liu@oracle.com/
Fixes: 0efac27791ee ("tap: accept an array of XDP buffs through sendmsg()")
Cc: stable@vger.kernel.org
Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com>
Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Jason Wang <jasowang@redhat.com>
Link: https://patch.msgid.link/20240724170452.16837-2-dongli.zhang@oracle.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Don't dereference *sp after calling dev_kfree_skb(*sp).
Fixes: af69fb3a8ffa ("Add mISDN HFC multiport driver")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/8be65f5a-c2dd-4ba0-8a10-bfe5980b8cfb@stanley.mountain
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The NIC requires each TSO segment to not span more than 10
descriptors. NIC further requires each descriptor to not exceed
16KB - 1 (GVE_TX_MAX_BUF_SIZE_DQO).
The descriptors for an skb are generated by
gve_tx_add_skb_no_copy_dqo() for DQO RDA queue format.
gve_tx_add_skb_no_copy_dqo() loops through each skb frag and
generates a descriptor for the entire frag if the frag size is
not greater than GVE_TX_MAX_BUF_SIZE_DQO. If the frag size is
greater than GVE_TX_MAX_BUF_SIZE_DQO, it is split into descriptor(s)
of size GVE_TX_MAX_BUF_SIZE_DQO and a descriptor is generated for
the remainder (frag size % GVE_TX_MAX_BUF_SIZE_DQO).
gve_can_send_tso() checks if the descriptors thus generated for an
skb would meet the requirement that each TSO-segment not span more
than 10 descriptors. However, the current code misses an edge case
when a TSO segment spans multiple descriptors within a large frag.
This change fixes the edge case.
gve_can_send_tso() relies on the assumption that max gso size (9728)
is less than GVE_TX_MAX_BUF_SIZE_DQO and therefore within an skb
fragment a TSO segment can never span more than 2 descriptors.
Fixes: a57e5de476be ("gve: DQO: Add TX path")
Signed-off-by: Praveen Kaligineedi <pkaligineedi@google.com>
Signed-off-by: Bailey Forrest <bcf@google.com>
Reviewed-by: Jeroen de Borst <jeroendb@google.com>
Cc: stable@vger.kernel.org
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/20240724143431.3343722-1-pkaligineedi@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
When the netdev_rx_queue_restart() restarts queues, the bnxt_en driver
updates(creates and deletes) a page_pool.
But it doesn't update xdp_rxq_info, so the xdp_rxq_info is still
connected to an old page_pool.
So, bnxt_rx_ring_info->page_pool indicates a new page_pool, but
bnxt_rx_ring_info->xdp_rxq is still connected to an old page_pool.
An old page_pool is no longer used so it is supposed to be
deleted by page_pool_destroy() but it isn't.
Because the xdp_rxq_info is holding the reference count for it and the
xdp_rxq_info is not updated, an old page_pool will not be deleted in
the queue restart logic.
Before restarting 1 queue:
./tools/net/ynl/samples/page-pool
enp10s0f1np1[6] page pools: 4 (zombies: 0)
refs: 8192 bytes: 33554432 (refs: 0 bytes: 0)
recycling: 0.0% (alloc: 128:8048 recycle: 0:0)
After restarting 1 queue:
./tools/net/ynl/samples/page-pool
enp10s0f1np1[6] page pools: 5 (zombies: 0)
refs: 10240 bytes: 41943040 (refs: 0 bytes: 0)
recycling: 20.0% (alloc: 160:10080 recycle: 1920:128)
Before restarting queues, an interface has 4 page_pools.
After restarting one queue, an interface has 5 page_pools, but it
should be 4, not 5.
The reason is that queue restarting logic creates a new page_pool and
an old page_pool is not deleted due to the absence of an update of
xdp_rxq_info logic.
Fixes: 2d694c27d32e ("bnxt_en: implement netdev_queue_mgmt_ops")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Reviewed-by: David Wei <dw@davidwei.uk>
Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Link: https://patch.msgid.link/20240721053554.1233549-1-ap420073@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The 'Fixes' commit recently changed the behaviour of TCP by skipping the
processing of the 3rd ACK when a sk->sk_socket is set. The goal was to
skip tcp_ack_snd_check() in tcp_rcv_state_process() not to send an
unnecessary ACK in case of simultaneous connect(). Unfortunately, that
had an impact on TFO and MPTCP.
I started to look at the impact on MPTCP, because the MPTCP CI found
some issues with the MPTCP Packetdrill tests [1]. Then Paolo Abeni
suggested me to look at the impact on TFO with "plain" TCP.
For MPTCP, when receiving the 3rd ACK of a request adding a new path
(MP_JOIN), sk->sk_socket will be set, and point to the MPTCP sock that
has been created when the MPTCP connection got established before with
the first path. The newly added 'goto' will then skip the processing of
the segment text (step 7) and not go through tcp_data_queue() where the
MPTCP options are validated, and some actions are triggered, e.g.
sending the MPJ 4th ACK [2] as demonstrated by the new errors when
running a packetdrill test [3] establishing a second subflow.
This doesn't fully break MPTCP, mainly the 4th MPJ ACK that will be
delayed. Still, we don't want to have this behaviour as it delays the
switch to the fully established mode, and invalid MPTCP options in this
3rd ACK will not be caught any more. This modification also affects the
MPTCP + TFO feature as well, and being the reason why the selftests
started to be unstable the last few days [4].
For TFO, the existing 'basic-cookie-not-reqd' test [5] was no longer
passing: if the 3rd ACK contains data, and the connection is accept()ed
before receiving them, these data would no longer be processed, and thus
not ACKed.
One last thing about MPTCP, in case of simultaneous connect(), a
fallback to TCP will be done, which seems fine:
`../common/defaults.sh`
0 socket(..., SOCK_STREAM|SOCK_NONBLOCK, IPPROTO_MPTCP) = 3
+0 connect(3, ..., ...) = -1 EINPROGRESS (Operation now in progress)
+0 > S 0:0(0) <mss 1460, sackOK, TS val 100 ecr 0, nop, wscale 8, mpcapable v1 flags[flag_h] nokey>
+0 < S 0:0(0) win 1000 <mss 1460, sackOK, TS val 407 ecr 0, nop, wscale 8, mpcapable v1 flags[flag_h] nokey>
+0 > S. 0:0(0) ack 1 <mss 1460, sackOK, TS val 330 ecr 0, nop, wscale 8, mpcapable v1 flags[flag_h] nokey>
+0 < S. 0:0(0) ack 1 win 65535 <mss 1460, sackOK, TS val 700 ecr 100, nop, wscale 8, mpcapable v1 flags[flag_h] key[skey=2]>
+0 > . 1:1(0) ack 1 <nop, nop, TS val 845707014 ecr 700, nop, nop, sack 0:1>
Simultaneous SYN-data crossing is also not supported by TFO, see [6].
Kuniyuki Iwashima suggested to restrict the processing to SYN+ACK only:
that's a more generic solution than the one initially proposed, and
also enough to fix the issues described above.
Later on, Eric Dumazet mentioned that an ACK should still be sent in
reaction to the second SYN+ACK that is received: not sending a DUPACK
here seems wrong and could hurt:
0 socket(..., SOCK_STREAM|SOCK_NONBLOCK, IPPROTO_TCP) = 3
+0 connect(3, ..., ...) = -1 EINPROGRESS (Operation now in progress)
+0 > S 0:0(0) <mss 1460, sackOK, TS val 1000 ecr 0,nop,wscale 8>
+0 < S 0:0(0) win 1000 <mss 1000, sackOK, nop, nop>
+0 > S. 0:0(0) ack 1 <mss 1460, sackOK, TS val 3308134035 ecr 0,nop,wscale 8>
+0 < S. 0:0(0) ack 1 win 1000 <mss 1000, sackOK, nop, nop>
+0 > . 1:1(0) ack 1 <nop, nop, sack 0:1> // <== Here
So in this version, the 'goto consume' is dropped, to always send an ACK
when switching from TCP_SYN_RECV to TCP_ESTABLISHED. This ACK will be
seen as a DUPACK -- with DSACK if SACK has been negotiated -- in case of
simultaneous SYN crossing: that's what is expected here.
Link: https://github.com/multipath-tcp/mptcp_net-next/actions/runs/9936227696 [1]
Link: https://datatracker.ietf.org/doc/html/rfc8684#fig_tokens [2]
Link: https://github.com/multipath-tcp/packetdrill/blob/mptcp-net-next/gtests/net/mptcp/syscalls/accept.pkt#L28 [3]
Link: https://netdev.bots.linux.dev/contest.html?executor=vmksft-mptcp-dbg&test=mptcp-connect-sh [4]
Link: https://github.com/google/packetdrill/blob/master/gtests/net/tcp/fastopen/server/basic-cookie-not-reqd.pkt#L21 [5]
Link: https://github.com/google/packetdrill/blob/master/gtests/net/tcp/fastopen/client/simultaneous-fast-open.pkt [6]
Fixes: 23e89e8ee7be ("tcp: Don't drop SYN+ACK for simultaneous connect().")
Suggested-by: Paolo Abeni <pabeni@redhat.com>
Suggested-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Suggested-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20240724-upstream-net-next-20240716-tcp-3rd-ack-consume-sk_socket-v3-1-d48339764ce9@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
This flag is now required to use tx_metadata_len.
Fixes: 40808a237d9c ("selftests/bpf: Add TX side to xdp_metadata")
Reported-by: Julian Schindel <mail@arctic-alpaca.de>
Signed-off-by: Stanislav Fomichev <sdf@fomichev.me>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Link: https://lore.kernel.org/bpf/20240713015253.121248-3-sdf@fomichev.me
|
|
Julian reports that commit 341ac980eab9 ("xsk: Support tx_metadata_len")
can break existing use cases which don't zero-initialize xdp_umem_reg
padding. Introduce new XDP_UMEM_TX_METADATA_LEN to make sure we
interpret the padding as tx_metadata_len only when being explicitly
asked.
Fixes: 341ac980eab9 ("xsk: Support tx_metadata_len")
Reported-by: Julian Schindel <mail@arctic-alpaca.de>
Signed-off-by: Stanislav Fomichev <sdf@fomichev.me>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Link: https://lore.kernel.org/bpf/20240713015253.121248-2-sdf@fomichev.me
|
|
Linearize the skb when downgrading gso_size because it may trigger a
BUG_ON() later when the skb is segmented as described in [1,2].
Fixes: 2be7e212d5419 ("bpf: add bpf_skb_adjust_room helper")
Signed-off-by: Fred Li <dracodingfly@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/all/20240626065555.35460-2-dracodingfly@gmail.com [1]
Link: https://lore.kernel.org/all/668d5cf1ec330_1c18c32947@willemb.c.googlers.com.notmuch [2]
Link: https://lore.kernel.org/bpf/20240719024653.77006-1-dracodingfly@gmail.com
|