summaryrefslogtreecommitdiffstats
path: root/io_uring (unfollow)
Commit message (Collapse)AuthorFilesLines
2024-04-22RDMA/rxe: Don't schedule rxe_completer from rxe_requesterBob Pearson2-13/+2
Now that rxe_completer() is always called serially after rxe_requester() there is no reason to schedule rxe_completer() from rxe_requester(). Link: https://lore.kernel.org/r/20240329145513.35381-9-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2024-04-22RDMA/rxe: Remove save/rollback_state in rxe_requesterBob Pearson1-38/+2
Now that req.task and comp.task are merged it is no longer necessary to call save_state() before calling rxe_xmit_pkt() and rollback_state() if rxe_xmit_pkt() fails. This was done originally to prevent races between rxe_completer() and rxe_requester() which now cannot happen. Link: https://lore.kernel.org/r/20240329145513.35381-8-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2024-04-22RDMA/rxe: Merge request and complete tasksBob Pearson10-55/+63
Currently the rxe driver has three work queue tasks per qp. These are the req.task, comp.task and resp.task which call rxe_requester(), rxe_completer() and rxe_responder() respectively directly or on work queues. Each of these subroutines checks to see if there is work to be performed on the send queue or on the response packet queue or the request packet queue and will run until there is no work remaining or yield the cpu and reschedule itself until there is no work remaining. This commit combines the req.task and comp.task into a single send.task and renames the resp.task to the recv.task. The combined send.task calls rxe_requester() and rxe_completer() serially and continues until all work on both the send queue and the response packet queue are done. In various benchmarks the performance is either improved or left the same. At high scale there is a significant reduction in the load on the cpu. This is the first step in combining these two tasks. Once they are serialized cross rescheduling of req.task and comp.task can be more efficiently handled by just letting the send.task continue to run. This will be done in the next several patches. Link: https://lore.kernel.org/r/20240329145513.35381-7-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2024-04-22RDMA/rxe: Remove redundant scheduling of rxe_completerBob Pearson1-5/+0
In rxe_post_send_kernel() if the qp is in the error state after posting the work requests the rxe_completer() task is scheduled. But, the only way to move the qp into the error state is to call rxe_qp_error() which also schedules the rxe_completer() task to drain the queues. Calling it a second time has no effect. This commit removes the redundant call. Link: https://lore.kernel.org/r/20240329145513.35381-6-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2024-04-22RDMA/rxe: Allow good work requests to be executedBob Pearson1-1/+5
A previous commit incorrectly added an 'if(!err)' before scheduling the requester task in rxe_post_send_kernel(). But if there were send wrs successfully added to the send queue before a bad wr they might never get executed. This commit fixes this by scheduling the requester task if any wqes were successfully posted in rxe_post_send_kernel() in rxe_verbs.c. Link: https://lore.kernel.org/r/20240329145513.35381-5-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Fixes: 5bf944f24129 ("RDMA/rxe: Add error messages") Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2024-04-22RDMA/rxe: Fix seg fault in rxe_comp_queue_pktBob Pearson1-3/+3
In rxe_comp_queue_pkt() an incoming response packet skb is enqueued to the resp_pkts queue and then a decision is made whether to run the completer task inline or schedule it. Finally the skb is dereferenced to bump a 'hw' performance counter. This is wrong because if the completer task is already running in a separate thread it may have already processed the skb and freed it which can cause a seg fault. This has been observed infrequently in testing at high scale. This patch fixes this by changing the order of enqueuing the packet until after the counter is accessed. Link: https://lore.kernel.org/r/20240329145513.35381-4-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Fixes: 0b1e5b99a48b ("IB/rxe: Add port protocol stats") Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2024-04-21IB/core: Implement a limit on UMAD receive ListMichael Guralnik1-6/+15
The existing behavior of ib_umad, which maintains received MAD packets in an unbounded list, poses a risk of uncontrolled growth. As user-space applications extract packets from this list, the rate of extraction may not match the rate of incoming packets, leading to potential list overflow. To address this, we introduce a limit to the size of the list. After considering typical scenarios, such as OpenSM processing, which can handle approximately 100k packets per second, and the 1-second retry timeout for most packets, we set the list size limit to 200k. Packets received beyond this limit are dropped, assuming they are likely timed out by the time they are handled by user-space. Notably, packets queued on the receive list due to reasons like timed-out sends are preserved even when the list is full. Signed-off-by: Michael Guralnik <michaelgur@nvidia.com> Reviewed-by: Mark Zhang <markzhang@nvidia.com> Link: https://lore.kernel.org/r/7197cb58a7d9e78399008f25036205ceab07fbd5.1713268818.git.leon@kernel.org Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-04-16RDMA/hns: Modify the print level of CQE errorChengchang Tang1-2/+3
Too much print may lead to a panic in kernel. Change ibdev_err() to ibdev_err_ratelimited(), and change the printing level of cqe dump to debug level. Fixes: 7c044adca272 ("RDMA/hns: Simplify the cqe code of poll cq") Signed-off-by: Chengchang Tang <tangchengchang@huawei.com> Signed-off-by: Junxian Huang <huangjunxian6@hisilicon.com> Link: https://lore.kernel.org/r/20240412091616.370789-11-huangjunxian6@hisilicon.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-04-16RDMA/hns: Use complete parentheses in macrosChengchang Tang1-6/+6
Use complete parentheses to ensure that macro expansion does not produce unexpected results. Fixes: a25d13cbe816 ("RDMA/hns: Add the interfaces to support multi hop addressing for the contexts in hip08") Signed-off-by: Chengchang Tang <tangchengchang@huawei.com> Signed-off-by: Junxian Huang <huangjunxian6@hisilicon.com> Link: https://lore.kernel.org/r/20240412091616.370789-10-huangjunxian6@hisilicon.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-04-16RDMA/hns: Add mutex_destroy()wenglianfa6-5/+39
Add mutex_destroy(). Signed-off-by: wenglianfa <wenglianfa@huawei.com> Signed-off-by: Junxian Huang <huangjunxian6@hisilicon.com> Link: https://lore.kernel.org/r/20240412091616.370789-9-huangjunxian6@hisilicon.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-04-16RDMA/hns: Fix GMV table pagesizeChengchang Tang1-1/+1
GMV's BA table only supports 4K pages. Currently, PAGESIZE is used to calculate gmv_bt_num, which will cause an abnormal number of gmv_bt_num in a 64K OS. Fixes: d6d91e46210f ("RDMA/hns: Add support for configuring GMV table") Signed-off-by: Chengchang Tang <tangchengchang@huawei.com> Signed-off-by: Junxian Huang <huangjunxian6@hisilicon.com> Link: https://lore.kernel.org/r/20240412091616.370789-8-huangjunxian6@hisilicon.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-04-16RDMA/hns: Fix mismatch exception rollbackwenglianfa1-1/+1
When dma_alloc_coherent() fails in hns_roce_alloc_hem(), just call kfree() to release hem instead of hns_roce_free_hem(). Fixes: c00743cbf2b8 ("RDMA/hns: Simplify 'struct hns_roce_hem' allocation") Signed-off-by: wenglianfa <wenglianfa@huawei.com> Signed-off-by: Junxian Huang <huangjunxian6@hisilicon.com> Link: https://lore.kernel.org/r/20240412091616.370789-7-huangjunxian6@hisilicon.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-04-16RDMA/hns: Fix UAF for cq async eventChengchang Tang1-11/+13
The refcount of CQ is not protected by locks. When CQ asynchronous events and CQ destruction are concurrent, CQ may have been released, which will cause UAF. Use the xa_lock() to protect the CQ refcount. Fixes: 9a4435375cd1 ("IB/hns: Add driver files for hns RoCE driver") Signed-off-by: Chengchang Tang <tangchengchang@huawei.com> Signed-off-by: Junxian Huang <huangjunxian6@hisilicon.com> Link: https://lore.kernel.org/r/20240412091616.370789-6-huangjunxian6@hisilicon.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-04-16RDMA/hns: Fix deadlock on SRQ async events.Chengchang Tang2-3/+4
xa_lock for SRQ table may be required in AEQ. Use xa_store_irq()/ xa_erase_irq() to avoid deadlock. Fixes: 81fce6291d99 ("RDMA/hns: Add SRQ asynchronous event support") Signed-off-by: Chengchang Tang <tangchengchang@huawei.com> Signed-off-by: Junxian Huang <huangjunxian6@hisilicon.com> Link: https://lore.kernel.org/r/20240412091616.370789-5-huangjunxian6@hisilicon.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-04-16RDMA/hns: Add max_ah and cq moderation capacities in query_device()Chengchang Tang4-2/+12
Add max_ah and cq moderation capacities to hns_roce_query_device(). Fixes: 9a4435375cd1 ("IB/hns: Add driver files for hns RoCE driver") Signed-off-by: Chengchang Tang <tangchengchang@huawei.com> Signed-off-by: Junxian Huang <huangjunxian6@hisilicon.com> Link: https://lore.kernel.org/r/20240412091616.370789-4-huangjunxian6@hisilicon.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-04-16RDMA/hns: Remove unused parameters and variablesChengchang Tang7-33/+20
Remove unused parameters and variables. Signed-off-by: Chengchang Tang <tangchengchang@huawei.com> Signed-off-by: Junxian Huang <huangjunxian6@hisilicon.com> Link: https://lore.kernel.org/r/20240412091616.370789-3-huangjunxian6@hisilicon.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-04-16RDMA/hns: Use macro instead of magic numberYangyang Li3-16/+34
Use macro instead of magic number. Signed-off-by: Yangyang Li <liyangyang20@huawei.com> Signed-off-by: Chengchang Tang <tangchengchang@huawei.com> Signed-off-by: Junxian Huang <huangjunxian6@hisilicon.com> Link: https://lore.kernel.org/r/20240412091616.370789-2-huangjunxian6@hisilicon.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-04-16RDMA/hns: Fix return value in hns_roce_map_mr_sgZhengchao Shao1-8/+7
As described in the ib_map_mr_sg function comment, it returns the number of sg elements that were mapped to the memory region. However, hns_roce_map_mr_sg returns the number of pages required for mapping the DMA area. Fix it. Fixes: 9b2cf76c9f05 ("RDMA/hns: Optimize PBL buffer allocation process") Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com> Link: https://lore.kernel.org/r/20240411033851.2884771-1-shaozhengchao@huawei.com Reviewed-by: Junxian Huang <huangjunxian6@hisilicon.com> Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-04-16RDMA/mana_ib: Configure mac address in RNICKonstantin Taranov3-0/+46
Set local mac address in RNIC, which is required by the HW. Signed-off-by: Konstantin Taranov <kotaranov@microsoft.com> Link: https://lore.kernel.org/r/1712738551-22075-7-git-send-email-kotaranov@linux.microsoft.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-04-16RDMA/mana_ib: Adding and deleting GIDsKonstantin Taranov3-0/+97
Implement add_gid and del_gid for RNIC. IPv4 and IPv6 addresses are supported. Signed-off-by: Konstantin Taranov <kotaranov@microsoft.com> Link: https://lore.kernel.org/r/1712738551-22075-6-git-send-email-kotaranov@linux.microsoft.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-04-16RDMA/mana_ib: Enable RoCE on port 1Konstantin Taranov2-4/+26
Set netdev and RoCEv2 flag to enable GID population on port 1. Use GIDs of the master netdev. As mc->ports[] stores slave devices, use a helper to get the master netdev. Signed-off-by: Konstantin Taranov <kotaranov@microsoft.com> Link: https://lore.kernel.org/r/1712738551-22075-5-git-send-email-kotaranov@linux.microsoft.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-04-16RDMA/mana_ib: Implement port parametersKonstantin Taranov3-1/+42
Implement port parameters for RNIC: 1) extend query_port() method 2) implement get_link_layer() 3) implement query_pkey() Only port 1 can store GIDs. Signed-off-by: Konstantin Taranov <kotaranov@microsoft.com> Link: https://lore.kernel.org/r/1712738551-22075-4-git-send-email-kotaranov@linux.microsoft.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-04-16RDMA/mana_ib: Create and destroy rnic adapterKonstantin Taranov3-1/+79
Add functions for RNIC creation and destruction. If creation fails, the ib_probe fails as well. Signed-off-by: Konstantin Taranov <kotaranov@microsoft.com> Link: https://lore.kernel.org/r/1712738551-22075-3-git-send-email-kotaranov@linux.microsoft.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-04-16RDMA/mana_ib: Add EQ creation for rnic adapterKonstantin Taranov3-3/+41
Create an error EQ for the RNIC adapter. Signed-off-by: Konstantin Taranov <kotaranov@microsoft.com> Link: https://lore.kernel.org/r/1712738551-22075-2-git-send-email-kotaranov@linux.microsoft.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-04-16RDMA/mana_ib: Use num_comp_vectors of ib_deviceKonstantin Taranov3-9/+4
Use num_comp_vectors of struct ib_device instead of max_num_queues from gdma_context. Signed-off-by: Konstantin Taranov <kotaranov@microsoft.com> Link: https://lore.kernel.org/r/1712911656-17352-1-git-send-email-kotaranov@linux.microsoft.com Reviewed-by: Long Li <longli@microsoft.com> Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-04-11RDMA/rxe: Return the correct errnoZhu Yanjun1-2/+2
In the function __rxe_add_to_pool, the function xa_alloc_cyclic is called. The return value of the function xa_alloc_cyclic is as below: " Return: 0 if the allocation succeeded without wrapping. 1 if the allocation succeeded after wrapping, -ENOMEM if memory could not be allocated or -EBUSY if there are no free entries in @limit. " But now the function __rxe_add_to_pool only returns -EINVAL. All the returned error value should be returned to the caller. Signed-off-by: Zhu Yanjun <yanjun.zhu@linux.dev> Link: https://lore.kernel.org/r/20240408142142.792413-1-yanjun.zhu@linux.dev Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-04-11RDMA/mana_ib: remove useless return values from dbg printsKonstantin Taranov3-7/+5
Remove printing ret value on success as it was always 0. Signed-off-by: Konstantin Taranov <kotaranov@microsoft.com> Link: https://lore.kernel.org/r/1712672465-29960-1-git-send-email-kotaranov@linux.microsoft.com Reviewed-by: Long Li <longli@microsoft.com> Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-04-11net: mana: Avoid open coded arithmeticErick Archer1-8/+6
This is an effort to get rid of all multiplications from allocation functions in order to prevent integer overflows [1][2]. As the "req" variable is a pointer to "struct mana_cfg_rx_steer_req_v2" and this structure ends in a flexible array: struct mana_cfg_rx_steer_req_v2 { [...] mana_handle_t indir_tab[] __counted_by(num_indir_entries); }; the preferred way in the kernel is to use the struct_size() helper to do the arithmetic instead of the calculation "size + size * count" in the kzalloc() function. Moreover, use the "offsetof" helper to get the indirect table offset instead of the "sizeof" operator and avoid the open-coded arithmetic in pointers using the new flex member. This new structure member also allow us to remove the "req_indir_tab" variable since it is no longer needed. Now, it is also possible to use the "flex_array_size" helper to compute the size of these trailing elements in the "memcpy" function. This way, the code is more readable and safer. This code was detected with the help of Coccinelle, and audited and modified manually. Link: https://www.kernel.org/doc/html/latest/process/deprecated.html#open-coded-arithmetic-in-allocator-arguments [1] Link: https://github.com/KSPP/linux/issues/160 [2] Signed-off-by: Erick Archer <erick.archer@outlook.com> Link: https://lore.kernel.org/r/AS8PR02MB7237A21355C86EC0DCC0D83B8B022@AS8PR02MB7237.eurprd02.prod.outlook.com Reviewed-by: Justin Stitt <justinstitt@google.com> Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-04-11RDMA/mana_ib: Prefer struct_size over open coded arithmeticErick Archer1-7/+5
This is an effort to get rid of all multiplications from allocation functions in order to prevent integer overflows [1][2]. As the "req" variable is a pointer to "struct mana_cfg_rx_steer_req_v2" and this structure ends in a flexible array: struct mana_cfg_rx_steer_req_v2 { [...] mana_handle_t indir_tab[] __counted_by(num_indir_entries); }; the preferred way in the kernel is to use the struct_size() helper to do the arithmetic instead of the calculation "size + size * count" in the kzalloc() function. Moreover, use the "offsetof" helper to get the indirect table offset instead of the "sizeof" operator and avoid the open-coded arithmetic in pointers using the new flex member. This new structure member also allow us to remove the "req_indir_tab" variable since it is no longer needed. This way, the code is more readable and safer. This code was detected with the help of Coccinelle, and audited and modified manually. Link: https://www.kernel.org/doc/html/latest/process/deprecated.html#open-coded-arithmetic-in-allocator-arguments [1] Link: https://github.com/KSPP/linux/issues/160 [2] Signed-off-by: Erick Archer <erick.archer@outlook.com> Link: https://lore.kernel.org/r/AS8PR02MB72375EB06EE1A84A67BE722E8B022@AS8PR02MB7237.eurprd02.prod.outlook.com Reviewed-by: Long Li <longli@microsoft.com> Reviewed-by: Justin Stitt <justinstitt@google.com> Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-04-11net: mana: Add flex array to struct mana_cfg_rx_steer_req_v2Erick Archer1-0/+1
The "struct mana_cfg_rx_steer_req_v2" uses a dynamically sized set of trailing elements. Specifically, it uses a "mana_handle_t" array. So, use the preferred way in the kernel declaring a flexible array [1]. At the same time, prepare for the coming implementation by GCC and Clang of the __counted_by attribute. Flexible array members annotated with __counted_by can have their accesses bounds-checked at run-time via CONFIG_UBSAN_BOUNDS (for array indexing) and CONFIG_FORTIFY_SOURCE (for strcpy/memcpy-family functions). This is a previous step to refactor the two consumers of this structure. drivers/infiniband/hw/mana/qp.c drivers/net/ethernet/microsoft/mana/mana_en.c The ultimate goal is to avoid the open-coded arithmetic in the memory allocator functions [2] using the "struct_size" macro. Link: https://www.kernel.org/doc/html/next/process/deprecated.html#zero-length-and-one-element-arrays [1] Link: https://www.kernel.org/doc/html/next/process/deprecated.html#open-coded-arithmetic-in-allocator-arguments [2] Signed-off-by: Erick Archer <erick.archer@outlook.com> Link: https://lore.kernel.org/r/AS8PR02MB7237E2900247571C9CB84C678B022@AS8PR02MB7237.eurprd02.prod.outlook.com Reviewed-by: Long Li <longli@microsoft.com> Reviewed-by: Justin Stitt <justinstitt@google.com> Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-04-09RDMA/hns: Support DSCPJunxian Huang5-30/+116
Add support for DSCP configuration. For DSCP, get dscp-prio mapping via hns3 nic driver api .get_dscp_prio() and fill the SL (in WQE for UD or in QPC for RC) with the priority value. The prio-tc mapping is configured to HW by hns3 nic driver. HW will select a corresponding TC according to SL and the prio-tc mapping. Signed-off-by: Junxian Huang <huangjunxian6@hisilicon.com> Link: https://lore.kernel.org/r/20240315093551.1650088-1-huangjunxian6@hisilicon.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-04-08RDMA/mlx5: Adding remote atomic access flag to updatable flagsOr Har-Toov1-1/+2
Currently IB_ACCESS_REMOTE_ATOMIC is blocked from being updated via UMR although in some cases it should be possible. These cases are checked in mlx5r_umr_can_reconfig function. Fixes: ef3642c4f54d ("RDMA/mlx5: Fix error unwinds for rereg_mr") Signed-off-by: Or Har-Toov <ohartoov@nvidia.com> Link: https://lore.kernel.org/r/24dac73e2fa48cb806f33a932d97f3e402a5ea2c.1712140377.git.leon@kernel.org Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-04-08RDMA/mlx5: Change check for cacheable mkeysOr Har-Toov2-10/+23
umem can be NULL for user application mkeys in some cases. Therefore umem can't be used for checking if the mkey is cacheable and it is changed for checking a flag that indicates it. Also make sure that all mkeys which are not returned to the cache will be destroyed. Fixes: dd1b913fb0d0 ("RDMA/mlx5: Cache all user cacheable mkeys on dereg MR flow") Signed-off-by: Or Har-Toov <ohartoov@nvidia.com> Link: https://lore.kernel.org/r/2690bc5c6896bcb937f89af16a1ff0343a7ab3d0.1712140377.git.leon@kernel.org Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-04-08RDMA/mlx5: Uncacheable mkey has neither rb_key or cache_entOr Har-Toov1-1/+1
As some mkeys can't be modified with UMR due to some UMR limitations, like the size of translation that can be updated, not all user mkeys can be cached. Fixes: dd1b913fb0d0 ("RDMA/mlx5: Cache all user cacheable mkeys on dereg MR flow") Signed-off-by: Or Har-Toov <ohartoov@nvidia.com> Link: https://lore.kernel.org/r/f2742dd934ed73b2d32c66afb8e91b823063880c.1712140377.git.leon@kernel.org Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-04-02RDMA/mana_ib: Use struct mana_ib_queue for RAW QPsKonstantin Taranov2-46/+18
Use struct mana_ib_queue and its helpers for RAW QPs Signed-off-by: Konstantin Taranov <kotaranov@microsoft.com> Link: https://lore.kernel.org/r/1711483688-24358-5-git-send-email-kotaranov@linux.microsoft.com Reviewed-by: Long Li <longli@microsoft.com> Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-04-02RDMA/mana_ib: Use struct mana_ib_queue for WQsKonstantin Taranov3-35/+10
Use struct mana_ib_queue and its helpers for WQs Signed-off-by: Konstantin Taranov <kotaranov@microsoft.com> Link: https://lore.kernel.org/r/1711483688-24358-4-git-send-email-kotaranov@linux.microsoft.com Reviewed-by: Long Li <longli@microsoft.com> Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-04-02RDMA/mana_ib: Use struct mana_ib_queue for CQsKonstantin Taranov3-58/+24
Use struct mana_ib_queue and its helpers for CQs Signed-off-by: Konstantin Taranov <kotaranov@microsoft.com> Link: https://lore.kernel.org/r/1711483688-24358-3-git-send-email-kotaranov@linux.microsoft.com Reviewed-by: Long Li <longli@microsoft.com> Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-04-02RDMA/mana_ib: Introduce helpers to create and destroy mana queuesKonstantin Taranov2-0/+53
Intoduce helpers to work with mana ib queues (struct mana_ib_queue). A queue always consists of umem, gdma_region, and id. A queue can become a WQ or a CQ. Signed-off-by: Konstantin Taranov <kotaranov@microsoft.com> Link: https://lore.kernel.org/r/1711483688-24358-2-git-send-email-kotaranov@linux.microsoft.com Reviewed-by: Long Li <longli@microsoft.com> Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-04-01RDMA/restrack: Fix potential invalid address accessWenchao Hao1-50/+1
struct rdma_restrack_entry's kern_name was set to KBUILD_MODNAME in ib_create_cq(), while if the module exited but forgot del this rdma_restrack_entry, it would cause a invalid address access in rdma_restrack_clean() when print the owner of this rdma_restrack_entry. These code is used to help find one forgotten PD release in one of the ULPs. But it is not needed anymore, so delete them. Signed-off-by: Wenchao Hao <haowenchao2@huawei.com> Link: https://lore.kernel.org/r/20240318092320.1215235-1-haowenchao2@huawei.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-04-01RDMA/erdma: Remove unnecessary __GFP_ZERO flagBoshi Yu2-8/+4
The dma_alloc_coherent() interface automatically zero the memory returned. Thus, we do not need to specify the __GFP_ZERO flag explicitly when we call dma_alloc_coherent(). Reviewed-by: Cheng Xu <chengyou@linux.alibaba.com> Signed-off-by: Boshi Yu <boshiyu@linux.alibaba.com> Link: https://lore.kernel.org/r/20240311113821.22482-4-boshiyu@alibaba-inc.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-04-01RDMA/erdma: Unify the names related to doorbell recordsBoshi Yu8-93/+79
There exist two different names for the doorbell records: db_info and db_record. We use dbrec for cpu address of the doorbell record and dbrec_dma for dma address of the doorbell recordi uniformly. Reviewed-by: Cheng Xu <chengyou@linux.alibaba.com> Signed-off-by: Boshi Yu <boshiyu@linux.alibaba.com> Link: https://lore.kernel.org/r/20240311113821.22482-3-boshiyu@alibaba-inc.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-04-01RDMA/erdma: Allocate doorbell records from dma poolBoshi Yu6-101/+167
Currently, the 8 byte doorbell record is allocated along with the queue buffer, which may result in waste of dma space when the queue buffer is page aligned. To address this issue, we introduce a dma pool named db_pool and allocate doorbell record from it. Reviewed-by: Cheng Xu <chengyou@linux.alibaba.com> Signed-off-by: Boshi Yu <boshiyu@linux.alibaba.com> Link: https://lore.kernel.org/r/20240311113821.22482-2-boshiyu@alibaba-inc.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2024-03-31Linux 6.9-rc2v6.9-rc2Linus Torvalds1-1/+1
2024-03-31kconfig: Fix typo HEIGTH to HEIGHTIsak Ellmer8-14/+14
Fixed a typo in some variables where height was misspelled as heigth. Signed-off-by: Isak Ellmer <isak01@gmail.com> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2024-03-31Documentation/llvm: Note s390 LLVM=1 support with LLVM 18.1.0 and newerNathan Chancellor1-1/+1
As of the first s390 pull request during the 6.9 merge window, commit 691632f0e869 ("Merge tag 's390-6.9-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux"), s390 can be built with LLVM=1 when using LLVM 18.1.0, which is the first version that has SystemZ support implemented in ld.lld and llvm-objcopy. Update the supported architectures table in the Kbuild LLVM documentation to note this explicitly to make it more discoverable by users and other developers. Additionally, this brings s390 in line with the rest of the architectures in the table, which all support LLVM=1. Signed-off-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2024-03-31kbuild: Disable KCSAN for autogenerated *.mod.c intermediariesBorislav Petkov (AMD)1-1/+1
When KCSAN and CONSTRUCTORS are enabled, one can trigger the "Unpatched return thunk in use. This should not happen!" catch-all warning. Usually, when objtool runs on the .o objects, it does generate a section .return_sites which contains all offsets in the objects to the return thunks of the functions present there. Those return thunks then get patched at runtime by the alternatives. KCSAN and CONSTRUCTORS add this to the object file's .text.startup section: ------------------- Disassembly of section .text.startup: ... 0000000000000010 <_sub_I_00099_0>: 10: f3 0f 1e fa endbr64 14: e8 00 00 00 00 call 19 <_sub_I_00099_0+0x9> 15: R_X86_64_PLT32 __tsan_init-0x4 19: e9 00 00 00 00 jmp 1e <__UNIQUE_ID___addressable_cryptd_alloc_aead349+0x6> 1a: R_X86_64_PLT32 __x86_return_thunk-0x4 ------------------- which, if it is built as a module goes through the intermediary stage of creating a <module>.mod.c file which, when translated, receives a second constructor: ------------------- Disassembly of section .text.startup: 0000000000000010 <_sub_I_00099_0>: 10: f3 0f 1e fa endbr64 14: e8 00 00 00 00 call 19 <_sub_I_00099_0+0x9> 15: R_X86_64_PLT32 __tsan_init-0x4 19: e9 00 00 00 00 jmp 1e <_sub_I_00099_0+0xe> 1a: R_X86_64_PLT32 __x86_return_thunk-0x4 ... 0000000000000030 <_sub_I_00099_0>: 30: f3 0f 1e fa endbr64 34: e8 00 00 00 00 call 39 <_sub_I_00099_0+0x9> 35: R_X86_64_PLT32 __tsan_init-0x4 39: e9 00 00 00 00 jmp 3e <__ksymtab_cryptd_alloc_ahash+0x2> 3a: R_X86_64_PLT32 __x86_return_thunk-0x4 ------------------- in the .ko file. Objtool has run already so that second constructor's return thunk cannot be added to the .return_sites section and thus the return thunk remains unpatched and the warning rightfully fires. Drop KCSAN flags from the mod.c generation stage as those constructors do not contain data races one would be interested about. Debugged together with David Kaplan <David.Kaplan@amd.com> and Nikolay Borisov <nik.borisov@suse.com>. Reported-by: Paul Menzel <pmenzel@molgen.mpg.de> Closes: https://lore.kernel.org/r/0851a207-7143-417e-be31-8bf2b3afb57d@molgen.mpg.de Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Tested-by: Paul Menzel <pmenzel@molgen.mpg.de> # Dell XPS 13 Reviewed-by: Nikolay Borisov <nik.borisov@suse.com> Reviewed-by: Marco Elver <elver@google.com> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2024-03-31kbuild: make -Woverride-init warnings more consistentArnd Bergmann13-23/+18
The -Woverride-init warn about code that may be intentional or not, but the inintentional ones tend to be real bugs, so there is a bit of disagreement on whether this warning option should be enabled by default and we have multiple settings in scripts/Makefile.extrawarn as well as individual subsystems. Older versions of clang only supported -Wno-initializer-overrides with the same meaning as gcc's -Woverride-init, though all supported versions now work with both. Because of this difference, an earlier cleanup of mine accidentally turned the clang warning off for W=1 builds and only left it on for W=2, while it's still enabled for gcc with W=1. There is also one driver that only turns the warning off for newer versions of gcc but not other compilers, and some but not all the Makefiles still use a cc-disable-warning conditional that is no longer needed with supported compilers here. Address all of the above by removing the special cases for clang and always turning the warning off unconditionally where it got in the way, using the syntax that is supported by both compilers. Fixes: 2cd3271b7a31 ("kbuild: avoid duplicate warning options") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Hamza Mahfooz <hamza.mahfooz@amd.com> Acked-by: Jani Nikula <jani.nikula@intel.com> Acked-by: Andrew Jeffery <andrew@codeconstruct.com.au> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2024-03-30objtool: Fix compile failure when using the x32 compilerMikulas Patocka1-1/+1
When compiling the v6.9-rc1 kernel with the x32 compiler, the following errors are reported. The reason is that we take an "unsigned long" variable and print it using "PRIx64" format string. In file included from check.c:16: check.c: In function ‘add_dead_ends’: /usr/src/git/linux-2.6/tools/objtool/include/objtool/warn.h:46:17: error: format ‘%llx’ expects argument of type ‘long long unsigned int’, but argument 5 has type ‘long unsigned int’ [-Werror=format=] 46 | "%s: warning: objtool: " format "\n", \ | ^~~~~~~~~~~~~~~~~~~~~~~~ check.c:613:33: note: in expansion of macro ‘WARN’ 613 | WARN("can't find unreachable insn at %s+0x%" PRIx64, | ^~~~ ... Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: linux-kernel@vger.kernel.org
2024-03-30x86/build: Use obj-y to descend into arch/x86/virt/Masahiro Yamada3-4/+2
Commit c33621b4c5ad ("x86/virt/tdx: Wire up basic SEAMCALL functions") introduced a new instance of core-y instead of the standardized obj-y syntax. X86 Makefiles descend into subdirectories of arch/x86/virt inconsistently; into arch/x86/virt/ via core-y defined in arch/x86/Makefile, but into arch/x86/virt/svm/ via obj-y defined in arch/x86/Kbuild. This is problematic when you build a single object in parallel because multiple threads attempt to build the same file. $ make -j$(nproc) arch/x86/virt/vmx/tdx/seamcall.o [ snip ] AS arch/x86/virt/vmx/tdx/seamcall.o AS arch/x86/virt/vmx/tdx/seamcall.o fixdep: error opening file: arch/x86/virt/vmx/tdx/.seamcall.o.d: No such file or directory make[4]: *** [scripts/Makefile.build:362: arch/x86/virt/vmx/tdx/seamcall.o] Error 2 Use the obj-y syntax, as it works correctly. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20240330060554.18524-1-masahiroy@kernel.org