summaryrefslogtreecommitdiffstats
path: root/drivers/infiniband (follow)
Commit message (Collapse)AuthorAgeFilesLines
* RDMA/uverbs: Fix a NULL vs IS_ERR() bugDan Carpenter2021-05-191-2/+2
| | | | | | | | | | | The uapi_get_object() function returns error pointers, it never returns NULL. Fixes: 149d3845f4a5 ("RDMA/uverbs: Add a method to introspect handles in a context") Link: https://lore.kernel.org/r/YJ6Got+U7lz+3n9a@mwanda Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
* RDMA/mlx5: Fix query DCT via DEVXMaor Gottlieb2021-05-191-4/+2
| | | | | | | | | | | | | When executing DEVX command to query QP object, we need to take the QP type from the mlx5_ib_qp struct which hold the driver specific QP types as well, such as DC. Fixes: 34613eb1d2ad ("IB/mlx5: Enable modify and query verbs objects via DEVX") Link: https://lore.kernel.org/r/6eee15d63f09bb70787488e0cf96216e2957f5aa.1621413654.git.leonro@nvidia.com Reviewed-by: Yishai Hadas <yishaih@nvidia.com> Signed-off-by: Maor Gottlieb <maorg@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
* RDMA/core: Don't access cm_id after its destructionShay Drory2021-05-181-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | restrack should only be attached to a cm_id while the ID has a valid device pointer. It is set up when the device is first loaded, but not cleared when the device is removed. There is also two copies of the device pointer, one private and one in the public API, and these were left out of sync. Make everything go to NULL together and manipulate restrack right around the device assignments. Found by syzcaller: BUG: KASAN: wild-memory-access in __list_del include/linux/list.h:112 [inline] BUG: KASAN: wild-memory-access in __list_del_entry include/linux/list.h:135 [inline] BUG: KASAN: wild-memory-access in list_del include/linux/list.h:146 [inline] BUG: KASAN: wild-memory-access in cma_cancel_listens drivers/infiniband/core/cma.c:1767 [inline] BUG: KASAN: wild-memory-access in cma_cancel_operation drivers/infiniband/core/cma.c:1795 [inline] BUG: KASAN: wild-memory-access in cma_cancel_operation+0x1f4/0x4b0 drivers/infiniband/core/cma.c:1783 Write of size 8 at addr dead000000000108 by task syz-executor716/334 CPU: 0 PID: 334 Comm: syz-executor716 Not tainted 5.11.0+ #271 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 Call Trace: __dump_stack lib/dump_stack.c:79 [inline] dump_stack+0xbe/0xf9 lib/dump_stack.c:120 __kasan_report mm/kasan/report.c:400 [inline] kasan_report.cold+0x5f/0xd5 mm/kasan/report.c:413 __list_del include/linux/list.h:112 [inline] __list_del_entry include/linux/list.h:135 [inline] list_del include/linux/list.h:146 [inline] cma_cancel_listens drivers/infiniband/core/cma.c:1767 [inline] cma_cancel_operation drivers/infiniband/core/cma.c:1795 [inline] cma_cancel_operation+0x1f4/0x4b0 drivers/infiniband/core/cma.c:1783 _destroy_id+0x29/0x460 drivers/infiniband/core/cma.c:1862 ucma_close_id+0x36/0x50 drivers/infiniband/core/ucma.c:185 ucma_destroy_private_ctx+0x58d/0x5b0 drivers/infiniband/core/ucma.c:576 ucma_close+0x91/0xd0 drivers/infiniband/core/ucma.c:1797 __fput+0x169/0x540 fs/file_table.c:280 task_work_run+0xb7/0x100 kernel/task_work.c:140 exit_task_work include/linux/task_work.h:30 [inline] do_exit+0x7da/0x17f0 kernel/exit.c:825 do_group_exit+0x9e/0x190 kernel/exit.c:922 __do_sys_exit_group kernel/exit.c:933 [inline] __se_sys_exit_group kernel/exit.c:931 [inline] __x64_sys_exit_group+0x2d/0x30 kernel/exit.c:931 do_syscall_64+0x2d/0x40 arch/x86/entry/common.c:46 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Fixes: 255d0c14b375 ("RDMA/cma: rdma_bind_addr() leaks a cma_dev reference count") Link: https://lore.kernel.org/r/3352ee288fe34f2b44220457a29bfc0548686363.1620711734.git.leonro@nvidia.com Signed-off-by: Shay Drory <shayd@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
* RDMA/rxe: Return CQE error if invalid lkey was suppliedLeon Romanovsky2021-05-171-6/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | RXE is missing update of WQE status in LOCAL_WRITE failures. This caused the following kernel panic if someone sent an atomic operation with an explicitly wrong lkey. [leonro@vm ~]$ mkt test test_atomic_invalid_lkey (tests.test_atomic.AtomicTest) ... WARNING: CPU: 5 PID: 263 at drivers/infiniband/sw/rxe/rxe_comp.c:740 rxe_completer+0x1a6d/0x2e30 [rdma_rxe] Modules linked in: crc32_generic rdma_rxe ip6_udp_tunnel udp_tunnel rdma_ucm rdma_cm ib_umad ib_ipoib iw_cm ib_cm mlx5_ib ib_uverbs ib_core mlx5_core ptp pps_core CPU: 5 PID: 263 Comm: python3 Not tainted 5.13.0-rc1+ #2936 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 RIP: 0010:rxe_completer+0x1a6d/0x2e30 [rdma_rxe] Code: 03 0f 8e 65 0e 00 00 3b 93 10 06 00 00 0f 84 82 0a 00 00 4c 89 ff 4c 89 44 24 38 e8 2d 74 a9 e1 4c 8b 44 24 38 e9 1c f5 ff ff <0f> 0b e9 0c e8 ff ff b8 05 00 00 00 41 bf 05 00 00 00 e9 ab e7 ff RSP: 0018:ffff8880158af090 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff888016a78000 RCX: ffffffffa0cf1652 RDX: 1ffff9200004b442 RSI: 0000000000000004 RDI: ffffc9000025a210 RBP: dffffc0000000000 R08: 00000000ffffffea R09: ffff88801617740b R10: ffffed1002c2ee81 R11: 0000000000000007 R12: ffff88800f3b63e8 R13: ffff888016a78008 R14: ffffc9000025a180 R15: 000000000000000c FS: 00007f88b622a740(0000) GS:ffff88806d540000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f88b5a1fa10 CR3: 000000000d848004 CR4: 0000000000370ea0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: rxe_do_task+0x130/0x230 [rdma_rxe] rxe_rcv+0xb11/0x1df0 [rdma_rxe] rxe_loopback+0x157/0x1e0 [rdma_rxe] rxe_responder+0x5532/0x7620 [rdma_rxe] rxe_do_task+0x130/0x230 [rdma_rxe] rxe_rcv+0x9c8/0x1df0 [rdma_rxe] rxe_loopback+0x157/0x1e0 [rdma_rxe] rxe_requester+0x1efd/0x58c0 [rdma_rxe] rxe_do_task+0x130/0x230 [rdma_rxe] rxe_post_send+0x998/0x1860 [rdma_rxe] ib_uverbs_post_send+0xd5f/0x1220 [ib_uverbs] ib_uverbs_write+0x847/0xc80 [ib_uverbs] vfs_write+0x1c5/0x840 ksys_write+0x176/0x1d0 do_syscall_64+0x3f/0x80 entry_SYSCALL_64_after_hwframe+0x44/0xae Fixes: 8700e3e7c485 ("Soft RoCE driver") Link: https://lore.kernel.org/r/11e7b553f3a6f5371c6bb3f57c494bb52b88af99.1620711734.git.leonro@nvidia.com Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Acked-by: Zhu Yanjun <zyjzyj2000@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
* RDMA/mlx5: Recover from fatal event in dual port modeMaor Gottlieb2021-05-171-0/+1
| | | | | | | | | | | | When there is fatal event on the slave port, the device is marked as not active. We need to mark it as active again when the slave is recovered to regain full functionality. Fixes: d69a24e03659 ("IB/mlx5: Move IB event processing onto a workqueue") Link: https://lore.kernel.org/r/8906754455bb23019ef223c725d2c0d38acfb80b.1620711734.git.leonro@nvidia.com Signed-off-by: Maor Gottlieb <maorg@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
* RDMA/mlx5: Verify that DM operation is reasonableMaor Gottlieb2021-05-171-0/+3
| | | | | | | | | | | | | | | Fix the complaint from smatch by verifing that the user requested DM operation is not greater than 31. divers/infiniband/hw/mlx5/dm.c:220 mlx5_ib_handler_MLX5_IB_METHOD_DM_MAP_OP_ADDR() error: undefined (user controlled) shift '(((1))) << op' Fixes: cea85fa5dbc2 ("RDMA/mlx5: Add support in MEMIC operations") Link: https://lore.kernel.org/r/458b1d7710c3cf01360c8771893f483665569786.1620711734.git.leonro@nvidia.com Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Maor Gottlieb <maorg@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
* RDMA/rxe: Clear all QP fields if creation failedLeon Romanovsky2021-05-111-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | rxe_qp_do_cleanup() relies on valid pointer values in QP for the properly created ones, but in case rxe_qp_from_init() failed it was filled with garbage and caused tot the following error. refcount_t: underflow; use-after-free. WARNING: CPU: 1 PID: 12560 at lib/refcount.c:28 refcount_warn_saturate+0x1d1/0x1e0 lib/refcount.c:28 Modules linked in: CPU: 1 PID: 12560 Comm: syz-executor.4 Not tainted 5.12.0-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:refcount_warn_saturate+0x1d1/0x1e0 lib/refcount.c:28 Code: e9 db fe ff ff 48 89 df e8 2c c2 ea fd e9 8a fe ff ff e8 72 6a a7 fd 48 c7 c7 e0 b2 c1 89 c6 05 dc 3a e6 09 01 e8 ee 74 fb 04 <0f> 0b e9 af fe ff ff 0f 1f 84 00 00 00 00 00 41 56 41 55 41 54 55 RSP: 0018:ffffc900097ceba8 EFLAGS: 00010286 RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000 RDX: 0000000000040000 RSI: ffffffff815bb075 RDI: fffff520012f9d67 RBP: 0000000000000003 R08: 0000000000000000 R09: 0000000000000000 R10: ffffffff815b4eae R11: 0000000000000000 R12: ffff8880322a4800 R13: ffff8880322a4940 R14: ffff888033044e00 R15: 0000000000000000 FS: 00007f6eb2be3700(0000) GS:ffff8880b9d00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fdbe5d41000 CR3: 000000001d181000 CR4: 00000000001506e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: __refcount_sub_and_test include/linux/refcount.h:283 [inline] __refcount_dec_and_test include/linux/refcount.h:315 [inline] refcount_dec_and_test include/linux/refcount.h:333 [inline] kref_put include/linux/kref.h:64 [inline] rxe_qp_do_cleanup+0x96f/0xaf0 drivers/infiniband/sw/rxe/rxe_qp.c:805 execute_in_process_context+0x37/0x150 kernel/workqueue.c:3327 rxe_elem_release+0x9f/0x180 drivers/infiniband/sw/rxe/rxe_pool.c:391 kref_put include/linux/kref.h:65 [inline] rxe_create_qp+0x2cd/0x310 drivers/infiniband/sw/rxe/rxe_verbs.c:425 _ib_create_qp drivers/infiniband/core/core_priv.h:331 [inline] ib_create_named_qp+0x2ad/0x1370 drivers/infiniband/core/verbs.c:1231 ib_create_qp include/rdma/ib_verbs.h:3644 [inline] create_mad_qp+0x177/0x2d0 drivers/infiniband/core/mad.c:2920 ib_mad_port_open drivers/infiniband/core/mad.c:3001 [inline] ib_mad_init_device+0xd6f/0x1400 drivers/infiniband/core/mad.c:3092 add_client_context+0x405/0x5e0 drivers/infiniband/core/device.c:717 enable_device_and_get+0x1cd/0x3b0 drivers/infiniband/core/device.c:1331 ib_register_device drivers/infiniband/core/device.c:1413 [inline] ib_register_device+0x7c7/0xa50 drivers/infiniband/core/device.c:1365 rxe_register_device+0x3d5/0x4a0 drivers/infiniband/sw/rxe/rxe_verbs.c:1147 rxe_add+0x12fe/0x16d0 drivers/infiniband/sw/rxe/rxe.c:247 rxe_net_add+0x8c/0xe0 drivers/infiniband/sw/rxe/rxe_net.c:503 rxe_newlink drivers/infiniband/sw/rxe/rxe.c:269 [inline] rxe_newlink+0xb7/0xe0 drivers/infiniband/sw/rxe/rxe.c:250 nldev_newlink+0x30e/0x550 drivers/infiniband/core/nldev.c:1555 rdma_nl_rcv_msg+0x36d/0x690 drivers/infiniband/core/netlink.c:195 rdma_nl_rcv_skb drivers/infiniband/core/netlink.c:239 [inline] rdma_nl_rcv+0x2ee/0x430 drivers/infiniband/core/netlink.c:259 netlink_unicast_kernel net/netlink/af_netlink.c:1312 [inline] netlink_unicast+0x533/0x7d0 net/netlink/af_netlink.c:1338 netlink_sendmsg+0x856/0xd90 net/netlink/af_netlink.c:1927 sock_sendmsg_nosec net/socket.c:654 [inline] sock_sendmsg+0xcf/0x120 net/socket.c:674 ____sys_sendmsg+0x6e8/0x810 net/socket.c:2350 ___sys_sendmsg+0xf3/0x170 net/socket.c:2404 __sys_sendmsg+0xe5/0x1b0 net/socket.c:2433 do_syscall_64+0x3a/0xb0 arch/x86/entry/common.c:47 entry_SYSCALL_64_after_hwframe+0x44/0xae Fixes: 8700e3e7c485 ("Soft RoCE driver") Link: https://lore.kernel.org/r/7bf8d548764d406dbbbaf4b574960ebfd5af8387.1620717918.git.leonro@nvidia.com Reported-by: syzbot+36a7f280de4e11c6f04e@syzkaller.appspotmail.com Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Reviewed-by: Zhu Yanjun <zyjzyj2000@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
* RDMA/core: Prevent divide-by-zero error triggered by the userLeon Romanovsky2021-05-101-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The user_entry_size is supplied by the user and later used as a denominator to calculate number of entries. The zero supplied by the user will trigger the following divide-by-zero error: divide error: 0000 [#1] SMP KASAN PTI CPU: 4 PID: 497 Comm: c_repro Not tainted 5.13.0-rc1+ #281 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 RIP: 0010:ib_uverbs_handler_UVERBS_METHOD_QUERY_GID_TABLE+0x1b1/0x510 Code: 87 59 03 00 00 e8 9f ab 1e ff 48 8d bd a8 00 00 00 e8 d3 70 41 ff 44 0f b7 b5 a8 00 00 00 e8 86 ab 1e ff 31 d2 4c 89 f0 31 ff <49> f7 f5 48 89 d6 48 89 54 24 10 48 89 04 24 e8 1b ad 1e ff 48 8b RSP: 0018:ffff88810416f828 EFLAGS: 00010246 RAX: 0000000000000008 RBX: 1ffff1102082df09 RCX: ffffffff82183f3d RDX: 0000000000000000 RSI: ffff888105f2da00 RDI: 0000000000000000 RBP: ffff88810416fa98 R08: 0000000000000001 R09: ffffed102082df5f R10: ffff88810416faf7 R11: ffffed102082df5e R12: 0000000000000000 R13: 0000000000000000 R14: 0000000000000008 R15: ffff88810416faf0 FS: 00007f5715efa740(0000) GS:ffff88811a700000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000020000840 CR3: 000000010c2e0001 CR4: 0000000000370ea0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: ? ib_uverbs_handler_UVERBS_METHOD_INFO_HANDLES+0x4b0/0x4b0 ib_uverbs_cmd_verbs+0x1546/0x1940 ib_uverbs_ioctl+0x186/0x240 __x64_sys_ioctl+0x38a/0x1220 do_syscall_64+0x3f/0x80 entry_SYSCALL_64_after_hwframe+0x44/0xae Fixes: 9f85cbe50aa0 ("RDMA/uverbs: Expose the new GID query API to user space") Link: https://lore.kernel.org/r/b971cc70a8b240a8b5eda33c99fa0558a0071be2.1620657876.git.leonro@nvidia.com Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
* RDMA/siw: Release xarray entryLeon Romanovsky2021-05-101-1/+1
| | | | | | | | | | | The xarray entry is allocated in siw_qp_add(), but release was missed in case zero-sized SQ was discovered. Fixes: 661f385961f0 ("RDMA/siw: Fix handling of zero-sized Read and Receive Queues.") Link: https://lore.kernel.org/r/f070b59d5a1114d5a4e830346755c2b3f141cde5.1620560472.git.leonro@nvidia.com Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Reviewed-by: Bernard Metzler <bmt@zurich.ibm.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
* RDMA/siw: Properly check send and receive CQ pointersLeon Romanovsky2021-05-101-6/+3
| | | | | | | | | | | | | The check for the NULL of pointer received from container_of() is incorrect by definition as it points to some offset from NULL. Change such check with proper NULL check of SIW QP attributes. Fixes: 303ae1cdfdf7 ("rdma/siw: application interface") Link: https://lore.kernel.org/r/a7535a82925f6f4c1f062abaa294f3ae6e54bdd2.1620560310.git.leonro@nvidia.com Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Reviewed-by: Bernard Metzler <bmt@zurich.ibm.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
* Merge tag 'block-5.13-2021-05-07' of git://git.kernel.dk/linux-blockLinus Torvalds2021-05-071-1/+2
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull block fixes from Jens Axboe: - dasd spelling fixes (Bhaskar) - Limit bio max size on multi-page bvecs to the hardware limit, to avoid overly large bio's (and hence latencies). Originally queued for the merge window, but needed a fix and was dropped from the initial pull (Changheun) - NVMe pull request (Christoph): - reset the bdev to ns head when failover (Daniel Wagner) - remove unsupported command noise (Keith Busch) - misc passthrough improvements (Kanchan Joshi) - fix controller ioctl through ns_head (Minwoo Im) - fix controller timeouts during reset (Tao Chiu) - rnbd fixes/cleanups (Gioh, Md, Dima) - Fix iov_iter re-expansion (yangerkun) * tag 'block-5.13-2021-05-07' of git://git.kernel.dk/linux-block: block: reexpand iov_iter after read/write nvmet: remove unsupported command noise nvme-multipath: reset bdev to ns head when failover nvme-pci: fix controller reset hang when racing with nvme_timeout nvme: move the fabrics queue ready check routines to core nvme: avoid memset for passthrough requests nvme: add nvme_get_ns helper nvme: fix controller ioctl through ns_head bio: limit bio max size RDMA/rtrs: fix uninitialized symbol 'cnt' s390: dasd: Mundane spelling fixes block/rnbd: Remove all likely and unlikely block/rnbd-clt: Check the return value of the function rtrs_clt_query block/rnbd: Fix style issues block/rnbd-clt: Change queue_depth type in rnbd_clt_session to size_t
| * RDMA/rtrs: fix uninitialized symbol 'cnt'Gioh Kim2021-05-031-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | rtrs_clt_rdma_cq_direct returns an ninitialized value in cnt if there is no session. This patch makes rtrs_clt_rdma_cq_direct returns a negative value for block layer not to try again. Fixes: 2958a995edc94 ("block/rnbd-clt: Support polling mode for IO latency optimization") Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com> Signed-off-by: Jack Wang <jinpu.wang@ionos.com> Link: https://lore.kernel.org/r/20210429092741.266533-1-gi-oh.kim@ionos.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
* | Merge branch 'work.recursive_removal' of ↵Linus Torvalds2021-05-031-63/+5
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull another simple_recursive_removal() update from Al Viro: "I missed one case when simple_recursive_removal() was introduced" * 'work.recursive_removal' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: qib_fs: switch to simple_recursive_removal()
| * | qib_fs: switch to simple_recursive_removal()Al Viro2021-03-081-63/+5
| | | | | | | | | | | | Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* | | Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdmaLinus Torvalds2021-05-01198-3835/+4967
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull rdma updates from Jason Gunthorpe: "This is significantly bug fixes and general cleanups. The noteworthy new features are fairly small: - XRC support for HNS and improves RQ operations - Bug fixes and updates for hns, mlx5, bnxt_re, hfi1, i40iw, rxe, siw and qib - Quite a few general cleanups on spelling, error handling, static checker detections, etc - Increase the number of device ports supported beyond 255. High port count software switches now exist - Several bug fixes for rtrs - mlx5 Device Memory support for host controlled atomics - Report SRQ tables through to rdma-tool" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (145 commits) IB/qib: Remove redundant assignment to ret RDMA/nldev: Add copy-on-fork attribute to get sys command RDMA/bnxt_re: Fix a double free in bnxt_qplib_alloc_res RDMA/siw: Fix a use after free in siw_alloc_mr IB/hfi1: Remove redundant variable rcd RDMA/nldev: Add QP numbers to SRQ information RDMA/nldev: Return SRQ information RDMA/restrack: Add support to get resource tracking for SRQ RDMA/nldev: Return context information RDMA/core: Add CM to restrack after successful attachment to a device RDMA/cma: Skip device which doesn't support CM RDMA/rxe: Fix a bug in rxe_fill_ip_info() RDMA/mlx5: Expose private query port RDMA/mlx4: Remove an unused variable RDMA/mlx5: Fix type assignment for ICM DM IB/mlx5: Set right RoCE l3 type and roce version while deleting GID RDMA/i40iw: Fix error unwinding when i40iw_hmc_sd_one fails RDMA/cxgb4: add missing qpid increment IB/ipoib: Remove unnecessary struct declaration RDMA/bnxt_re: Get rid of custom module reference counting ...
| * | | IB/qib: Remove redundant assignment to retYang Li2021-04-301-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Variable 'ret' is set to zero but this value is never read as it is overwritten with a new value later on, hence it is a redundant assignment and can be removed. Clean up the following clang-analyzer warning: drivers/infiniband/hw/qib/qib_sd7220.c:690:2: warning: Value stored to 'ret' is never read [clang-analyzer-deadcode.DeadStores] Link: https://lore.kernel.org/r/1619692940-104771-1-git-send-email-yang.lee@linux.alibaba.com Reported-by: Abaci Robot <abaci@linux.alibaba.com> Signed-off-by: Yang Li <yang.lee@linux.alibaba.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
| * | | RDMA/nldev: Add copy-on-fork attribute to get sys commandGal Pressman2021-04-271-0/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The new attribute indicates that the kernel copies DMA pages on fork, hence libibverbs' fork support through madvise and MADV_DONTFORK is not needed. The introduced attribute is always reported as supported since the kernel has the patch that added the copy-on-fork behavior. This allows the userspace library to identify older vs newer kernel versions. Extra care should be taken when backporting this patch as it relies on the fact that the copy-on-fork patch is merged, hence no check for support is added. Don't backport this patch unless you also have the following series: commit 70e806e4e645 ("mm: Do early cow for pinned pages during fork() for ptes") and commit 4eae4efa2c29 ("hugetlb: do early cow when page pinned on src mm"). Fixes: 70e806e4e645 ("mm: Do early cow for pinned pages during fork() for ptes") Fixes: 4eae4efa2c29 ("hugetlb: do early cow when page pinned on src mm") Link: https://lore.kernel.org/r/20210418121025.66849-1-galpress@amazon.com Signed-off-by: Gal Pressman <galpress@amazon.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
| * | | RDMA/bnxt_re: Fix a double free in bnxt_qplib_alloc_resLv Yunlong2021-04-271-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In bnxt_qplib_alloc_res, it calls bnxt_qplib_alloc_dpi_tbl(). Inside bnxt_qplib_alloc_dpi_tbl, dpit->dbr_bar_reg_iomem is freed via pci_iounmap() in unmap_io error branch. After the callee returns err code, bnxt_qplib_alloc_res calls bnxt_qplib_free_res()->bnxt_qplib_free_dpi_tbl() in the fail branch. Then dpit->dbr_bar_reg_iomem is freed in the second time by pci_iounmap(). My patch set dpit->dbr_bar_reg_iomem to NULL after it is freed by pci_iounmap() in the first time, to avoid the double free. Fixes: 1ac5a4047975 ("RDMA/bnxt_re: Add bnxt_re RoCE driver") Link: https://lore.kernel.org/r/20210426140614.6722-1-lyl2019@mail.ustc.edu.cn Signed-off-by: Lv Yunlong <lyl2019@mail.ustc.edu.cn> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Acked-by: Devesh Sharma <devesh.sharma@broadcom.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
| * | | RDMA/siw: Fix a use after free in siw_alloc_mrLv Yunlong2021-04-271-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Our code analyzer reported a UAF. In siw_alloc_mr(), it calls siw_mr_add_mem(mr,..). In the implementation of siw_mr_add_mem(), mem is assigned to mr->mem and then mem is freed via kfree(mem) if xa_alloc_cyclic() failed. Here, mr->mem still point to a freed object. After, the execution continue up to the err_out branch of siw_alloc_mr, and the freed mr->mem is used in siw_mr_drop_mem(mr). My patch moves "mr->mem = mem" behind the if (xa_alloc_cyclic(..)<0) {} section, to avoid the uaf. Fixes: 2251334dcac9 ("rdma/siw: application buffer management") Link: https://lore.kernel.org/r/20210426011647.3561-1-lyl2019@mail.ustc.edu.cn Signed-off-by: Lv Yunlong <lyl2019@mail.ustc.edu.cn> Reviewed-by: Bernard Metzler <bmt@zurich.ihm.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
| * | | IB/hfi1: Remove redundant variable rcdJiapeng Chong2021-04-271-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The variable rcd is being assigned a value from a calculation however the variable is never read, so this redundant variable can be removed. Cleans up the following clang-analyzer warning: drivers/infiniband/hw/hfi1/affinity.c:986:3: warning: Value stored to 'rcd' is never read [clang-analyzer-deadcode.DeadStores]. Link: https://lore.kernel.org/r/1619346696-46300-1-git-send-email-jiapeng.chong@linux.alibaba.com Reported-by: Abaci Robot <abaci@linux.alibaba.com> Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
| * | | RDMA/nldev: Add QP numbers to SRQ informationNeta Ostrovsky2021-04-221-0/+91
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add QP numbers that are associated with the SRQ to the SRQ information. The QPs are displayed in a range form. Sample output: $ rdma res show srq dev ibp8s0f0 srqn 0 type BASIC pdn 3 comm [ib_ipoib] dev ibp8s0f0 srqn 4 type BASIC lqpn 125-128,130-140 pdn 9 pid 3581 comm ibv_srq_pingpon dev ibp8s0f0 srqn 5 type BASIC lqpn 141-156 pdn 10 pid 3584 comm ibv_srq_pingpon dev ibp8s0f0 srqn 6 type BASIC lqpn 157-172 pdn 11 pid 3590 comm ibv_srq_pingpon dev ibp8s0f1 srqn 0 type BASIC pdn 3 comm [ib_ipoib] dev ibp8s0f1 srqn 1 type BASIC lqpn 329-344 pdn 4 pid 3586 comm ibv_srq_pingpon $ rdma res show srq lqpn 126-141 dev ibp8s0f0 srqn 4 type BASIC lqpn 126-128,130-140 pdn 9 pid 3581 comm ibv_srq_pingpon dev ibp8s0f0 srqn 5 type BASIC lqpn 141 pdn 10 pid 3584 comm ibv_srq_pingpon $ rdma res show srq lqpn 127 dev ibp8s0f0 srqn 4 type BASIC lqpn 127 pdn 9 pid 3581 comm ibv_srq_pingpon Link: https://lore.kernel.org/r/79a4bd4caec2248fd9583cccc26786af8e4414fc.1618753110.git.leonro@nvidia.com Signed-off-by: Neta Ostrovsky <netao@nvidia.com> Reviewed-by: Mark Zhang <markzhang@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
| * | | RDMA/nldev: Return SRQ informationNeta Ostrovsky2021-04-221-0/+42
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Extend the RDMA nldev return a SRQ information, like SRQ number, SRQ type, PD number, CQ number and process ID that created that SRQ. Sample output: $ rdma res show srq dev ibp8s0f0 srqn 0 type BASIC pdn 3 comm [ib_ipoib] dev ibp8s0f0 srqn 4 type BASIC pdn 9 pid 3581 comm ibv_srq_pingpon dev ibp8s0f0 srqn 5 type BASIC pdn 10 pid 3584 comm ibv_srq_pingpon dev ibp8s0f0 srqn 6 type BASIC pdn 11 pid 3590 comm ibv_srq_pingpon dev ibp8s0f1 srqn 0 type BASIC pdn 3 comm [ib_ipoib] dev ibp8s0f1 srqn 1 type BASIC pdn 4 pid 3586 comm ibv_srq_pingpon Link: https://lore.kernel.org/r/322f9210b95812799190dd4a0fb92f3a3bba0333.1618753110.git.leonro@nvidia.com Signed-off-by: Neta Ostrovsky <netao@nvidia.com> Reviewed-by: Mark Zhang <markzhang@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
| * | | RDMA/restrack: Add support to get resource tracking for SRQNeta Ostrovsky2021-04-222-0/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In order to track SRQ resources, a new restrack object is initialized and added to the resource tracking database. Link: https://lore.kernel.org/r/0db71c409f24f2f6b019bf8797a8fed96fe7079c.1618753110.git.leonro@nvidia.com Signed-off-by: Neta Ostrovsky <netao@nvidia.com> Reviewed-by: Mark Zhang <markzhang@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
| * | | RDMA/nldev: Return context informationNeta Ostrovsky2021-04-221-0/+27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Extend the RDMA nldev return a context information, like ctx number and process ID that created that context. This functionality is helpful to find orphan contexts that are not closed for some reason. Sample output: $ rdma res show ctx dev ibp8s0f0 ctxn 0 pid 980 comm ibv_rc_pingpong dev ibp8s0f0 ctxn 1 pid 981 comm ibv_rc_pingpong dev ibp8s0f0 ctxn 2 pid 992 comm ibv_rc_pingpong dev ibp8s0f1 ctxn 0 pid 984 comm ibv_rc_pingpong dev ibp8s0f1 ctxn 1 pid 987 comm ibv_rc_pingpong $ rdma res show ctx dev ibp8s0f1 dev ibp8s0f1 ctxn 0 pid 984 comm ibv_rc_pingpong dev ibp8s0f1 ctxn 1 pid 987 comm ibv_rc_pingpong Link: https://lore.kernel.org/r/5c956acfeac4e9d532988575f3da7d64cb449374.1618753110.git.leonro@nvidia.com Signed-off-by: Neta Ostrovsky <netao@nvidia.com> Reviewed-by: Mark Zhang <markzhang@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
| * | | RDMA/core: Add CM to restrack after successful attachment to a deviceShay Drory2021-04-221-2/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The device attach triggers addition of CM_ID to the restrack DB. However, when error occurs, we releasing this device, but defer CM_ID release. This causes to the situation where restrack sees CM_ID that is not valid anymore. As a solution, add the CM_ID to the resource tracking DB only after the attachment is finished. Found by syzcaller: infiniband syz0: added syz_tun rdma_rxe: ignoring netdev event = 10 for syz_tun infiniband syz0: set down infiniband syz0: ib_query_port failed (-19) restrack: ------------[ cut here ]------------ infiniband syz0: BUG: RESTRACK detected leak of resources restrack: User CM_ID object allocated by syz-executor716 is not freed restrack: ------------[ cut here ]------------ Fixes: b09c4d701220 ("RDMA/restrack: Improve readability in task name management") Link: https://lore.kernel.org/r/ab93e56ba831eac65c322b3256796fa1589ec0bb.1618753862.git.leonro@nvidia.com Signed-off-by: Shay Drory <shayd@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
| * | | RDMA/cma: Skip device which doesn't support CMParav Pandit2021-04-221-0/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A switchdev RDMA device do not support IB CM. When such device is added to the RDMA CM's device list, when application invokes rdma_listen(), cma attempts to listen to such device, however it has IB CM attribute disabled. Due to this, rdma_listen() call fails to listen for other non switchdev devices as well. A below error message can be seen. infiniband mlx5_0: RDMA CMA: cma_listen_on_dev, error -38 A failing call flow is below. cma_listen_on_all() cma_listen_on_dev() _cma_attach_to_dev() rdma_listen() <- fails on a specific switchdev device This is because rdma_listen() is hardwired to only work with iwarp or IB CM compatible devices. Hence, when a IB device doesn't support IB CM or IW CM, avoid adding such device to the cma list so rdma_listen() can't even be called. Link: https://lore.kernel.org/r/f9cac00d52864ea7c61295e43fb64cf4db4fdae6.1618753862.git.leonro@nvidia.com Signed-off-by: Parav Pandit <parav@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
| * | | RDMA/rxe: Fix a bug in rxe_fill_ip_info()Bob Pearson2021-04-211-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix a bug in rxe_fill_ip_info() which was attempting to convert from RDMA_NETWORK_XXX to RXE_NETWORK_XXX. .._IPV6 should have mapped to .._IPV6 not .._IPV4. Fixes: edebc8407b88 ("RDMA/rxe: Fix small problem in network_type patch") Link: https://lore.kernel.org/r/20210421035952.4892-1-rpearson@hpe.com Suggested-by: Frank Zago <frank.zago@hpe.com> Signed-off-by: Bob Pearson <rpearson@hpe.com> Acked-by: Zhu Yanjun <zyjzyj2000@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
| * | | RDMA/mlx5: Expose private query portMark Bloch2021-04-201-0/+173
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Expose a non standard query port via IOCTL that will be used to expose port attributes that are specific to mlx5 devices. The new interface receives a port number to query and returns a structure that contains the available attributes for that port. This will be used to fill the gap between pure DEVX use cases and use cases where a kernel needs to inform userspace about various kernel driver configurations that userspace must use in order to work correctly. Flags is used to indicate which fields are valid on return. MLX5_IB_UAPI_QUERY_PORT_VPORT: The vport number of the queered port. MLX5_IB_UAPI_QUERY_PORT_VPORT_VHCA_ID: The VHCA ID of the vport of the queered port. MLX5_IB_UAPI_QUERY_PORT_VPORT_STEERING_ICM_RX: The vport's RX ICM address used for sw steering. MLX5_IB_UAPI_QUERY_PORT_VPORT_STEERING_ICM_TX: The vport's TX ICM address used for sw steering. MLX5_IB_UAPI_QUERY_PORT_VPORT_REG_C0: The metadata used to tag egress packets of the vport. MLX5_IB_UAPI_QUERY_PORT_ESW_OWNER_VHCA_ID: The E-Switch owner vhca id of the vport. Link: https://lore.kernel.org/r/6e2ef13e5a266a6c037eb0105eb1564c7bb52f23.1618743394.git.leonro@nvidia.com Reviewed-by: Maor Gottlieb <maorg@nvidia.com> Signed-off-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
| * | | RDMA/mlx4: Remove an unused variableChristophe JAILLET2021-04-201-3/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 'in6' is unused. It is just declared and filled-in. It can be removed. This is a left over from commit 5ea8bbfc4929 ("mlx4: Implement IP based gids support for RoCE/SRIOV") Link: https://lore.kernel.org/r/413dc6e2ea9d85bda1131bbd6a730b7620839eab.1618932283.git.christophe.jaillet@wanadoo.fr Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
| * | | RDMA/mlx5: Fix type assignment for ICM DMMaor Gottlieb2021-04-201-10/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We should hold the UAPI DM type in the base struct and not the internal mlx5 type. Fixes: 251b9d788750 ("RDMA/mlx5: Re-organize the DM code") Link: https://lore.kernel.org/r/58dedbd5c132660f808e59166d434e2eaa6ecf7a.1618753425.git.leonro@nvidia.com Signed-off-by: Maor Gottlieb <maorg@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
| * | | IB/mlx5: Set right RoCE l3 type and roce version while deleting GIDParav Pandit2021-04-201-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently when GID is deleted, it zero out all the fields of the RoCE address in the SET_ROCE_ADDRESS command for a specified index. roce_version = 0 means RoCEv1 in the SET_ROCE_ADDRESS command. This assumes that device has RoCEv1 always enabled which is not always correct. For example Subfunction does not support RoCEv1. Due to this assumption a previously added RoCEv2 GID is always deleted as RoCEv1 GID. This results in a below syndrome: mlx5_core.sf mlx5_core.sf.4: mlx5_cmd_check:777:(pid 4256): SET_ROCE_ADDRESS(0x761) op_mod(0x0) failed, status bad parameter(0x3), syndrome (0x12822d) Hence set the right RoCE version during GID deletion provided by the core. Link: https://lore.kernel.org/r/d3f54129c90ca329caf438dbe31875d8ad08d91a.1618753425.git.leonro@nvidia.com Signed-off-by: Parav Pandit <parav@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
| * | | RDMA/i40iw: Fix error unwinding when i40iw_hmc_sd_one failsSindhu Devale2021-04-201-4/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When i40iw_hmc_sd_one fails, chunk is freed without the deletion of chunk entry in the PBLE info list. Fix it by adding the chunk entry to the PBLE info list only after successful addition of SD in i40iw_hmc_sd_one. This fixes a static checker warning reported here: https://lore.kernel.org/linux-rdma/YHV4CFXzqTm23AOZ@mwanda/ Fixes: 9715830157be ("i40iw: add pble resource files") Link: https://lore.kernel.org/r/20210416002104.323-1-shiraz.saleem@intel.com Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Sindhu Devale <sindhu.devale@intel.com> Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
| * | | RDMA/cxgb4: add missing qpid incrementPotnuri Bharat Teja2021-04-201-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | missing qpid increment leads to skipping few qpids while allocating QP. This eventually leads to adapter running out of qpids after establishing fewer connections than it actually supports. Current patch increments the qpid correctly. Fixes: cfdda9d76436 ("RDMA/cxgb4: Add driver for Chelsio T4 RNIC") Link: https://lore.kernel.org/r/20210415151422.9139-1-bharat@chelsio.com Signed-off-by: Potnuri Bharat Teja <bharat@chelsio.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
| * | | IB/ipoib: Remove unnecessary struct declarationWan Jiabing2021-04-201-2/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | struct ipoib_cm_tx is defined at 245th line. And the definition is independent on the MACRO. The declaration here is unnecessary. Remove it. Link: https://lore.kernel.org/r/20210415092124.27684-1-wanjiabing@vivo.com Signed-off-by: Wan Jiabing <wanjiabing@vivo.com> Reviewed-by: Christoph Lameter <cl@linux.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
| * | | RDMA/bnxt_re: Get rid of custom module reference countingLeon Romanovsky2021-04-191-14/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Instead of manually messing with parent driver module reference counting rely on export symbol mechanism to ensure that proper probe/remove chain is performed. Link: https://lore.kernel.org/r/20210401065715.565226-4-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Acked-By: Devesh Sharma <devesh.sharma@broadcom.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
| * | | RDMA/bnxt_re: Create direct symbol link between bnxt modulesLeon Romanovsky2021-04-191-6/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Convert indirect probe call to its direct equivalent to create a symbol link between RDMA and netdev modules. This will give us an ability to remove custom module reference counting that doesn't belong to the driver. Link: https://lore.kernel.org/r/20210401065715.565226-3-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Acked-By: Devesh Sharma <devesh.sharma@broadcom.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
| * | | RDMA/bnxt_re: Depend on bnxt ethernet driver and not blindly select itLeon Romanovsky2021-04-191-3/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The "select" kconfig keyword provides reverse dependency, however it doesn't check that selected symbol meets its own dependencies. Usually "select" is used for non-visible symbols, so instead of trying to keep dependencies in sync with BNXT ethernet driver, simply "depends on" it, like Kconfig documentation suggest. * CONFIG_PCI is already required by BNXT * CONFIG_NETDEVICES and CONFIG_ETHERNET are needed to chose BNXT Link: https://lore.kernel.org/r/20210401065715.565226-2-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Acked-By: Devesh Sharma <devesh.sharma@broadcom.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
| * | | IB/ipoib: Improve latency in ipoib/cm connection formationManjunath Patil2021-04-191-5/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently IPoIB connected mode queries the device to get the pkey table entry during connection formation. This will increase the time taken to form the connection, especially when limited pkeys are in use. This gets worse when multiple connection attempts are done in parallel. Since ipoib interfaces are locked to a single pkey, use the pkey index that was determined at link up time instead of searching for anything. This improved the latency from 500ms to 1ms on an internal setup. Link: https://lore.kernel.org/r/1618338965-16717-1-git-send-email-manjunath.b.patil@oracle.com Suggested-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Manjunath Patil <manjunath.b.patil@oracle.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
| * | | RDMA/core: Unify RoCE check and re-factor codeHåkon Bugge2021-04-191-6/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In cm_req_handler(), unify the check for RoCE and re-factor to avoid one test. Link: https://lore.kernel.org/r/1617705423-15570-1-git-send-email-haakon.bugge@oracle.com Suggested-by: Jason Gunthorpe <jgg@nvidia.com> Fixes: 8f9748602491 ("IB/cm: Reduce dependency on gid attribute ndev check") Fixes: 194f64a3cad3 ("RDMA/core: Fix corrupted SL on passive side") Signed-off-by: Håkon Bugge <haakon.bugge@oracle.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
| * | | RDMA/ipoib: Print a message if only child interface is UPJack Wang2021-04-141-1/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When "enhanced IPoIB" is enabled for CX-5 devices it requires the parent device to be UP, otherwise the child devices won't work. Thus add a debug message to give admin a hint when only the child interface is UP but parent interface is not. Link: https://lore.kernel.org/r/20210408093215.24023-1-jinpu.wang@ionos.com Signed-off-by: Jack Wang <jinpu.wang@ionos.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
| * | | RDMA/hns: Remove duplicated hem page size config codeXi Wang2021-04-143-110/+76
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Remove duplicated code for setting hem page size in PF and VF. Link: https://lore.kernel.org/r/1617715514-29039-7-git-send-email-liweihang@huawei.com Signed-off-by: Xi Wang <wangxi11@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
| * | | RDMA/hns: Enable RoCE on virtual functionsWei Xu2021-04-143-39/+202
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Introduce the VF support by adding code changes to allow VF PCI device initialization, assgining the reserved resource of the PF to the active VFs, setting the default abilities, applying the interruptions, resetting and reducing the default QP/GID number to aovid exceeding the hardware limitation. Link: https://lore.kernel.org/r/1617715514-29039-6-git-send-email-liweihang@huawei.com Signed-off-by: Wei Xu <xuwei5@hisilicon.com> Signed-off-by: Shengming Shu <shushengming1@huawei.com> Signed-off-by: Xi Wang <wangxi11@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
| * | | RDMA/hns: Set parameters of all the functions belong to a PFWei Xu2021-04-141-2/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Switch parameters of all functions belong to a PF should be set including VFs. Link: https://lore.kernel.org/r/1617715514-29039-5-git-send-email-liweihang@huawei.com Signed-off-by: Wei Xu <xuwei5@hisilicon.com> Signed-off-by: Shengming Shu <shushengming1@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
| * | | RDMA/hns: Reserve the resource for the VFsWei Xu2021-04-143-28/+60
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Query the resource including EQC/SMAC/SGID from the firmware in the PF and distribute fairly among all the functions belong to the PF. Link: https://lore.kernel.org/r/1617715514-29039-4-git-send-email-liweihang@huawei.com Signed-off-by: Wei Xu <xuwei5@hisilicon.com> Signed-off-by: Shengming Shu <shushengming1@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
| * | | RDMA/hns: Query the number of functions supported by the PFWei Xu2021-04-143-4/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Query how many functions are supported by the PF from the FW and store it in the hns_roce_dev structure which will be used to support the configuration of virtual functions. Link: https://lore.kernel.org/r/1617715514-29039-3-git-send-email-liweihang@huawei.com Signed-off-by: Wei Xu <xuwei5@hisilicon.com> Signed-off-by: Shengming Shu <shushengming1@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
| * | | RDMA/hns: Simplify function's resource related commandXi Wang2021-04-143-305/+89
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Use hr_reg_write/read() to simplify codes about configuring function's resource. And because the design of PF/VF fields is same, they can be defined only once. Link: https://lore.kernel.org/r/1617715514-29039-2-git-send-email-liweihang@huawei.com Signed-off-by: Xi Wang <wangxi11@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
| * | | RDMA/rtrs-clt: Simplify error messageGioh Kim2021-04-141-16/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Two error messages are only different message but have common code to generate the path string. Link: https://lore.kernel.org/r/20210406123639.202899-4-gi-oh.kim@ionos.com Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com> Signed-off-by: Jack Wang <jinpu.wang@ionos.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
| * | | RDMA/rtrs-srv: More debugging info when fail to send replyGioh Kim2021-04-141-4/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It does not help to debug if it only print error message without any debugging information which session and connection the error happened. Link: https://lore.kernel.org/r/20210406123639.202899-3-gi-oh.kim@ionos.com Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com> Signed-off-by: Jack Wang <jinpu.wang@ionos.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
| * | | RDMA/rtrs-clt: Print more info when an error happensGioh Kim2021-04-141-4/+38
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Client prints only error value and it is not enough for debugging. 1. When client receives an error from server: the client does not only print the error value but also more information of server connection. 2. When client failes to send IO: the client gets an error from RDMA layer. It also print more information of server connection. Link: https://lore.kernel.org/r/20210406123639.202899-2-gi-oh.kim@ionos.com Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com> Signed-off-by: Jack Wang <jinpu.wang@ionos.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
| * | | RDMA/rtrs-clt: New sysfs attribute to print the latency of each pathGioh Kim2021-04-141-0/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It shows the latest latency that the client checked when sending the heart-beat. Link: https://lore.kernel.org/r/20210407113444.150961-3-gi-oh.kim@ionos.com Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com> Signed-off-by: Jack Wang <jinpu.wang@ionos.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>