| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
smatch gives the warning:
drivers/infiniband/ulp/rtrs/rtrs-srv.c:1805 rtrs_rdma_connect() warn: passing zero to 'PTR_ERR'
Which is trying to say smatch has shown that srv is not an error pointer
and thus cannot be passed to PTR_ERR.
The solution is to move the list_add() down after full initilization of
rtrs_srv. To avoid holding the srv_mutex too long, only hold it during the
list operation as suggested by Leon.
Fixes: 03e9b33a0fd6 ("RDMA/rtrs: Only allow addition of path to an already established session")
Link: https://lore.kernel.org/r/20210216143807.65923-1-jinpu.wang@cloud.ionos.com
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The current code computes a number of channels per SRP target and spreads
them equally across all online NUMA nodes. Each channel is then assigned
a CPU within this node.
In the case of unbalanced, or even unpopulated nodes, some channels do not
get a CPU associated and thus do not get connected. This causes the SRP
connection to fail.
This patch solves the issue by rewriting channel computation and
allocation:
- Drop channel to node/CPU association as it had no real effect on
locality but added unnecessary complexity.
- Tweak the number of channels allocated to reduce CPU contention when
possible:
- Up to one channel per CPU (instead of up to 4 by node)
- At least 4 channels per node, unless ch_count module parameter is
used.
Link: https://lore.kernel.org/r/9cb4d9d3-30ad-2276-7eff-e85f7ddfb411@suse.com
Signed-off-by: Nicolas Morey-Chaisemartin <nmoreychaisemartin@suse.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
put_device() decreases the ref-count and then the device will be
cleaned-up, while at is also add missing put_device in
rtrs_srv_create_once_sysfs_root_folders
This patch solves a kmemleak error as below:
unreferenced object 0xffff88809a7a0710 (size 8):
comm "kworker/4:1H", pid 113, jiffies 4295833049 (age 6212.380s)
hex dump (first 8 bytes):
62 6c 61 00 6b 6b 6b a5 bla.kkk.
backtrace:
[<0000000054413611>] kstrdup+0x2e/0x60
[<0000000078e3120a>] kobject_set_name_vargs+0x2f/0xb0
[<00000000f1a17a6b>] dev_set_name+0xab/0xe0
[<00000000d5502e32>] rtrs_srv_create_sess_files+0x2fb/0x314 [rtrs_server]
[<00000000ed11a1ef>] rtrs_srv_info_req_done+0x631/0x800 [rtrs_server]
[<000000008fc5aa8f>] __ib_process_cq+0x94/0x100 [ib_core]
[<00000000a9599cb4>] ib_cq_poll_work+0x32/0xc0 [ib_core]
[<00000000cfc376be>] process_one_work+0x4bc/0x980
[<0000000016e5c96a>] worker_thread+0x78/0x5c0
[<00000000c20b8be0>] kthread+0x191/0x1e0
[<000000006c9c0003>] ret_from_fork+0x3a/0x50
Fixes: baa5b28b7a47 ("RDMA/rtrs-srv: Replace device_register with device_initialize and device_add")
Link: https://lore.kernel.org/r/20210212134525.103456-5-jinpu.wang@cloud.ionos.com
Signed-off-by: Gioh Kim <gi-oh.kim@cloud.ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
kmemleak reported an error as below:
unreferenced object 0xffff8880674b7640 (size 64):
comm "kworker/4:1H", pid 113, jiffies 4296403507 (age 507.840s)
hex dump (first 32 bytes):
69 70 3a 31 39 32 2e 31 36 38 2e 31 32 32 2e 31 ip:192.168.122.1
31 30 40 69 70 3a 31 39 32 2e 31 36 38 2e 31 32 10@ip:192.168.12
backtrace:
[<0000000054413611>] kstrdup+0x2e/0x60
[<0000000078e3120a>] kobject_set_name_vargs+0x2f/0xb0
[<00000000ca2be3ee>] kobject_init_and_add+0xb0/0x120
[<0000000062ba5e78>] rtrs_srv_create_sess_files+0x14c/0x314 [rtrs_server]
[<00000000b45b7217>] rtrs_srv_info_req_done+0x5b1/0x800 [rtrs_server]
[<000000008fc5aa8f>] __ib_process_cq+0x94/0x100 [ib_core]
[<00000000a9599cb4>] ib_cq_poll_work+0x32/0xc0 [ib_core]
[<00000000cfc376be>] process_one_work+0x4bc/0x980
[<0000000016e5c96a>] worker_thread+0x78/0x5c0
[<00000000c20b8be0>] kthread+0x191/0x1e0
[<000000006c9c0003>] ret_from_fork+0x3a/0x50
It is caused by the not-freed kobject of rtrs_srv_sess. The kobject
embedded in rtrs_srv_sess has ref-counter 2 after calling
process_info_req(). Therefore it must call kobject_put twice. Currently
it calls kobject_put only once at rtrs_srv_destroy_sess_files because
kobject_del removes the state_in_sysfs flag and then kobject_put in
free_sess() is not called.
This patch moves kobject_del() into free_sess() so that the kobject of
rtrs_srv_sess can be freed. And also this patch adds the missing call of
sysfs_remove_group() to clean-up the sysfs directory.
Fixes: 9cb837480424 ("RDMA/rtrs: server: main functionality")
Link: https://lore.kernel.org/r/20210212134525.103456-4-jinpu.wang@cloud.ionos.com
Signed-off-by: Gioh Kim <gi-oh.kim@cloud.ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
While adding a path from the client side to an already established
session, it was possible to provide the destination IP to a different
server. This is dangerous.
This commit adds an extra member to the rtrs_msg_conn_req structure, named
first_conn; which is supposed to notify if the connection request is the
first for that session or not.
On the server side, if a session does not exist but the first_conn
received inside the rtrs_msg_conn_req structure is 1, the connection
request is failed. This signifies that the connection request is for an
already existing session, and since the server did not find one, it is an
wrong connection request.
Fixes: 6a98d71daea1 ("RDMA/rtrs: client: main functionality")
Fixes: 9cb837480424 ("RDMA/rtrs: server: main functionality")
Link: https://lore.kernel.org/r/20210212134525.103456-3-jinpu.wang@cloud.ionos.com
Signed-off-by: Md Haris Iqbal <haris.iqbal@cloud.ionos.com>
Reviewed-by: Lutz Pogrell <lutz.pogrell@cloud.ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
BUG: KASAN: stack-out-of-bounds in _mlx4_ib_post_send+0x1bd2/0x2770 [mlx4_ib]
Read of size 4 at addr ffff8880d5a7f980 by task kworker/0:1H/565
CPU: 0 PID: 565 Comm: kworker/0:1H Tainted: G O 5.4.84-storage #5.4.84-1+feature+linux+5.4.y+dbg+20201216.1319+b6b887b~deb10
Hardware name: Supermicro H8QG6/H8QG6, BIOS 3.00 09/04/2012
Workqueue: ib-comp-wq ib_cq_poll_work [ib_core]
Call Trace:
dump_stack+0x96/0xe0
print_address_description.constprop.4+0x1f/0x300
? irq_work_claim+0x2e/0x50
__kasan_report.cold.8+0x78/0x92
? _mlx4_ib_post_send+0x1bd2/0x2770 [mlx4_ib]
kasan_report+0x10/0x20
_mlx4_ib_post_send+0x1bd2/0x2770 [mlx4_ib]
? check_chain_key+0x1d7/0x2e0
? _mlx4_ib_post_recv+0x630/0x630 [mlx4_ib]
? lockdep_hardirqs_on+0x1a8/0x290
? stack_depot_save+0x218/0x56e
? do_profile_hits.isra.6.cold.13+0x1d/0x1d
? check_chain_key+0x1d7/0x2e0
? save_stack+0x4d/0x80
? save_stack+0x19/0x80
? __kasan_slab_free+0x125/0x170
? kfree+0xe7/0x3b0
rdma_write_sg+0x5b0/0x950 [rtrs_server]
The problem is when we send imm_wr, the type should be ib_rdma_wr, so hw
driver like mlx4 can do rdma_wr(wr), so fix it by use the ib_rdma_wr as
type for imm_wr.
Fixes: 9cb837480424 ("RDMA/rtrs: server: main functionality")
Link: https://lore.kernel.org/r/20210212134525.103456-2-jinpu.wang@cloud.ionos.com
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Reviewed-by: Gioh Kim <gi-oh.kim@cloud.ionos.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When a system receives a REREG event from the SM, then the SM information
in the kernel is marked as invalid and a request is sent to the SM to
update the information. The SM information is invalid in that time period.
However, receiving a REREG also occurs simultaneously in user space
applications that are now trying to rejoin the multicast groups. Some of
those may be sendonly multicast groups which are then failing.
If the SM information is invalid then ib_sa_sendonly_fullmem_support()
returns false. That is wrong because it just means that we do not know yet
if the potentially new SM supports sendonly joins.
Sendonly join was introduced in 2015 and all the Subnet managers have
supported it ever since. So there is no point in checking if a subnet
manager supports it.
Should an old opensm get a request for a sendonly join then the request
will fail. The code that is removed here accomodated that situation and
fell back to a full join.
Falling back to a full join is problematic in itself. The reason to use
the sendonly join was to reduce the traffic on the Infiniband fabric
otherwise one could have just stayed with the regular join. So this patch
may cause users of very old opensms to discover that lots of traffic
needlessly crosses their IB fabrics.
Link: https://lore.kernel.org/r/alpine.DEB.2.22.394.2101281845160.13303@www.lameter.com
Signed-off-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Reduce the number of instructions made for setting protection caps. No
need to do bitwise OR with 0 since we can zero the return value in the
beginning of the function.
Link: https://lore.kernel.org/r/20210111145754.56727-5-mgurtovoy@nvidia.com
Reviewed-by: Israel Rukshin <israelr@nvidia.com>
Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
A value of 0 will casue the driver to fail establishing a valid connection
to remote target.
The following can be seen in the log in this case:
iser: iser_connect: connecting to: 1.1.1.88:3260
iser: iser_cma_handler: address resolved (0): status 0 conn 00000000090aa4de id 00000000167d3b5a
iser: iser_cma_handler: route resolved (2): status 0 conn 00000000090aa4de id 00000000167d3b5a
iser: iscsi_iser_ep_poll: iser conn 00000000090aa4de rc = 0
iser: iser_create_ib_conn_res: setting conn 00000000090aa4de cma_id 00000000167d3b5a qp 00000000efa80660 max_send_wr 4619
iser_cma_handler: established (9): status 0 conn 00000000090aa4de id 00000000167d3b5a
iser: iser_connected_handler: remote qpn:1c7 my qpn:1c6
iser: iser_connected_handler: conn 00000000090aa4de: negotiated remote invalidation
iser: iscsi_iser_ep_poll: iser conn 00000000090aa4de rc = 1
scsi host10: iSCSI Initiator over iSER
mlx5_core 0000:07:00.0: mlx5_cmd_check:769:(pid 616473): CREATE_MKEY(0x200) op_mod(0x0) failed, status bad parameter(0x3), syndrome (0x3bf6f)
iser: iser_create_fastreg_desc: Failed to allocate ib_fast_reg_mr err=-22
iser: iser_alloc_rx_descriptors: failed allocating rx descriptors / data buffers
iser: iscsi_iser_ep_disconnect: ep 00000000d2040785 iser conn 00000000090aa4de
iser: iser_conn_terminate: iser_conn 00000000090aa4de state 3
iser: iser_free_ib_conn_res: freeing conn 00000000090aa4de cma_id 00000000167d3b5a qp 00000000efa80660
iser: iser_device_try_release: device 00000000dc871b1b refcount 0
Link: https://lore.kernel.org/r/20210111145754.56727-4-mgurtovoy@nvidia.com
Reviewed-by: Israel Rukshin <israelr@nvidia.com>
Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
|
|
|
|
|
|
|
|
| |
Remove the check from the module_init function.
Link: https://lore.kernel.org/r/20210111145754.56727-3-mgurtovoy@nvidia.com
Reviewed-by: Israel Rukshin <israelr@nvidia.com>
Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
|
|
|
|
|
|
|
|
| |
No need to add semicolon after closing bracket.
Link: https://lore.kernel.org/r/20210111145754.56727-2-mgurtovoy@nvidia.com
Reviewed-by: Israel Rukshin <israelr@nvidia.com>
Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
| |
Use if/else clause instead of "condition ? val1 : val2" to make the code
cleaner and simpler.
Link: https://lore.kernel.org/r/20210110111903.486681-3-mgurtovoy@nvidia.com
Reviewed-by: Israel Rukshin <israelr@nvidia.com>
Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com>
Acked-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
|
|
|
|
|
|
|
|
| |
No need to add semicolon after closing bracket.
Link: https://lore.kernel.org/r/20210110111903.486681-2-mgurtovoy@nvidia.com
Reviewed-by: Israel Rukshin <israelr@nvidia.com>
Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
|
|
|
|
|
|
|
|
| |
The Linux convention is to have only 1 new line between functions.
Link: https://lore.kernel.org/r/20210110111903.486681-1-mgurtovoy@nvidia.com
Reviewed-by: Israel Rukshin <israelr@nvidia.com>
Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When KASAN is enabled, we notice warning below:
[ 483.436975] ==================================================================
[ 483.437234] BUG: KASAN: stack-out-of-bounds in _mlx5_ib_post_send+0x188a/0x2560 [mlx5_ib]
[ 483.437430] Read of size 4 at addr ffff88a195fd7d30 by task kworker/1:3/6954
[ 483.437731] CPU: 1 PID: 6954 Comm: kworker/1:3 Kdump: loaded Tainted: G O 5.4.82-pserver #5.4.82-1+feature+linux+5.4.y+dbg+20201210.1532+987e7a6~deb10
[ 483.437976] Hardware name: Supermicro Super Server/X11DDW-L, BIOS 3.3 02/21/2020
[ 483.438168] Workqueue: rtrs_server_wq hb_work [rtrs_core]
[ 483.438323] Call Trace:
[ 483.438486] dump_stack+0x96/0xe0
[ 483.438646] ? _mlx5_ib_post_send+0x188a/0x2560 [mlx5_ib]
[ 483.438802] print_address_description.constprop.6+0x1b/0x220
[ 483.438966] ? _mlx5_ib_post_send+0x188a/0x2560 [mlx5_ib]
[ 483.439133] ? _mlx5_ib_post_send+0x188a/0x2560 [mlx5_ib]
[ 483.439285] __kasan_report.cold.9+0x1a/0x32
[ 483.439444] ? _mlx5_ib_post_send+0x188a/0x2560 [mlx5_ib]
[ 483.439597] kasan_report+0x10/0x20
[ 483.439752] _mlx5_ib_post_send+0x188a/0x2560 [mlx5_ib]
[ 483.439910] ? update_sd_lb_stats+0xfb1/0xfc0
[ 483.440073] ? set_reg_wr+0x520/0x520 [mlx5_ib]
[ 483.440222] ? update_group_capacity+0x340/0x340
[ 483.440377] ? find_busiest_group+0x314/0x870
[ 483.440526] ? update_sd_lb_stats+0xfc0/0xfc0
[ 483.440683] ? __bitmap_and+0x6f/0x100
[ 483.440832] ? __lock_acquire+0xa2/0x2150
[ 483.440979] ? __lock_acquire+0xa2/0x2150
[ 483.441128] ? __lock_acquire+0xa2/0x2150
[ 483.441279] ? debug_lockdep_rcu_enabled+0x23/0x60
[ 483.441430] ? lock_downgrade+0x390/0x390
[ 483.441582] ? __lock_acquire+0xa2/0x2150
[ 483.441729] ? __lock_acquire+0xa2/0x2150
[ 483.441876] ? newidle_balance+0x425/0x8f0
[ 483.442024] ? __lock_acquire+0xa2/0x2150
[ 483.442172] ? debug_lockdep_rcu_enabled+0x23/0x60
[ 483.442330] hb_work+0x15d/0x1d0 [rtrs_core]
[ 483.442479] ? schedule_hb+0x50/0x50 [rtrs_core]
[ 483.442627] ? lock_downgrade+0x390/0x390
[ 483.442781] ? process_one_work+0x40d/0xa50
[ 483.442931] process_one_work+0x4ee/0xa50
[ 483.443082] ? pwq_dec_nr_in_flight+0x110/0x110
[ 483.443231] ? do_raw_spin_lock+0x119/0x1d0
[ 483.443383] worker_thread+0x65/0x5c0
[ 483.443532] ? process_one_work+0xa50/0xa50
[ 483.451839] kthread+0x1e2/0x200
[ 483.451983] ? kthread_create_on_node+0xc0/0xc0
[ 483.452139] ret_from_fork+0x3a/0x50
The problem is we use wrong type when send wr, hw driver expect the type
of IB_WR_RDMA_WRITE_WITH_IMM wr should be ib_rdma_wr, and doing
container_of to access member. The fix is simple use ib_rdma_wr instread
of ib_send_wr.
Fixes: c0894b3ea69d ("RDMA/rtrs: core: lib functions shared between client and server modules")
Link: https://lore.kernel.org/r/20201217141915.56989-20-jinpu.wang@cloud.ionos.com
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Reviewed-by: Gioh Kim <gi-oh.kim@cloud.ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fix up wr_avail accounting. if wr_cnt is 0, then we do SIGNAL for first
wr, in completion we add queue_depth back, which is not right in the
sense of tracking for available wr.
So fix it by init wr_cnt to 1.
Fixes: 9cb837480424 ("RDMA/rtrs: server: main functionality")
Link: https://lore.kernel.org/r/20201217141915.56989-19-jinpu.wang@cloud.ionos.com
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: Gioh Kim <gi-oh.kim@cloud.ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
| |
We do not need to wait for REG_MR completion, so remove the
SIGNAL flag.
Fixes: 9cb837480424 ("RDMA/rtrs: server: main functionality")
Link: https://lore.kernel.org/r/20201217141915.56989-18-jinpu.wang@cloud.ionos.com
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: Gioh Kim <gi-oh.kim@cloud.ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
|
|
|
|
|
|
|
|
| |
We may want to add new flags, so it's better to use bitmask to check flags.
Fixes: 6a98d71daea1 ("RDMA/rtrs: client: main functionality")
Link: https://lore.kernel.org/r/20201217141915.56989-17-jinpu.wang@cloud.ionos.com
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: Gioh Kim <gi-oh.kim@cloud.ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
For HB, there is no need to generate signal for completion.
Also remove a comment accordingly.
Fixes: c0894b3ea69d ("RDMA/rtrs: core: lib functions shared between client and server modules")
Link: https://lore.kernel.org/r/20201217141915.56989-16-jinpu.wang@cloud.ionos.com
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Reported-by: Gioh Kim <gi-oh.kim@cloud.ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Make all failure cases go to the common path to avoid duplicate code.
And some issued existed before.
1. clt need to be freed to avoid memory leak.
2. return ERR_PTR(-ENOMEM) if kobject_create_and_add fails, because
rtrs_clt_open checks the return value of by call "IS_ERR(clt)".
Fixes: 6a98d71daea1 ("RDMA/rtrs: client: main functionality")
Link: https://lore.kernel.org/r/20201217141915.56989-15-jinpu.wang@cloud.ionos.com
Signed-off-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com>
Reviewed-by: Md Haris Iqbal <haris.iqbal@cloud.ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
We had a few places wr_cqe is not set, which could lead to NULL pointer
deref or GPF in error case.
Fixes: 9cb837480424 ("RDMA/rtrs: server: main functionality")
Link: https://lore.kernel.org/r/20201217141915.56989-14-jinpu.wang@cloud.ionos.com
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Reviewed-by: Md Haris Iqbal <haris.iqbal@cloud.ionos.com>
Signed-off-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Let's rename it to rtrs_clt_change_state since the previous one is
killed.
Also update the comment to make it more clear.
Link: https://lore.kernel.org/r/20201217141915.56989-13-jinpu.wang@cloud.ionos.com
Signed-off-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com>
Reviewed-by: Md Haris Iqbal <haris.iqbal@cloud.ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
| |
It is just a wrapper of rtrs_clt_change_state_get_old, and we can reuse
rtrs_clt_change_state_get_old with add the checking of 'old_state' is
valid or not.
Link: https://lore.kernel.org/r/20201217141915.56989-12-jinpu.wang@cloud.ionos.com
Signed-off-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
|
|
|
|
|
|
|
| |
This is not needed since the label is just after the place.
Link: https://lore.kernel.org/r/20201217141915.56989-11-jinpu.wang@cloud.ionos.com
Signed-off-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
|
|
|
|
|
|
|
|
| |
Let's wait the inflight permits before free it.
Link: https://lore.kernel.org/r/20201217141915.56989-10-jinpu.wang@cloud.ionos.com
Signed-off-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com>
Reviewed-by: Md Haris Iqbal <haris.iqbal@cloud.ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
|
|
|
|
|
|
|
|
| |
Since the two functions are called together, let's consolidate them in
a new function rtrs_clt_destroy_sysfs_root.
Link: https://lore.kernel.org/r/20201217141915.56989-9-jinpu.wang@cloud.ionos.com
Signed-off-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Per the comment of kobject_init_and_add, we need to free the memory
by call kobject_put.
Fixes: 215378b838df ("RDMA/rtrs: client: sysfs interface functions")
Fixes: 91b11610af8d ("RDMA/rtrs: server: sysfs interface functions")
Link: https://lore.kernel.org/r/20201217141915.56989-8-jinpu.wang@cloud.ionos.com
Signed-off-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com>
Reviewed-by: Md Haris Iqbal <haris.iqbal@cloud.ionos.com>
Reviewed-by: Gioh Kim <gi-oh.kim@cloud.ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
The rtrs_iu_free is called in rtrs_iu_alloc if memory is limited, so we
don't need to free the same iu again.
Fixes: 9cb837480424 ("RDMA/rtrs: server: main functionality")
Link: https://lore.kernel.org/r/20201217141915.56989-7-jinpu.wang@cloud.ionos.com
Signed-off-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com>
Reviewed-by: Gioh Kim <gi-oh.kim@cloud.ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently rtrs when create_qp use a coarse numbers (bigger in general),
which leads to hardware create more resources which only waste memory
with no benefits.
- SERVICE con,
For max_send_wr/max_recv_wr, it's 2 times SERVICE_CON_QUEUE_DEPTH + 2
- IO con
For max_send_wr/max_recv_wr, it's sess->queue_depth * 3 + 1
Fixes: 6a98d71daea1 ("RDMA/rtrs: client: main functionality")
Link: https://lore.kernel.org/r/20201217141915.56989-6-jinpu.wang@cloud.ionos.com
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Reviewed-by: Md Haris Iqbal <haris.iqbal@cloud.ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
| |
Remove self first to avoid deadlock, we don't want to
use close_work to remove sess sysfs.
Fixes: 91b11610af8d ("RDMA/rtrs: server: sysfs interface functions")
Link: https://lore.kernel.org/r/20201217141915.56989-5-jinpu.wang@cloud.ionos.com
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Tested-by: Lutz Pogrell <lutz.pogrell@cloud.ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
|
|
|
|
|
|
|
|
| |
In this error case, we don't need hold mutex to call close_sess.
Fixes: 9cb837480424 ("RDMA/rtrs: server: main functionality")
Link: https://lore.kernel.org/r/20201217141915.56989-4-jinpu.wang@cloud.ionos.com
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Tested-by: Lutz Pogrell <lutz.pogrell@cloud.ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
rtrs does not have same limit for both max_send_wr and max_recv_wr,
To allow client and server set different values, export in a separate
parameter for rtrs_cq_qp_create.
Also fix the type accordingly, u32 should be used instead of u16.
Fixes: c0894b3ea69d ("RDMA/rtrs: core: lib functions shared between client and server modules")
Link: https://lore.kernel.org/r/20201217141915.56989-2-jinpu.wang@cloud.ionos.com
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Reviewed-by: Md Haris Iqbal <haris.iqbal@cloud.ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
| |
Replace a comma between expression statements by a semicolon.
Link: https://lore.kernel.org/r/20201214134118.4349-1-zhengyongjun3@huawei.com
Link: https://lore.kernel.org/r/20201214134146.4456-1-zhengyongjun3@huawei.com
Link: https://lore.kernel.org/r/20201214134218.4510-1-zhengyongjun3@huawei.com
Link: https://lore.kernel.org/r/20201214134243.4563-1-zhengyongjun3@huawei.com
Signed-off-by: Zheng Yongjun <zhengyongjun3@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously the rnbd client requested the rtrs to allocate rnbd_iu
just after the rtrs_iu. So the rnbd client passes the size of
rnbd_iu for rtrs_clt_open() and rtrs creates an array of
rnbd_iu and rtrs_iu.
For IO handling, rnbd_iu exists after the request because we pass
the size of rnbd_iu when setting the tag-set. Therefore we do not
use the rnbd_iu allocated by rtrs for IO handling.
We only use the rnbd_iu allocated by rtrs when doing session
initialization. Almost all rnbd_iu allocated by rtrs are wasted.
By this patch the rnbd client does not request rnbd_iu allocation
to rtrs but allocate it for itself when doing session initialization.
Also remove unused rtrs_permit_to_pdu from rtrs.
Signed-off-by: Gioh Kim <gi-oh.kim@cloud.ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|\
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Pull rdma updates from Jason Gunthorpe:
"A smaller set of patches, nothing stands out as being particularly
major this cycle. The biggest item would be the new HIP09 HW support
from HNS, otherwise it was pretty quiet for new work here:
- Driver bug fixes and updates: bnxt_re, cxgb4, rxe, hns, i40iw,
cxgb4, mlx4 and mlx5
- Bug fixes and polishing for the new rts ULP
- Cleanup of uverbs checking for allowed driver operations
- Use sysfs_emit all over the place
- Lots of bug fixes and clarity improvements for hns
- hip09 support for hns
- NDR and 50/100Gb signaling rates
- Remove dma_virt_ops and go back to using the IB DMA wrappers
- mlx5 optimizations for contiguous DMA regions"
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (147 commits)
RDMA/cma: Don't overwrite sgid_attr after device is released
RDMA/mlx5: Fix MR cache memory leak
RDMA/rxe: Use acquire/release for memory ordering
RDMA/hns: Simplify AEQE process for different types of queue
RDMA/hns: Fix inaccurate prints
RDMA/hns: Fix incorrect symbol types
RDMA/hns: Clear redundant variable initialization
RDMA/hns: Fix coding style issues
RDMA/hns: Remove unnecessary access right set during INIT2INIT
RDMA/hns: WARN_ON if get a reserved sl from users
RDMA/hns: Avoid filling sl in high 3 bits of vlan_id
RDMA/hns: Do shift on traffic class when using RoCEv2
RDMA/hns: Normalization the judgment of some features
RDMA/hns: Limit the length of data copied between kernel and userspace
RDMA/mlx4: Remove bogus dev_base_lock usage
RDMA/uverbs: Fix incorrect variable type
RDMA/core: Do not indicate device ready when device enablement fails
RDMA/core: Clean up cq pool mechanism
RDMA/core: Update kernel documentation for ib_create_named_qp()
MAINTAINERS: SOFT-ROCE: Change Zhu Yanjun's email address
...
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
iser_initialize_task_headers() uses in_interrupt() to find out if it is
safe to acquire a mutex.
in_interrupt() is deprecated as it is ill defined and does not provide
what it suggests. Aside of that it covers only parts of the contexts in
which a mutex may not be acquired.
The following callchains exist:
iscsi_queuecommand() *locks* iscsi_session::frwd_lock
-> iscsi_prep_scsi_cmd_pdu()
-> session->tt->init_task() (iscsi_iser_task_init())
-> iser_initialize_task_headers()
-> iscsi_iser_task_xmit() (iscsi_transport::xmit_task)
-> iscsi_iser_task_xmit_unsol_data()
-> iser_send_data_out()
-> iser_initialize_task_headers()
iscsi_data_xmit() *locks* iscsi_session::frwd_lock
-> iscsi_prep_mgmt_task()
-> session->tt->init_task() (iscsi_iser_task_init())
-> iser_initialize_task_headers()
-> iscsi_prep_scsi_cmd_pdu()
-> session->tt->init_task() (iscsi_iser_task_init())
-> iser_initialize_task_headers()
__iscsi_conn_send_pdu() caller has iscsi_session::frwd_lock
-> iscsi_prep_mgmt_task()
-> session->tt->init_task() (iscsi_iser_task_init())
-> iser_initialize_task_headers()
-> session->tt->xmit_task() (
The only callchain that is close to be invoked in preemptible context:
iscsi_xmitworker() worker
-> iscsi_data_xmit()
-> iscsi_xmit_task()
-> conn->session->tt->xmit_task() (iscsi_iser_task_xmit()
In iscsi_iser_task_xmit() there is this check:
if (!task->sc)
return iscsi_iser_mtask_xmit(conn, task);
so it does end up in iser_initialize_task_headers() and
iser_initialize_task_headers() relies on iscsi_task::sc == NULL.
Remove conditional locking of iser_conn::state_mutex because there is no
call chain to do so. Remove the goto label and return early now that there
is no clean up needed.
Link: https://lore.kernel.org/r/20201204174256.62xfcvudndt7oufl@linutronix.de
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Sagi Grimberg <sagi@grimberg.me>
Cc: Max Gurtovoy <maxg@nvidia.com>
Cc: Doug Ledford <dledford@redhat.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: linux-rdma@vger.kernel.org
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Some functions have different names between their prototypes and the
kernel-doc markup.
Others need to be fixed, as kernel-doc markups should use this format:
identifier - description
Link: https://lore.kernel.org/r/78b98c41a5a0f4c0106433d305b143028a4168b0.1606823973.git.mchehab+huawei@kernel.org
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Currently ipoib choose cq completion vector based on port number, when HCA
only have one port, all the interface recv queue completion are bind to cq
completion vector 0.
To better distribute the load, use same method as __ib_alloc_cq_any to
choose completion vector, with the change, each interface now use
different completion vectors.
Link: https://lore.kernel.org/r/20201013074342.15867-1-jinpu.wang@cloud.ionos.com
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Reviewed-by: Gioh Kim <gi-oh.kim@cloud.ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
| |\
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
From https://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma.git
The rc RDMA branch is needed due to dependencies on the next patches.
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
It is not the kernel style, warning reported by coccicheck:
./ib_isert.c:1104:12-24: WARNING: Comparison to bool
Link: https://lore.kernel.org/r/1604404674-32998-1-git-send-email-zou_wei@huawei.com
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Zou Wei <zou_wei@huawei.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
The IBTA specification has new speeds - HDR and NDR, supporting signaling
rate of 50Gb and 100Gb respectively. ethtool support of ipoib driver
translates IB speed to signaling rate. Added translation of HDR and NDR IB
types to rates of 50Gb and 100Gb ethernet speed.
Link: https://lore.kernel.org/r/20201026132904.1338526-1-leon@kernel.org
Signed-off-by: Meir Lichtinger <meirl@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Manual changes for sysfs_emit as cocci scripts can't easily convert them.
Link: https://lore.kernel.org/r/ecde7791467cddb570c6f6d2c908ffbab9145cac.1602122880.git.joe@perches.com
Signed-off-by: Joe Perches <joe@perches.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Acked-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Make changes to use sysfs_emit in the RDMA code as cocci scripts can not
be written to handle _all_ the possible variants of various sprintf family
uses in sysfs show functions.
While there, make the code more legible and update its style to be more
like the typical kernel styles.
Miscellanea:
o Use intermediate pointers for dereferences
o Add and use string lookup functions
o return early when any intermediate call fails so normal return is
at the bottom of the function
o mlx4/mcg.c:sysfs_show_group: use scnprintf to format intermediate strings
Link: https://lore.kernel.org/r/f5c9e4c9d8dafca1b7b70bd597ee7f8f219c31c8.1602122880.git.joe@perches.com
Signed-off-by: Joe Perches <joe@perches.com>
Acked-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Remove the argument since it is not used in the function.
Link: https://lore.kernel.org/r/20201023074353.21946-13-jinpu.wang@cloud.ionos.com
Signed-off-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Since the three functions share the similar logic, let's introduce one
common function for it.
Link: https://lore.kernel.org/r/20201023074353.21946-12-jinpu.wang@cloud.ionos.com
Signed-off-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
This function isn't needed since no caller checks the old_state of sess.
Link: https://lore.kernel.org/r/20201023074353.21946-11-jinpu.wang@cloud.ionos.com
Signed-off-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
process_info_rsp checks that sg_cnt is zero twice.
Link: https://lore.kernel.org/r/20201023074353.21946-10-jinpu.wang@cloud.ionos.com
Signed-off-by: Gioh Kim <gi-oh.kim@cloud.ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
The events returning the same error value are put together.
Link: https://lore.kernel.org/r/20201023074353.21946-9-jinpu.wang@cloud.ionos.com
Signed-off-by: Gioh Kim <gi-oh.kim@cloud.ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
The direction of DMA operation is already in the rtrs_iu
Link: https://lore.kernel.org/r/20201023074353.21946-8-jinpu.wang@cloud.ionos.com
Signed-off-by: Gioh Kim <gi-oh.kim@cloud.ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
It should mean region here.
Fixes: 9cb837480424 ("RDMA/rtrs: server: main functionality")
Link: https://lore.kernel.org/r/20201023074353.21946-7-jinpu.wang@cloud.ionos.com
Signed-off-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|