summaryrefslogtreecommitdiffstats
path: root/drivers/infiniband/hw/mlx5/mr.c
diff options
context:
space:
mode:
authorMaor Gottlieb <maorg@mellanox.com>2016-06-17 14:01:38 +0200
committerDoug Ledford <dledford@redhat.com>2016-06-23 17:02:45 +0200
commit89ea94a7b6c40eb423c144aef1caceebaff79c8d (patch)
tree2831167a7aecf1b24275d3aca92bc1d3fadcc107 /drivers/infiniband/hw/mlx5/mr.c
parentIB/mlx5: Implements disassociate_ucontext API (diff)
downloadlinux-89ea94a7b6c40eb423c144aef1caceebaff79c8d.tar.xz
linux-89ea94a7b6c40eb423c144aef1caceebaff79c8d.zip
IB/mlx5: Reset flow support for IB kernel ULPs
The driver exposes interfaces that directly relate to HW state. Upon fatal error, consumers of these interfaces (ULPs) that rely on completion of all their posted work-request could hang, thereby introducing dependencies in shutdown order. To prevent this from happening, we manage the relevant resources (CQs, QPs) that are used by the device. Upon a fatal error, we now generate simulated completions for outstanding WQEs that were not completed at the time the HW was reset. It includes invoking the completion event handler for all involved CQs so that the ULPs will poll those CQs. When polled we return simulated CQEs with IB_WC_WR_FLUSH_ERR return code enabling ULPs to clean up their resources and not wait forever for completions upon receiving remove_one. The above change requires an extra check in the data path to make sure that when device is in error state, the simulated CQEs will be returned and no further WQEs will be posted. Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
Diffstat (limited to 'drivers/infiniband/hw/mlx5/mr.c')
-rw-r--r--drivers/infiniband/hw/mlx5/mr.c4
1 files changed, 4 insertions, 0 deletions
diff --git a/drivers/infiniband/hw/mlx5/mr.c b/drivers/infiniband/hw/mlx5/mr.c
index 8cf2ce50511f..4b021305c321 100644
--- a/drivers/infiniband/hw/mlx5/mr.c
+++ b/drivers/infiniband/hw/mlx5/mr.c
@@ -1193,12 +1193,16 @@ error:
static int unreg_umr(struct mlx5_ib_dev *dev, struct mlx5_ib_mr *mr)
{
+ struct mlx5_core_dev *mdev = dev->mdev;
struct umr_common *umrc = &dev->umrc;
struct mlx5_ib_umr_context umr_context;
struct mlx5_umr_wr umrwr = {};
struct ib_send_wr *bad;
int err;
+ if (mdev->state == MLX5_DEVICE_STATE_INTERNAL_ERROR)
+ return 0;
+
mlx5_ib_init_umr_context(&umr_context);
umrwr.wr.wr_cqe = &umr_context.cqe;