diff options
author | James Smart <jsmart2021@gmail.com> | 2018-04-09 23:24:30 +0200 |
---|---|---|
committer | Martin K. Petersen <martin.petersen@oracle.com> | 2018-04-19 01:34:05 +0200 |
commit | b15bd3e6212e747ebd1b37a542898e88ad05bb17 (patch) | |
tree | 430386525c7068cd5408cdf62ac67d2aa6ade0c4 /drivers/scsi/lpfc/lpfc_nvme.c | |
parent | scsi: lpfc: Fix driver not recovering NVME rports during target link faults (diff) | |
download | linux-b15bd3e6212e747ebd1b37a542898e88ad05bb17.tar.xz linux-b15bd3e6212e747ebd1b37a542898e88ad05bb17.zip |
scsi: lpfc: Fix nvme remoteport registration race conditions
On tests adding and removing a remote port, calls to nvme_info would
eventually show fewer target ports discovered than were present in the
san. Additionally, the following error messages were seen:
6031 RemotePort Registration failed err: -116, DID x471301
There is a race condition that exists between the driver and the nvme
transport on remote port unregister vs the confirmed deletion. It's
possible that the driver may rediscover the remote port and reregister
the remote port before a prior unregister delete callback was made (as
it rebinded to the prior remoteport structure). However, the driver was
coded to expect the callback before seeing the remote port again thus a
new registration. The logic results in the driver having an invalid
remoteport pointer set.
Correct by tracking when waiting for the delete callback. In cases where
the ndlp remoteport pointer is updated, it is only cleared when the wait
has not been superceded by a prior registration.
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Diffstat (limited to 'drivers/scsi/lpfc/lpfc_nvme.c')
-rw-r--r-- | drivers/scsi/lpfc/lpfc_nvme.c | 16 |
1 files changed, 14 insertions, 2 deletions
diff --git a/drivers/scsi/lpfc/lpfc_nvme.c b/drivers/scsi/lpfc/lpfc_nvme.c index 22962b08c275..a0257478b63c 100644 --- a/drivers/scsi/lpfc/lpfc_nvme.c +++ b/drivers/scsi/lpfc/lpfc_nvme.c @@ -334,8 +334,14 @@ lpfc_nvme_remoteport_delete(struct nvme_fc_remote_port *remoteport) "6146 remoteport delete of remoteport %p\n", remoteport); spin_lock_irq(&vport->phba->hbalock); - ndlp->nrport = NULL; - ndlp->upcall_flags &= ~NLP_WAIT_FOR_UNREG; + + /* The register rebind might have occurred before the delete + * downcall. Guard against this race. + */ + if (ndlp->upcall_flags & NLP_WAIT_FOR_UNREG) { + ndlp->nrport = NULL; + ndlp->upcall_flags &= ~NLP_WAIT_FOR_UNREG; + } spin_unlock_irq(&vport->phba->hbalock); /* Remove original register reference. The host transport @@ -2691,6 +2697,12 @@ lpfc_nvme_register_port(struct lpfc_vport *vport, struct lpfc_nodelist *ndlp) * a resume of the existing rport. Else this is a * new rport. */ + /* Guard against an unregister/reregister + * race that leaves the WAIT flag set. + */ + spin_lock_irq(&vport->phba->hbalock); + ndlp->upcall_flags &= ~NLP_WAIT_FOR_UNREG; + spin_unlock_irq(&vport->phba->hbalock); rport = remote_port->private; if (oldrport) { if (oldrport == remote_port->private) { |