diff options
author | James Smart <jsmart2021@gmail.com> | 2021-10-20 23:14:16 +0200 |
---|---|---|
committer | Martin K. Petersen <martin.petersen@oracle.com> | 2021-10-21 05:33:46 +0200 |
commit | af984c87293b19dccbd0b16afc57c5c9a4a279c7 (patch) | |
tree | 3c1ab333dbcece2368161c75758097ca57deeb9a /drivers/scsi/lpfc/lpfc_scsi.c | |
parent | scsi: lpfc: Fix link down processing to address NULL pointer dereference (diff) | |
download | linux-af984c87293b19dccbd0b16afc57c5c9a4a279c7.tar.xz linux-af984c87293b19dccbd0b16afc57c5c9a4a279c7.zip |
scsi: lpfc: Allow fabric node recovery if recovery is in progress before devloss
A link bounce to a slow fabric may observe FDISC response delays lasting
longer than devloss tmo. Current logic decrements the final fabric node
kref during a devloss tmo event. This results in a NULL ptr dereference
crash if the FDISC completes for that fabric node after devloss tmo.
Fix by adding the NLP_IN_RECOV_POST_DEV_LOSS flag, which is set when
devloss tmo triggers and we've noticed that fabric node recovery has
already started or finished in between the time lpfc_dev_loss_tmo_callbk
queues lpfc_dev_loss_tmo_handler. If fabric node recovery succeeds, then
the driver reverses the devloss tmo marked kref put with a kref get. If
fabric node recovery fails, then the final kref put relies on the ELS
timing out or the REG_LOGIN cmpl routine.
Link: https://lore.kernel.org/r/20211020211417.88754-8-jsmart2021@gmail.com
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Diffstat (limited to 'drivers/scsi/lpfc/lpfc_scsi.c')
-rw-r--r-- | drivers/scsi/lpfc/lpfc_scsi.c | 10 |
1 files changed, 5 insertions, 5 deletions
diff --git a/drivers/scsi/lpfc/lpfc_scsi.c b/drivers/scsi/lpfc/lpfc_scsi.c index 5578080afef8..6ccf573acdec 100644 --- a/drivers/scsi/lpfc/lpfc_scsi.c +++ b/drivers/scsi/lpfc/lpfc_scsi.c @@ -6475,28 +6475,28 @@ lpfc_target_reset_handler(struct scsi_cmnd *cmnd) /* Issue LOGO, if no LOGO is outstanding */ spin_lock_irqsave(&pnode->lock, flags); - if (!(pnode->upcall_flags & NLP_WAIT_FOR_LOGO) && + if (!(pnode->save_flags & NLP_WAIT_FOR_LOGO) && !pnode->logo_waitq) { pnode->logo_waitq = &waitq; pnode->nlp_fcp_info &= ~NLP_FCP_2_DEVICE; pnode->nlp_flag |= NLP_ISSUE_LOGO; - pnode->upcall_flags |= NLP_WAIT_FOR_LOGO; + pnode->save_flags |= NLP_WAIT_FOR_LOGO; spin_unlock_irqrestore(&pnode->lock, flags); lpfc_unreg_rpi(vport, pnode); wait_event_timeout(waitq, - (!(pnode->upcall_flags & + (!(pnode->save_flags & NLP_WAIT_FOR_LOGO)), msecs_to_jiffies(dev_loss_tmo * 1000)); - if (pnode->upcall_flags & NLP_WAIT_FOR_LOGO) { + if (pnode->save_flags & NLP_WAIT_FOR_LOGO) { lpfc_printf_vlog(vport, KERN_ERR, logit, "0725 SCSI layer TGTRST " "failed & LOGO TMO (%d, %llu) " "return x%x\n", tgt_id, lun_id, status); spin_lock_irqsave(&pnode->lock, flags); - pnode->upcall_flags &= ~NLP_WAIT_FOR_LOGO; + pnode->save_flags &= ~NLP_WAIT_FOR_LOGO; } else { spin_lock_irqsave(&pnode->lock, flags); } |