summaryrefslogtreecommitdiffstats
path: root/drivers/scsi/hisi_sas
diff options
context:
space:
mode:
authorJohn Garry <john.garry@huawei.com>2021-12-20 12:21:24 +0100
committerMartin K. Petersen <martin.petersen@oracle.com>2021-12-23 05:38:29 +0100
commitfbefe22811c3140a686e407e114789ebf328a9a2 (patch)
tree75086b929e23c76bd9f950582c387c87a5645757 /drivers/scsi/hisi_sas
parentscsi: libsas: Decode SAM status and host byte codes (diff)
downloadlinux-fbefe22811c3140a686e407e114789ebf328a9a2.tar.xz
linux-fbefe22811c3140a686e407e114789ebf328a9a2.zip
scsi: libsas: Don't always drain event workqueue for HA resume
For the hisi_sas driver, if a directly attached disk is removed during suspend, a hang will occur in the resume process: The background is that in commit 16fd4a7c5917 ("scsi: hisi_sas: Add device link between SCSI devices and hisi_hba"), it is ensured that the HBA device cannot be runtime suspended when any SCSI device associated is active. Other drivers which use libsas don't worry about this as none support runtime suspend. The mentioned hang occurs when an disk is removed during suspend. In the removal process - from PHYE_RESUME_TIMEOUT event processing - we call into scsi_remove_device(), which is being processed in the HA event workqueue. Here we wait for all suppliers of the SCSI device to resume, which includes the HBA device (from the above commit). However the HBA device cannot resume, as it is waiting for the PHYE_RESUME_TIMEOUT to be processed (from calling sas_resume_ha() -> sas_drain_work()). This is the deadlock. There does not appear to be any need for the sas_drain_work() to be called at all in sas_resume_ha() as it is not syncing against anything, so allow LLDDs to avoid this by providing a variant of sas_resume_ha() which does "sync", i.e. doesn't drain the event workqueue. Link: https://lore.kernel.org/r/1639999298-244569-2-git-send-email-chenxiang66@hisilicon.com Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Diffstat (limited to 'drivers/scsi/hisi_sas')
-rw-r--r--drivers/scsi/hisi_sas/hisi_sas_v3_hw.c10
1 files changed, 9 insertions, 1 deletions
diff --git a/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c b/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c
index 0239e2b4b84f..63059fb6d9ec 100644
--- a/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c
+++ b/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c
@@ -4950,7 +4950,15 @@ static int _resume_v3_hw(struct device *device)
return rc;
}
phys_init_v3_hw(hisi_hba);
- sas_resume_ha(sha);
+
+ /*
+ * If a directly-attached disk is removed during suspend, a deadlock
+ * may occur, as the PHYE_RESUME_TIMEOUT processing will require the
+ * hisi_hba->device to be active, which can only happen when resume
+ * completes. So don't wait for the HA event workqueue to drain upon
+ * resume.
+ */
+ sas_resume_ha_no_sync(sha);
clear_bit(HISI_SAS_RESETTING_BIT, &hisi_hba->flags);
return 0;