diff options
author | Martin K. Petersen <martin.petersen@oracle.com> | 2023-05-22 23:09:51 +0200 |
---|---|---|
committer | Martin K. Petersen <martin.petersen@oracle.com> | 2023-05-22 23:09:51 +0200 |
commit | 8b60e2189fcd8b10b592608256eb97aebfcff147 (patch) | |
tree | 664f034a0f2579ccbfd59da2ae80728897b4952f /drivers/scsi/sd.c | |
parent | Merge patch series "Use block pr_ops in LIO" (diff) | |
parent | scsi: ata: libata: Handle completion of CDL commands using policy 0xD (diff) | |
download | linux-8b60e2189fcd8b10b592608256eb97aebfcff147.tar.xz linux-8b60e2189fcd8b10b592608256eb97aebfcff147.zip |
Merge patch series "Add Command Duration Limits support"
Niklas Cassel <nks@flawful.org> says:
This series adds support for Command Duration Limits.
The series is based on linux tag: v6.4-rc1
The series can also be found in git: https://github.com/floatious/linux/commits/cdl-v7
=================
CDL in ATA / SCSI
=================
Command Duration Limits is defined in:
T13 ATA Command Set - 5 (ACS-5) and
T10 SCSI Primary Commands - 6 (SPC-6) respectively
(a simpler version of CDL is defined in T10 SPC-5).
CDL defines Duration Limits Descriptors (DLD).
7 DLDs for read commands and 7 DLDs for write commands.
Simply put, a DLD contains a limit and a policy.
A command can specify that a certain limit should be applied by setting
the DLD index field (3 bits, so 0-7) in the command itself.
The DLD index points to one of the 7 DLDs.
DLD index 0 means no descriptor, so no limit.
DLD index 1-7 means DLD 1-7.
A DLD can have a few different policies, but the two major ones are:
-Policy 0xF (abort), command will be completed with command aborted error
(ATA) or status CHECK CONDITION (SCSI), with sense data indicating that
the command timed out.
-Policy 0xD (complete-unavailable), command will be completed without
error (ATA) or status GOOD (SCSI), with sense data indicating that the
command timed out. Note that the command will not have transferred any
data to/from the device when the command timed out, even though the
command returned success.
Regardless of the CDL policy, in case of a CDL timeout, the I/O will
result in a -ETIME error to user-space.
The DLDs are defined in the CDL log page(s) and are readable and writable.
Reading and writing the CDL DLDs are outside the scope of the kernel.
If a user wants to read or write the descriptors, they can do so using a
user-space application that sends passthrough commands, such as cdl-tools:
https://github.com/westerndigitalcorporation/cdl-tools
================================
The introduction of ioprio hints
================================
What the kernel does provide, is a method to let I/O use one of the CDL DLDs
defined in the device. Note that the kernel will simply forward the DLD index
to the device, so the kernel currently does not know, nor does it need to know,
how the DLDs are defined inside the device.
The way that the CDL DLD index is supplied to the kernel is by introducing a
new 10 bit "ioprio hint" field within the existing 16 bit ioprio definition.
Currently, only 6 out of the 16 ioprio bits are in use, the remaining 10 bits
are unused, and are currently explicitly disallowed to be set by the kernel.
For now, we only add ioprio hints representing CDL DLD index 1-7. Additional
ioprio hints for other QoS features could be defined in the future.
A theoretical future work could be to make an I/O scheduler aware of these
hints. E.g. for CDL, an I/O scheduler could make use of the duration limit
in each descriptor, and take that information into account while scheduling
commands. Right now, the ioprio hints will be ignored by the I/O schedulers.
==============================
How to use CDL from user-space
==============================
Since CDL is mutually exclusive with NCQ priority
(see ncq_prio_enable and sas_ncq_prio_enable in
Documentation/ABI/testing/sysfs-block-device),
CDL has to be explicitly enabled using:
echo 1 > /sys/block/$bdev/device/cdl_enable
Since the ioprio hints are supplied through the existing I/O priority API,
it should be simple for an application to make use of the ioprio hints.
It simply has to reuse one of the new macros defined in
include/uapi/linux/ioprio.h: IOPRIO_PRIO_HINT() or IOPRIO_PRIO_VALUE_HINT(),
and supply one of the new hints defined in include/uapi/linux/ioprio.h:
IOPRIO_HINT_DEV_DURATION_LIMIT_[1-7], which indicates that the I/O should
use the corresponding CDL DLD index 1-7.
By reusing the I/O priority API, the user can both define a DLD to use per
AIO (io_uring sqe->ioprio or libaio iocb->aio_reqprio) or per-thread
(ioprio_set()).
=======
Testing
=======
With the following fio patches:
https://github.com/floatious/fio/commits/cdl
fio adds support for ioprio hints, such that CDL can be tested using e.g.:
fio --ioengine=io_uring --cmdprio_percentage=10 --cmdprio_hint=DLD_index
A simple way to test is to use a DLD with a very short duration limit,
and send large reads. Regardless of the CDL policy, in case of a CDL
timeout, the I/O will result in a -ETIME error to user-space.
We also provide a CDL test suite located in the cdl-tools repo, see:
https://github.com/westerndigitalcorporation/cdl-tools#testing-a-system-command-duration-limits-support
We have tested this patch series using:
-real hardware
-the following QEMU implementation:
https://github.com/floatious/qemu/tree/cdl
(NOTE: the QEMU implementation requires you to define the CDL policy at compile
time, so you currently need to recompile QEMU when switching between policies.)
===================
Further information
===================
For further information about CDL, see Damien's slides:
Presented at SDC 2021:
https://www.snia.org/sites/default/files/SDC/2021/pdfs/SNIA-SDC21-LeMoal-Be-On-Time-command-duration-limits-Feature-Support-in%20Linux.pdf
Presented at Lund Linux Con 2022:
https://drive.google.com/file/d/1I6ChFc0h4JY9qZdO1bY5oCAdYCSZVqWw/view?usp=sharing
================
Changes since V6
================
-Rebased series on v6.4-rc1.
-Picked up Reviewed-by tags from Hannes (Thank you Hannes!)
-Picked up Reviewed-by tag from Christoph (Thank you Christoph!)
-Changed KernelVersion from 6.4 to 6.5 for new sysfs attributes.
For older change logs, see previous patch series versions:
https://lore.kernel.org/linux-scsi/20230406113252.41211-1-nks@flawful.org/
https://lore.kernel.org/linux-scsi/20230404182428.715140-1-nks@flawful.org/
https://lore.kernel.org/linux-scsi/20230309215516.3800571-1-niklas.cassel@wdc.com/
https://lore.kernel.org/linux-scsi/20230124190308.127318-1-niklas.cassel@wdc.com/
https://lore.kernel.org/linux-scsi/20230112140412.667308-1-niklas.cassel@wdc.com/
https://lore.kernel.org/linux-scsi/20221208105947.2399894-1-niklas.cassel@wdc.com/
Link: https://lore.kernel.org/r/20230511011356.227789-1-nks@flawful.org
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Diffstat (limited to 'drivers/scsi/sd.c')
-rw-r--r-- | drivers/scsi/sd.c | 59 |
1 files changed, 43 insertions, 16 deletions
diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c index 5247abe4f266..84edbc0a5747 100644 --- a/drivers/scsi/sd.c +++ b/drivers/scsi/sd.c @@ -184,7 +184,7 @@ cache_type_store(struct device *dev, struct device_attribute *attr, return count; } - if (scsi_mode_sense(sdp, 0x08, 8, buffer, sizeof(buffer), SD_TIMEOUT, + if (scsi_mode_sense(sdp, 0x08, 8, 0, buffer, sizeof(buffer), SD_TIMEOUT, sdkp->max_retries, &data, NULL)) return -EINVAL; len = min_t(size_t, sizeof(buffer), data.length - data.header_length - @@ -1042,13 +1042,14 @@ static blk_status_t sd_setup_flush_cmnd(struct scsi_cmnd *cmd) static blk_status_t sd_setup_rw32_cmnd(struct scsi_cmnd *cmd, bool write, sector_t lba, unsigned int nr_blocks, - unsigned char flags) + unsigned char flags, unsigned int dld) { cmd->cmd_len = SD_EXT_CDB_SIZE; cmd->cmnd[0] = VARIABLE_LENGTH_CMD; cmd->cmnd[7] = 0x18; /* Additional CDB len */ cmd->cmnd[9] = write ? WRITE_32 : READ_32; cmd->cmnd[10] = flags; + cmd->cmnd[11] = dld & 0x07; put_unaligned_be64(lba, &cmd->cmnd[12]); put_unaligned_be32(lba, &cmd->cmnd[20]); /* Expected Indirect LBA */ put_unaligned_be32(nr_blocks, &cmd->cmnd[28]); @@ -1058,12 +1059,12 @@ static blk_status_t sd_setup_rw32_cmnd(struct scsi_cmnd *cmd, bool write, static blk_status_t sd_setup_rw16_cmnd(struct scsi_cmnd *cmd, bool write, sector_t lba, unsigned int nr_blocks, - unsigned char flags) + unsigned char flags, unsigned int dld) { cmd->cmd_len = 16; cmd->cmnd[0] = write ? WRITE_16 : READ_16; - cmd->cmnd[1] = flags; - cmd->cmnd[14] = 0; + cmd->cmnd[1] = flags | ((dld >> 2) & 0x01); + cmd->cmnd[14] = (dld & 0x03) << 6; cmd->cmnd[15] = 0; put_unaligned_be64(lba, &cmd->cmnd[2]); put_unaligned_be32(nr_blocks, &cmd->cmnd[10]); @@ -1115,6 +1116,31 @@ static blk_status_t sd_setup_rw6_cmnd(struct scsi_cmnd *cmd, bool write, return BLK_STS_OK; } +/* + * Check if a command has a duration limit set. If it does, and the target + * device supports CDL and the feature is enabled, return the limit + * descriptor index to use. Return 0 (no limit) otherwise. + */ +static int sd_cdl_dld(struct scsi_disk *sdkp, struct scsi_cmnd *scmd) +{ + struct scsi_device *sdp = sdkp->device; + int hint; + + if (!sdp->cdl_supported || !sdp->cdl_enable) + return 0; + + /* + * Use "no limit" if the request ioprio does not specify a duration + * limit hint. + */ + hint = IOPRIO_PRIO_HINT(req_get_ioprio(scsi_cmd_to_rq(scmd))); + if (hint < IOPRIO_HINT_DEV_DURATION_LIMIT_1 || + hint > IOPRIO_HINT_DEV_DURATION_LIMIT_7) + return 0; + + return (hint - IOPRIO_HINT_DEV_DURATION_LIMIT_1) + 1; +} + static blk_status_t sd_setup_read_write_cmnd(struct scsi_cmnd *cmd) { struct request *rq = scsi_cmd_to_rq(cmd); @@ -1126,6 +1152,7 @@ static blk_status_t sd_setup_read_write_cmnd(struct scsi_cmnd *cmd) unsigned int mask = logical_to_sectors(sdp, 1) - 1; bool write = rq_data_dir(rq) == WRITE; unsigned char protect, fua; + unsigned int dld; blk_status_t ret; unsigned int dif; bool dix; @@ -1175,6 +1202,7 @@ static blk_status_t sd_setup_read_write_cmnd(struct scsi_cmnd *cmd) fua = rq->cmd_flags & REQ_FUA ? 0x8 : 0; dix = scsi_prot_sg_count(cmd); dif = scsi_host_dif_capable(cmd->device->host, sdkp->protection_type); + dld = sd_cdl_dld(sdkp, cmd); if (dif || dix) protect = sd_setup_protect_cmnd(cmd, dix, dif); @@ -1183,10 +1211,10 @@ static blk_status_t sd_setup_read_write_cmnd(struct scsi_cmnd *cmd) if (protect && sdkp->protection_type == T10_PI_TYPE2_PROTECTION) { ret = sd_setup_rw32_cmnd(cmd, write, lba, nr_blocks, - protect | fua); + protect | fua, dld); } else if (sdp->use_16_for_rw || (nr_blocks > 0xffff)) { ret = sd_setup_rw16_cmnd(cmd, write, lba, nr_blocks, - protect | fua); + protect | fua, dld); } else if ((nr_blocks > 0xff) || (lba > 0x1fffff) || sdp->use_10_for_rw || protect) { ret = sd_setup_rw10_cmnd(cmd, write, lba, nr_blocks, @@ -2683,9 +2711,8 @@ sd_do_mode_sense(struct scsi_disk *sdkp, int dbd, int modepage, if (sdkp->device->use_10_for_ms && len < 8) len = 8; - return scsi_mode_sense(sdkp->device, dbd, modepage, buffer, len, - SD_TIMEOUT, sdkp->max_retries, data, - sshdr); + return scsi_mode_sense(sdkp->device, dbd, modepage, 0, buffer, len, + SD_TIMEOUT, sdkp->max_retries, data, sshdr); } /* @@ -2942,7 +2969,7 @@ static void sd_read_app_tag_own(struct scsi_disk *sdkp, unsigned char *buffer) if (sdkp->protection_type == 0) return; - res = scsi_mode_sense(sdp, 1, 0x0a, buffer, 36, SD_TIMEOUT, + res = scsi_mode_sense(sdp, 1, 0x0a, 0, buffer, 36, SD_TIMEOUT, sdkp->max_retries, &data, &sshdr); if (res < 0 || !data.header_length || @@ -3131,7 +3158,7 @@ static void sd_read_write_same(struct scsi_disk *sdkp, unsigned char *buffer) return; } - if (scsi_report_opcode(sdev, buffer, SD_BUF_SIZE, INQUIRY) < 0) { + if (scsi_report_opcode(sdev, buffer, SD_BUF_SIZE, INQUIRY, 0) < 0) { struct scsi_vpd *vpd; sdev->no_report_opcodes = 1; @@ -3147,10 +3174,10 @@ static void sd_read_write_same(struct scsi_disk *sdkp, unsigned char *buffer) rcu_read_unlock(); } - if (scsi_report_opcode(sdev, buffer, SD_BUF_SIZE, WRITE_SAME_16) == 1) + if (scsi_report_opcode(sdev, buffer, SD_BUF_SIZE, WRITE_SAME_16, 0) == 1) sdkp->ws16 = 1; - if (scsi_report_opcode(sdev, buffer, SD_BUF_SIZE, WRITE_SAME) == 1) + if (scsi_report_opcode(sdev, buffer, SD_BUF_SIZE, WRITE_SAME, 0) == 1) sdkp->ws10 = 1; } @@ -3162,9 +3189,9 @@ static void sd_read_security(struct scsi_disk *sdkp, unsigned char *buffer) return; if (scsi_report_opcode(sdev, buffer, SD_BUF_SIZE, - SECURITY_PROTOCOL_IN) == 1 && + SECURITY_PROTOCOL_IN, 0) == 1 && scsi_report_opcode(sdev, buffer, SD_BUF_SIZE, - SECURITY_PROTOCOL_OUT) == 1) + SECURITY_PROTOCOL_OUT, 0) == 1) sdkp->security = 1; } |