diff options
author | Sagi Grimberg <sagi@grimberg.me> | 2020-02-26 00:53:09 +0100 |
---|---|---|
committer | Keith Busch <kbusch@kernel.org> | 2020-03-25 20:48:06 +0100 |
commit | 40510a639ec08db81d5ff9c79856baf9dda94748 (patch) | |
tree | 9a2a64ca47a8a8dd4b6be70096ca23a8879f545a /Documentation/asm-annotations.rst | |
parent | nvme-pci: Simplify nvme_poll_irqdisable (diff) | |
download | linux-40510a639ec08db81d5ff9c79856baf9dda94748.tar.xz linux-40510a639ec08db81d5ff9c79856baf9dda94748.zip |
nvme-tcp: optimize queue io_cpu assignment for multiple queue maps
Currently, queue io_cpu assignment is done sequentially for default,
read and poll queues based on queue id. This causes miss-alignment between
context of CPU initiating I/O and the I/O worker thread processing
queued requests or completions.
Change to modify queue io_cpu assignment to take into account queue
maps offset. Each queue io_cpu will start at zero for each queue map.
This essentially aligns read/poll queues to start over the same range as
default queues.
Testing performed by Mark with:
- ram device (nvmet)
- single CPU core (pinned)
- 100% 4k reads
- engine io_uring (not using sq_thread option)
- hipri flag set
Micro-benchmark results show a net gain of:
- increase of 18%-29% in IOPs
- reduction of 16%-22% in average latency
- reduction of 7%-23% in 99.99% latency
Baseline:
========
QDepth/Batch | IOPs [k] | Avg. Lat [us] | 99.99% Lat [us]
-----------------------------------------------------------------
1/1 | 32.4 | 30.11 | 50.94
32/8 | 179 | 168.20 | 371
CPU alignment:
=============
QDepth/Batch | IOPs [k] | Avg. Lat [us] | 99.99% Lat [us]
-----------------------------------------------------------------
1/1 | 38.5 | 25.18 | 39.16
32/8 | 231 | 130.75 | 343
Reported-by: Mark Wunderlich <mark.wunderlich@intel.com>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Keith Busch <kbusch@kernel.org>
Diffstat (limited to 'Documentation/asm-annotations.rst')
0 files changed, 0 insertions, 0 deletions