diff options
author | Linus Torvalds <torvalds@linux-foundation.org> | 2024-01-10 21:23:43 +0100 |
---|---|---|
committer | Linus Torvalds <torvalds@linux-foundation.org> | 2024-01-10 21:23:43 +0100 |
commit | 0cb552aa97843f24549ce808883494138471c16b (patch) | |
tree | 805d1a4a46b68929c2ca2f878b58840e19dee550 /Documentation | |
parent | Merge tag 'tpmdd-v6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/jarkk... (diff) | |
parent | crypto: iaa - Account for cpu-less numa nodes (diff) | |
download | linux-0cb552aa97843f24549ce808883494138471c16b.tar.xz linux-0cb552aa97843f24549ce808883494138471c16b.zip |
Merge tag 'v6.8-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto updates from Herbert Xu:
"API:
- Add incremental lskcipher/skcipher processing
Algorithms:
- Remove SHA1 from drbg
- Remove CFB and OFB
Drivers:
- Add comp high perf mode configuration in hisilicon/zip
- Add support for 420xx devices in qat
- Add IAA Compression Accelerator driver"
* tag 'v6.8-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (172 commits)
crypto: iaa - Account for cpu-less numa nodes
crypto: scomp - fix req->dst buffer overflow
crypto: sahara - add support for crypto_engine
crypto: sahara - remove error message for bad aes request size
crypto: sahara - remove unnecessary NULL assignments
crypto: sahara - remove 'active' flag from sahara_aes_reqctx struct
crypto: sahara - use dev_err_probe()
crypto: sahara - use devm_clk_get_enabled()
crypto: sahara - use BIT() macro
crypto: sahara - clean up macro indentation
crypto: sahara - do not resize req->src when doing hash operations
crypto: sahara - fix processing hash requests with req->nbytes < sg->length
crypto: sahara - improve error handling in sahara_sha_process()
crypto: sahara - fix wait_for_completion_timeout() error handling
crypto: sahara - fix ahash reqsize
crypto: sahara - handle zero-length aes requests
crypto: skcipher - remove excess kerneldoc members
crypto: shash - remove excess kerneldoc members
crypto: qat - generate dynamically arbiter mappings
crypto: qat - add support for ring pair level telemetry
...
Diffstat (limited to 'Documentation')
17 files changed, 1238 insertions, 44 deletions
diff --git a/Documentation/ABI/testing/debugfs-driver-qat_telemetry b/Documentation/ABI/testing/debugfs-driver-qat_telemetry new file mode 100644 index 000000000000..eacee2072088 --- /dev/null +++ b/Documentation/ABI/testing/debugfs-driver-qat_telemetry @@ -0,0 +1,228 @@ +What: /sys/kernel/debug/qat_<device>_<BDF>/telemetry/control +Date: March 2024 +KernelVersion: 6.8 +Contact: qat-linux@intel.com +Description: (RW) Enables/disables the reporting of telemetry metrics. + + Allowed values to write: + ======================== + * 0: disable telemetry + * 1: enable telemetry + * 2, 3, 4: enable telemetry and calculate minimum, maximum + and average for each counter over 2, 3 or 4 samples + + Returned values: + ================ + * 1-4: telemetry is enabled and running + * 0: telemetry is disabled + + Example. + + Writing '3' to this file starts the collection of + telemetry metrics. Samples are collected every second and + stored in a circular buffer of size 3. These values are then + used to calculate the minimum, maximum and average for each + counter. After enabling, counters can be retrieved through + the ``device_data`` file:: + + echo 3 > /sys/kernel/debug/qat_4xxx_0000:6b:00.0/telemetry/control + + Writing '0' to this file stops the collection of telemetry + metrics:: + + echo 0 > /sys/kernel/debug/qat_4xxx_0000:6b:00.0/telemetry/control + + This attribute is only available for qat_4xxx devices. + +What: /sys/kernel/debug/qat_<device>_<BDF>/telemetry/device_data +Date: March 2024 +KernelVersion: 6.8 +Contact: qat-linux@intel.com +Description: (RO) Reports device telemetry counters. + Reads report metrics about performance and utilization of + a QAT device: + + ======================= ======================================== + Field Description + ======================= ======================================== + sample_cnt number of acquisitions of telemetry data + from the device. Reads are performed + every 1000 ms. + pci_trans_cnt number of PCIe partial transactions + max_rd_lat maximum logged read latency [ns] (could + be any read operation) + rd_lat_acc_avg average read latency [ns] + max_gp_lat max get to put latency [ns] (only takes + samples for AE0) + gp_lat_acc_avg average get to put latency [ns] + bw_in PCIe, write bandwidth [Mbps] + bw_out PCIe, read bandwidth [Mbps] + at_page_req_lat_avg Address Translator(AT), average page + request latency [ns] + at_trans_lat_avg AT, average page translation latency [ns] + at_max_tlb_used AT, maximum uTLB used + util_cpr<N> utilization of Compression slice N [%] + exec_cpr<N> execution count of Compression slice N + util_xlt<N> utilization of Translator slice N [%] + exec_xlt<N> execution count of Translator slice N + util_dcpr<N> utilization of Decompression slice N [%] + exec_dcpr<N> execution count of Decompression slice N + util_pke<N> utilization of PKE N [%] + exec_pke<N> execution count of PKE N + util_ucs<N> utilization of UCS slice N [%] + exec_ucs<N> execution count of UCS slice N + util_wat<N> utilization of Wireless Authentication + slice N [%] + exec_wat<N> execution count of Wireless Authentication + slice N + util_wcp<N> utilization of Wireless Cipher slice N [%] + exec_wcp<N> execution count of Wireless Cipher slice N + util_cph<N> utilization of Cipher slice N [%] + exec_cph<N> execution count of Cipher slice N + util_ath<N> utilization of Authentication slice N [%] + exec_ath<N> execution count of Authentication slice N + ======================= ======================================== + + The telemetry report file can be read with the following command:: + + cat /sys/kernel/debug/qat_4xxx_0000:6b:00.0/telemetry/device_data + + If ``control`` is set to 1, only the current values of the + counters are displayed:: + + <counter_name> <current> + + If ``control`` is 2, 3 or 4, counters are displayed in the + following format:: + + <counter_name> <current> <min> <max> <avg> + + If a device lacks of a specific accelerator, the corresponding + attribute is not reported. + + This attribute is only available for qat_4xxx devices. + +What: /sys/kernel/debug/qat_<device>_<BDF>/telemetry/rp_<A/B/C/D>_data +Date: March 2024 +KernelVersion: 6.8 +Contact: qat-linux@intel.com +Description: (RW) Selects up to 4 Ring Pairs (RP) to monitor, one per file, + and report telemetry counters related to each. + + Allowed values to write: + ======================== + * 0 to ``<num_rps - 1>``: + Ring pair to be monitored. The value of ``num_rps`` can be + retrieved through ``/sys/bus/pci/devices/<BDF>/qat/num_rps``. + See Documentation/ABI/testing/sysfs-driver-qat. + + Reads report metrics about performance and utilization of + the selected RP: + + ======================= ======================================== + Field Description + ======================= ======================================== + sample_cnt number of acquisitions of telemetry data + from the device. Reads are performed + every 1000 ms + rp_num RP number associated with slot <A/B/C/D> + service_type service associated to the RP + pci_trans_cnt number of PCIe partial transactions + gp_lat_acc_avg average get to put latency [ns] + bw_in PCIe, write bandwidth [Mbps] + bw_out PCIe, read bandwidth [Mbps] + at_glob_devtlb_hit Message descriptor DevTLB hit rate + at_glob_devtlb_miss Message descriptor DevTLB miss rate + tl_at_payld_devtlb_hit Payload DevTLB hit rate + tl_at_payld_devtlb_miss Payload DevTLB miss rate + ======================= ======================================== + + Example. + + Writing the value '32' to the file ``rp_C_data`` starts the + collection of telemetry metrics for ring pair 32:: + + echo 32 > /sys/kernel/debug/qat_4xxx_0000:6b:00.0/telemetry/rp_C_data + + Once a ring pair is selected, statistics can be read accessing + the file:: + + cat /sys/kernel/debug/qat_4xxx_0000:6b:00.0/telemetry/rp_C_data + + If ``control`` is set to 1, only the current values of the + counters are displayed:: + + <counter_name> <current> + + If ``control`` is 2, 3 or 4, counters are displayed in the + following format:: + + <counter_name> <current> <min> <max> <avg> + + + On QAT GEN4 devices there are 64 RPs on a PF, so the allowed + values are 0..63. This number is absolute to the device. + If Virtual Functions (VF) are used, the ring pair number can + be derived from the Bus, Device, Function of the VF: + + ============ ====== ====== ====== ====== + PCI BDF/VF RP0 RP1 RP2 RP3 + ============ ====== ====== ====== ====== + 0000:6b:0.1 RP 0 RP 1 RP 2 RP 3 + 0000:6b:0.2 RP 4 RP 5 RP 6 RP 7 + 0000:6b:0.3 RP 8 RP 9 RP 10 RP 11 + 0000:6b:0.4 RP 12 RP 13 RP 14 RP 15 + 0000:6b:0.5 RP 16 RP 17 RP 18 RP 19 + 0000:6b:0.6 RP 20 RP 21 RP 22 RP 23 + 0000:6b:0.7 RP 24 RP 25 RP 26 RP 27 + 0000:6b:1.0 RP 28 RP 29 RP 30 RP 31 + 0000:6b:1.1 RP 32 RP 33 RP 34 RP 35 + 0000:6b:1.2 RP 36 RP 37 RP 38 RP 39 + 0000:6b:1.3 RP 40 RP 41 RP 42 RP 43 + 0000:6b:1.4 RP 44 RP 45 RP 46 RP 47 + 0000:6b:1.5 RP 48 RP 49 RP 50 RP 51 + 0000:6b:1.6 RP 52 RP 53 RP 54 RP 55 + 0000:6b:1.7 RP 56 RP 57 RP 58 RP 59 + 0000:6b:2.0 RP 60 RP 61 RP 62 RP 63 + ============ ====== ====== ====== ====== + + The mapping is only valid for the BDFs of VFs on the host. + + + The service provided on a ring-pair varies depending on the + configuration. The configuration for a given device can be + queried and set using ``cfg_services``. + See Documentation/ABI/testing/sysfs-driver-qat for details. + + The following table reports how ring pairs are mapped to VFs + on the PF 0000:6b:0.0 configured for `sym;asym` or `asym;sym`: + + =========== ============ =========== ============ =========== + PCI BDF/VF RP0/service RP1/service RP2/service RP3/service + =========== ============ =========== ============ =========== + 0000:6b:0.1 RP 0 asym RP 1 sym RP 2 asym RP 3 sym + 0000:6b:0.2 RP 4 asym RP 5 sym RP 6 asym RP 7 sym + 0000:6b:0.3 RP 8 asym RP 9 sym RP10 asym RP11 sym + ... ... ... ... ... + =========== ============ =========== ============ =========== + + All VFs follow the same pattern. + + + The following table reports how ring pairs are mapped to VFs on + the PF 0000:6b:0.0 configured for `dc`: + + =========== ============ =========== ============ =========== + PCI BDF/VF RP0/service RP1/service RP2/service RP3/service + =========== ============ =========== ============ =========== + 0000:6b:0.1 RP 0 dc RP 1 dc RP 2 dc RP 3 dc + 0000:6b:0.2 RP 4 dc RP 5 dc RP 6 dc RP 7 dc + 0000:6b:0.3 RP 8 dc RP 9 dc RP10 dc RP11 dc + ... ... ... ... ... + =========== ============ =========== ============ =========== + + The mapping of a RP to a service can be retrieved using + ``rp2srv`` from sysfs. + See Documentation/ABI/testing/sysfs-driver-qat for details. + + This attribute is only available for qat_4xxx devices. diff --git a/Documentation/ABI/testing/debugfs-hisi-hpre b/Documentation/ABI/testing/debugfs-hisi-hpre index 82abf92df429..8e8de49c5cc6 100644 --- a/Documentation/ABI/testing/debugfs-hisi-hpre +++ b/Documentation/ABI/testing/debugfs-hisi-hpre @@ -101,7 +101,7 @@ What: /sys/kernel/debug/hisi_hpre/<bdf>/qm/status Date: Apr 2020 Contact: linux-crypto@vger.kernel.org Description: Dump the status of the QM. - Four states: initiated, started, stopped and closed. + Two states: work, stop. Available for both PF and VF, and take no other effect on HPRE. What: /sys/kernel/debug/hisi_hpre/<bdf>/qm/diff_regs diff --git a/Documentation/ABI/testing/debugfs-hisi-sec b/Documentation/ABI/testing/debugfs-hisi-sec index 93c530d1bf0f..deeefe2c735e 100644 --- a/Documentation/ABI/testing/debugfs-hisi-sec +++ b/Documentation/ABI/testing/debugfs-hisi-sec @@ -81,7 +81,7 @@ What: /sys/kernel/debug/hisi_sec2/<bdf>/qm/status Date: Apr 2020 Contact: linux-crypto@vger.kernel.org Description: Dump the status of the QM. - Four states: initiated, started, stopped and closed. + Two states: work, stop. Available for both PF and VF, and take no other effect on SEC. What: /sys/kernel/debug/hisi_sec2/<bdf>/qm/diff_regs diff --git a/Documentation/ABI/testing/debugfs-hisi-zip b/Documentation/ABI/testing/debugfs-hisi-zip index fd3f314cf8d1..593714afaed2 100644 --- a/Documentation/ABI/testing/debugfs-hisi-zip +++ b/Documentation/ABI/testing/debugfs-hisi-zip @@ -94,7 +94,7 @@ What: /sys/kernel/debug/hisi_zip/<bdf>/qm/status Date: Apr 2020 Contact: linux-crypto@vger.kernel.org Description: Dump the status of the QM. - Four states: initiated, started, stopped and closed. + Two states: work, stop. Available for both PF and VF, and take no other effect on ZIP. What: /sys/kernel/debug/hisi_zip/<bdf>/qm/diff_regs diff --git a/Documentation/crypto/device_drivers/index.rst b/Documentation/crypto/device_drivers/index.rst new file mode 100644 index 000000000000..c81d311ac61b --- /dev/null +++ b/Documentation/crypto/device_drivers/index.rst @@ -0,0 +1,9 @@ +.. SPDX-License-Identifier: GPL-2.0 + +Hardware Device Driver Specific Documentation +--------------------------------------------- + +.. toctree:: + :maxdepth: 1 + + octeontx2 diff --git a/Documentation/crypto/device_drivers/octeontx2.rst b/Documentation/crypto/device_drivers/octeontx2.rst new file mode 100644 index 000000000000..7e469b173ac8 --- /dev/null +++ b/Documentation/crypto/device_drivers/octeontx2.rst @@ -0,0 +1,25 @@ +.. SPDX-License-Identifier: GPL-2.0 + +========================= +octeontx2 devlink support +========================= + +This document describes the devlink features implemented by the ``octeontx2 CPT`` +device drivers. + +Parameters +========== + +The ``octeontx2`` driver implements the following driver-specific parameters. + +.. list-table:: Driver-specific parameters implemented + :widths: 5 5 5 85 + + * - Name + - Type + - Mode + - Description + * - ``t106_mode`` + - u8 + - runtime + - Used to configure CN10KA B0/CN10KB CPT to work as CN10KA A0/A1. diff --git a/Documentation/crypto/index.rst b/Documentation/crypto/index.rst index da5d5ad2bdf3..945ca1505ad9 100644 --- a/Documentation/crypto/index.rst +++ b/Documentation/crypto/index.rst @@ -28,3 +28,4 @@ for cryptographic use cases, as well as programming examples. api api-samples descore-readme + device_drivers/index diff --git a/Documentation/devicetree/bindings/crypto/inside-secure,safexcel.yaml b/Documentation/devicetree/bindings/crypto/inside-secure,safexcel.yaml new file mode 100644 index 000000000000..ef07258d16c1 --- /dev/null +++ b/Documentation/devicetree/bindings/crypto/inside-secure,safexcel.yaml @@ -0,0 +1,86 @@ +# SPDX-License-Identifier: GPL-2.0-only OR BSD-2-Clause +%YAML 1.2 +--- +$id: http://devicetree.org/schemas/crypto/inside-secure,safexcel.yaml# +$schema: http://devicetree.org/meta-schemas/core.yaml# + +title: Inside Secure SafeXcel cryptographic engine + +maintainers: + - Antoine Tenart <atenart@kernel.org> + +properties: + compatible: + oneOf: + - const: inside-secure,safexcel-eip197b + - const: inside-secure,safexcel-eip197d + - const: inside-secure,safexcel-eip97ies + - const: inside-secure,safexcel-eip197 + description: Equivalent of inside-secure,safexcel-eip197b + deprecated: true + - const: inside-secure,safexcel-eip97 + description: Equivalent of inside-secure,safexcel-eip97ies + deprecated: true + + reg: + maxItems: 1 + + interrupts: + maxItems: 6 + + interrupt-names: + items: + - const: ring0 + - const: ring1 + - const: ring2 + - const: ring3 + - const: eip + - const: mem + + clocks: + minItems: 1 + maxItems: 2 + + clock-names: + minItems: 1 + items: + - const: core + - const: reg + +required: + - reg + - interrupts + - interrupt-names + +allOf: + - if: + properties: + clocks: + minItems: 2 + then: + properties: + clock-names: + minItems: 2 + required: + - clock-names + +additionalProperties: false + +examples: + - | + #include <dt-bindings/interrupt-controller/arm-gic.h> + #include <dt-bindings/interrupt-controller/irq.h> + + crypto@800000 { + compatible = "inside-secure,safexcel-eip197b"; + reg = <0x800000 0x200000>; + interrupts = <GIC_SPI 54 IRQ_TYPE_LEVEL_HIGH>, + <GIC_SPI 55 IRQ_TYPE_LEVEL_HIGH>, + <GIC_SPI 56 IRQ_TYPE_LEVEL_HIGH>, + <GIC_SPI 57 IRQ_TYPE_LEVEL_HIGH>, + <GIC_SPI 58 IRQ_TYPE_LEVEL_HIGH>, + <GIC_SPI 34 IRQ_TYPE_LEVEL_HIGH>; + interrupt-names = "ring0", "ring1", "ring2", "ring3", "eip", "mem"; + clocks = <&cpm_syscon0 1 26>; + clock-names = "core"; + }; diff --git a/Documentation/devicetree/bindings/crypto/inside-secure-safexcel.txt b/Documentation/devicetree/bindings/crypto/inside-secure-safexcel.txt deleted file mode 100644 index 3bbf144c9988..000000000000 --- a/Documentation/devicetree/bindings/crypto/inside-secure-safexcel.txt +++ /dev/null @@ -1,40 +0,0 @@ -Inside Secure SafeXcel cryptographic engine - -Required properties: -- compatible: Should be "inside-secure,safexcel-eip197b", - "inside-secure,safexcel-eip197d" or - "inside-secure,safexcel-eip97ies". -- reg: Base physical address of the engine and length of memory mapped region. -- interrupts: Interrupt numbers for the rings and engine. -- interrupt-names: Should be "ring0", "ring1", "ring2", "ring3", "eip", "mem". - -Optional properties: -- clocks: Reference to the crypto engine clocks, the second clock is - needed for the Armada 7K/8K SoCs. -- clock-names: mandatory if there is a second clock, in this case the - name must be "core" for the first clock and "reg" for - the second one. - -Backward compatibility: -Two compatibles are kept for backward compatibility, but shouldn't be used for -new submissions: -- "inside-secure,safexcel-eip197" is equivalent to - "inside-secure,safexcel-eip197b". -- "inside-secure,safexcel-eip97" is equivalent to - "inside-secure,safexcel-eip97ies". - -Example: - - crypto: crypto@800000 { - compatible = "inside-secure,safexcel-eip197b"; - reg = <0x800000 0x200000>; - interrupts = <GIC_SPI 34 IRQ_TYPE_LEVEL_HIGH>, - <GIC_SPI 54 IRQ_TYPE_LEVEL_HIGH>, - <GIC_SPI 55 IRQ_TYPE_LEVEL_HIGH>, - <GIC_SPI 56 IRQ_TYPE_LEVEL_HIGH>, - <GIC_SPI 57 IRQ_TYPE_LEVEL_HIGH>, - <GIC_SPI 58 IRQ_TYPE_LEVEL_HIGH>; - interrupt-names = "mem", "ring0", "ring1", "ring2", "ring3", - "eip"; - clocks = <&cpm_syscon0 1 26>; - }; diff --git a/Documentation/devicetree/bindings/crypto/qcom,inline-crypto-engine.yaml b/Documentation/devicetree/bindings/crypto/qcom,inline-crypto-engine.yaml index ca4f7d1cefaa..09e43157cc71 100644 --- a/Documentation/devicetree/bindings/crypto/qcom,inline-crypto-engine.yaml +++ b/Documentation/devicetree/bindings/crypto/qcom,inline-crypto-engine.yaml @@ -16,6 +16,7 @@ properties: - qcom,sa8775p-inline-crypto-engine - qcom,sm8450-inline-crypto-engine - qcom,sm8550-inline-crypto-engine + - qcom,sm8650-inline-crypto-engine - const: qcom,inline-crypto-engine reg: diff --git a/Documentation/devicetree/bindings/crypto/qcom,prng.yaml b/Documentation/devicetree/bindings/crypto/qcom,prng.yaml index 13070db0f70c..89c88004b41b 100644 --- a/Documentation/devicetree/bindings/crypto/qcom,prng.yaml +++ b/Documentation/devicetree/bindings/crypto/qcom,prng.yaml @@ -21,6 +21,7 @@ properties: - qcom,sc7280-trng - qcom,sm8450-trng - qcom,sm8550-trng + - qcom,sm8650-trng - const: qcom,trng reg: diff --git a/Documentation/devicetree/bindings/crypto/qcom-qce.yaml b/Documentation/devicetree/bindings/crypto/qcom-qce.yaml index 8e665d910e6e..a48bd381063a 100644 --- a/Documentation/devicetree/bindings/crypto/qcom-qce.yaml +++ b/Documentation/devicetree/bindings/crypto/qcom-qce.yaml @@ -44,10 +44,12 @@ properties: - items: - enum: + - qcom,sc7280-qce - qcom,sm8250-qce - qcom,sm8350-qce - qcom,sm8450-qce - qcom,sm8550-qce + - qcom,sm8650-qce - const: qcom,sm8150-qce - const: qcom,qce @@ -96,6 +98,7 @@ allOf: - qcom,crypto-v5.4 - qcom,ipq6018-qce - qcom,ipq8074-qce + - qcom,ipq9574-qce - qcom,msm8996-qce - qcom,sdm845-qce then: @@ -129,6 +132,17 @@ allOf: - clocks - clock-names + - if: + properties: + compatible: + contains: + enum: + - qcom,sm8150-qce + then: + properties: + clocks: false + clock-names: false + required: - compatible - reg diff --git a/Documentation/devicetree/bindings/rng/starfive,jh7110-trng.yaml b/Documentation/devicetree/bindings/rng/starfive,jh7110-trng.yaml index 2b76ce25acc4..4639247e9e51 100644 --- a/Documentation/devicetree/bindings/rng/starfive,jh7110-trng.yaml +++ b/Documentation/devicetree/bindings/rng/starfive,jh7110-trng.yaml @@ -11,7 +11,11 @@ maintainers: properties: compatible: - const: starfive,jh7110-trng + oneOf: + - items: + - const: starfive,jh8100-trng + - const: starfive,jh7110-trng + - const: starfive,jh7110-trng reg: maxItems: 1 diff --git a/Documentation/driver-api/crypto/iaa/iaa-crypto.rst b/Documentation/driver-api/crypto/iaa/iaa-crypto.rst new file mode 100644 index 000000000000..de587cf9cbed --- /dev/null +++ b/Documentation/driver-api/crypto/iaa/iaa-crypto.rst @@ -0,0 +1,824 @@ +.. SPDX-License-Identifier: GPL-2.0 + +========================================= +IAA Compression Accelerator Crypto Driver +========================================= + +Tom Zanussi <tom.zanussi@linux.intel.com> + +The IAA crypto driver supports compression/decompression compatible +with the DEFLATE compression standard described in RFC 1951, which is +the compression/decompression algorithm exported by this module. + +The IAA hardware spec can be found here: + + https://cdrdv2.intel.com/v1/dl/getContent/721858 + +The iaa_crypto driver is designed to work as a layer underneath +higher-level compression devices such as zswap. + +Users can select IAA compress/decompress acceleration by specifying +one of the supported IAA compression algorithms in whatever facility +allows compression algorithms to be selected. + +For example, a zswap device can select the IAA 'fixed' mode +represented by selecting the 'deflate-iaa' crypto compression +algorithm:: + + # echo deflate-iaa > /sys/module/zswap/parameters/compressor + +This will tell zswap to use the IAA 'fixed' compression mode for all +compresses and decompresses. + +Currently, there is only one compression modes available, 'fixed' +mode. + +The 'fixed' compression mode implements the compression scheme +specified by RFC 1951 and is given the crypto algorithm name +'deflate-iaa'. (Because the IAA hardware has a 4k history-window +limitation, only buffers <= 4k, or that have been compressed using a +<= 4k history window, are technically compliant with the deflate spec, +which allows for a window of up to 32k. Because of this limitation, +the IAA fixed mode deflate algorithm is given its own algorithm name +rather than simply 'deflate'). + + +Config options and other setup +============================== + +The IAA crypto driver is available via menuconfig using the following +path:: + + Cryptographic API -> Hardware crypto devices -> Support for Intel(R) IAA Compression Accelerator + +In the configuration file the option called CONFIG_CRYPTO_DEV_IAA_CRYPTO. + +The IAA crypto driver also supports statistics, which are available +via menuconfig using the following path:: + + Cryptographic API -> Hardware crypto devices -> Support for Intel(R) IAA Compression -> Enable Intel(R) IAA Compression Accelerator Statistics + +In the configuration file the option called CONFIG_CRYPTO_DEV_IAA_CRYPTO_STATS. + +The following config options should also be enabled:: + + CONFIG_IRQ_REMAP=y + CONFIG_INTEL_IOMMU=y + CONFIG_INTEL_IOMMU_SVM=y + CONFIG_PCI_ATS=y + CONFIG_PCI_PRI=y + CONFIG_PCI_PASID=y + CONFIG_INTEL_IDXD=m + CONFIG_INTEL_IDXD_SVM=y + +IAA is one of the first Intel accelerator IPs that can work in +conjunction with the Intel IOMMU. There are multiple modes that exist +for testing. Based on IOMMU configuration, there are 3 modes:: + + - Scalable + - Legacy + - No IOMMU + + +Scalable mode +------------- + +Scalable mode supports Shared Virtual Memory (SVM or SVA). It is +entered when using the kernel boot commandline:: + + intel_iommu=on,sm_on + +with VT-d turned on in BIOS. + +With scalable mode, both shared and dedicated workqueues are available +for use. + +For scalable mode, the following BIOS settings should be enabled:: + + Socket Configuration > IIO Configuration > Intel VT for Directed I/O (VT-d) > Intel VT for Directed I/O + + Socket Configuration > IIO Configuration > PCIe ENQCMD > ENQCMDS + + +Legacy mode +----------- + +Legacy mode is entered when using the kernel boot commandline:: + + intel_iommu=off + +or VT-d is not turned on in BIOS. + +If you have booted into Linux and not sure if VT-d is on, do a "dmesg +| grep -i dmar". If you don't see a number of DMAR devices enumerated, +most likely VT-d is not on. + +With legacy mode, only dedicated workqueues are available for use. + + +No IOMMU mode +------------- + +No IOMMU mode is entered when using the kernel boot commandline:: + + iommu=off. + +With no IOMMU mode, only dedicated workqueues are available for use. + + +Usage +===== + +accel-config +------------ + +When loaded, the iaa_crypto driver automatically creates a default +configuration and enables it, and assigns default driver attributes. +If a different configuration or set of driver attributes is required, +the user must first disable the IAA devices and workqueues, reset the +configuration, and then re-register the deflate-iaa algorithm with the +crypto subsystem by removing and reinserting the iaa_crypto module. + +The :ref:`iaa_disable_script` in the 'Use Cases' +section below can be used to disable the default configuration. + +See :ref:`iaa_default_config` below for details of the default +configuration. + +More likely than not, however, and because of the complexity and +configurability of the accelerator devices, the user will want to +configure the device and manually enable the desired devices and +workqueues. + +The userspace tool to help doing that is called accel-config. Using +accel-config to configure device or loading a previously saved config +is highly recommended. The device can be controlled via sysfs +directly but comes with the warning that you should do this ONLY if +you know exactly what you are doing. The following sections will not +cover the sysfs interface but assumes you will be using accel-config. + +The :ref:`iaa_sysfs_config` section in the appendix below can be +consulted for the sysfs interface details if interested. + +The accel-config tool along with instructions for building it can be +found here: + + https://github.com/intel/idxd-config/#readme + +Typical usage +------------- + +In order for the iaa_crypto module to actually do any +compression/decompression work on behalf of a facility, one or more +IAA workqueues need to be bound to the iaa_crypto driver. + +For instance, here's an example of configuring an IAA workqueue and +binding it to the iaa_crypto driver (note that device names are +specified as 'iax' rather than 'iaa' - this is because upstream still +has the old 'iax' device naming in place) :: + + # configure wq1.0 + + accel-config config-wq --group-id=0 --mode=dedicated --type=kernel --name="iaa_crypto" --device_name="crypto" iax1/wq1.0 + + # enable IAA device iax1 + + accel-config enable-device iax1 + + # enable wq1.0 on IAX device iax1 + + accel-config enable-wq iax1/wq1.0 + +Whenever a new workqueue is bound to or unbound from the iaa_crypto +driver, the available workqueues are 'rebalanced' such that work +submitted from a particular CPU is given to the most appropriate +workqueue available. Current best practice is to configure and bind +at least one workqueue for each IAA device, but as long as there is at +least one workqueue configured and bound to any IAA device in the +system, the iaa_crypto driver will work, albeit most likely not as +efficiently. + +The IAA crypto algorigthms is operational and compression and +decompression operations are fully enabled following the successful +binding of the first IAA workqueue to the iaa_crypto driver. + +Similarly, the IAA crypto algorithm is not operational and compression +and decompression operations are disabled following the unbinding of +the last IAA worqueue to the iaa_crypto driver. + +As a result, the IAA crypto algorithms and thus the IAA hardware are +only available when one or more workques are bound to the iaa_crypto +driver. + +When there are no IAA workqueues bound to the driver, the IAA crypto +algorithms can be unregistered by removing the module. + + +Driver attributes +----------------- + +There are a couple user-configurable driver attributes that can be +used to configure various modes of operation. They're listed below, +along with their default values. To set any of these attributes, echo +the appropriate values to the attribute file located under +/sys/bus/dsa/drivers/crypto/ + +The attribute settings at the time the IAA algorithms are registered +are captured in each algorithm's crypto_ctx and used for all compresses +and decompresses when using that algorithm. + +The available attributes are: + + - verify_compress + + Toggle compression verification. If set, each compress will be + internally decompressed and the contents verified, returning error + codes if unsuccessful. This can be toggled with 0/1:: + + echo 0 > /sys/bus/dsa/drivers/crypto/verify_compress + + The default setting is '1' - verify all compresses. + + - sync_mode + + Select mode to be used to wait for completion of each compresses + and decompress operation. + + The crypto async interface support implemented by iaa_crypto + provides an implementation that satisfies the interface but does + so in a synchronous manner - it fills and submits the IDXD + descriptor and then loops around waiting for it to complete before + returning. This isn't a problem at the moment, since all existing + callers (e.g. zswap) wrap any asynchronous callees in a + synchronous wrapper anyway. + + The iaa_crypto driver does however provide true asynchronous + support for callers that can make use of it. In this mode, it + fills and submits the IDXD descriptor, then returns immediately + with -EINPROGRESS. The caller can then either poll for completion + itself, which requires specific code in the caller which currently + nothing in the upstream kernel implements, or go to sleep and wait + for an interrupt signaling completion. This latter mode is + supported by current users in the kernel such as zswap via + synchronous wrappers. Although it is supported this mode is + significantly slower than the synchronous mode that does the + polling in the iaa_crypto driver previously mentioned. + + This mode can be enabled by writing 'async_irq' to the sync_mode + iaa_crypto driver attribute:: + + echo async_irq > /sys/bus/dsa/drivers/crypto/sync_mode + + Async mode without interrupts (caller must poll) can be enabled by + writing 'async' to it:: + + echo async > /sys/bus/dsa/drivers/crypto/sync_mode + + The mode that does the polling in the iaa_crypto driver can be + enabled by writing 'sync' to it:: + + echo sync > /sys/bus/dsa/drivers/crypto/sync_mode + + The default mode is 'sync'. + +.. _iaa_default_config: + +IAA Default Configuration +------------------------- + +When the iaa_crypto driver is loaded, each IAA device has a single +work queue configured for it, with the following attributes:: + + mode "dedicated" + threshold 0 + size Total WQ Size from WQCAP + priority 10 + type IDXD_WQT_KERNEL + group 0 + name "iaa_crypto" + driver_name "crypto" + +The devices and workqueues are also enabled and therefore the driver +is ready to be used without any additional configuration. + +The default driver attributes in effect when the driver is loaded are:: + + sync_mode "sync" + verify_compress 1 + +In order to change either the device/work queue or driver attributes, +the enabled devices and workqueues must first be disabled. In order +to have the new configuration applied to the deflate-iaa crypto +algorithm, it needs to be re-registered by removing and reinserting +the iaa_crypto module. The :ref:`iaa_disable_script` in the 'Use +Cases' section below can be used to disable the default configuration. + +Statistics +========== + +If the optional debugfs statistics support is enabled, the IAA crypto +driver will generate statistics which can be accessed in debugfs at:: + + # ls -al /sys/kernel/debug/iaa-crypto/ + total 0 + drwxr-xr-x 2 root root 0 Mar 3 09:35 . + drwx------ 47 root root 0 Mar 3 09:35 .. + -rw-r--r-- 1 root root 0 Mar 3 09:35 max_acomp_delay_ns + -rw-r--r-- 1 root root 0 Mar 3 09:35 max_adecomp_delay_ns + -rw-r--r-- 1 root root 0 Mar 3 09:35 max_comp_delay_ns + -rw-r--r-- 1 root root 0 Mar 3 09:35 max_decomp_delay_ns + -rw-r--r-- 1 root root 0 Mar 3 09:35 stats_reset + -rw-r--r-- 1 root root 0 Mar 3 09:35 total_comp_bytes_out + -rw-r--r-- 1 root root 0 Mar 3 09:35 total_comp_calls + -rw-r--r-- 1 root root 0 Mar 3 09:35 total_decomp_bytes_in + -rw-r--r-- 1 root root 0 Mar 3 09:35 total_decomp_calls + -rw-r--r-- 1 root root 0 Mar 3 09:35 wq_stats + +Most of the above statisticss are self-explanatory. The wq_stats file +shows per-wq stats, a set for each iaa device and wq in addition to +some global stats:: + + # cat wq_stats + global stats: + total_comp_calls: 100 + total_decomp_calls: 100 + total_comp_bytes_out: 22800 + total_decomp_bytes_in: 22800 + total_completion_einval_errors: 0 + total_completion_timeout_errors: 0 + total_completion_comp_buf_overflow_errors: 0 + + iaa device: + id: 1 + n_wqs: 1 + comp_calls: 0 + comp_bytes: 0 + decomp_calls: 0 + decomp_bytes: 0 + wqs: + name: iaa_crypto + comp_calls: 0 + comp_bytes: 0 + decomp_calls: 0 + decomp_bytes: 0 + + iaa device: + id: 3 + n_wqs: 1 + comp_calls: 0 + comp_bytes: 0 + decomp_calls: 0 + decomp_bytes: 0 + wqs: + name: iaa_crypto + comp_calls: 0 + comp_bytes: 0 + decomp_calls: 0 + decomp_bytes: 0 + + iaa device: + id: 5 + n_wqs: 1 + comp_calls: 100 + comp_bytes: 22800 + decomp_calls: 100 + decomp_bytes: 22800 + wqs: + name: iaa_crypto + comp_calls: 100 + comp_bytes: 22800 + decomp_calls: 100 + decomp_bytes: 22800 + +Writing 0 to 'stats_reset' resets all the stats, including the +per-device and per-wq stats:: + + # echo 0 > stats_reset + # cat wq_stats + global stats: + total_comp_calls: 0 + total_decomp_calls: 0 + total_comp_bytes_out: 0 + total_decomp_bytes_in: 0 + total_completion_einval_errors: 0 + total_completion_timeout_errors: 0 + total_completion_comp_buf_overflow_errors: 0 + ... + + +Use cases +========= + +Simple zswap test +----------------- + +For this example, the kernel should be configured according to the +dedicated mode options described above, and zswap should be enabled as +well:: + + CONFIG_ZSWAP=y + +This is a simple test that uses iaa_compress as the compressor for a +swap (zswap) device. It sets up the zswap device and then uses the +memory_memadvise program listed below to forcibly swap out and in a +specified number of pages, demonstrating both compress and decompress. + +The zswap test expects the work queues for each IAA device on the +system to be configured properly as a kernel workqueue with a +workqueue driver_name of "crypto". + +The first step is to make sure the iaa_crypto module is loaded:: + + modprobe iaa_crypto + +If the IAA devices and workqueues haven't previously been disabled and +reconfigured, then the default configuration should be in place and no +further IAA configuration is necessary. See :ref:`iaa_default_config` +below for details of the default configuration. + +If the default configuration is in place, you should see the iaa +devices and wq0s enabled:: + + # cat /sys/bus/dsa/devices/iax1/state + enabled + # cat /sys/bus/dsa/devices/iax1/wq1.0/state + enabled + +To demonstrate that the following steps work as expected, these +commands can be used to enable debug output:: + + # echo -n 'module iaa_crypto +p' > /sys/kernel/debug/dynamic_debug/control + # echo -n 'module idxd +p' > /sys/kernel/debug/dynamic_debug/control + +Use the following commands to enable zswap:: + + # echo 0 > /sys/module/zswap/parameters/enabled + # echo 50 > /sys/module/zswap/parameters/max_pool_percent + # echo deflate-iaa > /sys/module/zswap/parameters/compressor + # echo zsmalloc > /sys/module/zswap/parameters/zpool + # echo 1 > /sys/module/zswap/parameters/enabled + # echo 0 > /sys/module/zswap/parameters/same_filled_pages_enabled + # echo 100 > /proc/sys/vm/swappiness + # echo never > /sys/kernel/mm/transparent_hugepage/enabled + # echo 1 > /proc/sys/vm/overcommit_memory + +Now you can now run the zswap workload you want to measure. For +example, using the memory_memadvise code below, the following command +will swap in and out 100 pages:: + + ./memory_madvise 100 + + Allocating 100 pages to swap in/out + Swapping out 100 pages + Swapping in 100 pages + Swapped out and in 100 pages + +You should see something like the following in the dmesg output:: + + [ 404.202972] idxd 0000:e7:02.0: iaa_comp_acompress: dma_map_sg, src_addr 223925c000, nr_sgs 1, req->src 00000000ee7cb5e6, req->slen 4096, sg_dma_len(sg) 4096 + [ 404.202973] idxd 0000:e7:02.0: iaa_comp_acompress: dma_map_sg, dst_addr 21dadf8000, nr_sgs 1, req->dst 000000008d6acea8, req->dlen 4096, sg_dma_len(sg) 8192 + [ 404.202975] idxd 0000:e7:02.0: iaa_compress: desc->src1_addr 223925c000, desc->src1_size 4096, desc->dst_addr 21dadf8000, desc->max_dst_size 4096, desc->src2_addr 2203543000, desc->src2_size 1568 + [ 404.202981] idxd 0000:e7:02.0: iaa_compress_verify: (verify) desc->src1_addr 21dadf8000, desc->src1_size 228, desc->dst_addr 223925c000, desc->max_dst_size 4096, desc->src2_addr 0, desc->src2_size 0 + ... + +Now that basic functionality has been demonstrated, the defaults can +be erased and replaced with a different configuration. To do that, +first disable zswap:: + + # echo lzo > /sys/module/zswap/parameters/compressor + # swapoff -a + # echo 0 > /sys/module/zswap/parameters/accept_threshold_percent + # echo 0 > /sys/module/zswap/parameters/max_pool_percent + # echo 0 > /sys/module/zswap/parameters/enabled + # echo 0 > /sys/module/zswap/parameters/enabled + +Then run the :ref:`iaa_disable_script` in the 'Use Cases' section +below to disable the default configuration. + +Finally turn swap back on:: + + # swapon -a + +Following all that the IAA device(s) can now be re-configured and +enabled as desired for further testing. Below is one example. + +The zswap test expects the work queues for each IAA device on the +system to be configured properly as a kernel workqueue with a +workqueue driver_name of "crypto". + +The below script automatically does that:: + + #!/bin/bash + + echo "IAA devices:" + lspci -d:0cfe + echo "# IAA devices:" + lspci -d:0cfe | wc -l + + # + # count iaa instances + # + iaa_dev_id="0cfe" + num_iaa=$(lspci -d:${iaa_dev_id} | wc -l) + echo "Found ${num_iaa} IAA instances" + + # + # disable iaa wqs and devices + # + echo "Disable IAA" + + for ((i = 1; i < ${num_iaa} * 2; i += 2)); do + echo disable wq iax${i}/wq${i}.0 + accel-config disable-wq iax${i}/wq${i}.0 + echo disable iaa iax${i} + accel-config disable-device iax${i} + done + + echo "End Disable IAA" + + # + # configure iaa wqs and devices + # + echo "Configure IAA" + for ((i = 1; i < ${num_iaa} * 2; i += 2)); do + accel-config config-wq --group-id=0 --mode=dedicated --size=128 --priority=10 --type=kernel --name="iaa_crypto" --driver_name="crypto" iax${i}/wq${i} + done + + echo "End Configure IAA" + + # + # enable iaa wqs and devices + # + echo "Enable IAA" + + for ((i = 1; i < ${num_iaa} * 2; i += 2)); do + echo enable iaa iaa${i} + accel-config enable-device iaa${i} + echo enable wq iaa${i}/wq${i}.0 + accel-config enable-wq iaa${i}/wq${i}.0 + done + + echo "End Enable IAA" + +When the workqueues are bound to the iaa_crypto driver, you should +see something similar to the following in dmesg output if you've +enabled debug output (echo -n 'module iaa_crypto +p' > +/sys/kernel/debug/dynamic_debug/control):: + + [ 60.752344] idxd 0000:f6:02.0: add_iaa_wq: added wq 000000004068d14d to iaa 00000000c9585ba2, n_wq 1 + [ 60.752346] iaa_crypto: rebalance_wq_table: nr_nodes=2, nr_cpus 160, nr_iaa 8, cpus_per_iaa 20 + [ 60.752347] iaa_crypto: rebalance_wq_table: iaa=0 + [ 60.752349] idxd 0000:6a:02.0: request_iaa_wq: getting wq from iaa_device 0000000042d7bc52 (0) + [ 60.752350] idxd 0000:6a:02.0: request_iaa_wq: returning unused wq 00000000c8bb4452 (0) from iaa device 0000000042d7bc52 (0) + [ 60.752352] iaa_crypto: rebalance_wq_table: assigned wq for cpu=0, node=0 = wq 00000000c8bb4452 + [ 60.752354] iaa_crypto: rebalance_wq_table: iaa=0 + [ 60.752355] idxd 0000:6a:02.0: request_iaa_wq: getting wq from iaa_device 0000000042d7bc52 (0) + [ 60.752356] idxd 0000:6a:02.0: request_iaa_wq: returning unused wq 00000000c8bb4452 (0) from iaa device 0000000042d7bc52 (0) + [ 60.752358] iaa_crypto: rebalance_wq_table: assigned wq for cpu=1, node=0 = wq 00000000c8bb4452 + [ 60.752359] iaa_crypto: rebalance_wq_table: iaa=0 + [ 60.752360] idxd 0000:6a:02.0: request_iaa_wq: getting wq from iaa_device 0000000042d7bc52 (0) + [ 60.752361] idxd 0000:6a:02.0: request_iaa_wq: returning unused wq 00000000c8bb4452 (0) from iaa device 0000000042d7bc52 (0) + [ 60.752362] iaa_crypto: rebalance_wq_table: assigned wq for cpu=2, node=0 = wq 00000000c8bb4452 + [ 60.752364] iaa_crypto: rebalance_wq_table: iaa=0 + . + . + . + +Once the workqueues and devices have been enabled, the IAA crypto +algorithms are enabled and available. When the IAA crypto algorithms +have been successfully enabled, you should see the following dmesg +output:: + + [ 64.893759] iaa_crypto: iaa_crypto_enable: iaa_crypto now ENABLED + +Now run the following zswap-specific setup commands to have zswap use +the 'fixed' compression mode:: + + echo 0 > /sys/module/zswap/parameters/enabled + echo 50 > /sys/module/zswap/parameters/max_pool_percent + echo deflate-iaa > /sys/module/zswap/parameters/compressor + echo zsmalloc > /sys/module/zswap/parameters/zpool + echo 1 > /sys/module/zswap/parameters/enabled + echo 0 > /sys/module/zswap/parameters/same_filled_pages_enabled + + echo 100 > /proc/sys/vm/swappiness + echo never > /sys/kernel/mm/transparent_hugepage/enabled + echo 1 > /proc/sys/vm/overcommit_memory + +Finally, you can now run the zswap workload you want to measure. For +example, using the code below, the following command will swap in and +out 100 pages:: + + ./memory_madvise 100 + + Allocating 100 pages to swap in/out + Swapping out 100 pages + Swapping in 100 pages + Swapped out and in 100 pages + +You should see something like the following in the dmesg output if +you've enabled debug output (echo -n 'module iaa_crypto +p' > +/sys/kernel/debug/dynamic_debug/control):: + + [ 404.202972] idxd 0000:e7:02.0: iaa_comp_acompress: dma_map_sg, src_addr 223925c000, nr_sgs 1, req->src 00000000ee7cb5e6, req->slen 4096, sg_dma_len(sg) 4096 + [ 404.202973] idxd 0000:e7:02.0: iaa_comp_acompress: dma_map_sg, dst_addr 21dadf8000, nr_sgs 1, req->dst 000000008d6acea8, req->dlen 4096, sg_dma_len(sg) 8192 + [ 404.202975] idxd 0000:e7:02.0: iaa_compress: desc->src1_addr 223925c000, desc->src1_size 4096, desc->dst_addr 21dadf8000, desc->max_dst_size 4096, desc->src2_addr 2203543000, desc->src2_size 1568 + [ 404.202981] idxd 0000:e7:02.0: iaa_compress_verify: (verify) desc->src1_addr 21dadf8000, desc->src1_size 228, desc->dst_addr 223925c000, desc->max_dst_size 4096, desc->src2_addr 0, desc->src2_size 0 + [ 409.203227] idxd 0000:e7:02.0: iaa_comp_adecompress: dma_map_sg, src_addr 21ddd8b100, nr_sgs 1, req->src 0000000084adab64, req->slen 228, sg_dma_len(sg) 228 + [ 409.203235] idxd 0000:e7:02.0: iaa_comp_adecompress: dma_map_sg, dst_addr 21ee3dc000, nr_sgs 1, req->dst 000000004e2990d0, req->dlen 4096, sg_dma_len(sg) 4096 + [ 409.203239] idxd 0000:e7:02.0: iaa_decompress: desc->src1_addr 21ddd8b100, desc->src1_size 228, desc->dst_addr 21ee3dc000, desc->max_dst_size 4096, desc->src2_addr 0, desc->src2_size 0 + [ 409.203254] idxd 0000:e7:02.0: iaa_comp_adecompress: dma_map_sg, src_addr 21ddd8b100, nr_sgs 1, req->src 0000000084adab64, req->slen 228, sg_dma_len(sg) 228 + [ 409.203256] idxd 0000:e7:02.0: iaa_comp_adecompress: dma_map_sg, dst_addr 21f1551000, nr_sgs 1, req->dst 000000004e2990d0, req->dlen 4096, sg_dma_len(sg) 4096 + [ 409.203257] idxd 0000:e7:02.0: iaa_decompress: desc->src1_addr 21ddd8b100, desc->src1_size 228, desc->dst_addr 21f1551000, desc->max_dst_size 4096, desc->src2_addr 0, desc->src2_size 0 + +In order to unregister the IAA crypto algorithms, and register new +ones using different parameters, any users of the current algorithm +should be stopped and the IAA workqueues and devices disabled. + +In the case of zswap, remove the IAA crypto algorithm as the +compressor and turn off swap (to remove all references to +iaa_crypto):: + + echo lzo > /sys/module/zswap/parameters/compressor + swapoff -a + + echo 0 > /sys/module/zswap/parameters/accept_threshold_percent + echo 0 > /sys/module/zswap/parameters/max_pool_percent + echo 0 > /sys/module/zswap/parameters/enabled + +Once zswap is disabled and no longer using iaa_crypto, the IAA wqs and +devices can be disabled. + +.. _iaa_disable_script: + +IAA disable script +------------------ + +The below script automatically does that:: + + #!/bin/bash + + echo "IAA devices:" + lspci -d:0cfe + echo "# IAA devices:" + lspci -d:0cfe | wc -l + + # + # count iaa instances + # + iaa_dev_id="0cfe" + num_iaa=$(lspci -d:${iaa_dev_id} | wc -l) + echo "Found ${num_iaa} IAA instances" + + # + # disable iaa wqs and devices + # + echo "Disable IAA" + + for ((i = 1; i < ${num_iaa} * 2; i += 2)); do + echo disable wq iax${i}/wq${i}.0 + accel-config disable-wq iax${i}/wq${i}.0 + echo disable iaa iax${i} + accel-config disable-device iax${i} + done + + echo "End Disable IAA" + +Finally, at this point the iaa_crypto module can be removed, which +will unregister the current IAA crypto algorithms:: + + rmmod iaa_crypto + + +memory_madvise.c (gcc -o memory_memadvise memory_madvise.c):: + + #include <stdio.h> + #include <stdlib.h> + #include <string.h> + #include <unistd.h> + #include <sys/mman.h> + #include <linux/mman.h> + + #ifndef MADV_PAGEOUT + #define MADV_PAGEOUT 21 /* force pages out immediately */ + #endif + + #define PG_SZ 4096 + + int main(int argc, char **argv) + { + int i, nr_pages = 1; + int64_t *dump_ptr; + char *addr, *a; + int loop = 1; + + if (argc > 1) + nr_pages = atoi(argv[1]); + + printf("Allocating %d pages to swap in/out\n", nr_pages); + + /* allocate pages */ + addr = mmap(NULL, nr_pages * PG_SZ, PROT_READ | PROT_WRITE, MAP_SHARED | MAP_ANONYMOUS, -1, 0); + *addr = 1; + + /* initialize data in page to all '*' chars */ + memset(addr, '*', nr_pages * PG_SZ); + + printf("Swapping out %d pages\n", nr_pages); + + /* Tell kernel to swap it out */ + madvise(addr, nr_pages * PG_SZ, MADV_PAGEOUT); + + while (loop > 0) { + /* Wait for swap out to finish */ + sleep(5); + + a = addr; + + printf("Swapping in %d pages\n", nr_pages); + + /* Access the page ... this will swap it back in again */ + for (i = 0; i < nr_pages; i++) { + if (a[0] != '*') { + printf("Bad data from decompress!!!!!\n"); + + dump_ptr = (int64_t *)a; + for (int j = 0; j < 100; j++) { + printf(" page %d data: %#llx\n", i, *dump_ptr); + dump_ptr++; + } + } + + a += PG_SZ; + } + + loop --; + } + + printf("Swapped out and in %d pages\n", nr_pages); + +Appendix +======== + +.. _iaa_sysfs_config: + +IAA sysfs config interface +-------------------------- + +Below is a description of the IAA sysfs interface, which as mentioned +in the main document, should only be used if you know exactly what you +are doing. Even then, there's no compelling reason to use it directly +since accel-config can do everything the sysfs interface can and in +fact accel-config is based on it under the covers. + +The 'IAA config path' is /sys/bus/dsa/devices and contains +subdirectories representing each IAA device, workqueue, engine, and +group. Note that in the sysfs interface, the IAA devices are actually +named using iax e.g. iax1, iax3, etc. (Note that IAA devices are the +odd-numbered devices; the even-numbered devices are DSA devices and +can be ignored for IAA). + +The 'IAA device bind path' is /sys/bus/dsa/drivers/idxd/bind and is +the file that is written to enable an IAA device. + +The 'IAA workqueue bind path' is /sys/bus/dsa/drivers/crypto/bind and +is the file that is written to enable an IAA workqueue. + +Similarly /sys/bus/dsa/drivers/idxd/unbind and +/sys/bus/dsa/drivers/crypto/unbind are used to disable IAA devices and +workqueues. + +The basic sequence of commands needed to set up the IAA devices and +workqueues is: + +For each device:: + 1) Disable any workqueues enabled on the device. For example to + disable workques 0 and 1 on IAA device 3:: + + # echo wq3.0 > /sys/bus/dsa/drivers/crypto/unbind + # echo wq3.1 > /sys/bus/dsa/drivers/crypto/unbind + + 2) Disable the device. For example to disable IAA device 3:: + + # echo iax3 > /sys/bus/dsa/drivers/idxd/unbind + + 3) configure the desired workqueues. For example, to configure + workqueue 3 on IAA device 3:: + + # echo dedicated > /sys/bus/dsa/devices/iax3/wq3.3/mode + # echo 128 > /sys/bus/dsa/devices/iax3/wq3.3/size + # echo 0 > /sys/bus/dsa/devices/iax3/wq3.3/group_id + # echo 10 > /sys/bus/dsa/devices/iax3/wq3.3/priority + # echo "kernel" > /sys/bus/dsa/devices/iax3/wq3.3/type + # echo "iaa_crypto" > /sys/bus/dsa/devices/iax3/wq3.3/name + # echo "crypto" > /sys/bus/dsa/devices/iax3/wq3.3/driver_name + + 4) Enable the device. For example to enable IAA device 3:: + + # echo iax3 > /sys/bus/dsa/drivers/idxd/bind + + 5) Enable the desired workqueues on the device. For example to + enable workques 0 and 1 on IAA device 3:: + + # echo wq3.0 > /sys/bus/dsa/drivers/crypto/bind + # echo wq3.1 > /sys/bus/dsa/drivers/crypto/bind diff --git a/Documentation/driver-api/crypto/iaa/index.rst b/Documentation/driver-api/crypto/iaa/index.rst new file mode 100644 index 000000000000..aa6837e27264 --- /dev/null +++ b/Documentation/driver-api/crypto/iaa/index.rst @@ -0,0 +1,20 @@ +.. SPDX-License-Identifier: GPL-2.0 + +================================= +IAA (Intel Analytics Accelerator) +================================= + +IAA provides hardware compression and decompression via the crypto +API. + +.. toctree:: + :maxdepth: 1 + + iaa-crypto + +.. only:: subproject and html + + Indices + ======= + + * :ref:`genindex` diff --git a/Documentation/driver-api/crypto/index.rst b/Documentation/driver-api/crypto/index.rst new file mode 100644 index 000000000000..fb9709b98bea --- /dev/null +++ b/Documentation/driver-api/crypto/index.rst @@ -0,0 +1,20 @@ +.. SPDX-License-Identifier: GPL-2.0 + +============== +Crypto Drivers +============== + +Documentation for crypto drivers that may need more involved setup and +configuration. + +.. toctree:: + :maxdepth: 1 + + iaa/index + +.. only:: subproject and html + + Indices + ======= + + * :ref:`genindex` diff --git a/Documentation/driver-api/index.rst b/Documentation/driver-api/index.rst index 8bc4ebe7a36f..3cc0e1a517c0 100644 --- a/Documentation/driver-api/index.rst +++ b/Documentation/driver-api/index.rst @@ -116,6 +116,7 @@ available subsections can be seen below. wmi dpll wbrf + crypto/index .. only:: subproject and html |