linux - linux

	Commit message (Collapse)	Author	Age	Files	Lines
*	virtio-crypto: enable retry for virtio-crypto-dev	lei he	2022-05-31	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Enable retry for virtio-crypto-dev, so that crypto-engine can process cipher-requests parallelly. Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Gonglei <arei.gonglei@huawei.com> Reviewed-by: Gonglei <arei.gonglei@huawei.com> Signed-off-by: lei he <helei.sig11@bytedance.com> Signed-off-by: zhenwei pi <pizhenwei@bytedance.com> Message-Id: <20220506131627.180784-6-pizhenwei@bytedance.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
*	virtio-crypto: adjust dst_len at ops callback	lei he	2022-05-31	1	-1/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	For some akcipher operations(eg, decryption of pkcs1pad(rsa)), the length of returned result maybe less than akcipher_req->dst_len, we need to recalculate the actual dst_len through the virt-queue protocol. Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Gonglei <arei.gonglei@huawei.com> Reviewed-by: Gonglei <arei.gonglei@huawei.com> Signed-off-by: lei he <helei.sig11@bytedance.com> Signed-off-by: zhenwei pi <pizhenwei@bytedance.com> Message-Id: <20220506131627.180784-5-pizhenwei@bytedance.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
*	virtio-crypto: wait ctrl queue instead of busy polling	zhenwei pi	2022-05-31	4	-55/+64
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Originally, after submitting request into virtio crypto control queue, the guest side polls the result from the virt queue. This works like following: CPU0 CPU1 ... CPUx CPUy \| \| \| \| \ \ / / \--------spin_lock(&vcrypto->ctrl_lock)-------/ \| virtqueue add & kick \| busy poll virtqueue \| spin_unlock(&vcrypto->ctrl_lock) ... There are two problems: 1, The queue depth is always 1, the performance of a virtio crypto device gets limited. Multi user processes share a single control queue, and hit spin lock race from control queue. Test on Intel Platinum 8260, a single worker gets ~35K/s create/close session operations, and 8 workers get ~40K/s operations with 800% CPU utilization. 2, The control request is supposed to get handled immediately, but in the current implementation of QEMU(v6.2), the vCPU thread kicks another thread to do this work, the latency also gets unstable. Tracking latency of virtio_crypto_alg_akcipher_close_session in 5s: usecs : count distribution 0 -> 1 : 0 \| \| 2 -> 3 : 7 \| \| 4 -> 7 : 72 \| \| 8 -> 15 : 186485 \|************************\| 16 -> 31 : 687 \| \| 32 -> 63 : 5 \| \| 64 -> 127 : 3 \| \| 128 -> 255 : 1 \| \| 256 -> 511 : 0 \| \| 512 -> 1023 : 0 \| \| 1024 -> 2047 : 0 \| \| 2048 -> 4095 : 0 \| \| 4096 -> 8191 : 0 \| \| 8192 -> 16383 : 2 \| \| This means that a CPU may hold vcrypto->ctrl_lock as long as 8192~16383us. To improve the performance of control queue, a request on control queue waits completion instead of busy polling to reduce lock racing, and gets completed by control queue callback. CPU0 CPU1 ... CPUx CPUy \| \| \| \| \ \ / / \--------spin_lock(&vcrypto->ctrl_lock)-------/ \| virtqueue add & kick \| ---------spin_unlock(&vcrypto->ctrl_lock)------ / / \ \ \| \| \| \| wait wait wait wait Test this patch, the guest side get ~200K/s operations with 300% CPU utilization. Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Gonglei <arei.gonglei@huawei.com> Reviewed-by: Gonglei <arei.gonglei@huawei.com> Signed-off-by: zhenwei pi <pizhenwei@bytedance.com> Message-Id: <20220506131627.180784-4-pizhenwei@bytedance.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
*	virtio-crypto: use private buffer for control request	zhenwei pi	2022-05-31	3	-45/+79
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Originally, all of the control requests share a single buffer( ctrl & input & ctrl_status fields in struct virtio_crypto), this allows queue depth 1 only, the performance of control queue gets limited by this design. In this patch, each request allocates request buffer dynamically, and free buffer after request, so the scope protected by ctrl_lock also get optimized here. It's possible to optimize control queue depth in the next step. A necessary comment is already in code, still describe it again: /* * Note: there are padding fields in request, clear them to zero before * sending to host to avoid to divulge any information. * Ex, virtio_crypto_ctrl_request::ctrl::u::destroy_session::padding[48] */ So use kzalloc to allocate buffer of struct virtio_crypto_ctrl_request. Potentially dereferencing uninitialized variables: Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Gonglei <arei.gonglei@huawei.com> Reviewed-by: Gonglei <arei.gonglei@huawei.com> Signed-off-by: zhenwei pi <pizhenwei@bytedance.com> Message-Id: <20220506131627.180784-3-pizhenwei@bytedance.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
*	virtio-crypto: change code style	zhenwei pi	2022-05-31	2	-53/+59
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Use temporary variable to make code easy to read and maintain. /* Pad cipher's parameters */ vcrypto->ctrl.u.sym_create_session.op_type = cpu_to_le32(VIRTIO_CRYPTO_SYM_OP_CIPHER); vcrypto->ctrl.u.sym_create_session.u.cipher.para.algo = vcrypto->ctrl.header.algo; vcrypto->ctrl.u.sym_create_session.u.cipher.para.keylen = cpu_to_le32(keylen); vcrypto->ctrl.u.sym_create_session.u.cipher.para.op = cpu_to_le32(op); --> sym_create_session = &ctrl->u.sym_create_session; sym_create_session->op_type = cpu_to_le32(VIRTIO_CRYPTO_SYM_OP_CIPHER); sym_create_session->u.cipher.para.algo = ctrl->header.algo; sym_create_session->u.cipher.para.keylen = cpu_to_le32(keylen); sym_create_session->u.cipher.para.op = cpu_to_le32(op); The new style shows more obviously: - the variable we want to operate. - an assignment statement in a single line. Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Gonglei <arei.gonglei@huawei.com> Reviewed-by: Gonglei <arei.gonglei@huawei.com> Signed-off-by: zhenwei pi <pizhenwei@bytedance.com> Message-Id: <20220506131627.180784-2-pizhenwei@bytedance.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
*	Merge tag 'powerpc-5.19-1' of ↵	Linus Torvalds	2022-05-28	1	-1/+1
\|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc updates from Michael Ellerman: - Convert to the generic mmap support (ARCH_WANT_DEFAULT_TOPDOWN_MMAP_LAYOUT) - Add support for outline-only KASAN with 64-bit Radix MMU (P9 or later) - Increase SIGSTKSZ and MINSIGSTKSZ and add support for AT_MINSIGSTKSZ - Enable the DAWR (Data Address Watchpoint) on POWER9 DD2.3 or later - Drop support for system call instruction emulation - Many other small features and fixes Thanks to Alexey Kardashevskiy, Alistair Popple, Andy Shevchenko, Bagas Sanjaya, Bjorn Helgaas, Bo Liu, Chen Huang, Christophe Leroy, Colin Ian King, Daniel Axtens, Dwaipayan Ray, Fabiano Rosas, Finn Thain, Frank Rowand, Fuqian Huang, Guilherme G. Piccoli, Hangyu Hua, Haowen Bai, Haren Myneni, Hari Bathini, He Ying, Jason Wang, Jiapeng Chong, Jing Yangyang, Joel Stanley, Julia Lawall, Kajol Jain, Kevin Hao, Krzysztof Kozlowski, Laurent Dufour, Lv Ruyi, Madhavan Srinivasan, Magali Lemes, Miaoqian Lin, Minghao Chi, Nathan Chancellor, Naveen N. Rao, Nicholas Piggin, Oliver O'Halloran, Oscar Salvador, Pali Rohár, Paul Mackerras, Peng Wu, Qing Wang, Randy Dunlap, Reza Arbab, Russell Currey, Sohaib Mohamed, Vaibhav Jain, Vasant Hegde, Wang Qing, Wang Wensheng, Xiang wangx, Xiaomeng Tong, Xu Wang, Yang Guang, Yang Li, Ye Bin, YueHaibing, Yu Kuai, Zheng Bin, Zou Wei, and Zucheng Zheng. * tag 'powerpc-5.19-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (200 commits) powerpc/64: Include cache.h directly in paca.h powerpc/64s: Only set HAVE_ARCH_UNMAPPED_AREA when CONFIG_PPC_64S_HASH_MMU is set powerpc/xics: Include missing header powerpc/powernv/pci: Drop VF MPS fixup powerpc/fsl_book3e: Don't set rodata RO too early powerpc/microwatt: Add mmu bits to device tree powerpc/powernv/flash: Check OPAL flash calls exist before using powerpc/powermac: constify device_node in of_irq_parse_oldworld() powerpc/powermac: add missing g5_phy_disable_cpu1() declaration selftests/powerpc/pmu: fix spelling mistake "mis-match" -> "mismatch" powerpc: Enable the DAWR on POWER9 DD2.3 and above powerpc/64s: Add CPU_FTRS_POWER10 to ALWAYS mask powerpc/64s: Add CPU_FTRS_POWER9_DD2_2 to CPU_FTRS_ALWAYS mask powerpc: Fix all occurences of "the the" selftests/powerpc/pmu/ebb: remove fixed_instruction.S powerpc/platforms/83xx: Use of_device_get_match_data() powerpc/eeh: Drop redundant spinlock initialization powerpc/iommu: Add missing of_node_put in iommu_init_early_dart powerpc/pseries/vas: Call misc_deregister if sysfs init fails powerpc/papr_scm: Fix leaking nvdimm_events_map elements ...
\| *	powerpc/powernv/vas: Assign real address to rx_fifo in vas_rx_win_attr	Haren Myneni	2022-05-22	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In init_winctx_regs(), __pa() is called on winctx->rx_fifo and this function is called to initialize registers for receive and fault windows. But the real address is passed in winctx->rx_fifo for receive windows and the virtual address for fault windows which causes errors with DEBUG_VIRTUAL enabled. Fixes this issue by assigning only real address to rx_fifo in vas_rx_win_attr struct for both receive and fault windows. Reported-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Haren Myneni <haren@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/338e958c7ab8f3b266fa794a1f80f99b9671829e.camel@linux.ibm.com
* \|	Merge tag 'v5.19-p1' of ↵	Linus Torvalds	2022-05-28	77	-881/+2605
\|\ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 Pull crypto updates from Herbert Xu: "API: - Test in-place en/decryption with two sglists in testmgr - Fix process vs softirq race in cryptd Algorithms: - Add arm64 acceleration for sm4 - Add s390 acceleration for chacha20 Drivers: - Add polarfire soc hwrng support in mpsf - Add support for TI SoC AM62x in sa2ul - Add support for ATSHA204 cryptochip in atmel-sha204a - Add support for PRNG in caam - Restore support for storage encryption in qat - Restore support for storage encryption in hisilicon/sec" * tag 'v5.19-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (116 commits) hwrng: omap3-rom - fix using wrong clk_disable() in omap_rom_rng_runtime_resume() crypto: hisilicon/sec - delete the flag CRYPTO_ALG_ALLOCATES_MEMORY crypto: qat - add support for 401xx devices crypto: qat - re-enable registration of algorithms crypto: qat - honor CRYPTO_TFM_REQ_MAY_SLEEP flag crypto: qat - add param check for DH crypto: qat - add param check for RSA crypto: qat - remove dma_free_coherent() for DH crypto: qat - remove dma_free_coherent() for RSA crypto: qat - fix memory leak in RSA crypto: qat - add backlog mechanism crypto: qat - refactor submission logic crypto: qat - use pre-allocated buffers in datapath crypto: qat - set to zero DH parameters before free crypto: s390 - add crypto library interface for ChaCha20 crypto: talitos - Uniform coding style with defined variable crypto: octeontx2 - simplify the return expression of otx2_cpt_aead_cbc_aes_sha_setkey() crypto: cryptd - Protect per-CPU resource by disabling BH. crypto: sun8i-ce - do not fallback if cryptlen is less than sg length crypto: sun8i-ce - rework debugging ...
\| * \|	crypto: hisilicon/sec - delete the flag CRYPTO_ALG_ALLOCATES_MEMORY	Kai Ye	2022-05-20	1	-2/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Should not to uses the CRYPTO_ALG_ALLOCATES_MEMORY in SEC2. The SEC2 driver uses the pre-allocated buffers, including the src sgl pool, dst sgl pool and other qp ctx resources. (e.g. IV buffer, mac buffer, key buffer). The SEC2 driver doesn't allocate memory during request processing. The driver only maps software sgl to allocated hardware sgl during I/O. So here is fix it. Signed-off-by: Kai Ye <yekai13@huawei.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
\| * \|	crypto: qat - add support for 401xx devices	Giovanni Cabiddu	2022-05-20	4	-0/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	QAT_401xx is a derivative of 4xxx. Add support for that device in the qat_4xxx driver by including the DIDs (both PF and VF), extending the probe and the firmware loader. Signed-off-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com> Reviewed-by: Srinivas Kerekare <srinivas.kerekare@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
\| * \|	crypto: qat - re-enable registration of algorithms	Giovanni Cabiddu	2022-05-20	2	-14/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Re-enable the registration of algorithms after fixes to (1) use pre-allocated buffers in the datapath and (2) support the CRYPTO_TFM_REQ_MAY_BACKLOG flag. This reverts commit 8893d27ffcaf6ec6267038a177cb87bcde4dd3de. Cc: stable@vger.kernel.org Signed-off-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com> Reviewed-by: Marco Chiappero <marco.chiappero@intel.com> Reviewed-by: Adam Guerin <adam.guerin@intel.com> Reviewed-by: Wojciech Ziemba <wojciech.ziemba@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
\| * \|	crypto: qat - honor CRYPTO_TFM_REQ_MAY_SLEEP flag	Giovanni Cabiddu	2022-05-20	3	-14/+27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If a request has the flag CRYPTO_TFM_REQ_MAY_SLEEP set, allocate memory using the flag GFP_KERNEL otherwise use GFP_ATOMIC. Signed-off-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com> Reviewed-by: Adam Guerin <adam.guerin@intel.com> Reviewed-by: Wojciech Ziemba <wojciech.ziemba@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
\| * \|	crypto: qat - add param check for DH	Giovanni Cabiddu	2022-05-20	1	-0/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Reject requests with a source buffer that is bigger than the size of the key. This is to prevent a possible integer underflow that might happen when copying the source scatterlist into a linear buffer. Cc: stable@vger.kernel.org Signed-off-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com> Reviewed-by: Adam Guerin <adam.guerin@intel.com> Reviewed-by: Wojciech Ziemba <wojciech.ziemba@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
\| * \|	crypto: qat - add param check for RSA	Giovanni Cabiddu	2022-05-20	1	-0/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Reject requests with a source buffer that is bigger than the size of the key. This is to prevent a possible integer underflow that might happen when copying the source scatterlist into a linear buffer. Cc: stable@vger.kernel.org Signed-off-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com> Reviewed-by: Adam Guerin <adam.guerin@intel.com> Reviewed-by: Wojciech Ziemba <wojciech.ziemba@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
\| * \|	crypto: qat - remove dma_free_coherent() for DH	Giovanni Cabiddu	2022-05-20	1	-49/+34
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The functions qat_dh_compute_value() allocates memory with dma_alloc_coherent() if the source or the destination buffers are made of multiple flat buffers or of a size that is not compatible with the hardware. This memory is then freed with dma_free_coherent() in the context of a tasklet invoked to handle the response for the corresponding request. According to Documentation/core-api/dma-api-howto.rst, the function dma_free_coherent() cannot be called in an interrupt context. Replace allocations with dma_alloc_coherent() in the function qat_dh_compute_value() with kmalloc() + dma_map_single(). Cc: stable@vger.kernel.org Fixes: c9839143ebbf ("crypto: qat - Add DH support") Signed-off-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com> Reviewed-by: Adam Guerin <adam.guerin@intel.com> Reviewed-by: Wojciech Ziemba <wojciech.ziemba@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
\| * \|	crypto: qat - remove dma_free_coherent() for RSA	Giovanni Cabiddu	2022-05-20	1	-77/+60
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	After commit f5ff79fddf0e ("dma-mapping: remove CONFIG_DMA_REMAP"), if the algorithms are enabled, the driver crashes with a BUG_ON while executing vunmap() in the context of a tasklet. This is due to the fact that the function dma_free_coherent() cannot be called in an interrupt context (see Documentation/core-api/dma-api-howto.rst). The functions qat_rsa_enc() and qat_rsa_dec() allocate memory with dma_alloc_coherent() if the source or the destination buffers are made of multiple flat buffers or of a size that is not compatible with the hardware. This memory is then freed with dma_free_coherent() in the context of a tasklet invoked to handle the response for the corresponding request. Replace allocations with dma_alloc_coherent() in the functions qat_rsa_enc() and qat_rsa_dec() with kmalloc() + dma_map_single(). Cc: stable@vger.kernel.org Fixes: a990532023b9 ("crypto: qat - Add support for RSA algorithm") Signed-off-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com> Reviewed-by: Adam Guerin <adam.guerin@intel.com> Reviewed-by: Wojciech Ziemba <wojciech.ziemba@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
\| * \|	crypto: qat - fix memory leak in RSA	Giovanni Cabiddu	2022-05-20	1	-11/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When an RSA key represented in form 2 (as defined in PKCS #1 V2.1) is used, some components of the private key persist even after the TFM is released. Replace the explicit calls to free the buffers in qat_rsa_exit_tfm() with a call to qat_rsa_clear_ctx() which frees all buffers referenced in the TFM context. Cc: stable@vger.kernel.org Fixes: 879f77e9071f ("crypto: qat - Add RSA CRT mode") Signed-off-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com> Reviewed-by: Adam Guerin <adam.guerin@intel.com> Reviewed-by: Wojciech Ziemba <wojciech.ziemba@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
\| * \|	crypto: qat - add backlog mechanism	Giovanni Cabiddu	2022-05-20	9	-18/+123
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The implementations of the crypto algorithms (aead, skcipher, etc) in the QAT driver do not properly support requests with the CRYPTO_TFM_REQ_MAY_BACKLOG flag set. If the HW queue is full, the driver returns -EBUSY but does not enqueue the request. This can result in applications like dm-crypt waiting indefinitely for the completion of a request that was never submitted to the hardware. Fix this by adding a software backlog queue: if the ring buffer is more than eighty percent full, then the request is enqueued to a backlog list and the error code -EBUSY is returned back to the caller. Requests in the backlog queue are resubmitted at a later time, in the context of the callback of a previously submitted request. The request for which -EBUSY is returned is then marked as -EINPROGRESS once submitted to the HW queues. The submission loop inside the function qat_alg_send_message() has been modified to decide which submission policy to use based on the request flags. If the request does not have the CRYPTO_TFM_REQ_MAY_BACKLOG set, the previous behaviour has been preserved. Based on a patch by Vishnu Das Ramachandran <vishnu.dasx.ramachandran@intel.com> Cc: stable@vger.kernel.org Fixes: d370cec32194 ("crypto: qat - Intel(R) QAT crypto interface") Reported-by: Mikulas Patocka <mpatocka@redhat.com> Reported-by: Kyle Sanderson <kyle.leet@gmail.com> Signed-off-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com> Reviewed-by: Marco Chiappero <marco.chiappero@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
\| * \|	crypto: qat - refactor submission logic	Giovanni Cabiddu	2022-05-20	6	-54/+101
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	All the algorithms in qat_algs.c and qat_asym_algs.c use the same pattern to submit messages to the HW queues. Move the submission loop to a new function, qat_alg_send_message(), and share it between the symmetric and the asymmetric algorithms. As part of this rework, since the number of retries before returning an error is inconsistent between the symmetric and asymmetric implementations, set it to a value that works for both (i.e. 20, was 10 in qat_algs.c and 100 in qat_asym_algs.c) In addition fix the return code reported when the HW queues are full. In that case return -ENOSPC instead of -EBUSY. Including stable in CC since (1) the error code returned if the HW queues are full is incorrect and (2) to facilitate the backport of the next fix "crypto: qat - add backlog mechanism". Cc: stable@vger.kernel.org Signed-off-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com> Reviewed-by: Marco Chiappero <marco.chiappero@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
\| * \|	crypto: qat - use pre-allocated buffers in datapath	Giovanni Cabiddu	2022-05-20	2	-27/+61
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In order to do DMAs, the QAT device requires that the scatterlist structures are mapped and translated into a format that the firmware can understand. This is defined as the composition of a scatter gather list (SGL) descriptor header, the struct qat_alg_buf_list, plus a variable number of flat buffer descriptors, the struct qat_alg_buf. The allocation and mapping of these data structures is done each time a request is received from the skcipher and aead APIs. In an OOM situation, this behaviour might lead to a dead-lock if an allocation fails. Based on the conversation in [1], increase the size of the aead and skcipher request contexts to include an SGL descriptor that can handle a maximum of 4 flat buffers. If requests exceed 4 entries buffers, memory is allocated dynamically. [1] https://lore.kernel.org/linux-crypto/20200722072932.GA27544@gondor.apana.org.au/ Cc: stable@vger.kernel.org Fixes: d370cec32194 ("crypto: qat - Intel(R) QAT crypto interface") Reported-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com> Reviewed-by: Marco Chiappero <marco.chiappero@intel.com> Reviewed-by: Wojciech Ziemba <wojciech.ziemba@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
\| * \|	crypto: qat - set to zero DH parameters before free	Giovanni Cabiddu	2022-05-20	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Set to zero the context buffers containing the DH key before they are freed. This is a defense in depth measure that avoids keys to be recovered from memory in case the system is compromised between the free of the buffer and when that area of memory (containing keys) gets overwritten. Cc: stable@vger.kernel.org Fixes: c9839143ebbf ("crypto: qat - Add DH support") Signed-off-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com> Reviewed-by: Adam Guerin <adam.guerin@intel.com> Reviewed-by: Wojciech Ziemba <wojciech.ziemba@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
\| * \|	crypto: s390 - add crypto library interface for ChaCha20	Vladis Dronov	2022-05-13	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Implement a crypto library interface for the s390-native ChaCha20 cipher algorithm. This allows us to stop to select CRYPTO_CHACHA20 and instead select CRYPTO_ARCH_HAVE_LIB_CHACHA. This allows BIG_KEYS=y not to build a whole ChaCha20 crypto infrastructure as a built-in, but build a smaller CRYPTO_LIB_CHACHA instead. Make CRYPTO_CHACHA_S390 config entry to look like similar ones on other architectures. Remove CRYPTO_ALGAPI select as anyway it is selected by CRYPTO_SKCIPHER. Add a new test module and a test script for ChaCha20 cipher and its interfaces. Here are test results on an idle z15 machine: Data \| Generic crypto TFM \| s390 crypto TFM \| s390 lib size \| enc dec \| enc dec \| enc dec -----+--------------------+------------------+---------------- 512b \| 1545ns 1295ns \| 604ns 446ns \| 430ns 407ns 4k \| 9536ns 9463ns \| 2329ns 2174ns \| 2170ns 2154ns 64k \| 149.6us 149.3us \| 34.4us 34.5us \| 33.9us 33.1us 6M \| 23.61ms 23.11ms \| 4223us 4160us \| 3951us 4008us 60M \| 143.9ms 143.9ms \| 33.5ms 33.2ms \| 32.2ms 32.1ms Signed-off-by: Vladis Dronov <vdronov@redhat.com> Reviewed-by: Harald Freudenberger <freude@linux.ibm.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
\| * \|	crypto: talitos - Uniform coding style with defined variable	jianchunfu	2022-05-13	1	-5/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Use the defined variable "desc" to uniform coding style. Signed-off-by: jianchunfu <jianchunfu@cmss.chinamobile.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
\| * \|	crypto: octeontx2 - simplify the return expression of ↵	Minghao Chi	2022-05-13	1	-6/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	otx2_cpt_aead_cbc_aes_sha_setkey() Simplify the return expression. Reported-by: Zeal Robot <zealci@zte.com.cn> Signed-off-by: Minghao Chi <chi.minghao@zte.com.cn> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
\| * \|	crypto: sun8i-ce - do not fallback if cryptlen is less than sg length	Corentin Labbe	2022-05-13	1	-2/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The sg length could be more than remaining data on it. So check the length requirement against the minimum between those two values. Signed-off-by: Corentin Labbe <clabbe@baylibre.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
\| * \|	crypto: sun8i-ce - rework debugging	Corentin Labbe	2022-05-13	4	-19/+96
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The "Fallback for xxx" message is annoying, remove it and store the information in the debugfs. Let's add more precise fallback stats and display it better. Signed-off-by: Corentin Labbe <clabbe@baylibre.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
\| * \|	crypto: sun8i-ce - use sg_nents_for_len	Corentin Labbe	2022-05-13	2	-18/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When testing with some large SG list, the sun8i-ce drivers always fallback even if it can handle it. So use sg_nents_for_len() which permits to see less SGs than needed. Signed-off-by: Corentin Labbe <clabbe@baylibre.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
\| * \|	crypto: sun8i-ce - Add function for handling hash padding	Corentin Labbe	2022-05-13	1	-30/+65
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Move all padding work to a dedicated function. Signed-off-by: Corentin Labbe <clabbe@baylibre.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
\| * \|	crypto: sun8i-ss - do not fallback if cryptlen is less than sg length	Corentin Labbe	2022-05-13	1	-10/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The sg length could be more than remaining data on it. So check the length requirement against the minimum between those two values. Signed-off-by: Corentin Labbe <clabbe@baylibre.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
\| * \|	crypto: sun8i-ss - add hmac(sha1)	Corentin Labbe	2022-05-13	3	-6/+231
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Even if sun8i-ss does not handle hmac(sha1) directly, we can provide one which use the already supported acceleration of sha1. Signed-off-by: Corentin Labbe <clabbe@baylibre.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
\| * \|	crypto: sun8i-ss - Add function for handling hash padding	Corentin Labbe	2022-05-13	1	-22/+65
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Move all padding work to a dedicated function. Signed-off-by: Corentin Labbe <clabbe@baylibre.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
\| * \|	crypto: sun8i-ss - rework debugging	Corentin Labbe	2022-05-13	4	-22/+83
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The "Fallback for xxx" message is annoying, remove it and store the information in the debugfs. In the same time, reports more fallback statistics. Signed-off-by: Corentin Labbe <clabbe@baylibre.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
\| * \|	crypto: sun8i-ss - handle requests if last block is not modulo 64	Corentin Labbe	2022-05-13	3	-10/+29
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The current sun8i-ss handle only requests with all SG length being modulo 64. But the last SG could be always handled by copying it on the pad buffer. Signed-off-by: Corentin Labbe <clabbe@baylibre.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
\| * \|	crypto: sun8i-ss - do not zeroize all pad	Corentin Labbe	2022-05-13	1	-2/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Instead of memset all pad buffer, it is faster to only put 0 where needed. Signed-off-by: Corentin Labbe <clabbe@baylibre.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
\| * \|	crypto: sun8i-ss - do not allocate memory when handling hash requests	Corentin Labbe	2022-05-13	3	-12/+17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Instead of allocate memory on each requests, it is easier to pre-allocate buffers. This made error path easier. Signed-off-by: Corentin Labbe <clabbe@baylibre.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
\| * \|	crypto: sun8i-ss - use sg_nents_for_len	Corentin Labbe	2022-05-13	1	-13/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When testing with some large SG list, the sun8i-ss drivers always fallback even if it can handle it. So use sg_nents_for_len() which permits to see less SGs than needed. Signed-off-by: Corentin Labbe <clabbe@baylibre.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
\| * \|	crypto: sun8i-ss - test error before assigning	Corentin Labbe	2022-05-13	1	-4/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The first thing we should do after dma_map_single() is to test the result. Signed-off-by: Corentin Labbe <clabbe@baylibre.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
\| * \|	crypto: sun8i-ss - remove redundant test	Corentin Labbe	2022-05-13	1	-11/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Some fallback tests were redundant with what sun8i_ss_hash_need_fallback() already do. Signed-off-by: Corentin Labbe <clabbe@baylibre.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
\| * \|	crypto: sun8i-ss - handle zero sized sg	Corentin Labbe	2022-05-13	1	-1/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	sun8i-ss does not handle well the possible zero sized sg. Fixes: d9b45418a917 ("crypto: sun8i-ss - support hash algorithms") Signed-off-by: Corentin Labbe <clabbe@baylibre.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
\| * \|	crypto: sun8i-ss - rework handling of IV	Corentin Labbe	2022-05-13	3	-52/+107
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	sun8i-ss fail handling IVs when doing decryption of multiple SGs in-place. It should backup the last block of each SG source for using it later as IVs. In the same time remove allocation on requests path for storing all IVs. Fixes: f08fcced6d00 ("crypto: allwinner - Add sun8i-ss cryptographic offloader") Signed-off-by: Corentin Labbe <clabbe@baylibre.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
\| * \|	crypto: sun4i-ss - do not allocate backup IV on requests	Corentin Labbe	2022-05-13	2	-14/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Instead of allocate memory on each requests, it is easier to pre-allocate buffer for backup IV. This made error path easier. Signed-off-by: Corentin Labbe <clabbe@baylibre.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
\| * \|	crypto: sun8i-ce - do not allocate memory when handling requests	Corentin Labbe	2022-05-13	3	-28/+29
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Instead of allocate memory on each requests, it is easier to pre-allocate buffer for IV. This made error path easier. Signed-off-by: Corentin Labbe <clabbe@baylibre.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
\| * \|	crypto: sun8i-ce - Fix minor style issue	Corentin Labbe	2022-05-13	1	-1/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch remove a double blank line. Signed-off-by: Corentin Labbe <clabbe@baylibre.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
\| * \|	crypto: vmx - Fix build error	Masahiro Yamada	2022-05-10	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When I refactored this Makefile, I accidentally changed the CONFIG option. Fixes: b52455a73db9 ("crypto: vmx - Align the short log with Makefile cleanups") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
\| * \|	crypto: vmx - Align the short log with Makefile cleanups	Masahiro Yamada	2022-05-06	1	-14/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	I notieced the log is not properly aligned: PERL drivers/crypto/vmx/aesp8-ppc.S CC [M] fs/xfs/xfs_reflink.o PERL drivers/crypto/vmx/ghashp8-ppc.S CC [M] drivers/crypto/vmx/aes.o Add some spaces after 'PERL'. While I was here, I cleaned up the Makefile: - Merge the two similar rules - Remove redundant 'clean-files' (Having 'targets' is enough) - Move the flavour into the build command This still avoids the build failures fixed by commit 4ee812f6143d ("crypto: vmx - Avoid weird build failures"). Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
\| * \|	crypto: atmel - Avoid flush_scheduled_work() usage	Tetsuo Handa	2022-05-06	5	-3/+27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Flushing system-wide workqueues is dangerous and will be forbidden. Replace system_wq with local atmel_wq. If CONFIG_CRYPTO_DEV_ATMEL_{I2C,ECC,SHA204A}=y, the ordering in Makefile guarantees that module_init() for atmel-i2c runs before module_init() for atmel-ecc and atmel-sha204a runs. Link: https://lkml.kernel.org/r/49925af7-78a8-a3dd-bce6-cfc02e1a9236@I-love.SAKURA.ne.jp Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
\| * \|	crypto: atmel-i2c - Simplify return code in probe function	Uwe Kleine-König	2022-05-06	1	-5/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	There is no semantical change introduced by this change. Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Reviewed-by: Claudiu Beznea <claudiu.beznea@microchip.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
\| * \|	crypto: caam/rng - Add support for PRNG	Meenakshi Aggarwal	2022-05-06	5	-1/+261
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add support for random number generation using PRNG mode of CAAM and expose the interface through crypto API. According to the RM, the HW implementation of the DRBG follows NIST SP 800-90A specification for DRBG_Hash SHA-256 function Signed-off-by: Meenakshi Aggarwal <meenakshi.aggarwal@nxp.com> Reviewed-by: Horia Geant <horia.geanta@nxp.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
\| * \|	crypto: atmel-sha204a - Suppress duplicate error message	Uwe Kleine-König	2022-05-06	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Returning an error value in an i2c remove callback results in an error message being emitted by the i2c core, but otherwise it doesn't make a difference. The device goes away anyhow and the devm cleanups are called. As atmel_sha204a_remove already emits an error message ant the additional error message by the i2c core doesn't add any useful information, change the return value to zero to suppress this error message. Note that after atmel_sha204a_remove() returns *i2c_priv is freed, so there is trouble ahead because atmel_sha204a_rng_done() might be called after that freeing. So make the error message a bit more frightening. Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
\| * \|	crypto: atmel-sha204a - Remove useless check	Uwe Kleine-König	2022-05-06	1	-2/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	kfree(NULL) is a noop, so there is no win in checking a pointer before kfreeing it. Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>