summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* crypto: dh - add public key verification testStephan Mueller2018-07-083-6/+79
| | | | | | | | | | | | | | | | | | | | | | According to SP800-56A section 5.6.2.1, the public key to be processed for the DH operation shall be checked for appropriateness. The check shall covers the full verification test in case the domain parameter Q is provided as defined in SP800-56A section 5.6.2.3.1. If Q is not provided, the partial check according to SP800-56A section 5.6.2.3.2 is performed. The full verification test requires the presence of the domain parameter Q. Thus, the patch adds the support to handle Q. It is permissible to not provide the Q value as part of the domain parameters. This implies that the interface is still backwards-compatible where so far only P and G are to be provided. However, if Q is provided, it is imported. Without the test, the NIST ACVP testing fails. After adding this check, the NIST ACVP testing passes. Testing without providing the Q domain parameter has been performed to verify the interface has not changed. Signed-off-by: Stephan Mueller <smueller@chronox.de> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* crypto: stm32/crc - Add power management supportlionel.debieve@st.com2018-07-081-0/+62
| | | | | | | Adding pm and pm_runtime support to STM32 CRC. Signed-off-by: Lionel Debieve <lionel.debieve@st.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* crypto: stm32/hash - Add power management supportlionel.debieve@st.com2018-07-081-0/+71
| | | | | | | Adding pm and pm_runtime support to STM32 HASH. Signed-off-by: Lionel Debieve <lionel.debieve@st.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* crypto: stm32/cryp - Add power management supportlionel.debieve@st.com2018-07-081-0/+62
| | | | | | | Adding pm and pm_runtime support to STM32 CRYP. Signed-off-by: Lionel Debieve <lionel.debieve@st.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* crypto: skcipher - Fix -Wstringop-truncation warningsStafford Horne2018-07-082-0/+3
| | | | | | | | | | | | | | | | | | | | | | As of GCC 9.0.0 the build is reporting warnings like: crypto/ablkcipher.c: In function ‘crypto_ablkcipher_report’: crypto/ablkcipher.c:374:2: warning: ‘strncpy’ specified bound 64 equals destination size [-Wstringop-truncation] strncpy(rblkcipher.geniv, alg->cra_ablkcipher.geniv ?: "<default>", ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ sizeof(rblkcipher.geniv)); ~~~~~~~~~~~~~~~~~~~~~~~~~ This means the strnycpy might create a non null terminated string. Fix this by explicitly performing '\0' termination. Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Eric Biggers <ebiggers3@gmail.com> Cc: Nick Desaulniers <nick.desaulniers@gmail.com> Signed-off-by: Stafford Horne <shorne@gmail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* crypto: ecdh - add public key verification testStephan Mueller2018-07-082-8/+56
| | | | | | | | | | | | | | | | | | According to SP800-56A section 5.6.2.1, the public key to be processed for the ECDH operation shall be checked for appropriateness. When the public key is considered to be an ephemeral key, the partial validation test as defined in SP800-56A section 5.6.2.3.4 can be applied. The partial verification test requires the presence of the field elements of a and b. For the implemented NIST curves, b is defined in FIPS 186-4 appendix D.1.2. The element a is implicitly given with the Weierstrass equation given in D.1.2 where a = p - 3. Without the test, the NIST ACVP testing fails. After adding this check, the NIST ACVP testing passes. Signed-off-by: Stephan Mueller <smueller@chronox.de> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* crypto: skcipher - remove the exporting of skcipher_walk_nextDenis Efremov2018-07-011-1/+0
| | | | | | | | | | | | | | | The function skcipher_walk_next declared as static and marked as EXPORT_SYMBOL_GPL. It's a bit confusing for internal function to be exported. The area of visibility for such function is its .c file and all other modules. Other *.c files of the same module can't use it, despite all other modules can. Relying on the fact that this is the internal function and it's not a crucial part of the API, the patch just removes the EXPORT_SYMBOL_GPL marking of skcipher_walk_next. Found by Linux Driver Verification project (linuxtesting.org). Signed-off-by: Denis Efremov <efremov@linux.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* crypto: virtio - Register an algo only if it's supportedFarhan Ali2018-07-013-45/+159
| | | | | | | | | | | | | | | Register a crypto algo with the Linux crypto layer only if the algorithm is supported by the backend virtio-crypto device. Also route crypto requests to a virtio-crypto device, only if it can support the requested service and algorithm. Signed-off-by: Farhan Ali <alifm@linux.ibm.com> Acked-by: Gonglei <arei.gonglei@huawei.com> Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* crypto: virtio - Read crypto services and algorithm masksFarhan Ali2018-07-012-0/+43
| | | | | | | | | | | Read the crypto services and algorithm masks which provides information about the services and algorithms supported by virtio-crypto backend. Signed-off-by: Farhan Ali <alifm@linux.ibm.com> Acked-by: Gonglei <arei.gonglei@huawei.com> Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* crypto: vmac - remove insecure version with hardcoded nonceEric Biggers2018-07-014-186/+8
| | | | | | | | | | | | | | | | Remove the original version of the VMAC template that had the nonce hardcoded to 0 and produced a digest with the wrong endianness. I'm unsure whether this had users or not (there are no explicit in-kernel references to it), but given that the hardcoded nonce made it wildly insecure unless a unique key was used for each message, let's try removing it and see if anyone complains. Leave the new "vmac64" template that requires the nonce to be explicitly specified as the first 16 bytes of data and uses the correct endianness for the digest. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* crypto: vmac - add nonced version with big endian digestEric Biggers2018-07-013-18/+273
| | | | | | | | | | | | | | | | | | Currently the VMAC template uses a "nonce" hardcoded to 0, which makes it insecure unless a unique key is set for every message. Also, the endianness of the final digest is wrong: the implementation uses little endian, but the VMAC specification has it as big endian, as do other VMAC implementations such as the one in Crypto++. Add a new VMAC template where the nonce is passed as the first 16 bytes of data (similar to what is done for Poly1305's nonce), and the digest is big endian. Call it "vmac64", since the old name of simply "vmac" didn't clarify whether the implementation is of VMAC-64 or of VMAC-128 (which produce 64-bit and 128-bit digests respectively); so we fix the naming ambiguity too. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* crypto: vmac - separate tfm and request contextEric Biggers2018-07-012-290/+181
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | syzbot reported a crash in vmac_final() when multiple threads concurrently use the same "vmac(aes)" transform through AF_ALG. The bug is pretty fundamental: the VMAC template doesn't separate per-request state from per-tfm (per-key) state like the other hash algorithms do, but rather stores it all in the tfm context. That's wrong. Also, vmac_final() incorrectly zeroes most of the state including the derived keys and cached pseudorandom pad. Therefore, only the first VMAC invocation with a given key calculates the correct digest. Fix these bugs by splitting the per-tfm state from the per-request state and using the proper init/update/final sequencing for requests. Reproducer for the crash: #include <linux/if_alg.h> #include <sys/socket.h> #include <unistd.h> int main() { int fd; struct sockaddr_alg addr = { .salg_type = "hash", .salg_name = "vmac(aes)", }; char buf[256] = { 0 }; fd = socket(AF_ALG, SOCK_SEQPACKET, 0); bind(fd, (void *)&addr, sizeof(addr)); setsockopt(fd, SOL_ALG, ALG_SET_KEY, buf, 16); fork(); fd = accept(fd, NULL, NULL); for (;;) write(fd, buf, 256); } The immediate cause of the crash is that vmac_ctx_t.partial_size exceeds VMAC_NHBYTES, causing vmac_final() to memset() a negative length. Reported-by: syzbot+264bca3a6e8d645550d3@syzkaller.appspotmail.com Fixes: f1939f7c5645 ("crypto: vmac - New hash algorithm for intel_txt support") Cc: <stable@vger.kernel.org> # v2.6.32+ Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* crypto: vmac - require a block cipher with 128-bit block sizeEric Biggers2018-07-011-0/+4
| | | | | | | | | | | | | | The VMAC template assumes the block cipher has a 128-bit block size, but it failed to check for that. Thus it was possible to instantiate it using a 64-bit block size cipher, e.g. "vmac(cast5)", causing uninitialized memory to be used. Add the needed check when instantiating the template. Fixes: f1939f7c5645 ("crypto: vmac - New hash algorithm for intel_txt support") Cc: <stable@vger.kernel.org> # v2.6.32+ Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* crypto: atmel-ecc - remove overly verbose dev_infoTudor-Dan Ambarus2018-06-221-4/+0
| | | | | | | | | | | Remove it because when using a slow console, it can affect the speed of crypto operations. Similar to 'commit 730f23b66095 ("crypto: vmx - Remove overly verbose printk from AES XTS init")'. Signed-off-by: Tudor Ambarus <tudor.ambarus@microchip.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* crypto: atmel-ecc - fix to allow multi segment scatterlistsTudor-Dan Ambarus2018-06-221-9/+22
| | | | | | | | | | | Remove the limitation of single element scatterlists. ECDH with multi-element scatterlists is needed by TPM. Similar to 'commit 95ec01ba1ef0 ("crypto: ecdh - fix to allow multi segment scatterlists")'. Signed-off-by: Tudor Ambarus <tudor.ambarus@microchip.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* crypto: cavium - make structure algs staticColin Ian King2018-06-221-1/+1
| | | | | | | | | | | | The structure algs is local to the source and does not need to be in global scope, so make it static. Cleans up sparse warning: drivers/crypto/cavium/cpt/cptvf_algs.c:354:19: warning: symbol 'algs' was not declared. Should it be static? Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* crypto: aegis - fix indentation of a statementColin Ian King2018-06-221-1/+1
| | | | | | | Trival fix to correct the indentation of a single statement Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* crypto: inside-secure - authenc(hmac(sha384), cbc(aes)) supportAntoine Tenart2018-06-223-0/+41
| | | | | | | | This patch adds the authenc(hmac(sha384),cbc(aes)) algorithm support to the Inside Secure SafeXcel driver. Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* crypto: inside-secure - hmac(sha384) supportAntoine Tenart2018-06-223-0/+58
| | | | | | | | This patch adds the hmac(sha384) algorithm support to the Inside Secure SafeXcel driver. Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* crypto: inside-secure - sha384 supportAntoine Tenart2018-06-223-1/+77
| | | | | | | | This patch adds the sha384 algorithm support to the Inside Secure SafeXcel driver. Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* crypto: sha512_generic - add a sha384 0-length pre-computed hashAntoine Tenart2018-06-222-0/+12
| | | | | | | | | This patch adds the sha384 pre-computed 0-length hash so that device drivers can use it when an hardware engine does not support computing a hash from a 0 length input. Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* crypto: inside-secure - authenc(hmac(sha512), cbc(aes)) supportAntoine Tenart2018-06-223-2/+43
| | | | | | | | This patch adds the authenc(hmac(sha512),cbc(aes)) algorithm support to the Inside Secure SafeXcel driver. Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* crypto: inside-secure - hmac(sha512) supportAntoine Tenart2018-06-223-3/+61
| | | | | | | | This patch adds the hmac(sha512) algorithm support to the Inside Secure SafeXcel driver. Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* crypto: inside-secure - sha512 supportAntoine Tenart2018-06-223-37/+153
| | | | | | | | This patch adds the sha512 algorithm support to the Inside Secure SafeXcel driver. Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* crypto: sha512_generic - add a sha512 0-length pre-computed hashAntoine Tenart2018-06-222-0/+14
| | | | | | | | | This patch adds the sha512 pre-computed 0-length hash so that device drivers can use it when an hardware engine does not support computing a hash from a 0 length input. Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* crypto: inside-secure - improve the counter computationAntoine Tenart2018-06-222-7/+10
| | | | | | | | | | | A counter is given to the engine when finishing hash computation. It currently uses the blocksize while it counts the number of 64 bytes blocks given to the engine. This works well for all algorithms so far, as SHA1, SHA224 and SHA256 all have a blocksize of 64 bytes, but others algorithms such as SHA512 wouldn't work. Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* crypto: inside-secure - use the error handler for invalidation requestsAntoine Tenart2018-06-222-10/+4
| | | | | | | | | | This patch reworks the way invalidation request handlers handle the result descriptor errors, to use the common error handling function. This improves the drivers in terms of readability and maintainability. Suggested-by: Ofer Heifetz <oferh@marvell.com> Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* crypto: inside-secure - increase minimum transfer sizeOfer Heifetz2018-06-221-4/+4
| | | | | | | | | | The token size was increased for AEAD support. Occasional authentication fails arise since the result descriptor overflows. This is because the token size and the engine minimal thresholds must be in sync. Signed-off-by: Ofer Heifetz <oferh@marvell.com> Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* Linux 4.18-rc1v4.18-rc1Linus Torvalds2018-06-171-2/+2
|
* Merge tag 'for-linus-20180616' of git://git.kernel.dk/linux-blockLinus Torvalds2018-06-1616-275/+174
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull block fixes from Jens Axboe: "A collection of fixes that should go into -rc1. This contains: - bsg_open vs bsg_unregister race fix (Anatoliy) - NVMe pull request from Christoph, with fixes for regressions in this window, FC connect/reconnect path code unification, and a trace point addition. - timeout fix (Christoph) - remove a few unused functions (Christoph) - blk-mq tag_set reinit fix (Roman)" * tag 'for-linus-20180616' of git://git.kernel.dk/linux-block: bsg: fix race of bsg_open and bsg_unregister block: remov blk_queue_invalidate_tags nvme-fabrics: fix and refine state checks in __nvmf_check_ready nvme-fabrics: handle the admin-only case properly in nvmf_check_ready nvme-fabrics: refactor queue ready check blk-mq: remove blk_mq_tagset_iter nvme: remove nvme_reinit_tagset nvme-fc: fix nulling of queue data on reconnect nvme-fc: remove reinit_request routine blk-mq: don't time out requests again that are in the timeout handler nvme-fc: change controllers first connect to use reconnect path nvme: don't rely on the changed namespace list log nvmet: free smart-log buffer after use nvme-rdma: fix error flow during mapping request data nvme: add bio remapping tracepoint nvme: fix NULL pointer dereference in nvme_init_subsystem blk-mq: reinit q->tag_set_list entry only after grace period
| * bsg: fix race of bsg_open and bsg_unregisterAnatoliy Glagolev2018-06-151-11/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The existing implementation allows races between bsg_unregister and bsg_open paths. bsg_unregister and request_queue cleanup and deletion may start and complete right after bsg_get_device (in bsg_open path) retrieves bsg_class_device and releases the mutex. Then bsg_open path touches freed memory of bsg_class_device and request_queue. One possible fix is to hold the mutex all the way through bsg_get_device instead of releasing it after bsg_class_device retrieval. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-Off-By: Anatoliy Glagolev <glagolig@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
| * block: remov blk_queue_invalidate_tagsChristoph Hellwig2018-06-153-38/+1
| | | | | | | | | | | | | | | | This function is entirely unused, so remove it and the tag_queue_busy member of struct request_queue. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
| * Merge branch 'nvme-4.18' of git://git.infradead.org/nvme into for-linusJens Axboe2018-06-1511-224/+154
| |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull NVMe fixes from Christoph: "Fix various little regressions introduced in this merge window, plus a rework of the fibre channel connect and reconnect path to share the code instead of having separate sets of bugs. Last but not least a trivial trace point addition from Hannes." * 'nvme-4.18' of git://git.infradead.org/nvme: nvme-fabrics: fix and refine state checks in __nvmf_check_ready nvme-fabrics: handle the admin-only case properly in nvmf_check_ready nvme-fabrics: refactor queue ready check blk-mq: remove blk_mq_tagset_iter nvme: remove nvme_reinit_tagset nvme-fc: fix nulling of queue data on reconnect nvme-fc: remove reinit_request routine nvme-fc: change controllers first connect to use reconnect path nvme: don't rely on the changed namespace list log nvmet: free smart-log buffer after use nvme-rdma: fix error flow during mapping request data nvme: add bio remapping tracepoint nvme: fix NULL pointer dereference in nvme_init_subsystem
| | * nvme-fabrics: fix and refine state checks in __nvmf_check_readyChristoph Hellwig2018-06-151-20/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - make sure we only allow internally generates commands in any non-live state - only allow connect commands on non-live queues when actually in the new or connecting states - treat all other non-live, non-dead states the same as a default cach-all This fixes a regression where we could not shutdown a controller orderly as we didn't allow the internal generated Property Set command, and also ensures we don't accidentally let a Connect command through in the wrong state. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: James Smart <james.smart@broadcom.com>
| | * nvme-fabrics: handle the admin-only case properly in nvmf_check_readyChristoph Hellwig2018-06-151-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | In the ADMIN_ONLY state we don't have any I/O queues, but we should accept all admin commands without further checks. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: James Smart <james.smart@broadcom.com>
| | * nvme-fabrics: refactor queue ready checkChristoph Hellwig2018-06-155-50/+45
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Move the is_connected check to the fibre channel transport, as it has no meaning for other transports. To facilitate this split out a new nvmf_fail_nonready_command helper that is called by the transport when it is asked to handle a command on a queue that is not ready. Also avoid a function call for the queue live fast path by inlining the check. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: James Smart <james.smart@broadcom.com>
| | * blk-mq: remove blk_mq_tagset_iterChristoph Hellwig2018-06-142-31/+0
| | | | | | | | | | | | | | | | | | | | | Unused now that nvme stopped using it. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jens Axboe <axboe@kernel.dk>
| | * nvme: remove nvme_reinit_tagsetChristoph Hellwig2018-06-142-12/+0
| | | | | | | | | | | | | | | | | | | | | Unused now that all transports stopped using it. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jens Axboe <axboe@kernel.dk>
| | * nvme-fc: fix nulling of queue data on reconnectJames Smart2018-06-141-6/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The reconnect path is calling the init routines to clear a queue structure. But the queue structure has state that perhaps needs to persist as long as the controller is live. Remove the nvme_fc_init_queue() calls on reconnect. The nvme_fc_free_queue() calls will clear state bits and reset any relevant queue state for a new connection. Signed-off-by: James Smart <james.smart@broadcom.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
| | * nvme-fc: remove reinit_request routineJames Smart2018-06-141-20/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The reinit_request routine is not necessary. Remove support for the op callback. As all that nvme_reinit_tagset() does is itterate and call the reinit routine, it too has no purpose. Remove the call. Signed-off-by: James Smart <james.smart@broadcom.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
| | * nvme-fc: change controllers first connect to use reconnect pathJames Smart2018-06-141-57/+47
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Current code follows the framework that has been in the transports from the beginning where initial link-side controller connect occurs as part of "creating the controller". Thus that first connect fully talks to the controller and obtains values that can then be used in for blk-mq setup, etc. It also means that everything about the controller is fully know before the "create controller" call returns. This has several weaknesses: - The initial create_ctrl call made by the cli will block for a long time as wire transactions are performed synchronously. This delay becomes longer if errors occur or connectivity is lost and retries need to be performed. - Code wise, it means there is a separate connect path for initial controller connect vs the (same) steps used in the reconnect path. - And as there's separate paths, it means there's separate error handling and retry logic. It also plays havoc with the NEW state (should transition out of it after successful initial connect) vs the RESETTING and CONNECTING (reconnect) states that want to be transitioned to on error. - As there's separate paths, to recover from errors and disruptions, it requires separate recovery/retry paths as well and can severely convolute the controller state. This patch reworks the fc transport to use the same connect paths for the initial connection as it uses for reconnect. This makes a single path for error recovery and handling. This patch: - Removes the driving of the initial connect and replaces it with a state transition to CONNECTING and initiating the reconnect thread. A dummy state transition of RESETTING had to be traversed as a direct transtion of NEW->CONNECTING is not allowed. Given that the controller is "new", the RESETTING transition is a simple no-op. Once in the reconnecting thread, the normal behaviors of ctrl_loss_tmo (max_retries * connect_delay) and dev_loss_tmo will apply before the controller is torn down. - Only if the state transitions couldn't be traversed and the reconnect thread not scheduled, will the controller be torn down while in create_ctrl. - The prior code used the controller state of NEW to indicate whether request queues had been initialized or not. For the admin queue, the request queue is always created, so there's no need to check a state. For IO queues, change to tracking whether a successful io request queue create has occurred (e.g. 1st successful connect). - The initial controller id is initialized to the dynamic controller id used in the initial connect message. It will be overwritten by the real controller id once the controller is connected on the wire. Signed-off-by: James Smart <james.smart@broadcom.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
| | * nvme: don't rely on the changed namespace list logChristoph Hellwig2018-06-131-25/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Don't optimize our namespace rescan based on the changed namespace list log page as userspace might have changed the content through reading it. Suggested-by: Keith Busch <keith.busch@linux.intel.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <keith.busch@linux.intel.com> Reviewed-by: Hannes Reinecke <hare@suse.com>
| | * nvmet: free smart-log buffer after useChaitanya Kulkarni2018-06-111-1/+3
| | | | | | | | | | | | | | | | | | | | | Free smart-log buffer allocated in the function after use. Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
| | * nvme-rdma: fix error flow during mapping request dataMax Gurtovoy2018-06-111-7/+24
| | | | | | | | | | | | | | | | | | | | | | | | After dma mapping the sgl, we map the sgl to nvme sgl descriptor. In case of failure during the last mapping we never dma unmap the sgl. Signed-off-by: Max Gurtovoy <maxg@mellanox.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
| | * nvme: add bio remapping tracepointHannes Reinecke2018-06-111-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | Adding a tracepoint to trace bio remapping for native nvme multipath. Signed-off-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de>
| | * nvme: fix NULL pointer dereference in nvme_init_subsystemIsrael Rukshin2018-06-111-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When using nvme-pci driver the nvmf_ctrl_options is NULL. There is no need to check for discovery_nqn flag at non-fabrics controller. Fixes: 181303d0 ("nvme-fabrics: allow duplicate connections to the discovery controller") Signed-off-by: Israel Rukshin <israelr@mellanox.com> Reviewed-by: Max Gurtovoy <maxg@mellanox.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
| * | blk-mq: don't time out requests again that are in the timeout handlerChristoph Hellwig2018-06-142-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We can currently call the timeout handler again on a request that has already been handed over to the timeout handler. Prevent that with a new flag. Fixes: 12f5b931 ("blk-mq: Remove generation seqeunce") Reported-by: Andrew Randrianasulu <randrianasulu@gmail.com> Tested-by: Andrew Randrianasulu <randrianasulu@gmail.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
| * | blk-mq: reinit q->tag_set_list entry only after grace periodRoman Pen2018-06-111-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It is not allowed to reinit q->tag_set_list list entry while RCU grace period has not completed yet, otherwise the following soft lockup in blk_mq_sched_restart() happens: [ 1064.252652] watchdog: BUG: soft lockup - CPU#12 stuck for 23s! [fio:9270] [ 1064.254445] task: ffff99b912e8b900 task.stack: ffffa6d54c758000 [ 1064.254613] RIP: 0010:blk_mq_sched_restart+0x96/0x150 [ 1064.256510] Call Trace: [ 1064.256664] <IRQ> [ 1064.256824] blk_mq_free_request+0xea/0x100 [ 1064.256987] msg_io_conf+0x59/0xd0 [ibnbd_client] [ 1064.257175] complete_rdma_req+0xf2/0x230 [ibtrs_client] [ 1064.257340] ? ibtrs_post_recv_empty+0x4d/0x70 [ibtrs_core] [ 1064.257502] ibtrs_clt_rdma_done+0xd1/0x1e0 [ibtrs_client] [ 1064.257669] ib_create_qp+0x321/0x380 [ib_core] [ 1064.257841] ib_process_cq_direct+0xbd/0x120 [ib_core] [ 1064.258007] irq_poll_softirq+0xb7/0xe0 [ 1064.258165] __do_softirq+0x106/0x2a2 [ 1064.258328] irq_exit+0x92/0xa0 [ 1064.258509] do_IRQ+0x4a/0xd0 [ 1064.258660] common_interrupt+0x7a/0x7a [ 1064.258818] </IRQ> Meanwhile another context frees other queue but with the same set of shared tags: [ 1288.201183] INFO: task bash:5910 blocked for more than 180 seconds. [ 1288.201833] bash D 0 5910 5820 0x00000000 [ 1288.202016] Call Trace: [ 1288.202315] schedule+0x32/0x80 [ 1288.202462] schedule_timeout+0x1e5/0x380 [ 1288.203838] wait_for_completion+0xb0/0x120 [ 1288.204137] __wait_rcu_gp+0x125/0x160 [ 1288.204287] synchronize_sched+0x6e/0x80 [ 1288.204770] blk_mq_free_queue+0x74/0xe0 [ 1288.204922] blk_cleanup_queue+0xc7/0x110 [ 1288.205073] ibnbd_clt_unmap_device+0x1bc/0x280 [ibnbd_client] [ 1288.205389] ibnbd_clt_unmap_dev_store+0x169/0x1f0 [ibnbd_client] [ 1288.205548] kernfs_fop_write+0x109/0x180 [ 1288.206328] vfs_write+0xb3/0x1a0 [ 1288.206476] SyS_write+0x52/0xc0 [ 1288.206624] do_syscall_64+0x68/0x1d0 [ 1288.206774] entry_SYSCALL_64_after_hwframe+0x3d/0xa2 What happened is the following: 1. There are several MQ queues with shared tags. 2. One queue is about to be freed and now task is in blk_mq_del_queue_tag_set(). 3. Other CPU is in blk_mq_sched_restart() and loops over all queues in tag list in order to find hctx to restart. Because linked list entry was modified in blk_mq_del_queue_tag_set() without proper waiting for a grace period, blk_mq_sched_restart() never ends, spining in list_for_each_entry_rcu_rr(), thus soft lockup. Fix is simple: reinit list entry after an RCU grace period elapsed. Fixes: Fixes: 705cda97ee3a ("blk-mq: Make it safe to use RCU to iterate over blk_mq_tag_set.tag_list") Cc: stable@vger.kernel.org Cc: Sagi Grimberg <sagi@grimberg.me> Cc: linux-block@vger.kernel.org Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Bart Van Assche <bart.vanassche@wdc.com> Signed-off-by: Roman Pen <roman.penyaev@profitbricks.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
* | | Merge tag 'docs-broken-links' of git://linuxtv.org/mchehab/experimentalLinus Torvalds2018-06-16206-339/+372
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull documentation fixes from Mauro Carvalho Chehab: "This solves a series of broken links for files under Documentation, and improves a script meant to detect such broken links (see scripts/documentation-file-ref-check). The changes on this series are: - can.rst: fix a footnote reference; - crypto_engine.rst: Fix two parsing warnings; - Fix a lot of broken references to Documentation/*; - improve the scripts/documentation-file-ref-check script, in order to help detecting/fixing broken references, preventing false-positives. After this patch series, only 33 broken references to doc files are detected by scripts/documentation-file-ref-check" * tag 'docs-broken-links' of git://linuxtv.org/mchehab/experimental: (26 commits) fix a series of Documentation/ broken file name references Documentation: rstFlatTable.py: fix a broken reference ABI: sysfs-devices-system-cpu: remove a broken reference devicetree: fix a series of wrong file references devicetree: fix name of pinctrl-bindings.txt devicetree: fix some bindings file names MAINTAINERS: fix location of DT npcm files MAINTAINERS: fix location of some display DT bindings kernel-parameters.txt: fix pointers to sound parameters bindings: nvmem/zii: Fix location of nvmem.txt docs: Fix more broken references scripts/documentation-file-ref-check: check tools/*/Documentation scripts/documentation-file-ref-check: get rid of false-positives scripts/documentation-file-ref-check: hint: dash or underline scripts/documentation-file-ref-check: add a fix logic for DT scripts/documentation-file-ref-check: accept more wildcards at filenames scripts/documentation-file-ref-check: fix help message media: max2175: fix location of driver's companion documentation media: v4l: fix broken video4linux docs locations media: dvb: point to the location of the old README.dvb-usb file ...
| * | | fix a series of Documentation/ broken file name referencesMauro Carvalho Chehab2018-06-158-9/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As files move around, their previous links break. Fix the references for them. Acked-by: Andy Shevchenko <andy.shevchenko@gmail.com> Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org> Acked-by: Jonathan Corbet <corbet@lwn.net>