linux - linux

	Commit message (Collapse)	Author	Age	Files	Lines
*	Merge branch 'nvme-4.12' of git://git.infradead.org/nvme into ↵	Jens Axboe	2017-04-27	22	-349/+828
\|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	for-4.12/post-merge Christoph writes: "A couple more updates for 4.12. The biggest pile is fc and lpfc updates from James, but there are various small fixes and cleanups as well." Fixes up a few merge issues, and also a warning in lpfc_nvmet_rcv_unsol_abort() if CONFIG_NVME_TARGET_FC isn't enabled. Signed-off-by: Jens Axboe <axboe@fb.com>
\| *	lpfc: Fix memory corruption of the lpfc_ncmd->list pointers	James Smart	2017-04-25	2	-6/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	lpfc was changing the private pointer that is set/maintained by the nvme_fc transport. This caused two issues: a) the transport, on teardown may erroneous attempt to free whatever address was set; and b) lfpc uses any value set in lpfc_nvme_fcp_abort() and assumes its a valid io request. Correct issue by properly defining a context structure for lpfc. Lpfc also updated to clear the private context structure on io completion. Since this bug caused scrutiny of the way lpfc moves local request structures between lists, also cleaned up list_del()'s to list_del_inits()'s. This is a nvme-specific bug. The patch was cut against the linux-block tree, for-4.12/block tree. It should be pulled in through that tree. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
\| *	lpfc revison 11.2.0.12	James Smart	2017-04-24	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Update lpfc version to reflect this set of changes. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
\| *	Fix Express lane queue creation.	James Smart	2017-04-24	1	-1/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The older sli4 adapters only supported the 64 byte WQE entry size. The new adapter (fw) support both 64 and 128 byte WQE entry sizies. The Express lane WQ was not being created with the 128 byte WQE sizes when it was supported. Not having the right WQE size created for the express lane work queue caused the the firmware to overwrite the lun indentifier in the FCP header. This patch correctly creates the express lane work queue with the supported size. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
\| *	Update ABORT processing for NVMET.	James Smart	2017-04-24	10	-125/+409
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The driver with nvme had this routine stubbed. Right now XRI_ABORTED_CQE is not handled and the FC NVMET Transport has a new API for the driver. Missing code path, new NVME abort API Update ABORT processing for NVMET There are 3 new FC NVMET Transport API/ template routines for NVMET: lpfc_nvmet_xmt_fcp_release This NVMET template callback routine called to release context associated with an IO This routine is ALWAYS called last, even if the IO was aborted or completed in error. lpfc_nvmet_xmt_fcp_abort This NVMET template callback routine called to abort an exchange that has an IO in progress nvmet_fc_rcv_fcp_req When the lpfc driver receives an ABTS, this NVME FC transport layer callback routine is called. For this case there are 2 paths thru the driver: the driver either has an outstanding exchange / context for the XRI to be aborted or not. If not, a BA_RJT is issued otherwise a BA_ACC NVMET Driver abort paths: There are 2 paths for aborting an IO. The first one is we receive an IO and decide not to process it because of lack of resources. An unsolicated ABTS is immediately sent back to the initiator as a response. lpfc_nvmet_unsol_fcp_buffer lpfc_nvmet_unsol_issue_abort (XMIT_SEQUENCE_WQE) The second one is we sent the IO up to the NVMET transport layer to process, and for some reason the NVME Transport layer decided to abort the IO before it completes all its phases. For this case there are 2 paths thru the driver: the driver either has an outstanding TSEND/TRECEIVE/TRSP WQE or no outstanding WQEs are present for the exchange / context. lpfc_nvmet_xmt_fcp_abort if (LPFC_NVMET_IO_INP) lpfc_nvmet_sol_fcp_issue_abort (ABORT_WQE) lpfc_nvmet_sol_fcp_abort_cmp else lpfc_nvmet_unsol_fcp_issue_abort lpfc_nvmet_unsol_issue_abort (XMIT_SEQUENCE_WQE) lpfc_nvmet_unsol_fcp_abort_cmp Context flags: LPFC_NVMET_IOP - his flag signifies an IO is in progress on the exchange. LPFC_NVMET_XBUSY - this flag indicates the IO completed but the firmware is still busy with the corresponding exchange. The exchange should not be reused until after a XRI_ABORTED_CQE is received for that exchange. LPFC_NVMET_ABORT_OP - this flag signifies an ABORT_WQE was issued on the exchange. LPFC_NVMET_CTX_RLS - this flag signifies a context free was requested, but we are deferring it due to an XBUSY or ABORT in progress. A ctxlock is added to the context structure that is used whenever these flags are set/read within the context of an IO. The LPFC_NVMET_CTX_RLS flag is only set in the defer_relase routine when the transport has resolved all IO associated with the buffer. The flag is cleared when the CTX is associated with a new IO. An exchange can has both an LPFC_NVMET_XBUSY and a LPFC_NVMET_ABORT_OP condition active simultaneously. Both conditions must complete before the exchange is freed. When the abort callback (lpfc_nvmet_xmt_fcp_abort) is envoked: If there is an outstanding IO, the driver will issue an ABORT_WQE. This should result in 3 completions for the exchange: 1) IO cmpl with XB bit set 2) Abort WQE cmpl 3) XRI_ABORTED_CQE cmpl For this scenerio, after completion #1, the NVMET Transport IO rsp callback is called. After completion #2, no action is taken with respect to the exchange / context. After completion #3, the exchange context is free for re-use on another IO. If there is no outstanding activity on the exchange, the driver will send a ABTS to the Initiator. Upon completion of this WQE, the exchange / context is freed for re-use on another IO. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
\| *	Fix implicit logo and RSCN handling for NVMET	James Smart	2017-04-24	5	-52/+110
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	NVMET didn't have any RSCN handling at all and would not execute implicit LOGO when receiving a PLOGI from an rport that NVMET had in state UNMAPPED. Clean up the logic in lpfc_nlp_state_cleanup for initiators (FCP and NVME). NVMET should not respond to RSCN including allocating new ndlps so this code was conditionalized when nvmet_support is true. The check for NLP_RCV_PLOGI in lpfc_setup_disc_node was moved below the check for nvmet_support to allow the NVMET to recover initiator nodes correctly. The implicit logo was introduced with lpfc_rcv_plogi when NVMET gets a PLOGI on an ndlp in UNMAPPED state. The RSCN handling was modified to not respond to an RSCN in NVMET. Instead NVMET sends a GID_FT and determines if an NVMEP_INITIATOR it has is UNMAPPED but no longer in the zone membership. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
\| *	Add Fabric assigned WWN support.	James Smart	2017-04-24	7	-12/+65
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Adding support for Fabric assigned WWPN and WWNN. Firmware sends first FLOGI to fabric with vendor version changes. On link up driver gets updated service parameter with FAWWN assigned port name. Driver sends 2nd FLOGI with updated fawwpn and modifies the vport->fc_portname in driver. Note: Soft wwpn will not be allowed when fawwpn is enabled. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
\| *	Fix max_sgl_segments settings for NVME / NVMET	James Smart	2017-04-24	3	-9/+31
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Cannot set NVME segment counts to a large number The existing module parameter lpfc_sg_seg_cnt is used for both SCSI and NVME. Limit the module parameter lpfc_sg_seg_cnt to 128 with the default being 64 for both NVME and NVMET, assuming NVME is enabled in the driver for that port. The driver will set max_sgl_segments in the NVME/NVMET template to lpfc_sg_seg_cnt + 1. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
\| *	Fix crash after issuing lip reset	James Smart	2017-04-24	7	-64/+79
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When RPI is not available, driver sends WQE with invalid RPI value and rejected by HBA. lpfc 0000:82:00.3: 1:3154 BLS ABORT RSP failed, data: x3/xa0320008 and lpfc :2753 PLOGI failure DID:FFFFFA Status:x3/xa0240008 In this case, driver accesses rpi_ids array out of bounds. Fix: Check return value of lpfc_sli4_alloc_rpi(). Do not allocate lpfc_nodelist entry if RPI is not available. When RPI is not available, we will get discovery timeouts and command drops for some of the vports as seen below. lpfc :0273 Unexpected discovery timeout, vport State x0 lpfc :0230 Unexpected timeout, hba link state x5 lpfc :0111 Dropping received ELS cmd Data: x0 xc90c55 x0 Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
\| *	Fix driver load issues when MRQ=8	James Smart	2017-04-24	3	-18/+23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The symptom is that the driver will fail to login to the fabric. The reason is because it is out of iocb resources. There is a one to one relationship between MRQs (receive buffers for NVMET-FC) and iocbs and the default number of IOCBs was not accounting for the number of MRQs that were being created. This fix aligns the number of MRQ resources with the total resources so that it can handle fabric events when needed. Also the initialization of ctxlock to be on FCP commands, NOT LS commands. And modified log messages so that the log output can be correlated with the analyzer trace. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
\| *	Remove hba lock from NVMET issue WQE.	James Smart	2017-04-24	1	-4/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Unnecessary lock is taken. ring lock should be sufficient to protect the work queue submission. This was noticed when doing performance testing. The hbalock is not needed to issue io to the nvme work queue. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
\| *	Fix nvme initiator handling when not enabled.	James Smart	2017-04-24	2	-3/+14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fix nvme initiator handline when CONFIG_LPFC_NVME_INITIATOR is not enabled. With update nvme upstream driver sources, loading the driver with nvme enabled resulting in this Oops. BUG: unable to handle kernel NULL pointer dereference at 0000000000000018 IP: lpfc_nvme_update_localport+0x23/0xd0 [lpfc] PGD 0 Oops: 0000 [#1] SMP CPU: 0 PID: 10256 Comm: lpfc_worker_0 Tainted Hardware name: ... task: ffff881028191c40 task.stack: ffff880ffdf00000 RIP: 0010:lpfc_nvme_update_localport+0x23/0xd0 [lpfc] RSP: 0018:ffff880ffdf03c20 EFLAGS: 00010202 Cause: As the initiator driver completes discovery at different stages, it call lpfc_nvme_update_localport to hint that the DID and role may have changed. In the implementation of lpfc_nvme_update_localport, the driver was not validating the localport or the lport during the execution of the update_localport routine. With the recent upstream additions to the driver, the create_localport routine didn't run and so the localport was NULL causing the page-fault Oops. Fix: Add the CONFIG_LPFC_NVME_INITIATOR preprocessor inclusions to lpfc_nvme_update_localport to turn off all routine processing when the running kernel does not have NVME configured. Add NULL pointer checks on the localport and lport in lpfc_nvme_update_localport and dump messages if they are NULL and just exit. Also one alingment issue fixed. Repalces the ifdef with the IS_ENABLED macro. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
\| *	Fix driver usage of 128B WQEs when WQ_CREATE is V1.	James Smart	2017-04-24	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	There are two versions of a structure for queue creation and setup that the driver shares with FW. The driver was only treating as version 0. Verify WQ_CREATE with 128B WQEs in V0 and V1. Code review of another bug showed the driver passing 128B WQEs and 8 pages in WQ CREATE and V0. Code inspection/instrumentation showed that the driver uses V0 in WQ_CREATE and if the caller passes queue->entry_size 128B, the driver sets the hdr_version to V1 so all is good. When I tested the V1 WQ_CREATE, the mailbox failed causing the driver to unload. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
\| *	Fix driver unload/reload operation.	James Smart	2017-04-24	2	-22/+31
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	There are couple of different load/unload issues fixed with this patch. One of the issues was reported by Junichi Nomura, a patch was submitted by Johannes Thumsrhirn which did fix one of the problems but the fix in this patch separates the pring free from the queue free and does not set the parameter passed in to NULL. issues: (1) driver could not be unloaded and reloaded without some Oops or Panic occurring. (2) The driver was panicking because of a corruption in the Memory Manager when the iocb list was getting allocated. Root cause for the memory corruption was a double free of the Work Queue ring pointer memory - Freed once in the lpfc_sli4_queue_free when the CQ was destroyed and again in lpfc_sli4_queue_free when the WQ was destroyed. The pring free and the queue free were separated, the pring free was moved to the wq destroy routine because it a better fit logically to delete the ring with the wq. The checkpatch flagged several alignmenet issues that were also corrected with this patch. The mboxq was never initialed correctly before it was used by the driver this patch corrects that issue. Reported-by: Junichi Nomura <j-nomura@ce.jp.nec.com> Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Tested-by: Junichi Nomura <j-nomura@ce.jp.nec.com>
\| *	Fix PRLI ACC rsp for NVME	James Smart	2017-04-24	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	PRLI ACC from target is FCP oriented. Word 0 was wrong. This was noticed by another nvmet-fc vendor that was testing the lpfc nvme-fc initiator with their target. Verified results with analyzer. PRLI BC B5 56 56 22 61 04 00 00 61 00 00 01 29 00 00 20 00 00 00 00 10 FF FF 00 00 00 00 20 14 00 18 28 00 00 00 00 00 00 00 00 00 00 00 00 00 00 20 00 00 00 00 9C D8 DA C9 BC 95 75 75 ACC BC B5 56 56 23 61 00 00 00 61 04 00 01 98 00 00 30 00 00 00 00 10 00 18 00 00 00 00 02 14 00 18 28 00 01 00 00 00 00 00 00 00 00 00 00 00 00 18 00 00 00 00 B0 6B 07 57 BC B5 75 75 Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
\| *	Fix extra line print in rqpair debug print.	James Smart	2017-04-24	1	-2/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	An extra blank line was being added the the rqpair printing. Remove the extra line feed. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
\| *	Remove NULL ptr check before kfree.	James Smart	2017-04-24	1	-8/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The check for NULL ptr is not necessary, kfree will check it. Removing NULL ptr check. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
\| *	Remove unused defines for NVME PostBuf.	James Smart	2017-04-24	1	-3/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	These defines for the posting of buffers for nvmet target were not used. Removing the unused defines. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
\| *	Fix spelling in comments.	James Smart	2017-04-24	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Comment should have said Repost. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
\| *	Add debug messages for nvme/fcp resource allocation.	James Smart	2017-04-24	1	-0/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The xri resources are split into pools for NVME and FCP IO when NVME is enabled. There was not message in the log that identified this allocation. Added debug message to log XRI split. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
\| *	Fix log message in completion path.	James Smart	2017-04-24	1	-3/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In the lpfc_nvme_io_cmd_wqe_cmpl routine the driver was printing two pointers and the DID for the rport whenever an IO completed on a now that had transitioned to a non active state. There is no need to print the node pointer address for a node that is not active the DID should be enough to debug. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
\| *	Fix rejected nvme LS Req.	James Smart	2017-04-24	1	-5/+21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In this case, the NVME initiator is sending an LS REQ command on an NDLP that is not MAPPED. The FW rejects it. The lpfc_nvme_ls_req routine checks for a NULL ndlp pointer but does not check the NDLP state. This allows the routine to send an LS IO when the ndlp is disconnected. Check the ndlp for NULL, actual node, Target and MAPPED or Initiator and UNMAPPED. This avoids Fabric nodes getting the Create Association or Create Connection commands. Initiators are free to Reject either Create. Also some of the messages numbers in lpfc_nvme_ls_req were changed because they were already used in other log messages. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
\| *	Fix nvme unregister port timeout.	James Smart	2017-04-24	1	-3/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	During some link event testing it was observed that the wait_for_completion_timeout in the lpfc_nvme_unregister_port was timing out all the time. The initiator is claiming the nvme_fc_unregister_remoteport upcall is not completing the unregister in the time allotted. [ 2186.151317] lpfc 0000:07:00.0: 0:(0):6169 Unreg nvme wait failed 0 The wait_for_completion_timeout returns 0 when the wait has been outstanding for the jiffies passed by the caller. In this error message, the nvme initiator passed value 5 - meaning 5 jiffies - and this is just wrong. Calculate 5 seconds in Jiffies and pass that value from the current jiffies. Also the log message for the unregister timeout was reduced because timeout failure is the same as timeout. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
\| *	Standardize nvme SGL segment count	James Smart	2017-04-24	3	-7/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Standardize default SGL segment count for nvme target and initiator The driver needs to make them the same for clarity. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
* \|	Merge branch 'master' into for-4.12/post-merge	Jens Axboe	2017-04-25	19	-53/+108
\|\ \ \| \|/ \|/\|
\| *	Merge tag 'scsi-fixes' of ↵	Linus Torvalds	2017-04-24	1	-2/+2
\| \|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fix from James Bottomley: "Our final fix before the 4.12 release (hopefully). It's an error leg again: the fix to not bug on empty DMA transfers is returning the wrong code and confusing the block layer" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: return correct blkprep status code in case scsi_init_io() fails.
\| \| *	Merge remote-tracking branch 'mkp-scsi/4.11/scsi-fixes' into fixes	James Bottomley	2017-04-15	1	-2/+2
\| \| \|\
\| \| \| *	scsi: return correct blkprep status code in case scsi_init_io() fails.	Johannes Thumshirn	2017-04-14	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When instrumenting the SCSI layer to run into the !blk_rq_nr_phys_segments(rq) case the following warning emitted from the block layer: blk_peek_request: bad return=-22 This happens because since commit fd3fc0b4d730 ("scsi: don't BUG_ON() empty DMA transfers") we return the wrong error value from scsi_prep_fn() back to the block layer. [mkp: silenced checkpatch] Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de> Fixes: fd3fc0b4d730 scsi: don't BUG_ON() empty DMA transfers Cc: <stable@vger.kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
\| * \| \|	Merge tag 'scsi-fixes' of ↵	Linus Torvalds	2017-04-15	8	-12/+49
\| \|\\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "This is seven small fixes which are all for user visible issues that fortunately only occur in rare circumstances. The most serious is the sr one in which QEMU can cause us to read beyond the end of a buffer (I don't think it's exploitable, but just in case). The next is the sd capacity fix which means all non 512 byte sector drives greater than 2TB fail to be correctly sized. The rest are either in new drivers (qedf) or on error legs" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: ipr: do not set DID_PASSTHROUGH on CHECK CONDITION scsi: aacraid: fix PCI error recovery path scsi: sd: Fix capacity calculation with 32-bit sector_t scsi: qla2xxx: Add fix to read correct register value for ISP82xx. scsi: qedf: Fix crash due to unsolicited FIP VLAN response. scsi: sr: Sanity check returned mode data scsi: sd: Consider max_xfer_blocks if opt_xfer_blocks is unusable
\| \| * \|	Merge remote-tracking branch 'mkp-scsi/4.11/scsi-fixes' into fixes	James Bottomley	2017-04-12	8	-12/+49
\| \| \|\\|
\| \| \| *	scsi: ipr: do not set DID_PASSTHROUGH on CHECK CONDITION	Mauricio Faria de Oliveira	2017-04-12	1	-1/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	On a dual controller setup with multipath enabled, some MEDIUM ERRORs caused both paths to be failed, thus I/O got queued/blocked since the 'queue_if_no_path' feature is enabled by default on IPR controllers. This example disabled 'queue_if_no_path' so the I/O failure is seen at the sg_dd program. Notice that after the sg_dd test-case, both paths are in 'failed' state, and both path/priority groups are in 'enabled' state (not 'active') -- which would block I/O with 'queue_if_no_path'. # sg_dd if=/dev/dm-2 bs=4096 count=1 dio=1 verbose=4 blk_sgio=0 <...> read(unix): count=4096, res=-1 sg_dd: reading, skip=0 : Input/output error <...> # dmesg [...] sd 2:2:16:0: [sds] FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE [...] sd 2:2:16:0: [sds] Sense Key : Medium Error [current] [...] sd 2:2:16:0: [sds] Add. Sense: Unrecovered read error - recommend rewrite the data [...] sd 2:2:16:0: [sds] CDB: Read(10) 28 00 00 00 00 00 00 00 20 00 [...] blk_update_request: I/O error, dev sds, sector 0 [...] device-mapper: multipath: Failing path 65:32. <...> [...] device-mapper: multipath: Failing path 65:224. # multipath -l 1IBM_IPR-0_59C2AE0000001F80 dm-2 IBM ,IPR-0 59C2AE00 size=5.2T features='0' hwhandler='1 alua' wp=rw \|-+- policy='service-time 0' prio=0 status=enabled \| `- 2:2:16:0 sds 65:32 failed undef running `-+- policy='service-time 0' prio=0 status=enabled `- 1:2:7:0 sdae 65:224 failed undef running This is not the desired behavior. The dm-multipath explicitly checks for the MEDIUM ERROR case (and a few others) so not to fail the path (e.g., I/O to other sectors could potentially happen without problems). See dm-mpath.c :: do_end_io_bio() -> noretry_error() !->! fail_path(). The problem trace is: 1) ipr_scsi_done() // SENSE KEY/CHECK CONDITION detected, go to.. 2) ipr_erp_start() // ipr_is_gscsi() and masked_ioasc OK, go to.. 3) ipr_gen_sense() // masked_ioasc is IPR_IOASC_MED_DO_NOT_REALLOC, // so set DID_PASSTHROUGH. 4) scsi_decide_disposition() // check for DID_PASSTHROUGH and return // early on, faking a DID_OK.. instead // of reaching scsi_check_sense(). // Had it reached the latter, that would // set host_byte to DID_MEDIUM_ERROR. 5) scsi_finish_command() 6) scsi_io_completion() 7) __scsi_error_from_host_byte() // That would be converted to -ENODATA <...> 8) dm_softirq_done() 9) multipath_end_io() 10) do_end_io() 11) noretry_error() // And that is checked in dm-mpath :: noretry_error() // which would cause fail_path() not to be called. With this patch applied, the I/O is failed but the paths are not. This multipath device continues accepting more I/O requests without blocking. (and notice the different host byte/driver byte handling per SCSI layer). # dmesg [...] sd 2:2:7:0: [sdaf] Done: SUCCESS Result: hostbyte=0x13 driverbyte=DRIVER_OK [...] sd 2:2:7:0: [sdaf] CDB: Read(10) 28 00 00 00 00 00 00 00 40 00 [...] sd 2:2:7:0: [sdaf] Sense Key : Medium Error [current] [...] sd 2:2:7:0: [sdaf] Add. Sense: Unrecovered read error - recommend rewrite the data [...] blk_update_request: critical medium error, dev sdaf, sector 0 [...] blk_update_request: critical medium error, dev dm-6, sector 0 [...] sd 2:2:7:0: [sdaf] Done: SUCCESS Result: hostbyte=0x13 driverbyte=DRIVER_OK [...] sd 2:2:7:0: [sdaf] CDB: Read(10) 28 00 00 00 00 00 00 00 10 00 [...] sd 2:2:7:0: [sdaf] Sense Key : Medium Error [current] [...] sd 2:2:7:0: [sdaf] Add. Sense: Unrecovered read error - recommend rewrite the data [...] blk_update_request: critical medium error, dev sdaf, sector 0 [...] blk_update_request: critical medium error, dev dm-6, sector 0 [...] Buffer I/O error on dev dm-6, logical block 0, async page read # multipath -l 1IBM_IPR-0_59C2AE0000001F80 1IBM_IPR-0_59C2AE0000001F80 dm-6 IBM ,IPR-0 59C2AE00 size=5.2T features='1 queue_if_no_path' hwhandler='1 alua' wp=rw \|-+- policy='service-time 0' prio=0 status=active \| `- 2:2:7:0 sdaf 65:240 active undef running `-+- policy='service-time 0' prio=0 status=enabled `- 1:2:7:0 sdh 8:112 active undef running Signed-off-by: Mauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com> Acked-by: Brian King <brking@linux.vnet.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
\| \| \| *	scsi: aacraid: fix PCI error recovery path	Guilherme G. Piccoli	2017-04-12	2	-4/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	During a PCI error recovery, if aac_check_health() is not aware that a PCI error happened and we have an offline PCI channel, it might trigger some errors (like NULL pointer dereference) and inhibit the error recovery process to complete. This patch makes the health check procedure aware of PCI channel issues, and in case of error recovery process, the function aac_adapter_check_health() returns -1 and let the recovery process to complete successfully. This patch was tested on upstream kernel v4.11-rc5 in PowerPC ppc64le architecture with adapter 9005:028d (VID:DID) - the error recovery procedure was able to recover fine. Fixes: 5c63f7f710bd ("aacraid: Added EEH support") Cc: stable@vger.kernel.org # v4.6+ Signed-off-by: Guilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com> Reviewed-by: Dave Carroll <david.carroll@microsemi.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
\| \| \| *	scsi: sd: Fix capacity calculation with 32-bit sector_t	Martin K. Petersen	2017-04-07	1	-2/+18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We previously made sure that the reported disk capacity was less than 0xffffffff blocks when the kernel was not compiled with large sector_t support (CONFIG_LBDAF). However, this check assumed that the capacity was reported in units of 512 bytes. Add a sanity check function to ensure that we only enable disks if the entire reported capacity can be expressed in terms of sector_t. Cc: <stable@vger.kernel.org> Reported-by: Steve Magnani <steve.magnani@digidescorp.com> Cc: Bart Van Assche <Bart.VanAssche@sandisk.com> Reviewed-by: Bart Van Assche <Bart.VanAssche@sandisk.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
\| \| \| *	scsi: qla2xxx: Add fix to read correct register value for ISP82xx.	Sawan Chandak	2017-04-07	1	-1/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add fix to read correct register value for ISP82xx, during check for register disconnect.ISP82xx has different base register. Fixes: a465537ad1a4 ("qla2xxx: Disable the adapter and skip error recovery in case of register disconnect") Signed-off-by: Sawan Chandak <sawan.chandak@cavium.com> Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> Cc: <stable@vger.kernel.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
\| \| \| *	scsi: qedf: Fix crash due to unsolicited FIP VLAN response.	Chad Dupuis	2017-04-07	2	-1/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We need to initialize qedf->fipvlan_compl in __qedf_probe so that if we receive an unsolicited FIP VLAN response, the system doesn't crash due to trying to complete an uninitialized completion. Also add a check to see if there are any waiters on the completion so we don't inadvertantly kick start the discovery process due to the unsolicited frame. Fixed the crash: <1>BUG: unable to handle kernel NULL pointer dereference at (null) <1>IP: [<ffffffff8105ed71>] __wake_up_common+0x31/0x90 <4>PGD 0 <4>Oops: 0000 [#1] SMP <4>last sysfs file: /sys/devices/system/cpu/online <4>CPU 7 <4>Modules linked in: autofs4 nfs lockd fscache auth_rpcgss nfs_acl sunrpc target_core_iblock target_core_file target_core_pscsi target_core_mod configfs bnx2fc cnic fcoe 8021q garp stp llc ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 iptable_filter ip_tables ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables ipv6 vfat fat uinput ipmi_devintf microcode power_meter acpi_ipmi ipmi_si ipmi_msghandler iTCO_wdt iTCO_vendor_support dcdbas sg joydev sb_edac edac_core lpc_ich mfd_core shpchp tg3 ptp pps_core ext4 jbd2 mbcache sr_mod cdrom sd_mod crc_t10dif qedi(U) iscsi_boot_sysfs libiscsi scsi_transport_iscsi uio qedf(U) libfcoe libfc scsi_transport_fc scsi_tgt qede(U) qed(U) ahci megaraid_sas wmi dm_mirror dm_region_hash dm_log dm_mod [last unloaded: speedstep_lib] <4> <4>Pid: 1485, comm: qedf_11_ll2 Not tainted 2.6.32-642.el6.x86_64 #1 Dell Inc. PowerEdge R730/0599V5 <4>RIP: 0010:[<ffffffff8105ed71>] [<ffffffff8105ed71>] __wake_up_common+0x31/0x90 <4>RSP: 0018:ffff881068a83d50 EFLAGS: 00010086 <4>RAX: ffffffffffffffe8 RBX: ffff88106bf42de0 RCX: 0000000000000000 <4>RDX: 0000000000000000 RSI: 0000000000000003 RDI: ffff88106bf42de0 <4>RBP: ffff881068a83d90 R08: 0000000000000000 R09: 00000000fffffffe <4>R10: 0000000000000000 R11: 000000000000000b R12: 0000000000000286 <4>R13: ffff88106bf42de8 R14: 0000000000000000 R15: 0000000000000000 <4>FS: 0000000000000000(0000) GS:ffff88089c460000(0000) knlGS:0000000000000000 <4>CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b <4>CR2: 0000000000000000 CR3: 0000000001a8d000 CR4: 00000000001407e0 <4>DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 <4>DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 <4>Process qedf_11_ll2 (pid: 1485, threadinfo ffff881068a80000, task ffff881068a70040) <4>Stack: <4> ffff88106ef00090 0000000300000001 ffff881068a83d90 ffff88106bf42de0 <4><d> 0000000000000286 ffff88106bf42dd8 ffff88106bf40a50 0000000000000002 <4><d> ffff881068a83dc0 ffffffff810634c7 ffff881000000003 000000000000000b <4>Call Trace: <4> [<ffffffff810634c7>] complete+0x47/0x60 <4> [<ffffffffa01d37e7>] qedf_fip_recv+0x1c7/0x450 [qedf] <4> [<ffffffffa01cb3cb>] qedf_ll2_recv_thread+0x33b/0x510 [qedf] <4> [<ffffffffa01cb090>] ? qedf_ll2_recv_thread+0x0/0x510 [qedf] <4> [<ffffffff810a662e>] kthread+0x9e/0xc0 <4> [<ffffffff8100c28a>] child_rip+0xa/0x20 <4> [<ffffffff810a6590>] ? kthread+0x0/0xc0 <4> [<ffffffff8100c280>] ? child_rip+0x0/0x20 <4>Code: 41 56 41 55 41 54 53 48 83 ec 18 0f 1f 44 00 00 89 75 cc 89 55 c8 4c 8d 6f 08 48 8b 57 08 41 89 cf 4d 89 c6 48 8d 42 e8 49 39 d5 <48> 8b 58 18 74 3f 48 83 eb 18 eb 0a 0f 1f 00 48 89 d8 48 8d 5a <1>RIP [<ffffffff8105ed71>] __wake_up_common+0x31/0x90 <4> RSP <ffff881068a83d50> <4>CR2: 0000000000000000 Signed-off-by: Chad Dupuis <chad.dupuis@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
\| \| \| *	scsi: sr: Sanity check returned mode data	Martin K. Petersen	2017-04-07	1	-2/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Kefeng Wang discovered that old versions of the QEMU CD driver would return mangled mode data causing us to walk off the end of the buffer in an attempt to parse it. Sanity check the returned mode sense data. Cc: <stable@vger.kernel.org> Reported-by: Kefeng Wang <wangkefeng.wang@huawei.com> Tested-by: Kefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
\| \| \| *	scsi: sd: Consider max_xfer_blocks if opt_xfer_blocks is unusable	Fam Zheng	2017-04-07	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If device reports a small max_xfer_blocks and a zero opt_xfer_blocks, we end up using BLK_DEF_MAX_SECTORS, which is wrong and r/w of that size may get error. [mkp: tweaked to avoid setting rw_max twice and added typecast] Cc: <stable@vger.kernel.org> # v4.4+ Fixes: ca369d51b3e ("block/sd: Fix device-imposed transfer length limits") Signed-off-by: Fam Zheng <famz@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
\| * \| \|	Merge branch 'for-linus' of git://git.kernel.dk/linux-block	Linus Torvalds	2017-04-08	1	-3/+3
\| \|\ \ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Pull block fixes from Jens Axboe: "Here's a pull request for 4.11-rc, fixing a set of issues mostly centered around the new scheduling framework. These have been brewing for a while, but split up into what we absolutely need in 4.11, and what we can defer until 4.12. These are well tested, on both single queue and multiqueue setups, and with and without shared tags. They fix several hangs that have happened in testing. This is obviously larger than I would have preferred at this point in time, but I don't think we can shave much off this and still get the desired results. In detail, this pull request contains: - a set of five fixes for NVMe, mostly from Christoph and one from Roland. - a series from Bart, fixing issues with dm-mq and SCSI shared tags and scheduling. Note that one of those patches commit messages may read like an optimization, but it is in fact an important fix for queue restarts in particular. - a series from Omar, most importantly fixing a hang with multiple hardware queues when we fail to get a driver tag. Another important fix in there is for resizing hardware queues, which nbd does when handling multiple sockets for one connection. - fixing an imbalance in putting the ctx for hctx request allocations from Minchan" * 'for-linus' of git://git.kernel.dk/linux-block: blk-mq: Restart a single queue if tag sets are shared dm rq: Avoid that request processing stalls sporadically scsi: Avoid that SCSI queues get stuck blk-mq: Introduce blk_mq_delay_run_hw_queue() blk-mq: remap queues when adding/removing hardware queues blk-mq-sched: fix crash in switch error path blk-mq-sched: set up scheduler tags when bringing up new queues blk-mq-sched: refactor scheduler initialization blk-mq: use the right hctx when getting a driver tag fails nvmet: fix byte swap in nvmet_parse_io_cmd nvmet: fix byte swap in nvmet_execute_write_zeroes nvmet: add missing byte swap in nvmet_get_smart_log nvme: add missing byte swap in nvme_setup_discard nvme: Correct NVMF enum values to match NVMe-oF rev 1.0 block: do not put mq context in blk_mq_alloc_request_hctx
\| * \ \ \	Merge tag 'scsi-fixes' of ↵	Linus Torvalds	2017-04-02	12	-39/+57
\| \|\ \ \ \ \| \| \| \|/ / \| \| \|/\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "Thirteen small fixes: The hopefully final effort to get the lpfc nvme kconfig problems sorted, there's one important sg fix (user can induce read after end of buffer) and one minor enhancement (adding an extra PCI ID to qedi). The rest are a set of minor fixes, which mostly occur as user visible in error legs or on specific devices" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: ufs: remove the duplicated checking for supporting clkscaling scsi: lpfc: fix building without debugfs support scsi: lpfc: Fix PT2PT PRLI reject scsi: hpsa: fix volume offline state scsi: libsas: fix ata xfer length scsi: scsi_dh_alua: Warn if the first argument of alua_rtpg_queue() is NULL scsi: scsi_dh_alua: Ensure that alua_activate() calls the completion function scsi: scsi_dh_alua: Check scsi_device_get() return value scsi: sg: check length passed to SG_NEXT_CMD_LEN scsi: ufshcd-platform: remove the useless cast in ERR_PTR/IS_ERR scsi: qedi: Add PCI device-ID for QL41xxx adapters. scsi: aacraid: Fix potential null access scsi: qla2xxx: Fix crash in qla2xxx_eh_abort on bad ptr
\| \| * \| \|	Merge remote-tracking branch 'mkp-scsi/4.11/scsi-fixes' into fixes	James Bottomley	2017-03-29	12	-39/+57
\| \| \|\ \ \ \| \| \| \| \|/ \| \| \| \|/\|
\| \| \| * \|	scsi: ufs: remove the duplicated checking for supporting clkscaling	Jaehoon Chung	2017-03-28	1	-2/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	There are same conditions for checking whether supporting clkscaling or not. When ufshcd is supporting clkscaling, active_reqs should be decreased by one. [mkp: addressed comment from Bartlomiej] Signed-off-by: Jaehoon Chung <jh80.chung@samsung.com> Reviewed-by: Subhash Jadavani <subhashj@codeaurora.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
\| \| \| * \|	scsi: lpfc: fix building without debugfs support	Arnd Bergmann	2017-03-23	2	-10/+16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	On a randconfig build without CONFIG_SCSI_LPFC_DEBUG_FS, I ran into multiple compile failures: drivers/scsi/lpfc/lpfc_debugfs.h: In function 'lpfc_debug_dump_wq': drivers/scsi/lpfc/lpfc_debugfs.h:405:15: error: 'DUMP_FCP' undeclared (first use in this function); did you mean 'DUMP_VAR'? drivers/scsi/lpfc/lpfc_debugfs.h:405:15: note: each undeclared identifier is reported only once for each function it appears in drivers/scsi/lpfc/lpfc_debugfs.h:408:22: error: 'DUMP_NVME' undeclared (first use in this function); did you mean 'DUMP_NONE'? drivers/scsi/lpfc/lpfc_nvmet.c: In function 'lpfc_nvmet_xmt_ls_rsp_cmp': drivers/scsi/lpfc/lpfc_nvmet.c:109:2: error: implicit declaration of function 'lpfc_nvmeio_data'; did you mean 'lpfc_mem_free'? [-Werror=implicit-function-declaration] drivers/scsi/lpfc/lpfc_nvmet.c: In function 'lpfc_nvmet_xmt_fcp_op': drivers/scsi/lpfc/lpfc_nvmet.c:523:10: error: unused variable 'id' [-Werror=unused-variable] They are all trivial to fix, so I'm doing it in a combined patch here. Fixes: 1d9d5a9879ad ("scsi: lpfc: refactor debugfs queue dump routines") Fixes: bd2cdd5e400f ("scsi: lpfc: NVME Initiator: Add debugfs support") Fixes: 2b65e18202fd ("scsi: lpfc: NVME Target: Add debugfs support") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
\| \| \| * \|	scsi: lpfc: Fix PT2PT PRLI reject	Dick Kennedy	2017-03-23	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	lpfc cannot establish connection with targets that send PRLI in P2P configurations. If lpfc rejects a PRLI that is sent from a target the target will not resend and will reject the PRLI send from the initiator. [mkp: applied by hand] Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
\| \| \| * \|	scsi: hpsa: fix volume offline state	Tomas Henzl	2017-03-23	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In a previous patch a hpsa_scsi_dev_t.volume_offline update line has been removed, so let us put it back.. Fixes: 85b29008d8 (hpsa: update check for logical volume status) Signed-off-by: Tomas Henzl <thenzl@redhat.com> Acked-by: Don Brace <don.brace@microsemi.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
\| \| \| * \|	scsi: libsas: fix ata xfer length	John Garry	2017-03-20	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The total ata xfer length may not be calculated properly, in that we do not use the proper method to get an sg element dma length. According to the code comment, sg_dma_len() should be used after dma_map_sg() is called. This issue was found by turning on the SMMUv3 in front of the hisi_sas controller in hip07. Multiple sg elements were being combined into a single element, but the original first element length was being use as the total xfer length. Cc: <stable@vger.kernel.org> Fixes: ff2aeb1eb64c8a4770a6 ("libata: convert to chained sg") Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
\| \| \| * \|	scsi: scsi_dh_alua: Warn if the first argument of alua_rtpg_queue() is NULL	Bart Van Assche	2017-03-19	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Callers must provide a valid port group to alua_rtpg_queue(). Issue a kernel warning if that is not the case. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Hannes Reinecke <hare@suse.de> Cc: Tang Junhui <tang.junhui@zte.com.cn> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
\| \| \| * \|	scsi: scsi_dh_alua: Ensure that alua_activate() calls the completion function	Bart Van Assche	2017-03-19	1	-5/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Callers of scsi_dh_activate(), e.g. dm-mpath, assume that this function either returns an error code or calls the completion function. Make alua_activate() call the completion function even if scsi_device_get() fails. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Hannes Reinecke <hare@suse.de> Cc: Tang Junhui <tang.junhui@zte.com.cn> Cc: <stable@vger.kernel.org> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
\| \| \| * \|	scsi: scsi_dh_alua: Check scsi_device_get() return value	Bart Van Assche	2017-03-19	1	-9/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Do not queue ALUA work nor call scsi_device_put() if the scsi_device_get() call fails. This patch fixes the following crash: general protection fault: 0000 [#1] SMP RIP: 0010:scsi_device_put+0xb/0x30 Call Trace: scsi_disk_put+0x2d/0x40 sd_release+0x3d/0xb0 __blkdev_put+0x29e/0x360 blkdev_put+0x49/0x170 dm_put_table_device+0x58/0xc0 [dm_mod] dm_put_device+0x70/0xc0 [dm_mod] free_priority_group+0x92/0xc0 [dm_multipath] free_multipath+0x70/0xc0 [dm_multipath] multipath_dtr+0x19/0x20 [dm_multipath] dm_table_destroy+0x67/0x120 [dm_mod] dev_suspend+0xde/0x240 [dm_mod] ctl_ioctl+0x1f5/0x520 [dm_mod] dm_ctl_ioctl+0xe/0x20 [dm_mod] do_vfs_ioctl+0x8f/0x700 SyS_ioctl+0x3c/0x70 entry_SYSCALL_64_fastpath+0x18/0xad Fixes: commit 03197b61c5ec ("scsi_dh_alua: Use workqueue for RTPG") Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Hannes Reinecke <hare@suse.de> Cc: Tang Junhui <tang.junhui@zte.com.cn> Cc: <stable@vger.kernel.org> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
\| \| \| * \|	scsi: sg: check length passed to SG_NEXT_CMD_LEN	peter chang	2017-03-17	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The user can control the size of the next command passed along, but the value passed to the ioctl isn't checked against the usable max command size. Cc: <stable@vger.kernel.org> Signed-off-by: Peter Chang <dpf@google.com> Acked-by: Douglas Gilbert <dgilbert@interlog.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
\| \| \| * \|	scsi: ufshcd-platform: remove the useless cast in ERR_PTR/IS_ERR	Tomas Winkler	2017-03-16	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	IS_ERR and ERR_PTR already forcefully cast their argument, hence there is no need for additional (complex) casting. Signed-off-by: Tomas Winkler <tomas.winkler@intel.com> Reviewed-by: Subhash Jadavani <subhashj@codeaurora.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>