summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* target: Fix ALUA transition state race between multiple initiatorsMike Christie2017-03-313-95/+51
| | | | | | | | | | | | | | | | | | | | | | | | | Multiple threads could be writing to alua_access_state at the same time, or there could be multiple STPGs in flight (different initiators sending them or one initiator sending them to different ports), or a combo of both and the core_alua_do_transition_tg_pt calls will race with each other. Because from the last patches we no longer delay running core_alua_do_transition_tg_pt_work, there does not seem to be any point in running that in a workqueue. And, we always wait for it to complete one way or another, so we can sleep in this code path. So, this patch made over target-pending just adds a mutex and does the work core_alua_do_transition_tg_pt_work was doing in core_alua_do_transition_tg_pt. There is also no need to use an atomic for the tg_pt_gp_alua_access_state. In core_alua_do_transition_tg_pt we will test and set it under the transition mutex. And, it is a int/32 bits so in the other places where it is read, we will never see it partially updated. Signed-off-by: Mike Christie <mchristi@redhat.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
* iser-target: avoid posting a recv buffer twiceSagi Grimberg2017-03-312-1/+14
| | | | | | | | | | | | | | | We pre-allocate our send-queues and might overflow them in case we have multi work-request operations which tend to occur for large RDMA transfers over devices with limited allowed sg elements. When we get to a queue-full condition we might retry again later, so track our receive buffers so we don't repost them for a retry case. Reported-by: Potnuri Bharat Teja <bharat@chelsio.com> Tested-by: Potnuri Bharat Teja <bharat@chelsio.com> Reviewed-by: Potnuri Bharat Teja <bharat@chelsio.com> Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
* iser-target: Fix queue-full response handlingNicholas Bellinger2017-03-311-18/+35
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch addresses two queue-full handling bugs in iser-target. The first is propagating isert_rdma_rw_ctx_post() return back to target-core via isert_put_datain() + isert_get_dataout() callbacks, in order to trigger queue-full logic in target-core. Note target-core expects -EAGAIN or -ENOMEM error to signal RDMA WRITE/READ data-transfer callbacks should be retried, after queue-full logic been invoked. Other types of errors propagated up from RDMA RW API will result in target-core generating internal CHECK_CONDITION status, avoiding subsequent isert_put_datain() and isert_get_dataout() iscsit_transport callback retry attempts. The second is to use transport_generic_request_failure() during T10-PI hw-offload errors in isert_rdma_write_done() and isert_rdma_read_done(), so CHECK_CONDITION queue-full is handled internally by target-core. Also add isert_put_response() T10-PI failure case fixme in isert_rdma_write_done(), which is currently not internally retried or released until session reinstatement. Reported-by: Potnuri Bharat Teja <bharat@chelsio.com> Reviewed-by: Potnuri Bharat Teja <bharat@chelsio.com> Tested-by: Potnuri Bharat Teja <bharat@chelsio.com> Cc: Potnuri Bharat Teja <bharat@chelsio.com> Reported-by: Steve Wise <swise@opengridcomputing.com> Cc: Steve Wise <swise@opengridcomputing.com> Cc: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
* iscsi-target: Propigate queue_data_in + queue_status errorsNicholas Bellinger2017-03-314-13/+10
| | | | | | | | | | | | | | | | | | This patch changes iscsi-target to propagate iscsit_transport ->iscsit_queue_data_in() and ->iscsit_queue_status() callback errors, back up into target-core. This allows target-core to retry failed iscsit_transport callbacks using internal queue-full logic. Reported-by: Potnuri Bharat Teja <bharat@chelsio.com> Reviewed-by: Potnuri Bharat Teja <bharat@chelsio.com> Tested-by: Potnuri Bharat Teja <bharat@chelsio.com> Cc: Potnuri Bharat Teja <bharat@chelsio.com> Reported-by: Steve Wise <swise@opengridcomputing.com> Cc: Steve Wise <swise@opengridcomputing.com> Cc: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
* target: Fix unknown fabric callback queue-full errorsNicholas Bellinger2017-03-312-34/+69
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch fixes a set of queue-full response handling bugs, where outgoing responses are leaked when a fabric driver is propagating non -EAGAIN or -ENOMEM errors to target-core. It introduces TRANSPORT_COMPLETE_QF_ERR state used to signal when CHECK_CONDITION status should be generated, when fabric driver ->write_pending(), ->queue_data_in(), or ->queue_status() callbacks fail with non -EAGAIN or -ENOMEM errors, and data-transfer should not be retried. Note all fabric driver -EAGAIN and -ENOMEM errors are still retried indefinately with associated data-transfer callbacks, following existing queue-full logic. Also fix two missing ->queue_status() queue-full cases related to CMD_T_ABORTED w/ TAS status handling. Reported-by: Potnuri Bharat Teja <bharat@chelsio.com> Reviewed-by: Potnuri Bharat Teja <bharat@chelsio.com> Tested-by: Potnuri Bharat Teja <bharat@chelsio.com> Cc: Potnuri Bharat Teja <bharat@chelsio.com> Reported-by: Steve Wise <swise@opengridcomputing.com> Cc: Steve Wise <swise@opengridcomputing.com> Cc: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
* tcmu: Fix wrongly calculating of the base_command_sizeXiubo Li2017-03-301-2/+8
| | | | | | | | | | | | | | | | | The t_data_nents and t_bidi_data_nents are the numbers of the segments, but it couldn't be sure the block size equals to size of the segment. For the worst case, all the blocks are discontiguous and there will need the same number of iovecs, that's to say: blocks == iovs. So here just set the number of iovs to block count needed by tcmu cmd. Tested-by: Ilias Tsitsimpis <iliastsi@arrikto.com> Reviewed-by: Mike Christie <mchristi@redhat.com> Signed-off-by: Xiubo Li <lixiubo@cmss.chinamobile.com> Cc: stable@vger.kernel.org # 3.18+ Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
* tcmu: Fix possible overwrite of t_data_sg's last iov[]Xiubo Li2017-03-301-11/+23
| | | | | | | | | | | | | | | | | | | | | If there has BIDI data, its first iov[] will overwrite the last iov[] for se_cmd->t_data_sg. To fix this, we can just increase the iov pointer, but this may introuduce a new memory leakage bug: If the se_cmd->data_length and se_cmd->t_bidi_data_sg->length are all not aligned up to the DATA_BLOCK_SIZE, the actual length needed maybe larger than just sum of them. So, this could be avoided by rounding all the data lengthes up to DATA_BLOCK_SIZE. Reviewed-by: Mike Christie <mchristi@redhat.com> Tested-by: Ilias Tsitsimpis <iliastsi@arrikto.com> Reviewed-by: Bryant G. Ly <bryantly@linux.vnet.ibm.com> Signed-off-by: Xiubo Li <lixiubo@cmss.chinamobile.com> Cc: stable@vger.kernel.org # 3.18+ Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
* target: Avoid mappedlun symlink creation during lun shutdownNicholas Bellinger2017-03-303-0/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch closes a race between se_lun deletion during configfs unlink in target_fabric_port_unlink() -> core_dev_del_lun() -> core_tpg_remove_lun(), when transport_clear_lun_ref() blocks waiting for percpu_ref RCU grace period to finish, but a new NodeACL mappedlun is added before the RCU grace period has completed. This can happen in target_fabric_mappedlun_link() because it only checks for se_lun->lun_se_dev, which is not cleared until after transport_clear_lun_ref() percpu_ref RCU grace period finishes. This bug originally manifested as NULL pointer dereference OOPsen in target_stat_scsi_att_intr_port_show_attr_dev() on v4.1.y code, because it dereferences lun->lun_se_dev without a explicit NULL pointer check. In post v4.1 code with target-core RCU conversion, the code in target_stat_scsi_att_intr_port_show_attr_dev() no longer uses se_lun->lun_se_dev, but the same race still exists. To address the bug, go ahead and set se_lun>lun_shutdown as early as possible in core_tpg_remove_lun(), and ensure new NodeACL mappedlun creation in target_fabric_mappedlun_link() fails during se_lun shutdown. Reported-by: James Shen <jcs@datera.io> Cc: James Shen <jcs@datera.io> Tested-by: James Shen <jcs@datera.io> Cc: stable@vger.kernel.org # 3.10+ Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
* iscsi-target: Fix TMR reference leak during session shutdownNicholas Bellinger2017-03-301-5/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch fixes a iscsi-target specific TMR reference leak during session shutdown, that could occur when a TMR was quiesced before the hand-off back to iscsi-target code via transport_cmd_check_stop_to_fabric(). The reference leak happens because iscsit_free_cmd() was incorrectly skipping the final target_put_sess_cmd() for TMRs when transport_generic_free_cmd() returned zero because the se_cmd->cmd_kref did not reach zero, due to the missing se_cmd assignment in original code. The result was iscsi_cmd and it's associated se_cmd memory would be freed once se_sess->sess_cmd_map where released, but the associated se_tmr_req was leaked and remained part of se_device->dev_tmr_list. This bug would manfiest itself as kernel paging request OOPsen in core_tmr_lun_reset(), when a left-over se_tmr_req attempted to dereference it's se_cmd pointer that had already been released during normal session shutdown. To address this bug, go ahead and treat ISCSI_OP_SCSI_CMD and ISCSI_OP_SCSI_TMFUNC the same when there is an extra se_cmd->cmd_kref to drop in iscsit_free_cmd(), and use op_scsi to signal __iscsit_free_cmd() when the former needs to clear any further iscsi related I/O state. Reported-by: Rob Millner <rlm@daterainc.com> Cc: Rob Millner <rlm@daterainc.com> Reported-by: Chu Yuan Lin <cyl@datera.io> Cc: Chu Yuan Lin <cyl@datera.io> Tested-by: Chu Yuan Lin <cyl@datera.io> Cc: stable@vger.kernel.org # 3.10+ Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
* usb: gadget: Correct usb EP argument for BOT status requestManish Narani2017-03-301-1/+1
| | | | | | | | | This patch corrects the argument in usb_ep_free_request as it is mistakenly set to ep_out. It should be ep_in for status request. Signed-off-by: Manish Narani <mnarani@xilinx.com> Acked-by: Felipe Balbi <felipe.balbi@linux.intel.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
* tcmu: Allow cmd_time_out to be set to zero (disabled)Nicholas Bellinger2017-03-301-5/+0
| | | | | | | | | The new cmd_time_out configfs attribute for TCMU is allowed to be disabled, so go ahead and drop the tcmu_cmd_time_out_store() check. Reported-by: Mike Christie <mchristi@redhat.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
* qla2xxx: Update driver version to 9.00.00.00-kHimanshu Madhani2017-03-191-3/+3
| | | | | | Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> signed-off-by: Giridhar Malavali <giridhar.malavali@cavium.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
* qla2xxx: Fix delayed response to command for loop mode/direct connect.Quinn Tran2017-03-196-20/+81
| | | | | | | | | | | | | | Current driver wait for FW to be in the ready state before processing in-coming commands. For Arbitrated Loop or Point-to- Point (not switch), FW Ready state can take a while. FW will transition to ready state after all Nports have been logged in. In the mean time, certain initiators have completed the login and starts IO. Driver needs to start processing all queues if FW is already started. Signed-off-by: Quinn Tran <quinn.tran@cavium.com> Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
* qla2xxx: Change scsi host lookup method.Quinn Tran2017-03-197-40/+100
| | | | | | | | | | | | For target mode, when new scsi command arrive, driver first performs a look up of the SCSI Host. The current look up method is based on the ALPA portion of the NPort ID. For Cisco switch, the ALPA can not be used as the index. Instead, the new search method is based on the full value of the Nport_ID via btree lib. Signed-off-by: Quinn Tran <quinn.tran@cavium.com> Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
* qla2xxx: Add DebugFS node to display Port DatabaseHimanshu Madhani2017-03-192-4/+90
| | | | | | Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> Signed-off-by: Giridhar Malavali <giridhar.malavali@cavium.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
* qla2xxx: Use IOCB interface to submit non-critical MBX.Quinn Tran2017-03-196-65/+279
| | | | | | | | | | | | | | | | | | The Mailbox interface is currently over subscribed. We like to reserve the Mailbox interface for the chip managment and link initialization. Any non essential Mailbox command will be routed through the IOCB interface. The IOCB interface is able to absorb more commands. Following commands are being routed through IOCB interface - Get ID List (007Ch) - Get Port DB (0064h) - Get Link Priv Stats (006Dh) Signed-off-by: Quinn Tran <quinn.tran@cavium.com> Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
* qla2xxx: Add async new target notificationQuinn Tran2017-03-192-3/+4
| | | | | | Signed-off-by: Quinn Tran <quinn.tran@cavium.com> Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
* qla2xxx: Export DIF stats via debugfsAnil Gurumurthy2017-03-192-0/+27
| | | | | | Signed-off-by: Anil Gurumurthy <anil.gurumurthy@cavium.com> Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
* qla2xxx: Improve T10-DIF/PI handling in driver.Quinn Tran2017-03-197-251/+406
| | | | | | | | | Add routines to support T10 DIF tag. Signed-off-by: Quinn Tran <quinn.tran@cavium.com> Signed-off-by: Anil Gurumurthy <anil.gurumurthy@cavium.com> Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
* qla2xxx: Allow relogin to proceed if remote login did not finishQuinn Tran2017-03-194-8/+32
| | | | | | | | | | | | | | If the remote port have started the login process, then the PLOGI and PRLI should be back to back. Driver will allow the remote port to complete the process. For the case where the remote port decide to back off from sending PRLI, this local port sets an expiration timer for the PRLI. Once the expiration time passes, the relogin retry logic is allowed to go through and perform login with the remote port. Signed-off-by: Quinn Tran <quinn.tran@qlogic.com> Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
* qla2xxx: Fix sess_lock & hardware_lock lock order problem.Quinn Tran2017-03-191-23/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The main lock that needs to be held for CMD or TMR submission to upper layer is the sess_lock. The sess_lock is used to serialize cmd submission and session deletion. The addition of hardware_lock being held is not necessary. This patch removes hardware_lock dependency from CMD/TMR submission. Use hardware_lock only for error response in this case. Path1 CPU0 CPU1 ---- ---- lock(&(&ha->tgt.sess_lock)->rlock); lock(&(&ha->hardware_lock)->rlock); lock(&(&ha->tgt.sess_lock)->rlock); lock(&(&ha->hardware_lock)->rlock); Path2/deadlock *** DEADLOCK *** Call Trace: dump_stack+0x85/0xc2 print_circular_bug+0x1e3/0x250 __lock_acquire+0x1425/0x1620 lock_acquire+0xbf/0x210 _raw_spin_lock_irqsave+0x53/0x70 qlt_sess_work_fn+0x21d/0x480 [qla2xxx] process_one_work+0x1f4/0x6e0 Cc: <stable@vger.kernel.org> Cc: Bart Van Assche <Bart.VanAssche@sandisk.com> Reported-by: Bart Van Assche <Bart.VanAssche@sandisk.com> Signed-off-by: Quinn Tran <quinn.tran@cavium.com> Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
* qla2xxx: Fix inadequate lock protection for ABTS.Quinn Tran2017-03-191-2/+10
| | | | | | | | | | | Normally, ABTS is sent to Target Core as Task MGMT command. In the case of error, qla2xxx needs to send response, hardware_lock is required to prevent request queue corruption. Cc: <stable@vger.kernel.org> Signed-off-by: Quinn Tran <quinn.tran@cavium.com> Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
* qla2xxx: Fix request queue corruption.Quinn Tran2017-03-191-3/+9
| | | | | | | | | | | When FW notify driver or driver detects low FW resource, driver tries to send out Busy SCSI Status to tell Initiator side to back off. During the send process, the lock was not held. Cc: <stable@vger.kernel.org> Signed-off-by: Quinn Tran <quinn.tran@qlogic.com> Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
* qla2xxx: Fix memory leak for abts processingQuinn Tran2017-03-191-0/+2
| | | | | | | Cc: <stable@vger.kernel.org> Signed-off-by: Quinn Tran <quinn.tran@cavium.com> Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
* qla2xxx: Allow vref count to timeout on vport delete.Joe Carnuccio2017-03-195-10/+16
| | | | | | | Cc: <stable@vger.kernel.org> Signed-off-by: Joe Carnuccio <joe.carnuccio@cavium.com> Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
* tcmu: Convert cmd_time_out into backend device attributeNicholas Bellinger2017-03-191-26/+68
| | | | | | | | | | | | | | | | | | | | | | Instead of putting cmd_time_out under ../target/core/user_0/foo/control, which has historically been used by parameters needed for initial backend device configuration, go ahead and move cmd_time_out into a backend device attribute. In order to do this, tcmu_module_init() has been updated to create a local struct configfs_attribute **tcmu_attrs, that is based upon the existing passthrough_attrib_attrs along with the new cmd_time_out attribute. Once **tcm_attrs has been setup, go ahead and point it at tcmu_ops->tb_dev_attrib_attrs so it's picked up by target-core. Also following MNC's previous change, ->cmd_time_out is stored in milliseconds but exposed via configfs in seconds. Also, note this patch restricts the modification of ->cmd_time_out to before + after the TCMU device has been configured, but not while it has active fabric exports. Cc: Mike Christie <mchristi@redhat.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
* tcmu: make cmd timeout configurableMike Christie2017-03-191-6/+35
| | | | | | | | | | | A single daemon could implement multiple types of devices using multuple types of real devices that may not support restarting from crashes and/or handling tcmu timeouts. This makes the cmd timeout configurable, so handlers that do not support it can turn if off for now. Signed-off-by: Mike Christie <mchristi@redhat.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
* tcmu: add helper to check if dev was configuredMike Christie2017-03-191-2/+6
| | | | | | | | | This adds a helper to check if the dev was configured. It will be used in the next patch to prevent updates to some config settings after the device has been setup. Signed-off-by: Mike Christie <mchristi@redhat.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
* target: fix race during implicit transition work flushesMike Christie2017-03-181-9/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This fixes the following races: 1. core_alua_do_transition_tg_pt could have read tg_pt_gp_alua_access_state and gone into this if chunk: if (!explicit && atomic_read(&tg_pt_gp->tg_pt_gp_alua_access_state) == ALUA_ACCESS_STATE_TRANSITION) { and then core_alua_do_transition_tg_pt_work could update the state. core_alua_do_transition_tg_pt would then only set tg_pt_gp_alua_pending_state and the tg_pt_gp_alua_access_state would not get updated with the second calls state. 2. core_alua_do_transition_tg_pt could be setting tg_pt_gp_transition_complete while the tg_pt_gp_transition_work is already completing. core_alua_do_transition_tg_pt then waits on the completion that will never be called. To handle these issues, we just call flush_work which will return when core_alua_do_transition_tg_pt_work has completed so there is no need to do the complete/wait. And, if core_alua_do_transition_tg_pt_work was running, instead of trying to sneak in the state change, we just schedule up another core_alua_do_transition_tg_pt_work call. Note that this does not handle a possible race where there are multiple threads call core_alua_do_transition_tg_pt at the same time. I think we need a mutex in target_tg_pt_gp_alua_access_state_store. Signed-off-by: Mike Christie <mchristi@redhat.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
* target: allow userspace to set state to transitioningMike Christie2017-03-181-15/+22
| | | | | | | | | Userspace target_core_user handlers like tcmu-runner may want to set the ALUA state to transitioning while it does implicit transitions. This patch allows that state when set from configfs. Signed-off-by: Mike Christie <mchristi@redhat.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
* target: fix ALUA transition timeout handlingMike Christie2017-03-182-16/+9
| | | | | | | | | | | | | The implicit transition time tells initiators the min time to wait before timing out a transition. We currently schedule the transition to occur in tg_pt_gp_implicit_trans_secs seconds so there is no room for delays. If core_alua_do_transition_tg_pt_work->core_alua_update_tpg_primary_metadata needs to write out info to a remote file, then the initiator can easily time out the operation. Signed-off-by: Mike Christie <mchristi@redhat.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
* target: Use system workqueue for ALUA transitionsMike Christie2017-03-181-5/+3
| | | | | | | | | | | | | | | | | | | | | | | | If tcmu-runner is processing a STPG and needs to change the kernel's ALUA state then we cannot use the same work queue for task management requests and ALUA transitions, because we could deadlock. The problem occurs when a STPG times out before tcmu-runner is able to call into target_tg_pt_gp_alua_access_state_store-> core_alua_do_port_transition -> core_alua_do_transition_tg_pt -> queue_work. In this case, the tmr is on the work queue waiting for the STPG to complete, but the STPG transition is now queued behind the waiting tmr. Note: This bug will also be fixed by this patch: http://www.spinics.net/lists/target-devel/msg14560.html which switches the tmr code to use the system workqueues. For both, I am not sure if we need a dedicated workqueue since it is not a performance path and I do not think we need WQ_MEM_RECLAIM to make forward progress to free up memory like the block layer does. Signed-off-by: Mike Christie <mchristi@redhat.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
* target: fail ALUA transitions for pscsiMike Christie2017-03-181-0/+3
| | | | | | | | | | | | | We do not setup the LU group for pscsi devices, so if you write a state to alua_access_state that will cause a transition you will get a NULL pointer dereference. This patch will fail attempts to try and transition the path for backend devices that set the TRANSPORT_FLAG_PASSTHROUGH_ALUA flag. Signed-off-by: Mike Christie <mchristi@redhat.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
* target: allow ALUA setup for some passthrough backendsMike Christie2017-03-184-7/+15
| | | | | | | | | | | | | | | | | | | | This patch allows passthrough backends to use the core/base LIO ALUA setup and state checks, but still handle the execution of commands. This will allow the target_core_user module to execute STPG and RTPG in userspace, and not have to duplicate the ALUA state checks, path information (needed so we can check if command is executable on specific paths) and setup (rtslib sets/updates the configfs ALUA interface like it does for iblock or file). For STPG, the target_core_user userspace daemon, tcmu-runner will still execute the STPG, and to update the core/base LIO state it will use the existing configfs interface. For RTPG, tcmu-runner will loop over configfs and/or cache the state. Signed-off-by: Mike Christie <mchristi@redhat.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
* tcmu: return on first Opt parse failureMike Christie2017-03-181-0/+3
| | | | | | | | We only were returing failure if the last opt to be parsed failed. This has a return failure when we first detect a failure. Signed-off-by: Mike Christie <mchristi@redhat.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
* tcmu: allow hw_max_sectors greater than 128Mike Christie2017-03-181-19/+35
| | | | | | | | | | | | tcmu hard codes the hw_max_sectors to 128 which is a litle small. Userspace uses the max_sectors to report the optimal IO size and some initiators perform better with larger IOs (open-iscsi seems to do better with 256 to 512 depending on the test). (Fix do not display hw max sectors twice - MNC) Signed-off-by: Mike Christie <mchristi@redhat.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
* target: Drop pointless tfo->check_stop_free checkNicholas Bellinger2017-03-182-2/+5
| | | | | | | | | | | All in-tree fabric drivers provide a tfo->check_stop_free(), so there is no need to do the extra check within existing transport_cmd_check_stop_to_fabric() code. Just to be sure, add a check in target_fabric_tf_ops_check() to notify any out-of-tree drivers that might be missing it. Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
* target: Fix VERIFY_16 handling in sbc_parse_cdbMax Lohrmann2017-03-081-2/+8
| | | | | | | | | | | | | | | As reported by Max, the Windows 2008 R2 chkdsk utility expects VERIFY_16 to be supported, and does not handle the returned CHECK_CONDITION properly, resulting in an infinite loop. The kernel will log huge amounts of this error: kernel: TARGET_CORE[iSCSI]: Unsupported SCSI Opcode 0x8f, sending CHECK_CONDITION. Signed-off-by: Max Lohrmann <post@wickenrode.com> Cc: <stable@vger.kernel.org> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
* target/pscsi: Fix TYPE_TAPE + TYPE_MEDIMUM_CHANGER exportNicholas Bellinger2017-03-081-35/+12
| | | | | | | | | | | | | | | | | | | | The following fixes a divide by zero OOPs with TYPE_TAPE due to pscsi_tape_read_blocksize() failing causing a zero sd->sector_size being propigated up via dev_attrib.hw_block_size. It also fixes another long-standing bug where TYPE_TAPE and TYPE_MEDIMUM_CHANGER where using pscsi_create_type_other(), which does not call scsi_device_get() to take the device reference. Instead, rename pscsi_create_type_rom() to pscsi_create_type_nondisk() and use it for all cases. Finally, also drop a dump_stack() in pscsi_get_blocks() for non TYPE_DISK, which in modern target-core can get invoked via target_sense_desc_format() during CHECK_CONDITION. Reported-by: Malcolm Haak <insanemal@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
* Linux 4.11-rc1v4.11-rc1Linus Torvalds2017-03-051-2/+2
|
* Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds2017-03-0586-368/+895
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull networking fixes from David Miller: 1) Fix double-free in batman-adv, from Sven Eckelmann. 2) Fix packet stats for fast-RX path, from Joannes Berg. 3) Netfilter's ip_route_me_harder() doesn't handle request sockets properly, fix from Florian Westphal. 4) Fix sendmsg deadlock in rxrpc, from David Howells. 5) Add missing RCU locking to transport hashtable scan, from Xin Long. 6) Fix potential packet loss in mlxsw driver, from Ido Schimmel. 7) Fix race in NAPI handling between poll handlers and busy polling, from Eric Dumazet. 8) TX path in vxlan and geneve need proper RCU locking, from Jakub Kicinski. 9) SYN processing in DCCP and TCP need to disable BH, from Eric Dumazet. 10) Properly handle net_enable_timestamp() being invoked from IRQ context, also from Eric Dumazet. 11) Fix crash on device-tree systems in xgene driver, from Alban Bedel. 12) Do not call sk_free() on a locked socket, from Arnaldo Carvalho de Melo. 13) Fix use-after-free in netvsc driver, from Dexuan Cui. 14) Fix max MTU setting in bonding driver, from WANG Cong. 15) xen-netback hash table can be allocated from softirq context, so use GFP_ATOMIC. From Anoob Soman. 16) Fix MAC address change bug in bgmac driver, from Hari Vyas. 17) strparser needs to destroy strp_wq on module exit, from WANG Cong. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (69 commits) strparser: destroy workqueue on module exit sfc: fix IPID endianness in TSOv2 sfc: avoid max() in array size rds: remove unnecessary returned value check rxrpc: Fix potential NULL-pointer exception nfp: correct DMA direction in XDP DMA sync nfp: don't tell FW about the reserved buffer space net: ethernet: bgmac: mac address change bug net: ethernet: bgmac: init sequence bug xen-netback: don't vfree() queues under spinlock xen-netback: keep a local pointer for vif in backend_disconnect() netfilter: nf_tables: don't call nfnetlink_set_err() if nfnetlink_send() fails netfilter: nft_set_rbtree: incorrect assumption on lower interval lookups netfilter: nf_conntrack_sip: fix wrong memory initialisation can: flexcan: fix typo in comment can: usb_8dev: Fix memory leak of priv->cmd_msg_buffer can: gs_usb: fix coding style can: gs_usb: Don't use stack memory for USB transfers ixgbe: Limit use of 2K buffers on architectures with 256B or larger cache lines ixgbe: update the rss key on h/w, when ethtool ask for it ...
| * strparser: destroy workqueue on module exitWANG Cong2017-03-041-0/+1
| | | | | | | | | | | | | | Fixes: 43a0c6751a32 ("strparser: Stream parser for messages") Cc: Tom Herbert <tom@herbertland.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nfDavid S. Miller2017-03-045-91/+66
| |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pablo Neira Ayuso says: ==================== Netfilter fixes for net The following patchset contains Netfilter fixes for your net tree, they are: 1) Missing check for full sock in ip_route_me_harder(), from Florian Westphal. 2) Incorrect sip helper structure initilization that breaks it when several ports are used, from Christophe Leroy. 3) Fix incorrect assumption when looking up for matching with adjacent intervals in the nft_set_rbtree. 4) Fix broken netlink event error reporting in nf_tables that results in misleading ESRCH errors propagated to userspace listeners. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| | * netfilter: nf_tables: don't call nfnetlink_set_err() if nfnetlink_send() failsPablo Neira Ayuso2017-03-032-81/+58
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The underlying nlmsg_multicast() already sets sk->sk_err for us to notify socket overruns, so we should not do anything with this return value. So we just call nfnetlink_set_err() if: 1) We fail to allocate the netlink message. or 2) We don't have enough space in the netlink message to place attributes, which means that we likely need to allocate a larger message. Before this patch, the internal ESRCH netlink error code was propagated to userspace, which is quite misleading. Netlink semantics mandate that listeners just hit ENOBUFS if the socket buffer overruns. Reported-by: Alexander Alemayhu <alexander@alemayhu.com> Tested-by: Alexander Alemayhu <alexander@alemayhu.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
| | * netfilter: nft_set_rbtree: incorrect assumption on lower interval lookupsPablo Neira Ayuso2017-03-031-5/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In case of adjacent ranges, we may indeed see either the high part of the range in first place or the low part of it. Remove this incorrect assumption, let's make sure we annotate the low part of the interval in case of we have adjacent interva intervals so we hit a matching in lookups. Reported-by: Simon Hanisch <hanisch@wh2.tu-dresden.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
| | * netfilter: nf_conntrack_sip: fix wrong memory initialisationChristophe Leroy2017-03-031-2/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In commit 82de0be6862cd ("netfilter: Add helper array register/unregister functions"), struct nf_conntrack_helper sip[MAX_PORTS][4] was changed to sip[MAX_PORTS * 4], so the memory init should have been changed to memset(&sip[4 * i], 0, 4 * sizeof(sip[i])); But as the sip[] table is allocated in the BSS, it is already set to 0 Fixes: 82de0be6862cd ("netfilter: Add helper array register/unregister functions") Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
| | * netfilter: use skb_to_full_sk in ip_route_me_harderFlorian Westphal2017-02-281-3/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | inet_sk(skb->sk) is illegal in case skb is attached to request socket. Fixes: ca6fb0651883 ("tcp: attach SYNACK messages to request sockets instead of listener") Reported by: Daniel J Blueman <daniel@quora.org> Signed-off-by: Florian Westphal <fw@strlen.de> Tested-by: Daniel J Blueman <daniel@quora.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
| * | Merge branch 'sfx-fixes'David S. Miller2017-03-031-6/+6
| |\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Edward Cree says: ==================== sfc: couple of fixes First patch addresses a construct that causes sparse to error out. With that fixed, sparse makes some warnings on ef10.c, second patch fixes one of them. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| | * | sfc: fix IPID endianness in TSOv2Edward Cree2017-03-031-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The value we read from the header is in network byte order, whereas EFX_POPULATE_QWORD_* takes values in host byte order (which it then converts to little-endian, as MCDI is little-endian). Fixes: e9117e5099ea ("sfc: Firmware-Assisted TSO version 2") Signed-off-by: Edward Cree <ecree@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * | sfc: avoid max() in array sizeEdward Cree2017-03-031-5/+5
| |/ / | | | | | | | | | | | | | | | | | | | | | | | | It confuses sparse, which thinks the size isn't constant. Let's achieve the same thing with a BUILD_BUG_ON, since we know which one should be bigger and don't expect them ever to change. Signed-off-by: Edward Cree <ecree@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>