summaryrefslogtreecommitdiffstats
path: root/include (follow)
Commit message (Collapse)AuthorAgeFilesLines
* target: use 64-bit LUNsHannes Reinecke2015-06-162-14/+15
| | | | | | | | As we're now using a list to hold the LUNs the target core can now converted to use 64-bit LUNs internally. Signed-off-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
* target: Drop unnecessary core_tpg_register TFO parameterNicholas Bellinger2015-06-161-2/+1
| | | | | | | | | | | | | | | | | This patch drops unnecessary target_core_fabric_ops parameter usage for core_tpg_register() during fabric driver TFO->fabric_make_tpg() se_portal_group creation callback execution. Instead, use the existing se_wwn->wwn_tf->tf_ops pointer to ensure fabric driver is really using the same TFO provided at module_init time. Also go ahead and drop the forward TFO declarations tree-wide, and handling the special case for iscsi-target discovery TPG. Cc: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
* target: Drop se_lun->lun_active for existing percpu lun_refNicholas Bellinger2015-06-011-1/+0
| | | | | | | | | | | | | | | With se_port_t and t10_alua_tg_pt_gp_member being absored into se_lun, there is no need for an extra atomic_t based reference count for PR ALL_TG_PT=1 and ALUA access state transition. Go ahead and use the existing percpu se_lun->lun_ref instead, and convert the two special cases to percpu_ref_tryget_live() to avoid se_lun if transport_clear_lun_ref() has already been invoked to shutdown the se_lun. Cc: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
* target: Drop lun_sep_lock for se_lun->lun_se_dev RCU usageNicholas Bellinger2015-06-011-4/+6
| | | | | | | | | | | | | | | | | | | | | | | | With se_port and t10_alua_tg_pt_gp_member being absored into se_lun, there is no need for an extra lock to protect se_lun->lun_se_dev assignment. This patch also converts backend drivers to use call_rcu() release to allow any se_device readers to complete. The call_rcu() instead of kfree_rcu() is required here because se_device is embedded into the backend driver specific structure. Also, convert se_lun->lun_stats to use atomic_long_t within the target_complete_ok_work() completion callback, and add FIXME for transport_lookup_tmr_lun() with se_lun->lun_ref. Finally, update sbp_update_unit_directory() special case usage with proper rcu_dereference_raw() and configfs symlink comment. Reported-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Chris Boot <bootc@bootc.net> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
* target: Subsume se_port + t10_alua_tg_pt_gp_member into se_lunChristoph Hellwig2015-06-011-42/+32
| | | | | | | | | | | | | | | | | | This patch eliminates all se_port + t10_alua_tg_pt_gp_member usage, and converts current users to direct se_lun pointer dereference. This includes the removal of core_export_port(), core_release_port() core_dev_export() and core_dev_unexport(). Along with conversion of special case se_lun pointer dereference within PR ALL_TG_PT=1 and ALUA access state transition UNIT_ATTENTION handling. Also, update core_enable_device_list_for_node() to reference the new per se_lun->lun_deve_list when creating a new entry, or replacing an existing one via RCU. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
* target: Simplify LUN shutdown codeBart Van Assche2015-06-011-1/+0
| | | | | | | | | | | Instead of starting a thread from transport_clear_lun_ref() that waits for LUN shutdown, wait in that function for LUN shutdown to finish. Additionally, change the return type of transport_clear_lun_ref() from int to void. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
* target: simplify backend attribute implementationChristoph Hellwig2015-06-011-29/+0
| | | | | | | | | | | | | | | Consolidate the implementation of the backend attributes in a single file and single function per attribute show/store function instead of splitting it into multiple functions in multiple files. Also use the proper strto* helpers for exposed data types, add macros to implement the store methods for the most common data types and share the show methods between the two different attribute implementations. (Fix bogus store_pi_prot_format flag=0 return value - nab) Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
* target: consolidate backend attribute implementationsChristoph Hellwig2015-06-012-118/+3
| | | | | | | | | | Provide a common sets of dev_attrib attributes for all devices using the generic SPC/SBC parsers, and a second one with the minimal required read-only attributes for passthrough devices. The later is only used by pscsi for now, but will be wired up for the full-passthrough TCMU use case as well. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
* target: simplify backend driver registrationChristoph Hellwig2015-06-012-20/+6
| | | | | | | | | | | | | Rewrite the backend driver registration based on what we did to the fabric drivers: introduce a read-only struct target_bakckend_ops that the driver registers, which is then instanciate as a struct target_backend by the core. This allows the ops vector to be smaller and allows us to mark it const. At the same time the registration function can set up the configfs attributes, avoiding the need to add additional boilerplate code for that to the drivers. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
* target: Drop left-over se_lun->lun_statusNicholas Bellinger2015-06-011-8/+0
| | | | | | | | | Now that se_portal_group->tpg_lun_hlist is a RCU protected hlist, go ahead and drop the left-over lun->lun_status usage. Reported-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
* target: Drop unused se_lun->lun_acl_listNicholas Bellinger2015-06-011-3/+0
| | | | | | | Reviewed-by: Hannes Reinecke <hare@suse.de> Cc: Christoph Hellwig <hch@lst.de> Cc: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
* target: Convert se_tpg->acl_node_lock to ->acl_node_mutexNicholas Bellinger2015-06-011-1/+1
| | | | | | | | | | | | | | | This patch converts se_tpg->acl_node_lock to struct mutex, so that ->acl_node_acl walkers in core_clear_lun_from_tpg() can block when calling core_disable_device_list_for_node(). It also updates core_dev_add_lun() to hold ->acl_node_mutex when calling core_tpg_add_node_to_devs() to build ->lun_entry_hlist for dynamically generated se_node_acl. Reviewed-by: Hannes Reinecke <hare@suse.de> Cc: Christoph Hellwig <hch@lst.de> Cc: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
* target: Convert se_portal_group->tpg_lun_list[] to RCU hlistNicholas Bellinger2015-06-012-3/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch converts the fixed size se_portal_group->tpg_lun_list[] to use modern RCU with hlist_head in order to support an arbitary number of se_lun ports per target endpoint. It includes dropping core_tpg_alloc_lun() from core_dev_add_lun(), and calling it directly from target_fabric_make_lun() to allocate a new se_lun. And add a new target_fabric_port_release() configfs item callback to invoke kfree_rcu() to release memory during se_lun->lun_group shutdown. Also now that se_node_acl->lun_entry_hlist is using RCU, convert existing tpg_lun_lock to struct mutex so core_tpg_add_node_to_devs() can perform RCU updater logic without releasing ->tpg_lun_mutex. Also, drop core_tpg_clear_object_luns() and it's single consumer in iscsi-target, which is duplicating TPG LUN shutdown logic and is current code results in a NOP. Finally, sbp-target and xen-scsiback fabric driver conversions are included, which are required due to the non-standard way they use ->tpg_lun_hlist. Reviewed-by: Hannes Reinecke <hare@suse.de> Cc: Christoph Hellwig <hch@lst.de> Cc: Sagi Grimberg <sagig@mellanox.com> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Chris Boot <bootc@bootc.net> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
* target/pr: Change alloc_registration to avoid pr_reg_tg_pt_lunNicholas Bellinger2015-06-011-2/+2
| | | | | | | | | | | | | | | | This patch changes __core_scsi3_do_alloc_registration() code to drop pr_reg->pr_reg_tg_pt_lun pointer usage in favor of a new pr_reg RPTI + existing pr_reg->pr_aptpl_target_lun used by APTPL metadata logic. It also includes changes to REGISTER, REGISTER_AND_MOVE and APTPL feature bit codepaths to use rcu_dereference_check() with the expected non-zero se_dev_entry->pr_kref reference held. Reviewed-by: Hannes Reinecke <hare@suse.de> Cc: Christoph Hellwig <hch@lst.de> Cc: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
* target/pr: Use atomic bitop for se_dev_entry->deve_flags reservation checkNicholas Bellinger2015-06-011-1/+2
| | | | | | | | | | | | | | | | | | | | This patch converts the core_scsi3_pr_seq_non_holder() check for non reservation holding registrations to use an atomic bitop in ->deve_flags to determine if a registration is currently active. It also includes associated a set_bit() in __core_scsi3_add_registration() and clear_bit() in __core_scsi3_free_registration(), if se_dev_entry still exists, and has not already been released via se_dev_entry shutdown path in core_disable_device_list_for_node(). Also, clear_bit in core_disable_device_list_for_node as well to ensure the read-critical path in core_scsi3_pr_seq_non_holder() picks up the new state, preceeding the final kfree_rcu() call. Reported-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Cc: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
* target: Convert se_node_acl->device_list[] to RCU hlistNicholas Bellinger2015-06-012-13/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch converts se_node_acl->device_list[] table for mappedluns to modern RCU hlist_head usage in order to support an arbitrary number of node_acl lun mappings. It converts transport_lookup_*_lun() fast-path code to use RCU read path primitives when looking up se_dev_entry. It adds a new hlist_head at se_node_acl->lun_entry_hlist for this purpose. For transport_lookup_cmd_lun() code, it works with existing per-cpu se_lun->lun_ref when associating se_cmd with se_lun + se_device. Also, go ahead and update core_create_device_list_for_node() + core_free_device_list_for_node() to use ->lun_entry_hlist. It also converts se_dev_entry->pr_ref_count access to use modern struct kref counting, and updates core_disable_device_list_for_node() to kref_put() and block on se_deve->pr_comp waiting for outstanding PR special-case PR references to drop, then invoke kfree_rcu() to wait for the RCU grace period to complete before releasing memory. So now that se_node_acl->lun_entry_hlist fast path access uses RCU protected pointers, go ahead and convert remaining non-fast path RCU updater code using ->lun_entry_lock to struct mutex to allow callers to block while walking se_node_acl->lun_entry_hlist. Finally drop the left-over core_clear_initiator_node_from_tpg() that originally cleared lun_access during se_node_acl shutdown, as post RCU conversion it now becomes duplicated logic. Reviewed-by: Hannes Reinecke <hare@suse.de> Cc: Christoph Hellwig <hch@lst.de> Cc: Sagi Grimberg <sagig@mellanox.com> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
* target: remove ->put_session methodChristoph Hellwig2015-05-311-1/+0
| | | | | | | | | | | | | | | | | | | The only instance of ->put_session is in qla2xxx, and was added by commit aaf68b ("tcm_qla2xxx: Convert to TFO->put_session() usage") with the following description: This patch converts tcm_qla2xxx code to use an internal kref_put() for se_session->sess_kref in order to ensure that qla_hw_data->hardware_lock can be held while calling qlt_unreg_sess() for the final put. But these day we're already holding the hardware lock over qlt_unreg_sess in the ->close_session callback, so we're fine without this method. (Re-add missing tcm_qla2xxx_release_session + drop put_session usage - nab) Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Himanshu Madhani <himanshu.madhani@qlogic.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
* target: remove struct target_fabric_configfs_templateChristoph Hellwig2015-05-311-26/+22
| | | | | | | | It's only embedded into struct target_fabric_configfs these days, so we might as well remove this layer of abstraction. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
* target: put struct target_fabric_configfs on a dietChristoph Hellwig2015-05-311-7/+0
| | | | | | | | Remove all fields that are either unused or can be replaced by trivially following pointers. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
* target: don't copy fabric opsChristoph Hellwig2015-05-311-1/+1
| | | | | | | | Now that we don't need to set up ->tf_subsys we don't need to copy around the ops vector anymore. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
* target: Move task tag into struct se_cmd + support 64-bit tagsBart Van Assche2015-05-312-2/+2
| | | | | | | | | | | | | | | | | | | | | Simplify target core and target drivers by storing the task tag a.k.a. command identifier inside struct se_cmd. For several transports (e.g. SRP) tags are 64 bits wide. Hence add support for 64-bit tags. (Fix core_tmr_abort_task conversion spec warnings - nab) (Fix up usb-gadget to use 16-bit tags - HCH + bart) Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Andy Grover <agrover@redhat.com> Cc: Sagi Grimberg <sagig@mellanox.com> Cc: <qla2xxx-upstream@qlogic.com> Cc: Felipe Balbi <balbi@ti.com> Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Juergen Gross <jgross@suse.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
* target: move transport ID handling to the coreChristoph Hellwig2015-05-311-33/+0
| | | | | | | | | Now that struct se_portal_group contains a protocol identifier field we can take all the code to format an parse protocol identifiers in CDBs into common code instead of leaving this to low-level drivers. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
* target: remove the get_fabric_proto_ident methodChristoph Hellwig2015-05-311-4/+0
| | | | | | | | Now that we store the protocol identifier in the tpg structure we don't need this method. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
* target: change core_tpg_register prototypeChristoph Hellwig2015-05-312-11/+7
| | | | | | | | Remove the unneeded fabric_ptr argument, and change the type argument to pass in a SPC protocol identifier. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
* lib: introduce crc_t10dif_update()Akinobu Mita2015-05-311-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | This introduces crc_t10dif_update() which enables to calculate CRC for a block which straddles multiple SG elements by calling multiple times. This also converts crc_t10dif() to use crc_t10dif_update() as they are almost same. (remove extra function call in crc_t10dif() and crc_t10dif_update - Tim + Herbert) Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Acked-by: Martin K. Petersen <martin.petersen@oracle.com> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: "David S. Miller" <davem@davemloft.net> Cc: linux-crypto@vger.kernel.org Cc: Nicholas Bellinger <nab@linux-iscsi.org> Cc: Sagi Grimberg <sagig@mellanox.com> Cc: "Martin K. Petersen" <martin.petersen@oracle.com> Cc: Christoph Hellwig <hch@lst.de> Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com> Cc: target-devel@vger.kernel.org Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
* target: ensure se_cmd->t_prot_sg is allocated when requiredAkinobu Mita2015-05-311-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | Even if the device backend is initialized with protection info is enabled, some requests don't have the protection info attached for WRITE SAME command issued by block device helpers, WRITE command with WRPROTECT=0 by SG_IO ioctl, etc. So when TCM loopback fabric module is used, se_cmd->t_prot_sg is NULL for these requests and performing WRITE_INSERT of PI using software emulation by sbc_dif_generate() causes kernel crash. To fix this, introduce SCF_PASSTHROUGH_PROT_SG_TO_MEM_NOALLOC for se_cmd_flags, which is used to determine that se_cmd->t_prot_sg needs to be allocated or use pre-allocated protection information by scsi mid-layer. Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Cc: Nicholas Bellinger <nab@linux-iscsi.org> Cc: Sagi Grimberg <sagig@mellanox.com> Cc: "Martin K. Petersen" <martin.petersen@oracle.com> Cc: Christoph Hellwig <hch@lst.de> Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com> Cc: target-devel@vger.kernel.org Cc: linux-scsi@vger.kernel.org Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
* target: move node ACL allocation to core codeChristoph Hellwig2015-05-312-5/+2
| | | | | Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
* target: refactor init/drop_nodeacl methodsChristoph Hellwig2015-05-311-7/+2
| | | | | | | | | | | By always allocating and adding, respectively removing and freeing the se_node_acl structure in core code we can remove tons of repeated code in the init_nodeacl and drop_nodeacl routines. Additionally this now respects the get_default_queue_depth method in this code path as well. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
* target: Remove first argument of target_{get,put}_sess_cmd()Bart Van Assche2015-05-311-2/+2
| | | | | | | | | | | | | | The first argument of these two functions is always identical to se_cmd->se_sess. Hence remove the first argument. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: Sagi Grimberg <sagig@mellanox.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Cc: Andy Grover <agrover@redhat.com> Cc: <qla2xxx-upstream@qlogic.com> Cc: Felipe Balbi <balbi@ti.com> Cc: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
* target: fix DPO and FUA bit checksChristoph Hellwig2015-05-311-6/+0
| | | | | | | | | | | | | | | | | | | | | Drivers may override the WCE flag, in which case the DPOFUA flag in MODE SENSE might differ from the check used to reject invalid FUA bits in sbc_check_dpofua. Also now that we reject invalid FUA bits early there is no need to duplicate the same buggy check down in the fileio code. As the DPOFUA flag controls th support for FUA bits on read and write commands as well as DPO key off all the checks off a single helper, and deprecate the emulate_dpo and emulate_fua_read attributs. This fixes various failures in the libiscsi testsuite. Personally I'd prefer to also remove the emulate_fua_write attribute as there is no good reason to disable it, but I'll leave that for a separate discussion. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
* iscsi-target: remove support for obsolete markersChristophe Vu-Brugier2015-05-311-10/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Support for markers is currently broken because of a bug in iscsi_enforce_integrity_rules(): the "IFMarkInt_Reject" and "OFMarkInt_Reject" variables are always equal to 1 in iscsi_enforce_integrity_rules(). Moreover, fixed interval markers keys (IFMarker, OFMarker, IFMarkInt and OFMarkInt) are obsolete according to iSCSI RFC 7143: >From http://tools.ietf.org/html/rfc7143#section-13.25: 13.25. Obsoleted Keys This document obsoletes the following keys defined in [RFC3720]: IFMarker, OFMarker, OFMarkInt, and IFMarkInt. However, iSCSI implementations compliant to this document may still receive these obsoleted keys -- i.e., in a responder role -- in a text negotiation. When an IFMarker or OFMarker key is received, a compliant iSCSI implementation SHOULD respond with the constant "Reject" value. The implementation MAY alternatively respond with a "No" value. However, the implementation MUST NOT respond with a "NotUnderstood" value for either of these keys. When an IFMarkInt or OFMarkInt key is received, a compliant iSCSI implementation MUST respond with the constant "Reject" value. The implementation MUST NOT respond with a "NotUnderstood" value for either of these keys. This patch disables markers by turning the corresponding parameters to read-only. The default value of IFMarker and OFMarker remains "No" but the user cannot change it to "Yes" anymore. The new value of IFMarkInt and OFMarkInt is "Reject". (Drop left-over iscsi_get_value_from_number_range + make configfs parameters attrs R/W nops - nab) Signed-off-by: Christophe Vu-Brugier <cvubrugier@fastmail.fm> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
* target: Merge sbc_verify_dif_read|writeSagi Grimberg2015-05-311-5/+3
| | | | | | | | | | | | | | | | | | | | | | Instead of providing DIF verify routines for read/write that are almost identical and conditionally copy protection information, just let the caller do the right thing. Have a single sbc_dif_verify that handles an sgl (that does NOT copy any data) and a protection information copy routine used by rd_mcp and fileio backend. In the WRITE case, call sbc_dif_verify with cmd->t_prot_sg and then do the copy from it to local sgl (assuming the verify succeeded of course). In the READ case, call sbc_dif_verify with the local sgl and if it succeeds, copy it to t_prot_sg (or not if we are stripping it). (Fix apply breakage from commit c836777 - nab) Tested-by: Akinobu Mita <akinobu.mita@gmail.com> Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
* target: Use a PASSTHROUGH flag instead of transport_typesAndy Grover2015-05-311-4/+2
| | | | | | | | | | It seems like we only care if a transport is passthrough or not. Convert transport_type to a flags field and replace TRANSPORT_PLUGIN_* with a flag, TRANSPORT_FLAG_PASSTHROUGH. Signed-off-by: Andy Grover <agrover@redhat.com> Reviewed-by: Ilias Tsitsimpis <iliastsi@arrikto.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
* target: Move passthrough CDB parsing into a common functionAndy Grover2015-05-311-0/+2
| | | | | | | | | | | Aside from whether they handle BIDI ops or not, parsing of the CDB by kernel and user SCSI passthrough modules should be identical. Move this into a new passthrough_parse_cdb() and call it from tcm-pscsi and tcm-user. Reported-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Ilias Tsitsimpis <iliastsi@arrikto.com> Signed-off-by: Andy Grover <agrover@redhat.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
* target: Fix se_tpg_tfo->tf_subsys regression + remove tf_subsystemChristoph Hellwig2015-05-312-3/+3
| | | | | | | | | | | | | | | | | There is just one configfs subsystem in the target code, so we might as well add two helpers to reference / unreference it from the core code instead of passing pointers to it around. This fixes a regression introduced for v4.1-rc1 with commit 9ac8928e6, where configfs_depend_item() callers using se_tpg_tfo->tf_subsys would fail, because the assignment from the original target_core_subsystem[] is no longer happening at target_register_template() time. (Fix target_core_exit_configfs pointer dereference - Sagi) Signed-off-by: Christoph Hellwig <hch@lst.de> Reported-by: Himanshu Madhani <himanshu.madhani@qlogic.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
* Merge git://git.infradead.org/intel-iommuLinus Torvalds2015-04-271-3/+15
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull intel iommu updates from David Woodhouse: "This lays a little of the groundwork for upcoming Shared Virtual Memory support — fixing some bogus #defines for capability bits and adding the new ones, and starting to use the new wider page tables where we can, in anticipation of actually filling in the new fields therein. It also allows graphics devices to be assigned to VM guests again. This got broken in 3.17 by disallowing assignment of RMRR-afflicted devices. Like USB, we do understand why there's an RMRR for graphics devices — and unlike USB, it's actually sane. So we can make an exception for graphics devices, just as we do USB controllers. Finally, tone down the warning about the X2APIC_OPT_OUT bit, due to persistent requests. X2APIC_OPT_OUT was added to the spec as a nasty hack to allow broken BIOSes to forbid us from using X2APIC when they do stupid and invasive things and would break if we did. Someone noticed that since Windows doesn't have full IOMMU support for DMA protection, setting the X2APIC_OPT_OUT bit made Windows avoid initialising the IOMMU on the graphics unit altogether. This means that it would be available for use in "driver mode", where the IOMMU registers are made available through a BAR of the graphics device and the graphics driver can do SVM all for itself. So they started setting the X2APIC_OPT_OUT bit on *all* platforms with SVM capabilities. And even the platforms which *might*, if the planets had been aligned correctly, possibly have had SVM capability but which in practice actually don't" * git://git.infradead.org/intel-iommu: iommu/vt-d: support extended root and context entries iommu/vt-d: Add new extended capabilities from v2.3 VT-d specification iommu/vt-d: Allow RMRR on graphics devices too iommu/vt-d: Print x2apic opt out info instead of printing a warning iommu/vt-d: kill bogus ecap_niotlb_iunits()
| * iommu/vt-d: Add new extended capabilities from v2.3 VT-d specificationDavid Woodhouse2015-03-251-0/+14
| | | | | | | | Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
| * iommu/vt-d: kill bogus ecap_niotlb_iunits()David Woodhouse2015-03-251-3/+1
| | | | | | | | | | | | | | | | | | | | | | | | As far back as I can see (which right now is a draft of the v1.2 spec dating from September 2008), bits 24-31 of the Extended Capability Register have already been reserved. I have no idea why anyone ever thought there would be multiple sets of IOTLB registers, but we've never supported them and all we do is make sure we map enough MMIO space for them. Kill it dead. Those bits do actually have a different meaning now. Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
* | Merge tag 'nfs-for-4.1-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfsLinus Torvalds2015-04-276-86/+15
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull NFS client updates from Trond Myklebust: "Another set of mainly bugfixes and a couple of cleanups. No new functionality in this round. Highlights include: Stable patches: - Fix a regression in /proc/self/mountstats - Fix the pNFS flexfiles O_DIRECT support - Fix high load average due to callback thread sleeping Bugfixes: - Various patches to fix the pNFS layoutcommit support - Do not cache pNFS deviceids unless server notifications are enabled - Fix a SUNRPC transport reconnection regression - make debugfs file creation failure non-fatal in SUNRPC - Another fix for circular directory warnings on NFSv4 "junctioned" mountpoints - Fix locking around NFSv4.2 fallocate() support - Truncating NFSv4 file opens should also sync O_DIRECT writes - Prevent infinite loop in rpcrdma_ep_create() Features: - Various improvements to the RDMA transport code's handling of memory registration - Various code cleanups" * tag 'nfs-for-4.1-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (55 commits) fs/nfs: fix new compiler warning about boolean in switch nfs: Remove unneeded casts in nfs NFS: Don't attempt to decode missing directory entries Revert "nfs: replace nfs_add_stats with nfs_inc_stats when add one" NFS: Rename idmap.c to nfs4idmap.c NFS: Move nfs_idmap.h into fs/nfs/ NFS: Remove CONFIG_NFS_V4 checks from nfs_idmap.h NFS: Add a stub for GETDEVICELIST nfs: remove WARN_ON_ONCE from nfs_direct_good_bytes nfs: fix DIO good bytes calculation nfs: Fetch MOUNTED_ON_FILEID when updating an inode sunrpc: make debugfs file creation failure non-fatal nfs: fix high load average due to callback thread sleeping NFS: Reduce time spent holding the i_mutex during fallocate() NFS: Don't zap caches on fallocate() xprtrdma: Make rpcrdma_{un}map_one() into inline functions xprtrdma: Handle non-SEND completions via a callout xprtrdma: Add "open" memreg op xprtrdma: Add "destroy MRs" memreg op xprtrdma: Add "reset MRs" memreg op ...
| * \ Merge tag 'nfs-rdma-for-4.1-1' of git://git.linux-nfs.org/projects/anna/nfs-rdmaTrond Myklebust2015-04-2334-98/+195
| |\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | NFS: NFSoRDMA Client Changes This patch series creates an operation vector for each of the different memory registration modes. This should make it easier to one day increase credit limit, rsize, and wsize. Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
| | * | SUNRPC: Introduce missing well-known netidsChuck Lever2015-03-312-6/+7
| | | | | | | | | | | | | | | | | | | | Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
| * | | NFS: Move nfs_idmap.h into fs/nfs/Anna Schumaker2015-04-232-69/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This file is only used internally to the NFS v4 module, so it doesn't need to be in the global include path. I also renamed it from nfs_idmap.h to nfs4idmap.h to emphasize that it's an NFSv4-only include file. Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| * | | NFS: Remove CONFIG_NFS_V4 checks from nfs_idmap.hAnna Schumaker2015-04-231-11/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The idmapper is completely internal to the NFS v4 module, so this macro will always evaluate to true. This patch also removes unnecessary includes of this file from the generic NFS client. Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| * | | sunrpc: make debugfs file creation failure non-fatalJeff Layton2015-04-231-9/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | v2: gracefully handle the case where some dentry pointers end up NULL and be more dilligent about zeroing out dentry pointers We currently have a problem that SELinux policy is being enforced when creating debugfs files. If a debugfs file is created as a side effect of doing some syscall, then that creation can fail if the SELinux policy for that process prevents it. This seems wrong. We don't do that for files under /proc, for instance, so Bruce has proposed a patch to fix that. While discussing that patch however, Greg K.H. stated: "No kernel code should care / fail if a debugfs function fails, so please fix up the sunrpc code first." This patch converts all of the sunrpc debugfs setup code to be void return functins, and the callers to not look for errors from those functions. This should allow rpc_clnt and rpc_xprt creation to work, even if the kernel fails to create debugfs files for some reason. Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Acked-by: "J. Bruce Fields" <bfields@fieldses.org> Signed-off-by: Jeff Layton <jeff.layton@primarydata.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| * | | NFS: Don't zap caches on fallocate()Anna Schumaker2015-04-231-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds a GETATTR to the end of ALLOCATE and DEALLOCATE operations so we can set the updated inode size and change attribute directly. DEALLOCATE will still need to release pagecache pages, so nfs42_proc_deallocate() now calls truncate_pagecache_range() before contacting the server. Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| * | | NFSv4: Truncating file opens should also sync O_DIRECT writesTrond Myklebust2015-03-271-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | We don't just want to sync out buffered writes, but also O_DIRECT ones. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
| * | | NFSv4.1: Allow getdeviceinfo to return notification info back to callerTrond Myklebust2015-03-271-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We are only allowed to cache deviceinfo if the server supports notifications and actually promises to call us back when changes occur. Right now, we request those notifications, but then we don't check the server's reply. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
* | | | Merge branch 'for-linus' of ↵Linus Torvalds2015-04-274-20/+47
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull fourth vfs update from Al Viro: "d_inode() annotations from David Howells (sat in for-next since before the beginning of merge window) + four assorted fixes" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: RCU pathwalk breakage when running into a symlink overmounting something fix I_DIO_WAKEUP definition direct-io: only inc/dec inode->i_dio_count for file systems fs/9p: fix readdir() VFS: assorted d_backing_inode() annotations VFS: fs/inode.c helpers: d_inode() annotations VFS: fs/cachefiles: d_backing_inode() annotations VFS: fs library helpers: d_inode() annotations VFS: assorted weird filesystems: d_inode() annotations VFS: normal filesystems (and lustre): d_inode() annotations VFS: security/: d_inode() annotations VFS: security/: d_backing_inode() annotations VFS: net/: d_inode() annotations VFS: net/unix: d_backing_inode() annotations VFS: kernel/: d_inode() annotations VFS: audit: d_backing_inode() annotations VFS: Fix up some ->d_inode accesses in the chelsio driver VFS: Cachefiles should perform fs modifications on the top layer only VFS: AF_UNIX sockets should call mknod on the top layer only
| * | | | fix I_DIO_WAKEUP definitionEric Sandeen2015-04-241-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | I_DIO_WAKEUP is never directly used, but fix it up anyway. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| * | | | direct-io: only inc/dec inode->i_dio_count for file systemsJens Axboe2015-04-241-1/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | do_blockdev_direct_IO() increments and decrements the inode ->i_dio_count for each IO operation. It does this to protect against truncate of a file. Block devices don't need this sort of protection. For a capable multiqueue setup, this atomic int is the only shared state between applications accessing the device for O_DIRECT, and it presents a scaling wall for that. In my testing, as much as 30% of system time is spent incrementing and decrementing this value. A mixed read/write workload improved from ~2.5M IOPS to ~9.6M IOPS, with better latencies too. Before: clat percentiles (usec): | 1.00th=[ 33], 5.00th=[ 34], 10.00th=[ 34], 20.00th=[ 34], | 30.00th=[ 34], 40.00th=[ 34], 50.00th=[ 35], 60.00th=[ 35], | 70.00th=[ 35], 80.00th=[ 35], 90.00th=[ 37], 95.00th=[ 80], | 99.00th=[ 98], 99.50th=[ 151], 99.90th=[ 155], 99.95th=[ 155], | 99.99th=[ 165] After: clat percentiles (usec): | 1.00th=[ 95], 5.00th=[ 108], 10.00th=[ 129], 20.00th=[ 149], | 30.00th=[ 155], 40.00th=[ 161], 50.00th=[ 167], 60.00th=[ 171], | 70.00th=[ 177], 80.00th=[ 185], 90.00th=[ 201], 95.00th=[ 270], | 99.00th=[ 390], 99.50th=[ 398], 99.90th=[ 418], 99.95th=[ 422], | 99.99th=[ 438] In other setups, Robert Elliott reported seeing good performance improvements: https://lkml.org/lkml/2015/4/3/557 The more applications accessing the device, the worse it gets. Add a new direct-io flags, DIO_SKIP_DIO_COUNT, which tells do_blockdev_direct_IO() that it need not worry about incrementing or decrementing the inode i_dio_count for this caller. Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Christoph Hellwig <hch@lst.de> Cc: Theodore Ts'o <tytso@mit.edu> Cc: Elliott, Robert (Server Storage) <elliott@hp.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Jens Axboe <axboe@fb.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>