linux - linux

	Commit message (Collapse)	Author	Age	Files	Lines
*	[SCSI] libiscsi: regression: fix header digest errors	Mike Christie	2010-05-02	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This fixes a regression introduced with this commit: commit d3305f3407fa3e9452079ec6cc8379067456e4aa Author: Mike Christie <michaelc@cs.wisc.edu> Date: Thu Aug 20 15:10:58 2009 -0500 [SCSI] libiscsi: don't increment cmdsn if cmd is not sent in 2.6.32. When I moved the hdr->cmdsn after init_task, I added a bug when header digests are used. The problem is that the LLD may calculate the header digest in init_task, so if we then set the cmdsn after the init_task call we change what the digest will be calculated by the target. Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Cc: Stable Tree <stable@kernel.org> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
*	Merge git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-rc-fixes-2.6	Linus Torvalds	2010-04-06	1	-2/+3
\|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-rc-fixes-2.6: [SCSI] qla1280: retain firmware for error recovery [SCSI] attirbute_container: Initialize sysfs attributes with sysfs_attr_init [SCSI] advansys: fix regression with request_firmware change [SCSI] qla2xxx: Updated version number to 8.03.02-k2. [SCSI] qla2xxx: Prevent sending mbx commands from sysfs during isp reset. [SCSI] qla2xxx: Disable MSI on qla24xx chips other than QLA2432. [SCSI] qla2xxx: Check to make sure multique and CPU affinity support is not enabled at the same time. [SCSI] qla2xxx: Correct vp_idx checking during PORT_UPDATE processing. [SCSI] qla2xxx: Honour "Extended BB credits" bit for CNAs. [SCSI] scsi_transport_fc: Make sure commands are completed when rport is offline [SCSI] libiscsi: Fix recovery slowdown regression
\| *	[SCSI] libiscsi: Fix recovery slowdown regression	Mike Christie	2010-03-27	1	-2/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We could be failing/stopping a connection due to libiscsi starting recovery/cleanup, but the xmit path or scsi eh thread path could be dropping the connection at the same time. As a result the session->state gets set to failed instead of in recovery. We end up not blocking the session and so the replacement timeout never gets started and we only end up failing the IO when scsi_softirq_done sees that the cmd has been running for (cmd->allowed + 1) * rq->timeout secs. We used to fail the IO right away so users are seeing a long delay when using dm-multipath. This problem was added in 2.6.28. Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Cc: stable@kernel.org Signed-off-by: James Bottomley <James.Bottomley@suse.de>
* \|	include cleanup: Update gfp.h and slab.h includes to prepare for breaking ↵	Tejun Heo	2010-03-30	1	-0/+1
\|/ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	implicit slab.h inclusion from percpu.h percpu.h is included by sched.h and module.h and thus ends up being included when building most .c files. percpu.h includes slab.h which in turn includes gfp.h making everything defined by the two files universally available and complicating inclusion dependencies. percpu.h -> slab.h dependency is about to be removed. Prepare for this change by updating users of gfp and slab facilities include those headers directly instead of assuming availability. As this conversion needs to touch large number of source files, the following script is used as the basis of conversion. http://userweb.kernel.org/~tj/misc/slabh-sweep.py The script does the followings. * Scan files for gfp and slab usages and update includes such that only the necessary includes are there. ie. if only gfp is used, gfp.h, if slab is used, slab.h. * When the script inserts a new include, it looks at the include blocks and try to put the new include such that its order conforms to its surrounding. It's put in the include block which contains core kernel includes, in the same order that the rest are ordered - alphabetical, Christmas tree, rev-Xmas-tree or at the end if there doesn't seem to be any matching order. * If the script can't find a place to put a new include (mostly because the file doesn't have fitting include block), it prints out an error message indicating which .h file needs to be added to the file. The conversion was done in the following steps. 1. The initial automatic conversion of all .c files updated slightly over 4000 files, deleting around 700 includes and adding ~480 gfp.h and ~3000 slab.h inclusions. The script emitted errors for ~400 files. 2. Each error was manually checked. Some didn't need the inclusion, some needed manual addition while adding it to implementation .h or embedding .c file was more appropriate for others. This step added inclusions to around 150 files. 3. The script was run again and the output was compared to the edits from #2 to make sure no file was left behind. 4. Several build tests were done and a couple of problems were fixed. e.g. lib/decompress_.c used malloc/free() wrappers around slab APIs requiring slab.h to be added manually. 5. The script was run on all .h files but without automatically editing them as sprinkling gfp.h and slab.h inclusions around .h files could easily lead to inclusion dependency hell. Most gfp.h inclusion directives were ignored as stuff from gfp.h was usually wildly available and often used in preprocessor macros. Each slab.h inclusion directive was examined and added manually as necessary. 6. percpu.h was updated not to include slab.h. 7. Build test were done on the following configurations and failures were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my distributed build env didn't work with gcov compiles) and a few more options had to be turned off depending on archs to make things build (like ipr on powerpc/64 which failed due to missing writeq). x86 and x86_64 UP and SMP allmodconfig and a custom test config. * powerpc and powerpc64 SMP allmodconfig * sparc and sparc64 SMP allmodconfig * ia64 SMP allmodconfig * s390 SMP allmodconfig * alpha SMP allmodconfig * um on x86_64 SMP allmodconfig 8. percpu.h modifications were reverted so that it could be applied as a separate patch and serve as bisection point. Given the fact that I had only a couple of failures from tests on step 6, I'm fairly confident about the coverage of this conversion patch. If there is a breakage, it's likely to be something in one of the arch headers which should be easily discoverable easily on most builds of the specific arch. Signed-off-by: Tejun Heo <tj@kernel.org> Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
*	[SCSI] libiscsi: Make iscsi_eh_target_reset start with session reset	Jayamohan Kallickal	2010-03-03	1	-4/+19
\| \| \| \| \| \| \| \| \| \|	The iscsi_eh_target_reset has been modified to attempt target reset only. If it fails, then iscsi_eh_session_reset will be called. Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: Jayamohan Kallickal <jayamohank@serverengines.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
*	[SCSI] libiscsi: reset cmd timer if cmds are making progress	Mike Christie	2010-02-17	1	-4/+49
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch resets the cmd timer if cmds started before the timedout command are making progress. The idea is that the cmd probably timed out because we are trying to exeucte too many commands. If it turns out that the device the IO timedout on was bad or the cmd just got screwed up but other IO/devs were ok then we will will figure this out when the cmds ahead of the timed out one complete ok. This also fixes a bug where we were sort of detecting this by setting the last_timeout and last_xfer to the same value when the task was allocated. That caught the case where we never got to send any IO for it. However, if the problem had started right before we started the new task, then we were forced to wait an extra cmd timeout seconds to start the scsi eh. Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
*	kfifo: rename kfifo_put... into kfifo_in... and kfifo_get... into kfifo_out...	Stefani Seibold	2009-12-22	1	-7/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	rename kfifo_put... into kfifo_in... to prevent miss use of old non in kernel-tree drivers ditto for kfifo_get... -> kfifo_out... Improve the prototypes of kfifo_in and kfifo_out to make the kerneldoc annotations more readable. Add mini "howto porting to the new API" in kfifo.h Signed-off-by: Stefani Seibold <stefani@seibold.net> Acked-by: Greg Kroah-Hartman <gregkh@suse.de> Acked-by: Mauro Carvalho Chehab <mchehab@redhat.com> Acked-by: Andi Kleen <ak@linux.intel.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
*	kfifo: cleanup namespace	Stefani Seibold	2009-12-22	1	-7/+7
\| \| \| \| \| \| \| \| \| \| \| \| \|	change name of __kfifo_* functions to kfifo_*, because the prefix __kfifo should be reserved for internal functions only. Signed-off-by: Stefani Seibold <stefani@seibold.net> Acked-by: Greg Kroah-Hartman <gregkh@suse.de> Acked-by: Mauro Carvalho Chehab <mchehab@redhat.com> Acked-by: Andi Kleen <ak@linux.intel.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
*	kfifo: move out spinlock	Stefani Seibold	2009-12-22	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Move the pointer to the spinlock out of struct kfifo. Most users in tree do not actually use a spinlock, so the few exceptions now have to call kfifo_{get,put}_locked, which takes an extra argument to a spinlock. Signed-off-by: Stefani Seibold <stefani@seibold.net> Acked-by: Greg Kroah-Hartman <gregkh@suse.de> Acked-by: Mauro Carvalho Chehab <mchehab@redhat.com> Acked-by: Andi Kleen <ak@linux.intel.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
*	kfifo: move struct kfifo in place	Stefani Seibold	2009-12-22	1	-14/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is a new generic kernel FIFO implementation. The current kernel fifo API is not very widely used, because it has to many constrains. Only 17 files in the current 2.6.31-rc5 used it. FIFO's are like list's a very basic thing and a kfifo API which handles the most use case would save a lot of development time and memory resources. I think this are the reasons why kfifo is not in use: - The API is to simple, important functions are missing - A fifo can be only allocated dynamically - There is a requirement of a spinlock whether you need it or not - There is no support for data records inside a fifo So I decided to extend the kfifo in a more generic way without blowing up the API to much. The new API has the following benefits: - Generic usage: For kernel internal use and/or device driver. - Provide an API for the most use case. - Slim API: The whole API provides 25 functions. - Linux style habit. - DECLARE_KFIFO, DEFINE_KFIFO and INIT_KFIFO Macros - Direct copy_to_user from the fifo and copy_from_user into the fifo. - The kfifo itself is an in place member of the using data structure, this save an indirection access and does not waste the kernel allocator. - Lockless access: if only one reader and one writer is active on the fifo, which is the common use case, no additional locking is necessary. - Remove spinlock - give the user the freedom of choice what kind of locking to use if one is required. - Ability to handle records. Three type of records are supported: - Variable length records between 0-255 bytes, with a record size field of 1 bytes. - Variable length records between 0-65535 bytes, with a record size field of 2 bytes. - Fixed size records, which no record size field. - Preserve memory resource. - Performance! - Easy to use! This patch: Since most users want to have the kfifo as part of another object, reorganize the code to allow including struct kfifo in another data structure. This requires changing the kfifo_alloc and kfifo_init prototypes so that we pass an existing kfifo pointer into them. This patch changes the implementation and all existing users. [akpm@linux-foundation.org: fix warning] Signed-off-by: Stefani Seibold <stefani@seibold.net> Acked-by: Greg Kroah-Hartman <gregkh@suse.de> Acked-by: Mauro Carvalho Chehab <mchehab@redhat.com> Acked-by: Andi Kleen <ak@linux.intel.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
*	[SCSI] libiscsi: hook into ramp up/down handling	Mike Christie	2009-12-04	1	-3/+12
\| \| \| \| \| \| \| \| \| \| \|	It is rare to get a queue full with iscsi, because targets seem to just reduce the iscsi cmd window. However, there is at least one iscsi target that will throw a queue full when overloaded. This hooks the iscsi code in to the ramp up/down code, so we can handle it. Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
*	[SCSI] libiscsi: add warm target reset tmf support	Mike Christie	2009-12-04	1	-72/+179
\| \| \| \| \| \| \| \| \| \|	This implements warm target reset tmf support for the scsi-ml target reset callback. Previously we would just drop the session in that callback. This patch will now try a target reset and if that fails drop the session. Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
*	[SCSI] libiscsi: Check TMF state before sending PDU	Mike Christie	2009-12-04	1	-12/+101
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Patch and mail from both MikeC and HannesR: Before we're trying to send a PDU we have to check whether a TMF is active. If so and if the PDU will be affected by the TMF we should allow only Data-out PDUs to be sent. If fast_abort is set, no Data-out PDUs will be sent while a LUN reset is being processed for a affected LUN. fast_abort is now ingored during a ABORT TASK tmf. We will not send any Data-outs for a task if the task is being aborted. Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: Hannes Reinecke <hare@suse.de> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
*	[SCSI] libiscsi: fix login/text checks in pdu injection code	Mike Christie	2009-12-04	1	-6/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	For some reason we used to check for the the immediate bit set and the opcocde in many places instead of just masking the opcode. In the passthrough code this is a problem because userspace may or may not have set the immediate bit and it does not have to. This fixes up the opcode checks in the passthrough code, so we mask off the opcode then check against the iscsi proto definition like is done in other places. Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
*	[SCSI] modify change_queue_depth to take in reason why it is being called	Mike Christie	2009-12-04	1	-1/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch modifies scsi_host_template->change_queue_depth so that it takes an argument indicating why it is being called. This will be used so that if a LLD needs to do some extra processing when handling queue fulls or later ramp ups, it can do so. This is a simple port of the drivers setting a change_queue_depth callback. In the patch I just have these LLDs adjust the queue depth if the user was requesting it. Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> [Vasu.Dev: v2 Also converted pmcraid_change_queue_depth and then verified all modules compile using "make allmodconfig" for any new build warnings on X86_64. Updated original description after combing two original patches from Mike to make this patch git bisectable.] Signed-off-by: Vasu Dev <vasu.dev@intel.com> [jejb: fixed up 53c700] Signed-off-by: James Bottomley <James.Bottomley@suse.de>
*	[SCSI] libiscsi: iscsi_session_setup to allow for private space	Jayamohan Kallickal	2009-10-02	1	-2/+4
\| \| \| \| \| \| \| \| \|	This patch contains changes that allow iscsi_session_setup to allocate private space for LLD's Signed-off-by: Jayamohan Kallickal <jayamohank@serverengines.com> Acked-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
*	[SCSI] libiscsi, bnx2i: make bound ep check common	Mike Christie	2009-09-12	1	-0/+33
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	bnx2i currently has a check for if a ep is properly bound, so if iscsi_queuecommand/xmit_task is called while there is no ep we will not queue IO. be2iscsi sends IO from queuecommand/xmit_task like how bnx2i does and needs a similar test. This patch has us just use the suspend_bit test for this. When ep_poll has succeeed iscsid will call conn_bind, the LLD will then call iscsi_conn_bind which will clear the suspend bit. When ep_disconnect is called (or if there is a conn error) we set the suspend bit. For the ep_disconnect case I am adding a helper in this patch that will take the session lock to make sure iscsi_queuecommand/xmit_task is not running and it will set the suspend bit. Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: Jayamohan Kallickal <jayamohank@serverengines.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
*	[SCSI] libiscsi: add completion function for drivers that do not need pdu ↵	Mike Christie	2009-09-12	1	-5/+33
\| \| \| \| \| \| \| \| \| \| \| \|	processing beiscsi does not need the iscsi scsi cmd processing. It does not even get this info on the completion path. This adds a function to just update the sequencing numbers and complete a task. Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: Jayamohan Kallickal <jayamohank@serverengines.com> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
*	[SCSI] libiscsi, iscsi_tcp: check suspend bit before each call to xmit_task	Mike Christie	2009-09-05	1	-9/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If we had multiple tasks on the cmd or requeue lists, and iscsi_tcp returns a error, the write_space function can still run and queue iscsi_data_xmit. If it was a legetimate problem and iscsi_conn_failure was run but we raced and iscsi_data_xmit was run first it could miss the suspend bit checks, and start trying to send data again and hit another timeout. A similar problem is present when using cxgb3i. This has libiscsi check the suspend bit before calling the xmit task callout, so we at least do not try sending multiple tasks (one could be sent). Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
*	[SCSI] libiscsi: handle immediate command rejections	Mike Christie	2009-09-05	1	-19/+87
\| \| \| \| \| \| \| \| \| \| \| \| \|	If we sent multiple pdus as immediate the target could be rejecting some and we have just been dropping the rejection notification. This adds code to handle nop-out rejections, so if a nop-out was sent as a ping and rejected we do not mark the connection bad. Instead we just clean up the timers since we have pdu making a rount trip we know the connection is good. Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
*	[SCSI] libiscsi: don't increment cmdsn if cmd is not sent	Mike Christie	2009-09-05	1	-2/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We increment session->cmdsn at the top of iscsi_prep_scsi_cmd_pdu, but if the prep ecb or prep bidi or init_task calls fails then we leave the session->cmdsn incremented. This moves the cmdsn manipulation to the end of the function when we know it has succeeded. It also adds a session->cmdsn--; in queuecommand for if a driver like bnx2i tries to send a a task from that context but it fails. We do not have to do this in the xmit thread context because that code will retry the same task if the initial call fails. Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <James.Bottomley@suse.de>
*	[SCSI] libiscsi: disable bh in and abort handler.	Mike Christie	2009-07-30	1	-2/+2
\| \| \| \| \| \| \| \| \|	The session lock can be held in the scsi eh thread or the completion paths run from the net softirq. This disables bhs in iscsi_eh_abort when taking the session lock. Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
*	libiscsi: add conn and scsi eh log debug flags	Erez Zilber	2009-06-21	1	-44/+65
\| \| \| \| \| \| \| \| \| \| \| \|	Allow the user to control the debug logs in libiscsi. We will now have a module param for connection, session & error handling. [Mike Christie - Fixed up to compile on current code and added missing ISCSI_DBG_EH conversions] Signed-off-by: Erez Zilber <erezzi.list@gmail.com> Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
*	libiscsi: don't run scsi eh if iscsi task is making progress	Mike Christie	2009-06-21	1	-13/+49
\| \| \| \| \| \| \| \|	If we are sending or receiving data for the task successfully do not run the scsi eh, because we know the task is making progress. Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
*	[SCSI] libiscsi: add debug printks for iscsi command completion path	Mike Christie	2009-05-23	1	-1/+10
\| \| \| \| \| \| \| \|	This patch just adds some debug statements for the abort and completion paths. Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
*	[SCSI] libiscsi: add task aborted state	Mike Christie	2009-05-23	1	-25/+35
\| \| \| \| \| \| \| \| \| \| \| \|	If a task did not complete normally due to a TMF, libiscsi will now complete the task with the state ISCSI_TASK_ABRT_TMF. Drivers like bnx2i that need to free resources if a command did not complete normally can then check the task state. If a driver does not need to send a special command if we have dropped the session then they can check for ISCSI_TASK_ABRT_SESS_RECOV. Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
*	[SCSI] libiscsi: check if iscsi host has work queue before queueing work	Mike Christie	2009-05-23	1	-10/+11
\| \| \| \| \| \| \| \| \|	Instead of having libiscsi check if the offload bit is set, have it check if the lld created a work queue. I think this is more clear. Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
*	[SCSI] libiscsi: don't let io sit in queue when session has failed	Mike Christie	2009-05-23	1	-9/+6
\| \| \| \| \| \| \| \| \| \| \| \| \|	If the session is failed, but we have not yet fully transitioned to the recovery stage we were still queueuing IO. The idea is that for some failures we can recvover at the command level and still continue to execute other IO. Well, we never have added the recovery within a command code, so queueing up IO here just creates the possibility that it might time time out so this just has us requeue the IO the scsi layer for now. Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
*	[SCSI] libiscsi: handle cleanup task races	Mike Christie	2009-05-23	1	-109/+116
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	bnx2i needs to send a hardware specific cleanup command if a command has not completed normally (iscsi/scsi response from target), and the session is still ok (this is the case when we send a TMF to stop the command). At this time it will need to drop the session lock. The problem with the current code is that fail_all_commands assumes we will hold the lock the entire time, so it uses list_for_each_entry_safe. If while bnx2i drops the session lock multiple cmds complete then list_for_each_entry_safe will not handle this correctly. This patch removes the running lists and just has us loop over the cmds array (in later patches we will then replace that array with a block tag map at the session level). It also fixes up the completion path so that if the TMF code and the normal recv path were completing the same command then they both do not try to do release the refcount taken when the task is queued. Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
*	[SCSI] libiscsi: fix iscsi transport checks to account for slower links	Mike Christie	2009-05-23	1	-9/+29
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	If we have not got any pdus for recv_timeout seconds, then we will send a iscsi ping/nop to make sure the target is still around. The problem is if this is a slow link, and the ping got queued after the data for a data_out (read), then the transport code could think the ping has failed when it is just slowly making its way through the network. This patch has us check if we are making progress while the nop is outstanding. If we are still reading in data, then we do not fail the session at that time. Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
*	[SCSI] libiscsi: fix nop response/reply and session cleanup race	Mike Christie	2009-05-23	1	-3/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If we are responding to a nop from the target by sending our nop, and the session is getting torn down, then iscsi_start_session_recovery could set the conn stop bits while the recv path is sending the nop response and we will hit the bug ons in __iscsi_conn_send_pdu. This has us check the state in __iscsi_conn_send_pdu and fail all incoming mgmt IO if we are not logged in and if the pdu is not login related. It also changes the ordering of the setting of conn stop state bits so they are set after the session state is set (both are set under the session lock). Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
*	[SCSI] libiscsi: have iscsi_data_in_rsp call iscsi_update_cmdsn	Mike Christie	2009-05-23	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \|	This has iscsi_data_in_rsp call iscsi_update_cmdsn when a pdu is completed like is done for other pdu's that are don. For libiscsi_tcp, this means that it calls iscsi_update_cmdsn when it is handling the pdu internally to only transfer data, but if there is status then it does not need to call it since the completion handling will do it. Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
*	[SCSI] libiscsi: export iscsi_itt_to_task for bnx2i	Mike Christie	2009-05-23	1	-1/+2
\| \| \| \| \| \| \| \| \|	bnx2i needs to be able to look up mgmt task like login and nop, because it does some processing of them on the completion path. This exports iscsi_itt_to_task so it can look up the task. Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
*	[SCSI] libiscsi: handle param allocation failures	Mike Christie	2009-05-23	1	-75/+33
\| \| \| \| \| \| \| \| \| \| \| \| \|	If we could not allocate the initiator name or some other id like the hwaddress or netdev, then userspace could deal with the failure by just running in a dregraded mode. Now we want to be able to switch values for the params and we want some feedback, so this patch will check if a string like the initiatorname could not be allocated and return an error. Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
*	[SCSI] libiscsi: check of LLD has a alloc pdu callout.	Mike Christie	2009-05-23	1	-7/+12
\| \| \| \| \| \| \| \| \| \| \| \|	bnx2i does not have one. It currently preallocates the bdt when the session is setup. We probably want to change that to a dma pool, then allocate from the pool in the alloc pdu. Until then check if there is a alloc pdu callout. Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
*	[SCSI] cxgb3i, iser, iscsi_tcp: set target can queue	Mike Christie	2009-04-27	1	-0/+10
\| \| \| \| \| \| \| \| \| \| \|	Set target can queue limit to the number of preallocated session tasks we have. This along with the cxgb3i can_queue patch will fix a throughput problem where it could only queue one LU worth of data at a time. Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
*	[SCSI] libiscsi: fix iscsi pool error path	Jean Delvare	2009-04-03	1	-1/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Le lundi 30 mars 2009, Chris Wright a écrit : > q->queue could be ERR_PTR(-ENOMEM) which will break unwinding > on error. Make iscsi_pool_free more defensive. > Making the freeing of q->queue dependent on q->pool being set looks really weird (although it is correct at the moment. But this seems to be fixable in a much simpler way. With the benefit that only the error case is slowed down. In both cases we have a problem if q->queue contains an error value but it's not -ENOMEM. Apparently this can't happen today, but it doesn't feel right to assume this will always be true. Maybe it's the right time to fix this as well. Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
*	[SCSI] libiscsi: fix possbile null ptr session command cleanup	Mike Christie	2009-03-13	1	-2/+5
\| \| \| \| \| \| \| \| \| \| \| \| \|	If the iscsi eh fires when the current task is a nop, then the task->sc pointer is null. fail_all_commands could then try to do task->sc->device and oops. We actually do not need to access the curr task in this path, because if it is a cmd task the fail_command call will handle this and if it is mgmt task then the flush of the mgmt queues will handle that. Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
*	[SCSI] libiscsi: pass session failure a session struct	Mike Christie	2009-03-13	1	-3/+2
\| \| \| \| \| \| \| \| \| \|	The api for conn and session failures is akward because one takes a conn from the lib and one takes a session from the class. This syncs up the interfaces to use structs from the lib. Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
*	[SCSI] iscsi lib: remove qdepth param from iscsi host allocation	Mike Christie	2009-03-13	1	-7/+4
\| \| \| \| \| \| \| \| \| \| \| \| \|	The qdepth setting was useful when we needed libiscsi to verify the setting. Now we just need to make sure if older tools passed in zero then we need to set some default. So this patch just has us use the sht->cmd_per_lun or if for LLD does a host per session then we can set it on per host basis. Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
*	[SCSI] iscsi lib: have lib create work queue for transmitting IO	Mike Christie	2009-03-13	1	-9/+36
\| \| \| \| \| \| \| \| \| \| \|	We were using the shost work queue which ended up being a little akward since all iscsi hosts need a thread for scanning, but only drivers hooked into libiscsi need a workqueue for transmitting. So this patch moves the xmit workqueue to the lib. Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
*	[SCSI] libiscsi: don't cap queue depth in iscsi modules	Mike Christie	2009-03-13	1	-8/+1
\| \| \| \| \| \| \| \| \| \|	There is no need to cap the queue depth in the modules. We set this in userspace and can do that there. For performance testing with ram based targets, this is helpful since we can have very high queue depths. Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
*	[SCSI] libiscsi: replace scsi_debug logging with session/conn logging	Mike Christie	2009-03-13	1	-61/+103
\| \| \| \| \| \| \| \| \|	This makes the logging a compile time option and replaces the scsi_debug macro with session and connection ones that print out a driver model id prefix. Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
*	[SCSI] libiscsi: fix iscsi pool error path	Jean Delvare	2009-03-13	1	-3/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Memory freeing in iscsi_pool_free() looks wrong to me. Either q->pool can be NULL and this should be tested before dereferencing it, or it can't be NULL and it shouldn't be tested at all. As far as I can see, the only case where q->pool is NULL is on early error in iscsi_pool_init(). One possible way to fix the bug is thus to not call iscsi_pool_free() in this case (nothing needs to be freed anyway) and then we can get rid of the q->pool check. Signed-off-by: Jean Delvare <jdelvare@suse.de> Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
*	[SCSI] libiscsi: Fix scsi command timeout oops in iscsi_eh_timed_out	Mike Christie	2009-02-10	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Yanling Qi from LSI found the root cause of the panic, below is his analysis: Problem description: the open iscsi driver installs eh_timed_out handler to the blank_transport_template of the scsi middle level that causes panic of timed out command of other host Here are the details Iscsi Session creation During iscsi session creation time, the iscsi_tcp_session_create() of iscsi_tpc.c will create a scsi-host for the session. See the statement marked with the label A. The statement B replaces the shost->transportt point with a local struct variable. static struct iscsi_cls_session * iscsi_tcp_session_create(struct iscsi_endpoint ep, uint16_t cmds_max, uint16_t qdepth, uint32_t initial_cmdsn, uint32_t hostno) { struct iscsi_cls_session cls_session; struct iscsi_session session; struct Scsi_Host shost; int cmd_i; if (ep) { printk(KERN_ERR "iscsi_tcp: invalid ep %p.\n", ep); return NULL; } A shost = iscsi_host_alloc(&iscsi_sht, 0, qdepth); if (!shost) return NULL; B shost->transportt = iscsi_tcp_scsi_transport; shost->max_lun = iscsi_max_lun; Please note the scsi host is allocated by invoking isccsi_host_alloc() in libiscsi.c Polluting the middle level blank_transport_template in iscsi_host_alloc() of libiscsi.c The iscsi_host_alloc() invokes the middle level function scsi_host_alloc() in hosts.c for allocating a scsi_host. Then the statement marked with C assigns the iscsi_eh_cmd_timed_out handler to the eh_timed_out callback function. struct Scsi_Host iscsi_host_alloc(struct scsi_host_template sht, int dd_data_size, uint16_t qdepth) { struct Scsi_Host shost; struct iscsi_host ihost; shost = scsi_host_alloc(sht, sizeof(struct iscsi_host) + dd_data_size); if (!shost) return NULL; C shost->transportt->eh_timed_out = iscsi_eh_cmd_timed_out; Please note the shost->transport is the middle level blank_transport_template as shown in the code segment below. We see two problems here. 1. iscsi_eh_cmd_timed_out is installed to the blank_transport_template that will cause some body else problem. 2. iscsi_eh_cmd_timed_out will never be invoked when iscsi command gets timeout because the statement B resets the pointer. Middle level blank_transport_template In the middle level function scsi_host_alloc() of hosts.c, the middle level assigns a blank_transport_template for those hosts not implementing its transport layer. All HBAs without supporting a specific scsi_transport will share the middle level blank_transport_template. Please see the statement D struct Scsi_Host scsi_host_alloc(struct scsi_host_template sht, int privsize) { struct Scsi_Host shost; gfp_t gfp_mask = GFP_KERNEL; int rval; if (sht->unchecked_isa_dma && privsize) gfp_mask \|= __GFP_DMA; shost = kzalloc(sizeof(struct Scsi_Host) + privsize, gfp_mask); if (!shost) return NULL; shost->host_lock = &shost->default_lock; spin_lock_init(shost->host_lock); shost->shost_state = SHOST_CREATED; INIT_LIST_HEAD(&shost->__devices); INIT_LIST_HEAD(&shost->__targets); INIT_LIST_HEAD(&shost->eh_cmd_q); INIT_LIST_HEAD(&shost->starved_list); init_waitqueue_head(&shost->host_wait); mutex_init(&shost->scan_mutex); shost->host_no = scsi_host_next_hn++; /* XXX(hch): still racy / shost->dma_channel = 0xff; / These three are default values which can be overridden / shost->max_channel = 0; shost->max_id = 8; shost->max_lun = 8; / Give each shost a default transportt / D shost->transportt = &blank_transport_template; Why we see panic at iscsi_eh_cmd_timed_out() The mpp virtual HBA doesn’t have a specific scsi_transport. Therefore, the blank_transport_template will be assigned to the virtual host of the MPP virtual HBA by SCSI middle level. Please note that the statement C has assigned iscsi-transport eh_timedout handler to the blank_transport_template. When a mpp virtual command gets timedout, the iscsi_eh_cmd_timed_out() will be invoked to handle mpp virtual command timeout from the middle level scsi_times_out() function of the scsi_error.c. enum blk_eh_timer_return scsi_times_out(struct request req) { struct scsi_cmnd scmd = req->special; enum blk_eh_timer_return (eh_timed_out)(struct scsi_cmnd ); enum blk_eh_timer_return rtn = BLK_EH_NOT_HANDLED; scsi_log_completion(scmd, TIMEOUT_ERROR); if (scmd->device->host->transportt->eh_timed_out) E eh_timed_out = scmd->device->host->transportt->eh_timed_out; else if (scmd->device->host->hostt->eh_timed_out) eh_timed_out = scmd->device->host->hostt->eh_timed_out; else eh_timed_out = NULL; if (eh_timed_out) { rtn = eh_timed_out(scmd); It is very easy to understand why we get panic in the iscsi_eh_cmd_timed_out(). A scsi_cmnd from a no-iscsi device definitely can not resolve out a session and session->lock. The panic can be happed anywhere during the differencing. static enum blk_eh_timer_return iscsi_eh_cmd_timed_out(struct scsi_cmnd scmd) { struct iscsi_cls_session cls_session; struct iscsi_session session; struct iscsi_conn *conn; enum blk_eh_timer_return rc = BLK_EH_NOT_HANDLED; cls_session = starget_to_session(scsi_target(scmd->device)); session = cls_session->dd_data; debug_scsi("scsi cmd %p timedout\n", scmd); spin_lock(&session->lock); This patch fixes the problem by moving the setting of the iscsi_eh_cmd_timed_out to iscsi_add_host, which is after the LLDs have set their transport template to shost->transportt. Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
*	[SCSI] libiscsi: fix iscsi pool leak	Mike Christie	2009-01-25	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \|	I am not sure what happened. It looks like we have always leaked the q->queue that is allocated from the kfifo_init call. nab finally noticed that we were leaking and this patch fixes it by adding a kfree call to iscsi_pool_free. kfifo_free is not used per kfifo_init's instructions to use kfree. Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
*	[SCSI] libiscsi: handle init task failures.	Mike Christie	2008-12-29	1	-2/+2
\| \| \| \| \| \| \| \| \|	Mgmt setup used to not fail so we did not have to check the return value. Now with cxgb3i it can so this has us pass up a error. Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
*	[SCSI] libiscsi: pass opcode into alloc_pdu callout	Mike Christie	2008-12-29	1	-7/+7
\| \| \| \| \| \| \| \|	We do not need to allocate a itt for data_out, so this passes the opcode to the alloc_pdu callout. Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
*	[SCSI] libiscsi: allow drivers to modify the itt sent to the target	Mike Christie	2008-12-29	1	-13/+34
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	bnx2i and cxgb3i need to encode LLD info in the itt so that the firmware/hardware can process the pdu. This patch allows the LLDs to encode info in the task->hdr->itt that they setup in the alloc_pdu callout (any resources that are allocated can be freed with the pdu in the cleanup_task callout). If the LLD encodes info in the itt they should implement a parse_pdu_itt callout. If parse_pdu_itt is not implemented libiscsi will do the right thing for the LLD. Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
*	[SCSI] libiscsi: change login data buffer allocation	Mike Christie	2008-12-29	1	-2/+4
\| \| \| \| \| \| \| \| \|	This modifies the login buffer allocation to use __get_free_pages. It will allow drivers that want to send this data with zero copy operations to easily line things up on page boundaries. Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>