summaryrefslogtreecommitdiffstats
path: root/include/scsi (follow)
Commit message (Collapse)AuthorAgeFilesLines
* [SCSI] fcoe: fix fcoe in a DCB environment by adding DCB notifiers to set ↵john fastabend2011-12-151-0/+3
| | | | | | | | | | | | | | | | skb priority Use DCB notifiers to set the skb priority to allow packets to be steered and tagged correctly over DCB enabled drivers that setup traffic classes. This allows queue_mapping() routines to be removed in these drivers that were previously inspecting the ethertype of every skb to mark FCoE/FIP frames. Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: Robert Love <robert.w.love@intel.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
* Merge git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6Linus Torvalds2011-10-299-117/+356
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6: (204 commits) [SCSI] qla4xxx: export address/port of connection (fix udev disk names) [SCSI] ipr: Fix BUG on adapter dump timeout [SCSI] megaraid_sas: Fix instance access in megasas_reset_timer [SCSI] hpsa: change confusing message to be more clear [SCSI] iscsi class: fix vlan configuration [SCSI] qla4xxx: fix data alignment and use nl helpers [SCSI] iscsi class: fix link local mispelling [SCSI] iscsi class: Replace iscsi_get_next_target_id with IDA [SCSI] aacraid: use lower snprintf() limit [SCSI] lpfc 8.3.27: Change driver version to 8.3.27 [SCSI] lpfc 8.3.27: T10 additions for SLI4 [SCSI] lpfc 8.3.27: Fix queue allocation failure recovery [SCSI] lpfc 8.3.27: Change algorithm for getting physical port name [SCSI] lpfc 8.3.27: Changed worst case mailbox timeout [SCSI] lpfc 8.3.27: Miscellanous logic and interface fixes [SCSI] megaraid_sas: Changelog and version update [SCSI] megaraid_sas: Add driver workaround for PERC5/1068 kdump kernel panic [SCSI] megaraid_sas: Add multiple MSI-X vector/multiple reply queue support [SCSI] megaraid_sas: Add support for MegaRAID 9360/9380 12GB/s controllers [SCSI] megaraid_sas: Clear FUSION_IN_RESET before enabling interrupts ...
| * [SCSI] iscsi class: fix vlan configurationMike Christie2011-10-201-4/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | Userspace was sending the priority/id part of the vlan tag and sysfs was displaying the id in the vlan file. This renames the vlan sysfs file to vlan_id to reflect that it was showing the id and to match the vlan_priority file. This also adds a ISCSI_NET_PARAM_VLAN_TAG iscsi nl command to relfect that we are sending down the vlan/priority part of the tag. Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
| * [SCSI] qla4xxx: fix data alignment and use nl helpersMike Christie2011-10-201-1/+2
| | | | | | | | | | | | | | | | | | | | This has the driver use helpers for a common operation and fixes a issue where if multiple iscsi params are sent they could be sent at offsets that cause unaligned accesses. The nla helpers account for the padding needed to align properly for the driver. Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
| * [SCSI] iscsi class: Replace iscsi_get_next_target_id with IDAMike Christie2011-10-201-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | Replaced the iscsi_get_next_target_id with IDA to make target-id allocation efficient for iscsi offload drivers This patch should be applied after Jonathen Cameron Patch "ida : simplified functions for id allocation" Signed-off-by: John Soni Jose <jose0here@gmail.com> Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
| * [SCSI] libsas: fix port->dev_list lockingDan Williams2011-10-161-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | port->dev_list maintains a list of devices attached to a given sas root port. It needs to be mutated under a lock as contexts outside of the single-threaded-libsas-workqueue access the list via sas_find_dev_by_rphy(). Fixup locations where the list was being mutated without a lock. This is a follow-up to commit 5911e963 "[SCSI] libsas: remove expander from dev list on error", where Luben noted [1]: > 2/ We have unlocked list manipulations in sas_ex_discover_end_dev(), > sas_unregister_common_dev(), and sas_ex_discover_end_dev() Yes, I can see that and that is very unfortunate. [1]: http://marc.info/?l=linux-scsi&m=131480962006471&w=2 Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
| * [SCSI] fcoe,libfcoe: Move common code for fcoe_get_lesb to fcoe_transportBhanu Prakash Gollapudi2011-10-161-0/+2
| | | | | | | | | | | | | | | | | | Except for obtaining the netdev from lport, fcoe_get_lesb is the common code for the LLDs. Signed-off-by: Bhanu Prakash Gollapudi <bprakash@broadcom.com> Acked-by: Yi Zou <yi.zou@intel.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
| * [SCSI] isci: export phy events via ->lldd_control_phy()Dan Williams2011-10-021-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Allow the sas-transport-class to update events for local phys via a new PHY_FUNC_GET_EVENTS command to ->lldd_control_phy(). Fixup drivers that are not prepared for new enum phy_func values, and unify ->lldd_control_phy() error codes. These are the SAS defined phy events that are reported in a smp-report-phy-error-log command: * /sys/class/sas_phy/<phyX>/invalid_dword_count * /sys/class/sas_phy/<phyX>/running_disparity_error_count * /sys/class/sas_phy/<phyX>/loss_of_dword_sync_count * /sys/class/sas_phy/<phyX>/phy_reset_problem_count Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
| * [SCSI] isci: atapi supportDan Williams2011-10-021-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Based on original implementation from Jiangbi Liu and Maciej Trela. ATAPI transfers happen in two-to-three stages. The two stage atapi commands are those that include a dma data transfer. The data transfer portion of these operations is handled by the hardware packet-dma acceleration. The three-stage commands do not have a data transfer and are handled without hardware assistance in raw frame mode. stage1: transmit host-to-device fis to notify the device of an incoming atapi cdb. Upon reception of the pio-setup-fis repost the task_context to perform the dma transfer of the cdb+data (go to stage3), or repost the task_context to transmit the cdb as a raw frame (go to stage 2). stage2: wait for hardware notification of the cdb transmission and then go to stage 3. stage3: wait for the arrival of the terminating device-to-host fis and terminate the command. To keep the implementation simple we only support ATAPI packet-dma protocol (for commands with data) to avoid needing to handle the data transfer manually (like we do for SATA-PIO). This may affect compatibility for a small number of devices (see ATA_HORKAGE_ATAPI_MOD16_DMA). If the data-transfer underruns, or encounters an error the device-to-host fis is expected to arrive in the unsolicited frame queue to pass to libata for disposition. However, in the DONE_UNEXP_FIS (data underrun) case it appears we need to craft a response. In the DONE_REG_ERR case we do receive the UF and propagate it to libsas. Signed-off-by: Maciej Trela <maciej.trela@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
| * [SCSI] libfc: cache align struct fc_exch fieldsVasu Dev2011-10-021-15/+12
| | | | | | | | | | | | | | | | | | | | cache aligned xid and ex_lock beside removing holes. Signed-off-by: Vasu Dev <vasu.dev@intel.com> Tested-by: Ross Brattain <ross.b.brattain@intel.com> Signed-off-by: Yi Zou <yi.zou@intel.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
| * [SCSI] libfc: cache align struct fc_fcp_pkt fieldsVasu Dev2011-10-021-28/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Re-arrange its fields to avoid padding and have better cacheline alignments. Removed not used start_time, end_time and last_pkt_time fields. This all reduced this struct size to 448 from 480 and that also reduced one cacheline on x86_64 beside eliminating 8 pads. However kept logical fields together. Signed-off-by: Vasu Dev <vasu.dev@intel.com> Tested-by: Ross Brattain <ross.b.brattain@intel.com> Signed-off-by: Yi Zou <yi.zou@intel.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
| * [SCSI] libsas: fix warnings when checking sata/stp protocolDan Williams2011-10-021-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Several sas drivers legitimately check the protocol against the union of SAS_PROTOCOL_SATA and SAS_PROTOCOL_STP. Provide a SAS_PROTOCOL_STP_ALL to silence warnings like: drivers/scsi/pm8001/pm8001_sas.c:438:3: warning: case value ‘5’ not in enumerated type ‘enum sas_protocol’ [-Wswitch] drivers/scsi/mvsas/mv_sas.c:798:2: warning: case value ‘5’ not in enumerated type ‘enum sas_protocol’ [-Wswitch] drivers/scsi/mvsas/mv_sas.c:1783:2: warning: case value ‘5’ not in enumerated type ‘enum sas_protocol’ [-Wswitch] drivers/scsi/mvsas/mv_sas.c:1886:2: warning: case value ‘5’ not in enumerated type ‘enum sas_protocol’ [-Wswitch] drivers/scsi/isci/request.c:3565:2: warning: case value ‘5’ not in enumerated type ‘enum sas_protocol’ [-Wswitch] Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
| * [SCSI] libsas: fix try_test_sas_gpio_gp_bit() build errorDan Williams2011-10-021-0/+7
| | | | | | | | | | | | | | | | | | | | If the user has disabled CONFIG_SCSI_SAS_HOST_SMP then libsas drivers will not be receiving smp-gpio frames and do not need this lookup code. Reported-by: Randy Dunlap <rdunlap@xenotime.net> Tested-by: Randy Dunlap <rdunlap@xenotime.net> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
| * [SCSI] libsas: Allow expander T-T attachmentsLuben Tuikov2011-10-022-2/+15
| | | | | | | | | | | | | | | | Allow expander table-to-table attachments for expanders that support it. Signed-off-by: Luben Tuikov <ltuikov@yahoo.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
| * [SCSI] libsas: sgpio write supportDan Williams2011-09-222-0/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add SFF-8485 v0.7 / SAS-1 smp-write-gpio register support to libsas. Defer SAS-2 support unless/until it defines an sgpio interface. Minimum implementation needed to get the lights blinking. try_test_sas_gpio_gp_bit() provides a common method to parse the incoming write data (raw bitstream), and the to_sas_gpio_gp_bit() helper routine can be used as a basis for the set/clear operations for the 'read' implementation. Host implementations parse as many bits (ODx.[012]) as are locally supported and report the number of registers successfully written. If the submitted data overruns the internal number of registers available report the write as a success with the number of bytes remaining reported in ->resid_len. Example (assuming an active backplane) set the "identify" pattern for the first 21 devices: smp_write_gpio --count=2 --data=92,49,24,92,24,92,49,24 -t 4 --index=1 /dev/bsg/sas_hostX Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
| * [SCSI] scsi_dh: Implement match callback functionHannes Reinecke2011-08-301-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | Some device handler types are not tied to the vendor/model but rather to a specific capability. Eg ALUA is supported if the 'TPGS' setting in the standard inquiry is set. This patch implements a 'match' callback for device handler which supersedes the original vendor/model lookup and implements the callback for the ALUA handler. Signed-off-by: Hannes Reinecke <hare@suse.de> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
| * [SCSI] scsi_dh_alua: Evaluate TPGS setting from inquiry dataHannes Reinecke2011-08-301-0/+5
| | | | | | | | | | | | | | | | | | Instead of issuing a standard inquiry from within the alua device handler we can evaluate the TPGS setting from the existing inquiry data of the sdev and save us the I/O. Signed-off-by: Hannes Reinecke <hare@suse.de> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
| * [SCSI] scsi scan: don't fail scans when host is in recoveryMike Christie2011-08-291-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The problem is that if we are doing a scsi scan then the device goes into recovery then we will wait for the recovery to complete. It waits because scsi-ml will send inquiries or report luns and the queueing code will have been blocked due to the host not being ready. However, if we are in recovery and then a scan is started the scan will silently fail and some devices will not be added. It is easy to hit the problem where devices do not show up with FC where we are doing tests that disrupt the target controllers. When the controller is disruprted (reboot, or setting firmware, etc), and we cause the dev loss tmo to fire then devices will be removed Then when the problem has been fixed, the rport will be scanned and devices should be added back. But if we cause another disruption before scanning has started then devices will not get added back. If the problem is not started until the scan is started then the devices will be added back. This patch fixes that problem by not failing scans when the host is in recovery. We will let scsi-ml send the IO and let the queueing and scsi error handling deal with it like is done if we went into recovery while scanning. For recovery cases where the host is being torn down then with the patch we will still fail the scan since there is not point in scanning. Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
| * [SCSI] scsi: Added support for adapter and firmware resetVikas Chaudhary2011-08-271-0/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Added new sysfs attr 'host_reset' in scsi_sysfs.c to perform adapter or firmware reset as suggested by Mike Christie here: http://marc.info/?l=linux-scsi&m=127359347111167&w=2 user/application can write "adapter" or "firmware" on this attr and it will call newly added function hook in scsi_host_template to call LDD adapter or firmware reset implementation. Signed-off-by: Vikas Chaudhary <vikas.chaudhary@qlogic.com> Reviewed-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
| * [SCSI] scsi_transport_iscsi: Added support to update initiator iscsi portVikas Chaudhary2011-08-271-0/+1
| | | | | | | | | | | | Signed-off-by: Vikas Chaudhary <vikas.chaudhary@qlogic.com> Reviewed-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
| * [SCSI] scsi_transport_iscsi: Added support to update mtuVikas Chaudhary2011-08-271-0/+1
| | | | | | | | | | | | Signed-off-by: Vikas Chaudhary <vikas.chaudhary@qlogic.com> Reviewed-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
| * [SCSI] scsi_transport_iscsi: Add conn login, kernel to user, event to ↵Manish Rangankar2011-08-272-0/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | support offload session login. Offload drivers like qla4xxx will offload the sending of the login/logout pdus still, so this patch adds iscsi_conn_login_event which is used by these types of drivers to notify userspace that the connection has changed state. It also adds a iscsi_is_session_online helper so the lld can query the sessions state field. Signed-off-by: Manish Rangankar <manish.rangankar@qlogic.com> Signed-off-by: Lalit Chandivade <lalit.chandivade@qlogic.com> Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
| * [SCSI] iscsi class: add bsg support to iscsi classMike Christie2011-08-272-1/+116
| | | | | | | | | | | | | | | | | | | | | | | | This patch adds bsg support to the iscsi class. There is only 1 request, the host vendor one, supported. It is expected that this would be used for things like flash updates. This patch is made over this one http://marc.info/?l=linux-scsi&m=131149780020992&w=2 Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
| * [SCSI] iscsi class: expand vlan supportMike Christie2011-08-271-2/+11
| | | | | | | | | | | | | | | | | | Add support to set vlan priority and enable/disble a vlan. Patch based on code from Vikas Chaudhary. Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
| * [SCSI] iscsi class: sysfs group is_visible callout for iscsi host attrsMike Christie2011-08-272-7/+0
| | | | | | | | | | | | | | | | | | | | | | The iscsi class currently does not support writable sysfs attrs for LLD sysfs settings. This patch converts the iscsi class and driver's host attrs to use the attribute container sysfs group and the sysfs group's is_visible callout to be able to support readable or writable sysfs attrs. Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
| * [SCSI] iscsi class: remove iface param maskMike Christie2011-08-272-18/+0
| | | | | | | | | | | | | | | | We can replace the iface param mask with the attr_is_visible callback. Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
| * [SCSI] iscsi class: sysfs group is_visible callout for session attrsMike Christie2011-08-272-39/+3
| | | | | | | | | | | | | | | | | | | | | | The iscsi class currently does not support writable sysfs attrs for LLD sysfs settings. This patch converts the iscsi class and driver's session attrs to use the attribute container sysfs group and the sysfs group's is_visible callout to be able to support readable or writable sysfs attrs. Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
| * [SCSI] iscsi cls: sysfs group is_visible callout for conn attrsMike Christie2011-08-271-0/+5
| | | | | | | | | | | | | | | | | | | | | | The iscsi class currently does not support writable sysfs attrs for LLD sysfs settings. This patch converts the iscsi class and drivers to use the attribute container sysfs group and the sysfs group's is_visible callout to be able to support readable or writable sysfs attrs. Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
| * [SCSI] iscsi class: add iface representationMike Christie2011-08-272-2/+45
| | | | | | | | | | | | | | | | | | | | | | | | | | | | A iscsi host can have multiple interfaces. This patch adds a new iface iscsi class for this. It exports the network settings now, and will be extended to also export iscsi initiator port settings like the isid and initiator name for drivers that can support multiple initiator ports. Based on patch from Lalit Chandivade. Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
| * [SCSI] iscsi_transport: add support for net settingsMike Christie2011-08-272-0/+64
| | | | | | | | | | | | | | | | | | | | | | Allows user space (iscsiadm) to send down network configuration parameters for LLD to set private network configuration on the iSCSI adapters. Based on patch from Lalit Chandivade. Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
| * [SCSI] fcoe: Move common functions to fcoe_transport libraryBhanu Prakash Gollapudi2011-08-271-0/+3
| | | | | | | | | | | | | | | | Export fcoe_get_wwn, fcoe_validate_vport_create and fcoe_wwn_to_str so that all LLDs can use these common function. Signed-off-by: Bhanu Prakash Gollapudi <bprakash@broadcom.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
| * [SCSI] libsas: export sas_alloc_task()Dan Williams2011-08-271-24/+2
| | | | | | | | | | | | | | | | | | Now that isci has added a 3rd open coded user of this functionality just share the libsas version. Acked-by: Jack Wang <jack_wang@usish.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
* | ore: RAID5 WriteBoaz Harrosh2011-10-251-0/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is finally the RAID5 Write support. The bigger part of this patch is not the XOR engine itself, But the read4write logic, which is a complete mini prepare_for_striping reading engine that can read scattered pages of a stripe into cache so it can be used for XOR calculation. That is, if the write was not stripe aligned. The main algorithm behind the XOR engine is the 2 dimensional array: struct __stripe_pages_2d. A drawing might save 1000 words --- __stripe_pages_2d | n = pages_in_stripe_unit; w = group_width - parity; | pages array presented to the XOR lib | | V | __1_page_stripe[0].pages --> [c0][c1]..[cw][c_par] <---| | | __1_page_stripe[1].pages --> [c0][c1]..[cw][c_par] <--- | ... | ... | __1_page_stripe[n].pages --> [c0][c1]..[cw][c_par] ^ | data added columns first then row --- The pages are put on this array columns first. .i.e: p0-of-c0, p1-of-c0, ... pn-of-c0, p0-of-c1, ... So we are doing a corner turn of the pages. Note that pages will zigzag down and left. but are put sequentially in growing order. So when the time comes to XOR the stripe, only the beginning and end of the array need be checked. We scan the array and any NULL spot will be field by pages-to-be-read. The FS that wants to support RAID5 needs to supply an operations-vector that searches a given page in cache, and specifies if the page is uptodate or need reading. All these pages to be read are put on a slave ore_io_state and synchronously read. All the pages of a stripe are read in one IO, using the scatter gather mechanism. In write we constrain our IO to only be incomplete on a single stripe. Meaning either the complete IO is within a single stripe so we might have pages to read from both beginning or end of the strip. Or we have some reading to do at beginning but end at strip boundary. The left over pages are pushed to the next IO by the API already established by previous work, where an IO offset/length combination presented to the ORE might get the length truncated and the user must re-submit the leftover pages. (Both exofs and NFS support this) But any ORE user should make it's best effort to align it's IO before hand and avoid complications. A cached ore_layout->stripe_size member can be used for that calculation. (NOTE: that ORE demands that stripe_size may not be bigger then 32bit) What else? Well read it and tell me. Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
* | ore: RAID5 readBoaz Harrosh2011-10-251-3/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch introduces the first stage of RAID5 support mainly the skip-over-raid-units when reading. For writes it inserts BLANK units, into where XOR blocks should be calculated and written to. It introduces the new "general raid maths", and the main additional parameters and components needed for raid5. Since at this stage it could corrupt future version that actually do support raid5. The enablement of raid5 mounting and setting of parity-count > 0 is disabled. So the raid5 code will never be used. Mounting of raid5 is only enabled later once the basic XOR write is also in. But if the patch "enable RAID5" is applied this code has been tested to be able to properly read raid5 volumes and is according to standard. Also it has been tested that the new maths still properly supports RAID0 and grouping code just as before. (BTW: I have found more bugs in the pnfs-obj RAID math fixed here) The ore.c file is getting too big, so new ore_raid.[hc] files are added that will include the special raid stuff that are not used in striping and mirrors. In future write support these will get bigger. When adding the ore_raid.c to Kbuild file I was forced to rename ore.ko to libore.ko. Is it possible to keep source file, say ore.c and module file ore.ko the same even if there are multiple files inside ore.ko? Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
* | ore: Make ore_calc_stripe_info EXPORT_SYMBOLBoaz Harrosh2011-10-251-0/+3
| | | | | | | | | | | | | | ore_calc_stripe_info is needed by exofs::export.c for the layout calculations. Make it exportable Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
* | ore/exofs: Change ore_check_io APIBoaz Harrosh2011-10-141-1/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Current ore_check_io API receives a residual pointer, to report partial IO. But it is actually not used, because in a multiple devices IO there is never a linearity in the IO failure. On the other hand if every failing device is reported through a received callback measures can be taken to handle only failed devices. One at a time. This will also be needed by the objects-layout-driver for it's error reporting facility. Exofs is not currently using the new information and keeps the old behaviour of failing the complete IO in case of an error. (No partial completion) TODO: Use an ore_check_io callback to set_page_error only the failing pages. And re-dirty write pages. Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
* | ore/exofs: Define new ore_verify_layoutBoaz Harrosh2011-10-141-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | All users of the ore will need to check if current code supports the given layout. For example RAID5/6 is not currently supported. So move all the checks from exofs/super.c to a new ore_verify_layout() to be used by ore users. Note that any new layout should be passed through the ore_verify_layout() because the ore engine will prepare and verify some internal members of ore_layout, and assumes it's called. Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
* | ore: Support for partial component tableBoaz Harrosh2011-10-141-3/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Users like the objlayout-driver would like to only pass a partial device table that covers the IO in question. For example exofs divides the file into raid-group-sized chunks and only serves group_width number of devices at a time. The partiality is communicated by setting ore_componets->first_dev and the array covers all logical devices from oc->first_dev upto (oc->first_dev + oc->numdevs) The ore_comp_dev() API receives a logical device index and returns the actual present device in the table. An out-of-range dev_index will BUG. Logical device index is the theoretical device index as if all the devices of a file are present. .i.e: total_devs = group_width * mirror_p1 * group_count 0 <= dev_index < total_devs Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
* | ore: cleanup: Embed an ore_striping_info inside ore_io_stateBoaz Harrosh2011-10-141-0/+1
| | | | | | | | | | | | | | | | | | | | | | Now that each ore_io_state covers only a single raid group. A single striping_info math is needed. Embed one inside ore_io_state to cache the calculation results and eliminate an extra call. Also the outer _prepare_for_striping is removed since it does nothing. Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
* | ore/exofs: Change the type of the devices array (API change)Boaz Harrosh2011-10-041-1/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In the pNFS obj-LD the device table at the layout level needs to point to a device_cache node, where it is possible and likely that many layouts will point to the same device-nodes. In Exofs we have a more orderly structure where we have a single array of devices that repeats twice for a round-robin view of the device table This patch moves to a model that can be used by the pNFS obj-LD where struct ore_components holds an array of ore_dev-pointers. (ore_dev is newly defined and contains a struct osd_dev *od member) Each pointer in the array of pointers will point to a bigger user-defined dev_struct. That can be accessed by use of the container_of macro. In Exofs an __alloc_dev_table() function allocates the ore_dev-pointers array as well as an exofs_dev array, in one allocation and does the addresses dance to set everything pointing correctly. It still keeps the double allocation trick for the inodes round-robin view of the table. The device table is always allocated dynamically, also for the single device case. So it is unconditionally freed at umount. Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
* | ore: Make ore_striping_info and ore_calc_stripe_info publicBoaz Harrosh2011-10-031-0/+8
| | | | | | | | | | | | | | | | | | The struct ore_striping_info will be used later in other structures. And ore_calc_stripe_info as well. Rename them make struct ore_striping_info public. ore_calc_stripe_info is still static, will be made public on first use. Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
* | exofs: Remove unused data_map member from exofs_sb_infoBoaz Harrosh2011-10-031-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | The struct pnfs_osd_data_map data_map member of exofs_sb_info was never used after mount. In fact all it's members were duplicated by the ore_layout structure. So just remove the duplicated information. Also removed some stupid, but perfectly supported, restrictions on layout parameters. The case where num_devices is not divisible by mirror_count+1 is perfectly fine since the rotating device view will eventually use all the devices it can get. Signed-off-by: Boaz Harrosh <bharrosh@panasas.com> Signed-off-by: Benny Halevy <bhalevy@tonian.com>
* | exofs: Rename struct ore_components comps => ocBoaz Harrosh2011-10-031-1/+1
|/ | | | | | | | | ore_components already has a comps member so this leads to things like comps->comps which is annoying. the name oc was already used in new code. So rename all old usage of ore_components comps => ore_components oc. Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
* Merge branch 'for-linus' of git://git.open-osd.org/linux-open-osdLinus Torvalds2011-08-071-0/+125
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | * 'for-linus' of git://git.open-osd.org/linux-open-osd: ore: Make ore its own module exofs: Rename raid engine from exofs/ios.c => ore exofs: ios: Move to a per inode components & device-table exofs: Move exofs specific osd operations out of ios.c exofs: Add offset/length to exofs_get_io_state exofs: Fix truncate for the raid-groups case exofs: Small cleanup of exofs_fill_super exofs: BUG: Avoid sbi realloc exofs: Remove pnfs-osd private definitions nfs_xdr: Move nfs4_string definition out of #ifdef CONFIG_NFS_V4
| * exofs: Rename raid engine from exofs/ios.c => oreBoaz Harrosh2011-08-071-0/+125
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ORE stands for "Objects Raid Engine" This patch is a mechanical rename of everything that was in ios.c and its API declaration to an ore.c and an osd_ore.h header. The ore engine will later be used by the pnfs objects layout driver. * File ios.c => ore.c * Declaration of types and API are moved from exofs.h to a new osd_ore.h * All used types are prefixed by ore_ from their exofs_ name. * Shift includes from exofs.h to osd_ore.h so osd_ore.h is independent, include it from exofs.h. Other than a pure rename there are no other changes. Next patch will move the ore into it's own module and will export the API to be used by exofs and later the layout driver Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
* | Merge git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6Linus Torvalds2011-07-301-1/+0
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6: (71 commits) [SCSI] fcoe: cleanup cpu selection for incoming requests [SCSI] fcoe: add fip retry to avoid missing critical keep alive [SCSI] libfc: fix warn on in lport retry [SCSI] libfc: Remove the reference to FCP packet from scsi_cmnd in case of error [SCSI] libfc: cleanup sending SRR request [SCSI] libfc: two minor changes in comments [SCSI] libfc, fcoe: ignore rx frame with wrong xid info [SCSI] libfc: release exchg cache [SCSI] libfc: use FC_MAX_ERROR_CNT [SCSI] fcoe: remove unused ptype field in fcoe_rcv_info [SCSI] bnx2fc: Update copyright and bump version to 1.0.4 [SCSI] bnx2fc: Tx BDs cache in write tasks [SCSI] bnx2fc: Do not arm CQ when there are no CQEs [SCSI] bnx2fc: hold tgt lock when calling cmd_release [SCSI] bnx2fc: Enable support for sequence level error recovery [SCSI] bnx2fc: HSI changes for tape [SCSI] bnx2fc: Handle REC_TOV error code from firmware [SCSI] bnx2fc: REC/SRR link service request and response handling [SCSI] bnx2fc: Support 'sequence cleanup' task [SCSI] dh_rdac: Associate HBA and storage in rdac_controller to support partitions in storage ...
| * | [SCSI] fcoe: remove unused ptype field in fcoe_rcv_infoYi Zou2011-07-281-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There is no need to cache the ptype in fcoe_rcv_info struct as it is never used anywhere. Signed-off-by: Yi Zou <yi.zou@intel.com> Tested-by: Ross Brattain <ross.b.brattain@intel.com> Signed-off-by: Robert Love <robert.w.love@intel.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
* | | Merge branch 'for-next' of ↵Linus Torvalds2011-07-271-8/+52
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending * 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending: target: Convert to DIV_ROUND_UP_SECTOR_T usage for sectors / dev_max_sectors kernel.h: Add DIV_ROUND_UP_ULL and DIV_ROUND_UP_SECTOR_T macro usage iscsi-target: Add iSCSI fabric support for target v4.1 iscsi: Add Serial Number Arithmetic LT and GT into iscsi_proto.h iscsi: Use struct scsi_lun in iscsi structs instead of u8[8] iscsi: Resolve iscsi_proto.h naming conflicts with drivers/target/iscsi
| * | | iscsi: Add Serial Number Arithmetic LT and GT into iscsi_proto.hNicholas Bellinger2011-07-251-0/+30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch moves the iscsi_sna_lt() and iscsi_sna_lte(), along with iscsi_sna_gt() and iscsi_sna_gte() from iscsi_target_mod into static inlines inside of include/scsi/iscsi_proto.h This patch also includes the ISCSI_HDR_LEN and ISCSI_CRC_LEN definitions. (Added JesperJ simpliciation for iscsi_sna_* usage) Signed-off-by: Mark Rustad <mark.d.rustad@intel.com> Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: Nicholas A. Bellinger <nab@linux-iscsi.org>
| * | | iscsi: Use struct scsi_lun in iscsi structs instead of u8[8]Andy Grover2011-07-252-10/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | struct scsi_lun is also just a struct with an array of 8 octets (64 bits) but using it instead in iscsi structs lets us call scsilun_to_int without a cast, and also lets us copy it using assignment, instead of memcpy(). Signed-off-by: Andy Grover <agrover@redhat.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>