summaryrefslogtreecommitdiffstats
path: root/drivers/scsi/lpfc/lpfc_crtn.h (follow)
Commit message (Collapse)AuthorAgeFilesLines
* scsi: lpfc: Replace blk_irq_poll intr handler with threaded IRQJustin Tee2023-05-081-0/+1
| | | | | | | | | | | | | It has been determined that the threaded IRQ API accomplishes effectively the same performance metrics as blk_irq_poll. As blk_irq_poll is mostly scheduled by the softirqd and handled in softirq context, this is not entirely desired from a Fibre Channel driver context. A threaded IRQ model fits cleaner. This patch replaces the blk_irq_poll logic with threaded IRQ. Signed-off-by: Justin Tee <justin.tee@broadcom.com> Link: https://lore.kernel.org/r/20230417191558.83100-7-justintee8345@gmail.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* scsi: lpfc: Update congestion warning notification periodJustin Tee2023-05-081-1/+1
| | | | | | | | | | The CMF_SYNC_WQE command is updated to use an 8-bit field sync period. All related variables used to calculate congestion warning notifications are updated to 8-bit fields accordingly. Signed-off-by: Justin Tee <justin.tee@broadcom.com> Link: https://lore.kernel.org/r/20230417191558.83100-5-justintee8345@gmail.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* scsi: lpfc: Fix double free in lpfc_cmpl_els_logo_acc() caused by ↵Justin Tee2023-05-081-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | lpfc_nlp_not_used() Smatch detected a double free path because lpfc_nlp_not_used() releases an ndlp object before reaching lpfc_nlp_put() at the end of lpfc_cmpl_els_logo_acc(). Remove the outdated lpfc_nlp_not_used() routine. In lpfc_mbx_cmpl_ns_reg_login(), replace the call with lpfc_nlp_put(). In lpfc_cmpl_els_logo_acc(), replace the call with lpfc_unreg_rpi() and keep the lpfc_nlp_put() at the end of the routine. If ndlp's rpi was registered, then lpfc_unreg_rpi()'s completion routine performs the final ndlp clean up after lpfc_nlp_put() is called from lpfc_cmpl_els_logo_acc(). Otherwise if ndlp has no rpi registered, the lpfc_nlp_put() at the end of lpfc_cmpl_els_logo_acc() is the final ndlp clean up. Fixes: 4430f7fd09ec ("scsi: lpfc: Rework locations of ndlp reference taking") Cc: <stable@vger.kernel.org> # v5.11+ Reported-by: Dan Carpenter <error27@gmail.com> Link: https://lore.kernel.org/all/Y3OefhyyJNKH%2Fiaf@kili/ Signed-off-by: Justin Tee <justin.tee@broadcom.com> Link: https://lore.kernel.org/r/20230417191558.83100-3-justintee8345@gmail.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* scsi: lpfc: Revise lpfc_error_lost_link() reason code evaluation logicJustin Tee2023-03-101-0/+2
| | | | | | | | | | | Extended status reason code errors should mask off the IOERR_PARAM_MASK before checking strict equalities for IOERR values. Update the lpfc_error_lost_link() routine as such. Signed-off-by: Justin Tee <justin.tee@broadcom.com> Link: https://lore.kernel.org/r/20230301231626.9621-9-justintee8345@gmail.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* scsi: lpfc: Copyright updates for 14.2.0.10 patchesJustin Tee2023-01-121-1/+1
| | | | | | | Update copyrights to 2023 for files modified in the 14.2.0.10 patch set. Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* scsi: lpfc: Reinitialize internal VMID data structures after FLOGI completionJustin Tee2023-01-121-0/+1
| | | | | | | | | | | After enabling VMID, an issue LIP test was erasing fabric switch VMID information. Introduce a lpfc_reinit_vmid() routine, which reinitializes all VMID data structures upon FLOGI completion in fabric topology. Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* scsi: lpfc: Resolve miscellaneous variable set but not used compiler warningsJustin Tee2023-01-121-1/+0
| | | | | | | | | | | | | The local variables called curr_data are incremented, but not actually used for anything so they are removed. The return value of lpfc_sli4_poll_eq is not used anywhere and is not called outside of lpfc_sli.c. Thus, its declaration is removed from lpfc_crtn.h Also, lpfc_sli4_poll_eq's path argument is not used in the routine so it is removed along with corresponding macros. Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* scsi: lpfc: Create a sysfs entry called lpfc_xcvr_data for transceiver infoJustin Tee2022-10-221-0/+3
| | | | | | | | | | | | The DUMP_MEMORY mailbox command is implemented for page A0 and A2 to retrieve transceiver information from firmware. The mailbox command output is then formatted to print raw data values for userspace to parse via sysfs. Signed-off-by: Justin Tee <justin.tee@broadcom.com> Link: https://lore.kernel.org/r/20221017164323.14536-4-justintee8345@gmail.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* scsi: lpfc: Add reporting capability for Link Degrade SignalingJames Smart2022-09-161-0/+1
| | | | | | | | | | | | | | Firmware reports link degrade signaling via ACQES. Handlers and new additions to the SET_FEATURES mbox command are implemented so that link degrade parameters for 64GB capable links are reported through EDC ELS frames. Link: https://lore.kernel.org/r/20220911221505.117655-12-jsmart2021@gmail.com Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* scsi: lpfc: Move scsi_host_template outside dynamically allocated/freed phbaJames Smart2022-09-161-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | On a PCI hotplug capable system, it is possible for scsi_device_put() to happen after lpfc_pci_remove_one() is called. As a result, the sdev->host->hostt->module dereference is for a previously freed memory location because the phba structure containing the hostt template was already freed when lpfc_pci_remove_one() returned. Since the lpfc module is still loaded during power slot disable, all scsi_host_templates should be declared as part of the global data segment instead of inside the heap allocated phba structure. This way the sdev->host->hostt memory area is always valid as long as the module is loaded regardless if PCI hotplug dynamically allocates or frees phba structures. Move all scsi_host_templates in the phba structure to global variables. Create a small helper routine to determine appropriate sg_tablesize during shost allocation. Link: https://lore.kernel.org/r/20220911221505.117655-7-jsmart2021@gmail.com Co-developed-by: Dwip N. Banerjee <dnbanerg@us.ibm.com> Signed-off-by: Dwip N. Banerjee <dnbanerg@us.ibm.com> Co-developed-by: Daniel Wagner <dwagner@suse.de> Signed-off-by: Daniel Wagner <dwagner@suse.de> Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* scsi: lpfc: Rework MIB Rx Monitor debug info logicJames Smart2022-09-011-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | The kernel test robot reported the following sparse warning: arch/arm64/include/asm/cmpxchg.h:88:1: sparse: sparse: cast truncates bits from constant value (369 becomes 69) On arm64, atomic_xchg only works on 8-bit byte fields. Thus, the macro usage of LPFC_RXMONITOR_TABLE_IN_USE can be unintentionally truncated leading to all logic involving the LPFC_RXMONITOR_TABLE_IN_USE macro to not work properly. Replace the Rx Table atomic_t indexing logic with a new lpfc_rx_info_monitor structure that holds a circular ring buffer. For locking semantics, a spinlock_t is used. Link: https://lore.kernel.org/r/20220819011736.14141-4-jsmart2021@gmail.com Fixes: 17b27ac59224 ("scsi: lpfc: Add rx monitoring statistics") Cc: <stable@vger.kernel.org> # v5.15+ Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* scsi: lpfc: Refactor lpfc_nvmet_prep_abort_wqe() into lpfc_sli_prep_abort_xri()James Smart2022-07-071-1/+1
| | | | | | | | | | | | | | | lpfc_nvmet_prep_abort_wqe() has a lot of common code with lpfc_sli_prep_abort_xri(). Delete lpfc_nvmet_prep_abort_wqe() as the wqe can be filled out using the generic lpfc_sli_prep_abort_xri routine(). Add the wqec option to lpfc_sli_prep_abort_xri() for lpfc_nvmet_prep_abort_wqe(). Link: https://lore.kernel.org/r/20220701211425.2708-10-jsmart2021@gmail.com Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* scsi: lpfc: Resolve some cleanup issues following abort path refactoringJames Smart2022-06-081-3/+1
| | | | | | | | | | | | | | | | | | | | | | Refactoring and consolidation of abort paths: - lpfc_sli4_abort_fcp_cmpl() and lpfc_sli_abort_fcp_cmpl() are combined into a single generic lpfc_sli_abort_fcp_cmpl() routine. Thus, remove extraneous lpfc_sli4_abort_fcp_cmpl() prototype declaration. - lpfc_nvme_abort_fcreq_cmpl() abort completion routine is called with a mismatched argument type. This may result in misleading log message content. Update to the correct argument type of lpfc_iocbq instead of lpfc_wcqe_complete. The lpfc_wcqe_complete should be derived from the lpfc_iocbq structure. Link: https://lore.kernel.org/r/20220603174329.63777-3-jsmart2021@gmail.com Fixes: 31a59f75702f ("scsi: lpfc: SLI path split: Refactor Abort paths") Cc: <stable@vger.kernel.org> # v5.18 Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* scsi: lpfc: Rework lpfc_vmid_get_appid() to be protocol independentJames Smart2022-05-201-2/+3
| | | | | | | | | | | | | | Rework lpfc_vmid_get_appid() arguments to remove scsi_cmnd dependency. The function is now callable by the NVMe I/O path. Fix up SCSI call path to accommodate the arg change. Link: https://lore.kernel.org/r/20220519123110.17361-4-jsmart2021@gmail.com Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Co-developed-by: Gaurav Srivastava <gaurav.srivastava@broadcom.com> Signed-off-by: Gaurav Srivastava <gaurav.srivastava@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* scsi: lpfc: Commonize VMID code locationJames Smart2022-05-201-0/+2
| | | | | | | | | | | | | Remove VMID code from its SCSI-specific location and move to a new file solely for VMID code. Link: https://lore.kernel.org/r/20220519123110.17361-3-jsmart2021@gmail.com Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Co-developed-by: Gaurav Srivastava <gaurav.srivastava@broadcom.com> Signed-off-by: Gaurav Srivastava <gaurav.srivastava@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* scsi: lpfc: Rework FDMI initialization after link upJames Smart2022-05-111-0/+1
| | | | | | | | | | | | | | | | | | | After a link up, it's possible for the switch to change FDMI support (e.g. FDMI1 vs FDMI2 vs SmartSAN). If the switch reverts to FDMI1, then the revert is currently not detected. Additionally, when NPIV is configured, it's possible the physical port's RHBA is unprocessed by the switch before reciept of an NPIV port issued RPRT. This causes some switches vendors to reject the NPIV's RPRT. Fix by reinitializing base FDMI mode on link up, and defer FDMI vport RPRT submission until after confirming physical port's RHBA is completed. Link: https://lore.kernel.org/r/20220506035519.50908-10-jsmart2021@gmail.com Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* scsi: lpfc: Refactor cleanup of mailbox commandsJames Smart2022-04-191-1/+3
| | | | | | | | | | | | | | | | | | | | | The intention of this patch is to refactor mailbox memory allocation and cleanup steps in one routine respectively to prevent memory leaks or memory errors related to mailbox commands. There are trivial localized fixes as well. Provide lpfc_mbox_rsrc_prep() - this routine allocates the dmabuf and the mbuf associated with it. It also catches allocation errors and returns status. Provide lpfc_mbox_rsrc_cleanup() - this routine verifies a dmabuf exists and if so releases the associated mbuf and the dmabuf memory. It then sets the ctx_buf to NULL and releases the mailbox memory to the mailbox pool. Link: https://lore.kernel.org/r/20220412222008.126521-22-jsmart2021@gmail.com Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* scsi: lpfc: Improve PCI EEH Error and Recovery HandlingJames Smart2022-03-301-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Following EEH errors, the driver can crash or hang when deleting the localport or when attempting to unload. The EEH handlers in the driver did not notify the NVMe-FC transport before tearing the driver down. This was delayed until the resume steps. This worked for SCSI because lpfc_block_scsi() would notify the scsi_fc_transport that the target was not available but it would not clean up all the references to the ndlp. The SLI3 prep for dev reset handler did the lpfc_offline_prep() and lpfc_offline() calls to get the port stopped before restarting. The SLI4 version of the prep for dev reset just destroyed the queues and did not stop NVMe from continuing. Also because the port was not really stopped the localport destroy would hang because the transport was still waiting for I/O. Additionally, a devloss tmo can fire and post events to a stopped worker thread creating another hang condition. lpfc_sli4_prep_dev_for_reset() is modified to call lpfc_offline_prep() and lpfc_offline() rather than just lpfc_scsi_dev_block() to ensure both SCSI and NVMe transports are notified to block I/O to the driver. Logic is added to devloss handler and worker thread to clean up ndlp references and quiesce appropriately. Link: https://lore.kernel.org/r/20220317032737.45308-2-jsmart2021@gmail.com Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* scsi: lpfc: Copyright updates for 14.2.0.0 patchesJames Smart2022-03-151-1/+1
| | | | | | | | | | Update copyrights to 2022 for files modified in the 14.2.0.0 patch set. Link: https://lore.kernel.org/r/20220225022308.16486-18-jsmart2021@gmail.com Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* scsi: lpfc: SLI path split: Refactor Abort pathsJames Smart2022-03-151-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch refactors the Abort paths to use SLI-4 as the primary interface. - Introduce generic lpfc_sli_prep_abort_xri jump table routine - Consolidate lpfc_sli4_issue_abort_iotag and lpfc_sli_issue_abort_iotag into a single generic lpfc_sli_issue_abort_iotag routine - Consolidate lpfc_sli4_abort_fcp_cmpl and lpfc_sli_abort_fcp_cmpl into a single generic lpfc_sli_abort_fcp_cmpl routine - Remove unused routine lpfc_get_iocb_from_iocbq - Conversion away from using SLI-3 iocb structures to set/access fields in common routines. Use the new generic get/set routines that were added. This move changes code from indirect structure references to using local variables with the generic routines. - Refactor routines when setting non-generic fields, to have both SLI3 and SLI4 specific sections. This replaces the set-as-SLI3 then translate to SLI4 behavior of the past. Link: https://lore.kernel.org/r/20220225022308.16486-15-jsmart2021@gmail.com Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* scsi: lpfc: SLI path split: Refactor CT pathsJames Smart2022-03-151-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch refactors the CT paths to use SLI-4 as the primary interface. - Introduce generic lpfc_sli_prep_gen_req jump table routine - Introduce generic lpfc_sli_prep_xmit_seq64 jump table routine - Rename lpfcdiag_loop_post_rxbufs to lpfcdiag_sli3_loop_post_rxbufs to indicate that it is an SLI3 only path - Create new prep_wqe routine for unsolicited ELS rsp WQEs. - Conversion away from using SLI-3 iocb structures to set/access fields in common routines. Use the new generic get/set routines that were added. This move changes code from indirect structure references to using local variables with the generic routines. - Refactor routines when setting non-generic fields, to have both SLI3 and SLI4 specific sections. This replaces the set-as-SLI3 then translate to SLI4 behavior of the past. Link: https://lore.kernel.org/r/20220225022308.16486-13-jsmart2021@gmail.com Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* scsi: lpfc: SLI path split: Refactor base ELS paths and the FLOGI pathJames Smart2022-03-151-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | The patch refactors the general ELS handling paths to migrate to SLI-4 structures or common element abstractions. The fabric login paths are revised as part of this patch: - New generic lpfc_sli_prep_els_req_rsp jump table routine - Introduce ls_rjt_error_be and ulp_bde64_le unions to correct legacy endianness assignments - Conversion away from using SLI-3 iocb structures to set/access fields in common routines. Use the new generic get/set routines that were added. This move changes code from indirect structure references to using local variables with the generic routines. - Refactor routines when setting non-generic fields, to have both SLI3 and SLI4 specific sections. This replaces the set-as-SLI3 then translate to SLI4 behavior of the past. - Clean up poor indentation on some of the ELS paths Link: https://lore.kernel.org/r/20220225022308.16486-5-jsmart2021@gmail.com Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* scsi: lpfc: SLI path split: Introduce lpfc_prep_wqeJames Smart2022-03-151-0/+1
| | | | | | | | | | | | | | | Introduce lpfc_prep_wqe routine. The lpfc_prep_wqe() routine is used with lpfc_sli_issue_iocb() and lpfc_sli_issue_iocb_wait(). The routine performs additional SLI-4 wqe field setting that the generic routines did not perform as they kept their actions compatible with both SLI3 and SLI4. Link: https://lore.kernel.org/r/20220225022308.16486-4-jsmart2021@gmail.com Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* scsi: lpfc: SLI path split: Refactor fast and slow paths to native SLI4James Smart2022-03-151-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Convert the SLI4 fast and slow paths to use native SLI4 wqe constructs instead of iocb SLI3-isms. Includes the following: - Create simple get_xxx and set_xxx routines to wrapper access to common elements in both SLI3 and SLI4 commands - allowing calling routines to avoid sli-rev-specific structures to access the elements. - using the wqe in the job structure as the primary element - use defines from SLI-4, not SLI-3 - Removal of iocb to wqe conversion from fast and slow path - Add below routines to handle fast path lpfc_prep_embed_io - prepares the wqe for fast path lpfc_wqe_bpl2sgl - manages bpl to sgl conversion lpfc_sli_wqe2iocb - converts a WQE to IOCB for SLI-3 path - Add lpfc_sli3_iocb2wcqecmpl in completion path to convert an SLI-3 iocb completion to wcqe completion - Refactor some of the code that works on both revs for clarity Link: https://lore.kernel.org/r/20220225022308.16486-3-jsmart2021@gmail.com Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* scsi: lpfc: SLI path split: Refactor lpfc_iocbqJames Smart2022-03-151-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, SLI3 and SLI4 data paths use the same lpfc_iocbq structure. This is a "common" structure but many of the components refer to sli-rev specific entities which can lead the developer astray as to what they actually mean, should be set to, or when they should be used. This first patch prepares the lpfc_iocbq structure so that elements common to both SLI3 and SLI4 data paths are more appropriately named, making it clear they apply generically. Fieldnames based on 'iocb' (sli3) or 'wqe' (sli4) which are actually generic to the paths are renamed to 'cmd': - iocb_flag is renamed to cmd_flag - lpfc_vmid_iocb_tag is renamed to lpfc_vmid_tag - fabric_iocb_cmpl is renamed to fabric_cmd_cmpl - wait_iocb_cmpl is renamed to wait_cmd_cmpl - iocb_cmpl and wqe_cmpl are combined and renamed to cmd_cmpl - rsvd2 member is renamed to num_bdes due to pre-existing usage The structure name itself will retain the iocb reference as changing to a more relevant "job" or "cmd" title induces many hundreds of line changes for only a name change. lpfc_post_buffer is also renamed to lpfc_sli3_post_buffer to indicate use in the SLI3 path only. Link: https://lore.kernel.org/r/20220225022308.16486-2-jsmart2021@gmail.com Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* scsi: lpfc: Allow fabric node recovery if recovery is in progress before devlossJames Smart2021-10-211-0/+2
| | | | | | | | | | | | | | | | | | | | | A link bounce to a slow fabric may observe FDISC response delays lasting longer than devloss tmo. Current logic decrements the final fabric node kref during a devloss tmo event. This results in a NULL ptr dereference crash if the FDISC completes for that fabric node after devloss tmo. Fix by adding the NLP_IN_RECOV_POST_DEV_LOSS flag, which is set when devloss tmo triggers and we've noticed that fabric node recovery has already started or finished in between the time lpfc_dev_loss_tmo_callbk queues lpfc_dev_loss_tmo_handler. If fabric node recovery succeeds, then the driver reverses the devloss tmo marked kref put with a kref get. If fabric node recovery fails, then the final kref put relies on the ELS timing out or the REG_LOGIN cmpl routine. Link: https://lore.kernel.org/r/20211020211417.88754-8-jsmart2021@gmail.com Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* scsi: lpfc: Correct sysfs reporting of loop support after SFP status changeJames Smart2021-10-211-0/+1
| | | | | | | | | | | | | | | | | | | | | Applications determine loop support in part by querying the 'pls' sysfs node. Reporting of 'pls' (Private Loop Support) is derived from the descriptor returned by the COMMON_GET_SLI4_PARAMETERS mailbox command, which is issued during initialization or after a reset. The value of this field may change if there is a dynamic SFP change. The driver currently will not pick up the change as there was no reset scenario. Rework to commonize the sending of the COMMON_GET_SLI4_PARAMETERS command. Add the calling of the routine after receipt of an async event indicating an SFP change. Link: https://lore.kernel.org/r/20211020211417.88754-4-jsmart2021@gmail.com Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* scsi: lpfc: Switch to attribute groupsBart Van Assche2021-10-171-2/+2
| | | | | | | | | struct device supports attribute groups directly but does not support struct device_attribute directly. Hence switch to attribute groups. Link: https://lore.kernel.org/r/20211012233558.4066756-28-bvanassche@acm.org Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* scsi: lpfc: Add cmf_info sysfs entryJames Smart2021-08-251-0/+2
| | | | | | | | | | Allow abbreviated cm framework status information to be obtained via sysfs. Link: https://lore.kernel.org/r/20210816162901.121235-14-jsmart2021@gmail.com Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* scsi: lpfc: Add support for maintaining the cm statistics bufferJames Smart2021-08-251-0/+2
| | | | | | | | | | | | Add the logic to move the congestion management and event information into the cmd statistics buffer maintained for the adapter. The update includes rolling up values for the last minute, hour, and day information. Link: https://lore.kernel.org/r/20210816162901.121235-12-jsmart2021@gmail.com Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* scsi: lpfc: Add support for the CM frameworkJames Smart2021-08-251-0/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Complete the enablement of the cm framework feature in the adapter. Perform the following: - Detect the presence of the congestion management framework feature. When the cm framework is present: - Issue the SET_FEATURE command to enable the feature. - Register the cm statistics buffer with the adapter. - Read the cm enablement buffer to determine the cm framework state for cm management. When cm management is enabled: - Monitor all FPIN and congestion signalling events, incrementing counters. - Regularly sync with the adapter to communicate congestion events and to receive an rx request limit. - Monitor requests for rx data and ensure that no more than the adapter prescribed limit is issued on the link. If the limit is exceeded, SCSI and/or NVMe traffic is temporarily suspended. - Maintain the minute, hourly, daily statistics buffer. - Monitor for congestion enablement change events, causing a reread of the enablement buffer and acting on any change in enablement. And: - Add teardown logic, including buffer deregistration, on adapter detachment or reset. Link: https://lore.kernel.org/r/20210816162901.121235-10-jsmart2021@gmail.com Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* scsi: lpfc: Add cmfsync WQE supportJames Smart2021-08-251-0/+1
| | | | | | | | | | | | | | | | | | When congestion mgmt is enabled, cmf has the driver regularly issue a command to synchronize reporting of congestion mgmt events such as fpin and signal delivery. This patch adds the definition of the CMF_SYNC WQE and its CQE fields as well as support for issuing the command. The patch also adds the few remaining cmf-related SLI additions, such as feature definition for enablement of CMF and notifications to the driver if the cm enablement mode changes. Link: https://lore.kernel.org/r/20210816162901.121235-9-jsmart2021@gmail.com Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* scsi: lpfc: Add support for cm enablement bufferJames Smart2021-08-251-0/+4
| | | | | | | | | | | | | | | | As part of the cmf framework, the firmware maintains a table with congestion related state information, specifically whether enabled and if enabled, whether monitoring or actively managing congestion. Add definition of the table and add support to read the table from the adapter and determine if it is enabled. In support of this, the READ_OBJECT mailbox command definition is added to the driver. Link: https://lore.kernel.org/r/20210816162901.121235-8-jsmart2021@gmail.com Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* scsi: lpfc: Add cm statistics buffer supportJames Smart2021-08-251-0/+3
| | | | | | | | | | | | | | | | | | The cmf framework requires the driver to maintain a cm statistics table, accessible inband, of congestion related statistics that are reported per minute, rolled up to per hour, and rolled up again per day. Several days worth may be maintained. The table is registered with the adapter when the MIB feature is enabled. Add definition of the table and add support to register the table with the adapter. Includes definition and initialization of event counters that are later added to the statistics table. Link: https://lore.kernel.org/r/20210816162901.121235-7-jsmart2021@gmail.com Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* scsi: lpfc: Add EDC ELS supportJames Smart2021-08-251-0/+6
| | | | | | | | | | | | | | | | | | When congestion management is enabled, issue EDC ELS to register congestion signaling capabilities with the fabric. The response handling will process the fabric parameters and set the reporting parameters. Similarly, add support for receiving an EDC request from the fabric generating a corresponding response. Implement handlers for congestion signals from the fabric and maintain statistics for them. Link: https://lore.kernel.org/r/20210816162901.121235-6-jsmart2021@gmail.com Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* scsi: lpfc: Delay unregistering from transport until GIDFT or ADISC completesJames Smart2021-07-191-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On an RSCN event, the nodes specified in RSCN payload and in MAPPED state are moved to NPR state in order to revalidate the login. This triggers an immediate unregister from SCSI/NVMe backend. The assumption is that the node may be missing. The re-registration with the backend happens after either relogin (PLOGI/PRLI; if ADISC is disabled or login truly lost) or when ADISC completes successfully (rediscover with ADISC enabled). However, the NVMe-FC standard provides for an RSCN to be triggered when the remote port supports a discovery controller and there was a change of discovery log content. As the remote port typically also supports storage subsystems, this unregister causes all storage controller connections to fail and require reconnect. Correct by reworking the code to ensure that the unregistration only occurs when a login state is truly terminated, thereby leaving the NVMe storage controllers in place. The changes made are: - Retain node state in ADISC_ISSUE when scheduling ADISC ELS retry. - Do not clear wwpn/wwnn values upon ADISC failure. - Move MAPPED nodes to NPR during RSCN processing, but do not unregister with transport. On GIDFT completion, identify missing nodes (not marked NLP_NPR_2B_DISC) and unregister them. - Perform unregistration for nodes that will go through ADISC processing if ADISC completion fails. - Successful ADISC completion will move node back to MAPPED state. Link: https://lore.kernel.org/r/20210707184351.67872-16-jsmart2021@gmail.com Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* scsi: lpfc: vmid: Add datastructure for supporting VMID in lpfcGaurav Srivastava2021-06-101-0/+11
| | | | | | | | | | | | | | Add the primary datastructures needed to implement VMID in the lpfc driver. Maintain the capability, current state, and hash table for the vmid/appid along with other information. This implementation supports the two versions of vmid implementation (app header and priority tagging). Link: https://lore.kernel.org/r/20210608043556.274139-5-muneendra.kumar@broadcom.com Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Gaurav Srivastava <gaurav.srivastava@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Muneendra Kumar <muneendra.kumar@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* scsi: lpfc: Fix node handling for Fabric Controller and Domain ControllerJames Smart2021-05-221-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | During link bounce testing, RPI counts were seen to differ from the number of nodes. For fabric and domain controllers, a temporary RPI is assigned, but the code isn't registering it. If the nodes do go away, such as on link down, the temporary RPI isn't being released. Change the way these two fabric services are managed, make them behave like any other remote port. Register the RPI and register with the transport. Never leave the nodes in a NPR or UNUSED state where their RPI is in limbo. This allows them to follow normal dev_loss_tmo handling, RPI refcounting, and normal removal rules. It also allows fabric I/Os to use the RPI for traffic requests. Note: There is some logic that still has a couple of exceptions when the Domain controller (0xfffcXX). There are cases where the fabric won't have a valid login but will send RDP. Other times, it will it send a LOGO then an RDP. It makes for ad-hoc behavior to manage the node. Exceptions are documented in the code. Link: https://lore.kernel.org/r/20210514195559.119853-7-jsmart2021@gmail.com Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* scsi: lpfc: Remove unsupported mbox PORT_CAPABILITIES logicJames Smart2021-04-131-3/+0
| | | | | | | | | | | | | | | SLI-4 does not contain a PORT_CAPABILITIES mailbox command (only SLI-3 does, and SLI-3 doesn't use it), yet there are SLI-4 code paths that have code to issue the command. The command will always fail. Remove the code for the mailbox command and leave only the resulting "failure path" logic. Link: https://lore.kernel.org/r/20210412013127.2387-12-jsmart2021@gmail.com Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* scsi: lpfc: Fix rmmod crash due to bad ring pointers to abort_iotagJames Smart2021-04-131-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | Rmmod on SLI-4 adapters is sometimes hitting a bad ptr dereference in lpfc_els_free_iocb(). A prior patch refactored the lpfc_sli_abort_iocb() routine. One of the changes was to convert from building/sending an abort within the routine to using a common routine. The reworked routine passes, without modification, the pring ptr to the new common routine. The older routine had logic to check SLI-3 vs SLI-4 and adapt the pring ptr if necessary as callers were passing SLI-3 pointers even when not on an SLI-4 adapter. The new routine is missing this check and adapt, so the SLI-3 ring pointers are being used in SLI-4 paths. Fix by cleaning up the calling routines. In review, there is no need to pass the ring ptr argument to abort_iocb at all. The routine can look at the adapter type itself and reference the proper ring. Link: https://lore.kernel.org/r/20210412013127.2387-2-jsmart2021@gmail.com Fixes: db7531d2b377 ("scsi: lpfc: Convert abort handling to SLI-3 and SLI-4 handlers") Cc: <stable@vger.kernel.org> # v5.11+ Co-developed-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: Justin Tee <justin.tee@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* scsi: lpfc: Update copyrights for 12.8.0.7 and 12.8.0.8 changesJames Smart2021-03-041-1/+1
| | | | | | | | | | | For the files modified in 2021 via the 12.8.0.7 and 12.8.0.8 patch sets, update the copyright for 2021. Link: https://lore.kernel.org/r/20210301171821.3427-23-jsmart2021@gmail.com Co-developed-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* scsi: lpfc: Fix dropped FLOGI during pt2pt discovery recoveryJames Smart2021-03-041-0/+2
| | | | | | | | | | | | | | | | | When connected in pt2pt mode, there is a scenario where the remote port significantly delays sending a response to our FLOGI, but acts on the FLOGI it sent us and proceeds to PLOGI/PRLI. The FLOGI ends up timing out and kicks off recovery logic. End result is a lot of unnecessary state changes and lots of discovery messages being logged. Fix by terminating the FLOGI and noop'ing its completion if we have already accepted the remote ports FLOGI and are now processing PLOGI. Link: https://lore.kernel.org/r/20210301171821.3427-13-jsmart2021@gmail.com Co-developed-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* scsi: lpfc: Implement health checking when aborting I/OJames Smart2021-01-081-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | Several errors have occurred where the adapter stops or fails but does not raise the register values for the driver to detect failure. Thus driver is unaware of the failure. The failure typically results in I/O timeouts, the I/O timeout handler failing (after several seconds), and the error handler escalating recovery policy and resulting in more errors. Eventually, the driver is in a position where things have spiraled and it can't do recovery because other recovery ops are still outstanding and it becomes unusable. Resolve the situation by having the I/O timeout handler (actually a els, SCSI I/O, NVMe ls, or NVMe I/O timeout), in addition to aborting the I/O, perform a mailbox command and look for a response from the hardware. If the mailbox command fails, it will mark the adapter offline and then invoke the adapter reset handler to clean up. The new I/O timeout test will be limited to a test every 5s. If there are multiple I/O timeouts concurrently, only the 1st I/O timeout will generate the mailbox command. Further testing will only occur once a timeout occurs after a 5s delay from the last mailbox command has expired. Link: https://lore.kernel.org/r/20210104180240.46824-14-jsmart2021@gmail.com Co-developed-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* scsi: lpfc: Fix NVMe recovery after mailbox timeoutJames Smart2021-01-081-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If a mailbox command times out, the SLI port is deemed in error and the port is reset. The HBA cleanup is not returning I/Os to the NVMe layer before the port is unregistered. This is due to the HBA being marked offline (!SLI_ACTIVE) and cleanup being done by the mailbox timeout handler rather than an general adapter reset routine. The mailbox timeout handler mailbox handler only cleaned up SCSI I/Os. Fix by reworking the mailbox handler to: - After handling the mailbox error, detect the board is already in failure (may be due to another error), and leave cleanup to the other handler. - If the mailbox command timeout is initial detector of the port error, continue with the board cleanup and marking the adapter offline (!SLI_ACTIVE). Remove the SCSI-only I/O cleanup routine. The generic reset adapter routine that is subsequently invoked, will clean up the I/Os. - Have the reset adapter routine flush all NVMe and SCSI I/Os if the adapter has been marked failed (!SLI_ACTIVE). - Rework the NVMe I/O terminate routine to take a status code to fail the I/O with and update so that cleaned up I/O calls the wqe completion routine. Currently it is bypassing the wqe cleanup and calling the NVMe I/O completion directly. The wqe completion routine will take care of data structure and node cleanup then call the NVMe I/O completion handler. Link: https://lore.kernel.org/r/20210104180240.46824-11-jsmart2021@gmail.com Co-developed-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* scsi: lpfc: Update changed file copyrights for 2020James Smart2020-11-171-1/+1
| | | | | | | | | | Update Copyright in files changed by the 12.8.0.6 patch set to 2020 Link: https://lore.kernel.org/r/20201115192646.12977-18-james.smart@broadcom.com Co-developed-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* scsi: lpfc: Convert abort handling to SLI-3 and SLI-4 handlersJames Smart2020-11-171-2/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch reworks the abort interfaces such that SLI-3 retains the iocb-based formatting and completions and SLI-4 now uses native WQEs and completion routines. The following changes are made: - The code is refactored from a confusing 2 routine sequence of xx_abort_iotag_issue(), which creates/formats and abort cmd, and xx_issue_abort_tag(), which then issues and handles the completion of the abort cmd - into a single interface of xx_issue_abort_iotag(). The new interface will determine whether SLI-3 or SLI-4 and then call the appropriate handler. A completion handler can now be specified to address the differences in completion handling. Note: original code is all iocb based, with SLI-4 converting to SLI-3 for the SCSI/ELS path, and NVMe natively using wqes. - The SLI-3 side is refactored: The older iocb-base lpfc_sli_issue_abort_iotag() routine is combined with the logic of lpfc_sli_abort_iotag_issue() as well as the iocb-specific code in lpfc_abort_handler() and lpfc_sli_abort_iocb() to create the new single SLI-3 abort routine that formats and issues the iocb. - The SLI-4 side is refactored and added to: The native WQE abort code in NVMe is moved to the new SLI-4 issue_abort_iotag() routine. Items in SCSI that set fields not set by NVMe is migrated into the new routine. Thus the routine supports NVMe and SCSI initiators. The nvmet block (target) formats the abort slightly different (like the old NVMe initiator) thus it has its own prep routine stolen from NVMe initiator and it retains the current code it has for issuing the WQE (does not use the commonized routine the initiators do). SLI-4 completion handlers were also added. - lpfc_abort_handler now becomes a wrapper that determines whether SLI-3 or SLI-4 and calls the proper abort handler. Link: https://lore.kernel.org/r/20201115192646.12977-16-james.smart@broadcom.com Co-developed-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* scsi: lpfc: Enable common send_io interface for SCSI and NVMeJames Smart2020-11-171-0/+2
| | | | | | | | | | | | | | | To set up common use by the SCSI and NVMe I/O paths, create a new routine that issues FCP I/O commands which can be used by either protocol. The new routine addresses SLI-3 vs SLI-4 differences within its implementation. Replace the (SLI-3 centric) iocb routine in the SCSI path with this new WQE-centric common routine. Link: https://lore.kernel.org/r/20201115192646.12977-13-james.smart@broadcom.com Co-developed-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* scsi: lpfc: Enable common wqe_template support for both SCSI and NVMeJames Smart2020-11-171-1/+4
| | | | | | | | | | | | The driver is currently using SLI-4 WQE templates only for NVMe. Refactor the template and the placement of the service routine so that it can be used by both SCSI and NVMe. Link: https://lore.kernel.org/r/20201115192646.12977-12-james.smart@broadcom.com Co-developed-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* scsi: lpfc: Rework remote port ref counting and node freeingJames Smart2020-11-171-2/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When a remote port is disconnected and disappears, its node structure (ndlp) stays allocated and on a vport node list. While on the list it can be matched, thus requires validation checks on state to be added in numerous code paths. If the node comes back, its possible for there to be multiple node structures for the same device on the vport node list. There is no reason to keep the node structure around after it is no longer in existence, and the current implementation creates problems for itself (multiple nodes) and lots of unnecessary code for state validation. Additionally, the reference taking on the node structure didn't follow the normal model used by the kernel kref api. It included lots of odd logic to match state with reference count. The combination of this odd logic plus the way it was implicitly used in the discovery engine made its reference taking implementation suspect and extremely hard to follow. Change the driver such that the reference taking routines are now normal ref increments/decrements and callout on refcount=0. With this in place, the rework can be done such that the node structure is fully removed and deallocated when the remote port no longer exists and all references are removed. This removal logic, and the basic ref counting are intrically tied, thus in a single patch. Link: https://lore.kernel.org/r/20201115192646.12977-2-james.smart@broadcom.com Co-developed-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
* scsi: lpfc: Add an internal trace log bufferDick Kennedy2020-07-031-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The current logging methods typically end up requesting a reproduction with a different logging level set to figure out what happened. This was mainly by design to not clutter the kernel log messages with things that were typically not interesting and the messages themselves could cause other issues. When looking to make a better system, it was seen that in many cases when more data was wanted was when another message, usually at KERN_ERR level, was logged. And in most cases, what the additional logging that was then enabled was typically. Most of these areas fell into the discovery machine. Based on this summary, the following design has been put in place: The driver will maintain an internal log (256 elements of 256 bytes). The "additional logging" messages that are usually enabled in a reproduction will be changed to now log all the time to the internal log. A new logging level is defined - LOG_TRACE_EVENT. When this level is set (it is not by default) and a message marked as KERN_ERR is logged, all the messages in the internal log will be dumped to the kernel log before the KERN_ERR message is logged. There is a timestamp on each message added to the internal log. However, this timestamp is not converted to wall time when logged. The value of the timestamp is solely to give a crude time reference for the messages. Link: https://lore.kernel.org/r/20200630215001.70793-14-jsmart2021@gmail.com Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>