summaryrefslogtreecommitdiffstats
path: root/drivers/cpufreq/speedstep-lib.c (unfollow)
Commit message (Collapse)AuthorFilesLines
2023-12-21drm/xe/huc: HuC is not supported on GTs that don't have video enginesDaniele Ceraolo Spurio1-1/+10
On MTL-style multi-gt platforms, the HuC is only available on the media GT, so we need to consider it as not supported on the render GT. Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Alan Previn <alan.previn.teres.alexis@intel.com> Cc: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21drm/xe/huc: Extract version and binary offset from new HuC headersDaniele Ceraolo Spurio5-3/+244
The GSC-enabled HuC binary starts with a GSC header, which is followed by the legacy-style CSS header and the binary itself. We can parse the GSC headers to find the HuC version and the location of the binary to be used for the DMA transfer. The parsing function has been designed to be re-used for the GSC binary, so the entry names are external parameters (because the GSC uses different ones) and the CSS entry is optional (because the GSC doesn't have it). v2: move new code to uc_fw.c, better comments and error checking, split old code move to separate patch (Lucas), move headers and documentation to uc_fw_abi.h. v3: use 2 separate loops, rework marker check (Lucas) Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Alan Previn <alan.previn.teres.alexis@intel.com> Cc: John Harrison <John.C.Harrison@Intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21drm/xe/uc: Prepare for parsing of different header typesDaniele Ceraolo Spurio3-54/+71
GSC binaries and newer HuC ones use GSC-style headers instead of the CSS. In preparation for adding support for such parsing, split out the current parsing code to its own function, to make it cleaner to add the new paths. The existing doc section has also been renamed to narrow it to CSS-based binaries. v2: new patch in series, split out from next patch for easier reviewing v3: drop unneeded include (Lucas) Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Alan Previn <alan.previn.teres.alexis@intel.com> Cc: John Harrison <John.C.Harrison@Intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21drm/xe: Extend rpX values extraction for future platformsBadal Nilawar1-3/+3
In existing code flow for future platforms i.e. >1270, the rpX (rp0,rpn and rpe) fused values are read from gen 6 registers. Which is not correct. Unless specified gen 1270 regs should be valid for gen 1270+ platforms as well. Signed-off-by: Badal Nilawar <badal.nilawar@intel.com> Reviewed-by: Anshuman Gupta <anshuman.gupta@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21drm/xe/mocs: MOCS registers are multicast on Xe_HP and beyondMatt Roper3-11/+20
The MOCS registers should be written in an MCR-specific manner on Xe_HP and beyond to prevent any other driver threads or external firmware from putting the hardware into unicast mode while we initialize the MOCS table. Bspec: 66534, 67609, 71185 Cc: Ruthuvikas Ravikumar <ruthuvikas.ravikumar@intel.com> Reviewed-by: Gustavo Sousa <gustavo.sousa@intel.com> Link: https://lore.kernel.org/r/20231023204112.2856331-2-matthew.d.roper@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21drm/xe/xe2: Update SVG state handlingMatt Roper2-2/+14
As with DG2/MTL, Xe2 also fails to emit instruction headers for SVG state instructions if no explicit state has been set. The SVG part of the LRC is nearly identical to DG2/MTL; the only change is that 3DSTATE_DRAWING_RECTANGLE has been replaced by 3DSTATE_DRAWING_RECTANGLE_FAST, so we can just re-use the same state table and handle that single instruction when we encounter it. Bspec: 65182 Reviewed-by: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com> Link: https://lore.kernel.org/r/20231025151732.3461842-8-matthew.d.roper@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21drm/xe: Emit SVG state on RCS during driver load on DG2 and MTLMatt Roper3-1/+162
When recording the default LRC, the expectation is that the hardware's original state settings (both register and instruction) will be written out to the LRC upon first context switch. For many 3DSTATE_* state instructions that don't truly have "default" values, this translates to a simple instruction header (opcodes + dword length) being written to the LRC, followed by an appropriate number of blank dwords as a place holder. When userspace creates a context (which starts as a copy of the default LRC), they'll generally emit real 3DSTATE_* as part of their initialization to select the settings they desire. If they don't emit one of the 3DSTATE instructions, then the zeroed dwords that remain in their LRC image generally translate to various state remaining disabled. This will either be what userspace wants or will lead to very reproducible and easily-debugged problems (rendering glitches, engine hangs). It turns out that a subset of the 3DSTATE instructions, specifically those belonging to the SVG (State Variable - Global) unit, are not only emitting 0's for the instruction's "body" dwords, but also for the instruction header dword if no specific state has been explicitly set before context switch. This means that when the hardware switches to a context that hasn't explicitly provided an appropriate state setting, the hardware will just see a sequence of NOOPs in the spot reserved for that 3DSTATE instruction while executing the LRC, and the actual hardware state setting will unintentionally inherit the configuration used by the previously running context. Now when userspace makes a mistake and forgets to emit an important state instruction they no longer get consistent, easily-reproducible corruption/hangs, but rather erratic behavior where the presence/absence of a problem depends on what other workloads are running on the system and what order the contexts are scheduled on the engine. A specific example of this that came up recently related to mesh shading The OpenGL driver was not specifically emitting a 3DSTATE_MESH_CONTROL to disable mesh shading at context init, so on context switch, mesh shading would either be on or off depending on what the previous context had been doing. Vulkan apps _were_ enabling mesh shading, so running a Vulkan app and then context switching to an OpenGL app resulted in mesh shading still unexpectedly being enabled during OpenGL operation, and since other Mesh-related state was not properly initialized for that context a GPU hang was seen. Due to the specific ordering requirements (Vulkan app runs first, followed by OpenGL app), it took additional debug effort to track down the cause of the problem. There are various workarounds related to this behavior, with current implementations handled in the userspace drivers. E.g., Wa_14019789679 and Wa_22018402687. However it's been suggested that the kernel driver can help simplify things here by emitting zeroed SVG state with proper instruction headers as part of our default context creation (i.e., at the same point we apply LRC workarounds). This will help ensure that any future cases where a userspace driver does not emit an important state setting will result in consistent behavior. Bspec: 46261 Reviewed-by: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com> Link: https://lore.kernel.org/r/20231025151732.3461842-7-matthew.d.roper@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21drm/xe: Prepare to emit non-register state while recording default LRCMatt Roper3-1/+57
On some platforms we need to emit some non-register state while recording an engine class' default LRC. Add the infrastructure to support this; actual per-platform tables will be added in future patches. v2: - Checkpatch whitespace fix - Add extra assertion to ensure num_dw != 0. (Bala) Reviewed-by: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com> Link: https://lore.kernel.org/r/20231025151732.3461842-6-matthew.d.roper@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21drm/xe/trace: Optimize trace definitionBalasubramani Vivekanandan1-47/+36
Make use of EVENT_CLASS to group similar trace events Signed-off-by: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com> Reviewed-by: Haridhar Kalvala <haridhar.kalvala@intel.com> Link: https://lore.kernel.org/intel-xe/20231019093140.1901665-3-balasubramani.vivekanandan@intel.com/ Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21drm/xe: Add event tracing for CTBBalasubramani Vivekanandan2-2/+52
Event tracing enabled for CTB submissions. Additional minor refactor - Removed a unnecessary ct_to_xe() call. v2: Remove a unwanted comment (Hari) Add missing change to commit message Signed-off-by: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com> Reviewed-by: Haridhar Kalvala <haridhar.kalvala@intel.com> Link: https://lore.kernel.org/intel-xe/20231019093140.1901665-2-balasubramani.vivekanandan@intel.com/ Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21drm/xe: Add performance tuning settings for MTL and Xe2Shekhar Chauhan2-0/+25
Add L3SQCREG5 as part of HW recommended settings. The recommended value in Bspec is 00e0007f. For Xe2-LPG, bits 23:21 don't exist anymore, but it's confirmed with HW engineers that setting them doesn't do anything. They still exist on the media GT, Xe2-LPM, but they are already they are already set as per HW default value. So for Xe2 platform, the only bits that need to be set are 9:0 since HW's default is 0x1ff and the recommended value is 0x7f. Unlike most registers, which have the same relative offset on both the primary and media GT, this register has a different base offset on the media GT. On MTL the register only exists for the primary (graphics) GT, so there's no need to program it on the media gt. Also, it's part of the RCS engine's context, so it needs to be added as a LRC workaround. Bspec: 72161 Signed-off-by: Shekhar Chauhan <shekhar.chauhan@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://lore.kernel.org/r/20231024220739.224251-2-lucas.demarchi@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21drm/xe/xe2: Add initial workaroundsDnyaneshwar Bhadane2-0/+98
Add the initial collection of gt/engine/lrc workarounds. While at it, add some newlines around the platform/IP comments to make them consistent across all workarounds. v2: - FF_MODE is an MCR register (Matt Roper) - Group 18032247524 with other Xe2 workarounds (Matt Roper) - Move WA changing PSS_CHICKEN to lrc_was[] as for Xe2 that register is part of the render context image (Matt Roper) - Apply WA 16020518922 only on render engine (Matt Roper) Signed-off-by: Dnyaneshwar Bhadane <dnyaneshwar.bhadane@intel.com> Signed-off-by: Shekhar Chauhan <shekhar.chauhan@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://lore.kernel.org/r/20231024220739.224251-1-lucas.demarchi@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21drm/xe: Simplify xe_res_get_buddy()Brian Welty1-6/+1
We can remove the unnecessary indirection thru xe->tiles[] to get the TTM VRAM manager. This code can be common for VRAM and STOLEN. Signed-off-by: Brian Welty <brian.welty@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21drm/xe: fix pat[2] programming with 2M/1G pagesMatthew Auld2-6/+12
Bit 7 in the leaf node is normally programmed with pat[2], however with 2M/1G pages that same bit in the PDE/PDPE also toggles 2M/1G pages. For 2M/1G entries the pat[2] is rather moved to bit 12, which is now free given that the address must be aligned to 2M or 1G, leaving bit 7 for toggling 2M/1G pages. Bspec: 59510, 45038 Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21drm/xe/guc: Bump PVC GuC version to 70.9.1Daniele Ceraolo Spurio1-1/+1
The PVC GuC version that we're currently using (70.6.4) has a known issue that leads to dropping the disabling of contexts that have pending page faults. This is fixed in newer blobs, so we need to update to a more recent release. Link: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/696 Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21drm/xe/uapi: Fix naming of XE_QUERY_CONFIG_MAX_EXEC_QUEUE_PRIORITYFrancois Dugast2-3/+3
This is used for the priority of an exec queue (not an engine) and should be named accordingly. Signed-off-by: Francois Dugast <francois.dugast@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com>
2023-12-21drm/xe/uapi: Rename gts to gt_listRodrigo Vivi2-29/+29
During the uapi review it was identified a possible confusion with the plural of acronym with a new acronym. So the recommendation is to go with gt_list instead. Suggested-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Francois Dugast <francois.dugast@intel.com>
2023-12-21drm/xe/uapi: Remove unused field of drm_xe_query_gtRodrigo Vivi1-2/+0
We already have many bits reserved at the end already. Let's kill the unused ones. Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Francois Dugast <francois.dugast@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com>
2023-12-21drm/xe/uapi: Replace useless 'instance' per unique gt_idRodrigo Vivi4-8/+4
Let's have a single GT ID per GT within the PCI Device Card. Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Francois Dugast <francois.dugast@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com>
2023-12-21drm/xe/uapi: Document drm_xe_query_gtRodrigo Vivi1-22/+43
Split drm_xe_query_gt out of the gt list one in order to better document it. No functional change at this point. Any actual change to the uapi should come in follow-up additions. v2: s/maks/mask Cc: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Francois Dugast <francois.dugast@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com>
2023-12-21drm/xe: Fix VM bind out-sync signaling orderingMatthew Brost4-8/+125
A case existed where an out-sync of a later VM bind operation could signal before a previous one if the later operation results in a NOP (e.g. a unbind or prefetch to a VA range without any mappings). This breaks the ordering rules, fix this. This patch also lays the groundwork for users to pass in num_binds == 0 and out-syncs. Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21drm/xe: Remove async worker and rework sync bindsMatthew Brost9-518/+127
Async worker is gone. All jobs and memory allocations done in IOCTL to align with dma fencing rules. Async vs. sync now means when do bind operations complete relative to the IOCTL. Async completes when out-syncs signal while sync completes when the IOCTL returns. In-syncs and out-syncs are only allowed in async mode. If memory allocations fail in the job creation step the VM is killed. This is temporary, eventually a proper unwind will be done and VM will be usable. Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21drm/xe/uapi: Kill DRM_XE_UFENCE_WAIT_VM_ERRORMatthew Brost4-65/+9
This is not used nor does it align VM async document, kill this. Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21drm/xe: Kill XE_VM_PROPERTY_BIND_OP_ERROR_CAPTURE_ADDRESS extensionRodrigo Vivi2-148/+4
This extension is currently not used and it is not aligned with the error handling on async VM_BIND. Let's remove it and along with that, since it was the only extension for the vm_create, remove VM extension entirely. v2: rebase on top of the removal of drm_xe_ext_exec_queue_set_property Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Signed-off-by: Francois Dugast <francois.dugast@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com>
2023-12-21drm/xe/uapi: Use common drm_xe_ext_set_property extensionAshutosh Dixit3-20/+5
There really is no difference between 'struct drm_xe_ext_vm_set_property' and 'struct drm_xe_ext_exec_queue_set_property', they are extensions which specify a <property, value> pair. Replace the two extensions with a single common 'struct drm_xe_ext_set_property' extension. The rationale is that rather than have each XE module (including future modules) invent their own property/value extensions, all XE modules use a common set_property extension when possible. Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Signed-off-by: Francois Dugast <francois.dugast@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com>
2023-12-21drm/xe: Remove XE_EXEC_QUEUE_SET_PROPERTY_COMPUTE_MODE from uAPIMatthew Brost2-20/+6
Functionality of XE_EXEC_QUEUE_SET_PROPERTY_COMPUTE_MODE deprecated in a previous patch, drop from uAPI. The property is just simply inherented from the VM. v2: - Update commit message (Niranjana) Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21drm/xe: Rename exec_queue_kill_compute to xe_vm_remove_compute_exec_queueMatthew Brost3-18/+24
Much better name and aligns with xe_vm_add_compute_exec_queue. As part of the rename, move the implementation from xe_exec_queue.c to xe_vm.c. Suggested-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21drm/xe: Deprecate XE_EXEC_QUEUE_SET_PROPERTY_COMPUTE_MODE implementationMatthew Brost2-47/+16
We are going to remove XE_EXEC_QUEUE_SET_PROPERTY_COMPUTE_MODE from the uAPI, deprecate the implementation first by making XE_EXEC_QUEUE_SET_PROPERTY_COMPUTE_MODE a NOP. After removal of XE_EXEC_QUEUE_SET_PROPERTY_COMPUTE_MODE the proper is simply inherented from the VM. v2: - Update commit message with explaination of removal (Niranjana) Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21drm/xe: Fix xe_exec_queue_is_idle for parallel exec queuesMatthew Brost1-2/+11
Last little piece to support parallel exec queue is compute mode. Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21drm/xe/uapi: Remove MMIO ioctlFrancois Dugast4-133/+4
This was previously used in UMD for timestamp correlation, which can now be done with DRM_XE_QUERY_CS_CYCLES. Link: https://lore.kernel.org/all/20230706042044.GR6953@mdroper-desk1.amr.corp.intel.com/ Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/636 Signed-off-by: Francois Dugast <francois.dugast@intel.com> Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21drm/xe/vm: Remove VM_BIND_OP macroFrancois Dugast1-23/+19
This macro was necessary when bind operations were shifted but this is no longer the case, so removing to simplify code. Signed-off-by: Francois Dugast <francois.dugast@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com>
2023-12-21drm/xe/uapi: Separate VM_BIND's operation and flagFrancois Dugast2-19/+24
Use different members in the drm_xe_vm_bind_op for op and for flags as it is done in other structures. Type is left to u32 to leave enough room for future operations and flags. v2: Remove the XE_VM_BIND_* flags shift (Rodrigo Vivi) Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/303 Signed-off-by: Francois Dugast <francois.dugast@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com>
2023-12-21drm/xe: Correlate engine and cpu timestamps with better accuracyUmesh Nerlige Ramappa2-24/+218
Perf measurements rely on CPU and engine timestamps to correlate events of interest across these time domains. Current mechanisms get these timestamps separately and the calculated delta between these timestamps lack enough accuracy. To improve the accuracy of these time measurements to within a few us, add a query that returns the engine and cpu timestamps captured as close to each other as possible. Mesa MR: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24591 v2: - Fix kernel-doc warnings (CI) - Document input params and group them together (Jose) - s/cs/engine/ (Jose) - Remove padding in the query (Ashutosh) Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Signed-off-by: Francois Dugast <francois.dugast@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> [Rodrigo finished the s/cs/engine renaming]
2023-12-21drm/xe: Set the correct type for xe_to_user_engine_classUmesh Nerlige Ramappa1-1/+1
User engine class is of type u16. Set the same type for the array used to map xe engines to user engines. Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Signed-off-by: Francois Dugast <francois.dugast@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21drm/xe: Fix array bounds check for queriesUmesh Nerlige Ramappa1-1/+1
Queries are 0-indexed, so a query with value N is invalid if the ARRAY_SIZE is N. Modify the check to account for that. Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs") Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Signed-off-by: Francois Dugast <francois.dugast@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21drm/xe/debugfs: Include GFXPIPE commands in LRC dumpMatt Roper2-0/+216
RCS and CCS engines include several non-register gfxpipe commands in their LRC images. Include these in the dump output so that we can see exactly what's inside the context snapshot. v2: - Include raw instruction header in output - Add 3DSTATE_AMFS_TEXTURE_POINTERS and 3DSTATE_MONOFILTER_SIZE. The first was supposed to be removed in Xe_HPG, and the second by gen12, but both still show up in the RCS LRC. v3: - Sanity check that we don't have numdw > remaining_dw. (Lucas) Cc: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/20231016163449.1300701-14-matthew.d.roper@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21drm/xe/debugfs: Add dump of default LRCs' MI instructionsMatt Roper5-0/+152
For non-RCS engines, nearly all of the LRC state is composed of MI instructions (specifically MI_LOAD_REGISTER_IMM). Providing a dump interface allows us to verify that the context image layout matches what's documented in the bspec, and also allows us to check whether LRC workarounds are being properly captured by the default state we record at startup. For now, the non-MI instructions found in the RCS and CCS engines will dump as "unknown;" parsing of those will be added in a follow-up patch. v2: - Add raw instruction header as well as decoded meaning. (Lucas) - Check that num_dw isn't greater than remaining_dw for instructions that have a "# dwords" field. (Lucas) - Clarify comment about skipping over ppHWSP. (Lucas) Bspec: 64993 Cc: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/20231016163449.1300701-13-matthew.d.roper@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21drm/xe: Extract MI_* instructions to their own headerMatt Roper9-42/+96
Extracting the common MI_* instructions that can be used with any engine to their own header will make it easier as we add additional engine instructions in upcoming patches. Also, since the majority of GPU instructions (both MI and non-MI) have a "length" field in bits 7:0 of the instruction header, a common define is added for that. Instruction-specific length fields are still defined for special case instructions that have larger/smaller length fields. v2: - Use "instr" instead of "inst" as the short form of "instruction" everywhere. (Lucas) - Include xe_reg_defs.h instead of the i915 compat header. (Lucas) Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/20231016163449.1300701-12-matthew.d.roper@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21drm/xe: Clarify number of dwords/qwords stored by MI_STORE_DATA_IMMMatt Roper3-10/+8
MI_STORE_DATA_IMM can store either dword values or qword values, and can store more than one value if the instruction's length field is large enough. Create explicit defines to specify the number of dwords/qwords to be stored, which will set the instruction length correctly and, if necessary, turn on the 'store qword' bit. While we're here, also replace an open-coded version of MI_STORE_DATA_IMM with the common macros. Bspec: 60246 Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/20231016163449.1300701-11-matthew.d.roper@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21drm/xe: Separate number of registers from MI_LRI opcodeMatt Roper4-4/+6
Keeping the number of registers to be loaded as a separate macro from the instruction opcode will simplify some upcoming LRC parsing code. Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/20231016163449.1300701-10-matthew.d.roper@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21drm/xe: Make MI_FLUSH_DW immediate size more explicitMatt Roper2-6/+9
Despite its name, MI_FLUSH_DW instruction can write an immediate value of either dword size or qword size, depending on the 'length' field of the instruction. Since "length" excludes the first two dwords of the instruction, a value of 2 in the length field implies a dword write and a value of 3 implies a qword write. Even in cases where the flush instruction's post-sync operation is set to "no write" we're still expected to size the overall instruction as if we were doing a dword or qword write (i.e., a length of 1 shouldn't be used on modern platforms). Rather than baking a size of "1" into the #define and then adding another unexplained "+ 1" at all the spots where the definition gets used, lets just create MI_FLUSH_IMM_DW and MI_FLUSH_IMM_QW definitions that should be OR'd into the instruction header to make it more explicit what behavior we're requesting. Bspec: 60229 Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/20231016163449.1300701-9-matthew.d.roper@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21drm/xe/gsc: add gsc device supportVitaly Lubart7-6/+283
Create mei-gscfi auxiliary device and configure interrupts to be consumed by mei-gsc device driver. Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Vitaly Lubart <vitaly.lubart@intel.com> Signed-off-by: Alexander Usyskin <alexander.usyskin@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21drm/xe/gsc: add has_heci_gscfi indication to deviceVitaly Lubart2-2/+11
Mark support of MEI-GSC interaction per device. Add has_heci_gscfi indication to xe_device and xe_pci structures. Mark DG1 and DG2 devices as supported. Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Vitaly Lubart <vitaly.lubart@intel.com> Signed-off-by: Alexander Usyskin <alexander.usyskin@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21drm/xe/gsc: add HECI2 register offsetsVitaly Lubart1-0/+4
Add HECI2 register offsets for DG1 and DG2 to regs/xe_regs.h Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Vitaly Lubart <vitaly.lubart@intel.com> Signed-off-by: Alexander Usyskin <alexander.usyskin@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21drm/xe/dg2: Remove one PCI IDShekhar Chauhan1-1/+0
The bspec was recently updated to remove PCI ID 0x5698; this ID is actually reserved for future use and should not be treated as DG2-G11. BSpec: 44477 Signed-off-by: Shekhar Chauhan <shekhar.chauhan@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://lore.kernel.org/r/20231011154526.2819754-1-shekhar.chauhan@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21drm/xe: Add new DG2 PCI IDsShekhar Chauhan1-1/+5
Add recently added PCI IDs for DG2 BSpec: 44477 Signed-off-by: Shekhar Chauhan <shekhar.chauhan@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://lore.kernel.org/r/20231011051418.2767145-1-shekhar.chauhan@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21drm/xe: Set PTE_AE for smem allocations in integrated devicesJosé Roberto de Souza1-3/+6
Without this if a atomic operation is executed in Xe2 integrated GPUs it causes engine memory catastrophic error. This fixes at least 3 failures in piglit sanity and 2 failures in crucible for LNL. v3: - only add PTE_AE to smem in integrated Cc: Matt Roper <matthew.d.roper@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21drm/xe: refactor xe_mmio_probe_tiles to support MMIO extensionKoby Elbaz1-32/+38
In future ASICs, there will be an additional MMIO extension space for all tiles altogether, residing on top of MMIO address space. Signed-off-by: Koby Elbaz <kelbaz@habana.ai> Reviewed-by: Ofir Bitton <obitton@habana.ai> Reviewed-by: Moti Haimovski <mhaimovski@habana.ai> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21drm/xe: map MMIO BAR according to the num of tiles in device descKoby Elbaz1-7/+6
When MMIO BAR is initially mapped, the driver assumes a single tile device. However, former memory allocations take all tiles into account. First, a common standard for resource usage is needed here. Second, with the next (6th) patch in this series, the MMIO BAR remapping will be done only if a reduced-tile device is attached. Signed-off-by: Koby Elbaz <kelbaz@habana.ai> Reviewed-by: Ofir Bitton <obitton@habana.ai> Reviewed-by: Moti Haimovski <mhaimovski@habana.ai> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21drm/xe: add MMIO extension support flagsKoby Elbaz3-0/+9
Besides the regular MMIO space that exists by default, MMIO extension support & MMIO extension tile size should both be defined per device, and updated from the device's descriptor. Signed-off-by: Koby Elbaz <kelbaz@habana.ai> Reviewed-by: Ofir Bitton <obitton@habana.ai> Reviewed-by: Moti Haimovski <mhaimovski@habana.ai> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>