summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* Merge tag 'amd-drm-next-6.5-2023-06-16' of ↵Dave Airlie2023-06-1993-431/+1222
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | https://gitlab.freedesktop.org/agd5f/linux into drm-next amd-drm-next-6.5-2023-06-16: amdgpu: - Misc display fixes - W=1 fixes - Improve scheduler naming - DCN 3.1.4 fixes - kdoc fixes - Enable W=1 - VCN 4.0 fix - xgmi fixes - TOPDOWN fix for large BAR systems - eDP fix - PSR fixes - SubVP fixes - Freesync fix - DPIA fix - SMU 13.0.5 fixes - vblflash fix - RAS fixes - SDMA 4 fix - BO locking fix - BO backing store fix - NBIO 7.9 fixes - GC 9.4.3 fixes - GPU reset recovery fixes - HMM fix amdkfd: - Fix NULL check - Trap fixes - Queue count fix - Add event age tracking radeon: - fbdev client fix scheduler: - Avoid an infinite loop UAPI: - Add KFD event age tracking: Proposed ROCT-Thunk-Interface: https://github.com/RadeonOpenCompute/ROCT-Thunk-Interface/commit/efdbf6cfbc026bd68ac3c35d00dacf84370eb81e https://github.com/RadeonOpenCompute/ROCT-Thunk-Interface/commit/1820ae0a2db85b6f584611dc0cde1a00e7c22915 Proposed ROCR-Runtime: https://github.com/RadeonOpenCompute/ROCR-Runtime/compare/master...zhums:ROCR-Runtime:new_event_wait_review https://github.com/RadeonOpenCompute/ROCR-Runtime/commit/e1f5bdb88eb882ac798aeca2c00ea3fbb2dba459 https://github.com/RadeonOpenCompute/ROCR-Runtime/commit/7d26afd14107b5c2a754c1a3f415d89f3aabb503 drm: - DP MST fix Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230616163548.7706-1-alexander.deucher@amd.com
| * drm/dp_mst: Clear MSG_RDY flag before sending new messageWayne Lin2023-06-155-31/+81
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | [Why] The sequence for collecting down_reply from source perspective should be: Request_n->repeat (get partial reply of Request_n->clear message ready flag to ack DPRX that the message is received) till all partial replies for Request_n are received->new Request_n+1. Now there is chance that drm_dp_mst_hpd_irq() will fire new down request in the tx queue when the down reply is incomplete. Source is restricted to generate interveleaved message transactions so we should avoid it. Also, while assembling partial reply packets, reading out DPCD DOWN_REP Sideband MSG buffer + clearing DOWN_REP_MSG_RDY flag should be wrapped up as a complete operation for reading out a reply packet. Kicking off a new request before clearing DOWN_REP_MSG_RDY flag might be risky. e.g. If the reply of the new request has overwritten the DPRX DOWN_REP Sideband MSG buffer before source writing one to clear DOWN_REP_MSG_RDY flag, source then unintentionally flushes the reply for the new request. Should handle the up request in the same way. [How] Separete drm_dp_mst_hpd_irq() into 2 steps. After acking the MST IRQ event, driver calls drm_dp_mst_hpd_irq_send_new_request() and might trigger drm_dp_mst_kick_tx() only when there is no on going message transaction. Changes since v1: * Reworked on review comments received -> Adjust the fix to let driver explicitly kick off new down request when mst irq event is handled and acked -> Adjust the commit message Changes since v2: * Adjust the commit message * Adjust the naming of the divided 2 functions and add a new input parameter "ack". * Adjust code flow as per review comments. Changes since v3: * Update the function description of drm_dp_mst_hpd_irq_handle_event Changes since v4: * Change ack of drm_dp_mst_hpd_irq_handle_event() to be an array align the size of esi[] Signed-off-by: Wayne Lin <Wayne.Lin@amd.com> Reviewed-by: Lyude Paul <lyude@redhat.com> Acked-by: Jani Nikula <jani.nikula@intel.com> Cc: stable@vger.kernel.org Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
| * drm/amdgpu: Increase hmm range get pages timeoutPhilip Yang2023-06-151-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | If hmm_range_fault returns -EBUSY, we should call hmm_range_fault again to validate the remaining pages. On one system with NUMA auto balancing enabled, hmm_range_fault takes 6 seconds for 1GB range because CPU migrate the range one page at a time. To be safe, increase timeout value to 1 second for 128MB range. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
| * drm/amdgpu: Enable translate further for GC v9.4.3Philip Yang2023-06-151-0/+1
| | | | | | | | | | | | | | | | To extend UTCL2 reach. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
| * drm/amdgpu: Remove unused NBIO interfaceLijo Lazar2023-06-152-16/+0
| | | | | | | | | | | | | | | | | | | | | | Set compute partition mode interface in NBIO is no longer used. Remove the only implementation from NBIO v7.9 Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Le Ma <le.ma@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
| * drm/amdkfd: bump kfd ioctl minor version for event age availabilityJames Zhu2023-06-151-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Bump the minor version to declare event age tracking feature is now available. In kernel amdgpu driver, kfd_wait_on_events is used to support user space signal event wait function. For multiple threads waiting on same event scenery, race condition could occur since some threads after checking signal condition, before calling kfd_wait_on_events, the event interrupt could be fired and wake up other thread which are sleeping on this event. Then those threads could fall into sleep without waking up again. Adding event age tracking in both kernel and user mode, will help avoiding this race condition. Proposed ROCT-Thunk-Interface: https://github.com/RadeonOpenCompute/ROCT-Thunk-Interface/commit/efdbf6cfbc026bd68ac3c35d00dacf84370eb81e https://github.com/RadeonOpenCompute/ROCT-Thunk-Interface/commit/1820ae0a2db85b6f584611dc0cde1a00e7c22915 Proposed ROCR-Runtime: https://github.com/RadeonOpenCompute/ROCR-Runtime/compare/master...zhums:ROCR-Runtime:new_event_wait_review https://github.com/RadeonOpenCompute/ROCR-Runtime/commit/e1f5bdb88eb882ac798aeca2c00ea3fbb2dba459 https://github.com/RadeonOpenCompute/ROCR-Runtime/commit/7d26afd14107b5c2a754c1a3f415d89f3aabb503 Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
| * drm/amdkfd: update user space last_event_ageJames Zhu2023-06-151-8/+15
| | | | | | | | | | | | | | | | | | Update user space last_event_age when event age is enabled. It is only for KFD_EVENT_TYPE_SIGNAL which is checked by user space. Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
| * drm/amdkfd: set activated flag true when event age unmatchsJames Zhu2023-06-151-4/+13
| | | | | | | | | | | | | | | | Set waiter's activated flag true when event age unmatchs with last_event_age. Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
| * drm/amdkfd: add event_age tracking when receiving interruptJames Zhu2023-06-152-0/+7
| | | | | | | | | | | | | | | | Add event_age tracking when receiving interrupt. Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
| * drm/amdkfd: add event age trackingJames Zhu2023-06-151-1/+9
| | | | | | | | | | | | | | | | Add event age tracking Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
| * drm/scheduler: avoid infinite loop if entity's dependency is a scheduled ↵ZhenGuo Yin2023-06-151-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | error fence [Why] drm_sched_entity_add_dependency_cb ignores the scheduled fence and return false. If entity's dependency is a scheduler error fence and drm_sched_stop is called due to TDR, drm_sched_entity_pop_job will wait for the dependency infinitely. [How] Do not wait or ignore the scheduled error fence, add drm_sched_entity_wakeup callback for the dependency with scheduled error fence. Signed-off-by: ZhenGuo Yin <zhenguo.yin@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
| * drm/amdgpu: add entity error check in amdgpu_ctx_get_entityZhenGuo Yin2023-06-151-1/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | [Why] UMD is not aware of entity error, and will keep submitting jobs into the error entity. [How] Add entity error check when getting entity from ctx. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: ZhenGuo Yin <zhenguo.yin@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
| * drm/amdgpu: add VM generation tokenChristian König2023-06-157-7/+37
| | | | | | | | | | | | | | | | | | | | | | | | | | Instead of using the VRAM lost counter add a 64bit token which indicates if a context or job is still valid to use. Should the VRAM be lost or the page tables need re-creation the token will change indicating that userspace needs to act and re-create the contexts and re-submit the work. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Luben Tuikov <luben.tuikov@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
| * drm/amdgpu: reset VM when an error is detectedChristian König2023-06-151-16/+65
| | | | | | | | | | | | | | | | | | When some problem with the updates of page tables is detected reset the state machine of the VM and re-create all page tables from scratch. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Luben Tuikov <luben.tuikov@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
| * drm/amdgpu: abort submissions during prepare on errorChristian König2023-06-151-1/+12
| | | | | | | | | | | | | | | | Forward errors from previous submissions to this one. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Luben Tuikov <luben.tuikov@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
| * drm/amdgpu: mark soft recovered fences with -ENODATAChristian König2023-06-151-0/+7
| | | | | | | | | | | | | | | | | | | | Set the fence error code before trying to soft-recover it. It gets overwritten when a hard recovery is required. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Luben Tuikov <luben.tuikov@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
| * drm/amdgpu: mark force completed fences with -ECANCELEDChristian König2023-06-151-0/+1
| | | | | | | | | | | | | | | | When we force complete fences we should mark them as canceled. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Luben Tuikov <luben.tuikov@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
| * drm/amdgpu: add amdgpu_error_* debugfs fileChristian König2023-06-153-0/+41
| | | | | | | | | | | | | | | | | | This allows us to insert some error codes into the bottom of the pipeline on an engine. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Luben Tuikov <luben.tuikov@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
| * drm/amdgpu: mark GC 9.4.3 experimental for nowAlex Deucher2023-06-151-0/+2
| | | | | | | | | | | | | | | | | | Mark as experimental for now until we get closer to production to avoid possible undesireable behavior when mixing newer boards with older kernels. Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
| * drm/amdgpu: Use PSP FW API for partition switchLijo Lazar2023-06-152-15/+6
| | | | | | | | | | | | | | | | | | | | Use PSP firmware interface for switching compute partitions. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Le Ma <le.ma@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
| * drm/amdgpu: Change nbio v7.9 xcp status definitionLijo Lazar2023-06-151-5/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | PARTITION_MODE field in PARTITION_COMPUTE_STATUS register is defined as below by firmware. SPX = 0, DPX = 1, TPX = 2, QPX = 3, CPX = 4 Change driver definition accordingly. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Le Ma <le.ma@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
| * drm/amdgpu: make sure that BOs have a backing storeChristian König2023-06-151-1/+5
| | | | | | | | | | | | | | | | | | | | | | It's perfectly possible that the BO is about to be destroyed and doesn't have a backing store associated with it. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Guchun Chen <guchun.chen@amd.com> Tested-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
| * drm/amdgpu: make sure BOs are locked in amdgpu_vm_get_memoryChristian König2023-06-151-30/+39
| | | | | | | | | | | | | | | | | | | | | | We need to grab the lock of the BO or otherwise can run into a crash when we try to inspect the current location. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Guchun Chen <guchun.chen@amd.com> Tested-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
| * drm/amdgpu: Add checking mc_vram_sizeStanley.Yang2023-06-151-1/+2
| | | | | | | | | | | | | | | | | | Do not compare injection address with mc_vram_size if mc_vram_size is zero. Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
| * drm/amdgpu: Optimize checking ras supportedStanley.Yang2023-06-153-21/+23
| | | | | | | | | | | | | | | | | | Using "is_app_apu" to identify device in the native APU mode or carveout mode. Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
| * drm/amdgpu: Add channel_dis_num to ras init flagsCandice Li2023-06-152-0/+2
| | | | | | | | | | | | | | | | Add disabled channel number to ras init flags. Signed-off-by: Candice Li <candice.li@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
| * drm/amdgpu: Update total channel number for umc v8_10Candice Li2023-06-153-1/+5
| | | | | | | | | | | | | | | | Update total channel number for umc v8_10. Signed-off-by: Candice Li <candice.li@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
| * drm/amd/pm: Align eccinfo table structure with smu v13_0_0 interfaceCandice Li2023-06-151-2/+1
| | | | | | | | | | | | | | | | | | | | Update eccinfo table structure according to smu v13_0_0 interface. v2: Calculate array size instead of using macro definition. Signed-off-by: Candice Li <candice.li@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
| * drm/amd/display: Convert to kdoc formats in dc/core/dc.cSrinivasan Shanmugam2023-06-151-19/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | Fixes the following gcc with W=1: drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc.c:3483: warning: Cannot understand * ******************************************************************************* drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc.c:4204: warning: Cannot understand * ******************************************************************************* Cc: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Cc: Aurabindo Pillai <aurabindo.pillai@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
| * drm/amdkfd: decrement queue count on mes queue destroyJonathan Kim2023-06-151-1/+1
| | | | | | | | | | | | | | | | | | Queue count should decrement on queue destruction regardless of HWS support type. Signed-off-by: Jonathan Kim <jonathan.kim@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
| * drm/amd/pm: enable more Pstates profile levels for SMU v13.0.5Tim Huang2023-06-152-3/+39
| | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch enables following UMD stable Pstates profile levels for power_dpm_force_performance_level interface. - profile_peak - profile_min_sclk - profile_standard Signed-off-by: Tim Huang <Tim.Huang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
| * drm/radeon: Fix missing prototypes in radeon_atpx_handler.cSrinivasan Shanmugam2023-06-152-1/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fixes the following gcc with W=1: drivers/gpu/drm/radeon/radeon_atpx_handler.c:64:6: warning: no previous prototype for ‘radeon_has_atpx’ [-Wmissing-prototypes] 64 | bool 4(void) { | ^~~~~~~~~~~~~~~ drivers/gpu/drm/radeon/radeon_atpx_handler.c:68:6: warning: no previous prototype for ‘radeon_has_atpx_dgpu_power_cntl’ [-Wmissing-prototypes] 68 | bool radeon_has_atpx_dgpu_power_cntl(void) { | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ drivers/gpu/drm/radeon/radeon_atpx_handler.c:72:6: warning: no previous prototype for ‘radeon_is_atpx_hybrid’ [-Wmissing-prototypes] 72 | bool radeon_is_atpx_hybrid(void) { | ^~~~~~~~~~~~~~~~~~~~~ drivers/gpu/drm/radeon/radeon_atpx_handler.c:77:6: warning: no previous prototype for ‘radeon_atpx_dgpu_req_power_for_displays’ [-Wmissing-prototypes] 77 | bool radeon_atpx_dgpu_req_power_for_displays(void) { | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ drivers/gpu/drm/radeon/radeon_atpx_handler.c:596:6: warning: no previous prototype for ‘radeon_register_atpx_handler’ [-Wmissing-prototypes] 596 | void radeon_register_atpx_handler(void) | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~ drivers/gpu/drm/radeon/radeon_atpx_handler.c:614:6: warning: no previous prototype for ‘radeon_unregister_atpx_handler’ [-Wmissing-prototypes] 614 | void radeon_unregister_atpx_handler(void) | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ drivers/gpu/drm/radeon/radeon_atpx_handler.c:159: warning: expecting prototype for radeon_atpx_validate_functions(). Prototype was for radeon_atpx_validate() instead Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
| * drm/amdgpu: Fix usage of UMC fill record in RASLuben Tuikov2023-06-151-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The fixed commit listed in the Fixes tag below, introduced a bug in amdgpu_ras.c::amdgpu_reserve_page_direct(), in that when introducing the new amdgpu_umc_fill_error_record() and internally in that new function the physical address (argument "uint64_t retired_page"--wrong name) is right-shifted by AMDGPU_GPU_PAGE_SHIFT. Thus, in amdgpu_reserve_page_direct() when we pass "address" to that new function, we should NOT right-shift it, since this results, erroneously, in the page address to be 0 for first 2^(2*AMDGPU_GPU_PAGE_SHIFT) memory addresses. This commit fixes this bug. Cc: Tao Zhou <tao.zhou1@amd.com> Cc: Hawking Zhang <Hawking.Zhang@amd.com> Cc: Alex Deucher <Alexander.Deucher@amd.com> Fixes: 400013b268cb ("drm/amdgpu: add umc_fill_error_record to make code more simple") Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Link: https://lore.kernel.org/r/20230610113536.10621-1-luben.tuikov@amd.com Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
| * drm/amdgpu/sdma4: set align mask to 255Alex Deucher2023-06-152-4/+4
| | | | | | | | | | | | | | | | | | | | | | The wptr needs to be incremented at at least 64 dword intervals, use 256 to align with windows. This should fix potential hangs with unaligned updates. Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Aaron Liu <aaron.liu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
| * drm/amdgpu: Report ras_num_recs in debugfsLuben Tuikov2023-06-151-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Report the number of records stored in the RAS EEPROM table in debugfs. This can be used by user-space to calculate the capacity of the RAS EEPROM table since "bad_page_cnt_threshold" is also reported in the same place in debugfs. See commit 7fb640714547 ("drm/amdgpu: Add bad_page_cnt_threshold to debugfs"). ras_num_recs can already be inferred by dumping the RAS EEPROM table, also in the same debugfs location, see commit reference c65b0805e77919 (drm/amdgpu: RAS EEPROM table is now in debugfs, 2021-04-08). This commit makes it an integer value easily shown in a single file. Cc: Alex Deucher <Alexander.Deucher@amd.com> Cc: Hawking Zhang <Hawking.Zhang@amd.com> Cc: Tao Zhou <tao.zhou1@amd.com> Cc: Stanley Yang <Stanley.Yang@amd.com> Cc: John Clements <john.clements@amd.com> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Link: https://lore.kernel.org/r/20230603051043.211548-1-luben.tuikov@amd.com Acked-by: Alex Deucher <Alexander.Deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
| * drm/amdkfd: Remove DUMMY_VRAM_SIZEMukul Joshi2023-06-151-5/+0
| | | | | | | | | | | | | | | | | | Remove DUMMY_VRAM_SIZE as it is not needed and can result in reporting incorrect memory size. Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
| * drm/amdgpu: Release SDMAv4.4.2 ecc irq properlyLijo Lazar2023-06-151-6/+10
| | | | | | | | | | | | | | | | | | Release ECC irq only if irq is enabled - only when RAS feature is enabled ECC irq gets enabled. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
| * drm/amdgpu: add wait_for helper for spirom updateLikun Gao2023-06-154-4/+29
| | | | | | | | | | | | | | | | | | | | Spirom update typically requires extremely long duration for command execution, and special helper function to wait for it completion. Signed-off-by: Likun Gao <Likun.Gao@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
| * drm/amd/display: Clean up dcn10_optc.c kdocSrinivasan Shanmugam2023-06-151-21/+36
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fixes the following W=1 kernel build warning: display/dc/dcn10/dcn10_optc.c:45: warning: This comment starts with '/**', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst * apply_front_porch_workaround TODO FPGA still need? display/dc/dcn10/dcn10_optc.c:136: warning: This comment starts with '/**', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst * program_timing_generator used by mode timing set display/dc/dcn10/dcn10_optc.c:391: warning: Function parameter or member 'optc' not described in 'optc1_set_timing_double_buffer' display/dc/dcn10/dcn10_optc.c:391: warning: Function parameter or member 'enable' not described in 'optc1_set_timing_double_buffer' display/dc/dcn10/dcn10_optc.c:404: warning: Function parameter or member 'optc' not described in 'optc1_unblank_crtc' display/dc/dcn10/dcn10_optc.c:404: warning: expecting prototype for unblank_crtc(). Prototype was for optc1_unblank_crtc() instead display/dc/dcn10/dcn10_optc.c:427: warning: Function parameter or member 'optc' not described in 'optc1_blank_crtc' display/dc/dcn10/dcn10_optc.c:427: warning: expecting prototype for blank_crtc(). Prototype was for optc1_blank_crtc() instead display/dc/dcn10/dcn10_optc.c:496: warning: This comment starts with '/**', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst * Enable CRTC display/dc/dcn10/dcn10_optc.c:895: warning: Cannot understand ***************************************************************************** on line 895 - I thought it was a doc line Cc: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Cc: Aurabindo Pillai <aurabindo.pillai@amd.com> Cc: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
| * drm/amd/display: Correct kdoc formats in dcn32_resource_helpers.cSrinivasan Shanmugam2023-06-151-17/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fixes the following gcc with W=1: drivers/gpu/drm/amd/amdgpu/../display/dc/dcn32/dcn32_resource_helpers.c:285: warning: Function parameter or member 'dc' not described in 'dcn32_determine_det_override' drivers/gpu/drm/amd/amdgpu/../display/dc/dcn32/dcn32_resource_helpers.c:285: warning: Function parameter or member 'context' not described in 'dcn32_determine_det_override' drivers/gpu/drm/amd/amdgpu/../display/dc/dcn32/dcn32_resource_helpers.c:285: warning: Function parameter or member 'pipes' not described in 'dcn32_determine_det_override' drivers/gpu/drm/amd/amdgpu/../display/dc/dcn32/dcn32_resource_helpers.c:624: warning: Cannot understand * ***************************************************************** drivers/gpu/drm/amd/amdgpu/../display/dc/dcn32/dcn32_resource_helpers.c:676: warning: Cannot understand * ***************************************************************** Cc: Alvin Lee <alvin.lee2@amd.com> Cc: Stylon Wang <stylon.wang@amd.com> Cc: Hamza Mahfooz <hamza.mahfooz@amd.com> Cc: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Cc: Harry Wentland <harry.wentland@amd.com> Cc: Aurabindo Pillai <aurabindo.pillai@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Reviewed-by: Alvin Lee <alvin.lee2@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
| * drm/amd/display: Provide function name for 'optc32_enable_crtc()'Srinivasan Shanmugam2023-06-151-2/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Fixes the following W=1 kernel build warning(s): drivers/gpu/drm/amd/amdgpu/../display/dc/dcn32/dcn32_optc.c:109: warning: This comment starts with '/**', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst * Enable CRTC Cc: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Cc: Aurabindo Pillai <aurabindo.pillai@amd.com> Cc: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
| * drm/amd/display: Correct and remove excess function parameter names in kdocSrinivasan Shanmugam2023-06-151-9/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fixes the following gcc with W=1: drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn32/dcn32_fpu.c:872: warning: Excess function parameter 'drr_pipe' description in 'subvp_drr_schedulable' drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn32/dcn32_fpu.c:1030: warning: Cannot understand * **************************************************** Cc: Stylon Wang <stylon.wang@amd.com> Cc: Hamza Mahfooz <hamza.mahfooz@amd.com> Cc: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Cc: Harry Wentland <harry.wentland@amd.com> Cc: Aurabindo Pillai <aurabindo.pillai@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
| * drm/amd/display: Correct kdoc formats in dcn10_opp.cSrinivasan Shanmugam2023-06-151-8/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fixes the following W=1 kernel build warning: display/dc/dcn10/dcn10_opp.c:52: warning: Function parameter or member 'oppn10' not described in 'opp1_set_truncation' display/dc/dcn10/dcn10_opp.c:52: warning: Function parameter or member 'params' not described in 'opp1_set_truncation' display/dc/dcn10/dcn10_opp.c:52: warning: expecting prototype for set_truncation(). Prototype was for opp1_set_truncation() instead display/dc/dcn10/dcn10_opp.c:161: warning: Function parameter or member 'oppn10' not described in 'opp1_set_pixel_encoding' display/dc/dcn10/dcn10_opp.c:161: warning: Function parameter or member 'params' not described in 'opp1_set_pixel_encoding' display/dc/dcn10/dcn10_opp.c:161: warning: expecting prototype for set_pixel_encoding(). Prototype was for opp1_set_pixel_encoding() instead display/dc/dcn10/dcn10_opp.c:183: warning: This comment starts with '/**', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst * Set Clamping Cc: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Cc: Aurabindo Pillai <aurabindo.pillai@amd.com> Cc: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
| * drm/amdgpu: Add missing function parameter 'optc' & 'enable' to kdoc in ↵Srinivasan Shanmugam2023-06-151-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | optc3_set_timing_double_buffer() Fixes the following gcc with W=1: drivers/gpu/drm/amd/amdgpu/../display/dc/dcn30/dcn30_optc.c:285: warning: Function parameter or member 'optc' not described in 'optc3_set_timing_double_buffer' drivers/gpu/drm/amd/amdgpu/../display/dc/dcn30/dcn30_optc.c:285: warning: Function parameter or member 'enable' not described in 'optc3_set_timing_double_buffer' Cc: Harry Wentland <harry.wentland@amd.com> Cc: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Cc: Aurabindo Pillai <aurabindo.pillai@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
| * drm/amdgpu: Print client id for the unregistered interrupt resourceMa Jun2023-06-151-1/+2
| | | | | | | | | | | | | | | | | | Modify the debug information and print the clien id for these interrupts as well. Signed-off-by: Ma Jun <Jun.Ma2@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
| * drm/amdkfd: To enable traps for GC_11_0_4 and upRuili Ji2023-06-151-2/+2
| | | | | | | | | | | | | | | | | | Flag trap_en should be enabled for trap handler. Signed-off-by: Ruili Ji <ruiliji2@amd.com> Signed-off-by: Aaron Liu <aaron.liu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
| * drm/amd/display: don't free stolen console memory during suspendAlex Deucher2023-06-151-4/+11
| | | | | | | | | | | | | | | | | | | | | | | | Don't free the memory if we are hitting this as part of suspend. This way we don't free any memory during suspend; see amdgpu_bo_free_kernel(). The memory will be freed in the first non-suspend modeset or when the driver is torn down. Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2568 Acked-by: Harry Wentland <harry.wentland@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
| * Revert "drm/amd/display: fix dpms_off issue when disabling bios mode"Alex Deucher2023-06-151-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This reverts commit 58e67bb3c131da5ee14e4842b08e53f4888dce0a. This patch was reverted, but came back again as commit 58e67bb3c131 ("drm/amd/display: fix dpms_off issue when disabling bios mode") Revert it again as it breaks Asus G513QY / 6800M laptops. Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/2259 Cc: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Cc: Zhongwei <Zhongwei.Zhang@amd.com> Cc: Harry Wentland <harry.wentland@amd.com> Cc: Mike Lothian <mike@fireburn.co.uk> Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
| * drm/amdkfd: fix null queue check on debug setting exceptionsJonathan Kim2023-06-151-1/+1
| | | | | | | | | | | | | | | | | | Null check should be done on queue struct itself and not on the process queue list node. Signed-off-by: Jonathan Kim <jonathan.kim@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
| * drm/amd/pm: enable vclk and dclk Pstates for SMU v13.0.5Tim Huang2023-06-151-0/+29
| | | | | | | | | | | | | | | | | | | | Add the ability to control the vclk and dclk frequency by power_dpm_force_performance_level interface. Signed-off-by: Tim Huang <Tim.Huang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>