diff options
author | Eric Anholt <eric@anholt.net> | 2019-04-17 00:58:54 +0200 |
---|---|---|
committer | Eric Anholt <eric@anholt.net> | 2019-04-18 18:54:10 +0200 |
commit | d223f98f02099b002903b9b22b56febae16ef80d (patch) | |
tree | b7f92f09256c19e9b39dc9d5b3ab3490bfe743c3 /drivers/gpu/drm/v3d/v3d_trace.h | |
parent | drm/v3d: Refactor job management. (diff) | |
download | linux-d223f98f02099b002903b9b22b56febae16ef80d.tar.xz linux-d223f98f02099b002903b9b22b56febae16ef80d.zip |
drm/v3d: Add support for compute shader dispatch.
The compute shader dispatch interface is pretty simple -- just pass in
the regs that userspace has passed us, with no CLs to run. However,
with no CL to run it means that we need to do manual cache flushing of
the L2 after the HW execution completes (for SSBO, atomic, and
image_load_store writes that are the output of compute shaders).
This doesn't yet expose the L2 cache's ability to have a region of the
address space not write back to memory (which could be used for
shared_var storage).
So far, the Mesa side has been tested on V3D v4.2 simpenrose (passing
the ES31 tests), and on the kernel side on 7278 (failing atomic
compswap tests in a way that doesn't reproduce on simpenrose).
v2: Fix excessive allocation for the clean_job (reported by Dan
Carpenter). Keep refs on jobs until clean_job is finished, to
avoid spurious MMU errors if the output BOs are freed by userspace
before L2 cleaning is finished.
Signed-off-by: Eric Anholt <eric@anholt.net>
Link: https://patchwork.freedesktop.org/patch/msgid/20190416225856.20264-4-eric@anholt.net
Acked-by: Rob Clark <robdclark@gmail.com>
Diffstat (limited to 'drivers/gpu/drm/v3d/v3d_trace.h')
-rw-r--r-- | drivers/gpu/drm/v3d/v3d_trace.h | 94 |
1 files changed, 94 insertions, 0 deletions
diff --git a/drivers/gpu/drm/v3d/v3d_trace.h b/drivers/gpu/drm/v3d/v3d_trace.h index edd984afa33f..7aa8dc356e54 100644 --- a/drivers/gpu/drm/v3d/v3d_trace.h +++ b/drivers/gpu/drm/v3d/v3d_trace.h @@ -124,6 +124,26 @@ TRACE_EVENT(v3d_tfu_irq, __entry->seqno) ); +TRACE_EVENT(v3d_csd_irq, + TP_PROTO(struct drm_device *dev, + uint64_t seqno), + TP_ARGS(dev, seqno), + + TP_STRUCT__entry( + __field(u32, dev) + __field(u64, seqno) + ), + + TP_fast_assign( + __entry->dev = dev->primary->index; + __entry->seqno = seqno; + ), + + TP_printk("dev=%u, seqno=%llu", + __entry->dev, + __entry->seqno) +); + TRACE_EVENT(v3d_submit_tfu_ioctl, TP_PROTO(struct drm_device *dev, u32 iia), TP_ARGS(dev, iia), @@ -163,6 +183,80 @@ TRACE_EVENT(v3d_submit_tfu, __entry->seqno) ); +TRACE_EVENT(v3d_submit_csd_ioctl, + TP_PROTO(struct drm_device *dev, u32 cfg5, u32 cfg6), + TP_ARGS(dev, cfg5, cfg6), + + TP_STRUCT__entry( + __field(u32, dev) + __field(u32, cfg5) + __field(u32, cfg6) + ), + + TP_fast_assign( + __entry->dev = dev->primary->index; + __entry->cfg5 = cfg5; + __entry->cfg6 = cfg6; + ), + + TP_printk("dev=%u, CFG5 0x%08x, CFG6 0x%08x", + __entry->dev, + __entry->cfg5, + __entry->cfg6) +); + +TRACE_EVENT(v3d_submit_csd, + TP_PROTO(struct drm_device *dev, + uint64_t seqno), + TP_ARGS(dev, seqno), + + TP_STRUCT__entry( + __field(u32, dev) + __field(u64, seqno) + ), + + TP_fast_assign( + __entry->dev = dev->primary->index; + __entry->seqno = seqno; + ), + + TP_printk("dev=%u, seqno=%llu", + __entry->dev, + __entry->seqno) +); + +TRACE_EVENT(v3d_cache_clean_begin, + TP_PROTO(struct drm_device *dev), + TP_ARGS(dev), + + TP_STRUCT__entry( + __field(u32, dev) + ), + + TP_fast_assign( + __entry->dev = dev->primary->index; + ), + + TP_printk("dev=%u", + __entry->dev) +); + +TRACE_EVENT(v3d_cache_clean_end, + TP_PROTO(struct drm_device *dev), + TP_ARGS(dev), + + TP_STRUCT__entry( + __field(u32, dev) + ), + + TP_fast_assign( + __entry->dev = dev->primary->index; + ), + + TP_printk("dev=%u", + __entry->dev) +); + TRACE_EVENT(v3d_reset_begin, TP_PROTO(struct drm_device *dev), TP_ARGS(dev), |