summaryrefslogtreecommitdiffstats
path: root/Documentation/trace/histogram.rst
diff options
context:
space:
mode:
Diffstat (limited to 'Documentation/trace/histogram.rst')
-rw-r--r--Documentation/trace/histogram.rst416
1 files changed, 285 insertions, 131 deletions
diff --git a/Documentation/trace/histogram.rst b/Documentation/trace/histogram.rst
index f95459aa984f..479c9eac6335 100644
--- a/Documentation/trace/histogram.rst
+++ b/Documentation/trace/histogram.rst
@@ -81,6 +81,7 @@ Documentation written by Tom Zanussi
.usecs display a common_timestamp in microseconds
.percent display a number of percentage value
.graph display a bar-graph of a value
+ .stacktrace display as a stacktrace (must by a long[] type)
============= =================================================
Note that in general the semantics of a given field aren't
@@ -102,12 +103,12 @@ Documentation written by Tom Zanussi
trigger, read its current contents, and then turn it off::
# echo 'hist:keys=skbaddr.hex:vals=len' > \
- /sys/kernel/debug/tracing/events/net/netif_rx/trigger
+ /sys/kernel/tracing/events/net/netif_rx/trigger
- # cat /sys/kernel/debug/tracing/events/net/netif_rx/hist
+ # cat /sys/kernel/tracing/events/net/netif_rx/hist
# echo '!hist:keys=skbaddr.hex:vals=len' > \
- /sys/kernel/debug/tracing/events/net/netif_rx/trigger
+ /sys/kernel/tracing/events/net/netif_rx/trigger
The trigger file itself can be read to show the details of the
currently attached hist trigger. This information is also displayed
@@ -169,13 +170,13 @@ Documentation written by Tom Zanussi
aggregation on and off when conditions of interest are hit::
# echo 'hist:keys=skbaddr.hex:vals=len:pause' > \
- /sys/kernel/debug/tracing/events/net/netif_receive_skb/trigger
+ /sys/kernel/tracing/events/net/netif_receive_skb/trigger
# echo 'enable_hist:net:netif_receive_skb if filename==/usr/bin/wget' > \
- /sys/kernel/debug/tracing/events/sched/sched_process_exec/trigger
+ /sys/kernel/tracing/events/sched/sched_process_exec/trigger
# echo 'disable_hist:net:netif_receive_skb if comm==wget' > \
- /sys/kernel/debug/tracing/events/sched/sched_process_exit/trigger
+ /sys/kernel/tracing/events/sched/sched_process_exit/trigger
The above sets up an initially paused hist trigger which is unpaused
and starts aggregating events when a given program is executed, and
@@ -218,7 +219,7 @@ Extended error information
event. The fields that can be used for the hist trigger are listed
in the kmalloc event's format file::
- # cat /sys/kernel/debug/tracing/events/kmem/kmalloc/format
+ # cat /sys/kernel/tracing/events/kmem/kmalloc/format
name: kmalloc
ID: 374
format:
@@ -238,7 +239,7 @@ Extended error information
the kernel that made one or more calls to kmalloc::
# echo 'hist:key=call_site:val=bytes_req.buckets=32' > \
- /sys/kernel/debug/tracing/events/kmem/kmalloc/trigger
+ /sys/kernel/tracing/events/kmem/kmalloc/trigger
This tells the tracing system to create a 'hist' trigger using the
call_site field of the kmalloc event as the key for the table, which
@@ -252,7 +253,7 @@ Extended error information
file in the kmalloc event's subdirectory (for readability, a number
of entries have been omitted)::
- # cat /sys/kernel/debug/tracing/events/kmem/kmalloc/hist
+ # cat /sys/kernel/tracing/events/kmem/kmalloc/hist
# trigger info: hist:keys=call_site:vals=bytes_req:sort=hitcount:size=2048 [active]
{ call_site: 18446744072106379007 } hitcount: 1 bytes_req: 176
@@ -292,7 +293,7 @@ Extended error information
the trigger info, which can also be displayed by reading the
'trigger' file::
- # cat /sys/kernel/debug/tracing/events/kmem/kmalloc/trigger
+ # cat /sys/kernel/tracing/events/kmem/kmalloc/trigger
hist:keys=call_site:vals=bytes_req:sort=hitcount:size=2048 [active]
At the end of the output are a few lines that display the overall
@@ -323,7 +324,7 @@ Extended error information
command history and re-execute it with a '!' prepended::
# echo '!hist:key=call_site:val=bytes_req' > \
- /sys/kernel/debug/tracing/events/kmem/kmalloc/trigger
+ /sys/kernel/tracing/events/kmem/kmalloc/trigger
Finally, notice that the call_site as displayed in the output above
isn't really very useful. It's an address, but normally addresses
@@ -331,9 +332,9 @@ Extended error information
value, simply append '.hex' to the field name in the trigger::
# echo 'hist:key=call_site.hex:val=bytes_req' > \
- /sys/kernel/debug/tracing/events/kmem/kmalloc/trigger
+ /sys/kernel/tracing/events/kmem/kmalloc/trigger
- # cat /sys/kernel/debug/tracing/events/kmem/kmalloc/hist
+ # cat /sys/kernel/tracing/events/kmem/kmalloc/hist
# trigger info: hist:keys=call_site.hex:vals=bytes_req:sort=hitcount:size=2048 [active]
{ call_site: ffffffffa026b291 } hitcount: 1 bytes_req: 433
@@ -376,9 +377,9 @@ Extended error information
trigger::
# echo 'hist:key=call_site.sym:val=bytes_req' > \
- /sys/kernel/debug/tracing/events/kmem/kmalloc/trigger
+ /sys/kernel/tracing/events/kmem/kmalloc/trigger
- # cat /sys/kernel/debug/tracing/events/kmem/kmalloc/hist
+ # cat /sys/kernel/tracing/events/kmem/kmalloc/hist
# trigger info: hist:keys=call_site.sym:vals=bytes_req:sort=hitcount:size=2048 [active]
{ call_site: [ffffffff810adcb9] syslog_print_all } hitcount: 1 bytes_req: 1024
@@ -426,9 +427,9 @@ Extended error information
the 'sort' parameter, along with the 'descending' modifier::
# echo 'hist:key=call_site.sym:val=bytes_req:sort=bytes_req.descending' > \
- /sys/kernel/debug/tracing/events/kmem/kmalloc/trigger
+ /sys/kernel/tracing/events/kmem/kmalloc/trigger
- # cat /sys/kernel/debug/tracing/events/kmem/kmalloc/hist
+ # cat /sys/kernel/tracing/events/kmem/kmalloc/hist
# trigger info: hist:keys=call_site.sym:vals=bytes_req:sort=bytes_req.descending:size=2048 [active]
{ call_site: [ffffffffa046041c] i915_gem_execbuffer2 [i915] } hitcount: 2186 bytes_req: 3397464
@@ -467,9 +468,9 @@ Extended error information
name, just use 'sym-offset' instead::
# echo 'hist:key=call_site.sym-offset:val=bytes_req:sort=bytes_req.descending' > \
- /sys/kernel/debug/tracing/events/kmem/kmalloc/trigger
+ /sys/kernel/tracing/events/kmem/kmalloc/trigger
- # cat /sys/kernel/debug/tracing/events/kmem/kmalloc/hist
+ # cat /sys/kernel/tracing/events/kmem/kmalloc/hist
# trigger info: hist:keys=call_site.sym-offset:vals=bytes_req:sort=bytes_req.descending:size=2048 [active]
{ call_site: [ffffffffa046041c] i915_gem_execbuffer2+0x6c/0x2c0 [i915] } hitcount: 4569 bytes_req: 3163720
@@ -506,9 +507,9 @@ Extended error information
allocated in a descending order::
# echo 'hist:keys=call_site.sym:values=bytes_req,bytes_alloc:sort=bytes_alloc.descending' > \
- /sys/kernel/debug/tracing/events/kmem/kmalloc/trigger
+ /sys/kernel/tracing/events/kmem/kmalloc/trigger
- # cat /sys/kernel/debug/tracing/events/kmem/kmalloc/hist
+ # cat /sys/kernel/tracing/events/kmem/kmalloc/hist
# trigger info: hist:keys=call_site.sym:vals=bytes_req,bytes_alloc:sort=bytes_alloc.descending:size=2048 [active]
{ call_site: [ffffffffa046041c] i915_gem_execbuffer2 [i915] } hitcount: 7403 bytes_req: 4084360 bytes_alloc: 5958016
@@ -549,7 +550,7 @@ Extended error information
value 'stacktrace' for the key parameter::
# echo 'hist:keys=stacktrace:values=bytes_req,bytes_alloc:sort=bytes_alloc' > \
- /sys/kernel/debug/tracing/events/kmem/kmalloc/trigger
+ /sys/kernel/tracing/events/kmem/kmalloc/trigger
The above trigger will use the kernel stack trace in effect when an
event is triggered as the key for the hash table. This allows the
@@ -559,7 +560,7 @@ Extended error information
every callpath in the system that led up to a kmalloc (in this case
every callpath to a kmalloc for a kernel compile)::
- # cat /sys/kernel/debug/tracing/events/kmem/kmalloc/hist
+ # cat /sys/kernel/tracing/events/kmem/kmalloc/hist
# trigger info: hist:keys=stacktrace:vals=bytes_req,bytes_alloc:sort=bytes_alloc:size=2048 [active]
{ stacktrace:
@@ -658,9 +659,9 @@ Extended error information
keeps a per-process sum of total bytes read::
# echo 'hist:key=common_pid.execname:val=count:sort=count.descending' > \
- /sys/kernel/debug/tracing/events/syscalls/sys_enter_read/trigger
+ /sys/kernel/tracing/events/syscalls/sys_enter_read/trigger
- # cat /sys/kernel/debug/tracing/events/syscalls/sys_enter_read/hist
+ # cat /sys/kernel/tracing/events/syscalls/sys_enter_read/hist
# trigger info: hist:keys=common_pid.execname:vals=count:sort=count.descending:size=2048 [active]
{ common_pid: gnome-terminal [ 3196] } hitcount: 280 count: 1093512
@@ -699,9 +700,9 @@ Extended error information
counts for the system during the run::
# echo 'hist:key=id.syscall:val=hitcount' > \
- /sys/kernel/debug/tracing/events/raw_syscalls/sys_enter/trigger
+ /sys/kernel/tracing/events/raw_syscalls/sys_enter/trigger
- # cat /sys/kernel/debug/tracing/events/raw_syscalls/sys_enter/hist
+ # cat /sys/kernel/tracing/events/raw_syscalls/sys_enter/hist
# trigger info: hist:keys=id.syscall:vals=hitcount:sort=hitcount:size=2048 [active]
{ id: sys_fsync [ 74] } hitcount: 1
@@ -753,9 +754,9 @@ Extended error information
hitcount sum as the secondary key::
# echo 'hist:key=id.syscall,common_pid.execname:val=hitcount:sort=id,hitcount' > \
- /sys/kernel/debug/tracing/events/raw_syscalls/sys_enter/trigger
+ /sys/kernel/tracing/events/raw_syscalls/sys_enter/trigger
- # cat /sys/kernel/debug/tracing/events/raw_syscalls/sys_enter/hist
+ # cat /sys/kernel/tracing/events/raw_syscalls/sys_enter/hist
# trigger info: hist:keys=id.syscall,common_pid.execname:vals=hitcount:sort=id.syscall,hitcount:size=2048 [active]
{ id: sys_read [ 0], common_pid: rtkit-daemon [ 1877] } hitcount: 1
@@ -803,9 +804,9 @@ Extended error information
can use that to filter out all the other syscalls::
# echo 'hist:key=id.syscall,common_pid.execname:val=hitcount:sort=id,hitcount if id == 16' > \
- /sys/kernel/debug/tracing/events/raw_syscalls/sys_enter/trigger
+ /sys/kernel/tracing/events/raw_syscalls/sys_enter/trigger
- # cat /sys/kernel/debug/tracing/events/raw_syscalls/sys_enter/hist
+ # cat /sys/kernel/tracing/events/raw_syscalls/sys_enter/hist
# trigger info: hist:keys=id.syscall,common_pid.execname:vals=hitcount:sort=id.syscall,hitcount:size=2048 if id == 16 [active]
{ id: sys_ioctl [ 16], common_pid: gmain [ 2769] } hitcount: 1
@@ -846,9 +847,9 @@ Extended error information
each process::
# echo 'hist:key=common_pid.execname,size:val=hitcount:sort=common_pid,size' > \
- /sys/kernel/debug/tracing/events/syscalls/sys_enter_recvfrom/trigger
+ /sys/kernel/tracing/events/syscalls/sys_enter_recvfrom/trigger
- # cat /sys/kernel/debug/tracing/events/syscalls/sys_enter_recvfrom/hist
+ # cat /sys/kernel/tracing/events/syscalls/sys_enter_recvfrom/hist
# trigger info: hist:keys=common_pid.execname,size:vals=hitcount:sort=common_pid.execname,size:size=2048 [active]
{ common_pid: smbd [ 784], size: 4 } hitcount: 1
@@ -899,9 +900,9 @@ Extended error information
much smaller number, say 256::
# echo 'hist:key=child_comm:val=hitcount:size=256' > \
- /sys/kernel/debug/tracing/events/sched/sched_process_fork/trigger
+ /sys/kernel/tracing/events/sched/sched_process_fork/trigger
- # cat /sys/kernel/debug/tracing/events/sched/sched_process_fork/hist
+ # cat /sys/kernel/tracing/events/sched/sched_process_fork/hist
# trigger info: hist:keys=child_comm:vals=hitcount:sort=hitcount:size=256 [active]
{ child_comm: dconf worker } hitcount: 1
@@ -935,9 +936,9 @@ Extended error information
displays as [paused]::
# echo 'hist:key=child_comm:val=hitcount:size=256:pause' >> \
- /sys/kernel/debug/tracing/events/sched/sched_process_fork/trigger
+ /sys/kernel/tracing/events/sched/sched_process_fork/trigger
- # cat /sys/kernel/debug/tracing/events/sched/sched_process_fork/hist
+ # cat /sys/kernel/tracing/events/sched/sched_process_fork/hist
# trigger info: hist:keys=child_comm:vals=hitcount:sort=hitcount:size=256 [paused]
{ child_comm: dconf worker } hitcount: 1
@@ -972,9 +973,9 @@ Extended error information
again, and the data has changed::
# echo 'hist:key=child_comm:val=hitcount:size=256:cont' >> \
- /sys/kernel/debug/tracing/events/sched/sched_process_fork/trigger
+ /sys/kernel/tracing/events/sched/sched_process_fork/trigger
- # cat /sys/kernel/debug/tracing/events/sched/sched_process_fork/hist
+ # cat /sys/kernel/tracing/events/sched/sched_process_fork/hist
# trigger info: hist:keys=child_comm:vals=hitcount:sort=hitcount:size=256 [active]
{ child_comm: dconf worker } hitcount: 1
@@ -1026,7 +1027,7 @@ Extended error information
netif_receive_skb event::
# echo 'hist:key=stacktrace:vals=len:pause' > \
- /sys/kernel/debug/tracing/events/net/netif_receive_skb/trigger
+ /sys/kernel/tracing/events/net/netif_receive_skb/trigger
Next, we set up an 'enable_hist' trigger on the sched_process_exec
event, with an 'if filename==/usr/bin/wget' filter. The effect of
@@ -1037,7 +1038,7 @@ Extended error information
hash table keyed on stacktrace::
# echo 'enable_hist:net:netif_receive_skb if filename==/usr/bin/wget' > \
- /sys/kernel/debug/tracing/events/sched/sched_process_exec/trigger
+ /sys/kernel/tracing/events/sched/sched_process_exec/trigger
The aggregation continues until the netif_receive_skb is paused
again, which is what the following disable_hist event does by
@@ -1045,7 +1046,7 @@ Extended error information
filter 'comm==wget'::
# echo 'disable_hist:net:netif_receive_skb if comm==wget' > \
- /sys/kernel/debug/tracing/events/sched/sched_process_exit/trigger
+ /sys/kernel/tracing/events/sched/sched_process_exit/trigger
Whenever a process exits and the comm field of the disable_hist
trigger filter matches 'comm==wget', the netif_receive_skb hist
@@ -1058,7 +1059,7 @@ Extended error information
$ wget https://www.kernel.org/pub/linux/kernel/v3.x/patch-3.19.xz
- # cat /sys/kernel/debug/tracing/events/net/netif_receive_skb/hist
+ # cat /sys/kernel/tracing/events/net/netif_receive_skb/hist
# trigger info: hist:keys=stacktrace:vals=len:sort=hitcount:size=2048 [paused]
{ stacktrace:
@@ -1142,12 +1143,12 @@ Extended error information
again, we can just clear the histogram first::
# echo 'hist:key=stacktrace:vals=len:clear' >> \
- /sys/kernel/debug/tracing/events/net/netif_receive_skb/trigger
+ /sys/kernel/tracing/events/net/netif_receive_skb/trigger
Just to verify that it is in fact cleared, here's what we now see in
the hist file::
- # cat /sys/kernel/debug/tracing/events/net/netif_receive_skb/hist
+ # cat /sys/kernel/tracing/events/net/netif_receive_skb/hist
# trigger info: hist:keys=stacktrace:vals=len:sort=hitcount:size=2048 [paused]
Totals:
@@ -1162,21 +1163,21 @@ Extended error information
sched_process_exit events as such::
# echo 'enable_event:net:netif_receive_skb if filename==/usr/bin/wget' > \
- /sys/kernel/debug/tracing/events/sched/sched_process_exec/trigger
+ /sys/kernel/tracing/events/sched/sched_process_exec/trigger
# echo 'disable_event:net:netif_receive_skb if comm==wget' > \
- /sys/kernel/debug/tracing/events/sched/sched_process_exit/trigger
+ /sys/kernel/tracing/events/sched/sched_process_exit/trigger
If you read the trigger files for the sched_process_exec and
sched_process_exit triggers, you should see two triggers for each:
one enabling/disabling the hist aggregation and the other
enabling/disabling the logging of events::
- # cat /sys/kernel/debug/tracing/events/sched/sched_process_exec/trigger
+ # cat /sys/kernel/tracing/events/sched/sched_process_exec/trigger
enable_event:net:netif_receive_skb:unlimited if filename==/usr/bin/wget
enable_hist:net:netif_receive_skb:unlimited if filename==/usr/bin/wget
- # cat /sys/kernel/debug/tracing/events/sched/sched_process_exit/trigger
+ # cat /sys/kernel/tracing/events/sched/sched_process_exit/trigger
enable_event:net:netif_receive_skb:unlimited if comm==wget
disable_hist:net:netif_receive_skb:unlimited if comm==wget
@@ -1192,7 +1193,7 @@ Extended error information
saw in the last run, but this time you should also see the
individual events in the trace file::
- # cat /sys/kernel/debug/tracing/trace
+ # cat /sys/kernel/tracing/trace
# tracer: nop
#
@@ -1226,15 +1227,15 @@ Extended error information
other things::
# echo 'hist:keys=skbaddr.hex:vals=len if len < 0' >> \
- /sys/kernel/debug/tracing/events/net/netif_receive_skb/trigger
+ /sys/kernel/tracing/events/net/netif_receive_skb/trigger
# echo 'hist:keys=skbaddr.hex:vals=len if len > 4096' >> \
- /sys/kernel/debug/tracing/events/net/netif_receive_skb/trigger
+ /sys/kernel/tracing/events/net/netif_receive_skb/trigger
# echo 'hist:keys=skbaddr.hex:vals=len if len == 256' >> \
- /sys/kernel/debug/tracing/events/net/netif_receive_skb/trigger
+ /sys/kernel/tracing/events/net/netif_receive_skb/trigger
# echo 'hist:keys=skbaddr.hex:vals=len' >> \
- /sys/kernel/debug/tracing/events/net/netif_receive_skb/trigger
+ /sys/kernel/tracing/events/net/netif_receive_skb/trigger
# echo 'hist:keys=len:vals=common_preempt_count' >> \
- /sys/kernel/debug/tracing/events/net/netif_receive_skb/trigger
+ /sys/kernel/tracing/events/net/netif_receive_skb/trigger
The above set of commands create four triggers differing only in
their filters, along with a completely different though fairly
@@ -1246,7 +1247,7 @@ Extended error information
Displaying the contents of the 'hist' file for the event shows the
contents of all five histograms::
- # cat /sys/kernel/debug/tracing/events/net/netif_receive_skb/hist
+ # cat /sys/kernel/tracing/events/net/netif_receive_skb/hist
# event histogram
#
@@ -1367,15 +1368,15 @@ Extended error information
field in the shared 'foo' histogram data::
# echo 'hist:name=foo:keys=skbaddr.hex:vals=len' > \
- /sys/kernel/debug/tracing/events/net/netif_receive_skb/trigger
+ /sys/kernel/tracing/events/net/netif_receive_skb/trigger
# echo 'hist:name=foo:keys=skbaddr.hex:vals=len' > \
- /sys/kernel/debug/tracing/events/net/netif_rx/trigger
+ /sys/kernel/tracing/events/net/netif_rx/trigger
You can see that they're updating common histogram data by reading
each event's hist files at the same time::
- # cat /sys/kernel/debug/tracing/events/net/netif_receive_skb/hist;
- cat /sys/kernel/debug/tracing/events/net/netif_rx/hist
+ # cat /sys/kernel/tracing/events/net/netif_receive_skb/hist;
+ cat /sys/kernel/tracing/events/net/netif_rx/hist
# event histogram
#
@@ -1488,15 +1489,15 @@ Extended error information
couple of triggers named 'bar' using those fields::
# echo 'hist:name=bar:key=stacktrace:val=hitcount' > \
- /sys/kernel/debug/tracing/events/sched/sched_process_fork/trigger
+ /sys/kernel/tracing/events/sched/sched_process_fork/trigger
# echo 'hist:name=bar:key=stacktrace:val=hitcount' > \
- /sys/kernel/debug/tracing/events/net/netif_rx/trigger
+ /sys/kernel/tracing/events/net/netif_rx/trigger
And displaying the output of either shows some interesting if
somewhat confusing output::
- # cat /sys/kernel/debug/tracing/events/sched/sched_process_fork/hist
- # cat /sys/kernel/debug/tracing/events/net/netif_rx/hist
+ # cat /sys/kernel/tracing/events/sched/sched_process_fork/hist
+ # cat /sys/kernel/tracing/events/net/netif_rx/hist
# event histogram
#
@@ -1786,6 +1787,8 @@ or assigned to a variable and referenced in a subsequent expression::
# echo 'hist:keys=next_pid:us_per_sec=1000000 ...' >> event/trigger
# echo 'hist:keys=next_pid:timestamp_secs=common_timestamp/$us_per_sec ...' >> event/trigger
+Variables can even hold stacktraces, which are useful with synthetic events.
+
2.2.2 Synthetic Events
----------------------
@@ -1826,19 +1829,19 @@ variable reference to a variable on another event::
u64 lat; \
pid_t pid; \
int prio' >> \
- /sys/kernel/debug/tracing/synthetic_events
+ /sys/kernel/tracing/synthetic_events
Reading the tracing/synthetic_events file lists all the currently
defined synthetic events, in this case the event defined above::
- # cat /sys/kernel/debug/tracing/synthetic_events
+ # cat /sys/kernel/tracing/synthetic_events
wakeup_latency u64 lat; pid_t pid; int prio
An existing synthetic event definition can be removed by prepending
the command that defined it with a '!'::
# echo '!wakeup_latency u64 lat pid_t pid int prio' >> \
- /sys/kernel/debug/tracing/synthetic_events
+ /sys/kernel/tracing/synthetic_events
At this point, there isn't yet an actual 'wakeup_latency' event
instantiated in the event subsystem - for this to happen, a 'hist
@@ -1850,20 +1853,20 @@ done, the 'wakeup_latency' synthetic event instance is created.
The new event is created under the tracing/events/synthetic/ directory
and looks and behaves just like any other event::
- # ls /sys/kernel/debug/tracing/events/synthetic/wakeup_latency
+ # ls /sys/kernel/tracing/events/synthetic/wakeup_latency
enable filter format hist id trigger
A histogram can now be defined for the new synthetic event::
# echo 'hist:keys=pid,prio,lat.log2:sort=lat' >> \
- /sys/kernel/debug/tracing/events/synthetic/wakeup_latency/trigger
+ /sys/kernel/tracing/events/synthetic/wakeup_latency/trigger
The above shows the latency "lat" in a power of 2 grouping.
Like any other event, once a histogram is enabled for the event, the
-output can be displayed by reading the event's 'hist' file.
+output can be displayed by reading the event's 'hist' file::
- # cat /sys/kernel/debug/tracing/events/synthetic/wakeup_latency/hist
+ # cat /sys/kernel/tracing/events/synthetic/wakeup_latency/hist
# event histogram
#
@@ -1908,10 +1911,10 @@ output can be displayed by reading the event's 'hist' file.
The latency values can also be grouped linearly by a given size with
-the ".buckets" modifier and specify a size (in this case groups of 10).
+the ".buckets" modifier and specify a size (in this case groups of 10)::
# echo 'hist:keys=pid,prio,lat.buckets=10:sort=lat' >> \
- /sys/kernel/debug/tracing/events/synthetic/wakeup_latency/trigger
+ /sys/kernel/tracing/events/synthetic/wakeup_latency/trigger
# event histogram
#
@@ -1940,6 +1943,157 @@ the ".buckets" modifier and specify a size (in this case groups of 10).
Entries: 16
Dropped: 0
+To save stacktraces, create a synthetic event with a field of type "unsigned long[]"
+or even just "long[]". For example, to see how long a task is blocked in an
+uninterruptible state::
+
+ # cd /sys/kernel/tracing
+ # echo 's:block_lat pid_t pid; u64 delta; unsigned long[] stack;' > dynamic_events
+ # echo 'hist:keys=next_pid:ts=common_timestamp.usecs,st=stacktrace if prev_state == 2' >> events/sched/sched_switch/trigger
+ # echo 'hist:keys=prev_pid:delta=common_timestamp.usecs-$ts,s=$st:onmax($delta).trace(block_lat,prev_pid,$delta,$s)' >> events/sched/sched_switch/trigger
+ # echo 1 > events/synthetic/block_lat/enable
+ # cat trace
+
+ # tracer: nop
+ #
+ # entries-in-buffer/entries-written: 2/2 #P:8
+ #
+ # _-----=> irqs-off/BH-disabled
+ # / _----=> need-resched
+ # | / _---=> hardirq/softirq
+ # || / _--=> preempt-depth
+ # ||| / _-=> migrate-disable
+ # |||| / delay
+ # TASK-PID CPU# ||||| TIMESTAMP FUNCTION
+ # | | | ||||| | |
+ <idle>-0 [005] d..4. 521.164922: block_lat: pid=0 delta=8322 stack=STACK:
+ => __schedule+0x448/0x7b0
+ => schedule+0x5a/0xb0
+ => io_schedule+0x42/0x70
+ => bit_wait_io+0xd/0x60
+ => __wait_on_bit+0x4b/0x140
+ => out_of_line_wait_on_bit+0x91/0xb0
+ => jbd2_journal_commit_transaction+0x1679/0x1a70
+ => kjournald2+0xa9/0x280
+ => kthread+0xe9/0x110
+ => ret_from_fork+0x2c/0x50
+
+ <...>-2 [004] d..4. 525.184257: block_lat: pid=2 delta=76 stack=STACK:
+ => __schedule+0x448/0x7b0
+ => schedule+0x5a/0xb0
+ => schedule_timeout+0x11a/0x150
+ => wait_for_completion_killable+0x144/0x1f0
+ => __kthread_create_on_node+0xe7/0x1e0
+ => kthread_create_on_node+0x51/0x70
+ => create_worker+0xcc/0x1a0
+ => worker_thread+0x2ad/0x380
+ => kthread+0xe9/0x110
+ => ret_from_fork+0x2c/0x50
+
+A synthetic event that has a stacktrace field may use it as a key in
+histogram::
+
+ # echo 'hist:keys=delta.buckets=100,stack.stacktrace:sort=delta' > events/synthetic/block_lat/trigger
+ # cat events/synthetic/block_lat/hist
+
+ # event histogram
+ #
+ # trigger info: hist:keys=delta.buckets=100,stack.stacktrace:vals=hitcount:sort=delta.buckets=100:size=2048 [active]
+ #
+ { delta: ~ 0-99, stack.stacktrace __schedule+0xa19/0x1520
+ schedule+0x6b/0x110
+ io_schedule+0x46/0x80
+ bit_wait_io+0x11/0x80
+ __wait_on_bit+0x4e/0x120
+ out_of_line_wait_on_bit+0x8d/0xb0
+ __wait_on_buffer+0x33/0x40
+ jbd2_journal_commit_transaction+0x155a/0x19b0
+ kjournald2+0xab/0x270
+ kthread+0xfa/0x130
+ ret_from_fork+0x29/0x50
+ } hitcount: 1
+ { delta: ~ 0-99, stack.stacktrace __schedule+0xa19/0x1520
+ schedule+0x6b/0x110
+ io_schedule+0x46/0x80
+ rq_qos_wait+0xd0/0x170
+ wbt_wait+0x9e/0xf0
+ __rq_qos_throttle+0x25/0x40
+ blk_mq_submit_bio+0x2c3/0x5b0
+ __submit_bio+0xff/0x190
+ submit_bio_noacct_nocheck+0x25b/0x2b0
+ submit_bio_noacct+0x20b/0x600
+ submit_bio+0x28/0x90
+ ext4_bio_write_page+0x1e0/0x8c0
+ mpage_submit_page+0x60/0x80
+ mpage_process_page_bufs+0x16c/0x180
+ mpage_prepare_extent_to_map+0x23f/0x530
+ } hitcount: 1
+ { delta: ~ 0-99, stack.stacktrace __schedule+0xa19/0x1520
+ schedule+0x6b/0x110
+ schedule_hrtimeout_range_clock+0x97/0x110
+ schedule_hrtimeout_range+0x13/0x20
+ usleep_range_state+0x65/0x90
+ __intel_wait_for_register+0x1c1/0x230 [i915]
+ intel_psr_wait_for_idle_locked+0x171/0x2a0 [i915]
+ intel_pipe_update_start+0x169/0x360 [i915]
+ intel_update_crtc+0x112/0x490 [i915]
+ skl_commit_modeset_enables+0x199/0x600 [i915]
+ intel_atomic_commit_tail+0x7c4/0x1080 [i915]
+ intel_atomic_commit_work+0x12/0x20 [i915]
+ process_one_work+0x21c/0x3f0
+ worker_thread+0x50/0x3e0
+ kthread+0xfa/0x130
+ } hitcount: 3
+ { delta: ~ 0-99, stack.stacktrace __schedule+0xa19/0x1520
+ schedule+0x6b/0x110
+ schedule_timeout+0x11e/0x160
+ __wait_for_common+0x8f/0x190
+ wait_for_completion+0x24/0x30
+ __flush_work.isra.0+0x1cc/0x360
+ flush_work+0xe/0x20
+ drm_mode_rmfb+0x18b/0x1d0 [drm]
+ drm_mode_rmfb_ioctl+0x10/0x20 [drm]
+ drm_ioctl_kernel+0xb8/0x150 [drm]
+ drm_ioctl+0x243/0x560 [drm]
+ __x64_sys_ioctl+0x92/0xd0
+ do_syscall_64+0x59/0x90
+ entry_SYSCALL_64_after_hwframe+0x72/0xdc
+ } hitcount: 1
+ { delta: ~ 0-99, stack.stacktrace __schedule+0xa19/0x1520
+ schedule+0x6b/0x110
+ schedule_timeout+0x87/0x160
+ __wait_for_common+0x8f/0x190
+ wait_for_completion_timeout+0x1d/0x30
+ drm_atomic_helper_wait_for_flip_done+0x57/0x90 [drm_kms_helper]
+ intel_atomic_commit_tail+0x8ce/0x1080 [i915]
+ intel_atomic_commit_work+0x12/0x20 [i915]
+ process_one_work+0x21c/0x3f0
+ worker_thread+0x50/0x3e0
+ kthread+0xfa/0x130
+ ret_from_fork+0x29/0x50
+ } hitcount: 1
+ { delta: ~ 100-199, stack.stacktrace __schedule+0xa19/0x1520
+ schedule+0x6b/0x110
+ schedule_hrtimeout_range_clock+0x97/0x110
+ schedule_hrtimeout_range+0x13/0x20
+ usleep_range_state+0x65/0x90
+ pci_set_low_power_state+0x17f/0x1f0
+ pci_set_power_state+0x49/0x250
+ pci_finish_runtime_suspend+0x4a/0x90
+ pci_pm_runtime_suspend+0xcb/0x1b0
+ __rpm_callback+0x48/0x120
+ rpm_callback+0x67/0x70
+ rpm_suspend+0x167/0x780
+ rpm_idle+0x25a/0x380
+ pm_runtime_work+0x93/0xc0
+ process_one_work+0x21c/0x3f0
+ } hitcount: 1
+
+ Totals:
+ Hits: 10
+ Entries: 7
+ Dropped: 0
+
2.2.3 Hist trigger 'handlers' and 'actions'
-------------------------------------------
@@ -2039,9 +2193,9 @@ The following commonly-used handler.action pairs are available:
event::
# echo 'wakeup_new_test pid_t pid' >> \
- /sys/kernel/debug/tracing/synthetic_events
+ /sys/kernel/tracing/synthetic_events
- # cat /sys/kernel/debug/tracing/synthetic_events
+ # cat /sys/kernel/tracing/synthetic_events
wakeup_new_test pid_t pid
The following hist trigger both defines the missing testpid
@@ -2052,26 +2206,26 @@ The following commonly-used handler.action pairs are available:
# echo 'hist:keys=$testpid:testpid=pid:onmatch(sched.sched_wakeup_new).\
wakeup_new_test($testpid) if comm=="cyclictest"' >> \
- /sys/kernel/debug/tracing/events/sched/sched_wakeup_new/trigger
+ /sys/kernel/tracing/events/sched/sched_wakeup_new/trigger
- Or, equivalently, using the 'trace' keyword syntax:
+ Or, equivalently, using the 'trace' keyword syntax::
- # echo 'hist:keys=$testpid:testpid=pid:onmatch(sched.sched_wakeup_new).\
- trace(wakeup_new_test,$testpid) if comm=="cyclictest"' >> \
- /sys/kernel/debug/tracing/events/sched/sched_wakeup_new/trigger
+ # echo 'hist:keys=$testpid:testpid=pid:onmatch(sched.sched_wakeup_new).\
+ trace(wakeup_new_test,$testpid) if comm=="cyclictest"' >> \
+ /sys/kernel/tracing/events/sched/sched_wakeup_new/trigger
Creating and displaying a histogram based on those events is now
just a matter of using the fields and new synthetic event in the
tracing/events/synthetic directory, as usual::
# echo 'hist:keys=pid:sort=pid' >> \
- /sys/kernel/debug/tracing/events/synthetic/wakeup_new_test/trigger
+ /sys/kernel/tracing/events/synthetic/wakeup_new_test/trigger
Running 'cyclictest' should cause wakeup_new events to generate
wakeup_new_test synthetic events which should result in histogram
output in the wakeup_new_test event's hist file::
- # cat /sys/kernel/debug/tracing/events/synthetic/wakeup_new_test/hist
+ # cat /sys/kernel/tracing/events/synthetic/wakeup_new_test/hist
A more typical usage would be to use two events to calculate a
latency. The following example uses a set of hist triggers to
@@ -2080,14 +2234,14 @@ The following commonly-used handler.action pairs are available:
First, we define a 'wakeup_latency' synthetic event::
# echo 'wakeup_latency u64 lat; pid_t pid; int prio' >> \
- /sys/kernel/debug/tracing/synthetic_events
+ /sys/kernel/tracing/synthetic_events
Next, we specify that whenever we see a sched_waking event for a
cyclictest thread, save the timestamp in a 'ts0' variable::
# echo 'hist:keys=$saved_pid:saved_pid=pid:ts0=common_timestamp.usecs \
if comm=="cyclictest"' >> \
- /sys/kernel/debug/tracing/events/sched/sched_waking/trigger
+ /sys/kernel/tracing/events/sched/sched_waking/trigger
Then, when the corresponding thread is actually scheduled onto the
CPU by a sched_switch event (saved_pid matches next_pid), calculate
@@ -2097,19 +2251,19 @@ The following commonly-used handler.action pairs are available:
# echo 'hist:keys=next_pid:wakeup_lat=common_timestamp.usecs-$ts0:\
onmatch(sched.sched_waking).wakeup_latency($wakeup_lat,\
$saved_pid,next_prio) if next_comm=="cyclictest"' >> \
- /sys/kernel/debug/tracing/events/sched/sched_switch/trigger
+ /sys/kernel/tracing/events/sched/sched_switch/trigger
We also need to create a histogram on the wakeup_latency synthetic
event in order to aggregate the generated synthetic event data::
# echo 'hist:keys=pid,prio,lat:sort=pid,lat' >> \
- /sys/kernel/debug/tracing/events/synthetic/wakeup_latency/trigger
+ /sys/kernel/tracing/events/synthetic/wakeup_latency/trigger
Finally, once we've run cyclictest to actually generate some
events, we can see the output by looking at the wakeup_latency
synthetic event's hist file::
- # cat /sys/kernel/debug/tracing/events/synthetic/wakeup_latency/hist
+ # cat /sys/kernel/tracing/events/synthetic/wakeup_latency/hist
- onmax(var).save(field,.. .)
@@ -2135,19 +2289,19 @@ The following commonly-used handler.action pairs are available:
# echo 'hist:keys=pid:ts0=common_timestamp.usecs \
if comm=="cyclictest"' >> \
- /sys/kernel/debug/tracing/events/sched/sched_waking/trigger
+ /sys/kernel/tracing/events/sched/sched_waking/trigger
# echo 'hist:keys=next_pid:\
wakeup_lat=common_timestamp.usecs-$ts0:\
onmax($wakeup_lat).save(next_comm,prev_pid,prev_prio,prev_comm) \
if next_comm=="cyclictest"' >> \
- /sys/kernel/debug/tracing/events/sched/sched_switch/trigger
+ /sys/kernel/tracing/events/sched/sched_switch/trigger
When the histogram is displayed, the max value and the saved
values corresponding to the max are displayed following the rest
of the fields::
- # cat /sys/kernel/debug/tracing/events/sched/sched_switch/hist
+ # cat /sys/kernel/tracing/events/sched/sched_switch/hist
{ next_pid: 2255 } hitcount: 239
common_timestamp-ts0: 0
max: 27
@@ -2191,48 +2345,48 @@ The following commonly-used handler.action pairs are available:
resulting latency, stored in wakeup_lat, exceeds the current
maximum latency, a snapshot is taken. As part of the setup, all
the scheduler events are also enabled, which are the events that
- will show up in the snapshot when it is taken at some point:
+ will show up in the snapshot when it is taken at some point::
- # echo 1 > /sys/kernel/debug/tracing/events/sched/enable
+ # echo 1 > /sys/kernel/tracing/events/sched/enable
- # echo 'hist:keys=pid:ts0=common_timestamp.usecs \
- if comm=="cyclictest"' >> \
- /sys/kernel/debug/tracing/events/sched/sched_waking/trigger
+ # echo 'hist:keys=pid:ts0=common_timestamp.usecs \
+ if comm=="cyclictest"' >> \
+ /sys/kernel/tracing/events/sched/sched_waking/trigger
- # echo 'hist:keys=next_pid:wakeup_lat=common_timestamp.usecs-$ts0: \
- onmax($wakeup_lat).save(next_prio,next_comm,prev_pid,prev_prio, \
- prev_comm):onmax($wakeup_lat).snapshot() \
- if next_comm=="cyclictest"' >> \
- /sys/kernel/debug/tracing/events/sched/sched_switch/trigger
+ # echo 'hist:keys=next_pid:wakeup_lat=common_timestamp.usecs-$ts0: \
+ onmax($wakeup_lat).save(next_prio,next_comm,prev_pid,prev_prio, \
+ prev_comm):onmax($wakeup_lat).snapshot() \
+ if next_comm=="cyclictest"' >> \
+ /sys/kernel/tracing/events/sched/sched_switch/trigger
When the histogram is displayed, for each bucket the max value
and the saved values corresponding to the max are displayed
following the rest of the fields.
If a snapshot was taken, there is also a message indicating that,
- along with the value and event that triggered the global maximum:
+ along with the value and event that triggered the global maximum::
- # cat /sys/kernel/debug/tracing/events/sched/sched_switch/hist
- { next_pid: 2101 } hitcount: 200
- max: 52 next_prio: 120 next_comm: cyclictest \
- prev_pid: 0 prev_prio: 120 prev_comm: swapper/6
+ # cat /sys/kernel/tracing/events/sched/sched_switch/hist
+ { next_pid: 2101 } hitcount: 200
+ max: 52 next_prio: 120 next_comm: cyclictest \
+ prev_pid: 0 prev_prio: 120 prev_comm: swapper/6
- { next_pid: 2103 } hitcount: 1326
- max: 572 next_prio: 19 next_comm: cyclictest \
- prev_pid: 0 prev_prio: 120 prev_comm: swapper/1
+ { next_pid: 2103 } hitcount: 1326
+ max: 572 next_prio: 19 next_comm: cyclictest \
+ prev_pid: 0 prev_prio: 120 prev_comm: swapper/1
- { next_pid: 2102 } hitcount: 1982 \
- max: 74 next_prio: 19 next_comm: cyclictest \
- prev_pid: 0 prev_prio: 120 prev_comm: swapper/5
+ { next_pid: 2102 } hitcount: 1982 \
+ max: 74 next_prio: 19 next_comm: cyclictest \
+ prev_pid: 0 prev_prio: 120 prev_comm: swapper/5
- Snapshot taken (see tracing/snapshot). Details:
- triggering value { onmax($wakeup_lat) }: 572 \
- triggered by event with key: { next_pid: 2103 }
+ Snapshot taken (see tracing/snapshot). Details:
+ triggering value { onmax($wakeup_lat) }: 572 \
+ triggered by event with key: { next_pid: 2103 }
- Totals:
- Hits: 3508
- Entries: 3
- Dropped: 0
+ Totals:
+ Hits: 3508
+ Entries: 3
+ Dropped: 0
In the above case, the event that triggered the global maximum has
the key with next_pid == 2103. If you look at the bucket that has
@@ -2247,7 +2401,7 @@ The following commonly-used handler.action pairs are available:
sched_switch events, which should match the time displayed in the
global maximum)::
- # cat /sys/kernel/debug/tracing/snapshot
+ # cat /sys/kernel/tracing/snapshot
<...>-2103 [005] d..3 309.873125: sched_switch: prev_comm=cyclictest prev_pid=2103 prev_prio=19 prev_state=D ==> next_comm=swapper/5 next_pid=0 next_prio=120
<idle>-0 [005] d.h3 309.873611: sched_waking: comm=cyclictest pid=2102 prio=19 target_cpu=005
@@ -2310,15 +2464,15 @@ The following commonly-used handler.action pairs are available:
$cwnd variable. If the value has changed, a snapshot is taken.
As part of the setup, all the scheduler and tcp events are also
enabled, which are the events that will show up in the snapshot
- when it is taken at some point:
+ when it is taken at some point::
- # echo 1 > /sys/kernel/debug/tracing/events/sched/enable
- # echo 1 > /sys/kernel/debug/tracing/events/tcp/enable
+ # echo 1 > /sys/kernel/tracing/events/sched/enable
+ # echo 1 > /sys/kernel/tracing/events/tcp/enable
- # echo 'hist:keys=dport:cwnd=snd_cwnd: \
- onchange($cwnd).save(snd_wnd,srtt,rcv_wnd): \
- onchange($cwnd).snapshot()' >> \
- /sys/kernel/debug/tracing/events/tcp/tcp_probe/trigger
+ # echo 'hist:keys=dport:cwnd=snd_cwnd: \
+ onchange($cwnd).save(snd_wnd,srtt,rcv_wnd): \
+ onchange($cwnd).snapshot()' >> \
+ /sys/kernel/tracing/events/tcp/tcp_probe/trigger
When the histogram is displayed, for each bucket the tracked value
and the saved values corresponding to that value are displayed
@@ -2327,7 +2481,7 @@ The following commonly-used handler.action pairs are available:
If a snapshot was taken, there is also a message indicating that,
along with the value and event that triggered the snapshot::
- # cat /sys/kernel/debug/tracing/events/tcp/tcp_probe/hist
+ # cat /sys/kernel/tracing/events/tcp/tcp_probe/hist
{ dport: 1521 } hitcount: 8
changed: 10 snd_wnd: 35456 srtt: 154262 rcv_wnd: 42112
@@ -2341,10 +2495,10 @@ The following commonly-used handler.action pairs are available:
{ dport: 443 } hitcount: 211
changed: 10 snd_wnd: 26960 srtt: 17379 rcv_wnd: 28800
- Snapshot taken (see tracing/snapshot). Details::
+ Snapshot taken (see tracing/snapshot). Details:
- triggering value { onchange($cwnd) }: 10
- triggered by event with key: { dport: 80 }
+ triggering value { onchange($cwnd) }: 10
+ triggered by event with key: { dport: 80 }
Totals:
Hits: 414
@@ -2361,7 +2515,7 @@ The following commonly-used handler.action pairs are available:
And finally, looking at the snapshot data should show at or near
the end the event that triggered the snapshot::
- # cat /sys/kernel/debug/tracing/snapshot
+ # cat /sys/kernel/tracing/snapshot
gnome-shell-1261 [006] dN.3 49.823113: sched_stat_runtime: comm=gnome-shell pid=1261 runtime=49347 [ns] vruntime=1835730389 [ns]
kworker/u16:4-773 [003] d..3 49.823114: sched_switch: prev_comm=kworker/u16:4 prev_pid=773 prev_prio=120 prev_state=R+ ==> next_comm=kworker/3:2 next_pid=135 next_prio=120