diff options
Diffstat (limited to 'Documentation/trace/histogram.rst')
-rw-r--r-- | Documentation/trace/histogram.rst | 416 |
1 files changed, 285 insertions, 131 deletions
diff --git a/Documentation/trace/histogram.rst b/Documentation/trace/histogram.rst index f95459aa984f..479c9eac6335 100644 --- a/Documentation/trace/histogram.rst +++ b/Documentation/trace/histogram.rst @@ -81,6 +81,7 @@ Documentation written by Tom Zanussi .usecs display a common_timestamp in microseconds .percent display a number of percentage value .graph display a bar-graph of a value + .stacktrace display as a stacktrace (must by a long[] type) ============= ================================================= Note that in general the semantics of a given field aren't @@ -102,12 +103,12 @@ Documentation written by Tom Zanussi trigger, read its current contents, and then turn it off:: # echo 'hist:keys=skbaddr.hex:vals=len' > \ - /sys/kernel/debug/tracing/events/net/netif_rx/trigger + /sys/kernel/tracing/events/net/netif_rx/trigger - # cat /sys/kernel/debug/tracing/events/net/netif_rx/hist + # cat /sys/kernel/tracing/events/net/netif_rx/hist # echo '!hist:keys=skbaddr.hex:vals=len' > \ - /sys/kernel/debug/tracing/events/net/netif_rx/trigger + /sys/kernel/tracing/events/net/netif_rx/trigger The trigger file itself can be read to show the details of the currently attached hist trigger. This information is also displayed @@ -169,13 +170,13 @@ Documentation written by Tom Zanussi aggregation on and off when conditions of interest are hit:: # echo 'hist:keys=skbaddr.hex:vals=len:pause' > \ - /sys/kernel/debug/tracing/events/net/netif_receive_skb/trigger + /sys/kernel/tracing/events/net/netif_receive_skb/trigger # echo 'enable_hist:net:netif_receive_skb if filename==/usr/bin/wget' > \ - /sys/kernel/debug/tracing/events/sched/sched_process_exec/trigger + /sys/kernel/tracing/events/sched/sched_process_exec/trigger # echo 'disable_hist:net:netif_receive_skb if comm==wget' > \ - /sys/kernel/debug/tracing/events/sched/sched_process_exit/trigger + /sys/kernel/tracing/events/sched/sched_process_exit/trigger The above sets up an initially paused hist trigger which is unpaused and starts aggregating events when a given program is executed, and @@ -218,7 +219,7 @@ Extended error information event. The fields that can be used for the hist trigger are listed in the kmalloc event's format file:: - # cat /sys/kernel/debug/tracing/events/kmem/kmalloc/format + # cat /sys/kernel/tracing/events/kmem/kmalloc/format name: kmalloc ID: 374 format: @@ -238,7 +239,7 @@ Extended error information the kernel that made one or more calls to kmalloc:: # echo 'hist:key=call_site:val=bytes_req.buckets=32' > \ - /sys/kernel/debug/tracing/events/kmem/kmalloc/trigger + /sys/kernel/tracing/events/kmem/kmalloc/trigger This tells the tracing system to create a 'hist' trigger using the call_site field of the kmalloc event as the key for the table, which @@ -252,7 +253,7 @@ Extended error information file in the kmalloc event's subdirectory (for readability, a number of entries have been omitted):: - # cat /sys/kernel/debug/tracing/events/kmem/kmalloc/hist + # cat /sys/kernel/tracing/events/kmem/kmalloc/hist # trigger info: hist:keys=call_site:vals=bytes_req:sort=hitcount:size=2048 [active] { call_site: 18446744072106379007 } hitcount: 1 bytes_req: 176 @@ -292,7 +293,7 @@ Extended error information the trigger info, which can also be displayed by reading the 'trigger' file:: - # cat /sys/kernel/debug/tracing/events/kmem/kmalloc/trigger + # cat /sys/kernel/tracing/events/kmem/kmalloc/trigger hist:keys=call_site:vals=bytes_req:sort=hitcount:size=2048 [active] At the end of the output are a few lines that display the overall @@ -323,7 +324,7 @@ Extended error information command history and re-execute it with a '!' prepended:: # echo '!hist:key=call_site:val=bytes_req' > \ - /sys/kernel/debug/tracing/events/kmem/kmalloc/trigger + /sys/kernel/tracing/events/kmem/kmalloc/trigger Finally, notice that the call_site as displayed in the output above isn't really very useful. It's an address, but normally addresses @@ -331,9 +332,9 @@ Extended error information value, simply append '.hex' to the field name in the trigger:: # echo 'hist:key=call_site.hex:val=bytes_req' > \ - /sys/kernel/debug/tracing/events/kmem/kmalloc/trigger + /sys/kernel/tracing/events/kmem/kmalloc/trigger - # cat /sys/kernel/debug/tracing/events/kmem/kmalloc/hist + # cat /sys/kernel/tracing/events/kmem/kmalloc/hist # trigger info: hist:keys=call_site.hex:vals=bytes_req:sort=hitcount:size=2048 [active] { call_site: ffffffffa026b291 } hitcount: 1 bytes_req: 433 @@ -376,9 +377,9 @@ Extended error information trigger:: # echo 'hist:key=call_site.sym:val=bytes_req' > \ - /sys/kernel/debug/tracing/events/kmem/kmalloc/trigger + /sys/kernel/tracing/events/kmem/kmalloc/trigger - # cat /sys/kernel/debug/tracing/events/kmem/kmalloc/hist + # cat /sys/kernel/tracing/events/kmem/kmalloc/hist # trigger info: hist:keys=call_site.sym:vals=bytes_req:sort=hitcount:size=2048 [active] { call_site: [ffffffff810adcb9] syslog_print_all } hitcount: 1 bytes_req: 1024 @@ -426,9 +427,9 @@ Extended error information the 'sort' parameter, along with the 'descending' modifier:: # echo 'hist:key=call_site.sym:val=bytes_req:sort=bytes_req.descending' > \ - /sys/kernel/debug/tracing/events/kmem/kmalloc/trigger + /sys/kernel/tracing/events/kmem/kmalloc/trigger - # cat /sys/kernel/debug/tracing/events/kmem/kmalloc/hist + # cat /sys/kernel/tracing/events/kmem/kmalloc/hist # trigger info: hist:keys=call_site.sym:vals=bytes_req:sort=bytes_req.descending:size=2048 [active] { call_site: [ffffffffa046041c] i915_gem_execbuffer2 [i915] } hitcount: 2186 bytes_req: 3397464 @@ -467,9 +468,9 @@ Extended error information name, just use 'sym-offset' instead:: # echo 'hist:key=call_site.sym-offset:val=bytes_req:sort=bytes_req.descending' > \ - /sys/kernel/debug/tracing/events/kmem/kmalloc/trigger + /sys/kernel/tracing/events/kmem/kmalloc/trigger - # cat /sys/kernel/debug/tracing/events/kmem/kmalloc/hist + # cat /sys/kernel/tracing/events/kmem/kmalloc/hist # trigger info: hist:keys=call_site.sym-offset:vals=bytes_req:sort=bytes_req.descending:size=2048 [active] { call_site: [ffffffffa046041c] i915_gem_execbuffer2+0x6c/0x2c0 [i915] } hitcount: 4569 bytes_req: 3163720 @@ -506,9 +507,9 @@ Extended error information allocated in a descending order:: # echo 'hist:keys=call_site.sym:values=bytes_req,bytes_alloc:sort=bytes_alloc.descending' > \ - /sys/kernel/debug/tracing/events/kmem/kmalloc/trigger + /sys/kernel/tracing/events/kmem/kmalloc/trigger - # cat /sys/kernel/debug/tracing/events/kmem/kmalloc/hist + # cat /sys/kernel/tracing/events/kmem/kmalloc/hist # trigger info: hist:keys=call_site.sym:vals=bytes_req,bytes_alloc:sort=bytes_alloc.descending:size=2048 [active] { call_site: [ffffffffa046041c] i915_gem_execbuffer2 [i915] } hitcount: 7403 bytes_req: 4084360 bytes_alloc: 5958016 @@ -549,7 +550,7 @@ Extended error information value 'stacktrace' for the key parameter:: # echo 'hist:keys=stacktrace:values=bytes_req,bytes_alloc:sort=bytes_alloc' > \ - /sys/kernel/debug/tracing/events/kmem/kmalloc/trigger + /sys/kernel/tracing/events/kmem/kmalloc/trigger The above trigger will use the kernel stack trace in effect when an event is triggered as the key for the hash table. This allows the @@ -559,7 +560,7 @@ Extended error information every callpath in the system that led up to a kmalloc (in this case every callpath to a kmalloc for a kernel compile):: - # cat /sys/kernel/debug/tracing/events/kmem/kmalloc/hist + # cat /sys/kernel/tracing/events/kmem/kmalloc/hist # trigger info: hist:keys=stacktrace:vals=bytes_req,bytes_alloc:sort=bytes_alloc:size=2048 [active] { stacktrace: @@ -658,9 +659,9 @@ Extended error information keeps a per-process sum of total bytes read:: # echo 'hist:key=common_pid.execname:val=count:sort=count.descending' > \ - /sys/kernel/debug/tracing/events/syscalls/sys_enter_read/trigger + /sys/kernel/tracing/events/syscalls/sys_enter_read/trigger - # cat /sys/kernel/debug/tracing/events/syscalls/sys_enter_read/hist + # cat /sys/kernel/tracing/events/syscalls/sys_enter_read/hist # trigger info: hist:keys=common_pid.execname:vals=count:sort=count.descending:size=2048 [active] { common_pid: gnome-terminal [ 3196] } hitcount: 280 count: 1093512 @@ -699,9 +700,9 @@ Extended error information counts for the system during the run:: # echo 'hist:key=id.syscall:val=hitcount' > \ - /sys/kernel/debug/tracing/events/raw_syscalls/sys_enter/trigger + /sys/kernel/tracing/events/raw_syscalls/sys_enter/trigger - # cat /sys/kernel/debug/tracing/events/raw_syscalls/sys_enter/hist + # cat /sys/kernel/tracing/events/raw_syscalls/sys_enter/hist # trigger info: hist:keys=id.syscall:vals=hitcount:sort=hitcount:size=2048 [active] { id: sys_fsync [ 74] } hitcount: 1 @@ -753,9 +754,9 @@ Extended error information hitcount sum as the secondary key:: # echo 'hist:key=id.syscall,common_pid.execname:val=hitcount:sort=id,hitcount' > \ - /sys/kernel/debug/tracing/events/raw_syscalls/sys_enter/trigger + /sys/kernel/tracing/events/raw_syscalls/sys_enter/trigger - # cat /sys/kernel/debug/tracing/events/raw_syscalls/sys_enter/hist + # cat /sys/kernel/tracing/events/raw_syscalls/sys_enter/hist # trigger info: hist:keys=id.syscall,common_pid.execname:vals=hitcount:sort=id.syscall,hitcount:size=2048 [active] { id: sys_read [ 0], common_pid: rtkit-daemon [ 1877] } hitcount: 1 @@ -803,9 +804,9 @@ Extended error information can use that to filter out all the other syscalls:: # echo 'hist:key=id.syscall,common_pid.execname:val=hitcount:sort=id,hitcount if id == 16' > \ - /sys/kernel/debug/tracing/events/raw_syscalls/sys_enter/trigger + /sys/kernel/tracing/events/raw_syscalls/sys_enter/trigger - # cat /sys/kernel/debug/tracing/events/raw_syscalls/sys_enter/hist + # cat /sys/kernel/tracing/events/raw_syscalls/sys_enter/hist # trigger info: hist:keys=id.syscall,common_pid.execname:vals=hitcount:sort=id.syscall,hitcount:size=2048 if id == 16 [active] { id: sys_ioctl [ 16], common_pid: gmain [ 2769] } hitcount: 1 @@ -846,9 +847,9 @@ Extended error information each process:: # echo 'hist:key=common_pid.execname,size:val=hitcount:sort=common_pid,size' > \ - /sys/kernel/debug/tracing/events/syscalls/sys_enter_recvfrom/trigger + /sys/kernel/tracing/events/syscalls/sys_enter_recvfrom/trigger - # cat /sys/kernel/debug/tracing/events/syscalls/sys_enter_recvfrom/hist + # cat /sys/kernel/tracing/events/syscalls/sys_enter_recvfrom/hist # trigger info: hist:keys=common_pid.execname,size:vals=hitcount:sort=common_pid.execname,size:size=2048 [active] { common_pid: smbd [ 784], size: 4 } hitcount: 1 @@ -899,9 +900,9 @@ Extended error information much smaller number, say 256:: # echo 'hist:key=child_comm:val=hitcount:size=256' > \ - /sys/kernel/debug/tracing/events/sched/sched_process_fork/trigger + /sys/kernel/tracing/events/sched/sched_process_fork/trigger - # cat /sys/kernel/debug/tracing/events/sched/sched_process_fork/hist + # cat /sys/kernel/tracing/events/sched/sched_process_fork/hist # trigger info: hist:keys=child_comm:vals=hitcount:sort=hitcount:size=256 [active] { child_comm: dconf worker } hitcount: 1 @@ -935,9 +936,9 @@ Extended error information displays as [paused]:: # echo 'hist:key=child_comm:val=hitcount:size=256:pause' >> \ - /sys/kernel/debug/tracing/events/sched/sched_process_fork/trigger + /sys/kernel/tracing/events/sched/sched_process_fork/trigger - # cat /sys/kernel/debug/tracing/events/sched/sched_process_fork/hist + # cat /sys/kernel/tracing/events/sched/sched_process_fork/hist # trigger info: hist:keys=child_comm:vals=hitcount:sort=hitcount:size=256 [paused] { child_comm: dconf worker } hitcount: 1 @@ -972,9 +973,9 @@ Extended error information again, and the data has changed:: # echo 'hist:key=child_comm:val=hitcount:size=256:cont' >> \ - /sys/kernel/debug/tracing/events/sched/sched_process_fork/trigger + /sys/kernel/tracing/events/sched/sched_process_fork/trigger - # cat /sys/kernel/debug/tracing/events/sched/sched_process_fork/hist + # cat /sys/kernel/tracing/events/sched/sched_process_fork/hist # trigger info: hist:keys=child_comm:vals=hitcount:sort=hitcount:size=256 [active] { child_comm: dconf worker } hitcount: 1 @@ -1026,7 +1027,7 @@ Extended error information netif_receive_skb event:: # echo 'hist:key=stacktrace:vals=len:pause' > \ - /sys/kernel/debug/tracing/events/net/netif_receive_skb/trigger + /sys/kernel/tracing/events/net/netif_receive_skb/trigger Next, we set up an 'enable_hist' trigger on the sched_process_exec event, with an 'if filename==/usr/bin/wget' filter. The effect of @@ -1037,7 +1038,7 @@ Extended error information hash table keyed on stacktrace:: # echo 'enable_hist:net:netif_receive_skb if filename==/usr/bin/wget' > \ - /sys/kernel/debug/tracing/events/sched/sched_process_exec/trigger + /sys/kernel/tracing/events/sched/sched_process_exec/trigger The aggregation continues until the netif_receive_skb is paused again, which is what the following disable_hist event does by @@ -1045,7 +1046,7 @@ Extended error information filter 'comm==wget':: # echo 'disable_hist:net:netif_receive_skb if comm==wget' > \ - /sys/kernel/debug/tracing/events/sched/sched_process_exit/trigger + /sys/kernel/tracing/events/sched/sched_process_exit/trigger Whenever a process exits and the comm field of the disable_hist trigger filter matches 'comm==wget', the netif_receive_skb hist @@ -1058,7 +1059,7 @@ Extended error information $ wget https://www.kernel.org/pub/linux/kernel/v3.x/patch-3.19.xz - # cat /sys/kernel/debug/tracing/events/net/netif_receive_skb/hist + # cat /sys/kernel/tracing/events/net/netif_receive_skb/hist # trigger info: hist:keys=stacktrace:vals=len:sort=hitcount:size=2048 [paused] { stacktrace: @@ -1142,12 +1143,12 @@ Extended error information again, we can just clear the histogram first:: # echo 'hist:key=stacktrace:vals=len:clear' >> \ - /sys/kernel/debug/tracing/events/net/netif_receive_skb/trigger + /sys/kernel/tracing/events/net/netif_receive_skb/trigger Just to verify that it is in fact cleared, here's what we now see in the hist file:: - # cat /sys/kernel/debug/tracing/events/net/netif_receive_skb/hist + # cat /sys/kernel/tracing/events/net/netif_receive_skb/hist # trigger info: hist:keys=stacktrace:vals=len:sort=hitcount:size=2048 [paused] Totals: @@ -1162,21 +1163,21 @@ Extended error information sched_process_exit events as such:: # echo 'enable_event:net:netif_receive_skb if filename==/usr/bin/wget' > \ - /sys/kernel/debug/tracing/events/sched/sched_process_exec/trigger + /sys/kernel/tracing/events/sched/sched_process_exec/trigger # echo 'disable_event:net:netif_receive_skb if comm==wget' > \ - /sys/kernel/debug/tracing/events/sched/sched_process_exit/trigger + /sys/kernel/tracing/events/sched/sched_process_exit/trigger If you read the trigger files for the sched_process_exec and sched_process_exit triggers, you should see two triggers for each: one enabling/disabling the hist aggregation and the other enabling/disabling the logging of events:: - # cat /sys/kernel/debug/tracing/events/sched/sched_process_exec/trigger + # cat /sys/kernel/tracing/events/sched/sched_process_exec/trigger enable_event:net:netif_receive_skb:unlimited if filename==/usr/bin/wget enable_hist:net:netif_receive_skb:unlimited if filename==/usr/bin/wget - # cat /sys/kernel/debug/tracing/events/sched/sched_process_exit/trigger + # cat /sys/kernel/tracing/events/sched/sched_process_exit/trigger enable_event:net:netif_receive_skb:unlimited if comm==wget disable_hist:net:netif_receive_skb:unlimited if comm==wget @@ -1192,7 +1193,7 @@ Extended error information saw in the last run, but this time you should also see the individual events in the trace file:: - # cat /sys/kernel/debug/tracing/trace + # cat /sys/kernel/tracing/trace # tracer: nop # @@ -1226,15 +1227,15 @@ Extended error information other things:: # echo 'hist:keys=skbaddr.hex:vals=len if len < 0' >> \ - /sys/kernel/debug/tracing/events/net/netif_receive_skb/trigger + /sys/kernel/tracing/events/net/netif_receive_skb/trigger # echo 'hist:keys=skbaddr.hex:vals=len if len > 4096' >> \ - /sys/kernel/debug/tracing/events/net/netif_receive_skb/trigger + /sys/kernel/tracing/events/net/netif_receive_skb/trigger # echo 'hist:keys=skbaddr.hex:vals=len if len == 256' >> \ - /sys/kernel/debug/tracing/events/net/netif_receive_skb/trigger + /sys/kernel/tracing/events/net/netif_receive_skb/trigger # echo 'hist:keys=skbaddr.hex:vals=len' >> \ - /sys/kernel/debug/tracing/events/net/netif_receive_skb/trigger + /sys/kernel/tracing/events/net/netif_receive_skb/trigger # echo 'hist:keys=len:vals=common_preempt_count' >> \ - /sys/kernel/debug/tracing/events/net/netif_receive_skb/trigger + /sys/kernel/tracing/events/net/netif_receive_skb/trigger The above set of commands create four triggers differing only in their filters, along with a completely different though fairly @@ -1246,7 +1247,7 @@ Extended error information Displaying the contents of the 'hist' file for the event shows the contents of all five histograms:: - # cat /sys/kernel/debug/tracing/events/net/netif_receive_skb/hist + # cat /sys/kernel/tracing/events/net/netif_receive_skb/hist # event histogram # @@ -1367,15 +1368,15 @@ Extended error information field in the shared 'foo' histogram data:: # echo 'hist:name=foo:keys=skbaddr.hex:vals=len' > \ - /sys/kernel/debug/tracing/events/net/netif_receive_skb/trigger + /sys/kernel/tracing/events/net/netif_receive_skb/trigger # echo 'hist:name=foo:keys=skbaddr.hex:vals=len' > \ - /sys/kernel/debug/tracing/events/net/netif_rx/trigger + /sys/kernel/tracing/events/net/netif_rx/trigger You can see that they're updating common histogram data by reading each event's hist files at the same time:: - # cat /sys/kernel/debug/tracing/events/net/netif_receive_skb/hist; - cat /sys/kernel/debug/tracing/events/net/netif_rx/hist + # cat /sys/kernel/tracing/events/net/netif_receive_skb/hist; + cat /sys/kernel/tracing/events/net/netif_rx/hist # event histogram # @@ -1488,15 +1489,15 @@ Extended error information couple of triggers named 'bar' using those fields:: # echo 'hist:name=bar:key=stacktrace:val=hitcount' > \ - /sys/kernel/debug/tracing/events/sched/sched_process_fork/trigger + /sys/kernel/tracing/events/sched/sched_process_fork/trigger # echo 'hist:name=bar:key=stacktrace:val=hitcount' > \ - /sys/kernel/debug/tracing/events/net/netif_rx/trigger + /sys/kernel/tracing/events/net/netif_rx/trigger And displaying the output of either shows some interesting if somewhat confusing output:: - # cat /sys/kernel/debug/tracing/events/sched/sched_process_fork/hist - # cat /sys/kernel/debug/tracing/events/net/netif_rx/hist + # cat /sys/kernel/tracing/events/sched/sched_process_fork/hist + # cat /sys/kernel/tracing/events/net/netif_rx/hist # event histogram # @@ -1786,6 +1787,8 @@ or assigned to a variable and referenced in a subsequent expression:: # echo 'hist:keys=next_pid:us_per_sec=1000000 ...' >> event/trigger # echo 'hist:keys=next_pid:timestamp_secs=common_timestamp/$us_per_sec ...' >> event/trigger +Variables can even hold stacktraces, which are useful with synthetic events. + 2.2.2 Synthetic Events ---------------------- @@ -1826,19 +1829,19 @@ variable reference to a variable on another event:: u64 lat; \ pid_t pid; \ int prio' >> \ - /sys/kernel/debug/tracing/synthetic_events + /sys/kernel/tracing/synthetic_events Reading the tracing/synthetic_events file lists all the currently defined synthetic events, in this case the event defined above:: - # cat /sys/kernel/debug/tracing/synthetic_events + # cat /sys/kernel/tracing/synthetic_events wakeup_latency u64 lat; pid_t pid; int prio An existing synthetic event definition can be removed by prepending the command that defined it with a '!':: # echo '!wakeup_latency u64 lat pid_t pid int prio' >> \ - /sys/kernel/debug/tracing/synthetic_events + /sys/kernel/tracing/synthetic_events At this point, there isn't yet an actual 'wakeup_latency' event instantiated in the event subsystem - for this to happen, a 'hist @@ -1850,20 +1853,20 @@ done, the 'wakeup_latency' synthetic event instance is created. The new event is created under the tracing/events/synthetic/ directory and looks and behaves just like any other event:: - # ls /sys/kernel/debug/tracing/events/synthetic/wakeup_latency + # ls /sys/kernel/tracing/events/synthetic/wakeup_latency enable filter format hist id trigger A histogram can now be defined for the new synthetic event:: # echo 'hist:keys=pid,prio,lat.log2:sort=lat' >> \ - /sys/kernel/debug/tracing/events/synthetic/wakeup_latency/trigger + /sys/kernel/tracing/events/synthetic/wakeup_latency/trigger The above shows the latency "lat" in a power of 2 grouping. Like any other event, once a histogram is enabled for the event, the -output can be displayed by reading the event's 'hist' file. +output can be displayed by reading the event's 'hist' file:: - # cat /sys/kernel/debug/tracing/events/synthetic/wakeup_latency/hist + # cat /sys/kernel/tracing/events/synthetic/wakeup_latency/hist # event histogram # @@ -1908,10 +1911,10 @@ output can be displayed by reading the event's 'hist' file. The latency values can also be grouped linearly by a given size with -the ".buckets" modifier and specify a size (in this case groups of 10). +the ".buckets" modifier and specify a size (in this case groups of 10):: # echo 'hist:keys=pid,prio,lat.buckets=10:sort=lat' >> \ - /sys/kernel/debug/tracing/events/synthetic/wakeup_latency/trigger + /sys/kernel/tracing/events/synthetic/wakeup_latency/trigger # event histogram # @@ -1940,6 +1943,157 @@ the ".buckets" modifier and specify a size (in this case groups of 10). Entries: 16 Dropped: 0 +To save stacktraces, create a synthetic event with a field of type "unsigned long[]" +or even just "long[]". For example, to see how long a task is blocked in an +uninterruptible state:: + + # cd /sys/kernel/tracing + # echo 's:block_lat pid_t pid; u64 delta; unsigned long[] stack;' > dynamic_events + # echo 'hist:keys=next_pid:ts=common_timestamp.usecs,st=stacktrace if prev_state == 2' >> events/sched/sched_switch/trigger + # echo 'hist:keys=prev_pid:delta=common_timestamp.usecs-$ts,s=$st:onmax($delta).trace(block_lat,prev_pid,$delta,$s)' >> events/sched/sched_switch/trigger + # echo 1 > events/synthetic/block_lat/enable + # cat trace + + # tracer: nop + # + # entries-in-buffer/entries-written: 2/2 #P:8 + # + # _-----=> irqs-off/BH-disabled + # / _----=> need-resched + # | / _---=> hardirq/softirq + # || / _--=> preempt-depth + # ||| / _-=> migrate-disable + # |||| / delay + # TASK-PID CPU# ||||| TIMESTAMP FUNCTION + # | | | ||||| | | + <idle>-0 [005] d..4. 521.164922: block_lat: pid=0 delta=8322 stack=STACK: + => __schedule+0x448/0x7b0 + => schedule+0x5a/0xb0 + => io_schedule+0x42/0x70 + => bit_wait_io+0xd/0x60 + => __wait_on_bit+0x4b/0x140 + => out_of_line_wait_on_bit+0x91/0xb0 + => jbd2_journal_commit_transaction+0x1679/0x1a70 + => kjournald2+0xa9/0x280 + => kthread+0xe9/0x110 + => ret_from_fork+0x2c/0x50 + + <...>-2 [004] d..4. 525.184257: block_lat: pid=2 delta=76 stack=STACK: + => __schedule+0x448/0x7b0 + => schedule+0x5a/0xb0 + => schedule_timeout+0x11a/0x150 + => wait_for_completion_killable+0x144/0x1f0 + => __kthread_create_on_node+0xe7/0x1e0 + => kthread_create_on_node+0x51/0x70 + => create_worker+0xcc/0x1a0 + => worker_thread+0x2ad/0x380 + => kthread+0xe9/0x110 + => ret_from_fork+0x2c/0x50 + +A synthetic event that has a stacktrace field may use it as a key in +histogram:: + + # echo 'hist:keys=delta.buckets=100,stack.stacktrace:sort=delta' > events/synthetic/block_lat/trigger + # cat events/synthetic/block_lat/hist + + # event histogram + # + # trigger info: hist:keys=delta.buckets=100,stack.stacktrace:vals=hitcount:sort=delta.buckets=100:size=2048 [active] + # + { delta: ~ 0-99, stack.stacktrace __schedule+0xa19/0x1520 + schedule+0x6b/0x110 + io_schedule+0x46/0x80 + bit_wait_io+0x11/0x80 + __wait_on_bit+0x4e/0x120 + out_of_line_wait_on_bit+0x8d/0xb0 + __wait_on_buffer+0x33/0x40 + jbd2_journal_commit_transaction+0x155a/0x19b0 + kjournald2+0xab/0x270 + kthread+0xfa/0x130 + ret_from_fork+0x29/0x50 + } hitcount: 1 + { delta: ~ 0-99, stack.stacktrace __schedule+0xa19/0x1520 + schedule+0x6b/0x110 + io_schedule+0x46/0x80 + rq_qos_wait+0xd0/0x170 + wbt_wait+0x9e/0xf0 + __rq_qos_throttle+0x25/0x40 + blk_mq_submit_bio+0x2c3/0x5b0 + __submit_bio+0xff/0x190 + submit_bio_noacct_nocheck+0x25b/0x2b0 + submit_bio_noacct+0x20b/0x600 + submit_bio+0x28/0x90 + ext4_bio_write_page+0x1e0/0x8c0 + mpage_submit_page+0x60/0x80 + mpage_process_page_bufs+0x16c/0x180 + mpage_prepare_extent_to_map+0x23f/0x530 + } hitcount: 1 + { delta: ~ 0-99, stack.stacktrace __schedule+0xa19/0x1520 + schedule+0x6b/0x110 + schedule_hrtimeout_range_clock+0x97/0x110 + schedule_hrtimeout_range+0x13/0x20 + usleep_range_state+0x65/0x90 + __intel_wait_for_register+0x1c1/0x230 [i915] + intel_psr_wait_for_idle_locked+0x171/0x2a0 [i915] + intel_pipe_update_start+0x169/0x360 [i915] + intel_update_crtc+0x112/0x490 [i915] + skl_commit_modeset_enables+0x199/0x600 [i915] + intel_atomic_commit_tail+0x7c4/0x1080 [i915] + intel_atomic_commit_work+0x12/0x20 [i915] + process_one_work+0x21c/0x3f0 + worker_thread+0x50/0x3e0 + kthread+0xfa/0x130 + } hitcount: 3 + { delta: ~ 0-99, stack.stacktrace __schedule+0xa19/0x1520 + schedule+0x6b/0x110 + schedule_timeout+0x11e/0x160 + __wait_for_common+0x8f/0x190 + wait_for_completion+0x24/0x30 + __flush_work.isra.0+0x1cc/0x360 + flush_work+0xe/0x20 + drm_mode_rmfb+0x18b/0x1d0 [drm] + drm_mode_rmfb_ioctl+0x10/0x20 [drm] + drm_ioctl_kernel+0xb8/0x150 [drm] + drm_ioctl+0x243/0x560 [drm] + __x64_sys_ioctl+0x92/0xd0 + do_syscall_64+0x59/0x90 + entry_SYSCALL_64_after_hwframe+0x72/0xdc + } hitcount: 1 + { delta: ~ 0-99, stack.stacktrace __schedule+0xa19/0x1520 + schedule+0x6b/0x110 + schedule_timeout+0x87/0x160 + __wait_for_common+0x8f/0x190 + wait_for_completion_timeout+0x1d/0x30 + drm_atomic_helper_wait_for_flip_done+0x57/0x90 [drm_kms_helper] + intel_atomic_commit_tail+0x8ce/0x1080 [i915] + intel_atomic_commit_work+0x12/0x20 [i915] + process_one_work+0x21c/0x3f0 + worker_thread+0x50/0x3e0 + kthread+0xfa/0x130 + ret_from_fork+0x29/0x50 + } hitcount: 1 + { delta: ~ 100-199, stack.stacktrace __schedule+0xa19/0x1520 + schedule+0x6b/0x110 + schedule_hrtimeout_range_clock+0x97/0x110 + schedule_hrtimeout_range+0x13/0x20 + usleep_range_state+0x65/0x90 + pci_set_low_power_state+0x17f/0x1f0 + pci_set_power_state+0x49/0x250 + pci_finish_runtime_suspend+0x4a/0x90 + pci_pm_runtime_suspend+0xcb/0x1b0 + __rpm_callback+0x48/0x120 + rpm_callback+0x67/0x70 + rpm_suspend+0x167/0x780 + rpm_idle+0x25a/0x380 + pm_runtime_work+0x93/0xc0 + process_one_work+0x21c/0x3f0 + } hitcount: 1 + + Totals: + Hits: 10 + Entries: 7 + Dropped: 0 + 2.2.3 Hist trigger 'handlers' and 'actions' ------------------------------------------- @@ -2039,9 +2193,9 @@ The following commonly-used handler.action pairs are available: event:: # echo 'wakeup_new_test pid_t pid' >> \ - /sys/kernel/debug/tracing/synthetic_events + /sys/kernel/tracing/synthetic_events - # cat /sys/kernel/debug/tracing/synthetic_events + # cat /sys/kernel/tracing/synthetic_events wakeup_new_test pid_t pid The following hist trigger both defines the missing testpid @@ -2052,26 +2206,26 @@ The following commonly-used handler.action pairs are available: # echo 'hist:keys=$testpid:testpid=pid:onmatch(sched.sched_wakeup_new).\ wakeup_new_test($testpid) if comm=="cyclictest"' >> \ - /sys/kernel/debug/tracing/events/sched/sched_wakeup_new/trigger + /sys/kernel/tracing/events/sched/sched_wakeup_new/trigger - Or, equivalently, using the 'trace' keyword syntax: + Or, equivalently, using the 'trace' keyword syntax:: - # echo 'hist:keys=$testpid:testpid=pid:onmatch(sched.sched_wakeup_new).\ - trace(wakeup_new_test,$testpid) if comm=="cyclictest"' >> \ - /sys/kernel/debug/tracing/events/sched/sched_wakeup_new/trigger + # echo 'hist:keys=$testpid:testpid=pid:onmatch(sched.sched_wakeup_new).\ + trace(wakeup_new_test,$testpid) if comm=="cyclictest"' >> \ + /sys/kernel/tracing/events/sched/sched_wakeup_new/trigger Creating and displaying a histogram based on those events is now just a matter of using the fields and new synthetic event in the tracing/events/synthetic directory, as usual:: # echo 'hist:keys=pid:sort=pid' >> \ - /sys/kernel/debug/tracing/events/synthetic/wakeup_new_test/trigger + /sys/kernel/tracing/events/synthetic/wakeup_new_test/trigger Running 'cyclictest' should cause wakeup_new events to generate wakeup_new_test synthetic events which should result in histogram output in the wakeup_new_test event's hist file:: - # cat /sys/kernel/debug/tracing/events/synthetic/wakeup_new_test/hist + # cat /sys/kernel/tracing/events/synthetic/wakeup_new_test/hist A more typical usage would be to use two events to calculate a latency. The following example uses a set of hist triggers to @@ -2080,14 +2234,14 @@ The following commonly-used handler.action pairs are available: First, we define a 'wakeup_latency' synthetic event:: # echo 'wakeup_latency u64 lat; pid_t pid; int prio' >> \ - /sys/kernel/debug/tracing/synthetic_events + /sys/kernel/tracing/synthetic_events Next, we specify that whenever we see a sched_waking event for a cyclictest thread, save the timestamp in a 'ts0' variable:: # echo 'hist:keys=$saved_pid:saved_pid=pid:ts0=common_timestamp.usecs \ if comm=="cyclictest"' >> \ - /sys/kernel/debug/tracing/events/sched/sched_waking/trigger + /sys/kernel/tracing/events/sched/sched_waking/trigger Then, when the corresponding thread is actually scheduled onto the CPU by a sched_switch event (saved_pid matches next_pid), calculate @@ -2097,19 +2251,19 @@ The following commonly-used handler.action pairs are available: # echo 'hist:keys=next_pid:wakeup_lat=common_timestamp.usecs-$ts0:\ onmatch(sched.sched_waking).wakeup_latency($wakeup_lat,\ $saved_pid,next_prio) if next_comm=="cyclictest"' >> \ - /sys/kernel/debug/tracing/events/sched/sched_switch/trigger + /sys/kernel/tracing/events/sched/sched_switch/trigger We also need to create a histogram on the wakeup_latency synthetic event in order to aggregate the generated synthetic event data:: # echo 'hist:keys=pid,prio,lat:sort=pid,lat' >> \ - /sys/kernel/debug/tracing/events/synthetic/wakeup_latency/trigger + /sys/kernel/tracing/events/synthetic/wakeup_latency/trigger Finally, once we've run cyclictest to actually generate some events, we can see the output by looking at the wakeup_latency synthetic event's hist file:: - # cat /sys/kernel/debug/tracing/events/synthetic/wakeup_latency/hist + # cat /sys/kernel/tracing/events/synthetic/wakeup_latency/hist - onmax(var).save(field,.. .) @@ -2135,19 +2289,19 @@ The following commonly-used handler.action pairs are available: # echo 'hist:keys=pid:ts0=common_timestamp.usecs \ if comm=="cyclictest"' >> \ - /sys/kernel/debug/tracing/events/sched/sched_waking/trigger + /sys/kernel/tracing/events/sched/sched_waking/trigger # echo 'hist:keys=next_pid:\ wakeup_lat=common_timestamp.usecs-$ts0:\ onmax($wakeup_lat).save(next_comm,prev_pid,prev_prio,prev_comm) \ if next_comm=="cyclictest"' >> \ - /sys/kernel/debug/tracing/events/sched/sched_switch/trigger + /sys/kernel/tracing/events/sched/sched_switch/trigger When the histogram is displayed, the max value and the saved values corresponding to the max are displayed following the rest of the fields:: - # cat /sys/kernel/debug/tracing/events/sched/sched_switch/hist + # cat /sys/kernel/tracing/events/sched/sched_switch/hist { next_pid: 2255 } hitcount: 239 common_timestamp-ts0: 0 max: 27 @@ -2191,48 +2345,48 @@ The following commonly-used handler.action pairs are available: resulting latency, stored in wakeup_lat, exceeds the current maximum latency, a snapshot is taken. As part of the setup, all the scheduler events are also enabled, which are the events that - will show up in the snapshot when it is taken at some point: + will show up in the snapshot when it is taken at some point:: - # echo 1 > /sys/kernel/debug/tracing/events/sched/enable + # echo 1 > /sys/kernel/tracing/events/sched/enable - # echo 'hist:keys=pid:ts0=common_timestamp.usecs \ - if comm=="cyclictest"' >> \ - /sys/kernel/debug/tracing/events/sched/sched_waking/trigger + # echo 'hist:keys=pid:ts0=common_timestamp.usecs \ + if comm=="cyclictest"' >> \ + /sys/kernel/tracing/events/sched/sched_waking/trigger - # echo 'hist:keys=next_pid:wakeup_lat=common_timestamp.usecs-$ts0: \ - onmax($wakeup_lat).save(next_prio,next_comm,prev_pid,prev_prio, \ - prev_comm):onmax($wakeup_lat).snapshot() \ - if next_comm=="cyclictest"' >> \ - /sys/kernel/debug/tracing/events/sched/sched_switch/trigger + # echo 'hist:keys=next_pid:wakeup_lat=common_timestamp.usecs-$ts0: \ + onmax($wakeup_lat).save(next_prio,next_comm,prev_pid,prev_prio, \ + prev_comm):onmax($wakeup_lat).snapshot() \ + if next_comm=="cyclictest"' >> \ + /sys/kernel/tracing/events/sched/sched_switch/trigger When the histogram is displayed, for each bucket the max value and the saved values corresponding to the max are displayed following the rest of the fields. If a snapshot was taken, there is also a message indicating that, - along with the value and event that triggered the global maximum: + along with the value and event that triggered the global maximum:: - # cat /sys/kernel/debug/tracing/events/sched/sched_switch/hist - { next_pid: 2101 } hitcount: 200 - max: 52 next_prio: 120 next_comm: cyclictest \ - prev_pid: 0 prev_prio: 120 prev_comm: swapper/6 + # cat /sys/kernel/tracing/events/sched/sched_switch/hist + { next_pid: 2101 } hitcount: 200 + max: 52 next_prio: 120 next_comm: cyclictest \ + prev_pid: 0 prev_prio: 120 prev_comm: swapper/6 - { next_pid: 2103 } hitcount: 1326 - max: 572 next_prio: 19 next_comm: cyclictest \ - prev_pid: 0 prev_prio: 120 prev_comm: swapper/1 + { next_pid: 2103 } hitcount: 1326 + max: 572 next_prio: 19 next_comm: cyclictest \ + prev_pid: 0 prev_prio: 120 prev_comm: swapper/1 - { next_pid: 2102 } hitcount: 1982 \ - max: 74 next_prio: 19 next_comm: cyclictest \ - prev_pid: 0 prev_prio: 120 prev_comm: swapper/5 + { next_pid: 2102 } hitcount: 1982 \ + max: 74 next_prio: 19 next_comm: cyclictest \ + prev_pid: 0 prev_prio: 120 prev_comm: swapper/5 - Snapshot taken (see tracing/snapshot). Details: - triggering value { onmax($wakeup_lat) }: 572 \ - triggered by event with key: { next_pid: 2103 } + Snapshot taken (see tracing/snapshot). Details: + triggering value { onmax($wakeup_lat) }: 572 \ + triggered by event with key: { next_pid: 2103 } - Totals: - Hits: 3508 - Entries: 3 - Dropped: 0 + Totals: + Hits: 3508 + Entries: 3 + Dropped: 0 In the above case, the event that triggered the global maximum has the key with next_pid == 2103. If you look at the bucket that has @@ -2247,7 +2401,7 @@ The following commonly-used handler.action pairs are available: sched_switch events, which should match the time displayed in the global maximum):: - # cat /sys/kernel/debug/tracing/snapshot + # cat /sys/kernel/tracing/snapshot <...>-2103 [005] d..3 309.873125: sched_switch: prev_comm=cyclictest prev_pid=2103 prev_prio=19 prev_state=D ==> next_comm=swapper/5 next_pid=0 next_prio=120 <idle>-0 [005] d.h3 309.873611: sched_waking: comm=cyclictest pid=2102 prio=19 target_cpu=005 @@ -2310,15 +2464,15 @@ The following commonly-used handler.action pairs are available: $cwnd variable. If the value has changed, a snapshot is taken. As part of the setup, all the scheduler and tcp events are also enabled, which are the events that will show up in the snapshot - when it is taken at some point: + when it is taken at some point:: - # echo 1 > /sys/kernel/debug/tracing/events/sched/enable - # echo 1 > /sys/kernel/debug/tracing/events/tcp/enable + # echo 1 > /sys/kernel/tracing/events/sched/enable + # echo 1 > /sys/kernel/tracing/events/tcp/enable - # echo 'hist:keys=dport:cwnd=snd_cwnd: \ - onchange($cwnd).save(snd_wnd,srtt,rcv_wnd): \ - onchange($cwnd).snapshot()' >> \ - /sys/kernel/debug/tracing/events/tcp/tcp_probe/trigger + # echo 'hist:keys=dport:cwnd=snd_cwnd: \ + onchange($cwnd).save(snd_wnd,srtt,rcv_wnd): \ + onchange($cwnd).snapshot()' >> \ + /sys/kernel/tracing/events/tcp/tcp_probe/trigger When the histogram is displayed, for each bucket the tracked value and the saved values corresponding to that value are displayed @@ -2327,7 +2481,7 @@ The following commonly-used handler.action pairs are available: If a snapshot was taken, there is also a message indicating that, along with the value and event that triggered the snapshot:: - # cat /sys/kernel/debug/tracing/events/tcp/tcp_probe/hist + # cat /sys/kernel/tracing/events/tcp/tcp_probe/hist { dport: 1521 } hitcount: 8 changed: 10 snd_wnd: 35456 srtt: 154262 rcv_wnd: 42112 @@ -2341,10 +2495,10 @@ The following commonly-used handler.action pairs are available: { dport: 443 } hitcount: 211 changed: 10 snd_wnd: 26960 srtt: 17379 rcv_wnd: 28800 - Snapshot taken (see tracing/snapshot). Details:: + Snapshot taken (see tracing/snapshot). Details: - triggering value { onchange($cwnd) }: 10 - triggered by event with key: { dport: 80 } + triggering value { onchange($cwnd) }: 10 + triggered by event with key: { dport: 80 } Totals: Hits: 414 @@ -2361,7 +2515,7 @@ The following commonly-used handler.action pairs are available: And finally, looking at the snapshot data should show at or near the end the event that triggered the snapshot:: - # cat /sys/kernel/debug/tracing/snapshot + # cat /sys/kernel/tracing/snapshot gnome-shell-1261 [006] dN.3 49.823113: sched_stat_runtime: comm=gnome-shell pid=1261 runtime=49347 [ns] vruntime=1835730389 [ns] kworker/u16:4-773 [003] d..3 49.823114: sched_switch: prev_comm=kworker/u16:4 prev_pid=773 prev_prio=120 prev_state=R+ ==> next_comm=kworker/3:2 next_pid=135 next_prio=120 |