tools/perf/Documentation/perf-sched.txt


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233

perf-sched(1)
=============

NAME
----
perf-sched - Tool to trace/measure scheduler properties (latencies)

SYNOPSIS
--------
[verse]
'perf sched' {record|latency|map|replay|script|timehist}

DESCRIPTION
-----------
There are several variants of 'perf sched':

  'perf sched record <command>' to record the scheduling events
  of an arbitrary workload.

  'perf sched latency' to report the per task scheduling latencies
  and other scheduling properties of the workload.

   Example usage:
       perf sched record -- sleep 1
       perf sched latency

  -------------------------------------------------------------------------------------------------------------------------------------------
  Task                  |   Runtime ms  |  Count   | Avg delay ms    | Max delay ms    | Max delay start           | Max delay end          |
  -------------------------------------------------------------------------------------------------------------------------------------------
  perf:(2)              |      2.804 ms |       66 | avg:   0.524 ms | max:   1.069 ms | max start: 254752.314960 s | max end: 254752.316029 s
  NetworkManager:1343   |      0.372 ms |       13 | avg:   0.008 ms | max:   0.013 ms | max start: 254751.551153 s | max end: 254751.551166 s
  kworker/1:2-xfs:4649  |      0.012 ms |        1 | avg:   0.008 ms | max:   0.008 ms | max start: 254751.519807 s | max end: 254751.519815 s
  kworker/3:1-xfs:388   |      0.011 ms |        1 | avg:   0.006 ms | max:   0.006 ms | max start: 254751.519809 s | max end: 254751.519815 s
  sleep:147736          |      0.938 ms |        3 | avg:   0.006 ms | max:   0.007 ms | max start: 254751.313817 s | max end: 254751.313824 s

  It shows Runtime(time that a task spent actually running on the CPU),
  Count(number of times a delay was calculated) and delay(time that a
  task was ready to run but was kept waiting).

  Tasks with the same command name are merged and the merge count is
  given within (), However if -p option is used, pid is mentioned.

  'perf sched script' to see a detailed trace of the workload that
   was recorded (aliased to 'perf script' for now).

  'perf sched replay' to simulate the workload that was recorded
  via perf sched record. (this is done by starting up mockup threads
  that mimic the workload based on the events in the trace. These
  threads can then replay the timings (CPU runtime and sleep patterns)
  of the workload as it occurred when it was recorded - and can repeat
  it a number of times, measuring its performance.)

  'perf sched map' to print a textual context-switching outline of
  workload captured via perf sched record.  Columns stand for
  individual CPUs, and the two-letter shortcuts stand for tasks that
  are running on a CPU. A '*' denotes the CPU that had the event, and
  a dot signals an idle CPU.

  'perf sched timehist' provides an analysis of scheduling events.
    
    Example usage:
        perf sched record -- sleep 1
        perf sched timehist
    
   By default it shows the individual schedule events, including the wait
   time (time between sched-out and next sched-in events for the task), the
   task scheduling delay (time between runnable and actually running) and
   run time for the task:
    
                time    cpu  task name             wait time  sch delay   run time
                             [tid/pid]                (msec)     (msec)     (msec)
      -------------- ------  --------------------  ---------  ---------  ---------
        79371.874569 [0011]  gcc[31949]                0.014      0.000      1.148
        79371.874591 [0010]  gcc[31951]                0.000      0.000      0.024
        79371.874603 [0010]  migration/10[59]          3.350      0.004      0.011
        79371.874604 [0011]  <idle>                    1.148      0.000      0.035
        79371.874723 [0005]  <idle>                    0.016      0.000      1.383
        79371.874746 [0005]  gcc[31949]                0.153      0.078      0.022
    ...
    
   Times are in msec.usec.

OPTIONS
-------
-i::
--input=<file>::
        Input file name. (default: perf.data unless stdin is a fifo)

-v::
--verbose::
        Be more verbose. (show symbol address, etc)

-D::
--dump-raw-trace=::
        Display verbose dump of the sched data.

-f::
--force::
	Don't complain, do it.

OPTIONS for 'perf sched latency'
-------------------------------

-C::
--CPU <n>::
        CPU to profile on.

-p::
--pids::
        latency stats per pid instead of per command name.

-s::
--sort <key[,key2...]>::
        sort by key(s): runtime, switch, avg, max
        by default it's sorted by "avg ,max ,switch ,runtime".

OPTIONS for 'perf sched map'
----------------------------

--compact::
	Show only CPUs with activity. Helps visualizing on high core
	count systems.

--cpus::
	Show just entries with activities for the given CPUs.

--color-cpus::
	Highlight the given cpus.

--color-pids::
	Highlight the given pids.

--task-name <task>::
	Map output only for the given task name(s). Separate the
	task names with a comma (without whitespace). The sched-out
	time is printed and is represented by '*-' for the given
	task name(s).
	('-' indicates other tasks while '.' is idle).

--fuzzy-name::
	Given task name(s) can be partially matched (fuzzy matching).

OPTIONS for 'perf sched timehist'
---------------------------------
-k::
--vmlinux=<file>::
    vmlinux pathname

--kallsyms=<file>::
    kallsyms pathname

-g::
--call-graph::
	Display call chains if present (default on).

--max-stack::
	Maximum number of functions to display in backtrace, default 5.

-C=::
--cpu=::
	Only show events for the given CPU(s) (comma separated list).

-p=::
--pid=::
	Only show events for given process ID (comma separated list).

-t=::
--tid=::
	Only show events for given thread ID (comma separated list).

-s::
--summary::
    Show only a summary of scheduling by thread with min, max, and average
    run times (in sec) and relative stddev.

-S::
--with-summary::
    Show all scheduling events followed by a summary by thread with min,
    max, and average run times (in sec) and relative stddev.

--symfs=<directory>::
    Look for files with symbols relative to this directory.

-V::
--cpu-visual::
	Show visual aid for sched switches by CPU: 'i' marks idle time,
	's' are scheduler events.

-w::
--wakeups::
	Show wakeup events.

-M::
--migrations::
	Show migration events.

-n::
--next::
	Show next task.

-I::
--idle-hist::
	Show idle-related events only.

--time::
	Only analyze samples within given time window: <start>,<stop>. Times
	have the format seconds.microseconds. If start is not given (i.e., time
	string is ',x.y') then analysis starts at the beginning of the file. If
	stop time is not given (i.e, time string is 'x.y,') then analysis goes
	to end of file.

--state::
	Show task state when it switched out.

--show-prio::
	Show task priority.

--prio::
	Only show events for given task priority(ies). Multiple priorities can be
	provided as a comma-separated list with no spaces: 0,120. Ranges of
	priorities are specified with -: 120-129. A combination of both can also be
	provided: 0,120-129.

OPTIONS for 'perf sched replay'
------------------------------

-r::
--repeat <n>::
	repeat the workload n times (0: infinite). Default is 10.

SEE ALSO
--------
linkperf:perf-record[1]