diff options
Diffstat (limited to 'tools/perf/Documentation')
-rw-r--r-- | tools/perf/Documentation/itrace.txt | 1 | ||||
-rw-r--r-- | tools/perf/Documentation/perf-arm-coresight.txt | 5 | ||||
-rw-r--r-- | tools/perf/Documentation/perf-c2c.txt | 14 | ||||
-rw-r--r-- | tools/perf/Documentation/perf-config.txt | 7 | ||||
-rw-r--r-- | tools/perf/Documentation/perf-inject.txt | 13 | ||||
-rw-r--r-- | tools/perf/Documentation/perf-intel-pt.txt | 13 | ||||
-rw-r--r-- | tools/perf/Documentation/perf-lock.txt | 20 | ||||
-rw-r--r-- | tools/perf/Documentation/perf-mem.txt | 3 | ||||
-rw-r--r-- | tools/perf/Documentation/perf-record.txt | 8 | ||||
-rw-r--r-- | tools/perf/Documentation/perf-report.txt | 3 |
10 files changed, 75 insertions, 12 deletions
diff --git a/tools/perf/Documentation/itrace.txt b/tools/perf/Documentation/itrace.txt index 6b189669c450..0916bbfe64cb 100644 --- a/tools/perf/Documentation/itrace.txt +++ b/tools/perf/Documentation/itrace.txt @@ -64,6 +64,7 @@ debug messages will or will not be logged. Each flag must be preceded by either '+' or '-'. The flags are: a all perf events + e output only on errors (size configurable - see linkperf:perf-config[1]) o output to stdout If supported, the 'q' option may be repeated to increase the effect. diff --git a/tools/perf/Documentation/perf-arm-coresight.txt b/tools/perf/Documentation/perf-arm-coresight.txt new file mode 100644 index 000000000000..c117fc50a2a9 --- /dev/null +++ b/tools/perf/Documentation/perf-arm-coresight.txt @@ -0,0 +1,5 @@ +Arm CoreSight Support +===================== + +For full documentation, see Documentation/trace/coresight/coresight-perf.rst +in the kernel tree. diff --git a/tools/perf/Documentation/perf-c2c.txt b/tools/perf/Documentation/perf-c2c.txt index f1f7ae6b08d1..5c5eb2def83e 100644 --- a/tools/perf/Documentation/perf-c2c.txt +++ b/tools/perf/Documentation/perf-c2c.txt @@ -19,9 +19,10 @@ C2C stands for Cache To Cache. The perf c2c tool provides means for Shared Data C2C/HITM analysis. It allows you to track down the cacheline contentions. -On x86, the tool is based on load latency and precise store facility events +On Intel, the tool is based on load latency and precise store facility events provided by Intel CPUs. On PowerPC, the tool uses random instruction sampling -with thresholding feature. +with thresholding feature. On AMD, the tool uses IBS op pmu (due to hardware +limitations, perf c2c is not supported on Zen3 cpus). These events provide: - memory address of the access @@ -49,7 +50,8 @@ RECORD OPTIONS -l:: --ldlat:: - Configure mem-loads latency. (x86 only) + Configure mem-loads latency. Supported on Intel and Arm64 processors + only. Ignored on other archs. -k:: --all-kernel:: @@ -135,11 +137,15 @@ Following perf record options are configured by default: -W,-d,--phys-data,--sample-cpu Unless specified otherwise with '-e' option, following events are monitored by -default on x86: +default on Intel: cpu/mem-loads,ldlat=30/P cpu/mem-stores/P +following on AMD: + + ibs_op// + and following on PowerPC: cpu/mem-loads/ diff --git a/tools/perf/Documentation/perf-config.txt b/tools/perf/Documentation/perf-config.txt index 0420e71698ee..39c890ead2dc 100644 --- a/tools/perf/Documentation/perf-config.txt +++ b/tools/perf/Documentation/perf-config.txt @@ -729,6 +729,13 @@ auxtrace.*:: If the directory does not exist or has the wrong file type, the current directory is used. +itrace.*:: + + debug-log-buffer-size:: + Log size in bytes to output when using the option --itrace=d+e + Refer 'itrace' option of linkperf:perf-script[1] or + linkperf:perf-report[1]. The default is 16384. + daemon.*:: daemon.base:: diff --git a/tools/perf/Documentation/perf-inject.txt b/tools/perf/Documentation/perf-inject.txt index ffc293fdf61d..c972032f4ca0 100644 --- a/tools/perf/Documentation/perf-inject.txt +++ b/tools/perf/Documentation/perf-inject.txt @@ -25,10 +25,17 @@ OPTIONS ------- -b:: --build-ids:: - Inject build-ids into the output stream + Inject build-ids of DSOs hit by samples into the output stream. + This means it needs to process all SAMPLE records to find the DSOs. ---buildid-all: - Inject build-ids of all DSOs into the output stream +--buildid-all:: + Inject build-ids of all DSOs into the output stream regardless of hits + and skip SAMPLE processing. + +--known-build-ids=:: + Override build-ids to inject using these comma-separated pairs of + build-id and path. Understands file://filename to read these pairs + from a file, which can be generated with perf buildid-list. -v:: --verbose:: diff --git a/tools/perf/Documentation/perf-intel-pt.txt b/tools/perf/Documentation/perf-intel-pt.txt index 3dc3f0ccbd51..92464a5d7eaf 100644 --- a/tools/perf/Documentation/perf-intel-pt.txt +++ b/tools/perf/Documentation/perf-intel-pt.txt @@ -943,12 +943,15 @@ event packets are recorded only if the "pwr_evt" config term was used. Refer to the config terms section above. The power events record information about C-state changes, whereas CBR is indicative of CPU frequency. perf script "event,synth" fields display information like this: + cbr: cbr: 22 freq: 2189 MHz (200%) mwait: hints: 0x60 extensions: 0x1 pwre: hw: 0 cstate: 2 sub-cstate: 0 exstop: ip: 1 pwrx: deepest cstate: 2 last cstate: 2 wake reason: 0x4 + Where: + "cbr" includes the frequency and the percentage of maximum non-turbo "mwait" shows mwait hints and extensions "pwre" shows C-state transitions (to a C-state deeper than C0) and @@ -956,6 +959,7 @@ Where: "exstop" indicates execution stopped and whether the IP was recorded exactly, "pwrx" indicates return to C0 + For more details refer to the Intel 64 and IA-32 Architectures Software Developer Manuals. @@ -969,8 +973,10 @@ are quite important. Users must know if what they are seeing is a complete picture or not. The "e" option may be followed by flags which affect what errors will or will not be reported. Each flag must be preceded by either '+' or '-'. The flags supported by Intel PT are: + -o Suppress overflow errors -l Suppress trace data lost errors + For example, for errors but not overflow or data lost errors: --itrace=e-o-l @@ -980,11 +986,16 @@ decoded packets and instructions. Note that this option slows down the decoder and that the resulting file may be very large. The "d" option may be followed by flags which affect what debug messages will or will not be logged. Each flag must be preceded by either '+' or '-'. The flags support by Intel PT are: + -a Suppress logging of perf events +a Log all perf events + +e Output only on decoding errors (size configurable) +o Output to stdout instead of "intel_pt.log" + By default, logged perf events are filtered by any specified time ranges, but -flag +a overrides that. +flag +a overrides that. The +e flag can be useful for analyzing errors. By +default, the log size in that case is 16384 bytes, but can be altered by +linkperf:perf-config[1] e.g. perf config itrace.debug-log-buffer-size=30000 In addition, the period of the "instructions" event can be specified. e.g. diff --git a/tools/perf/Documentation/perf-lock.txt b/tools/perf/Documentation/perf-lock.txt index 193c5d8b8db9..3b1e16563b79 100644 --- a/tools/perf/Documentation/perf-lock.txt +++ b/tools/perf/Documentation/perf-lock.txt @@ -40,6 +40,10 @@ COMMON OPTIONS --verbose:: Be more verbose (show symbol address, etc). +-q:: +--quiet:: + Do not show any message. (Suppress -v) + -D:: --dump-raw-trace:: Dump raw trace in ASCII. @@ -94,6 +98,11 @@ REPORT OPTIONS EventManager_De 1845 1 636 futex-default-S 1609 0 0 +-E:: +--entries=<value>:: + Display this many entries. + + INFO OPTIONS ------------ @@ -105,6 +114,7 @@ INFO OPTIONS --map:: dump map of lock instances (address:name table) + CONTENTION OPTIONS -------------- @@ -148,6 +158,16 @@ CONTENTION OPTIONS --map-nr-entries:: Maximum number of BPF map entries (default: 10240). +--max-stack:: + Maximum stack depth when collecting lock contention (default: 8). + +--stack-skip + Number of stack depth to skip when finding a lock caller (default: 3). + +-E:: +--entries=<value>:: + Display this many entries. + SEE ALSO -------- diff --git a/tools/perf/Documentation/perf-mem.txt b/tools/perf/Documentation/perf-mem.txt index 66177511c5c4..005c95580b1e 100644 --- a/tools/perf/Documentation/perf-mem.txt +++ b/tools/perf/Documentation/perf-mem.txt @@ -85,7 +85,8 @@ RECORD OPTIONS Be more verbose (show counter open errors, etc) --ldlat <n>:: - Specify desired latency for loads event. (x86 only) + Specify desired latency for loads event. Supported on Intel and Arm64 + processors only. Ignored on other archs. In addition, for report all perf report options are valid, and for record all perf record options. diff --git a/tools/perf/Documentation/perf-record.txt b/tools/perf/Documentation/perf-record.txt index 0228efc96686..e41ae950fdc3 100644 --- a/tools/perf/Documentation/perf-record.txt +++ b/tools/perf/Documentation/perf-record.txt @@ -400,6 +400,7 @@ following filters are defined: For the platforms with Intel Arch LBR support (12th-Gen+ client or 4th-Gen Xeon+ server), the save branch type is unconditionally enabled when the taken branch stack sampling is enabled. + - priv: save privilege state during sampling in case binary is not available later + The option requires at least one branch type among any, any_call, any_ret, ind_call, cond. @@ -410,6 +411,7 @@ is enabled for all the sampling events. The sampled branch type is the same for The various filters must be specified as a comma separated list: --branch-filter any_ret,u,k Note that this feature may not be available on all processors. +-W:: --weight:: Enable weightened sampling. An additional weight is recorded per sample and can be displayed with the weight and local_weight sort keys. This currently works for TSX @@ -433,8 +435,10 @@ if combined with -a or -C options. -D:: --delay=:: After starting the program, wait msecs before measuring (-1: start with events -disabled). This is useful to filter out the startup phase of the program, which -is often very different. +disabled), or enable events only for specified ranges of msecs (e.g. +-D 10-20,30-40 means wait 10 msecs, enable for 10 msecs, wait 10 msecs, enable +for 10 msecs, then stop). Note, delaying enabling of events is useful to filter +out the startup phase of the program, which is often very different. -I:: --intr-regs:: diff --git a/tools/perf/Documentation/perf-report.txt b/tools/perf/Documentation/perf-report.txt index 24efc0583c93..4533db2ee56b 100644 --- a/tools/perf/Documentation/perf-report.txt +++ b/tools/perf/Documentation/perf-report.txt @@ -73,7 +73,7 @@ OPTIONS Sort histogram entries by given key(s) - multiple keys can be specified in CSV format. Following sort keys are available: pid, comm, dso, symbol, parent, cpu, socket, srcline, weight, - local_weight, cgroup_id. + local_weight, cgroup_id, addr. Each key has following meaning: @@ -114,6 +114,7 @@ OPTIONS - local_ins_lat: Local instruction latency version - p_stage_cyc: On powerpc, this presents the number of cycles spent in a pipeline stage. And currently supported only on powerpc. + - addr: (Full) virtual address of the sampled instruction By default, comm, dso and symbol keys are used. (i.e. --sort comm,dso,symbol) |