summaryrefslogtreecommitdiffstats
path: root/tools/perf/util/auxtrace.h (follow)
Commit message (Collapse)AuthorAgeFilesLines
* perf auxtrace: Change to use SMP memory barriersLeo Yan2021-06-081-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | The kernel and the userspace tool can access the AUX ring buffer head and tail from different CPUs, thus SMP class of barriers are required on SMP system. This patch changes to use SMP barriers to replace mb() and rmb() barriers. Signed-off-by: Leo Yan <leo.yan@linaro.org> Acked-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: coresight@lists.linaro.org Cc: linux-arm-kernel@lists.infradead.org Cc: x86@kernel.org Link: http://lore.kernel.org/lkml/20210602103007.184993-6-leo.yan@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf auxtrace: Factor out itrace_do_parse_synth_opts()Adrian Hunter2021-06-011-0/+10
| | | | | | | | | | Factor out itrace_do_parse_synth_opts() so that it can be reused. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: https://lore.kernel.org/r/20210530192308.7382-9-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf auxtrace: Make perf_event__process_auxtrace*() callableArnaldo Carvalho de Melo2021-05-251-3/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As we'll use it in the upcoming python interfaces and when built with: make_minimal_O: make NO_LIBPERL=1 NO_LIBPYTHON=1 NO_NEWT=1 NO_GTK2=1 NO_DEMANGLE=1 NO_LIBELF=1 NO_LIBUNWIND=1 NO_BACKTRACE=1 NO_LIBNUMA=1 NO_LIBAUDIT=1 NO_LIBBIONIC=1 NO_LIBDW_DWARF_UNWIND=1 NO_AUXTRACE=1 NO_LIBBPF=1 NO_LIBCRYPTO=1 NO_SDT=1 NO_JVMTI=1 +NO_LIBZSTD=1 NO_LIBCAP=1 NO_SYSCALL_TABLE=1 make NO_LIBPERL=1 NO_LIBPYTHON=1 NO_NEWT=1 NO_GTK2=1 NO_DEMANGLE=1 NO_LIBELF=1 NO_LIBUNWIND=1 NO_BACKTRACE=1 NO_LIBNUMA=1 NO_LIBAUDIT=1 NO_LIBBIONIC=1 NO_LIBDW_DWARF_UNWIND=1 NO_AUXTRACE=1 NO_LIBBPF=1 NO_LIBCRYPTO=1 NO_SDT=1 NO_JVMTI=1 NO_LIBZSTD=1 NO_LIBCAP=1 +NO_SYSCALL_TABLE=1 BUILD: Doing 'make -j24' parallel build <SNIP> CC /tmp/tmp.rGrdpQlTCr/builtin-daemon.o In file included from util/events_stats.h:8, from util/evlist.h:12, from builtin-script.c:18: builtin-script.c: In function ‘process_auxtrace_error’: util/auxtrace.h:708:57: error: called object is not a function or function pointer 708 | #define perf_event__process_auxtrace_error 0 | ^ builtin-script.c:2443:16: note: in expansion of macro ‘perf_event__process_auxtrace_error’ 2443 | return perf_event__process_auxtrace_error(session, event); | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ MKDIR /tmp/tmp.rGrdpQlTCr/tests/ MKDIR /tmp/tmp.rGrdpQlTCr/bench/ CC /tmp/tmp.rGrdpQlTCr/tests/builtin-test.o CC /tmp/tmp.rGrdpQlTCr/bench/sched-messaging.o builtin-script.c:2444:1: error: control reaches end of non-void function [-Werror=return-type] 2444 | } | ^ To: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf auxtrace: Allow buffers to be mapped read / writeAdrian Hunter2021-05-121-1/+5
| | | | | | | | | | To support in-place update, allow buffers to be mapped read / write. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Reviewed-by: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: https://lore.kernel.org/r/20210430070309.17624-7-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf inject: Add --vm-time-correlation optionAdrian Hunter2021-05-121-0/+6
| | | | | | | | | | | | | | | | | | Intel PT timestamps are affected by virtualization. Add a new option that will allow the Intel PT decoder to correlate the timestamps and translate the virtual machine timestamps to host timestamps. The advantages of making this a separate step, rather than a part of normal decoding are that it is simpler to implement, and it needs to be done only once. This patch adds only the option. Later patches add Intel PT support. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Reviewed-by: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: https://lore.kernel.org/r/20210430070309.17624-6-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf auxtrace: Add Z itrace option for timeless decodingAdrian Hunter2021-05-121-0/+2
| | | | | | | | | | | | Issues correlating timestamps can be avoided with timeless decoding. Add an option for that, so that timeless decoding can be used even when timestamps are present. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Reviewed-by: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: https://lore.kernel.org/r/20210430070309.17624-2-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf auxtrace: Automatically group aux-output eventsAdrian Hunter2021-02-181-0/+6
| | | | | | | | | | | | | | | | | | | aux-output events need to have an AUX area event as the group leader. However, grouping events does not allow the AUX area event to be given an address filter because the --filter option must come after the event, which conflicts with the grouping syntax. To allow filtering in that case, automatically create a group since that is the requirement anyway. Example: (requires Intel Tremont) perf record -c 500 -e 'intel_pt//u' --filter 'filter main @ /bin/ls' -e 'cycles/aux-output/pp' ls Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Jiri Olsa <jolsa@redhat.com> Link: http://lore.kernel.org/lkml/20210121140418.14705-1-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf auxtrace: Add itrace option '-M' for memory eventsLeo Yan2020-11-111-0/+2
| | | | | | | | | This patch is to add itrace option '-M' to synthesize memory event. Signed-off-by: Leo Yan <leo.yan@linaro.org> Acked-by: Jiri Olsa <jolsa@redhat.com> Link: https://lore.kernel.org/r/20201106094853.21082-7-leo.yan@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf auxtrace: Add itrace 'q' option for quicker, less detailed decodingAdrian Hunter2020-08-061-0/+3
| | | | | | | | | | | | | The 'q' option is for modes of decoding that are quicker because they skip or omit decoding some aspects of trace data. If supported, the 'q' option may be repeated to increase the effect. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Reviewed-by: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lore.kernel.org/lkml/20200710151104.15137-11-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf auxtrace: Add optional log flags to the itrace 'd' optionAdrian Hunter2020-08-061-1/+9
| | | | | | | | | | | | | Allow the 'd' option to be followed by flags which will affect what debug messages will or will not be reported. Each flag must be preceded by either '+' or '-'. The flags are: a all perf events Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Reviewed-by: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lore.kernel.org/lkml/20200710151104.15137-8-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf auxtrace: Add optional error flags to the itrace 'e' optionAdrian Hunter2020-08-061-1/+11
| | | | | | | | | | | | | | Allow the 'e' option to be followed by flags which will affect what errors will or will not be reported. Each flag must be preceded by either '+' or '-'. The flags are: o overflow l trace data lost Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Reviewed-by: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lore.kernel.org/lkml/20200710151104.15137-6-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf auxtrace: Add missing itrace options to help textAdrian Hunter2020-08-061-1/+5
| | | | | | | | | | Add missing itrace options o, G and L. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Reviewed-by: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lore.kernel.org/lkml/20200710151104.15137-5-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf auxtrace: Add four itrace optionsTan Xiaojun2020-06-011-1/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch is to add four options to synthesize events which are described as below: 'f': synthesize first level cache events 'm': synthesize last level cache events 't': synthesize TLB events 'a': synthesize remote access events This four options will be used by ARM SPE as their first consumer. Signed-off-by: Tan Xiaojun <tanxiaojun@huawei.com> Tested-by: James Clark <james.clark@arm.com> Acked-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Al Grant <al.grant@arm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will@kernel.org> Cc: linux-arm-kernel@lists.infradead.org Link: http://lore.kernel.org/lkml/20200530122442.490-3-leo.yan@linaro.org Signed-off-by: James Clark <james.clark@arm.com> Signed-off-by: Leo Yan <leo.yan@linaro.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf auxtrace: Add option to synthesize branch stack for regular eventsAdrian Hunter2020-05-051-0/+2
| | | | | | | | | | | There is an existing option to synthesize branch stacks for synthesized events. Add a new option to synthesize branch stacks for regular events. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lore.kernel.org/lkml/20200429150751.12570-5-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf auxtrace: Add an option to synthesize callchains for regular eventsAdrian Hunter2020-04-161-0/+2
| | | | | | | | | | | Currently, callchains can be synthesized only for synthesized events. Add an itrace option to synthesize callchains for regular events. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lore.kernel.org/lkml/20200401101613.6201-9-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf auxtrace: Add ->evsel_is_auxtrace() callbackAdrian Hunter2020-04-161-0/+12
| | | | | | | | | | | | | | Add ->evsel_is_auxtrace() callback to identify if a selected event is an AUX area event. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kim Phillips <kim.phillips@arm.com> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Thomas Richter <tmricht@linux.ibm.com> Link: http://lore.kernel.org/lkml/20200401101613.6201-2-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf auxtrace: Add auxtrace_record__read_finish()Adrian Hunter2020-02-181-0/+6
| | | | | | | | | | | | | | | All ->read_finish() implementations are doing the same thing. Add a helper function so that they can share the same implementation. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Reviewed-by: Leo Yan <leo.yan@linaro.org> Tested-by: Leo Yan <leo.yan@linaro.org> Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kim Phillips <kim.phillips@arm.com> Cc: Wei Li <liwei391@huawei.com> Link: http://lore.kernel.org/lkml/20200217082300.6301-1-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf auxtrace: Add support for queuing AUX area samplesAdrian Hunter2019-11-221-0/+15
| | | | | | | | | | | | | | | | Add functions to queue AUX area samples in advance (auxtrace_queue_data()) or individually (auxtrace_queues__add_sample()) or find out what queue a sample belongs on (auxtrace_queues__sample_queue()). auxtrace_queue_data() can also queue snapshot data which keeps snapshots and samples ordered with respect to each other in case support for that is desired. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lore.kernel.org/lkml/20191115124225.5247-12-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf auxtrace: Add support for dumping AUX area samplesAdrian Hunter2019-11-221-0/+11
| | | | | | | | | | | | | | | Add support for dumping AUX area samples i.e. via the perf script/report -D (--dump-raw-trace) option. Committer notes: Add __maybe_unused to the two args for auxtrace__dump_auxtrace_sample() for when we don't HAVE_AUXTRACE_SUPPORT. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lore.kernel.org/lkml/20191115124225.5247-10-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf auxtrace: Add support for AUX area sample recordingAdrian Hunter2019-11-221-0/+17
| | | | | | | | | | | | | | | | Add support for parsing and validating AUX area sample options. At present, the only option is the sample size, but it is also necessary to ensure that events are in a group with an AUX area event as the leader. Committer note: Add missing 'static inline' in front of auxtrace_parse_sample_options() for when we don't HAVE_AUXTRACE_SUPPORT. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lore.kernel.org/lkml/20191115124225.5247-6-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf auxtrace: Add auxtrace_cache__remove()Adrian Hunter2019-11-061-0/+1
| | | | | | | | | | | | | | | | | | | | Add auxtrace_cache__remove(). Intel PT uses an auxtrace_cache to store the results of code-walking, so that the same block of instructions does not have to be decoded repeatedly. However, when there are text poke events, the associated cache entries need to be removed. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Borislav Petkov <bp@alien8.de> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: x86@kernel.org Link: http://lore.kernel.org/lkml/20191025130000.13032-6-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf tools: Rename 'struct perf_mmap' to 'struct mmap'Jiri Olsa2019-09-251-4/+4
| | | | | | | | | | | | | Rename 'struct perf_evlist' to 'struct evlist', so we don't have a name clash when we add 'struct perf_mmap' to libperf. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lore.kernel.org/lkml/20190913132355.21634-4-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf tools: Move event synthesizing routines to separate headerArnaldo Carvalho de Melo2019-09-201-15/+2
| | | | | | | | | | | | | Those are the only routines using the perf_event__handler_t typedef and are all related, so move to a separate header to reduce the header dependency tree, lots of places were getting event.h and even stdio.h, limits.h indirectly, so fix those as well. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-yvx9u1mf7baq6cu1abfhbqgs@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf auxtrace: Add missing 'struct perf_sample' forward declarationArnaldo Carvalho de Melo2019-09-201-0/+1
| | | | | | | | | | Its needed, was being obtained indirectly, fix it. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-c3k1il7sm28old4e22nwlm7l@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf auxtrace: Uninline functions that touch perf_sessionArnaldo Carvalho de Melo2019-09-011-36/+7
| | | | | | | | | | | | So that we don't carry the session.h include directive in auxtrace.h, which in turn opens a can of worms of files that were getting all sorts of things via that include, fix them all. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-d2d83aovpgri2z75wlitquni@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf tools: Move 'struct events_stats' and prototypes to separate headerArnaldo Carvalho de Melo2019-09-011-0/+5
| | | | | | | | | | | This will allow us to untangle the header dependency a bit more, as some places will not need event.h anymore. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-enqncj29ovzaat3cd9203rwl@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf tools: Remove debug.h from header files not needing itArnaldo Carvalho de Melo2019-08-291-1/+1
| | | | | | | | | | | And fix the fallout, adding it to places that must have it since they use its definitions. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-1s3jel4i26chq2g0lydoz7i3@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf tools: Remove needless perf.h include directive from headersArnaldo Carvalho de Melo2019-08-291-1/+0
| | | | | | | | | | | Its not needed there, add it to the places that need it and were getting it via those headers. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-5yulx1u16vyd0zmrbg1tjhju@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf tools: Move everything related to sys_perf_event_open() to perf-sys.hArnaldo Carvalho de Melo2019-08-291-0/+1
| | | | | | | | | | | | | | | And remove unneeded include directives from perf-sys.h to prune the header dependency tree. Fixup the fallout in places where definitions were being used without the needed include directives that were being satisfied because they were in perf-sys.h. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-7b1zvugiwak4ibfa3j6ott7f@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* libperf: Rename the PERF_RECORD_ structs to have a "perf" prefixJiri Olsa2019-08-291-4/+4
| | | | | | | | | | | | | Even more, to have a "perf_record_" prefix, so that they match the PERF_RECORD_ enum they map to. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20190828135717.7245-23-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf tools: Add itrace option 'o' to synthesize aux-output eventsAdrian Hunter2019-08-141-0/+3
| | | | | | | | | | | | | | Add itrace option 'o' to synthesize events recorded in the AUX area due to the use of perf record's aux-output config term. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20190806084606.4021-5-alexander.shishkin@linux.intel.com Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf record: Add an option to take an AUX snapshot on exitAlexander Shishkin2019-08-141-1/+1
| | | | | | | | | | | | | | | | | | | | | It is sometimes useful to generate a snapshot when perf record exits; I've been using a wrapper script around the workload that would do a killall -USR2 perf when the workload exits. This patch makes it easier and also works when perf record is attached to a pre-existing task. A new snapshot option 'e' can be specified in -S to enable this behavior: root@elsewhere:~# perf record -e intel_pt// -Se sleep 1 [ perf record: Woken up 2 times to write data ] [ perf record: Captured and wrote 0.085 MB perf.data ] Co-developed-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20190806144101.62892-1-alexander.shishkin@linux.intel.com [ Fixed up !HAVE_AUXTRACE_SUPPORT build in builtin-record.c, adding 2 missing __maybe_unused ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf evlist: Rename struct perf_evlist to struct evlistJiri Olsa2019-07-291-12/+12
| | | | | | | | | | | | | | | | | | | | Rename struct perf_evlist to struct evlist, so we don't have a name clash when we add struct perf_evlist in libperf. Committer notes: Added fixes to build on arm64, from Jiri and from me (tools/perf/util/cs-etm.c) Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Budankov <alexey.budankov@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20190721112506.12306-6-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* Merge tag 'perf-core-for-mingo-5.3-20190611' of ↵Ingo Molnar2019-06-171-0/+34
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo: perf record: Alexey Budankov: - Allow mixing --user-regs with --call-graph=dwarf, making sure that the minimal set of registers for DWARF unwinding is present in the set of user registers requested to be present in each sample, while warning the user that this may make callchains unreliable if more that the minimal set of registers is needed to unwind. yuzhoujian: - Add support to collect callchains from kernel or user space only, IOW allow setting the perf_event_attr.exclude_callchain_{kernel,user} bits from the command line. perf trace: Arnaldo Carvalho de Melo: - Remove x86_64 specific syscall numbers from the augmented_raw_syscalls BPF in-kernel collector of augmented raw_syscalls:sys_{enter,exit} payloads, use instead the syscall numbers obtainer either by the arch specific syscalltbl generators or from audit-libs. - Allow 'perf trace' to ask for the number of bytes to collect for string arguments, for now ask for PATH_MAX, i.e. the whole pathnames, which ends up being just a way to speficy which syscall args are pathnames and thus should be read using bpf_probe_read_str(). - Skip unknown syscalls when expanding strace like syscall groups. This helps using the 'string' group of syscalls to work in arm64, where some of the syscalls present in x86_64 that deal with strings, for instance 'access', are deprecated and this should not be asked for tracing. Leo Yan: - Exit when failing to build eBPF program. perf config: Arnaldo Carvalho de Melo: - Bail out when a handler returns failure for a key-value pair. This helps with cases where processing a key-value pair is not just a matter of setting some tool specific knob, involving, for instance building a BPF program to then attach to the list of events 'perf trace' will use, e.g. augmented_raw_syscalls.c. perf.data: Kan Liang: - Read and store die ID information available in new Intel processors in CPUID.1F in the CPU topology written in the perf.data header. perf stat: Kan Liang: - Support per-die aggregation. Documentation: Arnaldo Carvalho de Melo: - Update perf.data documentation about the CPU_TOPOLOGY, MEM_TOPOLOGY, CLOCKID and DIR_FORMAT headers. Song Liu: - Add description of headers HEADER_BPF_PROG_INFO and HEADER_BPF_BTF. Leo Yan: - Update default value for llvm.clang-bpf-cmd-template in 'man perf-config'. JVMTI: Jiri Olsa: - Address gcc string overflow warning for strncpy() core: - Remove superfluous nthreads system_wide setup in perf_evsel__alloc_fd(). Intel PT: Adrian Hunter: - Add support for samples to contain IPC ratio, collecting cycles information from CYC packets, showing the IPC info periodically, because Intel PT does not update the cycle count on every branch or instruction, the incremental values will often be zero. When there are values, they will be the number of instructions and number of cycles since the last update, and thus represent the average IPC since the last IPC value. E.g.: # perf record --cpu 1 -m200000 -a -e intel_pt/cyc/u sleep 0.0001 rounding mmap pages size to 1024M (262144 pages) [ perf record: Woken up 0 times to write data ] [ perf record: Captured and wrote 2.208 MB perf.data ] # perf script --insn-trace --xed -F+ipc,-dso,-cpu,-tid # <SNIP + add line numbering to make sense of IPC counts e.g.: (18/3)> 1 cc1 63501.650479626: 7f5219ac27bf _int_free+0x3f jnz 0x7f5219ac2af0 IPC: 0.81 (36/44) 2 cc1 63501.650479626: 7f5219ac27c5 _int_free+0x45 cmp $0x1f, %rbp 3 cc1 63501.650479626: 7f5219ac27c9 _int_free+0x49 jbe 0x7f5219ac2b00 4 cc1 63501.650479626: 7f5219ac27cf _int_free+0x4f test $0x8, %al 5 cc1 63501.650479626: 7f5219ac27d1 _int_free+0x51 jnz 0x7f5219ac2b00 6 cc1 63501.650479626: 7f5219ac27d7 _int_free+0x57 movq 0x13c58a(%rip), %rcx 7 cc1 63501.650479626: 7f5219ac27de _int_free+0x5e mov %rdi, %r12 8 cc1 63501.650479626: 7f5219ac27e1 _int_free+0x61 movq %fs:(%rcx), %rax 9 cc1 63501.650479626: 7f5219ac27e5 _int_free+0x65 test %rax, %rax 10 cc1 63501.650479626: 7f5219ac27e8 _int_free+0x68 jz 0x7f5219ac2821 11 cc1 63501.650479626: 7f5219ac27ea _int_free+0x6a leaq -0x11(%rbp), %rdi 12 cc1 63501.650479626: 7f5219ac27ee _int_free+0x6e mov %rdi, %rsi 13 cc1 63501.650479626: 7f5219ac27f1 _int_free+0x71 shr $0x4, %rsi 14 cc1 63501.650479626: 7f5219ac27f5 _int_free+0x75 cmpq %rsi, 0x13caf4(%rip) 15 cc1 63501.650479626: 7f5219ac27fc _int_free+0x7c jbe 0x7f5219ac2821 16 cc1 63501.650479626: 7f5219ac2821 _int_free+0xa1 cmpq 0x13f138(%rip), %rbp 17 cc1 63501.650479626: 7f5219ac2828 _int_free+0xa8 jnbe 0x7f5219ac28d8 18 cc1 63501.650479626: 7f5219ac28d8 _int_free+0x158 testb $0x2, 0x8(%rbx) 19 cc1 63501.650479628: 7f5219ac28dc _int_free+0x15c jnz 0x7f5219ac2ab0 IPC: 6.00 (18/3) <SNIP> - Allow using time ranges with Intel PT, i.e. these features, already present but not optimially usable with Intel PT, should be now: Select the second 10% time slice: $ perf script --time 10%/2 Select from 0% to 10% time slice: $ perf script --time 0%-10% Select the first and second 10% time slices: $ perf script --time 10%/1,10%/2 Select from 0% to 10% and 30% to 40% slices: $ perf script --time 0%-10%,30%-40% cs-etm (ARM): Mathieu Poirier: - Add support for CPU-wide trace scenarios. s390: Thomas Richter: - Fix missing kvm module load for s390. - Fix OOM error in TUI mode on s390 - Support s390 diag event display when doing analysis on !s390 architectures. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * perf auxtrace: Add perf time interval to itrace_synth_opsAdrian Hunter2019-06-101-0/+34
| | | | | | | | | | | | | | | | | | | | | | | | Instruction trace decoders can optimize output based on what time intervals will be filtered, so pass that information in itrace_synth_ops. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/20190604130017.31207-2-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* | treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 288Thomas Gleixner2019-06-051-10/+1
|/ | | | | | | | | | | | | | | | | | | | | | | | | Based on 1 normalized pattern(s): this program is free software you can redistribute it and or modify it under the terms and conditions of the gnu general public license version 2 as published by the free software foundation this program is distributed in the hope it will be useful but without any warranty without even the implied warranty of merchantability or fitness for a particular purpose see the gnu general public license for more details extracted by the scancode license scanner the SPDX license identifier GPL-2.0-only has been chosen to replace the boilerplate/reference in 263 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Allison Randal <allison@lohutok.net> Reviewed-by: Alexios Zavras <alexios.zavras@intel.com> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190529141901.208660670@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* perf auxtrace: Add timestamp to auxtrace errorsAdrian Hunter2019-02-061-1/+1
| | | | | | | | | | | The timestamp can use useful to find part of a trace that has an error without outputting all of the trace e.g. using the itrace 's' option to skip initial number of events. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/20190206103947.15750-6-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf auxtrace: Define auxtrace record alignmentAdrian Hunter2019-02-061-0/+3
| | | | | | | | | | | | | Define auxtrace record alignment so that it can be referenced elsewhere. Note this is preparation for patch "perf intel-pt: Fix overlap calculation for padding" Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/20190206103947.15750-2-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf script: Make itrace script default to all callsAndi Kleen2018-10-241-1/+4
| | | | | | | | | | | | | | | | | | | | | | | By default 'perf script' for itrace outputs sampled instructions or branches. In my experience this is confusing to users because it's hard to correlate with real program behavior. The sampling makes sense for tools like 'perf report' that actually sample to reduce the run time, but run time is normally not a problem for 'perf script'. It's better to give an accurate representation of the program flow. Default 'perf script' to output all calls for itrace. That's a much saner default. The old behavior can be still requested with 'perf script' --itrace=ibxwpe100000 v2: Fix ETM build failure v3: Really fix ETM build failure (Kim Phillips) Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Kim Phillips <kim.phillips@arm.com> Cc: Leo Yan <leo.yan@linaro.org> Link: http://lkml.kernel.org/r/20180920180540.14039-3-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf auxtrace: Include missing asm/bitsperlong.h to get BITS_PER_LONGArnaldo Carvalho de Melo2018-10-081-0/+1
| | | | | | | | | | | | | | | The auxtrace.h header references BITS_PER_LONG without including the header where it is defined, getting it by luck from some other header, fix it. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Sverdlin <alexander.sverdlin@nokia.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-v04ydmbh7tvpcctf3zld9j9s@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf tools: Report itrace options in helpAndi Kleen2018-09-191-0/+19
| | | | | | | | | | | | | | I often forget all the options that --itrace accepts. Instead of burying them in the man page only report them in the normal command line help too to make them easier accessible. v2: Align Signed-off-by: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kim Phillips <kim.phillips@arm.com> Link: http://lkml.kernel.org/r/20180914031038.4160-2-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf tools: Add 'struct perf_mmap' arg to record__write()Jiri Olsa2018-09-191-0/+1
| | | | | | | | | | | | | | The struct perf_mmap map argument will hold the file pointer to write the data to. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Budankov <alexey.budankov@linux.intel.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180913125450.21342-5-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf auxtrace: Pass struct perf_mmap into mmap__read* functionsJiri Olsa2018-09-191-2/+3
| | | | | | | | | | | | | | | The perf_mmap struct will hold a file pointer to write the mmap's contents, so we need to propagate it down the stack to record__write callers instead of its member the auxtrace_mmap struct. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Budankov <alexey.budankov@linux.intel.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180913125450.21342-4-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf tools: Remove perf_tool from event_op3Jiri Olsa2018-09-191-3/+2
| | | | | | | | | | | | | | | Now that we keep a perf_tool pointer inside perf_session, there's no need to have a perf_tool argument in the event_op3 callback. Remove it. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Budankov <alexey.budankov@linux.intel.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180913125450.21342-3-jolsa@kernel.org [ Fix the builtin-inject.c build for !HAVE_AUXTRACE_SUPPORT ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf tools: Remove perf_tool from event_op2Jiri Olsa2018-09-191-6/+4
| | | | | | | | | | | | | | Now that we keep a perf_tool pointer inside perf_session, there's no need to have a perf_tool argument in the event_op2 callback. Remove it. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Budankov <alexey.budankov@linux.intel.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180913125450.21342-2-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf auxtrace: Support for perf report -D for s390Thomas Richter2018-08-031-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add initial support for s390 auxiliary traces using the CPU-Measurement Sampling Facility. Support and ignore PERF_REPORT_AUXTRACE_INFO records in the perf data file. Later patches will show the contents of the auxiliary traces. Setup the auxtrace queues and data structures for s390. A raw dump of the perf.data file now does not show an error when an auxtrace event is encountered. Output before: [root@s35lp76 perf]# ./perf report -D -i perf.data.auxtrace 0x128 [0x10]: failed to process type: 70 Error: failed to process sample 0x128 [0x10]: event: 70 . . ... raw event: size 16 bytes . 0000: 00 00 00 46 00 00 00 10 00 00 00 00 00 00 00 00 ...F............ 0x128 [0x10]: PERF_RECORD_AUXTRACE_INFO type: 0 [root@s35lp76 perf]# Output after: # ./perf report -D -i perf.data.auxtrace |fgrep PERF_RECORD_AUXTRACE 0 0 0x128 [0x10]: PERF_RECORD_AUXTRACE_INFO type: 5 0 0 0x25a66 [0x30]: PERF_RECORD_AUXTRACE size: 0x40000 offset: 0 ref: 0 idx: 4 tid: -1 cpu: 4 .... Additional notes about the underlying hardware and software implementation, provided by Hendrik Brueckner (see Link: below). ============================================================================= The CPU-Measurement Facility (CPU-MF) provides a set of functions to obtain performance information on the mainframe. Basically, it was introduced with System z10 years ago for the z/Architecture, that means, 64-bit. For Linux, there are two facilities of interest, counter facility and sampling facility. The counter facility provides hardware counters for instructions, cycles, crypto-activities, and many more. The sampling facility is a hardware sampler that when started will write samples at a particular interval into a sampling buffer. At some point, for example, if a sample block is full, it generates an interrupt to collect samples (while the sampler continues to run). Few years ago, I started to provide the a perf PMU to use the counter and sampling facilities. Recently, the device driver was updated to also "export" the sampling buffer into the AUX area. Thomas now completed the related perf work to interpret and process these AUX data. If people are more interested in the sampling facility, they can have a look into: - The Load-Program-Parameter and the CPU-Measurement Facilities, SA23-2260-05 http://www-01.ibm.com/support/docview.wss?uid=isg26fcd1cc32246f4c8852574ce0044734a and to learn how-to use it for Linux on Z, have look at chapter 54, "Using the CPU-measurement facilities" in the: - Device Drivers, Features, and Commands, SC33-8411-34 http://public.dhe.ibm.com/software/dw/linux390/docu/l416dd34.pdf ============================================================================= Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Reviewed-by: Hendrik Brueckner <brueckner@linux.ibm.com> Link: http://lkml.kernel.org/r/20180803100758.GA28475@linux.ibm.com Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Link: http://lkml.kernel.org/r/20180802074622.13641-2-tmricht@linux.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf auxtrace: Add missing parameters from kernel-doc commentsAdrian Hunter2018-03-071-0/+2
| | | | | | | | | Add missing parameters from kernel-doc comments. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/1520327598-1317-4-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* perf tools: Add ARM Statistical Profiling Extensions (SPE) supportKim Phillips2018-01-171-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 'perf record' and 'perf report --dump-raw-trace' supported in this release. Example usage: # perf record -e arm_spe/ts_enable=1,pa_enable=1/ dd if=/dev/zero of=/dev/null count=10000 # perf report --dump-raw-trace Note that the perf.data file is portable, so the report can be run on another architecture host if necessary. Output will contain raw SPE data and its textual representation, such as: 0x5c8 [0x30]: PERF_RECORD_AUXTRACE size: 0x200000 offset: 0 ref: 0x1891ad0e idx: 1 tid: 2227 cpu: 1 . . ... ARM SPE data: size 2097152 bytes . 00000000: 49 00 LD . 00000002: b2 c0 3b 29 0f 00 00 ff ff VA 0xffff00000f293bc0 . 0000000b: b3 c0 eb 24 fb 00 00 00 80 PA 0xfb24ebc0 ns=1 . 00000014: 9a 00 00 LAT 0 XLAT . 00000017: 42 16 EV RETIRED L1D-ACCESS TLB-ACCESS . 00000019: b0 00 c4 15 08 00 00 ff ff PC 0xff00000815c400 el3 ns=1 . 00000022: 98 00 00 LAT 0 TOT . 00000025: 71 36 6c 21 2c 09 00 00 00 TS 39395093558 . 0000002e: 49 00 LD . 00000030: b2 80 3c 29 0f 00 00 ff ff VA 0xffff00000f293c80 . 00000039: b3 80 ec 24 fb 00 00 00 80 PA 0xfb24ec80 ns=1 . 00000042: 9a 00 00 LAT 0 XLAT . 00000045: 42 16 EV RETIRED L1D-ACCESS TLB-ACCESS . 00000047: b0 f4 11 16 08 00 00 ff ff PC 0xff0000081611f4 el3 ns=1 . 00000050: 98 00 00 LAT 0 TOT . 00000053: 71 36 6c 21 2c 09 00 00 00 TS 39395093558 . 0000005c: 48 00 INSN-OTHER . 0000005e: 42 02 EV RETIRED . 00000060: b0 2c ef 7f 08 00 00 ff ff PC 0xff0000087fef2c el3 ns=1 . 00000069: 98 00 00 LAT 0 TOT . 0000006c: 71 d1 6f 21 2c 09 00 00 00 TS 39395094481 ... Other release notes: - applies to acme's perf/{core,urgent} branches, likely elsewhere - Report is self-contained within the tool. Record requires enabling the kernel SPE driver by setting CONFIG_ARM_SPE_PMU. - The intel-bts implementation was used as a starting point; its min/default/max buffer sizes and power of 2 pages granularity need to be revisited for ARM SPE - Recording across multiple SPE clusters/domains not supported - Snapshot support (record -S), and conversion to native perf events (e.g., via 'perf inject --itrace'), are also not supported - Technically both cs-etm and spe can be used simultaneously, however disabled for simplicity in this release Signed-off-by: Kim Phillips <kim.phillips@arm.com> Reviewed-by: Dongjiu Geng <gengdongjiu@huawei.com> Acked-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: linux-arm-kernel@lists.infradead.org Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Pawel Moll <pawel.moll@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rob Herring <robh@kernel.org> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Wang Nan <wangnan0@huawei.com> Cc: Will Deacon <will.deacon@arm.com> Link: http://lkml.kernel.org/r/20180114132850.0b127434b704a26bad13268f@arm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* locking/atomics: COCCINELLE/treewide: Convert trivial ACCESS_ONCE() patterns ↵Mark Rutland2017-10-251-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | to READ_ONCE()/WRITE_ONCE() Please do not apply this to mainline directly, instead please re-run the coccinelle script shown below and apply its output. For several reasons, it is desirable to use {READ,WRITE}_ONCE() in preference to ACCESS_ONCE(), and new code is expected to use one of the former. So far, there's been no reason to change most existing uses of ACCESS_ONCE(), as these aren't harmful, and changing them results in churn. However, for some features, the read/write distinction is critical to correct operation. To distinguish these cases, separate read/write accessors must be used. This patch migrates (most) remaining ACCESS_ONCE() instances to {READ,WRITE}_ONCE(), using the following coccinelle script: ---- // Convert trivial ACCESS_ONCE() uses to equivalent READ_ONCE() and // WRITE_ONCE() // $ make coccicheck COCCI=/home/mark/once.cocci SPFLAGS="--include-headers" MODE=patch virtual patch @ depends on patch @ expression E1, E2; @@ - ACCESS_ONCE(E1) = E2 + WRITE_ONCE(E1, E2) @ depends on patch @ expression E; @@ - ACCESS_ONCE(E) + READ_ONCE(E) ---- Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: davem@davemloft.net Cc: linux-arch@vger.kernel.org Cc: mpe@ellerman.id.au Cc: shuah@kernel.org Cc: snitzer@redhat.com Cc: thor.thayer@linux.intel.com Cc: tj@kernel.org Cc: viro@zeniv.linux.org.uk Cc: will.deacon@arm.com Link: http://lkml.kernel.org/r/1508792849-3115-19-git-send-email-paulmck@linux.vnet.ibm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
* perf auxtrace: Add CPU filter supportAdrian Hunter2017-06-301-0/+2
| | | | | | | | | | | Decoding auxtrace data can take a long time. To avoid decoding unnecessarily, filter auxtrace data that is collected per-cpu before it is decoded. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Link: http://lkml.kernel.org/r/1495786658-18063-38-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>