diff options
author | Andi Kleen <ak@linux.intel.com> | 2019-11-21 01:15:19 +0100 |
---|---|---|
committer | Arnaldo Carvalho de Melo <acme@redhat.com> | 2019-11-29 16:20:45 +0100 |
commit | 4804e0111662d7d89edf4e767a64c6f7e4778bb1 (patch) | |
tree | 91ec256ab8e416dd8a74ba5652cc1b583cdf0945 /tools/perf/util/evsel.c | |
parent | perf stat: Factor out open error handling (diff) | |
download | linux-4804e0111662d7d89edf4e767a64c6f7e4778bb1.tar.xz linux-4804e0111662d7d89edf4e767a64c6f7e4778bb1.zip |
perf stat: Use affinity for opening events
Restructure the event opening in perf stat to cycle through the events
by CPU after setting affinity to that CPU.
This eliminates IPI overhead in the perf API.
We have to loop through the CPU in the outter builtin-stat code instead
of leaving that to low level functions.
It has to change the weak group fallback strategy slightly. Since we
cannot easily undo the opens for other CPUs move the weak group retry to
a separate loop.
Before with a large test case with 94 CPUs:
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
42.75 4.050910 67 60046 110 perf_event_open
After:
26.86 0.944396 16 58069 110 perf_event_open
(the number changes slightly because the weak group retries
work differently and the test case relies on weak groups)
Committer notes:
Added one of the hunks in a patch provided by Andi after I noticed that
the "event times" 'perf test' entry was segfaulting.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lore.kernel.org/lkml/20191121001522.180827-10-andi@firstfloor.org
Link: http://lore.kernel.org/lkml/20191127232657.GL84886@tassilo.jf.intel.com # Fix
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Diffstat (limited to 'tools/perf/util/evsel.c')
-rw-r--r-- | tools/perf/util/evsel.c | 22 |
1 files changed, 17 insertions, 5 deletions
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c index f4dea055b080..aa180d1df50f 100644 --- a/tools/perf/util/evsel.c +++ b/tools/perf/util/evsel.c @@ -1587,8 +1587,9 @@ static int perf_event_open(struct evsel *evsel, return fd; } -int evsel__open(struct evsel *evsel, struct perf_cpu_map *cpus, - struct perf_thread_map *threads) +static int evsel__open_cpu(struct evsel *evsel, struct perf_cpu_map *cpus, + struct perf_thread_map *threads, + int start_cpu, int end_cpu) { int cpu, thread, nthreads; unsigned long flags = PERF_FLAG_FD_CLOEXEC; @@ -1665,7 +1666,7 @@ retry_sample_id: display_attr(&evsel->core.attr); - for (cpu = 0; cpu < cpus->nr; cpu++) { + for (cpu = start_cpu; cpu < end_cpu; cpu++) { for (thread = 0; thread < nthreads; thread++) { int fd, group_fd; @@ -1843,6 +1844,12 @@ out_close: return err; } +int evsel__open(struct evsel *evsel, struct perf_cpu_map *cpus, + struct perf_thread_map *threads) +{ + return evsel__open_cpu(evsel, cpus, threads, 0, cpus ? cpus->nr : 1); +} + void evsel__close(struct evsel *evsel) { perf_evsel__close(&evsel->core); @@ -1850,9 +1857,14 @@ void evsel__close(struct evsel *evsel) } int perf_evsel__open_per_cpu(struct evsel *evsel, - struct perf_cpu_map *cpus) + struct perf_cpu_map *cpus, + int cpu) { - return evsel__open(evsel, cpus, NULL); + if (cpu == -1) + return evsel__open_cpu(evsel, cpus, NULL, 0, + cpus ? cpus->nr : 1); + + return evsel__open_cpu(evsel, cpus, NULL, cpu, cpu + 1); } int perf_evsel__open_per_thread(struct evsel *evsel, |