diff options
author | Linus Torvalds <torvalds@linux-foundation.org> | 2022-03-27 22:42:32 +0200 |
---|---|---|
committer | Linus Torvalds <torvalds@linux-foundation.org> | 2022-03-27 22:42:32 +0200 |
commit | 7b58b82b86c8b65a2b57a4c6cb96a460654f9e09 (patch) | |
tree | a13e19f216389f16f1cb6641d54751f167482515 /tools/perf/util/bpf-loader.c | |
parent | Merge tag 'memblock-v5.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/g... (diff) | |
parent | perf evsel: Improve AMD IBS (Instruction-Based Sampling) error handling messages (diff) | |
download | linux-7b58b82b86c8b65a2b57a4c6cb96a460654f9e09.tar.xz linux-7b58b82b86c8b65a2b57a4c6cb96a460654f9e09.zip |
Merge tag 'perf-tools-for-v5.18-2022-03-26' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux
Pull perf tools updates from Arnaldo Carvalho de Melo:
"New features:
perf ftrace:
- Add -n/--use-nsec option to the 'latency' subcommand.
Default: usecs:
$ sudo perf ftrace latency -T dput -a sleep 1
# DURATION | COUNT | GRAPH |
0 - 1 us | 2098375 | ############################# |
1 - 2 us | 61 | |
2 - 4 us | 33 | |
4 - 8 us | 13 | |
8 - 16 us | 124 | |
16 - 32 us | 123 | |
32 - 64 us | 1 | |
64 - 128 us | 0 | |
128 - 256 us | 1 | |
256 - 512 us | 0 | |
Better granularity with nsec:
$ sudo perf ftrace latency -T dput -a -n sleep 1
# DURATION | COUNT | GRAPH |
0 - 1 us | 0 | |
1 - 2 ns | 0 | |
2 - 4 ns | 0 | |
4 - 8 ns | 0 | |
8 - 16 ns | 0 | |
16 - 32 ns | 0 | |
32 - 64 ns | 0 | |
64 - 128 ns | 1163434 | ############## |
128 - 256 ns | 914102 | ############# |
256 - 512 ns | 884 | |
512 - 1024 ns | 613 | |
1 - 2 us | 31 | |
2 - 4 us | 17 | |
4 - 8 us | 7 | |
8 - 16 us | 123 | |
16 - 32 us | 83 | |
perf lock:
- Add -c/--combine-locks option to merge lock instances in the same
class into a single entry.
# perf lock report -c
Name acquired contended avg wait(ns) total wait(ns) max wait(ns) min wait(ns)
rcu_read_lock 251225 0 0 0 0 0
hrtimer_bases.lock 39450 0 0 0 0 0
&sb->s_type->i_l... 10301 1 662 662 662 662
ptlock_ptr(page) 10173 2 701 1402 760 642
&(ei->i_block_re... 8732 0 0 0 0 0
&xa->xa_lock 8088 0 0 0 0 0
&base->lock 6705 0 0 0 0 0
&p->pi_lock 5549 0 0 0 0 0
&dentry->d_lockr... 5010 4 1274 5097 1844 789
&ep->lock 3958 0 0 0 0 0
- Add -F/--field option to customize the list of fields to output:
$ perf lock report -F contended,wait_max -k avg_wait
Name contended max wait(ns) avg wait(ns)
slock-AF_INET6 1 23543 23543
&lruvec->lru_lock 5 18317 11254
slock-AF_INET6 1 10379 10379
rcu_node_1 1 2104 2104
&dentry->d_lockr... 1 1844 1844
&dentry->d_lockr... 1 1672 1672
&newf->file_lock 15 2279 1025
&dentry->d_lockr... 1 792 792
- Add --synth=no option for record, as there is no need to symbolize,
lock names comes from the tracepoints.
perf record:
- Threaded recording, opt-in, via the new --threads command line
option.
- Improve AMD IBS (Instruction-Based Sampling) error handling
messages.
perf script:
- Add 'brstackinsnlen' field (use it with -F) for branch stacks.
- Output branch sample type in 'perf script'.
perf report:
- Add "addr_from" and "addr_to" sort dimensions.
- Print branch stack entry type in 'perf report --dump-raw-trace'
- Fix symbolization for chrooted workloads.
Hardware tracing:
Intel PT:
- Add CFE (Control Flow Event) and EVD (Event Data) packets support.
- Add MODE.Exec IFLAG bit support.
Explanation about these features from the "Intel® 64 and IA-32
architectures software developer’s manual combined volumes: 1, 2A,
2B, 2C, 2D, 3A, 3B, 3C, 3D, and 4" PDF at:
https://cdrdv2.intel.com/v1/dl/getContent/671200
At page 3951:
"32.2.4
Event Trace is a capability that exposes details about the
asynchronous events, when they are generated, and when their
corresponding software event handler completes execution. These
include:
o Interrupts, including NMI and SMI, including the interrupt
vector when defined.
o Faults, exceptions including the fault vector.
- Page faults additionally include the page fault address,
when in context.
o Event handler returns, including IRET and RSM.
o VM exits and VM entries.¹
- VM exits include the values written to the “exit reason”
and “exit qualification” VMCS fields. INIT and SIPI events.
o TSX aborts, including the abort status returned for the RTM
instructions.
o Shutdown.
Additionally, it provides indication of the status of the
Interrupt Flag (IF), to indicate when interrupts are masked"
ARM CoreSight:
- Use advertised caps/min_interval as default sample_period on ARM
spe.
- Update deduction of TRCCONFIGR register for branch broadcast on
ARM's CoreSight ETM.
Vendor Events (JSON):
Intel:
- Update events and metrics for: Alderlake, Broadwell, Broadwell DE,
BroadwellX, CascadelakeX, Elkhartlake, Bonnell, Goldmont,
GoldmontPlus, Westmere EP-DP, Haswell, HaswellX, Icelake, IcelakeX,
Ivybridge, Ivytown, Jaketown, Knights Landing, Nehalem EP,
Sandybridge, Silvermont, Skylake, Skylake Server, SkylakeX,
Tigerlake, TremontX, Westmere EP-SP, and Westmere EX.
ARM:
- Add support for HiSilicon CPA PMU aliasing.
perf stat:
- Fix forked applications enablement of counters.
- The 'slots' should only be printed on a different order than the
one specified on the command line when 'topdown' events are
present, fix it.
Miscellaneous:
- Sync msr-index, cpufeatures header files with the kernel sources.
- Stop using some deprecated libbpf APIs in 'perf trace'.
- Fix some spelling mistakes.
- Refactor the maps pointers usage to pave the way for using refcount
debugging.
- Only offer the --tui option on perf top, report and annotate when
perf was built with libslang.
- Don't mention --to-ctf in 'perf data --help' when not linking with
the required library, libbabeltrace.
- Use ARRAY_SIZE() instead of ad hoc equivalent, spotted by
array_size.cocci.
- Enhance the matching of sub-commands abbreviations:
'perf c2c rec' -> 'perf c2c record'
'perf c2c recport -> error
- Set build-id using build-id header on new mmap records.
- Fix generation of 'perf --version' string.
perf test:
- Add test for the arm_spe event.
- Add test to check unwinding using fame-pointer (fp) mode on arm64.
- Make metric testing more robust in 'perf test'.
- Add error message for unsupported branch stack cases.
libperf:
- Add API for allocating new thread map array.
- Fix typo in perf_evlist__open() failure error messages in libperf
tests.
perf c2c:
- Replace bitmap_weight() with bitmap_empty() where appropriate"
* tag 'perf-tools-for-v5.18-2022-03-26' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux: (143 commits)
perf evsel: Improve AMD IBS (Instruction-Based Sampling) error handling messages
perf python: Add perf_env stubs that will be needed in evsel__open_strerror()
perf tools: Enhance the matching of sub-commands abbreviations
libperf tests: Fix typo in perf_evlist__open() failure error messages
tools arm64: Import cputype.h
perf lock: Add -F/--field option to control output
perf lock: Extend struct lock_key to have print function
perf lock: Add --synth=no option for record
tools headers cpufeatures: Sync with the kernel sources
tools headers cpufeatures: Sync with the kernel sources
perf stat: Fix forked applications enablement of counters
tools arch x86: Sync the msr-index.h copy with the kernel sources
perf evsel: Make evsel__env() always return a valid env
perf build-id: Fix spelling mistake "Cant" -> "Can't"
perf header: Fix spelling mistake "could't" -> "couldn't"
perf script: Add 'brstackinsnlen' for branch stacks
perf parse-events: Move slots only with topdown
perf ftrace latency: Update documentation
perf ftrace latency: Add -n/--use-nsec option
perf tools: Fix version kernel tag
...
Diffstat (limited to 'tools/perf/util/bpf-loader.c')
-rw-r--r-- | tools/perf/util/bpf-loader.c | 254 |
1 files changed, 216 insertions, 38 deletions
diff --git a/tools/perf/util/bpf-loader.c b/tools/perf/util/bpf-loader.c index ec6d9e7b446d..b72cef1ae959 100644 --- a/tools/perf/util/bpf-loader.c +++ b/tools/perf/util/bpf-loader.c @@ -26,6 +26,8 @@ #include "util.h" #include "llvm-utils.h" #include "c++/clang-c.h" +#include "hashmap.h" +#include "asm/bug.h" #include <internal/xyarray.h> @@ -49,8 +51,54 @@ struct bpf_prog_priv { int *type_mapping; }; +struct bpf_perf_object { + struct list_head list; + struct bpf_object *obj; +}; + +static LIST_HEAD(bpf_objects_list); +static struct hashmap *bpf_program_hash; +static struct hashmap *bpf_map_hash; + +static struct bpf_perf_object * +bpf_perf_object__next(struct bpf_perf_object *prev) +{ + struct bpf_perf_object *next; + + if (!prev) + next = list_first_entry(&bpf_objects_list, + struct bpf_perf_object, + list); + else + next = list_next_entry(prev, list); + + /* Empty list is noticed here so don't need checking on entry. */ + if (&next->list == &bpf_objects_list) + return NULL; + + return next; +} + +#define bpf_perf_object__for_each(perf_obj, tmp) \ + for ((perf_obj) = bpf_perf_object__next(NULL), \ + (tmp) = bpf_perf_object__next(perf_obj); \ + (perf_obj) != NULL; \ + (perf_obj) = (tmp), (tmp) = bpf_perf_object__next(tmp)) + static bool libbpf_initialized; +static int bpf_perf_object__add(struct bpf_object *obj) +{ + struct bpf_perf_object *perf_obj = zalloc(sizeof(*perf_obj)); + + if (perf_obj) { + INIT_LIST_HEAD(&perf_obj->list); + perf_obj->obj = obj; + list_add_tail(&perf_obj->list, &bpf_objects_list); + } + return perf_obj ? 0 : -ENOMEM; +} + struct bpf_object * bpf__prepare_load_buffer(void *obj_buf, size_t obj_buf_sz, const char *name) { @@ -68,9 +116,21 @@ bpf__prepare_load_buffer(void *obj_buf, size_t obj_buf_sz, const char *name) return ERR_PTR(-EINVAL); } + if (bpf_perf_object__add(obj)) { + bpf_object__close(obj); + return ERR_PTR(-ENOMEM); + } + return obj; } +static void bpf_perf_object__close(struct bpf_perf_object *perf_obj) +{ + list_del(&perf_obj->list); + bpf_object__close(perf_obj->obj); + free(perf_obj); +} + struct bpf_object *bpf__prepare_load(const char *filename, bool source) { LIBBPF_OPTS(bpf_object_open_opts, opts, .object_name = filename); @@ -102,29 +162,25 @@ struct bpf_object *bpf__prepare_load(const char *filename, bool source) llvm__dump_obj(filename, obj_buf, obj_buf_sz); free(obj_buf); - } else + } else { obj = bpf_object__open(filename); + } if (IS_ERR_OR_NULL(obj)) { pr_debug("bpf: failed to load %s\n", filename); return obj; } - return obj; -} - -void bpf__clear(void) -{ - struct bpf_object *obj, *tmp; - - bpf_object__for_each_safe(obj, tmp) { - bpf__unprobe(obj); + if (bpf_perf_object__add(obj)) { bpf_object__close(obj); + return ERR_PTR(-BPF_LOADER_ERRNO__COMPILE); } + + return obj; } static void -clear_prog_priv(struct bpf_program *prog __maybe_unused, +clear_prog_priv(const struct bpf_program *prog __maybe_unused, void *_priv) { struct bpf_prog_priv *priv = _priv; @@ -137,6 +193,83 @@ clear_prog_priv(struct bpf_program *prog __maybe_unused, free(priv); } +static void bpf_program_hash_free(void) +{ + struct hashmap_entry *cur; + size_t bkt; + + if (IS_ERR_OR_NULL(bpf_program_hash)) + return; + + hashmap__for_each_entry(bpf_program_hash, cur, bkt) + clear_prog_priv(cur->key, cur->value); + + hashmap__free(bpf_program_hash); + bpf_program_hash = NULL; +} + +static void bpf_map_hash_free(void); + +void bpf__clear(void) +{ + struct bpf_perf_object *perf_obj, *tmp; + + bpf_perf_object__for_each(perf_obj, tmp) { + bpf__unprobe(perf_obj->obj); + bpf_perf_object__close(perf_obj); + } + + bpf_program_hash_free(); + bpf_map_hash_free(); +} + +static size_t ptr_hash(const void *__key, void *ctx __maybe_unused) +{ + return (size_t) __key; +} + +static bool ptr_equal(const void *key1, const void *key2, + void *ctx __maybe_unused) +{ + return key1 == key2; +} + +static void *program_priv(const struct bpf_program *prog) +{ + void *priv; + + if (IS_ERR_OR_NULL(bpf_program_hash)) + return NULL; + if (!hashmap__find(bpf_program_hash, prog, &priv)) + return NULL; + return priv; +} + +static int program_set_priv(struct bpf_program *prog, void *priv) +{ + void *old_priv; + + /* + * Should not happen, we warn about it in the + * caller function - config_bpf_program + */ + if (IS_ERR(bpf_program_hash)) + return PTR_ERR(bpf_program_hash); + + if (!bpf_program_hash) { + bpf_program_hash = hashmap__new(ptr_hash, ptr_equal, NULL); + if (IS_ERR(bpf_program_hash)) + return PTR_ERR(bpf_program_hash); + } + + old_priv = program_priv(prog); + if (old_priv) { + clear_prog_priv(prog, old_priv); + return hashmap__set(bpf_program_hash, prog, priv, NULL, NULL); + } + return hashmap__add(bpf_program_hash, prog, priv); +} + static int prog_config__exec(const char *value, struct perf_probe_event *pev) { @@ -378,7 +511,7 @@ config_bpf_program(struct bpf_program *prog) pr_debug("bpf: config '%s' is ok\n", config_str); set_priv: - err = bpf_program__set_priv(prog, priv, clear_prog_priv); + err = program_set_priv(prog, priv); if (err) { pr_debug("Failed to set priv for program '%s'\n", config_str); goto errout; @@ -419,7 +552,7 @@ preproc_gen_prologue(struct bpf_program *prog, int n, struct bpf_insn *orig_insns, int orig_insns_cnt, struct bpf_prog_prep_result *res) { - struct bpf_prog_priv *priv = bpf_program__priv(prog); + struct bpf_prog_priv *priv = program_priv(prog); struct probe_trace_event *tev; struct perf_probe_event *pev; struct bpf_insn *buf; @@ -570,7 +703,7 @@ static int map_prologue(struct perf_probe_event *pev, int *mapping, static int hook_load_preprocessor(struct bpf_program *prog) { - struct bpf_prog_priv *priv = bpf_program__priv(prog); + struct bpf_prog_priv *priv = program_priv(prog); struct perf_probe_event *pev; bool need_prologue = false; int err, i; @@ -646,7 +779,7 @@ int bpf__probe(struct bpf_object *obj) if (err) goto out; - priv = bpf_program__priv(prog); + priv = program_priv(prog); if (IS_ERR_OR_NULL(priv)) { if (!priv) err = -BPF_LOADER_ERRNO__INTERNAL; @@ -698,7 +831,7 @@ int bpf__unprobe(struct bpf_object *obj) struct bpf_program *prog; bpf_object__for_each_program(prog, obj) { - struct bpf_prog_priv *priv = bpf_program__priv(prog); + struct bpf_prog_priv *priv = program_priv(prog); int i; if (IS_ERR_OR_NULL(priv) || priv->is_tp) @@ -754,7 +887,7 @@ int bpf__foreach_event(struct bpf_object *obj, int err; bpf_object__for_each_program(prog, obj) { - struct bpf_prog_priv *priv = bpf_program__priv(prog); + struct bpf_prog_priv *priv = program_priv(prog); struct probe_trace_event *tev; struct perf_probe_event *pev; int i, fd; @@ -850,7 +983,7 @@ bpf_map_priv__purge(struct bpf_map_priv *priv) } static void -bpf_map_priv__clear(struct bpf_map *map __maybe_unused, +bpf_map_priv__clear(const struct bpf_map *map __maybe_unused, void *_priv) { struct bpf_map_priv *priv = _priv; @@ -859,6 +992,53 @@ bpf_map_priv__clear(struct bpf_map *map __maybe_unused, free(priv); } +static void *map_priv(const struct bpf_map *map) +{ + void *priv; + + if (IS_ERR_OR_NULL(bpf_map_hash)) + return NULL; + if (!hashmap__find(bpf_map_hash, map, &priv)) + return NULL; + return priv; +} + +static void bpf_map_hash_free(void) +{ + struct hashmap_entry *cur; + size_t bkt; + + if (IS_ERR_OR_NULL(bpf_map_hash)) + return; + + hashmap__for_each_entry(bpf_map_hash, cur, bkt) + bpf_map_priv__clear(cur->key, cur->value); + + hashmap__free(bpf_map_hash); + bpf_map_hash = NULL; +} + +static int map_set_priv(struct bpf_map *map, void *priv) +{ + void *old_priv; + + if (WARN_ON_ONCE(IS_ERR(bpf_map_hash))) + return PTR_ERR(bpf_program_hash); + + if (!bpf_map_hash) { + bpf_map_hash = hashmap__new(ptr_hash, ptr_equal, NULL); + if (IS_ERR(bpf_map_hash)) + return PTR_ERR(bpf_map_hash); + } + + old_priv = map_priv(map); + if (old_priv) { + bpf_map_priv__clear(map, old_priv); + return hashmap__set(bpf_map_hash, map, priv, NULL, NULL); + } + return hashmap__add(bpf_map_hash, map, priv); +} + static int bpf_map_op_setkey(struct bpf_map_op *op, struct parse_events_term *term) { @@ -958,7 +1138,7 @@ static int bpf_map__add_op(struct bpf_map *map, struct bpf_map_op *op) { const char *map_name = bpf_map__name(map); - struct bpf_map_priv *priv = bpf_map__priv(map); + struct bpf_map_priv *priv = map_priv(map); if (IS_ERR(priv)) { pr_debug("Failed to get private from map %s\n", map_name); @@ -973,7 +1153,7 @@ bpf_map__add_op(struct bpf_map *map, struct bpf_map_op *op) } INIT_LIST_HEAD(&priv->ops_list); - if (bpf_map__set_priv(map, priv, bpf_map_priv__clear)) { + if (map_set_priv(map, priv)) { free(priv); return -BPF_LOADER_ERRNO__INTERNAL; } @@ -1305,7 +1485,7 @@ bpf_map_config_foreach_key(struct bpf_map *map, int err, map_fd, type; struct bpf_map_op *op; const char *name = bpf_map__name(map); - struct bpf_map_priv *priv = bpf_map__priv(map); + struct bpf_map_priv *priv = map_priv(map); if (IS_ERR(priv)) { pr_debug("ERROR: failed to get private from map %s\n", name); @@ -1494,11 +1674,11 @@ apply_obj_config_object(struct bpf_object *obj) int bpf__apply_obj_config(void) { - struct bpf_object *obj, *tmp; + struct bpf_perf_object *perf_obj, *tmp; int err; - bpf_object__for_each_safe(obj, tmp) { - err = apply_obj_config_object(obj); + bpf_perf_object__for_each(perf_obj, tmp) { + err = apply_obj_config_object(perf_obj->obj); if (err) return err; } @@ -1506,27 +1686,25 @@ int bpf__apply_obj_config(void) return 0; } -#define bpf__for_each_map(pos, obj, objtmp) \ - bpf_object__for_each_safe(obj, objtmp) \ - bpf_object__for_each_map(pos, obj) +#define bpf__perf_for_each_map(map, pobj, tmp) \ + bpf_perf_object__for_each(pobj, tmp) \ + bpf_object__for_each_map(map, pobj->obj) -#define bpf__for_each_map_named(pos, obj, objtmp, name) \ - bpf__for_each_map(pos, obj, objtmp) \ - if (bpf_map__name(pos) && \ - (strcmp(name, \ - bpf_map__name(pos)) == 0)) +#define bpf__perf_for_each_map_named(map, pobj, pobjtmp, name) \ + bpf__perf_for_each_map(map, pobj, pobjtmp) \ + if (bpf_map__name(map) && (strcmp(name, bpf_map__name(map)) == 0)) struct evsel *bpf__setup_output_event(struct evlist *evlist, const char *name) { struct bpf_map_priv *tmpl_priv = NULL; - struct bpf_object *obj, *tmp; + struct bpf_perf_object *perf_obj, *tmp; struct evsel *evsel = NULL; struct bpf_map *map; int err; bool need_init = false; - bpf__for_each_map_named(map, obj, tmp, name) { - struct bpf_map_priv *priv = bpf_map__priv(map); + bpf__perf_for_each_map_named(map, perf_obj, tmp, name) { + struct bpf_map_priv *priv = map_priv(map); if (IS_ERR(priv)) return ERR_PTR(-BPF_LOADER_ERRNO__INTERNAL); @@ -1561,8 +1739,8 @@ struct evsel *bpf__setup_output_event(struct evlist *evlist, const char *name) evsel = evlist__last(evlist); } - bpf__for_each_map_named(map, obj, tmp, name) { - struct bpf_map_priv *priv = bpf_map__priv(map); + bpf__perf_for_each_map_named(map, perf_obj, tmp, name) { + struct bpf_map_priv *priv = map_priv(map); if (IS_ERR(priv)) return ERR_PTR(-BPF_LOADER_ERRNO__INTERNAL); @@ -1574,7 +1752,7 @@ struct evsel *bpf__setup_output_event(struct evlist *evlist, const char *name) if (!priv) return ERR_PTR(-ENOMEM); - err = bpf_map__set_priv(map, priv, bpf_map_priv__clear); + err = map_set_priv(map, priv); if (err) { bpf_map_priv__clear(map, priv); return ERR_PTR(err); |