summaryrefslogtreecommitdiffstats
path: root/tools/perf/pmu-events/arch/arm64
diff options
context:
space:
mode:
authorLinus Torvalds <torvalds@linux-foundation.org>2023-09-10 05:06:17 +0200
committerLinus Torvalds <torvalds@linux-foundation.org>2023-09-10 05:06:17 +0200
commit535a265d7f0dd50d8c3a4f8b4f3a452d56bd160f (patch)
treea42e088342dac365cfa13f30d987118f2a5d259e /tools/perf/pmu-events/arch/arm64
parentMerge tag '6.6-rc-smb3-client-fixes-part2' of git://git.samba.org/sfrench/cif... (diff)
parentperf parse-events: Fix driver config term (diff)
downloadlinux-535a265d7f0dd50d8c3a4f8b4f3a452d56bd160f.tar.xz
linux-535a265d7f0dd50d8c3a4f8b4f3a452d56bd160f.zip
Merge tag 'perf-tools-for-v6.6-1-2023-09-05' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools
Pull perf tools updates from Arnaldo Carvalho de Melo: "perf tools maintainership: - Add git information for perf-tools and perf-tools-next trees and branches to the MAINTAINERS file. That is where development now takes place and myself and Namhyung Kim have write access, more people to come as we emulate other maintainer groups. perf record: - Record kernel data maps when 'perf record --data' is used, so that global variables can be resolved and used in tools that do data profiling. perf trace: - Remove the old, experimental support for BPF events in which a .c file was passed as an event: "perf trace -e hello.c" to then get compiled and loaded. The only known usage for that, that shipped with the kernel as an example for such events, augmented the raw_syscalls tracepoints and was converted to a libbpf skeleton, reusing all the user space components and the BPF code connected to the syscalls. In the end just the way to glue the BPF part and the user space type beautifiers changed, now being performed by libbpf skeletons. The next step is to use BTF to do pretty printing of all syscall types, as discussed with Alan Maguire and others. Now, on a perf built with BUILD_BPF_SKEL=1 we get most if not all path/filenames/strings, some of the networking data structures, perf_event_attr, etc, i.e. systemwide tracing of nanosleep calls and perf_event_open syscalls while 'perf stat' runs 'sleep' for 5 seconds: # perf trace -a -e *nanosleep,perf* perf stat -e cycles,instructions sleep 5 0.000 ( 9.034 ms): perf/327641 perf_event_open(attr_uptr: { type: 0 (PERF_TYPE_HARDWARE), size: 136, config: 0 (PERF_COUNT_HW_CPU_CYCLES), sample_type: IDENTIFIER, read_format: TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING, disabled: 1, inherit: 1, enable_on_exec: 1, exclude_guest: 1 }, pid: 327642 (perf), cpu: -1, group_fd: -1, flags: FD_CLOEXEC) = 3 9.039 ( 0.006 ms): perf/327641 perf_event_open(attr_uptr: { type: 0 (PERF_TYPE_HARDWARE), size: 136, config: 0x1 (PERF_COUNT_HW_INSTRUCTIONS), sample_type: IDENTIFIER, read_format: TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING, disabled: 1, inherit: 1, enable_on_exec: 1, exclude_guest: 1 }, pid: 327642 (perf-exec), cpu: -1, group_fd: -1, flags: FD_CLOEXEC) = 4 ? ( ): gpm/991 ... [continued]: clock_nanosleep()) = 0 10.133 ( ): sleep/327642 clock_nanosleep(rqtp: { .tv_sec: 5, .tv_nsec: 0 }, rmtp: 0x7ffd36f83ed0) ... ? ( ): pool-gsd-smart/3051 ... [continued]: clock_nanosleep()) = 0 30.276 ( ): gpm/991 clock_nanosleep(rqtp: { .tv_sec: 2, .tv_nsec: 0 }, rmtp: 0x7ffcc6f73710) ... 223.215 (1000.430 ms): pool-gsd-smart/3051 clock_nanosleep(rqtp: { .tv_sec: 1, .tv_nsec: 0 }, rmtp: 0x7f6e7fffec90) = 0 30.276 (2000.394 ms): gpm/991 ... [continued]: clock_nanosleep()) = 0 1230.814 ( ): pool-gsd-smart/3051 clock_nanosleep(rqtp: { .tv_sec: 1, .tv_nsec: 0 }, rmtp: 0x7f6e7fffec90) ... 1230.814 (1000.404 ms): pool-gsd-smart/3051 ... [continued]: clock_nanosleep()) = 0 2030.886 ( ): gpm/991 clock_nanosleep(rqtp: { .tv_sec: 2, .tv_nsec: 0 }, rmtp: 0x7ffcc6f73710) ... 2237.709 (1000.153 ms): pool-gsd-smart/3051 clock_nanosleep(rqtp: { .tv_sec: 1, .tv_nsec: 0 }, rmtp: 0x7f6e7fffec90) = 0 ? ( ): crond/1172 ... [continued]: clock_nanosleep()) = 0 3242.699 ( ): pool-gsd-smart/3051 clock_nanosleep(rqtp: { .tv_sec: 1, .tv_nsec: 0 }, rmtp: 0x7f6e7fffec90) ... 2030.886 (2000.385 ms): gpm/991 ... [continued]: clock_nanosleep()) = 0 3728.078 ( ): crond/1172 clock_nanosleep(rqtp: { .tv_sec: 60, .tv_nsec: 0 }, rmtp: 0x7ffe0971dcf0) ... 3242.699 (1000.158 ms): pool-gsd-smart/3051 ... [continued]: clock_nanosleep()) = 0 4031.409 ( ): gpm/991 clock_nanosleep(rqtp: { .tv_sec: 2, .tv_nsec: 0 }, rmtp: 0x7ffcc6f73710) ... 10.133 (5000.375 ms): sleep/327642 ... [continued]: clock_nanosleep()) = 0 Performance counter stats for 'sleep 5': 2,617,347 cycles 1,855,997 instructions # 0.71 insn per cycle 5.002282128 seconds time elapsed 0.000855000 seconds user 0.000852000 seconds sys perf annotate: - Building with binutils' libopcode now is opt-in (BUILD_NONDISTRO=1) for licensing reasons, and we missed a build test on tools/perf/tests makefile. Since we now default to NDEBUG=1, we ended up segfaulting when building with BUILD_NONDISTRO=1 because a needed initialization routine was being "error checked" via an assert. Fix it by explicitly checking the result and aborting instead if it fails. We better back propagate the error, but at least 'perf annotate' on samples collected for a BPF program is back working when perf is built with BUILD_NONDISTRO=1. perf report/top: - Add back TUI hierarchy mode header, that is seen when using 'perf report/top --hierarchy'. - Fix the number of entries for 'e' key in the TUI that was preventing navigation of lines when expanding an entry. perf report/script: - Support cross platform register handling, allowing a perf.data file collected on one architecture to have registers sampled correctly displayed when analysis tools such as 'perf report' and 'perf script' are used on a different architecture. - Fix handling of event attributes in pipe mode, i.e. when one uses: perf record -o - | perf report -i - When no perf.data files are used. - Handle files generated via pipe mode with a version of perf and then read also via pipe mode with a different version of perf, where the event attr record may have changed, use the record size field to properly support this version mismatch. perf probe: - Accessing global variables from uprobes isn't supported, make the error message state that instead of stating that some minimal kernel version is needed to have that feature. This seems just a tool limitation, the kernel probably has all that is needed. perf tests: - Fix a reference count related leak in the dlfilter v0 API where the result of a thread__find_symbol_fb() is not matched with an addr_location__exit() to drop the reference counts of the resolved components (machine, thread, map, symbol, etc). Add a dlfilter test to make sure that doesn't regresses. - Lots of fixes for the 'perf test' written in shell script related to problems found with the shellcheck utility. - Fixes for 'perf test' shell scripts testing features enabled when perf is built with BUILD_BPF_SKEL=1, such as 'perf stat' bpf counters. - Add perf record sample filtering test, things like the following example, that gets implemented as a BPF filter attached to the event: # perf record -e task-clock -c 10000 --filter 'ip < 0xffffffff00000000' - Improve the way the task_analyzer test checks if libtraceevent is linked, using 'perf version --build-options' instead of the more expensinve 'perf record -e "sched:sched_switch"'. - Add support for riscv in the mmap-basic test. (This went as well via the RiscV tree, same contents). libperf: - Implement riscv mmap support (This went as well via the RiscV tree, same contents). perf script: - New tool that converts perf.data files to the firefox profiler format so that one can use the visualizer at https://profiler.firefox.com/. Done by Anup Sharma as part of this year's Google Summer of Code. One can generate the output and upload it to the web interface but Anup also automated everything: perf script gecko -F 99 -a sleep 60 - Support syscall name parsing on arm64. - Print "cgroup" field on the same line as "comm". perf bench: - Add new 'uprobe' benchmark to measure the overhead of uprobes with/without BPF programs attached to it. - breakpoints are not available on power9, skip that test. perf stat: - Add #num_cpus_online literal to be used in 'perf stat' metrics, and add this extra 'perf test' check that exemplifies its purpose: TEST_ASSERT_VAL("#num_cpus_online", expr__parse(&num_cpus_online, ctx, "#num_cpus_online") == 0); TEST_ASSERT_VAL("#num_cpus", expr__parse(&num_cpus, ctx, "#num_cpus") == 0); TEST_ASSERT_VAL("#num_cpus >= #num_cpus_online", num_cpus >= num_cpus_online); Miscellaneous: - Improve tool startup time by lazily reading PMU, JSON, sysfs data. - Improve error reporting in the parsing of events, passing YYLTYPE to error routines, so that the output can show were the parsing error was found. - Add 'perf test' entries to check the parsing of events improvements. - Fix various leak for things detected by -fsanitize=address, mostly things that would be freed at tool exit, including: - Free evsel->filter on the destructor. - Allow tools to register a thread->priv destructor and use it in 'perf trace'. - Free evsel->priv in 'perf trace'. - Free string returned by synthesize_perf_probe_point() when the caller fails to do all it needs. - Adjust various compiler options to not consider errors some warnings when building with broken headers found in things like python, flex, bison, as we otherwise build with -Werror. Some for gcc, some for clang, some for some specific version of those, some for some specific version of flex or bison, or some specific combination of these components, bah. - Allow customization of clang options for BPF target, this helps building on gentoo where there are other oddities where BPF targets gets passed some compiler options intended for the native build, so building with WERROR=0 helps while these oddities are fixed. - Dont pass ERR_PTR() values to perf_session__delete() in 'perf top' and 'perf lock', fixing some segfaults when handling some odd failures. - Add LTO build option. - Fix format of unordered lists in the perf docs (tools/perf/Documentation) - Overhaul the bison files, using constructs such as YYNOMEM. - Remove unused tokens from the bison .y files. - Add more comments to various structs. - A few LoongArch enablement patches. Vendor events (JSON): - Add JSON metrics for Yitian 710 DDR (aarch64). Things like: EventName, BriefDescription visible_window_limit_reached_rd, "At least one entry in read queue reaches the visible window limit.", visible_window_limit_reached_wr, "At least one entry in write queue reaches the visible window limit.", op_is_dqsosc_mpc , "A DQS Oscillator MPC command to DRAM.", op_is_dqsosc_mrr , "A DQS Oscillator MRR command to DRAM.", op_is_tcr_mrr , "A Temperature Compensated Refresh(TCR) MRR command to DRAM.", - Add AmpereOne metrics (aarch64). - Update N2 and V2 metrics (aarch64) and events using Arm telemetry repo. - Update scale units and descriptions of common topdown metrics on aarch64. Things like: - "MetricExpr": "stall_slot_frontend / (#slots * cpu_cycles)", - "BriefDescription": "Frontend bound L1 topdown metric", + "MetricExpr": "100 * (stall_slot_frontend / (#slots * cpu_cycles))", + "BriefDescription": "This metric is the percentage of total slots that were stalled due to resource constraints in the frontend of the processor.", - Update events for intel: meteorlake to 1.04, sapphirerapids to 1.15, Icelake+ metric constraints. - Update files for the power10 platform" * tag 'perf-tools-for-v6.6-1-2023-09-05' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools: (217 commits) perf parse-events: Fix driver config term perf parse-events: Fixes relating to no_value terms perf parse-events: Fix propagation of term's no_value when cloning perf parse-events: Name the two term enums perf list: Don't print Unit for "default_core" perf vendor events intel: Fix modifier in tma_info_system_mem_parallel_reads for skylake perf dlfilter: Avoid leak in v0 API test use of resolve_address() perf metric: Add #num_cpus_online literal perf pmu: Remove str from perf_pmu_alias perf parse-events: Make common term list to strbuf helper perf parse-events: Minor help message improvements perf pmu: Avoid uninitialized use of alias->str perf jevents: Use "default_core" for events with no Unit perf test stat_bpf_counters_cgrp: Enhance perf stat cgroup BPF counter test perf test shell stat_bpf_counters: Fix test on Intel perf test shell record_bpf_filter: Skip 6.2 kernel libperf: Get rid of attr.id field perf tools: Convert to perf_record_header_attr_id() libperf: Add perf_record_header_attr_id() perf tools: Handle old data in PERF_RECORD_ATTR ...
Diffstat (limited to 'tools/perf/pmu-events/arch/arm64')
-rw-r--r--tools/perf/pmu-events/arch/arm64/ampere/ampereone/cache.json3
-rw-r--r--tools/perf/pmu-events/arch/arm64/ampere/ampereone/core-imp-def.json120
-rw-r--r--tools/perf/pmu-events/arch/arm64/ampere/ampereone/metrics.json362
-rw-r--r--tools/perf/pmu-events/arch/arm64/ampere/ampereone/pipeline.json12
-rw-r--r--tools/perf/pmu-events/arch/arm64/arm/neoverse-n2-v2/branch.json8
-rw-r--r--tools/perf/pmu-events/arch/arm64/arm/neoverse-n2-v2/bus.json18
-rw-r--r--tools/perf/pmu-events/arch/arm64/arm/neoverse-n2-v2/cache.json155
-rw-r--r--tools/perf/pmu-events/arch/arm64/arm/neoverse-n2-v2/exception.json45
-rw-r--r--tools/perf/pmu-events/arch/arm64/arm/neoverse-n2-v2/fp_operation.json22
-rw-r--r--tools/perf/pmu-events/arch/arm64/arm/neoverse-n2-v2/general.json10
-rw-r--r--tools/perf/pmu-events/arch/arm64/arm/neoverse-n2-v2/instruction.json143
-rw-r--r--tools/perf/pmu-events/arch/arm64/arm/neoverse-n2-v2/l1d_cache.json54
-rw-r--r--tools/perf/pmu-events/arch/arm64/arm/neoverse-n2-v2/l1i_cache.json14
-rw-r--r--tools/perf/pmu-events/arch/arm64/arm/neoverse-n2-v2/l2_cache.json50
-rw-r--r--tools/perf/pmu-events/arch/arm64/arm/neoverse-n2-v2/l3_cache.json22
-rw-r--r--tools/perf/pmu-events/arch/arm64/arm/neoverse-n2-v2/ll_cache.json10
-rw-r--r--tools/perf/pmu-events/arch/arm64/arm/neoverse-n2-v2/memory.json39
-rw-r--r--tools/perf/pmu-events/arch/arm64/arm/neoverse-n2-v2/metrics.json365
-rw-r--r--tools/perf/pmu-events/arch/arm64/arm/neoverse-n2-v2/pipeline.json23
-rw-r--r--tools/perf/pmu-events/arch/arm64/arm/neoverse-n2-v2/retired.json30
-rw-r--r--tools/perf/pmu-events/arch/arm64/arm/neoverse-n2-v2/spe.json12
-rw-r--r--tools/perf/pmu-events/arch/arm64/arm/neoverse-n2-v2/spec_operation.json110
-rw-r--r--tools/perf/pmu-events/arch/arm64/arm/neoverse-n2-v2/stall.json30
-rw-r--r--tools/perf/pmu-events/arch/arm64/arm/neoverse-n2-v2/sve.json50
-rw-r--r--tools/perf/pmu-events/arch/arm64/arm/neoverse-n2-v2/tlb.json66
-rw-r--r--tools/perf/pmu-events/arch/arm64/arm/neoverse-n2-v2/trace.json27
-rw-r--r--tools/perf/pmu-events/arch/arm64/freescale/yitian710/sys/ali_drw.json373
-rw-r--r--tools/perf/pmu-events/arch/arm64/freescale/yitian710/sys/metrics.json20
-rw-r--r--tools/perf/pmu-events/arch/arm64/sbsa.json24
29 files changed, 1528 insertions, 689 deletions
diff --git a/tools/perf/pmu-events/arch/arm64/ampere/ampereone/cache.json b/tools/perf/pmu-events/arch/arm64/ampere/ampereone/cache.json
index fc0633054211..7a2b7b200f14 100644
--- a/tools/perf/pmu-events/arch/arm64/ampere/ampereone/cache.json
+++ b/tools/perf/pmu-events/arch/arm64/ampere/ampereone/cache.json
@@ -93,9 +93,6 @@
"ArchStdEvent": "L1D_CACHE_LMISS_RD"
},
{
- "ArchStdEvent": "L1D_CACHE_LMISS"
- },
- {
"ArchStdEvent": "L1I_CACHE_LMISS"
},
{
diff --git a/tools/perf/pmu-events/arch/arm64/ampere/ampereone/core-imp-def.json b/tools/perf/pmu-events/arch/arm64/ampere/ampereone/core-imp-def.json
index 95c30243f2b2..88b23b85e33c 100644
--- a/tools/perf/pmu-events/arch/arm64/ampere/ampereone/core-imp-def.json
+++ b/tools/perf/pmu-events/arch/arm64/ampere/ampereone/core-imp-def.json
@@ -534,66 +534,6 @@
"BriefDescription": "L2D OTB allocate"
},
{
- "PublicDescription": "DTLB Translation cache hit on S1L2 walk cache entry",
- "EventCode": "0xD801",
- "EventName": "MMU_D_TRANS_CACHE_HIT_S1L2_WALK",
- "BriefDescription": "DTLB Translation cache hit on S1L2 walk cache entry"
- },
- {
- "PublicDescription": "DTLB Translation cache hit on S1L1 walk cache entry",
- "EventCode": "0xD802",
- "EventName": "MMU_D_TRANS_CACHE_HIT_S1L1_WALK",
- "BriefDescription": "DTLB Translation cache hit on S1L1 walk cache entry"
- },
- {
- "PublicDescription": "DTLB Translation cache hit on S1L0 walk cache entry",
- "EventCode": "0xD803",
- "EventName": "MMU_D_TRANS_CACHE_HIT_S1L0_WALK",
- "BriefDescription": "DTLB Translation cache hit on S1L0 walk cache entry"
- },
- {
- "PublicDescription": "DTLB Translation cache hit on S2L2 walk cache entry",
- "EventCode": "0xD804",
- "EventName": "MMU_D_TRANS_CACHE_HIT_S2L2_WALK",
- "BriefDescription": "DTLB Translation cache hit on S2L2 walk cache entry"
- },
- {
- "PublicDescription": "DTLB Translation cache hit on S2L1 walk cache entry",
- "EventCode": "0xD805",
- "EventName": "MMU_D_TRANS_CACHE_HIT_S2L1_WALK",
- "BriefDescription": "DTLB Translation cache hit on S2L1 walk cache entry"
- },
- {
- "PublicDescription": "DTLB Translation cache hit on S2L0 walk cache entry",
- "EventCode": "0xD806",
- "EventName": "MMU_D_TRANS_CACHE_HIT_S2L0_WALK",
- "BriefDescription": "DTLB Translation cache hit on S2L0 walk cache entry"
- },
- {
- "PublicDescription": "D-side S1 Page walk cache lookup",
- "EventCode": "0xD807",
- "EventName": "MMU_D_S1_WALK_CACHE_LOOKUP",
- "BriefDescription": "D-side S1 Page walk cache lookup"
- },
- {
- "PublicDescription": "D-side S1 Page walk cache refill",
- "EventCode": "0xD808",
- "EventName": "MMU_D_S1_WALK_CACHE_REFILL",
- "BriefDescription": "D-side S1 Page walk cache refill"
- },
- {
- "PublicDescription": "D-side S2 Page walk cache lookup",
- "EventCode": "0xD809",
- "EventName": "MMU_D_S2_WALK_CACHE_LOOKUP",
- "BriefDescription": "D-side S2 Page walk cache lookup"
- },
- {
- "PublicDescription": "D-side S2 Page walk cache refill",
- "EventCode": "0xD80A",
- "EventName": "MMU_D_S2_WALK_CACHE_REFILL",
- "BriefDescription": "D-side S2 Page walk cache refill"
- },
- {
"PublicDescription": "D-side Stage1 tablewalk fault",
"EventCode": "0xD80B",
"EventName": "MMU_D_S1_WALK_FAULT",
@@ -618,66 +558,6 @@
"BriefDescription": "L2I OTB allocate"
},
{
- "PublicDescription": "ITLB Translation cache hit on S1L2 walk cache entry",
- "EventCode": "0xD901",
- "EventName": "MMU_I_TRANS_CACHE_HIT_S1L2_WALK",
- "BriefDescription": "ITLB Translation cache hit on S1L2 walk cache entry"
- },
- {
- "PublicDescription": "ITLB Translation cache hit on S1L1 walk cache entry",
- "EventCode": "0xD902",
- "EventName": "MMU_I_TRANS_CACHE_HIT_S1L1_WALK",
- "BriefDescription": "ITLB Translation cache hit on S1L1 walk cache entry"
- },
- {
- "PublicDescription": "ITLB Translation cache hit on S1L0 walk cache entry",
- "EventCode": "0xD903",
- "EventName": "MMU_I_TRANS_CACHE_HIT_S1L0_WALK",
- "BriefDescription": "ITLB Translation cache hit on S1L0 walk cache entry"
- },
- {
- "PublicDescription": "ITLB Translation cache hit on S2L2 walk cache entry",
- "EventCode": "0xD904",
- "EventName": "MMU_I_TRANS_CACHE_HIT_S2L2_WALK",
- "BriefDescription": "ITLB Translation cache hit on S2L2 walk cache entry"
- },
- {
- "PublicDescription": "ITLB Translation cache hit on S2L1 walk cache entry",
- "EventCode": "0xD905",
- "EventName": "MMU_I_TRANS_CACHE_HIT_S2L1_WALK",
- "BriefDescription": "ITLB Translation cache hit on S2L1 walk cache entry"
- },
- {
- "PublicDescription": "ITLB Translation cache hit on S2L0 walk cache entry",
- "EventCode": "0xD906",
- "EventName": "MMU_I_TRANS_CACHE_HIT_S2L0_WALK",
- "BriefDescription": "ITLB Translation cache hit on S2L0 walk cache entry"
- },
- {
- "PublicDescription": "I-side S1 Page walk cache lookup",
- "EventCode": "0xD907",
- "EventName": "MMU_I_S1_WALK_CACHE_LOOKUP",
- "BriefDescription": "I-side S1 Page walk cache lookup"
- },
- {
- "PublicDescription": "I-side S1 Page walk cache refill",
- "EventCode": "0xD908",
- "EventName": "MMU_I_S1_WALK_CACHE_REFILL",
- "BriefDescription": "I-side S1 Page walk cache refill"
- },
- {
- "PublicDescription": "I-side S2 Page walk cache lookup",
- "EventCode": "0xD909",
- "EventName": "MMU_I_S2_WALK_CACHE_LOOKUP",
- "BriefDescription": "I-side S2 Page walk cache lookup"
- },
- {
- "PublicDescription": "I-side S2 Page walk cache refill",
- "EventCode": "0xD90A",
- "EventName": "MMU_I_S2_WALK_CACHE_REFILL",
- "BriefDescription": "I-side S2 Page walk cache refill"
- },
- {
"PublicDescription": "I-side Stage1 tablewalk fault",
"EventCode": "0xD90B",
"EventName": "MMU_I_S1_WALK_FAULT",
diff --git a/tools/perf/pmu-events/arch/arm64/ampere/ampereone/metrics.json b/tools/perf/pmu-events/arch/arm64/ampere/ampereone/metrics.json
new file mode 100644
index 000000000000..1e7e8901a445
--- /dev/null
+++ b/tools/perf/pmu-events/arch/arm64/ampere/ampereone/metrics.json
@@ -0,0 +1,362 @@
+[
+ {
+ "MetricExpr": "BR_MIS_PRED / BR_PRED",
+ "BriefDescription": "Branch predictor misprediction rate. May not count branches that are never resolved because they are in the misprediction shadow of an earlier branch",
+ "MetricGroup": "Branch Prediction",
+ "MetricName": "Misprediction"
+ },
+ {
+ "MetricExpr": "BR_MIS_PRED_RETIRED / BR_RETIRED",
+ "BriefDescription": "Branch predictor misprediction rate",
+ "MetricGroup": "Branch Prediction",
+ "MetricName": "Misprediction (retired)"
+ },
+ {
+ "MetricExpr": "BUS_ACCESS / ( BUS_CYCLES * 1)",
+ "BriefDescription": "Core-to-uncore bus utilization",
+ "MetricGroup": "Bus",
+ "MetricName": "Bus utilization"
+ },
+ {
+ "MetricExpr": "L1D_CACHE_REFILL / L1D_CACHE",
+ "BriefDescription": "L1D cache miss rate",
+ "MetricGroup": "Cache",
+ "MetricName": "L1D cache miss"
+ },
+ {
+ "MetricExpr": "L1D_CACHE_LMISS_RD / L1D_CACHE_RD",
+ "BriefDescription": "L1D cache read miss rate",
+ "MetricGroup": "Cache",
+ "MetricName": "L1D cache read miss"
+ },
+ {
+ "MetricExpr": "L1I_CACHE_REFILL / L1I_CACHE",
+ "BriefDescription": "L1I cache miss rate",
+ "MetricGroup": "Cache",
+ "MetricName": "L1I cache miss"
+ },
+ {
+ "MetricExpr": "L2D_CACHE_REFILL / L2D_CACHE",
+ "BriefDescription": "L2 cache miss rate",
+ "MetricGroup": "Cache",
+ "MetricName": "L2 cache miss"
+ },
+ {
+ "MetricExpr": "L1I_CACHE_LMISS / L1I_CACHE",
+ "BriefDescription": "L1I cache read miss rate",
+ "MetricGroup": "Cache",
+ "MetricName": "L1I cache read miss"
+ },
+ {
+ "MetricExpr": "L2D_CACHE_LMISS_RD / L2D_CACHE_RD",
+ "BriefDescription": "L2 cache read miss rate",
+ "MetricGroup": "Cache",
+ "MetricName": "L2 cache read miss"
+ },
+ {
+ "MetricExpr": "(L1D_CACHE_LMISS_RD * 1000) / INST_RETIRED",
+ "BriefDescription": "Misses per thousand instructions (data)",
+ "MetricGroup": "Cache",
+ "MetricName": "MPKI data"
+ },
+ {
+ "MetricExpr": "(L1I_CACHE_LMISS * 1000) / INST_RETIRED",
+ "BriefDescription": "Misses per thousand instructions (instruction)",
+ "MetricGroup": "Cache",
+ "MetricName": "MPKI instruction"
+ },
+ {
+ "MetricExpr": "ASE_SPEC / OP_SPEC",
+ "BriefDescription": "Proportion of advanced SIMD data processing operations (excluding DP_SPEC/LD_SPEC) operations",
+ "MetricGroup": "Instruction",
+ "MetricName": "ASE mix"
+ },
+ {
+ "MetricExpr": "CRYPTO_SPEC / OP_SPEC",
+ "BriefDescription": "Proportion of crypto data processing operations",
+ "MetricGroup": "Instruction",
+ "MetricName": "Crypto mix"
+ },
+ {
+ "MetricExpr": "VFP_SPEC / (duration_time *1000000000)",
+ "BriefDescription": "Giga-floating point operations per second",
+ "MetricGroup": "Instruction",
+ "MetricName": "GFLOPS_ISSUED"
+ },
+ {
+ "MetricExpr": "DP_SPEC / OP_SPEC",
+ "BriefDescription": "Proportion of integer data processing operations",
+ "MetricGroup": "Instruction",
+ "MetricName": "Integer mix"
+ },
+ {
+ "MetricExpr": "INST_RETIRED / CPU_CYCLES",
+ "BriefDescription": "Instructions per cycle",
+ "MetricGroup": "Instruction",
+ "MetricName": "IPC"
+ },
+ {
+ "MetricExpr": "LD_SPEC / OP_SPEC",
+ "BriefDescription": "Proportion of load operations",
+ "MetricGroup": "Instruction",
+ "MetricName": "Load mix"
+ },
+ {
+ "MetricExpr": "LDST_SPEC/ OP_SPEC",
+ "BriefDescription": "Proportion of load & store operations",
+ "MetricGroup": "Instruction",
+ "MetricName": "Load-store mix"
+ },
+ {
+ "MetricExpr": "INST_RETIRED / (duration_time * 1000000)",
+ "BriefDescription": "Millions of instructions per second",
+ "MetricGroup": "Instruction",
+ "MetricName": "MIPS_RETIRED"
+ },
+ {
+ "MetricExpr": "INST_SPEC / (duration_time * 1000000)",
+ "BriefDescription": "Millions of instructions per second",
+ "MetricGroup": "Instruction",
+ "MetricName": "MIPS_UTILIZATION"
+ },
+ {
+ "MetricExpr": "PC_WRITE_SPEC / OP_SPEC",
+ "BriefDescription": "Proportion of software change of PC operations",
+ "MetricGroup": "Instruction",
+ "MetricName": "PC write mix"
+ },
+ {
+ "MetricExpr": "ST_SPEC / OP_SPEC",
+ "BriefDescription": "Proportion of store operations",
+ "MetricGroup": "Instruction",
+ "MetricName": "Store mix"
+ },
+ {
+ "MetricExpr": "VFP_SPEC / OP_SPEC",
+ "BriefDescription": "Proportion of FP operations",
+ "MetricGroup": "Instruction",
+ "MetricName": "VFP mix"
+ },
+ {
+ "MetricExpr": "1 - (OP_RETIRED/ (CPU_CYCLES * 4))",
+ "BriefDescription": "Proportion of slots lost",
+ "MetricGroup": "Speculation / TDA",
+ "MetricName": "CPU lost"
+ },
+ {
+ "MetricExpr": "OP_RETIRED/ (CPU_CYCLES * 4)",
+ "BriefDescription": "Proportion of slots retiring",
+ "MetricGroup": "Speculation / TDA",
+ "MetricName": "CPU utilization"
+ },
+ {
+ "MetricExpr": "OP_RETIRED - OP_SPEC",
+ "BriefDescription": "Operations lost due to misspeculation",
+ "MetricGroup": "Speculation / TDA",
+ "MetricName": "Operations lost"
+ },
+ {
+ "MetricExpr": "1 - (OP_RETIRED / OP_SPEC)",
+ "BriefDescription": "Proportion of operations lost",
+ "MetricGroup": "Speculation / TDA",
+ "MetricName": "Operations lost (ratio)"
+ },
+ {
+ "MetricExpr": "OP_RETIRED / OP_SPEC",
+ "BriefDescription": "Proportion of operations retired",
+ "MetricGroup": "Speculation / TDA",
+ "MetricName": "Operations retired"
+ },
+ {
+ "MetricExpr": "STALL_BACKEND_CACHE / CPU_CYCLES",
+ "BriefDescription": "Proportion of cycles stalled and no operations issued to backend and cache miss",
+ "MetricGroup": "Stall",
+ "MetricName": "Stall backend cache cycles"
+ },
+ {
+ "MetricExpr": "STALL_BACKEND_RESOURCE / CPU_CYCLES",
+ "BriefDescription": "Proportion of cycles stalled and no operations issued to backend and resource full",
+ "MetricGroup": "Stall",
+ "MetricName": "Stall backend resource cycles"
+ },
+ {
+ "MetricExpr": "STALL_BACKEND_TLB / CPU_CYCLES",
+ "BriefDescription": "Proportion of cycles stalled and no operations issued to backend and TLB miss",
+ "MetricGroup": "Stall",
+ "MetricName": "Stall backend tlb cycles"
+ },
+ {
+ "MetricExpr": "STALL_FRONTEND_CACHE / CPU_CYCLES",
+ "BriefDescription": "Proportion of cycles stalled and no ops delivered from frontend and cache miss",
+ "MetricGroup": "Stall",
+ "MetricName": "Stall frontend cache cycles"
+ },
+ {
+ "MetricExpr": "STALL_FRONTEND_TLB / CPU_CYCLES",
+ "BriefDescription": "Proportion of cycles stalled and no ops delivered from frontend and TLB miss",
+ "MetricGroup": "Stall",
+ "MetricName": "Stall frontend tlb cycles"
+ },
+ {
+ "MetricExpr": "DTLB_WALK / L1D_TLB",
+ "BriefDescription": "D-side walk per d-side translation request",
+ "MetricGroup": "TLB",
+ "MetricName": "DTLB walks"
+ },
+ {
+ "MetricExpr": "ITLB_WALK / L1I_TLB",
+ "BriefDescription": "I-side walk per i-side translation request",
+ "MetricGroup": "TLB",
+ "MetricName": "ITLB walks"
+ },
+ {
+ "MetricExpr": "STALL_SLOT_BACKEND / (CPU_CYCLES * 4)",
+ "BriefDescription": "Fraction of slots backend bound",
+ "MetricGroup": "TopDownL1",
+ "MetricName": "backend"
+ },
+ {
+ "MetricExpr": "1 - (retiring + lost + backend)",
+ "BriefDescription": "Fraction of slots frontend bound",
+ "MetricGroup": "TopDownL1",
+ "MetricName": "frontend"
+ },
+ {
+ "MetricExpr": "((OP_SPEC - OP_RETIRED) / (CPU_CYCLES * 4))",
+ "BriefDescription": "Fraction of slots lost due to misspeculation",
+ "MetricGroup": "TopDownL1",
+ "MetricName": "lost"
+ },
+ {
+ "MetricExpr": "(OP_RETIRED / (CPU_CYCLES * 4))",
+ "BriefDescription": "Fraction of slots retiring, useful work",
+ "MetricGroup": "TopDownL1",
+ "MetricName": "retiring"
+ },
+ {
+ "MetricExpr": "backend - backend_memory",
+ "BriefDescription": "Fraction of slots the CPU was stalled due to backend non-memory subsystem issues",
+ "MetricGroup": "TopDownL2",
+ "MetricName": "backend_core"
+ },
+ {
+ "MetricExpr": "(STALL_BACKEND_TLB + STALL_BACKEND_CACHE + STALL_BACKEND_MEM) / CPU_CYCLES ",
+ "BriefDescription": "Fraction of slots the CPU was stalled due to backend memory subsystem issues (cache/tlb miss)",
+ "MetricGroup": "TopDownL2",
+ "MetricName": "backend_memory"
+ },
+ {
+ "MetricExpr": " (BR_MIS_PRED_RETIRED / GPC_FLUSH) * lost",
+ "BriefDescription": "Fraction of slots lost due to branch misprediciton",
+ "MetricGroup": "TopDownL2",
+ "MetricName": "branch_mispredict"
+ },
+ {
+ "MetricExpr": "frontend - frontend_latency",
+ "BriefDescription": "Fraction of slots the CPU did not dispatch at full bandwidth - able to dispatch partial slots only (1, 2, or 3 uops)",
+ "MetricGroup": "TopDownL2",
+ "MetricName": "frontend_bandwidth"
+ },
+ {
+ "MetricExpr": "(STALL_FRONTEND - ((STALL_SLOT_FRONTEND - (frontend * CPU_CYCLES * 4)) / 4)) / CPU_CYCLES",
+ "BriefDescription": "Fraction of slots the CPU was stalled due to frontend latency issues (cache/tlb miss); nothing to dispatch",
+ "MetricGroup": "TopDownL2",
+ "MetricName": "frontend_latency"
+ },
+ {
+ "MetricExpr": "lost - branch_mispredict",
+ "BriefDescription": "Fraction of slots lost due to other/non-branch misprediction misspeculation",
+ "MetricGroup": "TopDownL2",
+ "MetricName": "other_clears"
+ },
+ {
+ "MetricExpr": "(IXU_NUM_UOPS_ISSUED + FSU_ISSUED) / (CPU_CYCLES * 6)",
+ "BriefDescription": "Fraction of execute slots utilized",
+ "MetricGroup": "TopDownL2",
+ "MetricName": "pipe_utilization"
+ },
+ {
+ "MetricExpr": "STALL_BACKEND_MEM / CPU_CYCLES",
+ "BriefDescription": "Fraction of cycles the CPU was stalled due to data L2 cache miss",
+ "MetricGroup": "TopDownL3",
+ "MetricName": "d_cache_l2_miss"
+ },
+ {
+ "MetricExpr": "STALL_BACKEND_CACHE / CPU_CYCLES",
+ "BriefDescription": "Fraction of cycles the CPU was stalled due to data cache miss",
+ "MetricGroup": "TopDownL3",
+ "MetricName": "d_cache_miss"
+ },
+ {
+ "MetricExpr": "STALL_BACKEND_TLB / CPU_CYCLES",
+ "BriefDescription": "Fraction of cycles the CPU was stalled due to data TLB miss",
+ "MetricGroup": "TopDownL3",
+ "MetricName": "d_tlb_miss"
+ },
+ {
+ "MetricExpr": "FSU_ISSUED / (CPU_CYCLES * 2)",
+ "BriefDescription": "Fraction of FSU execute slots utilized",
+ "MetricGroup": "TopDownL3",
+ "MetricName": "fsu_pipe_utilization"
+ },
+ {
+ "MetricExpr": "STALL_FRONTEND_CACHE / CPU_CYCLES",
+ "BriefDescription": "Fraction of cycles the CPU was stalled due to instruction cache miss",
+ "MetricGroup": "TopDownL3",
+ "MetricName": "i_cache_miss"
+ },
+ {
+ "MetricExpr": " STALL_FRONTEND_TLB / CPU_CYCLES ",
+ "BriefDescription": "Fraction of cycles the CPU was stalled due to instruction TLB miss",
+ "MetricGroup": "TopDownL3",
+ "MetricName": "i_tlb_miss"
+ },
+ {
+ "MetricExpr": "IXU_NUM_UOPS_ISSUED / (CPU_CYCLES / 4)",
+ "BriefDescription": "Fraction of IXU execute slots utilized",
+ "MetricGroup": "TopDownL3",
+ "MetricName": "ixu_pipe_utilization"
+ },
+ {
+ "MetricExpr": "IDR_STALL_FLUSH / CPU_CYCLES",
+ "BriefDescription": "Fraction of cycles the CPU was stalled due to flush recovery",
+ "MetricGroup": "TopDownL3",
+ "MetricName": "recovery"
+ },
+ {
+ "MetricExpr": "STALL_BACKEND_RESOURCE / CPU_CYCLES",
+ "BriefDescription": "Fraction of cycles the CPU was stalled due to core resource shortage",
+ "MetricGroup": "TopDownL3",
+ "MetricName": "resource"
+ },
+ {
+ "MetricExpr": "IDR_STALL_FSU_SCHED / CPU_CYCLES ",
+ "BriefDescription": "Fraction of cycles the CPU was stalled and FSU was full",
+ "MetricGroup": "TopDownL4",
+ "MetricName": "stall_fsu_sched"
+ },
+ {
+ "MetricExpr": "IDR_STALL_IXU_SCHED / CPU_CYCLES ",
+ "BriefDescription": "Fraction of cycles the CPU was stalled and IXU was full",
+ "MetricGroup": "TopDownL4",
+ "MetricName": "stall_ixu_sched"
+ },
+ {
+ "MetricExpr": "IDR_STALL_LOB_ID / CPU_CYCLES ",
+ "BriefDescription": "Fraction of cycles the CPU was stalled and LOB was full",
+ "MetricGroup": "TopDownL4",
+ "MetricName": "stall_lob_id"
+ },
+ {
+ "MetricExpr": "IDR_STALL_ROB_ID / CPU_CYCLES",
+ "BriefDescription": "Fraction of cycles the CPU was stalled and ROB was full",
+ "MetricGroup": "TopDownL4",
+ "MetricName": "stall_rob_id"
+ },
+ {
+ "MetricExpr": "IDR_STALL_SOB_ID / CPU_CYCLES ",
+ "BriefDescription": "Fraction of cycles the CPU was stalled and SOB was full",
+ "MetricGroup": "TopDownL4",
+ "MetricName": "stall_sob_id"
+ }
+]
diff --git a/tools/perf/pmu-events/arch/arm64/ampere/ampereone/pipeline.json b/tools/perf/pmu-events/arch/arm64/ampere/ampereone/pipeline.json
index f9fae15f7555..711028377f3e 100644
--- a/tools/perf/pmu-events/arch/arm64/ampere/ampereone/pipeline.json
+++ b/tools/perf/pmu-events/arch/arm64/ampere/ampereone/pipeline.json
@@ -1,18 +1,24 @@
[
{
- "ArchStdEvent": "STALL_FRONTEND"
+ "ArchStdEvent": "STALL_FRONTEND",
+ "Errata": "Errata AC03_CPU_29",
+ "BriefDescription": "Impacted by errata, use metrics instead -"
},
{
"ArchStdEvent": "STALL_BACKEND"
},
{
- "ArchStdEvent": "STALL"
+ "ArchStdEvent": "STALL",
+ "Errata": "Errata AC03_CPU_29",
+ "BriefDescription": "Impacted by errata, use metrics instead -"
},
{
"ArchStdEvent": "STALL_SLOT_BACKEND"
},
{
- "ArchStdEvent": "STALL_SLOT_FRONTEND"
+ "ArchStdEvent": "STALL_SLOT_FRONTEND",
+ "Errata": "Errata AC03_CPU_29",
+ "BriefDescription": "Impacted by errata, use metrics instead -"
},
{
"ArchStdEvent": "STALL_SLOT"
diff --git a/tools/perf/pmu-events/arch/arm64/arm/neoverse-n2-v2/branch.json b/tools/perf/pmu-events/arch/arm64/arm/neoverse-n2-v2/branch.json
deleted file mode 100644
index 79f2016c53b0..000000000000
--- a/tools/perf/pmu-events/arch/arm64/arm/neoverse-n2-v2/branch.json
+++ /dev/null
@@ -1,8 +0,0 @@
-[
- {
- "ArchStdEvent": "BR_MIS_PRED"
- },
- {
- "ArchStdEvent": "BR_PRED"
- }
-]
diff --git a/tools/perf/pmu-events/arch/arm64/arm/neoverse-n2-v2/bus.json b/tools/perf/pmu-events/arch/arm64/arm/neoverse-n2-v2/bus.json
index 579c1c993d17..2e11a8c4a484 100644
--- a/tools/perf/pmu-events/arch/arm64/arm/neoverse-n2-v2/bus.json
+++ b/tools/perf/pmu-events/arch/arm64/arm/neoverse-n2-v2/bus.json
@@ -1,20 +1,18 @@
[
{
- "ArchStdEvent": "CPU_CYCLES"
+ "ArchStdEvent": "BUS_ACCESS",
+ "PublicDescription": "Counts memory transactions issued by the CPU to the external bus, including snoop requests and snoop responses. Each beat of data is counted individually."
},
{
- "ArchStdEvent": "BUS_ACCESS"
+ "ArchStdEvent": "BUS_CYCLES",
+ "PublicDescription": "Counts bus cycles in the CPU. Bus cycles represent a clock cycle in which a transaction could be sent or received on the interface from the CPU to the external bus. Since that interface is driven at the same clock speed as the CPU, this event is a duplicate of CPU_CYCLES."
},
{
- "ArchStdEvent": "BUS_CYCLES"
+ "ArchStdEvent": "BUS_ACCESS_RD",
+ "PublicDescription": "Counts memory read transactions seen on the external bus. Each beat of data is counted individually."
},
{
- "ArchStdEvent": "BUS_ACCESS_RD"
- },
- {
- "ArchStdEvent": "BUS_ACCESS_WR"
- },
- {
- "ArchStdEvent": "CNT_CYCLES"
+ "ArchStdEvent": "BUS_ACCESS_WR",
+ "PublicDescription": "Counts memory write transactions seen on the external bus. Each beat of data is counted individually."
}
]
diff --git a/tools/perf/pmu-events/arch/arm64/arm/neoverse-n2-v2/cache.json b/tools/perf/pmu-events/arch/arm64/arm/neoverse-n2-v2/cache.json
deleted file mode 100644
index 0141f749bff3..000000000000
--- a/tools/perf/pmu-events/arch/arm64/arm/neoverse-n2-v2/cache.json
+++ /dev/null
@@ -1,155 +0,0 @@
-[
- {
- "ArchStdEvent": "L1I_CACHE_REFILL"
- },
- {
- "ArchStdEvent": "L1I_TLB_REFILL"
- },
- {
- "ArchStdEvent": "L1D_CACHE_REFILL"
- },
- {
- "ArchStdEvent": "L1D_CACHE"
- },
- {
- "ArchStdEvent": "L1D_TLB_REFILL"
- },
- {
- "ArchStdEvent": "L1I_CACHE"
- },
- {
- "ArchStdEvent": "L1D_CACHE_WB"
- },
- {
- "ArchStdEvent": "L2D_CACHE"
- },
- {
- "ArchStdEvent": "L2D_CACHE_REFILL"
- },
- {
- "ArchStdEvent": "L2D_CACHE_WB"
- },
- {
- "ArchStdEvent": "L2D_CACHE_ALLOCATE"
- },
- {
- "ArchStdEvent": "L1D_TLB"
- },
- {
- "ArchStdEvent": "L1I_TLB"
- },
- {
- "ArchStdEvent": "L3D_CACHE_ALLOCATE"
- },
- {
- "ArchStdEvent": "L3D_CACHE_REFILL"
- },
- {
- "ArchStdEvent": "L3D_CACHE"
- },
- {
- "ArchStdEvent": "L2D_TLB_REFILL"
- },
- {
- "ArchStdEvent": "L2D_TLB"
- },
- {
- "ArchStdEvent": "DTLB_WALK"
- },
- {
- "ArchStdEvent": "ITLB_WALK"
- },
- {
- "ArchStdEvent": "LL_CACHE_RD"
- },
- {
- "ArchStdEvent": "LL_CACHE_MISS_RD"
- },
- {
- "ArchStdEvent": "L1D_CACHE_LMISS_RD"
- },
- {
- "ArchStdEvent": "L1D_CACHE_RD"
- },
- {
- "ArchStdEvent": "L1D_CACHE_WR"
- },
- {
- "ArchStdEvent": "L1D_CACHE_REFILL_RD"
- },
- {
- "ArchStdEvent": "L1D_CACHE_REFILL_WR"
- },
- {
- "ArchStdEvent": "L1D_CACHE_REFILL_INNER"
- },
- {
- "ArchStdEvent": "L1D_CACHE_REFILL_OUTER"
- },
- {
- "ArchStdEvent": "L1D_CACHE_WB_VICTIM"
- },
- {
- "ArchStdEvent": "L1D_CACHE_WB_CLEAN"
- },
- {
- "ArchStdEvent": "L1D_CACHE_INVAL"
- },
- {
- "ArchStdEvent": "L1D_TLB_REFILL_RD"
- },
- {
- "ArchStdEvent": "L1D_TLB_REFILL_WR"
- },
- {
- "ArchStdEvent": "L1D_TLB_RD"
- },
- {
- "ArchStdEvent": "L1D_TLB_WR"
- },
- {
- "ArchStdEvent": "L2D_CACHE_RD"
- },
- {
- "ArchStdEvent": "L2D_CACHE_WR"
- },
- {
- "ArchStdEvent": "L2D_CACHE_REFILL_RD"
- },
- {
- "ArchStdEvent": "L2D_CACHE_REFILL_WR"
- },
- {
- "ArchStdEvent": "L2D_CACHE_WB_VICTIM"
- },
- {
- "ArchStdEvent": "L2D_CACHE_WB_CLEAN"
- },
- {
- "ArchStdEvent": "L2D_CACHE_INVAL"
- },
- {
- "ArchStdEvent": "L2D_TLB_REFILL_RD"
- },
- {
- "ArchStdEvent": "L2D_TLB_REFILL_WR"
- },
- {
- "ArchStdEvent": "L2D_TLB_RD"
- },
- {
- "ArchStdEvent": "L2D_TLB_WR"
- },
- {
- "ArchStdEvent": "L3D_CACHE_RD"
- },
- {
- "ArchStdEvent": "L1I_CACHE_LMISS"
- },
- {
- "ArchStdEvent": "L2D_CACHE_LMISS_RD"
- },
- {
- "ArchStdEvent": "L3D_CACHE_LMISS_RD"
- }
-]
diff --git a/tools/perf/pmu-events/arch/arm64/arm/neoverse-n2-v2/exception.json b/tools/perf/pmu-events/arch/arm64/arm/neoverse-n2-v2/exception.json
index 344a2d552ad5..4404b8e91690 100644
--- a/tools/perf/pmu-events/arch/arm64/arm/neoverse-n2-v2/exception.json
+++ b/tools/perf/pmu-events/arch/arm64/arm/neoverse-n2-v2/exception.json
@@ -1,47 +1,62 @@
[
{
- "ArchStdEvent": "EXC_TAKEN"
+ "ArchStdEvent": "EXC_TAKEN",
+ "PublicDescription": "Counts any taken architecturally visible exceptions such as IRQ, FIQ, SError, and other synchronous exceptions. Exceptions are counted whether or not they are taken locally."
},
{
- "ArchStdEvent": "MEMORY_ERROR"
+ "ArchStdEvent": "EXC_RETURN",
+ "PublicDescription": "Counts any architecturally executed exception return instructions. Eg: AArch64: ERET"
},
{
- "ArchStdEvent": "EXC_UNDEF"
+ "ArchStdEvent": "EXC_UNDEF",
+ "PublicDescription": "Counts the number of synchronous exceptions which are taken locally that are due to attempting to execute an instruction that is UNDEFINED. Attempting to execute instruction bit patterns that have not been allocated. Attempting to execute instructions when they are disabled. Attempting to execute instructions at an inappropriate Exception level. Attempting to execute an instruction when the value of PSTATE.IL is 1."
},
{
- "ArchStdEvent": "EXC_SVC"
+ "ArchStdEvent": "EXC_SVC",
+ "PublicDescription": "Counts SVC exceptions taken locally."
},
{
- "ArchStdEvent": "EXC_PABORT"
+ "ArchStdEvent": "EXC_PABORT",
+ "PublicDescription": "Counts synchronous exceptions that are taken locally and caused by Instruction Aborts."
},
{
- "ArchStdEvent": "EXC_DABORT"
+ "ArchStdEvent": "EXC_DABORT",
+ "PublicDescription": "Counts exceptions that are taken locally and are caused by data aborts or SErrors. Conditions that could cause those exceptions are attempting to read or write memory where the MMU generates a fault, attempting to read or write memory with a misaligned address, interrupts from the nSEI inputs and internally generated SErrors."
},
{
- "ArchStdEvent": "EXC_IRQ"
+ "ArchStdEvent": "EXC_IRQ",
+ "PublicDescription": "Counts IRQ exceptions including the virtual IRQs that are taken locally."
},
{
- "ArchStdEvent": "EXC_FIQ"
+ "ArchStdEvent": "EXC_FIQ",
+ "PublicDescription": "Counts FIQ exceptions including the virtual FIQs that are taken locally."
},
{
- "ArchStdEvent": "EXC_SMC"
+ "ArchStdEvent": "EXC_SMC",
+ "PublicDescription": "Counts SMC exceptions take to EL3."
},
{
- "ArchStdEvent": "EXC_HVC"
+ "ArchStdEvent": "EXC_HVC",
+ "PublicDescription": "Counts HVC exceptions taken to EL2."
},
{
- "ArchStdEvent": "EXC_TRAP_PABORT"
+ "ArchStdEvent": "EXC_TRAP_PABORT",
+ "PublicDescription": "Counts exceptions which are traps not taken locally and are caused by Instruction Aborts. For example, attempting to execute an instruction with a misaligned PC."
},
{
- "ArchStdEvent": "EXC_TRAP_DABORT"
+ "ArchStdEvent": "EXC_TRAP_DABORT",
+ "PublicDescription": "Counts exceptions which are traps not taken locally and are caused by Data Aborts or SError interrupts. Conditions that could cause those exceptions are:\n\n1. Attempting to read or write memory where the MMU generates a fault,\n2. Attempting to read or write memory with a misaligned address,\n3. Interrupts from the SEI input.\n4. internally generated SErrors."
},
{
- "ArchStdEvent": "EXC_TRAP_OTHER"
+ "ArchStdEvent": "EXC_TRAP_OTHER",
+ "PublicDescription": "Counts the number of synchronous trap exceptions which are not taken locally and are not SVC, SMC, HVC, data aborts, Instruction Aborts, or interrupts."
},
{
- "ArchStdEvent": "EXC_TRAP_IRQ"
+ "ArchStdEvent": "EXC_TRAP_IRQ",
+ "PublicDescription": "Counts IRQ exceptions including the virtual IRQs that are not taken locally."
},
{
- "ArchStdEvent": "EXC_TRAP_FIQ"
+ "ArchStdEvent": "EXC_TRAP_FIQ",
+ "PublicDescription": "Counts FIQs which are not taken locally but taken from EL0, EL1,\n or EL2 to EL3 (which would be the normal behavior for FIQs when not executing\n in EL3)."
}
]
diff --git a/tools/perf/pmu-events/arch/arm64/arm/neoverse-n2-v2/fp_operation.json b/tools/perf/pmu-events/arch/arm64/arm/neoverse-n2-v2/fp_operation.json
new file mode 100644
index 000000000000..cec3435ac766
--- /dev/null
+++ b/tools/perf/pmu-events/arch/arm64/arm/neoverse-n2-v2/fp_operation.json
@@ -0,0 +1,22 @@
+[
+ {
+ "ArchStdEvent": "FP_HP_SPEC",
+ "PublicDescription": "Counts speculatively executed half precision floating point operations."
+ },
+ {
+ "ArchStdEvent": "FP_SP_SPEC",
+ "PublicDescription": "Counts speculatively executed single precision floating point operations."
+ },
+ {
+ "ArchStdEvent": "FP_DP_SPEC",
+ "PublicDescription": "Counts speculatively executed double precision floating point operations."
+ },
+ {
+ "ArchStdEvent": "FP_SCALE_OPS_SPEC",
+ "PublicDescription": "Counts speculatively executed scalable single precision floating point operations."
+ },
+ {
+ "ArchStdEvent": "FP_FIXED_OPS_SPEC",
+ "PublicDescription": "Counts speculatively executed non-scalable single precision floating point operations."
+ }
+]
diff --git a/tools/perf/pmu-events/arch/arm64/arm/neoverse-n2-v2/general.json b/tools/perf/pmu-events/arch/arm64/arm/neoverse-n2-v2/general.json
new file mode 100644
index 000000000000..428810f855b8
--- /dev/null
+++ b/tools/perf/pmu-events/arch/arm64/arm/neoverse-n2-v2/general.json
@@ -0,0 +1,10 @@
+[
+ {
+ "ArchStdEvent": "CPU_CYCLES",
+ "PublicDescription": "Counts CPU clock cycles (not timer cycles). The clock measured by this event is defined as the physical clock driving the CPU logic."
+ },
+ {
+ "ArchStdEvent": "CNT_CYCLES",
+ "PublicDescription": "Counts constant frequency cycles"
+ }
+]
diff --git a/tools/perf/pmu-events/arch/arm64/arm/neoverse-n2-v2/instruction.json b/tools/perf/pmu-events/arch/arm64/arm/neoverse-n2-v2/instruction.json
deleted file mode 100644
index e57cd55937c6..000000000000
--- a/tools/perf/pmu-events/arch/arm64/arm/neoverse-n2-v2/instruction.json
+++ /dev/null
@@ -1,143 +0,0 @@
-[
- {
- "ArchStdEvent": "SW_INCR"
- },
- {
- "ArchStdEvent": "INST_RETIRED"
- },
- {
- "ArchStdEvent": "EXC_RETURN"
- },
- {
- "ArchStdEvent": "CID_WRITE_RETIRED"
- },
- {
- "ArchStdEvent": "INST_SPEC"
- },
- {
- "ArchStdEvent": "TTBR_WRITE_RETIRED"
- },
- {
- "ArchStdEvent": "BR_RETIRED"
- },
- {
- "ArchStdEvent": "BR_MIS_PRED_RETIRED"
- },
- {
- "ArchStdEvent": "OP_RETIRED"
- },
- {
- "ArchStdEvent": "OP_SPEC"
- },
- {
- "ArchStdEvent": "LDREX_SPEC"
- },
- {
- "ArchStdEvent": "STREX_PASS_SPEC"
- },
- {
- "ArchStdEvent": "STREX_FAIL_SPEC"
- },
- {
- "ArchStdEvent": "STREX_SPEC"
- },
- {
- "ArchStdEvent": "LD_SPEC"
- },
- {
- "ArchStdEvent": "ST_SPEC"
- },
- {
- "ArchStdEvent": "DP_SPEC"
- },
- {
- "ArchStdEvent": "ASE_SPEC"
- },
- {
- "ArchStdEvent": "VFP_SPEC"
- },
- {
- "ArchStdEvent": "PC_WRITE_SPEC"
- },
- {
- "ArchStdEvent": "CRYPTO_SPEC"
- },
- {
- "ArchStdEvent": "BR_IMMED_SPEC"
- },
- {
- "ArchStdEvent": "BR_RETURN_SPEC"
- },
- {
- "ArchStdEvent": "BR_INDIRECT_SPEC"
- },
- {
- "ArchStdEvent": "ISB_SPEC"
- },
- {
- "ArchStdEvent": "DSB_SPEC"
- },
- {
- "ArchStdEvent": "DMB_SPEC"
- },
- {
- "ArchStdEvent": "RC_LD_SPEC"
- },
- {
- "ArchStdEvent": "RC_ST_SPEC"
- },
- {
- "ArchStdEvent": "ASE_INST_SPEC"
- },
- {
- "ArchStdEvent": "SVE_INST_SPEC"
- },
- {
- "ArchStdEvent": "FP_HP_SPEC"
- },
- {
- "ArchStdEvent": "FP_SP_SPEC"
- },
- {
- "ArchStdEvent": "FP_DP_SPEC"
- },
- {
- "ArchStdEvent": "SVE_PRED_SPEC"
- },
- {
- "ArchStdEvent": "SVE_PRED_EMPTY_SPEC"
- },
- {
- "ArchStdEvent": "SVE_PRED_FULL_SPEC"
- },
- {
- "ArchStdEvent": "SVE_PRED_PARTIAL_SPEC"
- },
- {
- "ArchStdEvent": "SVE_PRED_NOT_FULL_SPEC"
- },
- {
- "ArchStdEvent": "SVE_LDFF_SPEC"
- },
- {
- "ArchStdEvent": "SVE_LDFF_FAULT_SPEC"
- },
- {
- "ArchStdEvent": "FP_SCALE_OPS_SPEC"
- },
- {
- "ArchStdEvent": "FP_FIXED_OPS_SPEC"
- },
- {
- "ArchStdEvent": "ASE_SVE_INT8_SPEC"
- },
- {
- "ArchStdEvent": "ASE_SVE_INT16_SPEC"
- },
- {
- "ArchStdEvent": "ASE_SVE_INT32_SPEC"
- },
- {
- "ArchStdEvent": "ASE_SVE_INT64_SPEC"
- }
-]
diff --git a/tools/perf/pmu-events/arch/arm64/arm/neoverse-n2-v2/l1d_cache.json b/tools/perf/pmu-events/arch/arm64/arm/neoverse-n2-v2/l1d_cache.json
new file mode 100644
index 000000000000..da7c129f2569
--- /dev/null
+++ b/tools/perf/pmu-events/arch/arm64/arm/neoverse-n2-v2/l1d_cache.json
@@ -0,0 +1,54 @@
+[
+ {
+ "ArchStdEvent": "L1D_CACHE_REFILL",
+ "PublicDescription": "Counts level 1 data cache refills caused by speculatively executed load or store operations that missed in the level 1 data cache. This event only counts one event per cache line. This event does not count cache line allocations from preload instructions or from hardware cache prefetching."
+ },
+ {
+ "ArchStdEvent": "L1D_CACHE",
+ "PublicDescription": "Counts level 1 data cache accesses from any load/store operations. Atomic operations that resolve in the CPUs caches (near atomic operations) count as both a write access and read access. Each access to a cache line is counted including the multiple accesses caused by single instructions such as LDM or STM. Each access to other level 1 data or unified memory structures, for example refill buffers, write buffers, and write-back buffers, are also counted."
+ },
+ {
+ "ArchStdEvent": "L1D_CACHE_WB",
+ "PublicDescription": "Counts write-backs of dirty data from the L1 data cache to the L2 cache. This occurs when either a dirty cache line is evicted from L1 data cache and allocated in the L2 cache or dirty data is written to the L2 and possibly to the next level of cache. This event counts both victim cache line evictions and cache write-backs from snoops or cache maintenance operations. The following cache operations are not counted:\n\n1. Invalidations which do not result in data being transferred out of the L1 (such as evictions of clean data),\n2. Full line writes which write to L2 without writing L1, such as write streaming mode."
+ },
+ {
+ "ArchStdEvent": "L1D_CACHE_LMISS_RD",
+ "PublicDescription": "Counts cache line refills into the level 1 data cache from any memory read operations, that incurred additional latency."
+ },
+ {
+ "ArchStdEvent": "L1D_CACHE_RD",
+ "PublicDescription": "Counts level 1 data cache accesses from any load operation. Atomic load operations that resolve in the CPUs caches count as both a write access and read access."
+ },
+ {
+ "ArchStdEvent": "L1D_CACHE_WR",
+ "PublicDescription": "Counts level 1 data cache accesses generated by store operations. This event also counts accesses caused by a DC ZVA (data cache zero, specified by virtual address) instruction. Near atomic operations that resolve in the CPUs caches count as a write access and read access."
+ },
+ {
+ "ArchStdEvent": "L1D_CACHE_REFILL_RD",
+ "PublicDescription": "Counts level 1 data cache refills caused by speculatively executed load instructions where the memory read operation misses in the level 1 data cache. This event only counts one event per cache line."
+ },
+ {
+ "ArchStdEvent": "L1D_CACHE_REFILL_WR",
+ "PublicDescription": "Counts level 1 data cache refills caused by speculatively executed store instructions where the memory write operation misses in the level 1 data cache. This event only counts one event per cache line."
+ },
+ {
+ "ArchStdEvent": "L1D_CACHE_REFILL_INNER",
+ "PublicDescription": "Counts level 1 data cache refills where the cache line data came from caches inside the immediate cluster of the core."
+ },
+ {
+ "ArchStdEvent": "L1D_CACHE_REFILL_OUTER",
+ "PublicDescription": "Counts level 1 data cache refills for which the cache line data came from outside the immediate cluster of the core, like an SLC in the system interconnect or DRAM."
+ },
+ {
+ "ArchStdEvent": "L1D_CACHE_WB_VICTIM",
+ "PublicDescription": "Counts dirty cache line evictions from the level 1 data cache caused by a new cache line allocation. This event does not count evictions caused by cache maintenance operations."
+ },
+ {
+ "ArchStdEvent": "L1D_CACHE_WB_CLEAN",
+ "PublicDescription": "Counts write-backs from the level 1 data cache that are a result of a coherency operation made by another CPU. Event count includes cache maintenance operations."
+ },
+ {
+ "ArchStdEvent": "L1D_CACHE_INVAL",
+ "PublicDescription": "Counts each explicit invalidation of a cache line in the level 1 data cache caused by:\n\n- Cache Maintenance Operations (CMO) that operate by a virtual address.\n- Broadcast cache coherency operations from another CPU in the system.\n\nThis event does not count for the following conditions:\n\n1. A cache refill invalidates a cache line.\n2. A CMO which is executed on that CPU and invalidates a cache line specified by set/way.\n\nNote that CMOs that operate by set/way cannot be broadcast from one CPU to another."
+ }
+]
diff --git a/tools/perf/pmu-events/arch/arm64/arm/neoverse-n2-v2/l1i_cache.json b/tools/perf/pmu-events/arch/arm64/arm/neoverse-n2-v2/l1i_cache.json
new file mode 100644
index 000000000000..633f1030359d
--- /dev/null
+++ b/tools/perf/pmu-events/arch/arm64/arm/neoverse-n2-v2/l1i_cache.json
@@ -0,0 +1,14 @@
+[
+ {
+ "ArchStdEvent": "L1I_CACHE_REFILL",
+ "PublicDescription": "Counts cache line refills in the level 1 instruction cache caused by a missed instruction fetch. Instruction fetches may include accessing multiple instructions, but the single cache line allocation is counted once."
+ },
+ {
+ "ArchStdEvent": "L1I_CACHE",
+ "PublicDescription": "Counts instruction fetches which access the level 1 instruction cache. Instruction cache accesses caused by cache maintenance operations are not counted."
+ },
+ {
+ "ArchStdEvent": "L1I_CACHE_LMISS",
+ "PublicDescription": "Counts cache line refills into the level 1 instruction cache, that incurred additional latency."
+ }
+]
diff --git a/tools/perf/pmu-events/arch/arm64/arm/neoverse-n2-v2/l2_cache.json b/tools/perf/pmu-events/arch/arm64/arm/neoverse-n2-v2/l2_cache.json
new file mode 100644
index 000000000000..0e31d0daf88b
--- /dev/null
+++ b/tools/perf/pmu-events/arch/arm64/arm/neoverse-n2-v2/l2_cache.json
@@ -0,0 +1,50 @@
+[
+ {
+ "ArchStdEvent": "L2D_CACHE",
+ "PublicDescription": "Counts level 2 cache accesses. level 2 cache is a unified cache for data and instruction accesses. Accesses are for misses in the first level caches or translation resolutions due to accesses. This event also counts write back of dirty data from level 1 data cache to the L2 cache."
+ },
+ {
+ "ArchStdEvent": "L2D_CACHE_REFILL",
+ "PublicDescription": "Counts cache line refills into the level 2 cache. level 2 cache is a unified cache for data and instruction accesses. Accesses are for misses in the level 1 caches or translation resolutions due to accesses."
+ },
+ {
+ "ArchStdEvent": "L2D_CACHE_WB",
+ "PublicDescription": "Counts write-backs of data from the L2 cache to outside the CPU. This includes snoops to the L2 (from other CPUs) which return data even if the snoops cause an invalidation. L2 cache line invalidations which do not write data outside the CPU and snoops which return data from an L1 cache are not counted. Data would not be written outside the cache when invalidating a clean cache line."
+ },
+ {
+ "ArchStdEvent": "L2D_CACHE_ALLOCATE",
+ "PublicDescription": "TBD"
+ },
+ {
+ "ArchStdEvent": "L2D_CACHE_RD",
+ "PublicDescription": "Counts level 2 cache accesses due to memory read operations. level 2 cache is a unified cache for data and instruction accesses, accesses are for misses in the level 1 caches or translation resolutions due to accesses."
+ },
+ {
+ "ArchStdEvent": "L2D_CACHE_WR",
+ "PublicDescription": "Counts level 2 cache accesses due to memory write operations. level 2 cache is a unified cache for data and instruction accesses, accesses are for misses in the level 1 caches or translation resolutions due to accesses."
+ },
+ {
+ "ArchStdEvent": "L2D_CACHE_REFILL_RD",
+ "PublicDescription": "Counts refills for memory accesses due to memory read operation counted by L2D_CACHE_RD. level 2 cache is a unified cache for data and instruction accesses, accesses are for misses in the level 1 caches or translation resolutions due to accesses."
+ },
+ {
+ "ArchStdEvent": "L2D_CACHE_REFILL_WR",
+ "PublicDescription": "Counts refills for memory accesses due to memory write operation counted by L2D_CACHE_WR. level 2 cache is a unified cache for data and instruction accesses, accesses are for misses in the level 1 caches or translation resolutions due to accesses."
+ },
+ {
+ "ArchStdEvent": "L2D_CACHE_WB_VICTIM",
+ "PublicDescription": "Counts evictions from the level 2 cache because of a line being allocated into the L2 cache."
+ },
+ {
+ "ArchStdEvent": "L2D_CACHE_WB_CLEAN",
+ "PublicDescription": "Counts write-backs from the level 2 cache that are a result of either:\n\n1. Cache maintenance operations,\n\n2. Snoop responses or,\n\n3. Direct cache transfers to another CPU due to a forwarding snoop request."
+ },
+ {
+ "ArchStdEvent": "L2D_CACHE_INVAL",
+ "PublicDescription": "Counts each explicit invalidation of a cache line in the level 2 cache by cache maintenance operations that operate by a virtual address, or by external coherency operations. This event does not count if either:\n\n1. A cache refill invalidates a cache line or,\n2. A Cache Maintenance Operation (CMO), which invalidates a cache line specified by set/way, is executed on that CPU.\n\nCMOs that operate by set/way cannot be broadcast from one CPU to another."
+ },
+ {
+ "ArchStdEvent": "L2D_CACHE_LMISS_RD",
+ "PublicDescription": "Counts cache line refills into the level 2 unified cache from any memory read operations that incurred additional latency."
+ }
+]
diff --git a/tools/perf/pmu-events/arch/arm64/arm/neoverse-n2-v2/l3_cache.json b/tools/perf/pmu-events/arch/arm64/arm/neoverse-n2-v2/l3_cache.json
new file mode 100644
index 000000000000..45bfba532df7
--- /dev/null
+++ b/tools/perf/pmu-events/arch/arm64/arm/neoverse-n2-v2/l3_cache.json
@@ -0,0 +1,22 @@
+[
+ {
+ "ArchStdEvent": "L3D_CACHE_ALLOCATE",
+ "PublicDescription": "Counts level 3 cache line allocates that do not fetch data from outside the level 3 data or unified cache. For example, allocates due to streaming stores."
+ },
+ {
+ "ArchStdEvent": "L3D_CACHE_REFILL",
+ "PublicDescription": "Counts level 3 accesses that receive data from outside the L3 cache."
+ },
+ {
+ "ArchStdEvent": "L3D_CACHE",
+ "PublicDescription": "Counts level 3 cache accesses. level 3 cache is a unified cache for data and instruction accesses. Accesses are for misses in the lower level caches or translation resolutions due to accesses."
+ },
+ {
+ "ArchStdEvent": "L3D_CACHE_RD",
+ "PublicDescription": "TBD"
+ },
+ {
+ "ArchStdEvent": "L3D_CACHE_LMISS_RD",
+ "PublicDescription": "Counts any cache line refill into the level 3 cache from memory read operations that incurred additional latency."
+ }
+]
diff --git a/tools/perf/pmu-events/arch/arm64/arm/neoverse-n2-v2/ll_cache.json b/tools/perf/pmu-events/arch/arm64/arm/neoverse-n2-v2/ll_cache.json
new file mode 100644
index 000000000000..bb712d57d58a
--- /dev/null
+++ b/tools/perf/pmu-events/arch/arm64/arm/neoverse-n2-v2/ll_cache.json
@@ -0,0 +1,10 @@
+[
+ {
+ "ArchStdEvent": "LL_CACHE_RD",
+ "PublicDescription": "Counts read transactions that were returned from outside the core cluster. This event counts when the system register CPUECTLR.EXTLLC bit is set. This event counts read transactions returned from outside the core if those transactions are either hit in the system level cache or missed in the SLC and are returned from any other external sources."
+ },
+ {
+ "ArchStdEvent": "LL_CACHE_MISS_RD",
+ "PublicDescription": "Counts read transactions that were returned from outside the core cluster but missed in the system level cache. This event counts when the system register CPUECTLR.EXTLLC bit is set. This event counts read transactions returned from outside the core if those transactions are missed in the System level Cache. The data source of the transaction is indicated by a field in the CHI transaction returning to the CPU. This event does not count reads caused by cache maintenance operations."
+ }
+]
diff --git a/tools/perf/pmu-events/arch/arm64/arm/neoverse-n2-v2/memory.json b/tools/perf/pmu-events/arch/arm64/arm/neoverse-n2-v2/memory.json
index 7b2b21ac150f..106a97f8b2e7 100644
--- a/tools/perf/pmu-events/arch/arm64/arm/neoverse-n2-v2/memory.json
+++ b/tools/perf/pmu-events/arch/arm64/arm/neoverse-n2-v2/memory.json
@@ -1,41 +1,46 @@
[
{
- "ArchStdEvent": "MEM_ACCESS"
+ "ArchStdEvent": "MEM_ACCESS",
+ "PublicDescription": "Counts memory accesses issued by the CPU load store unit, where those accesses are issued due to load or store operations. This event counts memory accesses no matter whether the data is received from any level of cache hierarchy or external memory. If memory accesses are broken up into smaller transactions than what were specified in the load or store instructions, then the event counts those smaller memory transactions."
},
{
- "ArchStdEvent": "REMOTE_ACCESS"
+ "ArchStdEvent": "MEMORY_ERROR",
+ "PublicDescription": "Counts any detected correctable or uncorrectable physical memory errors (ECC or parity) in protected CPUs RAMs. On the core, this event counts errors in the caches (including data and tag rams). Any detected memory error (from either a speculative and abandoned access, or an architecturally executed access) is counted. Note that errors are only detected when the actual protected memory is accessed by an operation."
},
{
- "ArchStdEvent": "MEM_ACCESS_RD"
+ "ArchStdEvent": "REMOTE_ACCESS",
+ "PublicDescription": "Counts accesses to another chip, which is implemented as a different CMN mesh in the system. If the CHI bus response back to the core indicates that the data source is from another chip (mesh), then the counter is updated. If no data is returned, even if the system snoops another chip/mesh, then the counter is not updated."
},
{
- "ArchStdEvent": "MEM_ACCESS_WR"
+ "ArchStdEvent": "MEM_ACCESS_RD",
+ "PublicDescription": "Counts memory accesses issued by the CPU due to load operations. The event counts any memory load access, no matter whether the data is received from any level of cache hierarchy or external memory. The event also counts atomic load operations. If memory accesses are broken up by the load/store unit into smaller transactions that are issued by the bus interface, then the event counts those smaller transactions."
},
{
- "ArchStdEvent": "UNALIGNED_LD_SPEC"
+ "ArchStdEvent": "MEM_ACCESS_WR",
+ "PublicDescription": "Counts memory accesses issued by the CPU due to store operations. The event counts any memory store access, no matter whether the data is located in any level of cache or external memory. The event also counts atomic load and store operations. If memory accesses are broken up by the load/store unit into smaller transactions that are issued by the bus interface, then the event counts those smaller transactions."
},
{
- "ArchStdEvent": "UNALIGNED_ST_SPEC"
+ "ArchStdEvent": "LDST_ALIGN_LAT",
+ "PublicDescription": "Counts the number of memory read and write accesses in a cycle that incurred additional latency, due to the alignment of the address and the size of data being accessed, which results in store crossing a single cache line."
},
{
- "ArchStdEvent": "UNALIGNED_LDST_SPEC"
+ "ArchStdEvent": "LD_ALIGN_LAT",
+ "PublicDescription": "Counts the number of memory read accesses in a cycle that incurred additional latency, due to the alignment of the address and size of data being accessed, which results in load crossing a single cache line."
},
{
- "ArchStdEvent": "LDST_ALIGN_LAT"
+ "ArchStdEvent": "ST_ALIGN_LAT",
+ "PublicDescription": "Counts the number of memory write access in a cycle that incurred additional latency, due to the alignment of the address and size of data being accessed incurred additional latency."
},
{
- "ArchStdEvent": "LD_ALIGN_LAT"
+ "ArchStdEvent": "MEM_ACCESS_CHECKED",
+ "PublicDescription": "Counts the number of memory read and write accesses in a cycle that are tag checked by the Memory Tagging Extension (MTE)."
},
{
- "ArchStdEvent": "ST_ALIGN_LAT"
+ "ArchStdEvent": "MEM_ACCESS_CHECKED_RD",
+ "PublicDescription": "Counts the number of memory read accesses in a cycle that are tag checked by the Memory Tagging Extension (MTE)."
},
{
- "ArchStdEvent": "MEM_ACCESS_CHECKED"
- },
- {
- "ArchStdEvent": "MEM_ACCESS_CHECKED_RD"
- },
- {
- "ArchStdEvent": "MEM_ACCESS_CHECKED_WR"
+ "ArchStdEvent": "MEM_ACCESS_CHECKED_WR",
+ "PublicDescription": "Counts the number of memory write accesses in a cycle that is tag checked by the Memory Tagging Extension (MTE)."
}
]
diff --git a/tools/perf/pmu-events/arch/arm64/arm/neoverse-n2-v2/metrics.json b/tools/perf/pmu-events/arch/arm64/arm/neoverse-n2-v2/metrics.json
index 8ad15b726dca..5f449270b448 100644
--- a/tools/perf/pmu-events/arch/arm64/arm/neoverse-n2-v2/metrics.json
+++ b/tools/perf/pmu-events/arch/arm64/arm/neoverse-n2-v2/metrics.json
@@ -1,272 +1,303 @@
[
{
- "ArchStdEvent": "FRONTEND_BOUND",
- "MetricExpr": "((stall_slot_frontend) if (#slots - 5) else (stall_slot_frontend - cpu_cycles)) / (#slots * cpu_cycles)"
+ "ArchStdEvent": "backend_bound",
+ "MetricExpr": "(100 * ((STALL_SLOT_BACKEND / (CPU_CYCLES * #slots)) - ((BR_MIS_PRED * 3) / CPU_CYCLES)))"
},
{
- "ArchStdEvent": "BAD_SPECULATION",
- "MetricExpr": "(1 - op_retired / op_spec) * (1 - (stall_slot if (#slots - 5) else (stall_slot - cpu_cycles)) / (#slots * cpu_cycles))"
+ "MetricName": "backend_stalled_cycles",
+ "MetricExpr": "((STALL_BACKEND / CPU_CYCLES) * 100)",
+ "BriefDescription": "This metric is the percentage of cycles that were stalled due to resource constraints in the backend unit of the processor.",
+ "MetricGroup": "Cycle_Accounting",
+ "ScaleUnit": "1percent of cycles"
},
{
- "ArchStdEvent": "RETIRING",
- "MetricExpr": "(op_retired / op_spec) * (1 - (stall_slot if (#slots - 5) else (stall_slot - cpu_cycles)) / (#slots * cpu_cycles))"
+ "ArchStdEvent": "bad_speculation",
+ "MetricExpr": "(100 * (((1 - (OP_RETIRED / OP_SPEC)) * (1 - (((STALL_SLOT) if (strcmp_cpuid_str(0x410fd493) | strcmp_cpuid_str(0x410fd490) ^ 1) else (STALL_SLOT - CPU_CYCLES)) / (CPU_CYCLES * #slots)))) + ((BR_MIS_PRED * 4) / CPU_CYCLES)))"
},
{
- "ArchStdEvent": "BACKEND_BOUND"
+ "MetricName": "branch_misprediction_ratio",
+ "MetricExpr": "(BR_MIS_PRED_RETIRED / BR_RETIRED)",
+ "BriefDescription": "This metric measures the ratio of branches mispredicted to the total number of branches architecturally executed. This gives an indication of the effectiveness of the branch prediction unit.",
+ "MetricGroup": "Miss_Ratio;Branch_Effectiveness",
+ "ScaleUnit": "1per branch"
},
{
- "MetricExpr": "L1D_TLB_REFILL / L1D_TLB",
- "BriefDescription": "The rate of L1D TLB refill to the overall L1D TLB lookups",
- "MetricGroup": "TLB",
- "MetricName": "l1d_tlb_miss_rate",
- "ScaleUnit": "100%"
+ "MetricName": "branch_mpki",
+ "MetricExpr": "((BR_MIS_PRED_RETIRED / INST_RETIRED) * 1000)",
+ "BriefDescription": "This metric measures the number of branch mispredictions per thousand instructions executed.",
+ "MetricGroup": "MPKI;Branch_Effectiveness",
+ "ScaleUnit": "1MPKI"
},
{
- "MetricExpr": "L1I_TLB_REFILL / L1I_TLB",
- "BriefDescription": "The rate of L1I TLB refill to the overall L1I TLB lookups",
- "MetricGroup": "TLB",
- "MetricName": "l1i_tlb_miss_rate",
- "ScaleUnit": "100%"
+ "MetricName": "branch_percentage",
+ "MetricExpr": "(((BR_IMMED_SPEC + BR_INDIRECT_SPEC) / INST_SPEC) * 100)",
+ "BriefDescription": "This metric measures branch operations as a percentage of operations speculatively executed.",
+ "MetricGroup": "Operation_Mix",
+ "ScaleUnit": "1percent of operations"
},
{
- "MetricExpr": "L2D_TLB_REFILL / L2D_TLB",
- "BriefDescription": "The rate of L2D TLB refill to the overall L2D TLB lookups",
- "MetricGroup": "TLB",
- "MetricName": "l2_tlb_miss_rate",
- "ScaleUnit": "100%"
+ "MetricName": "crypto_percentage",
+ "MetricExpr": "((CRYPTO_SPEC / INST_SPEC) * 100)",
+ "BriefDescription": "This metric measures crypto operations as a percentage of operations speculatively executed.",
+ "MetricGroup": "Operation_Mix",
+ "ScaleUnit": "1percent of operations"
},
{
- "MetricExpr": "DTLB_WALK / INST_RETIRED * 1000",
- "BriefDescription": "The rate of TLB Walks per kilo instructions for data accesses",
- "MetricGroup": "TLB",
"MetricName": "dtlb_mpki",
+ "MetricExpr": "((DTLB_WALK / INST_RETIRED) * 1000)",
+ "BriefDescription": "This metric measures the number of data TLB Walks per thousand instructions executed.",
+ "MetricGroup": "MPKI;DTLB_Effectiveness",
"ScaleUnit": "1MPKI"
},
{
- "MetricExpr": "DTLB_WALK / L1D_TLB",
- "BriefDescription": "The rate of DTLB Walks to the overall L1D TLB lookups",
- "MetricGroup": "TLB",
- "MetricName": "dtlb_walk_rate",
- "ScaleUnit": "100%"
+ "MetricName": "dtlb_walk_ratio",
+ "MetricExpr": "(DTLB_WALK / L1D_TLB)",
+ "BriefDescription": "This metric measures the ratio of data TLB Walks to the total number of data TLB accesses. This gives an indication of the effectiveness of the data TLB accesses.",
+ "MetricGroup": "Miss_Ratio;DTLB_Effectiveness",
+ "ScaleUnit": "1per TLB access"
},
{
- "MetricExpr": "ITLB_WALK / INST_RETIRED * 1000",
- "BriefDescription": "The rate of TLB Walks per kilo instructions for instruction accesses",
- "MetricGroup": "TLB",
- "MetricName": "itlb_mpki",
- "ScaleUnit": "1MPKI"
+ "ArchStdEvent": "frontend_bound",
+ "MetricExpr": "(100 * ((((STALL_SLOT_FRONTEND) if (strcmp_cpuid_str(0x410fd493) | strcmp_cpuid_str(0x410fd490) ^ 1) else (STALL_SLOT_FRONTEND - CPU_CYCLES)) / (CPU_CYCLES * #slots)) - (BR_MIS_PRED / CPU_CYCLES)))"
},
{
- "MetricExpr": "ITLB_WALK / L1I_TLB",
- "BriefDescription": "The rate of ITLB Walks to the overall L1I TLB lookups",
- "MetricGroup": "TLB",
- "MetricName": "itlb_walk_rate",
- "ScaleUnit": "100%"
+ "MetricName": "frontend_stalled_cycles",
+ "MetricExpr": "((STALL_FRONTEND / CPU_CYCLES) * 100)",
+ "BriefDescription": "This metric is the percentage of cycles that were stalled due to resource constraints in the frontend unit of the processor.",
+ "MetricGroup": "Cycle_Accounting",
+ "ScaleUnit": "1percent of cycles"
},
{
- "MetricExpr": "L1I_CACHE_REFILL / INST_RETIRED * 1000",
- "BriefDescription": "The rate of L1 I-Cache misses per kilo instructions",
- "MetricGroup": "Cache",
- "MetricName": "l1i_cache_mpki",
+ "MetricName": "integer_dp_percentage",
+ "MetricExpr": "((DP_SPEC / INST_SPEC) * 100)",
+ "BriefDescription": "This metric measures scalar integer operations as a percentage of operations speculatively executed.",
+ "MetricGroup": "Operation_Mix",
+ "ScaleUnit": "1percent of operations"
+ },
+ {
+ "MetricName": "ipc",
+ "MetricExpr": "(INST_RETIRED / CPU_CYCLES)",
+ "BriefDescription": "This metric measures the number of instructions retired per cycle.",
+ "MetricGroup": "General",
+ "ScaleUnit": "1per cycle"
+ },
+ {
+ "MetricName": "itlb_mpki",
+ "MetricExpr": "((ITLB_WALK / INST_RETIRED) * 1000)",
+ "BriefDescription": "This metric measures the number of instruction TLB Walks per thousand instructions executed.",
+ "MetricGroup": "MPKI;ITLB_Effectiveness",
"ScaleUnit": "1MPKI"
},
{
- "MetricExpr": "L1I_CACHE_REFILL / L1I_CACHE",
- "BriefDescription": "The rate of L1 I-Cache misses to the overall L1 I-Cache",
- "MetricGroup": "Cache",
- "MetricName": "l1i_cache_miss_rate",
- "ScaleUnit": "100%"
+ "MetricName": "itlb_walk_ratio",
+ "MetricExpr": "(ITLB_WALK / L1I_TLB)",
+ "BriefDescription": "This metric measures the ratio of instruction TLB Walks to the total number of instruction TLB accesses. This gives an indication of the effectiveness of the instruction TLB accesses.",
+ "MetricGroup": "Miss_Ratio;ITLB_Effectiveness",
+ "ScaleUnit": "1per TLB access"
+ },
+ {
+ "MetricName": "l1d_cache_miss_ratio",
+ "MetricExpr": "(L1D_CACHE_REFILL / L1D_CACHE)",
+ "BriefDescription": "This metric measures the ratio of level 1 data cache accesses missed to the total number of level 1 data cache accesses. This gives an indication of the effectiveness of the level 1 data cache.",
+ "MetricGroup": "Miss_Ratio;L1D_Cache_Effectiveness",
+ "ScaleUnit": "1per cache access"
},
{
- "MetricExpr": "L1D_CACHE_REFILL / INST_RETIRED * 1000",
- "BriefDescription": "The rate of L1 D-Cache misses per kilo instructions",
- "MetricGroup": "Cache",
"MetricName": "l1d_cache_mpki",
+ "MetricExpr": "((L1D_CACHE_REFILL / INST_RETIRED) * 1000)",
+ "BriefDescription": "This metric measures the number of level 1 data cache accesses missed per thousand instructions executed.",
+ "MetricGroup": "MPKI;L1D_Cache_Effectiveness",
"ScaleUnit": "1MPKI"
},
{
- "MetricExpr": "L1D_CACHE_REFILL / L1D_CACHE",
- "BriefDescription": "The rate of L1 D-Cache misses to the overall L1 D-Cache",
- "MetricGroup": "Cache",
- "MetricName": "l1d_cache_miss_rate",
- "ScaleUnit": "100%"
+ "MetricName": "l1d_tlb_miss_ratio",
+ "MetricExpr": "(L1D_TLB_REFILL / L1D_TLB)",
+ "BriefDescription": "This metric measures the ratio of level 1 data TLB accesses missed to the total number of level 1 data TLB accesses. This gives an indication of the effectiveness of the level 1 data TLB.",
+ "MetricGroup": "Miss_Ratio;DTLB_Effectiveness",
+ "ScaleUnit": "1per TLB access"
},
{
- "MetricExpr": "L2D_CACHE_REFILL / INST_RETIRED * 1000",
- "BriefDescription": "The rate of L2 D-Cache misses per kilo instructions",
- "MetricGroup": "Cache",
- "MetricName": "l2d_cache_mpki",
+ "MetricName": "l1d_tlb_mpki",
+ "MetricExpr": "((L1D_TLB_REFILL / INST_RETIRED) * 1000)",
+ "BriefDescription": "This metric measures the number of level 1 instruction TLB accesses missed per thousand instructions executed.",
+ "MetricGroup": "MPKI;DTLB_Effectiveness",
"ScaleUnit": "1MPKI"
},
{
- "MetricExpr": "L2D_CACHE_REFILL / L2D_CACHE",
- "BriefDescription": "The rate of L2 D-Cache misses to the overall L2 D-Cache",
- "MetricGroup": "Cache",
- "MetricName": "l2d_cache_miss_rate",
- "ScaleUnit": "100%"
+ "MetricName": "l1i_cache_miss_ratio",
+ "MetricExpr": "(L1I_CACHE_REFILL / L1I_CACHE)",
+ "BriefDescription": "This metric measures the ratio of level 1 instruction cache accesses missed to the total number of level 1 instruction cache accesses. This gives an indication of the effectiveness of the level 1 instruction cache.",
+ "MetricGroup": "Miss_Ratio;L1I_Cache_Effectiveness",
+ "ScaleUnit": "1per cache access"
},
{
- "MetricExpr": "L3D_CACHE_REFILL / INST_RETIRED * 1000",
- "BriefDescription": "The rate of L3 D-Cache misses per kilo instructions",
- "MetricGroup": "Cache",
- "MetricName": "l3d_cache_mpki",
+ "MetricName": "l1i_cache_mpki",
+ "MetricExpr": "((L1I_CACHE_REFILL / INST_RETIRED) * 1000)",
+ "BriefDescription": "This metric measures the number of level 1 instruction cache accesses missed per thousand instructions executed.",
+ "MetricGroup": "MPKI;L1I_Cache_Effectiveness",
"ScaleUnit": "1MPKI"
},
{
- "MetricExpr": "L3D_CACHE_REFILL / L3D_CACHE",
- "BriefDescription": "The rate of L3 D-Cache misses to the overall L3 D-Cache",
- "MetricGroup": "Cache",
- "MetricName": "l3d_cache_miss_rate",
- "ScaleUnit": "100%"
+ "MetricName": "l1i_tlb_miss_ratio",
+ "MetricExpr": "(L1I_TLB_REFILL / L1I_TLB)",
+ "BriefDescription": "This metric measures the ratio of level 1 instruction TLB accesses missed to the total number of level 1 instruction TLB accesses. This gives an indication of the effectiveness of the level 1 instruction TLB.",
+ "MetricGroup": "Miss_Ratio;ITLB_Effectiveness",
+ "ScaleUnit": "1per TLB access"
},
{
- "MetricExpr": "LL_CACHE_MISS_RD / INST_RETIRED * 1000",
- "BriefDescription": "The rate of LL Cache read misses per kilo instructions",
- "MetricGroup": "Cache",
- "MetricName": "ll_cache_read_mpki",
+ "MetricName": "l1i_tlb_mpki",
+ "MetricExpr": "((L1I_TLB_REFILL / INST_RETIRED) * 1000)",
+ "BriefDescription": "This metric measures the number of level 1 instruction TLB accesses missed per thousand instructions executed.",
+ "MetricGroup": "MPKI;ITLB_Effectiveness",
"ScaleUnit": "1MPKI"
},
{
- "MetricExpr": "LL_CACHE_MISS_RD / LL_CACHE_RD",
- "BriefDescription": "The rate of LL Cache read misses to the overall LL Cache read",
- "MetricGroup": "Cache",
- "MetricName": "ll_cache_read_miss_rate",
- "ScaleUnit": "100%"
+ "MetricName": "l2_cache_miss_ratio",
+ "MetricExpr": "(L2D_CACHE_REFILL / L2D_CACHE)",
+ "BriefDescription": "This metric measures the ratio of level 2 cache accesses missed to the total number of level 2 cache accesses. This gives an indication of the effectiveness of the level 2 cache, which is a unified cache that stores both data and instruction. Note that cache accesses in this cache are either data memory access or instruction fetch as this is a unified cache.",
+ "MetricGroup": "Miss_Ratio;L2_Cache_Effectiveness",
+ "ScaleUnit": "1per cache access"
},
{
- "MetricExpr": "(LL_CACHE_RD - LL_CACHE_MISS_RD) / LL_CACHE_RD",
- "BriefDescription": "The rate of LL Cache read hit to the overall LL Cache read",
- "MetricGroup": "Cache",
- "MetricName": "ll_cache_read_hit_rate",
- "ScaleUnit": "100%"
+ "MetricName": "l2_cache_mpki",
+ "MetricExpr": "((L2D_CACHE_REFILL / INST_RETIRED) * 1000)",
+ "BriefDescription": "This metric measures the number of level 2 unified cache accesses missed per thousand instructions executed. Note that cache accesses in this cache are either data memory access or instruction fetch as this is a unified cache.",
+ "MetricGroup": "MPKI;L2_Cache_Effectiveness",
+ "ScaleUnit": "1MPKI"
},
{
- "MetricExpr": "BR_MIS_PRED_RETIRED / INST_RETIRED * 1000",
- "BriefDescription": "The rate of branches mis-predicted per kilo instructions",
- "MetricGroup": "Branch",
- "MetricName": "branch_mpki",
+ "MetricName": "l2_tlb_miss_ratio",
+ "MetricExpr": "(L2D_TLB_REFILL / L2D_TLB)",
+ "BriefDescription": "This metric measures the ratio of level 2 unified TLB accesses missed to the total number of level 2 unified TLB accesses. This gives an indication of the effectiveness of the level 2 TLB.",
+ "MetricGroup": "Miss_Ratio;ITLB_Effectiveness;DTLB_Effectiveness",
+ "ScaleUnit": "1per TLB access"
+ },
+ {
+ "MetricName": "l2_tlb_mpki",
+ "MetricExpr": "((L2D_TLB_REFILL / INST_RETIRED) * 1000)",
+ "BriefDescription": "This metric measures the number of level 2 unified TLB accesses missed per thousand instructions executed.",
+ "MetricGroup": "MPKI;ITLB_Effectiveness;DTLB_Effectiveness",
"ScaleUnit": "1MPKI"
},
{
- "MetricExpr": "BR_RETIRED / INST_RETIRED * 1000",
- "BriefDescription": "The rate of branches retired per kilo instructions",
- "MetricGroup": "Branch",
- "MetricName": "branch_pki",
- "ScaleUnit": "1PKI"
+ "MetricName": "ll_cache_read_hit_ratio",
+ "MetricExpr": "((LL_CACHE_RD - LL_CACHE_MISS_RD) / LL_CACHE_RD)",
+ "BriefDescription": "This metric measures the ratio of last level cache read accesses hit in the cache to the total number of last level cache accesses. This gives an indication of the effectiveness of the last level cache for read traffic. Note that cache accesses in this cache are either data memory access or instruction fetch as this is a system level cache.",
+ "MetricGroup": "LL_Cache_Effectiveness",
+ "ScaleUnit": "1per cache access"
},
{
- "MetricExpr": "BR_MIS_PRED_RETIRED / BR_RETIRED",
- "BriefDescription": "The rate of branches mis-predited to the overall branches",
- "MetricGroup": "Branch",
- "MetricName": "branch_miss_pred_rate",
- "ScaleUnit": "100%"
+ "MetricName": "ll_cache_read_miss_ratio",
+ "MetricExpr": "(LL_CACHE_MISS_RD / LL_CACHE_RD)",
+ "BriefDescription": "This metric measures the ratio of last level cache read accesses missed to the total number of last level cache accesses. This gives an indication of the effectiveness of the last level cache for read traffic. Note that cache accesses in this cache are either data memory access or instruction fetch as this is a system level cache.",
+ "MetricGroup": "Miss_Ratio;LL_Cache_Effectiveness",
+ "ScaleUnit": "1per cache access"
},
{
- "MetricExpr": "instructions / CPU_CYCLES",
- "BriefDescription": "The average number of instructions executed for each cycle.",
- "MetricGroup": "PEutilization",
- "MetricName": "ipc"
+ "MetricName": "ll_cache_read_mpki",
+ "MetricExpr": "((LL_CACHE_MISS_RD / INST_RETIRED) * 1000)",
+ "BriefDescription": "This metric measures the number of last level cache read accesses missed per thousand instructions executed.",
+ "MetricGroup": "MPKI;LL_Cache_Effectiveness",
+ "ScaleUnit": "1MPKI"
},
{
- "MetricExpr": "ipc / 5",
- "BriefDescription": "IPC percentage of peak. The peak of IPC is 5.",
- "MetricGroup": "PEutilization",
- "MetricName": "ipc_rate",
- "ScaleUnit": "100%"
+ "MetricName": "load_percentage",
+ "MetricExpr": "((LD_SPEC / INST_SPEC) * 100)",
+ "BriefDescription": "This metric measures load operations as a percentage of operations speculatively executed.",
+ "MetricGroup": "Operation_Mix",
+ "ScaleUnit": "1percent of operations"
},
{
- "MetricExpr": "INST_RETIRED / CPU_CYCLES",
- "BriefDescription": "Architecturally executed Instructions Per Cycle (IPC)",
- "MetricGroup": "PEutilization",
- "MetricName": "retired_ipc"
+ "ArchStdEvent": "retiring",
+ "MetricExpr": "(100 * ((OP_RETIRED / OP_SPEC) * (1 - (((STALL_SLOT) if (strcmp_cpuid_str(0x410fd493) | strcmp_cpuid_str(0x410fd490) ^ 1) else (STALL_SLOT - CPU_CYCLES)) / (CPU_CYCLES * #slots)))))"
},
{
- "MetricExpr": "INST_SPEC / CPU_CYCLES",
- "BriefDescription": "Speculatively executed Instructions Per Cycle (IPC)",
- "MetricGroup": "PEutilization",
- "MetricName": "spec_ipc"
+ "MetricName": "scalar_fp_percentage",
+ "MetricExpr": "((VFP_SPEC / INST_SPEC) * 100)",
+ "BriefDescription": "This metric measures scalar floating point operations as a percentage of operations speculatively executed.",
+ "MetricGroup": "Operation_Mix",
+ "ScaleUnit": "1percent of operations"
},
{
- "MetricExpr": "OP_RETIRED / OP_SPEC",
- "BriefDescription": "Of all the micro-operations issued, what percentage are retired(committed)",
- "MetricGroup": "PEutilization",
- "MetricName": "retired_rate",
- "ScaleUnit": "100%"
+ "MetricName": "simd_percentage",
+ "MetricExpr": "((ASE_SPEC / INST_SPEC) * 100)",
+ "BriefDescription": "This metric measures advanced SIMD operations as a percentage of total operations speculatively executed.",
+ "MetricGroup": "Operation_Mix",
+ "ScaleUnit": "1percent of operations"
},
{
- "MetricExpr": "1 - OP_RETIRED / OP_SPEC",
- "BriefDescription": "Of all the micro-operations issued, what percentage are not retired(committed)",
- "MetricGroup": "PEutilization",
- "MetricName": "wasted_rate",
- "ScaleUnit": "100%"
+ "MetricName": "store_percentage",
+ "MetricExpr": "((ST_SPEC / INST_SPEC) * 100)",
+ "BriefDescription": "This metric measures store operations as a percentage of operations speculatively executed.",
+ "MetricGroup": "Operation_Mix",
+ "ScaleUnit": "1percent of operations"
},
{
- "MetricExpr": "OP_RETIRED / OP_SPEC * (1 - (STALL_SLOT if (#slots - 5) else (STALL_SLOT - CPU_CYCLES)) / (#slots * CPU_CYCLES))",
- "BriefDescription": "The truly effective ratio of micro-operations executed by the CPU, which means that misprediction and stall are not included",
- "MetricGroup": "PEutilization",
- "MetricName": "cpu_utilization",
- "ScaleUnit": "100%"
+ "MetricExpr": "L3D_CACHE_REFILL / INST_RETIRED * 1000",
+ "BriefDescription": "The rate of L3 D-Cache misses per kilo instructions",
+ "MetricGroup": "MPKI;L3_Cache_Effectiveness",
+ "MetricName": "l3d_cache_mpki",
+ "ScaleUnit": "1MPKI"
},
{
- "MetricExpr": "LD_SPEC / INST_SPEC",
- "BriefDescription": "The rate of load instructions speculatively executed to overall instructions speclatively executed",
- "MetricGroup": "InstructionMix",
- "MetricName": "load_spec_rate",
+ "MetricExpr": "L3D_CACHE_REFILL / L3D_CACHE",
+ "BriefDescription": "The rate of L3 D-Cache misses to the overall L3 D-Cache",
+ "MetricGroup": "Miss_Ratio;L3_Cache_Effectiveness",
+ "MetricName": "l3d_cache_miss_rate",
"ScaleUnit": "100%"
},
{
- "MetricExpr": "ST_SPEC / INST_SPEC",
- "BriefDescription": "The rate of store instructions speculatively executed to overall instructions speclatively executed",
- "MetricGroup": "InstructionMix",
- "MetricName": "store_spec_rate",
- "ScaleUnit": "100%"
+ "MetricExpr": "BR_RETIRED / INST_RETIRED * 1000",
+ "BriefDescription": "The rate of branches retired per kilo instructions",
+ "MetricGroup": "MPKI;Branch_Effectiveness",
+ "MetricName": "branch_pki",
+ "ScaleUnit": "1PKI"
},
{
- "MetricExpr": "DP_SPEC / INST_SPEC",
- "BriefDescription": "The rate of integer data-processing instructions speculatively executed to overall instructions speclatively executed",
- "MetricGroup": "InstructionMix",
- "MetricName": "data_process_spec_rate",
+ "MetricExpr": "ipc / #slots",
+ "BriefDescription": "IPC percentage of peak. The peak of IPC is the number of slots.",
+ "MetricGroup": "General",
+ "MetricName": "ipc_rate",
"ScaleUnit": "100%"
},
{
- "MetricExpr": "ASE_SPEC / INST_SPEC",
- "BriefDescription": "The rate of advanced SIMD instructions speculatively executed to overall instructions speclatively executed",
- "MetricGroup": "InstructionMix",
- "MetricName": "advanced_simd_spec_rate",
- "ScaleUnit": "100%"
+ "MetricExpr": "INST_SPEC / CPU_CYCLES",
+ "BriefDescription": "Speculatively executed Instructions Per Cycle (IPC)",
+ "MetricGroup": "General",
+ "MetricName": "spec_ipc"
},
{
- "MetricExpr": "VFP_SPEC / INST_SPEC",
- "BriefDescription": "The rate of floating point instructions speculatively executed to overall instructions speclatively executed",
- "MetricGroup": "InstructionMix",
- "MetricName": "float_point_spec_rate",
+ "MetricExpr": "OP_RETIRED / OP_SPEC",
+ "BriefDescription": "Of all the micro-operations issued, what percentage are retired(committed)",
+ "MetricGroup": "General",
+ "MetricName": "retired_rate",
"ScaleUnit": "100%"
},
{
- "MetricExpr": "CRYPTO_SPEC / INST_SPEC",
- "BriefDescription": "The rate of crypto instructions speculatively executed to overall instructions speclatively executed",
- "MetricGroup": "InstructionMix",
- "MetricName": "crypto_spec_rate",
+ "MetricExpr": "1 - OP_RETIRED / OP_SPEC",
+ "BriefDescription": "Of all the micro-operations issued, what percentage are not retired(committed)",
+ "MetricGroup": "General",
+ "MetricName": "wasted_rate",
"ScaleUnit": "100%"
},
{
"MetricExpr": "BR_IMMED_SPEC / INST_SPEC",
- "BriefDescription": "The rate of branch immediate instructions speculatively executed to overall instructions speclatively executed",
- "MetricGroup": "InstructionMix",
+ "BriefDescription": "The rate of branch immediate instructions speculatively executed to overall instructions speculatively executed",
+ "MetricGroup": "Operation_Mix",
"MetricName": "branch_immed_spec_rate",
"ScaleUnit": "100%"
},
{
"MetricExpr": "BR_RETURN_SPEC / INST_SPEC",
- "BriefDescription": "The rate of procedure return instructions speculatively executed to overall instructions speclatively executed",
- "MetricGroup": "InstructionMix",
+ "BriefDescription": "The rate of procedure return instructions speculatively executed to overall instructions speculatively executed",
+ "MetricGroup": "Operation_Mix",
"MetricName": "branch_return_spec_rate",
"ScaleUnit": "100%"
},
{
"MetricExpr": "BR_INDIRECT_SPEC / INST_SPEC",
- "BriefDescription": "The rate of indirect branch instructions speculatively executed to overall instructions speclatively executed",
- "MetricGroup": "InstructionMix",
+ "BriefDescription": "The rate of indirect branch instructions speculatively executed to overall instructions speculatively executed",
+ "MetricGroup": "Operation_Mix",
"MetricName": "branch_indirect_spec_rate",
"ScaleUnit": "100%"
}
diff --git a/tools/perf/pmu-events/arch/arm64/arm/neoverse-n2-v2/pipeline.json b/tools/perf/pmu-events/arch/arm64/arm/neoverse-n2-v2/pipeline.json
deleted file mode 100644
index f9fae15f7555..000000000000
--- a/tools/perf/pmu-events/arch/arm64/arm/neoverse-n2-v2/pipeline.json
+++ /dev/null
@@ -1,23 +0,0 @@
-[
- {
- "ArchStdEvent": "STALL_FRONTEND"
- },
- {
- "ArchStdEvent": "STALL_BACKEND"
- },
- {
- "ArchStdEvent": "STALL"
- },
- {
- "ArchStdEvent": "STALL_SLOT_BACKEND"
- },
- {
- "ArchStdEvent": "STALL_SLOT_FRONTEND"
- },
- {
- "ArchStdEvent": "STALL_SLOT"
- },
- {
- "ArchStdEvent": "STALL_BACKEND_MEM"
- }
-]
diff --git a/tools/perf/pmu-events/arch/arm64/arm/neoverse-n2-v2/retired.json b/tools/perf/pmu-events/arch/arm64/arm/neoverse-n2-v2/retired.json
new file mode 100644
index 000000000000..f297b049b62f
--- /dev/null
+++ b/tools/perf/pmu-events/arch/arm64/arm/neoverse-n2-v2/retired.json
@@ -0,0 +1,30 @@
+[
+ {
+ "ArchStdEvent": "SW_INCR",
+ "PublicDescription": "Counts software writes to the PMSWINC_EL0 (software PMU increment) register. The PMSWINC_EL0 register is a manually updated counter for use by application software.\n\nThis event could be used to measure any user program event, such as accesses to a particular data structure (by writing to the PMSWINC_EL0 register each time the data structure is accessed).\n\nTo use the PMSWINC_EL0 register and event, developers must insert instructions that write to the PMSWINC_EL0 register into the source code.\n\nSince the SW_INCR event records writes to the PMSWINC_EL0 register, there is no need to do a read/increment/write sequence to the PMSWINC_EL0 register."
+ },
+ {
+ "ArchStdEvent": "INST_RETIRED",
+ "PublicDescription": "Counts instructions that have been architecturally executed."
+ },
+ {
+ "ArchStdEvent": "CID_WRITE_RETIRED",
+ "PublicDescription": "Counts architecturally executed writes to the CONTEXTIDR register, which usually contain the kernel PID and can be output with hardware trace."
+ },
+ {
+ "ArchStdEvent": "TTBR_WRITE_RETIRED",
+ "PublicDescription": "Counts architectural writes to TTBR0/1_EL1. If virtualization host extensions are enabled (by setting the HCR_EL2.E2H bit to 1), then accesses to TTBR0/1_EL1 that are redirected to TTBR0/1_EL2, or accesses to TTBR0/1_EL12, are counted. TTBRn registers are typically updated when the kernel is swapping user-space threads or applications."
+ },
+ {
+ "ArchStdEvent": "BR_RETIRED",
+ "PublicDescription": "Counts architecturally executed branches, whether the branch is taken or not. Instructions that explicitly write to the PC are also counted."
+ },
+ {
+ "ArchStdEvent": "BR_MIS_PRED_RETIRED",
+ "PublicDescription": "Counts branches counted by BR_RETIRED which were mispredicted and caused a pipeline flush."
+ },
+ {
+ "ArchStdEvent": "OP_RETIRED",
+ "PublicDescription": "Counts micro-operations that are architecturally executed. This is a count of number of micro-operations retired from the commit queue in a single cycle."
+ }
+]
diff --git a/tools/perf/pmu-events/arch/arm64/arm/neoverse-n2-v2/spe.json b/tools/perf/pmu-events/arch/arm64/arm/neoverse-n2-v2/spe.json
index 20f2165c85fe..5de8b0f3a440 100644
--- a/tools/perf/pmu-events/arch/arm64/arm/neoverse-n2-v2/spe.json
+++ b/tools/perf/pmu-events/arch/arm64/arm/neoverse-n2-v2/spe.json
@@ -1,14 +1,18 @@
[
{
- "ArchStdEvent": "SAMPLE_POP"
+ "ArchStdEvent": "SAMPLE_POP",
+ "PublicDescription": "Counts statistical profiling sample population, the count of all operations that could be sampled but may or may not be chosen for sampling."
},
{
- "ArchStdEvent": "SAMPLE_FEED"
+ "ArchStdEvent": "SAMPLE_FEED",
+ "PublicDescription": "Counts statistical profiling samples taken for sampling."
},
{
- "ArchStdEvent": "SAMPLE_FILTRATE"
+ "ArchStdEvent": "SAMPLE_FILTRATE",
+ "PublicDescription": "Counts statistical profiling samples taken which are not removed by filtering."
},
{
- "ArchStdEvent": "SAMPLE_COLLISION"
+ "ArchStdEvent": "SAMPLE_COLLISION",
+ "PublicDescription": "Counts statistical profiling samples that have collided with a previous sample and so therefore not taken."
}
]
diff --git a/tools/perf/pmu-events/arch/arm64/arm/neoverse-n2-v2/spec_operation.json b/tools/perf/pmu-events/arch/arm64/arm/neoverse-n2-v2/spec_operation.json
new file mode 100644
index 000000000000..1af961f8a6c8
--- /dev/null
+++ b/tools/perf/pmu-events/arch/arm64/arm/neoverse-n2-v2/spec_operation.json
@@ -0,0 +1,110 @@
+[
+ {
+ "ArchStdEvent": "BR_MIS_PRED",
+ "PublicDescription": "Counts branches which are speculatively executed and mispredicted."
+ },
+ {
+ "ArchStdEvent": "BR_PRED",
+ "PublicDescription": "Counts branches speculatively executed and were predicted right."
+ },
+ {
+ "ArchStdEvent": "INST_SPEC",
+ "PublicDescription": "Counts operations that have been speculatively executed."
+ },
+ {
+ "ArchStdEvent": "OP_SPEC",
+ "PublicDescription": "Counts micro-operations speculatively executed. This is the count of the number of micro-operations dispatched in a cycle."
+ },
+ {
+ "ArchStdEvent": "UNALIGNED_LD_SPEC",
+ "PublicDescription": "Counts unaligned memory read operations issued by the CPU. This event counts unaligned accesses (as defined by the actual instruction), even if they are subsequently issued as multiple aligned accesses. The event does not count preload operations (PLD, PLI)."
+ },
+ {
+ "ArchStdEvent": "UNALIGNED_ST_SPEC",
+ "PublicDescription": "Counts unaligned memory write operations issued by the CPU. This event counts unaligned accesses (as defined by the actual instruction), even if they are subsequently issued as multiple aligned accesses."
+ },
+ {
+ "ArchStdEvent": "UNALIGNED_LDST_SPEC",
+ "PublicDescription": "Counts unaligned memory operations issued by the CPU. This event counts unaligned accesses (as defined by the actual instruction), even if they are subsequently issued as multiple aligned accesses."
+ },
+ {
+ "ArchStdEvent": "LDREX_SPEC",
+ "PublicDescription": "Counts Load-Exclusive operations that have been speculatively executed. Eg: LDREX, LDX"
+ },
+ {
+ "ArchStdEvent": "STREX_PASS_SPEC",
+ "PublicDescription": "Counts store-exclusive operations that have been speculatively executed and have successfully completed the store operation."
+ },
+ {
+ "ArchStdEvent": "STREX_FAIL_SPEC",
+ "PublicDescription": "Counts store-exclusive operations that have been speculatively executed and have not successfully completed the store operation."
+ },
+ {
+ "ArchStdEvent": "STREX_SPEC",
+ "PublicDescription": "Counts store-exclusive operations that have been speculatively executed."
+ },
+ {
+ "ArchStdEvent": "LD_SPEC",
+ "PublicDescription": "Counts speculatively executed load operations including Single Instruction Multiple Data (SIMD) load operations."
+ },
+ {
+ "ArchStdEvent": "ST_SPEC",
+ "PublicDescription": "Counts speculatively executed store operations including Single Instruction Multiple Data (SIMD) store operations."
+ },
+ {
+ "ArchStdEvent": "DP_SPEC",
+ "PublicDescription": "Counts speculatively executed logical or arithmetic instructions such as MOV/MVN operations."
+ },
+ {
+ "ArchStdEvent": "ASE_SPEC",
+ "PublicDescription": "Counts speculatively executed Advanced SIMD operations excluding load, store and move micro-operations that move data to or from SIMD (vector) registers."
+ },
+ {
+ "ArchStdEvent": "VFP_SPEC",
+ "PublicDescription": "Counts speculatively executed floating point operations. This event does not count operations that move data to or from floating point (vector) registers."
+ },
+ {
+ "ArchStdEvent": "PC_WRITE_SPEC",
+ "PublicDescription": "Counts speculatively executed operations which cause software changes of the PC. Those operations include all taken branch operations."
+ },
+ {
+ "ArchStdEvent": "CRYPTO_SPEC",
+ "PublicDescription": "Counts speculatively executed cryptographic operations except for PMULL and VMULL operations."
+ },
+ {
+ "ArchStdEvent": "BR_IMMED_SPEC",
+ "PublicDescription": "Counts immediate branch operations which are speculatively executed."
+ },
+ {
+ "ArchStdEvent": "BR_RETURN_SPEC",
+ "PublicDescription": "Counts procedure return operations (RET) which are speculatively executed."
+ },
+ {
+ "ArchStdEvent": "BR_INDIRECT_SPEC",
+ "PublicDescription": "Counts indirect branch operations including procedure returns, which are speculatively executed. This includes operations that force a software change of the PC, other than exception-generating operations. Eg: BR Xn, RET"
+ },
+ {
+ "ArchStdEvent": "ISB_SPEC",
+ "PublicDescription": "Counts ISB operations that are executed."
+ },
+ {
+ "ArchStdEvent": "DSB_SPEC",
+ "PublicDescription": "Counts DSB operations that are speculatively issued to Load/Store unit in the CPU."
+ },
+ {
+ "ArchStdEvent": "DMB_SPEC",
+ "PublicDescription": "Counts DMB operations that are speculatively issued to the Load/Store unit in the CPU. This event does not count implied barriers from load acquire/store release operations."
+ },
+ {
+ "ArchStdEvent": "RC_LD_SPEC",
+ "PublicDescription": "Counts any load acquire operations that are speculatively executed. Eg: LDAR, LDARH, LDARB"
+ },
+ {
+ "ArchStdEvent": "RC_ST_SPEC",
+ "PublicDescription": "Counts any store release operations that are speculatively executed. Eg: STLR, STLRH, STLRB'"
+ },
+ {
+ "ArchStdEvent": "ASE_INST_SPEC",
+ "PublicDescription": "Counts speculatively executed Advanced SIMD operations."
+ }
+]
diff --git a/tools/perf/pmu-events/arch/arm64/arm/neoverse-n2-v2/stall.json b/tools/perf/pmu-events/arch/arm64/arm/neoverse-n2-v2/stall.json
new file mode 100644
index 000000000000..bbbebc805034
--- /dev/null
+++ b/tools/perf/pmu-events/arch/arm64/arm/neoverse-n2-v2/stall.json
@@ -0,0 +1,30 @@
+[
+ {
+ "ArchStdEvent": "STALL_FRONTEND",
+ "PublicDescription": "Counts cycles when frontend could not send any micro-operations to the rename stage because of frontend resource stalls caused by fetch memory latency or branch prediction flow stalls. All the frontend slots were empty during the cycle when this event counts."
+ },
+ {
+ "ArchStdEvent": "STALL_BACKEND",
+ "PublicDescription": "Counts cycles whenever the rename unit is unable to send any micro-operations to the backend of the pipeline because of backend resource constraints. Backend resource constraints can include issue stage fullness, execution stage fullness, or other internal pipeline resource fullness. All the backend slots were empty during the cycle when this event counts."
+ },
+ {
+ "ArchStdEvent": "STALL",
+ "PublicDescription": "Counts cycles when no operations are sent to the rename unit from the frontend or from the rename unit to the backend for any reason (either frontend or backend stall)."
+ },
+ {
+ "ArchStdEvent": "STALL_SLOT_BACKEND",
+ "PublicDescription": "Counts slots per cycle in which no operations are sent from the rename unit to the backend due to backend resource constraints."
+ },
+ {
+ "ArchStdEvent": "STALL_SLOT_FRONTEND",
+ "PublicDescription": "Counts slots per cycle in which no operations are sent to the rename unit from the frontend due to frontend resource constraints."
+ },
+ {
+ "ArchStdEvent": "STALL_SLOT",
+ "PublicDescription": "Counts slots per cycle in which no operations are sent to the rename unit from the frontend or from the rename unit to the backend for any reason (either frontend or backend stall)."
+ },
+ {
+ "ArchStdEvent": "STALL_BACKEND_MEM",
+ "PublicDescription": "Counts cycles when the backend is stalled because there is a pending demand load request in progress in the last level core cache."
+ }
+]
diff --git a/tools/perf/pmu-events/arch/arm64/arm/neoverse-n2-v2/sve.json b/tools/perf/pmu-events/arch/arm64/arm/neoverse-n2-v2/sve.json
new file mode 100644
index 000000000000..51dab48cb2ba
--- /dev/null
+++ b/tools/perf/pmu-events/arch/arm64/arm/neoverse-n2-v2/sve.json
@@ -0,0 +1,50 @@
+[
+ {
+ "ArchStdEvent": "SVE_INST_SPEC",
+ "PublicDescription": "Counts speculatively executed operations that are SVE operations."
+ },
+ {
+ "ArchStdEvent": "SVE_PRED_SPEC",
+ "PublicDescription": "Counts speculatively executed predicated SVE operations."
+ },
+ {
+ "ArchStdEvent": "SVE_PRED_EMPTY_SPEC",
+ "PublicDescription": "Counts speculatively executed predicated SVE operations with no active predicate elements."
+ },
+ {
+ "ArchStdEvent": "SVE_PRED_FULL_SPEC",
+ "PublicDescription": "Counts speculatively executed predicated SVE operations with all predicate elements active."
+ },
+ {
+ "ArchStdEvent": "SVE_PRED_PARTIAL_SPEC",
+ "PublicDescription": "Counts speculatively executed predicated SVE operations with at least one but not all active predicate elements."
+ },
+ {
+ "ArchStdEvent": "SVE_PRED_NOT_FULL_SPEC",
+ "PublicDescription": "Counts speculatively executed predicated SVE operations with at least one non active predicate elements."
+ },
+ {
+ "ArchStdEvent": "SVE_LDFF_SPEC",
+ "PublicDescription": "Counts speculatively executed SVE first fault or non-fault load operations."
+ },
+ {
+ "ArchStdEvent": "SVE_LDFF_FAULT_SPEC",
+ "PublicDescription": "Counts speculatively executed SVE first fault or non-fault load operations that clear at least one bit in the FFR."
+ },
+ {
+ "ArchStdEvent": "ASE_SVE_INT8_SPEC",
+ "PublicDescription": "Counts speculatively executed Advanced SIMD or SVE integer operations with the largest data type an 8-bit integer."
+ },
+ {
+ "ArchStdEvent": "ASE_SVE_INT16_SPEC",
+ "PublicDescription": "Counts speculatively executed Advanced SIMD or SVE integer operations with the largest data type a 16-bit integer."
+ },
+ {
+ "ArchStdEvent": "ASE_SVE_INT32_SPEC",
+ "PublicDescription": "Counts speculatively executed Advanced SIMD or SVE integer operations with the largest data type a 32-bit integer."
+ },
+ {
+ "ArchStdEvent": "ASE_SVE_INT64_SPEC",
+ "PublicDescription": "Counts speculatively executed Advanced SIMD or SVE integer operations with the largest data type a 64-bit integer."
+ }
+]
diff --git a/tools/perf/pmu-events/arch/arm64/arm/neoverse-n2-v2/tlb.json b/tools/perf/pmu-events/arch/arm64/arm/neoverse-n2-v2/tlb.json
new file mode 100644
index 000000000000..b550af1831f5
--- /dev/null
+++ b/tools/perf/pmu-events/arch/arm64/arm/neoverse-n2-v2/tlb.json
@@ -0,0 +1,66 @@
+[
+ {
+ "ArchStdEvent": "L1I_TLB_REFILL",
+ "PublicDescription": "Counts level 1 instruction TLB refills from any Instruction fetch. If there are multiple misses in the TLB that are resolved by the refill, then this event only counts once. This event will not count if the translation table walk results in a fault (such as a translation or access fault), since there is no new translation created for the TLB."
+ },
+ {
+ "ArchStdEvent": "L1D_TLB_REFILL",
+ "PublicDescription": "Counts level 1 data TLB accesses that resulted in TLB refills. If there are multiple misses in the TLB that are resolved by the refill, then this event only counts once. This event counts for refills caused by preload instructions or hardware prefetch accesses. This event counts regardless of whether the miss hits in L2 or results in a translation table walk. This event will not count if the translation table walk results in a fault (such as a translation or access fault), since there is no new translation created for the TLB. This event will not count on an access from an AT(address translation) instruction."
+ },
+ {
+ "ArchStdEvent": "L1D_TLB",
+ "PublicDescription": "Counts level 1 data TLB accesses caused by any memory load or store operation. Note that load or store instructions can be broken up into multiple memory operations. This event does not count TLB maintenance operations."
+ },
+ {
+ "ArchStdEvent": "L1I_TLB",
+ "PublicDescription": "Counts level 1 instruction TLB accesses, whether the access hits or misses in the TLB. This event counts both demand accesses and prefetch or preload generated accesses."
+ },
+ {
+ "ArchStdEvent": "L2D_TLB_REFILL",
+ "PublicDescription": "Counts level 2 TLB refills caused by memory operations from both data and instruction fetch, except for those caused by TLB maintenance operations and hardware prefetches."
+ },
+ {
+ "ArchStdEvent": "L2D_TLB",
+ "PublicDescription": "Counts level 2 TLB accesses except those caused by TLB maintenance operations."
+ },
+ {
+ "ArchStdEvent": "DTLB_WALK",
+ "PublicDescription": "Counts data memory translation table walks caused by a miss in the L2 TLB driven by a memory access. Note that partial translations that also cause a table walk are counted. This event does not count table walks caused by TLB maintenance operations."
+ },
+ {
+ "ArchStdEvent": "ITLB_WALK",
+ "PublicDescription": "Counts instruction memory translation table walks caused by a miss in the L2 TLB driven by a memory access. Partial translations that also cause a table walk are counted. This event does not count table walks caused by TLB maintenance operations."
+ },
+ {
+ "ArchStdEvent": "L1D_TLB_REFILL_RD",
+ "PublicDescription": "Counts level 1 data TLB refills caused by memory read operations. If there are multiple misses in the TLB that are resolved by the refill, then this event only counts once. This event counts for refills caused by preload instructions or hardware prefetch accesses. This event counts regardless of whether the miss hits in L2 or results in a translation table walk. This event will not count if the translation table walk results in a fault (such as a translation or access fault), since there is no new translation created for the TLB. This event will not count on an access from an Address Translation (AT) instruction."
+ },
+ {
+ "ArchStdEvent": "L1D_TLB_REFILL_WR",
+ "PublicDescription": "Counts level 1 data TLB refills caused by data side memory write operations. If there are multiple misses in the TLB that are resolved by the refill, then this event only counts once. This event counts for refills caused by preload instructions or hardware prefetch accesses. This event counts regardless of whether the miss hits in L2 or results in a translation table walk. This event will not count if the table walk results in a fault (such as a translation or access fault), since there is no new translation created for the TLB. This event will not count with an access from an Address Translation (AT) instruction."
+ },
+ {
+ "ArchStdEvent": "L1D_TLB_RD",
+ "PublicDescription": "Counts level 1 data TLB accesses caused by memory read operations. This event counts whether the access hits or misses in the TLB. This event does not count TLB maintenance operations."
+ },
+ {
+ "ArchStdEvent": "L1D_TLB_WR",
+ "PublicDescription": "Counts any L1 data side TLB accesses caused by memory write operations. This event counts whether the access hits or misses in the TLB. This event does not count TLB maintenance operations."
+ },
+ {
+ "ArchStdEvent": "L2D_TLB_REFILL_RD",
+ "PublicDescription": "Counts level 2 TLB refills caused by memory read operations from both data and instruction fetch except for those caused by TLB maintenance operations or hardware prefetches."
+ },
+ {
+ "ArchStdEvent": "L2D_TLB_REFILL_WR",
+ "PublicDescription": "Counts level 2 TLB refills caused by memory write operations from both data and instruction fetch except for those caused by TLB maintenance operations."
+ },
+ {
+ "ArchStdEvent": "L2D_TLB_RD",
+ "PublicDescription": "Counts level 2 TLB accesses caused by memory read operations from both data and instruction fetch except for those caused by TLB maintenance operations."
+ },
+ {
+ "ArchStdEvent": "L2D_TLB_WR",
+ "PublicDescription": "Counts level 2 TLB accesses caused by memory write operations from both data and instruction fetch except for those caused by TLB maintenance operations."
+ }
+]
diff --git a/tools/perf/pmu-events/arch/arm64/arm/neoverse-n2-v2/trace.json b/tools/perf/pmu-events/arch/arm64/arm/neoverse-n2-v2/trace.json
index 3116135c59e2..98f6fabfebc7 100644
--- a/tools/perf/pmu-events/arch/arm64/arm/neoverse-n2-v2/trace.json
+++ b/tools/perf/pmu-events/arch/arm64/arm/neoverse-n2-v2/trace.json
@@ -1,29 +1,38 @@
[
{
- "ArchStdEvent": "TRB_WRAP"
+ "ArchStdEvent": "TRB_WRAP",
+ "PublicDescription": "This event is generated each time the current write pointer is wrapped to the base pointer."
},
{
- "ArchStdEvent": "TRCEXTOUT0"
+ "ArchStdEvent": "TRCEXTOUT0",
+ "PublicDescription": "This event is generated each time an event is signaled by ETE external event 0."
},
{
- "ArchStdEvent": "TRCEXTOUT1"
+ "ArchStdEvent": "TRCEXTOUT1",
+ "PublicDescription": "This event is generated each time an event is signaled by ETE external event 1."
},
{
- "ArchStdEvent": "TRCEXTOUT2"
+ "ArchStdEvent": "TRCEXTOUT2",
+ "PublicDescription": "This event is generated each time an event is signaled by ETE external event 2."
},
{
- "ArchStdEvent": "TRCEXTOUT3"
+ "ArchStdEvent": "TRCEXTOUT3",
+ "PublicDescription": "This event is generated each time an event is signaled by ETE external event 3."
},
{
- "ArchStdEvent": "CTI_TRIGOUT4"
+ "ArchStdEvent": "CTI_TRIGOUT4",
+ "PublicDescription": "This event is generated each time an event is signaled on CTI output trigger 4."
},
{
- "ArchStdEvent": "CTI_TRIGOUT5"
+ "ArchStdEvent": "CTI_TRIGOUT5",
+ "PublicDescription": "This event is generated each time an event is signaled on CTI output trigger 5."
},
{
- "ArchStdEvent": "CTI_TRIGOUT6"
+ "ArchStdEvent": "CTI_TRIGOUT6",
+ "PublicDescription": "This event is generated each time an event is signaled on CTI output trigger 6."
},
{
- "ArchStdEvent": "CTI_TRIGOUT7"
+ "ArchStdEvent": "CTI_TRIGOUT7",
+ "PublicDescription": "This event is generated each time an event is signaled on CTI output trigger 7."
}
]
diff --git a/tools/perf/pmu-events/arch/arm64/freescale/yitian710/sys/ali_drw.json b/tools/perf/pmu-events/arch/arm64/freescale/yitian710/sys/ali_drw.json
new file mode 100644
index 000000000000..e21c469a8ef0
--- /dev/null
+++ b/tools/perf/pmu-events/arch/arm64/freescale/yitian710/sys/ali_drw.json
@@ -0,0 +1,373 @@
+[
+ {
+ "BriefDescription": "A Write or Read Op at HIF interface. The unit is 64B.",
+ "ConfigCode": "0x0",
+ "EventName": "hif_rd_or_wr",
+ "Unit": "ali_drw",
+ "Compat": "ali_drw_pmu"
+ },
+ {
+ "BriefDescription": "A Write Op at HIF interface. The unit is 64B.",
+ "ConfigCode": "0x1",
+ "EventName": "hif_wr",
+ "Unit": "ali_drw",
+ "Compat": "ali_drw_pmu"
+ },
+ {
+ "BriefDescription": "A Read Op at HIF interface. The unit is 64B.",
+ "ConfigCode": "0x2",
+ "EventName": "hif_rd",
+ "Unit": "ali_drw",
+ "Compat": "ali_drw_pmu"
+ },
+ {
+ "BriefDescription": "A Read-Modify-Write Op at HIF interface. The unit is 64B.",
+ "ConfigCode": "0x3",
+ "EventName": "hif_rmw",
+ "Unit": "ali_drw",
+ "Compat": "ali_drw_pmu"
+ },
+ {
+ "BriefDescription": "A high priority Read at HIF interface. The unit is 64B.",
+ "ConfigCode": "0x4",
+ "EventName": "hif_hi_pri_rd",
+ "Unit": "ali_drw",
+ "Compat": "ali_drw_pmu"
+ },
+ {
+ "BriefDescription": "A write data cycle at DFI interface (to DRAM).",
+ "ConfigCode": "0x7",
+ "EventName": "dfi_wr_data_cycles",
+ "Unit": "ali_drw",
+ "Compat": "ali_drw_pmu"
+ },
+ {
+ "BriefDescription": "A read data cycle at DFI interface (to DRAM).",
+ "ConfigCode": "0x8",
+ "EventName": "dfi_rd_data_cycles",
+ "Unit": "ali_drw",
+ "Compat": "ali_drw_pmu"
+ },
+ {
+ "BriefDescription": "A high priority read becomes critical.",
+ "ConfigCode": "0x9",
+ "EventName": "hpr_xact_when_critical",
+ "Unit": "ali_drw",
+ "Compat": "ali_drw_pmu"
+ },
+ {
+ "BriefDescription": "A low priority read becomes critical.",
+ "ConfigCode": "0xA",
+ "EventName": "lpr_xact_when_critical",
+ "Unit": "ali_drw",
+ "Compat": "ali_drw_pmu"
+ },
+ {
+ "BriefDescription": "A write becomes critical.",
+ "ConfigCode": "0xB",
+ "EventName": "wr_xact_when_critical",
+ "Unit": "ali_drw",
+ "Compat": "ali_drw_pmu"
+ },
+ {
+ "BriefDescription": "An Activate(ACT) command to DRAM.",
+ "ConfigCode": "0xC",
+ "EventName": "op_is_activate",
+ "Unit": "ali_drw",
+ "Compat": "ali_drw_pmu"
+ },
+ {
+ "BriefDescription": "A Read or Write CAS command to DRAM.",
+ "ConfigCode": "0xD",
+ "EventName": "op_is_rd_or_wr",
+ "Unit": "ali_drw",
+ "Compat": "ali_drw_pmu"
+ },
+ {
+ "BriefDescription": "An Activate(ACT) command for read to DRAM.",
+ "ConfigCode": "0xE",
+ "EventName": "op_is_rd_activate",
+ "Unit": "ali_drw",
+ "Compat": "ali_drw_pmu"
+ },
+ {
+ "BriefDescription": "A Read CAS command to DRAM.",
+ "ConfigCode": "0xF",
+ "EventName": "op_is_rd",
+ "Unit": "ali_drw",
+ "Compat": "ali_drw_pmu"
+ },
+ {
+ "BriefDescription": "A Write CAS command to DRAM.",
+ "ConfigCode": "0x10",
+ "EventName": "op_is_wr",
+ "Unit": "ali_drw",
+ "Compat": "ali_drw_pmu"
+ },
+ {
+ "BriefDescription": "A Masked Write command to DRAM.",
+ "ConfigCode": "0x11",
+ "EventName": "op_is_mwr",
+ "Unit": "ali_drw",
+ "Compat": "ali_drw_pmu"
+ },
+ {
+ "BriefDescription": "A Precharge(PRE) command to DRAM.",
+ "ConfigCode": "0x12",
+ "EventName": "op_is_precharge",
+ "Unit": "ali_drw",
+ "Compat": "ali_drw_pmu"
+ },
+ {
+ "BriefDescription": "A Precharge(PRE) required by read or write.",
+ "ConfigCode": "0x13",
+ "EventName": "precharge_for_rdwr",
+ "Unit": "ali_drw",
+ "Compat": "ali_drw_pmu"
+ },
+ {
+ "BriefDescription": "A Precharge(PRE) required by other conditions.",
+ "ConfigCode": "0x14",
+ "EventName": "precharge_for_other",
+ "Unit": "ali_drw",
+ "Compat": "ali_drw_pmu"
+ },
+ {
+ "BriefDescription": "A read-write turnaround.",
+ "ConfigCode": "0x15",
+ "EventName": "rdwr_transitions",
+ "Unit": "ali_drw",
+ "Compat": "ali_drw_pmu"
+ },
+ {
+ "BriefDescription": "A write combine(merge) in write data buffer.",
+ "ConfigCode": "0x16",
+ "EventName": "write_combine",
+ "Unit": "ali_drw",
+ "Compat": "ali_drw_pmu"
+ },
+ {
+ "BriefDescription": "A Write-After-Read hazard.",
+ "ConfigCode": "0x17",
+ "EventName": "war_hazard",
+ "Unit": "ali_drw",
+ "Compat": "ali_drw_pmu"
+ },
+ {
+ "BriefDescription": "A Read-After-Write hazard.",
+ "ConfigCode": "0x18",
+ "EventName": "raw_hazard",
+ "Unit": "ali_drw",
+ "Compat": "ali_drw_pmu"
+ },
+ {
+ "BriefDescription": "A Write-After-Write hazard.",
+ "ConfigCode": "0x19",
+ "EventName": "waw_hazard",
+ "Unit": "ali_drw",
+ "Compat": "ali_drw_pmu"
+ },
+ {
+ "BriefDescription": "Rank0 enters self-refresh(SRE).",
+ "ConfigCode": "0x1A",
+ "EventName": "op_is_enter_selfref_rk0",
+ "Unit": "ali_drw",
+ "Compat": "ali_drw_pmu"
+ },
+ {
+ "BriefDescription": "Rank1 enters self-refresh(SRE).",
+ "ConfigCode": "0x1B",
+ "EventName": "op_is_enter_selfref_rk1",
+ "Unit": "ali_drw",
+ "Compat": "ali_drw_pmu"
+ },
+ {
+ "BriefDescription": "Rank2 enters self-refresh(SRE).",
+ "ConfigCode": "0x1C",
+ "EventName": "op_is_enter_selfref_rk2",
+ "Unit": "ali_drw",
+ "Compat": "ali_drw_pmu"
+ },
+ {
+ "BriefDescription": "Rank3 enters self-refresh(SRE).",
+ "ConfigCode": "0x1D",
+ "EventName": "op_is_enter_selfref_rk3",
+ "Unit": "ali_drw",
+ "Compat": "ali_drw_pmu"
+ },
+ {
+ "BriefDescription": "Rank0 enters power-down(PDE).",
+ "ConfigCode": "0x1E",
+ "EventName": "op_is_enter_powerdown_rk0",
+ "Unit": "ali_drw",
+ "Compat": "ali_drw_pmu"
+ },
+ {
+ "BriefDescription": "Rank1 enters power-down(PDE).",
+ "ConfigCode": "0x1F",
+ "EventName": "op_is_enter_powerdown_rk1",
+ "Unit": "ali_drw",
+ "Compat": "ali_drw_pmu"
+ },
+ {
+ "BriefDescription": "Rank2 enters power-down(PDE).",
+ "ConfigCode": "0x20",
+ "EventName": "op_is_enter_powerdown_rk2",
+ "Unit": "ali_drw",
+ "Compat": "ali_drw_pmu"
+ },
+ {
+ "BriefDescription": "Rank3 enters power-down(PDE).",
+ "ConfigCode": "0x21",
+ "EventName": "op_is_enter_powerdown_rk3",
+ "Unit": "ali_drw",
+ "Compat": "ali_drw_pmu"
+ },
+ {
+ "BriefDescription": "A cycle that Rank0 stays in self-refresh mode.",
+ "ConfigCode": "0x26",
+ "EventName": "selfref_mode_rk0",
+ "Unit": "ali_drw",
+ "Compat": "ali_drw_pmu"
+ },
+ {
+ "BriefDescription": "A cycle that Rank1 stays in self-refresh mode.",
+ "ConfigCode": "0x27",
+ "EventName": "selfref_mode_rk1",
+ "Unit": "ali_drw",
+ "Compat": "ali_drw_pmu"
+ },
+ {
+ "BriefDescription": "A cycle that Rank2 stays in self-refresh mode.",
+ "ConfigCode": "0x28",
+ "EventName": "selfref_mode_rk2",
+ "Unit": "ali_drw",
+ "Compat": "ali_drw_pmu"
+ },
+ {
+ "BriefDescription": "A cycle that Rank3 stays in self-refresh mode.",
+ "ConfigCode": "0x29",
+ "EventName": "selfref_mode_rk3",
+ "Unit": "ali_drw",
+ "Compat": "ali_drw_pmu"
+ },
+ {
+ "BriefDescription": "An auto-refresh(REF) command to DRAM.",
+ "ConfigCode": "0x2A",
+ "EventName": "op_is_refresh",
+ "Unit": "ali_drw",
+ "Compat": "ali_drw_pmu"
+ },
+ {
+ "BriefDescription": "A critical auto-refresh(REF) command to DRAM.",
+ "ConfigCode": "0x2B",
+ "EventName": "op_is_crit_ref",
+ "Unit": "ali_drw",
+ "Compat": "ali_drw_pmu"
+ },
+ {
+ "BriefDescription": "An MRR or MRW command to DRAM.",
+ "ConfigCode": "0x2D",
+ "EventName": "op_is_load_mode",
+ "Unit": "ali_drw",
+ "Compat": "ali_drw_pmu"
+ },
+ {
+ "BriefDescription": "A ZQCal command to DRAM.",
+ "ConfigCode": "0x2E",
+ "EventName": "op_is_zqcl",
+ "Unit": "ali_drw",
+ "Compat": "ali_drw_pmu"
+ },
+ {
+ "BriefDescription": "At least one entry in read queue reaches the visible window limit.",
+ "ConfigCode": "0x30",
+ "EventName": "visible_window_limit_reached_rd",
+ "Unit": "ali_drw",
+ "Compat": "ali_drw_pmu"
+ },
+ {
+ "BriefDescription": "At least one entry in write queue reaches the visible window limit.",
+ "ConfigCode": "0x31",
+ "EventName": "visible_window_limit_reached_wr",
+ "Unit": "ali_drw",
+ "Compat": "ali_drw_pmu"
+ },
+ {
+ "BriefDescription": "A DQS Oscillator MPC command to DRAM.",
+ "ConfigCode": "0x34",
+ "EventName": "op_is_dqsosc_mpc",
+ "Unit": "ali_drw",
+ "Compat": "ali_drw_pmu"
+ },
+ {
+ "BriefDescription": "A DQS Oscillator MRR command to DRAM.",
+ "ConfigCode": "0x35",
+ "EventName": "op_is_dqsosc_mrr",
+ "Unit": "ali_drw",
+ "Compat": "ali_drw_pmu"
+ },
+ {
+ "BriefDescription": "A Temperature Compensated Refresh(TCR) MRR command to DRAM.",
+ "ConfigCode": "0x36",
+ "EventName": "op_is_tcr_mrr",
+ "Unit": "ali_drw",
+ "Compat": "ali_drw_pmu"
+ },
+ {
+ "BriefDescription": "A ZQCal Start command to DRAM.",
+ "ConfigCode": "0x37",
+ "EventName": "op_is_zqstart",
+ "Unit": "ali_drw",
+ "Compat": "ali_drw_pmu"
+ },
+ {
+ "BriefDescription": "A ZQCal Latch command to DRAM.",
+ "ConfigCode": "0x38",
+ "EventName": "op_is_zqlatch",
+ "Unit": "ali_drw",
+ "Compat": "ali_drw_pmu"
+ },
+ {
+ "BriefDescription": "A packet at CHI TXREQ interface (request).",
+ "ConfigCode": "0x39",
+ "EventName": "chi_txreq",
+ "Unit": "ali_drw",
+ "Compat": "ali_drw_pmu"
+ },
+ {
+ "BriefDescription": "A packet at CHI TXDAT interface (read data).",
+ "ConfigCode": "0x3A",
+ "EventName": "chi_txdat",
+ "Unit": "ali_drw",
+ "Compat": "ali_drw_pmu"
+ },
+ {
+ "BriefDescription": "A packet at CHI RXDAT interface (write data).",
+ "ConfigCode": "0x3B",
+ "EventName": "chi_rxdat",
+ "Unit": "ali_drw",
+ "Compat": "ali_drw_pmu"
+ },
+ {
+ "BriefDescription": "A packet at CHI RXRSP interface.",
+ "ConfigCode": "0x3C",
+ "EventName": "chi_rxrsp",
+ "Unit": "ali_drw",
+ "Compat": "ali_drw_pmu"
+ },
+ {
+ "BriefDescription": "A violation detected in TZC.",
+ "ConfigCode": "0x3D",
+ "EventName": "tsz_vio",
+ "Unit": "ali_drw",
+ "Compat": "ali_drw_pmu"
+ },
+ {
+ "BriefDescription": "The ddr cycles.",
+ "ConfigCode": "0x80",
+ "EventName": "ddr_cycles",
+ "Unit": "ali_drw",
+ "Compat": "ali_drw_pmu"
+ }
+]
diff --git a/tools/perf/pmu-events/arch/arm64/freescale/yitian710/sys/metrics.json b/tools/perf/pmu-events/arch/arm64/freescale/yitian710/sys/metrics.json
new file mode 100644
index 000000000000..bc865b374b6a
--- /dev/null
+++ b/tools/perf/pmu-events/arch/arm64/freescale/yitian710/sys/metrics.json
@@ -0,0 +1,20 @@
+[
+ {
+ "MetricName": "ddr_read_bandwidth.all",
+ "BriefDescription": "The ddr read bandwidth(MB/s).",
+ "MetricGroup": "ali_drw",
+ "MetricExpr": "hif_rd * 64 / 1e6 / duration_time",
+ "ScaleUnit": "1MB/s",
+ "Unit": "ali_drw",
+ "Compat": "ali_drw_pmu"
+ },
+ {
+ "MetricName": "ddr_write_bandwidth.all",
+ "BriefDescription": "The ddr write bandwidth(MB/s).",
+ "MetricGroup": "ali_drw",
+ "MetricExpr": "(hif_wr + hif_rmw) * 64 / 1e6 / duration_time",
+ "ScaleUnit": "1MB/s",
+ "Unit": "ali_drw",
+ "Compat": "ali_drw_pmu"
+ }
+]
diff --git a/tools/perf/pmu-events/arch/arm64/sbsa.json b/tools/perf/pmu-events/arch/arm64/sbsa.json
index f90b338261ac..4eed79a28f6e 100644
--- a/tools/perf/pmu-events/arch/arm64/sbsa.json
+++ b/tools/perf/pmu-events/arch/arm64/sbsa.json
@@ -1,34 +1,34 @@
[
{
- "MetricExpr": "stall_slot_frontend / (#slots * cpu_cycles)",
- "BriefDescription": "Frontend bound L1 topdown metric",
+ "MetricExpr": "100 * (stall_slot_frontend / (#slots * cpu_cycles))",
+ "BriefDescription": "This metric is the percentage of total slots that were stalled due to resource constraints in the frontend of the processor.",
"DefaultMetricgroupName": "TopdownL1",
"MetricGroup": "Default;TopdownL1",
"MetricName": "frontend_bound",
- "ScaleUnit": "100%"
+ "ScaleUnit": "1percent of slots"
},
{
- "MetricExpr": "(1 - op_retired / op_spec) * (1 - stall_slot / (#slots * cpu_cycles))",
- "BriefDescription": "Bad speculation L1 topdown metric",
+ "MetricExpr": "100 * ((1 - op_retired / op_spec) * (1 - stall_slot / (#slots * cpu_cycles)))",
+ "BriefDescription": "This metric is the percentage of total slots that executed operations and didn't retire due to a pipeline flush.\nThis indicates cycles that were utilized but inefficiently.",
"DefaultMetricgroupName": "TopdownL1",
"MetricGroup": "Default;TopdownL1",
"MetricName": "bad_speculation",
- "ScaleUnit": "100%"
+ "ScaleUnit": "1percent of slots"
},
{
- "MetricExpr": "(op_retired / op_spec) * (1 - stall_slot / (#slots * cpu_cycles))",
- "BriefDescription": "Retiring L1 topdown metric",
+ "MetricExpr": "100 * ((op_retired / op_spec) * (1 - stall_slot / (#slots * cpu_cycles)))",
+ "BriefDescription": "This metric is the percentage of total slots that retired operations, which indicates cycles that were utilized efficiently.",
"DefaultMetricgroupName": "TopdownL1",
"MetricGroup": "Default;TopdownL1",
"MetricName": "retiring",
- "ScaleUnit": "100%"
+ "ScaleUnit": "1percent of slots"
},
{
- "MetricExpr": "stall_slot_backend / (#slots * cpu_cycles)",
- "BriefDescription": "Backend Bound L1 topdown metric",
+ "MetricExpr": "100 * (stall_slot_backend / (#slots * cpu_cycles))",
+ "BriefDescription": "This metric is the percentage of total slots that were stalled due to resource constraints in the backend of the processor.",
"DefaultMetricgroupName": "TopdownL1",
"MetricGroup": "Default;TopdownL1",
"MetricName": "backend_bound",
- "ScaleUnit": "100%"
+ "ScaleUnit": "1percent of slots"
}
]