| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When synthesizing data from SPE, augment the type with source information
for Arm Neoverse cores. The field is IMPLDEF but the Neoverse cores all use
the same encoding. I can't find encoding information for any other SPE
implementations to unify their choices with Arm's thus that is left for
future work.
This change populates the mem_lvl_num for Neoverse cores as well as the
deprecated mem_lvl namespace.
Reviewed-by: German Gomez <german.gomez@arm.com>
Reviewed-by: Leo Yan <leo.yan@linaro.org>
Signed-off-by: Ali Saidi <alisaidi@amazon.com>
Tested-by: Leo Yan <leo.yan@linaro.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Gustavo A. R. Silva <gustavoars@kernel.org>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Like Xu <likexu@tencent.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Timothy Hayes <timothy.hayes@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20220811062451.435810-4-leo.yan@linaro.org
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Use total latency info in the SPE counter packet as sample weight so
that we can see it in local_weight and (global) weight sort keys.
Maybe we can use PERF_SAMPLE_WEIGHT_STRUCT to support ins_lat as well
but I'm not sure which latency it matches. So just adding total latency
first.
Reviewed-by: Leo Yan <leo.yan@linaro.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: German Gomez <german.gomez@arm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20211201220855.1260688-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch is to save context ID in record, this will be used to set TID
for samples.
Reviewed-by: Leo Yan <leo.yan@linaro.org>
Signed-off-by: German Gomez <german.gomez@arm.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20211111133625.193568-4-german.gomez@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch is to store operation type in packet structure.
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Reviewed-by: James Clark <james.clark@arm.com>
Tested-by: James Clark <james.clark@arm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Al Grant <al.grant@arm.com>
Cc: Andre Przywara <andre.przywara@arm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wei Li <liwei391@huawei.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: James Clark <james.clark@arm.com>
Link: https://lore.kernel.org/r/20210211133856.2137-3-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch is to store virtual and physical memory addresses in packet,
which will be used for memory samples.
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Reviewed-by: James Clark <james.clark@arm.com>
Tested-by: James Clark <james.clark@arm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Al Grant <al.grant@arm.com>
Cc: Andre Przywara <andre.przywara@arm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wei Li <liwei391@huawei.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20210211133856.2137-2-james.clark@arm.com
Signed-off-by: James Clark <james.clark@arm.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In the Armv8 ARM (ARM DDI 0487F.c), chapter "D10.2.6 Events packet", it
describes the event bit is valid with specific payload requirement. For
example, the Last Level cache access event, the bit is defined as:
E[8], byte 1 bit [0], when SZ == 0b01 , when SZ == 0b10 ,
or when SZ == 0b11
It requires the payload size is at least 2 bytes, when byte 1 (start
counting from 0) is valid, E[8] (bit 0 in byte 1) can be used for LLC
access event type. For safety, the code checks the condition for
payload size firstly, if meet the requirement for payload size, then
continue to parse event type.
If review function arm_spe_get_payload(), it has used cast, so any bytes
beyond the valid size have been set to zeros.
For this reason, we don't need to check payload size anymore afterwards
when parse events, thus this patch removes payload size conditions.
Suggested-by: Andre Przywara <andre.przywara@arm.com>
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Reviewed-by: Andre Przywara <andre.przywara@arm.com>
Acked-by: Will Deacon <will@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Al Grant <Al.Grant@arm.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Dave Martin <Dave.Martin@arm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wei Li <liwei391@huawei.com>
Link: https://lore.kernel.org/r/20201119152441.6972-12-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
To establish a valid address from the address packet payload and finally
the address value can be used for parsing data symbol in DSO, current
code uses 0xff to replace the tag in the top byte of data virtual
address.
So far the code only fixups top byte for the memory layouts with 4KB
pages, it misses to support memory layouts with 64KB pages.
This patch adds the conditions for checking bits [55:52] are 0xf, if
detects the pattern it will fill 0xff into the top byte of the address,
also adds comment to explain the fixing up.
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Reviewed-by: Andre Przywara <andre.przywara@arm.com>
Acked-by: Will Deacon <will@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Al Grant <Al.Grant@arm.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Dave Martin <Dave.Martin@arm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wei Li <liwei391@huawei.com>
Link: https://lore.kernel.org/r/20201119152441.6972-6-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch is to refactor address packet handling, it defines macros for
address packet's header and payload, these macros are used by decoder
and the dump flow.
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Reviewed-by: Andre Przywara <andre.przywara@arm.com>
Acked-by: Will Deacon <will@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Al Grant <Al.Grant@arm.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Dave Martin <Dave.Martin@arm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wei Li <liwei391@huawei.com>
Link: https://lore.kernel.org/r/20201119152441.6972-5-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Include header linux/bitops.h, directly use its BIT() macro and remove
the self defined macros.
Committer notes:
Use BIT_ULL() instead of BIT to build on 32-bit arches as mentioned in
review by Andre Przywara <andre.przywara@arm.com>. I noticed the build
failure when crossbuilding to arm32 from x86_64.
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Reviewed-by: Andre Przywara <andre.przywara@arm.com>
Link: https://lore.kernel.org/r/20201111071149.815-2-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In arm_spe_read_record(), when we are processing an events packet,
'decoder->packet.index' is the length of payload, which has been
transformed in payloadlen(). So correct the check of 'idx'.
Signed-off-by: Wei Li <liwei391@huawei.com>
Reviewed-by: Leo Yan <leo.yan@linaro.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Hanjun Guo <guohanjun@huawei.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20200724072628.35904-1-liwei391@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
After the commit ffd3d18c20b8 ("perf tools: Add ARM Statistical
Profiling Extensions (SPE) support") has been merged, it supports to
output raw data with option "--dump-raw-trace". However, it misses for
support synthetic events so cannot output any statistical info.
This patch is to improve the "perf report" support for ARM SPE for four
types synthetic events:
First level cache synthetic events, including L1 data cache accessing
and missing events;
Last level cache synthetic events, including last level cache
accessing and missing events;
TLB synthetic events, including TLB accessing and missing events;
Remote access events, which is used to account load/store operations
caused to another socket.
Example usage:
$ perf record -c 1024 -e arm_spe_0/branch_filter=1,ts_enable=1,pct_enable=1,pa_enable=1,load_filter=1,jitter=1,store_filter=1,min_latency=0/ dd if=/dev/zero of=/dev/null count=10000
$ perf report --stdio
# Samples: 59 of event 'l1d-miss'
# Event count (approx.): 59
#
# Children Self Command Shared Object Symbol
# ........ ........ ....... ................. ..................................
#
23.73% 23.73% dd [kernel.kallsyms] [k] perf_iterate_ctx.constprop.135
20.34% 20.34% dd [kernel.kallsyms] [k] filemap_map_pages
5.08% 5.08% dd [kernel.kallsyms] [k] perf_event_mmap
5.08% 5.08% dd [kernel.kallsyms] [k] unlock_page_memcg
5.08% 5.08% dd [kernel.kallsyms] [k] unmap_page_range
3.39% 3.39% dd [kernel.kallsyms] [k] PageHuge
3.39% 3.39% dd [kernel.kallsyms] [k] release_pages
3.39% 3.39% dd ld-2.28.so [.] 0x0000000000008b5c
1.69% 1.69% dd [kernel.kallsyms] [k] __alloc_fd
[...]
# Samples: 3K of event 'l1d-access'
# Event count (approx.): 3980
#
# Children Self Command Shared Object Symbol
# ........ ........ ....... ................. ......................................
#
26.98% 26.98% dd [kernel.kallsyms] [k] ret_to_user
10.53% 10.53% dd [kernel.kallsyms] [k] fsnotify
7.51% 7.51% dd [kernel.kallsyms] [k] new_sync_read
4.57% 4.57% dd [kernel.kallsyms] [k] vfs_read
4.35% 4.35% dd [kernel.kallsyms] [k] vfs_write
3.69% 3.69% dd [kernel.kallsyms] [k] __fget_light
3.69% 3.69% dd [kernel.kallsyms] [k] rw_verify_area
3.44% 3.44% dd [kernel.kallsyms] [k] security_file_permission
2.76% 2.76% dd [kernel.kallsyms] [k] __fsnotify_parent
2.44% 2.44% dd [kernel.kallsyms] [k] ksys_write
2.24% 2.24% dd [kernel.kallsyms] [k] iov_iter_zero
2.19% 2.19% dd [kernel.kallsyms] [k] read_iter_zero
1.81% 1.81% dd dd [.] 0x0000000000002960
1.78% 1.78% dd dd [.] 0x0000000000002980
[...]
# Samples: 35 of event 'llc-miss'
# Event count (approx.): 35
#
# Children Self Command Shared Object Symbol
# ........ ........ ....... ................. ...........................
#
34.29% 34.29% dd [kernel.kallsyms] [k] filemap_map_pages
8.57% 8.57% dd [kernel.kallsyms] [k] unlock_page_memcg
8.57% 8.57% dd [kernel.kallsyms] [k] unmap_page_range
5.71% 5.71% dd [kernel.kallsyms] [k] PageHuge
5.71% 5.71% dd [kernel.kallsyms] [k] release_pages
5.71% 5.71% dd ld-2.28.so [.] 0x0000000000008b5c
2.86% 2.86% dd [kernel.kallsyms] [k] __queue_work
2.86% 2.86% dd [kernel.kallsyms] [k] __radix_tree_lookup
2.86% 2.86% dd [kernel.kallsyms] [k] copy_page
[...]
# Samples: 2 of event 'llc-access'
# Event count (approx.): 2
#
# Children Self Command Shared Object Symbol
# ........ ........ ....... ................. .............
#
50.00% 50.00% dd [kernel.kallsyms] [k] copy_page
50.00% 50.00% dd libc-2.28.so [.] _dl_addr
# Samples: 48 of event 'tlb-miss'
# Event count (approx.): 48
#
# Children Self Command Shared Object Symbol
# ........ ........ ....... ................. ..................................
#
20.83% 20.83% dd [kernel.kallsyms] [k] perf_iterate_ctx.constprop.135
12.50% 12.50% dd [kernel.kallsyms] [k] __arch_clear_user
10.42% 10.42% dd [kernel.kallsyms] [k] clear_page
4.17% 4.17% dd [kernel.kallsyms] [k] copy_page
4.17% 4.17% dd [kernel.kallsyms] [k] filemap_map_pages
2.08% 2.08% dd [kernel.kallsyms] [k] __alloc_fd
2.08% 2.08% dd [kernel.kallsyms] [k] __mod_memcg_state.part.70
2.08% 2.08% dd [kernel.kallsyms] [k] __queue_work
2.08% 2.08% dd [kernel.kallsyms] [k] __rcu_read_unlock
2.08% 2.08% dd [kernel.kallsyms] [k] d_path
2.08% 2.08% dd [kernel.kallsyms] [k] destroy_inode
2.08% 2.08% dd [kernel.kallsyms] [k] do_dentry_open
[...]
# Samples: 9K of event 'tlb-access'
# Event count (approx.): 9573
#
# Children Self Command Shared Object Symbol
# ........ ........ ....... ................. ......................................
#
25.79% 25.79% dd [kernel.kallsyms] [k] __arch_clear_user
11.22% 11.22% dd [kernel.kallsyms] [k] ret_to_user
8.56% 8.56% dd [kernel.kallsyms] [k] fsnotify
4.06% 4.06% dd [kernel.kallsyms] [k] new_sync_read
3.67% 3.67% dd [kernel.kallsyms] [k] el0_svc_common.constprop.2
3.04% 3.04% dd [kernel.kallsyms] [k] __fsnotify_parent
2.90% 2.90% dd [kernel.kallsyms] [k] vfs_write
2.82% 2.82% dd [kernel.kallsyms] [k] vfs_read
2.52% 2.52% dd libc-2.28.so [.] write
2.26% 2.26% dd [kernel.kallsyms] [k] security_file_permission
2.08% 2.08% dd [kernel.kallsyms] [k] ksys_write
1.96% 1.96% dd [kernel.kallsyms] [k] rw_verify_area
1.95% 1.95% dd [kernel.kallsyms] [k] read_iter_zero
[...]
# Samples: 9 of event 'branch-miss'
# Event count (approx.): 9
#
# Children Self Command Shared Object Symbol
# ........ ........ ....... ................. .........................
#
22.22% 22.22% dd libc-2.28.so [.] _dl_addr
11.11% 11.11% dd [kernel.kallsyms] [k] __arch_clear_user
11.11% 11.11% dd [kernel.kallsyms] [k] __arch_copy_from_user
11.11% 11.11% dd [kernel.kallsyms] [k] __dentry_kill
11.11% 11.11% dd [kernel.kallsyms] [k] __efistub_memcpy
11.11% 11.11% dd ld-2.28.so [.] 0x0000000000012b7c
11.11% 11.11% dd libc-2.28.so [.] 0x000000000002a980
11.11% 11.11% dd libc-2.28.so [.] 0x0000000000083340
# Samples: 29 of event 'remote-access'
# Event count (approx.): 29
#
# Children Self Command Shared Object Symbol
# ........ ........ ....... ................. ...........................
#
41.38% 41.38% dd [kernel.kallsyms] [k] filemap_map_pages
10.34% 10.34% dd [kernel.kallsyms] [k] unlock_page_memcg
10.34% 10.34% dd [kernel.kallsyms] [k] unmap_page_range
6.90% 6.90% dd [kernel.kallsyms] [k] release_pages
3.45% 3.45% dd [kernel.kallsyms] [k] PageHuge
3.45% 3.45% dd [kernel.kallsyms] [k] __queue_work
3.45% 3.45% dd [kernel.kallsyms] [k] page_add_file_rmap
3.45% 3.45% dd [kernel.kallsyms] [k] page_counter_try_charge
3.45% 3.45% dd [kernel.kallsyms] [k] page_remove_rmap
3.45% 3.45% dd [kernel.kallsyms] [k] xas_start
3.45% 3.45% dd ld-2.28.so [.] 0x0000000000002a1c
3.45% 3.45% dd ld-2.28.so [.] 0x0000000000008b5c
3.45% 3.45% dd ld-2.28.so [.] 0x00000000000093cc
Signed-off-by: Tan Xiaojun <tanxiaojun@huawei.com>
Tested-by: James Clark <james.clark@arm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Al Grant <al.grant@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lore.kernel.org/lkml/20200530122442.490-4-leo.yan@linaro.org
Signed-off-by: James Clark <james.clark@arm.com>
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|