summaryrefslogtreecommitdiffstats
path: root/arch/powerpc/perf (follow)
Commit message (Collapse)AuthorAgeFilesLines
* Merge tag 'powerpc-5.14-1' of ↵Linus Torvalds2021-07-024-43/+139
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc updates from Michael Ellerman: - A big series refactoring parts of our KVM code, and converting some to C. - Support for ARCH_HAS_SET_MEMORY, and ARCH_HAS_STRICT_MODULE_RWX on some CPUs. - Support for the Microwatt soft-core. - Optimisations to our interrupt return path on 64-bit. - Support for userspace access to the NX GZIP accelerator on PowerVM on Power10. - Enable KUAP and KUEP by default on 32-bit Book3S CPUs. - Other smaller features, fixes & cleanups. Thanks to: Andy Shevchenko, Aneesh Kumar K.V, Arnd Bergmann, Athira Rajeev, Baokun Li, Benjamin Herrenschmidt, Bharata B Rao, Christophe Leroy, Daniel Axtens, Daniel Henrique Barboza, Finn Thain, Geoff Levand, Haren Myneni, Jason Wang, Jiapeng Chong, Joel Stanley, Jordan Niethe, Kajol Jain, Nathan Chancellor, Nathan Lynch, Naveen N. Rao, Nicholas Piggin, Nick Desaulniers, Paul Mackerras, Russell Currey, Sathvika Vasireddy, Shaokun Zhang, Stephen Rothwell, Sudeep Holla, Suraj Jitindar Singh, Tom Rix, Vaibhav Jain, YueHaibing, Zhang Jianhua, and Zhen Lei. * tag 'powerpc-5.14-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (218 commits) powerpc: Only build restart_table.c for 64s powerpc/64s: move ret_from_fork etc above __end_soft_masked powerpc/64s/interrupt: clean up interrupt return labels powerpc/64/interrupt: add missing kprobe annotations on interrupt exit symbols powerpc/64: enable MSR[EE] in irq replay pt_regs powerpc/64s/interrupt: preserve regs->softe for NMI interrupts powerpc/64s: add a table of implicit soft-masked addresses powerpc/64e: remove implicit soft-masking and interrupt exit restart logic powerpc/64e: fix CONFIG_RELOCATABLE build warnings powerpc/64s: fix hash page fault interrupt handler powerpc/4xx: Fix setup_kuep() on SMP powerpc/32s: Fix setup_{kuap/kuep}() on SMP powerpc/interrupt: Use names in check_return_regs_valid() powerpc/interrupt: Also use exit_must_hard_disable() on PPC32 powerpc/sysfs: Replace sizeof(arr)/sizeof(arr[0]) with ARRAY_SIZE powerpc/ptrace: Refactor regs_set_return_{msr/ip} powerpc/ptrace: Move set_return_regs_changed() before regs_set_return_{msr/ip} powerpc/stacktrace: Fix spurious "stale" traces in raise_backtrace_ipi() powerpc/pseries/vas: Include irqdomain.h powerpc: mark local variables around longjmp as volatile ...
| * powerpc/pmu: Make the generic compat PMU use the architected eventsPaul Mackerras2021-06-251-36/+134
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This changes generic-compat-pmu.c so that it only uses architected events defined in Power ISA v3.0B, rather than event encodings which, while common to all the IBM Power Systems implementations, are nevertheless implementation-specific rather than architected. The intention is that any CPU implementation designed to conform to Power ISA v3.0B or later can use generic-compat-pmu.c. In addition to the existing events for cycles and instructions, this adds several other architected events, including alternative encodings for some events. In order to make it possible to measure cycles and instructions at the same time as each other, we set the CC5-6RUN bit in MMCR0, which makes PMC5 and PMC6 count instructions and cycles regardless of the run bit, so their events are now PM_CYC and PM_INST_CMPL rather than PM_RUN_CYC and PM_RUN_INST_CMPL (the latter are still available via other event codes). Note that POWER9 has an erratum where one architected event (PM_FLOP_CMPL, floating-point operations completed, code 0x100f4) does not work correctly. Given that there is a specific PMU driver for P9 which will be used in preference to generic-compat-pmu.c, that is not a real problem. Signed-off-by: Paul Mackerras <paulus@ozlabs.org> Reviewed-by: Madhavan Srinivasan <maddy@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/YJD7L9yeoxvxqeYi@thinks.paulus.ozlabs.org
| * powerpc: make stack walking KASAN-safeDaniel Axtens2021-06-161-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Make our stack-walking code KASAN-safe by using __no_sanitize_address. Generic code, arm64, s390 and x86 all make accesses unchecked for similar sorts of reasons: when unwinding a stack, we might touch memory that KASAN has marked as being out-of-bounds. In ppc64 KASAN development, I hit this sometimes when checking for an exception frame - because we're checking an arbitrary offset into the stack frame. See commit 20955746320e ("s390/kasan: avoid false positives during stack unwind"), commit bcaf669b4bdb ("arm64: disable kasan when accessing frame->fp in unwind_frame"), commit 91e08ab0c851 ("x86/dumpstack: Prevent KASAN false positive warnings") and commit 6e22c8366416 ("tracing, kasan: Silence Kasan warning in check_stack of stack_tracer"). Signed-off-by: Daniel Axtens <dja@axtens.net> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210614120907.1952321-1-dja@axtens.net
| * powerpc: Don't use 'struct ppc_inst' to reference instruction locationChristophe Leroy2021-06-161-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 'struct ppc_inst' is an internal representation of an instruction, but in-memory instructions are and will remain a table of 'u32' forever. Replace all 'struct ppc_inst *' used for locating an instruction in memory by 'u32 *'. This removes a lot of undue casts to 'struct ppc_inst *'. It also helps locating ab-use of 'struct ppc_inst' dereference. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> [mpe: Fix ppc_inst_next(), use u32 instead of unsigned int] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/7062722b087228e42cbd896e39bfdf526d6a340a.1621516826.git.christophe.leroy@csgroup.eu
| * powerpc/perf: Simplify MakefileChristophe Leroy2021-06-151-4/+2
| | | | | | | | | | | | | | | | | | | | | | | | arch/powerpc/Kbuild decend into arch/powerpc/perf/ only when CONFIG_PERF_EVENTS is selected, so there is not need to take CONFIG_PERF_EVENTS into account in arch/powerpc/perf/Makefile. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Reviewed-by: Michal Suchánek <msuchanek@suse.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/d37f61afca55b5b33787b643890e061ae1c18f5f.1620396045.git.christophe.leroy@csgroup.eu
* | powerpc/perf: Fix crash in perf_instruction_pointer() when ppmu is not setAthira Rajeev2021-06-181-1/+1
|/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On systems without any specific PMU driver support registered, running perf record causes Oops. The relevant portion from call trace: BUG: Kernel NULL pointer dereference on read at 0x00000040 Faulting instruction address: 0xc0021f0c Oops: Kernel access of bad area, sig: 11 [#1] BE PAGE_SIZE=4K PREEMPT CMPCPRO SAF3000 DIE NOTIFICATION CPU: 0 PID: 442 Comm: null_syscall Not tainted 5.13.0-rc6-s3k-dev-01645-g7649ee3d2957 #5164 NIP: c0021f0c LR: c00e8ad8 CTR: c00d8a5c NIP perf_instruction_pointer+0x10/0x60 LR perf_prepare_sample+0x344/0x674 Call Trace: perf_prepare_sample+0x7c/0x674 (unreliable) perf_event_output_forward+0x3c/0x94 __perf_event_overflow+0x74/0x14c perf_swevent_hrtimer+0xf8/0x170 __hrtimer_run_queues.constprop.0+0x160/0x318 hrtimer_interrupt+0x148/0x3b0 timer_interrupt+0xc4/0x22c Decrementer_virt+0xb8/0xbc During perf record session, perf_instruction_pointer() is called to capture the sample IP. This function in core-book3s accesses ppmu->flags. If a platform specific PMU driver is not registered, ppmu is set to NULL and accessing its members results in a crash. Fix this crash by checking if ppmu is set. Fixes: 2ca13a4cc56c ("powerpc/perf: Use regs->nip when SIAR is zero") Cc: stable@vger.kernel.org # v5.11+ Reported-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Tested-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/1623952506-1431-1-git-send-email-atrajeev@linux.vnet.ibm.com
* powerpc/perf: Fix the threshold event selection for memory events in power10Athira Rajeev2021-04-221-2/+2
| | | | | | | | | | | | | | | Memory events (mem-loads and mem-stores) currently use the threshold event selection as issue to finish. Power10 supports issue to complete as part of thresholding which is more appropriate for mem-loads and mem-stores. Hence fix the event code for memory events to use issue to complete. Fixes: a64e697cef23 ("powerpc/perf: power10 Performance Monitoring support") Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Reviewed-by: Madhavan Srinivasan <maddy@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/1614840015-1535-1-git-send-email-atrajeev@linux.vnet.ibm.com
* powerpc/perf: Fix sampled instruction type for larx/stcxAthira Rajeev2021-04-222-5/+34
| | | | | | | | | | | | | | | | | Sampled Instruction Event Register (SIER) field [46:48] identifies the sampled instruction type. ISA v3.1 says value of 0b111 for this field as reserved, but in POWER10 it denotes LARX/STCX type which will hopefully be fixed in ISA v3.1 update. Patch fixes the functions to handle type value 7 for CPU_FTR_ARCH_31. Fixes: a64e697cef23 ("powerpc/perf: power10 Performance Monitoring support") Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Reviewed-by: Madhavan Srinivasan <maddy@linux.ibm.com> [mpe: Avoid reading mmcra until necessary, use early return to deindent if block] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/1614858937-1485-1-git-send-email-atrajeev@linux.vnet.ibm.com
* powerpc/perf: Expose processor pipeline stage cycles using ↵Athira Rajeev2021-04-203-6/+33
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | PERF_SAMPLE_WEIGHT_STRUCT Performance Monitoring Unit (PMU) registers in powerpc provides information on cycles elapsed between different stages in the pipeline. This can be used for application tuning. On ISA v3.1 platform, this information is exposed by sampling registers. Patch adds kernel support to capture two of the cycle counters as part of perf sample using the sample type: PERF_SAMPLE_WEIGHT_STRUCT. The power PMU function 'get_mem_weight' currently uses 64 bit weight field of perf_sample_data to capture memory latency. But following the introduction of PERF_SAMPLE_WEIGHT_TYPE, weight field could contain 64-bit or 32-bit value depending on the architexture support for PERF_SAMPLE_WEIGHT_STRUCT. Patches uses WEIGHT_STRUCT to expose the pipeline stage cycles info. Hence update the ppmu functions to work for 64-bit and 32-bit weight values. If the sample type is PERF_SAMPLE_WEIGHT, use the 64-bit weight field. if the sample type is PERF_SAMPLE_WEIGHT_STRUCT, memory subsystem latency is stored in the low 32bits of perf_sample_weight structure. Also for CPU_FTR_ARCH_31, capture the two cycle counter information in two 16 bit fields of perf_sample_weight structure. Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Reviewed-by: Madhavan Srinivasan <maddy@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/1616425047-1666-2-git-send-email-atrajeev@linux.vnet.ibm.com
* powerpc/perf: Add platform specific check_attr_configMadhavan Srinivasan2021-04-194-0/+70
| | | | | | | | | Add platform specific attr.config value checks. Patch includes checks for both power9 and power10. Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210408074504.248211-2-maddy@linux.ibm.com
* powerpc/traps: Enhance readability for trap typesXiongwei Song2021-04-171-2/+3
| | | | | | | | | | | | | | Define macros to list ppc interrupt types in interttupt.h, replace the reference of the trap hex values with these macros. Referred the hex numbers in arch/powerpc/kernel/exceptions-64e.S, arch/powerpc/kernel/exceptions-64s.S, arch/powerpc/kernel/head_*.S, arch/powerpc/kernel/head_booke.h and arch/powerpc/include/asm/kvm_asm.h. Signed-off-by: Xiongwei Song <sxwjean@gmail.com> [mpe: Resolve conflicts in nmi_disables_ftrace(), fix 40x build] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/1618398033-13025-1-git-send-email-sxwjean@me.com
* powerpc/perf: Infrastructure to support checking of attr.config*Madhavan Srinivasan2021-04-141-0/+11
| | | | | | | | | | | | | | | | | | | | | Introduce code to support the checking of attr.config* for values which are reserved for a given platform. Performance Monitoring Unit (PMU) configuration registers have fields that are reserved and some specific values for bit fields are reserved. For ex., MMCRA[61:62] is Random Sampling Mode (SM) and value of 0b11 for this field is reserved. Writing non-zero or invalid values in these fields will have unknown behaviours. Patch adds a generic call-back function "check_attr_config" in "struct power_pmu", to be called in event_init to check for attr.config* values for a given platform. Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210408074504.248211-1-maddy@linux.ibm.com
* powerpc/perf/hv-24x7: Make some symbols staticBixuan Cui2021-04-141-5/+5
| | | | | | | | | | | | | | | | | | | | | | | The sparse tool complains as follows: arch/powerpc/perf/hv-24x7.c:229:1: warning: symbol '__pcpu_scope_hv_24x7_txn_flags' was not declared. Should it be static? arch/powerpc/perf/hv-24x7.c:230:1: warning: symbol '__pcpu_scope_hv_24x7_txn_err' was not declared. Should it be static? arch/powerpc/perf/hv-24x7.c:236:1: warning: symbol '__pcpu_scope_hv_24x7_hw' was not declared. Should it be static? arch/powerpc/perf/hv-24x7.c:244:1: warning: symbol '__pcpu_scope_hv_24x7_reqb' was not declared. Should it be static? arch/powerpc/perf/hv-24x7.c:245:1: warning: symbol '__pcpu_scope_hv_24x7_resb' was not declared. Should it be static? This symbol is not used outside of hv-24x7.c, so this commit marks it static. Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Bixuan Cui <cuibixuan@huawei.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210409090124.59492-1-cuibixuan@huawei.com
* powerpc/perf: Make symbol 'isa207_pmu_format_attr' staticBixuan Cui2021-04-141-1/+1
| | | | | | | | | | | | | | | The sparse tool complains as follows: arch/powerpc/perf/isa207-common.c:24:18: warning: symbol 'isa207_pmu_format_attr' was not declared. Should it be static? This symbol is not used outside of isa207-common.c, so this commit marks it static. Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Bixuan Cui <cuibixuan@huawei.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210409090119.59444-1-cuibixuan@huawei.com
* powerpc/perf: Fix PMU constraint check for EBB eventsAthira Rajeev2021-04-081-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The power PMU group constraints includes check for EBB events to make sure all events in a group must agree on EBB. This will prevent scheduling EBB and non-EBB events together. But in the existing check, settings for constraint mask and value is interchanged. Patch fixes the same. Before the patch, PMU selftest "cpu_event_pinned_vs_ebb_test" fails with below in dmesg logs. This happens because EBB event gets enabled along with a non-EBB cpu event. [35600.453346] cpu_event_pinne[41326]: illegal instruction (4) at 10004a18 nip 10004a18 lr 100049f8 code 1 in cpu_event_pinned_vs_ebb_test[10000000+10000] Test results after the patch: $ ./pmu/ebb/cpu_event_pinned_vs_ebb_test test: cpu_event_pinned_vs_ebb tags: git_version:v5.12-rc5-93-gf28c3125acd3-dirty Binding to cpu 8 EBB Handler is at 0x100050c8 read error on event 0x7fffe6bd4040! PM_RUN_INST_CMPL: result 9872 running/enabled 37930432 success: cpu_event_pinned_vs_ebb This bug was hidden by other logic until commit 1908dc911792 (perf: Tweak perf_event_attr::exclusive semantics). Fixes: 4df489991182 ("powerpc/perf: Add power8 EBB support") Reported-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com> Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com> [mpe: Mention commit 1908dc911792] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/1617725761-1464-1-git-send-email-atrajeev@linux.vnet.ibm.com
* powerpc/perf: Fix handling of privilege level checks in perf interrupt contextAthira Rajeev2021-03-021-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Running "perf mem record" in powerpc platforms with selinux enabled resulted in soft lockup's. Below call-trace was seen in the logs: CPU: 58 PID: 3751 Comm: sssd_nss Not tainted 5.11.0-rc7+ #2 NIP: c000000000dff3d4 LR: c000000000dff3d0 CTR: 0000000000000000 REGS: c000007fffab7d60 TRAP: 0100 Not tainted (5.11.0-rc7+) ... NIP _raw_spin_lock_irqsave+0x94/0x120 LR _raw_spin_lock_irqsave+0x90/0x120 Call Trace: 0xc00000000fd47260 (unreliable) skb_queue_tail+0x3c/0x90 audit_log_end+0x6c/0x180 common_lsm_audit+0xb0/0xe0 slow_avc_audit+0xa4/0x110 avc_has_perm+0x1c4/0x260 selinux_perf_event_open+0x74/0xd0 security_perf_event_open+0x68/0xc0 record_and_restart+0x6e8/0x7f0 perf_event_interrupt+0x22c/0x560 performance_monitor_exception0x4c/0x60 performance_monitor_common_virt+0x1c8/0x1d0 interrupt: f00 at _raw_spin_lock_irqsave+0x38/0x120 NIP: c000000000dff378 LR: c000000000b5fbbc CTR: c0000000007d47f0 REGS: c00000000fd47860 TRAP: 0f00 Not tainted (5.11.0-rc7+) ... NIP _raw_spin_lock_irqsave+0x38/0x120 LR skb_queue_tail+0x3c/0x90 interrupt: f00 0x38 (unreliable) 0xc00000000aae6200 audit_log_end+0x6c/0x180 audit_log_exit+0x344/0xf80 __audit_syscall_exit+0x2c0/0x320 do_syscall_trace_leave+0x148/0x200 syscall_exit_prepare+0x324/0x390 system_call_common+0xfc/0x27c The above trace shows that while the CPU was handling a performance monitor exception, there was a call to security_perf_event_open() function. In powerpc core-book3s, this function is called from perf_allow_kernel() check during recording of data address in the sample via perf_get_data_addr(). Commit da97e18458fb ("perf_event: Add support for LSM and SELinux checks") introduced security enhancements to perf. As part of this commit, the new security hook for perf_event_open() was added in all places where perf paranoid check was previously used. In powerpc core-book3s code, originally had paranoid checks in perf_get_data_addr() and power_pmu_bhrb_read(). So perf_paranoid_kernel() checks were replaced with perf_allow_kernel() in these PMU helper functions as well. The intention of paranoid checks in core-book3s was to verify privilege access before capturing some of the sample data. Along with paranoid checks, perf_allow_kernel() also does a security_perf_event_open(). Since these functions are accessed while recording a sample, we end up calling selinux_perf_event_open() in PMI context. Some of the security functions use spinlock like sidtab_sid2str_put(). If a perf interrupt hits under a spin lock and if we end up in calling selinux hook functions in PMI handler, this could cause a dead lock. Since the purpose of this security hook is to control access to perf_event_open(), it is not right to call this in interrupt context. The paranoid checks in powerpc core-book3s were done at interrupt time which is also not correct. Reference commits: Commit cd1231d7035f ("powerpc/perf: Prevent kernel address leak via perf_get_data_addr()") Commit bb19af816025 ("powerpc/perf: Prevent kernel address leak to userspace via BHRB buffer") We only allow creation of events that have already passed the privilege checks in perf_event_open(). So these paranoid checks are not needed at event time. As a fix, patch uses 'event->attr.exclude_kernel' check to prevent exposing kernel address for userspace only sampling. Fixes: cd1231d7035f ("powerpc/perf: Prevent kernel address leak via perf_get_data_addr()") Cc: stable@vger.kernel.org # v4.17+ Suggested-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/1614247839-1428-1-git-send-email-atrajeev@linux.vnet.ibm.com
* Merge tag 'powerpc-5.12-1' of ↵Linus Torvalds2021-02-2213-108/+157
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc updates from Michael Ellerman: - A large series adding wrappers for our interrupt handlers, so that irq/nmi/user tracking can be isolated in the wrappers rather than spread in each handler. - Conversion of the 32-bit syscall handling into C. - A series from Nick to streamline our TLB flushing when using the Radix MMU. - Switch to using queued spinlocks by default for 64-bit server CPUs. - A rework of our PCI probing so that it happens later in boot, when more generic infrastructure is available. - Two small fixes to allow 32-bit little-endian processes to run on 64-bit kernels. - Other smaller features, fixes & cleanups. Thanks to: Alexey Kardashevskiy, Ananth N Mavinakayanahalli, Aneesh Kumar K.V, Athira Rajeev, Bhaskar Chowdhury, Cédric Le Goater, Chengyang Fan, Christophe Leroy, Christopher M. Riedl, Fabiano Rosas, Florian Fainelli, Frederic Barrat, Ganesh Goudar, Hari Bathini, Jiapeng Chong, Joseph J Allen, Kajol Jain, Markus Elfring, Michal Suchanek, Nathan Lynch, Naveen N. Rao, Nicholas Piggin, Oliver O'Halloran, Pingfan Liu, Po-Hsu Lin, Qian Cai, Ram Pai, Randy Dunlap, Sandipan Das, Stephen Rothwell, Tyrel Datwyler, Will Springer, Yury Norov, and Zheng Yongjun. * tag 'powerpc-5.12-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (188 commits) powerpc/perf: Adds support for programming of Thresholding in P10 powerpc/pci: Remove unimplemented prototypes powerpc/uaccess: Merge raw_copy_to_user_allowed() into raw_copy_to_user() powerpc/uaccess: Merge __put_user_size_allowed() into __put_user_size() powerpc/uaccess: get rid of small constant size cases in raw_copy_{to,from}_user() powerpc/64: Fix stack trace not displaying final frame powerpc/time: Remove get_tbl() powerpc/time: Avoid using get_tbl() spi: mpc52xx: Avoid using get_tbl() powerpc/syscall: Avoid storing 'current' in another pointer powerpc/32: Handle bookE debugging in C in syscall entry/exit powerpc/syscall: Do not check unsupported scv vector on PPC32 powerpc/32: Remove the counter in global_dbcr0 powerpc/32: Remove verification of MSR_PR on syscall in the ASM entry powerpc/syscall: implement system call entry/exit logic in C for PPC32 powerpc/32: Always save non volatile GPRs at syscall entry powerpc/syscall: Change condition to check MSR_RI powerpc/syscall: Save r3 in regs->orig_r3 powerpc/syscall: Use is_compat_task() powerpc/syscall: Make interrupt.c buildable on PPC32 ...
| * powerpc/perf: Adds support for programming of Thresholding in P10Kajol Jain2021-02-1110-32/+99
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Thresholding, a performance monitoring unit feature, can be used to identify marked instructions which take more than expected cycles between start event and end event. Threshold compare (thresh_cmp) bits are programmed in MMCRA register. In Power9, thresh_cmp bits were part of the event code. But in case of P10, thresh_cmp are not part of event code due to inclusion of MMCR3 bits. Patch here adds an option to use attr.config1 variable to be used to pass thresh_cmp value to be programmed in MMCRA register. A new ppmu flag called PPMU_HAS_ATTR_CONFIG1 has been added and this flag is used to notify the use of attr.config1 variable. Patch has extended the parameter list of 'compute_mmcr', to include power_pmu's 'flags' element and parameter list of get_constraint to include attr.config1 value. It also extend parameter list of power_check_constraints inorder to pass perf_event list. As stated by commit ef0e3b650f8d ("powerpc/perf: Fix Threshold Event Counter Multiplier width for P10"), constraint bits for thresh_cmp is also needed to be increased to 11 bits, which is handled as part of this patch. We added bit number 53 as part of constraint bits of thresh_cmp for power10 to make it an 11 bit field. Updated layout for p10: /* * Layout of constraint bits: * * 60 56 52 48 44 40 36 32 * | - - - - | - - - - | - - - - | - - - - | - - - - | - - - - | - - - - | - - - - | * [ fab_match ] [ thresh_cmp ] [ thresh_ctl ] [ ] * | | * [ thresh_cmp bits for p10] thresh_sel -* * * 28 24 20 16 12 8 4 0 * | - - - - | - - - - | - - - - | - - - - | - - - - | - - - - | - - - - | - - - - | * [ ] | [ ] | [ sample ] [ ] [6] [5] [4] [3] [2] [1] * | | | | | * BHRB IFM -* | | |*radix_scope | Count of events for each PMC. * EBB -* | | p1, p2, p3, p4, p5, p6. * L1 I/D qualifier -* | * nc - number of counters -* * * The PMC fields P1..P6, and NC, are adder fields. As we accumulate constraints * we want the low bit of each field to be added to any existing value. * * Everything else is a value field. */ Result: command#: cat /sys/devices/cpu/format/thresh_cmp config1:0-17 ex. usage: command#: perf record -I --weight -d -e cpu/event=0x67340101EC,thresh_cmp=500/ ./ebizzy -S 2 -t 1 -s 4096 1826636 records/s real 2.00 s user 2.00 s sys 0.00 s [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.038 MB perf.data (61 samples) ] Signed-off-by: Kajol Jain <kjain@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210209095234.837356-1-kjain@linux.ibm.com
| * powerpc/perf: Record counter overflow always if SAMPLE_IP is unsetAthira Rajeev2021-02-081-4/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | While sampling for marked events, currently we record the sample only if the SIAR valid bit of Sampled Instruction Event Register (SIER) is set. SIAR_VALID bit is used for fetching the instruction address from Sampled Instruction Address Register(SIAR). But there are some usecases, where the user is interested only in the PMU stats at each counter overflow and the exact IP of the overflow event is not required. Dropping SIAR invalid samples will fail to record some of the counter overflows in such cases. Example of such usecase is dumping the PMU stats (event counts) after some regular amount of instructions/events from the userspace (ex: via ptrace). Here counter overflow is indicated to userspace via signal handler, and captured by monitoring and enabling I/O signaling on the event file descriptor. In these cases, we expect to get sample/overflow indication after each specified sample_period. Perf event attribute will not have PERF_SAMPLE_IP set in the sample_type if exact IP of the overflow event is not requested. So while profiling if SAMPLE_IP is not set, just record the counter overflow irrespective of SIAR_VALID check. Suggested-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com> [mpe: Reflow comment and if formatting] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/1612516492-1428-1-git-send-email-atrajeev@linux.vnet.ibm.com
| * powerpc/perf: Expose Performance Monitor Counter SPR's as part of extended regsAthira Rajeev2021-02-082-9/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently Monitor Mode Control Registers and Sampling registers are part of extended regs. Patch adds support to include Performance Monitor Counter Registers (PMC1 to PMC6 ) as part of extended registers. PMCs are saved in the perf interrupt handler as part of per-cpu array 'pmcs' in struct cpu_hw_events. While capturing the register values for extended regs, fetch these saved PMC values. Simplified the PERF_REG_PMU_MASK_300/31 definition to include PMU SPRs MMCR0 to PMC6. Exclude the unsupported SPRs (MMCR3, SIER2, SIER3) from extended mask value for CPU_FTR_ARCH_300 in the new definition. PERF_REG_EXTENDED_MAX is used to check if any index beyond the extended registers is requested in the sample. Have one PERF_REG_EXTENDED_MAX for CPU_FTR_ARCH_300/CPU_FTR_ARCH_31 since perf_reg_validate function already checks the extended mask for the presence of any unsupported register. Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/1612335337-1888-3-git-send-email-atrajeev@linux.vnet.ibm.com
| * powerpc/perf: Include PMCs as part of per-cpu cpuhw_events structAthira Rajeev2021-02-081-6/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | To support capturing of PMC's as part of extended registers, the value of SPR's PMC1 to PMC6 has to be saved in the starting of PMI interrupt handler. This is needed since we are resetting the overflown PMC before creating sample and hence directly reading SPRN_PMCx in 'perf_reg_value' will be capturing the modified value. To solve this, add a per-cpu array as part of structure cpu_hw_events and use this array to capture PMC values in the perf interrupt handler. Patch also re-factor's the interrupt handler code to use this per-cpu array instead of current local array. Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/1612335337-1888-2-git-send-email-atrajeev@linux.vnet.ibm.com
| * powerpc/perf: move perf irq/nmi handling details into traps.cNicholas Piggin2021-02-082-58/+2
| | | | | | | | | | | | | | | | | | This is required in order to allow more significant differences between NMI type interrupt handlers and regular asynchronous handlers. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210130130852.2952424-20-npiggin@gmail.com
| * powerpc/perf/hv-24x7: Dont create sysfs event files for dummy eventsKajol Jain2021-01-301-0/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The hv_24x7 performance monitoring unit creates a list of supported events from the event catalog obtained via HCALL. The hv_24x7 catalog could also contain invalid or dummy events with names like RESERVED*. These events do not have any hardware counters backing them. Add a check to string compare the event names to filter such events out. Result on power9 machine: Before this patch: ..... hv_24x7/PM_XLINK2_OUT_ODD_CYC,chip=?/ [Kernel PMU event] hv_24x7/PM_XLINK2_OUT_ODD_DATA_COUNT,chip=?/ [Kernel PMU event] hv_24x7/PM_XLINK2_OUT_ODD_TOTAL_UTIL,chip=?/ [Kernel PMU event] hv_24x7/PM_XTS_ATR_DEMAND_CHECKOUT,chip=?/ [Kernel PMU event] hv_24x7/PM_XTS_ATR_DEMAND_CHECKOUT_MISS,chip=?/ [Kernel PMU event] hv_24x7/PM_XTS_ATSD_SENT,chip=?/ [Kernel PMU event] hv_24x7/PM_XTS_ATSD_TLBI_RCV,chip=?/ [Kernel PMU event] hv_24x7/RESERVED_NEST1,chip=?/ [Kernel PMU event] hv_24x7/RESERVED_NEST10,chip=?/ [Kernel PMU event] hv_24x7/RESERVED_NEST11,chip=?/ [Kernel PMU event] hv_24x7/RESERVED_NEST12,chip=?/ [Kernel PMU event] hv_24x7/RESERVED_NEST13,chip=?/ [Kernel PMU event] ...... dmesg: [ 0.000362] printk: console [hvc0] enabled [ 0.815452] hv-24x7: read 1530 catalog entries, created 537 event attrs (0 failures), 275 descs After this patch: ...... hv_24x7/PM_XLINK2_OUT_ODD_AVLBL_CYC,chip=?/ [Kernel PMU event] hv_24x7/PM_XLINK2_OUT_ODD_CYC,chip=?/ [Kernel PMU event] hv_24x7/PM_XLINK2_OUT_ODD_DATA_COUNT,chip=?/ [Kernel PMU event] hv_24x7/PM_XLINK2_OUT_ODD_TOTAL_UTIL,chip=?/ [Kernel PMU event] hv_24x7/PM_XTS_ATR_DEMAND_CHECKOUT,chip=?/ [Kernel PMU event] hv_24x7/PM_XTS_ATR_DEMAND_CHECKOUT_MISS,chip=?/ [Kernel PMU event] hv_24x7/PM_XTS_ATSD_SENT,chip=?/ [Kernel PMU event] hv_24x7/PM_XTS_ATSD_TLBI_RCV,chip=?/ [Kernel PMU event] hv_24x7/TOD,chip=?/ [Kernel PMU event] ...... dmesg: [ 0.000357] printk: console [hvc0] enabled [ 0.808592] hv-24x7: read 1530 catalog entries, created 509 event attrs (0 failures), 275 descs Signed-off-by: Kajol Jain <kjain@linux.ibm.com> [mpe: Simplify ignore_event(), minor change log formatting] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20201228085204.18026-1-kjain@linux.ibm.com
* | perf/core: Add PERF_SAMPLE_WEIGHT_STRUCTKan Liang2021-02-011-1/+1
|/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Current PERF_SAMPLE_WEIGHT sample type is very useful to expresses the cost of an action represented by the sample. This allows the profiler to scale the samples to be more informative to the programmer. It could also help to locate a hotspot, e.g., when profiling by memory latencies, the expensive load appear higher up in the histograms. But current PERF_SAMPLE_WEIGHT sample type is solely determined by one factor. This could be a problem, if users want two or more factors to contribute to the weight. For example, Golden Cove core PMU can provide both the instruction latency and the cache Latency information as factors for the memory profiling. For current X86 platforms, although meminfo::latency is defined as a u64, only the lower 32 bits include the valid data in practice (No memory access could last than 4G cycles). The higher 32 bits can be used to store new factors. Add a new sample type, PERF_SAMPLE_WEIGHT_STRUCT, to indicate the new sample weight structure. It shares the same space as the PERF_SAMPLE_WEIGHT sample type. Users can apply either the PERF_SAMPLE_WEIGHT sample type or the PERF_SAMPLE_WEIGHT_STRUCT sample type to retrieve the sample weight, but they cannot apply both sample types simultaneously. Currently, only X86 and PowerPC use the PERF_SAMPLE_WEIGHT sample type. - For PowerPC, there is nothing changed for the PERF_SAMPLE_WEIGHT sample type. There is no effect for the new PERF_SAMPLE_WEIGHT_STRUCT sample type. PowerPC can re-struct the weight field similarly later. - For X86, the same value will be dumped for the PERF_SAMPLE_WEIGHT sample type or the PERF_SAMPLE_WEIGHT_STRUCT sample type for now. The following patches will apply the new factors for the PERF_SAMPLE_WEIGHT_STRUCT sample type. The field in the union perf_sample_weight should be shared among different architectures. A generic name is required, but it's hard to abstract a name that applies to all architectures. For example, on X86, the fields are to store all kinds of latency. While on PowerPC, it stores MMCRA[TECX/TECM], which should not be latency. So a general name prefix 'var$NUM' is used here. Suggested-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/1611873611-156687-2-git-send-email-kan.liang@linux.intel.com
* Merge tag 'powerpc-5.11-1' of ↵Linus Torvalds2020-12-179-40/+309
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc updates from Michael Ellerman: - Switch to the generic C VDSO, as well as some cleanups of our VDSO setup/handling code. - Support for KUAP (Kernel User Access Prevention) on systems using the hashed page table MMU, using memory protection keys. - Better handling of PowerVM SMT8 systems where all threads of a core do not share an L2, allowing the scheduler to make better scheduling decisions. - Further improvements to our machine check handling. - Show registers when unwinding interrupt frames during stack traces. - Improvements to our pseries (PowerVM) partition migration code. - Several series from Christophe refactoring and cleaning up various parts of the 32-bit code. - Other smaller features, fixes & cleanups. Thanks to: Alan Modra, Alexey Kardashevskiy, Andrew Donnellan, Aneesh Kumar K.V, Ard Biesheuvel, Athira Rajeev, Balamuruhan S, Bill Wendling, Cédric Le Goater, Christophe Leroy, Christophe Lombard, Colin Ian King, Daniel Axtens, David Hildenbrand, Frederic Barrat, Ganesh Goudar, Gautham R. Shenoy, Geert Uytterhoeven, Giuseppe Sacco, Greg Kurz, Harish, Jan Kratochvil, Jordan Niethe, Kaixu Xia, Laurent Dufour, Leonardo Bras, Madhavan Srinivasan, Mahesh Salgaonkar, Mathieu Desnoyers, Nathan Lynch, Nicholas Piggin, Oleg Nesterov, Oliver O'Halloran, Oscar Salvador, Po-Hsu Lin, Qian Cai, Qinglang Miao, Randy Dunlap, Ravi Bangoria, Sachin Sant, Sandipan Das, Sebastian Andrzej Siewior , Segher Boessenkool, Srikar Dronamraju, Tyrel Datwyler, Uwe Kleine-König, Vincent Stehlé, Youling Tang, and Zhang Xiaoxu. * tag 'powerpc-5.11-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (304 commits) powerpc/32s: Fix cleanup_cpu_mmu_context() compile bug powerpc: Add config fragment for disabling -Werror powerpc/configs: Add ppc64le_allnoconfig target powerpc/powernv: Rate limit opal-elog read failure message powerpc/pseries/memhotplug: Quieten some DLPAR operations powerpc/ps3: use dma_mapping_error() powerpc: force inlining of csum_partial() to avoid multiple csum_partial() with GCC10 powerpc/perf: Fix Threshold Event Counter Multiplier width for P10 powerpc/mm: Fix hugetlb_free_pmd_range() and hugetlb_free_pud_range() KVM: PPC: Book3S HV: Fix mask size for emulated msgsndp KVM: PPC: fix comparison to bool warning KVM: PPC: Book3S: Assign boolean values to a bool variable powerpc: Inline setup_kup() powerpc/64s: Mark the kuap/kuep functions non __init KVM: PPC: Book3S HV: XIVE: Add a comment regarding VP numbering powerpc/xive: Improve error reporting of OPAL calls powerpc/xive: Simplify xive_do_source_eoi() powerpc/xive: Remove P9 DD1 flag XIVE_IRQ_FLAG_EOI_FW powerpc/xive: Remove P9 DD1 flag XIVE_IRQ_FLAG_MASK_FW powerpc/xive: Remove P9 DD1 flag XIVE_IRQ_FLAG_SHIFT_BUG ...
| * powerpc/perf: Fix Threshold Event Counter Multiplier width for P10Madhavan Srinivasan2020-12-152-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Threshold Event Counter Multiplier (TECM) is part of Monitor Mode Control Register A (MMCRA). This field along with Threshold Event Counter Exponent (TECE) is used to get threshould counter value. In Power10, this is a 8bit field, so patch fixes the current code to modify the MMCRA[TECM] extraction macro to handle this change. ISA v3.1 says this is a 7 bit field but POWER10 it's actually 8 bits which will hopefully be fixed in ISA v3.1 update. Fixes: 170a315f41c6 ("powerpc/perf: Support to export MMCRA[TEC*] field to userspace") Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com> Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/1608022578-1532-1-git-send-email-atrajeev@linux.vnet.ibm.com
| * powerpc/perf: Exclude kernel samples while counting events in user space.Athira Rajeev2020-12-101-0/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Perf event attritube supports exclude_kernel flag to avoid sampling/profiling in supervisor state (kernel). Based on this event attr flag, Monitor Mode Control Register bit is set to freeze on supervisor state. But sometimes (due to hardware limitation), Sampled Instruction Address Register (SIAR) locks on to kernel address even when freeze on supervisor is set. Patch here adds a check to drop those samples. Cc: stable@vger.kernel.org Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/1606289215-1433-1-git-send-email-atrajeev@linux.vnet.ibm.com
| * powerpc/ppc-opcode: Add PPC_RAW_MFSPR()Christophe Leroy2020-12-091-4/+1
| | | | | | | | | | | | | | | | Add PPC_RAW_MFSPR() to replace open coding done in 8xx-pmu.c Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/e281e3a611eead8817c49cf06a60072a021af823.1606231483.git.christophe.leroy@csgroup.eu
| * powerpc/8xx: Use SPRN_SPRG_SCRATCH2 in DTLB miss exceptionChristophe Leroy2020-12-091-12/+7
| | | | | | | | | | | | | | | | | | | | | | Use SPRN_SPRG_SCRATCH2 in DTLB miss exception instead of DAR in order to be similar to ITLB miss exception. This also simplifies mpc8xx_pmu_del() Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/e3cc8f023ef40e1e8ae144e4dd1330a5ff022528.1606231483.git.christophe.leroy@csgroup.eu
| * powerpc/8xx: Use SPRN_SPRG_SCRATCH2 in ITLB miss exceptionChristophe Leroy2020-12-091-2/+2
| | | | | | | | | | | | | | | | | | | | | | In order to re-enable MMU earlier, ensure ITLB miss exception cannot clobber SPRN_SPRG_SCRATCH0 and SPRN_SPRG_SCRATCH1. Do so by using SPRN_SPRG_SCRATCH2 and SPRN_M_TW instead, like the DTLB miss exception. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/abc78e8e9577d473691ebb9996c6413b37bfd9ca.1606231483.git.christophe.leroy@csgroup.eu
| * powerpc/perf: MMCR0 control for PMU registers under PMCC=00Athira Rajeev2020-12-032-0/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | PowerISA v3.1 introduces new control bit (PMCCEXT) for restricting access to group B PMU registers in problem state when MMCR0 PMCC=0b00. In problem state and when MMCR0 PMCC=0b00, setting the Monitor Mode Control Register bit 54 (MMCR0 PMCCEXT), will restrict read permission on Group B Performance Monitor Registers (SIER, SIAR, SDAR and MMCR1). When this bit is set to zero, group B registers will be readable. In other platforms (like power9), the older behaviour is retained where group B PMU SPRs are readable. Patch adds support for MMCR0 PMCCEXT bit in power10 by enabling this bit during boot and during the PMU event enable/disable callback functions. Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/1606409684-1589-8-git-send-email-atrajeev@linux.vnet.ibm.com
| * powerpc/perf: Fix to update cache events with l2l3 events in power10Athira Rajeev2020-12-032-3/+15
| | | | | | | | | | | | | | | | | | | | Export l2l3 events (PM_L2_ST_MISS and PM_L2_ST) and LLC-prefetches (PM_L3_PF_MISS_L3) via sysfs, and also add these to list of cache_events. Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/1606409684-1589-7-git-send-email-atrajeev@linux.vnet.ibm.com
| * powerpc/perf: Fix to update generic event codes for power10Athira Rajeev2020-12-032-6/+12
| | | | | | | | | | | | | | | | | | | | | | Fix the event code for events: branch-instructions (to PM_BR_FIN), branch-misses (to PM_MPRED_BR_FIN) and cache-misses (to PM_LD_DEMAND_MISS_L1_FIN) for power10 PMU. Update the list of generic events with this modified event code. Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/1606409684-1589-6-git-send-email-atrajeev@linux.vnet.ibm.com
| * powerpc/perf: Add generic and cache event list for power10 DD1Athira Rajeev2020-12-031-0/+152
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There are event code updates for some of the generic events and cache events for power10. Inorder to maintain the current event codes work with DD1 also, create a new array of generic_events, cache_events and pmu_attr_groups with suffix _dd1, example, power10_events_attr_dd1. So that further updates to event codes can be made in the original list, ie, power10_events_attr. Update the power10 pmu init code to pick the dd1 list while registering the power PMU, based on the pvr (Processor Version Register) value. Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/1606409684-1589-5-git-send-email-atrajeev@linux.vnet.ibm.com
| * powerpc/perf: Fix the PMU group constraints for threshold events in power10Athira Rajeev2020-12-032-1/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The PMU group constraints mask for threshold events covers all thresholding bits which includes threshold control value (start/stop), select value as well as thresh_cmp value (MMCRA[9:18]. In power9, thresh_cmp bits were part of the event code. But in case of power10, thresh_cmp bits are not part of event code due to inclusion of MMCR3 bits. Hence thresh_cmp is not valid for group constraints for power10. Fix the PMU group constraints checking for threshold events in power10 by using constraint mask and value for only threshold control and select bits. Fixes: a64e697cef23 ("powerpc/perf: power10 Performance Monitoring support") Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/1606409684-1589-4-git-send-email-atrajeev@linux.vnet.ibm.com
| * powerpc/perf: Update the PMU group constraints for l2l3 events in power10Athira Rajeev2020-12-031-3/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In Power9, L2/L3 bus events are always available as a "bank" of 4 events. To obtain the counts for any of the l2/l3 bus events in a given bank, the user will have to program PMC4 with corresponding l2/l3 bus event for that bank. Commit 59029136d750 ("powerpc/perf: Add constraints for power9 l2/l3 bus events") enforced this rule in Power9. But this is not valid for Power10, since in Power10 Monitor Mode Control Register2 (MMCR2) has bits to configure l2/l3 event bits. Hence remove this PMC4 constraint check from power10. Since the l2/l3 bits in MMCR2 are not per-pmc, patch handles group constrints checks for l2/l3 bits in MMCR2. Fixes: a64e697cef23 ("powerpc/perf: power10 Performance Monitoring support") Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/1606409684-1589-3-git-send-email-atrajeev@linux.vnet.ibm.com
| * powerpc/perf: Fix to update radix_scope_qual in power10Athira Rajeev2020-12-033-7/+29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | power10 uses bit 9 of the raw event code as RADIX_SCOPE_QUAL. This bit is used for enabling the radix process events. Patch fixes the PMU counter support functions to program bit 18 of MMCR1 ( Monitor Mode Control Register1 ) with the RADIX_SCOPE_QUAL bit value. Since this field is not per-pmc, add this to PMU group constraints to make sure events in a group will have same bit value for this field. Use bit 21 as constraint bit field for radix_scope_qual. Patch also updates the power10 raw event encoding layout information, format field and constraints bit layout to include the radix_scope_qual bit. Fixes: a64e697cef23 ("powerpc/perf: power10 Performance Monitoring support") Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/1606409684-1589-2-git-send-email-atrajeev@linux.vnet.ibm.com
| * powerpc/perf: Invoke per-CPU variable access with disabled interruptsAthira Rajeev2020-12-031-4/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The power_pmu_event_init() callback access per-cpu variable (cpu_hw_events) to check for event constraints and Branch Stack (BHRB). Current usage is to disable preemption when accessing the per-cpu variable, but this does not prevent timer callback from interrupting event_init. Fix this by using 'local_irq_save/restore' to make sure the code path is invoked with disabled interrupts. This change is tested in mambo simulator to ensure that, if a timer interrupt comes in during the per-cpu access in event_init, it will be soft masked and replayed later. For testing purpose, introduced a udelay() in power_pmu_event_init() to make sure a timer interrupt arrives while in per-cpu variable access code between local_irq_save/resore. As expected the timer interrupt was replayed later during local_irq_restore called from power_pmu_event_init. This was confirmed by adding breakpoint in mambo and checking the backtrace when timer_interrupt was hit. Reported-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/1606814880-1720-1-git-send-email-atrajeev@linux.vnet.ibm.com
| * powerpc/vdso: Retrieve sigtramp offsets at buildtimeChristophe Leroy2020-12-032-6/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | This is copied from arm64. Instead of using runtime generated signal trampoline offsets, get offsets at buildtime. If the said trampoline doesn't exist, build will fail. So no need to check whether the trampoline exists or not in the VDSO. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/f8bfd6812c3e3678b1cdb4d55a52f9eb022b40d3.1601197618.git.christophe.leroy@csgroup.eu
| * powerpc/vdso: Replace vdso_base by vdsoChristophe Leroy2020-12-032-6/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | All other architectures but s390 use a void pointer named 'vdso' to reference the VDSO mapping. In a following patch, the VDSO data page will be put in front of text, vdso_base will then not anymore point to VDSO text. To avoid confusion between vdso_base and VDSO text, rename vdso_base into vdso and make it a void __user *. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/8e6cefe474aa4ceba028abb729485cd46c140990.1601197618.git.christophe.leroy@csgroup.eu
| * powerpc/perf: Fix crash with is_sier_available when pmu is not setAthira Rajeev2020-12-031-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On systems without any specific PMU driver support registered, running 'perf record' with —intr-regs will crash ( perf record -I <workload> ). The relevant portion from crash logs and Call Trace: Unable to handle kernel paging request for data at address 0x00000068 Faulting instruction address: 0xc00000000013eb18 Oops: Kernel access of bad area, sig: 11 [#1] CPU: 2 PID: 13435 Comm: kill Kdump: loaded Not tainted 4.18.0-193.el8.ppc64le #1 NIP: c00000000013eb18 LR: c000000000139f2c CTR: c000000000393d80 REGS: c0000004a07ab4f0 TRAP: 0300 Not tainted (4.18.0-193.el8.ppc64le) NIP [c00000000013eb18] is_sier_available+0x18/0x30 LR [c000000000139f2c] perf_reg_value+0x6c/0xb0 Call Trace: [c0000004a07ab770] [c0000004a07ab7c8] 0xc0000004a07ab7c8 (unreliable) [c0000004a07ab7a0] [c0000000003aa77c] perf_output_sample+0x60c/0xac0 [c0000004a07ab840] [c0000000003ab3f0] perf_event_output_forward+0x70/0xb0 [c0000004a07ab8c0] [c00000000039e208] __perf_event_overflow+0x88/0x1a0 [c0000004a07ab910] [c00000000039e42c] perf_swevent_hrtimer+0x10c/0x1d0 [c0000004a07abc50] [c000000000228b9c] __hrtimer_run_queues+0x17c/0x480 [c0000004a07abcf0] [c00000000022aaf4] hrtimer_interrupt+0x144/0x520 [c0000004a07abdd0] [c00000000002a864] timer_interrupt+0x104/0x2f0 [c0000004a07abe30] [c0000000000091c4] decrementer_common+0x114/0x120 When perf record session is started with "-I" option, capturing registers on each sample calls is_sier_available() to check for the SIER (Sample Instruction Event Register) availability in the platform. This function in core-book3s accesses 'ppmu->flags'. If a platform specific PMU driver is not registered, ppmu is set to NULL and accessing its members results in a crash. Fix the crash by returning false in is_sier_available() if ppmu is not set. Fixes: 333804dc3b7a ("powerpc/perf: Update perf_regs structure to include SIER") Reported-by: Sachin Sant <sachinp@linux.vnet.ibm.com> Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/1606185640-1720-1-git-send-email-atrajeev@linux.vnet.ibm.com
| * powerpc/perf: Use regs->nip when SIAR is zeroMadhavan Srinivasan2020-11-191-4/+17
| | | | | | | | | | | | | | | | | | | | | | In power10 DD1, there is an issue where the SIAR (Sampled Instruction Address Register) is not latching to the sampled address during random sampling. This results in value of 0s in the SIAR. Add a check to use regs->nip when SIAR is zero. Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20201021085329.384535-5-maddy@linux.ibm.com
| * powerpc/perf: Use the address from SIAR register to set cpumode flagsAthira Rajeev2020-11-191-0/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | While setting the processor mode for any sample, perf_get_misc_flags() expects the privilege level to differentiate the userspace and kernel address. On power10 DD1, there is an issue that causes MSR_HV MSR_PR bits of Sampled Instruction Event Register (SIER) not to be set for marked events. Hence add a check to use the address in SIAR (Sampled Instruction Address Register) to identify the privilege level. Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20201021085329.384535-3-maddy@linux.ibm.com
| * powerpc/perf: Drop the check for SIAR_VALIDAthira Rajeev2020-11-191-1/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | In power10 DD1, there is an issue that causes the SIAR_VALID bit of the SIER (Sampled Instruction Event Register) to not be set. But the SIAR_VALID bit is used for fetching the instruction address from the SIAR (Sampled Instruction Address Register), and marked events are sampled only if the SIAR_VALID bit is set. So drop the check for SIAR_VALID and return true always incase of power10 DD1. Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20201021085329.384535-2-maddy@linux.ibm.com
| * powerpc/perf: Add new power PMU flag "PPMU_P10_DD1" for power10 DD1Athira Rajeev2020-11-191-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | Add a new power PMU flag "PPMU_P10_DD1" which can be used to conditionally add any code path for power10 DD1 processor version. Also modify power10 PMU driver code to set this flag only for DD1, based on the Processor Version Register (PVR) value. Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20201021085329.384535-1-maddy@linux.ibm.com
| * powerpc/64s/perf: perf interrupt does not have to get_user_pages to access ↵Nicholas Piggin2020-11-192-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | user memory read_user_stack_slow that walks user address translation by hand is only required on hash, because a hash fault can not be serviced from "NMI" context (to avoid re-entering the hash code) so the user stack can be mapped into Linux page tables but not accessible by the CPU. Radix MMU mode does not have this restriction. A page fault failure would indicate the page is not accessible via get_user_pages either, so avoid this on radix. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20201111120151.3150658-1-npiggin@gmail.com
* | Merge tag 'fallthrough-fixes-clang-5.11-rc1' of ↵Linus Torvalds2020-12-161-0/+1
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/gustavoars/linux Pull fallthrough fixes from Gustavo A. R. Silva: "Fix many fall-through warnings when building with Clang 12.0.0 using -Wimplicit-fallthrough. - powerpc: boot: include compiler_attributes.h (Nick Desaulniers) - Revert "lib: Revert use of fallthrough pseudo-keyword in lib/" (Nick Desaulniers) - powerpc: fix -Wimplicit-fallthrough (Nick Desaulniers) - lib: Fix fall-through warnings for Clang (Gustavo A. R. Silva)" * tag 'fallthrough-fixes-clang-5.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gustavoars/linux: lib: Fix fall-through warnings for Clang powerpc: fix -Wimplicit-fallthrough Revert "lib: Revert use of fallthrough pseudo-keyword in lib/" powerpc: boot: include compiler_attributes.h
| * | powerpc: fix -Wimplicit-fallthroughNick Desaulniers2020-11-181-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The "fallthrough" pseudo-keyword was added as a portable way to denote intentional fallthrough. Clang will still warn on cases where there is a fallthrough to an immediate break. Add explicit breaks for those cases. Signed-off-by: Nick Desaulniers <ndesaulniers@google.com> Tested-by: Nathan Chancellor <natechancellor@gmail.com> Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org> Reviewed-by: Miguel Ojeda <ojeda@kernel.org> Reviewed-by: Nathan Chancellor <natechancellor@gmail.com> Acked-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://github.com/ClangBuiltLinux/linux/issues/236 Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
* | | Merge remote-tracking branch 'origin/master' into perf/corePeter Zijlstra2020-11-2616-57/+175
|\| | | | | | | | | | | | | | | | | | | | | | | Further perf/core patches will depend on: d3f7b1bb2040 ("mm/gup: fix gup_fast with dynamic page table folding") which is already in Linus' tree.
| * | perf/arch: Remove perf_sample_data::regs_user_copyPeter Zijlstra2020-11-091-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | struct perf_sample_data lives on-stack, we should be careful about it's size. Furthermore, the pt_regs copy in there is only because x86_64 is a trainwreck, solve it differently. Reported-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Steven Rostedt <rostedt@goodmis.org> Link: https://lkml.kernel.org/r/20201030151955.258178461@infradead.org