summaryrefslogtreecommitdiffstats
path: root/arch/arm64/kernel (follow)
Commit message (Collapse)AuthorAgeFilesLines
* Merge tag 'arm64-fixes' of ↵Linus Torvalds2024-11-082-32/+4
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 fixes from Will Deacon: "Here is a (hopefully) final round of arm64 fixes for 6.12 that address some user-visible floating point register corruption. Both of the Marks have been working on this for a couple of weeks and we've ended up in a position where SVE is solid but SME still has enough pending issues that the most pragmatic solution for the release and stable backports is to disable the feature. Yes, it's a shame, but the hardware is rare as hen's teeth at the moment and we're better off getting back to a known good state before fixing it all properly. We're also improving the selftests for 6.13 to help avoid merging broken code in the future. Anyway, the good news is that we're removing a lot more code than we're adding. Summary: - Fix handling of SVE traps from userspace on preemptible kernels when converting the saved floating point state into SVE state. - Remove broken support for the SMCCCv1.3 "SVE discard hint" optimisation. - Disable SME support, as the current support code suffers from numerous issues around signal delivery, ptrace access and context-switch which can lead to user-visible corruption of the register state" * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: arm64: Kconfig: Make SME depend on BROKEN for now arm64: smccc: Remove broken support for SMCCCv1.3 SVE discard hint arm64/sve: Discard stale CPU state when handling SVE traps
| * arm64: smccc: Remove broken support for SMCCCv1.3 SVE discard hintMark Rutland2024-11-071-32/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | SMCCCv1.3 added a hint bit which callers can set in an SMCCC function ID (AKA "FID") to indicate that it is acceptable for the SMCCC implementation to discard SVE and/or SME state over a specific SMCCC call. The kernel support for using this hint is broken and SMCCC calls may clobber the SVE and/or SME state of arbitrary tasks, though FPSIMD state is unaffected. The kernel support is intended to use the hint when there is no SVE or SME state to save, and to do this it checks whether TIF_FOREIGN_FPSTATE is set or TIF_SVE is clear in assembly code: | ldr <flags>, [<current_task>, #TSK_TI_FLAGS] | tbnz <flags>, #TIF_FOREIGN_FPSTATE, 1f // Any live FP state? | tbnz <flags>, #TIF_SVE, 2f // Does that state include SVE? | | 1: orr <fid>, <fid>, ARM_SMCCC_1_3_SVE_HINT | 2: | << SMCCC call using FID >> This is not safe as-is: (1) SMCCC calls can be made in a preemptible context and preemption can result in TIF_FOREIGN_FPSTATE being set or cleared at arbitrary points in time. Thus checking for TIF_FOREIGN_FPSTATE provides no guarantee. (2) TIF_FOREIGN_FPSTATE only indicates that the live FP/SVE/SME state in the CPU does not belong to the current task, and does not indicate that clobbering this state is acceptable. When the live CPU state is clobbered it is necessary to update fpsimd_last_state.st to ensure that a subsequent context switch will reload FP/SVE/SME state from memory rather than consuming the clobbered state. This and the SMCCC call itself must happen in a critical section with preemption disabled to avoid races. (3) Live SVE/SME state can exist with TIF_SVE clear (e.g. with only TIF_SME set), and checking TIF_SVE alone is insufficient. Remove the broken support for the SMCCCv1.3 SVE saving hint. This is effectively a revert of commits: * cfa7ff959a78 ("arm64: smccc: Support SMCCC v1.3 SVE register saving hint") * a7c3acca5380 ("arm64: smccc: Save lr before calling __arm_smccc_sve_check()") ... leaving behind the ARM_SMCCC_VERSION_1_3 and ARM_SMCCC_1_3_SVE_HINT definitions, since these are simply definitions from the SMCCC specification, and the latter is used in KVM via ARM_SMCCC_CALL_HINTS. If we want to bring this back in future, we'll probably want to handle this logic in C where we can use all the usual FPSIMD/SVE/SME helper functions, and that'll likely require some rework of the SMCCC code and/or its callers. Fixes: cfa7ff959a78 ("arm64: smccc: Support SMCCC v1.3 SVE register saving hint") Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Mark Brown <broonie@kernel.org> Cc: Will Deacon <will@kernel.org> Cc: stable@vger.kernel.org Reviewed-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20241106160448.2712997-1-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org>
| * arm64/sve: Discard stale CPU state when handling SVE trapsMark Brown2024-11-061-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The logic for handling SVE traps manipulates saved FPSIMD/SVE state incorrectly, and a race with preemption can result in a task having TIF_SVE set and TIF_FOREIGN_FPSTATE clear even though the live CPU state is stale (e.g. with SVE traps enabled). This has been observed to result in warnings from do_sve_acc() where SVE traps are not expected while TIF_SVE is set: | if (test_and_set_thread_flag(TIF_SVE)) | WARN_ON(1); /* SVE access shouldn't have trapped */ Warnings of this form have been reported intermittently, e.g. https://lore.kernel.org/linux-arm-kernel/CA+G9fYtEGe_DhY2Ms7+L7NKsLYUomGsgqpdBj+QwDLeSg=JhGg@mail.gmail.com/ https://lore.kernel.org/linux-arm-kernel/000000000000511e9a060ce5a45c@google.com/ The race can occur when the SVE trap handler is preempted before and after manipulating the saved FPSIMD/SVE state, starting and ending on the same CPU, e.g. | void do_sve_acc(unsigned long esr, struct pt_regs *regs) | { | // Trap on CPU 0 with TIF_SVE clear, SVE traps enabled | // task->fpsimd_cpu is 0. | // per_cpu_ptr(&fpsimd_last_state, 0) is task. | | ... | | // Preempted; migrated from CPU 0 to CPU 1. | // TIF_FOREIGN_FPSTATE is set. | | get_cpu_fpsimd_context(); | | if (test_and_set_thread_flag(TIF_SVE)) | WARN_ON(1); /* SVE access shouldn't have trapped */ | | sve_init_regs() { | if (!test_thread_flag(TIF_FOREIGN_FPSTATE)) { | ... | } else { | fpsimd_to_sve(current); | current->thread.fp_type = FP_STATE_SVE; | } | } | | put_cpu_fpsimd_context(); | | // Preempted; migrated from CPU 1 to CPU 0. | // task->fpsimd_cpu is still 0 | // If per_cpu_ptr(&fpsimd_last_state, 0) is still task then: | // - Stale HW state is reused (with SVE traps enabled) | // - TIF_FOREIGN_FPSTATE is cleared | // - A return to userspace skips HW state restore | } Fix the case where the state is not live and TIF_FOREIGN_FPSTATE is set by calling fpsimd_flush_task_state() to detach from the saved CPU state. This ensures that a subsequent context switch will not reuse the stale CPU state, and will instead set TIF_FOREIGN_FPSTATE, forcing the new state to be reloaded from memory prior to a return to userspace. Fixes: cccb78ce89c4 ("arm64/sve: Rework SVE access trap to convert state in registers") Reported-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Mark Brown <broonie@kernel.org> Cc: stable@vger.kernel.org Reviewed-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20241030-arm64-fpsimd-foreign-flush-v1-1-bd7bd66905a2@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
* | Merge tag 'arm64-fixes' of ↵Linus Torvalds2024-11-011-14/+78
|\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 fixes from Will Deacon: "The important one is a change to the way in which we handle protection keys around signal delivery so that we're more closely aligned with the x86 behaviour, however there is also a revert of the previous fix to disable software tag-based KASAN with GCC, since a workaround materialised shortly afterwards. I'd love to say we're done with 6.12, but we're aware of some longstanding fpsimd register corruption issues that we're almost at the bottom of resolving. Summary: - Fix handling of POR_EL0 during signal delivery so that pushing the signal context doesn't fail based on the pkey configuration of the interrupted context and align our user-visible behaviour with that of x86. - Fix a bogus pointer being passed to the CPU hotplug code from the Arm SDEI driver. - Re-enable software tag-based KASAN with GCC by using an alternative implementation of '__no_sanitize_address'" * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: arm64: signal: Improve POR_EL0 handling to avoid uaccess failures firmware: arm_sdei: Fix the input parameter of cpuhp_remove_state() Revert "kasan: Disable Software Tag-Based KASAN with GCC" kasan: Fix Software Tag-Based KASAN with GCC
| * arm64: signal: Improve POR_EL0 handling to avoid uaccess failuresKevin Brodsky2024-10-291-14/+78
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Reset POR_EL0 to "allow all" before writing the signal frame, preventing spurious uaccess failures. When POE is supported, the POR_EL0 register constrains memory accesses based on the target page's POIndex (pkey). This raises the question: what constraints should apply to a signal handler? The current answer is that POR_EL0 is reset to POR_EL0_INIT when invoking the handler, giving it full access to POIndex 0. This is in line with x86's MPK support and remains unchanged. This is only part of the story, though. POR_EL0 constrains all unprivileged memory accesses, meaning that uaccess routines such as put_user() are also impacted. As a result POR_EL0 may prevent the signal frame from being written to the signal stack (ultimately causing a SIGSEGV). This is especially concerning when an alternate signal stack is used, because userspace may want to prevent access to it outside of signal handlers. There is currently no provision for that: POR_EL0 is reset after writing to the stack, and POR_EL0_INIT only enables access to POIndex 0. This patch ensures that POR_EL0 is reset to its most permissive state before the signal stack is accessed. Once the signal frame has been fully written, POR_EL0 is still set to POR_EL0_INIT - it is up to the signal handler to enable access to additional pkeys if needed. As to sigreturn(), it expects having access to the stack like any other syscall; we only need to ensure that POR_EL0 is restored from the signal frame after all uaccess calls. This approach is in line with the recent x86/pkeys series [1]. Resetting POR_EL0 early introduces some complications, in that we can no longer read the register directly in preserve_poe_context(). This is addressed by introducing a struct (user_access_state) and helpers to manage any such register impacting user accesses (uaccess and accesses in userspace). Things look like this on signal delivery: 1. Save original POR_EL0 into struct [save_reset_user_access_state()] 2. Set POR_EL0 to "allow all" [save_reset_user_access_state()] 3. Create signal frame 4. Write saved POR_EL0 value to the signal frame [preserve_poe_context()] 5. Finalise signal frame 6. If all operations succeeded: a. Set POR_EL0 to POR_EL0_INIT [set_handler_user_access_state()] b. Else reset POR_EL0 to its original value [restore_user_access_state()] If any step fails when setting up the signal frame, the process will be sent a SIGSEGV, which it may be able to handle. Step 6.b ensures that the original POR_EL0 is saved in the signal frame when delivering that SIGSEGV (so that the original value is restored by sigreturn). The return path (sys_rt_sigreturn) doesn't strictly require any change since restore_poe_context() is already called last. However, to avoid uaccess calls being accidentally added after that point, we use the same approach as in the delivery path, i.e. separating uaccess from writing to the register: 1. Read saved POR_EL0 value from the signal frame [restore_poe_context()] 2. Set POR_EL0 to the saved value [restore_user_access_state()] [1] https://lore.kernel.org/lkml/20240802061318.2140081-1-aruna.ramakrishna@oracle.com/ Fixes: 9160f7e909e1 ("arm64: add POE signal support") Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Kevin Brodsky <kevin.brodsky@arm.com> Link: https://lore.kernel.org/r/20241029144539.111155-2-kevin.brodsky@arm.com Signed-off-by: Will Deacon <will@kernel.org>
* | Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds2024-10-211-0/+1
|\ \ | |/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull kvm fixes from Paolo Bonzini: "ARM64: - Fix the guest view of the ID registers, making the relevant fields writable from userspace (affecting ID_AA64DFR0_EL1 and ID_AA64PFR1_EL1) - Correcly expose S1PIE to guests, fixing a regression introduced in 6.12-rc1 with the S1POE support - Fix the recycling of stage-2 shadow MMUs by tracking the context (are we allowed to block or not) as well as the recycling state - Address a couple of issues with the vgic when userspace misconfigures the emulation, resulting in various splats. Headaches courtesy of our Syzkaller friends - Stop wasting space in the HYP idmap, as we are dangerously close to the 4kB limit, and this has already exploded in -next - Fix another race in vgic_init() - Fix a UBSAN error when faking the cache topology with MTE enabled RISCV: - RISCV: KVM: use raw_spinlock for critical section in imsic x86: - A bandaid for lack of XCR0 setup in selftests, which causes trouble if the compiler is configured to have x86-64-v3 (with AVX) as the default ISA. Proper XCR0 setup will come in the next merge window. - Fix an issue where KVM would not ignore low bits of the nested CR3 and potentially leak up to 31 bytes out of the guest memory's bounds - Fix case in which an out-of-date cached value for the segments could by returned by KVM_GET_SREGS. - More cleanups for KVM_X86_QUIRK_SLOT_ZAP_ALL - Override MTRR state for KVM confidential guests, making it WB by default as is already the case for Hyper-V guests. Generic: - Remove a couple of unused functions" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (27 commits) RISCV: KVM: use raw_spinlock for critical section in imsic KVM: selftests: Fix out-of-bounds reads in CPUID test's array lookups KVM: selftests: x86: Avoid using SSE/AVX instructions KVM: nSVM: Ignore nCR3[4:0] when loading PDPTEs from memory KVM: VMX: reset the segment cache after segment init in vmx_vcpu_reset() KVM: x86: Clean up documentation for KVM_X86_QUIRK_SLOT_ZAP_ALL KVM: x86/mmu: Add lockdep assert to enforce safe usage of kvm_unmap_gfn_range() KVM: x86/mmu: Zap only SPs that shadow gPTEs when deleting memslot x86/kvm: Override default caching mode for SEV-SNP and TDX KVM: Remove unused kvm_vcpu_gfn_to_pfn_atomic KVM: Remove unused kvm_vcpu_gfn_to_pfn KVM: arm64: Ensure vgic_ready() is ordered against MMIO registration KVM: arm64: vgic: Don't check for vgic_ready() when setting NR_IRQS KVM: arm64: Fix shift-out-of-bounds bug KVM: arm64: Shave a few bytes from the EL2 idmap code KVM: arm64: Don't eagerly teardown the vgic on init error KVM: arm64: Expose S1PIE to guests KVM: arm64: nv: Clarify safety of allowing TLBI unmaps to reschedule KVM: arm64: nv: Punt stage-2 recycling to a vCPU request KVM: arm64: nv: Do not block when unmapping stage-2 if disallowed ...
| * KVM: arm64: Shave a few bytes from the EL2 idmap codeMarc Zyngier2024-10-171-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Our idmap is becoming too big, to the point where it doesn't fit in a 4kB page anymore. There are some low-hanging fruits though, such as the el2_init_state horror that is expanded 3 times in the kernel. Let's at least limit ourselves to two copies, which makes the kernel link again. At some point, we'll have to have a better way of doing this. Reported-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20241009204903.GA3353168@thelio-3990X
* | arm64: set POR_EL0 for kernel threadsJoey Gouly2024-10-141-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | Restrict kernel threads to only have RWX overlays for pkey 0. This matches what arch/x86 does, by defaulting to a restrictive PKRU. Signed-off-by: Joey Gouly <joey.gouly@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: Kevin Brodsky <Kevin.Brodsky@arm.com> Link: https://lore.kernel.org/r/20241001133618.1547996-2-joey.gouly@arm.com Signed-off-by: Will Deacon <will@kernel.org>
* | arm64: probes: Fix uprobes for big-endian kernelsMark Rutland2024-10-091-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The arm64 uprobes code is broken for big-endian kernels as it doesn't convert the in-memory instruction encoding (which is always little-endian) into the kernel's native endianness before analyzing and simulating instructions. This may result in a few distinct problems: * The kernel may may erroneously reject probing an instruction which can safely be probed. * The kernel may erroneously erroneously permit stepping an instruction out-of-line when that instruction cannot be stepped out-of-line safely. * The kernel may erroneously simulate instruction incorrectly dur to interpretting the byte-swapped encoding. The endianness mismatch isn't caught by the compiler or sparse because: * The arch_uprobe::{insn,ixol} fields are encoded as arrays of u8, so the compiler and sparse have no idea these contain a little-endian 32-bit value. The core uprobes code populates these with a memcpy() which similarly does not handle endianness. * While the uprobe_opcode_t type is an alias for __le32, both arch_uprobe_analyze_insn() and arch_uprobe_skip_sstep() cast from u8[] to the similarly-named probe_opcode_t, which is an alias for u32. Hence there is no endianness conversion warning. Fix this by changing the arch_uprobe::{insn,ixol} fields to __le32 and adding the appropriate __le32_to_cpu() conversions prior to consuming the instruction encoding. The core uprobes copies these fields as opaque ranges of bytes, and so is unaffected by this change. At the same time, remove MAX_UINSN_BYTES and consistently use AARCH64_INSN_SIZE for clarity. Tested with the following: | #include <stdio.h> | #include <stdbool.h> | | #define noinline __attribute__((noinline)) | | static noinline void *adrp_self(void) | { | void *addr; | | asm volatile( | " adrp %x0, adrp_self\n" | " add %x0, %x0, :lo12:adrp_self\n" | : "=r" (addr)); | } | | | int main(int argc, char *argv) | { | void *ptr = adrp_self(); | bool equal = (ptr == adrp_self); | | printf("adrp_self => %p\n" | "adrp_self() => %p\n" | "%s\n", | adrp_self, ptr, equal ? "EQUAL" : "NOT EQUAL"); | | return 0; | } .... where the adrp_self() function was compiled to: | 00000000004007e0 <adrp_self>: | 4007e0: 90000000 adrp x0, 400000 <__ehdr_start> | 4007e4: 911f8000 add x0, x0, #0x7e0 | 4007e8: d65f03c0 ret Before this patch, the ADRP is not recognized, and is assumed to be steppable, resulting in corruption of the result: | # ./adrp-self | adrp_self => 0x4007e0 | adrp_self() => 0x4007e0 | EQUAL | # echo 'p /root/adrp-self:0x007e0' > /sys/kernel/tracing/uprobe_events | # echo 1 > /sys/kernel/tracing/events/uprobes/enable | # ./adrp-self | adrp_self => 0x4007e0 | adrp_self() => 0xffffffffff7e0 | NOT EQUAL After this patch, the ADRP is correctly recognized and simulated: | # ./adrp-self | adrp_self => 0x4007e0 | adrp_self() => 0x4007e0 | EQUAL | # | # echo 'p /root/adrp-self:0x007e0' > /sys/kernel/tracing/uprobe_events | # echo 1 > /sys/kernel/tracing/events/uprobes/enable | # ./adrp-self | adrp_self => 0x4007e0 | adrp_self() => 0x4007e0 | EQUAL Fixes: 9842ceae9fa8 ("arm64: Add uprobe support") Cc: stable@vger.kernel.org Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20241008155851.801546-4-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org>
* | arm64: probes: Fix simulate_ldr*_literal()Mark Rutland2024-10-091-11/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The simulate_ldr_literal() code always loads a 64-bit quantity, and when simulating a 32-bit load into a 'W' register, it discards the most significant 32 bits. For big-endian kernels this means that the relevant bits are discarded, and the value returned is the the subsequent 32 bits in memory (i.e. the value at addr + 4). Additionally, simulate_ldr_literal() and simulate_ldrsw_literal() use a plain C load, which the compiler may tear or elide (e.g. if the target is the zero register). Today this doesn't happen to matter, but it may matter in future if trampoline code uses a LDR (literal) or LDRSW (literal). Update simulate_ldr_literal() and simulate_ldrsw_literal() to use an appropriately-sized READ_ONCE() to perform the access, which avoids these problems. Fixes: 39a67d49ba35 ("arm64: kprobes instruction simulation support") Cc: stable@vger.kernel.org Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20241008155851.801546-3-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org>
* | arm64: probes: Remove broken LDR (literal) uprobe supportMark Rutland2024-10-091-5/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The simulate_ldr_literal() and simulate_ldrsw_literal() functions are unsafe to use for uprobes. Both functions were originally written for use with kprobes, and access memory with plain C accesses. When uprobes was added, these were reused unmodified even though they cannot safely access user memory. There are three key problems: 1) The plain C accesses do not have corresponding extable entries, and thus if they encounter a fault the kernel will treat these as unintentional accesses to user memory, resulting in a BUG() which will kill the kernel thread, and likely lead to further issues (e.g. lockup or panic()). 2) The plain C accesses are subject to HW PAN and SW PAN, and so when either is in use, any attempt to simulate an access to user memory will fault. Thus neither simulate_ldr_literal() nor simulate_ldrsw_literal() can do anything useful when simulating a user instruction on any system with HW PAN or SW PAN. 3) The plain C accesses are privileged, as they run in kernel context, and in practice can access a small range of kernel virtual addresses. The instructions they simulate have a range of +/-1MiB, and since the simulated instructions must itself be a user instructions in the TTBR0 address range, these can address the final 1MiB of the TTBR1 acddress range by wrapping downwards from an address in the first 1MiB of the TTBR0 address range. In contemporary kernels the last 8MiB of TTBR1 address range is reserved, and accesses to this will always fault, meaning this is no worse than (1). Historically, it was theoretically possible for the linear map or vmemmap to spill into the final 8MiB of the TTBR1 address range, but in practice this is extremely unlikely to occur as this would require either: * Having enough physical memory to fill the entire linear map all the way to the final 1MiB of the TTBR1 address range. * Getting unlucky with KASLR randomization of the linear map such that the populated region happens to overlap with the last 1MiB of the TTBR address range. ... and in either case if we were to spill into the final page there would be larger problems as the final page would alias with error pointers. Practically speaking, (1) and (2) are the big issues. Given there have been no reports of problems since the broken code was introduced, it appears that no-one is relying on probing these instructions with uprobes. Avoid these issues by not allowing uprobes on LDR (literal) and LDRSW (literal), limiting the use of simulate_ldr_literal() and simulate_ldrsw_literal() to kprobes. Attempts to place uprobes on LDR (literal) and LDRSW (literal) will be rejected as arm_probe_decode_insn() will return INSN_REJECTED. In future we can consider introducing working uprobes support for these instructions, but this will require more significant work. Fixes: 9842ceae9fa8 ("arm64: Add uprobe support") Cc: stable@vger.kernel.org Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20241008155851.801546-2-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org>
* | arm64: Subscribe Microsoft Azure Cobalt 100 to erratum 3194386Easwar Hariharan2024-10-041-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Add the Microsoft Azure Cobalt 100 CPU to the list of CPUs suffering from erratum 3194386 added in commit 75b3c43eab59 ("arm64: errata: Expand speculative SSBS workaround") CC: Mark Rutland <mark.rutland@arm.com> CC: James More <james.morse@arm.com> CC: Will Deacon <will@kernel.org> CC: stable@vger.kernel.org # 6.6+ Signed-off-by: Easwar Hariharan <eahariha@linux.microsoft.com> Link: https://lore.kernel.org/r/20241003225239.321774-1-eahariha@linux.microsoft.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
* | arm64: errata: Expand speculative SSBS workaround once moreMark Rutland2024-10-011-0/+2
|/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A number of Arm Ltd CPUs suffer from errata whereby an MSR to the SSBS special-purpose register does not affect subsequent speculative instructions, permitting speculative store bypassing for a window of time. We worked around this for a number of CPUs in commits: * 7187bb7d0b5c7dfa ("arm64: errata: Add workaround for Arm errata 3194386 and 3312417") * 75b3c43eab594bfb ("arm64: errata: Expand speculative SSBS workaround") * 145502cac7ea70b5 ("arm64: errata: Expand speculative SSBS workaround (again)") Since then, a (hopefully final) batch of updates have been published, with two more affected CPUs. For the affected CPUs the existing mitigation is sufficient, as described in their respective Software Developer Errata Notice (SDEN) documents: * Cortex-A715 (MP148) SDEN v15.0, erratum 3456084 https://developer.arm.com/documentation/SDEN-2148827/1500/ * Neoverse-N3 (MP195) SDEN v5.0, erratum 3456111 https://developer.arm.com/documentation/SDEN-3050973/0500/ Enable the existing mitigation by adding the relevant MIDRs to erratum_spec_ssbs_list, and update silicon-errata.rst and the Kconfig text accordingly. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20240930111705.3352047-3-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
* Merge tag 'asm-generic-6.12' of ↵Linus Torvalds2024-09-261-2/+1
|\ | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic Pull asm-generic updates from Arnd Bergmann: "These are only two small patches, one cleanup for arch/alpha and a preparation patch cleaning up the handling of runtime constants in the linker scripts" * tag 'asm-generic-6.12' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic: runtime constants: move list of constants to vmlinux.lds.h alpha: no need to include asm/xchg.h twice
| * runtime constants: move list of constants to vmlinux.lds.hJann Horn2024-08-191-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | Refactor the list of constant variables into a macro. This should make it easier to add more constants in the future. Signed-off-by: Jann Horn <jannh@google.com> Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Will Deacon <will@kernel.org> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
* | Merge tag 'random-6.12-rc1-for-linus' of ↵Linus Torvalds2024-09-185-14/+208
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/crng/random Pull random number generator updates from Jason Donenfeld: "Originally I'd planned on sending each of the vDSO getrandom() architecture ports to their respective arch trees. But as we started to work on this, we found lots of interesting issues in the shared code and infrastructure, the fixes for which the various archs needed to base their work. So in the end, this turned into a nice collaborative effort fixing up issues and porting to 5 new architectures -- arm64, powerpc64, powerpc32, s390x, and loongarch64 -- with everybody pitching in and commenting on each other's code. It was a fun development cycle. This contains: - Numerous fixups to the vDSO selftest infrastructure, getting it running successfully on more platforms, and fixing bugs in it. - Additions to the vDSO getrandom & chacha selftests. Basically every time manual review unearthed a bug in a revision of an arch patch, or an ambiguity, the tests were augmented. By the time the last arch was submitted for review, s390x, v1 of the series was essentially fine right out of the gate. - Fixes to the the generic C implementation of vDSO getrandom, to build and run successfully on all archs, decoupling it from assumptions we had (unintentionally) made on x86_64 that didn't carry through to the other architectures. - Port of vDSO getrandom to LoongArch64, from Xi Ruoyao and acked by Huacai Chen. - Port of vDSO getrandom to ARM64, from Adhemerval Zanella and acked by Will Deacon. - Port of vDSO getrandom to PowerPC, in both 32-bit and 64-bit varieties, from Christophe Leroy and acked by Michael Ellerman. - Port of vDSO getrandom to S390X from Heiko Carstens, the arch maintainer. While it'd be natural for there to be things to fix up over the course of the development cycle, these patches got a decent amount of review from a fairly diverse crew of folks on the mailing lists, and, for the most part, they've been cooking in linux-next, which has been helpful for ironing out build issues. In terms of architectures, I think that mostly takes care of the important 64-bit archs with hardware still being produced and running production loads in settings where vDSO getrandom is likely to help. Arguably there's still RISC-V left, and we'll see for 6.13 whether they find it useful and submit a port" * tag 'random-6.12-rc1-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random: (47 commits) selftests: vDSO: check cpu caps before running chacha test s390/vdso: Wire up getrandom() vdso implementation s390/vdso: Move vdso symbol handling to separate header file s390/vdso: Allow alternatives in vdso code s390/module: Provide find_section() helper s390/facility: Let test_facility() generate static branch if possible s390/alternatives: Remove ALT_FACILITY_EARLY s390/facility: Disable compile time optimization for decompressor code selftests: vDSO: fix vdso_config for s390 selftests: vDSO: fix ELF hash table entry size for s390x powerpc/vdso: Wire up getrandom() vDSO implementation on VDSO64 powerpc/vdso: Wire up getrandom() vDSO implementation on VDSO32 powerpc/vdso: Refactor CFLAGS for CVDSO build powerpc/vdso32: Add crtsavres mm: Define VM_DROPPABLE for powerpc/32 powerpc/vdso: Fix VDSO data access when running in a non-root time namespace selftests: vDSO: don't include generated headers for chacha test arm64: vDSO: Wire up getrandom() vDSO implementation arm64: alternative: make alternative_has_cap_likely() VDSO compatible selftests: vDSO: also test counter in vdso_test_chacha ...
| * | arm64: vDSO: Wire up getrandom() vDSO implementationAdhemerval Zanella2024-09-135-14/+208
| |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Hook up the generic vDSO implementation to the aarch64 vDSO data page. The _vdso_rng_data required data is placed within the _vdso_data vvar page, by using a offset larger than the vdso_data. The vDSO function requires a ChaCha20 implementation that does not write to the stack, and that can do an entire ChaCha20 permutation. The one provided uses NEON on the permute operation, with a fallback to the syscall for chips that do not support AdvSIMD. This also passes the vdso_test_chacha test along with vdso_test_getrandom. The vdso_test_getrandom bench-single result on Neoverse-N1 shows: vdso: 25000000 times in 0.783884250 seconds libc: 25000000 times in 8.780275399 seconds syscall: 25000000 times in 8.786581518 seconds A small fixup to arch/arm64/include/asm/mman.h was required to avoid pulling kernel code into the vDSO, similar to what's already done in arch/arm64/include/asm/rwonce.h. Signed-off-by: Adhemerval Zanella <adhemerval.zanella@linaro.org> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Acked-by: Will Deacon <will@kernel.org> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
* | Merge tag 'rcu.release.v6.12' of ↵Linus Torvalds2024-09-181-1/+1
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/rcu/linux Pull RCU updates from Neeraj Upadhyay: "Context tracking: - rename context tracking state related symbols and remove references to "dynticks" in various context tracking state variables and related helpers - force context_tracking_enabled_this_cpu() to be inlined to avoid leaving a noinstr section CSD lock: - enhance CSD-lock diagnostic reports - add an API to provide an indication of ongoing CSD-lock stall nocb: - update and simplify RCU nocb code to handle (de-)offloading of callbacks only for offline CPUs - fix RT throttling hrtimer being armed from offline CPU rcutorture: - remove redundant rcu_torture_ops get_gp_completed fields - add SRCU ->same_gp_state and ->get_comp_state functions - add generic test for NUM_ACTIVE_*RCU_POLL* for testing RCU and SRCU polled grace periods - add CFcommon.arch for arch-specific Kconfig options - print number of update types in rcu_torture_write_types() - add rcutree.nohz_full_patience_delay testing to the TREE07 scenario - add a stall_cpu_repeat module parameter to test repeated CPU stalls - add argument to limit number of CPUs a guest OS can use in torture.sh rcustall: - abbreviate RCU CPU stall warnings during CSD-lock stalls - Allow dump_cpu_task() to be called without disabling preemption - defer printing stall-warning backtrace when holding rcu_node lock srcu: - make SRCU gp seq wrap-around faster - add KCSAN checks for concurrent updates to ->srcu_n_exp_nodelay and ->reschedule_count which are used in heuristics governing auto-expediting of normal SRCU grace periods and grace-period-state-machine delays - mark idle SRCU-barrier callbacks to help identify stuck SRCU-barrier callback rcu tasks: - remove RCU Tasks Rude asynchronous APIs as they are no longer used - stop testing RCU Tasks Rude asynchronous APIs - fix access to non-existent percpu regions - check processor-ID assumptions during chosen CPU calculation for callback enqueuing - update description of rtp->tasks_gp_seq grace-period sequence number - add rcu_barrier_cb_is_done() to identify whether a given rcu_barrier callback is stuck - mark idle Tasks-RCU-barrier callbacks - add *torture_stats_print() functions to print detailed diagnostics for Tasks-RCU variants - capture start time of rcu_barrier_tasks*() operation to help distinguish a hung barrier operation from a long series of barrier operations refscale: - add a TINY scenario to support tests of Tiny RCU and Tiny SRCU - optimize process_durations() operation rcuscale: - dump stacks of stalled rcu_scale_writer() instances and grace-period statistics when rcu_scale_writer() stalls - mark idle RCU-barrier callbacks to identify stuck RCU-barrier callbacks - print detailed grace-period and barrier diagnostics on rcu_scale_writer() hangs for Tasks-RCU variants - warn if async module parameter is specified for RCU implementations that do not have async primitives such as RCU Tasks Rude - make all writer tasks report upon hang - tolerate repeated GFP_KERNEL failure in rcu_scale_writer() - use special allocator for rcu_scale_writer() - NULL out top-level pointers to heap memory to avoid double-free bugs on modprobe failures - maintain per-task instead of per-CPU callbacks count to avoid any issues with migration of either tasks or callbacks - constify struct ref_scale_ops Fixes: - use system_unbound_wq for kfree_rcu work to avoid disturbing isolated CPUs Misc: - warn on unexpected rcu_state.srs_done_tail state - better define "atomic" for list_replace_rcu() and hlist_replace_rcu() routines - annotate struct kvfree_rcu_bulk_data with __counted_by()" * tag 'rcu.release.v6.12' of git://git.kernel.org/pub/scm/linux/kernel/git/rcu/linux: (90 commits) rcu: Defer printing stall-warning backtrace when holding rcu_node lock rcu/nocb: Remove superfluous memory barrier after bypass enqueue rcu/nocb: Conditionally wake up rcuo if not already waiting on GP rcu/nocb: Fix RT throttling hrtimer armed from offline CPU rcu/nocb: Simplify (de-)offloading state machine context_tracking: Tag context_tracking_enabled_this_cpu() __always_inline context_tracking, rcu: Rename rcu_dyntick trace event into rcu_watching rcu: Update stray documentation references to rcu_dynticks_eqs_{enter, exit}() rcu: Rename rcu_momentary_dyntick_idle() into rcu_momentary_eqs() rcu: Rename rcu_implicit_dynticks_qs() into rcu_watching_snap_recheck() rcu: Rename dyntick_save_progress_counter() into rcu_watching_snap_save() rcu: Rename struct rcu_data .exp_dynticks_snap into .exp_watching_snap rcu: Rename struct rcu_data .dynticks_snap into .watching_snap rcu: Rename rcu_dynticks_zero_in_eqs() into rcu_watching_zero_in_eqs() rcu: Rename rcu_dynticks_in_eqs_since() into rcu_watching_snap_stopped_since() rcu: Rename rcu_dynticks_in_eqs() into rcu_watching_snap_in_eqs() rcu: Rename rcu_dynticks_eqs_online() into rcu_watching_online() context_tracking, rcu: Rename rcu_dynticks_curr_cpu_in_eqs() into rcu_is_watching_curr_cpu() context_tracking, rcu: Rename rcu_dynticks_task*() into rcu_task*() refscale: Constify struct ref_scale_ops ...
| * | treewide: context_tracking: Rename CONTEXT_* into CT_STATE_*Valentin Schneider2024-07-291-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Context tracking state related symbols currently use a mix of the CONTEXT_ (e.g. CONTEXT_KERNEL) and CT_SATE_ (e.g. CT_STATE_MASK) prefixes. Clean up the naming and make the ctx_state enum use the CT_STATE_ prefix. Suggested-by: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Valentin Schneider <vschneid@redhat.com> Acked-by: Frederic Weisbecker <frederic@kernel.org> Acked-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Neeraj Upadhyay <neeraj.upadhyay@kernel.org>
* | | Merge tag 'acpi-6.12-rc1' of ↵Linus Torvalds2024-09-161-191/+0
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull ACPI updates from Rafael Wysocki: "These update the ACPICA code in the kernel to upstream version 20240827, add support for ACPI-based enumeration of interrupt controllers on RISC-V along with some related irqchip updates, clean up the ACPI device object sysfs interface, add some quirks for backlight handling and IRQ overrides, fix assorted issues and clean up code. Specifics: - Check return value in acpi_db_convert_to_package() (Pei Xiao) - Detect FACS and allow setting the waking vector on reduced-hardware ACPI platforms (Jiaqing Zhao) - Allow ACPICA to represent semaphores as integers (Adrien Destugues) - Complete CXL 3.0 CXIMS structures support in ACPICA (Zhang Rui) - Make ACPICA support SPCR version 4 and add RISC-V SBI Subtype to DBG2 (Sia Jee Heng) - Implement the Dword_PCC Resource Descriptor Macro in ACPICA (Jose Marinho) - Correct the typo in struct acpi_mpam_msc_node member (Punit Agrawal) - Implement ACPI_WARNING_ONCE() and ACPI_ERROR_ONCE() and use them to prevent a Stall() violation warning from being printed every time this takes place (Vasily Khoruzhick) - Allow PCC Data Type in MCTP resource (Adam Young) - Fix memory leaks on acpi_ps_get_next_namepath() and acpi_ps_get_next_field() failures (Armin Wolf) - Add support for supressing leading zeros in hex strings when converting them to integers and update integer-to-hex-string conversions in ACPICA (Armin Wolf) - Add support for Windows 11 22H2 _OSI string (Armin Wolf) - Avoid warning for Dump Functions in ACPICA (Adam Lackorzynski) - Add extended linear address mode to HMAT MSCIS in ACPICA (Dave Jiang) - Handle empty connection_node in iasl (Aleksandrs Vinarskis) - Allow for more flexibility in _DSM args (Saket Dumbre) - Setup for ACPICA release 20240827 (Saket Dumbre) - Add ACPI device enumeration support for interrupt controller probing including taking dependencies into account (Sunil V L) - Implement ACPI-based interrupt controller probing on RISC-V (Sunil V L) - Add ACPI support for AIA in riscv-intc and add ACPI support to riscv-imsic, riscv-aplic, and sifive-plic (Sunil V L) - Do not release locks during operation region accesses in the ACPI EC driver (Rafael Wysocki) - Fix up the _STR handling in the ACPI device object sysfs interface, make it represent the device object attributes as an attribute group and make it rely on driver core functionality for sysfs attrubute management (Thomas Weißschuh) - Extend error messages printed to the kernel log when acpi_evaluate_dsm() fails to include revision and function number (David Wang) - Add a new AMDI0015 platform device ID to the ACPi APD driver for AMD SoCs (Shyam Sundar S K) - Use the driver core for the async probing management in the ACPI battery driver (Thomas Weißschuh) - Remove redundant initalizations of a local variable to NULL from the ACPI battery driver (Ilpo Järvinen) - Remove unneeded check in tps68470_pmic_opregion_probe() (Aleksandr Mishin) - Add support for setting the EPP register through the ACPI CPPC sysfs interface if it is in FFH (Mario Limonciello) - Fix MASK_VAL() usage in the ACPI CPPC library (Clément Léger) - Reduce the log level of a per-CPU message about idle states in the ACPI processor driver (Li RongQing) - Fix crash in exit_round_robin() in the ACPI processor aggregator device (PAD) driver (Seiji Nishikawa) - Add force_vendor quirk for Panasonic Toughbook CF-18 in the ACPI backlight driver (Hans de Goede) - Make the DMI checks related to backlight handling on Lenovo Yoga Tab 3 X90F less strict (Hans de Goede) - Enforce native backlight handling on Apple MacbookPro9,2 (Esther Shimanovich) - Add IRQ override quirks for Asus Vivobook Go E1404GAB and MECHREV GM7XG0M, and refine the TongFang GMxXGxx quirk (Li Chen, Tamim Khan, Werner Sembach) - Quirk ASUS ROG M16 to default to S3 sleep (Luke D. Jones) - Define and use symbols for device and class name lengths in the ACPI bus type code and make the code use strscpy() instead of strcpy() in several places (Muhammad Qasim Abdul Majeed)" * tag 'acpi-6.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (70 commits) ACPI: resource: Add another DMI match for the TongFang GMxXGxx ACPI: CPPC: Add support for setting EPP register in FFH ACPI: PM: Quirk ASUS ROG M16 to default to S3 sleep ACPI: video: Add force_vendor quirk for Panasonic Toughbook CF-18 ACPI: battery: use driver core managed async probing ACPI: button: Use strscpy() instead of strcpy() ACPI: resource: Skip IRQ override on Asus Vivobook Go E1404GAB ACPI: CPPC: Fix MASK_VAL() usage irqchip/sifive-plic: Add ACPI support ACPICA: Setup for ACPICA release 20240827 ACPICA: Allow for more flexibility in _DSM args ACPICA: iasl: handle empty connection_node ACPICA: HMAT: Add extended linear address mode to MSCIS ACPICA: Avoid warning for Dump Functions ACPICA: Add support for Windows 11 22H2 _OSI string ACPICA: Update integer-to-hex-string conversions ACPICA: Add support for supressing leading zeros in hex strings ACPICA: Allow for supressing leading zeros when using acpi_ex_convert_to_ascii() ACPICA: Fix memory leak if acpi_ps_get_next_field() fails ACPICA: Fix memory leak if acpi_ps_get_next_namepath() fails ...
| * \ \ Merge branch 'acpi-riscv'Rafael J. Wysocki2024-09-111-191/+0
| |\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Merge ACPI and irqchip updates related to external interrupt controller support on RISC-V: - Add ACPI device enumeration support for interrupt controller probing including taking dependencies into account (Sunil V L). - Implement ACPI-based interrupt controller probing on RISC-V (Sunil V L). - Add ACPI support for AIA in riscv-intc and add ACPI support to riscv-imsic, riscv-aplic, and sifive-plic (Sunil V L). * acpi-riscv: irqchip/sifive-plic: Add ACPI support irqchip/riscv-aplic: Add ACPI support irqchip/riscv-imsic: Add ACPI support irqchip/riscv-imsic-state: Create separate function for DT irqchip/riscv-intc: Add ACPI support for AIA ACPI: RISC-V: Implement function to add implicit dependencies ACPI: RISC-V: Initialize GSI mapping structures ACPI: RISC-V: Implement function to reorder irqchip probe entries ACPI: RISC-V: Implement PCI related functionality ACPI: pci_link: Clear the dependencies after probe ACPI: bus: Add RINTC IRQ model for RISC-V ACPI: scan: Define weak function to populate dependencies ACPI: scan: Add RISC-V interrupt controllers to honor list ACPI: scan: Refactor dependency creation ACPI: bus: Add acpi_riscv_init() function ACPI: scan: Add a weak arch_sort_irqchip_probe() to order the IRQCHIP probe arm64: PCI: Migrate ACPI related functions to pci-acpi.c
| | * | | arm64: PCI: Migrate ACPI related functions to pci-acpi.cSunil V L2024-08-271-191/+0
| | | |/ | | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The functions defined in arm64 for ACPI support are required for RISC-V also. To avoid duplication, move these functions to common location. Signed-off-by: Sunil V L <sunilvl@ventanamicro.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Will Deacon <will@kernel.org> Tested-by: Björn Töpel <bjorn@rivosinc.com> Link: https://patch.msgid.link/20240812005929.113499-2-sunilvl@ventanamicro.com Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
* | | | Merge tag 'arm64-upstream' of ↵Linus Torvalds2024-09-169-84/+345
|\ \ \ \ | |/ / / |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 updates from Will Deacon: "The highlights are support for Arm's "Permission Overlay Extension" using memory protection keys, support for running as a protected guest on Android as well as perf support for a bunch of new interconnect PMUs. Summary: ACPI: - Enable PMCG erratum workaround for HiSilicon HIP10 and 11 platforms. - Ensure arm64-specific IORT header is covered by MAINTAINERS. CPU Errata: - Enable workaround for hardware access/dirty issue on Ampere-1A cores. Memory management: - Define PHYSMEM_END to fix a crash in the amdgpu driver. - Avoid tripping over invalid kernel mappings on the kexec() path. - Userspace support for the Permission Overlay Extension (POE) using protection keys. Perf and PMUs: - Add support for the "fixed instruction counter" extension in the CPU PMU architecture. - Extend and fix the event encodings for Apple's M1 CPU PMU. - Allow LSM hooks to decide on SPE permissions for physical profiling. - Add support for the CMN S3 and NI-700 PMUs. Confidential Computing: - Add support for booting an arm64 kernel as a protected guest under Android's "Protected KVM" (pKVM) hypervisor. Selftests: - Fix vector length issues in the SVE/SME sigreturn tests - Fix build warning in the ptrace tests. Timers: - Add support for PR_{G,S}ET_TSC so that 'rr' can deal with non-determinism arising from the architected counter. Miscellaneous: - Rework our IPI-based CPU stopping code to try NMIs if regular IPIs don't succeed. - Minor fixes and cleanups" * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (94 commits) perf: arm-ni: Fix an NULL vs IS_ERR() bug arm64: hibernate: Fix warning for cast from restricted gfp_t arm64: esr: Define ESR_ELx_EC_* constants as UL arm64: pkeys: remove redundant WARN perf: arm_pmuv3: Use BR_RETIRED for HW branch event if enabled MAINTAINERS: List Arm interconnect PMUs as supported perf: Add driver for Arm NI-700 interconnect PMU dt-bindings/perf: Add Arm NI-700 PMU perf/arm-cmn: Improve format attr printing perf/arm-cmn: Clean up unnecessary NUMA_NO_NODE check arm64/mm: use lm_alias() with addresses passed to memblock_free() mm: arm64: document why pte is not advanced in contpte_ptep_set_access_flags() arm64: Expose the end of the linear map in PHYSMEM_END arm64: trans_pgd: mark PTEs entries as valid to avoid dead kexec() arm64/mm: Delete __init region from memblock.reserved perf/arm-cmn: Support CMN S3 dt-bindings: perf: arm-cmn: Add CMN S3 perf/arm-cmn: Refactor DTC PMU register access perf/arm-cmn: Make cycle counts less surprising perf/arm-cmn: Improve build-time assertion ...
| * | | Merge branch 'for-next/timers' into for-next/coreWill Deacon2024-09-122-18/+75
| |\ \ \ | | | | | | | | | | | | | | | | | | | | * for-next/timers: arm64: Implement prctl(PR_{G,S}ET_TSC)
| | * | | arm64: Implement prctl(PR_{G,S}ET_TSC)Peter Collingbourne2024-08-272-18/+75
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On arm64, this prctl controls access to CNTVCT_EL0, CNTVCTSS_EL0 and CNTFRQ_EL0 via CNTKCTL_EL1.EL0VCTEN. Since this bit is also used to implement various erratum workarounds, check whether the CPU needs a workaround whenever we potentially need to change it. This is needed for a correct implementation of non-instrumenting record-replay debugging on arm64 (i.e. rr; https://rr-project.org/). rr must trap and record any sources of non-determinism from the userspace program's perspective so it can be replayed later. This includes the results of syscalls as well as the results of access to architected timers exposed directly to the program. This prctl was originally added for x86 by commit 8fb402bccf20 ("generic, x86: add prctl commands PR_GET_TSC and PR_SET_TSC"), and rr uses it to trap RDTSC on x86 for the same reason. We also considered exposing this as a PTRACE_EVENT. However, prctl seems like a better choice for these reasons: 1) In general an in-process control seems more useful than an out-of-process control, since anything that you would be able to do with ptrace could also be done with prctl (tracer can inject a call to the prctl and handle signal-delivery-stops), and it avoids needing an additional process (which will complicate debugging of the ptraced process since it cannot have more than one tracer, and will be incompatible with ptrace_scope=3) in cases where that is not otherwise necessary. 2) Consistency with x86_64. Note that on x86_64, RDTSC has been there since the start, so it's the same situation as on arm64. Signed-off-by: Peter Collingbourne <pcc@google.com> Link: https://linux-review.googlesource.com/id/I233a1867d1ccebe2933a347552e7eae862344421 Link: https://lore.kernel.org/r/20240824015415.488474-1-pcc@google.com Signed-off-by: Will Deacon <will@kernel.org>
| * | | | Merge branch 'for-next/poe' into for-next/coreWill Deacon2024-09-126-0/+162
| |\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * for-next/poe: (31 commits) arm64: pkeys: remove redundant WARN kselftest/arm64: Add test case for POR_EL0 signal frame records kselftest/arm64: parse POE_MAGIC in a signal frame kselftest/arm64: add HWCAP test for FEAT_S1POE selftests: mm: make protection_keys test work on arm64 selftests: mm: move fpregs printing kselftest/arm64: move get_header() arm64: add Permission Overlay Extension Kconfig arm64: enable PKEY support for CPUs with S1POE arm64: enable POE and PIE to coexist arm64/ptrace: add support for FEAT_POE arm64: add POE signal support arm64: implement PKEYS support arm64: add pte_access_permitted_no_overlay() arm64: handle PKEY/POE faults arm64: mask out POIndex when modifying a PTE arm64: convert protection key into vm_flags and pgprot values arm64: add POIndex defines arm64: re-order MTE VM_ flags arm64: enable the Permission Overlay Extension for EL0 ...
| | * | | | arm64/ptrace: add support for FEAT_POEJoey Gouly2024-09-041-0/+46
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add a regset for POE containing POR_EL0. Signed-off-by: Joey Gouly <joey.gouly@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Reviewed-by: Mark Brown <broonie@kernel.org> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> Link: https://lore.kernel.org/r/20240822151113.1479789-21-joey.gouly@arm.com Signed-off-by: Will Deacon <will@kernel.org>
| | * | | | arm64: add POE signal supportJoey Gouly2024-09-041-0/+62
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add PKEY support to signals, by saving and restoring POR_EL0 from the stackframe. Signed-off-by: Joey Gouly <joey.gouly@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Reviewed-by: Mark Brown <broonie@kernel.org> Acked-by: Szabolcs Nagy <szabolcs.nagy@arm.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> Link: https://lore.kernel.org/r/20240822151113.1479789-20-joey.gouly@arm.com Signed-off-by: Will Deacon <will@kernel.org>
| | * | | | arm64: handle PKEY/POE faultsJoey Gouly2024-09-041-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If a memory fault occurs that is due to an overlay/pkey fault, report that to userspace with a SEGV_PKUERR. Signed-off-by: Joey Gouly <joey.gouly@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Link: https://lore.kernel.org/r/20240822151113.1479789-17-joey.gouly@arm.com [will: Add ESR.FSC check to data abort handler] Signed-off-by: Will Deacon <will@kernel.org>
| | * | | | arm64: enable the Permission Overlay Extension for EL0Joey Gouly2024-09-042-0/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Expose a HWCAP and ID_AA64MMFR3_EL1_S1POE to userspace, so they can be used to check if the CPU supports the feature. Signed-off-by: Joey Gouly <joey.gouly@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> Link: https://lore.kernel.org/r/20240822151113.1479789-12-joey.gouly@arm.com Signed-off-by: Will Deacon <will@kernel.org>
| | * | | | arm64: context switch POR_EL0 registerJoey Gouly2024-09-041-0/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | POR_EL0 is a register that can be modified by userspace directly, so it must be context switched. Signed-off-by: Joey Gouly <joey.gouly@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Link: https://lore.kernel.org/r/20240822151113.1479789-7-joey.gouly@arm.com [will: Dropped unnecessary isb()s] Signed-off-by: Will Deacon <will@kernel.org>
| | * | | | arm64: cpufeature: add Permission Overlay Extension cpucapJoey Gouly2024-09-041-0/+9
| | |/ / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This indicates if the system supports POE. This is a CPUCAP_BOOT_CPU_FEATURE as the boot CPU will enable POE if it has it, so secondary CPUs must also have this feature. Signed-off-by: Joey Gouly <joey.gouly@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> Link: https://lore.kernel.org/r/20240822151113.1479789-6-joey.gouly@arm.com Signed-off-by: Will Deacon <will@kernel.org>
| * | | | Merge branch 'for-next/misc' into for-next/coreWill Deacon2024-09-123-65/+99
| |\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * for-next/misc: arm64: hibernate: Fix warning for cast from restricted gfp_t arm64: esr: Define ESR_ELx_EC_* constants as UL arm64: Constify struct kobj_type arm64: smp: smp_send_stop() and crash_smp_send_stop() should try non-NMI first arm64/sve: Remove unused declaration read_smcr_features() arm64: mm: Remove unused declaration early_io_map() arm64: el2_setup.h: Rename some labels to be more diff-friendly arm64: signal: Fix some under-bracketed UAPI macros arm64/mm: Drop TCR_SMP_FLAGS arm64/mm: Drop PMD_SECT_VALID
| | * | | | arm64: hibernate: Fix warning for cast from restricted gfp_tMin-Hua Chen2024-09-121-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch fixes the following warning by adding __force to the cast: arch/arm64/kernel/hibernate.c:410:44: sparse: warning: cast from restricted gfp_t No functional change intended. Signed-off-by: Min-Hua Chen <minhuadotchen@gmail.com> Link: https://lore.kernel.org/r/20240910232507.313555-1-minhuadotchen@gmail.com Signed-off-by: Will Deacon <will@kernel.org>
| | * | | | arm64: Constify struct kobj_typeHuang Xiaojia2024-08-271-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 'struct kobj_type' is not modified. It is only used in kobject_init() which takes a 'const struct kobj_type *ktype' parameter. Constifying this structure moves some data to a read-only section, so increase over all security. On a x86_64, compiled with arm defconfig: Before: ====== text data bss dec hex filename 5602 548 352 6502 1966 arch/arm64/kernel/cpuinfo.o After: ====== text data bss dec hex filename 5650 500 352 6502 1966 arch/arm64/kernel/cpuinfo.o Signed-off-by: Huang Xiaojia <huangxiaojia2@huawei.com> Link: https://lore.kernel.org/r/20240826151250.3500302-1-huangxiaojia2@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
| | * | | | arm64: smp: smp_send_stop() and crash_smp_send_stop() should try non-NMI firstDouglas Anderson2024-08-231-63/+97
| | |/ / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When testing hard lockup handling on my sc7180-trogdor-lazor device with pseudo-NMI enabled, with serial console enabled and with kgdb disabled, I found that the stack crawls printed to the serial console ended up as a jumbled mess. After rebooting, the pstore-based console looked fine though. Also, enabling kgdb to trap the panic made the console look fine and avoided the mess. After a bit of tracking down, I came to the conclusion that this was what was happening: 1. The panic path was stopping all other CPUs with panic_other_cpus_shutdown(). 2. At least one of those other CPUs was in the middle of printing to the serial console and holding the console port's lock, which is grabbed with "irqsave". ...but since we were stopping with an NMI we didn't care about the "irqsave" and interrupted anyway. 3. Since we stopped the CPU while it was holding the lock it would never release it. 4. All future calls to output to the console would end up failing to get the lock in qcom_geni_serial_console_write(). This isn't _totally_ unexpected at panic time but it's a code path that's not well tested, hard to get right, and apparently doesn't work terribly well on the Qualcomm geni serial driver. The Qualcomm geni serial driver was fixed to be a bit better in commit 9e957a155005 ("serial: qcom-geni: Don't cancel/abort if we can't get the port lock") but it's nice not to get into this situation in the first place. Taking a page from what x86 appears to do in native_stop_other_cpus(), do this: 1. First, try to stop other CPUs with a normal IPI and wait a second. This gives them a chance to leave critical sections. 2. If CPUs fail to stop then retry with an NMI, but give a much lower timeout since there's no good reason for a CPU not to react quickly to a NMI. This works well and avoids the corrupted console and (presumably) could help avoid other similar issues. In order to do this, we need to do a little re-organization of our IPIs since we don't have any more free IDs. Do what was suggested in previous conversations and combine "stop" and "crash stop". That frees up an IPI so now we can have a "stop" and "stop NMI". In order to do this we also need a slight change in the way we keep track of which CPUs still need to be stopped. We need to know specifically which CPUs haven't stopped yet when we fall back to NMI but in the "crash stop" case the "cpu_online_mask" isn't updated as CPUs go down. This is why that code path had an atomic of the number of CPUs left. Solve this by also updating the "cpu_online_mask" for crash stops. All of the above lets us combine the logic for "stop" and "crash stop" code, which appeared to have a bunch of arbitrary implementation differences. Aside from the above change where we try a normal IPI and then an NMI, the combined function has a few subtle differences: * In the normal smp_send_stop(), if we fail to stop one or more CPUs then we won't include the current CPU (the one running smp_send_stop()) in the error message. * In crash_smp_send_stop(), if we fail to stop some CPUs we'll print the CPUs that we failed to stop instead of printing all _but_ the current running CPU. * In crash_smp_send_stop(), we will now only print "SMP: stopping secondary CPUs" if (system_state <= SYSTEM_RUNNING). Fixes: d7402513c935 ("arm64: smp: IPI_CPU_STOP and IPI_CPU_CRASH_STOP should try for NMI") Signed-off-by: Douglas Anderson <dianders@chromium.org> Link: https://lore.kernel.org/r/20240821145353.v3.1.Id4817adef610302554b8aa42b090d57270dc119c@changeid Signed-off-by: Will Deacon <will@kernel.org>
| * / / / arm64: errata: Enable the AC03_CPU_38 workaround for ampere1aD Scott Phillips2024-08-301-1/+9
| |/ / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The ampere1a cpu is affected by erratum AC04_CPU_10 which is the same bug as AC03_CPU_38. Add ampere1a to the AC03_CPU_38 workaround midr list. Cc: <stable@vger.kernel.org> Signed-off-by: D Scott Phillips <scott@os.amperecomputing.com> Acked-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20240827211701.2216719-1-scott@os.amperecomputing.com Signed-off-by: Will Deacon <will@kernel.org>
* | | | Merge tag 'arm64-fixes' of ↵Linus Torvalds2024-09-061-1/+3
|\ \ \ \ | |_|/ / |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 fix from Catalin Marinas: "Fix the arm64 usage of ftrace_graph_ret_addr() to pass the &state->graph_idx pointer instead of NULL, otherwise this function just returns early" * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: arm64: stacktrace: fix the usage of ftrace_graph_ret_addr()
| * | | arm64: stacktrace: fix the usage of ftrace_graph_ret_addr()Puranjay Mohan2024-09-051-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ftrace_graph_ret_addr() takes an 'idx' integer pointer that is used to optimize the stack unwinding process. arm64 currently passes `NULL` for this parameter which stops it from utilizing these optimizations. Further, the current code for ftrace_graph_ret_addr() will just return the passed in return address if it is NULL which will break this usage. Pass a valid integer pointer to ftrace_graph_ret_addr() similar to x86_64's stack unwinder. Signed-off-by: Puranjay Mohan <puranjay@kernel.org> Fixes: 29c1c24a2707 ("function_graph: Fix up ftrace_graph_ret_addr()") Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org> Acked-by: Will Deacon <will@kernel.org> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Tested-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org> Link: https://lore.kernel.org/r/20240618162342.28275-1-puranjay@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
* | | | Merge tag 'arm64-fixes' of ↵Linus Torvalds2024-08-173-4/+3
|\| | | | |/ / |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 fixes from Catalin Marinas: - Fix the arm64 __get_mem_asm() to use the _ASM_EXTABLE_##type##ACCESS() macro instead of the *_ERR() one in order to avoid writing -EFAULT to the value register in case of a fault - Initialise all elements of the acpi_early_node_map[] to NUMA_NO_NODE. Prior to this fix, only the first element was initialised - Move the KASAN random tag seed initialisation after the per-CPU areas have been initialised (prng_state is __percpu) * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: arm64: Fix KASAN random tag seed initialization arm64: ACPI: NUMA: initialize all values of acpi_early_node_map to NUMA_NO_NODE arm64: uaccess: correct thinko in __get_mem_asm()
| * | arm64: Fix KASAN random tag seed initializationSamuel Holland2024-08-152-3/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, kasan_init_sw_tags() is called before setup_per_cpu_areas(), so per_cpu(prng_state, cpu) accesses the same address regardless of the value of "cpu", and the same seed value gets copied to the percpu area for every CPU. Fix this by moving the call to smp_prepare_boot_cpu(), which is the first architecture hook after setup_per_cpu_areas(). Fixes: 3c9e3aa11094 ("kasan: add tag related helper functions") Fixes: 3f41b6093823 ("kasan: fix random seed generation for tag-based mode") Signed-off-by: Samuel Holland <samuel.holland@sifive.com> Reviewed-by: Andrey Konovalov <andreyknvl@gmail.com> Link: https://lore.kernel.org/r/20240814091005.969756-1-samuel.holland@sifive.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
| * | arm64: ACPI: NUMA: initialize all values of acpi_early_node_map to NUMA_NO_NODEHaibo Xu2024-08-141-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, only acpi_early_node_map[0] was initialized to NUMA_NO_NODE. To ensure all the values were properly initialized, switch to initialize all of them to NUMA_NO_NODE. Fixes: e18962491696 ("arm64: numa: rework ACPI NUMA initialization") Cc: <stable@vger.kernel.org> # 4.19.x Reported-by: Andrew Jones <ajones@ventanamicro.com> Suggested-by: Andrew Jones <ajones@ventanamicro.com> Signed-off-by: Haibo Xu <haibo1.xu@intel.com> Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> Reviewed-by: Sunil V L <sunilvl@ventanamicro.com> Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Lorenzo Pieralisi <lpieralisi@kernel.org> Reviewed-by: Hanjun Guo <guohanjun@huawei.com> Link: https://lore.kernel.org/r/853d7f74aa243f6f5999e203246f0d1ae92d2b61.1722828421.git.haibo1.xu@intel.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
* | | Merge tag 'arm64-fixes' of ↵Linus Torvalds2024-08-022-3/+19
|\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 fixes from Catalin Marinas: - Expand the speculative SSBS errata workaround to more CPUs - Ensure jump label changes are visible to all CPUs with a kick_all_cpus_sync() (and also enable jump label batching as part of the fix) - The shadow call stack sanitiser is currently incompatible with Rust, make CONFIG_RUST conditional on !CONFIG_SHADOW_CALL_STACK * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: arm64: jump_label: Ensure patched jump_labels are visible to all CPUs rust: SHADOW_CALL_STACK is incompatible with Rust arm64: errata: Expand speculative SSBS workaround (again) arm64: cputype: Add Cortex-A725 definitions arm64: cputype: Add Cortex-X1C definitions
| * | arm64: jump_label: Ensure patched jump_labels are visible to all CPUsWill Deacon2024-08-021-2/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Although the Arm architecture permits concurrent modification and execution of NOP and branch instructions, it still requires some synchronisation to ensure that other CPUs consistently execute the newly written instruction: > When the modified instructions are observable, each PE that is > executing the modified instructions must execute an ISB or perform a > context synchronizing event to ensure execution of the modified > instructions Prior to commit f6cc0c501649 ("arm64: Avoid calling stop_machine() when patching jump labels"), the arm64 jump_label patching machinery performed synchronisation using stop_machine() after each modification, however this was problematic when flipping static keys from atomic contexts (namely, the arm_arch_timer CPU hotplug startup notifier) and so we switched to the _nosync() patching routines to avoid "scheduling while atomic" BUG()s during boot. In hindsight, the analysis of the issue in f6cc0c501649 isn't quite right: it cites the use of IPIs in the default patching routines as the cause of the lockup, whereas stop_machine() does not rely on IPIs and the I-cache invalidation is performed using __flush_icache_range(), which elides the call to kick_all_cpus_sync(). In fact, the blocking wait for other CPUs is what triggers the BUG() and the problem remains even after f6cc0c501649, for example because we could block on the jump_label_mutex. Eventually, the arm_arch_timer driver was fixed to avoid the static key entirely in commit a862fc2254bd ("clocksource/arm_arch_timer: Remove use of workaround static key"). This all leaves the jump_label patching code in a funny situation on arm64 as we do not synchronise with other CPUs to reduce the likelihood of a bug which no longer exists. Consequently, toggling a static key on one CPU cannot be assumed to take effect on other CPUs, leading to potential issues, for example with missing preempt notifiers. Rather than revert f6cc0c501649 and go back to stop_machine() for each patch site, implement arch_jump_label_transform_apply() and kick all the other CPUs with an IPI at the end of patching. Cc: Alexander Potapenko <glider@google.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Marc Zyngier <maz@kernel.org> Fixes: f6cc0c501649 ("arm64: Avoid calling stop_machine() when patching jump labels") Signed-off-by: Will Deacon <will@kernel.org> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20240731133601.3073-1-will@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
| * | arm64: errata: Expand speculative SSBS workaround (again)Mark Rutland2024-08-011-1/+10
| |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A number of Arm Ltd CPUs suffer from errata whereby an MSR to the SSBS special-purpose register does not affect subsequent speculative instructions, permitting speculative store bypassing for a window of time. We worked around this for a number of CPUs in commits: * 7187bb7d0b5c7dfa ("arm64: errata: Add workaround for Arm errata 3194386 and 3312417") * 75b3c43eab594bfb ("arm64: errata: Expand speculative SSBS workaround") Since then, similar errata have been published for a number of other Arm Ltd CPUs, for which the same mitigation is sufficient. This is described in their respective Software Developer Errata Notice (SDEN) documents: * Cortex-A76 (MP052) SDEN v31.0, erratum 3324349 https://developer.arm.com/documentation/SDEN-885749/3100/ * Cortex-A77 (MP074) SDEN v19.0, erratum 3324348 https://developer.arm.com/documentation/SDEN-1152370/1900/ * Cortex-A78 (MP102) SDEN v21.0, erratum 3324344 https://developer.arm.com/documentation/SDEN-1401784/2100/ * Cortex-A78C (MP138) SDEN v16.0, erratum 3324346 https://developer.arm.com/documentation/SDEN-1707916/1600/ * Cortex-A78C (MP154) SDEN v10.0, erratum 3324347 https://developer.arm.com/documentation/SDEN-2004089/1000/ * Cortex-A725 (MP190) SDEN v5.0, erratum 3456106 https://developer.arm.com/documentation/SDEN-2832921/0500/ * Cortex-X1 (MP077) SDEN v21.0, erratum 3324344 https://developer.arm.com/documentation/SDEN-1401782/2100/ * Cortex-X1C (MP136) SDEN v16.0, erratum 3324346 https://developer.arm.com/documentation/SDEN-1707914/1600/ * Neoverse-N1 (MP050) SDEN v32.0, erratum 3324349 https://developer.arm.com/documentation/SDEN-885747/3200/ * Neoverse-V1 (MP076) SDEN v19.0, erratum 3324341 https://developer.arm.com/documentation/SDEN-1401781/1900/ Note that due to the manner in which Arm develops IP and tracks errata, some CPUs share a common erratum number and some CPUs have multiple erratum numbers for the same HW issue. On parts without SB, it is necessary to use ISB for the workaround. The spec_bar() macro used in the mitigation will expand to a "DSB SY; ISB" sequence in this case, which is sufficient on all affected parts. Enable the existing mitigation by adding the relevant MIDRs to erratum_spec_ssbs_list. The list is sorted alphanumerically (involving moving Neoverse-V3 after Neoverse-V2) so that this is easy to audit and potentially extend again in future. The Kconfig text is also updated to clarify the set of affected parts and the mitigation. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Will Deacon <will@kernel.org> Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> Acked-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20240801101803.1982459-4-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
* / syscalls: fix syscall macros for newfstat/newfstatatArnd Bergmann2024-08-021-1/+1
|/ | | | | | | | | | | | | | | The __NR_newfstat and __NR_newfstatat macros accidentally got renamed in the conversion to the syscall.tbl format, dropping the 'new' portion of the name. In an unrelated change, the two syscalls are no longer architecture specific but are once more defined on all 64-bit architectures, so the 'newstat' ABI keyword can be dropped from the table as a simplification. Fixes: Fixes: 4fe53bf2ba0a ("syscalls: add generic scripts/syscall.tbl") Closes: https://lore.kernel.org/lkml/838053e0-b186-4e9f-9668-9a3384a71f23@app.fastmail.com/T/#t Reported-by: Florian Weimer <fweimer@redhat.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
* Merge tag 'arm64-fixes' of ↵Linus Torvalds2024-07-262-2/+2
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 fixes from Will Deacon: "The usual summary below, but the main fix is for the fast GUP lockless page-table walk when we have a combination of compile-time and run-time folding of the p4d and the pud respectively. - Remove some redundant Kconfig conditionals - Fix string output in ptrace selftest - Fix fast GUP crashes in some page-table configurations - Remove obsolete linker option when building the vDSO - Fix some sysreg field definitions for the GIC" * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: arm64: mm: Fix lockless walks with static and dynamic page-table folding arm64/sysreg: Correct the values for GICv4.1 arm64/vdso: Remove --hash-style=sysv kselftest: missing arg in ptrace.c arm64/Kconfig: Remove redundant 'if HAVE_FUNCTION_GRAPH_TRACER' arm64: remove redundant 'if HAVE_ARCH_KASAN' in Kconfig
| * arm64/vdso: Remove --hash-style=sysvFangrui Song2024-07-232-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | glibc added support for .gnu.hash in 2006 and .hash has been obsoleted for more than one decade in many Linux distributions. Using --hash-style=sysv might imply unaddressed issues and confuse readers. Just drop the option and rely on the linker default, which is likely "both", or "gnu" when the distribution really wants to eliminate sysv hash overhead. Similar to commit 6b7e26547fad ("x86/vdso: Emit a GNU hash"). Signed-off-by: Fangrui Song <maskray@google.com> Link: https://lore.kernel.org/r/20240718173423.1574395-1-maskray@google.com Signed-off-by: Will Deacon <will@kernel.org>
* | sysctl: treewide: constify the ctl_table argument of proc_handlersJoel Granados2024-07-242-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | const qualify the struct ctl_table argument in the proc_handler function signatures. This is a prerequisite to moving the static ctl_table structs into .rodata data which will ensure that proc_handler function pointers cannot be modified. This patch has been generated by the following coccinelle script: ``` virtual patch @r1@ identifier ctl, write, buffer, lenp, ppos; identifier func !~ "appldata_(timer|interval)_handler|sched_(rt|rr)_handler|rds_tcp_skbuf_handler|proc_sctp_do_(hmac_alg|rto_min|rto_max|udp_port|alpha_beta|auth|probe_interval)"; @@ int func( - struct ctl_table *ctl + const struct ctl_table *ctl ,int write, void *buffer, size_t *lenp, loff_t *ppos); @r2@ identifier func, ctl, write, buffer, lenp, ppos; @@ int func( - struct ctl_table *ctl + const struct ctl_table *ctl ,int write, void *buffer, size_t *lenp, loff_t *ppos) { ... } @r3@ identifier func; @@ int func( - struct ctl_table * + const struct ctl_table * ,int , void *, size_t *, loff_t *); @r4@ identifier func, ctl; @@ int func( - struct ctl_table *ctl + const struct ctl_table *ctl ,int , void *, size_t *, loff_t *); @r5@ identifier func, write, buffer, lenp, ppos; @@ int func( - struct ctl_table * + const struct ctl_table * ,int write, void *buffer, size_t *lenp, loff_t *ppos); ``` * Code formatting was adjusted in xfs_sysctl.c to comply with code conventions. The xfs_stats_clear_proc_handler, xfs_panic_mask_proc_handler and xfs_deprecated_dointvec_minmax where adjusted. * The ctl_table argument in proc_watchdog_common was const qualified. This is called from a proc_handler itself and is calling back into another proc_handler, making it necessary to change it as part of the proc_handler migration. Co-developed-by: Thomas Weißschuh <linux@weissschuh.net> Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Co-developed-by: Joel Granados <j.granados@samsung.com> Signed-off-by: Joel Granados <j.granados@samsung.com>
* | Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds2024-07-203-8/+5
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull kvm updates from Paolo Bonzini: "ARM: - Initial infrastructure for shadow stage-2 MMUs, as part of nested virtualization enablement - Support for userspace changes to the guest CTR_EL0 value, enabling (in part) migration of VMs between heterogenous hardware - Fixes + improvements to pKVM's FF-A proxy, adding support for v1.1 of the protocol - FPSIMD/SVE support for nested, including merged trap configuration and exception routing - New command-line parameter to control the WFx trap behavior under KVM - Introduce kCFI hardening in the EL2 hypervisor - Fixes + cleanups for handling presence/absence of FEAT_TCRX - Miscellaneous fixes + documentation updates LoongArch: - Add paravirt steal time support - Add support for KVM_DIRTY_LOG_INITIALLY_SET - Add perf kvm-stat support for loongarch RISC-V: - Redirect AMO load/store access fault traps to guest - perf kvm stat support - Use guest files for IMSIC virtualization, when available s390: - Assortment of tiny fixes which are not time critical x86: - Fixes for Xen emulation - Add a global struct to consolidate tracking of host values, e.g. EFER - Add KVM_CAP_X86_APIC_BUS_CYCLES_NS to allow configuring the effective APIC bus frequency, because TDX - Print the name of the APICv/AVIC inhibits in the relevant tracepoint - Clean up KVM's handling of vendor specific emulation to consistently act on "compatible with Intel/AMD", versus checking for a specific vendor - Drop MTRR virtualization, and instead always honor guest PAT on CPUs that support self-snoop - Update to the newfangled Intel CPU FMS infrastructure - Don't advertise IA32_PERF_GLOBAL_OVF_CTRL as an MSR-to-be-saved, as it reads '0' and writes from userspace are ignored - Misc cleanups x86 - MMU: - Small cleanups, renames and refactoring extracted from the upcoming Intel TDX support - Don't allocate kvm_mmu_page.shadowed_translation for shadow pages that can't hold leafs SPTEs - Unconditionally drop mmu_lock when allocating TDP MMU page tables for eager page splitting, to avoid stalling vCPUs when splitting huge pages - Bug the VM instead of simply warning if KVM tries to split a SPTE that is non-present or not-huge. KVM is guaranteed to end up in a broken state because the callers fully expect a valid SPTE, it's all but dangerous to let more MMU changes happen afterwards x86 - AMD: - Make per-CPU save_area allocations NUMA-aware - Force sev_es_host_save_area() to be inlined to avoid calling into an instrumentable function from noinstr code - Base support for running SEV-SNP guests. API-wise, this includes a new KVM_X86_SNP_VM type, encrypting/measure the initial image into guest memory, and finalizing it before launching it. Internally, there are some gmem/mmu hooks needed to prepare gmem-allocated pages before mapping them into guest private memory ranges This includes basic support for attestation guest requests, enough to say that KVM supports the GHCB 2.0 specification There is no support yet for loading into the firmware those signing keys to be used for attestation requests, and therefore no need yet for the host to provide certificate data for those keys. To support fetching certificate data from userspace, a new KVM exit type will be needed to handle fetching the certificate from userspace. An attempt to define a new KVM_EXIT_COCO / KVM_EXIT_COCO_REQ_CERTS exit type to handle this was introduced in v1 of this patchset, but is still being discussed by community, so for now this patchset only implements a stub version of SNP Extended Guest Requests that does not provide certificate data x86 - Intel: - Remove an unnecessary EPT TLB flush when enabling hardware - Fix a series of bugs that cause KVM to fail to detect nested pending posted interrupts as valid wake eents for a vCPU executing HLT in L2 (with HLT-exiting disable by L1) - KVM: x86: Suppress MMIO that is triggered during task switch emulation Explicitly suppress userspace emulated MMIO exits that are triggered when emulating a task switch as KVM doesn't support userspace MMIO during complex (multi-step) emulation Silently ignoring the exit request can result in the WARN_ON_ONCE(vcpu->mmio_needed) firing if KVM exits to userspace for some other reason prior to purging mmio_needed See commit 0dc902267cb3 ("KVM: x86: Suppress pending MMIO write exits if emulator detects exception") for more details on KVM's limitations with respect to emulated MMIO during complex emulator flows Generic: - Rename the AS_UNMOVABLE flag that was introduced for KVM to AS_INACCESSIBLE, because the special casing needed by these pages is not due to just unmovability (and in fact they are only unmovable because the CPU cannot access them) - New ioctl to populate the KVM page tables in advance, which is useful to mitigate KVM page faults during guest boot or after live migration. The code will also be used by TDX, but (probably) not through the ioctl - Enable halt poll shrinking by default, as Intel found it to be a clear win - Setup empty IRQ routing when creating a VM to avoid having to synchronize SRCU when creating a split IRQCHIP on x86 - Rework the sched_in/out() paths to replace kvm_arch_sched_in() with a flag that arch code can use for hooking both sched_in() and sched_out() - Take the vCPU @id as an "unsigned long" instead of "u32" to avoid truncating a bogus value from userspace, e.g. to help userspace detect bugs - Mark a vCPU as preempted if and only if it's scheduled out while in the KVM_RUN loop, e.g. to avoid marking it preempted and thus writing guest memory when retrieving guest state during live migration blackout Selftests: - Remove dead code in the memslot modification stress test - Treat "branch instructions retired" as supported on all AMD Family 17h+ CPUs - Print the guest pseudo-RNG seed only when it changes, to avoid spamming the log for tests that create lots of VMs - Make the PMU counters test less flaky when counting LLC cache misses by doing CLFLUSH{OPT} in every loop iteration" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (227 commits) crypto: ccp: Add the SNP_VLEK_LOAD command KVM: x86/pmu: Add kvm_pmu_call() to simplify static calls of kvm_pmu_ops KVM: x86: Introduce kvm_x86_call() to simplify static calls of kvm_x86_ops KVM: x86: Replace static_call_cond() with static_call() KVM: SEV: Provide support for SNP_EXTENDED_GUEST_REQUEST NAE event x86/sev: Move sev_guest.h into common SEV header KVM: SEV: Provide support for SNP_GUEST_REQUEST NAE event KVM: x86: Suppress MMIO that is triggered during task switch emulation KVM: x86/mmu: Clean up make_huge_page_split_spte() definition and intro KVM: x86/mmu: Bug the VM if KVM tries to split a !hugepage SPTE KVM: selftests: x86: Add test for KVM_PRE_FAULT_MEMORY KVM: x86: Implement kvm_arch_vcpu_pre_fault_memory() KVM: x86/mmu: Make kvm_mmu_do_page_fault() return mapped level KVM: x86/mmu: Account pf_{fixed,emulate,spurious} in callers of "do page fault" KVM: x86/mmu: Bump pf_taken stat only in the "real" page fault handler KVM: Add KVM_PRE_FAULT_MEMORY vcpu ioctl to pre-populate guest memory KVM: Document KVM_PRE_FAULT_MEMORY ioctl mm, virt: merge AS_UNMOVABLE and AS_INACCESSIBLE perf kvm: Add kvm-stat for loongarch64 LoongArch: KVM: Add PV steal time support in guest side ...