summaryrefslogtreecommitdiffstats
path: root/arch/x86/kvm/svm (follow)
Commit message (Collapse)AuthorAgeFilesLines
* Merge tag 'objtool-core-2021-10-31' of ↵Linus Torvalds2021-11-012-6/+6
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull objtool updates from Thomas Gleixner: - Improve retpoline code patching by separating it from alternatives which reduces memory footprint and allows to do better optimizations in the actual runtime patching. - Add proper retpoline support for x86/BPF - Address noinstr warnings in x86/kvm, lockdep and paravirtualization code - Add support to handle pv_opsindirect calls in the noinstr analysis - Classify symbols upfront and cache the result to avoid redundant str*cmp() invocations. - Add a CFI hash to reduce memory consumption which also reduces runtime on a allyesconfig by ~50% - Adjust XEN code to make objtool handling more robust and as a side effect to prevent text fragmentation due to placement of the hypercall page. * tag 'objtool-core-2021-10-31' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (41 commits) bpf,x86: Respect X86_FEATURE_RETPOLINE* bpf,x86: Simplify computing label offsets x86,bugs: Unconditionally allow spectre_v2=retpoline,amd x86/alternative: Add debug prints to apply_retpolines() x86/alternative: Try inline spectre_v2=retpoline,amd x86/alternative: Handle Jcc __x86_indirect_thunk_\reg x86/alternative: Implement .retpoline_sites support x86/retpoline: Create a retpoline thunk array x86/retpoline: Move the retpoline thunk declarations to nospec-branch.h x86/asm: Fixup odd GEN-for-each-reg.h usage x86/asm: Fix register order x86/retpoline: Remove unused replacement symbols objtool,x86: Replace alternatives with .retpoline_sites objtool: Shrink struct instruction objtool: Explicitly avoid self modifying code in .altinstr_replacement objtool: Classify symbols objtool: Support pv_opsindirect calls for noinstr x86/xen: Rework the xen_{cpu,irq,mmu}_opsarrays x86/xen: Mark xen_force_evtchn_callback() noinstr x86/xen: Make irq_disable() noinstr ...
| * Merge branch 'objtool/urgent'Peter Zijlstra2021-10-074-98/+144
| |\ | | | | | | | | | | | | | | | | | | Fixup conflicts. # Conflicts: # tools/objtool/check.c
| * | x86/kvm: Always inline to_svm()Peter Zijlstra2021-09-151-1/+1
| | | | | | | | | | | | | | | | | | | | | vmlinux.o: warning: objtool: svm_vcpu_enter_exit()+0x13: call to to_svm() leaves .noinstr.text section Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20210624095148.066347165@infradead.org
| * | x86/kvm: Always inline vmload() / vmsave()Peter Zijlstra2021-09-151-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | vmlinux.o: warning: objtool: svm_vcpu_enter_exit()+0xea: call to vmload() leaves .noinstr.text section vmlinux.o: warning: objtool: svm_vcpu_enter_exit()+0x133: call to vmsave() leaves .noinstr.text section Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20210624095147.942250748@infradead.org
| * | x86/kvm: Always inline sev_*guest()Peter Zijlstra2021-09-151-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | vmlinux.o: warning: objtool: svm_vcpu_enter_exit()+0x4d: call to sev_es_guest() leaves .noinstr.text section vmlinux.o: warning: objtool: svm_vcpu_enter_exit()+0x50: call to sev_guest() leaves .noinstr.text section Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20210624095147.880513802@infradead.org
* | | Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds2021-10-311-3/+12
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull kvm fixes from Paolo Bonzini: - Fixes for s390 interrupt delivery - Fixes for Xen emulator bugs showing up as debug kernel WARNs - Fix another issue with SEV/ES string I/O VMGEXITs * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: x86: Take srcu lock in post_kvm_run_save() KVM: SEV-ES: fix another issue with string I/O VMGEXITs KVM: x86/xen: Fix kvm_xen_has_interrupt() sleeping in kvm_vcpu_block() KVM: x86: switch pvclock_gtod_sync_lock to a raw spinlock KVM: s390: preserve deliverable_mask in __airqs_kick_single_vcpu KVM: s390: clear kicked_mask before sleeping again
| * | | KVM: SEV-ES: fix another issue with string I/O VMGEXITsPaolo Bonzini2021-10-271-3/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If the guest requests string I/O from the hypervisor via VMGEXIT, SW_EXITINFO2 will contain the REP count. However, sev_es_string_io was incorrectly treating it as the size of the GHCB buffer in bytes. This fixes the "outsw" test in the experimental SEV tests of kvm-unit-tests. Cc: stable@vger.kernel.org Fixes: 7ed9abfe8e9f ("KVM: SVM: Support string IO operations for an SEV-ES guest") Reported-by: Marc Orr <marcorr@google.com> Tested-by: Marc Orr <marcorr@google.com> Reviewed-by: Marc Orr <marcorr@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* | | | KVM: SEV: Flush cache on non-coherent systems before RECEIVE_UPDATE_DATAMasahiro Kozuka2021-10-211-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Flush the destination page before invoking RECEIVE_UPDATE_DATA, as the PSP encrypts the data with the guest's key when writing to guest memory. If the target memory was not previously encrypted, the cache may contain dirty, unecrypted data that will persist on non-coherent systems. Fixes: 15fb7de1a7f5 ("KVM: SVM: Add KVM_SEV_RECEIVE_UPDATE_DATA command") Cc: stable@vger.kernel.org Cc: Peter Gonda <pgonda@google.com> Cc: Marc Orr <marcorr@google.com> Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: Brijesh Singh <brijesh.singh@amd.com> Signed-off-by: Masahiro Kozuka <masa.koz@kozuka.jp> [sean: converted bug report to changelog] Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20210914210951.2994260-3-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* | | | KVM: SEV-ES: reduce ghcb_sa_len to 32 bitsPaolo Bonzini2021-10-181-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The size of the GHCB scratch area is limited to 16 KiB (GHCB_SCRATCH_AREA_LIMIT), so there is no need for it to be a u64. This fixes a build error on 32-bit systems: i686-linux-gnu-ld: arch/x86/kvm/svm/sev.o: in function `sev_es_string_io: sev.c:(.text+0x110f): undefined reference to `__udivdi3' Cc: stable@vger.kernel.org Fixes: 019057bd73d1 ("KVM: SEV-ES: fix length of string I/O") Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* | | | KVM: SEV-ES: Set guest_state_protected after VMSA updatePeter Gonda2021-10-181-1/+6
|/ / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The refactoring in commit bb18a6777465 ("KVM: SEV: Acquire vcpu mutex when updating VMSA") left behind the assignment to svm->vcpu.arch.guest_state_protected; add it back. Signed-off-by: Peter Gonda <pgonda@google.com> [Delta between v2 and v3 of Peter's patch, which had already been committed; the commit message is my own. - Paolo] Fixes: bb18a6777465 ("KVM: SEV: Acquire vcpu mutex when updating VMSA") Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* | / KVM: SEV-ES: fix length of string I/OPaolo Bonzini2021-10-151-1/+1
| |/ |/| | | | | | | | | | | | | | | | | | | The size of the data in the scratch buffer is not divided by the size of each port I/O operation, so vcpu->arch.pio.count ends up being larger than it should be by a factor of size. Cc: stable@vger.kernel.org Fixes: 7ed9abfe8e9f ("KVM: SVM: Support string IO operations for an SEV-ES guest") Acked-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* | KVM: x86: nSVM: don't copy virt_ext from vmcb12Maxim Levitsky2021-09-231-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | These field correspond to features that we don't expose yet to L2 While currently there are no CVE worthy features in this field, if AMD adds more features to this field, that could allow guest escapes similar to CVE-2021-3653 and CVE-2021-3656. Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com> Message-Id: <20210914154825.104886-6-mlevitsk@redhat.com> Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* | KVM: x86: nSVM: test eax for 4K alignment for GP errata workaroundMaxim Levitsky2021-09-231-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | GP SVM errata workaround made the #GP handler always emulate the SVM instructions. However these instructions #GP in case the operand is not 4K aligned, but the workaround code didn't check this and we ended up emulating these instructions anyway. This is only an emulation accuracy check bug as there is no harm for KVM to read/write unaligned vmcb images. Fixes: 82a11e9c6fa2 ("KVM: SVM: Add emulation support for #GP triggered by SVM instructions") Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com> Message-Id: <20210914154825.104886-4-mlevitsk@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* | KVM: x86: nSVM: restore int_vector in svm_clear_vintrMaxim Levitsky2021-09-231-0/+2
| | | | | | | | | | | | | | | | | | | | In svm_clear_vintr we try to restore the virtual interrupt injection that might be pending, but we fail to restore the interrupt vector. Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com> Message-Id: <20210914154825.104886-2-mlevitsk@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* | KVM: x86: nSVM: refactor svm_leave_smm and smm_enter_smmMaxim Levitsky2021-09-221-66/+69
| | | | | | | | | | | | | | | | | | | | Use return statements instead of nested if, and fix error path to free all the maps that were allocated. Suggested-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com> Message-Id: <20210913140954.165665-2-mlevitsk@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* | KVM: x86: SVM: call KVM_REQ_GET_NESTED_STATE_PAGES on exit from SMM modeMaxim Levitsky2021-09-223-5/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | Currently the KVM_REQ_GET_NESTED_STATE_PAGES on SVM only reloads PDPTRs, and MSR bitmap, with former not really needed for SMM as SMM exit code reloads them again from SMRAM'S CR3, and later happens to work since MSR bitmap isn't modified while in SMM. Still it is better to be consistient with VMX. Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com> Message-Id: <20210913140954.165665-5-mlevitsk@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* | KVM: x86: nSVM: restore the L1 host state prior to resuming nested guest on ↵Maxim Levitsky2021-09-221-5/+7
| | | | | | | | | | | | | | | | | | | | | | | | SMM exit Otherwise guest entry code might see incorrect L1 state (e.g paging state). Fixes: 37be407b2ce8 ("KVM: nSVM: Fix L1 state corruption upon return from SMM") Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com> Message-Id: <20210913140954.165665-3-mlevitsk@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* | KVM: SEV: Allow some commands for mirror VMPeter Gonda2021-09-221-2/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A mirrored SEV-ES VM will need to call KVM_SEV_LAUNCH_UPDATE_VMSA to setup its vCPUs and have them measured, and their VMSAs encrypted. Without this change, it is impossible to have mirror VMs as part of SEV-ES VMs. Also allow the guest status check and debugging commands since they do not change any guest state. Signed-off-by: Peter Gonda <pgonda@google.com> Cc: Marc Orr <marcorr@google.com> Cc: Nathan Tempelman <natet@google.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Sean Christopherson <seanjc@google.com> Cc: Steve Rutherford <srutherford@google.com> Cc: Brijesh Singh <brijesh.singh@amd.com> Cc: kvm@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: stable@vger.kernel.org Fixes: 54526d1fd593 ("KVM: x86: Support KVM VMs sharing SEV context", 2021-04-21) Message-Id: <20210921150345.2221634-3-pgonda@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* | KVM: SEV: Update svm_vm_copy_asid_from for SEV-ESPeter Gonda2021-09-221-4/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For mirroring SEV-ES the mirror VM will need more then just the ASID. The FD and the handle are required to all the mirror to call psp commands. The mirror VM will need to call KVM_SEV_LAUNCH_UPDATE_VMSA to setup its vCPUs' VMSAs for SEV-ES. Signed-off-by: Peter Gonda <pgonda@google.com> Cc: Marc Orr <marcorr@google.com> Cc: Nathan Tempelman <natet@google.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Sean Christopherson <seanjc@google.com> Cc: Steve Rutherford <srutherford@google.com> Cc: Brijesh Singh <brijesh.singh@amd.com> Cc: kvm@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: stable@vger.kernel.org Fixes: 54526d1fd593 ("KVM: x86: Support KVM VMs sharing SEV context", 2021-04-21) Message-Id: <20210921150345.2221634-2-pgonda@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* | KVM: SEV: Pin guest memory for write for RECEIVE_UPDATE_DATASean Christopherson2021-09-221-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Require the target guest page to be writable when pinning memory for RECEIVE_UPDATE_DATA. Per the SEV API, the PSP writes to guest memory: The result is then encrypted with GCTX.VEK and written to the memory pointed to by GUEST_PADDR field. Fixes: 15fb7de1a7f5 ("KVM: SVM: Add KVM_SEV_RECEIVE_UPDATE_DATA command") Cc: stable@vger.kernel.org Cc: Peter Gonda <pgonda@google.com> Cc: Marc Orr <marcorr@google.com> Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: Brijesh Singh <brijesh.singh@amd.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20210914210951.2994260-2-seanjc@google.com> Reviewed-by: Brijesh Singh <brijesh.singh@amd.com> Reviewed-by: Peter Gonda <pgonda@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* | KVM: SVM: fix missing sev_decommission in sev_receive_startMingwei Zhang2021-09-221-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | DECOMMISSION the current SEV context if binding an ASID fails after RECEIVE_START. Per AMD's SEV API, RECEIVE_START generates a new guest context and thus needs to be paired with DECOMMISSION: The RECEIVE_START command is the only command other than the LAUNCH_START command that generates a new guest context and guest handle. The missing DECOMMISSION can result in subsequent SEV launch failures, as the firmware leaks memory and might not able to allocate more SEV guest contexts in the future. Note, LAUNCH_START suffered the same bug, but was previously fixed by commit 934002cd660b ("KVM: SVM: Call SEV Guest Decommission if ASID binding fails"). Cc: Alper Gun <alpergun@google.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Brijesh Singh <brijesh.singh@amd.com> Cc: David Rienjes <rientjes@google.com> Cc: Marc Orr <marcorr@google.com> Cc: John Allen <john.allen@amd.com> Cc: Peter Gonda <pgonda@google.com> Cc: Sean Christopherson <seanjc@google.com> Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: Vipin Sharma <vipinsh@google.com> Cc: stable@vger.kernel.org Reviewed-by: Marc Orr <marcorr@google.com> Acked-by: Brijesh Singh <brijesh.singh@amd.com> Fixes: af43cbbf954b ("KVM: SVM: Add support for KVM_SEV_RECEIVE_START command") Signed-off-by: Mingwei Zhang <mizhang@google.com> Reviewed-by: Sean Christopherson <seanjc@google.com> Message-Id: <20210912181815.3899316-1-mizhang@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* | KVM: SEV: Acquire vcpu mutex when updating VMSAPeter Gonda2021-09-221-22/+29
|/ | | | | | | | | | | | | | | | | | | | | | The update-VMSA ioctl touches data stored in struct kvm_vcpu, and therefore should not be performed concurrently with any VCPU ioctl that might cause KVM or the processor to use the same data. Adds vcpu mutex guard to the VMSA updating code. Refactors out __sev_launch_update_vmsa() function to deal with per vCPU parts of sev_launch_update_vmsa(). Fixes: ad73109ae7ec ("KVM: SVM: Provide support to launch and run an SEV-ES guest") Signed-off-by: Peter Gonda <pgonda@google.com> Cc: Marc Orr <marcorr@google.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Sean Christopherson <seanjc@google.com> Cc: Brijesh Singh <brijesh.singh@amd.com> Cc: kvm@vger.kernel.org Cc: stable@vger.kernel.org Cc: linux-kernel@vger.kernel.org Message-Id: <20210915171755.3773766-1-pgonda@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds2021-09-076-113/+51
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull KVM updates from Paolo Bonzini: "ARM: - Page ownership tracking between host EL1 and EL2 - Rely on userspace page tables to create large stage-2 mappings - Fix incompatibility between pKVM and kmemleak - Fix the PMU reset state, and improve the performance of the virtual PMU - Move over to the generic KVM entry code - Address PSCI reset issues w.r.t. save/restore - Preliminary rework for the upcoming pKVM fixed feature - A bunch of MM cleanups - a vGIC fix for timer spurious interrupts - Various cleanups s390: - enable interpretation of specification exceptions - fix a vcpu_idx vs vcpu_id mixup x86: - fast (lockless) page fault support for the new MMU - new MMU now the default - increased maximum allowed VCPU count - allow inhibit IRQs on KVM_RUN while debugging guests - let Hyper-V-enabled guests run with virtualized LAPIC as long as they do not enable the Hyper-V "AutoEOI" feature - fixes and optimizations for the toggling of AMD AVIC (virtualized LAPIC) - tuning for the case when two-dimensional paging (EPT/NPT) is disabled - bugfixes and cleanups, especially with respect to vCPU reset and choosing a paging mode based on CR0/CR4/EFER - support for 5-level page table on AMD processors Generic: - MMU notifier invalidation callbacks do not take mmu_lock unless necessary - improved caching of LRU kvm_memory_slot - support for histogram statistics - add statistics for halt polling and remote TLB flush requests" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (210 commits) KVM: Drop unused kvm_dirty_gfn_invalid() KVM: x86: Update vCPU's hv_clock before back to guest when tsc_offset is adjusted KVM: MMU: mark role_regs and role accessors as maybe unused KVM: MIPS: Remove a "set but not used" variable x86/kvm: Don't enable IRQ when IRQ enabled in kvm_wait KVM: stats: Add VM stat for remote tlb flush requests KVM: Remove unnecessary export of kvm_{inc,dec}_notifier_count() KVM: x86/mmu: Move lpage_disallowed_link further "down" in kvm_mmu_page KVM: x86/mmu: Relocate kvm_mmu_page.tdp_mmu_page for better cache locality Revert "KVM: x86: mmu: Add guest physical address check in translate_gpa()" KVM: x86/mmu: Remove unused field mmio_cached in struct kvm_mmu_page kvm: x86: Increase KVM_SOFT_MAX_VCPUS to 710 kvm: x86: Increase MAX_VCPUS to 1024 kvm: x86: Set KVM_MAX_VCPU_ID to 4*KVM_MAX_VCPUS KVM: VMX: avoid running vmx_handle_exit_irqoff in case of emulation KVM: x86/mmu: Don't freak out if pml5_root is NULL on 4-level host KVM: s390: index kvm->arch.idle_mask by vcpu_idx KVM: s390: Enable specification exception interpretation KVM: arm64: Trim guest debug exception handling KVM: SVM: Add 5-level page table support for SVM ...
| * KVM: SVM: Add 5-level page table support for SVMWei Huang2021-08-201-6/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | When the 5-level page table is enabled on host OS, the nested page table for guest VMs must use 5-level as well. Update get_npt_level() function to reflect this requirement. In the meanwhile, remove the code that prevents kvm-amd driver from being loaded when 5-level page table is detected. Signed-off-by: Wei Huang <wei.huang2@amd.com> Message-Id: <20210818165549.3771014-4-wei.huang2@amd.com> [Tweak condition as suggested by Sean. - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
| * KVM: x86: Allow CPU to force vendor-specific TDP levelWei Huang2021-08-201-1/+3
| | | | | | | | | | | | | | | | | | | | | | AMD future CPUs will require a 5-level NPT if host CR4.LA57 is set. To prevent kvm_mmu_get_tdp_level() from incorrectly changing NPT level on behalf of CPUs, add a new parameter in kvm_configure_mmu() to force a fixed TDP level. Signed-off-by: Wei Huang <wei.huang2@amd.com> Message-Id: <20210818165549.3771014-2-wei.huang2@amd.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
| * KVM: SVM: split svm_handle_invalid_exitMaxim Levitsky2021-08-201-8/+9
| | | | | | | | | | | | | | | | | | | | Split the check for having a vmexit handler to svm_check_exit_valid, and make svm_handle_invalid_exit only handle a vmexit that is already not valid. Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com> Message-Id: <20210811122927.900604-2-mlevitsk@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
| * KVM: SVM: AVIC: drop unsupported AVIC base relocation codeMaxim Levitsky2021-08-203-13/+2
| | | | | | | | | | | | | | | | | | | | APIC base relocation is not supported anyway and won't work correctly so just drop the code that handles it and keep AVIC MMIO bar at the default APIC base. Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com> Message-Id: <20210810205251.424103-17-mlevitsk@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
| * KVM: SVM: call avic_vcpu_load/avic_vcpu_put when enabling/disabling AVICMaxim Levitsky2021-08-201-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently it is possible to have the following scenario: 1. AVIC is disabled by svm_refresh_apicv_exec_ctrl 2. svm_vcpu_blocking calls avic_vcpu_put which does nothing 3. svm_vcpu_unblocking enables the AVIC (due to KVM_REQ_APICV_UPDATE) and then calls avic_vcpu_load 4. warning is triggered in avic_vcpu_load since AVIC_PHYSICAL_ID_ENTRY_IS_RUNNING_MASK was never cleared While it is possible to just remove the warning, it seems to be more robust to fully disable/enable AVIC in svm_refresh_apicv_exec_ctrl by calling the avic_vcpu_load/avic_vcpu_put Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com> Message-Id: <20210810205251.424103-16-mlevitsk@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
| * KVM: SVM: move check for kvm_vcpu_apicv_active outside of avic_vcpu_{put|load}Maxim Levitsky2021-08-202-8/+9
| | | | | | | | | | | | | | | | No functional change intended. Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com> Message-Id: <20210810205251.424103-15-mlevitsk@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
| * KVM: SVM: remove svm_toggle_avic_for_irq_windowMaxim Levitsky2021-08-203-14/+2
| | | | | | | | | | | | | | | | | | | | Now that kvm_request_apicv_update doesn't need to drop the kvm->srcu lock, we can call kvm_request_apicv_update directly. Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20210810205251.424103-13-mlevitsk@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
| * KVM: SVM: add warning for mistmatch between AVIC vcpu state and AVIC inhibitionMaxim Levitsky2021-08-201-0/+2
| | | | | | | | | | | | | | | | | | | | It is never a good idea to enter a guest on a vCPU when the AVIC inhibition state doesn't match the enablement of the AVIC on the vCPU. Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com> Message-Id: <20210810205251.424103-11-mlevitsk@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
| * KVM: x86: don't disable APICv memslot when inhibitedMaxim Levitsky2021-08-203-17/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Thanks to the former patches, it is now possible to keep the APICv memslot always enabled, and it will be invisible to the guest when it is inhibited This code is based on a suggestion from Sean Christopherson: https://lkml.org/lkml/2021/7/19/2970 Suggested-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com> Message-Id: <20210810205251.424103-9-mlevitsk@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
| * KVM: x86: Move declaration of kvm_spurious_fault() to x86.hUros Bizjak2021-08-131-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | Move the declaration of kvm_spurious_fault() to KVM's "private" x86.h, it should never be called by anything other than low level KVM code. Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Sean Christopherson <seanjc@google.com> Signed-off-by: Uros Bizjak <ubizjak@gmail.com> [sean: rebased to a series without __ex()/__kvm_handle_fault_on_reboot()] Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20210809173955.1710866-3-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
| * KVM: x86: Kill off __ex() and __kvm_handle_fault_on_reboot()Sean Christopherson2021-08-132-4/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Remove the __kvm_handle_fault_on_reboot() and __ex() macros now that all VMX and SVM instructions use asm goto to handle the fault (or in the case of VMREAD, completely custom logic). Drop kvm_spurious_fault()'s asmlinkage annotation as __kvm_handle_fault_on_reboot() was the only flow that invoked it from assembly code. Cc: Uros Bizjak <ubizjak@gmail.com> Cc: Like Xu <like.xu.linux@gmail.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20210809173955.1710866-2-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
| * Merge branch 'kvm-vmx-secctl' into HEADPaolo Bonzini2021-08-101-20/+25
| |\ | | | | | | | | | Merge common topic branch for 5.14-rc6 and 5.15 merge window.
| * | KVM: nSVM: remove useless kvm_clear_*_queuePaolo Bonzini2021-08-021-5/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For an event to be in injected state when nested_svm_vmrun executes, it must have come from exitintinfo when svm_complete_interrupts ran: vcpu_enter_guest static_call(kvm_x86_run) -> svm_vcpu_run svm_complete_interrupts // now the event went from "exitintinfo" to "injected" static_call(kvm_x86_handle_exit) -> handle_exit svm_invoke_exit_handler vmrun_interception nested_svm_vmrun However, no event could have been in exitintinfo before a VMRUN vmexit. The code in svm.c is a bit more permissive than the one in vmx.c: if (is_external_interrupt(svm->vmcb->control.exit_int_info) && exit_code != SVM_EXIT_EXCP_BASE + PF_VECTOR && exit_code != SVM_EXIT_NPF && exit_code != SVM_EXIT_TASK_SWITCH && exit_code != SVM_EXIT_INTR && exit_code != SVM_EXIT_NMI) but in any case, a VMRUN instruction would not even start to execute during an attempted event delivery. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
| * | KVM: SVM: Drop redundant clearing of vcpu->arch.hflags at INIT/RESETSean Christopherson2021-08-021-3/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Drop redundant clears of vcpu->arch.hflags in init_vmcb() since kvm_vcpu_reset() always clears hflags, and it is also always zero at vCPU creation time. And of course, the second clearing in init_vmcb() was always redundant. Suggested-by: Reiji Watanabe <reijiw@google.com> Reviewed-by: Reiji Watanabe <reijiw@google.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20210713163324.627647-46-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
| * | KVM: SVM: Emulate #INIT in response to triple fault shutdownSean Christopherson2021-08-021-3/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Emulate a full #INIT instead of simply initializing the VMCB if the guest hits a shutdown. Initializing the VMCB but not other vCPU state, much of which is mirrored by the VMCB, results in incoherent and broken vCPU state. Ideally, KVM would not automatically init anything on shutdown, and instead put the vCPU into e.g. KVM_MP_STATE_UNINITIALIZED and force userspace to explicitly INIT or RESET the vCPU. Even better would be to add KVM_MP_STATE_SHUTDOWN, since technically NMI can break shutdown (and SMI on Intel CPUs). But, that ship has sailed, and emulating #INIT is the next best thing as that has at least some connection with reality since there exist bare metal platforms that automatically INIT the CPU if it hits shutdown. Fixes: 46fe4ddd9dbb ("[PATCH] KVM: SVM: Propagate cpu shutdown events to userspace") Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20210713163324.627647-45-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
| * | KVM: x86: Move setting of sregs during vCPU RESET/INIT to common x86Sean Christopherson2021-08-021-6/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Move the setting of CR0, CR4, EFER, RFLAGS, and RIP from vendor code to common x86. VMX and SVM now have near-identical sequences, the only difference being that VMX updates the exception bitmap. Updating the bitmap on SVM is unnecessary, but benign. Unfortunately it can't be left behind in VMX due to the need to update exception intercepts after the control registers are set. Reviewed-by: Reiji Watanabe <reijiw@google.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20210713163324.627647-37-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
| * | KVM: SVM: Stuff save->dr6 at during VMSA sync, not at RESET/INITSean Christopherson2021-08-022-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Move code to stuff vmcb->save.dr6 to its architectural init value from svm_vcpu_reset() into sev_es_sync_vmsa(). Except for protected guests, a.k.a. SEV-ES guests, vmcb->save.dr6 is set during VM-Enter, i.e. the extra write is unnecessary. For SEV-ES, stuffing save->dr6 handles a theoretical case where the VMSA could be encrypted before the first KVM_RUN. Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20210713163324.627647-33-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
| * | KVM: SVM: Drop redundant writes to vmcb->save.cr4 at RESET/INITSean Christopherson2021-08-021-3/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Drop direct writes to vmcb->save.cr4 during vCPU RESET/INIT, as the values being written are fully redundant with respect to svm_set_cr4(vcpu, 0) a few lines earlier. Note, svm_set_cr4() also correctly forces X86_CR4_PAE when NPT is disabled. No functional change intended. Reviewed-by: Reiji Watanabe <reijiw@google.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20210713163324.627647-32-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
| * | KVM: SVM: Tweak order of cr0/cr4/efer writes at RESET/INITSean Christopherson2021-08-021-6/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Hoist svm_set_cr0() up in the sequence of register initialization during vCPU RESET/INIT, purely to match VMX so that a future patch can move the sequences to common x86. Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20210713163324.627647-31-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
| * | KVM: SVM: Don't bother writing vmcb->save.rip at vCPU RESET/INITSean Christopherson2021-08-021-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Drop unnecessary initialization of vmcb->save.rip during vCPU RESET/INIT, as svm_vcpu_run() unconditionally propagates VCPU_REGS_RIP to save.rip. No true functional change intended. Reviewed-by: Reiji Watanabe <reijiw@google.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20210713163324.627647-21-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
| * | KVM: x86: Move EDX initialization at vCPU RESET to common codeSean Christopherson2021-08-021-13/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Move the EDX initialization at vCPU RESET, which is now identical between VMX and SVM, into common code. No functional change intended. Reviewed-by: Reiji Watanabe <reijiw@google.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20210713163324.627647-20-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
| * | KVM: x86: Consolidate APIC base RESET initialization codeSean Christopherson2021-08-021-6/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Consolidate the APIC base RESET logic, which is currently spread out across both x86 and vendor code. For an in-kernel APIC, the vendor code is redundant. But for a userspace APIC, KVM relies on the vendor code to initialize vcpu->arch.apic_base. Hoist the vcpu->arch.apic_base initialization above the !apic check so that it applies to both flavors of APIC emulation, and delete the vendor code. Reviewed-by: Reiji Watanabe <reijiw@google.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20210713163324.627647-19-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
| * | KVM: SVM: Drop explicit MMU reset at RESET/INITSean Christopherson2021-08-021-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Drop an explicit MMU reset in SVM's vCPU RESET/INIT flow now that the common x86 path correctly handles conditional MMU resets, e.g. if INIT arrives while the vCPU is in 64-bit mode. This reverts commit ebae871a509d ("kvm: svm: reset mmu on VCPU reset"). Reviewed-by: Reiji Watanabe <reijiw@google.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20210713163324.627647-9-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
| * | KVM: SVM: Fall back to KVM's hardcoded value for EDX at RESET/INITSean Christopherson2021-08-021-2/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | At vCPU RESET/INIT (mostly RESET), stuff EDX with KVM's hardcoded, default Family-Model-Stepping ID of 0x600 if CPUID.0x1 isn't defined. At RESET, the CPUID lookup is guaranteed to "miss" because KVM emulates RESET before exposing the vCPU to userspace, i.e. userspace can't possibly have done set the vCPU's CPUID model, and thus KVM will always write '0'. At INIT, using 0x600 is less bad than using '0'. While initializing EDX to '0' is _extremely_ unlikely to be noticed by the guest, let alone break the guest, and can be overridden by userspace for the RESET case, using 0x600 is preferable as it will allow consolidating the relevant VMX and SVM RESET/INIT logic in the future. And, digging through old specs suggests that neither Intel nor AMD have ever shipped a CPU that initialized EDX to '0' at RESET. Regarding 0x600 as KVM's default Family, it is a sane default and in many ways the most appropriate. Prior to the 386 implementations, DX was undefined at RESET. With the 386, 486, 586/P5, and 686/P6/Athlon, both Intel and AMD set EDX to 3, 4, 5, and 6 respectively. AMD switched to using '15' as its primary Family with the introduction of AMD64, but Intel has continued using '6' for the last few decades. So, '6' is a valid Family for both Intel and AMD CPUs, is compatible with both 32-bit and 64-bit CPUs (albeit not a perfect fit for 64-bit AMD), and of the common Families (3 - 6), is the best fit with respect to KVM's virtual CPU model. E.g. prior to the P6, Intel CPUs did not have a STI window. Modern operating systems, Linux included, rely on the STI window, e.g. for "safe halt", and KVM unconditionally assumes the virtual CPU has an STI window. Thus enumerating a Family ID of 3, 4, or 5 would be provably wrong. Opportunistically remove a stale comment. Fixes: 66f7b72e1171 ("KVM: x86: Make register state after reset conform to specification") Reviewed-by: Reiji Watanabe <reijiw@google.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20210713163324.627647-7-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
| * | KVM: SVM: Require exact CPUID.0x1 match when stuffing EDX at INITSean Christopherson2021-08-021-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Do not allow an inexact CPUID "match" when querying the guest's CPUID.0x1 to stuff EDX during INIT. In the common case, where the guest CPU model is an AMD variant, allowing an inexact match is a nop since KVM doesn't emulate Intel's goofy "out-of-range" logic for AMD and Hygon. If the vCPU model happens to be an Intel variant, an inexact match is possible if and only if the max CPUID leaf is precisely '0'. Aside from the fact that there's probably no CPU in existence with a single CPUID leaf, if the max CPUID leaf is '0', that means that CPUID.0.EAX is '0', and thus an inexact match for CPUID.0x1.EAX will also yield '0'. So, with lots of twisty logic, no functional change intended. Reviewed-by: Reiji Watanabe <reijiw@google.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20210713163324.627647-6-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
| * | KVM: SVM: Zero out GDTR.base and IDTR.base on INITSean Christopherson2021-08-021-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Explicitly set GDTR.base and IDTR.base to zero when intializing the VMCB. Functionally this only affects INIT, as the bases are implicitly set to zero on RESET by virtue of the VMCB being zero allocated. Per AMD's APM, GDTR.base and IDTR.base are zeroed after RESET and INIT. Fixes: 04d2cc7780d4 ("KVM: Move main vcpu loop into subarch independent code") Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20210713163324.627647-4-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
| * | KVM: x86: Use KVM_BUG/KVM_BUG_ON to handle bugs that are fatal to the VMSean Christopherson2021-08-021-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <0e8760a26151f47dc47052b25ca8b84fffe0641e.1625186503.git.isaku.yamahata@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>