diff options
author | Sean Christopherson <sean.j.christopherson@intel.com> | 2020-09-16 01:27:02 +0200 |
---|---|---|
committer | Paolo Bonzini <pbonzini@redhat.com> | 2020-09-28 13:57:19 +0200 |
commit | 09e3e2a1cc8d8069085785f1236a64c72707e7f2 (patch) | |
tree | 271a2a19c494d849a7b729516436bff645da4dc1 /arch/x86/kvm/x86.c | |
parent | KVM: SVM: use __GFP_ZERO instead of clear_page() (diff) | |
download | linux-09e3e2a1cc8d8069085785f1236a64c72707e7f2.tar.xz linux-09e3e2a1cc8d8069085785f1236a64c72707e7f2.zip |
KVM: x86: Add kvm_x86_ops hook to short circuit emulation
Replace the existing kvm_x86_ops.need_emulation_on_page_fault() with a
more generic is_emulatable(), and unconditionally call the new function
in x86_emulate_instruction().
KVM will use the generic hook to support multiple security related
technologies that prevent emulation in one way or another. Similar to
the existing AMD #NPF case where emulation of the current instruction is
not possible due to lack of information, AMD's SEV-ES and Intel's SGX
and TDX will introduce scenarios where emulation is impossible due to
the guest's register state being inaccessible. And again similar to the
existing #NPF case, emulation can be initiated by kvm_mmu_page_fault(),
i.e. outside of the control of vendor-specific code.
While the cause and architecturally visible behavior of the various
cases are different, e.g. SGX will inject a #UD, AMD #NPF is a clean
resume or complete shutdown, and SEV-ES and TDX "return" an error, the
impact on the common emulation code is identical: KVM must stop
emulation immediately and resume the guest.
Query is_emulatable() in handle_ud() as well so that the
force_emulation_prefix code doesn't incorrectly modify RIP before
calling emulate_instruction() in the absurdly unlikely scenario that
KVM encounters forced emulation in conjunction with "do not emulate".
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Message-Id: <20200915232702.15945-1-sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Diffstat (limited to 'arch/x86/kvm/x86.c')
-rw-r--r-- | arch/x86/kvm/x86.c | 14 |
1 files changed, 11 insertions, 3 deletions
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index cc6992fd4637..5f3e9ab34c80 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -3222,8 +3222,9 @@ int kvm_get_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info) * even when not intercepted. AMD manual doesn't explicitly * state this but appears to behave the same. * - * However when userspace wants to read this MSR, we should - * return it's real L1 value so that its restore will be correct. + * Unconditionally return L1's TSC offset on userspace reads + * so that userspace reads and writes always operate on L1's + * offset, e.g. to ensure deterministic behavior for migration. */ u64 tsc_offset = msr_info->host_initiated ? vcpu->arch.l1_tsc_offset : vcpu->arch.tsc_offset; @@ -5714,6 +5715,9 @@ int handle_ud(struct kvm_vcpu *vcpu) char sig[5]; /* ud2; .ascii "kvm" */ struct x86_exception e; + if (unlikely(!kvm_x86_ops.can_emulate_instruction(vcpu, NULL, 0))) + return 1; + if (force_emulation_prefix && kvm_read_guest_virt(vcpu, kvm_get_linear_rip(vcpu), sig, sizeof(sig), &e) == 0 && @@ -6919,7 +6923,10 @@ int x86_emulate_instruction(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa, int r; struct x86_emulate_ctxt *ctxt = vcpu->arch.emulate_ctxt; bool writeback = true; - bool write_fault_to_spt = vcpu->arch.write_fault_to_shadow_pgtable; + bool write_fault_to_spt; + + if (unlikely(!kvm_x86_ops.can_emulate_instruction(vcpu, insn, insn_len))) + return 1; vcpu->arch.l1tf_flush_l1d = true; @@ -6927,6 +6934,7 @@ int x86_emulate_instruction(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa, * Clear write_fault_to_shadow_pgtable here to ensure it is * never reused. */ + write_fault_to_spt = vcpu->arch.write_fault_to_shadow_pgtable; vcpu->arch.write_fault_to_shadow_pgtable = false; kvm_clear_exception_queue(vcpu); |