| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
| |
System reset through synthetic MSR is not recommended neither by genuine
Hyper-V nor my QEMU.
Fixes: 2bc39970e932 ("x86/kvm/hyper-v: Introduce KVM_GET_SUPPORTED_HV_CPUID")
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: Liran Alon <liran.alon@oracle.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This changes the allocation of cached_vmcs12 to use kzalloc instead of
kmalloc. This removes the information leak found by Syzkaller (see
Reported-by) in this case and prevents similar leaks from happening
based on cached_vmcs12.
It also changes vmx_get_nested_state to copy out the full 4k VMCS12_SIZE
in copy_to_user rather than only the size of the struct.
Tested: rebuilt against head, booted, and ran the syszkaller repro
https://syzkaller.appspot.com/text?tag=ReproC&x=174efca3400000 without
observing any problems.
Reported-by: syzbot+ded1696f6b50b615b630@syzkaller.appspotmail.com
Fixes: 8fcc4b5923af5de58b80b53a069453b135693304
Cc: stable@vger.kernel.org
Signed-off-by: Tom Roeder <tmroeder@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL
Fix a recently introduced bug that results in the wrong VMCS control
field being updated when applying a IA32_PERF_GLOBAL_CTRL errata.
Fixes: c73da3fcab43 ("KVM: VMX: Properly handle dynamic VM Entry/Exit controls")
Reported-by: Harald Arnesen <harald@skogtun.org>
Tested-by: Harald Arnesen <harald@skogtun.org>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The single-step debugging of KVM guests on x86 is broken: if we run
gdb 'stepi' command at the breakpoint when the guest interrupts are
enabled, RIP always jumps to native_apic_mem_write(). Then other
nasty effects follow.
Long investigation showed that on Jun 7, 2017 the
commit c8401dda2f0a00cd25c0 ("KVM: x86: fix singlestepping over syscall")
introduced the kvm_run.debug corruption: kvm_vcpu_do_singlestep() can
be called without X86_EFLAGS_TF set.
Let's fix it. Please consider that for -stable.
Signed-off-by: Alexander Popov <alex.popov@linux.com>
Cc: stable@vger.kernel.org
Fixes: c8401dda2f0a00cd25c0 ("KVM: x86: fix singlestepping over syscall")
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
HV_X64_MSR_GUEST_IDLE_AVAILABLE appeared in kvm_vcpu_ioctl_get_hv_cpuid()
by mistake: it announces support for HV_X64_MSR_GUEST_IDLE (0x400000F0)
which we don't support in KVM (yet).
Fixes: 2bc39970e932 ("x86/kvm/hyper-v: Introduce KVM_GET_SUPPORTED_HV_CPUID")
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
not backed
Since commit 09abb5e3e5e50 ("KVM: nVMX: call kvm_skip_emulated_instruction
in nested_vmx_{fail,succeed}") nested_vmx_failValid() results in
kvm_skip_emulated_instruction() so doing it again in handle_vmptrld() when
vmptr address is not backed is wrong, we end up advancing RIP twice.
Fixes: fca91f6d60b6e ("kvm: nVMX: Set VM instruction error for VMPTRLD of unbacked page")
Reported-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: Sean Christopherson <sean.j.christopherson@intel.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
hv_remote_flush_tlb_with_range()
The "ret" is initialized to be ENOTSUPP. The return value of
__hv_remote_flush_tlb_with_range() will be Or with "ret" when ept
table potiners are mismatched. This will cause return ENOTSUPP even if
flush tlb successfully. This patch is to fix the issue and set
"ret" to 0.
Fixes: a5c214dad198 ("KVM/VMX: Change hv flush logic when ept tables are mismatched.")
Signed-off-by: Lan Tianyu <Tianyu.Lan@microsoft.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
By code inspection, it was found that multiple calls to KVM_SEV_INIT
could deplete asid bits and overwrite kvm_sev_info's regions_list.
Multiple calls to KVM_SVM_INIT is not likely to occur with QEMU, but this
should likely be fixed anyway.
This code is serialized by kvm->lock.
Fixes: 1654efcbc431 ("KVM: SVM: Add KVM_SEV_INIT command")
Reported-by: Cfir Cohen <cfir@google.com>
Signed-off-by: David Rientjes <rientjes@google.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
ctl_bitmask in pt_desc is of type u64. When an integer like 0xf is
being left shifted more than 32 bits, the behavior is undefined.
Fix this by adding suffix ULL to integer 0xf.
Addresses-Coverity-ID: 1476095 ("Bad bit shift operation")
Fixes: 6c0f0bba85a0 ("KVM: x86: Introduce a function to initialize the PT configuration")
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Reviewed-by: Wei Yang <richardw.yang@linux.intel.com>
Reviewed-by: Luwei Kang <luwei.kang@intel.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently, CONFIG_JUMP_LABEL just means "I _want_ to use jump label".
The jump label is controlled by HAVE_JUMP_LABEL, which is defined
like this:
#if defined(CC_HAVE_ASM_GOTO) && defined(CONFIG_JUMP_LABEL)
# define HAVE_JUMP_LABEL
#endif
We can improve this by testing 'asm goto' support in Kconfig, then
make JUMP_LABEL depend on CC_HAS_ASM_GOTO.
Ugly #ifdef HAVE_JUMP_LABEL will go away, and CONFIG_JUMP_LABEL will
match to the real kernel capability.
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc)
Tested-by: Sedat Dilek <sedat.dilek@gmail.com>
|
|\
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
Pull Kconfig updates from Masahiro Yamada:
- support -y option for merge_config.sh to avoid downgrading =y to =m
- remove S_OTHER symbol type, and touch include/config/*.h files correctly
- fix file name and line number in lexer warnings
- fix memory leak when EOF is encountered in quotation
- resolve all shift/reduce conflicts of the parser
- warn no new line at end of file
- make 'source' statement more strict to take only string literal
- rewrite the lexer and remove the keyword lookup table
- convert to SPDX License Identifier
- compile C files independently instead of including them from zconf.y
- fix various warnings of gconfig
- misc cleanups
* tag 'kconfig-v4.21' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (39 commits)
kconfig: surround dbg_sym_flags with #ifdef DEBUG to fix gconf warning
kconfig: split images.c out of qconf.cc/gconf.c to fix gconf warnings
kconfig: add static qualifiers to fix gconf warnings
kconfig: split the lexer out of zconf.y
kconfig: split some C files out of zconf.y
kconfig: convert to SPDX License Identifier
kconfig: remove keyword lookup table entirely
kconfig: update current_pos in the second lexer
kconfig: switch to ASSIGN_VAL state in the second lexer
kconfig: stop associating kconf_id with yylval
kconfig: refactor end token rules
kconfig: stop supporting '.' and '/' in unquoted words
treewide: surround Kconfig file paths with double quotes
microblaze: surround string default in Kconfig with double quotes
kconfig: use T_WORD instead of T_VARIABLE for variables
kconfig: use specific tokens instead of T_ASSIGN for assignments
kconfig: refactor scanning and parsing "option" properties
kconfig: use distinct tokens for type and default properties
kconfig: remove redundant token defines
kconfig: rename depends_list to comment_option_list
...
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The Kconfig lexer supports special characters such as '.' and '/' in
the parameter context. In my understanding, the reason is just to
support bare file paths in the source statement.
I do not see a good reason to complicate Kconfig for the room of
ambiguity.
The majority of code already surrounds file paths with double quotes,
and it makes sense since file paths are constant string literals.
Make it treewide consistent now.
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Acked-by: Wolfram Sang <wsa@the-dreams.de>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Ingo Molnar <mingo@kernel.org>
|
|\ \
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 cleanups from Ingo Molnar:
"Misc cleanups"
* 'x86-cleanups-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/kprobes: Remove trampoline_handler() prototype
x86/kernel: Fix more -Wmissing-prototypes warnings
x86: Fix various typos in comments
x86/headers: Fix -Wmissing-prototypes warning
x86/process: Avoid unnecessary NULL check in get_wchan()
x86/traps: Complete prototype declarations
x86/mce: Fix -Wmissing-prototypes warnings
x86/gart: Rewrite early_gart_iommu_check() comment
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Go over arch/x86/ and fix common typos in comments,
and a typo in an actual function argument name.
No change in functionality intended.
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|\ \ \
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Pull KVM updates from Paolo Bonzini:
"ARM:
- selftests improvements
- large PUD support for HugeTLB
- single-stepping fixes
- improved tracing
- various timer and vGIC fixes
x86:
- Processor Tracing virtualization
- STIBP support
- some correctness fixes
- refactorings and splitting of vmx.c
- use the Hyper-V range TLB flush hypercall
- reduce order of vcpu struct
- WBNOINVD support
- do not use -ftrace for __noclone functions
- nested guest support for PAUSE filtering on AMD
- more Hyper-V enlightenments (direct mode for synthetic timers)
PPC:
- nested VFIO
s390:
- bugfixes only this time"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (171 commits)
KVM: x86: Add CPUID support for new instruction WBNOINVD
kvm: selftests: ucall: fix exit mmio address guessing
Revert "compiler-gcc: disable -ftracer for __noclone functions"
KVM: VMX: Move VM-Enter + VM-Exit handling to non-inline sub-routines
KVM: VMX: Explicitly reference RCX as the vmx_vcpu pointer in asm blobs
KVM: x86: Use jmp to invoke kvm_spurious_fault() from .fixup
MAINTAINERS: Add arch/x86/kvm sub-directories to existing KVM/x86 entry
KVM/x86: Use SVM assembly instruction mnemonics instead of .byte streams
KVM/MMU: Flush tlb directly in the kvm_zap_gfn_range()
KVM/MMU: Flush tlb directly in kvm_set_pte_rmapp()
KVM/MMU: Move tlb flush in kvm_set_pte_rmapp() to kvm_mmu_notifier_change_pte()
KVM: Make kvm_set_spte_hva() return int
KVM: Replace old tlb flush function with new one to flush a specified range.
KVM/MMU: Add tlb flush with range helper function
KVM/VMX: Add hv tlb range flush support
x86/hyper-v: Add HvFlushGuestAddressList hypercall support
KVM: Add tlb_remote_flush_with_range callback in kvm_x86_ops
KVM: x86: Disable Intel PT when VMXON in L1 guest
KVM: x86: Set intercept for Intel PT MSRs read/write
KVM: x86: Implement Intel PT MSRs read/write emulation
...
|
| | | |
| | | |
| | | |
| | | |
| | | | |
Signed-off-by: Robert Hoo <robert.hu@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Transitioning to/from a VMX guest requires KVM to manually save/load
the bulk of CPU state that the guest is allowed to direclty access,
e.g. XSAVE state, CR2, GPRs, etc... For obvious reasons, loading the
guest's GPR snapshot prior to VM-Enter and saving the snapshot after
VM-Exit is done via handcoded assembly. The assembly blob is written
as inline asm so that it can easily access KVM-defined structs that
are used to hold guest state, e.g. moving the blob to a standalone
assembly file would require generating defines for struct offsets.
The other relevant aspect of VMX transitions in KVM is the handling of
VM-Exits. KVM doesn't employ a separate VM-Exit handler per se, but
rather treats the VMX transition as a mega instruction (with many side
effects), i.e. sets the VMCS.HOST_RIP to a label immediately following
VMLAUNCH/VMRESUME. The label is then exposed to C code via a global
variable definition in the inline assembly.
Because of the global variable, KVM takes steps to (attempt to) ensure
only a single instance of the owning C function, e.g. vmx_vcpu_run, is
generated by the compiler. The earliest approach placed the inline
assembly in a separate noinline function[1]. Later, the assembly was
folded back into vmx_vcpu_run() and tagged with __noclone[2][3], which
is still used today.
After moving to __noclone, an edge case was encountered where GCC's
-ftracer optimization resulted in the inline assembly blob being
duplicated. This was "fixed" by explicitly disabling -ftracer in the
__noclone definition[4].
Recently, it was found that disabling -ftracer causes build warnings
for unsuspecting users of __noclone[5], and more importantly for KVM,
prevents the compiler for properly optimizing vmx_vcpu_run()[6]. And
perhaps most importantly of all, it was pointed out that there is no
way to prevent duplication of a function with 100% reliability[7],
i.e. more edge cases may be encountered in the future.
So to summarize, the only way to prevent the compiler from duplicating
the global variable definition is to move the variable out of inline
assembly, which has been suggested several times over[1][7][8].
Resolve the aforementioned issues by moving the VMLAUNCH+VRESUME and
VM-Exit "handler" to standalone assembly sub-routines. Moving only
the core VMX transition codes allows the struct indexing to remain as
inline assembly and also allows the sub-routines to be used by
nested_vmx_check_vmentry_hw(). Reusing the sub-routines has a happy
side-effect of eliminating two VMWRITEs in the nested_early_check path
as there is no longer a need to dynamically change VMCS.HOST_RIP.
Note that callers to vmx_vmenter() must account for the CALL modifying
RSP, e.g. must subtract op-size from RSP when synchronizing RSP with
VMCS.HOST_RSP and "restore" RSP prior to the CALL. There are no great
alternatives to fudging RSP. Saving RSP in vmx_enter() is difficult
because doing so requires a second register (VMWRITE does not provide
an immediate encoding for the VMCS field and KVM supports Hyper-V's
memory-based eVMCS ABI). The other more drastic alternative would be
to use eschew VMCS.HOST_RSP and manually save/load RSP using a per-cpu
variable (which can be encoded as e.g. gs:[imm]). But because a valid
stack is needed at the time of VM-Exit (NMIs aren't blocked and a user
could theoretically insert INT3/INT1ICEBRK at the VM-Exit handler), a
dedicated per-cpu VM-Exit stack would be required. A dedicated stack
isn't difficult to implement, but it would require at least one page
per CPU and knowledge of the stack in the dumpstack routines. And in
most cases there is essentially zero overhead in dynamically updating
VMCS.HOST_RSP, e.g. the VMWRITE can be avoided for all but the first
VMLAUNCH unless nested_early_check=1, which is not a fast path. In
other words, avoiding the VMCS.HOST_RSP by using a dedicated stack
would only make the code marginally less ugly while requiring at least
one page per CPU and forcing the kernel to be aware (and approve) of
the VM-Exit stack shenanigans.
[1] cea15c24ca39 ("KVM: Move KVM context switch into own function")
[2] a3b5ba49a8c5 ("KVM: VMX: add the __noclone attribute to vmx_vcpu_run")
[3] 104f226bfd0a ("KVM: VMX: Fold __vmx_vcpu_run() into vmx_vcpu_run()")
[4] 95272c29378e ("compiler-gcc: disable -ftracer for __noclone functions")
[5] https://lkml.kernel.org/r/20181218140105.ajuiglkpvstt3qxs@treble
[6] https://patchwork.kernel.org/patch/8707981/#21817015
[7] https://lkml.kernel.org/r/ri6y38lo23g.fsf@suse.cz
[8] https://lkml.kernel.org/r/20181218212042.GE25620@tassilo.jf.intel.com
Suggested-by: Andi Kleen <ak@linux.intel.com>
Suggested-by: Martin Jambor <mjambor@suse.cz>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Nadav Amit <namit@vmware.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Martin Jambor <mjambor@suse.cz>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Miroslav Benes <mbenes@suse.cz>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Use '%% " _ASM_CX"' instead of '%0' to dereference RCX, i.e. the
'struct vcpu_vmx' pointer, in the VM-Enter asm blobs of vmx_vcpu_run()
and nested_vmx_check_vmentry_hw(). Using the symbolic name means that
adding/removing an output parameter(s) requires "rewriting" almost all
of the asm blob, which makes it nearly impossible to understand what's
being changed in even the most minor patches.
Opportunistically improve the code comments.
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Recently the minimum required version of binutils was changed to 2.20,
which supports all SVM instruction mnemonics. The patch removes
all .byte #defines and uses real instruction mnemonics instead.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Originally, flush tlb is done by slot_handle_level_range(). This patch
moves the flush directly to kvm_zap_gfn_range() when range flush is
available, so that only the requested range can be flushed.
Signed-off-by: Lan Tianyu <Tianyu.Lan@microsoft.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
This patch is to flush tlb directly in kvm_set_pte_rmapp()
function when Hyper-V remote TLB flush is available, returning 0
so that kvm_mmu_notifier_change_pte() does not flush again.
Signed-off-by: Lan Tianyu <Tianyu.Lan@microsoft.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
This patch is to move tlb flush in kvm_set_pte_rmapp() to
kvm_mmu_notifier_change_pte() in order to avoid redundant tlb flush.
Signed-off-by: Lan Tianyu <Tianyu.Lan@microsoft.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
The patch is to make kvm_set_spte_hva() return int and caller can
check return value to determine flush tlb or not.
Signed-off-by: Lan Tianyu <Tianyu.Lan@microsoft.com>
Acked-by: Paul Mackerras <paulus@ozlabs.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
This patch is to replace kvm_flush_remote_tlbs() with kvm_flush_
remote_tlbs_with_address() in some functions without logic change.
Signed-off-by: Lan Tianyu <Tianyu.Lan@microsoft.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
This patch is to add wrapper functions for tlb_remote_flush_with_range
callback and flush tlb directly in kvm_mmu_zap_collapsible_spte().
kvm_mmu_zap_collapsible_spte() returns flush request to the
slot_handle_leaf() and the latter does flush on demand. When
range flush is available, make kvm_mmu_zap_collapsible_spte()
to flush tlb with range directly to avoid returning range back
to slot_handle_leaf().
Signed-off-by: Lan Tianyu <Tianyu.Lan@microsoft.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
This patch is to register tlb_remote_flush_with_range callback with
hv tlb range flush interface.
Signed-off-by: Lan Tianyu <Tianyu.Lan@microsoft.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Currently, Intel Processor Trace do not support tracing in L1 guest
VMX operation(IA32_VMX_MISC[bit 14] is 0). As mentioned in SDM,
on these type of processors, execution of the VMXON instruction will
clears IA32_RTIT_CTL.TraceEn and any attempt to write IA32_RTIT_CTL
causes a general-protection exception (#GP).
Signed-off-by: Luwei Kang <luwei.kang@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
To save performance overhead, disable intercept Intel PT MSRs
read/write when Intel PT is enabled in guest.
MSR_IA32_RTIT_CTL is an exception that will always be intercepted.
Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
Signed-off-by: Luwei Kang <luwei.kang@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
This patch implement Intel Processor Trace MSRs read/write
emulation.
Intel PT MSRs read/write need to be emulated when Intel PT
MSRs is intercepted in guest and during live migration.
Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
Signed-off-by: Luwei Kang <luwei.kang@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Initialize the Intel PT configuration when cpuid update.
Include cpuid inforamtion, rtit_ctl bit mask and the number of
address ranges.
Signed-off-by: Luwei Kang <luwei.kang@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Load/Store Intel Processor Trace register in context switch.
MSR IA32_RTIT_CTL is loaded/stored automatically from VMCS.
In Host-Guest mode, we need load/resore PT MSRs only when PT
is enabled in guest.
Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
Signed-off-by: Luwei Kang <luwei.kang@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Expose Intel Processor Trace to guest only when
the PT works in Host-Guest mode.
Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
Signed-off-by: Luwei Kang <luwei.kang@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Intel Processor Trace virtualization can be work in one
of 2 possible modes:
a. System-Wide mode (default):
When the host configures Intel PT to collect trace packets
of the entire system, it can leave the relevant VMX controls
clear to allow VMX-specific packets to provide information
across VMX transitions.
KVM guest will not aware this feature in this mode and both
host and KVM guest trace will output to host buffer.
b. Host-Guest mode:
Host can configure trace-packet generation while in
VMX non-root operation for guests and root operation
for native executing normally.
Intel PT will be exposed to KVM guest in this mode, and
the trace output to respective buffer of host and guest.
In this mode, tht status of PT will be saved and disabled
before VM-entry and restored after VM-exit if trace
a virtual machine.
Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
Signed-off-by: Luwei Kang <luwei.kang@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
[Preserved the iff and a probably intentional weird bracket notation.
Also dropped the style change to make a single-purpose patch. - Radim]
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Although well-intentioned, keeping the KF() definition as a hint for
handling scattered CPUID features may be counter-productive. Simply
redefining the bit position only works for directly manipulating the
guest's CPUID leafs, e.g. it doesn't make guest_cpuid_has() magically
work. Taking an alternative approach, e.g. ensuring the bit position
is identical between the Linux-defined and hardware-defined features,
may be a simpler and/or more effective method of exposing scattered
features to the guest.
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Let the guest read the IA32_TSC MSR with the generic RDMSR instruction
as well as the specific RDTSC(P) instructions. Note that the hardware
applies the TSC multiplier and offset (when applicable) to the result of
RDMSR(IA32_TSC), just as it does to the result of RDTSC(P).
Signed-off-by: Jim Mattson <jmattson@google.com>
Reviewed-by: Peter Shier <pshier@google.com>
Reviewed-by: Marc Orr <marcorr@google.com>
Reviewed-by: Liran Alon <liran.alon@oracle.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
According to the SDM, "NMI-window exiting" VM-exits wake a logical
processor from the same inactive states as would an NMI and
"interrupt-window exiting" VM-exits wake a logical processor from the
same inactive states as would an external interrupt. Specifically, they
wake a logical processor from the shutdown state and from the states
entered using the HLT and MWAIT instructions.
Fixes: 6dfacadd5858 ("KVM: nVMX: Add support for activity state HLT")
Signed-off-by: Jim Mattson <jmattson@google.com>
Reviewed-by: Peter Shier <pshier@google.com>
Suggested-by: Sean Christopherson <sean.j.christopherson@intel.com>
[Squashed comments of two Jim's patches and used the simplified code
hunk provided by Sean. - Radim]
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Currently, the nested guest's PAUSE intercept intentions are not being
honored. Instead, since the L0 hypervisor's pause_filter_count and
pause_filter_thresh values are still in place, these values are used
instead of those programmed in the VMCB by the L1 hypervisor.
To honor the desired PAUSE intercept support of the L1 hypervisor, the L0
hypervisor must use the PAUSE filtering fields of the L1 hypervisor. This
requires saving and restoring of both the L0 and L1 hypervisor's PAUSE
filtering fields.
Signed-off-by: William Tambe <william.tambe@amd.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Remove duplicated include.
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
AMD doesn't seem to implement MSR_IA32_MCG_EXT_CTL and svm code in kvm
knows nothing about it, however, this MSR is among emulated_msrs and
thus returned with KVM_GET_MSR_INDEX_LIST. The consequent KVM_GET_MSRS,
of course, fails.
Report the MSR as unsupported to not confuse userspace.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
The memory allocation in b666a4b69739 ("kvm: x86: Dynamically allocate
guest_fpu", 2018-11-06) is wrong, there are other members in struct fpu
before the fpregs_state union and the patch should be doing something
similar to the code in fpu__init_task_struct_size. It's enough to run
a guest and then rmmod kvm to see slub errors which are actually caused
by memory corruption.
For now let's revert it to sizeof(struct fpu), which is conservative.
I have plans to move fsave/fxsave/xsave directly in KVM, without using
the kernel FPU helpers, and once it's done, the size of the object in
the cache will be something like kvm_xstate_size.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Previously, the guest_fpu field was embedded in the kvm_vcpu_arch
struct. Unfortunately, the field is quite large, (e.g., 4352 bytes on my
current setup). This bloats the kvm_vcpu_arch struct for x86 into an
order 3 memory allocation, which can become a problem on overcommitted
machines. Thus, this patch moves the fpu state outside of the
kvm_vcpu_arch struct.
With this patch applied, the kvm_vcpu_arch struct is reduced to 15168
bytes for vmx on my setup when building the kernel with kvmconfig.
Suggested-by: Dave Hansen <dave.hansen@intel.com>
Signed-off-by: Marc Orr <marcorr@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Previously, x86's instantiation of 'struct kvm_vcpu_arch' added an fpu
field to save/restore fpu-related architectural state, which will differ
from kvm's fpu state. However, this is redundant to the 'struct fpu'
field, called fpu, embedded in the task struct, via the thread field.
Thus, this patch removes the user_fpu field from the kvm_vcpu_arch
struct and replaces it with the task struct's fpu field.
This change is significant because the fpu struct is actually quite
large. For example, on the system used to develop this patch, this
change reduces the size of the vcpu_vmx struct from 23680 bytes down to
19520 bytes, when building the kernel with kvmconfig. This reduction in
the size of the vcpu_vmx struct moves us closer to being able to
allocate the struct at order 2, rather than order 3.
Suggested-by: Dave Hansen <dave.hansen@intel.com>
Signed-off-by: Marc Orr <marcorr@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
helper function
.. to improve readability and maintainability, and to align the code as per
the layout of the checks in chapter "VM Entries" in Intel SDM vol 3C.
Signed-off-by: Krish Sadhukhan <krish.sadhukhan@oracle.com>
Reviewed-by: Mihai Carabas <mihai.carabas@oracle.com>
Reviewed-by: Mark Kanda <mark.kanda@oracle.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
helper function
.. to improve readability and maintainability, and to align the code as per
the layout of the checks in chapter "VM Entries" in Intel SDM vol 3C.
Signed-off-by: Krish Sadhukhan <krish.sadhukhan@oracle.com>
Reviewed-by: Mihai Carabas <mihai.carabas@oracle.com>
Reviewed-by: Mark Kanda <mark.kanda@oracle.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
function
.. to improve readability and maintainability, and to align the code as per
the layout of the checks in chapter "VM Entries" in Intel SDM vol 3C.
Signed-off-by: Krish Sadhukhan <krish.sadhukhan@oracle.com>
Reviewed-by: Mihai Carabas <mihai.carabas@oracle.com>
Reviewed-by: Mark Kanda <mark.kanda@oracle.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
function
.. to improve readability and maintainability, and to align the code as per
the layout of the checks in chapter "VM Entries" in Intel SDM vol 3C.
Signed-off-by: Krish Sadhukhan <krish.sadhukhan@oracle.com>
Reviewed-by: Mihai Carabas <mihai.carabas@oracle.com>
Reviewed-by: Mark Kanda <mark.kanda@oracle.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Passing the enum and doing an indirect lookup is silly when we can
simply pass the field directly. Remove the "fast path" code in
nested_vmx_check_msr_switch_controls() as it's now nothing more than a
redundant check.
Remove the debug message rather than continue passing the enum for the
address field. Having debug messages for the MSRs themselves is useful
as MSR legality is a huge space, whereas messing up a physical address
means the VMM is fundamentally broken.
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
helper function
.. to improve readability and maintainability, and to align the code as per
the layout of the checks in chapter "VM Entries" in Intel SDM vol 3C.
Signed-off-by: Krish Sadhukhan <krish.sadhukhan@oracle.com>
Reviewed-by: Mihai Carabas <mihai.carabas@oracle.com>
Reviewed-by: Mark Kanda <mark.kanda@oracle.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
.. as they are used only in nested vmx context.
Signed-off-by: Krish Sadhukhan <krish.sadhukhan@oracle.com>
Reviewed-by: Liran Alon <liran.alon@oracle.com>
Reviewed-by: Mihai Carabas <mihai.carabas@oracle.com>
Reviewed-by: Mark Kanda <mark.kanda@oracle.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|