diff options
author | Andy Lutomirski <luto@kernel.org> | 2017-02-20 17:56:14 +0100 |
---|---|---|
committer | Paolo Bonzini <pbonzini@redhat.com> | 2017-02-21 12:45:08 +0100 |
commit | b7ffc44d5b2ea163899d09289ca7743d5c32e926 (patch) | |
tree | c41439937c5810b701a895d1f9e060b57573d540 /arch/x86/kernel/process.c | |
parent | x86/asm/64: Drop __cacheline_aligned from struct x86_hw_tss (diff) | |
download | linux-b7ffc44d5b2ea163899d09289ca7743d5c32e926.tar.xz linux-b7ffc44d5b2ea163899d09289ca7743d5c32e926.zip |
x86/kvm/vmx: Defer TR reload after VM exit
Intel's VMX is daft and resets the hidden TSS limit register to 0x67
on VMX reload, and the 0x67 is not configurable. KVM currently
reloads TR using the LTR instruction on every exit, but this is quite
slow because LTR is serializing.
The 0x67 limit is entirely harmless unless ioperm() is in use, so
defer the reload until a task using ioperm() is actually running.
Here's some poorly done benchmarking using kvm-unit-tests:
Before:
cpuid 1313
vmcall 1195
mov_from_cr8 11
mov_to_cr8 17
inl_from_pmtimer 6770
inl_from_qemu 6856
inl_from_kernel 2435
outl_to_kernel 1402
After:
cpuid 1291
vmcall 1181
mov_from_cr8 11
mov_to_cr8 16
inl_from_pmtimer 6457
inl_from_qemu 6209
inl_from_kernel 2339
outl_to_kernel 1391
Signed-off-by: Andy Lutomirski <luto@kernel.org>
[Force-reload TR in invalidate_tss_limit. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Diffstat (limited to 'arch/x86/kernel/process.c')
-rw-r--r-- | arch/x86/kernel/process.c | 10 |
1 files changed, 10 insertions, 0 deletions
diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c index b615a1113f58..7780efa635b9 100644 --- a/arch/x86/kernel/process.c +++ b/arch/x86/kernel/process.c @@ -32,6 +32,7 @@ #include <asm/mce.h> #include <asm/vm86.h> #include <asm/switch_to.h> +#include <asm/desc.h> /* * per-CPU TSS segments. Threads are completely 'soft' on Linux, @@ -64,6 +65,9 @@ __visible DEFINE_PER_CPU_SHARED_ALIGNED(struct tss_struct, cpu_tss) = { }; EXPORT_PER_CPU_SYMBOL(cpu_tss); +DEFINE_PER_CPU(bool, need_tr_refresh); +EXPORT_PER_CPU_SYMBOL_GPL(need_tr_refresh); + /* * this gets called so that we can store lazy state into memory and copy the * current task into the new thread. @@ -209,6 +213,12 @@ void __switch_to_xtra(struct task_struct *prev_p, struct task_struct *next_p, */ memcpy(tss->io_bitmap, next->io_bitmap_ptr, max(prev->io_bitmap_max, next->io_bitmap_max)); + + /* + * Make sure that the TSS limit is correct for the CPU + * to notice the IO bitmap. + */ + refresh_TR(); } else if (test_tsk_thread_flag(prev_p, TIF_IO_BITMAP)) { /* * Clear any possible leftover bits: |