linux - linux

	Commit message (Collapse)	Author	Age	Files	Lines
*	KVM: Add support for in-kernel pio handlers	Eddie Dong	2007-07-16	1	-1/+4
\| \| \| \| \| \| \|	Useful for the PIC and PIT. Signed-off-by: Yaozu (Eddie) Dong <eddie.dong@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
*	KVM: Adds support for in-kernel mmio handlers	Gregory Haskins	2007-07-16	1	-0/+60
\| \| \| \| \|	Signed-off-by: Gregory Haskins <ghaskins@novell.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
*	KVM: Flush remote tlbs when reducing shadow pte permissions	Avi Kivity	2007-07-16	1	-0/+8
\| \| \| \| \| \| \| \| \| \| \|	When a vcpu causes a shadow tlb entry to have reduced permissions, it must also clear the tlb on remote vcpus. We do that by: - setting a bit on the vcpu that requests a tlb flush before the next entry - if the vcpu is currently executing, we send an ipi to make sure it exits before we continue Signed-off-by: Avi Kivity <avi@qumranet.com>
*	KVM: Keep an upper bound of initialized vcpus	Avi Kivity	2007-07-16	1	-0/+1
\| \| \| \| \| \| \|	That way, we don't need to loop for KVM_MAX_VCPUS for a single vcpu vm. Signed-off-by: Avi Kivity <avi@qumranet.com>
*	KVM: Emulate hlt on real mode for Intel	Avi Kivity	2007-07-16	1	-0/+1
\| \| \| \| \| \| \|	This has two use cases: the bios can't boot from disk, and guest smp bootstrap. Signed-off-by: Avi Kivity <avi@qumranet.com>
*	KVM: Move duplicate halt handling code into kvm_main.c	Avi Kivity	2007-07-16	1	-0/+1
\| \| \| \| \| \|	Will soon have a thid user. Signed-off-by: Avi Kivity <avi@qumranet.com>
*	KVM: Enable guest smp	Avi Kivity	2007-07-16	1	-1/+1
\| \| \| \| \| \| \|	As we don't support guest tlb shootdown yet, this is only reliable for real-mode guests. Signed-off-by: Avi Kivity <avi@qumranet.com>
*	KVM: Lazy guest cr3 switching	Avi Kivity	2007-07-16	1	-0/+10
\| \| \| \| \| \| \| \| \|	Switch guest paging context may require us to allocate memory, which might fail. Instead of wiring up error paths everywhere, make context switching lazy and actually do the switch before the next guest entry, where we can return an error if allocation fails. Signed-off-by: Avi Kivity <avi@qumranet.com>
*	KVM: MMU: Use slab caches for shadow pages and their headers	Avi Kivity	2007-07-16	1	-2/+2
\| \| \| \| \| \|	Use slab caches instead of a simple custom list. Signed-off-by: Avi Kivity <avi@qumranet.com>
*	KVM: Fix includes	Markus Rechberger	2007-07-16	1	-0/+2
\| \| \| \| \| \| \|	KVM compilation fails for some .configs. This fixes it. Signed-off-by: Markus Rechberger <markus.rechberger@amd.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
*	KVM: VMX: Avoid saving and restoring msr_efer on lightweight vmexit	Eddie Dong	2007-07-16	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	MSR_EFER.LME/LMA bits are automatically save/restored by VMX hardware, KVM only needs to save NX/SCE bits at time of heavy weight VM Exit. But clearing NX bits in host envirnment may cause system hang if the host page table is using EXB bits, thus we leave NX bits as it is. If Host NX=1 and guest NX=0, we can do guest page table EXB bits check before inserting a shadow pte (though no guest is expecting to see this kind of gp fault). If host NX=0, we present guest no Execute-Disable feature to guest, thus no host NX=0, guest NX=1 combination. This patch reduces raw vmexit time by ~27%. Me: fix compile warnings on i386. Signed-off-by: Yaozu (Eddie) Dong <eddie.dong@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
*	KVM: VMX: Avoid saving and restoring msrs on lightweight vmexit	Eddie Dong	2007-07-16	1	-0/+4
\| \| \| \| \| \| \| \| \| \| \| \|	In a lightweight exit (where we exit and reenter the guest without scheduling or exiting to userspace in between), we don't need various msrs on the host, and avoiding shuffling them around reduces raw exit time by 8%. i386 compile fix by Daniel Hecken <dh@bahntechnik.de>. Signed-off-by: Yaozu (Eddie) Dong <eddie.dong@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
*	KVM: MMU: Store shadow page tables as kernel virtual addresses, not physical	Avi Kivity	2007-07-16	1	-1/+1
\| \| \| \| \| \|	Simpifies things a bit. Signed-off-by: Avi Kivity <avi@qumranet.com>
*	KVM: Set cr0.mp for guests	Avi Kivity	2007-07-16	1	-1/+3
\| \| \| \| \| \| \|	This allows fwait instructions to be trapped when the guest fpu is not loaded. Signed-off-by: Avi Kivity <avi@qumranet.com>
*	KVM: Consolidate guest fpu activation and deactivation	Avi Kivity	2007-07-16	1	-1/+1
\| \| \| \| \| \|	Easier to keep track of where the fpu is this way. Signed-off-by: Avi Kivity <avi@qumranet.com>
*	KVM: Fix potential guest state leak into host	Avi Kivity	2007-07-16	1	-0/+5
\| \| \| \| \| \| \| \| \| \| \|	The lightweight vmexit path avoids saving and reloading certain host state. However in certain cases lightweight vmexit handling can schedule() which requires reloading the host state. So we store the host state in the vcpu structure, and reloaded it if we relinquish the vcpu. Signed-off-by: Avi Kivity <avi@qumranet.com>
*	KVM: Increase mmu shadow cache to 1024 pages	Avi Kivity	2007-07-16	1	-1/+1
\| \| \| \| \| \| \|	This improves kbuild times by about 10%, bringing it within a respectable 25% of native. Signed-off-by: Avi Kivity <avi@qumranet.com>
*	KVM: Unify kvm_mmu_pre_write() and kvm_mmu_post_write()	Avi Kivity	2007-07-16	1	-2/+2
\| \| \| \| \| \| \|	Instead of calling two functions and repeating expensive checks, call one function and provide it with before/after information. Signed-off-by: Avi Kivity <avi@qumranet.com>
*	KVM: Avoid saving and restoring some host CPU state on lightweight vmexit	Avi Kivity	2007-07-16	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \|	Many msrs and the like will only be used by the host if we schedule() or return to userspace. Therefore, we avoid saving them if we handle the exit within the kernel, and if a reschedule is not requested. Based on a patch from Eddie Dong <eddie.dong@intel.com> with a couple of fixes by me. Signed-off-by: Yaozu(Eddie) Dong <eddie.dong@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
*	KVM: Prevent guest fpu state from leaking into the host	Avi Kivity	2007-06-15	1	-0/+3
\| \| \| \| \| \| \| \| \|	The lazy fpu changes did not take into account that some vmexit handlers can sleep. Move loading the guest state into the inner loop so that it can be reloaded if necessary, and move loading the host state into vmx_vcpu_put() so it can be performed whenever we relinquish the vcpu. Signed-off-by: Avi Kivity <avi@qumranet.com>
*	Detach sched.h from mm.h	Alexey Dobriyan	2007-05-21	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	First thing mm.h does is including sched.h solely for can_do_mlock() inline function which has "current" dereference inside. By dealing with can_do_mlock() mm.h can be detached from sched.h which is good. See below, why. This patch a) removes unconditional inclusion of sched.h from mm.h b) makes can_do_mlock() normal function in mm/mlock.c c) exports can_do_mlock() to not break compilation d) adds sched.h inclusions back to files that were getting it indirectly. e) adds less bloated headers to some files (asm/signal.h, jiffies.h) that were getting them indirectly Net result is: a) mm.h users would get less code to open, read, preprocess, parse, ... if they don't need sched.h b) sched.h stops being dependency for significant number of files: on x86_64 allmodconfig touching sched.h results in recompile of 4083 files, after patch it's only 3744 (-8.3%). Cross-compile tested on all arm defconfigs, all mips defconfigs, all powerpc defconfigs, alpha alpha-up arm i386 i386-up i386-defconfig i386-allnoconfig ia64 ia64-up m68k mips parisc parisc-up powerpc powerpc-up s390 s390-up sparc sparc-up sparc64 sparc64-up um-x86_64 x86_64 x86_64-up x86_64-defconfig x86_64-allnoconfig as well as my two usual configs. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
*	KVM: Remove extraneous guest entry on mmio read	Avi Kivity	2007-05-03	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \|	When emulating an mmio read, we actually emulate twice: once to determine the physical address of the mmio, and, after we've exited to userspace to get the mmio value, we emulate again to place the value in the result register and update any flags. But we don't really need to enter the guest again for that, only to take an immediate vmexit. So, if we detect that we're doing an mmio read, emulate a single instruction before entering the guest again. Signed-off-by: Avi Kivity <avi@qumranet.com>
*	KVM: VMX: Properly shadow the CR0 register in the vcpu struct	Anthony Liguori	2007-05-03	1	-1/+1
\| \| \| \| \| \| \| \|	Set all of the host mask bits for CR0 so that we can maintain a proper shadow of CR0. This exposes CR0.TS, paving the way for lazy fpu handling. Signed-off-by: Anthony Liguori <aliguori@us.ibm.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
*	KVM: Lazy FPU support for SVM	Anthony Liguori	2007-05-03	1	-0/+2
\| \| \| \| \| \| \| \|	Avoid saving and restoring the guest fpu state on every exit. This shaves ~100 cycles off the guest/host switch. Signed-off-by: Anthony Liguori <aliguori@us.ibm.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
*	KVM: Per-vcpu statistics	Avi Kivity	2007-05-03	1	-17/+18
\| \| \| \| \| \| \| \|	Make the exit statistics per-vcpu instead of global. This gives a 3.5% boost when running one virtual machine per core on my two socket dual core (4 cores total) machine. Signed-off-by: Avi Kivity <avi@qumranet.com>
*	KVM: Use slab caches to allocate mmu data structures	Avi Kivity	2007-05-03	1	-0/+3
\| \| \| \| \| \| \|	Better leak detection, statistics, memory use, speed -- goodness all around. Signed-off-by: Avi Kivity <avi@qumranet.com>
*	KVM: Add physical memory aliasing feature	Avi Kivity	2007-05-03	1	-0/+9
\| \| \| \| \| \| \| \|	With this, we can specify that accesses to one physical memory range will be remapped to another. This is useful for the vga window at 0xa0000 which is used as a movable window into the (much larger) framebuffer. Signed-off-by: Avi Kivity <avi@qumranet.com>
*	KVM: Simply gfn_to_page()	Avi Kivity	2007-05-03	1	-11/+1
\| \| \| \| \| \| \| \| \| \| \| \|	Mapping a guest page to a host page is a common operation. Currently, one has first to find the memory slot where the page belongs (gfn_to_memslot), then locate the page itself (gfn_to_page()). This is clumsy, and also won't work well with memory aliases. So simplify gfn_to_page() not to require memory slot translation first, and instead do it internally. Signed-off-by: Avi Kivity <avi@qumranet.com>
*	KVM: Add mmu cache clear function	Dor Laor	2007-05-03	1	-0/+1
\| \| \| \| \| \| \| \| \| \|	Functions that play around with the physical memory map need a way to clear mappings to possibly nonexistent or invalid memory. Both the mmu cache and the processor tlb are cleared. Signed-off-by: Dor Laor <dor.laor@qumranet.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
*	KVM: SVM: Ensure timestamp counter monotonicity	Avi Kivity	2007-05-03	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \|	When a vcpu is migrated from one cpu to another, its timestamp counter may lose its monotonic property if the host has unsynced timestamp counters. This can confuse the guest, sometimes to the point of refusing to boot. As the rdtsc instruction is rather fast on AMD processors (7-10 cycles), we can simply record the last host tsc when we drop the cpu, and adjust the vcpu tsc offset when we detect that we've migrated to a different cpu. Signed-off-by: Avi Kivity <avi@qumranet.com>
*	KVM: MMU: Fix hugepage pdes mapping same physical address with different access	Avi Kivity	2007-05-03	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The kvm mmu keeps a shadow page for hugepage pdes; if several such pdes map the same physical address, they share the same shadow page. This is a fairly common case (kernel mappings on i386 nonpae Linux, for example). However, if the two pdes map the same memory but with different permissions, kvm will happily use the cached shadow page. If the access through the more permissive pde will occur after the access to the strict pde, an endless pagefault loop will be generated and the guest will make no progress. Fix by making the access permissions part of the cache lookup key. The fix allows Xen pae to boot on kvm and run guest domains. Thanks to Jeremy Fitzhardinge for reporting the bug and testing the fix. Signed-off-by: Avi Kivity <avi@qumranet.com>
*	KVM: Remove set_cr0_no_modeswitch() arch op	Avi Kivity	2007-05-03	1	-2/+0
\| \| \| \| \| \| \| \| \|	set_cr0_no_modeswitch() was a hack to avoid corrupting segment registers. As we now cache the protected mode values on entry to real mode, this isn't an issue anymore, and it interferes with reboot (which usually _is_ a modeswitch). Signed-off-by: Avi Kivity <avi@qumranet.com>
*	KVM: MMU: Remove global pte tracking	Avi Kivity	2007-05-03	1	-1/+0
\| \| \| \| \| \| \| \| \| \| \|	The initial, noncaching, version of the kvm mmu flushed the all nonglobal shadow page table translations (much like a native tlb flush). The new implementation flushes translations only when they change, rendering global pte tracking superfluous. This removes the unused tracking mechanism and storage space. Signed-off-by: Avi Kivity <avi@qumranet.com>
*	KVM: Avoid guest virtual addresses in string pio userspace interface	Avi Kivity	2007-05-03	1	-1/+20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The current string pio interface communicates using guest virtual addresses, relying on userspace to translate addresses and to check permissions. This interface cannot fully support guest smp, as the check needs to take into account two pages at one in case an unaligned string transfer straddles a page boundary. Change the interface not to communicate guest addresses at all; instead use a buffer page (mmaped by userspace) and do transfers there. The kernel manages the virtual to physical translation and can perform the checks atomically by taking the appropriate locks. Signed-off-by: Avi Kivity <avi@qumranet.com>
*	KVM: Add guest mode signal mask	Avi Kivity	2007-05-03	1	-0/+3
\| \| \| \| \| \| \| \| \|	Allow a special signal mask to be used while executing in guest mode. This allows signals to be used to interrupt a vcpu without requiring signal delivery to a userspace handler, which is quite expensive. Userspace still receives -EINTR and can get the signal via sigwait(). Signed-off-by: Avi Kivity <avi@qumranet.com>
*	KVM: Handle cpuid in the kernel instead of punting to userspace	Avi Kivity	2007-05-03	1	-0/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	KVM used to handle cpuid by letting userspace decide what values to return to the guest. We now handle cpuid completely in the kernel. We still let userspace decide which values the guest will see by having userspace set up the value table beforehand (this is necessary to allow management software to set the cpu features to the least common denominator, so that live migration can work). The motivation for the change is that kvm kernel code can be impacted by cpuid features, for example the x86 emulator. Signed-off-by: Avi Kivity <avi@qumranet.com>
*	KVM: Do not communicate to userspace through cpu registers during PIO	Avi Kivity	2007-05-03	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \|	Currently when passing the a PIO emulation request to userspace, we rely on userspace updating %rax (on 'in' instructions) and %rsi/%rdi/%rcx (on string instructions). This (a) requires two extra ioctls for getting and setting the registers and (b) is unfriendly to non-x86 archs, when they get kvm ports. So fix by doing the register fixups in the kernel and passing to userspace only an abstract description of the PIO to be done. Signed-off-by: Avi Kivity <avi@qumranet.com>
*	KVM: Use a shared page for kernel/user communication when runing a vcpu	Avi Kivity	2007-05-03	1	-0/+1
\| \| \| \| \| \| \| \| \|	Instead of passing a 'struct kvm_run' back and forth between the kernel and userspace, allocate a page and allow the user to mmap() it. This reduces needless copying and makes the interface expandable by providing lots of free space. Signed-off-by: Avi Kivity <avi@qumranet.com>
*	KVM: Per-vcpu inodes	Avi Kivity	2007-03-04	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Allocate a distinct inode for every vcpu in a VM. This has the following benefits: - the filp cachelines are no longer bounced when f_count is incremented on every ioctl() - the API and internal code are distinctly clearer; for example, on the KVM_GET_REGS ioctl, there is no need to copy the vcpu number from userspace and then copy the registers back; the vcpu identity is derived from the fd used to make the call Right now the performance benefits are completely theoretical since (a) we don't support more than one vcpu per VM and (b) virtualization hardware inefficiencies completely everwhelm any cacheline bouncing effects. But both of these will change, and we need to prepare the API today. Signed-off-by: Avi Kivity <avi@qumranet.com>
*	KVM: Wire up hypercall handlers to a central arch-independent location	Avi Kivity	2007-03-04	1	-0/+2
\| \| \| \|	Signed-off-by: Avi Kivity <avi@qumranet.com>
*	KVM: add MSR based hypercall API	Ingo Molnar	2007-03-04	1	-0/+6
\| \| \| \| \| \| \| \|	This adds a special MSR based hypercall API to KVM. This is to be used by paravirtual kernels and virtual drivers. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Avi Kivity <avi@qumranet.com>
*	KVM: Use page_private()/set_page_private() apis	Markus Rechberger	2007-03-04	1	-1/+1
\| \| \| \| \| \| \|	Besides using an established api, this allows using kvm in older kernels. Signed-off-by: Markus Rechberger <markus.rechberger@amd.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
*	[PATCH] KVM: cpu hotplug support	Avi Kivity	2007-02-12	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \|	On hotplug, we execute the hardware extension enable sequence. On unplug, we decache any vcpus that last ran on the exiting cpu, and execute the hardware extension disable sequence. Signed-off-by: Avi Kivity <avi@qumranet.com> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
*	[PATCH] KVM: Add a global list of all virtual machines	Avi Kivity	2007-02-12	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \|	This will allow us to iterate over all vcpus and see which cpus they are running on. [akpm@osdl.org: use standard (ugly) initialisers] Signed-off-by: Avi Kivity <avi@qumranet.com> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
*	[PATCH] kvm: Fix asm constraint for lldt instruction	S.Caglar Onur	2007-02-12	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \|	lldt does not accept immediate operands, which "g" allows. Signed-off-by: S.Caglar Onur <caglar@pardus.org.tr> Signed-off-by: Avi Kivity <avi@qumranet.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
*	[PATCH] KVM: Emulate IA32_MISC_ENABLE msr	Avi Kivity	2007-01-26	1	-0/+1
\| \| \| \| \| \| \| \|	This allows netbsd 3.1 i386 to get further along installing. Signed-off-by: Avi Kivity <avi@qumranet.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
*	[PATCH] KVM: MMU: Replace atomic allocations by preallocated objects	Avi Kivity	2007-01-06	1	-1/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The mmu sometimes needs memory for reverse mapping and parent pte chains. however, we can't allocate from within the mmu because of the atomic context. So, move the allocations to a central place that can be executed before the main mmu machinery, where we can bail out on failure before any damage is done. (error handling is deffered for now, but the basic structure is there) Signed-off-by: Avi Kivity <avi@qumranet.com> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
*	[PATCH] KVM: MMU: Never free a shadow page actively serving as a root	Avi Kivity	2007-01-06	1	-0/+1
\| \| \| \| \| \| \| \| \| \|	We always need cr3 to point to something valid, so if we detect that we're freeing a root page, simply push it back to the top of the active list. Signed-off-by: Avi Kivity <avi@qumranet.com> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
*	[PATCH] KVM: MMU: Page table write flood protection	Avi Kivity	2007-01-06	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	In fork() (or when we protect a page that is no longer a page table), we can experience floods of writes to a page, which have to be emulated. This is expensive. So, if we detect such a flood, zap the page so subsequent writes can proceed natively. Signed-off-by: Avi Kivity <avi@qumranet.com> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
*	[PATCH] KVM: MMU: Remove invlpg interception	Avi Kivity	2007-01-06	1	-1/+0
\| \| \| \| \| \| \| \| \| \| \|	Since we write protect shadowed guest page tables, there is no need to trap page invalidations (the guest will always change the mapping before issuing the invlpg instruction). Signed-off-by: Avi Kivity <avi@qumranet.com> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>