summaryrefslogtreecommitdiffstats
path: root/arch (follow)
Commit message (Collapse)AuthorAgeFilesLines
* KVM: MMU: check for present pdptr shadow page in walk_shadowMarcelo Tosatti2008-12-311-0/+2
| | | | | | | | | | walk_shadow assumes the caller verified validity of the pdptr pointer in question, which is not the case for the invlpg handler. Fixes oops during Solaris 10 install. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: Consolidate userspace memory capability reporting into common codeAvi Kivity2008-12-314-7/+0
| | | | Signed-off-by: Avi Kivity <avi@redhat.com>
* x86: KVM guest: kvm_get_tsc_khz: return khz, not lpjEduardo Habkost2008-12-311-4/+4
| | | | | | | | | kvm_get_tsc_khz() currently returns the previously-calculated preset_lpj value, but it is in loops-per-jiffy, not kHz. The current code works correctly only when HZ=1000. Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: MMU: prepopulate the shadow on invlpgMarcelo Tosatti2008-12-314-15/+40
| | | | | | | | | | | | If the guest executes invlpg, peek into the pagetable and attempt to prepopulate the shadow entry. Also stop dirty fault updates from interfering with the fork detector. 2% improvement on RHEL3/AIM7. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: MMU: skip global pgtables on sync due to cr3 switchMarcelo Tosatti2008-12-314-10/+63
| | | | | | | | | Skip syncing global pages on cr3 switch (but not on cr4/cr0). This is important for Linux 32-bit guests with PAE, where the kmap page is marked as global. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: MMU: collapse remote TLB flushes on root syncMarcelo Tosatti2008-12-311-5/+14
| | | | | | | | | | Collapse remote TLB flushes on root sync. kernbench is 2.7% faster on 4-way guest. Improvements have been seen with other loads such as AIM7. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: MMU: use page array in unsync walkMarcelo Tosatti2008-12-312-56/+141
| | | | | | | | | | Instead of invoking the handler directly collect pages into an array so the caller can work with it. Simplifies TLB flush collapsing. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: x86 emulator: Fix handling of VMMCALL instructionAmit Shah2008-12-311-1/+1
| | | | | | | | | | | The VMMCALL instruction doesn't get recognised and isn't processed by the emulator. This is seen on an Intel host that tries to execute the VMMCALL instruction after a guest live migrates from an AMD host. Signed-off-by: Amit Shah <amit.shah@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: x86 emulator: add the emulation of shld and shrd instructionsGuillaume Thouvenin2008-12-311-2/+15
| | | | | | | Add emulation of shld and shrd instructions Signed-off-by: Guillaume Thouvenin <guillaume.thouvenin@ext.bull.net> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: x86 emulator: add the assembler code for three operandsGuillaume Thouvenin2008-12-311-0/+39
| | | | | | | | Add the assembler code for instruction with three operands and one operand is stored in ECX register Signed-off-by: Guillaume Thouvenin <guillaume.thouvenin@ext.bull.net> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: x86 emulator: add a new "implied 1" Src decode typeGuillaume Thouvenin2008-12-311-0/+5
| | | | | | | | Add SrcOne operand type when we need to decode an implied '1' like with regular shift instruction Signed-off-by: Guillaume Thouvenin <guillaume.thouvenin@ext.bull.net> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: x86 emulator: add Src2 decode setGuillaume Thouvenin2008-12-312-0/+30
| | | | | | | | | Instruction like shld has three operands, so we need to add a Src2 decode set. We start with Src2None, Src2CL, and Src2ImmByte, Src2One to support shld/shrd and we will expand it later. Signed-off-by: Guillaume Thouvenin <guillaume.thouvenin@ext.bull.net> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: x86 emulator: Extend the opcode descriptorGuillaume Thouvenin2008-12-311-4/+4
| | | | | | | | Extend the opcode descriptor to 32 bits. This is needed by the introduction of a new Src2 operand type. Signed-off-by: Guillaume Thouvenin <guillaume.thouvenin@ext.bull.net> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: ppc: mostly cosmetic updates to the exit timing accounting codeHollis Blanchard2008-12-315-91/+66
| | | | | | | | The only significant changes were to kvmppc_exit_timing_write() and kvmppc_exit_timing_show(), both of which were dramatically simplified. Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: ppc: Implement in-kernel exit timing statisticsHollis Blanchard2008-12-3113-18/+516
| | | | | | | | | | | | | | | | | | | | | | Existing KVM statistics are either just counters (kvm_stat) reported for KVM generally or trace based aproaches like kvm_trace. For KVM on powerpc we had the need to track the timings of the different exit types. While this could be achieved parsing data created with a kvm_trace extension this adds too much overhead (at least on embedded PowerPC) slowing down the workloads we wanted to measure. Therefore this patch adds a in-kernel exit timing statistic to the powerpc kvm code. These statistic is available per vm&vcpu under the kvm debugfs directory. As this statistic is low, but still some overhead it can be enabled via a .config entry and should be off by default. Since this patch touched all powerpc kvm_stat code anyway this code is now merged and simplified together with the exit timing statistic code (still working with exit timing disabled in .config). Signed-off-by: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com> Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: ppc: save and restore guest mappings on context switchHollis Blanchard2008-12-313-5/+66
| | | | | | | | | Store shadow TLB entries in memory, but only use it on host context switch (instead of every guest entry). This improves performance for most workloads on 440 by reducing the guest TLB miss rate. Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: ppc: directly insert shadow mappings into the hardware TLBHollis Blanchard2008-12-318-221/+168
| | | | | | | | | | | | | | | | | | | | | | | | | Formerly, we used to maintain a per-vcpu shadow TLB and on every entry to the guest would load this array into the hardware TLB. This consumed 1280 bytes of memory (64 entries of 16 bytes plus a struct page pointer each), and also required some assembly to loop over the array on every entry. Instead of saving a copy in memory, we can just store shadow mappings directly into the hardware TLB, accepting that the host kernel will clobber these as part of the normal 440 TLB round robin. When we do that we need less than half the memory, and we have decreased the exit handling time for all guest exits, at the cost of increased number of TLB misses because the host overwrites some guest entries. These savings will be increased on processors with larger TLBs or which implement intelligent flush instructions like tlbivax (which will avoid the need to walk arrays in software). In addition to that and to the code simplification, we have a greater chance of leaving other host userspace mappings in the TLB, instead of forcing all subsequent tasks to re-fault all their mappings. Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* powerpc/44x: declare tlb_44x_index for use in C codeHollis Blanchard2008-12-311-0/+1
| | | | | | | | | | | | | | | KVM currently ignores the host's round robin TLB eviction selection, instead maintaining its own TLB state and its own round robin index. However, by participating in the normal 44x TLB selection, we can drop the alternate TLB processing in KVM. This results in a significant performance improvement, since that processing currently must be done on *every* guest exit. Accordingly, KVM needs to be able to access and increment tlb_44x_index. (KVM on 440 cannot be a module, so there is no need to export this symbol.) Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com> Acked-by: Josh Boyer <jwboyer@linux.vnet.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: ppc: support large host pagesHollis Blanchard2008-12-313-23/+64
| | | | | | | | | | KVM on 440 has always been able to handle large guest mappings with 4K host pages -- we must, since the guest kernel uses 256MB mappings. This patch makes KVM work when the host has large pages too (tested with 64K). Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: VMX: fix sparse warningHannes Eder2008-12-311-1/+1
| | | | | | | | | Impact: make global function static arch/x86/kvm/vmx.c:134:3: warning: symbol 'vmx_capability' was not declared. Should it be static? Signed-off-by: Hannes Eder <hannes@hanneseder.net> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: Remove extraneous semicolon after do/whileAvi Kivity2008-12-311-1/+1
| | | | | | Notices by Guillaume Thouvenin. Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: x86 emulator: fix popf emulationAvi Kivity2008-12-311-0/+2
| | | | | | Set operand type and size to get correct writeback behavior. Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: x86 emulator: fix ret emulationAvi Kivity2008-12-311-0/+2
| | | | | | | 'ret' did not set the operand type or size for the destination, so writeback ignored it. Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: x86 emulator: switch 'pop reg' instruction to emulate_pop()Avi Kivity2008-12-311-7/+4
| | | | Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: x86 emulator: allow pop from mmioAvi Kivity2008-12-311-3/+3
| | | | Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: x86 emulator: Extract 'pop' sequence into a functionAvi Kivity2008-12-311-4/+17
| | | | | | Switch 'pop r/m' instruction to use the new function. Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: s390: Fix memory leak of vcpu->runChristian Borntraeger2008-12-311-2/+2
| | | | | | | | | | | The s390 backend of kvm never calls kvm_vcpu_uninit. This causes a memory leak of vcpu->run pages. Lets call kvm_vcpu_uninit in kvm_arch_vcpu_destroy to free the vcpu->run. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Acked-by: Carsten Otte <cotte@de.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: s390: Fix refcounting and allow module unloadChristian Borntraeger2008-12-311-14/+21
| | | | | | | | | | | | Currently it is impossible to unload the kvm module on s390. This patch fixes kvm_arch_destroy_vm to release all cpus. This make it possible to unload the module. In addition we stop messing with the module refcount in arch code. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Acked-by: Carsten Otte <cotte@de.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: x86 emulator: consolidate emulation of two operand instructionsAvi Kivity2008-12-311-51/+28
| | | | | | No need to repeat the same assembly block over and over. Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: x86 emulator: reduce duplication in one operand emulation thunksAvi Kivity2008-12-311-43/+23
| | | | Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: MMU: optimize set_spte for page syncMarcelo Tosatti2008-12-311-0/+9
| | | | | | | | | | | | | | | | | | The write protect verification in set_spte is unnecessary for page sync. Its guaranteed that, if the unsync spte was writable, the target page does not have a write protected shadow (if it had, the spte would have been write protected under mmu_lock by rmap_write_protect before). Same reasoning applies to mark_page_dirty: the gfn has been marked as dirty via the pagefault path. The cost of hash table and memslot lookups are quite significant if the workload is pagetable write intensive resulting in increased mmu_lock contention. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* x86: KVM guest: sign kvmclock as paravirtGlauber Costa2008-12-311-0/+2
| | | | | | | | | | Currently, we only set the KVM paravirt signature in case of CONFIG_KVM_GUEST. However, it is possible to have it turned off, while CONFIG_KVM_CLOCK is turned on. This is also a paravirt case, and should be shown accordingly. Signed-off-by: Glauber Costa <glommer@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: VMX: Conditionally request interrupt window after injecting irqAvi Kivity2008-12-311-0/+2
| | | | | | | | | | | If we're injecting an interrupt, and another one is pending, request an interrupt window notification so we don't have excess latency on the second interrupt. This shouldn't happen in practice since an EOI will be issued, giving a second chance to request an interrupt window, but... Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: ia64: Clean up vmm_ivt.S using tab to indent every lineXiantao Zhang2008-12-311-741/+729
| | | | | | | Using tab for indentation for vmm_ivt.S. Signed-off-by: Xiantao Zhang <xiantao.zhang@intel.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: ia64: Add handler for crashed vmmXiantao Zhang2008-12-314-12/+44
| | | | | | | | | Since vmm runs in an isolated address space and it is just a copy of host's kvm-intel module, so once vmm crashes, we just crash all guests running on it instead of crashing whole kernel. Signed-off-by: Xiantao Zhang <xiantao.zhang@intel.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: ia64: Add some debug points to provide crash infomationXiantao Zhang2008-12-315-33/+88
| | | | | | | Use printk infrastructure to print out some debug info once VM crashes. Signed-off-by: Xiantao Zhang <xiantao.zhang@intel.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: ia64: Define printk function for kvm-intel moduleXiantao Zhang2008-12-315-1/+54
| | | | | | | | | | kvm-intel module is relocated to an isolated address space with kernel, so it can't call host kernel's printk for debug purpose. In the module, we implement the printk to output debug info of vmm. Signed-off-by: Xiantao Zhang <xiantao.zhang@intel.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* x86: disable VMX on all CPUs on rebootEduardo Habkost2008-12-311-2/+60
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On emergency_restart, we may need to use an NMI to disable virtualization on all CPUs. We do that using nmi_shootdown_cpus() if VMX is enabled. Note: With this patch, we will run the NMI stuff only when the CPU where emergency_restart() was called has VMX enabled. This should work on most cases because KVM enables VMX on all CPUs, but we may miss the small window where KVM is doing that. Also, I don't know if all code using VMX out there always enable VMX on all CPUs like KVM does. We have two other alternatives for that: a) Have an API that all code that enables VMX on any CPU should use to tell the kernel core that it is going to enable VMX on the CPUs. b) Always call nmi_shootdown_cpus() if the CPU supports VMX. This is a bit intrusive and more risky, as it would run nmi_shootdown_cpus() on emergency_reboot() even on systems where virtualization is never enabled. Finding a proper point to hook the nmi_shootdown_cpus() call isn't trivial, as the non-emergency machine_restart() (that doesn't need the NMI tricks) uses machine_emergency_restart() directly. The solution to make this work without adding a new function or argument to machine_ops was setting a 'reboot_emergency' flag that tells if native_machine_emergency_restart() needs to do the virt cleanup or not. Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* kdump: forcibly disable VMX and SVM on machine_crash_shutdown()Eduardo Habkost2008-12-311-0/+18
| | | | | | | | | | | | | We need to disable virtualization extensions on all CPUs before booting the kdump kernel, otherwise the kdump kernel booting will fail, and rebooting after the kdump kernel did its task may also fail. We do it using cpu_emergency_vmxoff() and cpu_emergency_svm_disable(), that should always work, because those functions check if the CPUs support SVM or VMX before doing their tasks. Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* x86: cpu_emergency_svm_disable() functionEduardo Habkost2008-12-311-0/+8
| | | | | | | | This function can be used by the reboot or kdump code to forcibly disable SVM on the CPU. Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: SVM: move svm_hardware_disable() code to asm/virtext.hEduardo Habkost2008-12-312-5/+15
| | | | | | | Create cpu_svm_disable() function. Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: SVM: move has_svm() code to asm/virtext.hEduardo Habkost2008-12-312-14/+46
| | | | | | | | | Use a trick to keep the printk()s on has_svm() working as before. gcc will take care of not generating code for the 'msg' stuff when the function is called with a NULL msg argument. Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* x86: cpu_emergency_vmxoff() functionEduardo Habkost2008-12-311-0/+23
| | | | | | | | Add cpu_emergency_vmxoff() and its friends: cpu_vmx_enabled() and __cpu_emergency_vmxoff(). Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: VMX: extract kvm_cpu_vmxoff() from hardware_disable()Eduardo Habkost2008-12-311-2/+11
| | | | | | | | Along with some comments on why it is different from the core cpu_vmxoff() function. Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* x86: asm/virtext.h: add cpu_vmxoff() inline functionEduardo Habkost2008-12-311-0/+15
| | | | | | | | | Unfortunately we can't use exactly the same code from vmx hardware_disable(), because the KVM function uses the __kvm_handle_fault_on_reboot() tricks. Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: VMX: move cpu_has_kvm_support() to an inline on asm/virtext.hEduardo Habkost2008-12-312-2/+33
| | | | | | | | It will be used by core code on kdump and reboot, to disable vmx if needed. Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: VMX: move ASM_VMX_* definitions from asm/kvm_host.h to asm/vmx.hEduardo Habkost2008-12-312-12/+15
| | | | | | | | | | | Those definitions will be used by code outside KVM, so move it outside of a KVM-specific source file. Those definitions are used only on kvm/vmx.c, that already includes asm/vmx.h, so they can be moved safely. Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: SVM: move svm.h to include/asmEduardo Habkost2008-12-312-1/+1
| | | | | | | | svm.h will be used by core code that is independent of KVM, so I am moving it outside the arch/x86/kvm directory. Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: VMX: move vmx.h to include/asmEduardo Habkost2008-12-313-2/+2
| | | | | | | | vmx.h will be used by core code that is independent of KVM, so I am moving it outside the arch/x86/kvm directory. Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: ppc: fix userspace mapping invalidation on context switchHollis Blanchard2008-12-313-22/+20
| | | | | | | | | | | | | | | | We used to defer invalidating userspace TLB entries until jumping out of the kernel. This was causing MMU weirdness most easily triggered by using a pipe in the guest, e.g. "dmesg | tail". I believe the problem was that after the guest kernel changed the PID (part of context switch), the old process's mappings were still present, and so copy_to_user() on the "return to new process" path ended up using stale mappings. Testing with large pages (64K) exposed the problem, probably because with 4K pages, pressure on the TLB faulted all process A's mappings out before the guest kernel could insert any for process B. Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>