linux - linux

	Commit message (Collapse)	Author	Age	Files	Lines
*	KVM: x86: always exit on EOIs for interrupts listed in the IOAPIC redir table	Paolo Bonzini	2014-07-30	1	-4/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently, the EOI exit bitmap (used for APICv) does not include interrupts that are masked. However, this can cause a bug that manifests as an interrupt storm inside the guest. Alex Williamson reported the bug and is the one who really debugged this; I only wrote the patch. :) The scenario involves a multi-function PCI device with OHCI and EHCI USB functions and an audio function, all assigned to the guest, where both USB functions use legacy INTx interrupts. As soon as the guest boots, interrupts for these devices turn into an interrupt storm in the guest; the host does not see the interrupt storm. Basically the EOI path does not work, and the guest continues to see the interrupt over and over, even after it attempts to mask it at the APIC. The bug is only visible with older kernels (RHEL6.5, based on 2.6.32 with not many changes in the area of APIC/IOAPIC handling). Alex then tried forcing bit 59 (corresponding to the USB functions' IRQ) on in the eoi_exit_bitmap and TMR, and things then work. What happens is that VFIO asserts IRQ11, then KVM recomputes the EOI exit bitmap. It does not have set bit 59 because the RTE was masked, so the IOAPIC never sees the EOI and the interrupt continues to fire in the guest. My guess was that the guest is masking the interrupt in the redirection table in the interrupt routine, i.e. while the interrupt is set in a LAPIC's ISR, The simplest fix is to ignore the masking state, we would rather have an unnecessary exit rather than a missed IRQ ACK and anyway IOAPIC interrupts are not as performance-sensitive as for example MSIs. Alex tested this patch and it fixed his bug. [Thanks to Alex for his precise description of the problem and initial debugging effort. A lot of the text above is based on emails exchanged with him.] Reported-by: Alex Williamson <alex.williamson@redhat.com> Tested-by: Alex Williamson <alex.williamson@redhat.com> Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
*	KVM: ioapic: try to recover if pending_eoi goes out of range	Paolo Bonzini	2014-04-04	1	-5/+18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The RTC tracking code tracks the cardinality of rtc_status.dest_map into rtc_status.pending_eoi. It has some WARN_ONs that trigger if pending_eoi ever becomes negative; however, these do not do anything to recover, and it bad things will happen soon after they trigger. When the next RTC interrupt is triggered, rtc_check_coalesced() will return false, but ioapic_service will find pending_eoi != 0 and do a BUG_ON. To avoid this, should pending_eoi ever be nonzero, call kvm_rtc_eoi_tracking_restore_all to recompute a correct dest_map and pending_eoi. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
*	KVM: ioapic: fix assignment of ioapic->rtc_status.pending_eoi (CVE-2014-0155)	Paolo Bonzini	2014-04-04	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	QE reported that they got the BUG_ON in ioapic_service to trigger. I cannot reproduce it, but there are two reasons why this could happen. The less likely but also easiest one, is when kvm_irq_delivery_to_apic does not deliver to any APIC and returns -1. Because irqe.shorthand == 0, the kvm_for_each_vcpu loop in that function is never reached. However, you can target the similar loop in kvm_irq_delivery_to_apic_fast; just program a zero logical destination address into the IOAPIC, or an out-of-range physical destination address. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
*	KVM: ioapic: reinject pending interrupts on KVM_SET_IRQCHIP	Paolo Bonzini	2014-03-21	1	-1/+14
\| \| \| \| \| \| \| \| \| \| \| \|	After the previous patches, an interrupt whose bit is set in the IRR register will never be in the LAPIC's IRR and has never been injected on the migration source. So inject it on the destination. This fixes migration of Windows guests without HPET (they use the RTC to trigger the scheduler tick, and lose it after migration). Reviewed-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
*	KVM: ioapic: extract body of kvm_ioapic_set_irq	Paolo Bonzini	2014-03-21	1	-24/+50
\| \| \| \| \| \| \|	We will reuse it to process a nonzero IRR that is passed to KVM_SET_IRQCHIP. Reviewed-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
*	KVM: ioapic: clear IRR for edge-triggered interrupts at delivery	Paolo Bonzini	2014-03-21	1	-0/+3
\| \| \| \| \| \| \| \| \| \|	This ensures that IRR bits are set in the KVM_GET_IRQCHIP result only if the interrupt is still sitting in the IOAPIC. After the next patches, it avoids spurious reinjection of the interrupt when KVM_SET_IRQCHIP is called. Reviewed-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
*	KVM: ioapic: merge ioapic_deliver into ioapic_service	Paolo Bonzini	2014-03-21	1	-20/+9
\| \| \| \| \| \| \| \| \|	Commonize the handling of masking, which was absent for kvm_ioapic_set_irq. Setting remote_irr does not need a separate function either, and merging the two functions avoids confusion. Reviewed-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
*	kvm: x86: ignore ioapic polarity	Gabriel L. Somlo	2014-03-13	1	-1/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Both QEMU and KVM have already accumulated a significant number of optimizations based on the hard-coded assumption that ioapic polarity will always use the ActiveHigh convention, where the logical and physical states of level-triggered irq lines always match (i.e., active(asserted) == high == 1, inactive == low == 0). QEMU guests are expected to follow directions given via ACPI and configure the ioapic with polarity 0 (ActiveHigh). However, even when misbehaving guests (e.g. OS X <= 10.9) set the ioapic polarity to 1 (ActiveLow), QEMU will still use the ActiveHigh signaling convention when interfacing with KVM. This patch modifies KVM to completely ignore ioapic polarity as set by the guest OS, enabling misbehaving guests to work alongside those which comply with the ActiveHigh polarity specified by QEMU's ACPI tables. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Gabriel L. Somlo <somlo@cmu.edu> [Move documentation to KVM_IRQ_LINE, add ia64. - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
*	kvm: make local functions static	Stephen Hemminger	2014-01-08	1	-1/+1
\| \| \| \| \| \| \| \|	Running 'make namespacecheck' found lots of functions that should be declared static, since only used in one file. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
*	KVM: Set TMR when programming ioapic entry	Yang Zhang	2013-04-16	1	-3/+9
\| \| \| \| \| \| \| \| \| \|	We already know the trigger mode of a given interrupt when programming the ioapice entry. So it's not necessary to set it in each interrupt delivery. Signed-off-by: Yang Zhang <yang.z.zhang@Intel.com> Reviewed-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
*	KVM: Call common update function when ioapic entry changed.	Yang Zhang	2013-04-16	1	-9/+13
\| \| \| \| \| \| \| \| \| \|	Both TMR and EOI exit bitmap need to be updated when ioapic changed or vcpu's id/ldr/dfr changed. So use common function instead eoi exit bitmap specific function. Signed-off-by: Yang Zhang <yang.z.zhang@Intel.com> Reviewed-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
*	KVM: Use eoi to track RTC interrupt delivery status	Yang Zhang	2013-04-16	1	-1/+35
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Current interrupt coalescing logci which only used by RTC has conflict with Posted Interrupt. This patch introduces a new mechinism to use eoi to track interrupt: When delivering an interrupt to vcpu, the pending_eoi set to number of vcpu that received the interrupt. And decrease it when each vcpu writing eoi. No subsequent RTC interrupt can deliver to vcpu until all vcpus write eoi. Signed-off-by: Yang Zhang <yang.z.zhang@Intel.com> Reviewed-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
*	KVM: Let ioapic know the irq line status	Yang Zhang	2013-04-16	1	-8/+10
\| \| \| \| \| \| \| \| \|	Userspace may deliver RTC interrupt without query the status. So we want to track RTC EOI for this case. Signed-off-by: Yang Zhang <yang.z.zhang@Intel.com> Reviewed-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
*	KVM: Force vmexit with virtual interrupt delivery	Yang Zhang	2013-04-16	1	-1/+1
\| \| \| \| \| \| \| \| \|	Need the EOI to track interrupt deliver status, so force vmexit on EOI for rtc interrupt when enabling virtual interrupt delivery. Signed-off-by: Yang Zhang <yang.z.zhang@Intel.com> Reviewed-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
*	KVM: Add reset/restore rtc_status support	Yang Zhang	2013-04-16	1	-0/+58
\| \| \| \| \| \| \| \|	restore rtc_status from migration or save/restore Signed-off-by: Yang Zhang <yang.z.zhang@Intel.com> Reviewed-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
*	KVM: Return destination vcpu on interrupt injection	Yang Zhang	2013-04-16	1	-1/+1
\| \| \| \| \| \| \| \|	Add a new parameter to know vcpus who received the interrupt. Signed-off-by: Yang Zhang <yang.z.zhang@Intel.com> Reviewed-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
*	KVM: Add vcpu info to ioapic_update_eoi()	Yang Zhang	2013-04-16	1	-6/+6
\| \| \| \| \| \| \| \| \|	Add vcpu info to ioapic_update_eoi, so we can know which vcpu issued this EOI. Signed-off-by: Yang Zhang <yang.z.zhang@Intel.com> Reviewed-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
*	KVM: Call kvm_apic_match_dest() to check destination vcpu	Yang Zhang	2013-04-07	1	-6/+3
\| \| \| \| \| \| \| \| \|	For a given vcpu, kvm_apic_match_dest() will tell you whether the vcpu in the destination list quickly. Drop kvm_calculate_eoi_exitmap() and use kvm_apic_match_dest() instead. Signed-off-by: Yang Zhang <yang.z.zhang@Intel.com> Signed-off-by: Gleb Natapov <gleb@redhat.com>
*	KVM: Fix bounds checking in ioapic indirect register reads (CVE-2013-1798)	Andy Honig	2013-03-19	1	-2/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If the guest specifies a IOAPIC_REG_SELECT with an invalid value and follows that with a read of the IOAPIC_REG_WINDOW KVM does not properly validate that request. ioapic_read_indirect contains an ASSERT(redir_index < IOAPIC_NUM_PINS), but the ASSERT has no effect in non-debug builds. In recent kernels this allows a guest to cause a kernel oops by reading invalid memory. In older kernels (pre-3.3) this allows a guest to read from large ranges of host memory. Tested: tested against apic unit tests. Signed-off-by: Andrew Honig <ahonig@google.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
*	x86, apicv: add virtual interrupt delivery support	Yang Zhang	2013-01-29	1	-0/+39
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Virtual interrupt delivery avoids KVM to inject vAPIC interrupts manually, which is fully taken care of by the hardware. This needs some special awareness into existing interrupr injection path: - for pending interrupt, instead of direct injection, we may need update architecture specific indicators before resuming to guest. - A pending interrupt, which is masked by ISR, should be also considered in above update action, since hardware will decide when to inject it at right time. Current has_interrupt and get_interrupt only returns a valid vector from injection p.o.v. Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Kevin Tian <kevin.tian@intel.com> Signed-off-by: Yang Zhang <yang.z.zhang@Intel.com> Signed-off-by: Gleb Natapov <gleb@redhat.com>
*	KVM: remove a wrong hack of delivery PIT intr to vcpu0	Yang Zhang	2012-12-23	1	-9/+0
\| \| \| \| \| \| \| \| \|	This hack is wrong. The pin number of PIT is connected to 2 not 0. This means this hack never takes effect. So it is ok to remove it. Signed-off-by: Yang Zhang <yang.z.zhang@Intel.com> Signed-off-by: Gleb Natapov <gleb@redhat.com>
*	KVM: x86: drop parameter validation in ioapic/pic	Michael S. Tsirkin	2012-08-15	1	-18/+19
\| \| \| \| \| \| \| \| \| \|	We validate irq pin number when routing is setup, so code handling illegal irq # in pic and ioapic on each injection is never called. Drop it, replace with BUG_ON to catch out of bounds access bugs. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
*	KVM: fix race with level interrupts	Michael S. Tsirkin	2012-07-20	1	-3/+16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When more than 1 source id is in use for the same GSI, we have the following race related to handling irq_states race: CPU 0 clears bit 0. CPU 0 read irq_state as 0. CPU 1 sets level to 1. CPU 1 calls kvm_ioapic_set_irq(1). CPU 0 calls kvm_ioapic_set_irq(0). Now ioapic thinks the level is 0 but irq_state is not 0. Fix by performing all irq_states bitmap handling under pic/ioapic lock. This also removes the need for atomics with irq_states handling. Reported-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
*	KVM: dont clear TMR on EOI	Michael S. Tsirkin	2012-04-17	1	-3/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Intel spec says that TMR needs to be set/cleared when IRR is set, but kvm also clears it on EOI. I did some tests on a real (AMD based) system, and I see same TMR values both before and after EOI, so I think it's a minor bug in kvm. This patch fixes TMR to be set/cleared on IRR set only as per spec. And now that we don't clear TMR, we can save an atomic read of TMR on EOI that's not propagated to ioapic, by checking whether ioapic needs a specific vector first and calculating the mode afterwards. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
*	KVM: drop bsp_vcpu pointer from kvm struct	Gleb Natapov	2011-12-27	1	-1/+1
\| \| \| \| \| \| \| \|	Drop bsp_vcpu pointer from kvm struct since its only use is incorrect anyway. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
*	KVM: Allow aligned byte and word writes to IOAPIC registers.	Julian Stecklina	2011-12-27	1	-3/+12
\| \| \| \| \| \| \| \| \|	This fixes byte accesses to IOAPIC_REG_SELECT as mandated by at least the ICH10 and Intel Series 5 chipset specs. It also makes ioapic_mmio_write consistent with ioapic_mmio_read, which also allows byte and word accesses. Signed-off-by: Julian Stecklina <js@alien8.de> Signed-off-by: Avi Kivity <avi@redhat.com>
*	KVM: Intelligent device lookup on I/O bus	Sasha Levin	2011-09-25	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently the method of dealing with an IO operation on a bus (PIO/MMIO) is to call the read or write callback for each device registered on the bus until we find a device which handles it. Since the number of devices on a bus can be significant due to ioeventfds and coalesced MMIO zones, this leads to a lot of overhead on each IO operation. Instead of registering devices, we now register ranges which points to a device. Lookup is done using an efficient bsearch instead of a linear search. Performance test was conducted by comparing exit count per second with 200 ioeventfds created on one byte and the guest is trying to access a different byte continuously (triggering usermode exits). Before the patch the guest has achieved 259k exits per second, after the patch the guest does 274k exits per second. Cc: Avi Kivity <avi@redhat.com> Cc: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Sasha Levin <levinsasha928@gmail.com> Signed-off-by: Avi Kivity <avi@redhat.com>
*	KVM: ioapic: Fix an error field reference	Liu Yuan	2011-05-22	1	-1/+1
\| \| \| \| \| \| \| \|	Function ioapic_debug() in the ioapic_deliver() misnames one filed by reference. This patch correct it. Signed-off-by: Liu Yuan <tailai.ly@taobao.com> Signed-off-by: Avi Kivity <avi@redhat.com>
*	KVM: Convert mask notifiers to use irqchip/pin instead of gsi	Gleb Natapov	2010-08-02	1	-1/+1
\| \| \| \| \| \| \| \| \|	Devices register mask notifier using gsi, but irqchip knows about irqchip/pin, so conversion from irqchip/pin to gsi should be done before looking for mask notifier to call. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
*	KVM: Update Red Hat copyrights	Avi Kivity	2010-08-01	1	-0/+1
\| \| \| \|	Signed-off-by: Avi Kivity <avi@redhat.com>
*	KVM: read apic->irr with ioapic lock held	Marcelo Tosatti	2010-06-10	1	-1/+2
\| \| \| \| \| \| \|	Read ioapic->irr inside ioapic->lock protected section. KVM-Stable-Tag Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
*	KVM: convert ioapic lock to spinlock	Marcelo Tosatti	2010-05-13	1	-15/+15
\| \| \| \| \| \| \| \| \|	kvm_set_irq is used from non sleepable contexes, so convert ioapic from mutex to spinlock. KVM-Stable-Tag. Tested-by: Ralf Bonenkamp <ralf.bonenkamp@swyx.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
*	include cleanup: Update gfp.h and slab.h includes to prepare for breaking ↵	Tejun Heo	2010-03-30	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	implicit slab.h inclusion from percpu.h percpu.h is included by sched.h and module.h and thus ends up being included when building most .c files. percpu.h includes slab.h which in turn includes gfp.h making everything defined by the two files universally available and complicating inclusion dependencies. percpu.h -> slab.h dependency is about to be removed. Prepare for this change by updating users of gfp and slab facilities include those headers directly instead of assuming availability. As this conversion needs to touch large number of source files, the following script is used as the basis of conversion. http://userweb.kernel.org/~tj/misc/slabh-sweep.py The script does the followings. * Scan files for gfp and slab usages and update includes such that only the necessary includes are there. ie. if only gfp is used, gfp.h, if slab is used, slab.h. * When the script inserts a new include, it looks at the include blocks and try to put the new include such that its order conforms to its surrounding. It's put in the include block which contains core kernel includes, in the same order that the rest are ordered - alphabetical, Christmas tree, rev-Xmas-tree or at the end if there doesn't seem to be any matching order. * If the script can't find a place to put a new include (mostly because the file doesn't have fitting include block), it prints out an error message indicating which .h file needs to be added to the file. The conversion was done in the following steps. 1. The initial automatic conversion of all .c files updated slightly over 4000 files, deleting around 700 includes and adding ~480 gfp.h and ~3000 slab.h inclusions. The script emitted errors for ~400 files. 2. Each error was manually checked. Some didn't need the inclusion, some needed manual addition while adding it to implementation .h or embedding .c file was more appropriate for others. This step added inclusions to around 150 files. 3. The script was run again and the output was compared to the edits from #2 to make sure no file was left behind. 4. Several build tests were done and a couple of problems were fixed. e.g. lib/decompress_.c used malloc/free() wrappers around slab APIs requiring slab.h to be added manually. 5. The script was run on all .h files but without automatically editing them as sprinkling gfp.h and slab.h inclusions around .h files could easily lead to inclusion dependency hell. Most gfp.h inclusion directives were ignored as stuff from gfp.h was usually wildly available and often used in preprocessor macros. Each slab.h inclusion directive was examined and added manually as necessary. 6. percpu.h was updated not to include slab.h. 7. Build test were done on the following configurations and failures were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my distributed build env didn't work with gcov compiles) and a few more options had to be turned off depending on archs to make things build (like ipr on powerpc/64 which failed due to missing writeq). x86 and x86_64 UP and SMP allmodconfig and a custom test config. * powerpc and powerpc64 SMP allmodconfig * sparc and sparc64 SMP allmodconfig * ia64 SMP allmodconfig * s390 SMP allmodconfig * alpha SMP allmodconfig * um on x86_64 SMP allmodconfig 8. percpu.h modifications were reverted so that it could be applied as a separate patch and serve as bisection point. Given the fact that I had only a couple of failures from tests on step 6, I'm fairly confident about the coverage of this conversion patch. If there is a breakage, it's likely to be something in one of the arch headers which should be easily discoverable easily on most builds of the specific arch. Signed-off-by: Tejun Heo <tj@kernel.org> Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
*	KVM: cleanup the failure path of KVM_CREATE_IRQCHIP ioctrl	Wei Yongjun	2010-03-01	1	-0/+11
\| \| \| \| \| \| \| \| \|	If we fail to init ioapic device or the fail to setup the default irq routing, the device register by kvm_create_pic() and kvm_ioapic_init() remain unregister. This patch fixed to do this. Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>
*	KVM: kvm->arch.vioapic should be NULL if kvm_ioapic_init() failure	Wei Yongjun	2010-03-01	1	-1/+3
\| \| \| \| \| \| \| \|	kvm->arch.vioapic should be NULL in case of kvm_ioapic_init() failure due to cannot register io dev. Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>
*	KVM: avoid taking ioapic mutex for non-ioapic EOIs	Avi Kivity	2010-03-01	1	-0/+19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When the guest acknowledges an interrupt, it sends an EOI message to the local apic, which broadcasts it to the ioapic. To handle the EOI, we need to take the ioapic mutex. On large guests, this causes a lot of contention on this mutex. Since large guests usually don't route interrupts via the ioapic (they use msi instead), this is completely unnecessary. Avoid taking the mutex by introducing a handled_vectors bitmap. Before taking the mutex, check if the ioapic was actually responsible for the acked vector. If not, we can return early. Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
*	KVM: convert slots_lock to a mutex	Marcelo Tosatti	2010-03-01	1	-2/+2
\| \| \| \|	Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
*	KVM: convert io_bus to SRCU	Marcelo Tosatti	2010-03-01	1	-1/+3
\| \| \| \|	Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
*	KVM: Move IO APIC to its own lock	Gleb Natapov	2009-12-03	1	-19/+61
\| \| \| \| \| \| \|	The allows removal of irq_lock from the injection path. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
*	KVM: Fix coalesced interrupt reporting in IOAPIC	Gleb Natapov	2009-09-10	1	-0/+2
\| \| \| \| \| \| \| \| \|	This bug was introduced by b4a2f5e723e4f7df467. Cc: stable@kernel.org Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
*	KVM: make io_bus interface more robust	Gregory Haskins	2009-09-10	1	-2/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Today kvm_io_bus_regsiter_dev() returns void and will internally BUG_ON if it fails. We want to create dynamic MMIO/PIO entries driven from userspace later in the series, so we need to enhance the code to be more robust with the following changes: 1) Add a return value to the registration function 2) Fix up all the callsites to check the return code, handle any failures, and percolate the error up to the caller. 3) Add an unregister function that collapses holes in the array Signed-off-by: Gregory Haskins <ghaskins@novell.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
*	KVM: Add trace points in irqchip code	Gleb Natapov	2009-09-10	1	-0/+2
\| \| \| \| \| \| \| \| \|	Add tracepoint in msi/ioapic/pic set_irq() functions, in IPI sending and in the point where IRQ is placed into apic's IRR. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
*	KVM: Use temporary variable to shorten lines.	Gleb Natapov	2009-09-10	1	-8/+10
\| \| \| \| \| \| \|	Cosmetic only. No logic is changed by this patch. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
*	KVM: fix lock imbalance	Jiri Slaby	2009-09-10	1	-1/+2
\| \| \| \| \| \| \| \|	There is a missing unlock on one fail path in ioapic_mmio_write, fix that. Signed-off-by: Jiri Slaby <jirislaby@gmail.com> Signed-off-by: Avi Kivity <avi@redhat.com>
*	KVM: remove in_range from io devices	Michael S. Tsirkin	2009-09-10	1	-10/+12
\| \| \| \| \| \| \| \| \| \| \| \|	This changes bus accesses to use high-level kvm_io_bus_read/kvm_io_bus_write functions. in_range now becomes unused so it is removed from device ops in favor of read/write callbacks performing range checks internally. This allows aliasing (mostly for in-kernel virtio), as well as better error handling by making it possible to pass errors up to userspace. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
*	KVM: convert bus to slots_lock	Michael S. Tsirkin	2009-09-10	1	-1/+1
\| \| \| \| \| \| \| \| \| \|	Use slots_lock to protect device list on the bus. slots_lock is already taken for read everywhere, so we only need to take it for write when registering devices. This is in preparation to removing in_range and kvm->lock around it. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
*	KVM: Introduce kvm_vcpu_is_bsp() function.	Gleb Natapov	2009-09-10	1	-1/+3
\| \| \| \| \| \| \|	Use it instead of open code "vcpu_id zero is BSP" assumption. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
*	KVM: introduce irq_lock, use it to protect ioapic	Marcelo Tosatti	2009-09-10	1	-0/+5
\| \| \| \| \| \| \|	Introduce irq_lock, and use to protect ioapic data structures. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
*	KVM: cleanup io_device code	Gregory Haskins	2009-09-10	1	-7/+15
\| \| \| \| \| \| \| \| \| \| \|	We modernize the io_device code so that we use container_of() instead of dev->private, and move the vtable to a separate ops structure (theoretically allows better caching for multiple instances of the same ops structure) Signed-off-by: Gregory Haskins <ghaskins@novell.com> Acked-by: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Avi Kivity <avi@redhat.com>
*	KVM: Avoid redelivery of edge interrupt before next edge	Gleb Natapov	2009-08-09	1	-5/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The check for an edge is broken in current ioapic code. ioapic->irr is cleared on each edge interrupt by ioapic_service() and this makes old_irr != ioapic->irr condition in kvm_ioapic_set_irq() to be always true. The patch fixes the code to properly recognise edge. Some HW emulation calls set_irq() without level change. If each such call is propagated to an OS it may confuse a device driver. This is the case with keyboard device emulation and Windows XP x64 installer on SMP VM. Each keystroke produce two interrupts (down/up) one interrupt is submitted to CPU0 and another to CPU1. This confuses Windows somehow and it ignores keystrokes. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>