summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds2015-04-2633-392/+1630
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull second batch of KVM changes from Paolo Bonzini: "This mostly includes the PPC changes for 4.1, which this time cover Book3S HV only (debugging aids, minor performance improvements and some cleanups). But there are also bug fixes and small cleanups for ARM, x86 and s390. The task_migration_notifier revert and real fix is still pending review, but I'll send it as soon as possible after -rc1" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (29 commits) KVM: arm/arm64: check IRQ number on userland injection KVM: arm: irqfd: fix value returned by kvm_irq_map_gsi KVM: VMX: Preserve host CR4.MCE value while in guest mode. KVM: PPC: Book3S HV: Use msgsnd for signalling threads on POWER8 KVM: PPC: Book3S HV: Translate kvmhv_commence_exit to C KVM: PPC: Book3S HV: Streamline guest entry and exit KVM: PPC: Book3S HV: Use bitmap of active threads rather than count KVM: PPC: Book3S HV: Use decrementer to wake napping threads KVM: PPC: Book3S HV: Don't wake thread with no vcpu on guest IPI KVM: PPC: Book3S HV: Get rid of vcore nap_count and n_woken KVM: PPC: Book3S HV: Move vcore preemption point up into kvmppc_run_vcpu KVM: PPC: Book3S HV: Minor cleanups KVM: PPC: Book3S HV: Simplify handling of VCPUs that need a VPA update KVM: PPC: Book3S HV: Accumulate timing information for real-mode code KVM: PPC: Book3S HV: Create debugfs file for each guest's HPT KVM: PPC: Book3S HV: Add ICP real mode counters KVM: PPC: Book3S HV: Move virtual mode ICP functions to real-mode KVM: PPC: Book3S HV: Convert ICS mutex lock to spin lock KVM: PPC: Book3S HV: Add guest->host real mode completion counters KVM: PPC: Book3S HV: Add helpers for lock/unlock hpte ...
| * Merge tag 'kvm-arm-for-4.1-take2' of ↵Paolo Bonzini2015-04-224-5/+19
| |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into kvm-master KVM/ARM changes for v4.1, take #2: Rather small this time: - a fix for a nasty bug with virtual IRQ injection - a fix for irqfd
| | * KVM: arm/arm64: check IRQ number on userland injectionAndre Przywara2015-04-224-4/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When userland injects a SPI via the KVM_IRQ_LINE ioctl we currently only check it against a fixed limit, which historically is set to 127. With the new dynamic IRQ allocation the effective limit may actually be smaller (64). So when now a malicious or buggy userland injects a SPI in that range, we spill over on our VGIC bitmaps and bytemaps memory. I could trigger a host kernel NULL pointer dereference with current mainline by injecting some bogus IRQ number from a hacked kvmtool: ----------------- .... DEBUG: kvm_vgic_inject_irq(kvm, cpu=0, irq=114, level=1) DEBUG: vgic_update_irq_pending(kvm, cpu=0, irq=114, level=1) DEBUG: IRQ #114 still in the game, writing to bytemap now... Unable to handle kernel NULL pointer dereference at virtual address 00000000 pgd = ffffffc07652e000 [00000000] *pgd=00000000f658b003, *pud=00000000f658b003, *pmd=0000000000000000 Internal error: Oops: 96000006 [#1] PREEMPT SMP Modules linked in: CPU: 1 PID: 1053 Comm: lkvm-msi-irqinj Not tainted 4.0.0-rc7+ #3027 Hardware name: FVP Base (DT) task: ffffffc0774e9680 ti: ffffffc0765a8000 task.ti: ffffffc0765a8000 PC is at kvm_vgic_inject_irq+0x234/0x310 LR is at kvm_vgic_inject_irq+0x30c/0x310 pc : [<ffffffc0000ae0a8>] lr : [<ffffffc0000ae180>] pstate: 80000145 ..... So this patch fixes this by checking the SPI number against the actual limit. Also we remove the former legacy hard limit of 127 in the ioctl code. Signed-off-by: Andre Przywara <andre.przywara@arm.com> Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org> CC: <stable@vger.kernel.org> # 4.0, 3.19, 3.18 [maz: wrap KVM_ARM_IRQ_GIC_MAX with #ifndef __KERNEL__, as suggested by Christopher Covington] Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
| | * KVM: arm: irqfd: fix value returned by kvm_irq_map_gsiEric Auger2015-04-221-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | irqfd/arm curently does not support routing. kvm_irq_map_gsi is supposed to return all the routing entries associated with the provided gsi and return the number of those entries. We should return 0 at this point. Signed-off-by: Eric Auger <eric.auger@linaro.org> Acked-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
| * | Merge tag 'signed-kvm-ppc-queue' of git://github.com/agraf/linux-2.6 into ↵Paolo Bonzini2015-04-2125-364/+1580
| |\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | kvm-master Patch queue for ppc - 2015-04-21 This is the latest queue for KVM on PowerPC changes. Highlights this time around: - Book3S HV: Debugging aids - Book3S HV: Minor performance improvements - Book3S HV: Cleanups
| | * | KVM: PPC: Book3S HV: Use msgsnd for signalling threads on POWER8Paul Mackerras2015-04-214-22/+70
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This uses msgsnd where possible for signalling other threads within the same core on POWER8 systems, rather than IPIs through the XICS interrupt controller. This includes waking secondary threads to run the guest, the interrupts generated by the virtual XICS, and the interrupts to bring the other threads out of the guest when exiting. Aggregated statistics from debugfs across vcpus for a guest with 32 vcpus, 8 threads/vcore, running on a POWER8, show this before the change: rm_entry: 3387.6ns (228 - 86600, 1008969 samples) rm_exit: 4561.5ns (12 - 3477452, 1009402 samples) rm_intr: 1660.0ns (12 - 553050, 3600051 samples) and this after the change: rm_entry: 3060.1ns (212 - 65138, 953873 samples) rm_exit: 4244.1ns (12 - 9693408, 954331 samples) rm_intr: 1342.3ns (12 - 1104718, 3405326 samples) for a test of booting Fedora 20 big-endian to the login prompt. The time taken for a H_PROD hcall (which is handled in the host kernel) went down from about 35 microseconds to about 16 microseconds with this change. The noinline added to kvmppc_run_core turned out to be necessary for good performance, at least with gcc 4.9.2 as packaged with Fedora 21 and a little-endian POWER8 host. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>
| | * | KVM: PPC: Book3S HV: Translate kvmhv_commence_exit to CPaul Mackerras2015-04-214-68/+75
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This replaces the assembler code for kvmhv_commence_exit() with C code in book3s_hv_builtin.c. It also moves the IPI sending code that was in book3s_hv_rm_xics.c into a new kvmhv_rm_send_ipi() function so it can be used by kvmhv_commence_exit() as well as icp_rm_set_vcpu_irq(). Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>
| | * | KVM: PPC: Book3S HV: Streamline guest entry and exitPaul Mackerras2015-04-211-86/+126
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On entry to the guest, secondary threads now wait for the primary to switch the MMU after loading up most of their state, rather than before. This means that the secondary threads get into the guest sooner, in the common case where the secondary threads get to kvmppc_hv_entry before the primary thread. On exit, the first thread out increments the exit count and interrupts the other threads (to get them out of the guest) before saving most of its state, rather than after. That means that the other threads exit sooner and means that the first thread doesn't spend so much time waiting for the other threads at the point where the MMU gets switched back to the host. This pulls out the code that increments the exit count and interrupts other threads into a separate function, kvmhv_commence_exit(). This also makes sure that r12 and vcpu->arch.trap are set correctly in some corner cases. Statistics from /sys/kernel/debug/kvm/vm*/vcpu*/timings show the improvement. Aggregating across vcpus for a guest with 32 vcpus, 8 threads/vcore, running on a POWER8, gives this before the change: rm_entry: avg 4537.3ns (222 - 48444, 1068878 samples) rm_exit: avg 4787.6ns (152 - 165490, 1010717 samples) rm_intr: avg 1673.6ns (12 - 341304, 3818691 samples) and this after the change: rm_entry: avg 3427.7ns (232 - 68150, 1118921 samples) rm_exit: avg 4716.0ns (12 - 150720, 1119477 samples) rm_intr: avg 1614.8ns (12 - 522436, 3850432 samples) showing a substantial reduction in the time spent per guest entry in the real-mode guest entry code, and smaller reductions in the real mode guest exit and interrupt handling times. (The test was to start the guest and boot Fedora 20 big-endian to the login prompt.) Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>
| | * | KVM: PPC: Book3S HV: Use bitmap of active threads rather than countPaul Mackerras2015-04-215-49/+44
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, the entry_exit_count field in the kvmppc_vcore struct contains two 8-bit counts, one of the threads that have started entering the guest, and one of the threads that have started exiting the guest. This changes it to an entry_exit_map field which contains two bitmaps of 8 bits each. The advantage of doing this is that it gives us a bitmap of which threads need to be signalled when exiting the guest. That means that we no longer need to use the trick of setting the HDEC to 0 to pull the other threads out of the guest, which led in some cases to a spurious HDEC interrupt on the next guest entry. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>
| | * | KVM: PPC: Book3S HV: Use decrementer to wake napping threadsPaul Mackerras2015-04-211-2/+41
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This arranges for threads that are napping due to their vcpu having ceded or due to not having a vcpu to wake up at the end of the guest's timeslice without having to be poked with an IPI. We do that by arranging for the decrementer to contain a value no greater than the number of timebase ticks remaining until the end of the timeslice. In the case of a thread with no vcpu, this number is in the hypervisor decrementer already. In the case of a ceded vcpu, we use the smaller of the HDEC value and the DEC value. Using the DEC like this when ceded means we need to save and restore the guest decrementer value around the nap. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>
| | * | KVM: PPC: Book3S HV: Don't wake thread with no vcpu on guest IPIPaul Mackerras2015-04-211-3/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When running a multi-threaded guest and vcpu 0 in a virtual core is not running in the guest (i.e. it is busy elsewhere in the host), thread 0 of the physical core will switch the MMU to the guest and then go to nap mode in the code at kvm_do_nap. If the guest sends an IPI to thread 0 using the msgsndp instruction, that will wake up thread 0 and cause all the threads in the guest to exit to the host unnecessarily. To avoid the unnecessary exit, this arranges for the PECEDP bit to be cleared in this situation. When napping due to a H_CEDE from the guest, we still set PECEDP so that the thread will wake up on an IPI sent using msgsndp. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>
| | * | KVM: PPC: Book3S HV: Get rid of vcore nap_count and n_wokenPaul Mackerras2015-04-214-35/+34
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We can tell when a secondary thread has finished running a guest by the fact that it clears its kvm_hstate.kvm_vcpu pointer, so there is no real need for the nap_count field in the kvmppc_vcore struct. This changes kvmppc_wait_for_nap to poll the kvm_hstate.kvm_vcpu pointers of the secondary threads rather than polling vc->nap_count. Besides reducing the size of the kvmppc_vcore struct by 8 bytes, this also means that we can tell which secondary threads have got stuck and thus print a more informative error message. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>
| | * | KVM: PPC: Book3S HV: Move vcore preemption point up into kvmppc_run_vcpuPaul Mackerras2015-04-212-42/+55
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Rather than calling cond_resched() in kvmppc_run_core() before doing the post-processing for the vcpus that we have just run (that is, calling kvmppc_handle_exit_hv(), kvmppc_set_timer(), etc.), we now do that post-processing before calling cond_resched(), and that post- processing is moved out into its own function, post_guest_process(). The reschedule point is now in kvmppc_run_vcpu() and we define a new vcore state, VCORE_PREEMPT, to indicate that that the vcore's runner task is runnable but not running. (Doing the reschedule with the vcore in VCORE_INACTIVE state would be bad because there are potentially other vcpus waiting for the runner in kvmppc_wait_for_exec() which then wouldn't get woken up.) Also, we make use of the handy cond_resched_lock() function, which unlocks and relocks vc->lock for us around the reschedule. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>
| | * | KVM: PPC: Book3S HV: Minor cleanupsPaul Mackerras2015-04-213-28/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Remove unused kvmppc_vcore::n_busy field. * Remove setting of RMOR, since it was only used on PPC970 and the PPC970 KVM support has been removed. * Don't use r1 or r2 in setting the runlatch since they are conventionally reserved for other things; use r0 instead. * Streamline the code a little and remove the ext_interrupt_to_host label. * Add some comments about register usage. * hcall_try_real_mode doesn't need to be global, and can't be called from C code anyway. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>
| | * | KVM: PPC: Book3S HV: Simplify handling of VCPUs that need a VPA updatePaul Mackerras2015-04-212-29/+32
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Previously, if kvmppc_run_core() was running a VCPU that needed a VPA update (i.e. one of its 3 virtual processor areas needed to be pinned in memory so the host real mode code can update it on guest entry and exit), we would drop the vcore lock and do the update there and then. Future changes will make it inconvenient to drop the lock, so instead we now remove it from the list of runnable VCPUs and wake up its VCPU task. This will have the effect that the VCPU task will exit kvmppc_run_vcpu(), go around the do loop in kvmppc_vcpu_run_hv(), and re-enter kvmppc_run_vcpu(), whereupon it will do the necessary call to kvmppc_update_vpas() and then rejoin the vcore. The one complication is that the runner VCPU (whose VCPU task is the current task) might be one of the ones that gets removed from the runnable list. In that case we just return from kvmppc_run_core() and let the code in kvmppc_run_vcpu() wake up another VCPU task to be the runner if necessary. This all means that the VCORE_STARTING state is no longer used, so we remove it. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>
| | * | KVM: PPC: Book3S HV: Accumulate timing information for real-mode codePaul Mackerras2015-04-217-2/+346
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This reads the timebase at various points in the real-mode guest entry/exit code and uses that to accumulate total, minimum and maximum time spent in those parts of the code. Currently these times are accumulated per vcpu in 5 parts of the code: * rm_entry - time taken from the start of kvmppc_hv_entry() until just before entering the guest. * rm_intr - time from when we take a hypervisor interrupt in the guest until we either re-enter the guest or decide to exit to the host. This includes time spent handling hcalls in real mode. * rm_exit - time from when we decide to exit the guest until the return from kvmppc_hv_entry(). * guest - time spend in the guest * cede - time spent napping in real mode due to an H_CEDE hcall while other threads in the same vcore are active. These times are exposed in debugfs in a directory per vcpu that contains a file called "timings". This file contains one line for each of the 5 timings above, with the name followed by a colon and 4 numbers, which are the count (number of times the code has been executed), the total time, the minimum time, and the maximum time, all in nanoseconds. The overhead of the extra code amounts to about 30ns for an hcall that is handled in real mode (e.g. H_SET_DABR), which is about 25%. Since production environments may not wish to incur this overhead, the new code is conditional on a new config symbol, CONFIG_KVM_BOOK3S_HV_EXIT_TIMING. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>
| | * | KVM: PPC: Book3S HV: Create debugfs file for each guest's HPTPaul Mackerras2015-04-215-0/+153
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This creates a debugfs directory for each HV guest (assuming debugfs is enabled in the kernel config), and within that directory, a file by which the contents of the guest's HPT (hashed page table) can be read. The directory is named vmnnnn, where nnnn is the PID of the process that created the guest. The file is named "htab". This is intended to help in debugging problems in the host's management of guest memory. The contents of the file consist of a series of lines like this: 3f48 4000d032bf003505 0000000bd7ff1196 00000003b5c71196 The first field is the index of the entry in the HPT, the second and third are the HPT entry, so the third entry contains the real page number that is mapped by the entry if the entry's valid bit is set. The fourth field is the guest's view of the second doubleword of the entry, so it contains the guest physical address. (The format of the second through fourth fields are described in the Power ISA and also in arch/powerpc/include/asm/mmu-hash64.h.) Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>
| | * | KVM: PPC: Book3S HV: Add ICP real mode countersSuresh Warrier2015-04-213-2/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add two counters to count how often we generate real-mode ICS resend and reject events. The counters provide some performance statistics that could be used in the future to consider if the real mode functions need further optimizing. The counters are displayed as part of IPC and ICP state provided by /sys/debug/kernel/powerpc/kvm* for each VM. Also added two counters that count (approximately) how many times we don't find an ICP or ICS we're looking for. These are not currently exposed through sysfs, but can be useful when debugging crashes. Signed-off-by: Suresh Warrier <warrier@linux.vnet.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>
| | * | KVM: PPC: Book3S HV: Move virtual mode ICP functions to real-modeSuresh Warrier2015-04-211-14/+211
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Interrupt-based hypercalls return H_TOO_HARD to inform KVM that it needs to switch to the host to complete the rest of hypercall function in virtual mode. This patch ports the virtual mode ICS/ICP reject and resend functions to be runnable in hypervisor real mode, thus avoiding the need to switch to the host to execute these functions in virtual mode. However, the hypercalls continue to return H_TOO_HARD for vcpu_wakeup and notify events - these events cannot be done in real mode and they will still need a switch to host virtual mode. There are sufficient differences between the real mode code and the virtual mode code for the ICS/ICP resend and reject functions that for now the code has been duplicated instead of sharing common code. In the future, we can look at creating common functions. Signed-off-by: Suresh Warrier <warrier@linux.vnet.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>
| | * | KVM: PPC: Book3S HV: Convert ICS mutex lock to spin lockSuresh Warrier2015-04-212-22/+48
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Replaces the ICS mutex lock with a spin lock since we will be porting these routines to real mode. Note that we need to disable interrupts before we take the lock in anticipation of the fact that on the guest side, we are running in the context of a hard irq and interrupts are disabled (EE bit off) when the lock is acquired. Again, because we will be acquiring the lock in hypervisor real mode, we need to use an arch_spinlock_t instead of a normal spinlock here as we want to avoid running any lockdep code (which may not be safe to execute in real mode). Signed-off-by: Suresh Warrier <warrier@linux.vnet.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>
| | * | KVM: PPC: Book3S HV: Add guest->host real mode completion countersSuresh E. Warrier2015-04-212-4/+33
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add counters to track number of times we switch from guest real mode to host virtual mode during an interrupt-related hyper call because the hypercall requires actions that cannot be completed in real mode. This will help when making optimizations that reduce guest-host transitions. It is safe to use an ordinary increment rather than an atomic operation because there is one ICP per virtual CPU and kvmppc_xics_rm_complete() only works on the ICP for the current VCPU. The counters are displayed as part of IPC and ICP state provided by /sys/debug/kernel/powerpc/kvm* for each VM. Signed-off-by: Suresh Warrier <warrier@linux.vnet.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>
| | * | KVM: PPC: Book3S HV: Add helpers for lock/unlock hpteAneesh Kumar K.V2015-04-213-31/+33
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This adds helper routines for locking and unlocking HPTEs, and uses them in the rest of the code. We don't change any locking rules in this patch. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>
| | * | KVM: PPC: Book3S HV: Remove RMA-related variables from codeAneesh Kumar K.V2015-04-213-21/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We don't support real-mode areas now that 970 support is removed. Remove the remaining details of rma from the code. Also rename rma_setup_done to hpte_setup_done to better reflect the changes. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>
| | * | KVM: PPC: Book3S HV: Add fast real-mode H_RANDOM implementation.Michael Ellerman2015-04-218-2/+191
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Some PowerNV systems include a hardware random-number generator. This HWRNG is present on POWER7+ and POWER8 chips and is capable of generating one 64-bit random number every microsecond. The random numbers are produced by sampling a set of 64 unstable high-frequency oscillators and are almost completely entropic. PAPR defines an H_RANDOM hypercall which guests can use to obtain one 64-bit random sample from the HWRNG. This adds a real-mode implementation of the H_RANDOM hypercall. This hypercall was implemented in real mode because the latency of reading the HWRNG is generally small compared to the latency of a guest exit and entry for all the threads in the same virtual core. Userspace can detect the presence of the HWRNG and the H_RANDOM implementation by querying the KVM_CAP_PPC_HWRNG capability. The H_RANDOM hypercall implementation will only be invoked when the guest does an H_RANDOM hypercall if userspace first enables the in-kernel H_RANDOM implementation using the KVM_CAP_PPC_ENABLE_HCALL capability. Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>
| | * | kvmppc: Implement H_LOGICAL_CI_{LOAD,STORE} in KVMDavid Gibson2015-04-214-0/+119
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On POWER, storage caching is usually configured via the MMU - attributes such as cache-inhibited are stored in the TLB and the hashed page table. This makes correctly performing cache inhibited IO accesses awkward when the MMU is turned off (real mode). Some CPU models provide special registers to control the cache attributes of real mode load and stores but this is not at all consistent. This is a problem in particular for SLOF, the firmware used on KVM guests, which runs entirely in real mode, but which needs to do IO to load the kernel. To simplify this qemu implements two special hypercalls, H_LOGICAL_CI_LOAD and H_LOGICAL_CI_STORE which simulate a cache-inhibited load or store to a logical address (aka guest physical address). SLOF uses these for IO. However, because these are implemented within qemu, not the host kernel, these bypass any IO devices emulated within KVM itself. The simplest way to see this problem is to attempt to boot a KVM guest from a virtio-blk device with iothread / dataplane enabled. The iothread code relies on an in kernel implementation of the virtio queue notification, which is not triggered by the IO hcalls, and so the guest will stall in SLOF unable to load the guest OS. This patch addresses this by providing in-kernel implementations of the 2 hypercalls, which correctly scan the KVM IO bus. Any access to an address not handled by the KVM IO bus will cause a VM exit, hitting the qemu implementation as before. Note that a userspace change is also required, in order to enable these new hcall implementations with KVM_CAP_PPC_ENABLE_HCALL. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> [agraf: fix compilation] Signed-off-by: Alexander Graf <agraf@suse.de>
| | * | powerpc: Export __spin_yieldSuresh E. Warrier2015-04-211-0/+1
| | |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Export __spin_yield so that the arch_spin_unlock() function can be invoked from a module. This will be required for modules where we want to take a lock that is also is acquired in hypervisor real mode. Because we want to avoid running any lockdep code (which may not be safe in real mode), this lock needs to be an arch_spinlock_t instead of a normal spinlock. Signed-off-by: Suresh Warrier <warrier@linux.vnet.ibm.com> Acked-by: Paul Mackerras <paulus@samba.org> Acked-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Alexander Graf <agraf@suse.de>
| * | KVM: VMX: Preserve host CR4.MCE value while in guest mode.Ben Serebrin2015-04-211-2/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The host's decision to enable machine check exceptions should remain in force during non-root mode. KVM was writing 0 to cr4 on VCPU reset and passed a slightly-modified 0 to the vmcs.guest_cr4 value. Tested: Built. On earlier version, tested by injecting machine check while a guest is spinning. Before the change, if guest CR4.MCE==0, then the machine check is escalated to Catastrophic Error (CATERR) and the machine dies. If guest CR4.MCE==1, then the machine check causes VMEXIT and is handled normally by host Linux. After the change, injecting a machine check causes normal Linux machine check handling. Signed-off-by: Ben Serebrin <serebrin@google.com> Reviewed-by: Venkatesh Srinivas <venkateshs@google.com> Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
| * | KVM: MMU: fix comment in kvm_mmu_zap_collapsible_spteXiao Guangrong2015-04-151-3/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Soft mmu uses direct shadow page to fill guest large mapping with small pages if huge mapping is disallowed on host. So zapping direct shadow page works well both for soft mmu and hard mmu, it's just less widely applicable. Fix the comment to reflect this. Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com> Message-Id: <552C91BA.1010703@linux.intel.com> [Fix comment wording further. - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
| * | kvm: mmu: don't do memslot overflow checkWanpeng Li2015-04-151-10/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As Andres pointed out: | I don't understand the value of this check here. Are we looking for a | broken memslot? Shouldn't this be a BUG_ON? Is this the place to care | about these things? npages is capped to KVM_MEM_MAX_NR_PAGES, i.e. | 2^31. A 64 bit overflow would be caused by a gigantic gfn_start which | would be trouble in many other ways. This patch drops the memslot overflow check to make the codes more simple. Reviewed-by: Andres Lagar-Cavilla <andreslc@google.com> Signed-off-by: Wanpeng Li <wanpeng.li@linux.intel.com> Message-Id: <1429064694-3072-1-git-send-email-wanpeng.li@linux.intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
| * | KVM: s390: disable RRBM againChristian Borntraeger2015-04-151-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit b273921356df ("KVM: s390: enable more features that need no hypervisor changes") also enabled RRBM. Turns out that this instruction does need some KVM code, so lets disable that bit again. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Fixes: b273921356df ("KVM: s390: enable more features that need no hypervisor changes") Message-Id: <1429093624-49611-2-git-send-email-borntraeger@de.ibm.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
| * | KVM: x86: cleanup kvm_irq_delivery_to_apic_fastPaolo Bonzini2015-04-141-5/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Sparse is reporting a "we previously assumed 'src' could be null" error. This is true as far as the static analyzer can see, but in practice only IPIs can set shorthand to self and they also set 'src', so it's ok. Still, move the initialization of x2apic_ipi (and thus the NULL check for src right before the first use. While at it, initializing ret to "false" is somewhat confusing because of the almost immediate assigned of "true" to the same variable. Thus, initialize it to "true" and modify it in the only path that used to use the value from "bool ret = false". There is no change in generated code from this change. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
| * | KVM: x86: Fix MSR_IA32_BNDCFGS in msrs_to_saveNadav Amit2015-04-141-2/+8
| |/ | | | | | | | | | | | | | | | | | | | | | | | | | | kvm_init_msr_list is currently called before hardware_setup. As a result, vmx_mpx_supported always returns false when kvm_init_msr_list checks whether to save MSR_IA32_BNDCFGS. Move kvm_init_msr_list after vmx_hardware_setup is called to fix this issue. Signed-off-by: Nadav Amit <namit@cs.technion.ac.il> Message-Id: <1428864435-4732-1-git-send-email-namit@cs.technion.ac.il> Cc: stable@vger.kernel.org # 3.15+ Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* | v4l: xilinx: fix for include file movementStephen Rothwell2015-04-261-1/+1
| | | | | | | | | | Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | Merge tag 'sound-fix-4.1-rc1' of ↵Linus Torvalds2015-04-2411-68/+51
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound Pull sound fixes from Takashi Iwai: "Here are a few fixes that have been pending since the previous pull request: a regression fix for HD-audio multiple SPDIF / HDMI devices, several ALC256 codec fixes, a couple of i915 HDMI audio fixes, and various small fixes. Nothing exciting, just boring, but things good to have" * tag 'sound-fix-4.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: ALSA: hda - fix headset mic detection problem for one more machine ALSA: hda/realtek - Fix Headphone Mic doesn't recording for ALC256 ALSA: hda - fix "num_steps = 0" error on ALC256 ALSA: usb-audio: Fix audio output on Roland SC-D70 sound module ALSA: hda - add AZX_DCAPS_I915_POWERWELL to Baytrail ALSA: hda - only sync BCLK to the display clock for Haswell & Broadwell ALSA: hda - Mute headphone pin on suspend on XPS13 9333 sound/oss: fix deadlock in sequencer_ioctl(SNDCTL_SEQ_OUTOFBAND) ALSA: asound.h - use SNDRV_CTL_ELEM_ID_NAME_MAXLEN ALSA: hda - potential (but unlikely) uninitialized variable ALSA: hda - Fix regression for slave SPDIF setups ALSA: intel8x0: Check pci_iomap() success for DEVICE_ALI ALSA: hda - simplify azx_has_pm_runtime
| * | ALSA: hda - fix headset mic detection problem for one more machineHui Wang2015-04-241-9/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We have two machines with alc256 codec in the pin quirk table, so moving the common pins to ALC256_STANDARD_PINS. Cc: stable@vger.kernel.org BugLink: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1447909 Signed-off-by: Hui Wang <hui.wang@canonical.com> Signed-off-by: Takashi Iwai <tiwai@suse.de>
| * | ALSA: hda/realtek - Fix Headphone Mic doesn't recording for ALC256Kailang Yang2015-04-231-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | Switch default pcbeep path to Line in path. Signed-off-by: Kailang Yang <kailang@realtek.com> Tested-by: Hui Wang <hui.wang@canonical.com> Cc: <stable@vger.kernel.org> Signed-off-by: Takashi Iwai <tiwai@suse.de>
| * | ALSA: hda - fix "num_steps = 0" error on ALC256David Henningsson2015-04-211-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The ALC256 does not have a mixer nid at 0x0b, and there's no loopback path (the output pins are directly connected to the DACs). This commit fixes an "num_steps = 0 for NID=0xb (ctl = Beep Playback Volume)" error (and as a result, problems with amixer/alsamixer). If there's pcbeep functionality, it certainly isn't controlled by setting an amp on 0x0b, so disable beep functionality (at least for now). Cc: stable@vger.kernel.org BugLink: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1446517 Signed-off-by: David Henningsson <david.henningsson@canonical.com> Signed-off-by: Takashi Iwai <tiwai@suse.de>
| * | ALSA: usb-audio: Fix audio output on Roland SC-D70 sound moduleTakamichi Horikawa2015-04-212-29/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Roland SC-D70 reports its device class as vendor specific class and the quirk QUIRK_AUDIO_FIXED_ENDPOINT was used for audio output. In the quirks table the sampling rate was hard-coded to 44100 Hz and therefore not worked when the sound module was in 48000 Hz mode. In this change the quirk is changed to QUIRK_AUDIO_STANDARD_INTERFACE but as the sound module reports incorrect bSubframeSize in its descriptors, additional change is made in format.c to detect it and to override it (which uses the existing code for Edirol SD-90). Tested both when the sound module was in 44100 Hz mode and 48000 Hz mode and both audio input and output. MIDI related part of the driver is not touched. Signed-off-by: Takamichi Horikawa <takamichiho@gmail.com> Signed-off-by: Takashi Iwai <tiwai@suse.de>
| * | ALSA: hda - add AZX_DCAPS_I915_POWERWELL to BaytrailMengdong Lin2015-04-211-1/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch addes AZX_DCAPS_I915_POWERWELL to BYT (Baytrail). Like Braswell and Skylake, the HDMI codec on Bytrail is also in the shared power well with GPU. This power well must be turned on before we reset link to probe the codec, to avoid communication failure with the codec. The side effect is that this power is always ON in S0 because the BYT HDMI codec does not support EPSS or D3ClkStop and so the controller doesn't enter D3 at runtime, and the HDMI codec and analog codec share a single physical HD-A link and so we cannot reset the HD-A link freely when we re-enable the power to use the HDMI codec. Next step is to test if an AGP reset or double AGP reset on BYT HDMI codec is okay to bring the HDMI codec back to a functional state after restoring the power. If okay, we can bind the power on/off with the HDMI codec PM without interrupting the analog audio. Signed-off-by: Mengdong Lin <mengdong.lin@intel.com> Signed-off-by: Takashi Iwai <tiwai@suse.de>
| * | ALSA: hda - only sync BCLK to the display clock for Haswell & BroadwellMengdong Lin2015-04-201-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Only Intel Haswell and Broadwell have a separate HD-A controller (PCI device 3) for display audio, which needs to get 24MHz HD-A link BCLK from the variable display core clock through vendor specific registers EM4 & EM5. Other platforms (Baytrail, Braswell and Skylake) don't have this feature. So this patch checks the PCI device ID of the controller in haswell_set_bclk() and only sync BCLK for HSW and BDW. Signed-off-by: Mengdong Lin <mengdong.lin@intel.com> Signed-off-by: Takashi Iwai <tiwai@suse.de>
| * | ALSA: hda - Mute headphone pin on suspend on XPS13 9333Gabriele Mazzotta2015-04-201-10/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Muting the headphone output pin right before the codec suspension prevents pop noises when headphones are plugged in (except for a barely audible click noise). This solution allows to truly save some power when headphones are plugged in unlike the previous solution (033b0a7ca9c: "ALSA: hda - Pop noises fix for XPS13 9333") Signed-off-by: Gabriele Mazzotta <gabriele.mzt@gmail.com> Signed-off-by: Takashi Iwai <tiwai@suse.de>
| * | sound/oss: fix deadlock in sequencer_ioctl(SNDCTL_SEQ_OUTOFBAND)Alexey Khoroshilov2015-04-181-10/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A deadlock can be initiated by userspace via ioctl(SNDCTL_SEQ_OUTOFBAND) on /dev/sequencer with TMR_ECHO midi event. In this case the control flow is: sound_ioctl() -> case SND_DEV_SEQ: case SND_DEV_SEQ2: sequencer_ioctl() -> case SNDCTL_SEQ_OUTOFBAND: spin_lock_irqsave(&lock,flags); play_event(); -> case EV_TIMING: seq_timing_event() -> case TMR_ECHO: seq_copy_to_input() -> spin_lock_irqsave(&lock,flags); It seems that spin_lock_irqsave() around play_event() is not necessary, because the only other call location in seq_startplay() makes the call without acquiring spinlock. So, the patch just removes spinlocks around play_event(). By the way, it removes unreachable code in seq_timing_event(), since (seq_mode == SEQ_2) case is handled in the beginning. Compile tested only. Found by Linux Driver Verification project (linuxtesting.org). Signed-off-by: Alexey Khoroshilov <khoroshilov@ispras.ru> Signed-off-by: Takashi Iwai <tiwai@suse.de>
| * | ALSA: asound.h - use SNDRV_CTL_ELEM_ID_NAME_MAXLENVinod Koul2015-04-181-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | we have defined SNDRV_CTL_ELEM_ID_NAME_MAXLEN as size of name array so use this define instead of numeric value Signed-off-by: Vinod Koul <vinod.koul@intel.com> Signed-off-by: Takashi Iwai <tiwai@suse.de>
| * | ALSA: hda - potential (but unlikely) uninitialized variableDan Carpenter2015-04-171-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This function is a bit unusual because it accepts negative values as "conn_len". It's theoretically possible for both "cache_len" and "conn_len" to be -ENOSPC and in that case we would oops trying to run memcmp() on the uninitialized "list" pointer. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Takashi Iwai <tiwai@suse.de>
| * | ALSA: hda - Fix regression for slave SPDIF setupsTakashi Iwai2015-04-161-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The commit [a551d91473e5: ALSA: hda - Use regmap for command verb caches, too] introduced a regression due to a typo in the conversion; the IEC958 status bits of slave digital devices aren't updated correctly. This patch corrects it. Fixes: a551d91473e5 ('ALSA: hda - Use regmap for command verb caches, too') Reported-and-tested-by: Markus Trippelsdorf <markus@trippelsdorf.de> Signed-off-by: Takashi Iwai <tiwai@suse.de>
| * | ALSA: intel8x0: Check pci_iomap() success for DEVICE_ALIScott Wood2015-04-161-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | DEVICE_ALI previously would jump to port_inited after calling pci_iomap(), bypassing the check for bmaddr being NULL. Signed-off-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Takashi Iwai <tiwai@suse.de>
| * | ALSA: hda - simplify azx_has_pm_runtimeDavid Henningsson2015-04-161-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | Because AZX_DCAPS_PM_RUNTIME is always defined as non-zero, the initial part of the expression can be skipped. Signed-off-by: David Henningsson <david.henningsson@canonical.com> Signed-off-by: Takashi Iwai <tiwai@suse.de>
* | | Merge branch 'for-next' of ↵Linus Torvalds2015-04-2450-2273/+1687
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending Pull SCSI target updates from Nicholas Bellinger: "Lots of activity in target land the last months. The highlights include: - Convert fabric drivers tree-wide to target_register_template() (hch + bart) - iser-target hardening fixes + v1.0 improvements (sagi) - Convert iscsi_thread_set usage to kthread.h + kill iscsi_target_tq.c (sagi + nab) - Add support for T10-PI WRITE_STRIP + READ_INSERT operation (mkp + sagi + nab) - DIF fixes for CONFIG_DEBUG_SG=y + UNMAP file emulation (akinobu + sagi + mkp) - Extended TCMU ABI v2 for future BIDI + DIF support (andy + ilias) - Fix COMPARE_AND_WRITE handling for NO_ALLLOC drivers (hch + nab) Thanks to everyone who contributed this round with new features, bug-reports, fixes, cleanups and improvements. Looking forward, it's currently shaping up to be a busy v4.2 as well" * 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending: (69 commits) target: Put TCMU under a new config option target: Version 2 of TCMU ABI target: fix tcm_mod_builder.py target/file: Fix UNMAP with DIF protection support target/file: Fix SG table for prot_buf initialization target/file: Fix BUG() when CONFIG_DEBUG_SG=y and DIF protection enabled target: Make core_tmr_abort_task() skip TMFs target/sbc: Update sbc_dif_generate pr_debug output target/sbc: Make internal DIF emulation honor ->prot_checks target/sbc: Return INVALID_CDB_FIELD if DIF + sess_prot_type disabled target: Ensure sess_prot_type is saved across session restart target/rd: Don't pass incomplete scatterlist entries to sbc_dif_verify_* target: Remove the unused flag SCF_ACK_KREF target: Fix two sparse warnings target: Fix COMPARE_AND_WRITE with SG_TO_MEM_NOALLOC handling target: simplify the target template registration API target: simplify target_xcopy_init_pt_lun target: remove the unused SCF_CMD_XCOPY_PASSTHROUGH flag target/rd: reduce code duplication in rd_execute_rw() tcm_loop: fixup tpgt string to integer conversion ...
| * | | target: Put TCMU under a new config optionAndy Grover2015-04-202-3/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Conceptually version 2 should be viewed as an entirely new, incompatible version of TCMU, so emphasize this by changing the config option and Kconfig text. Signed-off-by: Andy Grover <agrover@redhat.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
| * | | target: Version 2 of TCMU ABIAndy Grover2015-04-203-44/+89
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The initial version of TCMU (in 3.18) does not properly handle bidirectional SCSI commands -- those with both an in and out buffer. In looking to fix this it also became clear that TCMU's support for adding new types of entries (opcodes) to the command ring was broken. We need to fix this now, so that future issues can be handled properly by adding new opcodes. We make the most of this ABI break by enabling bidi cmd handling within TCMP_OP_CMD opcode. Add an iov_bidi_cnt field to tcmu_cmd_entry.req. This enables TCMU to describe bidi commands, but further kernel work is needed for full bidi support. Enlarge tcmu_cmd_entry_hdr by 32 bits by pulling in cmd_id and __pad1. Turn __pad1 into two 8 bit flags fields, for kernel-set and userspace-set flags, "kflags" and "uflags" respectively. Update version fields so userspace can tell the interface is changed. Update tcmu-design.txt with details of how new stuff works: - Specify an additional requirement for userspace to set UNKNOWN_OP (bit 0) in hdr.uflags for unknown/unhandled opcodes. - Define how Data-In and Data-Out fields are described in req.iov[] Changed in v2: - Change name of SKIPPED bit to UNKNOWN bit - PAD op does not set the bit any more - Change len_op helper functions to take just len_op, not the whole struct - Change version to 2 in missed spots, and use defines - Add 16 unused bytes to cmd_entry.req, in case additional SAM cmd parameters need to be included - Add iov_dif_cnt field to specify buffers used for DIF info in iov[] - Rearrange fields to naturally align cdb_off - Handle if userspace sets UNKNOWN_OP by indicating failure of the cmd - Wrap some overly long UPDATE_HEAD lines (Add missing req.iov_bidi_cnt + req.iov_dif_cnt zeroing - Ilias) Signed-off-by: Andy Grover <agrover@redhat.com> Reviewed-by: Ilias Tsitsimpis <iliastsi@arrikto.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>