summaryrefslogtreecommitdiffstats
path: root/arch (follow)
Commit message (Collapse)AuthorAgeFilesLines
* m68k/mac: Export Peripheral System Controller (PSC) base address to modulesGeert Uytterhoeven2015-09-301-0/+1
| | | | | | | | | | | If CONFIG_MACMACE=m: ERROR: psc [drivers/net/ethernet/apple/macmace.ko] undefined! Add the missing export to fix this. Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller2015-09-27264-1009/+955
|\ | | | | | | | | | | | | | | | | | | Conflicts: net/ipv4/arp.c The net/ipv4/arp.c conflict was one commit adding a new local variable while another commit was deleting one. Signed-off-by: David S. Miller <davem@davemloft.net>
| * Merge tag 'pci-v4.3-fixes-1' of ↵Linus Torvalds2015-09-259-5/+38
| |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci Pull PCI fixes from Bjorn Helgaas: "These are fixes for things we merged for v4.3 (VPD, MSI, and bridge window management), and a new Renesas R8A7794 SoC device ID. Details: Resource management: - Revert pci_read_bridge_bases() unification (Bjorn Helgaas) - Clear IORESOURCE_UNSET when clipping a bridge window (Bjorn Helgaas) MSI: - Fix MSI IRQ domains for VFs on virtual buses (Alex Williamson) Renesas R-Car host bridge driver: - Add R8A7794 support (Sergei Shtylyov) Miscellaneous: - Fix devfn for VPD access through function 0 (Alex Williamson) - Use function 0 VPD only for identical functions (Alex Williamson)" * tag 'pci-v4.3-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: PCI: rcar: Add R8A7794 support PCI: Use function 0 VPD for identical functions, regular VPD for others PCI: Fix devfn for VPD access through function 0 PCI/MSI: Fix MSI IRQ domains for VFs on virtual buses PCI: Clear IORESOURCE_UNSET when clipping a bridge window PCI: Revert "PCI: Call pci_read_bridge_bases() from core instead of arch code"
| | * PCI: Revert "PCI: Call pci_read_bridge_bases() from core instead of arch code"Bjorn Helgaas2015-09-159-5/+38
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Revert dff22d2054b5 ("PCI: Call pci_read_bridge_bases() from core instead of arch code"). Reading PCI bridge windows is not arch-specific in itself, but there is PCI core code that doesn't work correctly if we read them too early. For example, Hannes found this case on an ARM Freescale i.mx6 board: pci_bus 0000:00: root bus resource [mem 0x01000000-0x01efffff] pci 0000:00:00.0: PCI bridge to [bus 01-ff] pci 0000:00:00.0: BAR 8: no space for [mem size 0x01000000] (mem window) pci 0000:01:00.0: BAR 2: failed to assign [mem size 0x00200000] pci 0000:01:00.0: BAR 1: failed to assign [mem size 0x00004000] pci 0000:01:00.0: BAR 0: failed to assign [mem size 0x00000100] The 00:00.0 mem window needs to be at least 3MB: the 01:00.0 device needs 0x204100 of space, and mem windows are megabyte-aligned. Bus sizing can increase a bridge window size, but never *decrease* it (see d65245c3297a ("PCI: don't shrink bridge resources")). Prior to dff22d2054b5, ARM didn't read bridge windows at all, so the "original size" was zero, and we assigned a 3MB window. After dff22d2054b5, we read the bridge windows before sizing the bus. The firmware programmed a 16MB window (size 0x01000000) in 00:00.0, and since we never decrease the size, we kept 16MB even though we only needed 3MB. But 16MB doesn't fit in the host bridge aperture, so we failed to assign space for the window and the downstream devices. I think this is a defect in the PCI core: we shouldn't rely on the firmware to assign sensible windows. Ray reported a similar problem, also on ARM, with Broadcom iProc. Issues like this are too hard to fix right now, so revert dff22d2054b5. Reported-by: Hannes <oe5hpm@gmail.com> Reported-by: Ray Jui <rjui@broadcom.com> Link: http://lkml.kernel.org/r/CAAa04yFQEUJm7Jj1qMT57-LG7ZGtnhNDBe=PpSRa70Mj+XhW-A@mail.gmail.com Link: http://lkml.kernel.org/r/55F75BB8.4070405@broadcom.com Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Yinghai Lu <yinghai@kernel.org> Acked-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
| * | Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds2015-09-2513-9/+42
| |\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull KVM fixes from Paolo Bonzini: "AMD fixes for bugs introduced in the 4.2 merge window, and a few PPC bug fixes too" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: disable halt_poll_ns as default for s390x KVM: x86: fix off-by-one in reserved bits check KVM: x86: use correct page table format to check nested page table reserved bits KVM: svm: do not call kvm_set_cr0 from init_vmcb KVM: x86: trap AMD MSRs for the TSeg base and mask KVM: PPC: Book3S: Take the kvm->srcu lock in kvmppc_h_logical_ci_load/store() KVM: PPC: Book3S HV: Pass the correct trap argument to kvmhv_commence_exit KVM: PPC: Book3S HV: Fix handling of interrupted VCPUs kvm: svm: reset mmu on VCPU reset
| | * | KVM: disable halt_poll_ns as default for s390xDavid Hildenbrand2015-09-256-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We observed some performance degradation on s390x with dynamic halt polling. Until we can provide a proper fix, let's enable halt_poll_ns as default only for supported architectures. Architectures are now free to set their own halt_poll_ns default value. Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
| | * | KVM: x86: fix off-by-one in reserved bits checkPaolo Bonzini2015-09-251-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 29ecd6601904 ("KVM: x86: avoid uninitialized variable warning", 2015-09-06) introduced a not-so-subtle problem, which probably escaped review because it was not part of the patch context. Before the patch, leaf was always equal to iterator.level. After, it is equal to iterator.level - 1 in the call to is_shadow_zero_bits_set, and when is_shadow_zero_bits_set does another "-1" the check on reserved bits becomes incorrect. Using "iterator.level" in the call fixes this call trace: WARNING: CPU: 2 PID: 17000 at arch/x86/kvm/mmu.c:3385 handle_mmio_page_fault.part.93+0x1a/0x20 [kvm]() Modules linked in: tun sha256_ssse3 sha256_generic drbg binfmt_misc ipv6 vfat fat fuse dm_crypt dm_mod kvm_amd kvm crc32_pclmul aesni_intel aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd fam15h_power amd64_edac_mod k10temp edac_core amdkfd amd_iommu_v2 radeon acpi_cpufreq [...] Call Trace: dump_stack+0x4e/0x84 warn_slowpath_common+0x95/0xe0 warn_slowpath_null+0x1a/0x20 handle_mmio_page_fault.part.93+0x1a/0x20 [kvm] tdp_page_fault+0x231/0x290 [kvm] ? emulator_pio_in_out+0x6e/0xf0 [kvm] kvm_mmu_page_fault+0x36/0x240 [kvm] ? svm_set_cr0+0x95/0xc0 [kvm_amd] pf_interception+0xde/0x1d0 [kvm_amd] handle_exit+0x181/0xa70 [kvm_amd] ? kvm_arch_vcpu_ioctl_run+0x68b/0x1730 [kvm] kvm_arch_vcpu_ioctl_run+0x6f6/0x1730 [kvm] ? kvm_arch_vcpu_ioctl_run+0x68b/0x1730 [kvm] ? preempt_count_sub+0x9b/0xf0 ? mutex_lock_killable_nested+0x26f/0x490 ? preempt_count_sub+0x9b/0xf0 kvm_vcpu_ioctl+0x358/0x710 [kvm] ? __fget+0x5/0x210 ? __fget+0x101/0x210 do_vfs_ioctl+0x2f4/0x560 ? __fget_light+0x29/0x90 SyS_ioctl+0x4c/0x90 entry_SYSCALL_64_fastpath+0x16/0x73 ---[ end trace 37901c8686d84de6 ]--- Reported-by: Borislav Petkov <bp@alien8.de> Tested-by: Borislav Petkov <bp@alien8.de> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
| | * | KVM: x86: use correct page table format to check nested page table reserved bitsPaolo Bonzini2015-09-251-6/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Intel CPUID on AMD host or vice versa is a weird case, but it can happen. Handle it by checking the host CPU vendor instead of the guest's in reset_tdp_shadow_zero_bits_mask. For speed, the check uses the fact that Intel EPT has an X (executable) bit while AMD NPT has NX. Reported-by: Borislav Petkov <bp@alien8.de> Tested-by: Borislav Petkov <bp@alien8.de> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
| | * | KVM: svm: do not call kvm_set_cr0 from init_vmcbPaolo Bonzini2015-09-251-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | kvm_set_cr0 may want to call kvm_zap_gfn_range and thus access the memslots array (SRCU protected). Using a mini SRCU critical section is ugly, and adding it to kvm_arch_vcpu_create doesn't work because the VMX vcpu_create callback calls synchronize_srcu. Fixes this lockdep splat: =============================== [ INFO: suspicious RCU usage. ] 4.3.0-rc1+ #1 Not tainted ------------------------------- include/linux/kvm_host.h:488 suspicious rcu_dereference_check() usage! other info that might help us debug this: rcu_scheduler_active = 1, debug_locks = 0 1 lock held by qemu-system-i38/17000: #0: (&(&kvm->mmu_lock)->rlock){+.+...}, at: kvm_zap_gfn_range+0x24/0x1a0 [kvm] [...] Call Trace: dump_stack+0x4e/0x84 lockdep_rcu_suspicious+0xfd/0x130 kvm_zap_gfn_range+0x188/0x1a0 [kvm] kvm_set_cr0+0xde/0x1e0 [kvm] init_vmcb+0x760/0xad0 [kvm_amd] svm_create_vcpu+0x197/0x250 [kvm_amd] kvm_arch_vcpu_create+0x47/0x70 [kvm] kvm_vm_ioctl+0x302/0x7e0 [kvm] ? __lock_is_held+0x51/0x70 ? __fget+0x101/0x210 do_vfs_ioctl+0x2f4/0x560 ? __fget_light+0x29/0x90 SyS_ioctl+0x4c/0x90 entry_SYSCALL_64_fastpath+0x16/0x73 Reported-by: Borislav Petkov <bp@alien8.de> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
| | * | Merge branch 'kvm-ppc-fixes' of ↵Paolo Bonzini2015-09-223-1/+12
| | |\ \ | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc into kvm-master
| | | * | KVM: PPC: Book3S: Take the kvm->srcu lock in kvmppc_h_logical_ci_load/store()Thomas Huth2015-09-211-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Access to the kvm->buses (like with the kvm_io_bus_read() and -write() functions) has to be protected via the kvm->srcu lock. The kvmppc_h_logical_ci_load() and -store() functions are missing this lock so far, so let's add it there, too. This fixes the problem that the kernel reports "suspicious RCU usage" when lock debugging is enabled. Cc: stable@vger.kernel.org # v4.1+ Fixes: 99342cf8044420eebdf9297ca03a14cb6a7085a1 Signed-off-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
| | | * | KVM: PPC: Book3S HV: Pass the correct trap argument to kvmhv_commence_exitGautham R. Shenoy2015-09-211-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In guest_exit_cont we call kvmhv_commence_exit which expects the trap number as the argument. However r3 doesn't contain the trap number at this point and as a result we would be calling the function with a spurious trap number. Fix this by copying r12 into r3 before calling kvmhv_commence_exit as r12 contains the trap number. Cc: stable@vger.kernel.org # v4.1+ Fixes: eddb60fb1443 Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
| | | * | KVM: PPC: Book3S HV: Fix handling of interrupted VCPUsPaul Mackerras2015-09-211-1/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This fixes a bug which results in stale vcore pointers being left in the per-cpu preempted vcore lists when a VM is destroyed. The result of the stale vcore pointers is usually either a crash or a lockup inside collect_piggybacks() when another VM is run. A typical lockup message looks like: [ 472.161074] NMI watchdog: BUG: soft lockup - CPU#24 stuck for 22s! [qemu-system-ppc:7039] [ 472.161204] Modules linked in: kvm_hv kvm_pr kvm xt_CHECKSUM ipt_MASQUERADE nf_nat_masquerade_ipv4 tun ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 xt_conntrack ebtable_nat ebtable_broute bridge stp llc ebtable_filter ebtables ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw ip6table_filter ip6_tables iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw ses enclosure shpchp rtc_opal i2c_opal powernv_rng binfmt_misc dm_service_time scsi_dh_alua radeon i2c_algo_bit drm_kms_helper ttm drm tg3 ptp pps_core cxgb3 ipr i2c_core mdio dm_multipath [last unloaded: kvm_hv] [ 472.162111] CPU: 24 PID: 7039 Comm: qemu-system-ppc Not tainted 4.2.0-kvm+ #49 [ 472.162187] task: c000001e38512750 ti: c000001e41bfc000 task.ti: c000001e41bfc000 [ 472.162262] NIP: c00000000096b094 LR: c00000000096b08c CTR: c000000000111130 [ 472.162337] REGS: c000001e41bff520 TRAP: 0901 Not tainted (4.2.0-kvm+) [ 472.162399] MSR: 9000000100009033 <SF,HV,EE,ME,IR,DR,RI,LE> CR: 24848844 XER: 00000000 [ 472.162588] CFAR: c00000000096b0ac SOFTE: 1 GPR00: c000000000111170 c000001e41bff7a0 c00000000127df00 0000000000000001 GPR04: 0000000000000003 0000000000000001 0000000000000000 0000000000874821 GPR08: c000001e41bff8e0 0000000000000001 0000000000000000 d00000000efde740 GPR12: c000000000111130 c00000000fdae400 [ 472.163053] NIP [c00000000096b094] _raw_spin_lock_irqsave+0xa4/0x130 [ 472.163117] LR [c00000000096b08c] _raw_spin_lock_irqsave+0x9c/0x130 [ 472.163179] Call Trace: [ 472.163206] [c000001e41bff7a0] [c000001e41bff7f0] 0xc000001e41bff7f0 (unreliable) [ 472.163295] [c000001e41bff7e0] [c000000000111170] __wake_up+0x40/0x90 [ 472.163375] [c000001e41bff830] [d00000000efd6fc0] kvmppc_run_core+0x1240/0x1950 [kvm_hv] [ 472.163465] [c000001e41bffa30] [d00000000efd8510] kvmppc_vcpu_run_hv+0x5a0/0xd90 [kvm_hv] [ 472.163559] [c000001e41bffb70] [d00000000e9318a4] kvmppc_vcpu_run+0x44/0x60 [kvm] [ 472.163653] [c000001e41bffba0] [d00000000e92e674] kvm_arch_vcpu_ioctl_run+0x64/0x170 [kvm] [ 472.163745] [c000001e41bffbe0] [d00000000e9263a8] kvm_vcpu_ioctl+0x538/0x7b0 [kvm] [ 472.163834] [c000001e41bffd40] [c0000000002d0f50] do_vfs_ioctl+0x480/0x7c0 [ 472.163910] [c000001e41bffde0] [c0000000002d1364] SyS_ioctl+0xd4/0xf0 [ 472.163986] [c000001e41bffe30] [c000000000009260] system_call+0x38/0xd0 [ 472.164060] Instruction dump: [ 472.164098] ebc1fff0 ebe1fff8 7c0803a6 4e800020 60000000 60000000 60420000 8bad02e2 [ 472.164224] 7fc3f378 4b6a57c1 60000000 7c210b78 <e92d0000> 89290009 792affe3 40820070 The bug is that kvmppc_run_vcpu does not correctly handle the case where a vcpu task receives a signal while its guest vcpu is executing in the guest as a result of being piggy-backed onto the execution of another vcore. In that case we need to wait for the vcpu to finish executing inside the guest, and then remove this vcore from the preempted vcores list. That way, we avoid leaving this vcpu's vcore on the preempted vcores list when the vcpu gets interrupted. Fixes: ec2571650826 Reported-by: Thomas Huth <thuth@redhat.com> Tested-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
| | * | | KVM: x86: trap AMD MSRs for the TSeg base and maskPaolo Bonzini2015-09-212-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | These have roughly the same purpose as the SMRR, which we do not need to implement in KVM. However, Linux accesses MSR_K8_TSEG_ADDR at boot, which causes problems when running a Xen dom0 under KVM. Just return 0, meaning that processor protection of SMRAM is not in effect. Reported-by: M A Young <m.a.young@durham.ac.uk> Cc: stable@vger.kernel.org Acked-by: Borislav Petkov <bp@suse.de> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
| | * | | kvm: svm: reset mmu on VCPU resetIgor Mammedov2015-09-181-0/+1
| | |/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When INIT/SIPI sequence is sent to VCPU which before that was in use by OS, VMRUN might fail with: KVM: entry failed, hardware error 0xffffffff EAX=00000000 EBX=00000000 ECX=00000000 EDX=000006d3 ESI=00000000 EDI=00000000 EBP=00000000 ESP=00000000 EIP=00000000 EFL=00000002 [-------] CPL=0 II=0 A20=1 SMM=0 HLT=0 ES =0000 00000000 0000ffff 00009300 CS =9a00 0009a000 0000ffff 00009a00 [...] CR0=60000010 CR2=b6f3e000 CR3=01942000 CR4=000007e0 [...] EFER=0000000000000000 with corresponding SVM error: KVM: FAILED VMRUN WITH VMCB: [...] cpl: 0 efer: 0000000000001000 cr0: 0000000080010010 cr2: 00007fd7fe85bf90 cr3: 0000000187d0c000 cr4: 0000000000000020 [...] What happens is that VCPU state right after offlinig: CR0: 0x80050033 EFER: 0xd01 CR4: 0x7e0 -> long mode with CR3 pointing to longmode page tables and when VCPU gets INIT/SIPI following transition happens CR0: 0 -> 0x60000010 EFER: 0x0 CR4: 0x7e0 -> paging disabled with stale CR3 However SVM under the hood puts VCPU in Paged Real Mode* which effectively translates CR0 0x60000010 -> 80010010 after svm_vcpu_reset() -> init_vmcb() -> kvm_set_cr0() -> svm_set_cr0() but from kvm_set_cr0() perspective CR0: 0 -> 0x60000010 only caching bits are changed and commit d81135a57aa6 ("KVM: x86: do not reset mmu if CR0.CD and CR0.NW are changed")' regressed svm_vcpu_reset() which relied on MMU being reset. As result VMRUN after svm_vcpu_reset() tries to run VCPU in Paged Real Mode with stale MMU context (longmode page tables), which causes some AMD CPUs** to bail out with VMEXIT_INVALID. Fix issue by unconditionally resetting MMU context at init_vmcb() time. * AMD64 Architecture Programmer’s Manual, Volume 2: System Programming, rev: 3.25 15.19 Paged Real Mode ** Opteron 1216 Signed-off-by: Igor Mammedov <imammedo@redhat.com> Fixes: d81135a57aa6 Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
| * | | Merge tag 'powerpc-4.3-3' of ↵Linus Torvalds2015-09-253-1/+3
| |\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Michael Ellerman: - Wire up sys_membarrier() - cxl: Fix lockdep warning while creating afu_err_buff from Vaibhav * tag 'powerpc-4.3-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: cxl: Fix lockdep warning while creating afu_err_buff attribute powerpc: Wire up sys_membarrier()
| | * | | powerpc: Wire up sys_membarrier()Michael Ellerman2015-09-213-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The selftest passes on 64-bit LE & BE, and 32-bit. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
| * | | | Merge tag 'devicetree-fixes-for-4.3' of ↵Linus Torvalds2015-09-252-2/+2
| |\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux Pull DeviceTree fixes from Rob Herring: - Silence bogus warning for of_irq_parse_pci - Fix typo in ARM idle-states binding doc and dts files - Various minor binding documentation updates * tag 'devicetree-fixes-for-4.3' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux: Documentation: arm: Fix typo in the idle-states bindings examples gpio: mention in DT binding doc that <name>-gpio is deprecated of_pci_irq: Silence bogus "of_irq_parse_pci() failed ..." messages. devicetree: bindings: Extend the bma180 bindings with bma250 info of: thermal: Mark cooling-*-level properties optional of: thermal: Fix inconsitency between cooling-*-state and cooling-*-level Docs: dt: add #msi-cells to GICv3 ITS binding of: add vendor prefix for Socionext Inc.
| | * | | | Documentation: arm: Fix typo in the idle-states bindings examplesLorenzo Pieralisi2015-09-252-2/+2
| | | |_|/ | | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The idle-states bindings mandate that the entry-method string in the idle-states node must be "psci" for ARM v8 64-bit systems, but the examples in the bindings report a wrong entry-method string. Owing to this typo, some dts in the kernel wrongly defined the entry-method property, since they likely cut and pasted the example definition without paying attention to the bindings definitions. This patch fixes the typo in the DT idle states bindings examples and respective dts in the kernel so that the bindings and related dts files are made compliant. Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Cc: Howard Chen <howard.chen@linaro.org> Cc: Rob Herring <robh+dt@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Heiko Stuebner <heiko@sntech.de> Signed-off-by: Rob Herring <robh@kernel.org>
| * | | | x86, efi, kasan: #undef memset/memcpy/memmove per archAndrey Ryabinin2015-09-231-0/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In not-instrumented code KASAN replaces instrumented memset/memcpy/memmove with not-instrumented analogues __memset/__memcpy/__memove. However, on x86 the EFI stub is not linked with the kernel. It uses not-instrumented mem*() functions from arch/x86/boot/compressed/string.c So we don't replace them with __mem*() variants in EFI stub. On ARM64 the EFI stub is linked with the kernel, so we should replace mem*() functions with __mem*(), because the EFI stub runs before KASAN sets up early shadow. So let's move these #undef mem* into arch's asm/efi.h which is also included by the EFI stub. Also, this will fix the warning in 32-bit build reported by kbuild test robot: efi-stub-helper.c:599:2: warning: implicit declaration of function 'memcpy' [akpm@linux-foundation.org: use 80 cols in comment] Signed-off-by: Andrey Ryabinin <ryabinin.a.a@gmail.com> Reported-by: Fengguang Wu <fengguang.wu@gmail.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Matt Fleming <matt.fleming@intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * | | | Merge branch 'for-linus' of ↵Linus Torvalds2015-09-2110-181/+147
| |\ \ \ \ | | |_|/ / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 fixes from Martin Schwidefsky: "A couple of system call updates. The two new system calls userfaultfd and membarrier have been added, as well as the 17 direct calls for the multiplexed socket system calls. In addition the system call compat wrappers have been flagged as notrace functions and a few wrappers could be removed. And bug fixes for the vector register handling, cpu_mf, suspend/resume, compat signals, SMT cputime accounting and the zfcp dumper" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: s390: wire up separate socketcalls system calls s390/compat: remove superfluous compat wrappers s390/compat: do not trace compat wrapper functions s390/s390x: allocate sys_membarrier system call number s390/configs//zfcpdump_defconfig: Remove CONFIG_MEMSTICK s390: wire up userfaultfd system call s390/vtime: correct scaled cputime for SMT s390/cpum_cf: Corrected return code for unauthorized counter sets s390/compat: correct uc_sigmask of the compat signal frame s390: fix floating point register corruption s390/hibernate: fix save and restore of vector registers
| | * | | s390: wire up separate socketcalls system callsHeiko Carstens2015-09-184-21/+42
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As discussed on linux-arch all architectures should wire up the separate system calls that are hidden behind the socketcall multiplexer system call. It's just a couple more system calls and gives us a very small performance improvement. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
| | * | | s390/compat: remove superfluous compat wrappersHeiko Carstens2015-09-182-102/+51
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A couple of compat wrapper functions are simply trampolines to the real system call. This happened because the compat wrapper defines will only sign and zero extend system call parameters which are of different size on s390/s390x (longs and pointers). All other parameters will be correctly sign and zero extended by the normal system call wrappers. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
| | * | | s390/compat: do not trace compat wrapper functionsHeiko Carstens2015-09-181-6/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add notrace to the compat wrapper define to disable tracing of compat wrapper functions. These are supposed to be very small and more or less just a trampoline to the real system call. Also fix indentation. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
| | * | | s390/s390x: allocate sys_membarrier system call numberMathieu Desnoyers2015-09-172-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> CC: Andrew Morton <akpm@linux-foundation.org> CC: linux-api@vger.kernel.org CC: Heiko Carstens <heiko.carstens@de.ibm.com> CC: linux-s390@vger.kernel.org Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
| | * | | s390/configs//zfcpdump_defconfig: Remove CONFIG_MEMSTICKMichael Holzheu2015-09-171-5/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This config option is completely irrelevant for zfcpdump and unfortunately causes a kernel panic on recent kernels in "mspro_block_init()/driver_register()". Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
| | * | | s390: wire up userfaultfd system callHeiko Carstens2015-09-172-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
| | * | | s390/vtime: correct scaled cputime for SMTMartin Schwidefsky2015-09-171-4/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The scaled cputime is supposed to be derived from the normal per-thread cputime by dividing it with the average thread density in the last interval. The calculation of the scaling values for the average thread density is incorrect. The current, incorrect calculation: Ci = cycle count with i active threads T = unscaled cputime, sT = scaled cputime sT = T * (C1 + C2 + ... + Cn) / (1*C1 + 2*C2 + ... + n*Cn) The calculation happens to yield the correct numbers for the simple cases with only one Ci value not zero. But for cases with multiple Ci values not zero it fails. E.g. on a SMT-2 system with one thread active half the time and two threads active for the other half of the time it fails, the scaling factor should be 3/4 but the formula gives 2/3. The correct formula is sT = T * (C1/1 + C2/2 + ... + Cn/n) / (C1 + C2 + ... + Cn) Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
| | * | | s390/cpum_cf: Corrected return code for unauthorized counter setsHendrik Brueckner2015-09-171-4/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Previously, the cpum_cf PMU returned -EPERM if a counter is requested and the counter set to which the counter belongs is not authorized. According to the perf_event_open() system call manual, an error code of EPERM indicates an unsupported exclude setting or CAP_SYS_ADMIN is missing. Use ENOENT to indicate that particular counters are not available when the counter set which contains the counter is not authorized. For generic events, this might trigger a fall back, for example, to a software event. Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
| | * | | s390/compat: correct uc_sigmask of the compat signal frameMartin Schwidefsky2015-09-171-4/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The uc_sigmask in the ucontext structure is an array of words to keep the 64 signal bits (or 1024 if you ask glibc but the kernel sigset_t only has 64 bits). For 64 bit the sigset_t contains a single 8 byte word, but for 31 bit there are two 4 byte words. The compat signal handler code uses a simple copy of the 64 bit sigset_t to the 31 bit compat_sigset_t. As s390 is a big-endian architecture this is incorrect, the two words in the 31 bit sigset_t array need to be swapped. Cc: <stable@vger.kernel.org> Reported-by: Stefan Liebler <stli@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
| | * | | s390: fix floating point register corruptionHeiko Carstens2015-09-171-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The critical section cleanup code misses to add the offset of the thread_struct to the task address. Therefore, if the critical section code gets executed, it may corrupt the task struct or restore the contents of the floating point registers from the wrong memory location. Fixes d0164ee20d "s390/kernel: remove save_fpu_regs() parameter and use __LC_CURRENT instead". Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Reviewed-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
| | * | | s390/hibernate: fix save and restore of vector registersMartin Schwidefsky2015-09-171-35/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The swsusp_arch_suspend()/swsusp_arch_resume() functions currently only save and restore the floating point registers. If the task that started the hibernation process is using vector registers they can get lost. To fix this just call save_fpu_regs in swsusp_arch_suspend(), the restore will happen automatically on return to user space. Reported-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
| * | | | Merge branch 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-armLinus Torvalds2015-09-203-10/+17
| |\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull ARM fixes from Russell King: "Three fixes and a resulting cleanup for -rc2: - Andre Przywara reported that he was seeing a warning with the new cast inside DMA_ERROR_CODE's definition, and fixed the incorrect use. - Doug Anderson noticed that kgdb causes a "scheduling while atomic" bug. - OMAP5 folk noticed that their Thumb-2 compiled X servers crashed when enabling support to cover ARMv6 CPUs due to a kernel bug leaking some conditional context into the signal handler" * 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm: ARM: 8425/1: kgdb: Don't try to stop the machine when setting breakpoints ARM: 8437/1: dma-mapping: fix build warning with new DMA_ERROR_CODE definition ARM: get rid of needless #if in signal handling code ARM: fix Thumb2 signal handling when ARMv6 is enabled
| | * | | | ARM: 8425/1: kgdb: Don't try to stop the machine when setting breakpointsDoug Anderson2015-09-171-3/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In (23a4e40 arm: kgdb: Handle read-only text / modules) we moved to using patch_text() to set breakpoints so that we could handle the case when we had CONFIG_DEBUG_RODATA. That patch used patch_text(). Unfortunately, patch_text() assumes that we're not in atomic context when it runs since it needs to grab a mutex and also wait for other CPUs to stop (which it does with a completion). This would result in a stack crawl if you had CONFIG_DEBUG_ATOMIC_SLEEP and tried to set a breakpoint in kgdb. The crawl looked something like: BUG: scheduling while atomic: swapper/0/0/0x00010007 CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.2.0-rc7-00133-geb63b34 #1073 Hardware name: Rockchip (Device Tree) (unwind_backtrace) from [<c00133d4>] (show_stack+0x20/0x24) (show_stack) from [<c05400e8>] (dump_stack+0x84/0xb8) (dump_stack) from [<c004913c>] (__schedule_bug+0x54/0x6c) (__schedule_bug) from [<c054065c>] (__schedule+0x80/0x668) (__schedule) from [<c0540cfc>] (schedule+0xb8/0xd4) (schedule) from [<c0543a3c>] (schedule_timeout+0x2c/0x234) (schedule_timeout) from [<c05417c0>] (wait_for_common+0xf4/0x188) (wait_for_common) from [<c0541874>] (wait_for_completion+0x20/0x24) (wait_for_completion) from [<c00a0104>] (__stop_cpus+0x58/0x70) (__stop_cpus) from [<c00a0580>] (stop_cpus+0x3c/0x54) (stop_cpus) from [<c00a06c4>] (__stop_machine+0xcc/0xe8) (__stop_machine) from [<c00a0714>] (stop_machine+0x34/0x44) (stop_machine) from [<c00173e8>] (patch_text+0x28/0x34) (patch_text) from [<c001733c>] (kgdb_arch_set_breakpoint+0x40/0x4c) (kgdb_arch_set_breakpoint) from [<c00a0d68>] (kgdb_validate_break_address+0x2c/0x60) (kgdb_validate_break_address) from [<c00a0e90>] (dbg_set_sw_break+0x1c/0xdc) (dbg_set_sw_break) from [<c00a2e88>] (gdb_serial_stub+0x9c4/0xba4) (gdb_serial_stub) from [<c00a11cc>] (kgdb_cpu_enter+0x1f8/0x60c) (kgdb_cpu_enter) from [<c00a18cc>] (kgdb_handle_exception+0x19c/0x1d0) (kgdb_handle_exception) from [<c0016f7c>] (kgdb_compiled_brk_fn+0x30/0x3c) (kgdb_compiled_brk_fn) from [<c00091a4>] (do_undefinstr+0x1a4/0x20c) (do_undefinstr) from [<c001400c>] (__und_svc_finish+0x0/0x34) It turns out that when we're in kgdb all the CPUs are stopped anyway so there's no reason we should be calling patch_text(). We can instead directly call __patch_text() which assumes that CPUs have already been stopped. Fixes: 23a4e4050ba9 ("arm: kgdb: Handle read-only text / modules") Reported-by: Aapo Vienamo <avienamo@nvidia.com> Signed-off-by: Douglas Anderson <dianders@chromium.org> Reviewed-by: Stephen Boyd <sboyd@codeaurora.org> Acked-by: Kees Cook <keescook@chromium.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
| | * | | | ARM: 8437/1: dma-mapping: fix build warning with new DMA_ERROR_CODE definitionAndre Przywara2015-09-171-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 96231b2686b5: ("ARM: 8419/1: dma-mapping: harmonize definition of DMA_ERROR_CODE") changed the definition of DMA_ERROR_CODE to use dma_addr_t, which makes the compiler barf on assigning this to an "int" variable on ARM with LPAE enabled: ************* In file included from /src/linux/include/linux/dma-mapping.h:86:0, from /src/linux/arch/arm/mm/dma-mapping.c:21: /src/linux/arch/arm/mm/dma-mapping.c: In function '__iommu_create_mapping': /src/linux/arch/arm/include/asm/dma-mapping.h:16:24: warning: overflow in implicit constant conversion [-Woverflow] #define DMA_ERROR_CODE (~(dma_addr_t)0x0) ^ /src/linux/arch/arm/mm/dma-mapping.c:1252:15: note: in expansion of macro DMA_ERROR_CODE' int i, ret = DMA_ERROR_CODE; ^ ************* Remove the actually unneeded initialization of "ret" in __iommu_create_mapping() and move the variable declaration inside the for-loop to make the scope of this variable more clear. Signed-off-by: Andre Przywara <andre.przywara@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
| | * | | | ARM: get rid of needless #if in signal handling codeRussell King2015-09-171-4/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Remove the #if statement which caused trouble for kernels that support both ARMv6 and ARMv7. Older architectures do not implement these bits, so it should be safe to always clear them. Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
| | * | | | ARM: fix Thumb2 signal handling when ARMv6 is enabledRussell King2015-09-161-5/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When a kernel is built covering ARMv6 to ARMv7, we omit to clear the IT state when entering a signal handler. This can cause the first few instructions to be conditionally executed depending on the parent context. In any case, the original test for >= ARMv7 is broken - ARMv6 can have Thumb-2 support as well, and an ARMv6T2 specific build would omit this code too. Relax the test back to ARMv6 or greater. This results in us always clearing the IT state bits in the PSR, even on CPUs where these bits are reserved. However, they're reserved for the IT state, so this should cause no harm. Cc: <stable@vger.kernel.org> Fixes: d71e1352e240 ("Clear the IT state when invoking a Thumb-2 signal handler") Acked-by: Tony Lindgren <tony@atomide.com> Tested-by: H. Nikolaus Schaller <hns@goldelico.com> Tested-by: Grazvydas Ignotas <notasas@gmail.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
| * | | | | Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds2015-09-1822-83/+57
| |\ \ \ \ \ | | | |_|_|/ | | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull KVM fixes from Paolo Bonzini: "Mostly stable material, a lot of ARM fixes" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (22 commits) sched: access local runqueue directly in single_task_running arm/arm64: KVM: Remove 'config KVM_ARM_MAX_VCPUS' arm64: KVM: Remove all traces of the ThumbEE registers arm: KVM: Disable virtual timer even if the guest is not using it arm64: KVM: Disable virtual timer even if the guest is not using it arm/arm64: KVM: vgic: Check for !irqchip_in_kernel() when mapping resources KVM: s390: Replace incorrect atomic_or with atomic_andnot arm: KVM: Fix incorrect device to IPA mapping arm64: KVM: Fix user access for debug registers KVM: vmx: fix VPID is 0000H in non-root operation KVM: add halt_attempted_poll to VCPU stats kvm: fix zero length mmio searching kvm: fix double free for fast mmio eventfd kvm: factor out core eventfd assign/deassign logic kvm: don't try to register to KVM_FAST_MMIO_BUS for non mmio eventfd KVM: make the declaration of functions within 80 characters KVM: arm64: add workaround for Cortex-A57 erratum #852523 KVM: fix polling for guest halt continued even if disable it arm/arm64: KVM: Fix PSCI affinity info return value for non valid cores arm64: KVM: set {v,}TCR_EL2 RES1 bits ...
| | * | | | Merge tag 'kvm-arm-for-4.3-rc2-2' of ↵Paolo Bonzini2015-09-1711-75/+28
| | |\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into kvm-master Second set of KVM/ARM changes for 4.3-rc2 - Workaround for a Cortex-A57 erratum - Bug fix for the debugging infrastructure - Fix for 32bit guests with more than 4GB of address space on a 32bit host - A number of fixes for the (unusual) case when we don't use the in-kernel GIC emulation - Removal of ThumbEE handling on arm64, since these have been dropped from the architecture before anyone actually ever built a CPU - Remove the KVM_ARM_MAX_VCPUS limitation which has become fairly pointless
| | | * | | | arm/arm64: KVM: Remove 'config KVM_ARM_MAX_VCPUS'Ming Lei2015-09-174-34/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch removes config option of KVM_ARM_MAX_VCPUS, and like other ARCHs, just choose the maximum allowed value from hardware, and follows the reasons: 1) from distribution view, the option has to be defined as the max allowed value because it need to meet all kinds of virtulization applications and need to support most of SoCs; 2) using a bigger value doesn't introduce extra memory consumption, and the help text in Kconfig isn't accurate because kvm_vpu structure isn't allocated until request of creating VCPU is sent from QEMU; 3) the main effect is that the field of vcpus[] in 'struct kvm' becomes a bit bigger(sizeof(void *) per vcpu) and need more cache lines to hold the structure, but 'struct kvm' is one generic struct, and it has worked well on other ARCHs already in this way. Also, the world switch frequecy is often low, for example, it is ~2000 when running kernel building load in VM from APM xgene KVM host, so the effect is very small, and the difference can't be observed in my test at all. Cc: Dann Frazier <dann.frazier@canonical.com> Signed-off-by: Ming Lei <ming.lei@canonical.com> Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
| | | * | | | arm64: KVM: Remove all traces of the ThumbEE registersWill Deacon2015-09-174-29/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Although the ThumbEE registers and traps were present in earlier versions of the v8 architecture, it was retrospectively removed and so we can do the same. Whilst this breaks migrating a guest started on a previous version of the kernel, it is much better to kill these (non existent) registers as soon as possible. Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> [maz: added commend about migration] Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
| | | * | | | arm: KVM: Disable virtual timer even if the guest is not using itMarc Zyngier2015-09-171-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When running a guest with the architected timer disabled (with QEMU and the kernel_irqchip=off option, for example), it is important to make sure the timer gets turned off. Otherwise, the guest may try to enable it anyway, leading to a screaming HW interrupt. The fix is to unconditionally turn off the virtual timer on guest exit. Cc: stable@vger.kernel.org Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
| | | * | | | arm64: KVM: Disable virtual timer even if the guest is not using itMarc Zyngier2015-09-171-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When running a guest with the architected timer disabled (with QEMU and the kernel_irqchip=off option, for example), it is important to make sure the timer gets turned off. Otherwise, the guest may try to enable it anyway, leading to a screaming HW interrupt. The fix is to unconditionally turn off the virtual timer on guest exit. Cc: stable@vger.kernel.org Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
| | | * | | | arm/arm64: KVM: vgic: Check for !irqchip_in_kernel() when mapping resourcesPavel Fedin2015-09-161-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Until b26e5fdac43c ("arm/arm64: KVM: introduce per-VM ops"), kvm_vgic_map_resources() used to include a check on irqchip_in_kernel(), and vgic_v2_map_resources() still has it. But now vm_ops are not initialized until we call kvm_vgic_create(). Therefore kvm_vgic_map_resources() can being called without a VGIC, and we die because vm_ops.map_resources is NULL. Fixing this restores QEMU's kernel-irqchip=off option to a working state, allowing to use GIC emulation in userspace. Fixes: b26e5fdac43c ("arm/arm64: KVM: introduce per-VM ops") Cc: stable@vger.kernel.org Signed-off-by: Pavel Fedin <p.fedin@samsung.com> [maz: reworked commit message] Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
| | | * | | | arm: KVM: Fix incorrect device to IPA mappingMarek Majtyka2015-09-161-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A critical bug has been found in device memory stage1 translation for VMs with more then 4GB of address space. Once vm_pgoff size is smaller then pa (which is true for LPAE case, u32 and u64 respectively) some more significant bits of pa may be lost as a shift operation is performed on u32 and later cast onto u64. Example: vm_pgoff(u32)=0x00210030, PAGE_SHIFT=12 expected pa(u64): 0x0000002010030000 produced pa(u64): 0x0000000010030000 The fix is to change the order of operations (casting first onto phys_addr_t and then shifting). Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> [maz: fixed changelog and patch formatting] Cc: stable@vger.kernel.org Signed-off-by: Marek Majtyka <marek.majtyka@tieto.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
| | | * | | | arm64: KVM: Fix user access for debug registersMarc Zyngier2015-09-161-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When setting the debug register from userspace, make sure that copy_from_user() is called with its parameters in the expected order. It otherwise doesn't do what you think. Fixes: 84e690bfbed1 ("KVM: arm64: introduce vcpu->arch.debug_ptr") Reported-by: Peter Maydell <peter.maydell@linaro.org> Cc: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
| | | * | | | KVM: arm64: add workaround for Cortex-A57 erratum #852523Will Deacon2015-09-141-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When restoring the system register state for an AArch32 guest at EL2, writes to DACR32_EL2 may not be correctly synchronised by Cortex-A57, which can lead to the guest effectively running with junk in the DACR and running into unexpected domain faults. This patch works around the issue by re-ordering our restoration of the AArch32 register aliases so that they happen before the AArch64 system registers. Ensuring that the registers are restored in this order guarantees that they will be correctly synchronised by the core. Cc: <stable@vger.kernel.org> Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
| | * | | | | KVM: s390: Replace incorrect atomic_or with atomic_andnotJason J. Herne2015-09-161-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The offending commit accidentally replaces an atomic_clear with an atomic_or instead of an atomic_andnot in kvm_s390_vcpu_request_handled. The symptom is that kvm guests on s390 hang on startup. This patch simply replaces the incorrect atomic_or with atomic_andnot Fixes: 805de8f43c20 (atomic: Replace atomic_{set,clear}_mask() usage) Signed-off-by: Jason J. Herne <jjherne@linux.vnet.ibm.com> Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
| | * | | | | KVM: vmx: fix VPID is 0000H in non-root operationWanpeng Li2015-09-161-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Reference SDM 28.1: The current VPID is 0000H in the following situations: - Outside VMX operation. (This includes operation in system-management mode under the default treatment of SMIs and SMM with VMX operation; see Section 34.14.) - In VMX root operation. - In VMX non-root operation when the “enable VPID” VM-execution control is 0. The VPID should never be 0000H in non-root operation when "enable VPID" VM-execution control is 1. However, commit 34a1cd60 ("kvm: x86: vmx: move some vmx setting from vmx_init() to hardware_setup()") remove the codes which reserve 0000H for VMX root operation. This patch fix it by again reserving 0000H for VMX root operation. Cc: stable@vger.kernel.org # 3.19+ Fixes: 34a1cd60d17f62c1f077c1478a6c2ca8c3d17af4 Reported-by: Wincy Van <fanwenyi0529@gmail.com> Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
| | * | | | | KVM: add halt_attempted_poll to VCPU statsPaolo Bonzini2015-09-1611-0/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This new statistic can help diagnosing VCPUs that, for any reason, trigger bad behavior of halt_poll_ns autotuning. For example, say halt_poll_ns = 480000, and wakeups are spaced exactly like 479us, 481us, 479us, 481us. Then KVM always fails polling and wastes 10+20+40+80+160+320+480 = 1110 microseconds out of every 479+481+479+481+479+481+479 = 3359 microseconds. The VCPU then is consuming about 30% more CPU than it would use without polling. This would show as an abnormally high number of attempted polling compared to the successful polls. Acked-by: Christian Borntraeger <borntraeger@de.ibm.com< Reviewed-by: David Matlack <dmatlack@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>