diff options
author | Linus Torvalds <torvalds@linux-foundation.org> | 2024-11-24 01:00:50 +0100 |
---|---|---|
committer | Linus Torvalds <torvalds@linux-foundation.org> | 2024-11-24 01:00:50 +0100 |
commit | 9f16d5e6f220661f73b36a4be1b21575651d8833 (patch) | |
tree | 8d26e5eeb7d74c83667ad91332c961c631ac6907 /arch/loongarch | |
parent | Merge tag 'powerpc-6.13-1' of git://git.kernel.org/pub/scm/linux/kernel/git/p... (diff) | |
parent | KVM: x86: Break CONFIG_KVM_X86's direct dependency on KVM_INTEL || KVM_AMD (diff) | |
download | linux-9f16d5e6f220661f73b36a4be1b21575651d8833.tar.xz linux-9f16d5e6f220661f73b36a4be1b21575651d8833.zip |
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull kvm updates from Paolo Bonzini:
"The biggest change here is eliminating the awful idea that KVM had of
essentially guessing which pfns are refcounted pages.
The reason to do so was that KVM needs to map both non-refcounted
pages (for example BARs of VFIO devices) and VM_PFNMAP/VM_MIXMEDMAP
VMAs that contain refcounted pages.
However, the result was security issues in the past, and more recently
the inability to map VM_IO and VM_PFNMAP memory that _is_ backed by
struct page but is not refcounted. In particular this broke virtio-gpu
blob resources (which directly map host graphics buffers into the
guest as "vram" for the virtio-gpu device) with the amdgpu driver,
because amdgpu allocates non-compound higher order pages and the tail
pages could not be mapped into KVM.
This requires adjusting all uses of struct page in the
per-architecture code, to always work on the pfn whenever possible.
The large series that did this, from David Stevens and Sean
Christopherson, also cleaned up substantially the set of functions
that provided arch code with the pfn for a host virtual addresses.
The previous maze of twisty little passages, all different, is
replaced by five functions (__gfn_to_page, __kvm_faultin_pfn, the
non-__ versions of these two, and kvm_prefetch_pages) saving almost
200 lines of code.
ARM:
- Support for stage-1 permission indirection (FEAT_S1PIE) and
permission overlays (FEAT_S1POE), including nested virt + the
emulated page table walker
- Introduce PSCI SYSTEM_OFF2 support to KVM + client driver. This
call was introduced in PSCIv1.3 as a mechanism to request
hibernation, similar to the S4 state in ACPI
- Explicitly trap + hide FEAT_MPAM (QoS controls) from KVM guests. As
part of it, introduce trivial initialization of the host's MPAM
context so KVM can use the corresponding traps
- PMU support under nested virtualization, honoring the guest
hypervisor's trap configuration and event filtering when running a
nested guest
- Fixes to vgic ITS serialization where stale device/interrupt table
entries are not zeroed when the mapping is invalidated by the VM
- Avoid emulated MMIO completion if userspace has requested
synchronous external abort injection
- Various fixes and cleanups affecting pKVM, vCPU initialization, and
selftests
LoongArch:
- Add iocsr and mmio bus simulation in kernel.
- Add in-kernel interrupt controller emulation.
- Add support for virtualization extensions to the eiointc irqchip.
PPC:
- Drop lingering and utterly obsolete references to PPC970 KVM, which
was removed 10 years ago.
- Fix incorrect documentation references to non-existing ioctls
RISC-V:
- Accelerate KVM RISC-V when running as a guest
- Perf support to collect KVM guest statistics from host side
s390:
- New selftests: more ucontrol selftests and CPU model sanity checks
- Support for the gen17 CPU model
- List registers supported by KVM_GET/SET_ONE_REG in the
documentation
x86:
- Cleanup KVM's handling of Accessed and Dirty bits to dedup code,
improve documentation, harden against unexpected changes.
Even if the hardware A/D tracking is disabled, it is possible to
use the hardware-defined A/D bits to track if a PFN is Accessed
and/or Dirty, and that removes a lot of special cases.
- Elide TLB flushes when aging secondary PTEs, as has been done in
x86's primary MMU for over 10 years.
- Recover huge pages in-place in the TDP MMU when dirty page logging
is toggled off, instead of zapping them and waiting until the page
is re-accessed to create a huge mapping. This reduces vCPU jitter.
- Batch TLB flushes when dirty page logging is toggled off. This
reduces the time it takes to disable dirty logging by ~3x.
- Remove the shrinker that was (poorly) attempting to reclaim shadow
page tables in low-memory situations.
- Clean up and optimize KVM's handling of writes to
MSR_IA32_APICBASE.
- Advertise CPUIDs for new instructions in Clearwater Forest
- Quirk KVM's misguided behavior of initialized certain feature MSRs
to their maximum supported feature set, which can result in KVM
creating invalid vCPU state. E.g. initializing PERF_CAPABILITIES to
a non-zero value results in the vCPU having invalid state if
userspace hides PDCM from the guest, which in turn can lead to
save/restore failures.
- Fix KVM's handling of non-canonical checks for vCPUs that support
LA57 to better follow the "architecture", in quotes because the
actual behavior is poorly documented. E.g. most MSR writes and
descriptor table loads ignore CR4.LA57 and operate purely on
whether the CPU supports LA57.
- Bypass the register cache when querying CPL from kvm_sched_out(),
as filling the cache from IRQ context is generally unsafe; harden
the cache accessors to try to prevent similar issues from occuring
in the future. The issue that triggered this change was already
fixed in 6.12, but was still kinda latent.
- Advertise AMD_IBPB_RET to userspace, and fix a related bug where
KVM over-advertises SPEC_CTRL when trying to support cross-vendor
VMs.
- Minor cleanups
- Switch hugepage recovery thread to use vhost_task.
These kthreads can consume significant amounts of CPU time on
behalf of a VM or in response to how the VM behaves (for example
how it accesses its memory); therefore KVM tried to place the
thread in the VM's cgroups and charge the CPU time consumed by that
work to the VM's container.
However the kthreads did not process SIGSTOP/SIGCONT, and therefore
cgroups which had KVM instances inside could not complete freezing.
Fix this by replacing the kthread with a PF_USER_WORKER thread, via
the vhost_task abstraction. Another 100+ lines removed, with
generally better behavior too like having these threads properly
parented in the process tree.
- Revert a workaround for an old CPU erratum (Nehalem/Westmere) that
didn't really work; there was really nothing to work around anyway:
the broken patch was meant to fix nested virtualization, but the
PERF_GLOBAL_CTRL MSR is virtualized and therefore unaffected by the
erratum.
- Fix 6.12 regression where CONFIG_KVM will be built as a module even
if asked to be builtin, as long as neither KVM_INTEL nor KVM_AMD is
'y'.
x86 selftests:
- x86 selftests can now use AVX.
Documentation:
- Use rST internal links
- Reorganize the introduction to the API document
Generic:
- Protect vcpu->pid accesses outside of vcpu->mutex with a rwlock
instead of RCU, so that running a vCPU on a different task doesn't
encounter long due to having to wait for all CPUs become quiescent.
In general both reads and writes are rare, but userspace that
supports confidential computing is introducing the use of "helper"
vCPUs that may jump from one host processor to another. Those will
be very happy to trigger a synchronize_rcu(), and the effect on
performance is quite the disaster"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (298 commits)
KVM: x86: Break CONFIG_KVM_X86's direct dependency on KVM_INTEL || KVM_AMD
KVM: x86: add back X86_LOCAL_APIC dependency
Revert "KVM: VMX: Move LOAD_IA32_PERF_GLOBAL_CTRL errata handling out of setup_vmcs_config()"
KVM: x86: switch hugepage recovery thread to vhost_task
KVM: x86: expose MSR_PLATFORM_INFO as a feature MSR
x86: KVM: Advertise CPUIDs for new instructions in Clearwater Forest
Documentation: KVM: fix malformed table
irqchip/loongson-eiointc: Add virt extension support
LoongArch: KVM: Add irqfd support
LoongArch: KVM: Add PCHPIC user mode read and write functions
LoongArch: KVM: Add PCHPIC read and write functions
LoongArch: KVM: Add PCHPIC device support
LoongArch: KVM: Add EIOINTC user mode read and write functions
LoongArch: KVM: Add EIOINTC read and write functions
LoongArch: KVM: Add EIOINTC device support
LoongArch: KVM: Add IPI user mode read and write function
LoongArch: KVM: Add IPI read and write function
LoongArch: KVM: Add IPI device support
LoongArch: KVM: Add iocsr and mmio bus simulation in kernel
KVM: arm64: Pass on SVE mapping failures
...
Diffstat (limited to 'arch/loongarch')
-rw-r--r-- | arch/loongarch/include/asm/irq.h | 1 | ||||
-rw-r--r-- | arch/loongarch/include/asm/kvm_eiointc.h | 123 | ||||
-rw-r--r-- | arch/loongarch/include/asm/kvm_host.h | 18 | ||||
-rw-r--r-- | arch/loongarch/include/asm/kvm_ipi.h | 45 | ||||
-rw-r--r-- | arch/loongarch/include/asm/kvm_pch_pic.h | 62 | ||||
-rw-r--r-- | arch/loongarch/include/uapi/asm/kvm.h | 20 | ||||
-rw-r--r-- | arch/loongarch/kvm/Kconfig | 5 | ||||
-rw-r--r-- | arch/loongarch/kvm/Makefile | 4 | ||||
-rw-r--r-- | arch/loongarch/kvm/exit.c | 82 | ||||
-rw-r--r-- | arch/loongarch/kvm/intc/eiointc.c | 1027 | ||||
-rw-r--r-- | arch/loongarch/kvm/intc/ipi.c | 475 | ||||
-rw-r--r-- | arch/loongarch/kvm/intc/pch_pic.c | 519 | ||||
-rw-r--r-- | arch/loongarch/kvm/irqfd.c | 89 | ||||
-rw-r--r-- | arch/loongarch/kvm/main.c | 19 | ||||
-rw-r--r-- | arch/loongarch/kvm/mmu.c | 40 | ||||
-rw-r--r-- | arch/loongarch/kvm/vcpu.c | 3 | ||||
-rw-r--r-- | arch/loongarch/kvm/vm.c | 21 |
17 files changed, 2497 insertions, 56 deletions
diff --git a/arch/loongarch/include/asm/irq.h b/arch/loongarch/include/asm/irq.h index 9c2ca785faa9..a0ca84da8541 100644 --- a/arch/loongarch/include/asm/irq.h +++ b/arch/loongarch/include/asm/irq.h @@ -65,6 +65,7 @@ extern struct acpi_vector_group pch_group[MAX_IO_PICS]; extern struct acpi_vector_group msi_group[MAX_IO_PICS]; #define CORES_PER_EIO_NODE 4 +#define CORES_PER_VEIO_NODE 256 #define LOONGSON_CPU_UART0_VEC 10 /* CPU UART0 */ #define LOONGSON_CPU_THSENS_VEC 14 /* CPU Thsens */ diff --git a/arch/loongarch/include/asm/kvm_eiointc.h b/arch/loongarch/include/asm/kvm_eiointc.h new file mode 100644 index 000000000000..a3a40aba8acf --- /dev/null +++ b/arch/loongarch/include/asm/kvm_eiointc.h @@ -0,0 +1,123 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * Copyright (C) 2024 Loongson Technology Corporation Limited + */ + +#ifndef __ASM_KVM_EIOINTC_H +#define __ASM_KVM_EIOINTC_H + +#include <kvm/iodev.h> + +#define EIOINTC_IRQS 256 +#define EIOINTC_ROUTE_MAX_VCPUS 256 +#define EIOINTC_IRQS_U8_NUMS (EIOINTC_IRQS / 8) +#define EIOINTC_IRQS_U16_NUMS (EIOINTC_IRQS_U8_NUMS / 2) +#define EIOINTC_IRQS_U32_NUMS (EIOINTC_IRQS_U8_NUMS / 4) +#define EIOINTC_IRQS_U64_NUMS (EIOINTC_IRQS_U8_NUMS / 8) +/* map to ipnum per 32 irqs */ +#define EIOINTC_IRQS_NODETYPE_COUNT 16 + +#define EIOINTC_BASE 0x1400 +#define EIOINTC_SIZE 0x900 + +#define EIOINTC_NODETYPE_START 0xa0 +#define EIOINTC_NODETYPE_END 0xbf +#define EIOINTC_IPMAP_START 0xc0 +#define EIOINTC_IPMAP_END 0xc7 +#define EIOINTC_ENABLE_START 0x200 +#define EIOINTC_ENABLE_END 0x21f +#define EIOINTC_BOUNCE_START 0x280 +#define EIOINTC_BOUNCE_END 0x29f +#define EIOINTC_ISR_START 0x300 +#define EIOINTC_ISR_END 0x31f +#define EIOINTC_COREISR_START 0x400 +#define EIOINTC_COREISR_END 0x41f +#define EIOINTC_COREMAP_START 0x800 +#define EIOINTC_COREMAP_END 0x8ff + +#define EIOINTC_VIRT_BASE (0x40000000) +#define EIOINTC_VIRT_SIZE (0x1000) + +#define EIOINTC_VIRT_FEATURES (0x0) +#define EIOINTC_HAS_VIRT_EXTENSION (0) +#define EIOINTC_HAS_ENABLE_OPTION (1) +#define EIOINTC_HAS_INT_ENCODE (2) +#define EIOINTC_HAS_CPU_ENCODE (3) +#define EIOINTC_VIRT_HAS_FEATURES ((1U << EIOINTC_HAS_VIRT_EXTENSION) \ + | (1U << EIOINTC_HAS_ENABLE_OPTION) \ + | (1U << EIOINTC_HAS_INT_ENCODE) \ + | (1U << EIOINTC_HAS_CPU_ENCODE)) +#define EIOINTC_VIRT_CONFIG (0x4) +#define EIOINTC_ENABLE (1) +#define EIOINTC_ENABLE_INT_ENCODE (2) +#define EIOINTC_ENABLE_CPU_ENCODE (3) + +#define LOONGSON_IP_NUM 8 + +struct loongarch_eiointc { + spinlock_t lock; + struct kvm *kvm; + struct kvm_io_device device; + struct kvm_io_device device_vext; + uint32_t num_cpu; + uint32_t features; + uint32_t status; + + /* hardware state */ + union nodetype { + u64 reg_u64[EIOINTC_IRQS_NODETYPE_COUNT / 4]; + u32 reg_u32[EIOINTC_IRQS_NODETYPE_COUNT / 2]; + u16 reg_u16[EIOINTC_IRQS_NODETYPE_COUNT]; + u8 reg_u8[EIOINTC_IRQS_NODETYPE_COUNT * 2]; + } nodetype; + + /* one bit shows the state of one irq */ + union bounce { + u64 reg_u64[EIOINTC_IRQS_U64_NUMS]; + u32 reg_u32[EIOINTC_IRQS_U32_NUMS]; + u16 reg_u16[EIOINTC_IRQS_U16_NUMS]; + u8 reg_u8[EIOINTC_IRQS_U8_NUMS]; + } bounce; + + union isr { + u64 reg_u64[EIOINTC_IRQS_U64_NUMS]; + u32 reg_u32[EIOINTC_IRQS_U32_NUMS]; + u16 reg_u16[EIOINTC_IRQS_U16_NUMS]; + u8 reg_u8[EIOINTC_IRQS_U8_NUMS]; + } isr; + union coreisr { + u64 reg_u64[EIOINTC_ROUTE_MAX_VCPUS][EIOINTC_IRQS_U64_NUMS]; + u32 reg_u32[EIOINTC_ROUTE_MAX_VCPUS][EIOINTC_IRQS_U32_NUMS]; + u16 reg_u16[EIOINTC_ROUTE_MAX_VCPUS][EIOINTC_IRQS_U16_NUMS]; + u8 reg_u8[EIOINTC_ROUTE_MAX_VCPUS][EIOINTC_IRQS_U8_NUMS]; + } coreisr; + union enable { + u64 reg_u64[EIOINTC_IRQS_U64_NUMS]; + u32 reg_u32[EIOINTC_IRQS_U32_NUMS]; + u16 reg_u16[EIOINTC_IRQS_U16_NUMS]; + u8 reg_u8[EIOINTC_IRQS_U8_NUMS]; + } enable; + + /* use one byte to config ipmap for 32 irqs at once */ + union ipmap { + u64 reg_u64; + u32 reg_u32[EIOINTC_IRQS_U32_NUMS / 4]; + u16 reg_u16[EIOINTC_IRQS_U16_NUMS / 4]; + u8 reg_u8[EIOINTC_IRQS_U8_NUMS / 4]; + } ipmap; + /* use one byte to config coremap for one irq */ + union coremap { + u64 reg_u64[EIOINTC_IRQS / 8]; + u32 reg_u32[EIOINTC_IRQS / 4]; + u16 reg_u16[EIOINTC_IRQS / 2]; + u8 reg_u8[EIOINTC_IRQS]; + } coremap; + + DECLARE_BITMAP(sw_coreisr[EIOINTC_ROUTE_MAX_VCPUS][LOONGSON_IP_NUM], EIOINTC_IRQS); + uint8_t sw_coremap[EIOINTC_IRQS]; +}; + +int kvm_loongarch_register_eiointc_device(void); +void eiointc_set_irq(struct loongarch_eiointc *s, int irq, int level); + +#endif /* __ASM_KVM_EIOINTC_H */ diff --git a/arch/loongarch/include/asm/kvm_host.h b/arch/loongarch/include/asm/kvm_host.h index d6bb72424027..7b8367c39da8 100644 --- a/arch/loongarch/include/asm/kvm_host.h +++ b/arch/loongarch/include/asm/kvm_host.h @@ -18,8 +18,13 @@ #include <asm/inst.h> #include <asm/kvm_mmu.h> +#include <asm/kvm_ipi.h> +#include <asm/kvm_eiointc.h> +#include <asm/kvm_pch_pic.h> #include <asm/loongarch.h> +#define __KVM_HAVE_ARCH_INTC_INITIALIZED + /* Loongarch KVM register ids */ #define KVM_GET_IOC_CSR_IDX(id) ((id & KVM_CSR_IDX_MASK) >> LOONGARCH_REG_SHIFT) #define KVM_GET_IOC_CPUCFG_IDX(id) ((id & KVM_CPUCFG_IDX_MASK) >> LOONGARCH_REG_SHIFT) @@ -44,6 +49,12 @@ struct kvm_vm_stat { struct kvm_vm_stat_generic generic; u64 pages; u64 hugepages; + u64 ipi_read_exits; + u64 ipi_write_exits; + u64 eiointc_read_exits; + u64 eiointc_write_exits; + u64 pch_pic_read_exits; + u64 pch_pic_write_exits; }; struct kvm_vcpu_stat { @@ -84,7 +95,7 @@ struct kvm_world_switch { * * For LOONGARCH_CSR_CPUID register, max CPUID size if 512 * For IPI hardware, max destination CPUID size 1024 - * For extioi interrupt controller, max destination CPUID size is 256 + * For eiointc interrupt controller, max destination CPUID size is 256 * For msgint interrupt controller, max supported CPUID size is 65536 * * Currently max CPUID is defined as 256 for KVM hypervisor, in future @@ -117,6 +128,9 @@ struct kvm_arch { s64 time_offset; struct kvm_context __percpu *vmcs; + struct loongarch_ipi *ipi; + struct loongarch_eiointc *eiointc; + struct loongarch_pch_pic *pch_pic; }; #define CSR_MAX_NUMS 0x800 @@ -221,6 +235,8 @@ struct kvm_vcpu_arch { int last_sched_cpu; /* mp state */ struct kvm_mp_state mp_state; + /* ipi state */ + struct ipi_state ipi_state; /* cpucfg */ u32 cpucfg[KVM_MAX_CPUCFG_REGS]; diff --git a/arch/loongarch/include/asm/kvm_ipi.h b/arch/loongarch/include/asm/kvm_ipi.h new file mode 100644 index 000000000000..060163dfb4a3 --- /dev/null +++ b/arch/loongarch/include/asm/kvm_ipi.h @@ -0,0 +1,45 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * Copyright (C) 2024 Loongson Technology Corporation Limited + */ + +#ifndef __ASM_KVM_IPI_H +#define __ASM_KVM_IPI_H + +#include <kvm/iodev.h> + +#define LARCH_INT_IPI 12 + +struct loongarch_ipi { + spinlock_t lock; + struct kvm *kvm; + struct kvm_io_device device; +}; + +struct ipi_state { + spinlock_t lock; + uint32_t status; + uint32_t en; + uint32_t set; + uint32_t clear; + uint64_t buf[4]; +}; + +#define IOCSR_IPI_BASE 0x1000 +#define IOCSR_IPI_SIZE 0x160 + +#define IOCSR_IPI_STATUS 0x000 +#define IOCSR_IPI_EN 0x004 +#define IOCSR_IPI_SET 0x008 +#define IOCSR_IPI_CLEAR 0x00c +#define IOCSR_IPI_BUF_20 0x020 +#define IOCSR_IPI_BUF_28 0x028 +#define IOCSR_IPI_BUF_30 0x030 +#define IOCSR_IPI_BUF_38 0x038 +#define IOCSR_IPI_SEND 0x040 +#define IOCSR_MAIL_SEND 0x048 +#define IOCSR_ANY_SEND 0x158 + +int kvm_loongarch_register_ipi_device(void); + +#endif diff --git a/arch/loongarch/include/asm/kvm_pch_pic.h b/arch/loongarch/include/asm/kvm_pch_pic.h new file mode 100644 index 000000000000..e6df6a4c1c70 --- /dev/null +++ b/arch/loongarch/include/asm/kvm_pch_pic.h @@ -0,0 +1,62 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * Copyright (C) 2024 Loongson Technology Corporation Limited + */ + +#ifndef __ASM_KVM_PCH_PIC_H +#define __ASM_KVM_PCH_PIC_H + +#include <kvm/iodev.h> + +#define PCH_PIC_SIZE 0x3e8 + +#define PCH_PIC_INT_ID_START 0x0 +#define PCH_PIC_INT_ID_END 0x7 +#define PCH_PIC_MASK_START 0x20 +#define PCH_PIC_MASK_END 0x27 +#define PCH_PIC_HTMSI_EN_START 0x40 +#define PCH_PIC_HTMSI_EN_END 0x47 +#define PCH_PIC_EDGE_START 0x60 +#define PCH_PIC_EDGE_END 0x67 +#define PCH_PIC_CLEAR_START 0x80 +#define PCH_PIC_CLEAR_END 0x87 +#define PCH_PIC_AUTO_CTRL0_START 0xc0 +#define PCH_PIC_AUTO_CTRL0_END 0xc7 +#define PCH_PIC_AUTO_CTRL1_START 0xe0 +#define PCH_PIC_AUTO_CTRL1_END 0xe7 +#define PCH_PIC_ROUTE_ENTRY_START 0x100 +#define PCH_PIC_ROUTE_ENTRY_END 0x13f +#define PCH_PIC_HTMSI_VEC_START 0x200 +#define PCH_PIC_HTMSI_VEC_END 0x23f +#define PCH_PIC_INT_IRR_START 0x380 +#define PCH_PIC_INT_IRR_END 0x38f +#define PCH_PIC_INT_ISR_START 0x3a0 +#define PCH_PIC_INT_ISR_END 0x3af +#define PCH_PIC_POLARITY_START 0x3e0 +#define PCH_PIC_POLARITY_END 0x3e7 +#define PCH_PIC_INT_ID_VAL 0x7000000UL +#define PCH_PIC_INT_ID_VER 0x1UL + +struct loongarch_pch_pic { + spinlock_t lock; + struct kvm *kvm; + struct kvm_io_device device; + uint64_t mask; /* 1:disable irq, 0:enable irq */ + uint64_t htmsi_en; /* 1:msi */ + uint64_t edge; /* 1:edge triggered, 0:level triggered */ + uint64_t auto_ctrl0; /* only use default value 00b */ + uint64_t auto_ctrl1; /* only use default value 00b */ + uint64_t last_intirr; /* edge detection */ + uint64_t irr; /* interrupt request register */ + uint64_t isr; /* interrupt service register */ + uint64_t polarity; /* 0: high level trigger, 1: low level trigger */ + uint8_t route_entry[64]; /* default value 0, route to int0: eiointc */ + uint8_t htmsi_vector[64]; /* irq route table for routing to eiointc */ + uint64_t pch_pic_base; +}; + +int kvm_loongarch_register_pch_pic_device(void); +void pch_pic_set_irq(struct loongarch_pch_pic *s, int irq, int level); +void pch_msi_set_irq(struct kvm *kvm, int irq, int level); + +#endif /* __ASM_KVM_PCH_PIC_H */ diff --git a/arch/loongarch/include/uapi/asm/kvm.h b/arch/loongarch/include/uapi/asm/kvm.h index 70d89070bfeb..5f354f5c6847 100644 --- a/arch/loongarch/include/uapi/asm/kvm.h +++ b/arch/loongarch/include/uapi/asm/kvm.h @@ -8,6 +8,8 @@ #include <linux/types.h> +#define __KVM_HAVE_IRQ_LINE + /* * KVM LoongArch specific structures and definitions. * @@ -132,4 +134,22 @@ struct kvm_iocsr_entry { #define KVM_IRQCHIP_NUM_PINS 64 #define KVM_MAX_CORES 256 +#define KVM_DEV_LOONGARCH_IPI_GRP_REGS 0x40000001 + +#define KVM_DEV_LOONGARCH_EXTIOI_GRP_REGS 0x40000002 + +#define KVM_DEV_LOONGARCH_EXTIOI_GRP_SW_STATUS 0x40000003 +#define KVM_DEV_LOONGARCH_EXTIOI_SW_STATUS_NUM_CPU 0x0 +#define KVM_DEV_LOONGARCH_EXTIOI_SW_STATUS_FEATURE 0x1 +#define KVM_DEV_LOONGARCH_EXTIOI_SW_STATUS_STATE 0x2 + +#define KVM_DEV_LOONGARCH_EXTIOI_GRP_CTRL 0x40000004 +#define KVM_DEV_LOONGARCH_EXTIOI_CTRL_INIT_NUM_CPU 0x0 +#define KVM_DEV_LOONGARCH_EXTIOI_CTRL_INIT_FEATURE 0x1 +#define KVM_DEV_LOONGARCH_EXTIOI_CTRL_LOAD_FINISHED 0x3 + +#define KVM_DEV_LOONGARCH_PCH_PIC_GRP_REGS 0x40000005 +#define KVM_DEV_LOONGARCH_PCH_PIC_GRP_CTRL 0x40000006 +#define KVM_DEV_LOONGARCH_PCH_PIC_CTRL_INIT 0 + #endif /* __UAPI_ASM_LOONGARCH_KVM_H */ diff --git a/arch/loongarch/kvm/Kconfig b/arch/loongarch/kvm/Kconfig index 248744b4d086..97a811077ac3 100644 --- a/arch/loongarch/kvm/Kconfig +++ b/arch/loongarch/kvm/Kconfig @@ -21,13 +21,16 @@ config KVM tristate "Kernel-based Virtual Machine (KVM) support" depends on AS_HAS_LVZ_EXTENSION select HAVE_KVM_DIRTY_RING_ACQ_REL + select HAVE_KVM_IRQ_ROUTING + select HAVE_KVM_IRQCHIP + select HAVE_KVM_MSI + select HAVE_KVM_READONLY_MEM select HAVE_KVM_VCPU_ASYNC_IOCTL select KVM_COMMON select KVM_GENERIC_DIRTYLOG_READ_PROTECT select KVM_GENERIC_HARDWARE_ENABLING select KVM_GENERIC_MMU_NOTIFIER select KVM_MMIO - select HAVE_KVM_READONLY_MEM select KVM_XFER_TO_GUEST_WORK select SCHED_INFO help diff --git a/arch/loongarch/kvm/Makefile b/arch/loongarch/kvm/Makefile index b2f4cbe01ae8..3a01292f71cc 100644 --- a/arch/loongarch/kvm/Makefile +++ b/arch/loongarch/kvm/Makefile @@ -18,5 +18,9 @@ kvm-y += timer.o kvm-y += tlb.o kvm-y += vcpu.o kvm-y += vm.o +kvm-y += intc/ipi.o +kvm-y += intc/eiointc.o +kvm-y += intc/pch_pic.o +kvm-y += irqfd.o CFLAGS_exit.o += $(call cc-option,-Wno-override-init,) diff --git a/arch/loongarch/kvm/exit.c b/arch/loongarch/kvm/exit.c index 90894f70ff4a..69f3e3782cc9 100644 --- a/arch/loongarch/kvm/exit.c +++ b/arch/loongarch/kvm/exit.c @@ -157,7 +157,7 @@ static int kvm_handle_csr(struct kvm_vcpu *vcpu, larch_inst inst) int kvm_emu_iocsr(larch_inst inst, struct kvm_run *run, struct kvm_vcpu *vcpu) { int ret; - unsigned long val; + unsigned long *val; u32 addr, rd, rj, opcode; /* @@ -170,6 +170,7 @@ int kvm_emu_iocsr(larch_inst inst, struct kvm_run *run, struct kvm_vcpu *vcpu) ret = EMULATE_DO_IOCSR; run->iocsr_io.phys_addr = addr; run->iocsr_io.is_write = 0; + val = &vcpu->arch.gprs[rd]; /* LoongArch is Little endian */ switch (opcode) { @@ -202,16 +203,25 @@ int kvm_emu_iocsr(larch_inst inst, struct kvm_run *run, struct kvm_vcpu *vcpu) run->iocsr_io.is_write = 1; break; default: - ret = EMULATE_FAIL; - break; + return EMULATE_FAIL; } - if (ret == EMULATE_DO_IOCSR) { - if (run->iocsr_io.is_write) { - val = vcpu->arch.gprs[rd]; - memcpy(run->iocsr_io.data, &val, run->iocsr_io.len); - } - vcpu->arch.io_gpr = rd; + if (run->iocsr_io.is_write) { + if (!kvm_io_bus_write(vcpu, KVM_IOCSR_BUS, addr, run->iocsr_io.len, val)) + ret = EMULATE_DONE; + else + /* Save data and let user space to write it */ + memcpy(run->iocsr_io.data, val, run->iocsr_io.len); + + trace_kvm_iocsr(KVM_TRACE_IOCSR_WRITE, run->iocsr_io.len, addr, val); + } else { + if (!kvm_io_bus_read(vcpu, KVM_IOCSR_BUS, addr, run->iocsr_io.len, val)) + ret = EMULATE_DONE; + else + /* Save register id for iocsr read completion */ + vcpu->arch.io_gpr = rd; + + trace_kvm_iocsr(KVM_TRACE_IOCSR_READ, run->iocsr_io.len, addr, NULL); } return ret; @@ -447,19 +457,33 @@ int kvm_emu_mmio_read(struct kvm_vcpu *vcpu, larch_inst inst) } if (ret == EMULATE_DO_MMIO) { + trace_kvm_mmio(KVM_TRACE_MMIO_READ, run->mmio.len, run->mmio.phys_addr, NULL); + + /* + * If mmio device such as PCH-PIC is emulated in KVM, + * it need not return to user space to handle the mmio + * exception. + */ + ret = kvm_io_bus_read(vcpu, KVM_MMIO_BUS, vcpu->arch.badv, + run->mmio.len, &vcpu->arch.gprs[rd]); + if (!ret) { + update_pc(&vcpu->arch); + vcpu->mmio_needed = 0; + return EMULATE_DONE; + } + /* Set for kvm_complete_mmio_read() use */ vcpu->arch.io_gpr = rd; run->mmio.is_write = 0; vcpu->mmio_is_write = 0; - trace_kvm_mmio(KVM_TRACE_MMIO_READ_UNSATISFIED, run->mmio.len, - run->mmio.phys_addr, NULL); - } else { - kvm_err("Read not supported Inst=0x%08x @%lx BadVaddr:%#lx\n", - inst.word, vcpu->arch.pc, vcpu->arch.badv); - kvm_arch_vcpu_dump_regs(vcpu); - vcpu->mmio_needed = 0; + return EMULATE_DO_MMIO; } + kvm_err("Read not supported Inst=0x%08x @%lx BadVaddr:%#lx\n", + inst.word, vcpu->arch.pc, vcpu->arch.badv); + kvm_arch_vcpu_dump_regs(vcpu); + vcpu->mmio_needed = 0; + return ret; } @@ -600,19 +624,29 @@ int kvm_emu_mmio_write(struct kvm_vcpu *vcpu, larch_inst inst) } if (ret == EMULATE_DO_MMIO) { + trace_kvm_mmio(KVM_TRACE_MMIO_WRITE, run->mmio.len, run->mmio.phys_addr, data); + + /* + * If mmio device such as PCH-PIC is emulated in KVM, + * it need not return to user space to handle the mmio + * exception. + */ + ret = kvm_io_bus_write(vcpu, KVM_MMIO_BUS, vcpu->arch.badv, run->mmio.len, data); + if (!ret) + return EMULATE_DONE; + run->mmio.is_write = 1; vcpu->mmio_needed = 1; vcpu->mmio_is_write = 1; - trace_kvm_mmio(KVM_TRACE_MMIO_WRITE, run->mmio.len, - run->mmio.phys_addr, data); - } else { - vcpu->arch.pc = curr_pc; - kvm_err("Write not supported Inst=0x%08x @%lx BadVaddr:%#lx\n", - inst.word, vcpu->arch.pc, vcpu->arch.badv); - kvm_arch_vcpu_dump_regs(vcpu); - /* Rollback PC if emulation was unsuccessful */ + return EMULATE_DO_MMIO; } + vcpu->arch.pc = curr_pc; + kvm_err("Write not supported Inst=0x%08x @%lx BadVaddr:%#lx\n", + inst.word, vcpu->arch.pc, vcpu->arch.badv); + kvm_arch_vcpu_dump_regs(vcpu); + /* Rollback PC if emulation was unsuccessful */ + return ret; } diff --git a/arch/loongarch/kvm/intc/eiointc.c b/arch/loongarch/kvm/intc/eiointc.c new file mode 100644 index 000000000000..f39929d7bf8a --- /dev/null +++ b/arch/loongarch/kvm/intc/eiointc.c @@ -0,0 +1,1027 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Copyright (C) 2024 Loongson Technology Corporation Limited + */ + +#include <asm/kvm_eiointc.h> +#include <asm/kvm_vcpu.h> +#include <linux/count_zeros.h> + +static void eiointc_set_sw_coreisr(struct loongarch_eiointc *s) +{ + int ipnum, cpu, irq_index, irq_mask, irq; + + for (irq = 0; irq < EIOINTC_IRQS; irq++) { + ipnum = s->ipmap.reg_u8[irq / 32]; + if (!(s->status & BIT(EIOINTC_ENABLE_INT_ENCODE))) { + ipnum = count_trailing_zeros(ipnum); + ipnum = (ipnum >= 0 && ipnum < 4) ? ipnum : 0; + } + irq_index = irq / 32; + irq_mask = BIT(irq & 0x1f); + + cpu = s->coremap.reg_u8[irq]; + if (!!(s->coreisr.reg_u32[cpu][irq_index] & irq_mask)) + set_bit(irq, s->sw_coreisr[cpu][ipnum]); + else + clear_bit(irq, s->sw_coreisr[cpu][ipnum]); + } +} + +static void eiointc_update_irq(struct loongarch_eiointc *s, int irq, int level) +{ + int ipnum, cpu, found, irq_index, irq_mask; + struct kvm_vcpu *vcpu; + struct kvm_interrupt vcpu_irq; + + ipnum = s->ipmap.reg_u8[irq / 32]; + if (!(s->status & BIT(EIOINTC_ENABLE_INT_ENCODE))) { + ipnum = count_trailing_zeros(ipnum); + ipnum = (ipnum >= 0 && ipnum < 4) ? ipnum : 0; + } + + cpu = s->sw_coremap[irq]; + vcpu = kvm_get_vcpu(s->kvm, cpu); + irq_index = irq / 32; + irq_mask = BIT(irq & 0x1f); + + if (level) { + /* if not enable return false */ + if (((s->enable.reg_u32[irq_index]) & irq_mask) == 0) + return; + s->coreisr.reg_u32[cpu][irq_index] |= irq_mask; + found = find_first_bit(s->sw_coreisr[cpu][ipnum], EIOINTC_IRQS); + set_bit(irq, s->sw_coreisr[cpu][ipnum]); + } else { + s->coreisr.reg_u32[cpu][irq_index] &= ~irq_mask; + clear_bit(irq, s->sw_coreisr[cpu][ipnum]); + found = find_first_bit(s->sw_coreisr[cpu][ipnum], EIOINTC_IRQS); + } + + if (found < EIOINTC_IRQS) + return; /* other irq is handling, needn't update parent irq */ + + vcpu_irq.irq = level ? (INT_HWI0 + ipnum) : -(INT_HWI0 + ipnum); + kvm_vcpu_ioctl_interrupt(vcpu, &vcpu_irq); +} + +static inline void eiointc_update_sw_coremap(struct loongarch_eiointc *s, + int irq, void *pvalue, u32 len, bool notify) +{ + int i, cpu; + u64 val = *(u64 *)pvalue; + + for (i = 0; i < len; i++) { + cpu = val & 0xff; + val = val >> 8; + + if (!(s->status & BIT(EIOINTC_ENABLE_CPU_ENCODE))) { + cpu = ffs(cpu) - 1; + cpu = (cpu >= 4) ? 0 : cpu; + } + + if (s->sw_coremap[irq + i] == cpu) + continue; + + if (notify && test_bit(irq + i, (unsigned long *)s->isr.reg_u8)) { + /* lower irq at old cpu and raise irq at new cpu */ + eiointc_update_irq(s, irq + i, 0); + s->sw_coremap[irq + i] = cpu; + eiointc_update_irq(s, irq + i, 1); + } else { + s->sw_coremap[irq + i] = cpu; + } + } +} + +void eiointc_set_irq(struct loongarch_eiointc *s, int irq, int level) +{ + unsigned long flags; + unsigned long *isr = (unsigned long *)s->isr.reg_u8; + + level ? set_bit(irq, isr) : clear_bit(irq, isr); + spin_lock_irqsave(&s->lock, flags); + eiointc_update_irq(s, irq, level); + spin_unlock_irqrestore(&s->lock, flags); +} + +static inline void eiointc_enable_irq(struct kvm_vcpu *vcpu, + struct loongarch_eiointc *s, int index, u8 mask, int level) +{ + u8 val; + int irq; + + val = mask & s->isr.reg_u8[index]; + irq = ffs(val); + while (irq != 0) { + /* + * enable bit change from 0 to 1, + * need to update irq by pending bits + */ + eiointc_update_irq(s, irq - 1 + index * 8, level); + val &= ~BIT(irq - 1); + irq = ffs(val); + } +} + +static int loongarch_eiointc_readb(struct kvm_vcpu *vcpu, struct loongarch_eiointc *s, + gpa_t addr, int len, void *val) +{ + int index, ret = 0; + u8 data = 0; + gpa_t offset; + + offset = addr - EIOINTC_BASE; + switch (offset) { + case EIOINTC_NODETYPE_START ... EIOINTC_NODETYPE_END: + index = offset - EIOINTC_NODETYPE_START; + data = s->nodetype.reg_u8[index]; + break; + case EIOINTC_IPMAP_START ... EIOINTC_IPMAP_END: + index = offset - EIOINTC_IPMAP_START; + data = s->ipmap.reg_u8[index]; + break; + case EIOINTC_ENABLE_START ... EIOINTC_ENABLE_END: + index = offset - EIOINTC_ENABLE_START; + data = s->enable.reg_u8[index]; + break; + case EIOINTC_BOUNCE_START ... EIOINTC_BOUNCE_END: + index = offset - EIOINTC_BOUNCE_START; + data = s->bounce.reg_u8[index]; + break; + case EIOINTC_COREISR_START ... EIOINTC_COREISR_END: + index = offset - EIOINTC_COREISR_START; + data = s->coreisr.reg_u8[vcpu->vcpu_id][index]; + break; + case EIOINTC_COREMAP_START ... EIOINTC_COREMAP_END: + index = offset - EIOINTC_COREMAP_START; + data = s->coremap.reg_u8[index]; + break; + default: + ret = -EINVAL; + break; + } + *(u8 *)val = data; + + return ret; +} + +static int loongarch_eiointc_readw(struct kvm_vcpu *vcpu, struct loongarch_eiointc *s, + gpa_t addr, int len, void *val) +{ + int index, ret = 0; + u16 data = 0; + gpa_t offset; + + offset = addr - EIOINTC_BASE; + switch (offset) { + case EIOINTC_NODETYPE_START ... EIOINTC_NODETYPE_END: + index = (offset - EIOINTC_NODETYPE_START) >> 1; + data = s->nodetype.reg_u16[index]; + break; + case EIOINTC_IPMAP_START ... EIOINTC_IPMAP_END: + index = (offset - EIOINTC_IPMAP_START) >> 1; + data = s->ipmap.reg_u16[index]; + break; + case EIOINTC_ENABLE_START ... EIOINTC_ENABLE_END: + index = (offset - EIOINTC_ENABLE_START) >> 1; + data = s->enable.reg_u16[index]; + break; + case EIOINTC_BOUNCE_START ... EIOINTC_BOUNCE_END: + index = (offset - EIOINTC_BOUNCE_START) >> 1; + data = s->bounce.reg_u16[index]; + break; + case EIOINTC_COREISR_START ... EIOINTC_COREISR_END: + index = (offset - EIOINTC_COREISR_START) >> 1; + data = s->coreisr.reg_u16[vcpu->vcpu_id][index]; + break; + case EIOINTC_COREMAP_START ... EIOINTC_COREMAP_END: + index = (offset - EIOINTC_COREMAP_START) >> 1; + data = s->coremap.reg_u16[index]; + break; + default: + ret = -EINVAL; + break; + } + *(u16 *)val = data; + + return ret; +} + +static int loongarch_eiointc_readl(struct kvm_vcpu *vcpu, struct loongarch_eiointc *s, + gpa_t addr, int len, void *val) +{ + int index, ret = 0; + u32 data = 0; + gpa_t offset; + + offset = addr - EIOINTC_BASE; + switch (offset) { + case EIOINTC_NODETYPE_START ... EIOINTC_NODETYPE_END: + index = (offset - EIOINTC_NODETYPE_START) >> 2; + data = s->nodetype.reg_u32[index]; + break; + case EIOINTC_IPMAP_START ... EIOINTC_IPMAP_END: + index = (offset - EIOINTC_IPMAP_START) >> 2; + data = s->ipmap.reg_u32[index]; + break; + case EIOINTC_ENABLE_START ... EIOINTC_ENABLE_END: + index = (offset - EIOINTC_ENABLE_START) >> 2; + data = s->enable.reg_u32[index]; + break; + case EIOINTC_BOUNCE_START ... EIOINTC_BOUNCE_END: + index = (offset - EIOINTC_BOUNCE_START) >> 2; + data = s->bounce.reg_u32[index]; + break; + case EIOINTC_COREISR_START ... EIOINTC_COREISR_END: + index = (offset - EIOINTC_COREISR_START) >> 2; + data = s->coreisr.reg_u32[vcpu->vcpu_id][index]; + break; + case EIOINTC_COREMAP_START ... EIOINTC_COREMAP_END: + index = (offset - EIOINTC_COREMAP_START) >> 2; + data = s->coremap.reg_u32[index]; + break; + default: + ret = -EINVAL; + break; + } + *(u32 *)val = data; + + return ret; +} + +static int loongarch_eiointc_readq(struct kvm_vcpu *vcpu, struct loongarch_eiointc *s, + gpa_t addr, int len, void *val) +{ + int index, ret = 0; + u64 data = 0; + gpa_t offset; + + offset = addr - EIOINTC_BASE; + switch (offset) { + case EIOINTC_NODETYPE_START ... EIOINTC_NODETYPE_END: + index = (offset - EIOINTC_NODETYPE_START) >> 3; + data = s->nodetype.reg_u64[index]; + break; + case EIOINTC_IPMAP_START ... EIOINTC_IPMAP_END: + index = (offset - EIOINTC_IPMAP_START) >> 3; + data = s->ipmap.reg_u64; + break; + case EIOINTC_ENABLE_START ... EIOINTC_ENABLE_END: + index = (offset - EIOINTC_ENABLE_START) >> 3; + data = s->enable.reg_u64[index]; + break; + case EIOINTC_BOUNCE_START ... EIOINTC_BOUNCE_END: + index = (offset - EIOINTC_BOUNCE_START) >> 3; + data = s->bounce.reg_u64[index]; + break; + case EIOINTC_COREISR_START ... EIOINTC_COREISR_END: + index = (offset - EIOINTC_COREISR_START) >> 3; + data = s->coreisr.reg_u64[vcpu->vcpu_id][index]; + break; + case EIOINTC_COREMAP_START ... EIOINTC_COREMAP_END: + index = (offset - EIOINTC_COREMAP_START) >> 3; + data = s->coremap.reg_u64[index]; + break; + default: + ret = -EINVAL; + break; + } + *(u64 *)val = data; + + return ret; +} + +static int kvm_eiointc_read(struct kvm_vcpu *vcpu, + struct kvm_io_device *dev, + gpa_t addr, int len, void *val) +{ + int ret = -EINVAL; + unsigned long flags; + struct loongarch_eiointc *eiointc = vcpu->kvm->arch.eiointc; + + if (!eiointc) { + kvm_err("%s: eiointc irqchip not valid!\n", __func__); + return -EINVAL; + } + + vcpu->kvm->stat.eiointc_read_exits++; + spin_lock_irqsave(&eiointc->lock, flags); + switch (len) { + case 1: + ret = loongarch_eiointc_readb(vcpu, eiointc, addr, len, val); + break; + case 2: + ret = loongarch_eiointc_readw(vcpu, eiointc, addr, len, val); + break; + case 4: + ret = loongarch_eiointc_readl(vcpu, eiointc, addr, len, val); + break; + case 8: + ret = loongarch_eiointc_readq(vcpu, eiointc, addr, len, val); + break; + default: + WARN_ONCE(1, "%s: Abnormal address access: addr 0x%llx, size %d\n", + __func__, addr, len); + } + spin_unlock_irqrestore(&eiointc->lock, flags); + + return ret; +} + +static int loongarch_eiointc_writeb(struct kvm_vcpu *vcpu, + struct loongarch_eiointc *s, + gpa_t addr, int len, const void *val) +{ + int index, irq, bits, ret = 0; + u8 cpu; + u8 data, old_data; + u8 coreisr, old_coreisr; + gpa_t offset; + + data = *(u8 *)val; + offset = addr - EIOINTC_BASE; + + switch (offset) { + case EIOINTC_NODETYPE_START ... EIOINTC_NODETYPE_END: + index = (offset - EIOINTC_NODETYPE_START); + s->nodetype.reg_u8[index] = data; + break; + case EIOINTC_IPMAP_START ... EIOINTC_IPMAP_END: + /* + * ipmap cannot be set at runtime, can be set only at the beginning + * of irqchip driver, need not update upper irq level + */ + index = (offset - EIOINTC_IPMAP_START); + s->ipmap.reg_u8[index] = data; + break; + case EIOINTC_ENABLE_START ... EIOINTC_ENABLE_END: + index = (offset - EIOINTC_ENABLE_START); + old_data = s->enable.reg_u8[index]; + s->enable.reg_u8[index] = data; + /* + * 1: enable irq. + * update irq when isr is set. + */ + data = s->enable.reg_u8[index] & ~old_data & s->isr.reg_u8[index]; + eiointc_enable_irq(vcpu, s, index, data, 1); + /* + * 0: disable irq. + * update irq when isr is set. + */ + data = ~s->enable.reg_u8[index] & old_data & s->isr.reg_u8[index]; + eiointc_enable_irq(vcpu, s, index, data, 0); + break; + case EIOINTC_BOUNCE_START ... EIOINTC_BOUNCE_END: + /* do not emulate hw bounced irq routing */ + index = offset - EIOINTC_BOUNCE_START; + s->bounce.reg_u8[index] = data; + break; + case EIOINTC_COREISR_START ... EIOINTC_COREISR_END: + index = (offset - EIOINTC_COREISR_START); + /* use attrs to get current cpu index */ + cpu = vcpu->vcpu_id; + coreisr = data; + old_coreisr = s->coreisr.reg_u8[cpu][index]; + /* write 1 to clear interrupt */ + s->coreisr.reg_u8[cpu][index] = old_coreisr & ~coreisr; + coreisr &= old_coreisr; + bits = sizeof(data) * 8; + irq = find_first_bit((void *)&coreisr, bits); + while (irq < bits) { + eiointc_update_irq(s, irq + index * bits, 0); + bitmap_clear((void *)&coreisr, irq, 1); + irq = find_first_bit((void *)&coreisr, bits); + } + break; + case EIOINTC_COREMAP_START ... EIOINTC_COREMAP_END: + irq = offset - EIOINTC_COREMAP_START; + index = irq; + s->coremap.reg_u8[index] = data; + eiointc_update_sw_coremap(s, irq, (void *)&data, sizeof(data), true); + break; + default: + ret = -EINVAL; + break; + } + + return ret; +} + +static int loongarch_eiointc_writew(struct kvm_vcpu *vcpu, + struct loongarch_eiointc *s, + gpa_t addr, int len, const void *val) +{ + int i, index, irq, bits, ret = 0; + u8 cpu; + u16 data, old_data; + u16 coreisr, old_coreisr; + gpa_t offset; + + data = *(u16 *)val; + offset = addr - EIOINTC_BASE; + + switch (offset) { + case EIOINTC_NODETYPE_START ... EIOINTC_NODETYPE_END: + index = (offset - EIOINTC_NODETYPE_START) >> 1; + s->nodetype.reg_u16[index] = data; + break; + case EIOINTC_IPMAP_START ... EIOINTC_IPMAP_END: + /* + * ipmap cannot be set at runtime, can be set only at the beginning + * of irqchip driver, need not update upper irq level + */ + index = (offset - EIOINTC_IPMAP_START) >> 1; + s->ipmap.reg_u16[index] = data; + break; + case EIOINTC_ENABLE_START ... EIOINTC_ENABLE_END: + index = (offset - EIOINTC_ENABLE_START) >> 1; + old_data = s->enable.reg_u32[index]; + s->enable.reg_u16[index] = data; + /* + * 1: enable irq. + * update irq when isr is set. + */ + data = s->enable.reg_u16[index] & ~old_data & s->isr.reg_u16[index]; + index = index << 1; + for (i = 0; i < sizeof(data); i++) { + u8 mask = (data >> (i * 8)) & 0xff; + eiointc_enable_irq(vcpu, s, index + i, mask, 1); + } + /* + * 0: disable irq. + * update irq when isr is set. + */ + data = ~s->enable.reg_u16[index] & old_data & s->isr.reg_u16[index]; + for (i = 0; i < sizeof(data); i++) { + u8 mask = (data >> (i * 8)) & 0xff; + eiointc_enable_irq(vcpu, s, index, mask, 0); + } + break; + case EIOINTC_BOUNCE_START ... EIOINTC_BOUNCE_END: + /* do not emulate hw bounced irq routing */ + index = (offset - EIOINTC_BOUNCE_START) >> 1; + s->bounce.reg_u16[index] = data; + break; + case EIOINTC_COREISR_START ... EIOINTC_COREISR_END: + index = (offset - EIOINTC_COREISR_START) >> 1; + /* use attrs to get current cpu index */ + cpu = vcpu->vcpu_id; + coreisr = data; + old_coreisr = s->coreisr.reg_u16[cpu][index]; + /* write 1 to clear interrupt */ + s->coreisr.reg_u16[cpu][index] = old_coreisr & ~coreisr; + coreisr &= old_coreisr; + bits = sizeof(data) * 8; + irq = find_first_bit((void *)&coreisr, bits); + while (irq < bits) { + eiointc_update_irq(s, irq + index * bits, 0); + bitmap_clear((void *)&coreisr, irq, 1); + irq = find_first_bit((void *)&coreisr, bits); + } + break; + case EIOINTC_COREMAP_START ... EIOINTC_COREMAP_END: + irq = offset - EIOINTC_COREMAP_START; + index = irq >> 1; + s->coremap.reg_u16[index] = data; + eiointc_update_sw_coremap(s, irq, (void *)&data, sizeof(data), true); + break; + default: + ret = -EINVAL; + break; + } + + return ret; +} + +static int loongarch_eiointc_writel(struct kvm_vcpu *vcpu, + struct loongarch_eiointc *s, + gpa_t addr, int len, const void *val) +{ + int i, index, irq, bits, ret = 0; + u8 cpu; + u32 data, old_data; + u32 coreisr, old_coreisr; + gpa_t offset; + + data = *(u32 *)val; + offset = addr - EIOINTC_BASE; + + switch (offset) { + case EIOINTC_NODETYPE_START ... EIOINTC_NODETYPE_END: + index = (offset - EIOINTC_NODETYPE_START) >> 2; + s->nodetype.reg_u32[index] = data; + break; + case EIOINTC_IPMAP_START ... EIOINTC_IPMAP_END: + /* + * ipmap cannot be set at runtime, can be set only at the beginning + * of irqchip driver, need not update upper irq level + */ + index = (offset - EIOINTC_IPMAP_START) >> 2; + s->ipmap.reg_u32[index] = data; + break; + case EIOINTC_ENABLE_START ... EIOINTC_ENABLE_END: + index = (offset - EIOINTC_ENABLE_START) >> 2; + old_data = s->enable.reg_u32[index]; + s->enable.reg_u32[index] = data; + /* + * 1: enable irq. + * update irq when isr is set. + */ + data = s->enable.reg_u32[index] & ~old_data & s->isr.reg_u32[index]; + index = index << 2; + for (i = 0; i < sizeof(data); i++) { + u8 mask = (data >> (i * 8)) & 0xff; + eiointc_enable_irq(vcpu, s, index + i, mask, 1); + } + /* + * 0: disable irq. + * update irq when isr is set. + */ + data = ~s->enable.reg_u32[index] & old_data & s->isr.reg_u32[index]; + for (i = 0; i < sizeof(data); i++) { + u8 mask = (data >> (i * 8)) & 0xff; + eiointc_enable_irq(vcpu, s, index, mask, 0); + } + break; + case EIOINTC_BOUNCE_START ... EIOINTC_BOUNCE_END: + /* do not emulate hw bounced irq routing */ + index = (offset - EIOINTC_BOUNCE_START) >> 2; + s->bounce.reg_u32[index] = data; + break; + case EIOINTC_COREISR_START ... EIOINTC_COREISR_END: + index = (offset - EIOINTC_COREISR_START) >> 2; + /* use attrs to get current cpu index */ + cpu = vcpu->vcpu_id; + coreisr = data; + old_coreisr = s->coreisr.reg_u32[cpu][index]; + /* write 1 to clear interrupt */ + s->coreisr.reg_u32[cpu][index] = old_coreisr & ~coreisr; + coreisr &= old_coreisr; + bits = sizeof(data) * 8; + irq = find_first_bit((void *)&coreisr, bits); + while (irq < bits) { + eiointc_update_irq(s, irq + index * bits, 0); + bitmap_clear((void *)&coreisr, irq, 1); + irq = find_first_bit((void *)&coreisr, bits); + } + break; + case EIOINTC_COREMAP_START ... EIOINTC_COREMAP_END: + irq = offset - EIOINTC_COREMAP_START; + index = irq >> 2; + s->coremap.reg_u32[index] = data; + eiointc_update_sw_coremap(s, irq, (void *)&data, sizeof(data), true); + break; + default: + ret = -EINVAL; + break; + } + + return ret; +} + +static int loongarch_eiointc_writeq(struct kvm_vcpu *vcpu, + struct loongarch_eiointc *s, + gpa_t addr, int len, const void *val) +{ + int i, index, irq, bits, ret = 0; + u8 cpu; + u64 data, old_data; + u64 coreisr, old_coreisr; + gpa_t offset; + + data = *(u64 *)val; + offset = addr - EIOINTC_BASE; + + switch (offset) { + case EIOINTC_NODETYPE_START ... EIOINTC_NODETYPE_END: + index = (offset - EIOINTC_NODETYPE_START) >> 3; + s->nodetype.reg_u64[index] = data; + break; + case EIOINTC_IPMAP_START ... EIOINTC_IPMAP_END: + /* + * ipmap cannot be set at runtime, can be set only at the beginning + * of irqchip driver, need not update upper irq level + */ + index = (offset - EIOINTC_IPMAP_START) >> 3; + s->ipmap.reg_u64 = data; + break; + case EIOINTC_ENABLE_START ... EIOINTC_ENABLE_END: + index = (offset - EIOINTC_ENABLE_START) >> 3; + old_data = s->enable.reg_u64[index]; + s->enable.reg_u64[index] = data; + /* + * 1: enable irq. + * update irq when isr is set. + */ + data = s->enable.reg_u64[index] & ~old_data & s->isr.reg_u64[index]; + index = index << 3; + for (i = 0; i < sizeof(data); i++) { + u8 mask = (data >> (i * 8)) & 0xff; + eiointc_enable_irq(vcpu, s, index + i, mask, 1); + } + /* + * 0: disable irq. + * update irq when isr is set. + */ + data = ~s->enable.reg_u64[index] & old_data & s->isr.reg_u64[index]; + for (i = 0; i < sizeof(data); i++) { + u8 mask = (data >> (i * 8)) & 0xff; + eiointc_enable_irq(vcpu, s, index, mask, 0); + } + break; + case EIOINTC_BOUNCE_START ... EIOINTC_BOUNCE_END: + /* do not emulate hw bounced irq routing */ + index = (offset - EIOINTC_BOUNCE_START) >> 3; + s->bounce.reg_u64[index] = data; + break; + case EIOINTC_COREISR_START ... EIOINTC_COREISR_END: + index = (offset - EIOINTC_COREISR_START) >> 3; + /* use attrs to get current cpu index */ + cpu = vcpu->vcpu_id; + coreisr = data; + old_coreisr = s->coreisr.reg_u64[cpu][index]; + /* write 1 to clear interrupt */ + s->coreisr.reg_u64[cpu][index] = old_coreisr & ~coreisr; + coreisr &= old_coreisr; + bits = sizeof(data) * 8; + irq = find_first_bit((void *)&coreisr, bits); + while (irq < bits) { + eiointc_update_irq(s, irq + index * bits, 0); + bitmap_clear((void *)&coreisr, irq, 1); + irq = find_first_bit((void *)&coreisr, bits); + } + break; + case EIOINTC_COREMAP_START ... EIOINTC_COREMAP_END: + irq = offset - EIOINTC_COREMAP_START; + index = irq >> 3; + s->coremap.reg_u64[index] = data; + eiointc_update_sw_coremap(s, irq, (void *)&data, sizeof(data), true); + break; + default: + ret = -EINVAL; + break; + } + + return ret; +} + +static int kvm_eiointc_write(struct kvm_vcpu *vcpu, + struct kvm_io_device *dev, + gpa_t addr, int len, const void *val) +{ + int ret = -EINVAL; + unsigned long flags; + struct loongarch_eiointc *eiointc = vcpu->kvm->arch.eiointc; + + if (!eiointc) { + kvm_err("%s: eiointc irqchip not valid!\n", __func__); + return -EINVAL; + } + + vcpu->kvm->stat.eiointc_write_exits++; + spin_lock_irqsave(&eiointc->lock, flags); + switch (len) { + case 1: + ret = loongarch_eiointc_writeb(vcpu, eiointc, addr, len, val); + break; + case 2: + ret = loongarch_eiointc_writew(vcpu, eiointc, addr, len, val); + break; + case 4: + ret = loongarch_eiointc_writel(vcpu, eiointc, addr, len, val); + break; + case 8: + ret = loongarch_eiointc_writeq(vcpu, eiointc, addr, len, val); + break; + default: + WARN_ONCE(1, "%s: Abnormal address access: addr 0x%llx, size %d\n", + __func__, addr, len); + } + spin_unlock_irqrestore(&eiointc->lock, flags); + + return ret; +} + +static const struct kvm_io_device_ops kvm_eiointc_ops = { + .read = kvm_eiointc_read, + .write = kvm_eiointc_write, +}; + +static int kvm_eiointc_virt_read(struct kvm_vcpu *vcpu, + struct kvm_io_device *dev, + gpa_t addr, int len, void *val) +{ + unsigned long flags; + u32 *data = val; + struct loongarch_eiointc *eiointc = vcpu->kvm->arch.eiointc; + + if (!eiointc) { + kvm_err("%s: eiointc irqchip not valid!\n", __func__); + return -EINVAL; + } + + addr -= EIOINTC_VIRT_BASE; + spin_lock_irqsave(&eiointc->lock, flags); + switch (addr) { + case EIOINTC_VIRT_FEATURES: + *data = eiointc->features; + break; + case EIOINTC_VIRT_CONFIG: + *data = eiointc->status; + break; + default: + break; + } + spin_unlock_irqrestore(&eiointc->lock, flags); + + return 0; +} + +static int kvm_eiointc_virt_write(struct kvm_vcpu *vcpu, + struct kvm_io_device *dev, + gpa_t addr, int len, const void *val) +{ + int ret = 0; + unsigned long flags; + u32 value = *(u32 *)val; + struct loongarch_eiointc *eiointc = vcpu->kvm->arch.eiointc; + + if (!eiointc) { + kvm_err("%s: eiointc irqchip not valid!\n", __func__); + return -EINVAL; + } + + addr -= EIOINTC_VIRT_BASE; + spin_lock_irqsave(&eiointc->lock, flags); + switch (addr) { + case EIOINTC_VIRT_FEATURES: + ret = -EPERM; + break; + case EIOINTC_VIRT_CONFIG: + /* + * eiointc features can only be set at disabled status + */ + if ((eiointc->status & BIT(EIOINTC_ENABLE)) && value) { + ret = -EPERM; + break; + } + eiointc->status = value & eiointc->features; + break; + default: + break; + } + spin_unlock_irqrestore(&eiointc->lock, flags); + + return ret; +} + +static const struct kvm_io_device_ops kvm_eiointc_virt_ops = { + .read = kvm_eiointc_virt_read, + .write = kvm_eiointc_virt_write, +}; + +static int kvm_eiointc_ctrl_access(struct kvm_device *dev, + struct kvm_device_attr *attr) +{ + int ret = 0; + unsigned long flags; + unsigned long type = (unsigned long)attr->attr; + u32 i, start_irq; + void __user *data; + struct loongarch_eiointc *s = dev->kvm->arch.eiointc; + + data = (void __user *)attr->addr; + spin_lock_irqsave(&s->lock, flags); + switch (type) { + case KVM_DEV_LOONGARCH_EXTIOI_CTRL_INIT_NUM_CPU: + if (copy_from_user(&s->num_cpu, data, 4)) + ret = -EFAULT; + break; + case KVM_DEV_LOONGARCH_EXTIOI_CTRL_INIT_FEATURE: + if (copy_from_user(&s->features, data, 4)) + ret = -EFAULT; + if (!(s->features & BIT(EIOINTC_HAS_VIRT_EXTENSION))) + s->status |= BIT(EIOINTC_ENABLE); + break; + case KVM_DEV_LOONGARCH_EXTIOI_CTRL_LOAD_FINISHED: + eiointc_set_sw_coreisr(s); + for (i = 0; i < (EIOINTC_IRQS / 4); i++) { + start_irq = i * 4; + eiointc_update_sw_coremap(s, start_irq, + (void *)&s->coremap.reg_u32[i], sizeof(u32), false); + } + break; + default: + break; + } + spin_unlock_irqrestore(&s->lock, flags); + + return ret; +} + +static int kvm_eiointc_regs_access(struct kvm_device *dev, + struct kvm_device_attr *attr, + bool is_write) +{ + int addr, cpuid, offset, ret = 0; + unsigned long flags; + void *p = NULL; + void __user *data; + struct loongarch_eiointc *s; + + s = dev->kvm->arch.eiointc; + addr = attr->attr; + cpuid = addr >> 16; + addr &= 0xffff; + data = (void __user *)attr->addr; + switch (addr) { + case EIOINTC_NODETYPE_START ... EIOINTC_NODETYPE_END: + offset = (addr - EIOINTC_NODETYPE_START) / 4; + p = &s->nodetype.reg_u32[offset]; + break; + case EIOINTC_IPMAP_START ... EIOINTC_IPMAP_END: + offset = (addr - EIOINTC_IPMAP_START) / 4; + p = &s->ipmap.reg_u32[offset]; + break; + case EIOINTC_ENABLE_START ... EIOINTC_ENABLE_END: + offset = (addr - EIOINTC_ENABLE_START) / 4; + p = &s->enable.reg_u32[offset]; + break; + case EIOINTC_BOUNCE_START ... EIOINTC_BOUNCE_END: + offset = (addr - EIOINTC_BOUNCE_START) / 4; + p = &s->bounce.reg_u32[offset]; + break; + case EIOINTC_ISR_START ... EIOINTC_ISR_END: + offset = (addr - EIOINTC_ISR_START) / 4; + p = &s->isr.reg_u32[offset]; + break; + case EIOINTC_COREISR_START ... EIOINTC_COREISR_END: + offset = (addr - EIOINTC_COREISR_START) / 4; + p = &s->coreisr.reg_u32[cpuid][offset]; + break; + case EIOINTC_COREMAP_START ... EIOINTC_COREMAP_END: + offset = (addr - EIOINTC_COREMAP_START) / 4; + p = &s->coremap.reg_u32[offset]; + break; + default: + kvm_err("%s: unknown eiointc register, addr = %d\n", __func__, addr); + return -EINVAL; + } + + spin_lock_irqsave(&s->lock, flags); + if (is_write) { + if (copy_from_user(p, data, 4)) + ret = -EFAULT; + } else { + if (copy_to_user(data, p, 4)) + ret = -EFAULT; + } + spin_unlock_irqrestore(&s->lock, flags); + + return ret; +} + +static int kvm_eiointc_sw_status_access(struct kvm_device *dev, + struct kvm_device_attr *attr, + bool is_write) +{ + int addr, ret = 0; + unsigned long flags; + void *p = NULL; + void __user *data; + struct loongarch_eiointc *s; + + s = dev->kvm->arch.eiointc; + addr = attr->attr; + addr &= 0xffff; + + data = (void __user *)attr->addr; + switch (addr) { + case KVM_DEV_LOONGARCH_EXTIOI_SW_STATUS_NUM_CPU: + p = &s->num_cpu; + break; + case KVM_DEV_LOONGARCH_EXTIOI_SW_STATUS_FEATURE: + p = &s->features; + break; + case KVM_DEV_LOONGARCH_EXTIOI_SW_STATUS_STATE: + p = &s->status; + break; + default: + kvm_err("%s: unknown eiointc register, addr = %d\n", __func__, addr); + return -EINVAL; + } + spin_lock_irqsave(&s->lock, flags); + if (is_write) { + if (copy_from_user(p, data, 4)) + ret = -EFAULT; + } else { + if (copy_to_user(data, p, 4)) + ret = -EFAULT; + } + spin_unlock_irqrestore(&s->lock, flags); + + return ret; +} + +static int kvm_eiointc_get_attr(struct kvm_device *dev, + struct kvm_device_attr *attr) +{ + switch (attr->group) { + case KVM_DEV_LOONGARCH_EXTIOI_GRP_REGS: + return kvm_eiointc_regs_access(dev, attr, false); + case KVM_DEV_LOONGARCH_EXTIOI_GRP_SW_STATUS: + return kvm_eiointc_sw_status_access(dev, attr, false); + default: + return -EINVAL; + } +} + +static int kvm_eiointc_set_attr(struct kvm_device *dev, + struct kvm_device_attr *attr) +{ + switch (attr->group) { + case KVM_DEV_LOONGARCH_EXTIOI_GRP_CTRL: + return kvm_eiointc_ctrl_access(dev, attr); + case KVM_DEV_LOONGARCH_EXTIOI_GRP_REGS: + return kvm_eiointc_regs_access(dev, attr, true); + case KVM_DEV_LOONGARCH_EXTIOI_GRP_SW_STATUS: + return kvm_eiointc_sw_status_access(dev, attr, true); + default: + return -EINVAL; + } +} + +static int kvm_eiointc_create(struct kvm_device *dev, u32 type) +{ + int ret; + struct loongarch_eiointc *s; + struct kvm_io_device *device, *device1; + struct kvm *kvm = dev->kvm; + + /* eiointc has been created */ + if (kvm->arch.eiointc) + return -EINVAL; + + s = kzalloc(sizeof(struct loongarch_eiointc), GFP_KERNEL); + if (!s) + return -ENOMEM; + + spin_lock_init(&s->lock); + s->kvm = kvm; + + /* + * Initialize IOCSR device + */ + device = &s->device; + kvm_iodevice_init(device, &kvm_eiointc_ops); + mutex_lock(&kvm->slots_lock); + ret = kvm_io_bus_register_dev(kvm, KVM_IOCSR_BUS, + EIOINTC_BASE, EIOINTC_SIZE, device); + mutex_unlock(&kvm->slots_lock); + if (ret < 0) { + kfree(s); + return ret; + } + + device1 = &s->device_vext; + kvm_iodevice_init(device1, &kvm_eiointc_virt_ops); + ret = kvm_io_bus_register_dev(kvm, KVM_IOCSR_BUS, + EIOINTC_VIRT_BASE, EIOINTC_VIRT_SIZE, device1); + if (ret < 0) { + kvm_io_bus_unregister_dev(kvm, KVM_IOCSR_BUS, &s->device); + kfree(s); + return ret; + } + kvm->arch.eiointc = s; + + return 0; +} + +static void kvm_eiointc_destroy(struct kvm_device *dev) +{ + struct kvm *kvm; + struct loongarch_eiointc *eiointc; + + if (!dev || !dev->kvm || !dev->kvm->arch.eiointc) + return; + + kvm = dev->kvm; + eiointc = kvm->arch.eiointc; + kvm_io_bus_unregister_dev(kvm, KVM_IOCSR_BUS, &eiointc->device); + kvm_io_bus_unregister_dev(kvm, KVM_IOCSR_BUS, &eiointc->device_vext); + kfree(eiointc); +} + +static struct kvm_device_ops kvm_eiointc_dev_ops = { + .name = "kvm-loongarch-eiointc", + .create = kvm_eiointc_create, + .destroy = kvm_eiointc_destroy, + .set_attr = kvm_eiointc_set_attr, + .get_attr = kvm_eiointc_get_attr, +}; + +int kvm_loongarch_register_eiointc_device(void) +{ + return kvm_register_device_ops(&kvm_eiointc_dev_ops, KVM_DEV_TYPE_LOONGARCH_EIOINTC); +} diff --git a/arch/loongarch/kvm/intc/ipi.c b/arch/loongarch/kvm/intc/ipi.c new file mode 100644 index 000000000000..a233a323e295 --- /dev/null +++ b/arch/loongarch/kvm/intc/ipi.c @@ -0,0 +1,475 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Copyright (C) 2024 Loongson Technology Corporation Limited + */ + +#include <linux/kvm_host.h> +#include <asm/kvm_ipi.h> +#include <asm/kvm_vcpu.h> + +static void ipi_send(struct kvm *kvm, uint64_t data) +{ + int cpu, action; + uint32_t status; + struct kvm_vcpu *vcpu; + struct kvm_interrupt irq; + + cpu = ((data & 0xffffffff) >> 16) & 0x3ff; + vcpu = kvm_get_vcpu_by_cpuid(kvm, cpu); + if (unlikely(vcpu == NULL)) { + kvm_err("%s: invalid target cpu: %d\n", __func__, cpu); + return; + } + + action = BIT(data & 0x1f); + spin_lock(&vcpu->arch.ipi_state.lock); + status = vcpu->arch.ipi_state.status; + vcpu->arch.ipi_state.status |= action; + spin_unlock(&vcpu->arch.ipi_state.lock); + if (status == 0) { + irq.irq = LARCH_INT_IPI; + kvm_vcpu_ioctl_interrupt(vcpu, &irq); + } +} + +static void ipi_clear(struct kvm_vcpu *vcpu, uint64_t data) +{ + uint32_t status; + struct kvm_interrupt irq; + + spin_lock(&vcpu->arch.ipi_state.lock); + vcpu->arch.ipi_state.status &= ~data; + status = vcpu->arch.ipi_state.status; + spin_unlock(&vcpu->arch.ipi_state.lock); + if (status == 0) { + irq.irq = -LARCH_INT_IPI; + kvm_vcpu_ioctl_interrupt(vcpu, &irq); + } +} + +static uint64_t read_mailbox(struct kvm_vcpu *vcpu, int offset, int len) +{ + uint64_t data = 0; + + spin_lock(&vcpu->arch.ipi_state.lock); + data = *(ulong *)((void *)vcpu->arch.ipi_state.buf + (offset - 0x20)); + spin_unlock(&vcpu->arch.ipi_state.lock); + + switch (len) { + case 1: + return data & 0xff; + case 2: + return data & 0xffff; + case 4: + return data & 0xffffffff; + case 8: + return data; + default: + kvm_err("%s: unknown data len: %d\n", __func__, len); + return 0; + } +} + +static void write_mailbox(struct kvm_vcpu *vcpu, int offset, uint64_t data, int len) +{ + void *pbuf; + + spin_lock(&vcpu->arch.ipi_state.lock); + pbuf = (void *)vcpu->arch.ipi_state.buf + (offset - 0x20); + + switch (len) { + case 1: + *(unsigned char *)pbuf = (unsigned char)data; + break; + case 2: + *(unsigned short *)pbuf = (unsigned short)data; + break; + case 4: + *(unsigned int *)pbuf = (unsigned int)data; + break; + case 8: + *(unsigned long *)pbuf = (unsigned long)data; + break; + default: + kvm_err("%s: unknown data len: %d\n", __func__, len); + } + spin_unlock(&vcpu->arch.ipi_state.lock); +} + +static int send_ipi_data(struct kvm_vcpu *vcpu, gpa_t addr, uint64_t data) +{ + int i, ret; + uint32_t val = 0, mask = 0; + + /* + * Bit 27-30 is mask for byte writing. + * If the mask is 0, we need not to do anything. + */ + if ((data >> 27) & 0xf) { + /* Read the old val */ + ret = kvm_io_bus_read(vcpu, KVM_IOCSR_BUS, addr, sizeof(val), &val); + if (unlikely(ret)) { + kvm_err("%s: : read date from addr %llx failed\n", __func__, addr); + return ret; + } + /* Construct the mask by scanning the bit 27-30 */ + for (i = 0; i < 4; i++) { + if (data & (BIT(27 + i))) + mask |= (0xff << (i * 8)); + } + /* Save the old part of val */ + val &= mask; + } + val |= ((uint32_t)(data >> 32) & ~mask); + ret = kvm_io_bus_write(vcpu, KVM_IOCSR_BUS, addr, sizeof(val), &val); + if (unlikely(ret)) + kvm_err("%s: : write date to addr %llx failed\n", __func__, addr); + + return ret; +} + +static int mail_send(struct kvm *kvm, uint64_t data) +{ + int cpu, mailbox, offset; + struct kvm_vcpu *vcpu; + + cpu = ((data & 0xffffffff) >> 16) & 0x3ff; + vcpu = kvm_get_vcpu_by_cpuid(kvm, cpu); + if (unlikely(vcpu == NULL)) { + kvm_err("%s: invalid target cpu: %d\n", __func__, cpu); + return -EINVAL; + } + mailbox = ((data & 0xffffffff) >> 2) & 0x7; + offset = IOCSR_IPI_BASE + IOCSR_IPI_BUF_20 + mailbox * 4; + + return send_ipi_data(vcpu, offset, data); +} + +static int any_send(struct kvm *kvm, uint64_t data) +{ + int cpu, offset; + struct kvm_vcpu *vcpu; + + cpu = ((data & 0xffffffff) >> 16) & 0x3ff; + vcpu = kvm_get_vcpu_by_cpuid(kvm, cpu); + if (unlikely(vcpu == NULL)) { + kvm_err("%s: invalid target cpu: %d\n", __func__, cpu); + return -EINVAL; + } + offset = data & 0xffff; + + return send_ipi_data(vcpu, offset, data); +} + +static int loongarch_ipi_readl(struct kvm_vcpu *vcpu, gpa_t addr, int len, void *val) +{ + int ret = 0; + uint32_t offset; + uint64_t res = 0; + + offset = (uint32_t)(addr & 0x1ff); + WARN_ON_ONCE(offset & (len - 1)); + + switch (offset) { + case IOCSR_IPI_STATUS: + spin_lock(&vcpu->arch.ipi_state.lock); + res = vcpu->arch.ipi_state.status; + spin_unlock(&vcpu->arch.ipi_state.lock); + break; + case IOCSR_IPI_EN: + spin_lock(&vcpu->arch.ipi_state.lock); + res = vcpu->arch.ipi_state.en; + spin_unlock(&vcpu->arch.ipi_state.lock); + break; + case IOCSR_IPI_SET: + res = 0; + break; + case IOCSR_IPI_CLEAR: + res = 0; + break; + case IOCSR_IPI_BUF_20 ... IOCSR_IPI_BUF_38 + 7: + if (offset + len > IOCSR_IPI_BUF_38 + 8) { + kvm_err("%s: invalid offset or len: offset = %d, len = %d\n", + __func__, offset, len); + ret = -EINVAL; + break; + } + res = read_mailbox(vcpu, offset, len); + break; + default: + kvm_err("%s: unknown addr: %llx\n", __func__, addr); + ret = -EINVAL; + break; + } + *(uint64_t *)val = res; + + return ret; +} + +static int loongarch_ipi_writel(struct kvm_vcpu *vcpu, gpa_t addr, int len, const void *val) +{ + int ret = 0; + uint64_t data; + uint32_t offset; + + data = *(uint64_t *)val; + + offset = (uint32_t)(addr & 0x1ff); + WARN_ON_ONCE(offset & (len - 1)); + + switch (offset) { + case IOCSR_IPI_STATUS: + ret = -EINVAL; + break; + case IOCSR_IPI_EN: + spin_lock(&vcpu->arch.ipi_state.lock); + vcpu->arch.ipi_state.en = data; + spin_unlock(&vcpu->arch.ipi_state.lock); + break; + case IOCSR_IPI_SET: + ret = -EINVAL; + break; + case IOCSR_IPI_CLEAR: + /* Just clear the status of the current vcpu */ + ipi_clear(vcpu, data); + break; + case IOCSR_IPI_BUF_20 ... IOCSR_IPI_BUF_38 + 7: + if (offset + len > IOCSR_IPI_BUF_38 + 8) { + kvm_err("%s: invalid offset or len: offset = %d, len = %d\n", + __func__, offset, len); + ret = -EINVAL; + break; + } + write_mailbox(vcpu, offset, data, len); + break; + case IOCSR_IPI_SEND: + ipi_send(vcpu->kvm, data); + break; + case IOCSR_MAIL_SEND: + ret = mail_send(vcpu->kvm, *(uint64_t *)val); + break; + case IOCSR_ANY_SEND: + ret = any_send(vcpu->kvm, *(uint64_t *)val); + break; + default: + kvm_err("%s: unknown addr: %llx\n", __func__, addr); + ret = -EINVAL; + break; + } + + return ret; +} + +static int kvm_ipi_read(struct kvm_vcpu *vcpu, + struct kvm_io_device *dev, + gpa_t addr, int len, void *val) +{ + int ret; + struct loongarch_ipi *ipi; + + ipi = vcpu->kvm->arch.ipi; + if (!ipi) { + kvm_err("%s: ipi irqchip not valid!\n", __func__); + return -EINVAL; + } + ipi->kvm->stat.ipi_read_exits++; + ret = loongarch_ipi_readl(vcpu, addr, len, val); + + return ret; +} + +static int kvm_ipi_write(struct kvm_vcpu *vcpu, + struct kvm_io_device *dev, + gpa_t addr, int len, const void *val) +{ + int ret; + struct loongarch_ipi *ipi; + + ipi = vcpu->kvm->arch.ipi; + if (!ipi) { + kvm_err("%s: ipi irqchip not valid!\n", __func__); + return -EINVAL; + } + ipi->kvm->stat.ipi_write_exits++; + ret = loongarch_ipi_writel(vcpu, addr, len, val); + + return ret; +} + +static const struct kvm_io_device_ops kvm_ipi_ops = { + .read = kvm_ipi_read, + .write = kvm_ipi_write, +}; + +static int kvm_ipi_regs_access(struct kvm_device *dev, + struct kvm_device_attr *attr, + bool is_write) +{ + int len = 4; + int cpu, addr; + uint64_t val; + void *p = NULL; + struct kvm_vcpu *vcpu; + + cpu = (attr->attr >> 16) & 0x3ff; + addr = attr->attr & 0xff; + + vcpu = kvm_get_vcpu(dev->kvm, cpu); + if (unlikely(vcpu == NULL)) { + kvm_err("%s: invalid target cpu: %d\n", __func__, cpu); + return -EINVAL; + } + + switch (addr) { + case IOCSR_IPI_STATUS: + p = &vcpu->arch.ipi_state.status; + break; + case IOCSR_IPI_EN: + p = &vcpu->arch.ipi_state.en; + break; + case IOCSR_IPI_SET: + p = &vcpu->arch.ipi_state.set; + break; + case IOCSR_IPI_CLEAR: + p = &vcpu->arch.ipi_state.clear; + break; + case IOCSR_IPI_BUF_20: + p = &vcpu->arch.ipi_state.buf[0]; + len = 8; + break; + case IOCSR_IPI_BUF_28: + p = &vcpu->arch.ipi_state.buf[1]; + len = 8; + break; + case IOCSR_IPI_BUF_30: + p = &vcpu->arch.ipi_state.buf[2]; + len = 8; + break; + case IOCSR_IPI_BUF_38: + p = &vcpu->arch.ipi_state.buf[3]; + len = 8; + break; + default: + kvm_err("%s: unknown ipi register, addr = %d\n", __func__, addr); + return -EINVAL; + } + + if (is_write) { + if (len == 4) { + if (get_user(val, (uint32_t __user *)attr->addr)) + return -EFAULT; + *(uint32_t *)p = (uint32_t)val; + } else if (len == 8) { + if (get_user(val, (uint64_t __user *)attr->addr)) + return -EFAULT; + *(uint64_t *)p = val; + } + } else { + if (len == 4) { + val = *(uint32_t *)p; + return put_user(val, (uint32_t __user *)attr->addr); + } else if (len == 8) { + val = *(uint64_t *)p; + return put_user(val, (uint64_t __user *)attr->addr); + } + } + + return 0; +} + +static int kvm_ipi_get_attr(struct kvm_device *dev, + struct kvm_device_attr *attr) +{ + switch (attr->group) { + case KVM_DEV_LOONGARCH_IPI_GRP_REGS: + return kvm_ipi_regs_access(dev, attr, false); + default: + kvm_err("%s: unknown group (%d)\n", __func__, attr->group); + return -EINVAL; + } +} + +static int kvm_ipi_set_attr(struct kvm_device *dev, + struct kvm_device_attr *attr) +{ + switch (attr->group) { + case KVM_DEV_LOONGARCH_IPI_GRP_REGS: + return kvm_ipi_regs_access(dev, attr, true); + default: + kvm_err("%s: unknown group (%d)\n", __func__, attr->group); + return -EINVAL; + } +} + +static int kvm_ipi_create(struct kvm_device *dev, u32 type) +{ + int ret; + struct kvm *kvm; + struct kvm_io_device *device; + struct loongarch_ipi *s; + + if (!dev) { + kvm_err("%s: kvm_device ptr is invalid!\n", __func__); + return -EINVAL; + } + + kvm = dev->kvm; + if (kvm->arch.ipi) { + kvm_err("%s: LoongArch IPI has already been created!\n", __func__); + return -EINVAL; + } + + s = kzalloc(sizeof(struct loongarch_ipi), GFP_KERNEL); + if (!s) + return -ENOMEM; + + spin_lock_init(&s->lock); + s->kvm = kvm; + + /* + * Initialize IOCSR device + */ + device = &s->device; + kvm_iodevice_init(device, &kvm_ipi_ops); + mutex_lock(&kvm->slots_lock); + ret = kvm_io_bus_register_dev(kvm, KVM_IOCSR_BUS, IOCSR_IPI_BASE, IOCSR_IPI_SIZE, device); + mutex_unlock(&kvm->slots_lock); + if (ret < 0) { + kvm_err("%s: Initialize IOCSR dev failed, ret = %d\n", __func__, ret); + goto err; + } + + kvm->arch.ipi = s; + return 0; + +err: + kfree(s); + return -EFAULT; +} + +static void kvm_ipi_destroy(struct kvm_device *dev) +{ + struct kvm *kvm; + struct loongarch_ipi *ipi; + + if (!dev || !dev->kvm || !dev->kvm->arch.ipi) + return; + + kvm = dev->kvm; + ipi = kvm->arch.ipi; + kvm_io_bus_unregister_dev(kvm, KVM_IOCSR_BUS, &ipi->device); + kfree(ipi); +} + +static struct kvm_device_ops kvm_ipi_dev_ops = { + .name = "kvm-loongarch-ipi", + .create = kvm_ipi_create, + .destroy = kvm_ipi_destroy, + .set_attr = kvm_ipi_set_attr, + .get_attr = kvm_ipi_get_attr, +}; + +int kvm_loongarch_register_ipi_device(void) +{ + return kvm_register_device_ops(&kvm_ipi_dev_ops, KVM_DEV_TYPE_LOONGARCH_IPI); +} diff --git a/arch/loongarch/kvm/intc/pch_pic.c b/arch/loongarch/kvm/intc/pch_pic.c new file mode 100644 index 000000000000..08fce845f668 --- /dev/null +++ b/arch/loongarch/kvm/intc/pch_pic.c @@ -0,0 +1,519 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Copyright (C) 2024 Loongson Technology Corporation Limited + */ + +#include <asm/kvm_eiointc.h> +#include <asm/kvm_pch_pic.h> +#include <asm/kvm_vcpu.h> +#include <linux/count_zeros.h> + +/* update the isr according to irq level and route irq to eiointc */ +static void pch_pic_update_irq(struct loongarch_pch_pic *s, int irq, int level) +{ + u64 mask = BIT(irq); + + /* + * set isr and route irq to eiointc and + * the route table is in htmsi_vector[] + */ + if (level) { + if (mask & s->irr & ~s->mask) { + s->isr |= mask; + irq = s->htmsi_vector[irq]; + eiointc_set_irq(s->kvm->arch.eiointc, irq, level); + } + } else { + if (mask & s->isr & ~s->irr) { + s->isr &= ~mask; + irq = s->htmsi_vector[irq]; + eiointc_set_irq(s->kvm->arch.eiointc, irq, level); + } + } +} + +/* update batch irqs, the irq_mask is a bitmap of irqs */ +static void pch_pic_update_batch_irqs(struct loongarch_pch_pic *s, u64 irq_mask, int level) +{ + int irq, bits; + + /* find each irq by irqs bitmap and update each irq */ + bits = sizeof(irq_mask) * 8; + irq = find_first_bit((void *)&irq_mask, bits); + while (irq < bits) { + pch_pic_update_irq(s, irq, level); + bitmap_clear((void *)&irq_mask, irq, 1); + irq = find_first_bit((void *)&irq_mask, bits); + } +} + +/* called when a irq is triggered in pch pic */ +void pch_pic_set_irq(struct loongarch_pch_pic *s, int irq, int level) +{ + u64 mask = BIT(irq); + + spin_lock(&s->lock); + if (level) + s->irr |= mask; /* set irr */ + else { + /* + * In edge triggered mode, 0 does not mean to clear irq + * The irr register variable is cleared when cpu writes to the + * PCH_PIC_CLEAR_START address area + */ + if (s->edge & mask) { + spin_unlock(&s->lock); + return; + } + s->irr &= ~mask; + } + pch_pic_update_irq(s, irq, level); + spin_unlock(&s->lock); +} + +/* msi irq handler */ +void pch_msi_set_irq(struct kvm *kvm, int irq, int level) +{ + eiointc_set_irq(kvm->arch.eiointc, irq, level); +} + +/* + * pch pic register is 64-bit, but it is accessed by 32-bit, + * so we use high to get whether low or high 32 bits we want + * to read. + */ +static u32 pch_pic_read_reg(u64 *s, int high) +{ + u64 val = *s; + + /* read the high 32 bits when high is 1 */ + return high ? (u32)(val >> 32) : (u32)val; +} + +/* + * pch pic register is 64-bit, but it is accessed by 32-bit, + * so we use high to get whether low or high 32 bits we want + * to write. + */ +static u32 pch_pic_write_reg(u64 *s, int high, u32 v) +{ + u64 val = *s, data = v; + + if (high) { + /* + * Clear val high 32 bits + * Write the high 32 bits when the high is 1 + */ + *s = (val << 32 >> 32) | (data << 32); + val >>= 32; + } else + /* + * Clear val low 32 bits + * Write the low 32 bits when the high is 0 + */ + *s = (val >> 32 << 32) | v; + + return (u32)val; +} + +static int loongarch_pch_pic_read(struct loongarch_pch_pic *s, gpa_t addr, int len, void *val) +{ + int offset, index, ret = 0; + u32 data = 0; + u64 int_id = 0; + + offset = addr - s->pch_pic_base; + + spin_lock(&s->lock); + switch (offset) { + case PCH_PIC_INT_ID_START ... PCH_PIC_INT_ID_END: + /* int id version */ + int_id |= (u64)PCH_PIC_INT_ID_VER << 32; + /* irq number */ + int_id |= (u64)31 << (32 + 16); + /* int id value */ + int_id |= PCH_PIC_INT_ID_VAL; + *(u64 *)val = int_id; + break; + case PCH_PIC_MASK_START ... PCH_PIC_MASK_END: + offset -= PCH_PIC_MASK_START; + index = offset >> 2; + /* read mask reg */ + data = pch_pic_read_reg(&s->mask, index); + *(u32 *)val = data; + break; + case PCH_PIC_HTMSI_EN_START ... PCH_PIC_HTMSI_EN_END: + offset -= PCH_PIC_HTMSI_EN_START; + index = offset >> 2; + /* read htmsi enable reg */ + data = pch_pic_read_reg(&s->htmsi_en, index); + *(u32 *)val = data; + break; + case PCH_PIC_EDGE_START ... PCH_PIC_EDGE_END: + offset -= PCH_PIC_EDGE_START; + index = offset >> 2; + /* read edge enable reg */ + data = pch_pic_read_reg(&s->edge, index); + *(u32 *)val = data; + break; + case PCH_PIC_AUTO_CTRL0_START ... PCH_PIC_AUTO_CTRL0_END: + case PCH_PIC_AUTO_CTRL1_START ... PCH_PIC_AUTO_CTRL1_END: + /* we only use default mode: fixed interrupt distribution mode */ + *(u32 *)val = 0; + break; + case PCH_PIC_ROUTE_ENTRY_START ... PCH_PIC_ROUTE_ENTRY_END: + /* only route to int0: eiointc */ + *(u8 *)val = 1; + break; + case PCH_PIC_HTMSI_VEC_START ... PCH_PIC_HTMSI_VEC_END: + offset -= PCH_PIC_HTMSI_VEC_START; + /* read htmsi vector */ + data = s->htmsi_vector[offset]; + *(u8 *)val = data; + break; + case PCH_PIC_POLARITY_START ... PCH_PIC_POLARITY_END: + /* we only use defalut value 0: high level triggered */ + *(u32 *)val = 0; + break; + default: + ret = -EINVAL; + } + spin_unlock(&s->lock); + + return ret; +} + +static int kvm_pch_pic_read(struct kvm_vcpu *vcpu, + struct kvm_io_device *dev, + gpa_t addr, int len, void *val) +{ + int ret; + struct loongarch_pch_pic *s = vcpu->kvm->arch.pch_pic; + + if (!s) { + kvm_err("%s: pch pic irqchip not valid!\n", __func__); + return -EINVAL; + } + + /* statistics of pch pic reading */ + vcpu->kvm->stat.pch_pic_read_exits++; + ret = loongarch_pch_pic_read(s, addr, len, val); + + return ret; +} + +static int loongarch_pch_pic_write(struct loongarch_pch_pic *s, gpa_t addr, + int len, const void *val) +{ + int ret; + u32 old, data, offset, index; + u64 irq; + + ret = 0; + data = *(u32 *)val; + offset = addr - s->pch_pic_base; + + spin_lock(&s->lock); + switch (offset) { + case PCH_PIC_MASK_START ... PCH_PIC_MASK_END: + offset -= PCH_PIC_MASK_START; + /* get whether high or low 32 bits we want to write */ + index = offset >> 2; + old = pch_pic_write_reg(&s->mask, index, data); + /* enable irq when mask value change to 0 */ + irq = (old & ~data) << (32 * index); + pch_pic_update_batch_irqs(s, irq, 1); + /* disable irq when mask value change to 1 */ + irq = (~old & data) << (32 * index); + pch_pic_update_batch_irqs(s, irq, 0); + break; + case PCH_PIC_HTMSI_EN_START ... PCH_PIC_HTMSI_EN_END: + offset -= PCH_PIC_HTMSI_EN_START; + index = offset >> 2; + pch_pic_write_reg(&s->htmsi_en, index, data); + break; + case PCH_PIC_EDGE_START ... PCH_PIC_EDGE_END: + offset -= PCH_PIC_EDGE_START; + index = offset >> 2; + /* 1: edge triggered, 0: level triggered */ + pch_pic_write_reg(&s->edge, index, data); + break; + case PCH_PIC_CLEAR_START ... PCH_PIC_CLEAR_END: + offset -= PCH_PIC_CLEAR_START; + index = offset >> 2; + /* write 1 to clear edge irq */ + old = pch_pic_read_reg(&s->irr, index); + /* + * get the irq bitmap which is edge triggered and + * already set and to be cleared + */ + irq = old & pch_pic_read_reg(&s->edge, index) & data; + /* write irr to the new state where irqs have been cleared */ + pch_pic_write_reg(&s->irr, index, old & ~irq); + /* update cleared irqs */ + pch_pic_update_batch_irqs(s, irq, 0); + break; + case PCH_PIC_AUTO_CTRL0_START ... PCH_PIC_AUTO_CTRL0_END: + offset -= PCH_PIC_AUTO_CTRL0_START; + index = offset >> 2; + /* we only use default mode: fixed interrupt distribution mode */ + pch_pic_write_reg(&s->auto_ctrl0, index, 0); + break; + case PCH_PIC_AUTO_CTRL1_START ... PCH_PIC_AUTO_CTRL1_END: + offset -= PCH_PIC_AUTO_CTRL1_START; + index = offset >> 2; + /* we only use default mode: fixed interrupt distribution mode */ + pch_pic_write_reg(&s->auto_ctrl1, index, 0); + break; + case PCH_PIC_ROUTE_ENTRY_START ... PCH_PIC_ROUTE_ENTRY_END: + offset -= PCH_PIC_ROUTE_ENTRY_START; + /* only route to int0: eiointc */ + s->route_entry[offset] = 1; + break; + case PCH_PIC_HTMSI_VEC_START ... PCH_PIC_HTMSI_VEC_END: + /* route table to eiointc */ + offset -= PCH_PIC_HTMSI_VEC_START; + s->htmsi_vector[offset] = (u8)data; + break; + case PCH_PIC_POLARITY_START ... PCH_PIC_POLARITY_END: + offset -= PCH_PIC_POLARITY_START; + index = offset >> 2; + /* we only use defalut value 0: high level triggered */ + pch_pic_write_reg(&s->polarity, index, 0); + break; + default: + ret = -EINVAL; + break; + } + spin_unlock(&s->lock); + + return ret; +} + +static int kvm_pch_pic_write(struct kvm_vcpu *vcpu, + struct kvm_io_device *dev, + gpa_t addr, int len, const void *val) +{ + int ret; + struct loongarch_pch_pic *s = vcpu->kvm->arch.pch_pic; + + if (!s) { + kvm_err("%s: pch pic irqchip not valid!\n", __func__); + return -EINVAL; + } + + /* statistics of pch pic writing */ + vcpu->kvm->stat.pch_pic_write_exits++; + ret = loongarch_pch_pic_write(s, addr, len, val); + + return ret; +} + +static const struct kvm_io_device_ops kvm_pch_pic_ops = { + .read = kvm_pch_pic_read, + .write = kvm_pch_pic_write, +}; + +static int kvm_pch_pic_init(struct kvm_device *dev, u64 addr) +{ + int ret; + struct kvm *kvm = dev->kvm; + struct kvm_io_device *device; + struct loongarch_pch_pic *s = dev->kvm->arch.pch_pic; + + s->pch_pic_base = addr; + device = &s->device; + /* init device by pch pic writing and reading ops */ + kvm_iodevice_init(device, &kvm_pch_pic_ops); + mutex_lock(&kvm->slots_lock); + /* register pch pic device */ + ret = kvm_io_bus_register_dev(kvm, KVM_MMIO_BUS, addr, PCH_PIC_SIZE, device); + mutex_unlock(&kvm->slots_lock); + + return (ret < 0) ? -EFAULT : 0; +} + +/* used by user space to get or set pch pic registers */ +static int kvm_pch_pic_regs_access(struct kvm_device *dev, + struct kvm_device_attr *attr, + bool is_write) +{ + int addr, offset, len = 8, ret = 0; + void __user *data; + void *p = NULL; + struct loongarch_pch_pic *s; + + s = dev->kvm->arch.pch_pic; + addr = attr->attr; + data = (void __user *)attr->addr; + + /* get pointer to pch pic register by addr */ + switch (addr) { + case PCH_PIC_MASK_START: + p = &s->mask; + break; + case PCH_PIC_HTMSI_EN_START: + p = &s->htmsi_en; + break; + case PCH_PIC_EDGE_START: + p = &s->edge; + break; + case PCH_PIC_AUTO_CTRL0_START: + p = &s->auto_ctrl0; + break; + case PCH_PIC_AUTO_CTRL1_START: + p = &s->auto_ctrl1; + break; + case PCH_PIC_ROUTE_ENTRY_START ... PCH_PIC_ROUTE_ENTRY_END: + offset = addr - PCH_PIC_ROUTE_ENTRY_START; + p = &s->route_entry[offset]; + len = 1; + break; + case PCH_PIC_HTMSI_VEC_START ... PCH_PIC_HTMSI_VEC_END: + offset = addr - PCH_PIC_HTMSI_VEC_START; + p = &s->htmsi_vector[offset]; + len = 1; + break; + case PCH_PIC_INT_IRR_START: + p = &s->irr; + break; + case PCH_PIC_INT_ISR_START: + p = &s->isr; + break; + case PCH_PIC_POLARITY_START: + p = &s->polarity; + break; + default: + return -EINVAL; + } + + spin_lock(&s->lock); + /* write or read value according to is_write */ + if (is_write) { + if (copy_from_user(p, data, len)) + ret = -EFAULT; + } else { + if (copy_to_user(data, p, len)) + ret = -EFAULT; + } + spin_unlock(&s->lock); + + return ret; +} + +static int kvm_pch_pic_get_attr(struct kvm_device *dev, + struct kvm_device_attr *attr) +{ + switch (attr->group) { + case KVM_DEV_LOONGARCH_PCH_PIC_GRP_REGS: + return kvm_pch_pic_regs_access(dev, attr, false); + default: + return -EINVAL; + } +} + +static int kvm_pch_pic_set_attr(struct kvm_device *dev, + struct kvm_device_attr *attr) +{ + u64 addr; + void __user *uaddr = (void __user *)(long)attr->addr; + + switch (attr->group) { + case KVM_DEV_LOONGARCH_PCH_PIC_GRP_CTRL: + switch (attr->attr) { + case KVM_DEV_LOONGARCH_PCH_PIC_CTRL_INIT: + if (copy_from_user(&addr, uaddr, sizeof(addr))) + return -EFAULT; + + if (!dev->kvm->arch.pch_pic) { + kvm_err("%s: please create pch_pic irqchip first!\n", __func__); + return -ENODEV; + } + + return kvm_pch_pic_init(dev, addr); + default: + kvm_err("%s: unknown group (%d) attr (%lld)\n", __func__, attr->group, + attr->attr); + return -EINVAL; + } + case KVM_DEV_LOONGARCH_PCH_PIC_GRP_REGS: + return kvm_pch_pic_regs_access(dev, attr, true); + default: + return -EINVAL; + } +} + +static int kvm_setup_default_irq_routing(struct kvm *kvm) +{ + int i, ret; + u32 nr = KVM_IRQCHIP_NUM_PINS; + struct kvm_irq_routing_entry *entries; + + entries = kcalloc(nr, sizeof(*entries), GFP_KERNEL); + if (!entries) + return -ENOMEM; + + for (i = 0; i < nr; i++) { + entries[i].gsi = i; + entries[i].type = KVM_IRQ_ROUTING_IRQCHIP; + entries[i].u.irqchip.irqchip = 0; + entries[i].u.irqchip.pin = i; + } + ret = kvm_set_irq_routing(kvm, entries, nr, 0); + kfree(entries); + + return ret; +} + +static int kvm_pch_pic_create(struct kvm_device *dev, u32 type) +{ + int ret; + struct kvm *kvm = dev->kvm; + struct loongarch_pch_pic *s; + + /* pch pic should not has been created */ + if (kvm->arch.pch_pic) + return -EINVAL; + + ret = kvm_setup_default_irq_routing(kvm); + if (ret) + return -ENOMEM; + + s = kzalloc(sizeof(struct loongarch_pch_pic), GFP_KERNEL); + if (!s) + return -ENOMEM; + + spin_lock_init(&s->lock); + s->kvm = kvm; + kvm->arch.pch_pic = s; + + return 0; +} + +static void kvm_pch_pic_destroy(struct kvm_device *dev) +{ + struct kvm *kvm; + struct loongarch_pch_pic *s; + + if (!dev || !dev->kvm || !dev->kvm->arch.pch_pic) + return; + + kvm = dev->kvm; + s = kvm->arch.pch_pic; + /* unregister pch pic device and free it's memory */ + kvm_io_bus_unregister_dev(kvm, KVM_MMIO_BUS, &s->device); + kfree(s); +} + +static struct kvm_device_ops kvm_pch_pic_dev_ops = { + .name = "kvm-loongarch-pch-pic", + .create = kvm_pch_pic_create, + .destroy = kvm_pch_pic_destroy, + .set_attr = kvm_pch_pic_set_attr, + .get_attr = kvm_pch_pic_get_attr, +}; + +int kvm_loongarch_register_pch_pic_device(void) +{ + return kvm_register_device_ops(&kvm_pch_pic_dev_ops, KVM_DEV_TYPE_LOONGARCH_PCHPIC); +} diff --git a/arch/loongarch/kvm/irqfd.c b/arch/loongarch/kvm/irqfd.c new file mode 100644 index 000000000000..9a39627aecf0 --- /dev/null +++ b/arch/loongarch/kvm/irqfd.c @@ -0,0 +1,89 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Copyright (C) 2024 Loongson Technology Corporation Limited + */ + +#include <linux/kvm_host.h> +#include <trace/events/kvm.h> +#include <asm/kvm_pch_pic.h> + +static int kvm_set_pic_irq(struct kvm_kernel_irq_routing_entry *e, + struct kvm *kvm, int irq_source_id, int level, bool line_status) +{ + /* PCH-PIC pin (0 ~ 64) <---> GSI (0 ~ 64) */ + pch_pic_set_irq(kvm->arch.pch_pic, e->irqchip.pin, level); + + return 0; +} + +/* + * kvm_set_msi: inject the MSI corresponding to the + * MSI routing entry + * + * This is the entry point for irqfd MSI injection + * and userspace MSI injection. + */ +int kvm_set_msi(struct kvm_kernel_irq_routing_entry *e, + struct kvm *kvm, int irq_source_id, int level, bool line_status) +{ + if (!level) + return -1; + + pch_msi_set_irq(kvm, e->msi.data, level); + + return 0; +} + +/* + * kvm_set_routing_entry: populate a kvm routing entry + * from a user routing entry + * + * @kvm: the VM this entry is applied to + * @e: kvm kernel routing entry handle + * @ue: user api routing entry handle + * return 0 on success, -EINVAL on errors. + */ +int kvm_set_routing_entry(struct kvm *kvm, + struct kvm_kernel_irq_routing_entry *e, + const struct kvm_irq_routing_entry *ue) +{ + switch (ue->type) { + case KVM_IRQ_ROUTING_IRQCHIP: + e->set = kvm_set_pic_irq; + e->irqchip.irqchip = ue->u.irqchip.irqchip; + e->irqchip.pin = ue->u.irqchip.pin; + + if (e->irqchip.pin >= KVM_IRQCHIP_NUM_PINS) + return -EINVAL; + + return 0; + case KVM_IRQ_ROUTING_MSI: + e->set = kvm_set_msi; + e->msi.address_lo = ue->u.msi.address_lo; + e->msi.address_hi = ue->u.msi.address_hi; + e->msi.data = ue->u.msi.data; + return 0; + default: + return -EINVAL; + } +} + +int kvm_arch_set_irq_inatomic(struct kvm_kernel_irq_routing_entry *e, + struct kvm *kvm, int irq_source_id, int level, bool line_status) +{ + switch (e->type) { + case KVM_IRQ_ROUTING_IRQCHIP: + pch_pic_set_irq(kvm->arch.pch_pic, e->irqchip.pin, level); + return 0; + case KVM_IRQ_ROUTING_MSI: + pch_msi_set_irq(kvm, e->msi.data, level); + return 0; + default: + return -EWOULDBLOCK; + } +} + +bool kvm_arch_intc_initialized(struct kvm *kvm) +{ + return kvm_arch_irqchip_in_kernel(kvm); +} diff --git a/arch/loongarch/kvm/main.c b/arch/loongarch/kvm/main.c index 27e9b94c0a0b..396fed2665a5 100644 --- a/arch/loongarch/kvm/main.c +++ b/arch/loongarch/kvm/main.c @@ -9,6 +9,8 @@ #include <asm/cacheflush.h> #include <asm/cpufeature.h> #include <asm/kvm_csr.h> +#include <asm/kvm_eiointc.h> +#include <asm/kvm_pch_pic.h> #include "trace.h" unsigned long vpid_mask; @@ -313,7 +315,7 @@ void kvm_arch_disable_virtualization_cpu(void) static int kvm_loongarch_env_init(void) { - int cpu, order; + int cpu, order, ret; void *addr; struct kvm_context *context; @@ -368,7 +370,20 @@ static int kvm_loongarch_env_init(void) kvm_init_gcsr_flag(); - return 0; + /* Register LoongArch IPI interrupt controller interface. */ + ret = kvm_loongarch_register_ipi_device(); + if (ret) + return ret; + + /* Register LoongArch EIOINTC interrupt controller interface. */ + ret = kvm_loongarch_register_eiointc_device(); + if (ret) + return ret; + + /* Register LoongArch PCH-PIC interrupt controller interface. */ + ret = kvm_loongarch_register_pch_pic_device(); + + return ret; } static void kvm_loongarch_env_exit(void) diff --git a/arch/loongarch/kvm/mmu.c b/arch/loongarch/kvm/mmu.c index 28681dfb4b85..4d203294767c 100644 --- a/arch/loongarch/kvm/mmu.c +++ b/arch/loongarch/kvm/mmu.c @@ -552,12 +552,10 @@ bool kvm_test_age_gfn(struct kvm *kvm, struct kvm_gfn_range *range) static int kvm_map_page_fast(struct kvm_vcpu *vcpu, unsigned long gpa, bool write) { int ret = 0; - kvm_pfn_t pfn = 0; kvm_pte_t *ptep, changed, new; gfn_t gfn = gpa >> PAGE_SHIFT; struct kvm *kvm = vcpu->kvm; struct kvm_memory_slot *slot; - struct page *page; spin_lock(&kvm->mmu_lock); @@ -570,8 +568,6 @@ static int kvm_map_page_fast(struct kvm_vcpu *vcpu, unsigned long gpa, bool writ /* Track access to pages marked old */ new = kvm_pte_mkyoung(*ptep); - /* call kvm_set_pfn_accessed() after unlock */ - if (write && !kvm_pte_dirty(new)) { if (!kvm_pte_write(new)) { ret = -EFAULT; @@ -595,26 +591,14 @@ static int kvm_map_page_fast(struct kvm_vcpu *vcpu, unsigned long gpa, bool writ } changed = new ^ (*ptep); - if (changed) { + if (changed) kvm_set_pte(ptep, new); - pfn = kvm_pte_pfn(new); - page = kvm_pfn_to_refcounted_page(pfn); - if (page) - get_page(page); - } + spin_unlock(&kvm->mmu_lock); - if (changed) { - if (kvm_pte_young(changed)) - kvm_set_pfn_accessed(pfn); + if (kvm_pte_dirty(changed)) + mark_page_dirty(kvm, gfn); - if (kvm_pte_dirty(changed)) { - mark_page_dirty(kvm, gfn); - kvm_set_pfn_dirty(pfn); - } - if (page) - put_page(page); - } return ret; out: spin_unlock(&kvm->mmu_lock); @@ -796,6 +780,7 @@ static int kvm_map_page(struct kvm_vcpu *vcpu, unsigned long gpa, bool write) struct kvm *kvm = vcpu->kvm; struct kvm_memory_slot *memslot; struct kvm_mmu_memory_cache *memcache = &vcpu->arch.mmu_page_cache; + struct page *page; /* Try the fast path to handle old / clean pages */ srcu_idx = srcu_read_lock(&kvm->srcu); @@ -823,7 +808,7 @@ retry: mmu_seq = kvm->mmu_invalidate_seq; /* * Ensure the read of mmu_invalidate_seq isn't reordered with PTE reads in - * gfn_to_pfn_prot() (which calls get_user_pages()), so that we don't + * kvm_faultin_pfn() (which calls get_user_pages()), so that we don't * risk the page we get a reference to getting unmapped before we have a * chance to grab the mmu_lock without mmu_invalidate_retry() noticing. * @@ -835,7 +820,7 @@ retry: smp_rmb(); /* Slow path - ask KVM core whether we can access this GPA */ - pfn = gfn_to_pfn_prot(kvm, gfn, write, &writeable); + pfn = kvm_faultin_pfn(vcpu, gfn, write, &writeable, &page); if (is_error_noslot_pfn(pfn)) { err = -EFAULT; goto out; @@ -847,10 +832,10 @@ retry: /* * This can happen when mappings are changed asynchronously, but * also synchronously if a COW is triggered by - * gfn_to_pfn_prot(). + * kvm_faultin_pfn(). */ spin_unlock(&kvm->mmu_lock); - kvm_release_pfn_clean(pfn); + kvm_release_page_unused(page); if (retry_no > 100) { retry_no = 0; schedule(); @@ -915,14 +900,13 @@ retry: else ++kvm->stat.pages; kvm_set_pte(ptep, new_pte); + + kvm_release_faultin_page(kvm, page, false, writeable); spin_unlock(&kvm->mmu_lock); - if (prot_bits & _PAGE_DIRTY) { + if (prot_bits & _PAGE_DIRTY) mark_page_dirty_in_slot(kvm, memslot, gfn); - kvm_set_pfn_dirty(pfn); - } - kvm_release_pfn_clean(pfn); out: srcu_read_unlock(&kvm->srcu, srcu_idx); return err; diff --git a/arch/loongarch/kvm/vcpu.c b/arch/loongarch/kvm/vcpu.c index 174734a23d0a..cab1818be68d 100644 --- a/arch/loongarch/kvm/vcpu.c +++ b/arch/loongarch/kvm/vcpu.c @@ -1475,6 +1475,9 @@ int kvm_arch_vcpu_create(struct kvm_vcpu *vcpu) /* Init */ vcpu->arch.last_sched_cpu = -1; + /* Init ipi_state lock */ + spin_lock_init(&vcpu->arch.ipi_state.lock); + /* * Initialize guest register state to valid architectural reset state. */ diff --git a/arch/loongarch/kvm/vm.c b/arch/loongarch/kvm/vm.c index 4ba734aaef87..b8b3e1972d6e 100644 --- a/arch/loongarch/kvm/vm.c +++ b/arch/loongarch/kvm/vm.c @@ -6,6 +6,8 @@ #include <linux/kvm_host.h> #include <asm/kvm_mmu.h> #include <asm/kvm_vcpu.h> +#include <asm/kvm_eiointc.h> +#include <asm/kvm_pch_pic.h> const struct _kvm_stats_desc kvm_vm_stats_desc[] = { KVM_GENERIC_VM_STATS(), @@ -76,6 +78,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext) int r; switch (ext) { + case KVM_CAP_IRQCHIP: case KVM_CAP_ONE_REG: case KVM_CAP_ENABLE_CAP: case KVM_CAP_READONLY_MEM: @@ -161,6 +164,8 @@ int kvm_arch_vm_ioctl(struct file *filp, unsigned int ioctl, unsigned long arg) struct kvm_device_attr attr; switch (ioctl) { + case KVM_CREATE_IRQCHIP: + return 0; case KVM_HAS_DEVICE_ATTR: if (copy_from_user(&attr, argp, sizeof(attr))) return -EFAULT; @@ -170,3 +175,19 @@ int kvm_arch_vm_ioctl(struct file *filp, unsigned int ioctl, unsigned long arg) return -ENOIOCTLCMD; } } + +int kvm_vm_ioctl_irq_line(struct kvm *kvm, struct kvm_irq_level *irq_event, bool line_status) +{ + if (!kvm_arch_irqchip_in_kernel(kvm)) + return -ENXIO; + + irq_event->status = kvm_set_irq(kvm, KVM_USERSPACE_IRQ_SOURCE_ID, + irq_event->irq, irq_event->level, line_status); + + return 0; +} + +bool kvm_arch_irqchip_in_kernel(struct kvm *kvm) +{ + return (kvm->arch.ipi && kvm->arch.eiointc && kvm->arch.pch_pic); +} |