summaryrefslogtreecommitdiffstats
path: root/arch (follow)
Commit message (Collapse)AuthorAgeFilesLines
* Merge tag 'gfs2-v5.15-rc5-mmap-fault' of ↵Linus Torvalds2021-11-024-5/+6
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2 Pull gfs2 mmap + page fault deadlocks fixes from Andreas Gruenbacher: "Functions gfs2_file_read_iter and gfs2_file_write_iter are both accessing the user buffer to write to or read from while holding the inode glock. In the most basic deadlock scenario, that buffer will not be resident and it will be mapped to the same file. Accessing the buffer will trigger a page fault, and gfs2 will deadlock trying to take the same inode glock again while trying to handle that fault. Fix that and similar, more complex scenarios by disabling page faults while accessing user buffers. To make this work, introduce a small amount of new infrastructure and fix some bugs that didn't trigger so far, with page faults enabled" * tag 'gfs2-v5.15-rc5-mmap-fault' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2: gfs2: Fix mmap + page fault deadlocks for direct I/O iov_iter: Introduce nofault flag to disable page faults gup: Introduce FOLL_NOFAULT flag to disable page faults iomap: Add done_before argument to iomap_dio_rw iomap: Support partial direct I/O on user copy failures iomap: Fix iomap_dio_rw return value for user copies gfs2: Fix mmap + page fault deadlocks for buffered I/O gfs2: Eliminate ip->i_gh gfs2: Move the inode glock locking to gfs2_file_buffered_write gfs2: Introduce flag for glock holder auto-demotion gfs2: Clean up function may_grant gfs2: Add wrapper for iomap_file_buffered_write iov_iter: Introduce fault_in_iov_iter_writeable iov_iter: Turn iov_iter_fault_in_readable into fault_in_iov_iter_readable gup: Turn fault_in_pages_{readable,writeable} into fault_in_{readable,writeable} powerpc/kvm: Fix kvm_use_magic_page iov_iter: Fix iov_iter_get_pages{,_alloc} page fault return value
| * gup: Turn fault_in_pages_{readable,writeable} into fault_in_{readable,writeable}Andreas Gruenbacher2021-10-184-8/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Turn fault_in_pages_{readable,writeable} into versions that return the number of bytes not faulted in, similar to copy_to_user, instead of returning a non-zero value when any of the requested pages couldn't be faulted in. This supports the existing users that require all pages to be faulted in as well as new users that are happy if any pages can be faulted in. Rename the functions to fault_in_{readable,writeable} to make sure this change doesn't silently break things. Neither of these functions is entirely trivial and it doesn't seem useful to inline them, so move them to mm/gup.c. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
| * powerpc/kvm: Fix kvm_use_magic_pageAndreas Gruenbacher2021-10-181-1/+1
| | | | | | | | | | | | | | | | | | | | When switching from __get_user to fault_in_pages_readable, commit 9f9eae5ce717 broke kvm_use_magic_page: like __get_user, fault_in_pages_readable returns 0 on success. Fixes: 9f9eae5ce717 ("powerpc/kvm: Prefer fault_in_pages_readable function") Cc: stable@vger.kernel.org # v4.18+ Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
* | Merge tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-armLinus Torvalds2021-11-0235-117/+339
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull ARM updates from Russell King: - Rejig task/thread info to place thread info in task struct - Amba bus cleanups (removing unused functions) - Handle Amba device probe without IRQ domains - Parse linux,usable-memory-range in decompressor - Mark OCRAM as read-only after initialisation - Refactor page fault handling - Fix PXN handling with LPAE kernels - Warning and build fixes from Arnd * tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm: (32 commits) ARM: 9151/1: Thumb2: avoid __builtin_thread_pointer() on Clang ARM: 9150/1: Fix PID_IN_CONTEXTIDR regression when THREAD_INFO_IN_TASK=y ARM: 9147/1: add printf format attribute to early_print() ARM: 9146/1: RiscPC needs older gcc version ARM: 9145/1: patch: fix BE32 compilation ARM: 9144/1: forbid ftrace with clang and thumb2_kernel ARM: 9143/1: add CONFIG_PHYS_OFFSET default values ARM: 9142/1: kasan: work around LPAE build warning ARM: 9140/1: allow compile-testing without machine record ARM: 9137/1: disallow CONFIG_THUMB with ARMv4 ARM: 9136/1: ARMv7-M uses BE-8, not BE-32 ARM: 9135/1: kprobes: address gcc -Wempty-body warning ARM: 9101/1: sa1100/assabet: convert LEDs to gpiod APIs ARM: 9131/1: mm: Fix PXN process with LPAE feature ARM: 9130/1: mm: Provide die_kernel_fault() helper ARM: 9126/1: mm: Kill page table base print in show_pte() ARM: 9127/1: mm: Cleanup access_error() ARM: 9129/1: mm: Kill task_struct argument for __do_page_fault() ARM: 9128/1: mm: Refactor the __do_page_fault() ARM: imx6: mark OCRAM mapping read-only ...
| | \
| | \
| *-. \ Merge branches 'devel-stable' and 'misc' into for-linusRussell King (Oracle)2021-11-0235-117/+339
| |\ \ \
| | | * | ARM: 9147/1: add printf format attribute to early_print()Nicolas Iooss2021-10-251-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Adding such an attribute is helpful to detect errors related to printf formats at compile-time. Link: https://lore.kernel.org/r/20160828165815.25647-1-nicolas.iooss_linux@m4x.org Signed-off-by: Nicolas Iooss <nicolas.iooss_linux@m4x.org> Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
| | | * | ARM: 9146/1: RiscPC needs older gcc versionArnd Bergmann2021-10-251-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Attempting to build mach-rpc with gcc-9 or higher, or with any version of clang results in a build failure, like: arm-linux-gnueabi-gcc-11.1.0: error: unrecognized -march target: armv3m arm-linux-gnueabi-gcc-11.1.0: note: valid arguments are: armv4 armv4t armv5t armv5te armv5tej armv6 armv6j armv6k armv6z armv6kz armv6zk armv6t2 armv6-m armv6s-m armv7 armv7-a armv7ve armv7-r armv7-m armv7e-m armv8-a armv8.1-a armv8.2-a armv8.3-a armv8.4-a armv8.5-a armv8.6-a armv8-m.base armv8-m.main armv8-r armv8.1-m.main iwmmxt iwmmxt2; did you mean 'armv4'? Building with gcc-5 also fails in at least one of these ways: /tmp/cczZoCcv.s:68: Error: selected processor does not support `bx lr' in ARM mode drivers/tty/vt/vt_ioctl.c:958:1: internal compiler error: Segmentation fault Handle this in Kconfig so we don't run into this with randconfig builds, allowing only gcc-6 through gcc-8. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
| | | * | ARM: 9145/1: patch: fix BE32 compilationArnd Bergmann2021-10-251-2/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On BE32 kernels, the __opcode_to_mem_thumb32() interface is intentionally not defined, but it is referenced whenever runtime patching is enabled for the kernel, which may be for ftrace, jump label, kprobes or kgdb: arch/arm/kernel/patch.c: In function '__patch_text_real': arch/arm/kernel/patch.c:94:32: error: implicit declaration of function '__opcode_to_mem_thumb32' [-Werror=implicit-function-declaration] 94 | insn = __opcode_to_mem_thumb32(insn); | ^~~~~~~~~~~~~~~~~~~~~~~ Since BE32 kernels never run Thumb2 code, we never end up using the result of this call, so providing an extern declaration without a definition makes it build correctly. Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
| | | * | ARM: 9144/1: forbid ftrace with clang and thumb2_kernelArnd Bergmann2021-10-251-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | clang fails to build kernels with THUMB2 and FUNCTION_TRACER enabled when there is any inline asm statement containing the frame pointer register r7: arch/arm/mach-exynos/mcpm-exynos.c:154:2: error: inline asm clobber list contains reserved registers: R7 [-Werror,-Winline-asm] arch/arm/probes/kprobes/actions-thumb.c:449:3: error: inline asm clobber list contains reserved registers: R7 [-Werror,-Winline-asm] Apparently gcc should also have warned about this, and the configuration is actually invalid, though there is some disagreement on the bug trackers about this. Link: https://bugs.llvm.org/show_bug.cgi?id=45826 Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94986 Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
| | | * | ARM: 9143/1: add CONFIG_PHYS_OFFSET default valuesArnd Bergmann2021-10-251-3/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For platforms that are not yet converted to ARCH_MULTIPLATFORM, we can disable CONFIG_ARM_PATCH_PHYS_VIRT, which in turn requires setting a correct address here. As we actualy know what all the values are supposed to be based on the old mach/memory.h header file contents (from git history), we can just add them here. This also solves a problem in Kconfig where 'make randconfig' fails to continue if no number is selected for a 'hex' option. Users can still override the number at configuration time, e.g. when the memory visible to the kernel starts at a nonstandard address on some machine, but it should no longer be required now. I originally posted this back in 2016, but the problem still persists. The patch has gotten much simpler though, as almost all platforms rely on ARM_PATCH_PHYS_VIRT now. Link: https://lore.kernel.org/linux-arm-kernel/1455804123-2526139-5-git-send-email-arnd@arndb.de/ Acked-by: Nicolas Pitre <nico@fluxnic.net> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
| | | * | ARM: 9142/1: kasan: work around LPAE build warningArnd Bergmann2021-10-251-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | pgd_page_vaddr() returns an 'unsigned long' address, causing a warning with the memcpy() call in kasan_init(): arch/arm/mm/kasan_init.c: In function 'kasan_init': include/asm-generic/pgtable-nop4d.h:44:50: error: passing argument 2 of '__memcpy' makes pointer from integer without a cast [-Werror=int-conversion] 44 | #define pgd_page_vaddr(pgd) ((unsigned long)(p4d_pgtable((p4d_t){ pgd }))) | ~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | | | long unsigned int arch/arm/include/asm/string.h:58:45: note: in definition of macro 'memcpy' 58 | #define memcpy(dst, src, len) __memcpy(dst, src, len) | ^~~ arch/arm/mm/kasan_init.c:229:16: note: in expansion of macro 'pgd_page_vaddr' 229 | pgd_page_vaddr(*pgd_offset_k(KASAN_SHADOW_START)), | ^~~~~~~~~~~~~~ arch/arm/include/asm/string.h:21:47: note: expected 'const void *' but argument is of type 'long unsigned int' 21 | extern void *__memcpy(void *dest, const void *src, __kernel_size_t n); | ~~~~~~~~~~~~^~~ Avoid this by adding an explicit typecast. Link: https://lore.kernel.org/all/CACRpkdb3DMvof3-xdtss0Pc6KM36pJA-iy=WhvtNVnsDpeJ24Q@mail.gmail.com/ Fixes: 5615f69bc209 ("ARM: 9016/2: Initialize the mapping of KASan shadow memory") Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
| | | * | ARM: 9140/1: allow compile-testing without machine recordArnd Bergmann2021-10-252-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A lot of randconfig builds end up not selecting any machine type at all. This is generally fine for the purpose of compile testing, but of course it means that the kernel is not usable on actual hardware, and it causes a warning about this fact. As most of the build bots now force-enable CONFIG_COMPILE_TEST for randconfig builds, use that as a guard to control whether we warn on this type of broken configuration. We could do the same for the missing-cpu-type warning, but those configurations fail to build much earlier. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
| | | * | ARM: 9137/1: disallow CONFIG_THUMB with ARMv4Arnd Bergmann2021-10-251-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We can currently build a multi-cpu enabled kernel that allows both ARMv4 and ARMv5 CPUs, and also supports THUMB mode in user space. However, returning to user space in this configuration with the usr_ret macro requires the use of the 'bx' instruction, which is refused by the assembler: arch/arm/kernel/entry-armv.S: Assembler messages: arch/arm/kernel/entry-armv.S:937: Error: selected processor does not support `bx lr' in ARM mode arch/arm/kernel/entry-armv.S:960: Error: selected processor does not support `bx lr' in ARM mode arch/arm/kernel/entry-armv.S:1003: Error: selected processor does not support `bx lr' in ARM mode <instantiation>:2:2: note: instruction requires: armv4t bx lr While it would be possible to handle this correctly in principle, doing so seems to not be worth it, if we can simply avoid the problem by enforcing that a kernel supporting both ARMv4 and a later CPU architecture cannot run THUMB binaries. This turned up while build-testing with clang; for some reason, gcc never triggered the problem. Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
| | | * | ARM: 9136/1: ARMv7-M uses BE-8, not BE-32Arnd Bergmann2021-10-251-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When configuring the kernel for big-endian, we set either BE-8 or BE-32 based on the CPU architecture level. Until linux-4.4, we did not have any ARMv7-M platform allowing big-endian builds, but now i.MX/Vybrid is in that category, adn we get a build error because of this: arch/arm/kernel/module-plts.c: In function 'get_module_plt': arch/arm/kernel/module-plts.c:60:46: error: implicit declaration of function '__opcode_to_mem_thumb32' [-Werror=implicit-function-declaration] This comes down to picking the wrong default, ARMv7-M uses BE8 like ARMv7-A does. Changing the default gets the kernel to compile and presumably works. https://lore.kernel.org/all/1455804123-2526139-2-git-send-email-arnd@arndb.de/ Tested-by: Vladimir Murzin <vladimir.murzin@arm.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
| | | * | ARM: 9135/1: kprobes: address gcc -Wempty-body warningArnd Bergmann2021-10-251-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Building with 'make W=1' shows a warning in some configurations when 'verbose()' is defined to be empty. arch/arm/probes/kprobes/test-core.c: In function 'kprobes_test_case_start': arch/arm/probes/kprobes/test-core.c:1367:26: error: suggest braces around empty body in an 'else' statement [-Werror=empty-body] 1367 | current_instruction); | ^ Change the definition of verbose() to use no_printk(), allowing format string checking and avoiding the warning. Link: https://lore.kernel.org/all/20210322114600.3528031-1-arnd@kernel.org/ Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
| | | * | ARM: 9101/1: sa1100/assabet: convert LEDs to gpiod APIsLinus Walleij2021-10-251-11/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Convert the Assabet LEDs to use the gpiod APIs. Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
| | | * | ARM: 9131/1: mm: Fix PXN process with LPAE featureWang Kefeng2021-10-192-1/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When user code execution with privilege mode, it will lead to infinite loop in the page fault handler if ARM_LPAE enabled, The issue could be reproduced with "echo EXEC_USERSPACE > /sys/kernel/debug/provoke-crash/DIRECT" As Permission fault shows in ARM spec, IFSR format when using the Short-descriptor translation table format Permission fault: 01101 First level 01111 Second level IFSR format when using the Long-descriptor translation table format Permission fault: 0011LL LL bits indicate levelb. Add is_permission_fault() function to check permission fault and die if permission fault occurred under instruction fault in do_page_fault(). Fixes: 1d4d37159d01 ("ARM: 8235/1: Support for the PXN CPU feature on ARMv7") Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
| | | * | ARM: 9130/1: mm: Provide die_kernel_fault() helperWang Kefeng2021-10-191-9/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Provide die_kernel_fault() helper to do the kernel fault reporting, which with msg argument, it could report different message in different scenes, and the later patch "ARM: mm: Fix PXN process with LPAE feature" will use it. Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
| | | * | ARM: 9126/1: mm: Kill page table base print in show_pte()Wang Kefeng2021-10-191-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now the show_pts() will dump the virtual (hashed) address of page table base, it is useless, kill it. Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
| | | * | ARM: 9127/1: mm: Cleanup access_error()Wang Kefeng2021-10-191-24/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now the write fault check in do_page_fault() and access_error() twice, we can cleanup access_error(), and make the fault check and vma flags set into do_page_fault() directly, then pass the vma flags to __do_page_fault. No functional change. Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
| | | * | ARM: 9129/1: mm: Kill task_struct argument for __do_page_fault()Wang Kefeng2021-10-191-7/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The __do_page_fault() won't use task_struct argument, kill it and also use current->mm directly in do_page_fault(). No functional change. Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
| | | * | ARM: 9128/1: mm: Refactor the __do_page_fault()Wang Kefeng2021-10-191-21/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Clean up the multiple goto statements and drops local variable vm_fault_t fault, which will make the __do_page_fault() much more readability. No functional change. Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
| | | * | ARM: imx6: mark OCRAM mapping read-onlyRussell King (Oracle)2021-10-191-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | iMX6 needs to write some code to OCRAM which resumes the DDR controller after suspend. However, merely using __arm_ioremap_exec() causes the kernel to complain of a W+X mapping. Solve this by using the newly introduced __arm_iomem_set_ro() function to prevent inadvertent or malicious writes to code we may later execute. Tested-by: Fabio Estevam <festevam@gmail.com> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
| | | * | ARM: add __arm_iomem_set_ro() to write-protect ioremapped areaRussell King (Oracle)2021-10-192-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | __arm_iomem_set_ro() marks an ioremapped area read-only. This is intended for use with __arm_ioremap_exec() to allow the kernel to write some code into e.g. SRAM and then write-protect it so the kernel doesn't complain about W+X mappings. Tested-by: Fabio Estevam <festevam@gmail.com> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
| | | * | ARM: 9124/1: uncompress: Parse "linux,usable-memory-range" DT propertyGeert Uytterhoeven2021-10-191-6/+42
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add support for parsing the "linux,usable-memory-range" DT property. This property is used to describe the usable memory reserved for the crash dump kernel, and thus makes the memory reservation explicit. If present, Linux no longer needs to mask the program counter, and rely on the "mem=" kernel parameter to obtain the start and size of usable memory. For backwards compatibility, the traditional method to derive the start of memory is still used if "linux,usable-memory-range" is absent. Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
| | | * | ARM: 9123/1: scoop: Drop if with an always false conditionUwe Kleine-König2021-10-191-3/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The remove callback is only called after probe completed successfully. In this case platform_set_drvdata() was called with a non-NULL argument and so !sdev is never true. The motivation for this change is to get rid of non-zero return values for remove callbacks as their only effect is to trigger a runtime warning. See commit e5e1c2097881 ("driver core: platform: Emit a warning if a remove callback returned non-zero") for further details. Link: https://lore.kernel.org/linux-arm-kernel/20210721205450.2173923-1-u.kleine-koenig@pengutronix.de Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
| | | * | ARM: 9125/1: fix incorrect use of get_kernel_nofault()Ard Biesheuvel2021-10-191-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 344179fc7ef4 ("ARM: 9106/1: traps: use get_kernel_nofault instead of set_fs()") replaced an occurrence of __get_user() with get_kernel_nofault(), but inverted the sense of the conditional in the process, resulting in no values to be printed at all. I.e., every exception stack now looks like this: Exception stack(0xc18d1fb0 to 0xc18d1ff8) 1fa0: ???????? ???????? ???????? ???????? 1fc0: ???????? ???????? ???????? ???????? ???????? ???????? ???????? ???????? 1fe0: ???????? ???????? ???????? ???????? ???????? ???????? which is rather unhelpful. Fixes: 344179fc7ef4 ("ARM: 9106/1: traps: use get_kernel_nofault instead of set_fs()") Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
| | * | | ARM: 9151/1: Thumb2: avoid __builtin_thread_pointer() on ClangArd Biesheuvel2021-10-301-1/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If available, we use the __builtin_thread_pointer() helper to get the value of the TLS register, to help the compiler understand that it doesn't need to reload it every time we access 'current'. Unfortunately, Clang fails to emit the MRC system register read directly when building for Thumb2, and instead, it issues a call to the __aeabi_read_tp helper, which the kernel does not provide, and so this result in link failures at build time. So create a special case for this, and emit the MRC directly using an asm() block, just like we do when the helper is not available to begin with. Link: https://github.com/ClangBuiltLinux/linux/issues/1485 Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Reviewed-by: Nathan Chancellor <nathan@kernel.org> Tested-by: Nathan Chancellor <nathan@kernel.org> Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
| | * | | ARM: 9150/1: Fix PID_IN_CONTEXTIDR regression when THREAD_INFO_IN_TASK=yArd Biesheuvel2021-10-302-1/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The code that implements the rarely used PID_IN_CONTEXTIDR feature dereferences the 'task' field of struct thread_info directly, and this is no longer possible when THREAD_INFO_IN_TASK=y, as the 'task' field is omitted from the struct definition in that case. Instead, we should just cast the thread_info pointer to a task_struct pointer, given that the former is now the first member of the latter. So use a helper that abstracts this, and provide implementations for both cases. Reported by: Arnd Bergmann <arnd@arndb.de> Fixes: 18ed1c01a7dd ("ARM: smp: Enable THREAD_INFO_IN_TASK") Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
| | * | | ARM: smp: Enable THREAD_INFO_IN_TASKArd Biesheuvel2021-09-277-2/+35
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now that we no longer rely on thread_info living at the base of the task stack to be able to access the 'current' pointer, we can wire up the generic support for moving thread_info into the task struct itself. Note that this requires us to update the cpu field in thread_info explicitly, now that the core code no longer does so. Ideally, we would switch the percpu code to access the cpu field in task_struct instead, but this unleashes #include circular dependency hell. Co-developed-by: Keith Packard <keithpac@amazon.com> Signed-off-by: Keith Packard <keithpac@amazon.com> Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Tested-by: Amit Daniel Kachhap <amit.kachhap@arm.com>
| | * | | ARM: smp: Store current pointer in TPIDRURO register if availableArd Biesheuvel2021-09-2712-2/+105
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now that the user space TLS register is assigned on every return to user space, we can use it to keep the 'current' pointer while running in the kernel. This removes the need to access it via thread_info, which is located at the base of the stack, but will be moved out of there in a subsequent patch. Use the __builtin_thread_pointer() helper when available - this will help GCC understand that reloading the value within the same function is not necessary, even when using the per-task stack protector (which also generates accesses via the TLS register). For example, the generated code below loads TPIDRURO only once, and uses it to access both the stack canary and the preempt_count fields. <do_one_initcall>: e92d 41f0 stmdb sp!, {r4, r5, r6, r7, r8, lr} ee1d 4f70 mrc 15, 0, r4, cr13, cr0, {3} 4606 mov r6, r0 b094 sub sp, #80 ; 0x50 f8d4 34e8 ldr.w r3, [r4, #1256] ; 0x4e8 <- stack canary 9313 str r3, [sp, #76] ; 0x4c f8d4 8004 ldr.w r8, [r4, #4] <- preempt count Co-developed-by: Keith Packard <keithpac@amazon.com> Signed-off-by: Keith Packard <keithpac@amazon.com> Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Tested-by: Amit Daniel Kachhap <amit.kachhap@arm.com>
| | * | | ARM: smp: Free up the TLS register while running in the kernelArd Biesheuvel2021-09-272-3/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | To prepare for a subsequent patch that stores the current task pointer in the user space TLS register while running in the kernel, modify the set_tls and switch_tls routines not to touch the register directly, and update the return to user space code to load the correct value. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Tested-by: Amit Daniel Kachhap <amit.kachhap@arm.com>
| | * | | ARM: smp: Pass task to secondary_start_kernelKeith Packard2021-09-274-5/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This avoids needing to compute the task pointer in this function, which will no longer be possible once we move thread_info off the stack. Signed-off-by: Keith Packard <keithpac@amazon.com> Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Tested-by: Amit Daniel Kachhap <amit.kachhap@arm.com>
| | * | | gcc-plugins: arm-ssp: Prepare for THREAD_INFO_IN_TASK supportArd Biesheuvel2021-09-276-18/+2
| | |/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We will be enabling THREAD_INFO_IN_TASK support for ARM, which means that we can no longer load the stack canary value by masking the stack pointer and taking the copy that lives in thread_info. Instead, we will be able to load it from the task_struct directly, by using the TPIDRURO register which will hold the current task pointer when THREAD_INFO_IN_TASK is in effect. This is much more straight-forward, and allows us to declutter this code a bit while at it. Note that this means that ARMv6 (non-v6K) SMP systems can no longer use this feature, but those are quite rare to begin with, so this is a reasonable trade off. Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Tested-by: Amit Daniel Kachhap <amit.kachhap@arm.com>
* | | | Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds2021-11-02106-1489/+7680
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull KVM updates from Paolo Bonzini: "ARM: - More progress on the protected VM front, now with the full fixed feature set as well as the limitation of some hypercalls after initialisation. - Cleanup of the RAZ/WI sysreg handling, which was pointlessly complicated - Fixes for the vgic placement in the IPA space, together with a bunch of selftests - More memcg accounting of the memory allocated on behalf of a guest - Timer and vgic selftests - Workarounds for the Apple M1 broken vgic implementation - KConfig cleanups - New kvmarm.mode=none option, for those who really dislike us RISC-V: - New KVM port. x86: - New API to control TSC offset from userspace - TSC scaling for nested hypervisors on SVM - Switch masterclock protection from raw_spin_lock to seqcount - Clean up function prototypes in the page fault code and avoid repeated memslot lookups - Convey the exit reason to userspace on emulation failure - Configure time between NX page recovery iterations - Expose Predictive Store Forwarding Disable CPUID leaf - Allocate page tracking data structures lazily (if the i915 KVM-GT functionality is not compiled in) - Cleanups, fixes and optimizations for the shadow MMU code s390: - SIGP Fixes - initial preparations for lazy destroy of secure VMs - storage key improvements/fixes - Log the guest CPNC Starting from this release, KVM-PPC patches will come from Michael Ellerman's PPC tree" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (227 commits) RISC-V: KVM: fix boolreturn.cocci warnings RISC-V: KVM: remove unneeded semicolon RISC-V: KVM: Fix GPA passed to __kvm_riscv_hfence_gvma_xyz() functions RISC-V: KVM: Factor-out FP virtualization into separate sources KVM: s390: add debug statement for diag 318 CPNC data KVM: s390: pv: properly handle page flags for protected guests KVM: s390: Fix handle_sske page fault handling KVM: x86: SGX must obey the KVM_INTERNAL_ERROR_EMULATION protocol KVM: x86: On emulation failure, convey the exit reason, etc. to userspace KVM: x86: Get exit_reason as part of kvm_x86_ops.get_exit_info KVM: x86: Clarify the kvm_run.emulation_failure structure layout KVM: s390: Add a routine for setting userspace CPU state KVM: s390: Simplify SIGP Set Arch handling KVM: s390: pv: avoid stalls when making pages secure KVM: s390: pv: avoid stalls for kvm_s390_pv_init_vm KVM: s390: pv: avoid double free of sida page KVM: s390: pv: add macros for UVC CC values s390/mm: optimize reset_guest_reference_bit() s390/mm: optimize set_guest_storage_key() s390/mm: no need for pte_alloc_map_lock() if we know the pmd is present ...
| * | | | RISC-V: KVM: fix boolreturn.cocci warningsBixuan Cui2021-11-011-9/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix boolreturn.cocci warnings: ./arch/riscv/kvm/mmu.c:603:9-10: WARNING: return of 0/1 in function 'kvm_age_gfn' with return type bool ./arch/riscv/kvm/mmu.c:582:9-10: WARNING: return of 0/1 in function 'kvm_set_spte_gfn' with return type bool ./arch/riscv/kvm/mmu.c:621:9-10: WARNING: return of 0/1 in function 'kvm_test_age_gfn' with return type bool ./arch/riscv/kvm/mmu.c:568:9-10: WARNING: return of 0/1 in function 'kvm_unmap_gfn_range' with return type bool Signed-off-by: Bixuan Cui <cuibixuan@linux.alibaba.com> Signed-off-by: Anup Patel <anup.patel@wdc.com>
| * | | | RISC-V: KVM: remove unneeded semicolonran jianping2021-11-014-8/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Elimate the following coccinelle check warning: ./arch/riscv/kvm/vcpu_sbi.c:169:2-3: Unneeded semicolon ./arch/riscv/kvm/vcpu_exit.c:397:2-3: Unneeded semicolon ./arch/riscv/kvm/vcpu_exit.c:687:2-3: Unneeded semicolon ./arch/riscv/kvm/vcpu_exit.c:645:2-3: Unneeded semicolon ./arch/riscv/kvm/vcpu.c:247:2-3: Unneeded semicolon ./arch/riscv/kvm/vcpu.c:284:2-3: Unneeded semicolon ./arch/riscv/kvm/vcpu_timer.c:123:2-3: Unneeded semicolon ./arch/riscv/kvm/vcpu_timer.c:170:2-3: Unneeded semicolon Reported-by: Zeal Robot <zealci@zte.com.cn> Signed-off-by: ran jianping <ran.jianping@zte.com.cn> Signed-off-by: Anup Patel <anup.patel@wdc.com>
| * | | | Merge tag 'kvm-s390-next-5.16-1' of ↵Paolo Bonzini2021-10-3112-77/+200
| |\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD KVM: s390: Fixes and Features for 5.16 - SIGP Fixes - initial preparations for lazy destroy of secure VMs - storage key improvements/fixes - Log the guest CPNC
| | * | | | KVM: s390: add debug statement for diag 318 CPNC dataCollin Walling2021-10-271-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The diag 318 data contains values that denote information regarding the guest's environment. Currently, it is unecessarily difficult to observe this value (either manually-inserted debug statements, gdb stepping, mem dumping etc). It's useful to observe this information to obtain an at-a-glance view of the guest's environment, so lets add a simple VCPU event that prints the CPNC to the s390dbf logs. Signed-off-by: Collin Walling <walling@linux.ibm.com> Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> Link: https://lore.kernel.org/r/20211027025451.290124-1-walling@linux.ibm.com [borntraeger@de.ibm.com]: change debug level to 3 Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
| | * | | | KVM: s390: pv: properly handle page flags for protected guestsClaudio Imbrenda2021-10-274-7/+50
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Introduce variants of the convert and destroy page functions that also clear the PG_arch_1 bit used to mark them as secure pages. The PG_arch_1 flag is always allowed to overindicate; using the new functions introduced here allows to reduce the extent of overindication and thus improve performance. These new functions can only be called on pages for which a reference is already being held. Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Reviewed-by: Janosch Frank <frankja@linux.ibm.com> Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Link: https://lore.kernel.org/r/20210920132502.36111-7-imbrenda@linux.ibm.com Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
| | * | | | KVM: s390: Fix handle_sske page fault handlingJanis Schoetterl-Glausch2021-10-271-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If handle_sske cannot set the storage key, because there is no page table entry or no present large page entry, it calls fixup_user_fault. However, currently, if the call succeeds, handle_sske returns -EAGAIN, without having set the storage key. Instead, retry by continue'ing the loop without incrementing the address. The same issue in handle_pfmf was fixed by a11bdb1a6b78 ("KVM: s390: Fix pfmf and conditional skey emulation"). Fixes: bd096f644319 ("KVM: s390: Add skey emulation fault handling") Signed-off-by: Janis Schoetterl-Glausch <scgl@linux.ibm.com> Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Link: https://lore.kernel.org/r/20211022152648.26536-1-scgl@linux.ibm.com Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
| | * | | | KVM: s390: Add a routine for setting userspace CPU stateEric Farman2021-10-252-3/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This capability exists, but we don't record anything when userspace enables it. Let's refactor that code so that a note can be made in the debug logs that it was enabled. Signed-off-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Reviewed-by: David Hildenbrand <david@redhat.com> Link: https://lore.kernel.org/r/20211008203112.1979843-7-farman@linux.ibm.com Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
| | * | | | KVM: s390: Simplify SIGP Set Arch handlingEric Farman2021-10-251-13/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The Principles of Operations describe the various reasons that each individual SIGP orders might be rejected, and the status bit that are set for each condition. For example, for the Set Architecture order, it states: "If it is not true that all other CPUs in the configu- ration are in the stopped or check-stop state, ... bit 54 (incorrect state) ... is set to one." However, it also states: "... if the CZAM facility is installed, ... bit 55 (invalid parameter) ... is set to one." Since the Configuration-z/Architecture-Architectural Mode (CZAM) facility is unconditionally presented, there is no need to examine each VCPU to determine if it is started/stopped. It can simply be rejected outright with the Invalid Parameter bit. Fixes: b697e435aeee ("KVM: s390: Support Configuration z/Architecture Mode") Signed-off-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Link: https://lore.kernel.org/r/20211008203112.1979843-2-farman@linux.ibm.com Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
| | * | | | KVM: s390: pv: avoid stalls when making pages secureClaudio Imbrenda2021-10-252-6/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Improve make_secure_pte to avoid stalls when the system is heavily overcommitted. This was especially problematic in kvm_s390_pv_unpack, because of the loop over all pages that needed unpacking. Due to the locks being held, it was not possible to simply replace uv_call with uv_call_sched. A more complex approach was needed, in which uv_call is replaced with __uv_call, which does not loop. When the UVC needs to be executed again, -EAGAIN is returned, and the caller (or its caller) will try again. When -EAGAIN is returned, the path is the same as when the page is in writeback (and the writeback check is also performed, which is harmless). Fixes: 214d9bbcd3a672 ("s390/mm: provide memory management functions for protected KVM guests") Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Reviewed-by: Janosch Frank <frankja@linux.ibm.com> Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Link: https://lore.kernel.org/r/20210920132502.36111-5-imbrenda@linux.ibm.com Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
| | * | | | KVM: s390: pv: avoid stalls for kvm_s390_pv_init_vmClaudio Imbrenda2021-10-251-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When the system is heavily overcommitted, kvm_s390_pv_init_vm might generate stall notifications. Fix this by using uv_call_sched instead of just uv_call. This is ok because we are not holding spinlocks. Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Fixes: 214d9bbcd3a672 ("s390/mm: provide memory management functions for protected KVM guests") Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: Janosch Frank <frankja@linux.ibm.com> Message-Id: <20210920132502.36111-4-imbrenda@linux.ibm.com> Signed-off-by: Janosch Frank <frankja@linux.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
| | * | | | KVM: s390: pv: avoid double free of sida pageClaudio Imbrenda2021-10-251-10/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If kvm_s390_pv_destroy_cpu is called more than once, we risk calling free_page on a random page, since the sidad field is aliased with the gbea, which is not guaranteed to be zero. This can happen, for example, if userspace calls the KVM_PV_DISABLE IOCTL, and it fails, and then userspace calls the same IOCTL again. This scenario is only possible if KVM has some serious bug or if the hardware is broken. The solution is to simply return successfully immediately if the vCPU was already non secure. Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Fixes: 19e1227768863a1469797c13ef8fea1af7beac2c ("KVM: S390: protvirt: Introduce instruction data area bounce buffer") Reviewed-by: Janosch Frank <frankja@linux.ibm.com> Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Message-Id: <20210920132502.36111-3-imbrenda@linux.ibm.com> Signed-off-by: Janosch Frank <frankja@linux.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
| | * | | | KVM: s390: pv: add macros for UVC CC valuesClaudio Imbrenda2021-10-251-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add macros to describe the 4 possible CC values returned by the UVC instruction. Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: Janosch Frank <frankja@linux.ibm.com> Message-Id: <20210920132502.36111-2-imbrenda@linux.ibm.com> Signed-off-by: Janosch Frank <frankja@linux.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
| | * | | | s390/mm: optimize reset_guest_reference_bit()David Hildenbrand2021-10-251-2/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We already optimize get_guest_storage_key() to assume that if we don't have a PTE table and don't have a huge page mapped that the storage key is 0. Similarly, optimize reset_guest_reference_bit() to simply do nothing if there is no PTE table and no huge page mapped. Signed-off-by: David Hildenbrand <david@redhat.com> Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Acked-by: Heiko Carstens <hca@linux.ibm.com> Link: https://lore.kernel.org/r/20210909162248.14969-10-david@redhat.com Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
| | * | | | s390/mm: optimize set_guest_storage_key()David Hildenbrand2021-10-251-2/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We already optimize get_guest_storage_key() to assume that if we don't have a PTE table and don't have a huge page mapped that the storage key is 0. Similarly, optimize set_guest_storage_key() to simply do nothing in case the key to set is 0. Signed-off-by: David Hildenbrand <david@redhat.com> Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Acked-by: Heiko Carstens <hca@linux.ibm.com> Link: https://lore.kernel.org/r/20210909162248.14969-9-david@redhat.com Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
| | * | | | s390/mm: no need for pte_alloc_map_lock() if we know the pmd is presentDavid Hildenbrand2021-10-251-12/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | pte_map_lock() is sufficient. Signed-off-by: David Hildenbrand <david@redhat.com> Acked-by: Heiko Carstens <hca@linux.ibm.com> Link: https://lore.kernel.org/r/20210909162248.14969-8-david@redhat.com Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>