summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* vio: make remove callback return voidUwe Kleine-König2021-03-0213-37/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | The driver core ignores the return value of struct bus_type::remove() because there is only little that can be done. To simplify the quest to make this function return void, let struct vio_driver::remove() return void, too. All users already unconditionally return 0, this commit makes it obvious that returning an error code is a bad idea. Note there are two nominally different implementations for a vio bus: one in arch/sparc/kernel/vio.c and the other in arch/powerpc/platforms/pseries/vio.c. This patch only adapts the powerpc one. Before this patch for a device that was bound to a driver without a remove callback vio_cmo_bus_remove(viodev) wasn't called. As the device core still considers the device unbound after vio_bus_remove() returns calling this unconditionally is the consistent behaviour which is implemented here. Signed-off-by: Uwe Kleine-König <uwe@kleine-koenig.org> Reviewed-by: Tyrel Datwyler <tyreld@linux.ibm.com> Acked-by: Lijun Pan <ljp@linux.ibm.com> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> [mpe: Drop unneeded hvcs_remove() forward declaration, squash in change from sfr to drop ibmvnic_remove() forward declaration] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210225221834.160083-1-uwe@kleine-koenig.org
* powerpc/syscall: Force inlining of __prep_irq_for_enabled_exit()Christophe Leroy2021-03-011-1/+1
| | | | | | | | | | | | | | | | | | | As reported by kernel test robot, a randconfig with high amount of debuging options can lead to build failure for undefined reference to replay_soft_interrupts() on ppc32. This is due to gcc not seeing that __prep_irq_for_enabled_exit() always returns true on ppc32 because it doesn't inline it for some reason. Force inlining of __prep_irq_for_enabled_exit() to fix the build. Fixes: 344bb20b159d ("powerpc/syscall: Make interrupt.c buildable on PPC32") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Acked-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/53f3a1f719441761000c41154602bf097d4350b5.1614148356.git.christophe.leroy@csgroup.eu
* powerpc/603: Fix protection of user pages mapped with PROT_NONEChristophe Leroy2021-03-011-3/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On book3s/32, page protection is defined by the PP bits in the PTE which provide the following protection depending on the access keys defined in the matching segment register: - PP 00 means RW with key 0 and N/A with key 1. - PP 01 means RW with key 0 and RO with key 1. - PP 10 means RW with both key 0 and key 1. - PP 11 means RO with both key 0 and key 1. Since the implementation of kernel userspace access protection, PP bits have been set as follows: - PP00 for pages without _PAGE_USER - PP01 for pages with _PAGE_USER and _PAGE_RW - PP11 for pages with _PAGE_USER and without _PAGE_RW For kernelspace segments, kernel accesses are performed with key 0 and user accesses are performed with key 1. As PP00 is used for non _PAGE_USER pages, user can't access kernel pages not flagged _PAGE_USER while kernel can. For userspace segments, both kernel and user accesses are performed with key 0, therefore pages not flagged _PAGE_USER are still accessible to the user. This shouldn't be an issue, because userspace is expected to be accessible to the user. But unlike most other architectures, powerpc implements PROT_NONE protection by removing _PAGE_USER flag instead of flagging the page as not valid. This means that pages in userspace that are not flagged _PAGE_USER shall remain inaccessible. To get the expected behaviour, just mimic other architectures in the TLB miss handler by checking _PAGE_USER permission on userspace accesses as if it was the _PAGE_PRESENT bit. Note that this problem only is only for 603 cores. The 604+ have an hash table, and hash_page() function already implement the verification of _PAGE_USER permission on userspace pages. Fixes: f342adca3afc ("powerpc/32s: Prepare Kernel Userspace Access Protection") Cc: stable@vger.kernel.org # v5.2+ Reported-by: Christoph Plattner <christoph.plattner@thalesgroup.com> Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/4a0c6e3bb8f0c162457bf54d9bc6fd8d7b55129f.1612160907.git.christophe.leroy@csgroup.eu
* powerpc/pseries: Don't enforce MSI affinity with kdumpGreg Kurz2021-03-011-2/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Depending on the number of online CPUs in the original kernel, it is likely for CPU #0 to be offline in a kdump kernel. The associated IRQs in the affinity mappings provided by irq_create_affinity_masks() are thus not started by irq_startup(), as per-design with managed IRQs. This can be a problem with multi-queue block devices driven by blk-mq : such a non-started IRQ is very likely paired with the single queue enforced by blk-mq during kdump (see blk_mq_alloc_tag_set()). This causes the device to remain silent and likely hangs the guest at some point. This is a regression caused by commit 9ea69a55b3b9 ("powerpc/pseries: Pass MSI affinity to irq_create_mapping()"). Note that this only happens with the XIVE interrupt controller because XICS has a workaround to bypass affinity, which is activated during kdump with the "noirqdistrib" kernel parameter. The issue comes from a combination of factors: - discrepancy between the number of queues detected by the multi-queue block driver, that was used to create the MSI vectors, and the single queue mode enforced later on by blk-mq because of kdump (i.e. keeping all queues fixes the issue) - CPU#0 offline (i.e. kdump always succeed with CPU#0) Given that I couldn't reproduce on x86, which seems to always have CPU#0 online even during kdump, I'm not sure where this should be fixed. Hence going for another approach : fine-grained affinity is for performance and we don't really care about that during kdump. Simply revert to the previous working behavior of ignoring affinity masks in this case only. Fixes: 9ea69a55b3b9 ("powerpc/pseries: Pass MSI affinity to irq_create_mapping()") Cc: stable@vger.kernel.org # v5.10+ Signed-off-by: Greg Kurz <groug@kaod.org> Reviewed-by: Laurent Vivier <lvivier@redhat.com> Reviewed-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210215094506.1196119-1-groug@kaod.org
* powerpc/4xx: Fix build errors from mfdcr()Michael Ellerman2021-03-011-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | lkp reported a build error in fsp2.o: CC arch/powerpc/platforms/44x/fsp2.o {standard input}:577: Error: unsupported relocation against base Which comes from: pr_err("GESR0: 0x%08x\n", mfdcr(base + PLB4OPB_GESR0)); Where our mfdcr() macro is stringifying "base + PLB4OPB_GESR0", and passing that to the assembler, which obviously doesn't work. The mfdcr() macro already checks that the argument is constant using __builtin_constant_p(), and if not calls the out-of-line version of mfdcr(). But in this case GCC is smart enough to notice that "base + PLB4OPB_GESR0" will be constant, even though it's not something we can immediately stringify into a register number. Segher pointed out that passing the register number to the inline asm as a constant would be better, and in fact it fixes the build error, presumably because it gives GCC a chance to resolve the value. While we're at it, change mtdcr() similarly. Reported-by: kernel test robot <lkp@intel.com> Suggested-by: Segher Boessenkool <segher@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Acked-by: Feng Tang <feng.tang@intel.com> Link: https://lore.kernel.org/r/20210218123058.748882-1-mpe@ellerman.id.au
* Linux 5.12-rc1v5.12-rc1-dontusev5.12-rc1Linus Torvalds2021-03-011-3/+3
|
* Merge tag 'ide-5.11-2021-02-28' of git://git.kernel.dk/linux-blockLinus Torvalds2021-03-011-1/+2
|\ | | | | | | | | | | | | | | Pull ide fix from Jens Axboe: "This is a leftover fix from 5.11, where I forgot to ship it your way" * tag 'ide-5.11-2021-02-28' of git://git.kernel.dk/linux-block: ide/falconide: Fix module unload
| * ide/falconide: Fix module unloadFinn Thain2021-01-031-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Unloading the falconide module results in a crash: Unable to handle kernel NULL pointer dereference at virtual address 00000000 Oops: 00000000 Modules linked in: falconide(-) PC: [<002930b2>] ide_host_remove+0x2e/0x1d2 SR: 2000 SP: 00b49e28 a2: 009b0f90 d0: 00000000 d1: 009b0f90 d2: 00000000 d3: 00b48000 d4: 003cef32 d5: 00299188 a0: 0086d000 a1: 0086d000 Process rmmod (pid: 322, task=009b0f90) Frame format=7 eff addr=00000000 ssw=0505 faddr=00000000 wb 1 stat/addr/data: 0000 00000000 00000000 wb 2 stat/addr/data: 0000 00000000 00000000 wb 3 stat/addr/data: 0000 00000000 00018da9 push data: 00000000 00000000 00000000 00000000 Stack from 00b49e90: 004c456a 0027f176 0027cb0a 0027cb9e 00000000 0086d00a 2187d3f0 0027f0e0 00b49ebc 2187d1f6 00000000 00b49ec8 002811e8 0086d000 00b49ef0 0028024c 0086d00a 002800d6 00279a1a 00000001 00000001 0086d00a 2187d3f0 00279a58 00b49f1c 002802e0 0086d00a 2187d3f0 004c456a 0086d00a ef96af74 00000000 2187d3f0 002805d2 800de064 00b49f44 0027f088 2187d3f0 00ac1cf4 2187d3f0 004c43be 2187d3f0 00000000 2187d3f0 800b66a8 00b49f5c 00280776 2187d3f0 Call Trace: [<0027f176>] __device_driver_unlock+0x0/0x48 [<0027cb0a>] device_links_busy+0x0/0x94 [<0027cb9e>] device_links_unbind_consumers+0x0/0x130 [<0027f0e0>] __device_driver_lock+0x0/0x5a [<2187d1f6>] falconide_remove+0x12/0x18 [falconide] [<002811e8>] platform_drv_remove+0x1c/0x28 [<0028024c>] device_release_driver_internal+0x176/0x17c [<002800d6>] device_release_driver_internal+0x0/0x17c [<00279a1a>] get_device+0x0/0x22 [<00279a58>] put_device+0x0/0x18 [<002802e0>] driver_detach+0x56/0x82 [<002805d2>] driver_remove_file+0x0/0x24 [<0027f088>] bus_remove_driver+0x4c/0xa4 [<00280776>] driver_unregister+0x28/0x5a [<00281a00>] platform_driver_unregister+0x12/0x18 [<2187d2a0>] ide_falcon_driver_exit+0x10/0x16 [falconide] [<000764f0>] sys_delete_module+0x110/0x1f2 [<000e83ea>] sys_rename+0x1a/0x1e [<00002e0c>] syscall+0x8/0xc [<00188004>] ext4_multi_mount_protect+0x35a/0x3ce Code: 0029 9188 4bf9 0027 aa1c 283c 003c ef32 <265c> 4a8b 6700 00b8 2043 2028 000c 0280 00ff ff00 6600 0176 40c0 7202 b2b9 004c Disabling lock debugging due to kernel taint This happens because the driver_data pointer is uninitialized. Add the missing platform_set_drvdata() call. For clarity, use the matching platform_get_drvdata() as well. Cc: Michael Schmitz <schmitzmic@gmail.com> Cc: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com> Fixes: 5ed0794cde593 ("m68k/atari: Convert Falcon IDE drivers to platform drivers") Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org> Reviewed-by: Michael Schmitz <schmitzmic@gmail.com> Signed-off-by: Finn Thain <fthain@telegraphics.com.au> Signed-off-by: Jens Axboe <axboe@kernel.dk>
* | Merge tag 'kbuild-fixes-v5.12' of ↵Linus Torvalds2021-02-286-21/+31
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild Pull Kbuild fixes from Masahiro Yamada: - Fix UNUSED_KSYMS_WHITELIST for Clang LTO - Make -s builds really silent irrespective of V= option - Fix build error when SUBLEVEL or PATCHLEVEL is empty * tag 'kbuild-fixes-v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: kbuild: Fix <linux/version.h> for empty SUBLEVEL or PATCHLEVEL again kbuild: make -s option take precedence over V=1 ia64: remove redundant READELF from arch/ia64/Makefile kbuild: do not include include/config/auto.conf from adjust_autoksyms.sh kbuild: fix UNUSED_KSYMS_WHITELIST for Clang LTO kbuild: lto: add _mcount to list of used symbols
| * | kbuild: Fix <linux/version.h> for empty SUBLEVEL or PATCHLEVEL againMasahiro Yamada2021-02-281-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 78d3bb4483ba ("kbuild: Fix <linux/version.h> for empty SUBLEVEL or PATCHLEVEL") fixed the build error for empty SUBLEVEL or PATCHLEVEL by prepending a zero. Commit 9b82f13e7ef3 ("kbuild: clamp SUBLEVEL to 255") re-introduced this issue. This time, we cannot take the same approach because we have C code: #define LINUX_VERSION_PATCHLEVEL $(PATCHLEVEL) #define LINUX_VERSION_SUBLEVEL $(SUBLEVEL) Replace empty SUBLEVEL/PATCHLEVEL with a zero. Fixes: 9b82f13e7ef3 ("kbuild: clamp SUBLEVEL to 255") Reported-by: Christian Zigotzky <chzigotzky@xenosoft.de> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Reviewed-and-tested-by: Sasha Levin <sashal@kernel.org>
| * | kbuild: make -s option take precedence over V=1Masahiro Yamada2021-02-281-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 'make -s' should be really silent. However, 'make -s V=1' prints noisy log messages from some shell scripts. Of course, such a combination is odd, but the build system needs to do the right thing even if a user gives strange input. If -s is given, KBUILD_VERBOSE should be forced to 0. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
| * | ia64: remove redundant READELF from arch/ia64/MakefileMasahiro Yamada2021-02-281-1/+0
| | | | | | | | | | | | | | | | | | READELF is defined by the top Makefile. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
| * | kbuild: do not include include/config/auto.conf from adjust_autoksyms.shMasahiro Yamada2021-02-281-3/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit cd195bc4775a ("kbuild: split adjust_autoksyms.sh in two parts") split out the code that needs include/config/auto.conf. This script no longer needs to include include/config/auto.conf. Fixes: cd195bc4775a ("kbuild: split adjust_autoksyms.sh in two parts") Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
| * | kbuild: fix UNUSED_KSYMS_WHITELIST for Clang LTOMasahiro Yamada2021-02-283-16/+26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit fbe078d397b4 ("kbuild: lto: add a default list of used symbols") does not work as expected if the .config file has already specified CONFIG_UNUSED_KSYMS_WHITELIST="my/own/white/list" before enabling CONFIG_LTO_CLANG. So, the user-supplied whitelist and LTO-specific white list must be independent of each other. I refactored the shell script so CONFIG_MODVERSIONS and CONFIG_CLANG_LTO handle whitelists in the same way. Fixes: fbe078d397b4 ("kbuild: lto: add a default list of used symbols") Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Tested-by: Sedat Dilek <sedat.dilek@gmail.com>
| * | kbuild: lto: add _mcount to list of used symbolsArnd Bergmann2021-02-271-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Some randconfig builds fail with undefined references to _mcount when CONFIG_TRIM_UNUSED_KSYMS is set: ERROR: modpost: "_mcount" [drivers/tee/optee/optee.ko] undefined! ERROR: modpost: "_mcount" [drivers/fsi/fsi-occ.ko] undefined! ERROR: modpost: "_mcount" [drivers/fpga/dfl-pci.ko] undefined! Since there is already a list of symbols that get generated at link time, add this one as well. Fixes: fbe078d397b4 ("kbuild: lto: add a default list of used symbols") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
* | | Merge tag 'csky-for-linus-5.12-rc1' of git://github.com/c-sky/csky-linuxLinus Torvalds2021-02-2894-992/+1347
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull arch/csky updates from Guo Ren: "Features: - add new memory layout 2.5G(user):1.5G(kernel) - add kmemleak support - reconstruct VDSO framework: add VDSO with GENERIC_GETTIMEOFDAY, GENERIC_TIME_VSYSCALL, HAVE_GENERIC_VDSO - add faulthandler_disabled() check - support (fix) swapon - add (fix) _PAGE_ACCESSED for default pgprot - abort uaccess retries upon fatal signal (from arm) Fixes and optimizations: - fix perf probe failure - fix show_regs doesn't contain regs->usp - remove custom asm/atomic.h implementation - fix barrier design - fix futex SMP implementation - fix asm/cmpxchg.h with correct ordering barrier - cleanup asm/spinlock.h - fix PTE global for 2.5:1.5 virtual memory - remove prologue of page fault handler in entry.S - fix TLB maintenance synchronization problem - add show_tlb for CPU_CK860 debug - fix FAULT_FLAG_XXX param for handle_mm_fault - fix update_mmu_cache called with user io mapping - fix do_page_fault parent irq status - fix a size determination in gpr_get() - pgtable.h: Coding convention - kprobe: Fix code in simulate without 'long' - fix pfn_valid error with wrong max_mapnr - use free_initmem_default() in free_initmem() - fix compile error" * tag 'csky-for-linus-5.12-rc1' of git://github.com/c-sky/csky-linux: (30 commits) csky: Fixup compile error csky: use free_initmem_default() in free_initmem() csky: Fixup pfn_valid error with wrong max_mapnr csky: Add VDSO with GENERIC_GETTIMEOFDAY, GENERIC_TIME_VSYSCALL, HAVE_GENERIC_VDSO csky: kprobe: Fixup code in simulate without 'long' csky: Fixup swapon csky: pgtable.h: Coding convention csky: Fixup _PAGE_ACCESSED for default pgprot csky: remove unused including <linux/version.h> csky: Fix a size determination in gpr_get() csky: Reconstruct VDSO framework csky: mm: abort uaccess retries upon fatal signal csky: Sync riscv mm/fault.c for easy maintenance csky: Fixup do_page_fault parent irq status csky: Add faulthandler_disabled() check csky: Fixup update_mmu_cache called with user io mapping csky: Fixup FAULT_FLAG_XXX param for handle_mm_fault csky: Add show_tlb for CPU_CK860 debug csky: Fix TLB maintenance synchronization problem csky: Add kmemleak support ...
| * | | csky: Fixup compile errorGuo Ren2021-02-2752-52/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | : error: C++ style comments are not allowed in ISO C90 // Copyright (C) 2018 Hangzhou C-SKY Microsystems co.,ltd. ^ error: (this will be reported only once per input file) Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
| * | | csky: use free_initmem_default() in free_initmem()David Hildenbrand2021-02-271-16/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The existing code is essentially free_initmem_default()->free_reserved_area() without poisoning. Note that existing code missed to update the managed page count of the zone. Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Mike Rapoport <rppt@kernel.org> Cc: Oscar Salvador <osalvador@suse.de> Cc: Michal Hocko <mhocko@kernel.org> Cc: Wei Yang <richard.weiyang@linux.alibaba.com> Tested-by: Guo Ren <guoren@kernel.org> Signed-off-by: Guo Ren <guoren@kernel.org> Signed-off-by: David Hildenbrand <david@redhat.com>
| * | | csky: Fixup pfn_valid error with wrong max_mapnrGuo Ren2021-02-271-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | The max_mapnr is the number of PFNs, not absolute PFN offset. Using set_max_mapnr API instead of setting the value directly. Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
| * | | csky: Add VDSO with GENERIC_GETTIMEOFDAY, GENERIC_TIME_VSYSCALL, ↵Guo Ren2021-02-2714-3/+225
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | HAVE_GENERIC_VDSO It could help to reduce the latency of the time-related functions in user space. We have referenced arm's and riscv's implementation for the patch. Signed-off-by: Guo Ren <guoren@linux.alibaba.com> Cc: Vincent Chen <vincent.chen@sifive.com> Cc: Arnd Bergmann <arnd@arndb.de>
| * | | csky: kprobe: Fixup code in simulate without 'long'Guo Ren2021-02-271-15/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The type of 'val' is 'unsigned long' in simulate_blz32, so 'val < 0' can't be true. Cast 'val' to 'long' here to determine branch token or not, Fixup instructions: bnezad32, bhsz32, bhz32, blsz32, blz32 Link: https://lore.kernel.org/linux-csky/CAJF2gTQjKXR9gpo06WAWG1aquiT87mATiMGorXs6ChxOxoe90Q@mail.gmail.com/T/#t Signed-off-by: Guo Ren <guoren@linux.alibaba.com> Co-developed-by: Menglong Dong <dong.menglong@zte.com.cn> Signed-off-by: Menglong Dong <dong.menglong@zte.com.cn>
| * | | csky: Fixup swaponGuo Ren2021-02-273-9/+52
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Current csky's swappon is broken by wrong swap PTE entry format. Now redesign the new format for abiv1 & abiv2 and make swappon + zram work properly on csky machines. C-SKY PTE has VALID, DIRTY to emulate PRESENT, READ, WRITE, EXEC attributes. GLOBAL bit is shared by two pages in the same tlb entry. So we need to keep GLOBAL, VALID, PRESENT zero in swp_pte. To distinguish PAGE_NONE and swp_pte, we need to use an additional bit (abiv1 is _PAGE_READ, abiv2 is _PAGE_WRITE). Signed-off-by: Guo Ren <guoren@linux.alibaba.com> Cc: Arnd Bergmann <arnd@arndb.de>
| * | | csky: pgtable.h: Coding conventionGuo Ren2021-02-273-55/+36
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | C-SKY page table attributes only have 'Dirty' and 'Valid' to emulate 'PRESENT, READ, WRITE, EXEC, DIRTY, ACCESSED'. This patch cleanup unnecessary definition. Signed-off-by: Guo Ren <guoren@linux.alibaba.com> Cc: Arnd Bergmann <arnd@arndb.de>
| * | | csky: Fixup _PAGE_ACCESSED for default pgprotGuo Ren2021-01-121-7/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When the system memory is exhausted, linux will trigger kswapd to shrink memory page cache. We found the csky's .text file mapping pages would be reclaimed earlier than arm's elf. Because csky doesn't give _PAGE_ACCESSED for default pgprot and in zap_pte_range if (pte_young(ptent) && likely(!(vma->vm_flags & VM_SEQ_READ))) mark_page_accessed(page); mark_page_accessed will put the pages into active lru list. [ 3.652722] delete busybox page from inactive file list Call Trace: [<9012a376>] dump_stack+0xe/0x24 [<9012a370>] dump_stack+0x8/0x24 [<9005b780>] activate_page+0x2b4/0x2d4 [<90132502>] vsnprintf+0x2c6/0x374 [<9005b880>] mark_page_accessed+0xe0/0x150 [<9006903e>] unmap_page_range+0x166/0x33c [<90021844>] get_signal+0x98/0x3b4 [<90069232>] unmap_single_vma+0x1e/0x24 [<90069462>] unmap_vmas+0x26/0x40 [<9006d3d8>] exit_mmap+0x60/0xbc [<9006a140>] handle_mm_fault+0x700/0xcec [<900426b2>] ktime_get_with_offset+0x86/0x130 [<90017566>] mmput+0x2e/0x90 [<9001a30a>] do_exit+0x13e/0x6f0 [<90015448>] page_fault_end+0x14/0x74 [<9001b4bc>] SyS_exit_group+0x0/0xc [<9001b47c>] do_group_exit+0x2c/0x6c [<9001b4c8>] __wake_up_parent+0x0/0x20 [<9001399e>] csky_systemcall+0x6e/0x72 csky will throw the pages at first and keep them in active lru list later after real accessed, but arm would keep them in active lru list at the beginning. The following are statistics of different architecture styles: Default _PAGE_ACCESSED: alpha, arm, arm64, ia64, m68k, microblaze, openrisc, powerpc, riscv, sh, um, x86, xtensa Not def _PAGE_ACCESSED: arc, c6x, h8300, hexgon, mips, s390, nds32, nios2, parisc, sparc Signed-off-by: Guo Ren <guoren@linux.alibaba.com> Co-developed-by: Xu Kai <xukai@nationalchip.com> Signed-off-by: Xu Kai <xukai@nationalchip.com>
| * | | csky: remove unused including <linux/version.h>Tian Tao2021-01-121-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Remove including <linux/version.h> that don't need it. Signed-off-by: Tian Tao <tiantao6@hisilicon.com> Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
| * | | csky: Fix a size determination in gpr_get()Zhenzhong Duan2021-01-121-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | "*" is missed in size determination as we are passing register set rather than a pointer. Fixes: dcad7854fcce ("sky: switch to ->regset_get()") Signed-off-by: Zhenzhong Duan <zhenzhong.duan@gmail.com> Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
| * | | csky: Reconstruct VDSO frameworkGuo Ren2021-01-1213-89/+269
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Reconstruct vdso framework to support future vsyscall, vgettimeofday features. These are very important features to reduce system calls into the kernel for performance improvement. The patch is reference RISC-V's Signed-off-by: Guo Ren <guoren@linux.alibaba.com> Cc: Palmer Dabbelt <palmerdabbelt@google.com>
| * | | csky: mm: abort uaccess retries upon fatal signalGuo Ren2021-01-121-1/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pick up the patch from the 'Link' made by Mark Rutland. Keep the same with x86, arm, arm64, arc, sh, power. Link: https://lore.kernel.org/linux-arm-kernel/1499782763-31418-1-git-send-email-mark.rutland@arm.com/ Signed-off-by: Guo Ren <guoren@linux.alibaba.com> Cc: Mark Rutland <mark.rutland@arm.com>
| * | | csky: Sync riscv mm/fault.c for easy maintenanceGuo Ren2021-01-122-155/+189
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Sync arch/riscv/mm/fault.c into arch/csky for easy maintenance. Here are the patches related to the modification: cac4d1d "riscv/mm/fault: Move no context handling to no_context()" ac416a7 "riscv/mm/fault: Move vmalloc fault handling to vmalloc_fault()" 6c11ffb "riscv/mm/fault: Move fault error handling to mm_fault_error()" afb8c6f "riscv/mm/fault: Move access error check to function" bda281d "riscv/mm/fault: Simplify fault error handling" a51271d "riscv/mm/fault: Move bad area handling to bad_area()" Signed-off-by: Guo Ren <guoren@linux.alibaba.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: Palmer Dabbelt <palmerdabbelt@google.com> Cc: Arnd Bergmann <arnd@arndb.de>
| * | | csky: Fixup do_page_fault parent irq statusGuo Ren2021-01-122-1/+5
| | | | | | | | | | | | | | | | | | | | | | | | We must succeed parent's context irq status in page fault handler. Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
| * | | csky: Add faulthandler_disabled() checkGuo Ren2021-01-121-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Similar to other architectures: In addition to in_atomic, we also need pagefault_disabled() to check. Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
| * | | csky: Fixup update_mmu_cache called with user io mappingGuo Ren2021-01-121-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The function update_mmu_cache could be called by user-io mapping. There is no space of struct page in mem_map for the pte. Just ignore the user-io mmaping in update_mmu_cache. Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
| * | | csky: Fixup FAULT_FLAG_XXX param for handle_mm_faultGuo Ren2021-01-121-2/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The past code only passes the FAULT_FLAG_WRITE into handle_mm_fault and missing USER & DEFAULT & RETRY. The patch references to arch/riscv/mm/fault.c, but there is no FAULT_FLAG_INSTRUCTION in csky hw. Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
| * | | csky: Add show_tlb for CPU_CK860 debugGuo Ren2021-01-121-0/+121
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Print all 1024 jtlb entries and 16 iutlb entries and 16 dutlb entries in show_regs. Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
| * | | csky: Fix TLB maintenance synchronization problemGuo Ren2021-01-125-16/+69
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | TLB invalidate didn't contain a barrier operation in csky cpu and we need to prevent previous PTW response after TLB invalidation instruction. Of cause, the ASID changing also needs to take care of the issue. CPU0 CPU1 =============== =============== set_pte sync_is() -> See the previous set_pte for all harts tlbi.vas -> Invalidate all harts TLB entry & flush pipeline Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
| * | | csky: Add kmemleak supportGuo Ren2021-01-121-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Here is the log after enabled: [ 1.798972] kmemleak: Kernel memory leak detector initialized (mem pool available: 15851) [ 1.798983] kmemleak: Automatic memory scanning thread started Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
| * | | csky: Remove prologue of page fault handler in entry.SGuo Ren2021-01-124-131/+48
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There is a prologue on page fault handler which marking pages dirty and/or accessed in page attributes, but all of these have been handled in handle_pte_fault. - Add flush_tlb_one in vmalloc page fault instead of prologue. - Using cmxchg_fixup C codes in do_page_fault instead of ASM one. Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
| * | | csky: Fixup PTE global for 2.5:1.5 virtual memoryGuo Ren2021-01-122-2/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fixup commit c2d1adfa9a24 "csky: Add memory layout 2.5G(user):1.5G (kernel)". That patch broke the global bit in PTE. C-SKY TLB's entry contain two pages: vpn, vpn + 1 -> ppn0, ppn1 All PPN's attributes contain global bit and final global is PPN0.G & PPN1.G. So we must keep PPN0.G and PPN1.G same in one TLB's entry. Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
| * | | csky: Cleanup asm/spinlock.hGuo Ren2021-01-123-178/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There are two implementation of spinlock in arch/csky: - simple one (NR_CPU = 1,2) - tick's one (NR_CPU = 3,4) Remove the simple one. There is already smp_mb in spinlock, so remove the definition of smp_mb__after_spinlock. Link: https://lore.kernel.org/linux-csky/20200807081253.GD2674@hirez.programming.kicks-ass.net/#t Signed-off-by: Guo Ren <guoren@linux.alibaba.com> Cc: Peter Zijlstra <peterz@infradead.org>k Cc: Arnd Bergmann <arnd@arndb.de>
| * | | csky: Fixup asm/cmpxchg.h with correct ordering barrierGuo Ren2021-01-121-10/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Optimize the performance of cmpxchg by using more fine-grained acquire/release barriers. Signed-off-by: Guo Ren <guoren@linux.alibaba.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Paul E. McKenney <paulmck@kernel.org>
| * | | csky: Fixup futex SMP implementationGuo Ren2021-01-122-0/+122
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Arnd said: I would guess that for csky, this is a mistake, as the architecture is fairly new and should be able to implement it. Guo reply: The c610, c807, c810 don't support SMP, so futex_cmpxchg_enabled = 1 with asm-generic's implementation. For c860, there is no HAVE_FUTEX_CMPXCHG and cmpxchg_inatomic/inuser implementation, so futex_cmpxchg_enabled = 0. Thx for point it out, we'll implement cmpxchg_inatomic/inuser for C860 and still use asm-generic for non-smp CPUs. LTP test: futex_wait01 1 TPASS : futex_wait(): errno=ETIMEDOUT(110): Connection timed out futex_wait01 2 TPASS : futex_wait(): errno=EAGAIN/EWOULDBLOCK(11): Resource temporarily unavailable futex_wait01 3 TPASS : futex_wait(): errno=ETIMEDOUT(110): Connection timed out futex_wait01 4 TPASS : futex_wait(): errno=EAGAIN/EWOULDBLOCK(11): Resource temporarily unavailable futex_wait02 1 TPASS : futex_wait() woken up futex_wait03 1 TPASS : futex_wait() woken up futex_wait04 1 TPASS : futex_wait() returned -1: errno=EAGAIN/EWOULDBLOCK(11): Resource temporarily unavailable Signed-off-by: Guo Ren <guoren@linux.alibaba.com> Cc: Arnd Bergmann <arnd@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Paul E. McKenney <paulmck@kernel.org> Link: https://lore.kernel.org/lkml/CAK8P3a3+WaQNyJ6Za2qfu6=0mBgU1hApnRXrdp1b1=P7wwyRUg@mail.gmail.com/
| * | | csky: Fixup barrier designGuo Ren2021-01-121-22/+60
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Remove shareable bit for ordering barrier, just keep ordering in current hart is enough for SMP. Using three continuous sync.is as PTW barrier to prevent speculative PTW in 860 microarchitecture. Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
| * | | csky: Remove custom asm/atomic.h implementationGuo Ren2021-01-121-212/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Use generic atomic implementation based on cmpxchg. So remove csky asm/atomic.h. Signed-off-by: Guo Ren <guoren@linux.alibaba.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Arnd Bergmann <arnd@kernel.org> Cc: Paul E. McKenney <paulmck@kernel.org>
| * | | csky: Fixup show_regs doesn't contain regs->uspGuo Ren2021-01-121-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Current show_regs didn't display regs->usp and it confused debug. So fixup wrong SP display and add PT_REGS. Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
| * | | csky: Fixup perf probe failedGuo Ren2021-01-122-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Current perf init will failed with: [ 1.452433] csky-pmu: probe of soc:pmu failed with error -16 This patch fix it up with adding CPUHP_AP_PERF_CSKY_ONLINE in cpuhotplug.h. Signed-off-by: Guo Ren <guoren@linux.alibaba.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
| * | | csky: Add memory layout 2.5G(user):1.5G(kernel)Guo Ren2021-01-1219-51/+113
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There are two ways for translating va to pa for csky: - Use TLB(Translate Lookup Buffer) and PTW (Page Table Walk) - Use SSEG0/1 (Simple Segment Mapping) We use tlb mapping 0-2G and 3G-4G virtual address area and SSEG0/1 are for 2G-2.5G and 2.5G-3G translation. We could disable SSEG0 to use 2G-2.5G as TLB user mapping. Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
* | | | Merge tag 'riscv-for-linus-5.12-mw1' of ↵Linus Torvalds2021-02-284-19/+5
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux Pull more RISC-V updates from Palmer Dabbelt: "A pair of patches that slipped through the cracks: - enable CPU hotplug in the defconfigs - some cleanups to setup_bootmem There's also a single fix for some randconfig build failures: - make NUMA depend on SMP" * tag 'riscv-for-linus-5.12-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: riscv: Cleanup setup_bootmem() RISC-V: Enable CPU Hotplug in defconfigs RISC-V: Make NUMA depend on SMP
| * | | | riscv: Cleanup setup_bootmem()Kefeng Wang2021-02-271-19/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | After the following patches, commit de043da0b9e7 ("RISC-V: Fix usage of memblock_enforce_memory_limit") commit 1bd14a66ee52 ("RISC-V: Remove any memblock representing unusable memory area") commit b10d6bca8720 ("arch, drivers: replace for_each_membock() with for_each_mem_range()") some logic is useless, kill the mem_start/start/end and unneeded code. Reviewed-by: Atish Patra <atish.patra@wdc.com> Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
| * | | | RISC-V: Enable CPU Hotplug in defconfigsAnup Patel2021-02-272-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The CPU hotplug support has been tested on QEMU, Spike, and SiFive Unleashed so let's enable it by default in RV32 and RV64 defconfigs. Signed-off-by: Anup Patel <anup.patel@wdc.com> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
| * | | | RISC-V: Make NUMA depend on SMPPalmer Dabbelt2021-02-271-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In theory these are orthogonal, but in practice all NUMA systems are SMP. NUMA && !SMP doesn't build, everyone else is coupling them, and I don't really see any value in supporting that configuration. Fixes: 4f0e8eef772e ("riscv: Add numa support for riscv64 platform") Suggested-by: Andrew Morton <akpm@linux-foundation.org> Suggested-by: Atish Patra <atishp@atishpatra.org> Reported-by: Kefeng Wang <wangkefeng.wang@huawei.com> Tested-by: Kefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>