summaryrefslogtreecommitdiffstats
path: root/arch (follow)
Commit message (Collapse)AuthorAgeFilesLines
* mm/vma: define a default value for VM_DATA_DEFAULT_FLAGSAnshuman Khandual2020-04-1127-89/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There are many platforms with exact same value for VM_DATA_DEFAULT_FLAGS This creates a default value for VM_DATA_DEFAULT_FLAGS in line with the existing VM_STACK_DEFAULT_FLAGS. While here, also define some more macros with standard VMA access flag combinations that are used frequently across many platforms. Apart from simplification, this reduces code duplication as well. Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Richard Henderson <rth@twiddle.net> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Russell King <linux@armlinux.org.uk> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Mark Salter <msalter@redhat.com> Cc: Guo Ren <guoren@kernel.org> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: Brian Cain <bcain@codeaurora.org> Cc: Tony Luck <tony.luck@intel.com> Cc: Michal Simek <monstr@monstr.eu> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Paul Burton <paulburton@kernel.org> Cc: Nick Hu <nickhu@andestech.com> Cc: Ley Foon Tan <ley.foon.tan@intel.com> Cc: Jonas Bonn <jonas@southpole.se> Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Rich Felker <dalias@libc.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Guan Xuetao <gxt@pku.edu.cn> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Jeff Dike <jdike@addtoit.com> Cc: Chris Zankel <chris@zankel.net> Link: http://lkml.kernel.org/r/1583391014-8170-2-git-send-email-anshuman.khandual@arm.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* mm: define pte_index as macro for x86Arjun Roy2020-04-111-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | pte_index() is either defined as a macro (e.g. sparc64) or as an inlined function (e.g. x86). vm_insert_pages() depends on pte_index but it is not defined on all platforms (e.g. m68k). To fix compilation of vm_insert_pages() on architectures not providing pte_index(), we perform the following fix: 0. For platforms where it is meaningful, and defined as a macro, no change is needed. 1. For platforms where it is meaningful and defined as an inlined function, and we want to use it with vm_insert_pages(), we define a degenerate macro of the form: #define pte_index pte_index 2. vm_insert_pages() checks for the existence of a pte_index macro definition. If found, it implements a batched insert. If not found, it devolves to calling vm_insert_page() in a loop. This patch implements step 1 for x86. v3 of this patch fixes a compilation warning for an unused method. v2 of this patch moved a macro definition to a more readable location. Signed-off-by: Arjun Roy <arjunroy@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: David Miller <davem@davemloft.net> Cc: Eric Dumazet <edumazet@google.com> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Matthew Wilcox <willy@infradead.org> Cc: Soheil Hassas Yeganeh <soheil@google.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Link: http://lkml.kernel.org/r/20200228054714.204424-1-arjunroy.kdev@gmail.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* mm: bring sparc pte_index() semantics inline with other platformsArjun Roy2020-04-111-5/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | pte_index() on platforms other than sparc return a numerical index. On sparc, it returns a pte_t*. This presents an issue for vm_insert_pages(), which relies on pte_index() to find the offset for a pte within a pmd, for batched inserts. This patch: 1. Modifies pte_index() for sparc to return a numerical index, like other platforms, 2. Defines pte_entry() for sparc which returns a pte_t* (as pte_index() used to), 3. Converts existing sparc callers for pte_index() to use pte_entry(). [sfr@canb.auug.org.au: remove pte_entry and just directly modified pte_offset_kernel instead] Signed-off-by: Arjun Roy <arjunroy@google.com> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Mike Rapoport <rppt@linux.ibm.com> Cc: Eric Dumazet <edumazet@google.com> Cc: Soheil Hassas Yeganeh <soheil@google.com> Cc: David Miller <davem@davemloft.net> Cc: Matthew Wilcox <willy@infradead.org> Cc: Arjun Roy <arjunroy.kdev@gmail.com> Cc: Jason Gunthorpe <jgg@ziepe.ca> Link: http://lkml.kernel.org/r/20200227105045.6b421d9f@canb.auug.org.au Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* mm: hugetlb: optionally allocate gigantic hugepages using cmaRoman Gushchin2020-04-112-0/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 944d9fec8d7a ("hugetlb: add support for gigantic page allocation at runtime") has added the run-time allocation of gigantic pages. However it actually works only at early stages of the system loading, when the majority of memory is free. After some time the memory gets fragmented by non-movable pages, so the chances to find a contiguous 1GB block are getting close to zero. Even dropping caches manually doesn't help a lot. At large scale rebooting servers in order to allocate gigantic hugepages is quite expensive and complex. At the same time keeping some constant percentage of memory in reserved hugepages even if the workload isn't using it is a big waste: not all workloads can benefit from using 1 GB pages. The following solution can solve the problem: 1) On boot time a dedicated cma area* is reserved. The size is passed as a kernel argument. 2) Run-time allocations of gigantic hugepages are performed using the cma allocator and the dedicated cma area In this case gigantic hugepages can be allocated successfully with a high probability, however the memory isn't completely wasted if nobody is using 1GB hugepages: it can be used for pagecache, anon memory, THPs, etc. * On a multi-node machine a per-node cma area is allocated on each node. Following gigantic hugetlb allocation are using the first available numa node if the mask isn't specified by a user. Usage: 1) configure the kernel to allocate a cma area for hugetlb allocations: pass hugetlb_cma=10G as a kernel argument 2) allocate hugetlb pages as usual, e.g. echo 10 > /sys/kernel/mm/hugepages/hugepages-1048576kB/nr_hugepages If the option isn't enabled or the allocation of the cma area failed, the current behavior of the system is preserved. x86 and arm-64 are covered by this patch, other architectures can be trivially added later. The patch contains clean-ups and fixes proposed and implemented by Aslan Bakirov and Randy Dunlap. It also contains ideas and suggestions proposed by Rik van Riel, Michal Hocko and Mike Kravetz. Thanks! Signed-off-by: Roman Gushchin <guro@fb.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Tested-by: Andreas Schaufler <andreas.schaufler@gmx.de> Acked-by: Mike Kravetz <mike.kravetz@oracle.com> Acked-by: Michal Hocko <mhocko@kernel.org> Cc: Aslan Bakirov <aslan@fb.com> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Rik van Riel <riel@surriel.com> Cc: Joonsoo Kim <js1304@gmail.com> Link: http://lkml.kernel.org/r/20200407163840.92263-3-guro@fb.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Merge tag 'arm64-fixes' of ↵Linus Torvalds2020-04-095-27/+12
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 fixes from Catalin Marinas: - Ensure that the compiler and linker versions are aligned so that ld doesn't complain about not understanding a .note.gnu.property section (emitted when pointer authentication is enabled). - Force -mbranch-protection=none when the feature is not enabled, in case a compiler may choose a different default value. - Remove CONFIG_DEBUG_ALIGN_RODATA. It was never in defconfig and rarely enabled. - Fix checking 16-bit Thumb-2 instructions checking mask in the emulation of the SETEND instruction (it could match the bottom half of a 32-bit Thumb-2 instruction). * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: arm64: armv8_deprecated: Fix undef_hook mask for thumb setend arm64: remove CONFIG_DEBUG_ALIGN_RODATA feature arm64: Always force a branch protection mode when the compiler has one arm64: Kconfig: ptrauth: Add binutils version check to fix mismatch init/kconfig: Add LD_VERSION Kconfig
| * arm64: armv8_deprecated: Fix undef_hook mask for thumb setendFredrik Strupe2020-04-081-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For thumb instructions, call_undef_hook() in traps.c first reads a u16, and if the u16 indicates a T32 instruction (u16 >= 0xe800), a second u16 is read, which then makes up the the lower half-word of a T32 instruction. For T16 instructions, the second u16 is not read, which makes the resulting u32 opcode always have the upper half set to 0. However, having the upper half of instr_mask in the undef_hook set to 0 masks out the upper half of all thumb instructions - both T16 and T32. This results in trapped T32 instructions with the lower half-word equal to the T16 encoding of setend (b650) being matched, even though the upper half-word is not 0000 and thus indicates a T32 opcode. An example of such a T32 instruction is eaa0b650, which should raise a SIGILL since T32 instructions with an eaa prefix are unallocated as per Arm ARM, but instead works as a SETEND because the second half-word is set to b650. This patch fixes the issue by extending instr_mask to include the upper u32 half, which will still match T16 instructions where the upper half is 0, but not T32 instructions. Fixes: 2d888f48e056 ("arm64: Emulate SETEND for AArch32 tasks") Cc: <stable@vger.kernel.org> # 4.0.x- Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: Fredrik Strupe <fredrik@strupe.net> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
| * arm64: remove CONFIG_DEBUG_ALIGN_RODATA featureArd Biesheuvel2020-04-012-24/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When CONFIG_DEBUG_ALIGN_RODATA is enabled, kernel segments mapped with different permissions (r-x for .text, r-- for .rodata, rw- for .data, etc) are rounded up to 2 MiB so they can be mapped more efficiently. In particular, it permits the segments to be mapped using level 2 block entries when using 4k pages, which is expected to result in less TLB pressure. However, the mappings for the bulk of the kernel will use level 2 entries anyway, and the misaligned fringes are organized such that they can take advantage of the contiguous bit, and use far fewer level 3 entries than would be needed otherwise. This makes the value of this feature dubious at best, and since it is not enabled in defconfig or in the distro configs, it does not appear to be in wide use either. So let's just remove it. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Acked-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Will Deacon <will@kernel.org> Acked-by: Laura Abbott <labbott@kernel.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
| * arm64: Always force a branch protection mode when the compiler has oneMark Brown2020-04-011-1/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Compilers with branch protection support can be configured to enable it by default, it is likely that distributions will do this as part of deploying branch protection system wide. As well as the slight overhead from having some extra NOPs for unused branch protection features this can cause more serious problems when the kernel is providing pointer authentication to userspace but not built for pointer authentication itself. In that case our switching of keys for userspace can affect the kernel unexpectedly, causing pointer authentication instructions in the kernel to corrupt addresses. To ensure that we get consistent and reliable behaviour always explicitly initialise the branch protection mode, ensuring that the kernel is built the same way regardless of the compiler defaults. Fixes: 7503197562567 (arm64: add basic pointer authentication support) Reported-by: Szabolcs Nagy <szabolcs.nagy@arm.com> Signed-off-by: Mark Brown <broonie@kernel.org> Cc: stable@vger.kernel.org [catalin.marinas@arm.com: remove Kconfig option in favour of Makefile check] Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
| * arm64: Kconfig: ptrauth: Add binutils version check to fix mismatchAmit Daniel Kachhap2020-04-011-1/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Recent addition of ARM64_PTR_AUTH exposed a mismatch issue with binutils. 9.1+ versions of gcc inserts a section note .note.gnu.property but this can be used properly by binutils version greater than 2.33.1. If older binutils are used then the following warnings are generated, aarch64-linux-ld: warning: arch/arm64/kernel/vdso/vgettimeofday.o: unsupported GNU_PROPERTY_TYPE (5) type: 0xc0000000 aarch64-linux-objdump: warning: arch/arm64/lib/csum.o: unsupported GNU_PROPERTY_TYPE (5) type: 0xc0000000 aarch64-linux-nm: warning: .tmp_vmlinux1: unsupported GNU_PROPERTY_TYPE (5) type: 0xc0000000 This patch enables ARM64_PTR_AUTH when gcc and binutils versions are compatible with each other. Older gcc which do not insert such section continue to work as before. This scenario may not occur with clang as a recent commit 3b446c7d27ddd06 ("arm64: Kconfig: verify binutils support for ARM64_PTR_AUTH") masks binutils version lesser then 2.34. Reported-by: kbuild test robot <lkp@intel.com> Suggested-by: Vincenzo Frascino <Vincenzo.Frascino@arm.com> Signed-off-by: Amit Daniel Kachhap <amit.kachhap@arm.com> [catalin.marinas@arm.com: slight adjustment to the comment] Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
* | Merge tag 'powerpc-5.7-2' of ↵Linus Torvalds2020-04-0925-599/+757
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull more powerpc updates from Michael Ellerman: "The bulk of this is the series to make CONFIG_COMPAT user-selectable, it's been around for a long time but was blocked behind the syscall-in-C series. Plus there's also a few fixes and other minor things. Summary: - A fix for a crash in machine check handling on pseries (ie. guests) - A small series to make it possible to disable CONFIG_COMPAT, and turn it off by default for ppc64le where it's not used. - A few other miscellaneous fixes and small improvements. Thanks to: Alexey Kardashevskiy, Anju T Sudhakar, Arnd Bergmann, Christophe Leroy, Dan Carpenter, Ganesh Goudar, Geert Uytterhoeven, Geoff Levand, Mahesh Salgaonkar, Markus Elfring, Michal Suchanek, Nicholas Piggin, Stephen Boyd, Wen Xiong" * tag 'powerpc-5.7-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: selftests/powerpc: Always build the tm-poison test 64-bit powerpc: Improve ppc_save_regs() Revert "powerpc/64: irq_work avoid interrupt when called with hardware irqs enabled" powerpc/time: Replace <linux/clk-provider.h> by <linux/of_clk.h> powerpc/pseries/ddw: Extend upper limit for huge DMA window for persistent memory powerpc/perf: split callchain.c by bitness powerpc/64: Make COMPAT user-selectable disabled on littleendian by default. powerpc/64: make buildable without CONFIG_COMPAT powerpc/perf: consolidate valid_user_sp -> invalid_user_sp powerpc/perf: consolidate read_user_stack_32 powerpc: move common register copy functions from signal_32.c to signal.c powerpc: Add back __ARCH_WANT_SYS_LLSEEK macro powerpc/ps3: Set CONFIG_UEVENT_HELPER=y in ps3_defconfig powerpc/ps3: Remove an unneeded NULL check powerpc/ps3: Remove duplicate error message powerpc/powernv: Re-enable imc trace-mode in kernel powerpc/perf: Implement a global lock to avoid races between trace, core and thread imc events. powerpc/pseries: Fix MCE handling on pseries selftests/eeh: Skip ahci adapters powerpc/64s: Fix doorbell wakeup msgclr optimisation
| * | powerpc: Improve ppc_save_regs()Nicholas Piggin2020-04-041-1/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Make ppc_save_regs() a bit more useful: - Set NIP to our caller rather rather than the caller's caller (which is what we save to LR in the stack frame). - Set SOFTE to the current irq soft-mask state rather than uninitialised. - Zero CFAR rather than leave it uninitialised. In qemu, injecting a nmi to an idle CPU gives a nicer stack trace (note NIP, IRQMASK, CFAR). Oops: System Reset, sig: 6 [#1] LE PAGE_SIZE=64K MMU=Hash PREEMPT SMP NR_CPUS=2048 NUMA PowerNV Modules linked in: CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.6.0-rc2-00429-ga76e38fd80bf #1277 NIP: c0000000000b6e5c LR: c0000000000b6e5c CTR: c000000000b06270 REGS: c00000000173fb08 TRAP: 0100 Not tainted MSR: 9000000000001033 <SF,HV,ME,IR,DR,RI,LE> CR: 28000224 XER: 00000000 CFAR: c0000000016a2128 IRQMASK: c00000000173fc80 GPR00: c0000000000b6e5c c00000000173fc80 c000000001743400 c00000000173fb08 GPR04: 0000000000000000 0000000000000000 0000000000000008 0000000000000001 GPR08: 00000001fea80000 0000000000000000 0000000000000000 ffffffffffffffff GPR12: c000000000b06270 c000000001930000 00000000300026c0 0000000000000000 GPR16: 0000000000000000 0000000000000000 0000000000000003 c0000000016a2128 GPR20: c0000001ffc97148 0000000000000001 c000000000f289a8 0000000000080000 GPR24: c0000000016e1480 000000011dc870ba 0000000000000000 0000000000000003 GPR28: c0000000016a2128 c0000001ffc97148 c0000000016a2260 0000000000000003 NIP [c0000000000b6e5c] power9_idle_type+0x5c/0x70 LR [c0000000000b6e5c] power9_idle_type+0x5c/0x70 Call Trace: [c00000000173fc80] [c0000000000b6e5c] power9_idle_type+0x5c/0x70 (unreliable) [c00000000173fcb0] [c000000000b062b0] stop_loop+0x40/0x60 [c00000000173fce0] [c000000000b022d8] cpuidle_enter_state+0xa8/0x660 [c00000000173fd60] [c000000000b0292c] cpuidle_enter+0x4c/0x70 [c00000000173fda0] [c00000000017624c] call_cpuidle+0x4c/0x90 [c00000000173fdc0] [c000000000176768] do_idle+0x338/0x460 [c00000000173fe60] [c000000000176b3c] cpu_startup_entry+0x3c/0x40 [c00000000173fe90] [c0000000000126b4] rest_init+0x124/0x140 [c00000000173fed0] [c0000000010948d4] start_kernel+0x938/0x988 [c00000000173ff90] [c00000000000cdcc] start_here_common+0x1c/0x20 Oops: System Reset, sig: 6 [#1] LE PAGE_SIZE=64K MMU=Hash PREEMPT SMP NR_CPUS=2048 NUMA PowerNV Modules linked in: CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.6.0-rc2-00430-gddce91b8712f #1278 NIP: c00000000001d150 LR: c0000000000b6e5c CTR: c000000000b06270 REGS: c00000000173fb08 TRAP: 0100 Not tainted MSR: 9000000000001033 <SF,HV,ME,IR,DR,RI,LE> CR: 28000224 XER: 00000000 CFAR: 0000000000000000 IRQMASK: 1 GPR00: c0000000000b6e5c c00000000173fc80 c000000001743400 c00000000173fb08 GPR04: 0000000000000000 0000000000000000 0000000000000008 0000000000000001 GPR08: 00000001fea80000 0000000000000000 0000000000000000 ffffffffffffffff GPR12: c000000000b06270 c000000001930000 00000000300026c0 0000000000000000 GPR16: 0000000000000000 0000000000000000 0000000000000003 c0000000016a2128 GPR20: c0000001ffc97148 0000000000000001 c000000000f289a8 0000000000080000 GPR24: c0000000016e1480 00000000b68db8ce 0000000000000000 0000000000000003 GPR28: c0000000016a2128 c0000001ffc97148 c0000000016a2260 0000000000000003 NIP [c00000000001d150] replay_system_reset+0x30/0xa0 LR [c0000000000b6e5c] power9_idle_type+0x5c/0x70 Call Trace: [c00000000173fc80] [c0000000000b6e5c] power9_idle_type+0x5c/0x70 (unreliable) [c00000000173fcb0] [c000000000b062b0] stop_loop+0x40/0x60 [c00000000173fce0] [c000000000b022d8] cpuidle_enter_state+0xa8/0x660 [c00000000173fd60] [c000000000b0292c] cpuidle_enter+0x4c/0x70 [c00000000173fda0] [c00000000017624c] call_cpuidle+0x4c/0x90 [c00000000173fdc0] [c000000000176768] do_idle+0x338/0x460 [c00000000173fe60] [c000000000176b38] cpu_startup_entry+0x38/0x40 [c00000000173fe90] [c0000000000126b4] rest_init+0x124/0x140 [c00000000173fed0] [c0000000010948d4] start_kernel+0x938/0x988 [c00000000173ff90] [c00000000000cdcc] start_here_common+0x1c/0x20 Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200403131006.123243-1-npiggin@gmail.com
| * | Revert "powerpc/64: irq_work avoid interrupt when called with hardware irqs ↵Nicholas Piggin2020-04-031-31/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | enabled" This reverts commit ebb37cf3ffd39fdb6ec5b07111f8bb2f11d92c5f. That commit does not play well with soft-masked irq state manipulations in idle, interrupt replay, and possibly others due to tracing code sometimes using irq_work_queue (e.g., in trace_hardirqs_on()). That can cause PACA_IRQ_DEC to become set when it is not expected, and be ignored or cleared or cause warnings. The net result seems to be missing an irq_work until the next timer interrupt in the worst case which is usually not going to be noticed, however it could be a long time if the tick is disabled, which is against the spirit of irq_work and might cause real problems. The idea is still solid, but it would need more work. It's not really clear if it would be worth added complexity, so revert this for now (not a straight revert, but replace with a comment explaining why we might see interrupts happening, and gives git blame something to find). Fixes: ebb37cf3ffd3 ("powerpc/64: irq_work avoid interrupt when called with hardware irqs enabled") Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200402120401.1115883-1-npiggin@gmail.com
| * | powerpc/time: Replace <linux/clk-provider.h> by <linux/of_clk.h>Geert Uytterhoeven2020-04-031-3/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The PowerPC time code is not a clock provider, and just needs to call of_clk_init(). Hence it can include <linux/of_clk.h> instead of <linux/clk-provider.h>. Remove the #ifdef protecting the of_clk_init() call, as a stub is available for the !CONFIG_COMMON_CLK case. Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: Stephen Boyd <sboyd@kernel.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200213083804.24315-1-geert+renesas@glider.be
| * | powerpc/pseries/ddw: Extend upper limit for huge DMA window for persistent ↵Alexey Kardashevskiy2020-04-031-0/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | memory Unlike normal memory ("memory" compatible type in the FDT), the persistent memory ("ibm,pmemory" in the FDT) can be mapped anywhere in the guest physical space and it can be used for DMA. In order to maintain 1:1 mapping via the huge DMA window, we need to know the maximum physical address at the time of the window setup. So far we've been looking at "memory" nodes but "ibm,pmemory" does not have fixed addresses and the persistent memory may be mapped afterwards. Since the persistent memory is still backed with page structs, use MAX_PHYSMEM_BITS as the upper limit. This effectively disables huge DMA window in LPAR under pHyp if persistent memory is present but this is the best we can do for the moment. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Tested-by: Wen Xiong<wenxiong@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200331012338.23773-1-aik@ozlabs.ru
| * | powerpc/perf: split callchain.c by bitnessMichal Suchanek2020-04-025-356/+394
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Building callchain.c with !COMPAT proved quite ugly with all the defines. Splitting out the 32bit and 64bit parts looks better. No code change intended. Signed-off-by: Michal Suchanek <msuchanek@suse.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/a20027bf1074935a7934ee2a6757c99ea047e70d.1584699455.git.msuchanek@suse.de
| * | powerpc/64: Make COMPAT user-selectable disabled on littleendian by default.Michal Suchanek2020-04-021-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On bigendian ppc64 it is common to have 32bit legacy binaries but much less so on littleendian. Signed-off-by: Michal Suchanek <msuchanek@suse.de> Reviewed-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/41393d6e895b0d3a47ee62f8f51e1cf888ad6226.1584699455.git.msuchanek@suse.de
| * | powerpc/64: make buildable without CONFIG_COMPATMichal Suchanek2020-04-028-13/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | There are numerous references to 32bit functions in generic and 64bit code so ifdef them out. Signed-off-by: Michal Suchanek <msuchanek@suse.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/e5619617020ef3a1f54f0c076e7d74cb9ec9f3bf.1584699455.git.msuchanek@suse.de
| * | powerpc/perf: consolidate valid_user_sp -> invalid_user_spMichal Suchanek2020-04-021-16/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Merge the 32bit and 64bit version. Halve the check constants on 32bit. Use STACK_TOP since it is defined. Passing is_64 is now redundant since is_32bit_task() is used to determine which callchain variant should be used. Use STACK_TOP and is_32bit_task() directly. This removes a page from the valid 32bit area on 64bit: #define TASK_SIZE_USER32 (0x0000000100000000UL - (1 * PAGE_SIZE)) #define STACK_TOP_USER32 TASK_SIZE_USER32 Change return value to bool. It is inverted by users anyway. Change to invalid_user_sp to avoid inverting the return value twice. Signed-off-by: Michal Suchanek <msuchanek@suse.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/be8e40fc0737fb28ad08b198552dee7cac1c5ce2.1584699455.git.msuchanek@suse.de
| * | powerpc/perf: consolidate read_user_stack_32Michal Suchanek2020-04-021-24/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There are two almost identical copies for 32bit and 64bit. The function is used only in 32bit code which will be split out in next patch so consolidate to one function. Signed-off-by: Michal Suchanek <msuchanek@suse.de> Reviewed-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/0c21c919ed1296420199c78f7c3cfd29d3c7e909.1584699455.git.msuchanek@suse.de
| * | powerpc: move common register copy functions from signal_32.c to signal.cMichal Suchanek2020-04-022-140/+141
| | | | | | | | | | | | | | | | | | | | | | | | | | | These functions are required for 64bit as well. Signed-off-by: Michal Suchanek <msuchanek@suse.de> Reviewed-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/9fd6d9b7c5e91fab21159fe23534a2f16b4962d3.1584699455.git.msuchanek@suse.de
| * | powerpc: Add back __ARCH_WANT_SYS_LLSEEK macroMichal Suchanek2020-04-021-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This partially reverts commit caf6f9c8a326 ("asm-generic: Remove unneeded __ARCH_WANT_SYS_LLSEEK macro") When CONFIG_COMPAT is disabled on ppc64 the kernel does not build. There is resistance to both removing the llseek syscall from the 64bit syscall tables and building the llseek interface unconditionally. Signed-off-by: Michal Suchanek <msuchanek@suse.de> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/lkml/20190828151552.GA16855@infradead.org/ Link: https://lore.kernel.org/lkml/20190829214319.498c7de2@naga/ Link: https://lore.kernel.org/r/dd4575c51e31766e87f7e7fa121d099ab78d3290.1584699455.git.msuchanek@suse.de
| * | powerpc/ps3: Set CONFIG_UEVENT_HELPER=y in ps3_defconfigGeoff Levand2020-04-021-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Set CONFIG_UEVENT_HELPER=y in ps3_defconfig. commit 1be01d4a57142ded23bdb9e0c8d9369e693b26cc (driver: base: Disable CONFIG_UEVENT_HELPER by default) disabled the CONFIG_UEVENT_HELPER option that is needed for hotplug and module loading by most older 32bit powerpc distributions that users typically install on the PS3. Signed-off-by: Geoff Levand <geoff@infradead.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/410cda9aa1a6e04434dfe1f9aa2103d0694f706c.1585340156.git.geoff@infradead.org
| * | powerpc/ps3: Remove duplicate error messageMarkus Elfring2020-04-021-3/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | Remove a duplicate memory allocation failure error message. Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Signed-off-by: Geoff Levand <geoff@infradead.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/1bc5a16a22c487c478a204ebb7b80a22d2ad9cd0.1585340156.git.geoff@infradead.org
| * | powerpc/powernv: Re-enable imc trace-mode in kernelAnju T Sudhakar2020-04-021-8/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit <249fad734a25> ""powerpc/perf: Disable trace_imc pmu" disables IMC(In-Memory Collection) trace-mode in kernel, since frequent mode switching between accumulation mode and trace mode via the spr LDBAR in the hardware can trigger a checkstop(system crash). Patch to re-enable imc-trace mode in kernel. The previous patch(1/2) in this series will address the mode switching issue by implementing a global lock, and will restrict the usage of accumulation and trace-mode at a time. Signed-off-by: Anju T Sudhakar <anju@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200313055238.8656-2-anju@linux.vnet.ibm.com
| * | powerpc/perf: Implement a global lock to avoid races between trace, core and ↵Anju T Sudhakar2020-04-021-24/+149
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | thread imc events. IMC(In-memory Collection Counters) does performance monitoring in two different modes, i.e accumulation mode(core-imc and thread-imc events), and trace mode(trace-imc events). A cpu thread can either be in accumulation-mode or trace-mode at a time and this is done via the LDBAR register in POWER architecture. The current design does not address the races between thread-imc and trace-imc events. Patch implements a global id and lock to avoid the races between core, trace and thread imc events. With this global id-lock implementation, the system can either run core, thread or trace imc events at a time. i.e. to run any core-imc events, thread/trace imc events should not be enabled/monitored. Signed-off-by: Anju T Sudhakar <anju@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200313055238.8656-1-anju@linux.vnet.ibm.com
| * | powerpc/pseries: Fix MCE handling on pseriesGanesh Goudar2020-04-021-0/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | MCE handling on pSeries platform fails as recent rework to use common code for pSeries and PowerNV in machine check error handling tries to access per-cpu variables in realmode. The per-cpu variables may be outside the RMO region on pSeries platform and needs translation to be enabled for access. Just moving these per-cpu variable into RMO region did'nt help because we queue some work to workqueues in real mode, which again tries to touch per-cpu variables. Also fwnmi_release_errinfo() cannot be called when translation is not enabled. This patch fixes this by enabling translation in the exception handler when all required real mode handling is done. This change only affects the pSeries platform. Without this fix below kernel crash is seen on injecting SLB multihit: BUG: Unable to handle kernel data access on read at 0xc00000027b205950 Faulting instruction address: 0xc00000000003b7e0 Oops: Kernel access of bad area, sig: 11 [#1] LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries Modules linked in: mcetest_slb(OE+) af_packet(E) xt_tcpudp(E) ip6t_rpfilter(E) ip6t_REJECT(E) ipt_REJECT(E) xt_conntrack(E) ip_set(E) nfnetlink(E) ebtable_nat(E) ebtable_broute(E) ip6table_nat(E) ip6table_mangle(E) ip6table_raw(E) ip6table_security(E) iptable_nat(E) nf_nat(E) nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) iptable_mangle(E) iptable_raw(E) iptable_security(E) ebtable_filter(E) ebtables(E) ip6table_filter(E) ip6_tables(E) iptable_filter(E) ip_tables(E) x_tables(E) xfs(E) ibmveth(E) vmx_crypto(E) gf128mul(E) uio_pdrv_genirq(E) uio(E) crct10dif_vpmsum(E) rtc_generic(E) btrfs(E) libcrc32c(E) xor(E) zstd_decompress(E) zstd_compress(E) raid6_pq(E) sr_mod(E) sd_mod(E) cdrom(E) ibmvscsi(E) scsi_transport_srp(E) crc32c_vpmsum(E) dm_mod(E) sg(E) scsi_mod(E) CPU: 34 PID: 8154 Comm: insmod Kdump: loaded Tainted: G OE 5.5.0-mahesh #1 NIP: c00000000003b7e0 LR: c0000000000f2218 CTR: 0000000000000000 REGS: c000000007dcb960 TRAP: 0300 Tainted: G OE (5.5.0-mahesh) MSR: 8000000000001003 <SF,ME,RI,LE> CR: 28002428 XER: 20040000 CFAR: c0000000000f2214 DAR: c00000027b205950 DSISR: 40000000 IRQMASK: 0 GPR00: c0000000000f2218 c000000007dcbbf0 c000000001544800 c000000007dcbd70 GPR04: 0000000000000001 c000000007dcbc98 c008000000d00258 c0080000011c0000 GPR08: 0000000000000000 0000000300000003 c000000001035950 0000000003000048 GPR12: 000000027a1d0000 c000000007f9c000 0000000000000558 0000000000000000 GPR16: 0000000000000540 c008000001110000 c008000001110540 0000000000000000 GPR20: c00000000022af10 c00000025480fd70 c008000001280000 c00000004bfbb300 GPR24: c000000001442330 c00800000800000d c008000008000000 4009287a77000510 GPR28: 0000000000000000 0000000000000002 c000000001033d30 0000000000000001 NIP [c00000000003b7e0] save_mce_event+0x30/0x240 LR [c0000000000f2218] pseries_machine_check_realmode+0x2c8/0x4f0 Call Trace: Instruction dump: 3c4c0151 38429050 7c0802a6 60000000 fbc1fff0 fbe1fff8 f821ffd1 3d42ffaf 3fc2ffaf e98d0030 394a1150 3bdef530 <7d6a62aa> 1d2b0048 2f8b0063 380b0001 ---[ end trace 46fd63f36bbdd940 ]--- Fixes: 9ca766f9891d ("powerpc/64s/pseries: machine check convert to use common event code") Reviewed-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Ganesh Goudar <ganeshgr@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200320110119.10207-1-ganeshgr@linux.ibm.com
| * | powerpc/64s: Fix doorbell wakeup msgclr optimisationNicholas Piggin2020-04-022-19/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 3282a3da25bd ("powerpc/64: Implement soft interrupt replay in C") broke the doorbell wakeup optimisation introduced by commit a9af97aa0a12 ("powerpc/64s: msgclr when handling doorbell exceptions from system reset"). This patch restores the msgclr, in C code. It's now done in the system reset wakeup path rather than doorbell interrupt replay where it used to be, because it is always the right thing to do in the wakeup case, but it may be rarely of use in other interrupt replay situations in which case it's wasted work - we would have to run measurements to see if that was a worthwhile optimisation, and I suspect it would not be. The results are similar to those in the original commit, test on POWER8 of context_switch selftests benchmark with polling idle disabled (e.g., always nap, giving cross-CPU IPIs) gives the following results: broken patched Different threads, same core: 317k/s 375k/s +18.7% Different cores: 280k/s 282k/s +1.0% Fixes: 3282a3da25bd ("powerpc/64: Implement soft interrupt replay in C") Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200402121212.1118218-1-npiggin@gmail.com
* | | Merge branch 'for-next' of ↵Linus Torvalds2020-04-094-48/+44
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu Pull m68knommu update from Greg Ungerer: "Only a single commit, to remove all use of the obsolete setup_irq() calls within the m68knommu architecture code" * 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu: m68k: Replace setup_irq() by request_irq()
| * | | m68k: Replace setup_irq() by request_irq()afzal mohammed2020-03-234-48/+44
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | request_irq() is preferred over setup_irq(). Invocations of setup_irq() occur after memory allocators are ready. Per tglx[1], setup_irq() existed in olden days when allocators were not ready by the time early interrupts were initialized. Hence replace setup_irq() by request_irq(). [1] https://lkml.kernel.org/r/alpine.DEB.2.20.1710191609480.1971@nanos Signed-off-by: afzal mohammed <afzal.mohd.ma@gmail.com> Tested-by: Greg Ungerer <gerg@linux-m68k.org> Signed-off-by: Greg Ungerer <gerg@linux-m68k.org>
* | | | Merge tag 'riscv-for-linus-5.7' of ↵Linus Torvalds2020-04-0948-294/+2806
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux Pull RISC-V updates from Palmer Dabbelt: "This contains a handful of new features: - Partial support for the Kendryte K210. There are still a few outstanding issues that I have patches for, but I don't actually have a board to test them so they're not included yet. - SBI v0.2 support. - Fixes to support for building with LLVM-based toolchains. The resulting images are known not to boot yet. I don't anticipate a part two, but I'll probably have something early in the RCs to finish up the K210 support" * tag 'riscv-for-linus-5.7' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (38 commits) riscv: create a loader.bin boot image for Kendryte SoC riscv: Kendryte K210 default config riscv: Add Kendryte K210 device tree riscv: Select required drivers for Kendryte SOC riscv: Add Kendryte K210 SoC support riscv: Add SOC early init support riscv: Unaligned load/store handling for M_MODE RISC-V: Support cpu hotplug RISC-V: Add supported for ordered booting method using HSM RISC-V: Add SBI HSM extension definitions RISC-V: Export SBI error to linux error mapping function RISC-V: Add cpu_ops and modify default booting method RISC-V: Move relocate and few other functions out of __init RISC-V: Implement new SBI v0.2 extensions RISC-V: Introduce a new config for SBI v0.1 RISC-V: Add SBI v0.2 extension definitions RISC-V: Add basic support for SBI v0.2 RISC-V: Mark existing SBI as 0.1 SBI. riscv: Use macro definition instead of magic number riscv: Add support to dump the kernel page tables ...
| * | | | riscv: create a loader.bin boot image for Kendryte SoCChristoph Hellwig2020-04-032-3/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Create the loader.bin bootable image file that can be loaded into Kendryte K210 based boards using the kflash.py tool with the command: kflash.py/kflash.py -t arch/riscv/boot/loader.bin Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Reviewed-by: Anup Patel <anup@brainfault.org> Reviewed-by: Palmer Dabbelt <palmerdabbelt@google.com> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
| * | | | riscv: Kendryte K210 default configDamien Le Moal2020-04-031-0/+68
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds a defconfig file to build No-MMU kernels meant for boards based on the Kendryte K210 SoC. Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
| * | | | riscv: Add Kendryte K210 device treeDamien Le Moal2020-04-034-0/+149
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add a generic device tree for Kendryte K210 SoC based boards. This is for now a very simple device tree describing the core elements of the SoC. This is suitable (and tested) for the Kendryte KD233 development board, the Sipeed MAIX M1 Dan Dock board and the Sipeed MAIXDUINO board. Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Reviewed-by: Sean Anderson <seanga2@gmail.com> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
| * | | | riscv: Select required drivers for Kendryte SOCDamien Le Moal2020-04-031-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch selects drivers required for the Kendryte K210 SOC. Since K210 SoC based boards do not provide a device tree, this patch also enables the BUILTIN_DTB option. Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Reviewed-by: Anup Patel <anup@brainfault.org> Reviewed-by: Palmer Dabbelt <palmerdabbelt@google.com> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
| * | | | riscv: Add Kendryte K210 SoC supportChristoph Hellwig2020-04-031-0/+30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add support for the Kendryte K210 RISC-V SoC. For now, this support only provides a simple sysctl driver allowing to setup the CPU and uart clock. This support is enabled through the new Kconfig option SOC_KENDRYTE and defines the config option CONFIG_K210_SYSCTL to enable the K210 SoC sysctl driver compilation. The sysctl driver also registers an early SoC initialization function allowing enabling the general purpose use of the 2MB of SRAM normally reserved for the SoC AI engine. This initialization function is automatically called before the dt early initialization using the flat dt root node compatible property matching the value "kendryte,k210". Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> [Palmer: Add missing endmenu in Kconfig.socs] Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
| * | | | riscv: Add SOC early init supportDamien Le Moal2020-04-035-0/+59
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add a mechanism for early SoC initialization for platforms that need additional hardware initialization not possible through the regular device tree and drivers mechanism. With this, a SoC specific initialization function can be called very early, before DTB parsing is done by parse_dtb() in Linux RISC-V kernel setup code. This can be very useful for early hardware initialization for No-MMU kernels booted directly in M-mode because it is quite likely that no other booting stage exist prior to the No-MMU kernel. Example use of a SoC early initialization is as follows: static void vendor_abc_early_init(const void *fdt) { /* * some early init code here that can use simple matches * against the flat device tree file. */ } SOC_EARLY_INIT_DECLARE("vendor,abc", abc_early_init); This early initialization function is executed only if the flat device tree for the board has a 'compatible = "vendor,abc"' entry; Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Signed-off-by: Anup Patel <anup.patel@wdc.com> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
| * | | | riscv: Unaligned load/store handling for M_MODEDamien Le Moal2020-04-033-4/+395
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add handlers for unaligned load and store traps that may be generated by applications. Code heavily inspired from the OpenSBI project. Handling of the unaligned access traps is suitable for applications compiled with or without compressed instructions and is independent of the kernel CONFIG_RISCV_ISA_C option value. Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Signed-off-by: Anup Patel <anup.patel@wdc.com> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
| * | | | RISC-V: Support cpu hotplugAtish Patra2020-03-317-2/+180
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch enable support for cpu hotplug in RISC-V. It uses SBI HSM extension to online/offline any hart. As a result, the harts are returned to firmware once they are offline. If the harts are brought online afterwards, they re-enter Linux kernel as if a secondary hart booted for the first time. All booting requirements are honored during this process. Tested both on QEMU and HighFive Unleashed board with. Test result follows. --------------------------------------------------- Offline cpu 2 --------------------------------------------------- $ echo 0 > /sys/devices/system/cpu/cpu2/online [ 32.828684] CPU2: off $ cat /proc/cpuinfo processor : 0 hart : 0 isa : rv64imafdcsu mmu : sv48 processor : 1 hart : 1 isa : rv64imafdcsu mmu : sv48 processor : 3 hart : 3 isa : rv64imafdcsu mmu : sv48 processor : 4 hart : 4 isa : rv64imafdcsu mmu : sv48 processor : 5 hart : 5 isa : rv64imafdcsu mmu : sv48 processor : 6 hart : 6 isa : rv64imafdcsu mmu : sv48 processor : 7 hart : 7 isa : rv64imafdcsu mmu : sv48 --------------------------------------------------- online cpu 2 --------------------------------------------------- $ echo 1 > /sys/devices/system/cpu/cpu2/online $ cat /proc/cpuinfo processor : 0 hart : 0 isa : rv64imafdcsu mmu : sv48 processor : 1 hart : 1 isa : rv64imafdcsu mmu : sv48 processor : 2 hart : 2 isa : rv64imafdcsu mmu : sv48 processor : 3 hart : 3 isa : rv64imafdcsu mmu : sv48 processor : 4 hart : 4 isa : rv64imafdcsu mmu : sv48 processor : 5 hart : 5 isa : rv64imafdcsu mmu : sv48 processor : 6 hart : 6 isa : rv64imafdcsu mmu : sv48 processor : 7 hart : 7 isa : rv64imafdcsu mmu : sv48 Signed-off-by: Atish Patra <atish.patra@wdc.com> Reviewed-by: Anup Patel <anup@brainfault.org>
| * | | | RISC-V: Add supported for ordered booting method using HSMAtish Patra2020-03-316-3/+121
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, all harts have to jump Linux in RISC-V. This complicates the multi-stage boot process as every transient stage also has to ensure all harts enter to that stage and jump to Linux afterwards. It also obstructs a clean Kexec implementation. SBI HSM extension provides alternate solutions where only a single hart need to boot and enter Linux. The booting hart can bring up secondary harts one by one afterwards. Add SBI HSM based cpu_ops that implements an ordered booting method in RISC-V. This change is also backward compatible with older firmware not implementing HSM extension. If a latest kernel is used with older firmware, it will continue to use the default spinning booting method. Signed-off-by: Atish Patra <atish.patra@wdc.com> Reviewed-by: Anup Patel <anup@brainfault.org> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
| * | | | RISC-V: Add SBI HSM extension definitionsAtish Patra2020-03-311-0/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | SBI specification defines HSM extension that allows to start/stop a hart by a supervisor anytime. The specification is available at https://github.com/riscv/riscv-sbi-doc/blob/master/riscv-sbi.adoc Add those definitions here. Signed-off-by: Atish Patra <atish.patra@wdc.com> Reviewed-by: Anup Patel <anup.patel@wdc.com> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
| * | | | RISC-V: Export SBI error to linux error mapping functionAtish Patra2020-03-312-1/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | All SBI related extensions will not be implemented in sbi.c to avoid bloating. Thus, sbi_err_map_linux_errno() will be used in other files implementing that specific extension. Export the function so that it can be used later. Signed-off-by: Atish Patra <atish.patra@wdc.com> Reviewed-by: Anup Patel <anup.patel@wdc.com> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
| * | | | RISC-V: Add cpu_ops and modify default booting methodAtish Patra2020-03-315-21/+147
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, all non-booting harts start booting after the booting hart updates the per-hart stack pointer. This is done in a way that, it's difficult to implement any other booting method without breaking the backward compatibility. Define a cpu_ops method that allows to introduce other booting methods in future. Modify the current booting method to be compatible with cpu_ops. Signed-off-by: Atish Patra <atish.patra@wdc.com> Reviewed-by: Anup Patel <anup@brainfault.org> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
| * | | | RISC-V: Move relocate and few other functions out of __initAtish Patra2020-03-312-72/+86
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The secondary hart booting and relocation code are under .init section. As a result, it will be freed once kernel booting is done. However, ordered booting protocol and CPU hotplug always requires these functions to be present to bringup harts after initial kernel boot. Move the required functions to a different section and make sure that they are in memory within first 2MB offset as trampoline page directory only maps first 2MB. Signed-off-by: Atish Patra <atish.patra@wdc.com> Reviewed-by: Anup Patel <anup@brainfault.org> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
| * | | | RISC-V: Implement new SBI v0.2 extensionsAtish Patra2020-03-313-4/+270
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Few v0.1 SBI calls are being replaced by new SBI calls that follows v0.2 calling convention. Implement the replacement extensions and few additional new SBI function calls that makes way for a better SBI interface in future. Signed-off-by: Atish Patra <atish.patra@wdc.com> Reviewed-by: Anup Patel <anup@brainfault.org> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
| * | | | RISC-V: Introduce a new config for SBI v0.1Atish Patra2020-03-313-23/+118
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We now have SBI v0.2 which is more scalable and extendable to handle future needs for RISC-V supervisor interfaces. Introduce a new config and move all SBI v0.1 code under that config. This allows to implement the new replacement SBI extensions cleanly and remove v0.1 extensions easily in future. Currently, the config is enabled by default. Once all M-mode software, with v0.1, is no longer in use, this config option and all relevant code can be easily removed. Signed-off-by: Atish Patra <atish.patra@wdc.com> Reviewed-by: Anup Patel <anup@brainfault.org> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
| * | | | RISC-V: Add SBI v0.2 extension definitionsAtish Patra2020-03-311-0/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Few v0.1 SBI calls are being replaced by new SBI calls that follows v0.2 calling convention. This patch just defines these new extensions. Signed-off-by: Atish Patra <atish.patra@wdc.com> Reviewed-by: Anup Patel <anup@brainfault.org> Reviewed-by: Palmer Dabbelt <palmerdabbelt@google.com> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
| * | | | RISC-V: Add basic support for SBI v0.2Atish Patra2020-03-313-73/+314
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The SBI v0.2 introduces a base extension which is backward compatible with v0.1. Implement all helper functions and minimum required SBI calls from v0.2 for now. All other base extension function will be added later as per need. As v0.2 calling convention is backward compatible with v0.1, remove the v0.1 helper functions and just use v0.2 calling convention. Signed-off-by: Atish Patra <atish.patra@wdc.com> Reviewed-by: Anup Patel <anup@brainfault.org> Reviewed-by: Palmer Dabbelt <palmerdabbelt@google.com> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
| * | | | RISC-V: Mark existing SBI as 0.1 SBI.Atish Patra2020-03-311-19/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As per the new SBI specification, current SBI implementation version is defined as 0.1 and will be removed/replaced in future. Each of the function call in 0.1 is defined as a separate extension which makes easier to replace them one at a time. Rename existing implementation to reflect that. This patch is just a preparatory patch for SBI v0.2 and doesn't introduce any functional changes. Signed-off-by: Atish Patra <atish.patra@wdc.com> Reviewed-by: Anup Patel <anup@brainfault.org> Reviewed-by: Palmer Dabbelt <palmerdabbelt@google.com> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
| * | | | riscv: Use macro definition instead of magic numberZong Li2020-03-261-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The KERN_VIRT_START defines the start virtual address of kernel space. Use this macro instead of magic number. Signed-off-by: Zong Li <zong.li@sifive.com> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
| * | | | riscv: Add support to dump the kernel page tablesZong Li2020-03-265-0/+340
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In a similar manner to arm64, x86, powerpc, etc., it can traverse all page tables, and dump the page table layout with the memory types and permissions. Add a debugfs file at /sys/kernel/debug/kernel_page_tables to export the page table layout to userspace. Signed-off-by: Zong Li <zong.li@sifive.com> Tested-by: Alexandre Ghiti <alex@ghiti.fr> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>