summaryrefslogtreecommitdiffstats
path: root/arch/powerpc (follow)
Commit message (Collapse)AuthorAgeFilesLines
* Merge tag 'powerpc-5.14-6' of ↵Linus Torvalds2021-08-223-14/+31
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Michael Ellerman: - Fix random crashes on some 32-bit CPUs by adding isync() after locking/unlocking KUEP - Fix intermittent crashes when loading modules with strict module RWX - Fix a section mismatch introduce by a previous fix. Thanks to Christophe Leroy, Fabiano Rosas, Laurent Vivier, Murilo Opsfelder Araújo, Nathan Chancellor, and Stan Johnson. h# -----BEGIN PGP SIGNATURE----- * tag 'powerpc-5.14-6' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/mm: Fix set_memory_*() against concurrent accesses powerpc/32s: Fix random crashes by adding isync() after locking/unlocking KUEP powerpc/xive: Do not mark xive_request_ipi() as __init
| * powerpc/mm: Fix set_memory_*() against concurrent accessesMichael Ellerman2021-08-191-13/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Laurent reported that STRICT_MODULE_RWX was causing intermittent crashes on one of his systems: kernel tried to execute exec-protected page (c008000004073278) - exploit attempt? (uid: 0) BUG: Unable to handle kernel instruction fetch Faulting instruction address: 0xc008000004073278 Oops: Kernel access of bad area, sig: 11 [#1] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA pSeries Modules linked in: drm virtio_console fuse drm_panel_orientation_quirks ... CPU: 3 PID: 44 Comm: kworker/3:1 Not tainted 5.14.0-rc4+ #12 Workqueue: events control_work_handler [virtio_console] NIP: c008000004073278 LR: c008000004073278 CTR: c0000000001e9de0 REGS: c00000002e4ef7e0 TRAP: 0400 Not tainted (5.14.0-rc4+) MSR: 800000004280b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE> CR: 24002822 XER: 200400cf ... NIP fill_queue+0xf0/0x210 [virtio_console] LR fill_queue+0xf0/0x210 [virtio_console] Call Trace: fill_queue+0xb4/0x210 [virtio_console] (unreliable) add_port+0x1a8/0x470 [virtio_console] control_work_handler+0xbc/0x1e8 [virtio_console] process_one_work+0x290/0x590 worker_thread+0x88/0x620 kthread+0x194/0x1a0 ret_from_kernel_thread+0x5c/0x64 Jordan, Fabiano & Murilo were able to reproduce and identify that the problem is caused by the call to module_enable_ro() in do_init_module(), which happens after the module's init function has already been called. Our current implementation of change_page_attr() is not safe against concurrent accesses, because it invalidates the PTE before flushing the TLB and then installing the new PTE. That leaves a window in time where there is no valid PTE for the page, if another CPU tries to access the page at that time we see something like the fault above. We can't simply switch to set_pte_at()/flush TLB, because our hash MMU code doesn't handle a set_pte_at() of a valid PTE. See [1]. But we do have pte_update(), which replaces the old PTE with the new, meaning there's no window where the PTE is invalid. And the hash MMU version hash__pte_update() deals with synchronising the hash page table correctly. [1]: https://lore.kernel.org/linuxppc-dev/87y318wp9r.fsf@linux.ibm.com/ Fixes: 1f9ad21c3b38 ("powerpc/mm: Implement set_memory() routines") Reported-by: Laurent Vivier <lvivier@redhat.com> Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu> Reviewed-by: Murilo Opsfelder Araújo <muriloo@linux.ibm.com> Tested-by: Laurent Vivier <lvivier@redhat.com> Signed-off-by: Fabiano Rosas <farosas@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210818120518.3603172-1-mpe@ellerman.id.au
| * powerpc/32s: Fix random crashes by adding isync() after locking/unlocking KUEPChristophe Leroy2021-08-191-0/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit b5efec00b671 ("powerpc/32s: Move KUEP locking/unlocking in C") removed the 'isync' instruction after adding/removing NX bit in user segments. The reasoning behind this change was that when setting the NX bit we don't mind it taking effect with delay as the kernel never executes text from userspace, and when clearing the NX bit this is to return to userspace and then the 'rfi' should synchronise the context. However, it looks like on book3s/32 having a hash page table, at least on the G3 processor, we get an unexpected fault from userspace, then this is followed by something wrong in the verification of MSR_PR at end of another interrupt. This is fixed by adding back the removed isync() following update of NX bit in user segment registers. Only do it for cores with an hash table, as 603 cores don't exhibit that problem and the two isync increase ./null_syscall selftest by 6 cycles on an MPC 832x. First problem: unexpected WARN_ON() for mysterious PROTFAULT WARNING: CPU: 0 PID: 1660 at arch/powerpc/mm/fault.c:354 do_page_fault+0x6c/0x5b0 Modules linked in: CPU: 0 PID: 1660 Comm: Xorg Not tainted 5.13.0-pmac-00028-gb3c15b60339a #40 NIP: c001b5c8 LR: c001b6f8 CTR: 00000000 REGS: e2d09e40 TRAP: 0700 Not tainted (5.13.0-pmac-00028-gb3c15b60339a) MSR: 00021032 <ME,IR,DR,RI> CR: 42d04f30 XER: 20000000 GPR00: c000424c e2d09f00 c301b680 e2d09f40 0000001e 42000000 00cba028 00000000 GPR08: 08000000 48000010 c301b680 e2d09f30 22d09f30 00c1fff0 00cba000 a7b7ba4c GPR16: 00000031 00000000 00000000 00000000 00000000 00000000 a7b7b0d0 00c5c010 GPR24: a7b7b64c a7b7d2f0 00000004 00000000 c1efa6c0 00cba02c 00000300 e2d09f40 NIP [c001b5c8] do_page_fault+0x6c/0x5b0 LR [c001b6f8] do_page_fault+0x19c/0x5b0 Call Trace: [e2d09f00] [e2d09f04] 0xe2d09f04 (unreliable) [e2d09f30] [c000424c] DataAccess_virt+0xd4/0xe4 --- interrupt: 300 at 0xa7a261dc NIP: a7a261dc LR: a7a253bc CTR: 00000000 REGS: e2d09f40 TRAP: 0300 Not tainted (5.13.0-pmac-00028-gb3c15b60339a) MSR: 0000d032 <EE,PR,ME,IR,DR,RI> CR: 228428e2 XER: 20000000 DAR: 00cba02c DSISR: 42000000 GPR00: a7a27448 afa6b0e0 a74c35c0 a7b7b614 0000001e a7b7b614 00cba028 00000000 GPR08: 00020fd9 00000031 00cb9ff8 a7a273b0 220028e2 00c1fff0 00cba000 a7b7ba4c GPR16: 00000031 00000000 00000000 00000000 00000000 00000000 a7b7b0d0 00c5c010 GPR24: a7b7b64c a7b7d2f0 00000004 00000002 0000001e a7b7b614 a7b7aff4 00000030 NIP [a7a261dc] 0xa7a261dc LR [a7a253bc] 0xa7a253bc --- interrupt: 300 Instruction dump: 7c4a1378 810300a0 75278410 83820298 83a300a4 553b018c 551e0036 4082038c 2e1b0000 40920228 75280800 41820220 <0fe00000> 3b600000 41920214 81420594 Second problem: MSR PR is seen unset allthough the interrupt frame shows it set kernel BUG at arch/powerpc/kernel/interrupt.c:458! Oops: Exception in kernel mode, sig: 5 [#1] BE PAGE_SIZE=4K MMU=Hash SMP NR_CPUS=2 PowerMac Modules linked in: CPU: 0 PID: 1660 Comm: Xorg Tainted: G W 5.13.0-pmac-00028-gb3c15b60339a #40 NIP: c0011434 LR: c001629c CTR: 00000000 REGS: e2d09e70 TRAP: 0700 Tainted: G W (5.13.0-pmac-00028-gb3c15b60339a) MSR: 00029032 <EE,ME,IR,DR,RI> CR: 42d09f30 XER: 00000000 GPR00: 00000000 e2d09f30 c301b680 e2d09f40 83440000 c44d0e68 e2d09e8c 00000000 GPR08: 00000002 00dc228a 00004000 e2d09f30 22d09f30 00c1fff0 afa6ceb4 00c26144 GPR16: 00c25fb8 00c26140 afa6ceb8 90000000 00c944d8 0000001c 00000000 00200000 GPR24: 00000000 000001fb afa6d1b4 00000001 00000000 a539a2a0 a530fd80 00000089 NIP [c0011434] interrupt_exit_kernel_prepare+0x10/0x70 LR [c001629c] interrupt_return+0x9c/0x144 Call Trace: [e2d09f30] [c000424c] DataAccess_virt+0xd4/0xe4 (unreliable) --- interrupt: 300 at 0xa09be008 NIP: a09be008 LR: a09bdfe8 CTR: a09bdfc0 REGS: e2d09f40 TRAP: 0300 Tainted: G W (5.13.0-pmac-00028-gb3c15b60339a) MSR: 0000d032 <EE,PR,ME,IR,DR,RI> CR: 420028e2 XER: 20000000 DAR: a539a308 DSISR: 0a000000 GPR00: a7b90d50 afa6b2d0 a74c35c0 a0a8b690 a0a8b698 a5365d70 a4fa82a8 00000004 GPR08: 00000000 a09bdfc0 00000000 a5360000 a09bde7c 00c1fff0 afa6ceb4 00c26144 GPR16: 00c25fb8 00c26140 afa6ceb8 90000000 00c944d8 0000001c 00000000 00200000 GPR24: 00000000 000001fb afa6d1b4 00000001 00000000 a539a2a0 a530fd80 00000089 NIP [a09be008] 0xa09be008 LR [a09bdfe8] 0xa09bdfe8 --- interrupt: 300 Instruction dump: 80010024 83e1001c 7c0803a6 4bffff80 3bc00800 4bffffd0 486b42fd 4bffffcc 81430084 71480002 41820038 554a0462 <0f0a0000> 80620060 74630001 40820034 Fixes: b5efec00b671 ("powerpc/32s: Move KUEP locking/unlocking in C") Cc: stable@vger.kernel.org # v5.13+ Reported-by: Stan Johnson <userm57@yahoo.com> Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/4856f5574906e2aec0522be17bf3848a22b2cd0b.1629269345.git.christophe.leroy@csgroup.eu
| * powerpc/xive: Do not mark xive_request_ipi() as __initNathan Chancellor2021-08-181-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Compiling ppc64le_defconfig with clang-14 shows a modpost warning: WARNING: modpost: vmlinux.o(.text+0xa74e0): Section mismatch in reference from the function xive_setup_cpu_ipi() to the function .init.text:xive_request_ipi() The function xive_setup_cpu_ipi() references the function __init xive_request_ipi(). This is often because xive_setup_cpu_ipi lacks a __init annotation or the annotation of xive_request_ipi is wrong. xive_request_ipi() is called from xive_setup_cpu_ipi(), which is not __init, so xive_request_ipi() should not be marked __init. Remove the attribute so there is no more warning. Fixes: cbc06f051c52 ("powerpc/xive: Do not skip CPU-less nodes when creating the IPIs") Signed-off-by: Nathan Chancellor <nathan@kernel.org> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210816185711.21563-1-nathan@kernel.org
* | Merge tag 'powerpc-5.14-5' of ↵Linus Torvalds2021-08-1513-62/+82
|\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Michael Ellerman: - Fix crashes coming out of nap on 32-bit Book3s (eg. powerbooks). - Fix critical and debug interrupts on BookE, seen as crashes when using ptrace. - Fix an oops when running an SMP kernel on a UP system. - Update pseries LPAR security flavor after partition migration. - Fix an oops when using kprobes on BookE. - Fix oops on 32-bit pmac by not calling do_IRQ() from timer_interrupt(). - Fix softlockups on CPU hotplug into a CPU-less node with xive (P9). Thanks to Cédric Le Goater, Christophe Leroy, Finn Thain, Geetika Moolchandani, Laurent Dufour, Laurent Vivier, Nicholas Piggin, Pu Lehui, Radu Rendec, Srikar Dronamraju, and Stan Johnson. * tag 'powerpc-5.14-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/xive: Do not skip CPU-less nodes when creating the IPIs powerpc/interrupt: Do not call single_step_exception() from other exceptions powerpc/interrupt: Fix OOPS by not calling do_IRQ() from timer_interrupt() powerpc/kprobes: Fix kprobe Oops happens in booke powerpc/pseries: Fix update of LPAR security flavor after LPM powerpc/smp: Fix OOPS in topology_init() powerpc/32: Fix critical and debug interrupts on BOOKE powerpc/32s: Fix napping restore in data storage interrupt (DSI)
| * powerpc/xive: Do not skip CPU-less nodes when creating the IPIsCédric Le Goater2021-08-121-11/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On PowerVM, CPU-less nodes can be populated with hot-plugged CPUs at runtime. Today, the IPI is not created for such nodes, and hot-plugged CPUs use a bogus IPI, which leads to soft lockups. We can not directly allocate and request the IPI on demand because bringup_up() is called under the IRQ sparse lock. The alternative is to allocate the IPIs for all possible nodes at startup and to request the mapping on demand when the first CPU of a node is brought up. Fixes: 7dcc37b3eff9 ("powerpc/xive: Map one IPI interrupt per node") Cc: stable@vger.kernel.org # v5.13 Reported-by: Geetika Moolchandani <Geetika.Moolchandani1@ibm.com> Signed-off-by: Cédric Le Goater <clg@kaod.org> Tested-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Tested-by: Laurent Vivier <lvivier@redhat.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210807072057.184698-1-clg@kaod.org
| * powerpc/interrupt: Do not call single_step_exception() from other exceptionsChristophe Leroy2021-08-121-2/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | single_step_exception() is called by emulate_single_step() which is called from (at least) alignment exception() handler and program_check_exception() handler. Redefine it as a regular __single_step_exception() which is called by both single_step_exception() handler and emulate_single_step() function. Fixes: 3a96570ffceb ("powerpc: convert interrupt handlers to use wrappers") Cc: stable@vger.kernel.org # v5.12+ Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/aed174f5cbc06f2cf95233c071d8aac948e46043.1628611921.git.christophe.leroy@csgroup.eu
| * powerpc/interrupt: Fix OOPS by not calling do_IRQ() from timer_interrupt()Christophe Leroy2021-08-124-3/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | An interrupt handler shall not be called from another interrupt handler otherwise this leads to problems like the following: Kernel attempted to write user page (afd4fa84) - exploit attempt? (uid: 1000) ------------[ cut here ]------------ Bug: Write fault blocked by KUAP! WARNING: CPU: 0 PID: 1617 at arch/powerpc/mm/fault.c:230 do_page_fault+0x484/0x720 Modules linked in: CPU: 0 PID: 1617 Comm: sshd Tainted: G W 5.13.0-pmac-00010-g8393422eb77 #7 NIP: c001b77c LR: c001b77c CTR: 00000000 REGS: cb9e5bc0 TRAP: 0700 Tainted: G W (5.13.0-pmac-00010-g8393422eb77) MSR: 00021032 <ME,IR,DR,RI> CR: 24942424 XER: 00000000 GPR00: c001b77c cb9e5c80 c1582c00 00000021 3ffffbff 085b0000 00000027 c8eb644c GPR08: 00000023 00000000 00000000 00000000 24942424 0063f8c8 00000000 000186a0 GPR16: afd52dd4 afd52dd0 afd52dcc afd52dc8 0065a990 c07640c4 cb9e5e98 cb9e5e90 GPR24: 00000040 afd4fa96 00000040 02000000 c1fda6c0 afd4fa84 00000300 cb9e5cc0 NIP [c001b77c] do_page_fault+0x484/0x720 LR [c001b77c] do_page_fault+0x484/0x720 Call Trace: [cb9e5c80] [c001b77c] do_page_fault+0x484/0x720 (unreliable) [cb9e5cb0] [c000424c] DataAccess_virt+0xd4/0xe4 --- interrupt: 300 at __copy_tofrom_user+0x110/0x20c NIP: c001f9b4 LR: c03250a0 CTR: 00000004 REGS: cb9e5cc0 TRAP: 0300 Tainted: G W (5.13.0-pmac-00010-g8393422eb77) MSR: 00009032 <EE,ME,IR,DR,RI> CR: 48028468 XER: 20000000 DAR: afd4fa84 DSISR: 0a000000 GPR00: 20726f6f cb9e5d80 c1582c00 00000004 cb9e5e3a 00000016 afd4fa80 00000000 GPR08: 3835202d 72777872 2d78722d 00000004 28028464 0063f8c8 00000000 000186a0 GPR16: afd52dd4 afd52dd0 afd52dcc afd52dc8 0065a990 c07640c4 cb9e5e98 cb9e5e90 GPR24: 00000040 afd4fa96 00000040 cb9e5e0c 00000daa a0000000 cb9e5e98 afd4fa56 NIP [c001f9b4] __copy_tofrom_user+0x110/0x20c LR [c03250a0] _copy_to_iter+0x144/0x990 --- interrupt: 300 [cb9e5d80] [c03e89c0] n_tty_read+0xa4/0x598 (unreliable) [cb9e5df0] [c03e2a0c] tty_read+0xdc/0x2b4 [cb9e5e80] [c0156bf8] vfs_read+0x274/0x340 [cb9e5f00] [c01571ac] ksys_read+0x70/0x118 [cb9e5f30] [c0016048] ret_from_syscall+0x0/0x28 --- interrupt: c00 at 0xa7855c88 NIP: a7855c88 LR: a7855c5c CTR: 00000000 REGS: cb9e5f40 TRAP: 0c00 Tainted: G W (5.13.0-pmac-00010-g8393422eb77) MSR: 0000d032 <EE,PR,ME,IR,DR,RI> CR: 2402446c XER: 00000000 GPR00: 00000003 afd4ec70 a72137d0 0000000b afd4ecac 00004000 0065a990 00000800 GPR08: 00000000 a7947930 00000000 00000004 c15831b0 0063f8c8 00000000 000186a0 GPR16: afd52dd4 afd52dd0 afd52dcc afd52dc8 0065a990 0065a9e0 00000001 0065fac0 GPR24: 00000000 00000089 00664050 00000000 00668e30 a720c8dc a7943ff4 0065f9b0 NIP [a7855c88] 0xa7855c88 LR [a7855c5c] 0xa7855c5c --- interrupt: c00 Instruction dump: 3884aa88 38630178 48076861 807f0080 48042e45 2f830000 419e0148 3c80c079 3c60c076 38841be4 386301c0 4801f705 <0fe00000> 3860000b 4bfffe30 3c80c06b ---[ end trace fd69b91a8046c2e5 ]--- Here the problem is that by re-enterring an exception handler, kuap_save_and_lock() is called a second time with this time KUAP access locked, leading to regs->kuap being overwritten hence KUAP not being unlocked at exception exit as expected. Do not call do_IRQ() from timer_interrupt() directly. Instead, redefine do_IRQ() as a standard function named __do_IRQ(), and call it from both do_IRQ() and time_interrupt() handlers. Fixes: 3a96570ffceb ("powerpc: convert interrupt handlers to use wrappers") Cc: stable@vger.kernel.org # v5.12+ Reported-by: Stan Johnson <userm57@yahoo.com> Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/c17d234f4927d39a1d7100864a8e1145323d33a0.1628611927.git.christophe.leroy@csgroup.eu
| * powerpc/kprobes: Fix kprobe Oops happens in bookePu Lehui2021-08-091-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When using kprobe on powerpc booke series processor, Oops happens as show bellow: / # echo "p:myprobe do_nanosleep" > /sys/kernel/debug/tracing/kprobe_events / # echo 1 > /sys/kernel/debug/tracing/events/kprobes/myprobe/enable / # sleep 1 [ 50.076730] Oops: Exception in kernel mode, sig: 5 [#1] [ 50.077017] BE PAGE_SIZE=4K SMP NR_CPUS=24 QEMU e500 [ 50.077221] Modules linked in: [ 50.077462] CPU: 0 PID: 77 Comm: sleep Not tainted 5.14.0-rc4-00022-g251a1524293d #21 [ 50.077887] NIP: c0b9c4e0 LR: c00ebecc CTR: 00000000 [ 50.078067] REGS: c3883de0 TRAP: 0700 Not tainted (5.14.0-rc4-00022-g251a1524293d) [ 50.078349] MSR: 00029000 <CE,EE,ME> CR: 24000228 XER: 20000000 [ 50.078675] [ 50.078675] GPR00: c00ebdf0 c3883e90 c313e300 c3883ea0 00000001 00000000 c3883ecc 00000001 [ 50.078675] GPR08: c100598c c00ea250 00000004 00000000 24000222 102490c2 bff4180c 101e60d4 [ 50.078675] GPR16: 00000000 102454ac 00000040 10240000 10241100 102410f8 10240000 00500000 [ 50.078675] GPR24: 00000002 00000000 c3883ea0 00000001 00000000 0000c350 3b9b8d50 00000000 [ 50.080151] NIP [c0b9c4e0] do_nanosleep+0x0/0x190 [ 50.080352] LR [c00ebecc] hrtimer_nanosleep+0x14c/0x1e0 [ 50.080638] Call Trace: [ 50.080801] [c3883e90] [c00ebdf0] hrtimer_nanosleep+0x70/0x1e0 (unreliable) [ 50.081110] [c3883f00] [c00ec004] sys_nanosleep_time32+0xa4/0x110 [ 50.081336] [c3883f40] [c001509c] ret_from_syscall+0x0/0x28 [ 50.081541] --- interrupt: c00 at 0x100a4d08 [ 50.081749] NIP: 100a4d08 LR: 101b5234 CTR: 00000003 [ 50.081931] REGS: c3883f50 TRAP: 0c00 Not tainted (5.14.0-rc4-00022-g251a1524293d) [ 50.082183] MSR: 0002f902 <CE,EE,PR,FP,ME> CR: 24000222 XER: 00000000 [ 50.082457] [ 50.082457] GPR00: 000000a2 bf980040 1024b4d0 bf980084 bf980084 64000000 00555345 fefefeff [ 50.082457] GPR08: 7f7f7f7f 101e0000 00000069 00000003 28000422 102490c2 bff4180c 101e60d4 [ 50.082457] GPR16: 00000000 102454ac 00000040 10240000 10241100 102410f8 10240000 00500000 [ 50.082457] GPR24: 00000002 bf9803f4 10240000 00000000 00000000 100039e0 00000000 102444e8 [ 50.083789] NIP [100a4d08] 0x100a4d08 [ 50.083917] LR [101b5234] 0x101b5234 [ 50.084042] --- interrupt: c00 [ 50.084238] Instruction dump: [ 50.084483] 4bfffc40 60000000 60000000 60000000 9421fff0 39400402 914200c0 38210010 [ 50.084841] 4bfffc20 00000000 00000000 00000000 <7fe00008> 7c0802a6 7c892378 93c10048 [ 50.085487] ---[ end trace f6fffe98e2fa8f3e ]--- [ 50.085678] Trace/breakpoint trap There is no real mode for booke arch and the MMU translation is always on. The corresponding MSR_IS/MSR_DS bit in booke is used to switch the address space, but not for real mode judgment. Fixes: 21f8b2fa3ca5 ("powerpc/kprobes: Ignore traps that happened in real mode") Signed-off-by: Pu Lehui <pulehui@huawei.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210809023658.218915-1-pulehui@huawei.com
| * powerpc/pseries: Fix update of LPAR security flavor after LPMLaurent Dufour2021-08-071-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | After LPM, when migrating from a system with security mitigation enabled to a system with mitigation disabled, the security flavor exposed in /proc is not correctly set back to 0. Do not assume the value of the security flavor is set to 0 when entering init_cpu_char_feature_flags(), so when called after a LPM, the value is set correctly even if the mitigation are not turned off. Fixes: 6ce56e1ac380 ("powerpc/pseries: export LPAR security flavor in lparcfg") Cc: stable@vger.kernel.org # v5.13+ Signed-off-by: Laurent Dufour <ldufour@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210805152308.33988-1-ldufour@linux.ibm.com
| * powerpc/smp: Fix OOPS in topology_init()Christophe Leroy2021-08-071-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Running an SMP kernel on an UP platform not prepared for it, I encountered the following OOPS: BUG: Kernel NULL pointer dereference on read at 0x00000034 Faulting instruction address: 0xc0a04110 Oops: Kernel access of bad area, sig: 11 [#1] BE PAGE_SIZE=4K SMP NR_CPUS=2 CMPCPRO Modules linked in: CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.13.0-pmac-00001-g230fedfaad21 #5234 NIP: c0a04110 LR: c0a040d8 CTR: c0a04084 REGS: e100dda0 TRAP: 0300 Not tainted (5.13.0-pmac-00001-g230fedfaad21) MSR: 00009032 <EE,ME,IR,DR,RI> CR: 84000284 XER: 00000000 DAR: 00000034 DSISR: 20000000 GPR00: c0006bd4 e100de60 c1033320 00000000 00000000 c0942274 00000000 00000000 GPR08: 00000000 00000000 00000001 00000063 00000007 00000000 c0006f30 00000000 GPR16: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000005 GPR24: c0c67d74 c0c67f1c c0c60000 c0c67d70 c0c0c558 1efdf000 c0c00020 00000000 NIP [c0a04110] topology_init+0x8c/0x138 LR [c0a040d8] topology_init+0x54/0x138 Call Trace: [e100de60] [80808080] 0x80808080 (unreliable) [e100de90] [c0006bd4] do_one_initcall+0x48/0x1bc [e100def0] [c0a0150c] kernel_init_freeable+0x1c8/0x278 [e100df20] [c0006f44] kernel_init+0x14/0x10c [e100df30] [c00190fc] ret_from_kernel_thread+0x14/0x1c Instruction dump: 7c692e70 7d290194 7c035040 7c7f1b78 5529103a 546706fe 5468103a 39400001 7c641b78 40800054 80c690b4 7fb9402e <81060034> 7fbeea14 2c080000 7fa3eb78 ---[ end trace b246ffbc6bbbb6fb ]--- Fix it by checking smp_ops before using it, as already done in several other places in the arch/powerpc/kernel/smp.c Fixes: 39f87561454d ("powerpc/smp: Move ppc_md.cpu_die() to smp_ops.cpu_offline_self()") Cc: stable@vger.kernel.org Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/75287841cbb8740edd44880fe60be66d489160d9.1628097995.git.christophe.leroy@csgroup.eu
| * powerpc/32: Fix critical and debug interrupts on BOOKEChristophe Leroy2021-08-073-41/+33
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 32 bits BOOKE have special interrupts for debug and other critical events. When handling those interrupts, dedicated registers are saved in the stack frame in addition to the standard registers, leading to a shift of the pt_regs struct. Since commit db297c3b07af ("powerpc/32: Don't save thread.regs on interrupt entry"), the pt_regs struct is expected to be at the same place all the time. Instead of handling a special struct in addition to pt_regs, just add those special registers to struct pt_regs. Fixes: db297c3b07af ("powerpc/32: Don't save thread.regs on interrupt entry") Cc: stable@vger.kernel.org Reported-by: Radu Rendec <radu.rendec@gmail.com> Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/028d5483b4851b01ea4334d0751e7f260419092b.1625637264.git.christophe.leroy@csgroup.eu
| * powerpc/32s: Fix napping restore in data storage interrupt (DSI)Christophe Leroy2021-08-071-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When a DSI (Data Storage Interrupt) is taken while in NAP mode, r11 doesn't survive the call to power_save_ppc32_restore(). So use r1 instead of r11 as they both contain the virtual stack pointer at that point. Fixes: 4c0104a83fc3 ("powerpc/32: Dismantle EXC_XFER_STD/LITE/TEMPLATE") Cc: stable@vger.kernel.org # v5.13+ Reported-by: Finn Thain <fthain@linux-m68k.org> Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/731694e0885271f6ee9ffc179eb4bcee78313682.1628003562.git.christophe.leroy@csgroup.eu
* | Merge tag 'powerpc-5.14-4' of ↵Linus Torvalds2021-08-012-1/+8
|\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Michael Ellerman: - Don't use r30 in VDSO code, to avoid breaking existing Go lang programs. - Change an export symbol to allow non-GPL modules to use spinlocks again. Thanks to Paul Menzel, and Srikar Dronamraju. * tag 'powerpc-5.14-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/vdso: Don't use r30 to avoid breaking Go lang powerpc/pseries: Fix regression while building external modules
| * powerpc/vdso: Don't use r30 to avoid breaking Go langMichael Ellerman2021-07-291-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The Go runtime uses r30 for some special value called 'g'. It assumes that value will remain unchanged even when calling VDSO functions. Although r30 is non-volatile across function calls, the callee is free to use it, as long as the callee saves the value and restores it before returning. It used to be true by accident that the VDSO didn't use r30, because the VDSO was hand-written asm. When we switched to building the VDSO from C the compiler started using r30, at least in some builds, leading to crashes in Go. eg: ~/go/src$ ./all.bash Building Go cmd/dist using /usr/lib/go-1.16. (go1.16.2 linux/ppc64le) Building Go toolchain1 using /usr/lib/go-1.16. go build os/exec: /usr/lib/go-1.16/pkg/tool/linux_ppc64le/compile: signal: segmentation fault go build reflect: /usr/lib/go-1.16/pkg/tool/linux_ppc64le/compile: signal: segmentation fault go tool dist: FAILED: /usr/lib/go-1.16/bin/go install -gcflags=-l -tags=math_big_pure_go compiler_bootstrap bootstrap/cmd/...: exit status 1 There are patches in flight to fix Go[1], but until they are released and widely deployed we can workaround it in the VDSO by avoiding use of r30. Note this only works with GCC, clang does not support -ffixed-rN. 1: https://go-review.googlesource.com/c/go/+/328110 Fixes: ab037dd87a2f ("powerpc/vdso: Switch VDSO to generic C implementation.") Cc: stable@vger.kernel.org # v5.11+ Reported-by: Paul Menzel <pmenzel@molgen.mpg.de> Tested-by: Paul Menzel <pmenzel@molgen.mpg.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210729131244.2595519-1-mpe@ellerman.id.au
| * powerpc/pseries: Fix regression while building external modulesSrikar Dronamraju2021-07-291-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | With commit c9f3401313a5 ("powerpc: Always enable queued spinlocks for 64s, disable for others") CONFIG_PPC_QUEUED_SPINLOCKS is always enabled on ppc64le, external modules that use spinlock APIs are failing. ERROR: modpost: GPL-incompatible module XXX.ko uses GPL-only symbol 'shared_processor' Before the above commit, modules were able to build without any issues. Also this problem is not seen on other architectures. This problem can be workaround if CONFIG_UNINLINE_SPIN_UNLOCK is enabled in the config. However CONFIG_UNINLINE_SPIN_UNLOCK is not enabled by default and only enabled in certain conditions like CONFIG_DEBUG_SPINLOCKS is set in the kernel config. #include <linux/module.h> spinlock_t spLock; static int __init spinlock_test_init(void) { spin_lock_init(&spLock); spin_lock(&spLock); spin_unlock(&spLock); return 0; } static void __exit spinlock_test_exit(void) { printk("spinlock_test unloaded\n"); } module_init(spinlock_test_init); module_exit(spinlock_test_exit); MODULE_DESCRIPTION ("spinlock_test"); MODULE_LICENSE ("non-GPL"); MODULE_AUTHOR ("Srikar Dronamraju"); Given that spin locks are one of the basic facilities for module code, this effectively makes it impossible to build/load almost any non GPL modules on ppc64le. This was first reported at https://github.com/openzfs/zfs/issues/11172 Currently shared_processor is exported as GPL only symbol. Fix this for parity with other architectures by exposing shared_processor to non-GPL modules too. Fixes: 14c73bd344da ("powerpc/vcpu: Assume dedicated processors as non-preempt") Cc: stable@vger.kernel.org # v5.5+ Reported-by: marc.c.dionne@gmail.com Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210729060449.292780-1-srikar@linux.vnet.ibm.com
| * Merge branch 'fixes' into nextMichael Ellerman2021-07-265-8/+68
| |\ | | | | | | | | | | | | Merge our fixes branch, which contains some fixes that didn't make it into rc2 but which we'd like in next.
* | \ Merge tag 'net-5.14-rc4' of ↵Linus Torvalds2021-07-312-0/+12
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Networking fixes for 5.14-rc4, including fixes from bpf, can, WiFi (mac80211) and netfilter trees. Current release - regressions: - mac80211: fix starting aggregation sessions on mesh interfaces Current release - new code bugs: - sctp: send pmtu probe only if packet loss in Search Complete state - bnxt_en: add missing periodic PHC overflow check - devlink: fix phys_port_name of virtual port and merge error - hns3: change the method of obtaining default ptp cycle - can: mcba_usb_start(): add missing urb->transfer_dma initialization Previous releases - regressions: - set true network header for ECN decapsulation - mlx5e: RX, avoid possible data corruption w/ relaxed ordering and LRO - phy: re-add check for PHY_BRCM_DIS_TXCRXC_NOENRGY on the BCM54811 PHY - sctp: fix return value check in __sctp_rcv_asconf_lookup Previous releases - always broken: - bpf: - more spectre corner case fixes, introduce a BPF nospec instruction for mitigating Spectre v4 - fix OOB read when printing XDP link fdinfo - sockmap: fix cleanup related races - mac80211: fix enabling 4-address mode on a sta vif after assoc - can: - raw: raw_setsockopt(): fix raw_rcv panic for sock UAF - j1939: j1939_session_deactivate(): clarify lifetime of session object, avoid UAF - fix number of identical memory leaks in USB drivers - tipc: - do not blindly write skb_shinfo frags when doing decryption - fix sleeping in tipc accept routine" * tag 'net-5.14-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (91 commits) gve: Update MAINTAINERS list can: esd_usb2: fix memory leak can: ems_usb: fix memory leak can: usb_8dev: fix memory leak can: mcba_usb_start(): add missing urb->transfer_dma initialization can: hi311x: fix a signedness bug in hi3110_cmd() MAINTAINERS: add Yasushi SHOJI as reviewer for the Microchip CAN BUS Analyzer Tool driver bpf: Fix leakage due to insufficient speculative store bypass mitigation bpf: Introduce BPF nospec instruction for mitigating Spectre v4 sis900: Fix missing pci_disable_device() in probe and remove net: let flow have same hash in two directions nfc: nfcsim: fix use after free during module unload tulip: windbond-840: Fix missing pci_disable_device() in probe and remove sctp: fix return value check in __sctp_rcv_asconf_lookup nfc: s3fwrn5: fix undefined parameter values in dev_err() net/mlx5: Fix mlx5_vport_tbl_attr chain from u16 to u32 net/mlx5e: Fix nullptr in mlx5e_hairpin_get_mdev() net/mlx5: Unload device upon firmware fatal error net/mlx5e: Fix page allocation failure for ptp-RQ over SF net/mlx5e: Fix page allocation failure for trap-RQ over SF ...
| * \ \ Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpfDavid S. Miller2021-07-292-0/+12
| |\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Daniel Borkmann says: ==================== pull-request: bpf 2021-07-29 The following pull-request contains BPF updates for your *net* tree. We've added 9 non-merge commits during the last 14 day(s) which contain a total of 20 files changed, 446 insertions(+), 138 deletions(-). The main changes are: 1) Fix UBSAN out-of-bounds splat for showing XDP link fdinfo, from Lorenz Bauer. 2) Fix insufficient Spectre v4 mitigation in BPF runtime, from Daniel Borkmann, Piotr Krysiuk and Benedict Schlueter. 3) Batch of fixes for BPF sockmap found under stress testing, from John Fastabend. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| | * | | bpf: Introduce BPF nospec instruction for mitigating Spectre v4Daniel Borkmann2021-07-292-0/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In case of JITs, each of the JIT backends compiles the BPF nospec instruction /either/ to a machine instruction which emits a speculation barrier /or/ to /no/ machine instruction in case the underlying architecture is not affected by Speculative Store Bypass or has different mitigations in place already. This covers both x86 and (implicitly) arm64: In case of x86, we use 'lfence' instruction for mitigation. In case of arm64, we rely on the firmware mitigation as controlled via the ssbd kernel parameter. Whenever the mitigation is enabled, it works for all of the kernel code with no need to provide any additional instructions here (hence only comment in arm64 JIT). Other archs can follow as needed. The BPF nospec instruction is specifically targeting Spectre v4 since i) we don't use a serialization barrier for the Spectre v1 case, and ii) mitigation instructions for v1 and v4 might be different on some archs. The BPF nospec is required for a future commit, where the BPF verifier does annotate intermediate BPF programs with speculation barriers. Co-developed-by: Piotr Krysiuk <piotras@gmail.com> Co-developed-by: Benedict Schlueter <benedict.schlueter@rub.de> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Piotr Krysiuk <piotras@gmail.com> Signed-off-by: Benedict Schlueter <benedict.schlueter@rub.de> Acked-by: Alexei Starovoitov <ast@kernel.org>
* | | | | Merge tag 'libata-5.14-2021-07-30' of git://git.kernel.dk/linux-blockLinus Torvalds2021-07-301-1/+0
|\ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull libata fixlets from Jens Axboe: - A fix for PIO highmem (Christoph) - Kill HAVE_IDE as it's now unused (Lukas) * tag 'libata-5.14-2021-07-30' of git://git.kernel.dk/linux-block: arch: Kconfig: clean up obsolete use of HAVE_IDE libata: fix ata_pio_sector for CONFIG_HIGHMEM
| * | | | | arch: Kconfig: clean up obsolete use of HAVE_IDELukas Bulwahn2021-07-301-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The arch-specific Kconfig files use HAVE_IDE to indicate if IDE is supported. As IDE support and the HAVE_IDE config vanishes with commit b7fb14d3ac63 ("ide: remove the legacy ide driver"), there is no need to mention HAVE_IDE in all those arch-specific Kconfig files. The issue was identified with ./scripts/checkkconfigsymbols.py. Fixes: b7fb14d3ac63 ("ide: remove the legacy ide driver") Suggested-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com> Acked-by: Randy Dunlap <rdunlap@infradead.org> Link: https://lore.kernel.org/r/20210728182115.4401-1-lukas.bulwahn@gmail.com Reviewed-by: Christoph Hellwig <hch@lst.de> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
* | | | | | Merge tag 'powerpc-5.14-3' of ↵Linus Torvalds2021-07-255-8/+68
|\ \ \ \ \ \ | | |_|_|_|/ | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Michael Ellerman: - Fix guest to host memory corruption in H_RTAS due to missing nargs check. - Fix guest triggerable host crashes due to bad handling of nested guest TM state. - Fix possible crashes due to incorrect reference counting in kvm_arch_vcpu_ioctl(). - Two commits fixing some regressions in KVM transactional memory handling introduced by the recent rework of the KVM code. Thanks to Nicholas Piggin, Alexey Kardashevskiy, and Michael Neuling. * tag 'powerpc-5.14-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: KVM: PPC: Book3S HV Nested: Sanitise H_ENTER_NESTED TM state KVM: PPC: Book3S: Fix H_RTAS rets buffer overflow KVM: PPC: Fix kvm_arch_vcpu_ioctl vcpu_load leak KVM: PPC: Book3S: Fix CONFIG_TRANSACTIONAL_MEM=n crash KVM: PPC: Book3S HV P9: Fix guest TM support
| * | | | | KVM: PPC: Book3S HV Nested: Sanitise H_ENTER_NESTED TM stateNicholas Piggin2021-07-231-0/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The H_ENTER_NESTED hypercall is handled by the L0, and it is a request by the L1 to switch the context of the vCPU over to that of its L2 guest, and return with an interrupt indication. The L1 is responsible for switching some registers to guest context, and the L0 switches others (including all the hypervisor privileged state). If the L2 MSR has TM active, then the L1 is responsible for recheckpointing the L2 TM state. Then the L1 exits to L0 via the H_ENTER_NESTED hcall, and the L0 saves the TM state as part of the exit, and then it recheckpoints the TM state as part of the nested entry and finally HRFIDs into the L2 with TM active MSR. Not efficient, but about the simplest approach for something that's horrendously complicated. Problems arise if the L1 exits to the L0 with a TM state which does not match the L2 TM state being requested. For example if the L1 is transactional but the L2 MSR is non-transactional, or vice versa. The L0's HRFID can take a TM Bad Thing interrupt and crash. Fix this by disallowing H_ENTER_NESTED in TM[T] state entirely, and then ensuring that if the L1 is suspended then the L2 must have TM active, and if the L1 is not suspended then the L2 must not have TM active. Fixes: 360cae313702 ("KVM: PPC: Book3S HV: Nested guest entry via hypercall") Cc: stable@vger.kernel.org # v4.20+ Reported-by: Alexey Kardashevskiy <aik@ozlabs.ru> Acked-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
| * | | | | KVM: PPC: Book3S: Fix H_RTAS rets buffer overflowNicholas Piggin2021-07-231-3/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The kvmppc_rtas_hcall() sets the host rtas_args.rets pointer based on the rtas_args.nargs that was provided by the guest. That guest nargs value is not range checked, so the guest can cause the host rets pointer to be pointed outside the args array. The individual rtas function handlers check the nargs and nrets values to ensure they are correct, but if they are not, the handlers store a -3 (0xfffffffd) failure indication in rets[0] which corrupts host memory. Fix this by testing up front whether the guest supplied nargs and nret would exceed the array size, and fail the hcall directly without storing a failure indication to rets[0]. Also expand on a comment about why we kill the guest and try not to return errors directly if we have a valid rets[0] pointer. Fixes: 8e591cb72047 ("KVM: PPC: Book3S: Add infrastructure to implement kernel-side RTAS calls") Cc: stable@vger.kernel.org # v3.10+ Reported-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
| * | | | | KVM: PPC: Fix kvm_arch_vcpu_ioctl vcpu_load leakNicholas Piggin2021-07-171-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | vcpu_put is not called if the user copy fails. This can result in preempt notifier corruption and crashes, among other issues. Fixes: b3cebfe8c1ca ("KVM: PPC: Move vcpu_load/vcpu_put down to each ioctl case in kvm_arch_vcpu_ioctl") Reported-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210716024310.164448-2-npiggin@gmail.com
| * | | | | KVM: PPC: Book3S: Fix CONFIG_TRANSACTIONAL_MEM=n crashNicholas Piggin2021-07-171-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When running CPU_FTR_P9_TM_HV_ASSIST, HFSCR[TM] is set for the guest even if the host has CONFIG_TRANSACTIONAL_MEM=n, which causes it to be unprepared to handle guest exits while transactional. Normal guests don't have a problem because the HTM capability will not be advertised, but a rogue or buggy one could crash the host. Fixes: 4bb3c7a0208f ("KVM: PPC: Book3S HV: Work around transactional memory bugs in POWER9") Reported-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210716024310.164448-1-npiggin@gmail.com
| * | | | | KVM: PPC: Book3S HV P9: Fix guest TM supportNicholas Piggin2021-07-151-3/+22
| | |_|/ / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The conversion to C introduced several bugs in TM handling that can cause host crashes with TM bad thing interrupts. Mostly just simple typos or missed logic in the conversion that got through due to my not testing TM in the guest sufficiently. - Early TM emulation for the softpatch interrupt should be done if fake suspend mode is _not_ active. - Early TM emulation wants to return immediately to the guest so as to not doom transactions unnecessarily. - And if exiting from the guest, the host MSR should include the TM[S] bit if the guest was T/S, before it is treclaimed. After this fix, all the TM selftests pass when running on a P9 processor that implements TM with softpatch interrupt. Fixes: 89d35b2391015 ("KVM: PPC: Book3S HV P9: Implement the rest of the P9 path in C") Reported-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210712013650.376325-1-npiggin@gmail.com
* | | | | Merge tag 'fallthrough-fixes-clang-5.14-rc3' of ↵Linus Torvalds2021-07-231-0/+1
|\ \ \ \ \ | |_|_|/ / |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/gustavoars/linux Pull fallthrough fix from Gustavo Silva: "Fix a fall-through warning when building with -Wimplicit-fallthrough on PowerPC" * tag 'fallthrough-fixes-clang-5.14-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gustavoars/linux: powerpc/pasemi: Fix fall-through warning for Clang
| * | | | powerpc/pasemi: Fix fall-through warning for ClangGustavo A. R. Silva2021-07-191-0/+1
| | |_|/ | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix the following fallthrough warning: arch/powerpc/platforms/pasemi/idle.c:45:2: warning: unannotated fall-through between switch labels [-Wimplicit-fallthrough] Reported-by: kernel test robot <lkp@intel.com> Link: https://lore.kernel.org/lkml/60efbf18.d9n6eXv275OJcc7T%25lkp@intel.com/ Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
* | | | Merge tag 'arm64-fixes' of ↵Linus Torvalds2021-07-221-0/+10
|\ \ \ \ | |/ / / |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 fixes from Will Deacon: "A pair of arm64 fixes for -rc3. The straightforward one is a fix to our firmware calling stub, which accidentally started corrupting the link register on machines with SVE. Since these machines don't really exist yet, it wasn't spotted in -next. The other fix is a revert-and-a-bit of a patch originally intended to allow PTE-level huge mappings for the VMAP area on 32-bit PPC 8xx. A side-effect of this change was that our pXd_set_huge() implementations could be replaced with generic dummy functions depending on the levels of page-table being used, which in turn broke the boot if we fail to create the linear mapping as a result of using these functions to operate on the pgd. Huge thanks to Michael Ellerman for modifying the revert so as not to regress PPC 8xx in terms of functionality. Anyway, that's the background and it's also available in the commit message along with Link tags pointing at all of the fun. Summary: - Fix hang when issuing SMC on SVE-capable system due to clobbered LR - Fix boot failure due to missing block mappings with folded page-table" * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: Revert "mm/pgtable: add stubs for {pmd/pub}_{set/clear}_huge" arm64: smccc: Save lr before calling __arm_smccc_sve_check()
| * | | Revert "mm/pgtable: add stubs for {pmd/pub}_{set/clear}_huge"Jonathan Marek2021-07-211-0/+10
| |/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This reverts commit c742199a014de23ee92055c2473d91fe5561ffdf. c742199a014d ("mm/pgtable: add stubs for {pmd/pub}_{set/clear}_huge") breaks arm64 in at least two ways for configurations where PUD or PMD folding occur: 1. We no longer install huge-vmap mappings and silently fall back to page-granular entries, despite being able to install block entries at what is effectively the PGD level. 2. If the linear map is backed with block mappings, these will now silently fail to be created in alloc_init_pud(), causing a panic early during boot. The pgtable selftests caught this, although a fix has not been forthcoming and Christophe is AWOL at the moment, so just revert the change for now to get a working -rc3 on which we can queue patches for 5.15. A simple revert breaks the build for 32-bit PowerPC 8xx machines, which rely on the default function definitions when the corresponding page-table levels are folded, since commit a6a8f7c4aa7e ("powerpc/8xx: add support for huge pages on VMAP and VMALLOC"), eg: powerpc64-linux-ld: mm/vmalloc.o: in function `vunmap_pud_range': linux/mm/vmalloc.c:362: undefined reference to `pud_clear_huge' To avoid that, add stubs for pud_clear_huge() and pmd_clear_huge() in arch/powerpc/mm/nohash/8xx.c as suggested by Christophe. Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Fixes: c742199a014d ("mm/pgtable: add stubs for {pmd/pub}_{set/clear}_huge") Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Acked-by: Marc Zyngier <maz@kernel.org> [mpe: Fold in 8xx.c changes from Christophe and mention in change log] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/linux-arm-kernel/CAMuHMdXShORDox-xxaeUfDW3wx2PeggFSqhVSHVZNKCGK-y_vQ@mail.gmail.com/ Link: https://lore.kernel.org/r/20210717160118.9855-1-jonathan@marek.ca Link: https://lore.kernel.org/r/87r1fs1762.fsf@mpe.ellerman.id.au Signed-off-by: Will Deacon <will@kernel.org>
* / / powerpc/smp: Fix fall-through warning for ClangGustavo A. R. Silva2021-07-141-0/+1
|/ / | | | | | | | | | | | | | | | | | | Fix the following fallthrough warning: arch/powerpc/platforms/powermac/smp.c:149:3: warning: unannotated fall-through between switch labels [-Wimplicit-fallthrough] Reported-by: kernel test robot <lkp@intel.com> Link: https://lore.kernel.org/lkml/60ef0750.I8J+C6KAtb0xVOAa%25lkp@intel.com/ Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
* | Merge tag 'kbuild-v5.14' of ↵Linus Torvalds2021-07-101-3/+0
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild Pull Kbuild updates from Masahiro Yamada: - Increase the -falign-functions alignment for the debug option. - Remove ugly libelf checks from the top Makefile. - Make the silent build (-s) more silent. - Re-compile the kernel if KBUILD_BUILD_TIMESTAMP is specified. - Various script cleanups * tag 'kbuild-v5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (27 commits) scripts: add generic syscallnr.sh scripts: check duplicated syscall number in syscall table sparc: syscalls: use pattern rules to generate syscall headers parisc: syscalls: use pattern rules to generate syscall headers nds32: add arch/nds32/boot/.gitignore kbuild: mkcompile_h: consider timestamp if KBUILD_BUILD_TIMESTAMP is set kbuild: modpost: Explicitly warn about unprototyped symbols kbuild: remove trailing slashes from $(KBUILD_EXTMOD) kconfig.h: explain IS_MODULE(), IS_ENABLED() kconfig: constify long_opts scripts/setlocalversion: simplify the short version part scripts/setlocalversion: factor out 12-chars hash construction scripts/setlocalversion: add more comments to -dirty flag detection scripts/setlocalversion: remove workaround for old make-kpkg scripts/setlocalversion: remove mercurial, svn and git-svn supports kbuild: clean up ${quiet} checks in shell scripts kbuild: sink stdout from cmd for silent build init: use $(call cmd,) for generating include/generated/compile.h kbuild: merge scripts/mkmakefile to top Makefile sh: move core-y in arch/sh/Makefile to arch/sh/Kbuild ...
| * | kbuild: require all architectures to have arch/$(SRCARCH)/KbuildMasahiro Yamada2021-05-261-3/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | arch/$(SRCARCH)/Kbuild is useful for Makefile cleanups because you can use the obj-y syntax. Add an empty file if it is missing in arch/$(SRCARCH)/. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
* | | Merge tag 'powerpc-5.14-2' of ↵Linus Torvalds2021-07-097-19/+22
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Michael Ellerman: "Fix crashes on 64-bit Book3E due to use of Book3S only mtmsrd instruction. Fix "scheduling while atomic" warnings at boot due to preempt count underflow. Two commits fixing our handling of BPF atomic instructions. Fix error handling in xive when allocating an IPI. Fix lockup on kernel exec fault on 603. Thanks to Bharata B Rao, Cédric Le Goater, Christian Zigotzky, Christophe Leroy, Guenter Roeck, Jiri Olsa, Naveen N. Rao, Nicholas Piggin, and Valentin Schneider" * tag 'powerpc-5.14-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/preempt: Don't touch the idle task's preempt_count during hotplug powerpc/64e: Fix system call illegal mtmsrd instruction powerpc/xive: Fix error handling when allocating an IPI powerpc/bpf: Reject atomic ops in ppc32 JIT powerpc/bpf: Fix detecting BPF atomic instructions powerpc/mm: Fix lockup on kernel exec fault
| * | | powerpc/preempt: Don't touch the idle task's preempt_count during hotplugValentin Schneider2021-07-082-6/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Powerpc currently resets a CPU's idle task preempt_count to 0 before said task starts executing the secondary startup routine (and becomes an idle task proper). This conflicts with commit f1a0a376ca0c ("sched/core: Initialize the idle task with preemption disabled"). which initializes all of the idle tasks' preempt_count to PREEMPT_DISABLED during smp_init(). Note that this was superfluous before said commit, as back then the hotplug machinery would invoke init_idle() via idle_thread_get(), which would have already reset the CPU's idle task's preempt_count to PREEMPT_ENABLED. Get rid of this preempt_count write. Fixes: f1a0a376ca0c ("sched/core: Initialize the idle task with preemption disabled") Reported-by: Bharata B Rao <bharata@linux.ibm.com> Signed-off-by: Valentin Schneider <valentin.schneider@arm.com> Tested-by: Guenter Roeck <linux@roeck-us.net> Tested-by: Bharata B Rao <bharata@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210707183831.2106509-1-valentin.schneider@arm.com
| * | | powerpc/64e: Fix system call illegal mtmsrd instructionNicholas Piggin2021-07-061-1/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | BookE does not have mtmsrd, switch to use wrteei to enable MSR[EE]. Fixes: dd152f70bdc1 ("powerpc/64s: system call avoid setting MSR[RI] until we set MSR[EE]") Reported-by: Christian Zigotzky <chzigotzky@xenosoft.de> Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210706051310.608992-1-npiggin@gmail.com
| * | | powerpc/xive: Fix error handling when allocating an IPICédric Le Goater2021-07-051-4/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a smatch warning: arch/powerpc/sysdev/xive/common.c:1161 xive_request_ipi() warn: unsigned 'xid->irq' is never less than zero. Fixes: fd6db2892eba ("powerpc/xive: Modernize XIVE-IPI domain with an 'alloc' handler") Cc: stable@vger.kernel.org # v5.13 Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210701152412.1507612-1-clg@kaod.org
| * | | powerpc/bpf: Reject atomic ops in ppc32 JITNaveen N. Rao2021-07-051-3/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 91c960b0056672 ("bpf: Rename BPF_XADD and prepare to encode other atomics in .imm") converted BPF_XADD to BPF_ATOMIC and updated all JIT implementations to reject JIT'ing instructions with an immediate value different from BPF_ADD. However, ppc32 BPF JIT was implemented around the same time and didn't include the same change. Update the ppc32 JIT accordingly. Fixes: 51c66ad849a7 ("powerpc/bpf: Implement extended BPF on PPC32") Cc: stable@vger.kernel.org # v5.13+ Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/426699046d89fe50f66ecf74bd31c01eda976ba5.1625145429.git.naveen.n.rao@linux.vnet.ibm.com
| * | | powerpc/bpf: Fix detecting BPF atomic instructionsNaveen N. Rao2021-07-051-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 91c960b0056672 ("bpf: Rename BPF_XADD and prepare to encode other atomics in .imm") converted BPF_XADD to BPF_ATOMIC and added a way to distinguish instructions based on the immediate field. Existing JIT implementations were updated to check for the immediate field and to reject programs utilizing anything more than BPF_ADD (such as BPF_FETCH) in the immediate field. However, the check added to powerpc64 JIT did not look at the correct BPF instruction. Due to this, such programs would be accepted and incorrectly JIT'ed resulting in soft lockups, as seen with the atomic bounds test. Fix this by looking at the correct immediate value. Fixes: 91c960b0056672 ("bpf: Rename BPF_XADD and prepare to encode other atomics in .imm") Reported-by: Jiri Olsa <jolsa@redhat.com> Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Tested-by: Jiri Olsa <jolsa@redhat.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/4117b430ffaa8cd7af042496f87fd7539e4f17fd.1625145429.git.naveen.n.rao@linux.vnet.ibm.com
| * | | powerpc/mm: Fix lockup on kernel exec faultChristophe Leroy2021-07-051-3/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The powerpc kernel is not prepared to handle exec faults from kernel. Especially, the function is_exec_fault() will return 'false' when an exec fault is taken by kernel, because the check is based on reading current->thread.regs->trap which contains the trap from user. For instance, when provoking a LKDTM EXEC_USERSPACE test, current->thread.regs->trap is set to SYSCALL trap (0xc00), and the fault taken by the kernel is not seen as an exec fault by set_access_flags_filter(). Commit d7df2443cd5f ("powerpc/mm: Fix spurious segfaults on radix with autonuma") made it clear and handled it properly. But later on commit d3ca587404b3 ("powerpc/mm: Fix reporting of kernel execute faults") removed that handling, introducing test based on error_code. And here is the problem, because on the 603 all upper bits of SRR1 get cleared when the TLB instruction miss handler bails out to ISI. Until commit cbd7e6ca0210 ("powerpc/fault: Avoid heavy search_exception_tables() verification"), an exec fault from kernel at a userspace address was indirectly caught by the lack of entry for that address in the exception tables. But after that commit the kernel mainly relies on KUAP or on core mm handling to catch wrong user accesses. Here the access is not wrong, so mm handles it. It is a minor fault because PAGE_EXEC is not set, set_access_flags_filter() should set PAGE_EXEC and voila. But as is_exec_fault() returns false as explained in the beginning, set_access_flags_filter() bails out without setting PAGE_EXEC flag, which leads to a forever minor exec fault. As the kernel is not prepared to handle such exec faults, the thing to do is to fire in bad_kernel_fault() for any exec fault taken by the kernel, as it was prior to commit d3ca587404b3. Fixes: d3ca587404b3 ("powerpc/mm: Fix reporting of kernel execute faults") Cc: stable@vger.kernel.org # v4.14+ Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Acked-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/024bb05105050f704743a0083fe3548702be5706.1625138205.git.christophe.leroy@csgroup.eu
* | | | powerpc/mm: enable HAVE_MOVE_PMD supportAneesh Kumar K.V2021-07-081-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | mremap HAVE_MOVE_PMD/PUD optimization time comparison for 1GB region: 1GB mremap - Source PTE-aligned, Destination PTE-aligned mremap time: 2292772ns 1GB mremap - Source PMD-aligned, Destination PMD-aligned mremap time: 1158928ns 1GB mremap - Source PUD-aligned, Destination PUD-aligned mremap time: 63886ns Link: https://lkml.kernel.org/r/20210616045735.374532-4-aneesh.kumar@linux.ibm.com Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Hugh Dickins <hughd@google.com> Cc: Joel Fernandes <joel@joelfernandes.org> Cc: Kalesh Singh <kaleshsingh@google.com> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Kirill A. Shutemov <kirill@shutemov.name> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | | powerpc/book3s64/mm: update flush_tlb_range to flush page walk cacheAneesh Kumar K.V2021-07-083-18/+36
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | flush_tlb_range is special in that we don't specify the page size used for the translation. Hence when flushing TLB we flush the translation cache for all possible page sizes. The kernel also uses the same interface when moving page tables around. Such a move requires us to flush the page walk cache. Instead of adding another interface to force page walk cache flush, update flush_tlb_range to flush page walk cache if the range flushed is more than the PMD range. A page table move will always involve an invalidate range more than PMD_SIZE. Running microbenchmark with mprotect and parallel memory access didn't show any observable performance impact. Link: https://lkml.kernel.org/r/20210616045735.374532-3-aneesh.kumar@linux.ibm.com Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Hugh Dickins <hughd@google.com> Cc: Joel Fernandes <joel@joelfernandes.org> Cc: Kalesh Singh <kaleshsingh@google.com> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | | mm/mremap: allow arch runtime overrideAneesh Kumar K.V2021-07-081-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Patch series "Speedup mremap on ppc64", v8. This patchset enables MOVE_PMD/MOVE_PUD support on power. This requires the platform to support updating higher-level page tables without updating page table entries. This also needs to invalidate the Page Walk Cache on architecture supporting the same. This patch (of 3): Architectures like ppc64 support faster mremap only with radix translation. Hence allow a runtime check w.r.t support for fast mremap. Link: https://lkml.kernel.org/r/20210616045735.374532-1-aneesh.kumar@linux.ibm.com Link: https://lkml.kernel.org/r/20210616045735.374532-2-aneesh.kumar@linux.ibm.com Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Kalesh Singh <kaleshsingh@google.com> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Joel Fernandes <joel@joelfernandes.org> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Kirill A. Shutemov <kirill@shutemov.name> Cc: "Aneesh Kumar K . V" <aneesh.kumar@linux.ibm.com> Cc: Hugh Dickins <hughd@google.com> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | | mm: rename p4d_page_vaddr to p4d_pgtable and make it return pud_t *Aneesh Kumar K.V2021-07-084-4/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | No functional change in this patch. [aneesh.kumar@linux.ibm.com: m68k build error reported by kernel robot] Link: https://lkml.kernel.org/r/87tulxnb2v.fsf@linux.ibm.com Link: https://lkml.kernel.org/r/20210615110859.320299-2-aneesh.kumar@linux.ibm.com Link: https://lore.kernel.org/linuxppc-dev/CAHk-=wi+J+iodze9FtjM3Zi4j4OeS+qqbKxME9QN4roxPEXH9Q@mail.gmail.com/ Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Hugh Dickins <hughd@google.com> Cc: Joel Fernandes <joel@joelfernandes.org> Cc: Kalesh Singh <kaleshsingh@google.com> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | | mm: rename pud_page_vaddr to pud_pgtable and make it return pmd_t *Aneesh Kumar K.V2021-07-084-5/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | No functional change in this patch. [aneesh.kumar@linux.ibm.com: fix] Link: https://lkml.kernel.org/r/87wnqtnb60.fsf@linux.ibm.com [sfr@canb.auug.org.au: another fix] Link: https://lkml.kernel.org/r/20210619134410.89559-1-aneesh.kumar@linux.ibm.com Link: https://lkml.kernel.org/r/20210615110859.320299-1-aneesh.kumar@linux.ibm.com Link: https://lore.kernel.org/linuxppc-dev/CAHk-=wi+J+iodze9FtjM3Zi4j4OeS+qqbKxME9QN4roxPEXH9Q@mail.gmail.com/ Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Hugh Dickins <hughd@google.com> Cc: Joel Fernandes <joel@joelfernandes.org> Cc: Kalesh Singh <kaleshsingh@google.com> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | | powerpc: convert to setup_initial_init_mm()Kefeng Wang2021-07-081-4/+1
| |_|/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Use setup_initial_init_mm() helper to simplify code. Link: https://lkml.kernel.org/r/20210608083418.137226-12-wangkefeng.wang@huawei.com Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Souptick Joarder <jrdr.linux@gmail.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | Merge tag 'tty-5.14-rc1' of ↵Linus Torvalds2021-07-051-1/+0
|\ \ \ | |/ / |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty Pull tty / serial updates from Greg KH: "Here is the big set of tty and serial driver patches for 5.14-rc1. A bit more than normal, but nothing major, lots of cleanups. Highlights are: - lots of tty api cleanups and mxser driver cleanups from Jiri - build warning fixes - various serial driver updates - coding style cleanups - various tty driver minor fixes and updates - removal of broken and disable r3964 line discipline (finally!) All of these have been in linux-next for a while with no reported issues" * tag 'tty-5.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: (227 commits) serial: mvebu-uart: remove unused member nb from struct mvebu_uart arm64: dts: marvell: armada-37xx: Fix reg for standard variant of UART dt-bindings: mvebu-uart: fix documentation serial: mvebu-uart: correctly calculate minimal possible baudrate serial: mvebu-uart: do not allow changing baudrate when uartclk is not available serial: mvebu-uart: fix calculation of clock divisor tty: make linux/tty_flip.h self-contained serial: Prefer unsigned int to bare use of unsigned serial: 8250: 8250_omap: Fix possible interrupt storm on K3 SoCs serial: qcom_geni_serial: use DT aliases according to DT bindings Revert "tty: serial: Add UART driver for Cortina-Access platform" tty: serial: Add UART driver for Cortina-Access platform MAINTAINERS: add me back as mxser maintainer mxser: Documentation, fix typos mxser: Documentation, make the docs up-to-date mxser: Documentation, remove traces of callout device mxser: introduce mxser_16550A_or_MUST helper mxser: rename flags to old_speed in mxser_set_serial_info mxser: use port variable in mxser_set_serial_info mxser: access info->MCR under info->slock ...
| * | Merge tag 'v5.13-rc6' into tty-nextGreg Kroah-Hartman2021-06-148-57/+49
| |\ \ | | | | | | | | | | | | | | | | | | | | We want the tty fixes in here as well. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>