| Commit message (Collapse) | Author | Age | Files | Lines |
|\
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc fixes from Michael Ellerman:
- Fix random crashes on some 32-bit CPUs by adding isync() after
locking/unlocking KUEP
- Fix intermittent crashes when loading modules with strict module RWX
- Fix a section mismatch introduce by a previous fix.
Thanks to Christophe Leroy, Fabiano Rosas, Laurent Vivier, Murilo
Opsfelder Araújo, Nathan Chancellor, and Stan Johnson.
h# -----BEGIN PGP SIGNATURE-----
* tag 'powerpc-5.14-6' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/mm: Fix set_memory_*() against concurrent accesses
powerpc/32s: Fix random crashes by adding isync() after locking/unlocking KUEP
powerpc/xive: Do not mark xive_request_ipi() as __init
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Laurent reported that STRICT_MODULE_RWX was causing intermittent crashes
on one of his systems:
kernel tried to execute exec-protected page (c008000004073278) - exploit attempt? (uid: 0)
BUG: Unable to handle kernel instruction fetch
Faulting instruction address: 0xc008000004073278
Oops: Kernel access of bad area, sig: 11 [#1]
LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA pSeries
Modules linked in: drm virtio_console fuse drm_panel_orientation_quirks ...
CPU: 3 PID: 44 Comm: kworker/3:1 Not tainted 5.14.0-rc4+ #12
Workqueue: events control_work_handler [virtio_console]
NIP: c008000004073278 LR: c008000004073278 CTR: c0000000001e9de0
REGS: c00000002e4ef7e0 TRAP: 0400 Not tainted (5.14.0-rc4+)
MSR: 800000004280b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE> CR: 24002822 XER: 200400cf
...
NIP fill_queue+0xf0/0x210 [virtio_console]
LR fill_queue+0xf0/0x210 [virtio_console]
Call Trace:
fill_queue+0xb4/0x210 [virtio_console] (unreliable)
add_port+0x1a8/0x470 [virtio_console]
control_work_handler+0xbc/0x1e8 [virtio_console]
process_one_work+0x290/0x590
worker_thread+0x88/0x620
kthread+0x194/0x1a0
ret_from_kernel_thread+0x5c/0x64
Jordan, Fabiano & Murilo were able to reproduce and identify that the
problem is caused by the call to module_enable_ro() in do_init_module(),
which happens after the module's init function has already been called.
Our current implementation of change_page_attr() is not safe against
concurrent accesses, because it invalidates the PTE before flushing the
TLB and then installing the new PTE. That leaves a window in time where
there is no valid PTE for the page, if another CPU tries to access the
page at that time we see something like the fault above.
We can't simply switch to set_pte_at()/flush TLB, because our hash MMU
code doesn't handle a set_pte_at() of a valid PTE. See [1].
But we do have pte_update(), which replaces the old PTE with the new,
meaning there's no window where the PTE is invalid. And the hash MMU
version hash__pte_update() deals with synchronising the hash page table
correctly.
[1]: https://lore.kernel.org/linuxppc-dev/87y318wp9r.fsf@linux.ibm.com/
Fixes: 1f9ad21c3b38 ("powerpc/mm: Implement set_memory() routines")
Reported-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Reviewed-by: Murilo Opsfelder Araújo <muriloo@linux.ibm.com>
Tested-by: Laurent Vivier <lvivier@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210818120518.3603172-1-mpe@ellerman.id.au
|
|/
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit c742199a014de23ee92055c2473d91fe5561ffdf.
c742199a014d ("mm/pgtable: add stubs for {pmd/pub}_{set/clear}_huge")
breaks arm64 in at least two ways for configurations where PUD or PMD
folding occur:
1. We no longer install huge-vmap mappings and silently fall back to
page-granular entries, despite being able to install block entries
at what is effectively the PGD level.
2. If the linear map is backed with block mappings, these will now
silently fail to be created in alloc_init_pud(), causing a panic
early during boot.
The pgtable selftests caught this, although a fix has not been
forthcoming and Christophe is AWOL at the moment, so just revert the
change for now to get a working -rc3 on which we can queue patches for
5.15.
A simple revert breaks the build for 32-bit PowerPC 8xx machines, which
rely on the default function definitions when the corresponding
page-table levels are folded, since commit a6a8f7c4aa7e ("powerpc/8xx:
add support for huge pages on VMAP and VMALLOC"), eg:
powerpc64-linux-ld: mm/vmalloc.o: in function `vunmap_pud_range':
linux/mm/vmalloc.c:362: undefined reference to `pud_clear_huge'
To avoid that, add stubs for pud_clear_huge() and pmd_clear_huge() in
arch/powerpc/mm/nohash/8xx.c as suggested by Christophe.
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Fixes: c742199a014d ("mm/pgtable: add stubs for {pmd/pub}_{set/clear}_huge")
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Acked-by: Marc Zyngier <maz@kernel.org>
[mpe: Fold in 8xx.c changes from Christophe and mention in change log]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/linux-arm-kernel/CAMuHMdXShORDox-xxaeUfDW3wx2PeggFSqhVSHVZNKCGK-y_vQ@mail.gmail.com/
Link: https://lore.kernel.org/r/20210717160118.9855-1-jonathan@marek.ca
Link: https://lore.kernel.org/r/87r1fs1762.fsf@mpe.ellerman.id.au
Signed-off-by: Will Deacon <will@kernel.org>
|
|\
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc fixes from Michael Ellerman:
"Fix crashes on 64-bit Book3E due to use of Book3S only mtmsrd
instruction.
Fix "scheduling while atomic" warnings at boot due to preempt count
underflow.
Two commits fixing our handling of BPF atomic instructions.
Fix error handling in xive when allocating an IPI.
Fix lockup on kernel exec fault on 603.
Thanks to Bharata B Rao, Cédric Le Goater, Christian Zigotzky,
Christophe Leroy, Guenter Roeck, Jiri Olsa, Naveen N. Rao, Nicholas
Piggin, and Valentin Schneider"
* tag 'powerpc-5.14-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/preempt: Don't touch the idle task's preempt_count during hotplug
powerpc/64e: Fix system call illegal mtmsrd instruction
powerpc/xive: Fix error handling when allocating an IPI
powerpc/bpf: Reject atomic ops in ppc32 JIT
powerpc/bpf: Fix detecting BPF atomic instructions
powerpc/mm: Fix lockup on kernel exec fault
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The powerpc kernel is not prepared to handle exec faults from kernel.
Especially, the function is_exec_fault() will return 'false' when an
exec fault is taken by kernel, because the check is based on reading
current->thread.regs->trap which contains the trap from user.
For instance, when provoking a LKDTM EXEC_USERSPACE test,
current->thread.regs->trap is set to SYSCALL trap (0xc00), and
the fault taken by the kernel is not seen as an exec fault by
set_access_flags_filter().
Commit d7df2443cd5f ("powerpc/mm: Fix spurious segfaults on radix
with autonuma") made it clear and handled it properly. But later on
commit d3ca587404b3 ("powerpc/mm: Fix reporting of kernel execute
faults") removed that handling, introducing test based on error_code.
And here is the problem, because on the 603 all upper bits of SRR1
get cleared when the TLB instruction miss handler bails out to ISI.
Until commit cbd7e6ca0210 ("powerpc/fault: Avoid heavy
search_exception_tables() verification"), an exec fault from kernel
at a userspace address was indirectly caught by the lack of entry for
that address in the exception tables. But after that commit the
kernel mainly relies on KUAP or on core mm handling to catch wrong
user accesses. Here the access is not wrong, so mm handles it.
It is a minor fault because PAGE_EXEC is not set,
set_access_flags_filter() should set PAGE_EXEC and voila.
But as is_exec_fault() returns false as explained in the beginning,
set_access_flags_filter() bails out without setting PAGE_EXEC flag,
which leads to a forever minor exec fault.
As the kernel is not prepared to handle such exec faults, the thing to
do is to fire in bad_kernel_fault() for any exec fault taken by the
kernel, as it was prior to commit d3ca587404b3.
Fixes: d3ca587404b3 ("powerpc/mm: Fix reporting of kernel execute faults")
Cc: stable@vger.kernel.org # v4.14+
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Acked-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/024bb05105050f704743a0083fe3548702be5706.1625138205.git.christophe.leroy@csgroup.eu
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
flush_tlb_range is special in that we don't specify the page size used for
the translation. Hence when flushing TLB we flush the translation cache
for all possible page sizes. The kernel also uses the same interface when
moving page tables around. Such a move requires us to flush the page walk
cache.
Instead of adding another interface to force page walk cache flush, update
flush_tlb_range to flush page walk cache if the range flushed is more than
the PMD range. A page table move will always involve an invalidate range
more than PMD_SIZE.
Running microbenchmark with mprotect and parallel memory access didn't
show any observable performance impact.
Link: https://lkml.kernel.org/r/20210616045735.374532-3-aneesh.kumar@linux.ibm.com
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Hugh Dickins <hughd@google.com>
Cc: Joel Fernandes <joel@joelfernandes.org>
Cc: Kalesh Singh <kaleshsingh@google.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
No functional change in this patch.
[aneesh.kumar@linux.ibm.com: m68k build error reported by kernel robot]
Link: https://lkml.kernel.org/r/87tulxnb2v.fsf@linux.ibm.com
Link: https://lkml.kernel.org/r/20210615110859.320299-2-aneesh.kumar@linux.ibm.com
Link: https://lore.kernel.org/linuxppc-dev/CAHk-=wi+J+iodze9FtjM3Zi4j4OeS+qqbKxME9QN4roxPEXH9Q@mail.gmail.com/
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Hugh Dickins <hughd@google.com>
Cc: Joel Fernandes <joel@joelfernandes.org>
Cc: Kalesh Singh <kaleshsingh@google.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|/
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
No functional change in this patch.
[aneesh.kumar@linux.ibm.com: fix]
Link: https://lkml.kernel.org/r/87wnqtnb60.fsf@linux.ibm.com
[sfr@canb.auug.org.au: another fix]
Link: https://lkml.kernel.org/r/20210619134410.89559-1-aneesh.kumar@linux.ibm.com
Link: https://lkml.kernel.org/r/20210615110859.320299-1-aneesh.kumar@linux.ibm.com
Link: https://lore.kernel.org/linuxppc-dev/CAHk-=wi+J+iodze9FtjM3Zi4j4OeS+qqbKxME9QN4roxPEXH9Q@mail.gmail.com/
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Hugh Dickins <hughd@google.com>
Cc: Joel Fernandes <joel@joelfernandes.org>
Cc: Kalesh Singh <kaleshsingh@google.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|\
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc updates from Michael Ellerman:
- A big series refactoring parts of our KVM code, and converting some
to C.
- Support for ARCH_HAS_SET_MEMORY, and ARCH_HAS_STRICT_MODULE_RWX on
some CPUs.
- Support for the Microwatt soft-core.
- Optimisations to our interrupt return path on 64-bit.
- Support for userspace access to the NX GZIP accelerator on PowerVM on
Power10.
- Enable KUAP and KUEP by default on 32-bit Book3S CPUs.
- Other smaller features, fixes & cleanups.
Thanks to: Andy Shevchenko, Aneesh Kumar K.V, Arnd Bergmann, Athira
Rajeev, Baokun Li, Benjamin Herrenschmidt, Bharata B Rao, Christophe
Leroy, Daniel Axtens, Daniel Henrique Barboza, Finn Thain, Geoff Levand,
Haren Myneni, Jason Wang, Jiapeng Chong, Joel Stanley, Jordan Niethe,
Kajol Jain, Nathan Chancellor, Nathan Lynch, Naveen N. Rao, Nicholas
Piggin, Nick Desaulniers, Paul Mackerras, Russell Currey, Sathvika
Vasireddy, Shaokun Zhang, Stephen Rothwell, Sudeep Holla, Suraj Jitindar
Singh, Tom Rix, Vaibhav Jain, YueHaibing, Zhang Jianhua, and Zhen Lei.
* tag 'powerpc-5.14-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (218 commits)
powerpc: Only build restart_table.c for 64s
powerpc/64s: move ret_from_fork etc above __end_soft_masked
powerpc/64s/interrupt: clean up interrupt return labels
powerpc/64/interrupt: add missing kprobe annotations on interrupt exit symbols
powerpc/64: enable MSR[EE] in irq replay pt_regs
powerpc/64s/interrupt: preserve regs->softe for NMI interrupts
powerpc/64s: add a table of implicit soft-masked addresses
powerpc/64e: remove implicit soft-masking and interrupt exit restart logic
powerpc/64e: fix CONFIG_RELOCATABLE build warnings
powerpc/64s: fix hash page fault interrupt handler
powerpc/4xx: Fix setup_kuep() on SMP
powerpc/32s: Fix setup_{kuap/kuep}() on SMP
powerpc/interrupt: Use names in check_return_regs_valid()
powerpc/interrupt: Also use exit_must_hard_disable() on PPC32
powerpc/sysfs: Replace sizeof(arr)/sizeof(arr[0]) with ARRAY_SIZE
powerpc/ptrace: Refactor regs_set_return_{msr/ip}
powerpc/ptrace: Move set_return_regs_changed() before regs_set_return_{msr/ip}
powerpc/stacktrace: Fix spurious "stale" traces in raise_backtrace_ipi()
powerpc/pseries/vas: Include irqdomain.h
powerpc: mark local variables around longjmp as volatile
...
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The early bad fault or key fault test in do_hash_fault() ends up calling
into ___do_page_fault without having gone through an interrupt handler
wrapper (except the initial _RAW one). This can end up calling local irq
functions while the interrupt has not been reconciled, which will likely
cause crashes and it trips up on a later patch that adds more assertions.
pkey_exec_prot from selftests causes this path to be executed.
There is no real reason to run the in_nmi() test should be performed
before the key fault check. In fact if a perf interrupt in the hash
fault code did a stack walk that was made to take a key fault somehow
then running ___do_page_fault could possibly cause another hash fault
causing problems. Move the in_nmi() test first, and then do everything
else inside the regular interrupt handler function.
Fixes: 3a96570ffceb ("powerpc: convert interrupt handlers to use wrappers")
Reported-by: Sachin Sant <sachinp@linux.vnet.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Tested-by: Sachin Sant <sachinp@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210630074621.2109197-2-npiggin@gmail.com
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
On SMP, setup_kuep() is also called from start_secondary() since
commit 86f46f343272 ("powerpc/32s: Initialise KUAP and KUEP in C").
start_secondary() is not an __init function.
Remove the __init marker from setup_kuep() and bail out when
not caller on the first CPU as the work is already done.
Fixes: 10248dcba120 ("powerpc/44x: Implement Kernel Userspace Exec Protection (KUEP)")
Fixes: 86f46f343272 ("powerpc/32s: Initialise KUAP and KUEP in C")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/8ee05934288994a65743a987acb1558f12c0c8c1.1624969450.git.christophe.leroy@csgroup.eu
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
On SMP, setup_kup() is also called from start_secondary().
start_secondary() is not an __init function.
Remove the __init marker from setup_kuep() and setup_kuap().
Fixes: 86f46f343272 ("powerpc/32s: Initialise KUAP and KUEP in C")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/42f4bd12b476942e4d5dc81c0e839d8871b20b1c.1624863319.git.christophe.leroy@csgroup.eu
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Commit aaa229529244 ("powerpc/mm: Add physical address to Linux page
table dump") changed range coalescing to only combine ranges that are
both virtually and physically contiguous, in order to avoid erroneous
combination of unrelated mappings in IOREMAP space.
But in the VMALLOC space, mappings almost never have contiguous
physical pages, so the commit mentionned above leads to dumping one
line per page for vmalloc mappings.
Taking into account the vmalloc always leave a gap between two areas,
we never have two mappings dumped as a single combination even if they
have the exact same flags. The only space that may have encountered
such an issue was the early IOREMAP which is not using vmalloc engine.
But previous commits added gaps between early IO mappings, so it is
not an issue anymore.
That commit created some difficulties with KASAN mappings, see
commit cabe8138b23c ("powerpc: dump as a single line areas mapping a
single physical page.") and with huge page, see
commit b00ff6d8c1c3 ("powerpc/ptdump: Properly handle non standard
page size").
So, almost revert commit aaa229529244 to properly coalesce pages
mapped with the same flags as before, only keep the display of the
first physical address of the range, as it can be usefull especially
for IO mappings.
It brings back powerpc at the same level as other architectures and
simplifies the conversion to GENERIC PTDUMP.
With the patch:
---[ kasan shadow mem start ]---
0xf8000000-0xf8ffffff 0x07000000 16M huge rw present dirty accessed
0xf9000000-0xf91fffff 0x01434000 2M r present accessed
0xf9200000-0xf95affff 0x02104000 3776K rw present dirty accessed
0xfef5c000-0xfeffffff 0x01434000 656K r present accessed
---[ kasan shadow mem end ]---
Before:
---[ kasan shadow mem start ]---
0xf8000000-0xf8ffffff 0x07000000 16M huge rw present dirty accessed
0xf9000000-0xf91fffff 0x01434000 16K r present accessed
0xf9200000-0xf9203fff 0x02104000 16K rw present dirty accessed
0xf9204000-0xf9207fff 0x0213c000 16K rw present dirty accessed
0xf9208000-0xf920bfff 0x02174000 16K rw present dirty accessed
0xf920c000-0xf920ffff 0x02188000 16K rw present dirty accessed
0xf9210000-0xf9213fff 0x021dc000 16K rw present dirty accessed
0xf9214000-0xf9217fff 0x02220000 16K rw present dirty accessed
0xf9218000-0xf921bfff 0x023c0000 16K rw present dirty accessed
0xf921c000-0xf921ffff 0x023d4000 16K rw present dirty accessed
0xf9220000-0xf9227fff 0x023ec000 32K rw present dirty accessed
...
0xf93b8000-0xf93e3fff 0x02614000 176K rw present dirty accessed
0xf93e4000-0xf94c3fff 0x027c0000 896K rw present dirty accessed
0xf94c4000-0xf94c7fff 0x0236c000 16K rw present dirty accessed
0xf94c8000-0xf94cbfff 0x041f0000 16K rw present dirty accessed
0xf94cc000-0xf94cffff 0x029c0000 16K rw present dirty accessed
0xf94d0000-0xf94d3fff 0x041ec000 16K rw present dirty accessed
0xf94d4000-0xf94d7fff 0x0407c000 16K rw present dirty accessed
0xf94d8000-0xf94f7fff 0x041c0000 128K rw present dirty accessed
...
0xf95ac000-0xf95affff 0x042b0000 16K rw present dirty accessed
0xfef5c000-0xfeffffff 0x01434000 16K r present accessed
---[ kasan shadow mem end ]---
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/c56ce1f5c3c75adc9811b1a5f9c410fa74183a8d.1618828806.git.christophe.leroy@csgroup.eu
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Vmalloc system leaves a gap between allocated areas. It helps catching
overflows.
Do the same for IO areas which are allocated with early_ioremap_range()
until slab_is_available().
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/c433e358190fb5d47650463ea1ab755fc7b73e6e.1618828806.git.christophe.leroy@csgroup.eu
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
When using the Radix MMU our PGD is always 64K, and must be naturally
aligned.
For a 4K page size kernel that means page alignment of swapper_pg_dir is
not sufficient, leading to failure to boot.
Use the existing MAX_PTRS_PER_PGD which has the correct value, and
avoids us hard-coding 64K here.
Fixes: e72421a085a8 ("powerpc: Define swapper_pg_dir[] in C")
Reported-by: Daniel Axtens <dja@axtens.net>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210624123420.2784187-1-mpe@ellerman.id.au
|
| |\
| | |
| | |
| | |
| | |
| | | |
Pull in some more ppc KVM patches we are keeping in our topic branch.
In particular this brings in the series to add H_RPT_INVALIDATE.
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Use set_memory_attr() instead of the PPC32 specific change_page_attr()
change_page_attr() was checking that the address was not mapped by
blocks and was handling highmem, but that's unneeded because the
affected pages can't be in highmem and block mapping verification
is already done by the callers.
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
[ruscur: rebase on powerpc/merge with Christophe's new patches]
Signed-off-by: Russell Currey <ruscur@russell.cc>
Signed-off-by: Jordan Niethe <jniethe5@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210609013431.9805-10-jniethe5@gmail.com
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
In addition to the set_memory_xx() functions which allows to change
the memory attributes of not (yet) used memory regions, implement a
set_memory_attr() function to:
- set the final memory protection after init on currently used
kernel regions.
- enable/disable kernel memory regions in the scope of DEBUG_PAGEALLOC.
Unlike the set_memory_xx() which can act in three step as the regions
are unused, this function must modify 'on the fly' as the kernel is
executing from them. At the moment only PPC32 will use it and changing
page attributes on the fly is not an issue.
Reported-by: kbuild test robot <lkp@intel.com>
[ruscur: cast "data" to unsigned long instead of int]
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Russell Currey <ruscur@russell.cc>
Signed-off-by: Jordan Niethe <jniethe5@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210609013431.9805-9-jniethe5@gmail.com
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
The set_memory_{ro/rw/nx/x}() functions are required for
STRICT_MODULE_RWX, and are generally useful primitives to have. This
implementation is designed to be generic across powerpc's many MMUs.
It's possible that this could be optimised to be faster for specific
MMUs.
This implementation does not handle cases where the caller is attempting
to change the mapping of the page it is executing from, or if another
CPU is concurrently using the page being altered. These cases likely
shouldn't happen, but a more complex implementation with MMU-specific code
could safely handle them.
On hash, the linear mapping is not kept in the linux pagetable, so this
will not change the protection if used on that range. Currently these
functions are not used on the linear map so just WARN for now.
apply_to_existing_page_range() does not work on huge pages so for now
disallow changing the protection of huge pages.
[jpn: - Allow set memory functions to be used without Strict RWX
- Hash: Disallow certain regions
- Have change_page_attr() take function pointers to manipulate ptes
- Radix: Add ptesync after set_pte_at()]
Signed-off-by: Russell Currey <ruscur@russell.cc>
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Jordan Niethe <jniethe5@gmail.com>
Reviewed-by: Daniel Axtens <dja@axtens.net>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210609013431.9805-2-jniethe5@gmail.com
|
| |\ \
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Merge some powerpc KVM patches from our topic branch.
In particular this brings in Nick's big series rewriting parts of the
guest entry/exit path in C.
Conflicts:
arch/powerpc/kernel/security.c
arch/powerpc/kvm/book3s_hv_rmhandlers.S
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Update _tlbiel_pid() such that we can avoid build errors like below when
using this function in other places.
arch/powerpc/mm/book3s64/radix_tlb.c: In function ‘__radix__flush_tlb_range_psize’:
arch/powerpc/mm/book3s64/radix_tlb.c:114:2: warning: ‘asm’ operand 3 probably does not match constraints
114 | asm volatile(PPC_TLBIEL(%0, %4, %3, %2, %1)
| ^~~
arch/powerpc/mm/book3s64/radix_tlb.c:114:2: error: impossible constraint in ‘asm’
make[4]: *** [scripts/Makefile.build:271: arch/powerpc/mm/book3s64/radix_tlb.o] Error 1
m
With this fix, we can also drop the __always_inline in __radix_flush_tlb_range_psize
which was added by commit e12d6d7d46a6 ("powerpc/mm/radix: mark __radix__flush_tlb_range_psize() as __always_inline")
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Acked-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210610083639.387365-1-aneesh.kumar@linux.ibm.com
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
book3s/32 and 8xx don't use vmalloc for modules.
Print the modules area at startup as part of the virtual memory layout:
[ 0.000000] Kernel virtual memory layout:
[ 0.000000] * 0xffafc000..0xffffc000 : fixmap
[ 0.000000] * 0xc9000000..0xffafc000 : vmalloc & ioremap
[ 0.000000] * 0xb0000000..0xc0000000 : modules
[ 0.000000] Memory: 118480K/131072K available (7152K kernel code, 2320K rwdata, 1328K rodata, 368K init, 854K bss, 12592K reserved, 0K cma-reserved)
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/98394503e92d6fd6d8f657e0b263b32f21cf2790.1623438478.git.christophe.leroy@csgroup.eu
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
PTE_SIZE means PTE page table size in most placed, whereas
in hash_low.S in means size of one entry in the table.
Rename it PTE_T_SIZE, and define it directly in hash_low.S
instead of going through asm-offsets.
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/83a008a9fd6cc3f2bbcb470f592555d260ed7a3d.1623063174.git.christophe.leroy@csgroup.eu
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Don't duplicate swapper_pg_dir[] in each platform's head.S
Define it in mm/pgtable.c
Define MAX_PTRS_PER_PGD because on book3s/64 PTRS_PER_PGD is
not a constant.
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/5e3f1b8a4695c33ccc80aa3870e016bef32b85e1.1623063174.git.christophe.leroy@csgroup.eu
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
At the time being, empty_zero_page[] is defined in each
platform head.S.
Define it in mm/mem.c instead, and put it in BSS section instead
of the DATA section. Commit 5227cfa71f9e ("arm64: mm: place
empty_zero_page in bss") explains why it is interesting to have
it in BSS.
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/5838caffa269e0957c5a50cc85477876220298b0.1623063174.git.christophe.leroy@csgroup.eu
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
DEBUG_HARDER is not user selectable.
Remove it together with related messages.
Also remove two pr_devel() messages that should
likely have been pr_hard().
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/0f25109b0e12fdd1e6541dedbb2212cc53526a57.1622712515.git.christophe.leroy@csgroup.eu
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
DEBUG_CLAMP_LAST_CONTEXT was there in the old days to reduce
number of contexts in order to ease debugging implementation
of context switching, but that's been quite stable during
years now.
As it is not user selectable, remove it.
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/da81837b452e8b9f1657b529b9c3050dc10b9770.1622712515.git.christophe.leroy@csgroup.eu
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
mmu_context handling has been there for years, so we
would know if there was problems with maps.
DEBUG_MAP_CONSISTENCY is not user selectable, remove it.
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/6fe2b88956db53f8d6ee221525b2c5dc6aec82c6.1622712515.git.christophe.leroy@csgroup.eu
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Everything can be done even when CONFIG_SMP is not selected.
Just use IS_ENABLED() where relevant and rely on GCC to
opt out unneeded code and variables when CONFIG_SMP is not set.
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/cc13b87b0f750a538621876ecc24c22a07e7c8be.1622712515.git.christophe.leroy@csgroup.eu
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
ppc8xx already has set_context() in C.
Other ones have it in assembly. The only thing it does is to
write the context id into SPRN_PID.
Do it in C.
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/a5d0759064f3831c6b88af49ef5d3b05ba1c4dad.1622712515.git.christophe.leroy@csgroup.eu
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Instead of duplicating the update of BDI2000 pointers in
set_context(), do it directly from switch_mmu_context().
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/4c54997edd3548fa54717915e7c6ebaf60f208c0.1622712515.git.christophe.leroy@csgroup.eu
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
On book3s/32, KUAP is provided by toggling Ks bit in segment registers.
One segment register addresses 256M of virtual memory.
At the time being, KUAP implements a complex logic to apply the
unlock/lock on the exact number of segments covering the user range
to access, with saving the boundaries of the range of segments in
a member of thread struct.
But most if not all user accesses are within a single segment.
Rework KUAP with a different approach:
- Open only one segment, the one corresponding to the starting
address of the range to be accessed.
- If a second segment is involved, it will generate a page fault. The
segment will then be open by the page fault handler.
The kuap member of thread struct will now contain:
- The start address of the current on going user access, that will be
used to know which segment to lock at the end of the user access.
- ~0 when no user access is open
- ~1 when additionnal segments are opened by a page fault.
Then, at lock time
- When only one segment is open, close it.
- When several segments are open, close all user segments.
Almost 100% of the time, only one segment will be involved.
In interrupts, inline the function that unlock/lock all segments,
because not inlining them implies a lot of register save/restore.
With the patch, writing value 128 in userspace in perf_copy_attr() is
done with 16 instructions:
3890: 93 82 04 dc stw r28,1244(r2)
3894: 7d 20 e5 26 mfsrin r9,r28
3898: 55 29 00 80 rlwinm r9,r9,0,2,0
389c: 7d 20 e1 e4 mtsrin r9,r28
38a0: 4c 00 01 2c isync
38a4: 39 20 00 80 li r9,128
38a8: 91 3c 00 00 stw r9,0(r28)
38ac: 81 42 04 dc lwz r10,1244(r2)
38b0: 39 00 ff ff li r8,-1
38b4: 91 02 04 dc stw r8,1244(r2)
38b8: 2c 0a ff fe cmpwi r10,-2
38bc: 41 82 00 88 beq 3944 <perf_copy_attr+0x36c>
38c0: 7d 20 55 26 mfsrin r9,r10
38c4: 65 29 40 00 oris r9,r9,16384
38c8: 7d 20 51 e4 mtsrin r9,r10
38cc: 4c 00 01 2c isync
...
3944: 48 00 00 01 bl 3944 <perf_copy_attr+0x36c>
3944: R_PPC_REL24 kuap_lock_all_ool
Before the patch it was 118 instructions. In reality only 42 are
executed in most cases, but GCC is not able to see that a properly
aligned user access cannot involve more than one segment.
5060: 39 1d 00 04 addi r8,r29,4
5064: 3d 20 b0 00 lis r9,-20480
5068: 7c 08 48 40 cmplw r8,r9
506c: 40 81 00 08 ble 5074 <perf_copy_attr+0x2cc>
5070: 3d 00 b0 00 lis r8,-20480
5074: 39 28 ff ff addi r9,r8,-1
5078: 57 aa 00 06 rlwinm r10,r29,0,0,3
507c: 55 29 27 3e rlwinm r9,r9,4,28,31
5080: 39 29 00 01 addi r9,r9,1
5084: 7d 29 53 78 or r9,r9,r10
5088: 91 22 04 dc stw r9,1244(r2)
508c: 7d 20 ed 26 mfsrin r9,r29
5090: 55 29 00 80 rlwinm r9,r9,0,2,0
5094: 7c 08 50 40 cmplw r8,r10
5098: 40 81 00 c0 ble 5158 <perf_copy_attr+0x3b0>
509c: 7d 46 50 f8 not r6,r10
50a0: 7c c6 42 14 add r6,r6,r8
50a4: 54 c6 27 be rlwinm r6,r6,4,30,31
50a8: 7d 20 51 e4 mtsrin r9,r10
50ac: 3c ea 10 00 addis r7,r10,4096
50b0: 39 29 01 11 addi r9,r9,273
50b4: 7f 88 38 40 cmplw cr7,r8,r7
50b8: 55 29 02 06 rlwinm r9,r9,0,8,3
50bc: 40 9d 00 9c ble cr7,5158 <perf_copy_attr+0x3b0>
50c0: 2f 86 00 00 cmpwi cr7,r6,0
50c4: 41 9e 00 4c beq cr7,5110 <perf_copy_attr+0x368>
50c8: 2f 86 00 01 cmpwi cr7,r6,1
50cc: 41 9e 00 2c beq cr7,50f8 <perf_copy_attr+0x350>
50d0: 2f 86 00 02 cmpwi cr7,r6,2
50d4: 41 9e 00 14 beq cr7,50e8 <perf_copy_attr+0x340>
50d8: 7d 20 39 e4 mtsrin r9,r7
50dc: 39 29 01 11 addi r9,r9,273
50e0: 3c e7 10 00 addis r7,r7,4096
50e4: 55 29 02 06 rlwinm r9,r9,0,8,3
50e8: 7d 20 39 e4 mtsrin r9,r7
50ec: 39 29 01 11 addi r9,r9,273
50f0: 3c e7 10 00 addis r7,r7,4096
50f4: 55 29 02 06 rlwinm r9,r9,0,8,3
50f8: 7d 20 39 e4 mtsrin r9,r7
50fc: 3c e7 10 00 addis r7,r7,4096
5100: 39 29 01 11 addi r9,r9,273
5104: 7f 88 38 40 cmplw cr7,r8,r7
5108: 55 29 02 06 rlwinm r9,r9,0,8,3
510c: 40 9d 00 4c ble cr7,5158 <perf_copy_attr+0x3b0>
5110: 7d 20 39 e4 mtsrin r9,r7
5114: 39 29 01 11 addi r9,r9,273
5118: 3c c7 10 00 addis r6,r7,4096
511c: 55 29 02 06 rlwinm r9,r9,0,8,3
5120: 7d 20 31 e4 mtsrin r9,r6
5124: 39 29 01 11 addi r9,r9,273
5128: 3c c6 10 00 addis r6,r6,4096
512c: 55 29 02 06 rlwinm r9,r9,0,8,3
5130: 7d 20 31 e4 mtsrin r9,r6
5134: 39 29 01 11 addi r9,r9,273
5138: 3c c7 30 00 addis r6,r7,12288
513c: 55 29 02 06 rlwinm r9,r9,0,8,3
5140: 7d 20 31 e4 mtsrin r9,r6
5144: 3c e7 40 00 addis r7,r7,16384
5148: 39 29 01 11 addi r9,r9,273
514c: 7f 88 38 40 cmplw cr7,r8,r7
5150: 55 29 02 06 rlwinm r9,r9,0,8,3
5154: 41 9d ff bc bgt cr7,5110 <perf_copy_attr+0x368>
5158: 4c 00 01 2c isync
515c: 39 20 00 80 li r9,128
5160: 91 3d 00 00 stw r9,0(r29)
5164: 38 e0 00 00 li r7,0
5168: 90 e2 04 dc stw r7,1244(r2)
516c: 7d 20 ed 26 mfsrin r9,r29
5170: 65 29 40 00 oris r9,r9,16384
5174: 40 81 00 c0 ble 5234 <perf_copy_attr+0x48c>
5178: 7d 47 50 f8 not r7,r10
517c: 7c e7 42 14 add r7,r7,r8
5180: 54 e7 27 be rlwinm r7,r7,4,30,31
5184: 7d 20 51 e4 mtsrin r9,r10
5188: 3d 4a 10 00 addis r10,r10,4096
518c: 39 29 01 11 addi r9,r9,273
5190: 7c 08 50 40 cmplw r8,r10
5194: 55 29 02 06 rlwinm r9,r9,0,8,3
5198: 40 81 00 9c ble 5234 <perf_copy_attr+0x48c>
519c: 2c 07 00 00 cmpwi r7,0
51a0: 41 82 00 4c beq 51ec <perf_copy_attr+0x444>
51a4: 2c 07 00 01 cmpwi r7,1
51a8: 41 82 00 2c beq 51d4 <perf_copy_attr+0x42c>
51ac: 2c 07 00 02 cmpwi r7,2
51b0: 41 82 00 14 beq 51c4 <perf_copy_attr+0x41c>
51b4: 7d 20 51 e4 mtsrin r9,r10
51b8: 39 29 01 11 addi r9,r9,273
51bc: 3d 4a 10 00 addis r10,r10,4096
51c0: 55 29 02 06 rlwinm r9,r9,0,8,3
51c4: 7d 20 51 e4 mtsrin r9,r10
51c8: 39 29 01 11 addi r9,r9,273
51cc: 3d 4a 10 00 addis r10,r10,4096
51d0: 55 29 02 06 rlwinm r9,r9,0,8,3
51d4: 7d 20 51 e4 mtsrin r9,r10
51d8: 3d 4a 10 00 addis r10,r10,4096
51dc: 39 29 01 11 addi r9,r9,273
51e0: 7c 08 50 40 cmplw r8,r10
51e4: 55 29 02 06 rlwinm r9,r9,0,8,3
51e8: 40 81 00 4c ble 5234 <perf_copy_attr+0x48c>
51ec: 7d 20 51 e4 mtsrin r9,r10
51f0: 39 29 01 11 addi r9,r9,273
51f4: 3c ea 10 00 addis r7,r10,4096
51f8: 55 29 02 06 rlwinm r9,r9,0,8,3
51fc: 7d 20 39 e4 mtsrin r9,r7
5200: 39 29 01 11 addi r9,r9,273
5204: 3c e7 10 00 addis r7,r7,4096
5208: 55 29 02 06 rlwinm r9,r9,0,8,3
520c: 7d 20 39 e4 mtsrin r9,r7
5210: 39 29 01 11 addi r9,r9,273
5214: 3c ea 30 00 addis r7,r10,12288
5218: 55 29 02 06 rlwinm r9,r9,0,8,3
521c: 7d 20 39 e4 mtsrin r9,r7
5220: 3d 4a 40 00 addis r10,r10,16384
5224: 39 29 01 11 addi r9,r9,273
5228: 7c 08 50 40 cmplw r8,r10
522c: 55 29 02 06 rlwinm r9,r9,0,8,3
5230: 41 81 ff bc bgt 51ec <perf_copy_attr+0x444>
5234: 4c 00 01 2c isync
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
[mpe: Export the ool handlers to fix build errors]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/d9121f96a7c4302946839a0771f5d1daeeb6968c.1622708530.git.christophe.leroy@csgroup.eu
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
PPC64 uses MMU features to enable/disable KUAP at boot time.
But feature fixups are applied way too early on PPC32.
Now that all KUAP related actions are in C following the
conversion of KUAP initial setup and context switch in C,
static branches can be used to enable/disable KUAP.
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
[mpe: Export disable_kuap_key to fix build errors]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/cd79e8008455fba5395d099f9bb1305c039b931c.1622708530.git.christophe.leroy@csgroup.eu
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
PPC64 uses MMU features to enable/disable KUEP at boot time.
But feature fixups are applied way too early on PPC32.
Now that all KUEP related actions are in C following the
conversion of KUEP initial setup and context switch in C,
static branches can be used to enable/disable KUEP.
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/7745a2c3a08ec46302920a3f48d1cb9b5469dbbb.1622708530.git.christophe.leroy@csgroup.eu
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
In order to selectively activate KUAP and KUEP in a following patch,
perform KUAP and KUEP initialisation in C.
Unlike PPC64, PPC32 doesn't have an early_setup_secondary(),
so do it in start_secondary().
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/87be72023448dd4e476744ed279b8c04b8d08a1c.1622708530.git.christophe.leroy@csgroup.eu
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
switch_mmu_context() does things that can easily be done in C.
For updating user segments, we have update_user_segments().
As mentionned in commit b5efec00b671 ("powerpc/32s: Move KUEP
locking/unlocking in C"), update_user_segments() has the loop
unrolled which is a significant performance gain.
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/05c0875ad8220c03452c3a334946e207c6ca04d6.1622708530.git.christophe.leroy@csgroup.eu
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
In order to reuse it in switch_mmu_context(), this
patch moves CTX_TO_VSID() macro into asm/book3s/32/mmu-hash.h
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/26b36ef2939234a04b37baf6ffe50cba81f5d1b7.1622708530.git.christophe.leroy@csgroup.eu
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
KUEP implements the update of user segment registers.
Move it into mmu-hash.h in order to use it from other places.
And inline kuep_lock() and kuep_unlock(). Inlining kuep_lock() is
important for system_call_exception(), otherwise system_call_exception()
has to save into stack the system call parameters that are used just
after, and doing that takes more instructions than kuep_lock() itself.
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/24591ca480d14a62ef910e38a5273d551262c4a2.1622708530.git.christophe.leroy@csgroup.eu
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Avoids the #ifdef in mmu.c
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/0b7a13d414837e58264edc336b89c2fe9f35f9bc.1622708530.git.christophe.leroy@csgroup.eu
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
PPC64 uses MMU features to enable/disable KUAP at boot time.
But feature fixups are applied way too early on PPC32.
But since commit c16728835eec ("powerpc/32: Manage KUAP in C"),
all KUAP is in C so it is now possible to use static branches.
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/3dca510ce555335261a47c4799167da698f569c0.1622782111.git.christophe.leroy@csgroup.eu
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Powerpc 44x has two bits for exec protection in TLBs: one
for user (UX) and one for superviser (SX).
Clear SX on user pages in TLB miss handlers to provide KUEP.
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/169310e08152aa1d96c979770291d165ec6896ae.1622616032.git.christophe.leroy@csgroup.eu
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
'struct ppc_inst' is an internal representation of an instruction, but
in-memory instructions are and will remain a table of 'u32' forever.
Replace all 'struct ppc_inst *' used for locating an instruction in
memory by 'u32 *'. This removes a lot of undue casts to 'struct
ppc_inst *'.
It also helps locating ab-use of 'struct ppc_inst' dereference.
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
[mpe: Fix ppc_inst_next(), use u32 instead of unsigned int]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/7062722b087228e42cbd896e39bfdf526d6a340a.1621516826.git.christophe.leroy@csgroup.eu
|
|\ \ \ \
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
Merge misc updates from Andrew Morton:
"191 patches.
Subsystems affected by this patch series: kthread, ia64, scripts,
ntfs, squashfs, ocfs2, kernel/watchdog, and mm (gup, pagealloc, slab,
slub, kmemleak, dax, debug, pagecache, gup, swap, memcg, pagemap,
mprotect, bootmem, dma, tracing, vmalloc, kasan, initialization,
pagealloc, and memory-failure)"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (191 commits)
mm,hwpoison: make get_hwpoison_page() call get_any_page()
mm,hwpoison: send SIGBUS with error virutal address
mm/page_alloc: split pcp->high across all online CPUs for cpuless nodes
mm/page_alloc: allow high-order pages to be stored on the per-cpu lists
mm: replace CONFIG_FLAT_NODE_MEM_MAP with CONFIG_FLATMEM
mm: replace CONFIG_NEED_MULTIPLE_NODES with CONFIG_NUMA
docs: remove description of DISCONTIGMEM
arch, mm: remove stale mentions of DISCONIGMEM
mm: remove CONFIG_DISCONTIGMEM
m68k: remove support for DISCONTIGMEM
arc: remove support for DISCONTIGMEM
arc: update comment about HIGHMEM implementation
alpha: remove DISCONTIGMEM and NUMA
mm/page_alloc: move free_the_page
mm/page_alloc: fix counting of managed_pages
mm/page_alloc: improve memmap_pages dbg msg
mm: drop SECTION_SHIFT in code comments
mm/page_alloc: introduce vm.percpu_pagelist_high_fraction
mm/page_alloc: limit the number of pages on PCP lists when reclaim is active
mm/page_alloc: scale the number of pages that are batch freed
...
|
| |/ / /
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
After removal of DISCINTIGMEM the NEED_MULTIPLE_NODES and NUMA
configuration options are equivalent.
Drop CONFIG_NEED_MULTIPLE_NODES and use CONFIG_NUMA instead.
Done with
$ sed -i 's/CONFIG_NEED_MULTIPLE_NODES/CONFIG_NUMA/' \
$(git grep -wl CONFIG_NEED_MULTIPLE_NODES)
$ sed -i 's/NEED_MULTIPLE_NODES/NUMA/' \
$(git grep -wl NEED_MULTIPLE_NODES)
with manual tweaks afterwards.
[rppt@linux.ibm.com: fix arm boot crash]
Link: https://lkml.kernel.org/r/YMj9vHhHOiCVN4BF@linux.ibm.com
Link: https://lkml.kernel.org/r/20210608091316.3622-9-rppt@kernel.org
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: David Hildenbrand <david@redhat.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Vineet Gupta <vgupta@synopsys.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|\ \ \ \
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irq updates from Thomas Gleixner:
"Updates for the interrupt subsystem:
Core changes:
- Cleanup and simplification of common code to invoke the low level
interrupt flow handlers when this invocation requires irqdomain
resolution. Add the necessary core infrastructure.
- Provide a proper interface for modular PMU drivers to set the
interrupt affinity.
- Add a request flag which allows to exclude interrupts from spurious
interrupt detection. Useful especially for IPI handlers which
always return IRQ_HANDLED which turns the spurious interrupt
detection into a pointless waste of CPU cycles.
Driver changes:
- Bulk convert interrupt chip drivers to the new irqdomain low level
flow handler invocation mechanism.
- Add device tree bindings for the Renesas R-Car M3-W+ SoC
- Enable modular build of the Qualcomm PDC driver
- The usual small fixes and improvements"
* tag 'irq-core-2021-06-29' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (38 commits)
dt-bindings: interrupt-controller: arm,gic-v3: Describe GICv3 optional properties
irqchip: gic-pm: Remove redundant error log of clock bulk
irqchip/sun4i: Remove unnecessary oom message
irqchip/irq-imx-gpcv2: Remove unnecessary oom message
irqchip/imgpdc: Remove unnecessary oom message
irqchip/gic-v3-its: Remove unnecessary oom message
irqchip/gic-v2m: Remove unnecessary oom message
irqchip/exynos-combiner: Remove unnecessary oom message
irqchip: Bulk conversion to generic_handle_domain_irq()
genirq: Move non-irqdomain handle_domain_irq() handling into ARM's handle_IRQ()
genirq: Add generic_handle_domain_irq() helper
irqchip/nvic: Convert from handle_IRQ() to handle_domain_irq()
irqdesc: Fix __handle_domain_irq() comment
genirq: Use irq_resolve_mapping() to implement __handle_domain_irq() and co
irqdomain: Introduce irq_resolve_mapping()
irqdomain: Protect the linear revmap with RCU
irqdomain: Cache irq_data instead of a virq number in the revmap
irqdomain: Use struct_size() helper when allocating irqdomain
irqdomain: Make normal and nomap irqdomains exclusive
powerpc: Move the use of irq_domain_add_nomap() behind a config option
...
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
A bunch of PPC files are missing the inclusion of linux/of.h and
linux/irqdomain.h, relying on transitive inclusion from another
file.
As we are about to break this dependency, make sure these dependencies
are explicit.
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|\ \ \ \ \
| |_|/ / /
|/| | | /
| | |_|/
| |/| |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Pull kvm updates from Paolo Bonzini:
"This covers all architectures (except MIPS) so I don't expect any
other feature pull requests this merge window.
ARM:
- Add MTE support in guests, complete with tag save/restore interface
- Reduce the impact of CMOs by moving them in the page-table code
- Allow device block mappings at stage-2
- Reduce the footprint of the vmemmap in protected mode
- Support the vGIC on dumb systems such as the Apple M1
- Add selftest infrastructure to support multiple configuration and
apply that to PMU/non-PMU setups
- Add selftests for the debug architecture
- The usual crop of PMU fixes
PPC:
- Support for the H_RPT_INVALIDATE hypercall
- Conversion of Book3S entry/exit to C
- Bug fixes
S390:
- new HW facilities for guests
- make inline assembly more robust with KASAN and co
x86:
- Allow userspace to handle emulation errors (unknown instructions)
- Lazy allocation of the rmap (host physical -> guest physical
address)
- Support for virtualizing TSC scaling on VMX machines
- Optimizations to avoid shattering huge pages at the beginning of
live migration
- Support for initializing the PDPTRs without loading them from
memory
- Many TLB flushing cleanups
- Refuse to load if two-stage paging is available but NX is not (this
has been a requirement in practice for over a year)
- A large series that separates the MMU mode (WP/SMAP/SMEP etc.) from
CR0/CR4/EFER, using the MMU mode everywhere once it is computed
from the CPU registers
- Use PM notifier to notify the guest about host suspend or hibernate
- Support for passing arguments to Hyper-V hypercalls using XMM
registers
- Support for Hyper-V TLB flush hypercalls and enlightened MSR bitmap
on AMD processors
- Hide Hyper-V hypercalls that are not included in the guest CPUID
- Fixes for live migration of virtual machines that use the Hyper-V
"enlightened VMCS" optimization of nested virtualization
- Bugfixes (not many)
Generic:
- Support for retrieving statistics without debugfs
- Cleanups for the KVM selftests API"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (314 commits)
KVM: x86: rename apic_access_page_done to apic_access_memslot_enabled
kvm: x86: disable the narrow guest module parameter on unload
selftests: kvm: Allows userspace to handle emulation errors.
kvm: x86: Allow userspace to handle emulation errors
KVM: x86/mmu: Let guest use GBPAGES if supported in hardware and TDP is on
KVM: x86/mmu: Get CR4.SMEP from MMU, not vCPU, in shadow page fault
KVM: x86/mmu: Get CR0.WP from MMU, not vCPU, in shadow page fault
KVM: x86/mmu: Drop redundant rsvd bits reset for nested NPT
KVM: x86/mmu: Optimize and clean up so called "last nonleaf level" logic
KVM: x86: Enhance comments for MMU roles and nested transition trickiness
KVM: x86/mmu: WARN on any reserved SPTE value when making a valid SPTE
KVM: x86/mmu: Add helpers to do full reserved SPTE checks w/ generic MMU
KVM: x86/mmu: Use MMU's role to determine PTTYPE
KVM: x86/mmu: Collapse 32-bit PAE and 64-bit statements for helpers
KVM: x86/mmu: Add a helper to calculate root from role_regs
KVM: x86/mmu: Add helper to update paging metadata
KVM: x86/mmu: Don't update nested guest's paging bitmasks if CR0.PG=0
KVM: x86/mmu: Consolidate reset_rsvds_bits_mask() calls
KVM: x86/mmu: Use MMU role_regs to get LA57, and drop vCPU LA57 helper
KVM: x86/mmu: Get nested MMU's root level from the MMU's role
...
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Enable support for process-scoped invalidations from nested
guests and partition-scoped invalidations for nested guests.
Process-scoped invalidations for any level of nested guests
are handled by implementing H_RPT_INVALIDATE handler in the
nested guest exit path in L0.
Partition-scoped invalidation requests are forwarded to the
right nested guest, handled there and passed down to L0
for eventual handling.
Signed-off-by: Bharata B Rao <bharata@linux.ibm.com>
[aneesh: Nested guest partition-scoped invalidation changes]
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
[mpe: Squash in fixup patch]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210621085003.904767-5-bharata@linux.ibm.com
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
H_RPT_INVALIDATE does two types of TLB invalidations:
1. Process-scoped invalidations for guests when LPCR[GTSE]=0.
This is currently not used in KVM as GTSE is not usually
disabled in KVM.
2. Partition-scoped invalidations that an L1 hypervisor does on
behalf of an L2 guest. This is currently handled
by H_TLB_INVALIDATE hcall and this new replaces the old that.
This commit enables process-scoped invalidations for L1 guests.
Support for process-scoped and partition-scoped invalidations
from/for nested guests will be added separately.
Process scoped tlbie invalidations from L1 and nested guests
need RS register for TLBIE instruction to contain both PID and
LPID. This patch introduces primitives that execute tlbie
instruction with both PID and LPID set in prepartion for
H_RPT_INVALIDATE hcall.
A description of H_RPT_INVALIDATE follows:
int64 /* H_Success: Return code on successful completion */
/* H_Busy - repeat the call with the same */
/* H_Parameter, H_P2, H_P3, H_P4, H_P5 : Invalid
parameters */
hcall(const uint64 H_RPT_INVALIDATE, /* Invalidate RPT
translation
lookaside information */
uint64 id, /* PID/LPID to invalidate */
uint64 target, /* Invalidation target */
uint64 type, /* Type of lookaside information */
uint64 pg_sizes, /* Page sizes */
uint64 start, /* Start of Effective Address (EA)
range (inclusive) */
uint64 end) /* End of EA range (exclusive) */
Invalidation targets (target)
-----------------------------
Core MMU 0x01 /* All virtual processors in the
partition */
Core local MMU 0x02 /* Current virtual processor */
Nest MMU 0x04 /* All nest/accelerator agents
in use by the partition */
A combination of the above can be specified,
except core and core local.
Type of translation to invalidate (type)
---------------------------------------
NESTED 0x0001 /* invalidate nested guest partition-scope */
TLB 0x0002 /* Invalidate TLB */
PWC 0x0004 /* Invalidate Page Walk Cache */
PRT 0x0008 /* Invalidate caching of Process Table
Entries if NESTED is clear */
PAT 0x0008 /* Invalidate caching of Partition Table
Entries if NESTED is set */
A combination of the above can be specified.
Page size mask (pages)
----------------------
4K 0x01
64K 0x02
2M 0x04
1G 0x08
All sizes (-1UL)
A combination of the above can be specified.
All page sizes can be selected with -1.
Semantics: Invalidate radix tree lookaside information
matching the parameters given.
* Return H_P2, H_P3 or H_P4 if target, type, or pageSizes parameters
are different from the defined values.
* Return H_PARAMETER if NESTED is set and pid is not a valid nested
LPID allocated to this partition
* Return H_P5 if (start, end) doesn't form a valid range. Start and
end should be a valid Quadrant address and end > start.
* Return H_NotSupported if the partition is not in running in radix
translation mode.
* May invalidate more translation information than requested.
* If start = 0 and end = -1, set the range to cover all valid
addresses. Else start and end should be aligned to 4kB (lower 11
bits clear).
* If NESTED is clear, then invalidate process scoped lookaside
information. Else pid specifies a nested LPID, and the invalidation
is performed on nested guest partition table and nested guest
partition scope real addresses.
* If pid = 0 and NESTED is clear, then valid addresses are quadrant 3
and quadrant 0 spaces, Else valid addresses are quadrant 0.
* Pages which are fully covered by the range are to be invalidated.
Those which are partially covered are considered outside
invalidation range, which allows a caller to optimally invalidate
ranges that may contain mixed page sizes.
* Return H_SUCCESS on success.
Signed-off-by: Bharata B Rao <bharata@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210621085003.904767-4-bharata@linux.ibm.com
|
| | |/
| |/|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Add a field to mmu_psize_def to store the page size encodings
of H_RPT_INVALIDATE hcall. Initialize this while scanning the radix
AP encodings. This will be used when invalidating with required
page size encoding in the hcall.
Signed-off-by: Bharata B Rao <bharata@linux.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210621085003.904767-3-bharata@linux.ibm.com
|