| Commit message (Collapse) | Author | Age | Files | Lines |
|\
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fixes from Will Deacon:
"The major fix here is for a filesystem corruption issue reported on
Apple M1 as a result of buggy management of the floating point
register state introduced in 6.8. I initially reverted one of the
offending patches, but in the end Ard cooked a proper fix so there's a
revert+reapply in the series.
Aside from that, we've got some CPU errata workarounds and misc other
fixes.
- Fix broken FP register state tracking which resulted in filesystem
corruption when dm-crypt is used
- Workarounds for Arm CPU errata affecting the SSBS Spectre
mitigation
- Fix lockdep assertion in DMC620 memory controller PMU driver
- Fix alignment of BUG table when CONFIG_DEBUG_BUGVERBOSE is
disabled"
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64/fpsimd: Avoid erroneous elide of user state reload
Reapply "arm64: fpsimd: Implement lazy restore for kernel mode FPSIMD"
arm64: asm-bug: Add .align 2 to the end of __BUG_ENTRY
perf/arm-dmc620: Fix lockdep assert in ->event_init()
Revert "arm64: fpsimd: Implement lazy restore for kernel mode FPSIMD"
arm64: errata: Add workaround for Arm errata 3194386 and 3312417
arm64: cputype: Add Neoverse-V3 definitions
arm64: cputype: Add Cortex-X4 definitions
arm64: barrier: Restore spec_bar() macro
|
| |\
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
* for-next/errata:
arm64: errata: Add workaround for Arm errata 3194386 and 3312417
arm64: cputype: Add Neoverse-V3 definitions
arm64: cputype: Add Cortex-X4 definitions
arm64: barrier: Restore spec_bar() macro
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Cortex-X4 and Neoverse-V3 suffer from errata whereby an MSR to the SSBS
special-purpose register does not affect subsequent speculative
instructions, permitting speculative store bypassing for a window of
time. This is described in their Software Developer Errata Notice (SDEN)
documents:
* Cortex-X4 SDEN v8.0, erratum 3194386:
https://developer.arm.com/documentation/SDEN-2432808/0800/
* Neoverse-V3 SDEN v6.0, erratum 3312417:
https://developer.arm.com/documentation/SDEN-2891958/0600/
To workaround these errata, it is necessary to place a speculation
barrier (SB) after MSR to the SSBS special-purpose register. This patch
adds the requisite SB after writes to SSBS within the kernel, and hides
the presence of SSBS from EL0 such that userspace software which cares
about SSBS will manipulate this via prctl(PR_GET_SPECULATION_CTRL, ...).
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20240508081400.235362-5-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
arm64 provides an equivalent to the common kernel-mode FPU API, but in a
different header and using different function names. Add a wrapper
header, and export CFLAGS adjustments as found in lib/raid6/Makefile.
Link: https://lkml.kernel.org/r/20240329072441.591471-5-samuel.holland@sifive.com
Signed-off-by: Samuel Holland <samuel.holland@sifive.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Borislav Petkov (AMD) <bp@alien8.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nicolas Schier <nicolas@fjasle.eu>
Cc: Palmer Dabbelt <palmer@rivosinc.com>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: WANG Xuerui <git@xen0n.name>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|\ \ \
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull mm updates from Andrew Morton:
"The usual shower of singleton fixes and minor series all over MM,
documented (hopefully adequately) in the respective changelogs.
Notable series include:
- Lucas Stach has provided some page-mapping cleanup/consolidation/
maintainability work in the series "mm/treewide: Remove pXd_huge()
API".
- In the series "Allow migrate on protnone reference with
MPOL_PREFERRED_MANY policy", Donet Tom has optimized mempolicy's
MPOL_PREFERRED_MANY mode, yielding almost doubled performance in
one test.
- In their series "Memory allocation profiling" Kent Overstreet and
Suren Baghdasaryan have contributed a means of determining (via
/proc/allocinfo) whereabouts in the kernel memory is being
allocated: number of calls and amount of memory.
- Matthew Wilcox has provided the series "Various significant MM
patches" which does a number of rather unrelated things, but in
largely similar code sites.
- In his series "mm: page_alloc: freelist migratetype hygiene"
Johannes Weiner has fixed the page allocator's handling of
migratetype requests, with resulting improvements in compaction
efficiency.
- In the series "make the hugetlb migration strategy consistent"
Baolin Wang has fixed a hugetlb migration issue, which should
improve hugetlb allocation reliability.
- Liu Shixin has hit an I/O meltdown caused by readahead in a
memory-tight memcg. Addressed in the series "Fix I/O high when
memory almost met memcg limit".
- In the series "mm/filemap: optimize folio adding and splitting"
Kairui Song has optimized pagecache insertion, yielding ~10%
performance improvement in one test.
- Baoquan He has cleaned up and consolidated the early zone
initialization code in the series "mm/mm_init.c: refactor
free_area_init_core()".
- Baoquan has also redone some MM initializatio code in the series
"mm/init: minor clean up and improvement".
- MM helper cleanups from Christoph Hellwig in his series "remove
follow_pfn".
- More cleanups from Matthew Wilcox in the series "Various
page->flags cleanups".
- Vlastimil Babka has contributed maintainability improvements in the
series "memcg_kmem hooks refactoring".
- More folio conversions and cleanups in Matthew Wilcox's series:
"Convert huge_zero_page to huge_zero_folio"
"khugepaged folio conversions"
"Remove page_idle and page_young wrappers"
"Use folio APIs in procfs"
"Clean up __folio_put()"
"Some cleanups for memory-failure"
"Remove page_mapping()"
"More folio compat code removal"
- David Hildenbrand chipped in with "fs/proc/task_mmu: convert
hugetlb functions to work on folis".
- Code consolidation and cleanup work related to GUP's handling of
hugetlbs in Peter Xu's series "mm/gup: Unify hugetlb, part 2".
- Rick Edgecombe has developed some fixes to stack guard gaps in the
series "Cover a guard gap corner case".
- Jinjiang Tu has fixed KSM's behaviour after a fork+exec in the
series "mm/ksm: fix ksm exec support for prctl".
- Baolin Wang has implemented NUMA balancing for multi-size THPs.
This is a simple first-cut implementation for now. The series is
"support multi-size THP numa balancing".
- Cleanups to vma handling helper functions from Matthew Wilcox in
the series "Unify vma_address and vma_pgoff_address".
- Some selftests maintenance work from Dev Jain in the series
"selftests/mm: mremap_test: Optimizations and style fixes".
- Improvements to the swapping of multi-size THPs from Ryan Roberts
in the series "Swap-out mTHP without splitting".
- Kefeng Wang has significantly optimized the handling of arm64's
permission page faults in the series
"arch/mm/fault: accelerate pagefault when badaccess"
"mm: remove arch's private VM_FAULT_BADMAP/BADACCESS"
- GUP cleanups from David Hildenbrand in "mm/gup: consistently call
it GUP-fast".
- hugetlb fault code cleanups from Vishal Moola in "Hugetlb fault
path to use struct vm_fault".
- selftests build fixes from John Hubbard in the series "Fix
selftests/mm build without requiring "make headers"".
- Memory tiering fixes/improvements from Ho-Ren (Jack) Chuang in the
series "Improved Memory Tier Creation for CPUless NUMA Nodes".
Fixes the initialization code so that migration between different
memory types works as intended.
- David Hildenbrand has improved follow_pte() and fixed an errant
driver in the series "mm: follow_pte() improvements and acrn
follow_pte() fixes".
- David also did some cleanup work on large folio mapcounts in his
series "mm: mapcount for large folios + page_mapcount() cleanups".
- Folio conversions in KSM in Alex Shi's series "transfer page to
folio in KSM".
- Barry Song has added some sysfs stats for monitoring multi-size
THP's in the series "mm: add per-order mTHP alloc and swpout
counters".
- Some zswap cleanups from Yosry Ahmed in the series "zswap
same-filled and limit checking cleanups".
- Matthew Wilcox has been looking at buffer_head code and found the
documentation to be lacking. The series is "Improve buffer head
documentation".
- Multi-size THPs get more work, this time from Lance Yang. His
series "mm/madvise: enhance lazyfreeing with mTHP in madvise_free"
optimizes the freeing of these things.
- Kemeng Shi has added more userspace-visible writeback
instrumentation in the series "Improve visibility of writeback".
- Kemeng Shi then sent some maintenance work on top in the series
"Fix and cleanups to page-writeback".
- Matthew Wilcox reduces mmap_lock traffic in the anon vma code in
the series "Improve anon_vma scalability for anon VMAs". Intel's
test bot reported an improbable 3x improvement in one test.
- SeongJae Park adds some DAMON feature work in the series
"mm/damon: add a DAMOS filter type for page granularity access recheck"
"selftests/damon: add DAMOS quota goal test"
- Also some maintenance work in the series
"mm/damon/paddr: simplify page level access re-check for pageout"
"mm/damon: misc fixes and improvements"
- David Hildenbrand has disabled some known-to-fail selftests ni the
series "selftests: mm: cow: flag vmsplice() hugetlb tests as
XFAIL".
- memcg metadata storage optimizations from Shakeel Butt in "memcg:
reduce memory consumption by memcg stats".
- DAX fixes and maintenance work from Vishal Verma in the series
"dax/bus.c: Fixups for dax-bus locking""
* tag 'mm-stable-2024-05-17-19-19' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (426 commits)
memcg, oom: cleanup unused memcg_oom_gfp_mask and memcg_oom_order
selftests/mm: hugetlb_madv_vs_map: avoid test skipping by querying hugepage size at runtime
mm/hugetlb: add missing VM_FAULT_SET_HINDEX in hugetlb_wp
mm/hugetlb: add missing VM_FAULT_SET_HINDEX in hugetlb_fault
selftests: cgroup: add tests to verify the zswap writeback path
mm: memcg: make alloc_mem_cgroup_per_node_info() return bool
mm/damon/core: fix return value from damos_wmark_metric_value
mm: do not update memcg stats for NR_{FILE/SHMEM}_PMDMAPPED
selftests: cgroup: remove redundant enabling of memory controller
Docs/mm/damon/maintainer-profile: allow posting patches based on damon/next tree
Docs/mm/damon/maintainer-profile: change the maintainer's timezone from PST to PT
Docs/mm/damon/design: use a list for supported filters
Docs/admin-guide/mm/damon/usage: fix wrong schemes effective quota update command
Docs/admin-guide/mm/damon/usage: fix wrong example of DAMOS filter matching sysfs file
selftests/damon: classify tests for functionalities and regressions
selftests/damon/_damon_sysfs: use 'is' instead of '==' for 'None'
selftests/damon/_damon_sysfs: find sysfs mount point from /proc/mounts
selftests/damon/_damon_sysfs: check errors from nr_schemes file reads
mm/damon/core: initialize ->esz_bp from damos_quota_init_priv()
selftests/damon: add a test for DAMOS quota goal
...
|
| | |/
| |/|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Nowadays, we call it "GUP-fast", the external interface includes functions
like "get_user_pages_fast()", and we renamed all internal functions to
reflect that as well.
Let's make the config option reflect that.
Link: https://lkml.kernel.org/r/20240402125516.223131-3-david@redhat.com
Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Mike Rapoport (IBM) <rppt@kernel.org>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: John Hubbard <jhubbard@nvidia.com>
Cc: Peter Xu <peterx@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|\ \ \
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu
Pull iommu updates from Joerg Roedel:
"Core:
- IOMMU memory usage observability - This will make the memory used
for IO page tables explicitly visible.
- Simplify arch_setup_dma_ops()
Intel VT-d:
- Consolidate domain cache invalidation
- Remove private data from page fault message
- Allocate DMAR fault interrupts locally
- Cleanup and refactoring
ARM-SMMUv2:
- Support for fault debugging hardware on Qualcomm implementations
- Re-land support for the ->domain_alloc_paging() callback
ARM-SMMUv3:
- Improve handling of MSI allocation failure
- Drop support for the "disable_bypass" cmdline option
- Major rework of the CD creation code, following on directly from
the STE rework merged last time around.
- Add unit tests for the new STE/CD manipulation logic
AMD-Vi:
- Final part of SVA changes with generic IO page fault handling
Renesas IPMMU:
- Add support for R8A779H0 hardware
... and a couple smaller fixes and updates across the sub-tree"
* tag 'iommu-updates-v6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (80 commits)
iommu/arm-smmu-v3: Make the kunit into a module
arm64: Properly clean up iommu-dma remnants
iommu/amd: Enable Guest Translation after reading IOMMU feature register
iommu/vt-d: Decouple igfx_off from graphic identity mapping
iommu/amd: Fix compilation error
iommu/arm-smmu-v3: Add unit tests for arm_smmu_write_entry
iommu/arm-smmu-v3: Build the whole CD in arm_smmu_make_s1_cd()
iommu/arm-smmu-v3: Move the CD generation for SVA into a function
iommu/arm-smmu-v3: Allocate the CD table entry in advance
iommu/arm-smmu-v3: Make arm_smmu_alloc_cd_ptr()
iommu/arm-smmu-v3: Consolidate clearing a CD table entry
iommu/arm-smmu-v3: Move the CD generation for S1 domains into a function
iommu/arm-smmu-v3: Make CD programming use arm_smmu_write_entry()
iommu/arm-smmu-v3: Add an ops indirection to the STE code
iommu/arm-smmu-qcom: Don't build debug features as a kernel module
iommu/amd: Add SVA domain support
iommu: Add ops->domain_alloc_sva()
iommu/amd: Initial SVA support for AMD IOMMU
iommu/amd: Add support for enable/disable IOPF
iommu/amd: Add IO page fault notifier handler
...
|
| |/ /
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Thanks to the somewhat asymmetrical nature, while removing
iommu_setup_dma_ops() from the arch_setup_dma_ops() flow, I managed to
forget that arm64's teardown path was also specific to iommu-dma. Clean
that up to match, otherwise probe deferral will lead to the arch code
erroneously removing DMA ops set elsewhere.
Reported-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Link: https://lore.kernel.org/linux-iommu/Zi_LV28TR-P-PzXi@eriador.lumag.spb.ru/
Fixes: b67483b3c44e ("iommu/dma: Centralise iommu_setup_dma_ops()")
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Tested-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Konrad Dybcio <konrad.dybcio@linaro.org>
Acked-by: Will Deacon <will@kernel.org>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Link: https://lore.kernel.org/r/d4cc20cbb0c45175e98dd76bf187e2ad6421296d.1714472573.git.robin.murphy@arm.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
|
|\ \ \
| |_|/
|/| |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux
Pull modules updates from Luis Chamberlain:
"Finally something fun. Mike Rapoport does some cleanup to allow us to
take out module_alloc() out of modules into a new paint shedded
execmem_alloc() and execmem_free() so to make emphasis these helpers
are actually used outside of modules.
It starts with a non-functional changes API rename / placeholders to
then allow architectures to define their requirements into a new shiny
struct execmem_info with ranges, and requirements for those ranges.
Archs now can intitialize this execmem_info as the last part of
mm_core_init() if they have to diverge from the norm. Each range is a
known type clearly articulated and spelled out in enum execmem_type.
Although a lot of this is major cleanup and prep work for future
enhancements an immediate clear gain is we get to enable KPROBES
without MODULES now. That is ultimately what motiviated to pick this
work up again, now with smaller goal as concrete stepping stone"
* tag 'modules-6.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux:
bpf: remove CONFIG_BPF_JIT dependency on CONFIG_MODULES of
kprobes: remove dependency on CONFIG_MODULES
powerpc: use CONFIG_EXECMEM instead of CONFIG_MODULES where appropriate
x86/ftrace: enable dynamic ftrace without CONFIG_MODULES
arch: make execmem setup available regardless of CONFIG_MODULES
powerpc: extend execmem_params for kprobes allocations
arm64: extend execmem_info for generated code allocations
riscv: extend execmem_params for generated code allocations
mm/execmem, arch: convert remaining overrides of module_alloc to execmem
mm/execmem, arch: convert simple overrides of module_alloc to execmem
mm: introduce execmem_alloc() and execmem_free()
module: make module_memory_{alloc,free} more self-contained
sparc: simplify module_alloc()
nios2: define virtual address space for modules
mips: module: rename MODULE_START to MODULES_VADDR
arm64: module: remove unneeded call to kasan_alloc_module_shadow()
kallsyms: replace deprecated strncpy with strscpy
module: allow UNUSED_KSYMS_WHITELIST to be relative against objtree.
|
| |/
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Extend execmem parameters to accommodate more complex overrides of
module_alloc() by architectures.
This includes specification of a fallback range required by arm, arm64
and powerpc, EXECMEM_MODULE_DATA type required by powerpc, support for
allocation of KASAN shadow required by s390 and x86 and support for
late initialization of execmem required by arm64.
The core implementation of execmem_alloc() takes care of suppressing
warnings when the initial allocation fails but there is a fallback range
defined.
Signed-off-by: Mike Rapoport (IBM) <rppt@kernel.org>
Acked-by: Will Deacon <will@kernel.org>
Acked-by: Song Liu <song@kernel.org>
Tested-by: Liviu Dudau <liviu@dudau.co.uk>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
|
|\ \
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
* for-next/perf: (41 commits)
arm64: Add USER_STACKTRACE support
drivers/perf: hisi: hns3: Actually use devm_add_action_or_reset()
drivers/perf: hisi: hns3: Fix out-of-bound access when valid event group
drivers/perf: hisi_pcie: Fix out-of-bound access when valid event group
perf/arm-spe: Assign parents for event_source device
perf/arm-smmuv3: Assign parents for event_source device
perf/arm-dsu: Assign parents for event_source device
perf/arm-dmc620: Assign parents for event_source device
perf/arm-ccn: Assign parents for event_source device
perf/arm-cci: Assign parents for event_source device
perf/alibaba_uncore: Assign parents for event_source device
perf/arm_pmu: Assign parents for event_source devices
perf/imx_ddr: Assign parents for event_source devices
perf/qcom: Assign parents for event_source devices
Documentation: qcom-pmu: Use /sys/bus/event_source/devices paths
perf/riscv: Assign parents for event_source devices
perf/thunderx2: Assign parents for event_source devices
Documentation: thunderx2-pmu: Use /sys/bus/event_source/devices paths
perf/xgene: Assign parents for event_source devices
Documentation: xgene-pmu: Use /sys/bus/event_source/devices paths
...
|
| |/
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Currently, userstacktrace is unsupported for ftrace and uprobe
tracers on arm64. This patch uses the perf_callchain_user() code
as blueprint to implement the arch_stack_walk_user() which add
userstacktrace support on arm64.
Meanwhile, we can use arch_stack_walk_user() to simplify the
implementation of perf_callchain_user().
This patch is tested pass with ftrace, uprobe and perf tracers
profiling userstacktrace cases.
Tested-by: chenqiwu <qiwu.chen@transsion.com>
Signed-off-by: chenqiwu <qiwu.chen@transsion.com>
Link: https://lore.kernel.org/r/20231219022229.10230-1-qiwu.chen@transsion.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|/
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Let's use the newly-free PTE SW bit (58) to add support for uffd-wp.
The standard handlers are implemented for set/test/clear for both pte
and pmd. Additionally we must also track the uffd-wp state as a pte swp
bit, so use a free swap pte bit (3).
Acked-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
Link: https://lore.kernel.org/r/20240503144604.151095-5-ryan.roberts@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|\
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fixes from Catalin Marinas:
- Re-instate the CPUMASK_OFFSTACK option for arm64 when NR_CPUS > 256.
The bug that led to the initial revert was the cpufreq-dt code not
using zalloc_cpumask_var().
- Make the STARFIVE_STARLINK_PMU config option depend on 64BIT to
prevent compile-test failures on 32-bit architectures due to missing
writeq().
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
perf: starfive: fix 64-bit only COMPILE_TEST condition
ARM64: Dynamically allocate cpumasks and increase supported CPUs to 512
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
[ a.k.a. Revert "Revert "ARM64: Dynamically allocate cpumasks and
increase supported CPUs to 512""; originally reverted because of a
bug in the cpufreq-dt code not using zalloc_cpumask_var() ]
Currently defconfig selects NR_CPUS=256, but some vendors (e.g. Ampere
Computing) are planning to ship systems with 512 CPUs. So that all CPUs on
these systems can be used with defconfig, we'd like to bump NR_CPUS to 512.
Therefore this patch increases the default NR_CPUS from 256 to 512.
As increasing NR_CPUS will increase the size of cpumasks, there's a fear that
this might have a significant impact on stack usage due to code which places
cpumasks on the stack. To mitigate that concern, we can select
CPUMASK_OFFSTACK. As that doesn't seem to be a problem today with
NR_CPUS=256, we only select this when NR_CPUS > 256.
CPUMASK_OFFSTACK configures the cpumasks in the kernel to be
dynamically allocated. This was used in the X86 architecture in the
past to enable support for larger CPU configurations up to 8k cpus.
With that is becomes possible to dynamically size the allocation of
the cpu bitmaps depending on the quantity of processors detected on
bootup. Memory used for cpumasks will increase if the kernel is
run on a machine with more cores.
Further increases may be needed if ARM processor vendors start
supporting more processors. Given the current inflationary trends
in core counts from multiple processor manufacturers this may occur.
There are minor regressions for hackbench. The kernel data size
for 512 cpus is smaller with offstack than with onstack.
Benchmark results using hackbench average over 10 runs of
hackbench -s 512 -l 2000 -g 15 -f 25 -P
on Altra 80 Core
Support for 256 CPUs on stack. Baseline
7.8564 sec
Support for 512 CUs on stack.
7.8713 sec + 0.18%
512 CPUS offstack
7.8916 sec + 0.44%
Kernel size comparison:
text data filename Difference to onstack256 baseline
25755648 9589248 vmlinuz-6.8.0-rc4-onstack256
25755648 9607680 vmlinuz-6.8.0-rc4-onstack512 +0.19%
25755648 9603584 vmlinuz-6.8.0-rc4-offstack512 +0.14%
Tested-by: Eric Mackay <eric.mackay@oracle.com>
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Christoph Lameter (Ampere) <cl@linux.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/37099a57-b655-3b3a-56d0-5f7fbd49d7db@gentwo.org
Link: https://lore.kernel.org/r/20240314125457.186678-1-m.szyprowski@samsung.com
[catalin.marinas@arm.com: use 'select' instead of duplicating 'config CPUMASK_OFFSTACK']
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|\ \
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Pull kvm updates from Paolo Bonzini:
"S390:
- Changes to FPU handling came in via the main s390 pull request
- Only deliver to the guest the SCLP events that userspace has
requested
- More virtual vs physical address fixes (only a cleanup since
virtual and physical address spaces are currently the same)
- Fix selftests undefined behavior
x86:
- Fix a restriction that the guest can't program a PMU event whose
encoding matches an architectural event that isn't included in the
guest CPUID. The enumeration of an architectural event only says
that if a CPU supports an architectural event, then the event can
be programmed *using the architectural encoding*. The enumeration
does NOT say anything about the encoding when the CPU doesn't
report support the event *in general*. It might support it, and it
might support it using the same encoding that made it into the
architectural PMU spec
- Fix a variety of bugs in KVM's emulation of RDPMC (more details on
individual commits) and add a selftest to verify KVM correctly
emulates RDMPC, counter availability, and a variety of other
PMC-related behaviors that depend on guest CPUID and therefore are
easier to validate with selftests than with custom guests (aka
kvm-unit-tests)
- Zero out PMU state on AMD if the virtual PMU is disabled, it does
not cause any bug but it wastes time in various cases where KVM
would check if a PMC event needs to be synthesized
- Optimize triggering of emulated events, with a nice ~10%
performance improvement in VM-Exit microbenchmarks when a vPMU is
exposed to the guest
- Tighten the check for "PMI in guest" to reduce false positives if
an NMI arrives in the host while KVM is handling an IRQ VM-Exit
- Fix a bug where KVM would report stale/bogus exit qualification
information when exiting to userspace with an internal error exit
code
- Add a VMX flag in /proc/cpuinfo to report 5-level EPT support
- Rework TDP MMU root unload, free, and alloc to run with mmu_lock
held for read, e.g. to avoid serializing vCPUs when userspace
deletes a memslot
- Tear down TDP MMU page tables at 4KiB granularity (used to be
1GiB). KVM doesn't support yielding in the middle of processing a
zap, and 1GiB granularity resulted in multi-millisecond lags that
are quite impolite for CONFIG_PREEMPT kernels
- Allocate write-tracking metadata on-demand to avoid the memory
overhead when a kernel is built with i915 virtualization support
but the workloads use neither shadow paging nor i915 virtualization
- Explicitly initialize a variety of on-stack variables in the
emulator that triggered KMSAN false positives
- Fix the debugregs ABI for 32-bit KVM
- Rework the "force immediate exit" code so that vendor code
ultimately decides how and when to force the exit, which allowed
some optimization for both Intel and AMD
- Fix a long-standing bug where kvm_has_noapic_vcpu could be left
elevated if vCPU creation ultimately failed, causing extra
unnecessary work
- Cleanup the logic for checking if the currently loaded vCPU is
in-kernel
- Harden against underflowing the active mmu_notifier invalidation
count, so that "bad" invalidations (usually due to bugs elsehwere
in the kernel) are detected earlier and are less likely to hang the
kernel
x86 Xen emulation:
- Overlay pages can now be cached based on host virtual address,
instead of guest physical addresses. This removes the need to
reconfigure and invalidate the cache if the guest changes the gpa
but the underlying host virtual address remains the same
- When possible, use a single host TSC value when computing the
deadline for Xen timers in order to improve the accuracy of the
timer emulation
- Inject pending upcall events when the vCPU software-enables its
APIC to fix a bug where an upcall can be lost (and to follow Xen's
behavior)
- Fall back to the slow path instead of warning if "fast" IRQ
delivery of Xen events fails, e.g. if the guest has aliased xAPIC
IDs
RISC-V:
- Support exception and interrupt handling in selftests
- New self test for RISC-V architectural timer (Sstc extension)
- New extension support (Ztso, Zacas)
- Support userspace emulation of random number seed CSRs
ARM:
- Infrastructure for building KVM's trap configuration based on the
architectural features (or lack thereof) advertised in the VM's ID
registers
- Support for mapping vfio-pci BARs as Normal-NC (vaguely similar to
x86's WC) at stage-2, improving the performance of interacting with
assigned devices that can tolerate it
- Conversion of KVM's representation of LPIs to an xarray, utilized
to address serialization some of the serialization on the LPI
injection path
- Support for _architectural_ VHE-only systems, advertised through
the absence of FEAT_E2H0 in the CPU's ID register
- Miscellaneous cleanups, fixes, and spelling corrections to KVM and
selftests
LoongArch:
- Set reserved bits as zero in CPUCFG
- Start SW timer only when vcpu is blocking
- Do not restart SW timer when it is expired
- Remove unnecessary CSR register saving during enter guest
- Misc cleanups and fixes as usual
Generic:
- Clean up Kconfig by removing CONFIG_HAVE_KVM, which was basically
always true on all architectures except MIPS (where Kconfig
determines the available depending on CPU capabilities). It is
replaced either by an architecture-dependent symbol for MIPS, and
IS_ENABLED(CONFIG_KVM) everywhere else
- Factor common "select" statements in common code instead of
requiring each architecture to specify it
- Remove thoroughly obsolete APIs from the uapi headers
- Move architecture-dependent stuff to uapi/asm/kvm.h
- Always flush the async page fault workqueue when a work item is
being removed, especially during vCPU destruction, to ensure that
there are no workers running in KVM code when all references to
KVM-the-module are gone, i.e. to prevent a very unlikely
use-after-free if kvm.ko is unloaded
- Grab a reference to the VM's mm_struct in the async #PF worker
itself instead of gifting the worker a reference, so that there's
no need to remember to *conditionally* clean up after the worker
Selftests:
- Reduce boilerplate especially when utilize selftest TAP
infrastructure
- Add basic smoke tests for SEV and SEV-ES, along with a pile of
library support for handling private/encrypted/protected memory
- Fix benign bugs where tests neglect to close() guest_memfd files"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (246 commits)
selftests: kvm: remove meaningless assignments in Makefiles
KVM: riscv: selftests: Add Zacas extension to get-reg-list test
RISC-V: KVM: Allow Zacas extension for Guest/VM
KVM: riscv: selftests: Add Ztso extension to get-reg-list test
RISC-V: KVM: Allow Ztso extension for Guest/VM
RISC-V: KVM: Forward SEED CSR access to user space
KVM: riscv: selftests: Add sstc timer test
KVM: riscv: selftests: Change vcpu_has_ext to a common function
KVM: riscv: selftests: Add guest helper to get vcpu id
KVM: riscv: selftests: Add exception handling support
LoongArch: KVM: Remove unnecessary CSR register saving during enter guest
LoongArch: KVM: Do not restart SW timer when it is expired
LoongArch: KVM: Start SW timer only when vcpu is blocking
LoongArch: KVM: Set reserved bits as zero in CPUCFG
KVM: selftests: Explicitly close guest_memfd files in some gmem tests
KVM: x86/xen: fix recursive deadlock in timer injection
KVM: pfncache: simplify locking and make more self-contained
KVM: x86/xen: remove WARN_ON_ONCE() with false positives in evtchn delivery
KVM: x86/xen: inject vCPU upcall vector when local APIC is enabled
KVM: x86/xen: improve accuracy of Xen timers
...
|
| | |
| | |
| | |
| | |
| | |
| | | |
It has no users anymore.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|\ \ \
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull non-MM updates from Andrew Morton:
- Kuan-Wei Chiu has developed the well-named series "lib min_heap: Min
heap optimizations".
- Kuan-Wei Chiu has also sped up the library sorting code in the series
"lib/sort: Optimize the number of swaps and comparisons".
- Alexey Gladkov has added the ability for code running within an IPC
namespace to alter its IPC and MQ limits. The series is "Allow to
change ipc/mq sysctls inside ipc namespace".
- Geert Uytterhoeven has contributed some dhrystone maintenance work in
the series "lib: dhry: miscellaneous cleanups".
- Ryusuke Konishi continues nilfs2 maintenance work in the series
"nilfs2: eliminate kmap and kmap_atomic calls"
"nilfs2: fix kernel bug at submit_bh_wbc()"
- Nathan Chancellor has updated our build tools requirements in the
series "Bump the minimum supported version of LLVM to 13.0.1".
- Muhammad Usama Anjum continues with the selftests maintenance work in
the series "selftests/mm: Improve run_vmtests.sh".
- Oleg Nesterov has done some maintenance work against the signal code
in the series "get_signal: minor cleanups and fix".
Plus the usual shower of singleton patches in various parts of the tree.
Please see the individual changelogs for details.
* tag 'mm-nonmm-stable-2024-03-14-09-36' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (77 commits)
nilfs2: prevent kernel bug at submit_bh_wbc()
nilfs2: fix failure to detect DAT corruption in btree and direct mappings
ocfs2: enable ocfs2_listxattr for special files
ocfs2: remove SLAB_MEM_SPREAD flag usage
assoc_array: fix the return value in assoc_array_insert_mid_shortcut()
buildid: use kmap_local_page()
watchdog/core: remove sysctl handlers from public header
nilfs2: use div64_ul() instead of do_div()
mul_u64_u64_div_u64: increase precision by conditionally swapping a and b
kexec: copy only happens before uchunk goes to zero
get_signal: don't initialize ksig->info if SIGNAL_GROUP_EXIT/group_exec_task
get_signal: hide_si_addr_tag_bits: fix the usage of uninitialized ksig
get_signal: don't abuse ksig->info.si_signo and ksig->sig
const_structs.checkpatch: add device_type
Normalise "name (ad@dr)" MODULE_AUTHORs to "name <ad@dr>"
dyndbg: replace kstrdup() + strchr() with kstrdup_and_replace()
list: leverage list_is_head() for list_entry_is_head()
nilfs2: MAINTAINERS: drop unreachable project mirror site
smp: make __smp_processor_id() 0-argument macro
fat: fix uninitialized field in nostale filehandles
...
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Now that the minimum supported version of LLVM for building the kernel has
been bumped to 13.0.1, several conditions become tautologies, as they will
always be true because the build will fail during the configuration stage
for older LLVM versions. Drop them, as they are unnecessary.
Link: https://lkml.kernel.org/r/20240125-bump-min-llvm-ver-to-13-0-1-v1-5-f5ff9bda41c5@kernel.org
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: "Aneesh Kumar K.V (IBM)" <aneesh.kumar@kernel.org>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: Borislav Petkov (AMD) <bp@alien8.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Conor Dooley <conor@kernel.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: "Naveen N. Rao" <naveen.n.rao@linux.ibm.com>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Nicolas Schier <nicolas@fjasle.eu>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
| |/ /
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
reviews.llvm.org was LLVM's Phabricator instances for code review. It has
been abandoned in favor of GitHub pull requests. While the majority of
links in the kernel sources still work because of the work Fangrui has
done turning the dynamic Phabricator instance into a static archive, there
are some issues with that work, so preemptively convert all the links in
the kernel sources to point to the commit on GitHub.
Most of the commits have the corresponding differential review link in the
commit message itself so there should not be any loss of fidelity in the
relevant information.
Link: https://discourse.llvm.org/t/update-on-github-pull-requests/71540/172
Link: https://lkml.kernel.org/r/20240109-update-llvm-links-v1-2-eb09b59db071@kernel.org
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Acked-by: Fangrui Song <maskray@google.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Mykola Lysenko <mykolal@fb.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|\ \ \
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull MM updates from Andrew Morton:
- Sumanth Korikkar has taught s390 to allocate hotplug-time page frames
from hotplugged memory rather than only from main memory. Series
"implement "memmap on memory" feature on s390".
- More folio conversions from Matthew Wilcox in the series
"Convert memcontrol charge moving to use folios"
"mm: convert mm counter to take a folio"
- Chengming Zhou has optimized zswap's rbtree locking, providing
significant reductions in system time and modest but measurable
reductions in overall runtimes. The series is "mm/zswap: optimize the
scalability of zswap rb-tree".
- Chengming Zhou has also provided the series "mm/zswap: optimize zswap
lru list" which provides measurable runtime benefits in some
swap-intensive situations.
- And Chengming Zhou further optimizes zswap in the series "mm/zswap:
optimize for dynamic zswap_pools". Measured improvements are modest.
- zswap cleanups and simplifications from Yosry Ahmed in the series
"mm: zswap: simplify zswap_swapoff()".
- In the series "Add DAX ABI for memmap_on_memory", Vishal Verma has
contributed several DAX cleanups as well as adding a sysfs tunable to
control the memmap_on_memory setting when the dax device is
hotplugged as system memory.
- Johannes Weiner has added the large series "mm: zswap: cleanups",
which does that.
- More DAMON work from SeongJae Park in the series
"mm/damon: make DAMON debugfs interface deprecation unignorable"
"selftests/damon: add more tests for core functionalities and corner cases"
"Docs/mm/damon: misc readability improvements"
"mm/damon: let DAMOS feeds and tame/auto-tune itself"
- In the series "mm/mempolicy: weighted interleave mempolicy and sysfs
extension" Rakie Kim has developed a new mempolicy interleaving
policy wherein we allocate memory across nodes in a weighted fashion
rather than uniformly. This is beneficial in heterogeneous memory
environments appearing with CXL.
- Christophe Leroy has contributed some cleanup and consolidation work
against the ARM pagetable dumping code in the series "mm: ptdump:
Refactor CONFIG_DEBUG_WX and check_wx_pages debugfs attribute".
- Luis Chamberlain has added some additional xarray selftesting in the
series "test_xarray: advanced API multi-index tests".
- Muhammad Usama Anjum has reworked the selftest code to make its
human-readable output conform to the TAP ("Test Anything Protocol")
format. Amongst other things, this opens up the use of third-party
tools to parse and process out selftesting results.
- Ryan Roberts has added fork()-time PTE batching of THP ptes in the
series "mm/memory: optimize fork() with PTE-mapped THP". Mainly
targeted at arm64, this significantly speeds up fork() when the
process has a large number of pte-mapped folios.
- David Hildenbrand also gets in on the THP pte batching game in his
series "mm/memory: optimize unmap/zap with PTE-mapped THP". It
implements batching during munmap() and other pte teardown
situations. The microbenchmark improvements are nice.
- And in the series "Transparent Contiguous PTEs for User Mappings"
Ryan Roberts further utilizes arm's pte's contiguous bit ("contpte
mappings"). Kernel build times on arm64 improved nicely. Ryan's
series "Address some contpte nits" provides some followup work.
- In the series "mm/hugetlb: Restore the reservation" Breno Leitao has
fixed an obscure hugetlb race which was causing unnecessary page
faults. He has also added a reproducer under the selftest code.
- In the series "selftests/mm: Output cleanups for the compaction
test", Mark Brown did what the title claims.
- Kinsey Ho has added the series "mm/mglru: code cleanup and
refactoring".
- Even more zswap material from Nhat Pham. The series "fix and extend
zswap kselftests" does as claimed.
- In the series "Introduce cpu_dcache_is_aliasing() to fix DAX
regression" Mathieu Desnoyers has cleaned up and fixed rather a mess
in our handling of DAX on archiecctures which have virtually aliasing
data caches. The arm architecture is the main beneficiary.
- Lokesh Gidra's series "per-vma locks in userfaultfd" provides
dramatic improvements in worst-case mmap_lock hold times during
certain userfaultfd operations.
- Some page_owner enhancements and maintenance work from Oscar Salvador
in his series
"page_owner: print stacks and their outstanding allocations"
"page_owner: Fixup and cleanup"
- Uladzislau Rezki has contributed some vmalloc scalability
improvements in his series "Mitigate a vmap lock contention". It
realizes a 12x improvement for a certain microbenchmark.
- Some kexec/crash cleanup work from Baoquan He in the series "Split
crash out from kexec and clean up related config items".
- Some zsmalloc maintenance work from Chengming Zhou in the series
"mm/zsmalloc: fix and optimize objects/page migration"
"mm/zsmalloc: some cleanup for get/set_zspage_mapping()"
- Zi Yan has taught the MM to perform compaction on folios larger than
order=0. This a step along the path to implementaton of the merging
of large anonymous folios. The series is named "Enable >0 order folio
memory compaction".
- Christoph Hellwig has done quite a lot of cleanup work in the
pagecache writeback code in his series "convert write_cache_pages()
to an iterator".
- Some modest hugetlb cleanups and speedups in Vishal Moola's series
"Handle hugetlb faults under the VMA lock".
- Zi Yan has changed the page splitting code so we can split huge pages
into sizes other than order-0 to better utilize large folios. The
series is named "Split a folio to any lower order folios".
- David Hildenbrand has contributed the series "mm: remove
total_mapcount()", a cleanup.
- Matthew Wilcox has sought to improve the performance of bulk memory
freeing in his series "Rearrange batched folio freeing".
- Gang Li's series "hugetlb: parallelize hugetlb page init on boot"
provides large improvements in bootup times on large machines which
are configured to use large numbers of hugetlb pages.
- Matthew Wilcox's series "PageFlags cleanups" does that.
- Qi Zheng's series "minor fixes and supplement for ptdesc" does that
also. S390 is affected.
- Cleanups to our pagemap utility functions from Peter Xu in his series
"mm/treewide: Replace pXd_large() with pXd_leaf()".
- Nico Pache has fixed a few things with our hugepage selftests in his
series "selftests/mm: Improve Hugepage Test Handling in MM
Selftests".
- Also, of course, many singleton patches to many things. Please see
the individual changelogs for details.
* tag 'mm-stable-2024-03-13-20-04' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (435 commits)
mm/zswap: remove the memcpy if acomp is not sleepable
crypto: introduce: acomp_is_async to expose if comp drivers might sleep
memtest: use {READ,WRITE}_ONCE in memory scanning
mm: prohibit the last subpage from reusing the entire large folio
mm: recover pud_leaf() definitions in nopmd case
selftests/mm: skip the hugetlb-madvise tests on unmet hugepage requirements
selftests/mm: skip uffd hugetlb tests with insufficient hugepages
selftests/mm: dont fail testsuite due to a lack of hugepages
mm/huge_memory: skip invalid debugfs new_order input for folio split
mm/huge_memory: check new folio order when split a folio
mm, vmscan: retry kswapd's priority loop with cache_trim_mode off on failure
mm: add an explicit smp_wmb() to UFFDIO_CONTINUE
mm: fix list corruption in put_pages_list
mm: remove folio from deferred split list before uncharging it
filemap: avoid unnecessary major faults in filemap_fault()
mm,page_owner: drop unnecessary check
mm,page_owner: check for null stack_record before bumping its refcount
mm: swap: fix race between free_swap_and_cache() and swapoff()
mm/treewide: align up pXd_leaf() retval across archs
mm/treewide: drop pXd_large()
...
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Patch series "Split crash out from kexec and clean up related config
items", v3.
Motivation:
=============
Previously, LKP reported a building error. When investigating, it can't
be resolved reasonablly with the present messy kdump config items.
https://lore.kernel.org/oe-kbuild-all/202312182200.Ka7MzifQ-lkp@intel.com/
The kdump (crash dumping) related config items could causes confusions:
Firstly,
CRASH_CORE enables codes including
- crashkernel reservation;
- elfcorehdr updating;
- vmcoreinfo exporting;
- crash hotplug handling;
Now fadump of powerpc, kcore dynamic debugging and kdump all selects
CRASH_CORE, while fadump
- fadump needs crashkernel parsing, vmcoreinfo exporting, and accessing
global variable 'elfcorehdr_addr';
- kcore only needs vmcoreinfo exporting;
- kdump needs all of the current kernel/crash_core.c.
So only enabling PROC_CORE or FA_DUMP will enable CRASH_CORE, this
mislead people that we enable crash dumping, actual it's not.
Secondly,
It's not reasonable to allow KEXEC_CORE select CRASH_CORE.
Because KEXEC_CORE enables codes which allocate control pages, copy
kexec/kdump segments, and prepare for switching. These codes are
shared by both kexec reboot and kdump. We could want kexec reboot,
but disable kdump. In that case, CRASH_CORE should not be selected.
--------------------
CONFIG_CRASH_CORE=y
CONFIG_KEXEC_CORE=y
CONFIG_KEXEC=y
CONFIG_KEXEC_FILE=y
---------------------
Thirdly,
It's not reasonable to allow CRASH_DUMP select KEXEC_CORE.
That could make KEXEC_CORE, CRASH_DUMP are enabled independently from
KEXEC or KEXEC_FILE. However, w/o KEXEC or KEXEC_FILE, the KEXEC_CORE
code built in doesn't make any sense because no kernel loading or
switching will happen to utilize the KEXEC_CORE code.
---------------------
CONFIG_CRASH_CORE=y
CONFIG_KEXEC_CORE=y
CONFIG_CRASH_DUMP=y
---------------------
In this case, what is worse, on arch sh and arm, KEXEC relies on MMU,
while CRASH_DUMP can still be enabled when !MMU, then compiling error is
seen as the lkp test robot reported in above link.
------arch/sh/Kconfig------
config ARCH_SUPPORTS_KEXEC
def_bool MMU
config ARCH_SUPPORTS_CRASH_DUMP
def_bool BROKEN_ON_SMP
---------------------------
Changes:
===========
1, split out crash_reserve.c from crash_core.c;
2, split out vmcore_infoc. from crash_core.c;
3, move crash related codes in kexec_core.c into crash_core.c;
4, remove dependency of FA_DUMP on CRASH_DUMP;
5, clean up kdump related config items;
6, wrap up crash codes in crash related ifdefs on all 8 arch-es
which support crash dumping, except of ppc;
Achievement:
===========
With above changes, I can rearrange the config item logic as below (the right
item depends on or is selected by the left item):
PROC_KCORE -----------> VMCORE_INFO
|----------> VMCORE_INFO
FA_DUMP----|
|----------> CRASH_RESERVE
---->VMCORE_INFO
/
|---->CRASH_RESERVE
KEXEC --| /|
|--> KEXEC_CORE--> CRASH_DUMP-->/-|---->PROC_VMCORE
KEXEC_FILE --| \ |
\---->CRASH_HOTPLUG
KEXEC --|
|--> KEXEC_CORE (for kexec reboot only)
KEXEC_FILE --|
Test
========
On all 8 architectures, including x86_64, arm64, s390x, sh, arm, mips,
riscv, loongarch, I did below three cases of config item setting and
building all passed. Take configs on x86_64 as exampmle here:
(1) Both CONFIG_KEXEC and KEXEC_FILE is unset, then all kexec/kdump
items are unset automatically:
# Kexec and crash features
# CONFIG_KEXEC is not set
# CONFIG_KEXEC_FILE is not set
# end of Kexec and crash features
(2) set CONFIG_KEXEC_FILE and 'make olddefconfig':
---------------
# Kexec and crash features
CONFIG_CRASH_RESERVE=y
CONFIG_VMCORE_INFO=y
CONFIG_KEXEC_CORE=y
CONFIG_KEXEC_FILE=y
CONFIG_CRASH_DUMP=y
CONFIG_CRASH_HOTPLUG=y
CONFIG_CRASH_MAX_MEMORY_RANGES=8192
# end of Kexec and crash features
---------------
(3) unset CONFIG_CRASH_DUMP in case 2 and execute 'make olddefconfig':
------------------------
# Kexec and crash features
CONFIG_KEXEC_CORE=y
CONFIG_KEXEC_FILE=y
# end of Kexec and crash features
------------------------
Note:
For ppc, it needs investigation to make clear how to split out crash
code in arch folder. Hope Hari and Pingfan can help have a look, see if
it's doable. Now, I make it either have both kexec and crash enabled, or
disable both of them altogether.
This patch (of 14):
Both kdump and fa_dump of ppc rely on crashkernel reservation. Move the
relevant codes into separate files: crash_reserve.c,
include/linux/crash_reserve.h.
And also add config item CRASH_RESERVE to control its enabling of the
codes. And update config items which has relationship with crashkernel
reservation.
And also change ifdeffery from CONFIG_CRASH_CORE to CONFIG_CRASH_RESERVE
when those scopes are only crashkernel reservation related.
And also rename arch/XXX/include/asm/{crash_core.h => crash_reserve.h} on
arm64, x86 and risc-v because those architectures' crash_core.h is only
related to crashkernel reservation.
[akpm@linux-foundation.org: s/CRASH_RESEERVE/CRASH_RESERVE/, per Klara Modin]
Link: https://lkml.kernel.org/r/20240124051254.67105-1-bhe@redhat.com
Link: https://lkml.kernel.org/r/20240124051254.67105-2-bhe@redhat.com
Signed-off-by: Baoquan He <bhe@redhat.com>
Acked-by: Hari Bathini <hbathini@linux.ibm.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Pingfan Liu <piliu@redhat.com>
Cc: Klara Modin <klarasmodin@gmail.com>
Cc: Michael Kelley <mhklinux@outlook.com>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Yang Li <yang.lee@linux.alibaba.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
| |/ /
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
With the ptep API sufficiently refactored, we can now introduce a new
"contpte" API layer, which transparently manages the PTE_CONT bit for user
mappings.
In this initial implementation, only suitable batches of PTEs, set via
set_ptes(), are mapped with the PTE_CONT bit. Any subsequent modification
of individual PTEs will cause an "unfold" operation to repaint the contpte
block as individual PTEs before performing the requested operation.
While, a modification of a single PTE could cause the block of PTEs to
which it belongs to become eligible for "folding" into a contpte entry,
"folding" is not performed in this initial implementation due to the costs
of checking the requirements are met. Due to this, contpte mappings will
degrade back to normal pte mappings over time if/when protections are
changed. This will be solved in a future patch.
Since a contpte block only has a single access and dirty bit, the semantic
here changes slightly; when getting a pte (e.g. ptep_get()) that is part
of a contpte mapping, the access and dirty information are pulled from the
block (so all ptes in the block return the same access/dirty info). When
changing the access/dirty info on a pte (e.g. ptep_set_access_flags())
that is part of a contpte mapping, this change will affect the whole
contpte block. This is works fine in practice since we guarantee that
only a single folio is mapped by a contpte block, and the core-mm tracks
access/dirty information per folio.
In order for the public functions, which used to be pure inline, to
continue to be callable by modules, export all the contpte_* symbols that
are now called by those public inline functions.
The feature is enabled/disabled with the ARM64_CONTPTE Kconfig parameter
at build time. It defaults to enabled as long as its dependency,
TRANSPARENT_HUGEPAGE is also enabled. The core-mm depends upon
TRANSPARENT_HUGEPAGE to be able to allocate large folios, so if its not
enabled, then there is no chance of meeting the physical contiguity
requirement for contpte mappings.
Link: https://lkml.kernel.org/r/20240215103205.2607016-13-ryan.roberts@arm.com
Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Tested-by: John Hubbard <jhubbard@nvidia.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Alistair Popple <apopple@nvidia.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Barry Song <21cnbao@gmail.com>
Cc: Borislav Petkov (AMD) <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Morse <james.morse@arm.com>
Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will@kernel.org>
Cc: Yang Shi <shy828301@gmail.com>
Cc: Zi Yan <ziy@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|\ \ \
| | |/
| |/|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 updates from Catalin Marinas:
"The major features are support for LPA2 (52-bit VA/PA with 4K and 16K
pages), the dpISA extension and Rust enabled on arm64. The changes are
mostly contained within the usual arch/arm64/, drivers/perf, the arm64
Documentation and kselftests. The exception is the Rust support which
touches some generic build files.
Summary:
- Reorganise the arm64 kernel VA space and add support for LPA2 (at
stage 1, KVM stage 2 was merged earlier) - 52-bit VA/PA address
range with 4KB and 16KB pages
- Enable Rust on arm64
- Support for the 2023 dpISA extensions (data processing ISA), host
only
- arm64 perf updates:
- StarFive's StarLink (integrates one or more CPU cores with a
shared L3 memory system) PMU support
- Enable HiSilicon Erratum 162700402 quirk for HIP09
- Several updates for the HiSilicon PCIe PMU driver
- Arm CoreSight PMU support
- Convert all drivers under drivers/perf/ to use .remove_new()
- Miscellaneous:
- Don't enable workarounds for "rare" errata by default
- Clean up the DAIF flags handling for EL0 returns (in preparation
for NMI support)
- Kselftest update for ptrace()
- Update some of the sysreg field definitions
- Slight improvement in the code generation for inline asm I/O
accessors to permit offset addressing
- kretprobes: acquire regs via a BRK exception (previously done
via a trampoline handler)
- SVE/SME cleanups, comment updates
- Allow CALL_OPS+CC_OPTIMIZE_FOR_SIZE with clang (previously
disabled due to gcc silently ignoring -falign-functions=N)"
* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (134 commits)
Revert "mm: add arch hook to validate mmap() prot flags"
Revert "arm64: mm: add support for WXN memory translation attribute"
Revert "ARM64: Dynamically allocate cpumasks and increase supported CPUs to 512"
ARM64: Dynamically allocate cpumasks and increase supported CPUs to 512
kselftest/arm64: Add 2023 DPISA hwcap test coverage
kselftest/arm64: Add basic FPMR test
kselftest/arm64: Handle FPMR context in generic signal frame parser
arm64/hwcap: Define hwcaps for 2023 DPISA features
arm64/ptrace: Expose FPMR via ptrace
arm64/signal: Add FPMR signal handling
arm64/fpsimd: Support FEAT_FPMR
arm64/fpsimd: Enable host kernel access to FPMR
arm64/cpufeature: Hook new identification registers up to cpufeature
docs: perf: Fix build warning of hisi-pcie-pmu.rst
perf: starfive: Only allow COMPILE_TEST for 64-bit architectures
MAINTAINERS: Add entry for StarFive StarLink PMU
docs: perf: Add description for StarFive's StarLink PMU
dt-bindings: perf: starfive: Add JH8100 StarLink PMU
perf: starfive: Add StarLink PMU support
docs: perf: Update usage for target filter of hisi-pcie-pmu
...
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
This reverts commit 50e3ed0f93f4f62ed2aa83de5db6cb84ecdd5707.
The SCTLR_EL1.WXN control forces execute-never when a page has write
permissions. While the idea of hardening such write/exec combinations is
good, with permissions indirection enabled (FEAT_PIE) this control
becomes RES0. FEAT_PIE introduces a slightly different form of WXN which
only has an effect when the base permission is RWX and the write is
toggled by the permission overlay (FEAT_POE, not yet supported by the
arm64 kernel). Revert the patch for now.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/r/ZfGESD3a91lxH367@arm.com
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
This reverts commit 0499a78369adacec1af29340b71ff8dd375b4697.
Enabling CPUMASK_OFFSTACK on arm64 triggers a warning in the
dev_pm_opp_set_config() function followed by a failure to set the
regulators and cpufreq-dt probing error. There is no apparent reason why
this happens, so revert this commit until further investigation.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Reported-by: Marek Szyprowski <m.szyprowski@samsung.com>
Link: https://lore.kernel.org/r/c1f2902d-cefc-4122-9b86-d1d32911f590@samsung.com
|
| |\ \
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
* for-next/stage1-lpa2: (48 commits)
: Add support for LPA2 and WXN and stage 1
arm64/mm: Avoid ID mapping of kpti flag if it is no longer needed
arm64/mm: Use generic __pud_free() helper in pud_free() implementation
arm64: gitignore: ignore relacheck
arm64: Use Signed/Unsigned enums for TGRAN{4,16,64} and VARange
arm64: mm: Make PUD folding check in set_pud() a runtime check
arm64: mm: add support for WXN memory translation attribute
mm: add arch hook to validate mmap() prot flags
arm64: defconfig: Enable LPA2 support
arm64: Enable 52-bit virtual addressing for 4k and 16k granule configs
arm64: kvm: avoid CONFIG_PGTABLE_LEVELS for runtime levels
arm64: ptdump: Deal with translation levels folded at runtime
arm64: ptdump: Disregard unaddressable VA space
arm64: mm: Add support for folding PUDs at runtime
arm64: kasan: Reduce minimum shadow alignment and enable 5 level paging
arm64: mm: Add 5 level paging support to fixmap and swapper handling
arm64: Enable LPA2 at boot if supported by the system
arm64: mm: add LPA2 and 5 level paging support to G-to-nG conversion
arm64: mm: Add definitions to support 5 levels of paging
arm64: mm: Add LPA2 support to phys<->pte conversion routines
arm64: mm: Wire up TCR.DS bit to PTE shareability fields
...
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
The AArch64 virtual memory system supports a global WXN control, which
can be enabled to make all writable mappings implicitly no-exec. This is
a useful hardening feature, as it prevents mistakes in managing page
table permissions from being exploited to attack the system.
When enabled at EL1, the restrictions apply to both EL1 and EL0. EL1 is
completely under our control, and has been cleaned up to allow WXN to be
enabled from boot onwards. EL0 is not under our control, but given that
widely deployed security features such as selinux or PaX already limit
the ability of user space to create mappings that are writable and
executable at the same time, the impact of enabling this for EL0 is
expected to be limited. (For this reason, common user space libraries
that have a legitimate need for manipulating executable code already
carry fallbacks such as [0].)
If enabled at compile time, the feature can still be disabled at boot if
needed, by passing arm64.nowxn on the kernel command line.
[0] https://github.com/libffi/libffi/blob/master/src/closures.c#L440
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20240214122845.2033971-88-ardb+git@google.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
We typically enable support in defconfig for all architectural features
for which we can detect at runtime if the hardware actually supports
them.
Now that we have implemented support for LPA2 based 52-bit virtual
addressing in a way that should not impact 48-bit operation on non-LPA2
CPU, we can do the same, and enable 52-bit virtual addressing by
default.
Catalin adds:
Currently the "Virtual address space size" arch/arm64/Kconfig menu
entry sets different defaults for each page size. However, all are
overridden by the defconfig to 48 bits. Set the new default in
Kconfig and remove the defconfig line.
[ardb: squash follow-up fix from Catalin]
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20240214122845.2033971-86-ardb+git@google.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Update Kconfig to permit 4k and 16k granule configurations to be built
with 52-bit virtual addressing, now that all the prerequisites are in
place.
While at it, update the feature description so it matches on the
appropriate feature bits depending on the page size. For simplicity,
let's just keep ARM64_HAS_VA52 as the feature name.
Note that LPA2 based 52-bit virtual addressing requires 52-bit physical
addressing support to be enabled as well, as programming TCR.TxSZ to
values below 16 is not allowed unless TCR.DS is set, which is what
activates the 52-bit physical addressing support.
While supporting the converse (52-bit physical addressing without 52-bit
virtual addressing) would be possible in principle, let's keep things
simple, by only allowing these features to be enabled at the same time.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20240214122845.2033971-85-ardb+git@google.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Allow the KASAN init code to deal with 5 levels of paging, and relax the
requirement that the shadow region is aligned to the top level pgd_t
size. This is necessary for LPA2 based 52-bit virtual addressing, where
the KASAN shadow will never be aligned to the pgd_t size. Allowing this
also enables the 16k/48-bit case for KASAN, which is a nice bonus.
This involves some hackery to manipulate the root and next level page
tables without having to distinguish all the various configurations,
including 16k/48-bits (which has a two entry pgd_t level), and LPA2
configurations running with one translation level less on non-LPA2
hardware.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20240214122845.2033971-80-ardb+git@google.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
| | |/
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
When LPA2 is enabled, bits 8 and 9 of page and block descriptors become
part of the output address instead of carrying shareability attributes
for the region in question.
So avoid setting these bits if TCR.DS == 1, which means LPA2 is enabled.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20240214122845.2033971-74-ardb+git@google.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
| | | | |
| | \ | |
| |\ \ \
| | |_|/
| |/| |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
'for-next/misc', 'for-next/daif-cleanup', 'for-next/kselftest', 'for-next/documentation', 'for-next/sysreg' and 'for-next/dpisa', remote-tracking branch 'arm64/for-next/perf' into for-next/core
* arm64/for-next/perf: (39 commits)
docs: perf: Fix build warning of hisi-pcie-pmu.rst
perf: starfive: Only allow COMPILE_TEST for 64-bit architectures
MAINTAINERS: Add entry for StarFive StarLink PMU
docs: perf: Add description for StarFive's StarLink PMU
dt-bindings: perf: starfive: Add JH8100 StarLink PMU
perf: starfive: Add StarLink PMU support
docs: perf: Update usage for target filter of hisi-pcie-pmu
drivers/perf: hisi_pcie: Merge find_related_event() and get_event_idx()
drivers/perf: hisi_pcie: Relax the check on related events
drivers/perf: hisi_pcie: Check the target filter properly
drivers/perf: hisi_pcie: Add more events for counting TLP bandwidth
drivers/perf: hisi_pcie: Fix incorrect counting under metric mode
drivers/perf: hisi_pcie: Introduce hisi_pcie_pmu_get_event_ctrl_val()
drivers/perf: hisi_pcie: Rename hisi_pcie_pmu_{config,clear}_filter()
drivers/perf: hisi: Enable HiSilicon Erratum 162700402 quirk for HIP09
perf/arm_cspmu: Add devicetree support
dt-bindings/perf: Add Arm CoreSight PMU
perf/arm_cspmu: Simplify counter reset
perf/arm_cspmu: Simplify attribute groups
perf/arm_cspmu: Simplify initialisation
...
* for-next/reorg-va-space:
: Reorganise the arm64 kernel VA space in preparation for LPA2 support
: (52-bit VA/PA).
arm64: kaslr: Adjust randomization range dynamically
arm64: mm: Reclaim unused vmemmap region for vmalloc use
arm64: vmemmap: Avoid base2 order of struct page size to dimension region
arm64: ptdump: Discover start of vmemmap region at runtime
arm64: ptdump: Allow all region boundaries to be defined at boot time
arm64: mm: Move fixmap region above vmemmap region
arm64: mm: Move PCI I/O emulation region above the vmemmap region
* for-next/rust-for-arm64:
: Enable Rust support for arm64
arm64: rust: Enable Rust support for AArch64
rust: Refactor the build target to allow the use of builtin targets
* for-next/misc:
: Miscellaneous arm64 patches
ARM64: Dynamically allocate cpumasks and increase supported CPUs to 512
arm64: Remove enable_daif macro
arm64/hw_breakpoint: Directly use ESR_ELx_WNR for an watchpoint exception
arm64: cpufeatures: Clean up temporary variable to simplify code
arm64: Update setup_arch() comment on interrupt masking
arm64: remove unnecessary ifdefs around is_compat_task()
arm64: ftrace: Don't forbid CALL_OPS+CC_OPTIMIZE_FOR_SIZE with Clang
arm64/sme: Ensure that all fields in SMCR_EL1 are set to known values
arm64/sve: Ensure that all fields in ZCR_EL1 are set to known values
arm64/sve: Document that __SVE_VQ_MAX is much larger than needed
arm64: make member of struct pt_regs and it's offset macro in the same order
arm64: remove unneeded BUILD_BUG_ON assertion
arm64: kretprobes: acquire the regs via a BRK exception
arm64: io: permit offset addressing
arm64: errata: Don't enable workarounds for "rare" errata by default
* for-next/daif-cleanup:
: Clean up DAIF handling for EL0 returns
arm64: Unmask Debug + SError in do_notify_resume()
arm64: Move do_notify_resume() to entry-common.c
arm64: Simplify do_notify_resume() DAIF masking
* for-next/kselftest:
: Miscellaneous arm64 kselftest patches
kselftest/arm64: Test that ptrace takes effect in the target process
* for-next/documentation:
: arm64 documentation patches
arm64/sme: Remove spurious 'is' in SME documentation
arm64/fp: Clarify effect of setting an unsupported system VL
arm64/sme: Fix cut'n'paste in ABI document
arm64/sve: Remove bitrotted comment about syscall behaviour
* for-next/sysreg:
: sysreg updates
arm64/sysreg: Update ID_AA64DFR0_EL1 register
arm64/sysreg: Update ID_DFR0_EL1 register fields
arm64/sysreg: Add register fields for ID_AA64DFR1_EL1
* for-next/dpisa:
: Support for 2023 dpISA extensions
kselftest/arm64: Add 2023 DPISA hwcap test coverage
kselftest/arm64: Add basic FPMR test
kselftest/arm64: Handle FPMR context in generic signal frame parser
arm64/hwcap: Define hwcaps for 2023 DPISA features
arm64/ptrace: Expose FPMR via ptrace
arm64/signal: Add FPMR signal handling
arm64/fpsimd: Support FEAT_FPMR
arm64/fpsimd: Enable host kernel access to FPMR
arm64/cpufeature: Hook new identification registers up to cpufeature
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Currently defconfig selects NR_CPUS=256, but some vendors (e.g. Ampere
Computing) are planning to ship systems with 512 CPUs. So that all CPUs on
these systems can be used with defconfig, we'd like to bump NR_CPUS to 512.
Therefore this patch increases the default NR_CPUS from 256 to 512.
As increasing NR_CPUS will increase the size of cpumasks, there's a fear that
this might have a significant impact on stack usage due to code which places
cpumasks on the stack. To mitigate that concern, we can select
CPUMASK_OFFSTACK. As that doesn't seem to be a problem today with
NR_CPUS=256, we only select this when NR_CPUS > 256.
CPUMASK_OFFSTACK configures the cpumasks in the kernel to be
dynamically allocated. This was used in the X86 architecture in the
past to enable support for larger CPU configurations up to 8k cpus.
With that is becomes possible to dynamically size the allocation of
the cpu bitmaps depending on the quantity of processors detected on
bootup. Memory used for cpumasks will increase if the kernel is
run on a machine with more cores.
Further increases may be needed if ARM processor vendors start
supporting more processors. Given the current inflationary trends
in core counts from multiple processor manufacturers this may occur.
There are minor regressions for hackbench. The kernel data size
for 512 cpus is smaller with offstack than with onstack.
Benchmark results using hackbench average over 10 runs of
hackbench -s 512 -l 2000 -g 15 -f 25 -P
on Altra 80 Core
Support for 256 CPUs on stack. Baseline
7.8564 sec
Support for 512 CUs on stack.
7.8713 sec + 0.18%
512 CPUS offstack
7.8916 sec + 0.44%
Kernel size comparison:
text data filename Difference to onstack256 baseline
25755648 9589248 vmlinuz-6.8.0-rc4-onstack256
25755648 9607680 vmlinuz-6.8.0-rc4-onstack512 +0.19%
25755648 9603584 vmlinuz-6.8.0-rc4-offstack512 +0.14%
Tested-by: Eric Mackay <eric.mackay@oracle.com>
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Christoph Lameter (Ampere) <cl@linux.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/37099a57-b655-3b3a-56d0-5f7fbd49d7db@gentwo.org
[catalin.marinas@arm.com: use 'select' instead of duplicating 'config CPUMASK_OFFSTACK']
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Per commit b3f11af9b2ce ("arm64: ftrace: forbid CALL_OPS with
CC_OPTIMIZE_FOR_SIZE"), GCC is silently ignoring `-falign-functions=N`
when passed `-Os`, causing functions to be improperly aligned. This
doesn't seem to be a problem with Clang though, where enabling CALL_OPS
with CC_OPTIMIZE_FOR_SIZE doesn't spit out any warnings at boot about
misaligned patch-sites. Only forbid CALL_OPS if GCC is used and we're
optimizing for size so that CALL_OPS can be used with clang optimizing
for size.
Cc: Jason Ling <jasonling@chromium.org>
Cc: Florian Fainelli <f.fainelli@gmail.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Bill Wendling <morbo@google.com>
Cc: Justin Stitt <justinstitt@google.com>
Cc: llvm@lists.linux.dev
Fixes: b3f11af9b2ce ("arm64: ftrace: forbid CALL_OPS with CC_OPTIMIZE_FOR_SIZE")
Signed-off-by: Stephen Boyd <swboyd@chromium.org>
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Link: https://lore.kernel.org/r/20240223064032.3463229-1-swboyd@chromium.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
| | |/
| |/|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Arm classifies some of its CPU errata as "rare", indicating that the
hardware error is unlikely to occur in practice. Given that the cost of
errata workarounds can often be significant in terms of power and
performance, don't enable workarounds for "rare" errata by default and
update our documentation to reflect that.
Cc: James Morse <james.morse@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/20240209183916.25860-1-will@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
| |/
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
This commit provides the build flags for Rust for AArch64. The core Rust
support already in the kernel does the rest. This enables the PAC ret
and BTI options in the Rust build flags to match the options that are
used when building C.
The Rust samples have been tested with this commit.
Signed-off-by: Jamie Cunliffe <Jamie.Cunliffe@arm.com>
Acked-by: Will Deacon <will@kernel.org>
Tested-by: Dirk Behme <dirk.behme@de.bosch.com>
Tested-by: Boqun Feng <boqun.feng@gmail.com>
Acked-by: Miguel Ojeda <ojeda@kernel.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Tested-by: Alice Ryhl <aliceryhl@google.com>
Tested-by: Fabien Parent <fabien.parent@linaro.org>
Link: https://lore.kernel.org/r/20231020155056.3495121-3-Jamie.Cunliffe@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|\ \
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
Pull hardening updates from Kees Cook:
"As is pretty normal for this tree, there are changes all over the
place, especially for small fixes, selftest improvements, and improved
macro usability.
Some header changes ended up landing via this tree as they depended on
the string header cleanups. Also, a notable set of changes is the work
for the reintroduction of the UBSAN signed integer overflow sanitizer
so that we can continue to make improvements on the compiler side to
make this sanitizer a more viable future security hardening option.
Summary:
- string.h and related header cleanups (Tanzir Hasan, Andy
Shevchenko)
- VMCI memcpy() usage and struct_size() cleanups (Vasiliy Kovalev,
Harshit Mogalapalli)
- selftests/powerpc: Fix load_unaligned_zeropad build failure
(Michael Ellerman)
- hardened Kconfig fragment updates (Marco Elver, Lukas Bulwahn)
- Handle tail call optimization better in LKDTM (Douglas Anderson)
- Use long form types in overflow.h (Andy Shevchenko)
- Add flags param to string_get_size() (Andy Shevchenko)
- Add Coccinelle script for potential struct_size() use (Jacob
Keller)
- Fix objtool corner case under KCFI (Josh Poimboeuf)
- Drop 13 year old backward compat CAP_SYS_ADMIN check (Jingzi Meng)
- Add str_plural() helper (Michal Wajdeczko, Kees Cook)
- Ignore relocations in .notes section
- Add comments to explain how __is_constexpr() works
- Fix m68k stack alignment expectations in stackinit Kunit test
- Convert string selftests to KUnit
- Add KUnit tests for fortified string functions
- Improve reporting during fortified string warnings
- Allow non-type arg to type_max() and type_min()
- Allow strscpy() to be called with only 2 arguments
- Add binary mode to leaking_addresses scanner
- Various small cleanups to leaking_addresses scanner
- Adding wrapping_*() arithmetic helper
- Annotate initial signed integer wrap-around in refcount_t
- Add explicit UBSAN section to MAINTAINERS
- Fix UBSAN self-test warnings
- Simplify UBSAN build via removal of CONFIG_UBSAN_SANITIZE_ALL
- Reintroduce UBSAN's signed overflow sanitizer"
* tag 'hardening-v6.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: (51 commits)
selftests/powerpc: Fix load_unaligned_zeropad build failure
string: Convert helpers selftest to KUnit
string: Convert selftest to KUnit
sh: Fix build with CONFIG_UBSAN=y
compiler.h: Explain how __is_constexpr() works
overflow: Allow non-type arg to type_max() and type_min()
VMCI: Fix possible memcpy() run-time warning in vmci_datagram_invoke_guest_handler()
lib/string_helpers: Add flags param to string_get_size()
x86, relocs: Ignore relocations in .notes section
objtool: Fix UNWIND_HINT_{SAVE,RESTORE} across basic blocks
overflow: Use POD in check_shl_overflow()
lib: stackinit: Adjust target string to 8 bytes for m68k
sparc: vdso: Disable UBSAN instrumentation
kernel.h: Move lib/cmdline.c prototypes to string.h
leaking_addresses: Provide mechanism to scan binary files
leaking_addresses: Ignore input device status lines
leaking_addresses: Use File::Temp for /tmp files
MAINTAINERS: Update LEAKING_ADDRESSES details
fortify: Improve buffer overflow reporting
fortify: Add KUnit tests for runtime overflows
...
|
| |/
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
For simplicity in splitting out UBSan options into separate rules,
remove CONFIG_UBSAN_SANITIZE_ALL, effectively defaulting to "y", which
is how it is generally used anyway. (There are no ":= y" cases beyond
where a specific file is enabled when a top-level ":= n" is in effect.)
Cc: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Marco Elver <elver@google.com>
Cc: linux-doc@vger.kernel.org
Cc: linux-kbuild@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
|
|/
|
|
|
|
|
|
|
|
|
|
|
|
| |
arc, arm64, parisc and powerpc all have their own Kconfig symbols
in place of the common CONFIG_PAGE_SIZE_4KB symbols. Change these
so the common symbols are the ones that are actually used, while
leaving the arhcitecture specific ones as the user visible
place for configuring it, to avoid breaking user configs.
Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu> (powerpc32)
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Helge Deller <deller@gmx.de> # parisc
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
|
|\
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fixes from Will Deacon:
"I think the main one is fixing the dynamic SCS patching when full LTO
is enabled (clang was silently getting this horribly wrong), but it's
all good stuff.
Rob just pointed out that the fix to the workaround for erratum
#2966298 might not be necessary, but in the worst case it's harmless
and since the official description leaves a little to be desired here,
I've left it in.
Summary:
- Fix shadow call stack patching with LTO=full
- Fix voluntary preemption of the FPSIMD registers from assembly code
- Fix workaround for A520 CPU erratum #2966298 and extend to A510
- Fix SME issues that resulted in corruption of the register state
- Minor fixes (missing includes, formatting)"
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: Fix silcon-errata.rst formatting
arm64/sme: Always exit sme_alloc() early with existing storage
arm64/fpsimd: Remove spurious check for SVE support
arm64/ptrace: Don't flush ZA/ZT storage when writing ZA via ptrace
arm64: entry: simplify kernel_exit logic
arm64: entry: fix ARM64_WORKAROUND_SPECULATIVE_UNPRIV_LOAD
arm64: errata: Add Cortex-A510 speculative unprivileged load workaround
arm64: Rename ARM64_WORKAROUND_2966298
arm64: fpsimd: Bring cond_yield asm macro in line with new rules
arm64: scs: Work around full LTO issue with dynamic SCS
arm64: irq: include <linux/cpumask.h>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Implement the workaround for ARM Cortex-A510 erratum 3117295. On an
affected Cortex-A510 core, a speculatively executed unprivileged load
might leak data from a privileged load via a cache side channel. The
issue only exists for loads within a translation regime with the same
translation (e.g. same ASID and VMID). Therefore, the issue only affects
the return to EL0.
The erratum and workaround are the same as ARM Cortex-A520 erratum
2966298, so reuse the existing workaround.
Cc: stable@vger.kernel.org
Signed-off-by: Rob Herring <robh@kernel.org>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/20240110-arm-errata-a510-v1-2-d02bc51aeeee@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
In preparation to apply ARM64_WORKAROUND_2966298 for multiple errata,
rename the kconfig and capability. No functional change.
Cc: stable@vger.kernel.org
Signed-off-by: Rob Herring <robh@kernel.org>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/20240110-arm-errata-a510-v1-1-d02bc51aeeee@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
|
|\ \
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull driver core updates from Greg KH:
"Here are the set of driver core and kernfs changes for 6.8-rc1.
Nothing major in here this release cycle, just lots of small cleanups
and some tweaks on kernfs that in the very end, got reverted and will
come back in a safer way next release cycle.
Included in here are:
- more driver core 'const' cleanups and fixes
- fw_devlink=rpm is now the default behavior
- kernfs tiny changes to remove some string functions
- cpu handling in the driver core is updated to work better on many
systems that add topologies and cpus after booting
- other minor changes and cleanups
All of the cpu handling patches have been acked by the respective
maintainers and are coming in here in one series. Everything has been
in linux-next for a while with no reported issues"
* tag 'driver-core-6.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (51 commits)
Revert "kernfs: convert kernfs_idr_lock to an irq safe raw spinlock"
kernfs: convert kernfs_idr_lock to an irq safe raw spinlock
class: fix use-after-free in class_register()
PM: clk: make pm_clk_add_notifier() take a const pointer
EDAC: constantify the struct bus_type usage
kernfs: fix reference to renamed function
driver core: device.h: fix Excess kernel-doc description warning
driver core: class: fix Excess kernel-doc description warning
driver core: mark remaining local bus_type variables as const
driver core: container: make container_subsys const
driver core: bus: constantify subsys_register() calls
driver core: bus: make bus_sort_breadthfirst() take a const pointer
kernfs: d_obtain_alias(NULL) will do the right thing...
driver core: Better advertise dev_err_probe()
kernfs: Convert kernfs_path_from_node_locked() from strlcpy() to strscpy()
kernfs: Convert kernfs_name_locked() from strlcpy() to strscpy()
kernfs: Convert kernfs_walk_ns() from strlcpy() to strscpy()
initramfs: Expose retained initrd as sysfs file
fs/kernfs/dir: obey S_ISGID
kernel/cgroup: use kernfs_create_dir_ns()
...
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
To allow ACPI's _STA value to hide CPUs that are present, but not
available to online right now due to VMM or firmware policy, the
register_cpu() call needs to be made by the ACPI machinery when ACPI
is in use. This allows it to hide CPUs that are unavailable from sysfs.
Switching to GENERIC_CPU_DEVICES is an intermediate step to allow all
five ACPI architectures to be modified at once.
Switch over to GENERIC_CPU_DEVICES, and provide an arch_register_cpu()
that populates the hotpluggable flag. arch_register_cpu() is also the
interface the ACPI machinery expects.
The struct cpu in struct cpuinfo_arm64 is never used directly, remove
it to use the one GENERIC_CPU_DEVICES provides.
This changes the CPUs visible in sysfs from possible to present, but
on arm64 smp_prepare_cpus() ensures these are the same.
This patch also has the effect of moving the registration of CPUs from
subsys to driver core initialisation, prior to any initcalls running.
Signed-off-by: James Morse <james.morse@arm.com>
Reviewed-by: "Russell King (Oracle)" <rmk+kernel@armlinux.org.uk>
Reviewed-by: Shaoqin Huang <shahuang@redhat.com>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Signed-off-by: "Russell King (Oracle)" <rmk+kernel@armlinux.org.uk>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/E1r5R3b-00Csza-Ku@rmk-PC.armlinux.org.uk
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|\ \ \
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull MM updates from Andrew Morton:
"Many singleton patches against the MM code. The patch series which are
included in this merge do the following:
- Peng Zhang has done some mapletree maintainance work in the series
'maple_tree: add mt_free_one() and mt_attr() helpers'
'Some cleanups of maple tree'
- In the series 'mm: use memmap_on_memory semantics for dax/kmem'
Vishal Verma has altered the interworking between memory-hotplug
and dax/kmem so that newly added 'device memory' can more easily
have its memmap placed within that newly added memory.
- Matthew Wilcox continues folio-related work (including a few fixes)
in the patch series
'Add folio_zero_tail() and folio_fill_tail()'
'Make folio_start_writeback return void'
'Fix fault handler's handling of poisoned tail pages'
'Convert aops->error_remove_page to ->error_remove_folio'
'Finish two folio conversions'
'More swap folio conversions'
- Kefeng Wang has also contributed folio-related work in the series
'mm: cleanup and use more folio in page fault'
- Jim Cromie has improved the kmemleak reporting output in the series
'tweak kmemleak report format'.
- In the series 'stackdepot: allow evicting stack traces' Andrey
Konovalov to permits clients (in this case KASAN) to cause eviction
of no longer needed stack traces.
- Charan Teja Kalla has fixed some accounting issues in the page
allocator's atomic reserve calculations in the series 'mm:
page_alloc: fixes for high atomic reserve caluculations'.
- Dmitry Rokosov has added to the samples/ dorectory some sample code
for a userspace memcg event listener application. See the series
'samples: introduce cgroup events listeners'.
- Some mapletree maintanance work from Liam Howlett in the series
'maple_tree: iterator state changes'.
- Nhat Pham has improved zswap's approach to writeback in the series
'workload-specific and memory pressure-driven zswap writeback'.
- DAMON/DAMOS feature and maintenance work from SeongJae Park in the
series
'mm/damon: let users feed and tame/auto-tune DAMOS'
'selftests/damon: add Python-written DAMON functionality tests'
'mm/damon: misc updates for 6.8'
- Yosry Ahmed has improved memcg's stats flushing in the series 'mm:
memcg: subtree stats flushing and thresholds'.
- In the series 'Multi-size THP for anonymous memory' Ryan Roberts
has added a runtime opt-in feature to transparent hugepages which
improves performance by allocating larger chunks of memory during
anonymous page faults.
- Matthew Wilcox has also contributed some cleanup and maintenance
work against eh buffer_head code int he series 'More buffer_head
cleanups'.
- Suren Baghdasaryan has done work on Andrea Arcangeli's series
'userfaultfd move option'. UFFDIO_MOVE permits userspace heap
compaction algorithms to move userspace's pages around rather than
UFFDIO_COPY'a alloc/copy/free.
- Stefan Roesch has developed a 'KSM Advisor', in the series 'mm/ksm:
Add ksm advisor'. This is a governor which tunes KSM's scanning
aggressiveness in response to userspace's current needs.
- Chengming Zhou has optimized zswap's temporary working memory use
in the series 'mm/zswap: dstmem reuse optimizations and cleanups'.
- Matthew Wilcox has performed some maintenance work on the writeback
code, both code and within filesystems. The series is 'Clean up the
writeback paths'.
- Andrey Konovalov has optimized KASAN's handling of alloc and free
stack traces for secondary-level allocators, in the series 'kasan:
save mempool stack traces'.
- Andrey also performed some KASAN maintenance work in the series
'kasan: assorted clean-ups'.
- David Hildenbrand has gone to town on the rmap code. Cleanups, more
pte batching, folio conversions and more. See the series 'mm/rmap:
interface overhaul'.
- Kinsey Ho has contributed some maintenance work on the MGLRU code
in the series 'mm/mglru: Kconfig cleanup'.
- Matthew Wilcox has contributed lruvec page accounting code cleanups
in the series 'Remove some lruvec page accounting functions'"
* tag 'mm-stable-2024-01-08-15-31' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (361 commits)
mm, treewide: rename MAX_ORDER to MAX_PAGE_ORDER
mm, treewide: introduce NR_PAGE_ORDERS
selftests/mm: add separate UFFDIO_MOVE test for PMD splitting
selftests/mm: skip test if application doesn't has root privileges
selftests/mm: conform test to TAP format output
selftests: mm: hugepage-mmap: conform to TAP format output
selftests/mm: gup_test: conform test to TAP format output
mm/selftests: hugepage-mremap: conform test to TAP format output
mm/vmstat: move pgdemote_* out of CONFIG_NUMA_BALANCING
mm: zsmalloc: return -ENOSPC rather than -EINVAL in zs_malloc while size is too large
mm/memcontrol: remove __mod_lruvec_page_state()
mm/khugepaged: use a folio more in collapse_file()
slub: use a folio in __kmalloc_large_node
slub: use folio APIs in free_large_kmalloc()
slub: use alloc_pages_node() in alloc_slab_page()
mm: remove inc/dec lruvec page state functions
mm: ratelimit stat flush from workingset shrinker
kasan: stop leaking stack trace handles
mm/mglru: remove CONFIG_TRANSPARENT_HUGEPAGE
mm/mglru: add dummy pmd_dirty()
...
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
commit 23baf831a32c ("mm, treewide: redefine MAX_ORDER sanely") has
changed the definition of MAX_ORDER to be inclusive. This has caused
issues with code that was not yet upstream and depended on the previous
definition.
To draw attention to the altered meaning of the define, rename MAX_ORDER
to MAX_PAGE_ORDER.
Link: https://lkml.kernel.org/r/20231228144704.14033-2-kirill.shutemov@linux.intel.com
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
| |/ /
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Patch series "mm/mglru: Kconfig cleanup", v4.
This series is the result of the following discussion:
https://lore.kernel.org/47066176-bd93-55dd-c2fa-002299d9e034@linux.ibm.com/
It mainly avoids building the code that walks page tables on CPUs that
use it, i.e., those don't support hardware accessed bit. Specifically,
it introduces a new Kconfig to guard some of functions added by
commit bd74fdaea146 ("mm: multi-gen LRU: support page table walks")
on CPUs like POWER9, on which the series was tested.
This patch (of 5):
Some architectures are able to set the accessed bit in PTEs when PTEs
are used as part of linear address translations.
Add CONFIG_ARCH_HAS_HW_PTE_YOUNG for such architectures to be able to
override arch_has_hw_pte_young().
Link: https://lkml.kernel.org/r/20231227141205.2200125-1-kinseyho@google.com
Link: https://lkml.kernel.org/r/20231227141205.2200125-2-kinseyho@google.com
Signed-off-by: Kinsey Ho <kinseyho@google.com>
Co-developed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Tested-by: Donet Tom <donettom@linux.vnet.ibm.com>
Acked-by: Yu Zhao <yuzhao@google.com>
Cc: kernel test robot <lkp@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|\ \ \
| |_|/
|/| |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab
Pull slab updates from Vlastimil Babka:
- SLUB: delayed freezing of CPU partial slabs (Chengming Zhou)
Freezing is an operation involving double_cmpxchg() that makes a slab
exclusive for a particular CPU. Chengming noticed that we use it also
in situations where we are not yet installing the slab as the CPU
slab, because freezing also indicates that the slab is not on the
shared list. This results in redundant freeze/unfreeze operation and
can be avoided by marking separately the shared list presence by
reusing the PG_workingset flag.
This approach neatly avoids the issues described in 9b1ea29bc0d7
("Revert "mm, slub: consider rest of partial list if acquire_slab()
fails"") as we can now grab a slab from the shared list in a quick
and guaranteed way without the cmpxchg_double() operation that
amplifies the lock contention and can fail.
As a result, lkp has reported 34.2% improvement of
stress-ng.rawudp.ops_per_sec
- SLAB removal and SLUB cleanups (Vlastimil Babka)
The SLAB allocator has been deprecated since 6.5 and nobody has
objected so far. We agreed at LSF/MM to wait until the next LTS,
which is 6.6, so we should be good to go now.
This doesn't yet erase all traces of SLAB outside of mm/ so some dead
code, comments or documentation remain, and will be cleaned up
gradually (some series are already in the works).
Removing the choice of allocators has already allowed to simplify and
optimize the code wiring up the kmalloc APIs to the SLUB
implementation.
* tag 'slab-for-6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab: (34 commits)
mm/slub: free KFENCE objects in slab_free_hook()
mm/slub: handle bulk and single object freeing separately
mm/slub: introduce __kmem_cache_free_bulk() without free hooks
mm/slub: fix bulk alloc and free stats
mm/slub: optimize free fast path code layout
mm/slub: optimize alloc fastpath code layout
mm/slub: remove slab_alloc() and __kmem_cache_alloc_lru() wrappers
mm/slab: move kmalloc() functions from slab_common.c to slub.c
mm/slab: move kmalloc_slab() to mm/slab.h
mm/slab: move kfree() from slab_common.c to slub.c
mm/slab: move struct kmem_cache_node from slab.h to slub.c
mm/slab: move memcg related functions from slab.h to slub.c
mm/slab: move pre/post-alloc hooks from slab.h to slub.c
mm/slab: consolidate includes in the internal mm/slab.h
mm/slab: move the rest of slub_def.h to mm/slab.h
mm/slab: move struct kmem_cache_cpu declaration to slub.c
mm/slab: remove mm/slab.c and slab_def.h
mm/mempool/dmapool: remove CONFIG_DEBUG_SLAB ifdefs
mm/slab: remove CONFIG_SLAB code from slab common code
cpu/hotplug: remove CPUHP_SLAB_PREPARE hooks
...
|
| |/
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Remove CONFIG_SLAB, CONFIG_DEBUG_SLAB, CONFIG_SLAB_DEPRECATED and
everything in Kconfig files and mm/Makefile that depends on those. Since
SLUB is the only remaining allocator, remove the allocator choice, make
CONFIG_SLUB a "def_bool y" for now and remove all explicit dependencies
on SLUB or SLAB as it's now always enabled. Make every option's verbose
name and description refer to "the slab allocator" without refering to
the specific implementation. Do not rename the CONFIG_ option names yet.
Everything under #ifdef CONFIG_SLAB, and mm/slab.c is now dead code, all
code under #ifdef CONFIG_SLUB is now always compiled.
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Christoph Lameter <cl@linux.com>
Acked-by: David Rientjes <rientjes@google.com>
Tested-by: David Rientjes <rientjes@google.com>
Reviewed-by: Hyeonggon Yoo <42.hyeyoo@gmail.com>
Tested-by: Hyeonggon Yoo <42.hyeyoo@gmail.com>
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
|