linux - linux

	Commit message (Collapse)	Author	Age	Files	Lines
*	Merge tag 'x86-mm-2023-06-27' of ↵	Linus Torvalds	2023-06-27	2	-17/+4
\|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 mm updates from Ingo Molnar: - Remove Xen-PV leftovers from init_32.c - Fix __swp_entry_to_pte() warning splat for Xen PV guests, triggered on CONFIG_DEBUG_VM_PGTABLE=y * tag 'x86-mm-2023-06-27' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/mm: Remove Xen-PV leftovers from init_32.c x86/mm: Fix __swp_entry_to_pte() for Xen PV guests
\| *	x86/mm: Remove Xen-PV leftovers from init_32.c	Juergen Gross	2023-06-09	1	-15/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	There are still some unneeded paravirt calls in arch/x86/mm/init_32.c. Remove them. Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20230609055100.12633-1-jgross@suse.com
\| *	x86/mm: Fix __swp_entry_to_pte() for Xen PV guests	Juergen Gross	2023-05-08	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Normally __swp_entry_to_pte() is never called with a value translating to a valid PTE. The only known exception is pte_swap_tests(), resulting in a WARN splat in Xen PV guests, as __pte_to_swp_entry() did translate the PFN of the valid PTE to a guest local PFN, while __swp_entry_to_pte() doesn't do the opposite translation. Fix that by using __pte() in __swp_entry_to_pte() instead of open coding the native variant of it. For correctness do the similar conversion for __swp_entry_to_pmd(). Fixes: 05289402d717 ("mm/debug_vm_pgtable: add tests validating arch helpers for core MM features") Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/20230306123259.12461-1-jgross@suse.com
* \|	Merge tag 'perf-core-2023-06-27' of ↵	Linus Torvalds	2023-06-27	10	-73/+174
\|\ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf events updates from Ingo Molnar: - Rework & fix the event forwarding logic by extending the core interface. This fixes AMD PMU events that have to be forwarded from the core PMU to the IBS PMU. - Add self-tests to test AMD IBS invocation via core PMU events - Clean up Intel FixCntrCtl MSR encoding & handling * tag 'perf-core-2023-06-27' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf: Re-instate the linear PMU search perf/x86/intel: Define bit macros for FixCntrCtl MSR perf test: Add selftest to test IBS invocation via core pmu events perf/core: Remove pmu linear searching code perf/ibs: Fix interface via core pmu events perf/core: Rework forwarding of {task\|cpu}-clock events
\| * \|	perf: Re-instate the linear PMU search	Peter Zijlstra	2023-06-06	1	-13/+24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Full revert of commit 9551fbb64d09 ("perf/core: Remove pmu linear searching code"). Some architectures (notably arm/arm64) still relied on the linear search in order to find the PMU that consumes PERF_TYPE_{HARDWARE,HW_CACHE,RAW}. This will need a more thorought audit and cleanup. Reported-by: Nathan Chancellor <nathan@kernel.org> Reported-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20230605101401.GL38236@hirez.programming.kicks-ass.net
\| * \|	perf/x86/intel: Define bit macros for FixCntrCtl MSR	Dapeng Mi	2023-05-08	2	-9/+19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Define bit macros for FixCntrCtl MSR and replace the bit hardcoding with these bit macros. This would make code be more human-readable. Perf commands 'perf stat -e "instructions,cycles,ref-cycles"' and 'perf record -e "instructions,cycles,ref-cycles"' pass. Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20230504072128.3653470-1-dapeng1.mi@linux.intel.com
\| * \|	perf test: Add selftest to test IBS invocation via core pmu events	Ravi Bangoria	2023-05-08	4	-0/+75
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	IBS pmu can be invoked via fixed set of core pmu events with 'precise_ip' set to 1. Add a simple event open test for all these events. Without kernel fix: $ sudo ./perf test -vv 76 76: AMD IBS via core pmu : --- start --- test child forked, pid 6553 Using CPUID AuthenticAMD-25-1-1 type: 0x0, config: 0x0, fd: 3 - Pass type: 0x0, config: 0x1, fd: -1 - Pass type: 0x4, config: 0x76, fd: -1 - Fail type: 0x4, config: 0xc1, fd: -1 - Fail type: 0x4, config: 0x12, fd: -1 - Pass test child finished with -1 ---- end ---- AMD IBS via core pmu: FAILED! With kernel fix: $ sudo ./perf test -vv 76 76: AMD IBS via core pmu : --- start --- test child forked, pid 7526 Using CPUID AuthenticAMD-25-1-1 type: 0x0, config: 0x0, fd: 3 - Pass type: 0x0, config: 0x1, fd: -1 - Pass type: 0x4, config: 0x76, fd: 3 - Pass type: 0x4, config: 0xc1, fd: 3 - Pass type: 0x4, config: 0x12, fd: -1 - Pass test child finished with 0 ---- end ---- AMD IBS via core pmu: Ok Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20230504110003.2548-5-ravi.bangoria@amd.com
\| * \|	perf/core: Remove pmu linear searching code	Ravi Bangoria	2023-05-08	1	-24/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Searching for the right pmu by iterating over all pmus is no longer required since all pmus now must be present in the 'pmu_idr' list. So, remove linear searching code. Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20230504110003.2548-4-ravi.bangoria@amd.com
\| * \|	perf/ibs: Fix interface via core pmu events	Ravi Bangoria	2023-05-08	3	-28/+29
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Although, IBS pmus can be invoked via their own interface, indirect IBS invocation via core pmu events is also supported with fixed set of events: cpu-cycles:p, r076:p (same as cpu-cycles:p) and r0C1:p (micro-ops) for user convenience. This indirect IBS invocation is broken since commit 66d258c5b048 ("perf/core: Optimize perf_init_event()"), which added RAW pmu under 'pmu_idr' list and thus if event_init() fails with RAW pmu, it started returning error instead of trying other pmus. Forward precise events from core pmu to IBS by overwriting 'type' and 'config' in the kernel copy of perf_event_attr. Overwriting will cause perf_init_event() to retry with updated 'type' and 'config', which will automatically forward event to IBS pmu. Without patch: $ sudo ./perf record -C 0 -e r076:p -- sleep 1 Error: The r076:p event is not supported. With patch: $ sudo ./perf record -C 0 -e r076:p -- sleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.341 MB perf.data (37 samples) ] Fixes: 66d258c5b048 ("perf/core: Optimize perf_init_event()") Reported-by: Stephane Eranian <eranian@google.com> Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20230504110003.2548-3-ravi.bangoria@amd.com
\| * \|	perf/core: Rework forwarding of {task\|cpu}-clock events	Ravi Bangoria	2023-05-08	2	-36/+51
\| \|/ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently, PERF_TYPE_SOFTWARE is treated specially since task-clock and cpu-clock events are interfaced through it but internally gets forwarded to their own pmus. Rework this by overwriting event->attr.type in perf_swevent_init() which will cause perf_init_event() to retry with updated type and event will automatically get forwarded to right pmu. With the change, SW pmu no longer needs to be treated specially and can be included in 'pmu_idr' list. Suggested-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20230504110003.2548-2-ravi.bangoria@amd.com
* \|	Merge tag 'locking-core-2023-06-27' of ↵	Linus Torvalds	2023-06-27	136	-4156/+9917
\|\ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking updates from Ingo Molnar: - Introduce cmpxchg128() -- aka. the demise of cmpxchg_double() The cmpxchg128() family of functions is basically & functionally the same as cmpxchg_double(), but with a saner interface. Instead of a 6-parameter horror that forced u128 - u64/u64-halves layout details on the interface and exposed users to complexity, fragility & bugs, use a natural 3-parameter interface with u128 types. - Restructure the generated atomic headers, and add kerneldoc comments for all of the generic atomic{,64,_long}_t operations. The generated definitions are much cleaner now, and come with documentation. - Implement lock_set_cmp_fn() on lockdep, for defining an ordering when taking multiple locks of the same type. This gets rid of one use of lockdep_set_novalidate_class() in the bcache code. - Fix raw_cpu_generic_try_cmpxchg() bug due to an unintended variable shadowing generating garbage code on Clang on certain ARM builds. * tag 'locking-core-2023-06-27' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (43 commits) locking/atomic: scripts: fix ${atomic}_dec_if_positive() kerneldoc percpu: Fix self-assignment of __old in raw_cpu_generic_try_cmpxchg() locking/atomic: treewide: delete arch_atomic_() kerneldoc locking/atomic: docs: Add atomic operations to the driver basic API documentation locking/atomic: scripts: generate kerneldoc comments docs: scripts: kernel-doc: accept bitwise negation like ~@var locking/atomic: scripts: simplify raw_atomic() definitions locking/atomic: scripts: simplify raw_atomic_long() definitions locking/atomic: scripts: split pfx/name/sfx/order locking/atomic: scripts: restructure fallback ifdeffery locking/atomic: scripts: build raw_atomic_long() directly locking/atomic: treewide: use raw_atomic_<op>() locking/atomic: scripts: add trivial raw_atomic_<op>() locking/atomic: scripts: factor out order template generation locking/atomic: scripts: remove leftover "${mult}" locking/atomic: scripts: remove bogus order parameter locking/atomic: xtensa: add preprocessor symbols locking/atomic: x86: add preprocessor symbols locking/atomic: sparc: add preprocessor symbols locking/atomic: sh: add preprocessor symbols ...
\| * \|	locking/atomic: scripts: fix ${atomic}_dec_if_positive() kerneldoc	Mark Rutland	2023-06-16	4	-10/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The ${atomic}_dec_if_positive() ops are unlike all the other conditional atomic ops. Rather than returning a boolean success value, these return the value that the atomic variable would be updated to, even when no update is performed. We missed this when adding kerneldoc comments, and the documentation for ${atomic}_dec_if_positive() erroneously states: \| Return: @true if @v was updated, @false otherwise. Ideally we'd clean this up by aligning ${atomic}_dec_if_positive() with the usual atomic op conventions: with ${atomic}_fetch_dec_if_positive() for those who care about the value of the varaible, and ${atomic}_dec_if_positive() returning a boolean success value. In the mean time, align the documentation with the current reality. Fixes: ad8110706f381170 ("locking/atomic: scripts: generate kerneldoc comments") Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Paul E. McKenney <paulmck@kernel.org> Link: https://lore.kernel.org/r/20230615132734.1119765-1-mark.rutland@arm.com
\| * \|	percpu: Fix self-assignment of __old in raw_cpu_generic_try_cmpxchg()	Nathan Chancellor	2023-06-08	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	After commit c5c0ba953b8c ("percpu: Add {raw,this}_cpu_try_cmpxchg()"), clang built ARCH=arm and ARCH=arm64 kernels with CONFIG_INIT_STACK_NONE started panicking on boot in alloc_vmap_area(): [ 0.000000] kernel BUG at mm/vmalloc.c:1638! [ 0.000000] Internal error: Oops - BUG: 00000000f2000800 [#1] PREEMPT SMP [ 0.000000] Modules linked in: [ 0.000000] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 6.4.0-rc2-ARCH+ #1 [ 0.000000] Hardware name: linux,dummy-virt (DT) [ 0.000000] pstate: 200000c9 (nzCv daIF -PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 0.000000] pc : alloc_vmap_area+0x7ec/0x7f8 [ 0.000000] lr : alloc_vmap_area+0x7e8/0x7f8 Compiling mm/vmalloc.c with W=2 reveals an instance of -Wshadow, which helps uncover that through macro expansion, '__old = (ovalp)' in raw_cpu_generic_try_cmpxchg() can become '__old = (&__old)' through raw_cpu_generic_cmpxchg(), which results in garbage being assigned to the inner __old and the cmpxchg not working properly. Add an extra underscore to __old in raw_cpu_generic_try_cmpxchg() so that there is no more self-assignment, which resolves the panics. Closes: https://github.com/ClangBuiltLinux/linux/issues/1868 Fixes: c5c0ba953b8c ("percpu: Add {raw,this}_cpu_try_cmpxchg()") Debugged-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20230607-fix-shadowing-in-raw_cpu_generic_try_cmpxchg-v1-1-8f0a3d930d43@kernel.org
\| * \|	locking/atomic: treewide: delete arch_atomic_*() kerneldoc	Mark Rutland	2023-06-05	7	-351/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently several architectures have kerneldoc comments for arch_atomic_(), which is unhelpful as these live in a shared namespace where they clash, and the arch_atomic_() ops are now an implementation detail of the raw_atomic_() ops, which no-one should use those directly. Delete the kerneldoc comments for arch_atomic_(), along with pseudo-kerneldoc comments which are in the correct style but are missing the leading '/**' necessary to be true kerneldoc comments. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20230605070124.3741859-28-mark.rutland@arm.com
\| * \|	locking/atomic: docs: Add atomic operations to the driver basic API ↵	Paul E. McKenney	2023-06-05	1	-1/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	documentation Add the generated atomic headers to driver-api/basics.rst in order to provide documentation for the Linux kernel's atomic operations. At the same time, dtop the x86 atomic header, which provides kerneldoc comments for some arch_atomic_() operations. The arch_atomic_() operations are now purely an implenentation detail of the raw_atomic_() ops, and outside of implementing the atomics, code should use the raw_atomic_() forms. [Mark: add atomic-{instrumented,long}.h, update commit message] Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20230605070124.3741859-27-mark.rutland@arm.com
\| * \|	locking/atomic: scripts: generate kerneldoc comments	Mark Rutland	2023-06-05	29	-7/+5940
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently the atomics are documented in Documentation/atomic_t.txt, and have no kerneldoc comments. There are a sufficient number of gotchas (e.g. semantics, noinstr-safety) that it would be nice to have comments to call these out, and it would be nice to have kerneldoc comments such that these can be collated. While it's possible to derive the semantics from the code, this can be painful given the amount of indirection we currently have (e.g. fallback paths), and it's easy to be mislead by naming, e.g. * The unconditional void-returning ops only have relaxed variants without a _relaxed suffix, and can easily be mistaken for being fully ordered. It would be nice to give these a _relaxed() suffix, but this would result in significant churn throughout the kernel. * Our naming of conditional and unconditional+test ops is rather inconsistent, and it can be difficult to derive the name of an operation, or to identify where an op is conditional or unconditional+test. Some ops are clearly conditional: - dec_if_positive - add_unless - dec_unless_positive - inc_unless_negative Some ops are clearly unconditional+test: - sub_and_test - dec_and_test - inc_and_test However, what exactly those test is not obvious. A _test_zero suffix might be clearer. Others could be read ambiguously: - inc_not_zero // conditional - add_negative // unconditional+test It would probably be worth renaming these, e.g. to inc_unless_zero and add_test_negative. As a step towards making this more consistent and easier to understand, this patch adds kerneldoc comments for all generated atomic_() functions. These are generated from templates, with some common text shared, making it easy to extend these in future if necessary. I've tried to make these as consistent and clear as possible, and I've deliberately ensured: All ops have their ordering explicitly mentioned in the short and long description. * All test ops have "test" in their short description. * All ops are described as an expression using their usual C operator. For example: andnot: "Atomically updates @v to (@v & ~@i)" inc: "Atomically updates @v to (@v + 1)" Which may be clearer to non-naative English speakers, and allows all the operations to be described in the same style. * All conditional ops have their condition described as an expression using the usual C operators. For example: add_unless: "If (@v != @u), atomically updates @v to (@v + @i)" cmpxchg: "If (@v == @old), atomically updates @v to @new" Which may be clearer to non-naative English speakers, and allows all the operations to be described in the same style. * All bitwise ops (and,andnot,or,xor) explicitly mention that they are bitwise in their short description, so that they are not mistaken for performing their logical equivalents. * The noinstr safety of each op is explicitly described, with a description of whether or not to use the raw_ form of the op. There should be no functional change as a result of this patch. Reported-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20230605070124.3741859-26-mark.rutland@arm.com
\| * \|	docs: scripts: kernel-doc: accept bitwise negation like ~@var	Mark Rutland	2023-06-05	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In some cases we'd like to indicate the bitwise negation of a parameter, e.g. ~@var This will be helpful for describing the atomic andnot operations, where we'd like to write comments of the form: Atomically updates @v to (@v & ~@i) Which kernel-doc currently transforms to: Atomically updates v to (v & ~i) Rather than the preferable form: Atomically updates v to (v & ~i) This is similar to what we did for '!@var' in commit: ee2aa7590398 ("scripts: kernel-doc: accept negation like !@var") This patch follows the same pattern that commit used to permit a '!' prefix on a param ref, allowing a '~' prefix on a param ref, cuasing kernel-doc to generate the preferred form above. Suggested-by: Akira Yokosawa <akiyks@gmail.com> Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20230605070124.3741859-25-mark.rutland@arm.com
\| * \|	locking/atomic: scripts: simplify raw_atomic*() definitions	Mark Rutland	2023-06-05	26	-1077/+901
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently each ordering variant has several potential definitions, with a mixture of preprocessor and C definitions, including several copies of its C prototype, e.g. \| #if defined(arch_atomic_fetch_andnot_acquire) \| #define raw_atomic_fetch_andnot_acquire arch_atomic_fetch_andnot_acquire \| #elif defined(arch_atomic_fetch_andnot_relaxed) \| static __always_inline int \| raw_atomic_fetch_andnot_acquire(int i, atomic_t v) \| { \| int ret = arch_atomic_fetch_andnot_relaxed(i, v); \| __atomic_acquire_fence(); \| return ret; \| } \| #elif defined(arch_atomic_fetch_andnot) \| #define raw_atomic_fetch_andnot_acquire arch_atomic_fetch_andnot \| #else \| static __always_inline int \| raw_atomic_fetch_andnot_acquire(int i, atomic_t v) \| { \| return raw_atomic_fetch_and_acquire(~i, v); \| } \| #endif Make this a bit simpler by defining the C prototype once, and writing the various potential definitions as plain C code guarded by ifdeffery. For example, the above becomes: \| static __always_inline int \| raw_atomic_fetch_andnot_acquire(int i, atomic_t v) \| { \| #if defined(arch_atomic_fetch_andnot_acquire) \| return arch_atomic_fetch_andnot_acquire(i, v); \| #elif defined(arch_atomic_fetch_andnot_relaxed) \| int ret = arch_atomic_fetch_andnot_relaxed(i, v); \| __atomic_acquire_fence(); \| return ret; \| #elif defined(arch_atomic_fetch_andnot) \| return arch_atomic_fetch_andnot(i, v); \| #else \| return raw_atomic_fetch_and_acquire(~i, v); \| #endif \| } Which is far easier to read. As we now always have a single copy of the C prototype wrapping all the potential definitions, we now have an obvious single location for kerneldoc comments. At the same time, the fallbacks for raw_atomic_xhcg() are made to use 'new' rather than 'i' as the name of the new value. This is what the existing fallback template used, and is more consistent with the raw_atomic{_try,}cmpxchg() fallbacks. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20230605070124.3741859-24-mark.rutland@arm.com
\| * \|	locking/atomic: scripts: simplify raw_atomic_long*() definitions	Mark Rutland	2023-06-05	2	-533/+349
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently, atomic-long is split into two sections, one defining the raw_atomic_long_() ops for CONFIG_64BIT, and one defining the raw atomic_long_() ops for !CONFIG_64BIT. With many lines elided, this looks like: \| #ifdef CONFIG_64BIT \| ... \| static __always_inline bool \| raw_atomic_long_try_cmpxchg(atomic_long_t v, long old, long new) \| { \| return raw_atomic64_try_cmpxchg(v, (s64 )old, new); \| } \| ... \| #else / CONFIG_64BIT / \| ... \| static __always_inline bool \| raw_atomic_long_try_cmpxchg(atomic_long_t v, long old, long new) \| { \| return raw_atomic_try_cmpxchg(v, (int )old, new); \| } \| ... \| #endif The two definitions are spread far apart in the file, and duplicate the prototype, making it hard to have a legible set of kerneldoc comments. Make this simpler by defining the C prototype once, and writing the two definitions inline. For example, the above becomes: \| static __always_inline bool \| raw_atomic_long_try_cmpxchg(atomic_long_t v, long old, long new) \| { \| #ifdef CONFIG_64BIT \| return raw_atomic64_try_cmpxchg(v, (s64 )old, new); \| #else \| return raw_atomic_try_cmpxchg(v, (int )old, new); \| #endif \| } As we now always have a single copy of the C prototype wrapping all the potential definitions, we now have an obvious single location for kerneldoc comments. As a bonus, both the script and the generated file are somewhat shorter. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20230605070124.3741859-23-mark.rutland@arm.com
\| * \|	locking/atomic: scripts: split pfx/name/sfx/order	Mark Rutland	2023-06-05	1	-3/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently gen-atomic-long.sh's gen_proto_order_variant() function combines the pfx/name/sfx/order variables immediately, unlike other functions in gen-atomic-.sh. This is fine today, but subsequent patches will require the individual individual pfx/name/sfx/order variables within gen-atomic-long.sh's gen_proto_order_variant() function. In preparation for this, split the variables in the style of other gen-atomic-.sh scripts. This results in no change to the generated headers, so there should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20230605070124.3741859-22-mark.rutland@arm.com
\| * \|	locking/atomic: scripts: restructure fallback ifdeffery	Mark Rutland	2023-06-05	27	-2845/+1860
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently the various ordering variants of an atomic operation are defined in groups of full/acquire/release/relaxed ordering variants with some shared ifdeffery and several potential definitions of each ordering variant in different branches of the shared ifdeffery. As an ordering variant can have several potential definitions down different branches of the shared ifdeffery, it can be painful for a human to find a relevant definition, and we don't have a good location to place anything common to all definitions of an ordering variant (e.g. kerneldoc). Historically the grouping of full/acquire/release/relaxed ordering variants was necessary as we filled in the missing atomics in the same namespace as the architecture used. It would be easy to accidentally define one ordering fallback in terms of another ordering fallback with redundant barriers, and avoiding that would otherwise require a lot of baroque ifdeffery. With recent changes we no longer need to fill in the missing atomics in the arch_atomic_<op>() namespace, and only need to fill in the raw_atomic_<op>() namespace. Due to this, there's no risk of a namespace collision, and we can define each raw_atomic_<op> ordering variant with its own ifdeffery checking for the arch_atomic_<op> ordering variants. Restructure the fallbacks in this way, with each ordering variant having its own ifdeffery of the form: \| #if defined(arch_atomic_fetch_andnot_acquire) \| #define raw_atomic_fetch_andnot_acquire arch_atomic_fetch_andnot_acquire \| #elif defined(arch_atomic_fetch_andnot_relaxed) \| static __always_inline int \| raw_atomic_fetch_andnot_acquire(int i, atomic_t v) \| { \| int ret = arch_atomic_fetch_andnot_relaxed(i, v); \| __atomic_acquire_fence(); \| return ret; \| } \| #elif defined(arch_atomic_fetch_andnot) \| #define raw_atomic_fetch_andnot_acquire arch_atomic_fetch_andnot \| #else \| static __always_inline int \| raw_atomic_fetch_andnot_acquire(int i, atomic_t v) \| { \| return raw_atomic_fetch_and_acquire(~i, v); \| } \| #endif Note that where there's no relevant arch_atomic_<op>() ordering variant, we'll define the operation in terms of a distinct raw_atomic_<otherop>(), as this itself might have been filled in with a fallback. As we now generate the raw_atomic*_<op>() implementations directly, we no longer need the trivial wrappers, so they are removed. This makes the ifdeffery easier to follow, and will allow for further improvements in subsequent patches. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20230605070124.3741859-21-mark.rutland@arm.com
\| * \|	locking/atomic: scripts: build raw_atomic_long*() directly	Mark Rutland	2023-06-05	5	-859/+345
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Now that arch_atomic() usage is limited to the atomic headers, we no longer have any users of arch_atomic_long_(), and can generate raw_atomic_long_() directly. Generate the raw_atomic_long_() ops directly. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20230605070124.3741859-20-mark.rutland@arm.com
\| * \|	locking/atomic: treewide: use raw_atomic*_<op>()	Mark Rutland	2023-06-05	14	-42/+42
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Now that we have raw_atomic_<op>() definitions, there's no need to use arch_atomic_<op>() definitions outside of the low-level atomic definitions. Move treewide users of arch_atomic_<op>() over to the equivalent raw_atomic_<op>(). There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20230605070124.3741859-19-mark.rutland@arm.com
\| * \|	locking/atomic: scripts: add trivial raw_atomic*_<op>()	Mark Rutland	2023-06-05	6	-312/+2033
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently a number of arch_atomic_<op>() functions are optional, and where an arch does not provide a given arch_atomic_<op>() we will define an implementation of arch_atomic*_<op>() in atomic-arch-fallback.h. Filling in the missing ops requires special care as we want to select the optimal definition of each op (e.g. preferentially defining ops in terms of their relaxed form rather than their fully-ordered form). The ifdeffery necessary for this requires us to group ordering variants together, which can be a bit painful to read, and is painful for kerneldoc generation. It would be easier to handle this if we generated ops into a separate namespace, as this would remove the need to take special care with the ifdeffery, and allow each ordering variant to be generated separately. This patch adds a new set of raw_atomic_<op>() definitions, which are currently trivial wrappers of their arch_atomic_<op>() equivalent. This will allow us to move treewide users of arch_atomic_<op>() over to raw atomic op before we rework the fallback generation to generate raw_atomic_<op> directly. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20230605070124.3741859-18-mark.rutland@arm.com
\| * \|	locking/atomic: scripts: factor out order template generation	Mark Rutland	2023-06-05	1	-17/+17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently gen_proto_order_variants() hard codes the path for the templates used for order fallbacks. Factor this out into a helper so that it can be reused elsewhere. This results in no change to the generated headers, so there should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20230605070124.3741859-17-mark.rutland@arm.com
\| * \|	locking/atomic: scripts: remove leftover "${mult}"	Mark Rutland	2023-06-05	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We removed cmpxchg_double() and variants in commit: b4cf83b2d1da40b2 ("arch: Remove cmpxchg_double") Which removed the need for "${mult}" in the instrumentation logic. Unfortunately we missed an instance of "${mult}". There is no change to the generated header. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20230605070124.3741859-16-mark.rutland@arm.com
\| * \|	locking/atomic: scripts: remove bogus order parameter	Mark Rutland	2023-06-05	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	At the start of gen_proto_order_variants(), the ${order} variable is not yet defined, and will be substituted with an empty string. Replace the current bogus use of ${order} with an empty string instead. This results in no change to the generated headers. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20230605070124.3741859-15-mark.rutland@arm.com
\| * \|	locking/atomic: xtensa: add preprocessor symbols	Mark Rutland	2023-06-05	1	-0/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Some atomics can be implemented in several different ways, e.g. FULL/ACQUIRE/RELEASE ordered atomics can be implemented in terms of RELAXED atomics, and ACQUIRE/RELEASE/RELAXED can be implemented in terms of FULL ordered atomics. Other atomics are optional, and don't exist in some configurations (e.g. not all architectures implement the 128-bit cmpxchg ops). Subsequent patches will require that architectures define a preprocessor symbol for any atomic (or ordering variant) which is optional. This will make the fallback ifdeffery more robust, and simplify future changes. Add the required definitions to arch/xtensa. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20230605070124.3741859-14-mark.rutland@arm.com
\| * \|	locking/atomic: x86: add preprocessor symbols	Mark Rutland	2023-06-05	1	-0/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Some atomics can be implemented in several different ways, e.g. FULL/ACQUIRE/RELEASE ordered atomics can be implemented in terms of RELAXED atomics, and ACQUIRE/RELEASE/RELAXED can be implemented in terms of FULL ordered atomics. Other atomics are optional, and don't exist in some configurations (e.g. not all architectures implement the 128-bit cmpxchg ops). Subsequent patches will require that architectures define a preprocessor symbol for any atomic (or ordering variant) which is optional. This will make the fallback ifdeffery more robust, and simplify future changes. Add the required definitions to arch/x86. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20230605070124.3741859-13-mark.rutland@arm.com
\| * \|	locking/atomic: sparc: add preprocessor symbols	Mark Rutland	2023-06-05	2	-2/+32
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Some atomics can be implemented in several different ways, e.g. FULL/ACQUIRE/RELEASE ordered atomics can be implemented in terms of RELAXED atomics, and ACQUIRE/RELEASE/RELAXED can be implemented in terms of FULL ordered atomics. Other atomics are optional, and don't exist in some configurations (e.g. not all architectures implement the 128-bit cmpxchg ops). Subsequent patches will require that architectures define a preprocessor symbol for any atomic (or ordering variant) which is optional. This will make the fallback ifdeffery more robust, and simplify future changes. Add the required definitions to arch/sparc. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20230605070124.3741859-12-mark.rutland@arm.com
\| * \|	locking/atomic: sh: add preprocessor symbols	Mark Rutland	2023-06-05	3	-0/+27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Some atomics can be implemented in several different ways, e.g. FULL/ACQUIRE/RELEASE ordered atomics can be implemented in terms of RELAXED atomics, and ACQUIRE/RELEASE/RELAXED can be implemented in terms of FULL ordered atomics. Other atomics are optional, and don't exist in some configurations (e.g. not all architectures implement the 128-bit cmpxchg ops). Subsequent patches will require that architectures define a preprocessor symbol for any atomic (or ordering variant) which is optional. This will make the fallback ifdeffery more robust, and simplify future changes. Add the required definitions to arch/sh. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20230605070124.3741859-11-mark.rutland@arm.com
\| * \|	locking/atomic: parisc: add preprocessor symbols	Mark Rutland	2023-06-05	1	-0/+18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Some atomics can be implemented in several different ways, e.g. FULL/ACQUIRE/RELEASE ordered atomics can be implemented in terms of RELAXED atomics, and ACQUIRE/RELEASE/RELAXED can be implemented in terms of FULL ordered atomics. Other atomics are optional, and don't exist in some configurations (e.g. not all architectures implement the 128-bit cmpxchg ops). Subsequent patches will require that architectures define a preprocessor symbol for any atomic (or ordering variant) which is optional. This will make the fallback ifdeffery more robust, and simplify future changes. Add the required definitions to arch/parisc. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20230605070124.3741859-10-mark.rutland@arm.com
\| * \|	locking/atomic: m68k: add preprocessor symbols	Mark Rutland	2023-06-05	1	-0/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Some atomics can be implemented in several different ways, e.g. FULL/ACQUIRE/RELEASE ordered atomics can be implemented in terms of RELAXED atomics, and ACQUIRE/RELEASE/RELAXED can be implemented in terms of FULL ordered atomics. Other atomics are optional, and don't exist in some configurations (e.g. not all architectures implement the 128-bit cmpxchg ops). Subsequent patches will require that architectures define a preprocessor symbol for any atomic (or ordering variant) which is optional. This will make the fallback ifdeffery more robust, and simplify future changes. Add the required definitions to arch/m68k. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20230605070124.3741859-9-mark.rutland@arm.com
\| * \|	locking/atomic: hexagon: add preprocessor symbols	Mark Rutland	2023-06-05	1	-0/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Some atomics can be implemented in several different ways, e.g. FULL/ACQUIRE/RELEASE ordered atomics can be implemented in terms of RELAXED atomics, and ACQUIRE/RELEASE/RELAXED can be implemented in terms of FULL ordered atomics. Other atomics are optional, and don't exist in some configurations (e.g. not all architectures implement the 128-bit cmpxchg ops). Subsequent patches will require that architectures define a preprocessor symbol for any atomic (or ordering variant) which is optional. This will make the fallback ifdeffery more robust, and simplify future changes. Add the required definitions to arch/hexagon. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20230605070124.3741859-8-mark.rutland@arm.com
\| * \|	locking/atomic: arm: add preprocessor symbols	Mark Rutland	2023-06-05	1	-2/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Some atomics can be implemented in several different ways, e.g. FULL/ACQUIRE/RELEASE ordered atomics can be implemented in terms of RELAXED atomics, and ACQUIRE/RELEASE/RELAXED can be implemented in terms of FULL ordered atomics. Other atomics are optional, and don't exist in some configurations (e.g. not all architectures implement the 128-bit cmpxchg ops). Subsequent patches will require that architectures define a preprocessor symbol for any atomic (or ordering variant) which is optional. This will make the fallback ifdeffery more robust, and simplify future changes. Add the required definitions to arch/arm. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20230605070124.3741859-7-mark.rutland@arm.com
\| * \|	locking/atomic: arc: add preprocessor symbols	Mark Rutland	2023-06-05	1	-0/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Some atomics can be implemented in several different ways, e.g. FULL/ACQUIRE/RELEASE ordered atomics can be implemented in terms of RELAXED atomics, and ACQUIRE/RELEASE/RELAXED can be implemented in terms of FULL ordered atomics. Other atomics are optional, and don't exist in some configurations (e.g. not all architectures implement the 128-bit cmpxchg ops). Subsequent patches will require that architectures define a preprocessor symbol for any atomic (or ordering variant) which is optional. This will make the fallback ifdeffery more robust, and simplify future changes. Add the required definitions to arch/arc. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20230605070124.3741859-6-mark.rutland@arm.com
\| * \|	locking/atomic: make atomic*_{cmp,}xchg optional	Mark Rutland	2023-06-05	23	-265/+179
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Most architectures define the atomic/atomic64 xchg and cmpxchg operations in terms of arch_xchg and arch_cmpxchg respectfully. Add fallbacks for these cases and remove the trivial cases from arch code. On some architectures the existing definitions are kept as these are used to build other arch_atomic*() operations. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20230605070124.3741859-5-mark.rutland@arm.com
\| * \|	locking/atomic: hexagon: remove redundant arch_atomic_cmpxchg	Mark Rutland	2023-06-05	1	-42/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Hexagon's implementation of arch_atomic_cmpxchg() is identical to its implementation of arch_cmpxchg(). Have it define arch_atomic_cmpxchg() in terms of arch_cmpxchg(), matching what it does for arch_atomic_xchg() and arch_xchg(). At the same time, remove the kerneldoc comments for hexagon's arch_atomic_xchg() and arch_atomic_cmpxchg(). The arch_atomic_*() namespace is shared by all architectures and the API should be documented centrally, and the comments aren't all that helpful as-is. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20230605070124.3741859-4-mark.rutland@arm.com
\| * \|	locking/atomic: remove fallback comments	Mark Rutland	2023-06-05	8	-223/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently a subset of the fallback templates have kerneldoc comments, resulting in a haphazard set of generated kerneldoc comments as only some operations have fallback templates to begin with. We'd like to generate more consistent kerneldoc comments, and to do so we'll need to restructure the way the fallback code is generated. To minimize churn and to make it easier to restructure the fallback code, this patch removes the existing kerneldoc comments from the fallback templates. We can add new kerneldoc comments in subsequent patches. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20230605070124.3741859-3-mark.rutland@arm.com
\| * \|	locking/atomic: arm: fix sync ops	Mark Rutland	2023-06-05	6	-7/+65
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The sync_() ops on arch/arm are defined in terms of the regular bitops with no special handling. This is not correct, as UP kernels elide barriers for the fully-ordered operations, and so the required ordering is lost when such UP kernels are run under a hypervsior on an SMP system. Fix this by defining sync ops with the required barriers. Note: On 32-bit arm, the sync_() ops are currently only used by Xen, which requires ARMv7, but the semantics can be implemented for ARMv6+. Fixes: e54d2f61528165bb ("xen/arm: sync_bitops") Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20230605070124.3741859-2-mark.rutland@arm.com
\| * \|	s390/cpum_sf: Convert to cmpxchg128()	Peter Zijlstra	2023-06-05	2	-14/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Now that there is a cross arch u128 and cmpxchg128(), use those instead of the custom CDSG helper. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Heiko Carstens <hca@linux.ibm.com> Tested-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20230531132324.058821078@infradead.org
\| * \|	arch: Remove cmpxchg_double	Peter Zijlstra	2023-06-05	15	-371/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	No moar users, remove the monster. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Heiko Carstens <hca@linux.ibm.com> Tested-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20230531132323.991907085@infradead.org
\| * \|	slub: Replace cmpxchg_double()	Peter Zijlstra	2023-06-05	3	-67/+137
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: Hyeonggon Yoo <42.hyeyoo@gmail.com> Tested-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20230531132323.924677086@infradead.org
\| * \|	x86,intel_iommu: Replace cmpxchg_double()	Peter Zijlstra	2023-06-05	2	-65/+68
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Tested-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20230531132323.855976804@infradead.org
\| * \|	x86,amd_iommu: Replace cmpxchg_double()	Peter Zijlstra	2023-06-05	2	-8/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Tested-by: Vasant Hegde <vasant.hegde@amd.com> Tested-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20230531132323.788955257@infradead.org
\| * \|	parisc: Raise minimal GCC version	Peter Zijlstra	2023-06-05	1	-1/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	64-bit targets need the __int128 type, which for pa-risc means raising the minimum gcc version to 11. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Helge Deller <deller@gmx.de> Tested-by: Mark Rutland <mark.rutland@arm.com> Link: https://lkml.kernel.org/r/20230602143912.GI620383%40hirez.programming.kicks-ass.net
\| * \|	percpu: Wire up cmpxchg128	Peter Zijlstra	2023-06-05	7	-39/+240
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In order to replace cmpxchg_double() with the newly minted cmpxchg128() family of functions, wire it up in this_cpu_cmpxchg(). Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Tested-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20230531132323.654945124@infradead.org
\| * \|	percpu: Add {raw,this}_cpu_try_cmpxchg()	Peter Zijlstra	2023-06-05	2	-4/+128
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add the try_cmpxchg() form to the per-cpu ops. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Tested-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20230531132323.587480729@infradead.org
\| * \|	instrumentation: Wire up cmpxchg128()	Peter Zijlstra	2023-06-05	4	-6/+183
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Wire up the cmpxchg128 family in the atomic wrapper scripts. These provide the generic cmpxchg128 family of functions from the arch_ prefixed version, adding explicit instrumentation where needed. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Tested-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20230531132323.519237070@infradead.org
\| * \|	arch: Introduce arch_{,try_}_cmpxchg128{,_local}()	Peter Zijlstra	2023-06-05	6	-2/+177
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	For all architectures that currently support cmpxchg_double() implement the cmpxchg128() family of functions that is basically the same but with a saner interface. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Heiko Carstens <hca@linux.ibm.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Tested-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20230531132323.452120708@infradead.org