summaryrefslogtreecommitdiffstats
path: root/scripts/atomic (follow)
Commit message (Collapse)AuthorAgeFilesLines
* locking/atomic: scripts: fix ${atomic}_sub_and_test() kerneldocCarlos Llamas2024-06-051-1/+1
| | | | | | | | | | | | | | | For ${atomic}_sub_and_test() the @i parameter is the value to subtract, not add. Fix the typo in the kerneldoc template and generate the headers with this update. Fixes: ad8110706f38 ("locking/atomic: scripts: generate kerneldoc comments") Suggested-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Carlos Llamas <cmllamas@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Kees Cook <keescook@chromium.org> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20240515133844.3502360-1-cmllamas@google.com
* locking/atomic: scripts: Clarify ordering of conditional atomicsMark Rutland2024-02-207-1/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Conditional atomic operations (e.g. cmpxchg()) only provide ordering when the condition holds; when the condition does not hold, the location is not modified and relaxed ordering is provided. Where ordering is needed for failed conditional atomics, it is necessary to use smp_mb__before_atomic() and/or smp_mb__after_atomic(). This is explained tersely in memory-barriers.txt, and is implied but not explicitly stated in the kerneldoc comments for the conditional operations. The lack of an explicit statement has lead to some off-list queries about the ordering semantics of failing conditional operations, so evidently this is confusing. Update the kerneldoc comments to explicitly describe the lack of ordering for failed conditional atomic operations. For most conditional atomic operations, this is written as: | If (${condition}), atomically updates @v to (${new}) with ${desc_order} ordering. | Otherwise, @v is not modified and relaxed ordering is provided. For the try_cmpxchg() operations, this is written as: | If (${condition}), atomically updates @v to @new with ${desc_order} ordering. | Otherwise, @v is not modified, @old is updated to the current value of @v, | and relaxed ordering is provided. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Paul E. McKenney <paulmck@kernel.org> Reviewed-by: Nhat Pham <nphamcs@gmail.com> Link: https://lore.kernel.org/r/20240209124010.2096198-1-mark.rutland@arm.com
* locking/atomic: Add generic support for sync_try_cmpxchg() and its fallbackUros Bizjak2023-10-092-16/+20
| | | | | | | | | | | | | | | | | | | | | | | | | Provide the generic sync_try_cmpxchg() function from the raw_ prefixed version, also adding explicit instrumentation. The patch amends existing scripts to generate sync_try_cmpxchg() locking primitive and its raw_sync_try_cmpxchg() fallback, while leaving existing macros from the try_cmpxchg() family unchanged. The target can define its own arch_sync_try_cmpxchg() to override the generic version of raw_sync_try_cmpxchg(). This allows the target to generate more optimal assembly than the generic version. Additionally, the patch renames two scripts to better reflect whet they really do. Signed-off-by: Uros Bizjak <ubizjak@gmail.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Will Deacon <will@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: linux-kernel@vger.kernel.org
* locking/atomic: scripts: fix fallback ifdefferyMark Rutland2023-09-201-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since commit: 9257959a6e5b4fca ("locking/atomic: scripts: restructure fallback ifdeffery") The ordering fallbacks for atomic*_read_acquire() and atomic*_set_release() erroneously fall back to the implictly relaxed atomic*_read() and atomic*_set() variants respectively, without any additional barriers. This loses the ACQUIRE and RELEASE ordering semantics, which can result in a wide variety of problems, even on strongly-ordered architectures where the implementation of atomic*_read() and/or atomic*_set() allows the compiler to reorder those relative to other accesses. In practice this has been observed to break bit spinlocks on arm64, resulting in dentry cache corruption. The fallback logic was intended to allow ACQUIRE/RELEASE/RELAXED ops to be defined in terms of FULL ops, but where an op had RELAXED ordering by default, this unintentionally permitted the ACQUIRE/RELEASE ops to be defined in terms of the implicitly RELAXED default. This patch corrects the logic to avoid falling back to implicitly RELAXED ops, resulting in the same behaviour as prior to commit 9257959a6e5b4fca. I've verified the resulting assembly on arm64 by generating outlined wrappers of the atomics. Prior to this patch the compiler generates sequences using relaxed load (LDR) and store (STR) instructions, e.g. | <outlined_atomic64_read_acquire>: | ldr x0, [x0] | ret | | <outlined_atomic64_set_release>: | str x1, [x0] | ret With this patch applied the compiler generates sequences using the intended load-acquire (LDAR) and store-release (STLR) instructions, e.g. | <outlined_atomic64_read_acquire>: | ldar x0, [x0] | ret | | <outlined_atomic64_set_release>: | stlr x1, [x0] | ret To make sure that there were no other victims of the ifdeffery rewrite, I generated outlined copies of all of the {atomic,atomic64,atomic_long} atomic operations before and after commit 9257959a6e5b4fca. A diff of the generated assembly on arm64 shows that only the read_acquire() and set_release() operations were changed, and only lost their intended ordering: | [mark@lakrids:~/src/linux]% diff -u \ | <(aarch64-linux-gnu-objdump -d before-9257959a6e5b4fca.o) | <(aarch64-linux-gnu-objdump -d after-9257959a6e5b4fca.o) | --- /proc/self/fd/11 2023-09-19 16:51:51.114779415 +0100 | +++ /proc/self/fd/16 2023-09-19 16:51:51.114779415 +0100 | @@ -1,5 +1,5 @@ | | -before-9257959a6e5b4fca.o: file format elf64-littleaarch64 | +after-9257959a6e5b4fca.o: file format elf64-littleaarch64 | | | Disassembly of section .text: | @@ -9,7 +9,7 @@ | 4: d65f03c0 ret | | 0000000000000008 <outlined_atomic_read_acquire>: | - 8: 88dffc00 ldar w0, [x0] | + 8: b9400000 ldr w0, [x0] | c: d65f03c0 ret | | 0000000000000010 <outlined_atomic_set>: | @@ -17,7 +17,7 @@ | 14: d65f03c0 ret | | 0000000000000018 <outlined_atomic_set_release>: | - 18: 889ffc01 stlr w1, [x0] | + 18: b9000001 str w1, [x0] | 1c: d65f03c0 ret | | 0000000000000020 <outlined_atomic_add>: | @@ -1230,7 +1230,7 @@ | 1070: d65f03c0 ret | | 0000000000001074 <outlined_atomic64_read_acquire>: | - 1074: c8dffc00 ldar x0, [x0] | + 1074: f9400000 ldr x0, [x0] | 1078: d65f03c0 ret | | 000000000000107c <outlined_atomic64_set>: | @@ -1238,7 +1238,7 @@ | 1080: d65f03c0 ret | | 0000000000001084 <outlined_atomic64_set_release>: | - 1084: c89ffc01 stlr x1, [x0] | + 1084: f9000001 str x1, [x0] | 1088: d65f03c0 ret | | 000000000000108c <outlined_atomic64_add>: | @@ -2427,7 +2427,7 @@ | 207c: d65f03c0 ret | | 0000000000002080 <outlined_atomic_long_read_acquire>: | - 2080: c8dffc00 ldar x0, [x0] | + 2080: f9400000 ldr x0, [x0] | 2084: d65f03c0 ret | | 0000000000002088 <outlined_atomic_long_set>: | @@ -2435,7 +2435,7 @@ | 208c: d65f03c0 ret | | 0000000000002090 <outlined_atomic_long_set_release>: | - 2090: c89ffc01 stlr x1, [x0] | + 2090: f9000001 str x1, [x0] | 2094: d65f03c0 ret | | 0000000000002098 <outlined_atomic_long_add>: I've build tested this with a variety of configs for alpha, arm, arm64, csky, i386, m68k, microblaze, mips, nios2, openrisc, powerpc, riscv, s390, sh, sparc, x86_64, and xtensa, for which I've seen no issues. I was unable to build test for ia64 and parisc due to existing build breakage in v6.6-rc2. Fixes: 9257959a6e5b4fca ("locking/atomic: scripts: restructure fallback ifdeffery") Reported-by: Ming Lei <ming.lei@redhat.com> Reported-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Baokun Li <libaokun1@huawei.com> Link: https://lkml.kernel.org/r/20230919171430.2697727-1-mark.rutland@arm.com
* locking/atomic: scripts: fix ${atomic}_dec_if_positive() kerneldocMark Rutland2023-06-161-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | The ${atomic}_dec_if_positive() ops are unlike all the other conditional atomic ops. Rather than returning a boolean success value, these return the value that the atomic variable would be updated to, even when no update is performed. We missed this when adding kerneldoc comments, and the documentation for ${atomic}_dec_if_positive() erroneously states: | Return: @true if @v was updated, @false otherwise. Ideally we'd clean this up by aligning ${atomic}_dec_if_positive() with the usual atomic op conventions: with ${atomic}_fetch_dec_if_positive() for those who care about the value of the varaible, and ${atomic}_dec_if_positive() returning a boolean success value. In the mean time, align the documentation with the current reality. Fixes: ad8110706f381170 ("locking/atomic: scripts: generate kerneldoc comments") Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Paul E. McKenney <paulmck@kernel.org> Link: https://lore.kernel.org/r/20230615132734.1119765-1-mark.rutland@arm.com
* locking/atomic: scripts: generate kerneldoc commentsMark Rutland2023-06-0526-4/+399
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently the atomics are documented in Documentation/atomic_t.txt, and have no kerneldoc comments. There are a sufficient number of gotchas (e.g. semantics, noinstr-safety) that it would be nice to have comments to call these out, and it would be nice to have kerneldoc comments such that these can be collated. While it's possible to derive the semantics from the code, this can be painful given the amount of indirection we currently have (e.g. fallback paths), and it's easy to be mislead by naming, e.g. * The unconditional void-returning ops *only* have relaxed variants without a _relaxed suffix, and can easily be mistaken for being fully ordered. It would be nice to give these a _relaxed() suffix, but this would result in significant churn throughout the kernel. * Our naming of conditional and unconditional+test ops is rather inconsistent, and it can be difficult to derive the name of an operation, or to identify where an op is conditional or unconditional+test. Some ops are clearly conditional: - dec_if_positive - add_unless - dec_unless_positive - inc_unless_negative Some ops are clearly unconditional+test: - sub_and_test - dec_and_test - inc_and_test However, what exactly those test is not obvious. A _test_zero suffix might be clearer. Others could be read ambiguously: - inc_not_zero // conditional - add_negative // unconditional+test It would probably be worth renaming these, e.g. to inc_unless_zero and add_test_negative. As a step towards making this more consistent and easier to understand, this patch adds kerneldoc comments for all generated *atomic*_*() functions. These are generated from templates, with some common text shared, making it easy to extend these in future if necessary. I've tried to make these as consistent and clear as possible, and I've deliberately ensured: * All ops have their ordering explicitly mentioned in the short and long description. * All test ops have "test" in their short description. * All ops are described as an expression using their usual C operator. For example: andnot: "Atomically updates @v to (@v & ~@i)" inc: "Atomically updates @v to (@v + 1)" Which may be clearer to non-naative English speakers, and allows all the operations to be described in the same style. * All conditional ops have their condition described as an expression using the usual C operators. For example: add_unless: "If (@v != @u), atomically updates @v to (@v + @i)" cmpxchg: "If (@v == @old), atomically updates @v to @new" Which may be clearer to non-naative English speakers, and allows all the operations to be described in the same style. * All bitwise ops (and,andnot,or,xor) explicitly mention that they are bitwise in their short description, so that they are not mistaken for performing their logical equivalents. * The noinstr safety of each op is explicitly described, with a description of whether or not to use the raw_ form of the op. There should be no functional change as a result of this patch. Reported-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20230605070124.3741859-26-mark.rutland@arm.com
* locking/atomic: scripts: simplify raw_atomic*() definitionsMark Rutland2023-06-0523-92/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently each ordering variant has several potential definitions, with a mixture of preprocessor and C definitions, including several copies of its C prototype, e.g. | #if defined(arch_atomic_fetch_andnot_acquire) | #define raw_atomic_fetch_andnot_acquire arch_atomic_fetch_andnot_acquire | #elif defined(arch_atomic_fetch_andnot_relaxed) | static __always_inline int | raw_atomic_fetch_andnot_acquire(int i, atomic_t *v) | { | int ret = arch_atomic_fetch_andnot_relaxed(i, v); | __atomic_acquire_fence(); | return ret; | } | #elif defined(arch_atomic_fetch_andnot) | #define raw_atomic_fetch_andnot_acquire arch_atomic_fetch_andnot | #else | static __always_inline int | raw_atomic_fetch_andnot_acquire(int i, atomic_t *v) | { | return raw_atomic_fetch_and_acquire(~i, v); | } | #endif Make this a bit simpler by defining the C prototype once, and writing the various potential definitions as plain C code guarded by ifdeffery. For example, the above becomes: | static __always_inline int | raw_atomic_fetch_andnot_acquire(int i, atomic_t *v) | { | #if defined(arch_atomic_fetch_andnot_acquire) | return arch_atomic_fetch_andnot_acquire(i, v); | #elif defined(arch_atomic_fetch_andnot_relaxed) | int ret = arch_atomic_fetch_andnot_relaxed(i, v); | __atomic_acquire_fence(); | return ret; | #elif defined(arch_atomic_fetch_andnot) | return arch_atomic_fetch_andnot(i, v); | #else | return raw_atomic_fetch_and_acquire(~i, v); | #endif | } Which is far easier to read. As we now always have a single copy of the C prototype wrapping all the potential definitions, we now have an obvious single location for kerneldoc comments. At the same time, the fallbacks for raw_atomic*_xhcg() are made to use 'new' rather than 'i' as the name of the new value. This is what the existing fallback template used, and is more consistent with the raw_atomic{_try,}cmpxchg() fallbacks. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20230605070124.3741859-24-mark.rutland@arm.com
* locking/atomic: scripts: simplify raw_atomic_long*() definitionsMark Rutland2023-06-051-18/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, atomic-long is split into two sections, one defining the raw_atomic_long_*() ops for CONFIG_64BIT, and one defining the raw atomic_long_*() ops for !CONFIG_64BIT. With many lines elided, this looks like: | #ifdef CONFIG_64BIT | ... | static __always_inline bool | raw_atomic_long_try_cmpxchg(atomic_long_t *v, long *old, long new) | { | return raw_atomic64_try_cmpxchg(v, (s64 *)old, new); | } | ... | #else /* CONFIG_64BIT */ | ... | static __always_inline bool | raw_atomic_long_try_cmpxchg(atomic_long_t *v, long *old, long new) | { | return raw_atomic_try_cmpxchg(v, (int *)old, new); | } | ... | #endif The two definitions are spread far apart in the file, and duplicate the prototype, making it hard to have a legible set of kerneldoc comments. Make this simpler by defining the C prototype once, and writing the two definitions inline. For example, the above becomes: | static __always_inline bool | raw_atomic_long_try_cmpxchg(atomic_long_t *v, long *old, long new) | { | #ifdef CONFIG_64BIT | return raw_atomic64_try_cmpxchg(v, (s64 *)old, new); | #else | return raw_atomic_try_cmpxchg(v, (int *)old, new); | #endif | } As we now always have a single copy of the C prototype wrapping all the potential definitions, we now have an obvious single location for kerneldoc comments. As a bonus, both the script and the generated file are somewhat shorter. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20230605070124.3741859-23-mark.rutland@arm.com
* locking/atomic: scripts: split pfx/name/sfx/orderMark Rutland2023-06-051-3/+8
| | | | | | | | | | | | | | | | | | | Currently gen-atomic-long.sh's gen_proto_order_variant() function combines the pfx/name/sfx/order variables immediately, unlike other functions in gen-atomic-*.sh. This is fine today, but subsequent patches will require the individual individual pfx/name/sfx/order variables within gen-atomic-long.sh's gen_proto_order_variant() function. In preparation for this, split the variables in the style of other gen-atomic-*.sh scripts. This results in no change to the generated headers, so there should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20230605070124.3741859-22-mark.rutland@arm.com
* locking/atomic: scripts: restructure fallback ifdefferyMark Rutland2023-06-0524-207/+196
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently the various ordering variants of an atomic operation are defined in groups of full/acquire/release/relaxed ordering variants with some shared ifdeffery and several potential definitions of each ordering variant in different branches of the shared ifdeffery. As an ordering variant can have several potential definitions down different branches of the shared ifdeffery, it can be painful for a human to find a relevant definition, and we don't have a good location to place anything common to all definitions of an ordering variant (e.g. kerneldoc). Historically the grouping of full/acquire/release/relaxed ordering variants was necessary as we filled in the missing atomics in the same namespace as the architecture used. It would be easy to accidentally define one ordering fallback in terms of another ordering fallback with redundant barriers, and avoiding that would otherwise require a lot of baroque ifdeffery. With recent changes we no longer need to fill in the missing atomics in the arch_atomic*_<op>() namespace, and only need to fill in the raw_atomic*_<op>() namespace. Due to this, there's no risk of a namespace collision, and we can define each raw_atomic*_<op> ordering variant with its own ifdeffery checking for the arch_atomic*_<op> ordering variants. Restructure the fallbacks in this way, with each ordering variant having its own ifdeffery of the form: | #if defined(arch_atomic_fetch_andnot_acquire) | #define raw_atomic_fetch_andnot_acquire arch_atomic_fetch_andnot_acquire | #elif defined(arch_atomic_fetch_andnot_relaxed) | static __always_inline int | raw_atomic_fetch_andnot_acquire(int i, atomic_t *v) | { | int ret = arch_atomic_fetch_andnot_relaxed(i, v); | __atomic_acquire_fence(); | return ret; | } | #elif defined(arch_atomic_fetch_andnot) | #define raw_atomic_fetch_andnot_acquire arch_atomic_fetch_andnot | #else | static __always_inline int | raw_atomic_fetch_andnot_acquire(int i, atomic_t *v) | { | return raw_atomic_fetch_and_acquire(~i, v); | } | #endif Note that where there's no relevant arch_atomic*_<op>() ordering variant, we'll define the operation in terms of a distinct raw_atomic*_<otherop>(), as this itself might have been filled in with a fallback. As we now generate the raw_atomic*_<op>() implementations directly, we no longer need the trivial wrappers, so they are removed. This makes the ifdeffery easier to follow, and will allow for further improvements in subsequent patches. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20230605070124.3741859-21-mark.rutland@arm.com
* locking/atomic: scripts: build raw_atomic_long*() directlyMark Rutland2023-06-052-6/+2
| | | | | | | | | | | | | | | Now that arch_atomic*() usage is limited to the atomic headers, we no longer have any users of arch_atomic_long_*(), and can generate raw_atomic_long_*() directly. Generate the raw_atomic_long_*() ops directly. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20230605070124.3741859-20-mark.rutland@arm.com
* locking/atomic: scripts: add trivial raw_atomic*_<op>()Mark Rutland2023-06-053-12/+92
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently a number of arch_atomic*_<op>() functions are optional, and where an arch does not provide a given arch_atomic*_<op>() we will define an implementation of arch_atomic*_<op>() in atomic-arch-fallback.h. Filling in the missing ops requires special care as we want to select the optimal definition of each op (e.g. preferentially defining ops in terms of their relaxed form rather than their fully-ordered form). The ifdeffery necessary for this requires us to group ordering variants together, which can be a bit painful to read, and is painful for kerneldoc generation. It would be easier to handle this if we generated ops into a separate namespace, as this would remove the need to take special care with the ifdeffery, and allow each ordering variant to be generated separately. This patch adds a new set of raw_atomic_<op>() definitions, which are currently trivial wrappers of their arch_atomic_<op>() equivalent. This will allow us to move treewide users of arch_atomic_<op>() over to raw atomic op before we rework the fallback generation to generate raw_atomic_<op> directly. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20230605070124.3741859-18-mark.rutland@arm.com
* locking/atomic: scripts: factor out order template generationMark Rutland2023-06-051-17/+17
| | | | | | | | | | | | | | Currently gen_proto_order_variants() hard codes the path for the templates used for order fallbacks. Factor this out into a helper so that it can be reused elsewhere. This results in no change to the generated headers, so there should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20230605070124.3741859-17-mark.rutland@arm.com
* locking/atomic: scripts: remove leftover "${mult}"Mark Rutland2023-06-051-1/+1
| | | | | | | | | | | | | | | | | We removed cmpxchg_double() and variants in commit: b4cf83b2d1da40b2 ("arch: Remove cmpxchg_double") Which removed the need for "${mult}" in the instrumentation logic. Unfortunately we missed an instance of "${mult}". There is no change to the generated header. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20230605070124.3741859-16-mark.rutland@arm.com
* locking/atomic: scripts: remove bogus order parameterMark Rutland2023-06-051-1/+1
| | | | | | | | | | | | | | | | At the start of gen_proto_order_variants(), the ${order} variable is not yet defined, and will be substituted with an empty string. Replace the current bogus use of ${order} with an empty string instead. This results in no change to the generated headers. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20230605070124.3741859-15-mark.rutland@arm.com
* locking/atomic: make atomic*_{cmp,}xchg optionalMark Rutland2023-06-052-0/+14
| | | | | | | | | | | | | | Most architectures define the atomic/atomic64 xchg and cmpxchg operations in terms of arch_xchg and arch_cmpxchg respectfully. Add fallbacks for these cases and remove the trivial cases from arch code. On some architectures the existing definitions are kept as these are used to build other arch_atomic*() operations. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20230605070124.3741859-5-mark.rutland@arm.com
* locking/atomic: remove fallback commentsMark Rutland2023-06-057-58/+0
| | | | | | | | | | | | | | | | | | | | | Currently a subset of the fallback templates have kerneldoc comments, resulting in a haphazard set of generated kerneldoc comments as only some operations have fallback templates to begin with. We'd like to generate more consistent kerneldoc comments, and to do so we'll need to restructure the way the fallback code is generated. To minimize churn and to make it easier to restructure the fallback code, this patch removes the existing kerneldoc comments from the fallback templates. We can add new kerneldoc comments in subsequent patches. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20230605070124.3741859-3-mark.rutland@arm.com
* arch: Remove cmpxchg_doublePeter Zijlstra2023-06-051-11/+4
| | | | | | | | | | | No moar users, remove the monster. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Heiko Carstens <hca@linux.ibm.com> Tested-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20230531132323.991907085@infradead.org
* instrumentation: Wire up cmpxchg128()Peter Zijlstra2023-06-052-4/+4
| | | | | | | | | | | | | | Wire up the cmpxchg128 family in the atomic wrapper scripts. These provide the generic cmpxchg128 family of functions from the arch_ prefixed version, adding explicit instrumentation where needed. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Tested-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20230531132323.519237070@infradead.org
* locking/atomic: Correct (cmp)xchg() instrumentationMark Rutland2023-04-291-3/+3
| | | | | | | | | | | | | | | | | | | All xchg() and cmpxchg() ops are atomic RMWs, but currently we instrument these with instrument_atomic_write() rather than instrument_atomic_read_write(), missing the read aspect. Similarly, all try_cmpxchg() ops are non-atomic RMWs on *oldp, but we instrument these accesses with instrument_atomic_write() rather than instrument_read_write(), missing the read aspect and erroneously marking these as atomic. Fix the instrumentation for both points. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lkml.kernel.org/r/20230413160644.490976-1-mark.rutland@arm.com Cc: Linus Torvalds <torvalds@linux-foundation.org>
* locking/atomic: Add generic try_cmpxchg{,64}_local() supportUros Bizjak2023-04-292-1/+5
| | | | | | | | | | | | | | Add generic support for try_cmpxchg{,64}_local() and their falbacks. These provides the generic try_cmpxchg_local family of functions from the arch_ prefixed version, also adding explicit instrumentation. Signed-off-by: Uros Bizjak <ubizjak@gmail.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20230405141710.3551-2-ubizjak@gmail.com Cc: Linus Torvalds <torvalds@linux-foundation.org>
* atomics: Provide atomic_add_negative() variantsThomas Gleixner2023-03-282-7/+6
| | | | | | | | | | | | atomic_add_negative() does not provide the relaxed/acquire/release variants. Provide them in preparation for a new scalable reference count algorithm. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20230323102800.101763813@linutronix.de
* Fix up more non-executable files marked executableLinus Torvalds2023-01-281-0/+0
| | | | | | | | | | | | | | | | | | | | | Joe found another DT file that shouldn't be executable, and that frustrated me enough that I went hunting with this script: git ls-files -s | grep '^100755' | cut -f2 | xargs grep -L '^#!' and that found another file that shouldn't have been marked executable either, despite being in the scripts directory. Maybe these two are the last ones at least for now. But I'm sure we'll be back in a few years, fixing things up again. Fixes: 8c6789f4e2d4 ("ASoC: dt-bindings: Add Everest ES8326 audio CODEC") Fixes: 4d8e5cd233db ("locking/atomics: Fix scripts/atomic/ script permissions") Reported-by: Joe Perches <joe@perches.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* kbuild: check sha1sum just once for each atomic headerMasahiro Yamada2022-09-281-33/+0
| | | | | | | | | | | It is unneeded to check the sha1sum every time. Create the timestamp files to manage it. Add '.' to clean-dirs because 'make clean' must visit ./Kbuild to clean up the timestamp files. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
* locking/atomic: Add generic try_cmpxchg64 supportUros Bizjak2022-05-182-14/+19
| | | | | | | | | Add generic support for try_cmpxchg64{,_acquire,_release,_relaxed} and their falbacks involving cmpxchg64. Signed-off-by: Uros Bizjak <ubizjak@gmail.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20220515184205.103089-2-ubizjak@gmail.com
* atomics: Fix atomic64_{read_acquire,set_release} fallbacksMark Rutland2022-02-112-2/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Arnd reports that on 32-bit architectures, the fallbacks for atomic64_read_acquire() and atomic64_set_release() are broken as they use smp_load_acquire() and smp_store_release() respectively, which do not work on types larger than the native word size. Since those contain compiletime_assert_atomic_type(), any attempt to use those fallbacks will result in a build-time error. e.g. with the following added to arch/arm/kernel/setup.c: | void test_atomic64(atomic64_t *v) | { | atomic64_set_release(v, 5); | atomic64_read_acquire(v); | } The compiler will complain as follows: | In file included from <command-line>: | In function 'arch_atomic64_set_release', | inlined from 'test_atomic64' at ./include/linux/atomic/atomic-instrumented.h:669:2: | ././include/linux/compiler_types.h:346:38: error: call to '__compiletime_assert_9' declared with attribute error: Need native word sized stores/loads for atomicity. | 346 | _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__) | | ^ | ././include/linux/compiler_types.h:327:4: note: in definition of macro '__compiletime_assert' | 327 | prefix ## suffix(); \ | | ^~~~~~ | ././include/linux/compiler_types.h:346:2: note: in expansion of macro '_compiletime_assert' | 346 | _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__) | | ^~~~~~~~~~~~~~~~~~~ | ././include/linux/compiler_types.h:349:2: note: in expansion of macro 'compiletime_assert' | 349 | compiletime_assert(__native_word(t), \ | | ^~~~~~~~~~~~~~~~~~ | ./include/asm-generic/barrier.h:133:2: note: in expansion of macro 'compiletime_assert_atomic_type' | 133 | compiletime_assert_atomic_type(*p); \ | | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | ./include/asm-generic/barrier.h:164:55: note: in expansion of macro '__smp_store_release' | 164 | #define smp_store_release(p, v) do { kcsan_release(); __smp_store_release(p, v); } while (0) | | ^~~~~~~~~~~~~~~~~~~ | ./include/linux/atomic/atomic-arch-fallback.h:1270:2: note: in expansion of macro 'smp_store_release' | 1270 | smp_store_release(&(v)->counter, i); | | ^~~~~~~~~~~~~~~~~ | make[2]: *** [scripts/Makefile.build:288: arch/arm/kernel/setup.o] Error 1 | make[1]: *** [scripts/Makefile.build:550: arch/arm/kernel] Error 2 | make: *** [Makefile:1831: arch/arm] Error 2 Fix this by only using smp_load_acquire() and smp_store_release() for native atomic types, and otherwise falling back to the regular barriers necessary for acquire/release semantics, as we do in the more generic acquire and release fallbacks. Since the fallback templates are used to generate the atomic64_*() and atomic_*() operations, the __native_word() check is added to both. For the atomic_*() operations, which are always 32-bit, the __native_word() check is redundant but not harmful, as it is always true. For the example above this works as expected on 32-bit, e.g. for arm multi_v7_defconfig: | <test_atomic64>: | push {r4, r5} | dmb ish | pldw [r0] | mov r2, #5 | mov r3, #0 | ldrexd r4, [r0] | strexd r4, r2, [r0] | teq r4, #0 | bne 484 <test_atomic64+0x14> | ldrexd r2, [r0] | dmb ish | pop {r4, r5} | bx lr ... and also on 64-bit, e.g. for arm64 defconfig: | <test_atomic64>: | bti c | paciasp | mov x1, #0x5 | stlr x1, [x0] | ldar x0, [x0] | autiasp | ret Reported-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Reviewed-by: Boqun Feng <boqun.feng@gmail.com> Link: https://lore.kernel.org/r/20220207101943.439825-1-mark.rutland@arm.com
* locking/atomics, kcsan: Add instrumentation for barriersMarco Elver2021-12-101-9/+32
| | | | | | | Adds the required KCSAN instrumentation for barriers of atomics. Signed-off-by: Marco Elver <elver@google.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
* locking/atomic: add arch_atomic_long*()Mark Rutland2021-07-162-2/+7
| | | | | | | | | | | | | | | | | | Now that all architectures provide arch_{atomic,atomic64}_*(), we can build arch_atomic_long_*() atop these, which can be safely used in noinstr code. The regular atomic_long_*() wrappers are built atop these, as we do for {atomic,atomic64}_*() atop arch_{atomic,atomic64}_*(). We don't provide arch_* versions of the cond_read*() variants, as we don't have arch_* versions of the underlying atomic/atomic64 functions (nor the smp_cond_load*() helpers these are typically based on). Note that the headers in this patch under include/linux/atomic/ are generated by the scripts in scripts/atomic/. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20210713105253.7615-5-mark.rutland@arm.com
* locking/atomic: centralize generated headersMark Rutland2021-07-164-12/+12
| | | | | | | | | | | | | | | The generated atomic headers are only intended to be included directly by <linux/atomic.h>, but are spread across include/linux/ and include/asm-generic/, where people mnay be encouraged to include them. This patch centralizes them under include/linux/atomic/. Other than the header guards and hashes, there is no change to any of the generated headers as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20210713105253.7615-4-mark.rutland@arm.com
* locking/atomic: remove ARCH_ATOMIC remanantsMark Rutland2021-07-1621-91/+71
| | | | | | | | | | | | | Now that gen-atomic-fallback.sh is only used to generate the arch_* fallbacks, we don't need to also generate the non-arch_* forms, and can removethe infrastructure this needed. There is no change to any of the generated headers as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20210713105253.7615-3-mark.rutland@arm.com
* locking/atomic: simplify ifdef generationMark Rutland2021-07-161-1/+1
| | | | | | | | | | | | | | | | | | | | | In gen-atomic-fallback.sh's gen_proto_order_variants(), we generate some ifdeferry with: | local basename="${arch}${atomic}_${pfx}${name}${sfx}" | ... | printf "#ifdef ${basename}\n" | ... | printf "#endif /* ${arch}${atomic}_${pfx}${name}${sfx} */\n\n" For clarity, use ${basename} for both sides, rather than open-coding the string generation. There is no change to any of the generated headers as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20210713105253.7615-2-mark.rutland@arm.com
* locking/atomics: atomic-instrumented: simplify ifdefferyMark Rutland2021-05-261-49/+2
| | | | | | | | | | | | | | | | | | | Now that all architectures implement ARCH_ATOMIC, the fallbacks are generated before the instrumented wrappers are generated. Due to this, in atomic-instrumented.h we can assume that the whole set of atomic functions has been generated. Likewise, atomic-instrumented.h doesn't need to provide a preprocessor definition for every atomic it wraps. This patch removes the redundant ifdeffery. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20210525140232.53872-34-mark.rutland@arm.com
* locking/atomic: delete !ARCH_ATOMIC remnantsMark Rutland2021-05-262-2/+0
| | | | | | | | | | | | | | | Now that all architectures implement ARCH_ATOMIC, we can make it mandatory, removing the Kconfig symbol and logic for !ARCH_ATOMIC. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20210525140232.53872-33-mark.rutland@arm.com
* locking/atomics: Regenerate the atomics-check SHA1'sIngo Molnar2020-11-071-0/+0
| | | | | | | | | | | | | | | | | | | | | | | The include/asm-generic/atomic-instrumented.h checksum got out of sync, so regenerate it. (No change to actual code.) Also make scripts/atomic/gen-atomics.sh executable, to make it easier to use. The auto-generated atomic header signatures are now fine: thule:~/tip> scripts/atomic/check-atomics.sh thule:~/tip> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: linux-kernel@vger.kernel.org Cc: Peter Zijlstra <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Paul E. McKenney <paulmck@kernel.org> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>
* Merge branch 'linus' into perf/kprobesIngo Molnar2020-11-072-6/+16
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Conflicts: include/asm-generic/atomic-instrumented.h kernel/kprobes.c Use the upstream atomic-instrumented.h checksum, and pick the kprobes version of kernel/kprobes.c, which effectively reverts this upstream workaround: 645f224e7ba2: ("kprobes: Tell lockdep about kprobe nesting") Since the new code *should* be fine without nesting. Knock on wood ... Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * Merge branch 'kcsan' of ↵Ingo Molnar2020-10-091-6/+15
| |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu into locking/core Pull KCSAN updates for v5.10 from Paul E. McKenney: - Improve kernel messages. - Be more permissive with bitops races under KCSAN_ASSUME_PLAIN_WRITES_ATOMIC=y. - Optimize debugfs stat counters. - Introduce the instrument_*read_write() annotations, to provide a finer description of certain ops - using KCSAN's compound instrumentation. Use them for atomic RNW and bitops, where appropriate. Doing this might find new races. (Depends on the compiler having tsan-compound-read-before-write=1 support.) - Support atomic built-ins, which will help certain architectures, such as s390. - Misc enhancements and smaller fixes. Signed-off-by: Ingo Molnar <mingo@kernel.org>
| | * locking/atomics: Use read-write instrumentation for atomic RMWsMarco Elver2020-08-251-6/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Use instrument_atomic_read_write() for atomic RMW ops. Cc: Will Deacon <will@kernel.org> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: <linux-arch@vger.kernel.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Marco Elver <elver@google.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
| * | locking/atomics: Check atomic-arch-fallback.h tooPaul Bolle2020-10-071-0/+1
| |/ | | | | | | | | | | | | | | | | | | The sha1sum of include/linux/atomic-arch-fallback.h isn't checked by check-atomics.sh. It's not clear why it's skipped so let's check it too. Signed-off-by: Paul Bolle <pebolle@tiscali.nl> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Link: https://lkml.kernel.org/r/20201001202028.1048418-1-pebolle@tiscali.nl
* / asm-generic/atomic: Add try_cmpxchg() fallbacksPeter Zijlstra2020-10-122-13/+79
|/ | | | | | | | | | | | Only x86 provides try_cmpxchg() outside of the atomic_t interfaces, provide generic fallbacks to create this interface from the widely available cmpxchg() function. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/159870621515.1229682.15506193091065001742.stgit@devnote2
* locking/atomics: Provide the arch_atomic_ interface to generic codePeter Zijlstra2020-06-251-0/+31
| | | | | | | | | | | | | Architectures with instrumented (KASAN/KCSAN) atomic operations natively provide arch_atomic_ variants that are not instrumented. It turns out that some generic code also requires arch_atomic_ in order to avoid instrumentation, so provide the arch_atomic_ interface as a direct map into the regular atomic_ interface for non-instrumented architectures. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
* Rebase locking/kcsan to locking/urgentThomas Gleixner2020-06-112-5/+7
|\ | | | | | | | | | | | | | | | | | | | | Merge the state of the locking kcsan branch before the read/write_once() and the atomics modifications got merged. Squash the fallout of the rebase on top of the read/write once and atomic fallback work into the merge. The history of the original branch is preserved in tag locking-kcsan-2020-06-02. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
| * asm-generic, atomic-instrumented: Use generic instrumented.hMarco Elver2020-03-211-16/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | This switches atomic-instrumented.h to use the generic instrumentation wrappers provided by instrumented.h. No functional change intended. Suggested-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Marco Elver <elver@google.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Ingo Molnar <mingo@kernel.org>
| * asm-generic/atomic: Use __always_inline for fallback wrappersMarco Elver2020-01-0720-19/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Use __always_inline for atomic fallback wrappers. When building for size (CC_OPTIMIZE_FOR_SIZE), some compilers appear to be less inclined to inline even relatively small static inline functions that are assumed to be inlinable such as atomic ops. This can cause problems, for example in UACCESS regions. While the fallback wrappers aren't pure wrappers, they are trivial nonetheless, and the function they wrap should determine the final inlining policy. For x86 tinyconfig we observe: - vmlinux baseline: 1315988 - vmlinux with patch: 1315928 (-60 bytes) Suggested-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Marco Elver <elver@google.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
| * asm-generic/atomic: Use __always_inline for pure wrappersMarco Elver2020-01-072-4/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Prefer __always_inline for atomic wrappers. When building for size (CC_OPTIMIZE_FOR_SIZE), some compilers appear to be less inclined to inline even relatively small static inline functions that are assumed to be inlinable such as atomic ops. This can cause problems, for example in UACCESS regions. By using __always_inline, we let the real implementation and not the wrapper determine the final inlining preference. For x86 tinyconfig we observe: - vmlinux baseline: 1316204 - vmlinux with patch: 1315988 (-216 bytes) This came up when addressing UACCESS warnings with CC_OPTIMIZE_FOR_SIZE in the KCSAN runtime: http://lkml.kernel.org/r/58708908-84a0-0a81-a836-ad97e33dbb62@infradead.org Reported-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Marco Elver <elver@google.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
| * locking/atomics, kcsan: Add KCSAN instrumentationMarco Elver2019-11-161-2/+15
| | | | | | | | | | | | | | | | | | This adds KCSAN instrumentation to atomic-instrumented.h. Signed-off-by: Marco Elver <elver@google.com> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
* | locking/atomics: Flip fallbacks and instrumentationPeter Zijlstra2020-06-1121-65/+63
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently instrumentation of atomic primitives is done at the architecture level, while composites or fallbacks are provided at the generic level. The result is that there are no uninstrumented variants of the fallbacks. Since there is now need of such variants to isolate text poke from any form of instrumentation invert this ordering. Doing this means moving the instrumentation into the generic code as well as having (for now) two variants of the fallbacks. Notes: - the various *cond_read* primitives are not proper fallbacks and got moved into linux/atomic.c. No arch_ variants are generated because the base primitives smp_cond_load*() are instrumented. - once all architectures are moved over to arch_atomic_ one of the fallback variants can be removed and some 2300 lines reclaimed. - atomic_{read,set}*() are no longer double-instrumented Reported-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Mark Rutland <mark.rutland@arm.com> Link: https://lkml.kernel.org/r/20200505134058.769149955@linutronix.de
* | asm-generic/atomic: Use __always_inline for fallback wrappersMarco Elver2020-06-1120-19/+21
|/ | | | | | | | | | | | | | | | | | | | | | | | | Use __always_inline for atomic fallback wrappers. When building for size (CC_OPTIMIZE_FOR_SIZE), some compilers appear to be less inclined to inline even relatively small static inline functions that are assumed to be inlinable such as atomic ops. This can cause problems, for example in UACCESS regions. While the fallback wrappers aren't pure wrappers, they are trivial nonetheless, and the function they wrap should determine the final inlining policy. For x86 tinyconfig we observe: - vmlinux baseline: 1315988 - vmlinux with patch: 1315928 (-60 bytes) [ tglx: Cherry-picked from KCSAN ] Suggested-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Marco Elver <elver@google.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
* locking/atomics: Use sed(1) instead of non-standard head(1) optionMichael Forney2019-06-251-1/+1
| | | | | | | | | | | | | | | | | | | POSIX says the -n option must be a positive decimal integer. Not all implementations of head(1) support negative numbers meaning offset from the end of the file. Instead, the sed expression '$d' has the same effect of removing the last line of the file. Signed-off-by: Michael Forney <mforney@mforney.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Will Deacon <will.deacon@arm.com> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/20190618053306.730-1-mforney@mforney.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
* locking/atomics: Don't assume that scripts are executableAndrew Morton2019-04-191-1/+1
| | | | | | | | | | | | | | | | patch(1) doesn't set the x bit on files. So if someone downloads and applies patch-4.21.xz, their kernel won't build. Fix that by executing /bin/sh. Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Mark Rutland <mark.rutland@arm.com> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>
* locking/atomics: Check atomic headers with sha1sumMark Rutland2019-02-132-6/+40
| | | | | | | | | | | | | | | | | | | | | | | | | | | We currently check the atomic headers at build-time to ensure they haven't been modified directly, and these checks require regenerating the headers in full. As this takes a few seconds, even when parallelized, this is too slow to run for every kernel build. Instead, we can generate a hash of each header as we generate them, which we can cheaply check at build time (~0.16s for all headers). This patch does so, updating headers with their hashes using the new gen-atomics.sh script. As some users apparently build the kernel wihout coreutils, lacking sha1sum, the checks are skipped in this case. Presumably, most developers have a working coreutils installation. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Will Deacon <will.deacon@arm.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: anders.roxell@linaro.org Cc: linux-kernel@vger.kernel.rg Cc: naresh.kamboju@linaro.org Signed-off-by: Ingo Molnar <mingo@kernel.org>