x86: atomic64 assembly improvements

In the "xchg" implementation, %ebx and %ecx don't need to be copied into %eax and %edx respectively (this is only necessary when desiring to only read the stored value). In the "add_unless" implementation, swapping the use of %ecx and %esi for passing arguments allows %esi to become an input only (i.e. permitting the register to be re-used to address the same object without reload). In "{add,sub}_return", doing the initial read64 through the passed in %ecx decreases a register dependency. In "inc_not_zero", a branch can be eliminated by or-ing together the two halves of the current (64-bit) value, and code size can be further reduced by adjusting the arithmetic slightly. v2: Undo the folding of "xchg" and "set". Signed-off-by: Jan Beulich <jbeulich@suse.com> Link: http://lkml.kernel.org/r/4F19A2BC020000780006E0DC@nat28.tlf.novell.com Cc: Luca Barbieri <luca@luca-barbieri.com> Cc: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
author: Jan Beulich <JBeulich@suse.com> 2012-01-20 17:22:04 +0100
committer: H. Peter Anvin <hpa@linux.intel.com> 2012-01-21 02:29:49 +0100
commit: cb8095bba6d24118135a5683a956f4f4fb5f17bb (patch)
tree: 25eff3732e8471e314591d0bc6ea41d96857c18b /arch/x86/lib/atomic64_386_32.S
parent: x86: Adjust asm constraints in atomic64 wrappers (diff)
download: linux-cb8095bba6d24118135a5683a956f4f4fb5f17bb.tar.xz
linux-cb8095bba6d24118135a5683a956f4f4fb5f17bb.zip
1 files changed, 3 insertions, 3 deletions
diff --git a/arch/x86/lib/atomic64_386_32.S b/arch/x86/lib/atomic64_386_32.S
index e8e7e0d06f42..00933d5e992f 100644
--- a/arch/x86/lib/atomic64_386_32.S
+++ b/arch/x86/lib/atomic64_386_32.S
@@ -137,13 +137,13 @@ BEGIN(dec_return)
 RET_ENDP
 #undef v
 
-#define v %ecx
+#define v %esi
 BEGIN(add_unless)
-	addl %eax, %esi
+	addl %eax, %ecx
 	adcl %edx, %edi
 	addl  (v), %eax
 	adcl 4(v), %edx
-	cmpl %eax, %esi
+	cmpl %eax, %ecx
 	je 3f
 1:
 	movl %eax,  (v)
author	Jan Beulich <JBeulich@suse.com>	2012-01-20 17:22:04 +0100
committer	H. Peter Anvin <hpa@linux.intel.com>	2012-01-21 02:29:49 +0100
commit	cb8095bba6d24118135a5683a956f4f4fb5f17bb (patch)
tree	25eff3732e8471e314591d0bc6ea41d96857c18b /arch/x86/lib/atomic64_386_32.S
parent	x86: Adjust asm constraints in atomic64 wrappers (diff)
download	linux-cb8095bba6d24118135a5683a956f4f4fb5f17bb.tar.xz linux-cb8095bba6d24118135a5683a956f4f4fb5f17bb.zip