docs: arm: convert docs to ReST and rename to *.rst

Converts ARM the text files to ReST, preparing them to be an architecture book. The conversion is actually: - add blank lines and identation in order to identify paragraphs; - fix tables markups; - add some lists markups; - mark literal blocks; - adjust title markups. At its new index.rst, let's add a :orphan: while this is not linked to the main index.rst file, in order to avoid build warnings. Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org> Reviewed-by Corentin Labbe <clabbe.montjoie@gmail.com> # For sun4i-ss
author: Mauro Carvalho Chehab <mchehab+samsung@kernel.org> 2019-04-14 20:51:10 +0200
committer: Mauro Carvalho Chehab <mchehab+samsung@kernel.org> 2019-07-15 14:20:24 +0200
commit: dc7a12bdfccd94c31f79e294f16f7549bd411b49 (patch)
tree: 81da5ca148347b94c4539234f50d4bca6465e2f8 /Documentation/arm/vlocks.txt
parent: docs: early-userspace: convert docs to ReST and rename to *.rst (diff)
download: linux-dc7a12bdfccd94c31f79e294f16f7549bd411b49.tar.xz
linux-dc7a12bdfccd94c31f79e294f16f7549bd411b49.zip
1 files changed, 0 insertions, 211 deletions
diff --git a/Documentation/arm/vlocks.txt b/Documentation/arm/vlocks.txt
deleted file mode 100644
index 45731672c564..000000000000
--- a/Documentation/arm/vlocks.txt
+++ /dev/null
@@ -1,211 +0,0 @@
-vlocks for Bare-Metal Mutual Exclusion
-======================================
-
-Voting Locks, or "vlocks" provide a simple low-level mutual exclusion
-mechanism, with reasonable but minimal requirements on the memory
-system.
-
-These are intended to be used to coordinate critical activity among CPUs
-which are otherwise non-coherent, in situations where the hardware
-provides no other mechanism to support this and ordinary spinlocks
-cannot be used.
-
-
-vlocks make use of the atomicity provided by the memory system for
-writes to a single memory location.  To arbitrate, every CPU "votes for
-itself", by storing a unique number to a common memory location.  The
-final value seen in that memory location when all the votes have been
-cast identifies the winner.
-
-In order to make sure that the election produces an unambiguous result
-in finite time, a CPU will only enter the election in the first place if
-no winner has been chosen and the election does not appear to have
-started yet.
-
-
-Algorithm
----------
-
-The easiest way to explain the vlocks algorithm is with some pseudo-code:
-
-
-	int currently_voting[NR_CPUS] = { 0, };
-	int last_vote = -1; /* no votes yet */
-
-	bool vlock_trylock(int this_cpu)
-	{
-		/* signal our desire to vote */
-		currently_voting[this_cpu] = 1;
-		if (last_vote != -1) {
-			/* someone already volunteered himself */
-			currently_voting[this_cpu] = 0;
-			return false; /* not ourself */
-		}
-
-		/* let's suggest ourself */
-		last_vote = this_cpu;
-		currently_voting[this_cpu] = 0;
-
-		/* then wait until everyone else is done voting */
-		for_each_cpu(i) {
-			while (currently_voting[i] != 0)
-				/* wait */;
-		}
-
-		/* result */
-		if (last_vote == this_cpu)
-			return true; /* we won */
-		return false;
-	}
-
-	bool vlock_unlock(void)
-	{
-		last_vote = -1;
-	}
-
-
-The currently_voting[] array provides a way for the CPUs to determine
-whether an election is in progress, and plays a role analogous to the
-"entering" array in Lamport's bakery algorithm [1].
-
-However, once the election has started, the underlying memory system
-atomicity is used to pick the winner.  This avoids the need for a static
-priority rule to act as a tie-breaker, or any counters which could
-overflow.
-
-As long as the last_vote variable is globally visible to all CPUs, it
-will contain only one value that won't change once every CPU has cleared
-its currently_voting flag.
-
-
-Features and limitations
-------------------------
-
- * vlocks are not intended to be fair.  In the contended case, it is the
-   _last_ CPU which attempts to get the lock which will be most likely
-   to win.
-
-   vlocks are therefore best suited to situations where it is necessary
-   to pick a unique winner, but it does not matter which CPU actually
-   wins.
-
- * Like other similar mechanisms, vlocks will not scale well to a large
-   number of CPUs.
-
-   vlocks can be cascaded in a voting hierarchy to permit better scaling
-   if necessary, as in the following hypothetical example for 4096 CPUs:
-
-	/* first level: local election */
-	my_town = towns[(this_cpu >> 4) & 0xf];
-	I_won = vlock_trylock(my_town, this_cpu & 0xf);
-	if (I_won) {
-		/* we won the town election, let's go for the state */
-		my_state = states[(this_cpu >> 8) & 0xf];
-		I_won = vlock_lock(my_state, this_cpu & 0xf));
-		if (I_won) {
-			/* and so on */
-			I_won = vlock_lock(the_whole_country, this_cpu & 0xf];
-			if (I_won) {
-				/* ... */
-			}
-			vlock_unlock(the_whole_country);
-		}
-		vlock_unlock(my_state);
-	}
-	vlock_unlock(my_town);
-
-
-ARM implementation
-------------------
-
-The current ARM implementation [2] contains some optimisations beyond
-the basic algorithm:
-
- * By packing the members of the currently_voting array close together,
-   we can read the whole array in one transaction (providing the number
-   of CPUs potentially contending the lock is small enough).  This
-   reduces the number of round-trips required to external memory.
-
-   In the ARM implementation, this means that we can use a single load
-   and comparison:
-
-	LDR	Rt, [Rn]
-	CMP	Rt, #0
-
-   ...in place of code equivalent to:
-
-	LDRB	Rt, [Rn]
-	CMP	Rt, #0
-	LDRBEQ	Rt, [Rn, #1]
-	CMPEQ	Rt, #0
-	LDRBEQ	Rt, [Rn, #2]
-	CMPEQ	Rt, #0
-	LDRBEQ	Rt, [Rn, #3]
-	CMPEQ	Rt, #0
-
-   This cuts down on the fast-path latency, as well as potentially
-   reducing bus contention in contended cases.
-
-   The optimisation relies on the fact that the ARM memory system
-   guarantees coherency between overlapping memory accesses of
-   different sizes, similarly to many other architectures.  Note that
-   we do not care which element of currently_voting appears in which
-   bits of Rt, so there is no need to worry about endianness in this
-   optimisation.
-
-   If there are too many CPUs to read the currently_voting array in
-   one transaction then multiple transations are still required.  The
-   implementation uses a simple loop of word-sized loads for this
-   case.  The number of transactions is still fewer than would be
-   required if bytes were loaded individually.
-
-
-   In principle, we could aggregate further by using LDRD or LDM, but
-   to keep the code simple this was not attempted in the initial
-   implementation.
-
-
- * vlocks are currently only used to coordinate between CPUs which are
-   unable to enable their caches yet.  This means that the
-   implementation removes many of the barriers which would be required
-   when executing the algorithm in cached memory.
-
-   packing of the currently_voting array does not work with cached
-   memory unless all CPUs contending the lock are cache-coherent, due
-   to cache writebacks from one CPU clobbering values written by other
-   CPUs.  (Though if all the CPUs are cache-coherent, you should be
-   probably be using proper spinlocks instead anyway).
-
-
- * The "no votes yet" value used for the last_vote variable is 0 (not
-   -1 as in the pseudocode).  This allows statically-allocated vlocks
-   to be implicitly initialised to an unlocked state simply by putting
-   them in .bss.
-
-   An offset is added to each CPU's ID for the purpose of setting this
-   variable, so that no CPU uses the value 0 for its ID.
-
-
-Colophon
---------
-
-Originally created and documented by Dave Martin for Linaro Limited, for
-use in ARM-based big.LITTLE platforms, with review and input gratefully
-received from Nicolas Pitre and Achin Gupta.  Thanks to Nicolas for
-grabbing most of this text out of the relevant mail thread and writing
-up the pseudocode.
-
-Copyright (C) 2012-2013  Linaro Limited
-Distributed under the terms of Version 2 of the GNU General Public
-License, as defined in linux/COPYING.
-
-
-References
-----------
-
-[1] Lamport, L. "A New Solution of Dijkstra's Concurrent Programming
-    Problem", Communications of the ACM 17, 8 (August 1974), 453-455.
-
-    https://en.wikipedia.org/wiki/Lamport%27s_bakery_algorithm
-
-[2] linux/arch/arm/common/vlock.S, www.kernel.org.
author	Mauro Carvalho Chehab <mchehab+samsung@kernel.org>	2019-04-14 20:51:10 +0200
committer	Mauro Carvalho Chehab <mchehab+samsung@kernel.org>	2019-07-15 14:20:24 +0200
commit	dc7a12bdfccd94c31f79e294f16f7549bd411b49 (patch)
tree	81da5ca148347b94c4539234f50d4bca6465e2f8 /Documentation/arm/vlocks.txt
parent	docs: early-userspace: convert docs to ReST and rename to *.rst (diff)
download	linux-dc7a12bdfccd94c31f79e294f16f7549bd411b49.tar.xz linux-dc7a12bdfccd94c31f79e294f16f7549bd411b49.zip