diff options
author | Mauro Carvalho Chehab <mchehab+samsung@kernel.org> | 2019-04-14 20:51:10 +0200 |
---|---|---|
committer | Mauro Carvalho Chehab <mchehab+samsung@kernel.org> | 2019-07-15 14:20:24 +0200 |
commit | dc7a12bdfccd94c31f79e294f16f7549bd411b49 (patch) | |
tree | 81da5ca148347b94c4539234f50d4bca6465e2f8 /Documentation/arm/cluster-pm-race-avoidance.txt | |
parent | docs: early-userspace: convert docs to ReST and rename to *.rst (diff) | |
download | linux-dc7a12bdfccd94c31f79e294f16f7549bd411b49.tar.xz linux-dc7a12bdfccd94c31f79e294f16f7549bd411b49.zip |
docs: arm: convert docs to ReST and rename to *.rst
Converts ARM the text files to ReST, preparing them to be an
architecture book.
The conversion is actually:
- add blank lines and identation in order to identify paragraphs;
- fix tables markups;
- add some lists markups;
- mark literal blocks;
- adjust title markups.
At its new index.rst, let's add a :orphan: while this is not linked to
the main index.rst file, in order to avoid build warnings.
Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
Reviewed-by Corentin Labbe <clabbe.montjoie@gmail.com> # For sun4i-ss
Diffstat (limited to 'Documentation/arm/cluster-pm-race-avoidance.txt')
-rw-r--r-- | Documentation/arm/cluster-pm-race-avoidance.txt | 498 |
1 files changed, 0 insertions, 498 deletions
diff --git a/Documentation/arm/cluster-pm-race-avoidance.txt b/Documentation/arm/cluster-pm-race-avoidance.txt deleted file mode 100644 index 750b6fc24af9..000000000000 --- a/Documentation/arm/cluster-pm-race-avoidance.txt +++ /dev/null @@ -1,498 +0,0 @@ -Cluster-wide Power-up/power-down race avoidance algorithm -========================================================= - -This file documents the algorithm which is used to coordinate CPU and -cluster setup and teardown operations and to manage hardware coherency -controls safely. - -The section "Rationale" explains what the algorithm is for and why it is -needed. "Basic model" explains general concepts using a simplified view -of the system. The other sections explain the actual details of the -algorithm in use. - - -Rationale ---------- - -In a system containing multiple CPUs, it is desirable to have the -ability to turn off individual CPUs when the system is idle, reducing -power consumption and thermal dissipation. - -In a system containing multiple clusters of CPUs, it is also desirable -to have the ability to turn off entire clusters. - -Turning entire clusters off and on is a risky business, because it -involves performing potentially destructive operations affecting a group -of independently running CPUs, while the OS continues to run. This -means that we need some coordination in order to ensure that critical -cluster-level operations are only performed when it is truly safe to do -so. - -Simple locking may not be sufficient to solve this problem, because -mechanisms like Linux spinlocks may rely on coherency mechanisms which -are not immediately enabled when a cluster powers up. Since enabling or -disabling those mechanisms may itself be a non-atomic operation (such as -writing some hardware registers and invalidating large caches), other -methods of coordination are required in order to guarantee safe -power-down and power-up at the cluster level. - -The mechanism presented in this document describes a coherent memory -based protocol for performing the needed coordination. It aims to be as -lightweight as possible, while providing the required safety properties. - - -Basic model ------------ - -Each cluster and CPU is assigned a state, as follows: - - DOWN - COMING_UP - UP - GOING_DOWN - - +---------> UP ----------+ - | v - - COMING_UP GOING_DOWN - - ^ | - +--------- DOWN <--------+ - - -DOWN: The CPU or cluster is not coherent, and is either powered off or - suspended, or is ready to be powered off or suspended. - -COMING_UP: The CPU or cluster has committed to moving to the UP state. - It may be part way through the process of initialisation and - enabling coherency. - -UP: The CPU or cluster is active and coherent at the hardware - level. A CPU in this state is not necessarily being used - actively by the kernel. - -GOING_DOWN: The CPU or cluster has committed to moving to the DOWN - state. It may be part way through the process of teardown and - coherency exit. - - -Each CPU has one of these states assigned to it at any point in time. -The CPU states are described in the "CPU state" section, below. - -Each cluster is also assigned a state, but it is necessary to split the -state value into two parts (the "cluster" state and "inbound" state) and -to introduce additional states in order to avoid races between different -CPUs in the cluster simultaneously modifying the state. The cluster- -level states are described in the "Cluster state" section. - -To help distinguish the CPU states from cluster states in this -discussion, the state names are given a CPU_ prefix for the CPU states, -and a CLUSTER_ or INBOUND_ prefix for the cluster states. - - -CPU state ---------- - -In this algorithm, each individual core in a multi-core processor is -referred to as a "CPU". CPUs are assumed to be single-threaded: -therefore, a CPU can only be doing one thing at a single point in time. - -This means that CPUs fit the basic model closely. - -The algorithm defines the following states for each CPU in the system: - - CPU_DOWN - CPU_COMING_UP - CPU_UP - CPU_GOING_DOWN - - cluster setup and - CPU setup complete policy decision - +-----------> CPU_UP ------------+ - | v - - CPU_COMING_UP CPU_GOING_DOWN - - ^ | - +----------- CPU_DOWN <----------+ - policy decision CPU teardown complete - or hardware event - - -The definitions of the four states correspond closely to the states of -the basic model. - -Transitions between states occur as follows. - -A trigger event (spontaneous) means that the CPU can transition to the -next state as a result of making local progress only, with no -requirement for any external event to happen. - - -CPU_DOWN: - - A CPU reaches the CPU_DOWN state when it is ready for - power-down. On reaching this state, the CPU will typically - power itself down or suspend itself, via a WFI instruction or a - firmware call. - - Next state: CPU_COMING_UP - Conditions: none - - Trigger events: - - a) an explicit hardware power-up operation, resulting - from a policy decision on another CPU; - - b) a hardware event, such as an interrupt. - - -CPU_COMING_UP: - - A CPU cannot start participating in hardware coherency until the - cluster is set up and coherent. If the cluster is not ready, - then the CPU will wait in the CPU_COMING_UP state until the - cluster has been set up. - - Next state: CPU_UP - Conditions: The CPU's parent cluster must be in CLUSTER_UP. - Trigger events: Transition of the parent cluster to CLUSTER_UP. - - Refer to the "Cluster state" section for a description of the - CLUSTER_UP state. - - -CPU_UP: - When a CPU reaches the CPU_UP state, it is safe for the CPU to - start participating in local coherency. - - This is done by jumping to the kernel's CPU resume code. - - Note that the definition of this state is slightly different - from the basic model definition: CPU_UP does not mean that the - CPU is coherent yet, but it does mean that it is safe to resume - the kernel. The kernel handles the rest of the resume - procedure, so the remaining steps are not visible as part of the - race avoidance algorithm. - - The CPU remains in this state until an explicit policy decision - is made to shut down or suspend the CPU. - - Next state: CPU_GOING_DOWN - Conditions: none - Trigger events: explicit policy decision - - -CPU_GOING_DOWN: - - While in this state, the CPU exits coherency, including any - operations required to achieve this (such as cleaning data - caches). - - Next state: CPU_DOWN - Conditions: local CPU teardown complete - Trigger events: (spontaneous) - - -Cluster state -------------- - -A cluster is a group of connected CPUs with some common resources. -Because a cluster contains multiple CPUs, it can be doing multiple -things at the same time. This has some implications. In particular, a -CPU can start up while another CPU is tearing the cluster down. - -In this discussion, the "outbound side" is the view of the cluster state -as seen by a CPU tearing the cluster down. The "inbound side" is the -view of the cluster state as seen by a CPU setting the CPU up. - -In order to enable safe coordination in such situations, it is important -that a CPU which is setting up the cluster can advertise its state -independently of the CPU which is tearing down the cluster. For this -reason, the cluster state is split into two parts: - - "cluster" state: The global state of the cluster; or the state - on the outbound side: - - CLUSTER_DOWN - CLUSTER_UP - CLUSTER_GOING_DOWN - - "inbound" state: The state of the cluster on the inbound side. - - INBOUND_NOT_COMING_UP - INBOUND_COMING_UP - - - The different pairings of these states results in six possible - states for the cluster as a whole: - - CLUSTER_UP - +==========> INBOUND_NOT_COMING_UP -------------+ - # | - | - CLUSTER_UP <----+ | - INBOUND_COMING_UP | v - - ^ CLUSTER_GOING_DOWN CLUSTER_GOING_DOWN - # INBOUND_COMING_UP <=== INBOUND_NOT_COMING_UP - - CLUSTER_DOWN | | - INBOUND_COMING_UP <----+ | - | - ^ | - +=========== CLUSTER_DOWN <------------+ - INBOUND_NOT_COMING_UP - - Transitions -----> can only be made by the outbound CPU, and - only involve changes to the "cluster" state. - - Transitions ===##> can only be made by the inbound CPU, and only - involve changes to the "inbound" state, except where there is no - further transition possible on the outbound side (i.e., the - outbound CPU has put the cluster into the CLUSTER_DOWN state). - - The race avoidance algorithm does not provide a way to determine - which exact CPUs within the cluster play these roles. This must - be decided in advance by some other means. Refer to the section - "Last man and first man selection" for more explanation. - - - CLUSTER_DOWN/INBOUND_NOT_COMING_UP is the only state where the - cluster can actually be powered down. - - The parallelism of the inbound and outbound CPUs is observed by - the existence of two different paths from CLUSTER_GOING_DOWN/ - INBOUND_NOT_COMING_UP (corresponding to GOING_DOWN in the basic - model) to CLUSTER_DOWN/INBOUND_COMING_UP (corresponding to - COMING_UP in the basic model). The second path avoids cluster - teardown completely. - - CLUSTER_UP/INBOUND_COMING_UP is equivalent to UP in the basic - model. The final transition to CLUSTER_UP/INBOUND_NOT_COMING_UP - is trivial and merely resets the state machine ready for the - next cycle. - - Details of the allowable transitions follow. - - The next state in each case is notated - - <cluster state>/<inbound state> (<transitioner>) - - where the <transitioner> is the side on which the transition - can occur; either the inbound or the outbound side. - - -CLUSTER_DOWN/INBOUND_NOT_COMING_UP: - - Next state: CLUSTER_DOWN/INBOUND_COMING_UP (inbound) - Conditions: none - Trigger events: - - a) an explicit hardware power-up operation, resulting - from a policy decision on another CPU; - - b) a hardware event, such as an interrupt. - - -CLUSTER_DOWN/INBOUND_COMING_UP: - - In this state, an inbound CPU sets up the cluster, including - enabling of hardware coherency at the cluster level and any - other operations (such as cache invalidation) which are required - in order to achieve this. - - The purpose of this state is to do sufficient cluster-level - setup to enable other CPUs in the cluster to enter coherency - safely. - - Next state: CLUSTER_UP/INBOUND_COMING_UP (inbound) - Conditions: cluster-level setup and hardware coherency complete - Trigger events: (spontaneous) - - -CLUSTER_UP/INBOUND_COMING_UP: - - Cluster-level setup is complete and hardware coherency is - enabled for the cluster. Other CPUs in the cluster can safely - enter coherency. - - This is a transient state, leading immediately to - CLUSTER_UP/INBOUND_NOT_COMING_UP. All other CPUs on the cluster - should consider treat these two states as equivalent. - - Next state: CLUSTER_UP/INBOUND_NOT_COMING_UP (inbound) - Conditions: none - Trigger events: (spontaneous) - - -CLUSTER_UP/INBOUND_NOT_COMING_UP: - - Cluster-level setup is complete and hardware coherency is - enabled for the cluster. Other CPUs in the cluster can safely - enter coherency. - - The cluster will remain in this state until a policy decision is - made to power the cluster down. - - Next state: CLUSTER_GOING_DOWN/INBOUND_NOT_COMING_UP (outbound) - Conditions: none - Trigger events: policy decision to power down the cluster - - -CLUSTER_GOING_DOWN/INBOUND_NOT_COMING_UP: - - An outbound CPU is tearing the cluster down. The selected CPU - must wait in this state until all CPUs in the cluster are in the - CPU_DOWN state. - - When all CPUs are in the CPU_DOWN state, the cluster can be torn - down, for example by cleaning data caches and exiting - cluster-level coherency. - - To avoid wasteful unnecessary teardown operations, the outbound - should check the inbound cluster state for asynchronous - transitions to INBOUND_COMING_UP. Alternatively, individual - CPUs can be checked for entry into CPU_COMING_UP or CPU_UP. - - - Next states: - - CLUSTER_DOWN/INBOUND_NOT_COMING_UP (outbound) - Conditions: cluster torn down and ready to power off - Trigger events: (spontaneous) - - CLUSTER_GOING_DOWN/INBOUND_COMING_UP (inbound) - Conditions: none - Trigger events: - - a) an explicit hardware power-up operation, - resulting from a policy decision on another - CPU; - - b) a hardware event, such as an interrupt. - - -CLUSTER_GOING_DOWN/INBOUND_COMING_UP: - - The cluster is (or was) being torn down, but another CPU has - come online in the meantime and is trying to set up the cluster - again. - - If the outbound CPU observes this state, it has two choices: - - a) back out of teardown, restoring the cluster to the - CLUSTER_UP state; - - b) finish tearing the cluster down and put the cluster - in the CLUSTER_DOWN state; the inbound CPU will - set up the cluster again from there. - - Choice (a) permits the removal of some latency by avoiding - unnecessary teardown and setup operations in situations where - the cluster is not really going to be powered down. - - - Next states: - - CLUSTER_UP/INBOUND_COMING_UP (outbound) - Conditions: cluster-level setup and hardware - coherency complete - Trigger events: (spontaneous) - - CLUSTER_DOWN/INBOUND_COMING_UP (outbound) - Conditions: cluster torn down and ready to power off - Trigger events: (spontaneous) - - -Last man and First man selection --------------------------------- - -The CPU which performs cluster tear-down operations on the outbound side -is commonly referred to as the "last man". - -The CPU which performs cluster setup on the inbound side is commonly -referred to as the "first man". - -The race avoidance algorithm documented above does not provide a -mechanism to choose which CPUs should play these roles. - - -Last man: - -When shutting down the cluster, all the CPUs involved are initially -executing Linux and hence coherent. Therefore, ordinary spinlocks can -be used to select a last man safely, before the CPUs become -non-coherent. - - -First man: - -Because CPUs may power up asynchronously in response to external wake-up -events, a dynamic mechanism is needed to make sure that only one CPU -attempts to play the first man role and do the cluster-level -initialisation: any other CPUs must wait for this to complete before -proceeding. - -Cluster-level initialisation may involve actions such as configuring -coherency controls in the bus fabric. - -The current implementation in mcpm_head.S uses a separate mutual exclusion -mechanism to do this arbitration. This mechanism is documented in -detail in vlocks.txt. - - -Features and Limitations ------------------------- - -Implementation: - - The current ARM-based implementation is split between - arch/arm/common/mcpm_head.S (low-level inbound CPU operations) and - arch/arm/common/mcpm_entry.c (everything else): - - __mcpm_cpu_going_down() signals the transition of a CPU to the - CPU_GOING_DOWN state. - - __mcpm_cpu_down() signals the transition of a CPU to the CPU_DOWN - state. - - A CPU transitions to CPU_COMING_UP and then to CPU_UP via the - low-level power-up code in mcpm_head.S. This could - involve CPU-specific setup code, but in the current - implementation it does not. - - __mcpm_outbound_enter_critical() and __mcpm_outbound_leave_critical() - handle transitions from CLUSTER_UP to CLUSTER_GOING_DOWN - and from there to CLUSTER_DOWN or back to CLUSTER_UP (in - the case of an aborted cluster power-down). - - These functions are more complex than the __mcpm_cpu_*() - functions due to the extra inter-CPU coordination which - is needed for safe transitions at the cluster level. - - A cluster transitions from CLUSTER_DOWN back to CLUSTER_UP via - the low-level power-up code in mcpm_head.S. This - typically involves platform-specific setup code, - provided by the platform-specific power_up_setup - function registered via mcpm_sync_init. - -Deep topologies: - - As currently described and implemented, the algorithm does not - support CPU topologies involving more than two levels (i.e., - clusters of clusters are not supported). The algorithm could be - extended by replicating the cluster-level states for the - additional topological levels, and modifying the transition - rules for the intermediate (non-outermost) cluster levels. - - -Colophon --------- - -Originally created and documented by Dave Martin for Linaro Limited, in -collaboration with Nicolas Pitre and Achin Gupta. - -Copyright (C) 2012-2013 Linaro Limited -Distributed under the terms of Version 2 of the GNU General Public -License, as defined in linux/COPYING. |