diff options
Diffstat (limited to 'Documentation/RCU/Design')
-rw-r--r-- | Documentation/RCU/Design/Data-Structures/Data-Structures.html | 49 | ||||
-rw-r--r-- | Documentation/RCU/Design/Requirements/Requirements.html | 7 |
2 files changed, 38 insertions, 18 deletions
diff --git a/Documentation/RCU/Design/Data-Structures/Data-Structures.html b/Documentation/RCU/Design/Data-Structures/Data-Structures.html index 38d6d800761f..6c06e10bd04b 100644 --- a/Documentation/RCU/Design/Data-Structures/Data-Structures.html +++ b/Documentation/RCU/Design/Data-Structures/Data-Structures.html @@ -1097,7 +1097,8 @@ will cause the CPU to disregard the values of its counters on its next exit from idle. Finally, the <tt>rcu_qs_ctr_snap</tt> field is used to detect cases where a given operation has resulted in a quiescent state -for all flavors of RCU, for example, <tt>cond_resched_rcu_qs()</tt>. +for all flavors of RCU, for example, <tt>cond_resched()</tt> +when RCU has indicated a need for quiescent states. <h5>RCU Callback Handling</h5> @@ -1182,8 +1183,8 @@ CPU (and from tracing) unless otherwise stated. Its fields are as follows: <pre> - 1 int dynticks_nesting; - 2 int dynticks_nmi_nesting; + 1 long dynticks_nesting; + 2 long dynticks_nmi_nesting; 3 atomic_t dynticks; 4 bool rcu_need_heavy_qs; 5 unsigned long rcu_qs_ctr; @@ -1191,15 +1192,31 @@ Its fields are as follows: </pre> <p>The <tt>->dynticks_nesting</tt> field counts the -nesting depth of normal interrupts. -In addition, this counter is incremented when exiting dyntick-idle -mode and decremented when entering it. +nesting depth of process execution, so that in normal circumstances +this counter has value zero or one. +NMIs, irqs, and tracers are counted by the <tt>->dynticks_nmi_nesting</tt> +field. +Because NMIs cannot be masked, changes to this variable have to be +undertaken carefully using an algorithm provided by Andy Lutomirski. +The initial transition from idle adds one, and nested transitions +add two, so that a nesting level of five is represented by a +<tt>->dynticks_nmi_nesting</tt> value of nine. This counter can therefore be thought of as counting the number of reasons why this CPU cannot be permitted to enter dyntick-idle -mode, aside from non-maskable interrupts (NMIs). -NMIs are counted by the <tt>->dynticks_nmi_nesting</tt> -field, except that NMIs that interrupt non-dyntick-idle execution -are not counted. +mode, aside from process-level transitions. + +<p>However, it turns out that when running in non-idle kernel context, +the Linux kernel is fully capable of entering interrupt handlers that +never exit and perhaps also vice versa. +Therefore, whenever the <tt>->dynticks_nesting</tt> field is +incremented up from zero, the <tt>->dynticks_nmi_nesting</tt> field +is set to a large positive number, and whenever the +<tt>->dynticks_nesting</tt> field is decremented down to zero, +the the <tt>->dynticks_nmi_nesting</tt> field is set to zero. +Assuming that the number of misnested interrupts is not sufficient +to overflow the counter, this approach corrects the +<tt>->dynticks_nmi_nesting</tt> field every time the corresponding +CPU enters the idle loop from process context. </p><p>The <tt>->dynticks</tt> field counts the corresponding CPU's transitions to and from dyntick-idle mode, so that this counter @@ -1231,14 +1248,16 @@ in response. <tr><th> </th></tr> <tr><th align="left">Quick Quiz:</th></tr> <tr><td> - Why not just count all NMIs? - Wouldn't that be simpler and less error prone? + Why not simply combine the <tt>->dynticks_nesting</tt> + and <tt>->dynticks_nmi_nesting</tt> counters into a + single counter that just counts the number of reasons that + the corresponding CPU is non-idle? </td></tr> <tr><th align="left">Answer:</th></tr> <tr><td bgcolor="#ffffff"><font color="ffffff"> - It seems simpler only until you think hard about how to go about - updating the <tt>rcu_dynticks</tt> structure's - <tt>->dynticks</tt> field. + Because this would fail in the presence of interrupts whose + handlers never return and of handlers that manage to return + from a made-up interrupt. </font></td></tr> <tr><td> </td></tr> </table> diff --git a/Documentation/RCU/Design/Requirements/Requirements.html b/Documentation/RCU/Design/Requirements/Requirements.html index 62e847bcdcdd..49690228b1c6 100644 --- a/Documentation/RCU/Design/Requirements/Requirements.html +++ b/Documentation/RCU/Design/Requirements/Requirements.html @@ -581,7 +581,8 @@ This guarantee was only partially premeditated. DYNIX/ptx used an explicit memory barrier for publication, but had nothing resembling <tt>rcu_dereference()</tt> for subscription, nor did it have anything resembling the <tt>smp_read_barrier_depends()</tt> -that was later subsumed into <tt>rcu_dereference()</tt>. +that was later subsumed into <tt>rcu_dereference()</tt> and later +still into <tt>READ_ONCE()</tt>. The need for these operations made itself known quite suddenly at a late-1990s meeting with the DEC Alpha architects, back in the days when DEC was still a free-standing company. @@ -2797,7 +2798,7 @@ RCU must avoid degrading real-time response for CPU-bound threads, whether executing in usermode (which is one use case for <tt>CONFIG_NO_HZ_FULL=y</tt>) or in the kernel. That said, CPU-bound loops in the kernel must execute -<tt>cond_resched_rcu_qs()</tt> at least once per few tens of milliseconds +<tt>cond_resched()</tt> at least once per few tens of milliseconds in order to avoid receiving an IPI from RCU. <p> @@ -3128,7 +3129,7 @@ The solution, in the form of is to have implicit read-side critical sections that are delimited by voluntary context switches, that is, calls to <tt>schedule()</tt>, -<tt>cond_resched_rcu_qs()</tt>, and +<tt>cond_resched()</tt>, and <tt>synchronize_rcu_tasks()</tt>. In addition, transitions to and from userspace execution also delimit tasks-RCU read-side critical sections. |