diff options
author | Paul E. McKenney <paulmck@linux.vnet.ibm.com> | 2018-04-11 06:17:56 +0200 |
---|---|---|
committer | Paul E. McKenney <paulmck@linux.vnet.ibm.com> | 2018-05-15 19:29:07 +0200 |
commit | 9036c2ffd596261d2067fc2d693dc4f0d7a51214 (patch) | |
tree | d17b25e148a4d22f6e309617bbdbf5cde5dc45aa /kernel/rcu | |
parent | Linux 4.17-rc1 (diff) | |
download | linux-9036c2ffd596261d2067fc2d693dc4f0d7a51214.tar.xz linux-9036c2ffd596261d2067fc2d693dc4f0d7a51214.zip |
rcu: Improve non-root rcu_cbs_completed() accuracy
When rcu_cbs_completed() is invoked on a non-root rcu_node structure,
it unconditionally assumes that two grace periods must complete before
the callbacks at hand can be invoked. This is overly conservative because
if that non-root rcu_node structure believes that no grace period is in
progress, and if the corresponding rcu_state structure's ->gpnum field
has not yet been incremented, then these callbacks may safely be invoked
after only one grace period has completed.
This change is required to permit grace-period start requests to use
funnel locking, which is in turn permitted to reduce root rcu_node ->lock
contention, which has been observed by Nick Piggin. Furthermore, such
contention will likely be increased by the merging of RCU-bh, RCU-preempt,
and RCU-sched, so it makes sense to take steps to decrease it.
This commit therefore improves the accuracy of rcu_cbs_completed() when
invoked on a non-root rcu_node structure as described above.
Reported-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Tested-by: Nicholas Piggin <npiggin@gmail.com>
Diffstat (limited to 'kernel/rcu')
-rw-r--r-- | kernel/rcu/tree.c | 15 |
1 files changed, 15 insertions, 0 deletions
diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index 2a734692a581..f5ca72f2ed43 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -1642,6 +1642,21 @@ static unsigned long rcu_cbs_completed(struct rcu_state *rsp, return rnp->completed + 1; /* + * If the current rcu_node structure believes that RCU is + * idle, and if the rcu_state structure does not yet reflect + * the start of a new grace period, then the next grace period + * will suffice. The memory barrier is needed to accurately + * sample the rsp->gpnum, and pairs with the second lock + * acquisition in rcu_gp_init(), which is augmented with + * smp_mb__after_unlock_lock() for this purpose. + */ + if (rnp->gpnum == rnp->completed) { + smp_mb(); /* See above block comment. */ + if (READ_ONCE(rsp->gpnum) == rnp->completed) + return rnp->completed + 1; + } + + /* * Otherwise, wait for a possible partial grace period and * then the subsequent full grace period. */ |