| Commit message (Collapse) | Author | Age | Files | Lines |
|\
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu into core/rcu
Pull RCU updates from Paul E. McKenney:
* Update RCU documentation.
* Miscellaneous fixes.
* Maintainership changes.
* Torture-test updates.
* Callback-offloading changes.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
If there isn't a nohz_full= kernel parameter specified, then
tick_nohz_full_mask can legitimately be NULL. This can cause
problems when RCU's boot code tries to cpumask_or() this value into
rcu_nocb_mask. In addition, if NO_HZ_FULL_ALL=y, there is no point
in doing the cpumask_or() in the first place because this will cause
RCU_NOCB_CPU_ALL=y, which in turn will have all bits already set in
rcu_nocb_mask.
This commit therefore avoids the cpumask_or() if NO_HZ_FULL_ALL=y
and checks for !tick_nohz_full_running otherwise, this latter check
catching cases when there was no nohz_full= kernel parameter specified.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
|
| |\ \ \ \
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
'maintainers.2014.07.08b', 'nocbs.2014.07.07a' and 'torture.2014.07.07a' into HEAD
doc.2014.07.08a: Documentation updates.
fixes.2014.07.09a: Miscellaneous fixes.
maintainers.2014.07.08b: Maintainership updates.
nocbs.2014.07.07a: Callback-offloading fixes.
torture.2014.07.07a: Torture-test updates.
|
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
Since the torture-test thread creation interface does not include
format string arguments, this commit makes sure the name can never be
accidentally processed as a format string.
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
|
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
Currently the post-processing complains about the lack of rcutorture
output when --buildonly is set and also emits misleading messages about
kernels being started and finishing. This commit suppresses these
complaints and messages.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
|
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
The CFcommon file must now be present, which makes using the current
scripts against old kernel versions cumbersome. This commit therefore
makes the CFcommon file be optional, so that old kernel versions can be
used with current torture scripts.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
|
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
Enabling NO_HZ_FULL currently has the side effect of enabling callback
offloading on all CPUs. This results in lots of additional rcuo kthreads,
and can also increase context switching and wakeups, even in cases where
callback offloading is neither needed nor particularly desirable. This
commit therefore enables callback offloading on a given CPU only if
specifically requested at build time or boot time, or if that CPU has
been specifically designated (again, either at build time or boot time)
as a nohz_full CPU.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
|
| | | | |/
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
An 80-CPU system with a context-switch-heavy workload can require so
many NOCB kthread wakeups that the RCU grace-period kthreads spend several
tens of percent of a CPU just awakening things. This clearly will not
scale well: If you add enough CPUs, the RCU grace-period kthreads would
get behind, increasing grace-period latency.
To avoid this problem, this commit divides the NOCB kthreads into leaders
and followers, where the grace-period kthreads awaken the leaders each of
whom in turn awakens its followers. By default, the number of groups of
kthreads is the square root of the number of CPUs, but this default may
be overridden using the rcutree.rcu_nocb_leader_stride boot parameter.
This reduces the number of wakeups done per grace period by the RCU
grace-period kthread by the square root of the number of CPUs, but of
course by shifting those wakeups to the leaders. In addition, because
the leaders do grace periods on behalf of their respective followers,
the number of wakeups of the followers decreases by up to a factor of two.
Instead of being awakened once when new callbacks arrive and again
at the end of the grace period, the followers are awakened only at
the end of the grace period.
For a numerical example, in a 4096-CPU system, the grace-period kthread
would awaken 64 leaders, each of which would awaken its 63 followers
at the end of the grace period. This compares favorably with the 79
wakeups for the grace-period kthread on an 80-CPU system.
Reported-by: Rik van Riel <riel@redhat.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
We can now designate reviewers in the MAINTAINERS file with the new
"R:" tag, so this commit teaches get_maintainers.pl to add their
email addresses.
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
Commit 51b1130eb582 ("rcutorture: Abstract rcu_torture_random()")
moved the file, so this commit updates the patterns.
Signed-off-by: Joe Perches <joe@perches.com>
cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
Drop Dipankar Sarma at his request (https://lkml.org/lkml/2014/6/2/628),
add Josh Triplett based on long-term review, contributions, and
agreement to take on this role (https://lkml.org/lkml/2014/6/2/554).
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
Adding Steven Rostedt, Mathieu Desnoyers, and Lai Jiangshan as designated
RCU reviewers based on recent emails:
o https://lkml.org/lkml/2014/6/2/578 (Steven)
o https://lkml.org/lkml/2014/6/2/621 (Mathieu)
o https://lkml.org/lkml/2014/6/3/897 (Lai)
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
|
| | | |/
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
A ksummit-discuss email thread looked at the difficulty recruiting
and retaining reviewers. Paul Walmsley also noted the need for patch
submitters to know who the key reviewers are and suggested adding an
"R:" tag to the MAINTAINERS file to record this information on a
per-subsystem basis. This commit does just that, and a subsequent
commit tags the designated reviewer for the RCU-related subsystems.
http://lists.linuxfoundation.org/pipermail/ksummit-discuss/2014-May/000830.html
Suggested-by: Paul Walmsley <paul@pwsan.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
This commit annotates rcu_report_unblock_qs_rnp() in order to fix the
following sparse warning:
kernel/rcu/tree_plugin.h:990:13: warning: context imbalance in 'rcu_report_unblock_qs_rnp' - unexpected unlock
Signed-off-by: Pranith Kumar <bobby.prani@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Lai Jiangshan <laijs@cn.fujitsu.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
This commit annotates rcu_initiate_boost() fixes the following sparse
warning:
kernel/rcu/tree_plugin.h:1494:13: warning: context imbalance in 'rcu_initiate_boost' - unexpected unlock
Signed-off-by: Pranith Kumar <bobby.prani@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Lai Jiangshan <laijs@cn.fujitsu.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
The __rcu_reclaim() function returned 0/1, which is not proper for a
function of type bool. This commit therefore converts to false/true.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Lai Jiangshan <laijs@cn.fujitsu.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
The CONFIG_PROVE_RCU_DELAY Kconfig parameter doesn't appear to be very
effective at finding race conditions, so this commit removes it.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Andi Kleen <ak@linux.intel.com>
[ paulmck: Remove definition and uses as noted by Paul Bolle. ]
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
The __this_cpu_read() function produces better code than does
per_cpu_ptr() on both ARM and x86. For example, gcc (Ubuntu/Linaro
4.7.3-12ubuntu1) 4.7.3 produces the following:
ARMv7 per_cpu_ptr():
force_quiescent_state:
mov r3, sp @,
bic r1, r3, #8128 @ tmp171,,
ldr r2, .L98 @ tmp169,
bic r1, r1, #63 @ tmp170, tmp171,
ldr r3, [r0, #220] @ __ptr, rsp_6(D)->rda
ldr r1, [r1, #20] @ D.35903_68->cpu, D.35903_68->cpu
mov r6, r0 @ rsp, rsp
ldr r2, [r2, r1, asl #2] @ tmp173, __per_cpu_offset
add r3, r3, r2 @ tmp175, __ptr, tmp173
ldr r5, [r3, #12] @ rnp_old, D.29162_13->mynode
ARMv7 __this_cpu_read():
force_quiescent_state:
ldr r3, [r0, #220] @ rsp_7(D)->rda, rsp_7(D)->rda
mov r6, r0 @ rsp, rsp
add r3, r3, #12 @ __ptr, rsp_7(D)->rda,
ldr r5, [r2, r3] @ rnp_old, *D.29176_13
Using gcc 4.8.2:
x86_64 per_cpu_ptr():
movl %gs:cpu_number,%edx # cpu_number, pscr_ret__
movslq %edx, %rdx # pscr_ret__, pscr_ret__
movq __per_cpu_offset(,%rdx,8), %rdx # __per_cpu_offset, tmp93
movq %rdi, %r13 # rsp, rsp
movq 1000(%rdi), %rax # rsp_9(D)->rda, __ptr
movq 24(%rdx,%rax), %r12 # _15->mynode, rnp_old
x86_64 __this_cpu_read():
movq %rdi, %r13 # rsp, rsp
movq 1000(%rdi), %rax # rsp_9(D)->rda, rsp_9(D)->rda
movq %gs:24(%rax),%r12 # _10->mynode, rnp_old
Because this change produces significant benefits for these two very
diverse architectures, this commit makes this change.
Signed-off-by: Shan Wei <davidshan@tencent.com>
Acked-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Pranith Kumar <bobby.prani@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Reviewed-by: Lai Jiangshan <laijs@cn.fujitsu.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Although NMI-based stack dumps are in principle more accurate, they are
also more likely to trigger deadlocks. This commit therefore replaces
all uses of trigger_all_cpu_backtrace() with rcu_dump_cpu_stacks(), so
that the CPU detecting an RCU CPU stall does the stack dumping.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Lai Jiangshan <laijs@cn.fujitsu.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Binding the grace-period kthreads to the timekeeping CPU resulted in
significant performance decreases for some workloads. For more detail,
see:
https://lkml.org/lkml/2014/6/3/395 for benchmark numbers
https://lkml.org/lkml/2014/6/4/218 for CPU statistics
It turns out that it is necessary to bind the grace-period kthreads
to the timekeeping CPU only when all but CPU 0 is a nohz_full CPU
on the one hand or if CONFIG_NO_HZ_FULL_SYSIDLE=y on the other.
In other cases, it suffices to bind the grace-period kthreads to the
set of non-nohz_full CPUs.
This commit therefore creates a tick_nohz_not_full_mask that is the
complement of tick_nohz_full_mask, and then binds the grace-period
kthread to the set of CPUs indicated by this new mask, which covers
the CONFIG_NO_HZ_FULL_SYSIDLE=n case. The CONFIG_NO_HZ_FULL_SYSIDLE=y
case still binds the grace-period kthreads to the timekeeping CPU.
This commit also includes the tick_nohz_full_enabled() check suggested
by Frederic Weisbecker.
Reported-by: Jet Chen <jet.chen@intel.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
[ paulmck: Created housekeeping_affine() and housekeeping_mask per
fweisbec feedback. ]
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
RCU priority boosting currently checks for boosting via a pointer in
task_struct. However, this is not needed: As Oleg noted, if the
rt_mutex is placed in the rcu_node instead of on the booster's stack,
the boostee can simply check it see if it owns the lock. This commit
makes this change, shrinking task_struct by one pointer and the kernel
by thirteen lines.
Suggested-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
The rcu_start_future_gp() function checks the current rcu_node's ->gpnum
and ->completed twice, once without ACCESS_ONCE() and once with it.
Which is pointless because we hold that rcu_node's ->lock at that point.
The intent was to check the current rcu_node structure and the root
rcu_node structure, the latter locklessly with ACCESS_ONCE(). This
commit therefore makes that change.
The reason that it is safe to locklessly check the root rcu_nodes's
->gpnum and ->completed fields is that we hold the current rcu_node's
->lock, which constrains the root rcu_node's ability to change its
->gpnum and ->completed fields. Of course, if there is a single rcu_node
structure, then rnp_root==rnp, and holding the lock prevents all changes.
If there is more than one rcu_node structure, then the code updates the
fields in the following order:
1. Increment rnp_root->gpnum to start new grace period.
2. Increment rnp->gpnum to initialize the current rcu_node,
continuing initialization for the new grace period.
3. Increment rnp_root->completed to end the current grace period.
4. Increment rnp->completed to continue cleaning up after the
old grace period.
So there are four possible combinations of relative values of these
four fields:
N N N N: RCU idle, new grace period must be initiated.
Although rnp_root->gpnum might be incremented immediately
after we check, that will just result in unnecessary work.
The grace period already started, and we try to start it.
N+1 N N N: RCU grace period just started. No further change is
possible because we hold rnp->lock, so the checks of
rnp_root->gpnum and rnp_root->completed are stable.
We know that our request for a future grace period will
be seen during grace-period cleanup.
N+1 N N+1 N: RCU grace period is ongoing. Because rnp->gpnum is
different than rnp->completed, we won't even look at
rnp_root->gpnum and rnp_root->completed, so the possible
concurrent change to rnp_root->completed does not matter.
We know that our request for a future grace period will
be seen during grace-period cleanup, which cannot pass
this rcu_node because we hold its ->lock.
N+1 N+1 N+1 N: RCU grace period has ended, but not yet been cleaned up.
Because rnp->gpnum is different than rnp->completed, we
won't look at rnp_root->gpnum and rnp_root->completed, so
the possible concurrent change to rnp_root->completed does
not matter. We know that our request for a future grace
period will be seen during grace-period cleanup, which
cannot pass this rcu_node because we hold its ->lock.
Therefore, despite initial appearances, the lockless check is safe.
Signed-off-by: Pranith Kumar <bobby.prani@gmail.com>
[ paulmck: Update comment to say why the lockless check is safe. ]
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
The current approach to RCU priority boosting uses an rt_mutex strictly
for its priority-boosting side effects. The rt_mutex_init_proxy_locked()
function is used by the booster to initialize the lock as held by the
boostee. The booster then uses rt_mutex_lock() to acquire this rt_mutex,
which priority-boosts the boostee. When the boostee reaches the end
of its outermost RCU read-side critical section, it checks a field in
its task structure to see whether it has been boosted, and, if so, uses
rt_mutex_unlock() to release the rt_mutex. The booster can then go on
to boost the next task that is blocking the current RCU grace period.
But reasonable implementations of rt_mutex_unlock() might result in the
boostee referencing the rt_mutex's data after releasing it. But the
booster might have re-initialized the rt_mutex between the time that the
boostee released it and the time that it later referenced it. This is
clearly asking for trouble, so this commit introduces a completion that
forces the booster to wait until the boostee has completely finished with
the rt_mutex, thus avoiding the case where the booster is re-initializing
the rt_mutex before the last boostee's last reference to that rt_mutex.
This of course does introduce some overhead, but the priority-boosting
code paths are miles from any possible fastpath, and the overhead of
executing the completion will normally be quite small compared to the
overhead of priority boosting and deboosting, so this should be OK.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
The m68k architecture aligns only to 16-bit boundaries, which can cause
the align-to-32-bits check in __call_rcu() to trigger. Because there is
currently no known potential need for more than one low-order bit, this
commit loosens the check to 16-bit boundaries.
Reported-by: Greg Ungerer <gerg@uclinux.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Lai Jiangshan <laijs@cn.fujitsu.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
RCU contains code of the following forms:
ACCESS_ONCE(x)++;
ACCESS_ONCE(x) += y;
ACCESS_ONCE(x) -= y;
Now these constructs do operate correctly, but they really result in a
pair of volatile accesses, one to do the load and another to do the store.
This can be confusing, as the casual reader might well assume that (for
example) gcc might generate a memory-to-memory add instruction for each
of these three cases. In fact, gcc will do no such thing. Also, there
is a good chance that the kernel will move to separate load and store
variants of ACCESS_ONCE(), and constructs like the above could easily
confuse both people and scripts attempting to make that sort of change.
Finally, most of RCU's read-modify-write uses of ACCESS_ONCE() really
only need the store to be volatile, so that the read-modify-write form
might be misleading.
This commit therefore changes the above forms in RCU so that each instance
of ACCESS_ONCE() either does a load or a store, but not both. In a few
cases, ACCESS_ONCE() was not critical, for example, for maintaining
statisitics. In these cases, ACCESS_ONCE() has been dispensed with
entirely.
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
In kernels built with CONFIG_NO_HZ_FULL, tick_do_timer_cpu is constant
once boot completes. Thus, there is no need to wrap it in ACCESS_ONCE()
in code that is built only when CONFIG_NO_HZ_FULL. This commit therefore
removes the redundant ACCESS_ONCE().
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Reviewed-by: Lai Jiangshan <laijs@cn.fujitsu.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Those two arrays are being passed to lockdep_init_map(), which expects
const char *, and are stored in lockdep_map the same way.
Cc: Dipankar Sarma <dipankar@in.ibm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Fabian Frederick <fabf@skynet.be>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
The explicit local_irq_save() in __lock_task_sighand() is needed to avoid
a potential deadlock condition, as noted in a841796f11c90d53 (signal:
align __lock_task_sighand() irq disabling and RCU). However, someone
reading the code might be forgiven for concluding that this separate
local_irq_save() was completely unnecessary. This commit therefore adds
a comment referencing the shiny new block comment on rcu_read_unlock().
Reported-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: Lai Jiangshan <laijs@cn.fujitsu.com>
|
| | | |
| | | |
| | | |
| | | |
| | | | |
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Lai Jiangshan <laijs@cn.fujitsu.com>
|
| | |/
| | |
| | |
| | |
| | |
| | | |
Reported-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Lai Jiangshan <laijs@cn.fujitsu.com>
|
| | |
| | |
| | |
| | |
| | |
| | | |
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Reviewed-by: Lai Jiangshan <laijs@cn.fujitsu.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
It is possible to pair acquire and release barriers with other barriers,
so this commit adds them to the list in the SMP barrier pairing section.
Reported-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Tejun Heo <tj@kernel.org>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
[ paulmck: Updated pairing discussion as suggested by Peter Zijlstra. ]
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
The kerneltrap.org site no longer works, so this commit updates it to
a working reference, namely gmane.
Signed-off-by: Pranith Kumar <bobby.prani@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
|
| |/
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
This commit adds an example demonstrating that if a wake_up() doesn't
actually wake something up, no memory ordering is provided.
Reported-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Reviewed-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Commit ac1bea85781e (Make cond_resched() report RCU quiescent states)
fixed a problem where a CPU looping in the kernel with but one runnable
task would give RCU CPU stall warnings, even if the in-kernel loop
contained cond_resched() calls. Unfortunately, in so doing, it introduced
performance regressions in Anton Blanchard's will-it-scale "open1" test.
The problem appears to be not so much the increased cond_resched() path
length as an increase in the rate at which grace periods complete, which
increased per-update grace-period overhead.
This commit takes a different approach to fixing this bug, mainly by
moving the RCU-visible quiescent state from cond_resched() to
rcu_note_context_switch(), and by further reducing the check to a
simple non-zero test of a single per-CPU variable. However, this
approach requires that the force-quiescent-state processing send
resched IPIs to the offending CPUs. These will be sent only once
the grace period has reached an age specified by the boot/sysfs
parameter rcutree.jiffies_till_sched_qs, or once the grace period
reaches an age halfway to the point at which RCU CPU stall warnings
will be emitted, whichever comes first.
Reported-by: Dave Hansen <dave.hansen@intel.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Christoph Lameter <cl@gentwo.org>
Cc: Mike Galbraith <umgwanakikbuti@gmail.com>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
[ paulmck: Made rcu_momentary_dyntick_idle() as suggested by the
ktest build robot. Also fixed smp_mb() comment as noted by
Oleg Nesterov. ]
Merge with e552592e (Reduce overhead of cond_resched() checks for RCU)
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Currently, call_rcu() relies on implicit allocation and initialization
for the debug-objects handling of RCU callbacks. If you hammer the
kernel hard enough with Sasha's modified version of trinity, you can end
up with the sl*b allocators recursing into themselves via this implicit
call_rcu() allocation.
This commit therefore exports the debug_init_rcu_head() and
debug_rcu_head_free() functions, which permits the allocators to allocated
and pre-initialize the debug-objects information, so that there no longer
any need for call_rcu() to do that initialization, which in turn prevents
the recursion into the memory allocators.
Reported-by: Sasha Levin <sasha.levin@oracle.com>
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Looks-good-to: Christoph Lameter <cl@linux.com>
|
| | |
|
|\ \
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4
Pull ext4 bugfixes from Ted Ts'o:
"More bug fixes for ext4 -- most importantly, a fix for a bug
introduced in 3.15 that can end up triggering a file system corruption
error after a journal replay.
It shouldn't lead to any actual data corruption, but it is scary and
can force file systems to be remounted read-only, etc"
* tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
ext4: fix potential null pointer dereference in ext4_free_inode
ext4: fix a potential deadlock in __ext4_es_shrink()
ext4: revert commit which was causing fs corruption after journal replays
ext4: disable synchronous transaction batching if max_batch_time==0
ext4: clarify ext4_error message in ext4_mb_generate_buddy_error()
ext4: clarify error count warning messages
ext4: fix unjournalled bg descriptor while initializing inode bitmap
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Fix potential null pointer dereferencing problem caused by e43bb4e612
("ext4: decrement free clusters/inodes counters when block group declared bad")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com>
Signed-off-by: Ashish Sangwan <a.sangwan@samsung.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Lukas Czerner <lczerner@redhat.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
This fixes the following lockdep complaint:
[ INFO: possible circular locking dependency detected ]
3.16.0-rc2-mm1+ #7 Tainted: G O
-------------------------------------------------------
kworker/u24:0/4356 is trying to acquire lock:
(&(&sbi->s_es_lru_lock)->rlock){+.+.-.}, at: [<ffffffff81285fff>] __ext4_es_shrink+0x4f/0x2e0
but task is already holding lock:
(&ei->i_es_lock){++++-.}, at: [<ffffffff81286961>] ext4_es_insert_extent+0x71/0x180
which lock already depends on the new lock.
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock(&ei->i_es_lock);
lock(&(&sbi->s_es_lru_lock)->rlock);
lock(&ei->i_es_lock);
lock(&(&sbi->s_es_lru_lock)->rlock);
*** DEADLOCK ***
6 locks held by kworker/u24:0/4356:
#0: ("writeback"){.+.+.+}, at: [<ffffffff81071d00>] process_one_work+0x180/0x560
#1: ((&(&wb->dwork)->work)){+.+.+.}, at: [<ffffffff81071d00>] process_one_work+0x180/0x560
#2: (&type->s_umount_key#22){++++++}, at: [<ffffffff811a9c74>] grab_super_passive+0x44/0x90
#3: (jbd2_handle){+.+...}, at: [<ffffffff812979f9>] start_this_handle+0x189/0x5f0
#4: (&ei->i_data_sem){++++..}, at: [<ffffffff81247062>] ext4_map_blocks+0x132/0x550
#5: (&ei->i_es_lock){++++-.}, at: [<ffffffff81286961>] ext4_es_insert_extent+0x71/0x180
stack backtrace:
CPU: 0 PID: 4356 Comm: kworker/u24:0 Tainted: G O 3.16.0-rc2-mm1+ #7
Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
Workqueue: writeback bdi_writeback_workfn (flush-253:0)
ffffffff8213dce0 ffff880014b07538 ffffffff815df0bb 0000000000000007
ffffffff8213e040 ffff880014b07588 ffffffff815db3dd ffff880014b07568
ffff880014b07610 ffff88003b868930 ffff88003b868908 ffff88003b868930
Call Trace:
[<ffffffff815df0bb>] dump_stack+0x4e/0x68
[<ffffffff815db3dd>] print_circular_bug+0x1fb/0x20c
[<ffffffff810a7a3e>] __lock_acquire+0x163e/0x1d00
[<ffffffff815e89dc>] ? retint_restore_args+0xe/0xe
[<ffffffff815ddc7b>] ? __slab_alloc+0x4a8/0x4ce
[<ffffffff81285fff>] ? __ext4_es_shrink+0x4f/0x2e0
[<ffffffff810a8707>] lock_acquire+0x87/0x120
[<ffffffff81285fff>] ? __ext4_es_shrink+0x4f/0x2e0
[<ffffffff8128592d>] ? ext4_es_free_extent+0x5d/0x70
[<ffffffff815e6f09>] _raw_spin_lock+0x39/0x50
[<ffffffff81285fff>] ? __ext4_es_shrink+0x4f/0x2e0
[<ffffffff8119760b>] ? kmem_cache_alloc+0x18b/0x1a0
[<ffffffff81285fff>] __ext4_es_shrink+0x4f/0x2e0
[<ffffffff812869b8>] ext4_es_insert_extent+0xc8/0x180
[<ffffffff812470f4>] ext4_map_blocks+0x1c4/0x550
[<ffffffff8124c4c4>] ext4_writepages+0x6d4/0xd00
...
Reported-by: Minchan Kim <minchan@kernel.org>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reported-by: Minchan Kim <minchan@kernel.org>
Cc: stable@vger.kernel.org
Cc: Zheng Liu <gnehzuil.liu@gmail.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Commit 007649375f6af2 ("ext4: initialize multi-block allocator before
checking block descriptors") causes the block group descriptor's count
of the number of free blocks to become inconsistent with the number of
free blocks in the allocation bitmap. This is a harmless form of fs
corruption, but it causes the kernel to potentially remount the file
system read-only, or to panic, depending on the file systems's error
behavior.
Thanks to Eric Whitney for his tireless work to reproduce and to find
the guilty commit.
Fixes: 007649375f6af2 ("ext4: initialize multi-block allocator before checking block descriptors"
Cc: stable@vger.kernel.org # 3.15
Reported-by: David Jander <david@protonic.nl>
Reported-by: Matteo Croce <technoboy85@gmail.com>
Tested-by: Eric Whitney <enwlinux@gmail.com>
Suggested-by: Eric Whitney <enwlinux@gmail.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
The mount manpage says of the max_batch_time option,
This optimization can be turned off entirely
by setting max_batch_time to 0.
But the code doesn't do that. So fix the code to do
that.
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@vger.kernel.org
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
We are spending a lot of time explaining to users what this error
means. Let's try to improve the message to avoid this problem.
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@vger.kernel.org
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Make it clear that values printed are times, and that it is error
since last fsck. Also add note about fsck version required.
Signed-off-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Andreas Dilger <adilger@dilger.ca>
Cc: stable@vger.kernel.org
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
The first time that we allocate from an uninitialized inode allocation
bitmap, if the block allocation bitmap is also uninitalized, we need
to get write access to the block group descriptor before we start
modifying the block group descriptor flags and updating the free block
count, etc. Otherwise, there is the potential of a bad journal
checksum (if journal checksums are enabled), and of the file system
becoming inconsistent if we crash at exactly the wrong time.
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@vger.kernel.org
|
|\ \ \
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
git://git.linaro.org/people/mike.turquette/linux
Pull clock driver fixes from Mike Turquette:
"This batch of fixes is for a handful of clock drivers from Allwinner,
Samsung, ST & TI. Most of them are of the "this hardware won't work
without this fix" variety, including patches that fix platforms that
did not boot under certain configurations. Other fixes are the result
of changes to the clock core introduced in 3.15 that had subtle
impacts on the clock drivers.
There are no fixes to the clock framework core in this pull request"
* tag 'clk-fixes-for-linus' of git://git.linaro.org/people/mike.turquette/linux:
clk: spear3xx: Set proper clock parent of uart1/2
clk: spear3xx: Use proper control register offset
clk: qcom: HDMI source sel is 3 not 2
clk: sunxi: fix devm_ioremap_resource error detection code
clk: s2mps11: Fix double free corruption during driver unbind
clk: ti: am43x: Fix boot with CONFIG_SOC_AM33XX disabled
clk: exynos5420: Remove aclk66_peric from the clock tree description
clk/exynos5250: fix bit number for tv sysmmu clock
clk: s3c64xx: Hookup SPI clocks correctly
clk: samsung: exynos4: Remove SRC_MASK_ISP gates
clk: samsung: add more aliases for s3c24xx
clk: samsung: fix several typos to fix boot on s3c2410
clk: ti: set CLK_SET_RATE_NO_REPARENT for ti,mux-clock
clk: ti: am43x: Fix boot with CONFIG_SOC_AM33XX disabled
clk: ti: dra7: return error code in failure case
clk: ti: apll: not allocating enough data
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
The uarts only work when the parent is ras_ahb_clk. The stale 3.5
based ST tree does this in the board file.
Add it to the clk init function. Not pretty, but the mess there is
amazing anyway.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Mike Turquette <mturquette@linaro.org>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
The control register is at offset 0x10, not 0x0. This is wreckaged
since commit 5df33a62c (SPEAr: Switch to common clock framework).
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Mike Turquette <mturquette@linaro.org>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
The HDMI PLL input to the tv mux is supposed to be 3, not 2. Fix
the code so that we can properly select the HDMI PLL.
Fixes: 6d00b56fe "clk: qcom: Add support for MSM8960's multimedia clock controller (MMCC)"
Reported-by: Rob Clark <robdclark@gmail.com>
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Mike Turquette <mturquette@linaro.org>
|
| |\ \ \
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
git://git.kernel.org/pub/scm/linux/kernel/git/tfiga/samsung-clk into clk-fixes-samsung
Samsung clock fixes for v3.16.
This pull request contains fixes for various issues found while testing
-rc versions of Linux 3.16. Mostly two kinds of patches:
* Fixes of incorrectly defined clocks
1) a37c82a clk: samsung: exynos4: Remove SRC_MASK_ISP gates
Issue present since v3.10.
2) 0b1643b clk/exynos5250: fix bit number for tv sysmmu clock
Issue present since v3.16.
3) 44ff025 clk: exynos5420: Remove aclk66_peric from the clock tree description
Issue present since v3.11.
* Adding things missed by original patches
1) cec1cde clk: samsung: fix several typos to fix boot on s3c2410
2) 34ece9e clk: samsung: add more aliases for s3c24xx
Both issues present since the driver was added in v3.16.
3) a92dda4 clk: s3c64xx: Hookup SPI clocks correctly
Issue present since v3.12.
|