| Commit message (Collapse) | Author | Age | Files | Lines |
|\
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'rcu-v28-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (21 commits)
rcu: RCU-based detection of stalled CPUs for Classic RCU, fix
rcu: RCU-based detection of stalled CPUs for Classic RCU
rcu: add rcu_read_lock_sched() / rcu_read_unlock_sched()
rcu: fix sparse shadowed variable warning
doc/RCU: fix pseudocode in rcuref.txt
rcuclassic: fix compiler warning
rcu: use irq-safe locks
rcuclassic: fix compilation NG
rcu: fix locking cleanup fallout
rcu: remove redundant ACCESS_ONCE definition from rcupreempt.c
rcu: fix classic RCU locking cleanup lockdep problem
rcu: trace fix possible mem-leak
rcu: just rename call_rcu_bh instead of making it a macro
rcu: remove list_for_each_rcu()
rcu: fixes to include/linux/rcupreempt.h
rcu: classic RCU locking and memory-barrier cleanups
rcu: prevent console flood when one CPU sees another AWOL via RCU
rcu, debug: detect stalled grace periods, cleanups
rcu, debug: detect stalled grace periods
rcu classic: new algorithm for callbacks-processing(v2)
...
|
| |\ |
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
fix the !CONFIG_RCU_CPU_STALL_DETECTOR path:
kernel/rcuclassic.c: In function '__rcu_pending':
kernel/rcuclassic.c:609: error: too few arguments to function 'check_cpu_stall'
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
This patch adds stalled-CPU detection to Classic RCU. This capability
is enabled by a new config variable CONFIG_RCU_CPU_STALL_DETECTOR, which
defaults disabled.
This is a debugging feature to detect infinite loops in kernel code, not
something that non-kernel-hackers would be expected to care about.
This feature can detect looping CPUs in !PREEMPT builds and looping CPUs
with preemption disabled in PREEMPT builds. This is essentially a port of
this functionality from the treercu patch, replacing the stall debug patch
that is already in tip/core/rcu (commit 67182ae1c4).
The changes from the patch in tip/core/rcu include making the config
variable name match that in treercu, changing from seconds to jiffies to
avoid spurious warnings, and printing a boot message when this feature
is enabled.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
| |\ \ |
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Add rcu_read_lock_sched() and rcu_read_unlock_sched() to rcupdate.h to match the
recently added write-side call_rcu_sched() and rcu_barrier_sched(). They also
match the no-so-recently-added synchronize_sched().
It will help following matching use of the update/read lock primitives. Those
new read lock will replace preempt_disable()/enable() used in pair with
RCU-classic synchronization.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
kernel/rcuclassic.c:564:18: warning: symbol 'flags' shadows an earlier one
kernel/rcuclassic.c:527:16: originally declared here
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
atomic_inc_not_zero(v) return 0 if *v = 0.
use spin_lock instead of write_lock for update lock.
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
| |\ \ \ |
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
CC kernel/rcuclassic.o
kernel/rcuclassic.c: In function 'rcu_init_percpu_data':
kernel/rcuclassic.c:705: warning: comparison of distinct pointer types lacks a cast
kernel/rcuclassic.c:713: warning: comparison of distinct pointer types lacks a cast
flags should be unsigned long.
Signed-off-by: Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
Some earlier tip/core/rcu patches caused RCU to incorrectly enable irqs
too early in boot. This caused Yinghai's repeated-kexec testing to
hit oopses, presumably due to so that device interrupts left over from
the prior kernel instance (which would oops the newly booting kernel
before it got a chance to reset said devices). This patch therefore
converts all the local_irq_disable()s in rcuclassic.c to local_irq_save().
Besides, I never did like local_irq_disable() anyway. ;-)
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
fix:
CC kernel/rcuclassic.o
kernel/rcuclassic.c: In function '__rcu_process_callbacks':
kernel/rcuclassic.c:561: error: 'flags' undeclared (first use in this function)
kernel/rcuclassic.c:561: error: (Each undeclared identifier is reported only once
kernel/rcuclassic.c:561: error: for each function it appears in.)
Declare missing variable flags.
Signed-off-by: Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
Given that the rcp->lock is now acquired from call_rcu(), which can be
invoked from irq-disable regions, all acquisitions need to disable irqs.
The following patch fixes this.
Although I don't have any reason to believe that this is the cause of
Yinghai's oops, it does need to be fixed.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
Remove the redundant definition of ACCESS_ONCE() from rcupreempt.c in
favor of the one in compiler.h. Also merge the comment header from
rcupreempt.c's definition into that in compiler.h.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
On Fri, Aug 15, 2008 at 04:24:30PM +0200, Ingo Molnar wrote:
>
> Paul,
>
> one of your two recent RCU patches caused this lockdep splat in -tip
> testing:
>
> ------------------->
> Brought up 2 CPUs
> Total of 2 processors activated (6850.87 BogoMIPS).
> PM: Adding info for No Bus:platform
> khelper used greatest stack depth: 3124 bytes left
>
> =================================
> [ INFO: inconsistent lock state ]
> 2.6.27-rc3-tip #1
> ---------------------------------
> inconsistent {softirq-on-W} -> {in-softirq-W} usage.
> ksoftirqd/0/4 [HC0[0]:SC1[1]:HE1:SE0] takes:
> (&rcu_ctrlblk.lock){-+..}, at: [<c016d91c>] __rcu_process_callbacks+0x1ac/0x1f0
> {softirq-on-W} state was registered at:
> [<c01528e4>] __lock_acquire+0x3f4/0x5b0
> [<c0152b29>] lock_acquire+0x89/0xc0
> [<c076142b>] _spin_lock+0x3b/0x70
> [<c016d649>] rcu_init_percpu_data+0x29/0x80
> [<c075e43f>] rcu_cpu_notify+0xaf/0xd0
> [<c076458d>] notifier_call_chain+0x2d/0x60
> [<c0145ede>] __raw_notifier_call_chain+0x1e/0x30
> [<c075db29>] _cpu_up+0x79/0x110
> [<c075dc0d>] cpu_up+0x4d/0x70
> [<c0a769e1>] kernel_init+0xb1/0x200
> [<c01048a3>] kernel_thread_helper+0x7/0x10
> [<ffffffff>] 0xffffffff
> irq event stamp: 14
> hardirqs last enabled at (14): [<c01534db>] trace_hardirqs_on+0xb/0x10
> hardirqs last disabled at (13): [<c014dbeb>] trace_hardirqs_off+0xb/0x10
> softirqs last enabled at (0): [<c012b186>] copy_process+0x276/0x1190
> softirqs last disabled at (11): [<c0105c0a>] call_on_stack+0x1a/0x30
>
> other info that might help us debug this:
> no locks held by ksoftirqd/0/4.
>
> stack backtrace:
> Pid: 4, comm: ksoftirqd/0 Not tainted 2.6.27-rc3-tip #1
> [<c01504dc>] print_usage_bug+0x16c/0x1b0
> [<c0152455>] mark_lock+0xa75/0xb10
> [<c0108b75>] ? sched_clock+0x15/0x30
> [<c015289d>] __lock_acquire+0x3ad/0x5b0
> [<c0152b29>] lock_acquire+0x89/0xc0
> [<c016d91c>] ? __rcu_process_callbacks+0x1ac/0x1f0
> [<c076142b>] _spin_lock+0x3b/0x70
> [<c016d91c>] ? __rcu_process_callbacks+0x1ac/0x1f0
> [<c016d91c>] __rcu_process_callbacks+0x1ac/0x1f0
> [<c016d986>] rcu_process_callbacks+0x26/0x50
> [<c0132305>] __do_softirq+0x95/0x120
> [<c0132270>] ? __do_softirq+0x0/0x120
> [<c0105c0a>] call_on_stack+0x1a/0x30
> [<c0132426>] ? ksoftirqd+0x96/0x110
> [<c0132390>] ? ksoftirqd+0x0/0x110
> [<c01411f7>] ? kthread+0x47/0x80
> [<c01411b0>] ? kthread+0x0/0x80
> [<c01048a3>] ? kernel_thread_helper+0x7/0x10
> =======================
> calling init_cpufreq_transition_notifier_list+0x0/0x20
> initcall init_cpufreq_transition_notifier_list+0x0/0x20 returned 0 after 0 msecs
> calling net_ns_init+0x0/0x190
> net_namespace: 676 bytes
> initcall net_ns_init+0x0/0x190 returned 0 after 0 msecs
> calling cpufreq_tsc+0x0/0x20
> initcall cpufreq_tsc+0x0/0x20 returned 0 after 0 msecs
> calling reboot_init+0x0/0x20
> initcall reboot_init+0x0/0x20 returned 0 after 0 msecs
> calling print_banner+0x0/0x10
> Booting paravirtualized kernel on bare hardware
>
> <-----------------------
>
> my guess is on:
>
> commit 1f7b94cd3d564901f9e04a8bc5832ae7bfd690a0
> Author: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> Date: Tue Aug 5 09:21:44 2008 -0700
>
> rcu: classic RCU locking and memory-barrier cleanups
>
> Ingo
Fixes a problem detected by lockdep in which rcu->lock was acquired
both in irq context and in process context, but without disabling from
process context.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
In the initialization of the RCU trace module, if
rcupreempt_debugfs_init() fails, we never free the the trace buffer.
This patch frees the trace buffer in case the debugfs fails.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Reviewed-by: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
Seems that I found a box that has a config that passes call_rcu_bh as a
function pointer (see net/sctp/sm_make_chunk.c), so declaring the
call_rcu_bh has a macro function isn't good enough.
This patch makes it just another name of call_rcu for rcupreempt.
Signed-off-by: Steven Rostedt <srostedt@redhat.org>
Reviewed-by: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
All of the in-tree uses of list_for_each_rcu() have been converted to
list_for_each_entry_rcu(), so list_for_each_rcu() can now be removed.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
Hello!
Compared tip/core/rcu to my latest patchset, and found the following
issues:
o the memory barrier in rcu_exit_nohz() somehow got out of place
(it is correct in mainline as of 2.6.26-rc7).
o There is a duplicate declaration of rcu_dyntick_sched.
The attached patch fixes these.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
This patch simplifies the locking and memory-barrier usage in the Classic
RCU grace-period-detection mechanism, incorporating Lai Jiangshan's
feedback from the earlier version (http://lkml.org/lkml/2008/8/1/400
and http://lkml.org/lkml/2008/8/3/43). Passed 10 hours of
rcutorture concurrent with CPUs being put online and taken offline on
a 128-hardware-thread Power machine. My apologies to whoever in the
Eastern Hemisphere was planning to use this machine over the Western
Hemisphere night, but it was sitting idle and...
So this is ready for tip/core/rcu.
This patch is in preparation for moving to a hierarchical
algorithm to allow the very large SMP machines -- requested by some
people at OLS, and there seem to have been a few recent patches in the
4096-CPU direction as well. The general idea is to move to a much more
conservative concurrency design, then apply a hierarchy to reduce
contention on the global lock by a few orders of magnitude (larger
machines would see greater reductions). The reason for taking a
conservative approach is that this code isn't on any fast path.
Prototype in progress.
This patch is against the linux-tip git tree (tip/core/rcu). If you
wish to test this against 2.6.26, use the following set of patches:
http://www.rdrop.com/users/paulmck/patches/2.6.26-ljsimp-1.patch
http://www.rdrop.com/users/paulmck/patches/2.6.26-ljsimpfix-3.patch
The first patch combines commits 5127bed588a2f8f3a1f732de2a8a190b7df5dce3
and 3cac97cbb14aed00d83eb33d4613b0fe3aaea863 from Lai Jiangshan
<laijs@cn.fujitsu.com>, and the second patch contains my changes.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
One small change needed to keep from flooding the console when one
CPU notices that another is AWOL. Unless I am missing something subtle.
Otherwise the cleanups look good!
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
small cleanups.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
this is a diagnostic patch for Classic RCU.
The approach is to record a timestamp at the beginning
of the grace period (in rcu_start_batch()), then have
rcu_check_callbacks() complain if:
1. it is running on a CPU that has holding up grace periods for
a long time (say one second). This will identify the culprit
assuming that the culprit has not disabled hardware irqs,
instruction execution, or some such.
2. it is running on a CPU that is not holding up grace periods,
but grace periods have been held up for an even longer time
(say two seconds).
It is enabled via the default-off CONFIG_DEBUG_RCU_STALL kernel parameter.
Rather than exponential backoff, it backs off to once per 30 seconds.
My feeling upon thinking on it was that if you have stalled RCU grace
periods for that long, a few extra printk() messages are probably the
least of your worries...
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Yinghai Lu <yhlu.kernel@gmail.com>
Cc: David Witbrodt <dawitbro@sbcglobal.net>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
| |\ \ \ \ |
|
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
This is v2, it's a little deference from v1 that I
had send to lkml.
use ACCESS_ONCE
use rcu_batch_after/rcu_batch_before for batch # comparison.
rcutorture test result:
(hotplugs: do cpu-online/offline once per second)
No CONFIG_NO_HZ: OK, 12hours
No CONFIG_NO_HZ, hotplugs: OK, 12hours
CONFIG_NO_HZ=y: OK, 24hours
CONFIG_NO_HZ=y, hotplugs: Failed.
(Failed also without my patch applied, exactly the same bug occurred,
http://lkml.org/lkml/2008/7/3/24)
v1's email thread:
http://lkml.org/lkml/2008/6/2/539
v1's description:
The code/algorithm of the implement of current callbacks-processing
is very efficient and technical. But when I studied it and I found
a disadvantage:
In multi-CPU systems, when a new RCU callback is being
queued(call_rcu[_bh]), this callback will be invoked after the grace
period for the batch with batch number = rcp->cur+2 has completed
very very likely in current implement. Actually, this callback can be
invoked after the grace period for the batch with
batch number = rcp->cur+1 has completed. The delay of invocation means
that latency of synchronize_rcu() is extended. But more important thing
is that the callbacks usually free memory, and these works are delayed
too! it's necessary for reclaimer to free memory as soon as
possible when left memory is few.
A very simple way can solve this problem:
a field(struct rcu_head::batch) is added to record the batch number for
the RCU callback. And when a new RCU callback is being queued, we
determine the batch number for this callback(head->batch = rcp->cur+1)
and we move this callback to rdp->donelist if we find
that head->batch <= rcp->completed when we process callbacks.
This simple way reduces the wait time for invocation a lot. (about
2.5Grace Period -> 1.5Grace Period in average in multi-CPU systems)
This is my algorithm. But I do not add any field for struct rcu_head
in my implement. We just need to memorize the last 2 batches and
their batch number, because these 2 batches include all entries that
for whom the grace period hasn't completed. So we use a special
linked-list rather than add a field.
Please see the comment of struct rcu_data.
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Dipankar Sarma <dipankar@in.ibm.com>
Cc: Gautham Shenoy <ego@in.ibm.com>
Cc: Dhaval Giani <dhaval@linux.vnet.ibm.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
use a batch number(rcp->pending) instead of a flag(rcp->next_pending)
rcu_start_batch() need to change this flag, so mb()s is needed
for memory-access safe.
but(after this patch applied) rcu_start_batch() do not change
this batch number(rcp->pending), rcp->pending is managed by
__rcu_process_callbacks only, and troublesome mb()s are eliminated.
And codes look simpler and clearer.
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Dipankar Sarma <dipankar@in.ibm.com>
Cc: Gautham Shenoy <ego@in.ibm.com>
Cc: Dhaval Giani <dhaval@linux.vnet.ibm.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
|\ \ \ \ \ \
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6:
selinux: Fix an uninitialized variable BUG/panic in selinux_secattr_to_sid()
selinux: use default proc sid on symlinks
file capabilities: uninline cap_safe_nice
Update selinux info in MAINTAINERS and Kconfig help text
SELinux: add gitignore file for mdp script
SELinux: add boundary support and thread context assignment
securityfs: do not depend on CONFIG_SECURITY
selinux: add support for installing a dummy policy (v2)
security: add/fix security kernel-doc
selinux: Unify for- and while-loop style
selinux: conditional expression type validation was off-by-one
smack: limit privilege by label
SELinux: Fix a potentially uninitialised variable in SELinux hooks
SELinux: trivial, remove unneeded local variable
SELinux: Trivial minor fixes that change C null character style
make selinux_write_opts() static
|
| |\ \ \ \ \ \ |
|
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | | |
At some point during the 2.6.27 development cycle two new fields were added
to the SELinux context structure, a string pointer and a length field. The
code in selinux_secattr_to_sid() was not modified and as a result these two
fields were left uninitialized which could result in erratic behavior,
including kernel panics, when NetLabel is used. This patch fixes the
problem by fully initializing the context in selinux_secattr_to_sid() before
use and reducing the level of direct context manipulation done to help
prevent future problems.
Please apply this to the 2.6.27-rcX release stream.
Signed-off-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: James Morris <jmorris@namei.org>
|
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | | |
As we are not concerned with fine-grained control over reading of
symlinks in proc, always use the default proc SID for all proc symlinks.
This should help avoid permission issues upon changes to the proc tree
as in the /proc/net -> /proc/self/net example.
This does not alter labeling of symlinks within /proc/pid directories.
ls -Zd /proc/net output before and after the patch should show the difference.
Signed-off-by: Stephen D. Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
|
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | | |
This reduces the kernel size by 289 bytes.
Signed-off-by: Serge E. Hallyn <serue@us.ibm.com>
Acked-by: Andrew G. Morgan <morgan@kernel.org>
Signed-off-by: James Morris <jmorris@namei.org>
|
| | |\ \ \ \ \ \
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | | |
Conflicts:
MAINTAINERS
Thanks for breaking my tree :-)
Signed-off-by: James Morris <jmorris@namei.org>
|
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | | |
Update the SELinux entry in MAINTAINERS and drop the obsolete information
from the selinux Kconfig help text.
Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
|
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | | |
Add gitignore file for scripts/selinux/mdp/mdp.
Signed-off-by: James Morris <jmorris@namei.org>
|
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | | |
The purpose of this patch is to assign per-thread security context
under a constraint. It enables multi-threaded server application
to kick a request handler with its fair security context, and
helps some of userspace object managers to handle user's request.
When we assign a per-thread security context, it must not have wider
permissions than the original one. Because a multi-threaded process
shares a single local memory, an arbitary per-thread security context
also means another thread can easily refer violated information.
The constraint on a per-thread security context requires a new domain
has to be equal or weaker than its original one, when it tries to assign
a per-thread security context.
Bounds relationship between two types is a way to ensure a domain can
never have wider permission than its bounds. We can define it in two
explicit or implicit ways.
The first way is using new TYPEBOUNDS statement. It enables to define
a boundary of types explicitly. The other one expand the concept of
existing named based hierarchy. If we defines a type with "." separated
name like "httpd_t.php", toolchain implicitly set its bounds on "httpd_t".
This feature requires a new policy version.
The 24th version (POLICYDB_VERSION_BOUNDARY) enables to ship them into
kernel space, and the following patch enables to handle it.
Signed-off-by: KaiGai Kohei <kaigai@ak.jp.nec.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
|
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | | |
Add a new Kconfig option SECURITYFS which will build securityfs support
but does not require CONFIG_SECURITY. The only current user of
securityfs does not depend on CONFIG_SECURITY and there is no reason the
full LSM needs to be built to build this fs.
Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
|
| | |\ \ \ \ \ \ \ |
|
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | | |
In August 2006 I posted a patch generating a minimal SELinux policy. This
week, David P. Quigley posted an updated version of that as a patch against
the kernel. It also had nice logic for auto-installing the policy.
Following is David's original patch intro (preserved especially
bc it has stats on the generated policies):
se interested in the changes there were only two significant
changes. The first is that the iteration through the list of classes
used NULL as a sentinel value. The problem with this is that the
class_to_string array actually has NULL entries in its table as place
holders for the user space object classes.
The second change was that it would seem at some point the initial sids
table was NULL terminated. This is no longer the case so that iteration
has to be done on array length instead of looking for NULL.
Some statistics on the policy that it generates:
The policy consists of 523 lines which contain no blank lines. Of those
523 lines 453 of them are class, permission, and initial sid
definitions. These lines are usually little to no concern to the policy
developer since they will not be adding object classes or permissions.
Of the remaining 70 lines there is one type, one role, and one user
statement. The remaining lines are broken into three portions. The first
group are TE allow rules which make up 29 of the remaining lines, the
second is assignment of labels to the initial sids which consist of 27
lines, and file system labeling statements which are the remaining 11.
In addition to the policy.conf generated there is a single file_contexts
file containing two lines which labels the entire system with base_t.
This policy generates a policy.23 binary that is 7920 bytes.
(then a few versions later...):
The new policy is 587 lines (stripped of blank lines) with 476 of those
lines being the boilerplate that I mentioned last time. The remaining
111 lines have the 3 lines for type, user, and role, 70 lines for the
allow rules (one for each object class including user space object
classes), 27 lines to assign types to the initial sids, and 11 lines for
file system labeling. The policy binary is 9194 bytes.
Changelog:
Aug 26: Added Documentation/SELinux.txt
Aug 26: Incorporated a set of comments by Stephen Smalley:
1. auto-setup SELINUXTYPE=dummy
2. don't auto-install if selinux is enabled with
non-dummy policy
3. don't re-compute policy version
4. /sbin/setfiles not /usr/sbin/setfiles
Aug 22: As per JMorris comments, made sure make distclean
cleans up the mdp directory.
Removed a check for file_contexts which is now
created in the same file as the check, making it
superfluous.
Signed-off-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: David Quigley <dpquigl@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
|
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | | |
Add security/inode.c functions to the kernel-api docbook.
Use '%' on constants in kernel-doc notation.
Fix several typos/spellos in security function descriptions.
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: James Morris <jmorris@namei.org>
|
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | | |
Replace "thing != NULL" comparisons with just "thing" to make
the code look more uniform (mixed styles were used even in the
same source file).
Signed-off-by: Vesa-Matti Kari <vmkari@cc.helsinki.fi>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
|
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | | |
expr_isvalid() in conditional.c was off-by-one and allowed
invalid expression type COND_LAST. However, it is this header file
that needs to be fixed. That way the if-statement's disjunction's
second component reads more naturally, "if expr type is greater than
the last allowed value" ( rather than using ">=" in conditional.c):
if (expr->expr_type <= 0 || expr->expr_type > COND_LAST)
Signed-off-by: Vesa-Matti Kari <vmkari@cc.helsinki.fi>
Signed-off-by: James Morris <jmorris@namei.org>
|
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | | |
There have been a number of requests to make the Smack LSM
enforce MAC even in the face of privilege, either capability
based or superuser based. This is not universally desired,
however, so it seems desirable to make it optional. Further,
at least one legacy OS implemented a scheme whereby only
processes running with one particular label could be exempt
from MAC. This patch supports these three cases.
If /smack/onlycap is empty (unset or null-string) privilege
is enforced in the normal way.
If /smack/onlycap contains a label only processes running with
that label may be MAC exempt.
If the label in /smack/onlycap is the star label ("*") the
semantics of the star label combine with the privilege
restrictions to prevent any violations of MAC, even in the
presence of privilege.
Again, this will be independent of the privilege scheme.
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
Reviewed-by: James Morris <jmorris@namei.org>
|
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | | |
Fix a potentially uninitialised variable in SELinux hooks that's given a
pointer to the network address by selinux_parse_skb() passing a pointer back
through its argument list. By restructuring selinux_parse_skb(), the compiler
can see that the error case need not set it as the caller will return
immediately.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
|
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | | |
Hello,
Remove unneeded local variable:
struct avtab_node *newnode
Signed-off-by: Vesa-Matti Kari <vmkari@cc.helsinki.fi>
Signed-off-by: James Morris <jmorris@namei.org>
|
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | | |
Trivial minor fixes that change C null character style.
Signed-off-by: Vesa-Matti Kari <vmkari@cc.helsinki.fi>
Signed-off-by: James Morris <jmorris@namei.org>
|
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | | |
This patch makes the needlessly global selinux_write_opts() static.
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: James Morris <jmorris@namei.org>
|
|\ \ \ \ \ \ \ \ \ \
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | | |
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'sched-v28-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (38 commits)
sched debug: add name to sched_domain sysctl entries
sched: sync wakeups vs avg_overlap
sched: remove redundant code in cpu_cgroup_create()
sched_rt.c: resch needed in rt_rq_enqueue() for the root rt_rq
cpusets: scan_for_empty_cpusets(), cpuset doesn't seem to be so const
sched: minor optimizations in wake_affine and select_task_rq_fair
sched: maintain only task entities in cfs_rq->tasks list
sched: fixup buddy selection
sched: more sanity checks on the bandwidth settings
sched: add some comments to the bandwidth code
sched: fixlet for group load balance
sched: rework wakeup preemption
CFS scheduler: documentation about scheduling policies
sched: clarify ifdef tangle
sched: fix list traversal to use _rcu variant
sched: turn off WAKEUP_OVERLAP
sched: wakeup preempt when small overlap
kernel/cpu.c: create a CPU_STARTING cpu_chain notifier
kernel/cpu.c: Move the CPU_DYING notifiers
sched: fix __load_balance_iterator() for cfq with only one task
...
|
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | | |
add /proc/sys/kernel/sched_domain/cpu0/domain0/name, to make
it easier to see which specific scheduler domain remained at
that entry.
Since we process the scheduler domain tree and
simplify it, it's not always immediately clear during debugging
which domain came from where.
depends on CONFIG_SCHED_DEBUG=y.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | |
| | | | | | | | | | | |
While looking at the code I wondered why we always do:
sync && avg_overlap < migration_cost
Which is a bit odd, since the overlap test was meant to detect sync wakeups
so using it to specialize sync wakeups doesn't make much sense.
Hence change the code to do:
sync || avg_overlap < migration_cost
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
| | | | | | | | | | | | |
| | \ \ \ \ \ \ \ \ \ | |
| | \ \ \ \ \ \ \ \ \ | |
| | \ \ \ \ \ \ \ \ \ | |
| |\ \ \ \ \ \ \ \ \ \ \ \
| | | | | | | | | | | | | |
| | | | | | | | | | | | | |
| | | | | | | | | | | | | | |
'sched/urgent' into sched/core
|