diff options
author | Rob Herring <rob.herring@calxeda.com> | 2012-11-29 20:39:54 +0100 |
---|---|---|
committer | Russell King <rmk+kernel@arm.linux.org.uk> | 2012-12-03 12:16:36 +0100 |
commit | 14318efb322e2fe1a034c69463d725209eb9d548 (patch) | |
tree | cf99b8a06eca7abdb89ac87c7f35bebb6b3254f7 /arch/arm/kernel/smp.c | |
parent | ARM: 7582/2: rename kvm_seq to vmalloc_seq so to avoid confusion with KVM (diff) | |
download | linux-14318efb322e2fe1a034c69463d725209eb9d548.tar.xz linux-14318efb322e2fe1a034c69463d725209eb9d548.zip |
ARM: 7587/1: implement optimized percpu variable access
Use the previously unused TPIDRPRW register to store percpu offsets.
TPIDRPRW is only accessible in PL1, so it can only be used in the kernel.
This replaces 2 loads with a mrc instruction for each percpu variable
access. With hackbench, the performance improvement is 1.4% on Cortex-A9
(highbank). Taking an average of 30 runs of "hackbench -l 1000" yields:
Before: 6.2191
After: 6.1348
Will Deacon reported similar delta on v6 with 11MPCore.
The asm "memory clobber" are needed here to ensure the percpu offset
gets reloaded. Testing by Will found that this would not happen in
__schedule() which is a bit of a special case as preemption is disabled
but the execution can move cores.
Signed-off-by: Rob Herring <rob.herring@calxeda.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Acked-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Diffstat (limited to 'arch/arm/kernel/smp.c')
-rw-r--r-- | arch/arm/kernel/smp.c | 4 |
1 files changed, 3 insertions, 1 deletions
diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c index 7eacd84cdc9c..f3a2be5837aa 100644 --- a/arch/arm/kernel/smp.c +++ b/arch/arm/kernel/smp.c @@ -314,9 +314,10 @@ asmlinkage void __cpuinit secondary_start_kernel(void) current->active_mm = mm; cpumask_set_cpu(cpu, mm_cpumask(mm)); + cpu_init(); + printk("CPU%u: Booted secondary processor\n", cpu); - cpu_init(); preempt_disable(); trace_hardirqs_off(); @@ -372,6 +373,7 @@ void __init smp_cpus_done(unsigned int max_cpus) void __init smp_prepare_boot_cpu(void) { + set_my_cpu_offset(per_cpu_offset(smp_processor_id())); } void __init smp_prepare_cpus(unsigned int max_cpus) |