sched/fair: Make utilization tracking CPU scale-invariant

Besides the existing frequency scale-invariance correction factor, apply CPU scale-invariance correction factor to utilization tracking to compensate for any differences in compute capacity. This could be due to micro-architectural differences (i.e. instructions per seconds) between cpus in HMP systems (e.g. big.LITTLE), and/or differences in the current maximum frequency supported by individual cpus in SMP systems. In the existing implementation utilization isn't comparable between cpus as it is relative to the capacity of each individual CPU. Each segment of the sched_avg.util_sum geometric series is now scaled by the CPU performance factor too so the sched_avg.util_avg of each sched entity will be invariant from the particular CPU of the HMP/SMP system on which the sched entity is scheduled. With this patch, the utilization of a CPU stays relative to the max CPU performance of the fastest CPU in the system. In contrast to utilization (sched_avg.util_sum), load (sched_avg.load_sum) should not be scaled by compute capacity. The utilization metric is based on running time which only makes sense when cpus are _not_ fully utilized (utilization cannot go beyond 100% even if more tasks are added), where load is runnable time which isn't limited by the capacity of the CPU and therefore is a better metric for overloaded scenarios. If we run two nice-0 busy loops on two cpus with different compute capacity their load should be similar since their compute demands are the same. We have to assume that the compute demand of any task running on a fully utilized CPU (no spare cycles = 100% utilization) is high and the same no matter of the compute capacity of its current CPU, hence we shouldn't scale load by CPU capacity. Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com> Signed-off-by: Morten Rasmussen <morten.rasmussen@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/55CE7409.1000700@arm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
author: Dietmar Eggemann <dietmar.eggemann@arm.com> 2015-08-15 01:04:41 +0200
committer: Ingo Molnar <mingo@kernel.org> 2015-09-13 09:52:56 +0200
commit: e3279a2e6d697e00e74f905851ee7cf532f72b2d (patch)
tree: ed737d4afb27498454dd5caa1a91f92df582f047 /kernel/sched
parent: sched/fair: Convert arch_scale_cpu_capacity() from weak function to #define (diff)
download: linux-e3279a2e6d697e00e74f905851ee7cf532f72b2d.tar.xz
linux-e3279a2e6d697e00e74f905851ee7cf532f72b2d.zip
2 files changed, 5 insertions, 4 deletions
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 102cdf1e4e97..573dc98c6248 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -2553,6 +2553,7 @@ __update_load_avg(u64 now, int cpu, struct sched_avg *sa,
 	u32 contrib;
 	int delta_w, scaled_delta_w, decayed = 0;
 	unsigned long scale_freq = arch_scale_freq_capacity(NULL, cpu);
+	unsigned long scale_cpu = arch_scale_cpu_capacity(NULL, cpu);
 
 	delta = now - sa->last_update_time;
 	/*
@@ -2596,7 +2597,7 @@ __update_load_avg(u64 now, int cpu, struct sched_avg *sa,
 			}
 		}
 		if (running)
-			sa->util_sum += scaled_delta_w;
+			sa->util_sum += scale(scaled_delta_w, scale_cpu);
 
 		delta -= delta_w;
 
@@ -2620,7 +2621,7 @@ __update_load_avg(u64 now, int cpu, struct sched_avg *sa,
 				cfs_rq->runnable_load_sum += weight * contrib;
 		}
 		if (running)
-			sa->util_sum += contrib;
+			sa->util_sum += scale(contrib, scale_cpu);
 	}
 
 	/* Remainder of delta accrued against u_0` */
@@ -2631,7 +2632,7 @@ __update_load_avg(u64 now, int cpu, struct sched_avg *sa,
 			cfs_rq->runnable_load_sum += weight * scaled_delta;
 	}
 	if (running)
-		sa->util_sum += scaled_delta;
+		sa->util_sum += scale(scaled_delta, scale_cpu);
 
 	sa->period_contrib += delta;
 
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index c0726d5fd6a3..167ab4844ee6 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -1398,7 +1398,7 @@ unsigned long arch_scale_freq_capacity(struct sched_domain *sd, int cpu)
 static __always_inline
 unsigned long arch_scale_cpu_capacity(struct sched_domain *sd, int cpu)
 {
-	if ((sd->flags & SD_SHARE_CPUCAPACITY) && (sd->span_weight > 1))
+	if (sd && (sd->flags & SD_SHARE_CPUCAPACITY) && (sd->span_weight > 1))
 		return sd->smt_gain / sd->span_weight;
 
 	return SCHED_CAPACITY_SCALE;
author	Dietmar Eggemann <dietmar.eggemann@arm.com>	2015-08-15 01:04:41 +0200
committer	Ingo Molnar <mingo@kernel.org>	2015-09-13 09:52:56 +0200
commit	e3279a2e6d697e00e74f905851ee7cf532f72b2d (patch)
tree	ed737d4afb27498454dd5caa1a91f92df582f047 /kernel/sched
parent	sched/fair: Convert arch_scale_cpu_capacity() from weak function to #define (diff)
download	linux-e3279a2e6d697e00e74f905851ee7cf532f72b2d.tar.xz linux-e3279a2e6d697e00e74f905851ee7cf532f72b2d.zip