summaryrefslogtreecommitdiffstats
path: root/drivers/cpuidle
diff options
context:
space:
mode:
authorLinus Torvalds <torvalds@linux-foundation.org>2014-04-01 21:48:54 +0200
committerLinus Torvalds <torvalds@linux-foundation.org>2014-04-01 21:48:54 +0200
commit4dedde7c7a18f55180574f934dbc1be84ca0400b (patch)
treed7cc511e8ba8ffceadf3f45b9a63395c4e4183c5 /drivers/cpuidle
parentMerge branch 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kerne... (diff)
parentMerge branch 'pm-devfreq' (diff)
downloadlinux-4dedde7c7a18f55180574f934dbc1be84ca0400b.tar.xz
linux-4dedde7c7a18f55180574f934dbc1be84ca0400b.zip
Merge tag 'pm+acpi-3.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull ACPI and power management updates from Rafael Wysocki: "The majority of this material spent some time in linux-next, some of it even several weeks. There are a few relatively fresh commits in it, but they are mostly fixes and simple cleanups. ACPI took the lead this time, both in terms of the number of commits and the number of modified lines of code, cpufreq follows and there are a few changes in the PM core and in cpuidle too. A new feature that already got some LWN.net's attention is the device PM QoS extension allowing latency tolerance requirements to be propagated from leaf devices to their ancestors with hardware interfaces for specifying latency tolerance. That should help systems with hardware-driven power management to avoid going too far with it in cases when there are latency tolerance constraints. There also are some significant changes in the ACPI core related to the way in which hotplug notifications are handled. They affect PCI hotplug (ACPIPHP) and the ACPI dock station code too. The bottom line is that all those notification now go through the root notify handler and are propagated to the interested subsystems by means of callbacks instead of having to install a notify handler for each device object that we can potentially get hotplug notifications for. In addition to that ACPICA will now advertise "Windows 2013" compatibility for _OSI, because some systems out there don't work correctly if that is not done (some of them don't even boot). On the system suspend side of things, all of the device suspend and resume callbacks, except for ->prepare() and ->complete(), are now going to be executed asynchronously as that turns out to speed up system suspend and resume on some platforms quite significantly and we have a few more optimizations in that area. Apart from that, there are some new device IDs and fixes and cleanups all over. In particular, the system suspend and resume handling by cpufreq should be improved and the cpuidle menu governor should be a bit more robust now. Specifics: - Device PM QoS support for latency tolerance constraints on systems with hardware interfaces allowing such constraints to be specified. That is necessary to prevent hardware-driven power management from becoming overly aggressive on some systems and to prevent power management features leading to excessive latencies from being used in some cases. - Consolidation of the handling of ACPI hotplug notifications for device objects. This causes all device hotplug notifications to go through the root notify handler (that was executed for all of them anyway before) that propagates them to individual subsystems, if necessary, by executing callbacks provided by those subsystems (those callbacks are associated with struct acpi_device objects during device enumeration). As a result, the code in question becomes both smaller in size and more straightforward and all of those changes should not affect users. - ACPICA update, including fixes related to the handling of _PRT in cases when it is broken and the addition of "Windows 2013" to the list of supported "features" for _OSI (which is necessary to support systems that work incorrectly or don't even boot without it). Changes from Bob Moore and Lv Zheng. - Consolidation of ACPI _OST handling from Jiang Liu. - ACPI battery and AC fixes allowing unusual system configurations to be handled by that code from Alexander Mezin. - New device IDs for the ACPI LPSS driver from Chiau Ee Chew. - ACPI fan and thermal optimizations related to system suspend and resume from Aaron Lu. - Cleanups related to ACPI video from Jean Delvare. - Assorted ACPI fixes and cleanups from Al Stone, Hanjun Guo, Lan Tianyu, Paul Bolle, Tomasz Nowicki. - Intel RAPL (Running Average Power Limits) driver cleanups from Jacob Pan. - intel_pstate fixes and cleanups from Dirk Brandewie. - cpufreq fixes related to system suspend/resume handling from Viresh Kumar. - cpufreq core fixes and cleanups from Viresh Kumar, Stratos Karafotis, Saravana Kannan, Rashika Kheria, Joe Perches. - cpufreq drivers updates from Viresh Kumar, Zhuoyu Zhang, Rob Herring. - cpuidle fixes related to the menu governor from Tuukka Tikkanen. - cpuidle fix related to coupled CPUs handling from Paul Burton. - Asynchronous execution of all device suspend and resume callbacks, except for ->prepare and ->complete, during system suspend and resume from Chuansheng Liu. - Delayed resuming of runtime-suspended devices during system suspend for the PCI bus type and ACPI PM domain. - New set of PM helper routines to allow device runtime PM callbacks to be used during system suspend and resume more easily from Ulf Hansson. - Assorted fixes and cleanups in the PM core from Geert Uytterhoeven, Prabhakar Lad, Philipp Zabel, Rashika Kheria, Sebastian Capella. - devfreq fix from Saravana Kannan" * tag 'pm+acpi-3.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (162 commits) PM / devfreq: Rewrite devfreq_update_status() to fix multiple bugs PM / sleep: Correct whitespace errors in <linux/pm.h> intel_pstate: Set core to min P state during core offline cpufreq: Add stop CPU callback to cpufreq_driver interface cpufreq: Remove unnecessary braces cpufreq: Fix checkpatch errors and warnings cpufreq: powerpc: add cpufreq transition latency for FSL e500mc SoCs MAINTAINERS: Reorder maintainer addresses for PM and ACPI PM / Runtime: Update runtime_idle() documentation for return value meaning video / output: Drop display output class support fujitsu-laptop: Drop unneeded include acer-wmi: Stop selecting VIDEO_OUTPUT_CONTROL ACPI / gpu / drm: Stop selecting VIDEO_OUTPUT_CONTROL ACPI / video: fix ACPI_VIDEO dependencies cpufreq: remove unused notifier: CPUFREQ_{SUSPENDCHANGE|RESUMECHANGE} cpufreq: Do not allow ->setpolicy drivers to provide ->target cpufreq: arm_big_little: set 'physical_cluster' for each CPU cpufreq: arm_big_little: make vexpress driver depend on bL core driver ACPI / button: Add ACPI Button event via netlink routine ACPI: Remove duplicate definitions of PREFIX ...
Diffstat (limited to 'drivers/cpuidle')
-rw-r--r--drivers/cpuidle/cpuidle.c3
-rw-r--r--drivers/cpuidle/driver.c2
-rw-r--r--drivers/cpuidle/governors/menu.c75
3 files changed, 47 insertions, 33 deletions
diff --git a/drivers/cpuidle/cpuidle.c b/drivers/cpuidle/cpuidle.c
index 09d05ab262be..cb20fd915be8 100644
--- a/drivers/cpuidle/cpuidle.c
+++ b/drivers/cpuidle/cpuidle.c
@@ -85,7 +85,8 @@ int cpuidle_enter_state(struct cpuidle_device *dev, struct cpuidle_driver *drv,
time_end = ktime_get();
- local_irq_enable();
+ if (!cpuidle_state_is_coupled(dev, drv, entered_state))
+ local_irq_enable();
diff = ktime_to_us(ktime_sub(time_end, time_start));
if (diff > INT_MAX)
diff --git a/drivers/cpuidle/driver.c b/drivers/cpuidle/driver.c
index 06dbe7c86199..136d6a283e0a 100644
--- a/drivers/cpuidle/driver.c
+++ b/drivers/cpuidle/driver.c
@@ -209,7 +209,7 @@ static void poll_idle_init(struct cpuidle_driver *drv)
state->exit_latency = 0;
state->target_residency = 0;
state->power_usage = -1;
- state->flags = 0;
+ state->flags = CPUIDLE_FLAG_TIME_VALID;
state->enter = poll_idle;
state->disabled = false;
}
diff --git a/drivers/cpuidle/governors/menu.c b/drivers/cpuidle/governors/menu.c
index cf7f2f0e4ef5..71b523293354 100644
--- a/drivers/cpuidle/governors/menu.c
+++ b/drivers/cpuidle/governors/menu.c
@@ -122,9 +122,8 @@ struct menu_device {
int last_state_idx;
int needs_update;
- unsigned int expected_us;
+ unsigned int next_timer_us;
unsigned int predicted_us;
- unsigned int exit_us;
unsigned int bucket;
unsigned int correction_factor[BUCKETS];
unsigned int intervals[INTERVALS];
@@ -257,7 +256,7 @@ again:
stddev = int_sqrt(stddev);
if (((avg > stddev * 6) && (divisor * 4 >= INTERVALS * 3))
|| stddev <= 20) {
- if (data->expected_us > avg)
+ if (data->next_timer_us > avg)
data->predicted_us = avg;
return;
}
@@ -289,7 +288,7 @@ static int menu_select(struct cpuidle_driver *drv, struct cpuidle_device *dev)
struct menu_device *data = &__get_cpu_var(menu_devices);
int latency_req = pm_qos_request(PM_QOS_CPU_DMA_LATENCY);
int i;
- int multiplier;
+ unsigned int interactivity_req;
struct timespec t;
if (data->needs_update) {
@@ -298,7 +297,6 @@ static int menu_select(struct cpuidle_driver *drv, struct cpuidle_device *dev)
}
data->last_state_idx = 0;
- data->exit_us = 0;
/* Special case when user has set very strict latency requirement */
if (unlikely(latency_req == 0))
@@ -306,13 +304,11 @@ static int menu_select(struct cpuidle_driver *drv, struct cpuidle_device *dev)
/* determine the expected residency time, round up */
t = ktime_to_timespec(tick_nohz_get_sleep_length());
- data->expected_us =
+ data->next_timer_us =
t.tv_sec * USEC_PER_SEC + t.tv_nsec / NSEC_PER_USEC;
- data->bucket = which_bucket(data->expected_us);
-
- multiplier = performance_multiplier();
+ data->bucket = which_bucket(data->next_timer_us);
/*
* if the correction factor is 0 (eg first time init or cpu hotplug
@@ -326,17 +322,26 @@ static int menu_select(struct cpuidle_driver *drv, struct cpuidle_device *dev)
* operands are 32 bits.
* Make sure to round up for half microseconds.
*/
- data->predicted_us = div_round64((uint64_t)data->expected_us *
+ data->predicted_us = div_round64((uint64_t)data->next_timer_us *
data->correction_factor[data->bucket],
RESOLUTION * DECAY);
get_typical_interval(data);
/*
+ * Performance multiplier defines a minimum predicted idle
+ * duration / latency ratio. Adjust the latency limit if
+ * necessary.
+ */
+ interactivity_req = data->predicted_us / performance_multiplier();
+ if (latency_req > interactivity_req)
+ latency_req = interactivity_req;
+
+ /*
* We want to default to C1 (hlt), not to busy polling
* unless the timer is happening really really soon.
*/
- if (data->expected_us > 5 &&
+ if (data->next_timer_us > 5 &&
!drv->states[CPUIDLE_DRIVER_STATE_START].disabled &&
dev->states_usage[CPUIDLE_DRIVER_STATE_START].disable == 0)
data->last_state_idx = CPUIDLE_DRIVER_STATE_START;
@@ -355,11 +360,8 @@ static int menu_select(struct cpuidle_driver *drv, struct cpuidle_device *dev)
continue;
if (s->exit_latency > latency_req)
continue;
- if (s->exit_latency * multiplier > data->predicted_us)
- continue;
data->last_state_idx = i;
- data->exit_us = s->exit_latency;
}
return data->last_state_idx;
@@ -390,36 +392,47 @@ static void menu_update(struct cpuidle_driver *drv, struct cpuidle_device *dev)
{
struct menu_device *data = &__get_cpu_var(menu_devices);
int last_idx = data->last_state_idx;
- unsigned int last_idle_us = cpuidle_get_last_residency(dev);
struct cpuidle_state *target = &drv->states[last_idx];
unsigned int measured_us;
unsigned int new_factor;
/*
- * Ugh, this idle state doesn't support residency measurements, so we
- * are basically lost in the dark. As a compromise, assume we slept
- * for the whole expected time.
+ * Try to figure out how much time passed between entry to low
+ * power state and occurrence of the wakeup event.
+ *
+ * If the entered idle state didn't support residency measurements,
+ * we are basically lost in the dark how much time passed.
+ * As a compromise, assume we slept for the whole expected time.
+ *
+ * Any measured amount of time will include the exit latency.
+ * Since we are interested in when the wakeup begun, not when it
+ * was completed, we must substract the exit latency. However, if
+ * the measured amount of time is less than the exit latency,
+ * assume the state was never reached and the exit latency is 0.
*/
- if (unlikely(!(target->flags & CPUIDLE_FLAG_TIME_VALID)))
- last_idle_us = data->expected_us;
+ if (unlikely(!(target->flags & CPUIDLE_FLAG_TIME_VALID))) {
+ /* Use timer value as is */
+ measured_us = data->next_timer_us;
+ } else {
+ /* Use measured value */
+ measured_us = cpuidle_get_last_residency(dev);
- measured_us = last_idle_us;
-
- /*
- * We correct for the exit latency; we are assuming here that the
- * exit latency happens after the event that we're interested in.
- */
- if (measured_us > data->exit_us)
- measured_us -= data->exit_us;
+ /* Deduct exit latency */
+ if (measured_us > target->exit_latency)
+ measured_us -= target->exit_latency;
+ /* Make sure our coefficients do not exceed unity */
+ if (measured_us > data->next_timer_us)
+ measured_us = data->next_timer_us;
+ }
/* Update our correction ratio */
new_factor = data->correction_factor[data->bucket];
new_factor -= new_factor / DECAY;
- if (data->expected_us > 0 && measured_us < MAX_INTERESTING)
- new_factor += RESOLUTION * measured_us / data->expected_us;
+ if (data->next_timer_us > 0 && measured_us < MAX_INTERESTING)
+ new_factor += RESOLUTION * measured_us / data->next_timer_us;
else
/*
* we were idle so long that we count it as a perfect
@@ -439,7 +452,7 @@ static void menu_update(struct cpuidle_driver *drv, struct cpuidle_device *dev)
data->correction_factor[data->bucket] = new_factor;
/* update the repeating-pattern data */
- data->intervals[data->interval_ptr++] = last_idle_us;
+ data->intervals[data->interval_ptr++] = measured_us;
if (data->interval_ptr >= INTERVALS)
data->interval_ptr = 0;
}