summaryrefslogtreecommitdiffstats
path: root/drivers/cpufreq/cpufreq_conservative.c (follow)
Commit message (Collapse)AuthorAgeFilesLines
* cpufreq: governor: New sysfs show/store callbacks for governor tunablesViresh Kumar2016-03-091-47/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The ondemand and conservative governors use the global-attr or freq-attr structures to represent sysfs attributes corresponding to their tunables (which of them is actually used depends on whether or not different policy objects can use the same governor with different tunables at the same time and, consequently, on where those attributes are located in sysfs). Unfortunately, in the freq-attr case, the standard cpufreq show/store sysfs attribute callbacks are applied to the governor tunable attributes and they always acquire the policy->rwsem lock before carrying out the operation. That may lead to an ABBA deadlock if governor tunable attributes are removed under policy->rwsem while one of them is being accessed concurrently (if sysfs attributes removal wins the race, it will wait for the access to complete with policy->rwsem held while the attribute callback will block on policy->rwsem indefinitely). We attempted to address this issue by dropping policy->rwsem around governor tunable attributes removal (that is, around invocations of the ->governor callback with the event arg equal to CPUFREQ_GOV_POLICY_EXIT) in cpufreq_set_policy(), but that opened up race conditions that had not been possible with policy->rwsem held all the time. Therefore policy->rwsem cannot be dropped in cpufreq_set_policy() at any point, but the deadlock situation described above must be avoided too. To that end, use the observation that in principle governor tunables may be represented by the same data type regardless of whether the governor is system-wide or per-policy and introduce a new structure, struct governor_attr, for representing them and new corresponding macros for creating show/store sysfs callbacks for them. Also make their parent kobject use a new kobject type whose default show/store callbacks are not related to the standard core cpufreq ones in any way (and they don't acquire policy->rwsem in particular). Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Tested-by: Juri Lelli <juri.lelli@arm.com> Tested-by: Shilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com> [ rjw: Subject & changelog + rebase ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
* cpufreq: governor: Move common tunables to 'struct dbs_data'Viresh Kumar2016-03-091-21/+17
| | | | | | | | | | | | There are a few common tunables shared between the ondemand and conservative governors. Move them to struct dbs_data to simplify code. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Tested-by: Juri Lelli <juri.lelli@arm.com> Tested-by: Shilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com> [ rjw: Changelog ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
* cpufreq: governor: Create generic macro for common tunablesViresh Kumar2016-03-091-4/+4
| | | | | | | | | | | | | | | | Some tunables are present in governor-specific structures, whereas one (min_sampling_rate) is located directly in struct dbs_data. There is a special macro for creating its sysfs attribute and the show/store callbacks, but since more tunables are going to be moved to struct dbs_data, a new generic macro for such cases will be useful, so add it and use it for min_sampling_rate. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Tested-by: Juri Lelli <juri.lelli@arm.com> Tested-by: Shilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com> [ rjw: Subject & changelog ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
* cpufreq: governor: Rearrange governor data structuresRafael J. Wysocki2016-03-091-2/+4
| | | | | | | | | | | | | | | | The struct policy_dbs_info objects representing per-policy governor data are not accessible directly from the corresponding policy objects. To access them, one has to get a pointer to the struct cpu_dbs_info of policy->cpu and use the policy_dbs field of that which isn't really straightforward. To address that rearrange the governor data structures so the governor_data pointer in struct cpufreq_policy will point to struct policy_dbs_info (instead of struct dbs_data) and that will contain a pointer to struct dbs_data. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
* cpufreq: governor: Drop cpu argument from dbs_check_cpu()Rafael J. Wysocki2016-03-091-1/+1
| | | | | | | | | Since policy->cpu is always passed as the second argument to dbs_check_cpu(), it is not really necessary to pass it, because the function can obtain that value via its first argument just fine. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
* cpufreq: governor: Rename cpu_common_dbs_info to policy_dbs_infoRafael J. Wysocki2016-03-091-1/+1
| | | | | | | | | | | | | | The struct cpu_common_dbs_info structure represents the per-policy part of the governor data (for the ondemand and conservative governors), but its name doesn't reflect its purpose. Rename it to struct policy_dbs_info and rename variables related to it accordingly. No functional changes. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
* cpufreq: governor: Drop the gov pointer from struct dbs_dataRafael J. Wysocki2016-03-091-1/+1
| | | | | | | | | | | | | Since it is possible to obtain a pointer to struct dbs_governor from a pointer to the struct governor embedded in it with the help of container_of(), the additional gov pointer in struct dbs_data isn't really necessary. Drop that pointer and make the code using it reach the dbs_governor object via policy->governor. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
* cpufreq: governor: Rework cpufreq_governor_dbs()Rafael J. Wysocki2016-03-091-10/+1
| | | | | | | | | | | | | | Since it is possible to obtain a pointer to struct dbs_governor from a pointer to the struct governor embedded in it via container_of(), the second argument of cpufreq_governor_init() is not necessary. Accordingly, cpufreq_governor_dbs() doesn't need its second argument either and the ->governor callbacks for both the ondemand and conservative governors may be set to cpufreq_governor_dbs() directly. Make that happen. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Saravana Kannan <skannan@codeaurora.org> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
* cpufreq: governor: Rename some data types and variablesRafael J. Wysocki2016-03-091-4/+4
| | | | | | | | | | | | The ondemand and conservative governors are represented by struct common_dbs_data whose name doesn't reflect the purpose it is used for, so rename it to struct dbs_governor and rename variables of that type accordingly. No functional changes. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
* cpufreq: governor: Put governor structure into common_dbs_dataRafael J. Wysocki2016-03-091-37/+41
| | | | | | | | | | | | | | | | | | | For the ondemand and conservative governors (generally, governors that use the common code in cpufreq_governor.c), there are two static data structures representing the governor, the struct governor structure (the interface to the cpufreq core) and the struct common_dbs_data one (the interface to the cpufreq_governor.c code). There's no fundamental reason why those two structures have to be separate. Moreover, if the struct governor one is included into struct common_dbs_data, it will be possible to reach the latter from the policy via its policy->governor pointer, so it won't be necessary to pass a separate pointer to it around. For this reason, embed struct governor in struct common_dbs_data. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Saravana Kannan <skannan@codeaurora.org> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
* cpufreq: governor: Use common mutex for dbs_data protectionRafael J. Wysocki2016-03-091-1/+0
| | | | | | | | | | | | | | | | | | | | | | Every governor relying on the common code in cpufreq_governor.c has to provide its own mutex in struct common_dbs_data. However, there actually is no need to have a separate mutex per governor for this purpose, they may be using the same global mutex just fine. Accordingly, introduce a single common mutex for that and drop the mutex field from struct common_dbs_data. That at least will ensure that the mutex is always present and initialized regardless of what the particular governors do. Another benefit is that the common code does not need a pointer to a governor-related structure to get to the mutex which sometimes helps. Finally, it makes the code generally easier to follow. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Saravana Kannan <skannan@codeaurora.org> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
* cpufreq: governor: Replace timers with utilization update callbacksRafael J. Wysocki2016-03-091-4/+2
| | | | | | | | | | | | | | | Instead of using a per-CPU deferrable timer for queuing up governor work items, register a utilization update callback that will be invoked from the scheduler on utilization changes. The sampling rate is still the same as what was used for the deferrable timers and the added irq_work overhead should be offset by the eliminated timers overhead, so in theory the functional impact of this patch should not be significant. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Tested-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
* cpufreq: Clean up default and fallback governor setupRafael J. Wysocki2016-02-051-4/+6
| | | | | | | | | | | The preprocessor magic used for setting the default cpufreq governor (and for using the performance governor as a fallback one for that matter) is really nasty, so replace it with __weak functions and overrides. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Saravana Kannan <skannan@codeaurora.org> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
* cpufreq: governor: Pass policy as argument to ->gov_dbs_timer()Viresh Kumar2015-12-071-3/+3
| | | | | | | | Pass 'policy' as argument to ->gov_dbs_timer() instead of cdbs and dbs_data. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
* cpufreq: conservative: remove 'enable' fieldViresh Kumar2015-09-261-13/+18
| | | | | | | | | | | | | | | | Conservative governor has its own 'enable' field to check if conservative governor is used for a CPU or not This can be checked by policy->governor with 'cpufreq_gov_conservative' and so this field can be dropped. Because its not guaranteed that dbs_info->cdbs.shared will a valid pointer for all CPUs (will be NULL for CPUs that don't use ondemand/conservative governors), we can't use it anymore. Lets get policy with cpufreq_cpu_get_raw() instead. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
* cpufreq: governor: split out common part of {cs|od}_dbs_timer()Viresh Kumar2015-07-211-20/+7
| | | | | | | | | | | | | | Some part of cs_dbs_timer() and od_dbs_timer() is exactly same and is unnecessarily duplicated. Create the real work-handler in cpufreq_governor.c and put the common code in this routine (dbs_timer()). Shouldn't make any functional change. Reviewed-and-tested-by: Preeti U Murthy <preeti@linux.vnet.ibm.com> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
* cpufreq: governor: Keep single copy of information common to policy->cpusViresh Kumar2015-07-211-8/+10
| | | | | | | | | | | | | | | | Some information is common to all CPUs belonging to a policy, but are kept on per-cpu basis. Lets keep that in another structure common to all policy->cpus. That will make updates/reads to that less complex and less error prone. The memory for cpu_common_dbs_info is allocated/freed at INIT/EXIT, so that it we don't reallocate it for STOP/START sequence. It will be also be used (in next patch) while the governor is stopped and so must not be freed that early. Reviewed-and-tested-by: Preeti U Murthy <preeti@linux.vnet.ibm.com> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
* cpufreq: governor: rename cur_policy as policyViresh Kumar2015-07-171-5/+5
| | | | | | | | | Just call it 'policy', cur_policy is unnecessarily long and doesn't have any special meaning. Reviewed-by: Preeti U Murthy <preeti@linux.vnet.ibm.com> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
* cpufreq: governor: Name delayed-work as dworkViresh Kumar2015-07-171-1/+1
| | | | | | | | | Delayed work was named as 'work' and to access work within it we do work.work. Not much readable. Rename delayed_work as 'dwork'. Reviewed-by: Preeti U Murthy <preeti@linux.vnet.ibm.com> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
* cpufreq: governor: Serialize governor callbacksViresh Kumar2015-06-151-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There are several races reported in cpufreq core around governors (only ondemand and conservative) by different people. There are at least two race scenarios present in governor code: (a) Concurrent access/updates of governor internal structures. It is possible that fields such as 'dbs_data->usage_count', etc. are accessed simultaneously for different policies using same governor structure (i.e. CPUFREQ_HAVE_GOVERNOR_PER_POLICY flag unset). And because of this we can dereference bad pointers. For example consider a system with two CPUs with separate 'struct cpufreq_policy' instances. CPU0 governor: ondemand and CPU1: powersave. CPU0 switching to powersave and CPU1 to ondemand: CPU0 CPU1 store* store* cpufreq_governor_exit() cpufreq_governor_init() dbs_data = cdata->gdbs_data; if (!--dbs_data->usage_count) kfree(dbs_data); dbs_data->usage_count++; *Bad pointer dereference* There are other races possible between EXIT and START/STOP/LIMIT as well. Its really complicated. (b) Switching governor state in bad sequence: For example trying to switch a governor to START state, when the governor is in EXIT state. There are some checks present in __cpufreq_governor() but they aren't sufficient as they compare events against 'policy->governor_enabled', where as we need to take governor's state into account, which can be used by multiple policies. These two issues need to be solved separately and the responsibility should be properly divided between cpufreq and governor core. The first problem is more about the governor core, as it needs to protect its structures properly. And the second problem should be fixed in cpufreq core instead of governor, as its all about sequence of events. This patch is trying to solve only the first problem. There are two types of data we need to protect, - 'struct common_dbs_data': No matter what, there is going to be a single copy of this per governor. - 'struct dbs_data': With CPUFREQ_HAVE_GOVERNOR_PER_POLICY flag set, we will have per-policy copy of this data, otherwise a single copy. Because of such complexities, the mutex present in 'struct dbs_data' is insufficient to solve our problem. For example we need to protect fetching of 'dbs_data' from different structures at the beginning of cpufreq_governor_dbs(), to make sure it isn't currently being updated. This can be fixed if we can guarantee serialization of event parsing code for an individual governor. This is best solved with a mutex per governor, and the placeholder for that is 'struct common_dbs_data'. And so this patch moves the mutex from 'struct dbs_data' to 'struct common_dbs_data' and takes it at the beginning and drops it at the end of cpufreq_governor_dbs(). Tested with and without following configuration options: CONFIG_LOCKDEP_SUPPORT=y CONFIG_DEBUG_RT_MUTEXES=y CONFIG_DEBUG_PI_LIST=y CONFIG_DEBUG_SPINLOCK=y CONFIG_DEBUG_MUTEXES=y CONFIG_DEBUG_LOCK_ALLOC=y CONFIG_PROVE_LOCKING=y CONFIG_LOCKDEP=y CONFIG_DEBUG_ATOMIC_SLEEP=y Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Reviewed-by: Preeti U Murthy <preeti@linux.vnet.ibm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
* cpufreq: governor: register notifier from cs_init()Viresh Kumar2015-06-151-11/+15
| | | | | | | | | | Notifiers are required only for conservative governor and the common governor code is unnecessarily polluted with that. Handle that from cs_init/exit() instead of cpufreq_governor_dbs(). Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Reviewed-by: Preeti U Murthy <preeti@linux.vnet.ibm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
* cpufreq: conservative: set requested_freq to policy max when it is over ↵Xiaoguang Chen2013-11-121-0/+3
| | | | | | | | | | | policy max When requested_freq is over policy->max, set it to policy->max. This can help to speed up decreasing frequency. Signed-off-by: Xiaoguang Chen <chenxg@marvell.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
* cpufreq: conservative: fix requested_freq reduction issueXiaoguang Chen2013-11-071-1/+6
| | | | | | | | | | | | | | | When decreasing frequency, requested_freq may be less than freq_target, So requested_freq minus freq_target may be negative, But reqested_freq's unit is unsigned int, then the negative result will be one larger interger which may be even higher than requested_freq. This patch is to fix such issue. when result becomes negative, set requested_freq as the min value of policy. Signed-off-by: Xiaoguang Chen <chenxg@marvell.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
* cpufreq: governors: Remove duplicate check of target freq in supported rangeStratos Karafotis2013-08-281-4/+0
| | | | | | | | | | | Function __cpufreq_driver_target() checks if target_freq is within policy->min and policy->max range. generic_powersave_bias_target() also checks if target_freq is valid via a cpufreq_frequency_table_target() call. So, drop the unnecessary duplicate check in *_check_cpu(). Signed-off-by: Stratos Karafotis <stratosk@semaphore.gr> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
* Merge back earlier 'pm-cpufreq' materialRafael J. Wysocki2013-08-141-13/+1
|\
| * cpufreq: Use sizeof(*ptr) convetion for computing sizesViresh Kumar2013-08-071-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Chapter 14 of Documentation/CodingStyle says: The preferred form for passing a size of a struct is the following: p = kmalloc(sizeof(*p), ...); The alternative form where struct name is spelled out hurts readability and introduces an opportunity for a bug when the pointer variable type is changed but the corresponding sizeof that is passed to a memory allocator is not. This wasn't followed consistently in drivers/cpufreq, let's make it more consistent by always following this rule. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
| * cpufreq: Clean up header files included in the coreViresh Kumar2013-08-071-12/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch addresses the following issues in the header files in the cpufreq core: - Include headers in ascending order, so that we don't add same many times by mistake. - <asm/> must be included after <linux/>, so that they override whatever they need to. - Remove unnecessary includes. - Don't include files already included by cpufreq.h or cpufreq_governor.h. [rjw: Changelog] Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
* | cpufreq: rename ignore_nice as ignore_nice_loadViresh Kumar2013-08-071-10/+10
|/ | | | | | | | | | | | | This sysfs file was called ignore_nice_load earlier and commit 4d5dcc4 (cpufreq: governor: Implement per policy instances of governors) changed its name to ignore_nice by mistake. Lets get it renamed back to its original name. Reported-by: Martin von Gagern <Martin.vGagern@gmx.net> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Cc: 3.10+ <stable@vger.kernel.org> # 3.10+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
* cpufreq: conservative: Use an inline function to evaluate freq_targetStratos Karafotis2013-04-011-12/+16
| | | | | | | | | | Use an inline function to evaluate freq_target to avoid duplicate code. Also, define a macro for the default frequency step. Signed-off-by: Stratos Karafotis <stratosk@semaphore.gr> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
* cpufreq: conservative: Fix the logic in frequency decrease checkingStratos Karafotis2013-04-011-6/+2
| | | | | | | | | | | | | When we evaluate the CPU load for frequency decrease we have to compare the load against down_threshold. There is no need to subtract 10 points from down_threshold. Instead, we have to use the default down_threshold or user's selection unmodified. Signed-off-by: Stratos Karafotis <stratosk@semaphore.gr> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
* cpufreq: conservative: Fix sampling_down_factor functionalityStratos Karafotis2013-04-011-3/+8
| | | | | | | | | | | | sampling_down_factor tunable is unused since commit 8e677ce83bf41ba9c74e5b6d9ee60b07d4e5ed93 (4 years ago). This patch restores the original functionality and documents the tunable. Signed-off-by: Stratos Karafotis <stratosk@semaphore.gr> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
* cpufreq: governors: Calculate iowait time only when necessaryStratos Karafotis2013-04-011-1/+1
| | | | | | | | | | | | | | | Currently we always calculate the CPU iowait time and add it to idle time. If we are in ondemand and we use io_is_busy, we re-calculate iowait time and we subtract it from idle time. With this patch iowait time is calculated only when necessary avoiding the double call to get_cpu_iowait_time_us. We use a parameter in function get_cpu_idle_time to distinguish when the iowait time will be added to idle time or not, without the need of keeping the prev_io_wait. Signed-off-by: Stratos Karafotis <stratosk@semaphore.gr> Acked-by: Viresh Kumar <viresh.kumar@linaro.,org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
* cpufreq: conservative: Fix relation when decreasing frequencyNamhyung Kim2013-04-011-1/+1
| | | | | | | | | The relation should be CPUFREQ_RELATION_L to find optimal frequency when decreasing. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
* cpufreq: conservative: Break out earlier on the lowest frequencyNamhyung Kim2013-04-011-6/+6
| | | | | | | | | If we're on the lowest frequency, no need to calculate new freq. Break out even earlier in this case. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
* cpufreq: governors: Avoid unnecessary per cpu timer interruptsViresh Kumar2013-04-011-3/+5
| | | | | | | | | | | | | | | | | | | | | | | | | Following patch has introduced per cpu timers or works for ondemand and conservative governors. commit 2abfa876f1117b0ab45f191fb1f82c41b1cbc8fe Author: Rickard Andersson <rickard.andersson@stericsson.com> Date: Thu Dec 27 14:55:38 2012 +0000 cpufreq: handle SW coordinated CPUs This causes additional unnecessary interrupts on all cpus when the load is recently evaluated by any other cpu. i.e. When load is recently evaluated by cpu x, we don't really need any other cpu to evaluate this load again for the next sampling_rate time. Some sort of code is present to avoid that but we are still getting timer interrupts for all cpus. A good way of avoiding this would be to modify delays for all cpus (policy->cpus) whenever any cpu has evaluated load. This patch does this change and some related code cleanup. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
* cpufreq: governor: Implement per policy instances of governorsViresh Kumar2013-04-011-75/+118
| | | | | | | | | | | | | | | | | | | | | | Currently, there can't be multiple instances of single governor_type. If we have a multi-package system, where we have multiple instances of struct policy (per package), we can't have multiple instances of same governor. i.e. We can't have multiple instances of ondemand governor for multiple packages. Governors directory in sysfs is created at /sys/devices/system/cpu/cpufreq/ governor-name/. Which again reflects that there can be only one instance of a governor_type in the system. This is a bottleneck for multicluster system, where we want different packages to use same governor type, but with different tunables. This patch uses the infrastructure provided by earlier patch and implements init/exit routines for ondemand and conservative governors. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
* cpufreq: conservative: Fix typos in commentsStratos Karafotis2013-02-091-2/+2
| | | | | | | | Fix a couple of typos in comments. Signed-off-by: Stratos Karafotis <stratosk@semaphore.gr> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
* cpufreq: governors: Remove code redundancy between governorsViresh Kumar2013-02-021-43/+9
| | | | | | | | | | | | | | | With the inclusion of following patches: 9f4eb10 cpufreq: conservative: call dbs_check_cpu only when necessary 772b4b1 cpufreq: ondemand: call dbs_check_cpu only when necessary code redundancy between the conservative and ondemand governors is introduced again, so get rid of it. [rjw: Changelog] Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Tested-by: Fabio Baltieri <fabio.baltieri@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
* cpufreq: governors: fix misuse of cdbs.cpuFabio Baltieri2013-02-021-2/+3
| | | | | | | | | | | Fix governors code to set all cpu's cdbs->cpu to the the actual cpu id and use cur_policy->cpu istead of cdbs->cpu to track current governor's leader cpu. Reported-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Fabio Baltieri <fabio.baltieri@linaro.org> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
* cpufreq: governors: implement generic policy_is_sharedFabio Baltieri2013-02-021-1/+1
| | | | | | | | | | | Implement a generic helper function policy_is_shared() to replace the current dbs_sw_coordinated_cpus() at cpufreq level, so that it can be used by code other than cpufreq governors. Suggested-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Fabio Baltieri <fabio.baltieri@linaro.org> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
* cpufreq: conservative: call dbs_check_cpu only when necessaryFabio Baltieri2013-02-021-6/+41
| | | | | | | | Modify conservative timer to not resample CPU utilization if recently sampled from another SW coordinated core. Signed-off-by: Fabio Baltieri <fabio.baltieri@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
* cpufreq: handle SW coordinated CPUsRickard Andersson2013-02-021-1/+2
| | | | | | | | | | | | | | | This patch fixes a bug that occurred when we had load on a secondary CPU and the primary CPU was sleeping. Only one sampling timer was spawned and it was spawned as a deferred timer on the primary CPU, so when a secondary CPU had a change in load this was not detected by the cpufreq governor (both ondemand and conservative). This patch make sure that deferred timers are run on all CPUs in the case of software controlled CPUs that run on the same frequency. Signed-off-by: Rickard Andersson <rickard.andersson@stericsson.com> Signed-off-by: Fabio Baltieri <fabio.baltieri@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
* cpufreq: governors: remove redundant codeViresh Kumar2012-11-151-389/+159
| | | | | | | | | | | | | | | Initially ondemand governor was written and then using its code conservative governor is written. It used a lot of code from ondemand governor, but copy of code was created instead of using the same routines from both governors. Which increased code redundancy, which is difficult to manage. This patch is an attempt to move common part of both the governors to cpufreq_governor.c file to come over above mentioned issues. This shouldn't change anything from functionality point of view. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
* cpufreq: Move common part from governors to separate file, v2viresh kumar2012-11-151-34/+0
| | | | | | | | | | | | Multiple cpufreq governers have defined similar get_cpu_idle_time_***() routines. These routines must be moved to some common place, so that all governors can use them. So moving them to cpufreq_governor.c, which seems to be a better place for keeping these routines. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
* Merge tag 'pm-for-3.7-rc1' of ↵Linus Torvalds2012-10-031-0/+2
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management updates from Rafael J Wysocki: - Improved system suspend/resume and runtime PM handling for the SH TMU, CMT and MTU2 clock event devices (also used by ARM/shmobile). - Generic PM domains framework extensions related to cpuidle support and domain objects lookup using names. - ARM/shmobile power management updates including improved support for the SH7372's A4S power domain containing the CPU core. - cpufreq changes related to AMD CPUs support from Matthew Garrett, Andre Przywara and Borislav Petkov. - cpu0 cpufreq driver from Shawn Guo. - cpufreq governor fixes related to the relaxing of limit from Michal Pecio. - OMAP cpufreq updates from Axel Lin and Richard Zhao. - cpuidle ladder governor fixes related to the disabling of states from Carsten Emde and me. - Runtime PM core updates related to the interactions with the system suspend core from Alan Stern and Kevin Hilman. - Wakeup sources modification allowing more helper functions to be called from interrupt context from John Stultz and additional diagnostic code from Todd Poynor. - System suspend error code path fix from Feng Hong. Fixed up conflicts in cpufreq/powernow-k8 that stemmed from the workqueue fixes conflicting fairly badly with the removal of support for hardware P-state chips. The changes were independent but somewhat intertwined. * tag 'pm-for-3.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (76 commits) Revert "PM QoS: Use spinlock in the per-device PM QoS constraints code" PM / Runtime: let rpm_resume() succeed if RPM_ACTIVE, even when disabled, v2 cpuidle: rename function name "__cpuidle_register_driver", v2 cpufreq: OMAP: Check IS_ERR() instead of NULL for omap_device_get_by_hwmod_name cpuidle: remove some empty lines PM: Prevent runtime suspend during system resume PM QoS: Use spinlock in the per-device PM QoS constraints code PM / Sleep: use resume event when call dpm_resume_early cpuidle / ACPI : move cpuidle_device field out of the acpi_processor_power structure ACPI / processor: remove pointless variable initialization ACPI / processor: remove unused function parameter cpufreq: OMAP: remove loops_per_jiffy recalculate for smp sections: fix section conflicts in drivers/cpufreq cpufreq: conservative: update frequency when limits are relaxed cpufreq / ondemand: update frequency when limits are relaxed properly __init-annotate pm_sysrq_init() cpufreq: Add a generic cpufreq-cpu0 driver PM / OPP: Initialize OPP table from device tree ARM: add cpufreq transiton notifier to adjust loops_per_jiffy for smp cpufreq: Remove support for hardware P-state chips from powernow-k8 ...
| * cpufreq: conservative: update frequency when limits are relaxedMichal Pecio2012-09-141-0/+1
| | | | | | | | | | | | | | | | | | | | Reevaluate CPU load and update frequency immediately whenever limits are changed. Currently conservative doesn't do that when limits are relaxed, wasting power on systems with relatively low sampling rate. Signed-off-by: Michal Pecio <mpecio@nvidia.com> Reviewed-by: Thomas Renninger <trenn@suse.de> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
| * PM / cpufreq: Initialise the cpu field during conservative governor startAmit Daniel Kachhap2012-09-041-0/+1
| | | | | | | | | | | | | | | | | | This change initialises the cpu id field of cs_cpu_dbs_info structure in conservative governor and keep this consistent with other governors. Similar initialisation is present in ondemand governor. Signed-off-by: Amit Daniel Kachhap <amit.daniel@samsung.com> Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
* | workqueue: make deferrable delayed_work initializer names consistentTejun Heo2012-08-211-1/+1
|/ | | | | | | | | | | | | | | | | | Initalizers for deferrable delayed_work are confused. * __DEFERRED_WORK_INITIALIZER() * DECLARE_DEFERRED_WORK() * INIT_DELAYED_WORK_DEFERRABLE() Rename them to * __DEFERRABLE_WORK_INITIALIZER() * DECLARE_DEFERRABLE_WORK() * INIT_DEFERRABLE_WORK() This patch doesn't cause any functional changes. Signed-off-by: Tejun Heo <tj@kernel.org>
* Merge branch 'sched/core' of ↵Martin Schwidefsky2011-12-191-21/+20
|\ | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip into cputime-tip Conflicts: drivers/cpufreq/cpufreq_conservative.c drivers/cpufreq/cpufreq_ondemand.c drivers/macintosh/rack-meter.c fs/proc/stat.c fs/proc/uptime.c kernel/sched/core.c
| * sched/accounting: Change cpustat fields to an arrayGlauber Costa2011-12-061-20/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch changes fields in cpustat from a structure, to an u64 array. Math gets easier, and the code is more flexible. Signed-off-by: Glauber Costa <glommer@parallels.com> Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Paul Tuner <pjt@google.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1322498719-2255-2-git-send-email-glommer@parallels.com Signed-off-by: Ingo Molnar <mingo@elte.hu>